"Python原始的HTML包含无法用替换命令去除的“\n”字符"

from CodeB import simple_get htmlPath = "https://en.wikipedia.org/wiki/Terminalia_nigrovenulosa" raw_html = simple_get(htmlPath) if raw_html is None: print("not found") else: tmpHtml = str(raw_html) tmpHtmlB = tmpHtml.replace("\n","") print("tmpHtmlB:=", tmpHtmlB) from requests import get from requests.exceptions import RequestException from contextlib import closing from bs4 import BeautifulSoup def simple_get(url): try: with closing(get(url, stream=True)) as resp: if is_good_response(resp): return resp.content else: return None except RequestException as e: log_error('Error during requests to {0} : {1}'.format(url, str(e))) return None def is_good_response(resp): content_type = resp.headers['Content-Type'].lower() return (resp.status_code == 200 and content_type is not None and content_type.find('html') > -1) def log_error(e): print(e)

3条回答

网友

1楼 · 编辑于 2024-09-26 17:44:04

我相信您需要在\n中添加另一个反作用“\”，以便搜索文本字符串\n并避免反作用。你知道吗

快速示例：

string = '\\n foo'
print(string.replace('\n', ''))

退货：

\n foo

而：

print(string.replace('\n', ''))

仅返回：

foo

网友

2楼 · 编辑于 2024-09-26 17:44:04

我认为在双引号之间加一个空格对你有好处

网友

3楼 · 编辑于 2024-09-26 17:44:04

使用原始字符串r'\n或者记住\n代表换行符，您需要转义反斜杠：.replace('\\n', '')

相关问题更多 >

编程相关推荐

热门问题

热门文章