擅长:python、mysql、java
<p>尝试使用换行符而不是空格连接:</p>
<pre><code>from bs4 import BeautifulSoup
with open('test.html', 'r') as html_file:
soup = BeautifulSoup(html_file, 'lxml')
tables = soup.find_all('table', class_='width-max')
for table in tables:
texts = '\n'.join(table.text.split())
print(texts)
</code></pre>
<p>编辑:
前面的代码段会将多个单词行拆分为单个单词行,请尝试以下操作:</p>
<pre><code>from bs4 import BeautifulSoup
with open('test.html', 'r') as html_file:
soup = BeautifulSoup(html_file, 'lxml')
tables = soup.find_all('table', class_='width-max')
for table in tables:
if !table.get_text().isspace():
text = os.linesep.join([l for l in table.get_text().splitlines() if l])
print(text.lstrip())
</code></pre>