<p>尝试对从<code>Beautiful Soup</code>返回的文本使用<a href="https://docs.python.org/2/library/string.html#string.strip" rel="nofollow">strip</a>。在</p>
<p>假设您使用类似这样的方法从<code>li</code>标记中提取文本:<code>text = soup.find('li').get_text()</code>,然后在text<code>text.strip()</code>上添加一个对<code>strip()</code>的调用,这应该会删除两端的空白。在</p>
<pre><code>from bs4 import BeautifulSoup
def get_li_texts(html):
soup = BeautifulSoup(html)
li_list = soup.findAll('li')
li_texts = []
for li in li_list:
text = li.get_text().strip()
li_texts.append(text)
return li_texts
html = '<li>\n\n GUANGZHOU ADS AUDIO SCIENCE &amp; TECHNOLOGY CO.,LTD.\n\n </li>, <li>\n\n SHIMA ADS INDUSTRIAL DISTRICT GUANGZHOU GUANGDONG CHINA\n\n </li>, <li>\n\n GUANGDONGGUANGZHOU\n\n </li>, <li>\n\n 510440\n\n </li>, <li>\n\n http://www.adsaudio.cc\n\n </li>'
texts = get_li_texts(html)
>> [u'GUANGZHOU ADS AUDIO SCIENCE & TECHNOLOGY CO.,LTD.',
>> u'SHIMA ADS INDUSTRIAL DISTRICT GUANGZHOU GUANGDONG CHINA',
>> u'GUANGDONGGUANGZHOU',
>> u'510440',
>> u'http://www.adsaudio.cc']
</code></pre>