<p>有几种方法可以采用前两个元素:</p>
<p>1)使用带有getattr的map函数,我喜欢这种方式,因为您只对前2个元素进行迭代</p>
<pre><code>from bs4 import BeautifulSoup
soup = BeautifulSoup(your_html, 'lxml')
r = soup.find_all('td')
gen_my_soup_text = map(lambda x: getattr(x, 'text'), r)
first_string = next(gen_my_soup_text)
second_string = next(gen_my_soup_text)
print(first_string)
print(second_string)
# output:
# Parte edibile, %
# 75
</code></pre>
<p>2)使用切片和贴图</p>
<pre><code>list(map(lambda x: getattr(x, 'text'), r))[:2]
</code></pre>
<p>3)使用列表理解和切片</p>
<pre><code>[e.text for e in r][:2]
</code></pre>
<p>要清除网页,您可以尝试:</p>
<pre><code>from bs4 import BeautifulSoup
import requests
req = requests.get('http://www.bda-ieo.it/test/Alphabetical.aspx?Lan=Ita')
soup = BeautifulSoup(req.text, "lxml")
# result is the container of the tags of interest.
rows = soup.find_all("tr", attrs = {'class':'testonormale'})
first_second = [[e.text for e in row.find_all('td')][:2] for row in rows]
# output:
#[['1300', 'ACCIUGHE o ALICI '],
# ['1502', 'ACCIUGHE o ALICI SOTTO SALE'],
# ['1501', "ACCIUGHE o ALICI SOTT'OLIO"],
# ['100205', 'ACETO'],
....
# ['602004', 'ASTICE '],
# ['600009', 'AVENA '],
# ['999692', 'AVOCADO ']]
</code></pre>