在python中打印列表中的所有项时出现问题问题的回答

在python中打印列表中的所有项时出现问题

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<a href="https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags">Do not parse HTML with regex.</a>使用一个专门的工具-an<code>HTML Parser</code>。你知道吗 下面是使用<a href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/" rel="nofollow noreferrer">^{<cd2>}</a>的解决方案： <pre><code>import urllib2 from bs4 import BeautifulSoup base_url = "http://www.boostmobile.com/stores/?page={page}&zipcode={zipcode}" num_pages = 10 zipcode = 30008 for page in xrange(1, num_pages + 1): url = base_url.format(page=page, zipcode=zipcode) soup = BeautifulSoup(urllib2.urlopen(url)) print "Page Number: %s" % page results = soup.find('table', class_="results") for h2 in results.find_all('h2'): print h2.text </code></pre> 它打印： <pre><code>Page Number: 1 Boost Mobile Store by Wireless Depot Boost Mobile Store by KOB Wireless Marietta Check Cashing Services ... Page Number: 2 Target Wal-Mart ... </code></pre> 如您所见，首先我们找到一个带有<code>results</code>类的<code>table</code>标记—这就是商店名称的实际位置。然后，在<code>table</code>中我们找到了所有的<code>h2</code>标记。这比依赖标签的<code>style</code>属性更健壮。你知道吗 <hr/> 您还可以使用<a href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/#parsing-only-part-of-a-document" rel="nofollow noreferrer">^{<cd8>}</a>。它将提高性能，因为它只解析您指定的文档部分： <pre><code>required_part = SoupStrainer('table', class_="results") for page in xrange(1, num_pages + 1): url = base_url.format(page=page, zipcode=zipcode) soup = BeautifulSoup(urllib2.urlopen(url), parse_only=required_part) print "Page Number: %s" % page for h2 in soup.find_all('h2'): print h2.text </code></pre> 这里我们说：“只解析带有类<code>results</code>的<code>table</code>标记。把里面的<code>h2</code>标签都给我们。” 另外，如果要提高性能，可以<a href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser" rel="nofollow noreferrer">let ^{<cd2>} use ^{<cd13>} parser under the hood</a>： <pre><code>soup = BeautifulSoup(urllib2.urlopen(url), "lxml", parse_only=required_part) </code></pre> 希望有帮助。你知道吗

在python中打印列表中的所有项时出现问题

1 个回答

相关Python问题