擅长:python、mysql、java
<p>Mechanize只适合获取html。一旦您想从html中提取信息,您可以使用例如<a href="http://www.crummy.com/software/BeautifulSoup/" rel="nofollow noreferrer">BeautifulSoup</a>。(另见我对类似问题的回答:<a href="https://stackoverflow.com/questions/7722876/web-mining-or-scraping-or-crawling-what-tool-library-should-i-use/7722953#7722953">Web mining or scraping or crawling? What tool/library should I use?</a>)</p>
<p>根据<code><td></code>在html中的位置(您的问题不清楚),可以使用以下代码:</p>
<pre><code>html = ... # this is the html you've fetched
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html)
# use this (gets all <td> elements)
cols = soup.findAll('td')
# or this (gets only <td> elements with class='h3')
cols = soup.findAll('td', attrs={"class" : 'h3'})
print cols[0].renderContents() # print content of first <td> element
</code></pre>