擅长:python、mysql、java
<p>您需要使用<a href="http://docs.python.org/release/2.5.2/lib/module-urllib2.html" rel="nofollow noreferrer">urllib2 python library</a>从网站获取html,然后通过html解析来获取所需的文本。在</p>
<p>使用<a href="http://www.crummy.com/software/BeautifulSoup/" rel="nofollow noreferrer">BeautifulSoup</a>解析html</p>
<pre><code>import BeautifulSoup
resp = urllib2.urlopen("http://stackoverflow.com")
rawhtml = resp.read()
#parse through html to get text
soup=BeautifulSoup(rawhtml)
</code></pre>