擅长:python、mysql、java
<p><a href="https://stackoverflow.com/a/1732454/843822">Don't use regex to extract data from html.</a>你有lxml,使用它的幂(<a href="http://lxml.de/xpathxslt.html#xpath" rel="nofollow noreferrer">XPath</a>)。在</p>
<pre><code>>>> import lxml.html as html
>>> page = html.parse("http://www.insiderpages.com/b/3721895833/central-kia-of-irving-irving")
>>> print page.xpath("//div[@class='rating_box']/abbr/@title")
['3']
</code></pre>