<p>以下代码适用于您的输入:</p>
<pre><code>import lxml.html
root = lxml.html.parse('text.html').getroot()
for span in root.xpath('//span[@class="zzAggregateRatingStat"]'):
print span.text
</code></pre>
<p>它打印:</p>
^{pr2}$
<p>我更喜欢使用<code>lxml</code>的<em>xpath</em>而不是<em>cssselector</em>,尽管它们都可以完成这项工作。在</p>
<p>ChrisP的示例打印<code>3</code>,但如果在实际输入上运行它,则会出现错误:</p>
<pre><code>$ python chrisp.py
Traceback (most recent call last):
File "chrisp.py", line 6, in <module>
doc = fromstring(text)
File "lxml.etree.pyx", line 2532, in lxml.etree.fromstring (src/lxml/lxml.etree.c:48270)
File "parser.pxi", line 1545, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:71812)
File "parser.pxi", line 1424, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:70673)
File "parser.pxi", line 938, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:67442)
File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63824)
File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64745)
File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64088)
lxml.etree.XMLSyntaxError: EntityRef: expecting ';', line 3, column 210
</code></pre>
<p>ChrisP的代码可以改为使用<code>lxml.html.fromstring</code>,这是一个更为宽松的解析器,而不是{<cd4>}。在</p>
<p>如果进行了此更改,它将打印<code>3</code>。在</p>