<p>一些评论:</p>
<ul>
<li><p><code>sel.xpath("//div[@class='my_class']")</code>选择<code>div</code>元素。</p></li>
<li><p><code>sel.xpath("//div[@class='my_class']").extract()</code>获取所选元素的字符串表示形式:HTML、列表;如果所选内容内的文本节点包含unicode代码点,则将unicode内容作为<a href="https://docs.python.org/2/howto/unicode.html#unicode-literals-in-python-source-code" rel="nofollow">^{<cd4>} escape sequences</a>。</p></li>
</ul>
<p>也可以使用<a href="http://www.w3.org/TR/xpath/#function-string" rel="nofollow">XPath's ^{<cd5>} function</a>直接请求选定节点的字符串表示形式:</p>
<ul>
<li><p><code>sel.xpath("string(//div[@class='my_class'])").extract()</code></p></li>
<li><p>或者使用<code>text()</code>节点的字符串连接的通用模式:<code>"".join(sel.xpath("//div[@class='my_class']//text()").extract())</code></p></li>
</ul>
<p>注意,<code>string()</code>将只考虑与表达式匹配的第一个元素作为参数。来自XPath 1.0规范:</p>
<blockquote>
<p>A node-set is converted to a string by returning the string-value of <strong>the node in the node-set that is first in document order</strong>. </p>
</blockquote>
<hr/>
<p>scrapy shell会话示例:</p>
<pre><code>$ scrapy shell
[s] Available Scrapy objects:
[s] crawler <scrapy.crawler.Crawler object at 0x7f06700bc2d0>
[s] item {}
[s] settings <scrapy.settings.Settings object at 0x7f06700b6f10>
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] fetch(req_or_url) Fetch request (or URL) and update local objects
[s] view(response) View response in a browser
In [1]: import scrapy
In [2]: sel = scrapy.Selector(text=u'''<div class="my_class"><ul><li class="parent">\n<a href="/category/tractors-ride-on-mowers/">\n\u0422\u0420\u0410\u041a\u0422\u041e\u0420\u042b \u0438 \u0420\u0410\u0419\u0414\u0415\u0420\u042b</a>\n<div class="sub1"><div class="str"></div><ul><li><a href="/category/lawn-tractors/" class="">\u0421\u0430\u0434\u043e\u0432\u044b\u0435 \u0442\u0440\u0430\u043a\u0442\u043e\u0440''')
In [3]: print "".join(sel.xpath('//div[@class="my_class"]//text()').extract())
ТРАКТОРЫ и РАЙДЕРЫ
Садовые трактор
In [4]: for r in sel.xpath('string(//div[@class="my_class"])').extract():
print r
...:
ТРАКТОРЫ и РАЙДЕРЫ
Садовые трактор
In [5]:
</code></pre>