<p>要在<code><dt></code>之后到达<code><dd></code>,可以使用<code>following-sibling</code>轴,这是正确的。</p>
<p><code>following-sibling::dd</code>在上下文节点后使用select all<code>dd</code>元素。因此,您需要使用位置谓词<code>[1]</code>,将XPath限制为只使用第一个。</p>
<p>对于从<code>//dl/dt</code>中得到的每个<code>dt</code>元素,您选择<code>following-sibling::dd[1]</code>。</p>
<p>下面是一个使用<code>scrapy shell</code>作为术语“address”的示例会话:</p>
<pre><code>$ scrapy shell "http://www.etymonline.com/index.php?allowed_in_frame=0&search=address&searchmode=none"
...
2014-11-26 10:34:53+0100 [default] DEBUG: Crawled (200) <GET http://www.etymonline.com/index.php?allowed_in_frame=0&search=address&searchmode=none> (referer: None)
[s] Available Scrapy objects:
[s] crawler <scrapy.crawler.Crawler object at 0x7f1396cc6950>
[s] item {}
[s] request <GET http://www.etymonline.com/index.php?allowed_in_frame=0&search=address&searchmode=none>
[s] response <200 http://www.etymonline.com/index.php?allowed_in_frame=0&search=address&searchmode=none>
[s] settings <scrapy.settings.Settings object at 0x7f1397399bd0>
[s] spider <Spider 'default' at 0x7f13966c05d0>
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] fetch(req_or_url) Fetch request (or URL) and update local objects
[s] view(response) View response in a browser
In [1]: for dt in response.xpath('//dl/dt'):
print "Word:", dt.xpath('string(a)').extract()
print "Definition:", dt.xpath('string(following-sibling::dd[1])').extract()
print
...:
Word: [u'address (n.)']
Definition: [u'1530s, "dutiful or courteous approach," from address (v.) and from French adresse. Sense of "formal speech" is from 1751. Sense of "superscription of a letter" is from 1712 and led to the meaning "place of residence" (1888).']
Word: [u'addressee (n.)']
Definition: [u'1810; see address (v.) + -ee.']
Word: [u'address (v.)']
Definition: [u'early 14c., "to guide or direct," from Old French adrecier "go straight toward; straighten, set right; point, direct" (13c.), from Vulgar Latin *addirectiare "make straight," from Latin ad "to" (see ad-) + *directiare, from Latin directus "straight, direct" (see direct (v.)). Late 14c. as "to set in order, repair, correct." Meaning "to write as a destination on a written message" is from mid-15c. Meaning "to direct spoken words (to someone)" is from late 15c. Related: Addressed; addressing.']
Word: [u'salutatorian (n.)']
Definition: [u'1841, American English, from salutatory "of the nature of a salutation," here in the specific sense "designating the welcoming address given at a college commencement" (1702) + -ian. The address was originally usually in Latin and given by the second-ranking graduating student.']
...
Word: [u'reverend (adj.)']
Definition: [u'early 15c., "worthy of respect," from Middle French reverend, from Latin reverendus "(he who is) to be respected," gerundive of revereri (see reverence). As a form of address for clergymen, it is attested from late 15c.; earlier reverent (late 14c. in this sense). Abbreviation Rev. is attested from 1721, earlier Revd. (1690s). Very Reverend is used of deans, Right Reverend of bishops, Most Reverend of archbishops.']
Word: [u'nun (n.)']
Definition: [u'Old English nunne "nun, vestal, pagan priestess, woman devoted to religious life under vows," from Late Latin nonna "nun, tutor," originally (along with masc. nonnus) a term of address to elderly persons, perhaps from children\'s speech, reminiscent of nana (compare Sanskrit nona, Persian nana "mother," Greek nanna "aunt," Serbo-Croatian nena "mother," Italian nonna, Welsh nain "grandmother;" see nanny).']
In [2]:
</code></pre>