https://bankchart.kz/spravochniki/reytingi_cbr/2/2019/7
我如何从每一列中获取文本,也就是说,从最后三个块中获取每个<div class = "col-currency-rate">
的类<div class = "row">
?我拿到桌子了,但下一步怎么办?你知道吗
>>> tree.xpath('//div[@class="table-currency"]/div[@class="row"]')
[<Element div at 0x7fcac2a47ba8>, <Element div at 0x7fcac2a47c00>, <Element div at 0x7fcac2a47c58>, <Element div at 0x7fcac2a47cb0>, <Element div at 0x7fcac2a47d08>, <Element div at 0x7fcac2a47d60>, <Element div at 0x7fcac2a47db8>, <Element div at 0x7fcac2a47e10>, <Element div at 0x7fcac2a47e68>, <Element div at 0x7fcac2a47ec0>, <Element div at 0x7fcac2a47f18>, <Element div at 0x7fcac2a47f70>, <Element div at 0x7fcac2a47fc8>, <Element div at 0x7fcac2a4e050>, <Element div at 0x7fcac2a4e0a8>, <Element div at 0x7fcac2a4e100>, <Element div at 0x7fcac2a4e158>, <Element div at 0x7fcac2a4e1b0>, <Element div at 0x7fcac2a4e208>, <Element div at 0x7fcac2a4e260>, <Element div at 0x7fcac2a4e2b8>, <Element div at 0x7fcac2a4e310>, <Element div at 0x7fcac2a4e368>, <Element div at 0x7fcac2a4e3c0>, <Element div at 0x7fcac2a4e418>, <Element div at 0x7fcac2a4e470>, <Element div at 0x7fcac2a4e4c8>, <Element div at 0x7fcac2a4e520>]
>>> len(tree.xpath('//div[@class="table-currency"]/div[@class="row"]'))
28
html
<div class="table-currency">
<div class="row"><div class="col col-currency">
2.
<img rel="nofollow" src="https://st6.prosto.im/cache/st6/1/0/5/5/1055/1055.jpg" width="16" height="16" alt="">
<a target="_blank" href="/spravochniki/reytingi_banka/2/1057">
ForteBank
</a></div><div class="col col-headery col-currency-rate"><p>Активы банков, тыс. тенге</p></div><div class="col col-headery col-currency-rate"><p>Прирост за июль 2019 года, тыс. тенге</p></div><div class="col col-headery col-currency-rate"><p>Прирост с начала 2019 года, тыс. тенге</p></div><div class="col col-currency-rate"><p>1 985 956 865</p></div><div class="col col-currency-rate"><p></p><p class="arrow-up">+89 298 547</p><p></p></div><div class="col col-currency-rate"><p></p><p class="arrow-up">+390 999 868</p><p></p></div></div>
<div class="row"><div class="col col-currency">
3.
<img rel="nofollow" src="https://st6.prosto.im/cache/st6/1/0/9/5/1095/1095.png" width="16" height="16" alt="">
<a target="_blank" href="/spravochniki/reytingi_banka/2/1076">
Сбербанк России
</a></div><div class="col col-headery col-currency-rate"><p>Активы банков, тыс. тенге</p></div><div class="col col-headery col-currency-rate"><p>Прирост за июль 2019 года, тыс. тенге</p></div><div class="col col-headery col-currency-rate"><p>Прирост с начала 2019 года, тыс. тенге</p></div><div class="col col-currency-rate"><p>1 983 840 092</p></div><div class="col col-currency-rate"><p></p><p class="arrow-up">+88 853 745</p><p></p></div><div class="col col-currency-rate"><p></p><p class="arrow-up">+119 145 827</p><p></p></div></div>
</div>
试着用这个
检查一下这个是否适合你。你知道吗
具有特定Xpath表达式的复杂解决方案:
详情:
descendant::a/text()
-xpath提取a
元素的文本节点,该元素是下划线行的子节点/子节点div[contains(@class, "col-currency-rate")][position() > last() - 3]
-xpath选择div
元素,该元素具有特定的class
属性部分值,位置从第三个最后位置开始到最后一个位置(last()
-最后一个元素的位置,last() - 3
指向最后第三个位置)输出:
相关问题 更多 >
编程相关推荐