用Xpath和Python提取alttag

2024-09-30 20:23:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试提取下面代码块中图像的'alt'标记,其中周围div的类是“onIcon”。(示例=Modelcontract或Kabeltelevisie)

<tbody> <tr class="odd"><td><div class="roomdetail_icon onIcon Modelcontract"><a href="/nl/modelcontract"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_modelcontract_on.png" alt="Modelcontract" /></a></div></td><td><div class="roomdetail_icon onIcon Kamer"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_room_on.png" alt="Kamer" /></div></td><td><div class="roomdetail_icon offIcon Studio"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_studio_off.png" alt="Studio" /></div></td><td><div class="roomdetail_icon offIcon Appartement"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_apartment_off.png" alt="Appartement" /></div></td><td><div class="roomdetail_icon onIcon Internet"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_internet_on.png" alt="Internet" /></div></td> </tr> <tr class="even"><td><div class="roomdetail_icon onIcon Kabeltelevisie"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_cable_tv_on.png" alt="Kabeltelevisie" /></div></td><td><div class="roomdetail_icon onIcon Gemeenschappelijke leefruimte"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_shared_living_space_on.png" alt="Gemeenschappelijke leefruimte" /></div></td><td><div class="roomdetail_icon onIcon Tuin/terras"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_garden_on.png" alt="Tuin/terras" /></div></td><td><div class="roomdetail_icon onIcon Fietsenstalling"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_bicycle_shed_on.png" alt="Fietsenstalling" /></div></td><td><div class="roomdetail_icon offIcon Beddengoed"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_bedding_off.png" alt="Beddengoed" /></div></td> </tr> <tr class="odd"><td><div class="roomdetail_icon onIcon Keukengerei"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_kitchen_utensils_on.png" alt="Keukengerei" /></div></td><td><div class="roomdetail_icon offIcon Muziekinstrumenten toegelaten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_musical_instruments_allowed_off.png" alt="Muziekinstrumenten toegelaten" /></div></td><td><div class="roomdetail_icon offIcon Roken niet toegelaten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_smoking_allowed_off.png" alt="Roken niet toegelaten" /></div></td><td><div class="roomdetail_icon offIcon Huisdieren wel/niet toegelaten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_animals_allowed_off.png" alt="Huisdieren wel/niet toegelaten" /></div></td><td><div class="roomdetail_icon offIcon Bemeubeld"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_furnished_off.png" alt="Bemeubeld" /></div></td> </tr> <tr class="even"><td><div class="roomdetail_icon offIcon Toegankelijk voor rolstoelgebruikers"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_wheelchair_accssible_off.png" alt="Toegankelijk voor rolstoelgebruikers" /></div></td><td><div class="roomdetail_icon offIcon Geschikt voor allergiepatienten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_allergies_off.png" alt="Geschikt voor allergiepatienten" /></div></td><td><div class="roomdetail_icon offIcon Verhuur aan niet-studenten"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_non_students_off.png" alt="Verhuur aan niet-studenten" /></div></td><td><div class="roomdetail_icon offIcon Straatkant"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_street_off.png" alt="Straatkant" /></div></td><td><div class="roomdetail_icon onIcon Niet aan straatkant"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_notstreet_on.png" alt="Niet aan straatkant" /></div></td> </tr> <tr class="odd"><td><div class="roomdetail_icon onIcon Building regulations"><img src="/sites/all/themes/kotweb/images/icons/grid/grid_building_regulations_on.png" alt="Building regulations" /></div></td> </tr> </tbody>

我在Python中使用XPath,得到了以下查询:

^{pr2}$

不幸的是,这返回一个空数组([])。在

我已经在这个问题上纠结了很长时间了:我做错了什么?在

谨致问候, 托马斯


Tags: divsrcimgallaltthemessitesclass
1条回答
网友
1楼 · 发布于 2024-09-30 20:23:48
response.xpath("//div[@class[contains(., 'onIcon')]]//img/@alt")

类的值是roomdetail_icon onIcon Modelcontract,而不仅仅是onIcon,您应该使用contains函数

.表示当前上下文节点(@class)。在

输出:

^{pr2}$

每次执行[@class='onIcon']操作时,xpath都会经历一些步骤:

  1. XPath注意有一个字符串'onIcon',因此它将@class转换为字符串,在这种情况下,可以比较这两个。在
  2. 为了将@class转换为字符串,有一个string()函数,string(@class)将返回类的值roomdetail_icon onIcon Modelcontract
  3. 最后,XPath比较['roomdetail_icon onIcon Modelcontract'='onIcon']

相关问题 更多 >