Xpath和css_选择器无法提取动态内容。（使用Python和Selenium）

<p class="image" id="thumb6" data-type="partition"> <svg class="canvas" width="256" height="220" viewBox="0 0 256 220">...</svg>==0 <div class="explanation" style="position: absolute; width: 110px; text-align: center; top: 82px; left: 73px;">10%</div> </p>

driver.find_element_by_class_name('explanation') driver.find_element_by_xpath("//div[@class='explanation']") #Trying to reach parent element: driver.find_element_by_xpath("//p[@id='thumb6']") driver.find_element_by_xpath(/html[1]/body[1]/div[1]/div[1]/a[7]/p[1]/svg[1]/g[1]/rect[1])

2条回答

网友

1楼 · 编辑于 2024-09-21 01:12:55

我认为你不需要硒来做这个。首先，建立一个URL列表。模式是：

https://rcc.isbe.net/api/reportcardservice/(en)/Domain(school)/Id(340491250130001)/(Profile)/(2019)/Table/(Xml)

其中Id(340491250130001)是每所学校的id。(2019)是感兴趣的年份。如果需要，可以指定年数范围(2016-2019)

对于列表中的每个url，您需要获取包含数据的ressource url。XPath:

//resourceUrl

您将得到如下结果：

https://sec.isbe.net/iircapi/tempData/XML/File1992993354.xml

对于每个xml文件，您将通过以下方式获得长期缺勤率：

//ChronicAbsenteeism

例如：

from lxml import html
import requests

data = requests.get('https://rcc.isbe.net/api/reportcardservice/(en)/Domain(school)/Id(340491250130001)/(Profile)/(2019)/Table/(Xml)')
root = html.fromstring(data.content)
xml=root.xpath('//resourceurl/text()')[0]

source = requests.get(xml)
tree = html.fromstring(source.content)
print(tree.xpath('//chronicabsenteeism/text()')[0])

输出：10

网友

2楼 · 编辑于 2024-09-21 01:12:55

下面是解决这个问题的快速方法：

driver.find_element_by_xpath("//div[@class='explanation']").text() # This will fetch the innerHTML i.e. value of the div

相关问题更多 >

编程相关推荐

热门问题

热门文章