基于当前行的值获取下一行的值

url = 'https://en.wikipedia.org/wiki/Arcadia' browser.get(url) vcard_element = browser.find_element_by_css_selector('#mw-content-text > div > table.infobox.geography.vcard').find_element_by_xpath('tbody') for row in vcard_element.find_elements_by_xpath('tr'): try: if 'Population' in row.find_element_by_xpath('th').text: print(row.find_element_by_xpath('th').text) except Exception: pass

2条回答

网友

1楼 · 编辑于 2024-05-19 22:25:45

虽然您当然可以使用selenium来实现这一点，但我个人建议使用requests和lxml，因为它们的重量比selenium轻得多，而且也可以很好地完成工作。我发现以下方法适用于我测试的几个地区：

try:
    response = requests.get(url)

    infocard_rows = html.fromstring(response.content).xpath("//table[@class='infobox geography vcard']/tbody/tr")

except:
    print('Error retrieving information from ' + url)


try:
    population_row = 0
    for i in range(len(infocard_rows)):
        if infocard_rows[i].findtext('th') == 'Population':
            population_row = i+1
            break

    population = infocard_rows[population_row].findtext('td')

except:
    print('Unable to find population')

从本质上说html.fromstring（）.xpath（）正在获取路径上infobox geography vcard表中的所有行。下一个try-catch只尝试定位其内部文本为th的Population，然后从下一个td中提取文本（这是总体数）。你知道吗

希望这是有帮助的，即使它不是像你所要求的那样！如果您想重新创建浏览器行为或检查javascript元素，通常会使用Selenium。你当然也可以在这里用。你知道吗

网友

2楼 · 编辑于 2024-05-19 22:25:45

使用./following::tr[1]或./following-sibling::tr[1]

url = 'https://en.wikipedia.org/wiki/Arcadia'
browser=webdriver.Chrome()
browser.get(url)

vcard_element = browser.find_element_by_css_selector('#mw-content-text > div > table.infobox.geography.vcard').find_element_by_xpath('tbody')

for row in vcard_element.find_elements_by_xpath('tr'):

    try:
        if 'Population' in row.find_element_by_xpath('th').text:
            print(row.find_element_by_xpath('th').text)
            print(row.find_element_by_xpath('./following::tr[1]').text) #whole word
            print(row.find_element_by_xpath('./following::tr[1]/td').text) #Only number
    except Exception:
        pass

控制台上的输出：

Population (2011)
 • Total 86,685
86,685

相关问题更多 >

编程相关推荐

热门问题

热门文章