lxmlxpath返回一个空数组

import requests from lxml import html finalurl = f"https://www.twitter.com/user/status/{id}" response = requests.get(finalurl,allow_redirects=True) tree = html.fromstring(response.content) print("getting photolink") postPhotoLink = tree.xpath('//*[@id="react-root"]/div/div/div/main/div/div/div/div[1]/div/div[2]/div/section/div/div/div/div[1]/div/article/div/div[4]/div/div/div/a/div/div[2]/div/img/@src') print(postPhotoLink)

2条回答

网友

1楼 · 编辑于 2024-06-26 14:27:16

尝试使用此XPath，它应该可以工作：

(//img[@class='css-9pa8cd'])[2]/@src

如果不起作用，请尝试使用此XPath，因为一旦获得html，代码就会更改

//img[@data-aria-label-part='']/@src

硒是不需要的

网友

2楼 · 编辑于 2024-06-26 14:27:16

谢谢大家的帮助。为此，我必须使用selenium，否则请求无法正常工作，在xpath思想中仅选择数字2 img仍然有一些困难。我正在从阵列中手动选择，仍然有效

完整工作代码

import requests
from lxml import html
from selenium import webdriver
import time

finalurl = "https://twitter.com/iForex_com/status/1019547735614255104"
browser = webdriver.Safari()
browser.get(finalurl)
time.sleep(1)

tree = html.fromstring(browser.page_source)
print("getting photolink")

postPhotoLink = tree.xpath('//img[@class="css-9pa8cd"]/@src')
print(postPhotoLink[1])

browser.close()

相关问题更多 >

编程相关推荐

热门问题

热门文章