在Python中刮取第二页就得到了第一页的数据

browser.get("https://XXXXXXXXX/0_9b34?P=2") innerHTML = browser.execute_script("return document.body.innerHTML") #type = str #returns the inner HTML as a string Eroom_M7_htmlpage = innerHTML soup = BeautifulSoup(Eroom_M7_htmlpage, 'html.parser') #type = bs4.BeautifulSoup htmlprettified = soup.prettify() #type = str project_items = soup.find_all('td', attrs={'headers' : 'ID Item'})

1条回答

网友

1楼 · 发布于 2024-09-30 06:30:18

innerHTML = browser.execute_script("return document.body.innerHTML")      #type = str    #returns the inner HTML as a string
Eroom_M7_htmlpage = innerHTML

您应该返回页面源代码，而不是javascript响应

.page_source是您要使用的方法。你知道吗

所以执行你想要的JavaScript，然后捕获HTML

Eroom_M7_htmlpage = browser.page_source

而不是innerhtml文档->；HERE

硒使用的一个基本例子。你知道吗

from selenium import webdriver
import time

options = webdriver.ChromeOptions()
options.add_argument(' ignore-certificate-errors')
options.add_argument(" test-type")
options.binary_location = "/usr/bin/chromium"
driver = webdriver.Chrome(chrome_options=options)
driver.get('https://python.org')

html = driver.page_source
print(html)

It will output the webpage source, which is stored in the variable html.

相关问题更多 >

编程相关推荐

热门问题

热门文章