如何抓取页面标题?

2024-06-26 13:08:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我不知道如何抓取网页的标题,下面是我的代码(很简单),但我不知道哪里错了,如果你有任何想法,请让我知道,谢谢

enter image description here

enter image description here

from selenium import webdriver
url="https://sukebei.nyaa.si/?s=seeders&o=desc&p=1"
driver_path = "C:\\webdriver\\chromedriver.exe"
option = webdriver.ChromeOptions()
driver = webdriver.Chrome(driver_path, options=option)
driver.implicitly_wait(10)
driver.get(url)
print(driver.find_element_by_xpath("/html/head/title").text)

Tags: path代码fromhttpsimporturl网页标题
2条回答
from selenium import webdriver
url="https://sukebei.nyaa.si/?s=seeders&o=desc&p=1"
driver_path = "C:\\webdriver\\chromedriver.exe"
option = webdriver.ChromeOptions()
driver = webdriver.Chrome(driver_path, options=option)
driver.implicitly_wait(10)
driver.get(url)
print(driver.title)

要对页面标题进行爬网,您必须使用torrent list^{}<table>诱导WebDriverWait,并且您可以使用以下任一Locator Strategies

  • 使用CSS_SELECTOR

    driver.get('https://sukebei.nyaa.si/?s=seeders&o=desc&p=1')
    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table.torrent-list")))
    print(driver.title)
    
  • 使用XPATH

    driver.get('https://sukebei.nyaa.si/?s=seeders&o=desc&p=1')
    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[contains(@class, 'torrent-list')]")))
    print(driver.title)
    
  • 控制台输出:

    Browse :: Sukebei
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

相关问题 更多 >