返回了0条记录的Web刮取

2024-04-23 19:46:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用这段代码很长时间来解决我的问题,但找不到解决方案。使用Selenium,我首先登录到该站点并进行抓取。但它返回的值为0,这是不期望的

import pandas as pd
import requests
from bs4 import BeautifulSoup
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
# Github credentials
username = "xxxxxxxxxxxxxx@gmail.com"
password = "xxxxxxxxxxxxxxxxx"
# initialize the Chrome driver
driver = webdriver.Chrome(executable_path=r'C:\Users\snyder\seaborn-data\chromedriver.exe')
driver.maximize_window()
driver.implicitly_wait(5)
driver.get("https://www.finq.com/en/login")
wait = WebDriverWait(driver, 10)
wait.until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR, "iframe[id='login']")))
# find username/email field and send the username itself to the input field
driver.find_element_by_id("login_username").send_keys(username)
# find password input field and insert password as well
driver.find_element_by_id("login_password").send_keys(password)
# click login button
#driver.find_element_by_name("submit").click()
driver.find_element_by_css_selector("button[id='submit_login']").click()
# wait the ready state to be complete
WebDriverWait(driver=driver, timeout=10).until(
    lambda x: x.execute_script("return document.readyState === 'complete'")
)
error_message = "Incorrect username or password."
# get the errors (if there are)
errors = driver.find_elements_by_class_name("flash-error")
# print the errors optionally
# for e in errors:
#     print(e.text)
# if we find that error message within errors, then login is failed
if any(error_message in e.text for e in errors):
    print("[!] Login failed")
else:
    print("[+] Login successful")

    
url = "https://live-cosmos.finq.com/trading-platform/#trading/Shares/Global/USA/All/FACEBOOK"
data  = requests.get(url).text
soup = BeautifulSoup(data, 'html5lib')
df = pd.DataFrame(columns=["Instrument", "Sell", "Buy", "Change"])
for row in soup.find_all('tr'):
    col = row.find_all("td")
    Instrument = col[0].text
    Sell = col[1].text
    Buy = col[2].text
    Change = col[3].text
    df = df.append({"Instrument":Instrument,"Sell":Sell,"Buy":Buy,"Change":Change}, ignore_index=True)
print(df)

登录到该站点后,我收到另一条需要关闭的消息。我也给出了以下条件

driver.find_element_by_css_selector("button[id='popup-actions fn-close-popup btn autotests__popup-btn-close']").click()

但是它说

NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"button[id='popup-actions fn-close-popup btn autotests__popup-btn-close']"}
  (Session info: chrome=92.0.4515.159)

Tags: thetextfromimportidbydriverusername
2条回答

如果“登录成功”,则需要替换:

data  = requests.get(url).text

driver.get(url)
time.sleep(3) # import time
data = driver.page_source

或者您需要将cookies从Selenium driver传递到requests

你失踪了

driver.switch_to.default_content()

我看到您在登录页面上切换到某个iframe以访问其中的一些元素,但在这之后要访问该iframe中的元素,您必须切换回默认内容。
UPD
要关闭登录后出现的弹出窗口,可以使用以下命令:

wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.fn-close-popup.popup-actions"))).click()

同样,在单击“登录”按钮后,您必须切换到默认内容

driver.switch_to.default_content()

相关问题 更多 >