无法登录网站并爬取 无法登录该网站并获取数据

2024-10-03 00:21:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图刮网站数据,但面临的问题,而登录到该网站。当我用用户名和密码登录到站点时,它不会这样做。 我认为令牌有问题,每次我尝试登录到系统时,都会生成一个令牌(检查控制台标题)

import requests
from bs4 import BeautifulSoup
s = requests.session()
url = "http://indiatechnoborate.tymra.com"

with requests.Session() as s:
    first = s.get(url)
    start_soup = BeautifulSoup(first.content, 'lxml')
    print(start_soup)
    retVal=start_soup.find("input",{"name":"return"}).get('value')
    print(retVal)
    formdata=start_soup.find("form",{"id":"form-login"})
    dynval=formdata.find_all('input',{"type":"hidden"})[1].get('name')
    print(dynval)
    dictdata={"username":"username", "password":"password","return":retVal,dynval:"1"
    }
    print(dictdata) 
    pr = {"task":"user.login"}
    print(pr)
    sec = s.post("http://indiatechnoborate.tymra.com/component/users/",data=dictdata,params=pr)
    print("------------------------------------------")
print(sec.status_code,sec.url)
print(sec.text)

我想登录到该网站,并希望得到的数据后,登录完成


Tags: 数据importurlget网站prsecfind
2条回答

尝试替换此行:

dictdata={"username":"username", "password":"password","return":retVal,dynval:"1"}

用这个:

dictdata={"username":"username", "password":"password","return":retVal + "==",dynval:"1"}

希望这有帮助

尝试使用身份验证方法,而不是传递有效负载

import requests
from requests.auth import HTTPBasicAuth
USERNAME = "<USERNAME>"
PASSWORD = "<PASSWORD>"
BASIC_AUTH = HTTPBasicAuth(USERNAME, PASSWORD)
LOGIN_URL = "http://indiatechnoborate.tymra.com"
response = requests.get(LOGIN_URL,headers={},auth=BASIC_AUTH)

相关问题 更多 >