BeautfulSoup通过请求会话登录网站

2024-10-02 00:29:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用下面的代码登录到一个网站,以便能够从我自己的个人资料页面中获取数据。 然而,在我从概要文件的URL获取之后,选择器(soup)只从登录页面返回数据。 我还是找不到原因

import requests
from requests import session
from bs4 import BeautifulSoup


login_url='https://caicara.pizzanapoles.com.br/Account/Login'

url_perfil = 'https://caicara.pizzanapoles.com.br/AdminCliente'

payload = {
    'username' : 'MY_USERNAME',
    'password' : 'MY_PASSWORD'
}


with requests.session() as s:
    s.post(login_url, data = payload)
    r = requests.get(url_perfil)
    soup = BeautifulSoup(r.content, 'html.parser')
    print(soup.title)

Tags: fromhttpsbrimportcomurlsessionlogin
2条回答

谢谢Karl让你回来

但它并不奏效。 你可以使用上面提到的提示来更改我的代码

import requests
from bs4 import BeautifulSoup

login_url = 'https://caicara.pizzanapoles.com.br/Account/Login'
url = 'https://caicara.pizzanapoles.com.br/AdminCliente'
data = {
    'username': 'myuser',
    'password': 'mypass',
}

with requests.session() as s:
    r = s.get(login_url)

    soup = BeautifulSoup(r.content, 'html.parser')
    token = soup.find('input', name='__RequestVerificationToken')['value_of
    _my_token']
    payload['__RequestVerificationToken'] = token

    r1 = s.post(login_url, data=payload)
    r2 = s.get(url_perfil)

但是,它在下面返回一个错误

                                     -
TypeError                                 Traceback (most recent call last)
<ipython-input-140-760e35f7b327> in <module>
     13 
     14     soup = BeautifulSoup(r.content, 'html.parser')
 -> 15     token = soup.find('input', name='__RequestVerificationToken')['QHlUQaro9sNo4lefL59lQRtbuziHnHtolV7Xm_Et_3tvnZKZnS4gjBBJZakw7crW0dyXy_lok44RozrMAvWm61XXGla5tC3AuZlgXC4GukA1']
     16 
     17     payload['__RequestVerificationToken'] = token

TypeError: find() got multiple values for argument 'name'

首先,您需要对所有请求使用会话对象s

r = requests.get(url_perfil)

更改为

r = s.get(url_perfil)

当您尝试登录时,__RequestVerificationToken会在POST数据中发送—您可能也需要发送它

它存在于login_url的HTML中

<input name="__RequestVerificationToken" value="..."

这意味着您.get()登录页面-提取令牌-然后发送您的.post()

r = s.get(login_url)

soup = BeautifulSoup(r.content, 'html.parser')
token = soup.find('input', {'name': '__RequestVerificationToken'})['value']
payload['__RequestVerificationToken'] = token

r1 = s.post(login_url, data=payload)
r2 = s.get(url_perfil)

您可能希望将每个请求保存到它自己的变量中,以便进一步调试

相关问题 更多 >

    热门问题