无法使用Requests Post方法从LifeInsuranceCouncil获取数据

2024-10-03 13:23:11 发布

您现在位置:Python中文网/ 问答频道 /正文

这是对此处所提问题的延伸: Selenium Python Pass Date parameter using Calendar

使用selenium是不可行的,因为我需要过去5年中所有日期的数据,导航日历控件将很困难- 因此,我尝试使用Requests-Post方法。 打印(mytable)总是空的-我没有在输出中获得所需的id。 response.text始终显示:特定日期的文件未上载

在网站上(使用浏览器)手动将日期放入日期选择器而不选择日历控件时,也会发生这种情况

下面是Python中使用请求的代码:

import requests
from bs4 import BeautifulSoup
import urllib.parse


url='https://www.lifeinscouncil.org/industry%20information/list_of_fund_NAVs.aspx'
headers = {
          'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36'\
          ,'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'\
          ,'Accept-Encoding': 'gzip, deflate, br'\
          ,'Accept-Language': 'en-US,en;q=0.9'\
          ,'Content-Type': 'application/x-www-form-urlencoded'\
          ,'Connection':'keep-alive'\
          ,'Cache-Control':'max-age=0'\
          ,'Host':'www.lifeinscouncil.org'\
          ,'Origin':'https://www.lifeinscouncil.org'\
          ,'Referer':'https://www.lifeinscouncil.org/industry%20information/list_of_fund_NAVs.aspx'\
          ,'sec-ch-ua':'"Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93"'\
          ,'sec-ch-ua-mobile':'?0'\
          ,'sec-ch-ua-platform':'"Windows"'\
          ,'Sec-Fetch-Dest':'document'\
          ,'Sec-Fetch-Mode':'navigate'\
          ,'Sec-Fetch-Site':'same-origin'\
          ,'Sec-Fetch-User':'?1'\
          ,'Upgrade-Insecure-Requests':'1'\
        }

#payload='ctl00%24txtusername=""&ctl00%24txtpwd=""&ctl00%24MainContent%24drpselectinscompany=BLS111&ctl00%24MainContent%24txtdateselect=01-Sep-21&ctl00%24MainContent%24btngetdetails=Get+Data'
s=requests.Session()

url = "https://www.lifeinscouncil.org/industry%20information/list_of_fund_NAVs.aspx"
req = s.get(url,headers=headers)
data = req.text
bs = BeautifulSoup(data,'html.parser')
cookies=s.cookies.get_dict()

viewstate=bs.find("input", {"id": "__VIEWSTATE"}).attrs['value']
viewstategenerator = bs.find("input", {"id": "__VIEWSTATEGENERATOR"}).attrs['value']
eventvalidation = bs.find("input", {"id": "__EVENTVALIDATION"}).attrs['value']


formData = (
    # ('__EVENTTARGET',''),
    # ('__EVENTARGUMENT',''),
    ('__VIEWSTATE', viewstate),
    ('__VIEWSTATEGENERATOR',viewstategenerator),
    ('__EVENTVALIDATION', eventvalidation),
    ('ctl00$txtusername', ''),
    ('ctl00$txtpwd', ''),
    # ('ctl00$MainContent$CALnavdate','7935'),
    ('ctl00$MainContent$drpselectinscompany', 'BLS111'),
    ('ctl00$MainContent$txtdateselect','01-Sep-21'),
    ('ctl00$MainContent$btngetdetails', 'Get Data')
)
encodedFields=urllib.parse.urlencode(formData)
print(encodedFields)
response = s.post(url, headers=headers, data=encodedFields,timeout=300,cookies=cookies)

soup=BeautifulSoup(response.text,'html.parser')
mytable=soup.find_all(id='MainContent_GridView1')
print(mytable)
print(response.text)

我还尝试使用postman来帮助我生成代码,但postman也无法获取输出中的数据

我在这里错过了什么

谢谢你提前帮忙。 当做 基兰·贾因


Tags: texthttpsorgidurlbsapplicationresponse