python请求库持久会话

2024-09-30 14:26:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图从一个在线计算器中获取如下数据:

https://srv111.services.gc.ca/INT_02.aspx

它需要填写多页表格。你知道吗

我正在尝试使用python请求库来实现这一点,但是运气不好。从我的理解来看

import requests

s = requests.Session()

然后我翻阅每一页

s.get(url1)

r2 = s.post(url2, data=payload)  #payload is form data created by looking at page source

。。。你知道吗

但似乎不起作用。当我

print r2.text 

我应该读下一页,但我读的是同一页。我对这东西很陌生。有人能指出我的错误吗。非常感谢!!!你知道吗

编辑:代码如下:

import requests
import codecs
url1='https://srv111.services.gc.ca/INT_01.aspx?lang=e'
url2='https://srv111.services.gc.ca/INT_02.aspx'
url3='https://srv111.services.gc.ca/OAS_01.aspx'
url4='https://srv111.services.gc.ca/OAS65_01.aspx'
url5='https://srv111.services.gc.ca/OAS_11.aspx'

payload2={'ctl00$ContentPlaceHolder1$ddlMonth':'1', 'ctl00$ContentPlaceHolder1$txtYear':'1987', 'ctl00$ContentPlaceHolder1$lbGender':'1', 'ctl00$ContentPlaceHolder1$btnNext': 'Next'}


s = requests.Session()
r1 = s.get(url1)
c = {'ASP.NET_SessionId': r1.cookies['ASP.NET_SessionId']}
r2 = s.post(url2, data=payload2, cookies=c)
payload3 = {'ctl00$ContentPlaceHolder1$btnNextPrevious2$btnNext':'Next'}
r3 = s.post(url3, data=payload3, cookies=c)
payload4 = {'ctl00$ContentPlaceHolder1$lbYesNo':'-1'}
r4 = s.post(url4, data=payload4)
r5 = s.get(url5)
with codecs.open('result.html','w','utf-8') as f:
    f.write(r2.text)

Tags: httpsimportdatagetservicepostrequestsgc