我正在尝试使用请求模块从网页获取数据。到目前为止,我的情况是:
import requests
import sys
reload(sys)
sys.setdefaultencoding('utf8')
params = {'ResetPaging': 'false',
'sortDirection': 'D',
'sortField': 'TransDate',
'startRow': 0,
'timePeriod': 6,
'transactionType': 'purchases'}
url = "http://markets.investorschronicle.co.uk/research/Markets/DirectorDealings"
headers = {"Content-type": "application/x-www-form-urlencoded",
"Accept": "application/json, text/javascript, */*; q=0.01",
'Referer': 'http://markets.investorschronicle.co.uk/research/Markets/DirectorDealings',
'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0',
'X-Requested-With': 'XMLHttpRequest'
}
s = requests.Session()
# HEAD requests ask for *just* the headers, which is all you need to grab the
# session cookie
s.head(url)
r = s.post(url, data=params, headers=headers)
print(r.status_code, r.reason)
with open('page1.html', 'w') as f:
f.write(r.text)
f.flush()
params['startRow'] = 11
params['transactionType'] = 'sales'
r = s.post(url, data=params, headers=headers)
print(r.status_code, r.reason)
with open('sales.html', 'w') as f:
f.write(r.text)
f.flush()
似乎我的参数被主机忽略了(虽然我从服务器得到了一个200ok的确认),我只是得到了一个“默认”的购买首页——不管我在帖子中使用了什么参数
我做错什么了
目前没有回答
相关问题 更多 >
编程相关推荐