如何在Python中插入cookie进行web抓取?

2024-09-29 00:19:40 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在制作一个网页刮板,我将托管在我的树莓皮B,但我需要从网站刮需要一个cookie访问它。具体的cookie是。ROBLOSECURITY。在使用cookie之前,我已经登录了(在googlechrome的EditThisCookie扩展的帮助下)。如何让Python程序使用这个cookie登录?在

这是我的.ROBLOSECURITY cookie(尽管有一些字母/数字被更改):

26D59EEB62BB82BA679D88E391F5E43448FDC5EEE74BEBBFD9879204EABA2813E4C00248E65D7ADBFE0B91F1B140E4DD61CBA1F0EE5991E5099BE044AD9AF0C019EFAFDCF6A41355002355A602F9B8ADEF4CD14E70825687F9748B082089DE69C833E4F5AE9B358F1988B3D3BB04CA5D0BF96501E8B4AAACD68BBE3ACCAED5DA646BB4E7B3D8CC88D102DD53382C8FE8696C54445EB3716AF08DF9816E14EAC0DA451C04803BAB801BF61A20FD9BF6E3FE9BF06833D68C08BB1DF4FDD3ED969687F42BAA5D57C66246549F4323F3FAE71D7E38574690F6AB41D56C224C949018C5C24901EB7D8A4B6D262A173B60B16B413F347B21AC8901F86D818B039A88344A324670D726176F42485ADE295EE22ADEDA733452735B043B7A4FF8262D42DF60D63329C77E8AF9EF65AD25B01CEAD48FCBF59D8CB70AE32BDE1651FB372656C600DBCBF53F0D49FB89275830B0A5513EC201C808699428C0F09BF8FE64A227D9A94B43943E2F81E252B45297D38AF6D8E8FDA180DCB491AA33FA7EE87BB1D1E005050573294010E9169AB9AF716F69483128B93F87878C24380A57F64A8EF4BC9242A6125413548F88D15F6E6779A9B996BCADFEA7EABFEE3ED17EFEC148C33630CBCDCD9E1DDCB4B1C5DD42EF93C696C20D01A1E9D95AD40145ACE57C4664ACDF79EF78482DE6E40E7D3727C501A089993402F386A2D5997CDE530DBF93CDAD90E15F207D3B9DE168C3B669E1099B304192CD33D327150A57B9383BDBC99215448F21

这是一个屏幕截图,里面有一些关于cookie的更多信息。 More info about this .ROBLOSECURITY cookie


Tags: 程序刮板信息网页屏幕网站cookie字母
2条回答

这就是如何使用urllib实现的。在

import urllib2
handler = urllib2.HTTPHandler(debuglevel = 1)

req = urllib2.Request('http://www.example.com/')
req.add_header('Cookie', (".ROBLOSECURITY=26D59EEB62BB82BA679D88E391F5E43448FDC5EEE74BEBBFD9879204EABA2813E4C00248E65D7ADBFE0B91F1B"
    "140E4DD61CBA1F0EE5991E5099BE044AD9AF0C019EFAFDCF6A41355002355A602F9B8ADEF4CD14E70825687F9748B082089DE69C833E4F5AE9B358F1988B3D3BB04CA5D0B"
    "F96501E8B4AAACD68BBE3ACCAED5DA646BB4E7B3D8CC88D102DD53382C8FE8696C54445EB3716AF08DF9816E14EAC0DA451C04803BAB801BF61A20FD9BF6E3FE9BF06833D"
    "68C08BB1DF4FDD3ED969687F42BAA5D57C66246549F4323F3FAE71D7E38574690F6AB41D56C224C949018C5C24901EB7D8A4B6D262A173B60B16B413F347B21AC8901F86D"
    "818B039A88344A324670D726176F42485ADE295EE22ADEDA733452735B043B7A4FF8262D42DF60D63329C77E8AF9EF65AD25B01CEAD48FCBF59D8CB70AE32BDE1651FB372"
    "656C600DBCBF53F0D49FB89275830B0A5513EC201C808699428C0F09BF8FE64A227D9A94B43943E2F81E252B45297D38AF6D8E8FDA180DCB491AA33FA7EE87BB1D1E00505"
    "0573294010E9169AB9AF716F69483128B93F87878C24380A57F64A8EF4BC9242A6125413548F88D15F6E6779A9B996BCADFEA7EABFEE3ED17EFEC148C33630CBCDCD9E1DD"
    "CB4B1C5DD42EF93C696C20D01A1E9D95AD40145ACE57C4664ACDF79EF78482DE6E40E7D3727C501A089993402F386A2D5997CDE530DBF93CDAD90E15F207D3B9DE168C3B6"
    "69E1099B304192CD33D327150A57B9383BDBC99215448F21"))

opener = urllib2.build_opener(handler)
urllib2.install_opener(opener)

resp = urllib2.urlopen(req)
print resp.read()

取自here

import requests
url = 'http://target_url'
cookies = dict(cookie='value')

r = requests.get(url, cookies=cookies)

相关问题 更多 >