如何在Python中发送URL请求而不打开浏览器("不使用webbrowser模块")?

4 投票
2 回答
7767 浏览
提问于 2025-04-17 09:02

我想在登录状态下,把这个网址发给服务器,请求它在网站上做一些更改。问题是,当我用 mechanize 或 urllib2 打开这个网址时,网站上什么都没变。但是,当我用 webbrowser 模块时,网站上的内容就会改变。我想要实现 webbrowser 模块的功能,但又不想打开实际的浏览器。有没有办法做到这一点?为什么 mechanize 和 urllib2 不起作用呢?

补充:我所说的“网站上的更改”是指我在网站上发布的信息会获得一些叫做“分享”和“票”的东西。如果信息不准确,他们会把你踢出去。我的程序会找到准确的信息,然后通过一个网址把这些信息“插入”到网站上。

示例网址(我其他的网址都遵循这个格式):

http://www.locationary.com/access/proxy.jsp?ACTION_TOKEN=proxy_jsp$JspView$SaveAction&inPlaceID=1020634218&xxx_c_1_f_987=http%3A%2F%2Fwww.yellowpages.com%2Fpittsburgh-pa%2Fmip%2Ffamily-dollar-store-1349194%3Flid%3D1349194

mechanize 的代码:

import mechanize
br = mechanize.Browser()
url = http://www.locationary.com/access/proxy.jsp?ACTION_TOKEN=proxy_jsp$JspView$SaveAction&inPlaceID=1020634218&xxx_c_1_f_987=http%3A%2F%2Fwww.yellowpages.com%2Fpittsburgh-pa%2Fmip%2Ffamily-dollar-store-1349194%3Flid%3D1349194
br.open(url)

urllib2 的代码:

from urllib2 import urlopen
url = http://www.locationary.com/access/proxy.jsp?ACTION_TOKEN=proxy_jsp$JspView$SaveAction&inPlaceID=1020634218&xxx_c_1_f_987=http%3A%2F%2Fwww.yellowpages.com%2Fpittsburgh-pa%2Fmip%2Ffamily-dollar-store-1349194%3Flid%3D1349194
page = urllib2.urlopen(url)
page.read()

webbrowser 的代码:

import webbrowser
url = http://www.locationary.com/access/proxy.jsp?ACTION_TOKEN=proxy_jsp$JspView$SaveAction&inPlaceID=1020634218&xxx_c_1_f_987=http%3A%2F%2Fwww.yellowpages.com%2Fpittsburgh-pa%2Fmip%2Ffamily-dollar-store-1349194%3Flid%3D1349194
webbrowser.open(url)

补充 #2 我刚才试了这个代码:

import urllib2
import urllib

def log_in():
    url = 'https://www.locationary.com/index.jsp?ACTION_TOKEN=tile_loginBar_jsp$JspView$LoginAction'
    values = {'inUserName' : 'me@gmail.com',
              'inUserPass' : 'myPass'}
    data = urllib.urlencode(values)
    req = urllib2.Request(url, data)
    req.add_header('Host', 'www.locationary.com')
    req.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0')
    req.add_header('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8')
    req.add_header('Accept-Language', 'en-us,en;q=0.5')
    req.add_header('Accept-Encoding','gzip, deflate')
    req.add_header('Accept-Charset','ISO-8859-1,utf-8;q=0.7,*;q=0.7')
    req.add_header('Connection','keep-alive')
    req.add_header('Referer','http://www.locationary.com/')
    req.add_header('Cookie','site_version=REGULAR; __utma=47547066.1079503560.1321924193.1322707232.1324693472.36; __utmz=47547066.1321924193.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); nickname=jacob501; locaCountry=1033; locaState=1795; locaCity=Montreal; jforumUserId=1; PMS=1; TurnOFfTips=true; Locacookie=enable; __utma=47547066.1079503560.1321924193.1322707232.1324693472.36; __utmz=47547066.1321924193.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); nickname=jacob501; PMS=1; __utmb=47547066.15.10.1324693472; __utmc=47547066; JSESSIONID=DC7F5AB08264A51FBCDB836393CB16E7; PSESSIONID=28b334905ab6305f7a7fe051e83857bc280af1a9; __utmc=47547066; __utmb=47547066.15.10.1324693472; ACTION_RESULT_CODE=ACTION_RESULT_FAIL; ACTION_ERROR_TEXT=java.lang.NullPointerException')
    req.add_header('Content-Type','application/x-www-form-urlencoded')
    response = urllib2.urlopen(req)
    page = response.read()

url2 = 'http://www.locationary.com/access/proxy.jsp?ACTION_TOKEN=proxy_jsp$JspView$SaveAction&inPlaceID=1020634218&xxx_c_1_f_987=http%3A%2F%2Fwww.yellowpages.com%2Fpittsburgh-pa%2Fmip%2Ffamily-dollar-store-1349194%3Flid%3D1349194'

log_in()
response2 = urllib2.urlopen(url2)
page2 = response2.read()

但它没有成功。

补充 3: Tony 的代码对我没用。

import urllib2
import urllib
import cookielib

data = urllib.urlencode({"inUserName":"MYUSERNAMESHOULDBEHERE", "inUserPass":"MYPASSWORDSHOULDBEHERE"})
jar = cookielib.FileCookieJar("cookies")
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
request = urllib2.Request("https://www.locationary.com/index.jsp?ACTION_TOKEN=tile_loginBar_jsp$JspView$LoginAction", data)
opener.open(request) 
url = "http://www.locationary.com/access/proxy.jsp?ACTION_TOKEN=proxy_jsp$JspView$SaveAction&inPlaceID=1012432546&xxx_c_1_f_987=http%3A%2F%2Fwww.yellowpages.com%2Fpittsburgh-pa%2Fmip%2Fdennys-13470813%3Flid%3D13470813"
anything = opener.open(url)
anything.read()

最终补充! 我终于按照 Tony 的建议让它成功了!

这是我最终成功的代码:

import urllib2
import urllib
import cookielib

data = urllib.urlencode({"inUserName":"myemail@gmail.com", "inUserPass":"mypassword"})
jar = cookielib.FileCookieJar("cookies")
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
opener.addheaders.append(('User-agent', 'Mozilla/4.0'))
opener.addheaders.append( ('Referer', 'http://www.hellboundhackers.org/index.php') )
opener.addheaders.append(('Cookie','site_version=REGULAR; __utma=47547066.912030359.1322003402.1324688192.1324930160.55; __utmz=47547066.1324655802.52.13.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=cache:dr23PN5fUj4J:www.locationary.com/%20locationary; nickname=jacob501; jforumUserId=1; PMS=1; locaCountry=1033; locaState=1786; locaCity=Vancouver; JSESSIONID=A8F241E1924CE7A25FAA8C5CA6597697; PSESSIONID=5c21c44245f978b917f17982c944a9ec2b5d2df5; Locacookie=enable; __utmb=47547066.5.10.1324930160; __utmc=47547066'))
request = urllib2.Request("https://www.locationary.com/index.jsp?ACTION_TOKEN=tile_loginBar_jsp$JspView$LoginAction", data)
response = opener.open(request) 
url = "http://www.locationary.com/"
anything = opener.open(url)
anything.read()

我只需要加上这一行

opener.addheaders.append(('Cookie','site_version=REGULAR; __utma=47547066.912030359.1322003402.1324688192.1324930160.55; __utmz= 

等等等等(那一长串代码,就是 cookie)

我还加了一个“Referer”和“User-Agent”的头,以防万一。

谢谢你,Tony!!

2 个回答

0
{"manifest":{"errorTimeout":0,"succeed":true,"errorCode":0,"serverVersion":"1.0","type":"locaaccess"},"saveResult":{"message":"You don't have permissions!","placeOpenedState":0,"isSucess":false}} 

我明白了,你是通过把你的urllib放到我的浏览器里来实现的。我觉得你需要先在网站上进行身份验证,然后再执行这个命令。我不能给你具体的登录步骤,但如果你去登录页面,可能会有一个表单,你可以通过urllib2用一个网址的方式来模拟这个表单。

1

首先,你应该把网址变量用引号括起来:

url = "http://www.locationary.com/access/proxy.jsp?ACTION_TOKEN=proxy_jsp$JspView$SaveAction&inPlaceID=1020634218&xxx_c_1_f_987=http%3A%2F%2Fwww.yellowpages.com%2Fpittsburgh-pa%2Fmip%2Ffamily-dollar-store-1349194%3Flid%3D1349194"

如果你想在不打开浏览器的情况下发送请求,可以使用urllib,就像你现在这样做的。

如果你需要身份验证(看起来你是需要的),你应该先发送一个请求来进行身份验证,获取到cookies(可以用cookielib.FileCookieJar来处理这个),然后把这些cookies设置到打开器里。这样你就可以打开网页并发送请求了。

大概你需要的代码是这样的:

data=urllib.urlencode({"login":"your login or whatever, "pass":"password}) # be aware you need to change "login" and "pass" to names of fields in form you have.
jar = cookielib.FileCookieJar("cookies")
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
request = urllib2.Request("url for authentication", data)
opener.open(request) # now you should be authorized and able to send any request like logged in user, using opener

url = "http://www.locationary.com/access/proxy.jsp?ACTION_TOKEN=proxy_jsp$JspView$SaveAction&inPlaceID=1020634218&xxx_c_1_f_987=http%3A%2F%2Fwww.yellowpages.com%2Fpittsburgh-pa%2Fmip%2Ffamily-dollar-store-1349194%3Flid%3D1349194"
anything = opener.open(url)
anything.read()

撰写回答