正在将表单数据发送到aspx pag

2024-10-01 22:27:05 发布

您现在位置:Python中文网/ 问答频道 /正文

有必要在网站上做一个搜索

    url = r'http://www.cpso.on.ca/docsearch/'

这是一个aspx页面(我从昨天开始这段旅程,很抱歉有noob问题)

使用BeautifulGroup,我可以获得如下所示的“视图状态”和“事件验证”:

^{pr2}$

标题可以这样设置:

    headers = {'HTTP_USER_AGENT': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.13) Gecko/2009073022 Firefox/3.0.13',
'HTTP_ACCEPT': 'text/html,application/xhtml+xml,application/xml; q=0.9,*/*; q=0.8',
'Content-Type': 'application/x-www-form-urlencoded'}

如果你去网页,我唯一想传递的是名字和姓氏。。。在

    LN = "smith"
    FN = "a"
    data = {"__VIEWSTATE":viewstate,"__EVENTVALIDATION":ev,
    "ctl00$ContentPlaceHolder1$MainContentControl1$ctl00$txtLastName":LN, 
    "ctl00$ContentPlaceHolder1$MainContentControl1$ctl00$txtFirstName":FN}

所以把它们放在一起是这样的:

    import urllib
    import urllib2
    import urlparse
    import BeautifulSoup

    url = r'http://www.cpso.on.ca/docsearch/'
    html = urllib2.urlopen(url).read()
    soup = BeautifulSoup.BeautifulSoup(html)

    viewstate = soup.find('input', {'id' : '__VIEWSTATE'})['value']
    ev = soup.find('input', {'id' : '__EVENTVALIDATION'})['value']
    headers = {'HTTP_USER_AGENT': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.13) Gecko/2009073022 Firefox/3.0.13',
        'HTTP_ACCEPT': 'text/html,application/xhtml+xml,application/xml; q=0.9,*/*; q=0.8',
        'Content-Type': 'application/x-www-form-urlencoded'}

    LN = "smith"
    FN = "a"
    data = {"__VIEWSTATE":viewstate,"__EVENTVALIDATION":ev,
            "ctl00$ContentPlaceHolder1$MainContentControl1$ctl00$txtLastName":LN, 
            "ctl00$ContentPlaceHolder1$MainContentControl1$ctl00$txtFirstName":FN}

    data = urllib.urlencode(data)
    request = urllib2.Request(url,data,headers)
    response = urllib2.urlopen(request)
    newsoup = BeautifulSoup.BeautifulSoup(response)
    for i in newsoup:
        print i

问题是它并没有给我真正的结果。。。不知道我是否需要为表单中的每个文本框提供每个值或者什么。。。也许我做得不对。不管怎样,只是希望有人能让我改邪归正。我以为我有,但我希望能看到医生名单和联系方式。在

任何见解都是非常感谢的,我以前使用过beauthoulsoup,但我认为我的问题只是发送请求和在数据部分有适当的信息量。在

谢谢!在


Tags: importhttpurldataapplicationwindowshtmlwww
1条回答
网友
1楼 · 发布于 2024-10-01 22:27:05

接受@pguardiario的建议,走机械化路线。。。简单得多

    import mechanize

    url = r'http://www.cpso.on.ca/docsearch/'
    request = mechanize.Request(url)
    response = mechanize.urlopen(request)
    forms = mechanize.ParseResponse(response, backwards_compat=False)
    response.close()

    form = forms[0]

    form['ctl00$ContentPlaceHolder1$MainContentControl1$ctl00$txtLastName']='Smith'
    form['ctl00$ContentPlaceHolder1$MainContentControl1$ctl00$txtPostalCode']='K1H'

    print mechanize.urlopen(form.click()).read()

我还有很长的路要走,但这让我走得更远。在

相关问题 更多 >

    热门问题