如何用python实现xhr-post请求

2024-10-01 07:43:10 发布

您现在位置:Python中文网/ 问答频道 /正文

所以,我想废弃一个需要post请求来检索数据的网站,但是我没有运气。。我最后一次尝试是: 从请求导入会话 从bs4导入BeautifulSoup

    # HEAD requests ask for *just* the headers, which is all you need to grab the
    # session cookie
    session = Session()

    # HEAD requests ask for *just* the headers, which is all you need to grab the
    # session cookie
    session.head('http://www.betrebels.gr/sports')

    response = session.post(
        #url = "https://sports-       itainment.biahosted.com/WebServices/SportEvents.asmx/GetEvents",
        url='http://www.betrebels.gr/sports',
        data={
                'champIds':         '["1191783","1191784","1191785","939911","939912","939913","939914","175","190686","198881","542378","217750","91","201","2","38","201614","454","63077","60920","384","49251","61873","87095","110401","111033","122008","122019","342","343","344","430",213","95","10","1240912","1237673","1239055","339","340","124","1381","260549","1071542","437","271","510","1241462","72","277","137","308","488","2131","59178","433","434","347","203","348","349","92420","148716","322","184","127983","321","88173","417","418","284","2688","103419","618","487","56029","214640","215229","514","92","302","1084811","1084813","1084831","68739","81852","406","100","70","172","351","541730","541732","541733","548965","552442","554615","554616","554617","361","136","519","279","65","319","364","75","220","194676","149","121443","110902","171694","152501","568313","126998","758","740","1264928"]',
                'dateFilter':'All',
                'eventIds':'[]',
                'marketsId':'-1',
                'skinId':"betrebels"
            },

        headers={'Accept':'application/json, text/javascript, */*; q=0.01',
            'Accept-Encoding':'gzip, deflate, br',
            'Accept-Language':'el-GR,el;q=0.8',
            'Connection':'keep-alive',
            'Content-Length':'701',
            'Content-Type':'application/json; charset=UTF-8',
            'Cookie':'Language=el-gr;         ASP.NET_SessionId=kp0b2xwf2vzuci4uwn33uh1o; IsBetApp=False; _ga=GA1.2.1005994943.1499255280; _gid=GA1.2.1197736989.1500201903; _gat=1; ParentUrl=ParentUrl is not need',
            'DNT':'1',
            'Host':'sports-itainment.biahosted.com',
            'Origin':'https://sports-itainment.biahosted.com',
            'Referer':'https://sports-itainment.biahosted.com/generic/prelive.aspx?token=&clientTimeZoneOffset=-180&lang=el-gr&walletcode=508729&skinid=betrebels&parentUrl=https%3A//ps.equalsystem.com/ps/game/BIASportbook.action',
            'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36',
            'X-Requested-With':'XMLHttpRequest'         
            }
        )

    print response.text




    soup= BeautifulSoup(response.content, "html.parser")

    #leagues= soup.find_all("div",{"class": "header"})[0].text
    #print leagues
    leagues= soup.find_all("div", {"class": "championship-header"})
    links= soup.find_all("a")

    for link in links:
        print (link.get("href"), link.text)

    for item in leagues:
        #print item.contents[0].find_all("div",{"class": "header"})[0].text
            print item.find_all("div",{"class": "header"})[0].text
        print item.find_all("div",{"class": "header"})[0].text
        print item.find_all("span")[0].text

我想取消所有的足球联赛betrebels.com网站有什么想法吗?在


Tags: thetextdivcomforsessionallfind
1条回答
网友
1楼 · 发布于 2024-10-01 07:43:10

因此,实际的数据更干净,更容易从真实的源获取——如果你仔细研究浏览器发出的请求,你可以看到——但是下面是URL:https://s5.sir.sportradar.com/betinaction/en/1

它本身也是在json underneath中,这意味着您可以将其缩减为仅使用requests模块和{}模块,但请求允许您只返回原始json,并将其解析为字典。在

所有这些意味着你可以从根本上简化获取你想要的东西的过程。在

您可以在这里找到所有国家的联盟https://ls.sportradar.com/ls/feeds/?/betinaction/en/Europe:Berlin/gismo/config_tree/41/0/1您只需要获取所有的_id字段,然后使用构造的URL以https://s5.sir.sportradar.com/betinaction/en/1/category/+_id的格式遍历每个字段

但是如果你检查请求,你也应该获取原始的网址。。。在

我把剩下的留给你-但你想要的一切都在那里,它更容易阅读和访问

相关问题 更多 >