我正在通过api抓取韩国航班网站,但未获得所需数据

2024-09-30 22:15:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我想刮韩国航班的网站,其中提供了连续航班隐藏的api 但是当我在postman中检查它时,它会工作并显示结果,当我在使用python的请求中尝试它时,它会显示空白记录。 这是网站https://suvarnabhumi.airportthai.co.th/flight,这是api 'https://apis.airportthai.co.th/'请求有效负载在以下代码中给出:

import requests
from requests import session
import json
from pprint import pprint


headers = {
  #"Content-Type": "application/text;",
  "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36",
  "Accept" : "*/*",
  "Accept-Encoding": "gzip, deflate, br",
  "Connection": "keep-alive",
  "Accept-Language": "en-US,en;q=0.9,ur;q=0.8",
  "sec-ch-ua": 'Chromium";v="92", " Not A;Brand";v="99", "Google Chrome";v="92"',
  "Sec-Fetch-Mode": "cors",
  "Referer": "https://suvarnabhumi.airportthai.co.th/"
}

data = {"query":"\n      query ($site: String, $type: FlightType, $search: String, $schedule_start: String, $schedule_end: String) {\n        flights(site: $site, type: $type, search: $search, schedule_start: $schedule_start, schedule_end: $schedule_end) {\n          flight_id\n          number\n          airline_id\n          aircraft_id\n          departure_scheduled_at\n          arrival_scheduled_at\n          flight_departure {\n            id\n            site_id\n            remark\n            terminal\n            gate\n            check_in_counter\n            status_color\n            estimated_at\n            actual_at\n            scheduled_at\n            updated_at\n            flight_shares\n            __typename\n          }\n          flight_arrival {\n            id\n            site_id\n            belt\n            terminal\n            remark\n            status_color\n            estimated_at\n            first_bag_at\n            last_bag_at\n            flight_shares\n            __typename\n          }\n          origin_airport {\n            id\n            name\n            city\n            iata_code\n            icao_code\n            __typename\n          }\n          destination_airport {\n            id\n            name\n            city\n            iata_code\n            icao_code\n            __typename\n          }\n          airline {\n            id\n            iata\n            icao\n            name\n            logo\n            __typename\n          }\n          aircraft {\n            id\n            name\n            iata\n            icao\n            __typename\n          }\n          updated_at\n          __typename\n        }\n      }\n      ","variables":{"site":"bkk","type":"A","search":"","schedule_start":"2021-08-24 11:49:00","schedule_end":"2021-08-24 23:59:59"}}

url = " https://apis.airportthai.co.th/"

r = requests.post(url,  data = data, headers = headers)

print(r.json())

当它运行时,会给我空白结果:

{'data': {'flights': []}}

在《邮递员》中,它显示了所有的数据,但在这里它不起作用 这是网站中的api enter image description here


Tags: httpsimportidsearchdatastringtypesite
3条回答

这对我有用

import requests

    url = "https://apis.airportthai.co.th/"
    data = {"query":"\n      query ($site: String, $type: FlightType, $search: String,"
                    " $schedule_start: String, $schedule_end: String) {\n        "
                    "flights(site: $site, type: $type, search: $search, schedule_start: "
                    "$schedule_start, schedule_end: $schedule_end) {\n          flight_id\n         "
                    " number\n          airline_id\n          aircraft_id\n        "
                    "  departure_scheduled_at\n          arrival_scheduled_at\n         "
                    " flight_departure {\n            id\n            site_id\n           "
                    " remark\n            terminal\n            gate\n            check_in_counter\n   "
                    "         status_color\n            estimated_at\n            actual_at\n           "
                    " scheduled_at\n            updated_at\n            flight_shares\n           "
                    " __typename\n          }\n          flight_arrival {\n            id\n           "
                    " site_id\n            belt\n            terminal\n            remark\n            "
                    "status_color\n            estimated_at\n            first_bag_at\n            "
                    "last_bag_at\n            flight_shares\n            __typename\n          }\n         "
                    " origin_airport {\n            id\n            name\n            city\n            "
                    "iata_code\n            icao_code\n            __typename\n          }\n        "
                    "  destination_airport {\n            id\n            name\n            city\n            "
                    "iata_code\n            icao_code\n            __typename\n          }\n          airline {\n"
                    "            id\n            iata\n            icao\n            name\n            logo\n         "
                    "   __typename\n          }\n          aircraft {\n            id\n            name\n           "
                    " iata\n            icao\n            __typename\n          }\n          updated_at\n          "
                    "__typename\n        }\n      }\n      ",
            "variables":{"site":"bkk","type":"A","search":"","schedule_start":"2021-09-10 17:10:00","schedule_end":"2021-09-10 23:59:59"}}
    response = requests.post(url, json=data)
    print(response.json())

数据线的长度可能太长了

您提供的标题不是必需的

此外,在提交有效负载时,请使用json格式而不是行格式数据

以下是我这边的工作解决方案:

代码:

import requests
import pandas as pd
import json


body = {"query": "\n      query ($site: String, $type: FlightType, $search: String, $schedule_start: String, $schedule_end: String) {\n        flights(site: $site, type: $type, search: $search, schedule_start: $schedule_start, schedule_end: $schedule_end) {\n          flight_id\n          number\n          airline_id\n          aircraft_id\n          departure_scheduled_at\n          arrival_scheduled_at\n          flight_departure {\n            id\n            site_id\n            remark\n            terminal\n            gate\n            check_in_counter\n            status_color\n            estimated_at\n            actual_at\n            scheduled_at\n            updated_at\n            flight_shares\n            __typename\n          }\n          flight_arrival {\n            id\n            site_id\n            belt\n            terminal\n            remark\n            status_color\n            estimated_at\n            first_bag_at\n            last_bag_at\n            flight_shares\n            __typename\n          }\n          origin_airport {\n            id\n            name\n            city\n            iata_code\n            icao_code\n            __typename\n          }\n          destination_airport {\n            id\n            name\n            city\n            iata_code\n            icao_code\n            __typename\n          }\n          airline {\n            id\n            iata\n            icao\n            name\n            logo\n            __typename\n          }\n          aircraft {\n            id\n            name\n            iata\n            icao\n            __typename\n          }\n          updated_at\n          __typename\n        }\n      }\n      ", "variables": {"site": "bkk", "type": "A", "search": "", "schedule_start": "2021-09-10 21:38:00", "schedule_end": "2021-09-10 23:59:59"}}
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36",
    "Accept": "*/*",
    "Accept-Encoding": "gzip, deflate, br",
    "Accept-Language": "en-US,en;q=0.9,bn;q=0.8,es;q=0.7,ar;q=0.6",
    "Connection": "keep-alive",
    "Content-Length": "1843",
    "Content-Type": "application/json",
    "Host": "apis.airportthai.co.th",
    "Origin": "https://suvarnabhumi.airportthai.co.th",
    "Referer": "https://suvarnabhumi.airportthai.co.th/",
    "sec-ch-ua-mobile": "?0"
    }


url = "https://apis.airportthai.co.th/"

r = requests.post(url, data=json.dumps(body), headers=headers)
response = r.json()['data']['flights']
for resp in response:
    print(resp['number'])

输出:示例 航班号

SG 089
VZ 317
TG 466
WE 137
ZE 761
LJ 011
TG 677
TR 850
KL 820
TG 222
TG 625
ZG 051
TG 418
TG 410
WE 121
KL 804
FD 4127
TG 482
TG 585
VZ 345
TG 607
KE 651
TG 436
WE 591
VZ 2107
WE 288
TG 635
WE 268
JL 707
TG 679
OZ 741
VZ 311
FD 4303
NH 805
RJ 181
VZ 329
VZ 3971
FD 4109
LJ 001
EK 385
7C 2215
CX 617
KE 659
VZ 121

试试这个:-

data = {"query":"query($site: String, $type: FlightType, $search: String, $schedule_start: String, $schedule_end: String) {flights(site: $site, type: $type, search: $search, schedule_start: $schedule_start, schedule_end: $schedule_end) {flight_id number airline_id aircraft_id departure_scheduled_at arrival_scheduled_at flight_departure {id site_id remark terminal gate check_in_counter status_color estimated_at actual_at scheduled_at updated_at flight_shares __typename}flight_arrival {id site_id belt terminal remark status_color estimated_at first_bag_at last_bag_at flight_shares __typename}origin_airport {id name city iata_code icao_code  __typename}destination_airport {id name city iata_code icao_code __typename}airline {id iata icao name logo  __typename}aircraft {id name iata icao __typename}updated_at __typename}}","variables":{"site":"bkk","type":"A","search":"","schedule_start":"2021-09-10 00:00:00","schedule_end":"2021-09-10 23:59:59"}}

url = " https://apis.airportthai.co.th/"

from urllib import request
import json

req = request.Request(url, method='POST')
req.add_header('Content-Type', 'application/json')
r = request.urlopen(req, data=json.dumps(data).encode())
print(r.read())

相关问题 更多 >