获取由使用Pypetteer呈现的页面发送的请求的URL

2024-09-30 04:36:50 发布

您现在位置:Python中文网/ 问答频道 /正文

朋友

因此有一个csv文件,包含名称和url对。还有我,我愿意得到每个给定URL发送的请求的URL

一个页面=一个请求URL,但如果它很重要,则会重复发送此请求

使用这段代码,我得到的是<coroutine object get_req_url at 0x0000021EF0F0E8C0>,而不是所需的https://*

我做错了什么?在有限的时间里,我应该朝哪个方向看,谷歌应该做些什么

提前感谢您的帮助,祝您度过愉快的一天!:)

import csv
from requests_html import HTMLSession
from requests import Request, Session
from pyppeteer.network_manager import Request

session = HTMLSession()
inFile = csv file

async def get_req_url(url):
    tRequest = await page.waitForRequest(url)
    trUrl = tRequest.url.headers

with open(inFile, 'r', newline='', encoding='utf8') as csvfile:
    items = list(csv.reader(csvfile))

#set index for extracting values from given data
index = 0

for item in items:
    #iterating index, skipping headings row
    index += 1
    if index == len(items):
        continue
    #extract the name-url pair
    itemElem = str(items[index])

    p1 = itemElem.find("'")
    p2 = itemElem.find(";")

    #decompose pair into name and url individually
    itemName = itemElem[2:p2]
    itemUrl = itemElem[(p2 + 1):-2]
    print('Name: ' + itemName)
    print('URL: ' + itemUrl)

    #continue
    r = session.get(itemUrl)
    r.html.render()
    trUrl = get_req_url(itemUrl)


    print(trUrl)

Tags: csvfromimporturlgetindexitemsrequests

热门问题