朋友
因此有一个csv文件,包含名称和url对。还有我,我愿意得到每个给定URL发送的请求的URL
一个页面=一个请求URL,但如果它很重要,则会重复发送此请求
使用这段代码,我得到的是<coroutine object get_req_url at 0x0000021EF0F0E8C0>
,而不是所需的https://*
我做错了什么?在有限的时间里,我应该朝哪个方向看,谷歌应该做些什么
提前感谢您的帮助,祝您度过愉快的一天!:)
import csv
from requests_html import HTMLSession
from requests import Request, Session
from pyppeteer.network_manager import Request
session = HTMLSession()
inFile = csv file
async def get_req_url(url):
tRequest = await page.waitForRequest(url)
trUrl = tRequest.url.headers
with open(inFile, 'r', newline='', encoding='utf8') as csvfile:
items = list(csv.reader(csvfile))
#set index for extracting values from given data
index = 0
for item in items:
#iterating index, skipping headings row
index += 1
if index == len(items):
continue
#extract the name-url pair
itemElem = str(items[index])
p1 = itemElem.find("'")
p2 = itemElem.find(";")
#decompose pair into name and url individually
itemName = itemElem[2:p2]
itemUrl = itemElem[(p2 + 1):-2]
print('Name: ' + itemName)
print('URL: ' + itemUrl)
#continue
r = session.get(itemUrl)
r.html.render()
trUrl = get_req_url(itemUrl)
print(trUrl)
目前没有回答
相关问题 更多 >
编程相关推荐