如何在python中使用urllib一次请求多个url

2024-10-03 09:10:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在编程一个程序,从互联网下载图像,我想加快它使用多个请求一次。在

所以我写了一个代码,你可以看到here at GitHub。在

我只能这样申请网页:

def myrequest(url):
    worked = False
    req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
    while not worked:
        try:
            webpage_read = urlopen(req).read()
            worked = True
        except:
            print("failed to connect to \n{}".format(url))
    return(webpage_read)

url = "http://www.mangahere.co/manga/mysterious_girlfriend_x"
webpage_read = myrequest(url).decode("utf-8")

while在这里是因为我确实想下载每一张图片,所以我一直在尝试直到它正常工作(除了urllib.error.HTTPError: HTTP Error 504: Gateway Time-out没有什么会出错)

我的问题是,如何一次运行多次?

我的想法是有一个“comander”,它将运行5个(或85个)Python脚本,给出每个url,并在完成后从中获取网页,但这绝对是一个愚蠢的解决方案:)

编辑: 我使用了_thread,但它似乎不能加快程序的速度。这应该是解决问题的方法我做错了吗?这是我的新问题。 您可以使用link do get to my code on GitHub

^{pr2}$

我在这里用它:

for url_ep in urls_eps:

    url, maxep = url_ep.split()
    maxep = int(maxep)
    chap = url.split("/")[-1][2:]
    if "." in chap:
        chap = chap.replace(".", "")
    else:
        chap = "{}0".format(chap)

    for ep in range(1, maxep + 1):
        ted = time.time()
        name = "{}{}".format(chap, "{}{}".format((2 - len(str(ep))) * "0", ep))
        if name in downloaded:
            continue

        _thread.start_new_thread(thrue_thread_download_pics, (path, url, ep, name))

checker = -1
while finished != goal:
    if finished != checker:
        checker = finished
        print("{} of {} downloaded".format(finished, goal))
    time.sleep(0.1)

Tags: toinformaturlreadiftimethread
1条回答
网友
1楼 · 发布于 2024-10-03 09:10:01

Requests Futures构建在非常流行的requests库之上,并使用非阻塞IO:

from requests_futures.sessions import FuturesSession

session = FuturesSession()

# These requests will run at the same time
future_one = session.get('http://httpbin.org/get')
future_two = session.get('http://httpbin.org/get?foo=bar')

# Get the first result
response_one = future_one.result()
print(response_one.status_code)
print(response_one.text)

# Get the second result
response_two = future_two.result()
print(response_two.status_code)
print(response_two.text)

相关问题 更多 >