<p>只需使用任何异步库。我认为异步版本的请求,如<a href="https://github.com/kennethreitz/grequests" rel="nofollow noreferrer">grequest</a>、txrequests、requests futures和requests线程最适合您。下面是grequests自述文件中的代码示例:</p>
<blockquote>
<pre><code>import grequests
urls = [
'http://www.heroku.com',
'http://python-tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://fakedomain/',
'http://kennethreitz.com'
]
</code></pre>
<p>Create a set of unsent Requests:</p>
<pre><code>rs = (grequests.get(u) for u in urls)
</code></pre>
<p>Send them all at the same time:</p>
<pre><code>grequests.map(rs)
</code></pre>
</blockquote>
<p>使用或学习其他提到的模块,比如请求线程,可能会稍微涉及一些,尤其是在Python2中</p>
<pre><code>from twisted.internet.defer import inlineCallbacks
from twisted.internet.task import react
from requests_threads import AsyncSession
session = AsyncSession(n=100)
@inlineCallbacks
def main(reactor):
responses = []
for i in range(100):
responses.append(session.get('http://httpbin.org/get'))
for response in responses:
r = yield response
print(r)
if __name__ == '__main__':
react(main)
</code></pre>
<p><a href="https://medium.com/@santhoshhari/efficient-web-scraping-with-pythons-asynchronous-programming-6b9e730f1ff7" rel="nofollow noreferrer">asyncio</a>和{a3}可能更值得注意,但是,我想,学习一个已经熟悉的模块的版本会更容易。在</p>
<p>多线程是不必要的,但是您可以尝试<a href="https://stackoverflow.com/questions/38671803/multithread-python-requests">mutithreading</a>或者,也许更好的是多进程处理,看看哪个性能最好。在</p>