在python3.7中,我使用urllib.request.urlretrieve(..)
函数从URL
下载一个大文件。在文档(https://docs.python.org/3/library/urllib.request.html)中,我阅读了urllib.request.urlretrieve(..)
文档上方的以下内容:
Legacy interface
The following functions and classes are ported from the Python 2 module urllib (as opposed to urllib2). They might become deprecated at some point in the future.
为了保证我的代码经得起未来的考验,我正在寻找一个替代方案。官方的Python文档没有提到具体的一个,但是看起来urllib.request.urlopen(..)
是最直接的候选。它在文档页面的顶部。在
不幸的是,像urlopen(..)
-这样的备选方案没有提供reporthook
参数。此参数是传递给urlretrieve(..)
函数的可调用参数。反过来,urlretrieve(..)
使用以下参数定期调用它:
我用它来更新progressbar。这就是为什么我错过了备选方案中的reporthook
参数。在
我发现urlretrieve(..)
只是使用urlopen(..)
。请参阅Python3.7安装中的request.py
代码文件(Python37/Lib/urllib/请求.py)公司名称:
_url_tempfiles = []
def urlretrieve(url, filename=None, reporthook=None, data=None):
"""
Retrieve a URL into a temporary location on disk.
Requires a URL argument. If a filename is passed, it is used as
the temporary file location. The reporthook argument should be
a callable that accepts a block number, a read size, and the
total file size of the URL target. The data argument should be
valid URL encoded data.
If a filename is passed and the URL points to a local resource,
the result is a copy from local file to new file.
Returns a tuple containing the path to the newly created
data file as well as the resulting HTTPMessage object.
"""
url_type, path = splittype(url)
with contextlib.closing(urlopen(url, data)) as fp:
headers = fp.info()
# Just return the local path and the "headers" for file://
# URLs. No sense in performing a copy unless requested.
if url_type == "file" and not filename:
return os.path.normpath(path), headers
# Handle temporary file setup.
if filename:
tfp = open(filename, 'wb')
else:
tfp = tempfile.NamedTemporaryFile(delete=False)
filename = tfp.name
_url_tempfiles.append(filename)
with tfp:
result = filename, headers
bs = 1024*8
size = -1
read = 0
blocknum = 0
if "content-length" in headers:
size = int(headers["Content-Length"])
if reporthook:
reporthook(blocknum, bs, size)
while True:
block = fp.read(bs)
if not block:
break
read += len(block)
tfp.write(block)
blocknum += 1
if reporthook:
reporthook(blocknum, bs, size)
if size >= 0 and read < size:
raise ContentTooShortError(
"retrieval incomplete: got only %i out of %i bytes"
% (read, size), result)
return result
从这一切,我看到了三个可能的决定:
我保持代码不变。希望urlretrieve(..)
函数不会很快被弃用。
我为自己编写了一个替换函数在外部表现为urlretrieve(..)
,而在内部使用urlopen(..)
。实际上,这个函数就是上面代码的复制粘贴。与使用官方的urlretrieve(..)
相比,这样做感觉不干净。
我给自己写了一个替换函数在外部表现为urlretrieve(..)
,而在内部使用了完全不同的东西。但嘿,我为什么要那样做?urlopen(..)
不是不推荐使用的,那么为什么不使用它呢?
你会做什么决定?在
目前没有回答
相关问题 更多 >
编程相关推荐