我应该从“切换到”吗urllib.request.urlretrieve（..“至”urllib.request.urlopen(..)"?

2024-09-28 01:30:27 发布

男 | 程序猿一只，喜欢编程写python代码。

1。折旧问题

在python3.7中，我使用urllib.request.urlretrieve(..)函数从URL下载一个大文件。在文档（https://docs.python.org/3/library/urllib.request.html）中，我阅读了urllib.request.urlretrieve(..)文档上方的以下内容：

Legacy interface
The following functions and classes are ported from the Python 2 module urllib (as opposed to urllib2). They might become deprecated at some point in the future.

2。寻找替代品

为了保证我的代码经得起未来的考验，我正在寻找一个替代方案。官方的Python文档没有提到具体的一个，但是看起来urllib.request.urlopen(..)是最直接的候选。它在文档页面的顶部。在

不幸的是，像urlopen(..)-这样的备选方案没有提供reporthook参数。此参数是传递给urlretrieve(..)函数的可调用参数。反过来，urlretrieve(..)使用以下参数定期调用它：

区块编号
块大小
文件总大小

我用它来更新progressbar。这就是为什么我错过了备选方案中的reporthook参数。在

3。urlretrieve（..）与urlopen（..）

我发现urlretrieve(..)只是使用urlopen(..)。请参阅Python3.7安装中的request.py代码文件（Python37/Lib/urllib/请求.py)公司名称：

_url_tempfiles = []
def urlretrieve(url, filename=None, reporthook=None, data=None):
    """
    Retrieve a URL into a temporary location on disk.

    Requires a URL argument. If a filename is passed, it is used as
    the temporary file location. The reporthook argument should be
    a callable that accepts a block number, a read size, and the
    total file size of the URL target. The data argument should be
    valid URL encoded data.

    If a filename is passed and the URL points to a local resource,
    the result is a copy from local file to new file.

    Returns a tuple containing the path to the newly created
    data file as well as the resulting HTTPMessage object.
    """
    url_type, path = splittype(url)

    with contextlib.closing(urlopen(url, data)) as fp:
        headers = fp.info()

        # Just return the local path and the "headers" for file://
        # URLs. No sense in performing a copy unless requested.
        if url_type == "file" and not filename:
            return os.path.normpath(path), headers

        # Handle temporary file setup.
        if filename:
            tfp = open(filename, 'wb')
        else:
            tfp = tempfile.NamedTemporaryFile(delete=False)
            filename = tfp.name
            _url_tempfiles.append(filename)

        with tfp:
            result = filename, headers
            bs = 1024*8
            size = -1
            read = 0
            blocknum = 0
            if "content-length" in headers:
                size = int(headers["Content-Length"])

            if reporthook:
                reporthook(blocknum, bs, size)

            while True:
                block = fp.read(bs)
                if not block:
                    break
                read += len(block)
                tfp.write(block)
                blocknum += 1
                if reporthook:
                    reporthook(blocknum, bs, size)

    if size >= 0 and read < size:
        raise ContentTooShortError(
            "retrieval incomplete: got only %i out of %i bytes"
            % (read, size), result)

    return result

4。结论

从这一切，我看到了三个可能的决定：

我保持代码不变。希望urlretrieve(..)函数不会很快被弃用。
我为自己编写了一个替换函数在外部表现为urlretrieve(..)，而在内部使用urlopen(..)。实际上，这个函数就是上面代码的复制粘贴。与使用官方的urlretrieve(..)相比，这样做感觉不干净。
我给自己写了一个替换函数在外部表现为urlretrieve(..)，而在内部使用了完全不同的东西。但嘿，我为什么要那样做？urlopen(..)不是不推荐使用的，那么为什么不使用它呢？

你会做什么决定？在

Tags： and the 函数 url read size if urllib

0条回答

目前没有回答