下载fi时出现Python 3错误403

2024-09-27 23:20:40 发布

您现在位置：Python中文网/ 问答频道 /正文

3611

网友

男 | 程序猿一只，喜欢编程写python代码。

我正在使用一个脚本从一个HTML页面（通过邮件发送给我）抓取下载链接，然后下载文件，这个脚本已经运行了大约6个月了，但上周我开始得到“403错误”。你知道吗

据我所知，问题是该网站阻止了我，认为这是一个机器人（不能否认），但我不是刮网站的HTML代码，只是试图下载一个文件使用requests.get，我只得到这个错误从一个特定的网站，其他我可以下载罚款。你知道吗

我试过设置headers={'User-Agent': 'Mozilla/5.0'}，但没用。你知道吗

以下是下载文件的函数：

def download_file(dl_url, local_save_path):
        """Download URL to given path"""
        user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'

        auth_check = requests.get(dl_url, auth=(username.get(), password.get()), verify=False, headers={'User-Agent': user_agent})

        dnl_sum = 1024
        local_filename = dl_url.split('/')[-1]
        complete_name = os.path.join(local_save_path, local_filename)

        # Get file size
        r = requests.head(dl_url, auth=(username.get(), password.get()), verify=False, headers={'User-Agent': user_agent})
        try:
            dl_file_size = int(r.headers['content-length'])
            file_size.set(str(int(int(r.headers['content-length']) * (10 ** -6))) + "MB")
            c = 1
        except KeyError:
            c = 0
            pass
        # NOTE the stream=True parameter
        print('1')
        r = requests.get(dl_url, stream=True, auth=(username.get(), password.get()), verify=False, headers={'User-Agent': user_agent})
        print('2')
        while True:
            try:
                with open(complete_name, 'wb') as f:
                    for chunk in r.iter_content(chunk_size=1024):
                        if chunk:  # filter out keep-alive new chunks
                            f.write(chunk)
                            f.flush()
                            if c == 1:
                                download_perc.set(percentage(dl_file_size, dnl_sum))
                            elif c == 0:
                                print(dnl_sum)
                            dnl_sum = os.path.getsize(complete_name)
            except FileNotFoundError:
                continue
            break
    return

Tags： path auth url size get local requests file

1条回答

网友

1楼 · 发布于 2024-09-27 23:20:40

你试过用代理吗？你可以使用tor，它允许你动态IP地址和网站无法识别你。你知道吗

试试这个https://techoverflow.net/blog/2015/02/06/using-python-requests-over-tor/

下载fi时出现Python 3错误403

相关问题更多 >

编程相关推荐

热门问题

热门文章

下载fi时出现Python 3错误403

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >