如何使用python从web下载文件?(无url)

2024-09-29 21:57:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我是一个试图从这个网页http://www.cnmv.es/ipps/(西班牙公司信息)中提取文件的新手

问题是我必须先填写几个字段(公司、学期、年份),然后点击下载。使用浏览器,它开始下载一个包含一个或多个.xbrl文件的.zip文件,但我找不到通过请求或类似方式(下载按钮中没有URL)在python中完成下载的方法,将文件内容保存到变量并将文件保存在路径中

我尝试的是我在网上找到的关于类似问题的东西,我读了一些关于ajax、json、beautifulsoup。。。但没有结果。我的实际脚本是错误的,因为我得到的只是响应,而不是目标文件,我需要你的帮助,请

在这里你可以找到一份我想要的草稿,这和我的实际剧本很相似


from requests import Session

s = Session()

Company = [''] #Companies string array
Semester = [''] #Semester string array
Year = [''] #Years string array

for x in range(Company):
    for y in range(Semester):
        for z in range(Year):
            
            #request the data and receive the desired information
            response = s.post(
                url='http://www.cnmv.es/ipps/',
                data = {
                    'wDescargas$drpEntidades': Company[x], #search parameters
                    'wDescargas$drpPeriodos': Semester[y],
                    'wDescargas$drpEjercicios': Year[z])
                },

                headers={
                    'Referer': 'http://www.cnmv.es/ipps/',
                }
            )

            #save the content of the target file in a path

            data = response.content
            filename = Semester[y] + Company[x] + Year[z]

            with open(filename,'w+b') as s:
                s.write(data)

非常感谢你的帮助


Tags: 文件theinhttpfordatastringes
1条回答
网友
1楼 · 发布于 2024-09-29 21:57:23

我建议您使用Python3中的selenium包来自动化整个过程,因为该站点使用的是.NET framework,并且除了“wDescargas$drpEntidades”、“wDescargas$drpperoidos”、“wDescargas$drpejercciios”之外,还有很多POST值

只要查看该网站的源代码,你就会明白为什么在这里使用requests包不是一个好选择

<form name="form1" method="post" action="./" id="form1">
<div>
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__LASTFOCUS" id="__LASTFOCUS" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwULLTE5MTQ0MzYzMDEPFgQeD01vZG9WaXN0YVBhZ2luYQspYUNOTVYuWEJSTElQUC5Nb2RvVmlzdGFQYWdpbmEsIENOTVYuWEJSTElQUCwgVmVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPW51bGwBHgxXZWJFbnRpZGFkZXMyrQIAAQAAAP////8BAAAAAAAAAAwCAAAAQ0NOTVYuWEJSTElQUCwgVmVyc2lvbj0xLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPW51bGwFAQAAAChDTk1WLlhCUkxJUFAuRW50aXRpZXMuV2ViRW50aWRhZGVzRUVMaXN0AwAAAA1MaXN0YDErX2l0ZW1zDExpc3RgMStfc2l6ZQ9MaXN0YDErX3ZlcnNpb24EAAAkQ05NVi5YQlJMSVBQLkVudGl0aWVzLldlYkVudGlkYWRFRVtdAgAAAAgIAgAAAAkDAAAAAAAAAAAAAAAHAwAAAAABAAAAAAAAAAQiQ05NVi5YQlJMSVBQLkVudGl0aWVzLldlYkVudGlkYWRFRQIAAAALFgICAw9kFgQCBw9kFgJmD2QWBgIBDw8WBh4IQ3NzQ2xhc3MFKWJ0biBidG4tT3BjaW9uIGNlbnRlci1ibG9jayBBY3Rpdm8gQWN0aXZvHgdFbmFibGVkaB4EXyFTQgICZBYCAgEPDxYCHghJbWFnZVVybAUWfi9pbWFnZXMvT3BjaW9uMU9uLmdpZmRkAgMPDxYEHwIFG2J0biBidG4tT3BjaW9uIGNlbnRlci1ibG9jax8EAgJkFgICAQ8PFgIfBQUXfi9pbWFnZXMvT3BjaW9uMk9mZi5naWZkZAIHD2QWBGYPZBYCZg9kFgwCAw8QZGQWAWZkAgUPZBYCAgMPEGRkFgBkAgcPZBYCAgMPEGRkFgBkAgkPZBYCAgMPEGRkFgBkAgsPZBYCAgMPEGRkFgBkAg0PZBYCAgMPEGRkFgBkAgEPZBYCAgEPZBYGAggPZBYCAgMPEGRkFgBkAgoPZBYCAgMPEGRkFgBkAgwPZBYEAgMPEGRkFgFmZAIHDxBkZBYAZAIJD2QWAmYPZBYEAgEPZBYEAgEPDxYCHwUFFn4vaW1hZ2VzL09wY2lvbjFPbi5naWZkZAIDDxYCHgRUZXh0BR5WaXN1YWxpemFjacOzbiBkZSBpbmZvcm1lcyBJUFBkAgMPZBYGZg9kFgICAw8WAh8GBaYEIGEgbGEgaGVycmFtaWVudGEgZGUgY29uc3VsdGEgeSBkZXNjYXJnYSBkZSBpbmZvcm1lcyBYQlJMIGNvbiBsb3MgZXN0YWRvcyBmaW5hbmNpZXJvcyBlbnZpYWRvcyBhIGxhIENOTVYgZW4gZWwgbWFyY28gZXN0YWJsZWNpZG8gcG9yIGxhcyBDaXJjdWxhcmVzIDxhIGhyZWY9Jy9JUFAvdGF4b25vbWlhLzIwMTYtMDYtMDEvaXBwXzIwMTYtMDYtMDEuemlwJyB0aXRsZT0nSXIgYSBDaXJjdWxhciA1LzIwMTUnPjUvMjAxNTwvYT4sIDxhIGhyZWY9Jy9JUFAvdGF4b25vbWlhLzIwMDgtMDEtMDEvaXBwXzIwMDgtMDEtMDEuemlwJyB0aXRsZT0nSXIgYSBDaXJjdWxhciAxLzIwMDgnPjEvMjAwODwvYT4geSA8YSBocmVmPScvSVBQL3RheG9ub21pYS8yMDA1LTA2LTMwL2lwcF8yMDA1LTA2LTMwX3YxLjIyLnppcCcgdGl0bGU9J0lyIGEgQ2lyY3VsYXIgMS8yMDA1Jz4xLzIwMDU8L2E+IHBhcmEgZWwgcmVwb3J0ZSBkZSBsYSBJbmZvcm1hY2nDs24gUMO6YmxpY2EgUGVyacOzZGljYSBkZSBsYXMgZW50aWRhZGVzIGNvbiB2YWxvcmVzIGFkbWl0aWRvcyBhIGNvdGl6YWNpw7NuLmQCAQ9kFgICDQ9kFgQCAQ9kFgYCAQ9kFgICAw8QZGQWAGQCAw9kFgICAw8QZGQWAGQCBQ9kFgICAw8QZGQWAGQCAg9kFgYCAQ9kFgZmDxBkZBYAZAIBDxBkZBYAZAIDDxBkZBYBZmQCAw9kFgoCAw8QZGQWAGQCBQ8QZGQWAGQCBw8QZGQWAGQCDQ8QZGQWAQIBZAIRDxBkZBYBZmQCBQ9kFgoCAw8QZGQWAGQCBQ8QZGQWAGQCBw8QZGQWAGQCDQ8QZGQWAQIBZAIRDxBkZBYBZmQCAg9kFgQCAQ9kFggCCQ8QZGQWAGQCEw8QZGQWAGQCFw8QZGQWAGQCGw8QZGQWAGQCAw9kFgICAQ88KwARAgEQFgAWABYADBQrAABkGAMFC212Q29udGVuaWRvDw9kZmQFH3dEZXNjYXJnYXNfTGlzdGFkbyRncmlkSW5mb3JtZXMPZ2QFDG12TW9kb1BhZ2luYQ8PZGZkIeVZrmpOfPFqcHtyXhTP+ho+VemJP+fQZiuA1wu5cOc=" />

相关问题 更多 >

    热门问题