将“servlets”下载到文本文件而不是ex

2024-07-08 11:02:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我有如下网址:

https://www.oslobors.no/ob/servlets/excel?type=history&columns=TIME%2C+BUYER%2C+SELLER%2C+PRICE%2C+VOLUME%2C+TYPE&format[TIME]=dd.mm.YY%20hh:MM:ss&format[PRICE]=%23%2C%23%230.00%23%23%23&format[VOLUME]=%23%2C%23%230&header[TIME]=Statoil&header[BUYER]=Kj%C3%B8per&header[SELLER]=Selger&header[PRICE]=Pris&header[VOLUME]=Volum&header[TYPE]=Type&view=DELAYED&source=feed.ose.trades.INSTRUMENTS&filter=ITEM_SECTOR%3D%3DsSTL.OSE%26%26DELETED!%3Dn1&stop=now&start=1493935200000&ascending=true

我可以在Excel中打开它(删除“tinyurll”中的“l”):

Sub Get_File()
    Dim oXMLHTTP As Object: Set oXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP")
    Dim strURL As String: strURL = "http://tinyurll.com/api-create.php?url=https://www.oslobors.no/ob/servlets/excel?type=history&columns=TIME%2C+BUYER%2C+SELLER%2C+PRICE%2C+VOLUME%2C+TYPE&format[TIME]=dd.mm.YY%20hh:MM:ss&format[PRICE]=%23%2C%23%230.00%23%23%23&format[VOLUME]=%23%2C%23%230&header[TIME]=Statoil&header[BUYER]=Kj%C3%B8per&header[SELLER]=Selger&header[PRICE]=Pris&header[VOLUME]=Volum&header[TYPE]=Type&view=DELAYED&source=feed.ose.trades.INSTRUMENTS&filter=ITEM_SECTOR%3D%3DsSTL.OSE%26%26DELETED!%3Dn1&stop=now&start=1493935200000&ascending=true"
        With oXMLHTTP: .Open "GET", strURL, False: .send: End With

        strURL = oXMLHTTP.responseText

        With Workbooks: .Open strURL, IgnoreReadOnlyRecommended:=True: End With
End Sub

但是我想用Python将内容下载到文本文件而不是excel文件?你知道吗


Tags: nohttpsformattimewwwtypewithexcel
3条回答

我使用以下命令下载到了一个“.xlsx”文件:

import requests
import time
import csv

url = 'https://www.oslobors.no/ob/servlets/excel?type=history&columns=TIME%2C+BUYER%2C+SELLER%2C+PRICE%2C+VOLUME%2C+TYPE&format[TIME]=dd.mm.YY%20hh:MM:ss&format[PRICE]=%23%2C%23%230.00%23%23%23&format[VOLUME]=%23%2C%23%230&header[TIME]=Statoil&header[BUYER]=Kj%C3%B8per&header[SELLER]=Selger&header[PRICE]=Pris&header[VOLUME]=Volum&header[TYPE]=Type&view=DELAYED&source=feed.ose.trades.INSTRUMENTS&filter=ITEM_SECTOR%3D%3DsSTL.OSE%26%26DELETED!%3Dn1&stop=now&start=1493935200000&ascending=true'
file_name = 'C:\\Users\\AR\\Documents\\DownloadFile.xlsx'

while True:
    try:
        resp = requests.get(url)
        with open(file_name, 'wb') as output:
            output.write(resp.content)
            break
    except Exception as e:
        print(str(e))
        time.sleep(3)

在“文件名”中使用扩展名“.txt”,会生成一个以以下开头的文件:

PK    L©J               _rels/.rels­’ÁjÃ0†ï}
£{ã´ƒ1FÝ^Æ ·2ºÐl%1I,c«[öö3»l
l°£ôýH»Ã4ê•RölªË·ÖÀóùq}*‡2ûÕ²’;³*Œ
t"ñ^ël;1W)”NÃiD)ejuDÛcKz[×·:}gÀªŽÎ@:º
¨3¦–ÄÀ4è7Nýs_ni¼GúM*7·ôÀö2R+á³

找到一个使用Openpyxl的解决方案(即可以在不打开excel(excel工作簿)的情况下读取excel文件):

from openpyxl import load_workbook #https://openpyxl.readthedocs.io/en/latest/index.html
import requests
import time

url = 'https://www.oslobors.no/ob/servlets/excel?type=history&columns=TIME%2C+BUYER%2C+SELLER%2C+PRICE%2C+VOLUME%2C+TYPE&format[TIME]=dd.mm.YY%20hh:MM:ss&format[PRICE]=%23%2C%23%230.00%23%23%23&format[VOLUME]=%23%2C%23%230&header[TIME]=Statoil&header[BUYER]=Kj%C3%B8per&header[SELLER]=Selger&header[PRICE]=Pris&header[VOLUME]=Volum&header[TYPE]=Type&view=DELAYED&source=feed.ose.trades.INSTRUMENTS&filter=ITEM_SECTOR%3D%3DsSTL.OSE%26%26DELETED!%3Dn1&stop=now&start=1493935200000&ascending=true'
file_name = 'DownloadFile.xlsx'

while True:
    try:
        resp = requests.get(url)
        with open(file_name, 'wb') as output:
            output.write(resp.content)
            break
    except Exception as e:
        print(str(e))
        time.sleep(3)

wb = load_workbook(file_name)
ws = wb['data']
for row in ws.rows:
    for cell in row:
        print(cell.value)

即使该文件将作为excel文件下载(.xlsx,可能),我认为您仍然可以使用Python打开并读取它作为CSV文件(可能查看this question以获得有关如何下载的更多详细信息)。如果这个excel文件有多个工作表,那么最终可能会出现问题。如果是这种情况,您可能需要使用一个额外的库(like pandas)来管理从excel文件打开和捕获数据的过程。你知道吗

打开并读取文件后,您就可以用您想要保留的任何内容写入一个新的文本文件。This other question有一些关于如何做到这一点的好信息。你知道吗

如果文件中只有一个工作表,则csv方法将起作用,并将如下所示:

(已编辑,打开CSV后将rb更改为rt

import csv

my_read_path = '/directory/some_excel_file.xlsx'
text_file = open('/directory/my_output.txt', "w")
with open(my_read_path, 'rt') as csv_file:
    csv_reader = csv.reader(csv_file)
    for line in list(csv_reader):
        text_file.write(line)  # assumes you want to write any line

text_file.close()

pandas之类的东西阅读可能会更复杂,但无论如何都可能是一种有价值的学习经历。你知道吗

相关问题 更多 >

    热门问题