如何用python从google电子数据表下载原始数据

2024-06-03 16:58:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我想要谷歌电子数据表上显示的数据,但是没有下载选项。我试着使用Beautifulsoup4库,但没能找到答案

这是数据:https://docs.google.com/spreadsheets/d/e/2PACX-1vSc_2y5N0I67wDU38DjDh35IZSIS30rQf7_NYZhtYYGU1jJYT6_kDx4YpF-qw0LSlGsBYP8pqM_a1Pd/pubhtml#


Tags: 数据答案httpscomdocs选项google电子
2条回答

您尝试使用的漂亮的汤法将以这种方式工作

read_url =  urllib.request.urlopen('your_sheet_url').read() #read the url
data = BeautifulSoup(read_url,"html.parser")
table = data.table                                          #extract table 

output_rows = []
df = pd.DataFrame(columns=['State','','Confirmed','Recovered','Deaths','Active','Last_Updated_Time'])
for table_row in table.findAll('tr'):                      #iterate though rows
    columns = table_row.findAll('td')
    output_row = []
    for column in columns:                                 #iterate though columns
        print(column.text)
        output_row.append(column.text)                     #append into a list
    print(len(output_row))
    output_rows.append(output_row)
    try:
        df = df.append(pd.Series(output_row,index = df.columns.tolist()),ignore_index = True) #add to the final dataframe
    except:
        pass
df.toexcel("Output.xlsx")                              # save the datafram as excel file

您可以使用google-api-python-client

有一个快速启动文档可用here

它可以归结为这样的东西:

SAMPLE_SPREADSHEET_ID = '<your spreadsheet id>'
SAMPLE_RANGE_NAME = '<your desired range>'
service = build('sheets', 'v4', credentials=creds)
sheet = service.spreadsheets()
result = sheet.values().get(spreadsheetId=SAMPLE_SPREADSHEET_ID,
                            range=SAMPLE_RANGE_NAME).execute()
values = result.get('values', [])

但一定要阅读完整的快速入门,以获得完整的图片。(示例代码取自此处。)

相关问题 更多 >