从多个URL导入表以创建单个数据帧和csv fi

2024-09-24 06:21:10 发布

您现在位置:Python中文网/ 问答频道 /正文

从多个URL导入表,并希望创建单个数据帧,然后存储为csv文件。我正在努力从表中删除重复的描述,并且在创建之后无法操作数据帧dfmaster。你知道吗

可能pd.read_html是作为列表而不是数据帧导入的?你知道吗

我尝试遍历传入的表并使用

for item in df:  
        if item not in dfmaster:            
            dfmaster.append(item)   
            print(dfmaster)

但这似乎列出了令人不快的重复争吵。你知道吗

我还尝试了drop.duplicates附加到dfmasterdf.drop[0]

producturls = ['https://www.interactivebrokers.com/en/index.php?f=2222&exch=ecbot&showcategories=FUTGRP',
               'https://www.interactivebrokers.com/en/index.php?f=2222&exch=cfe&showcategories=FUTGRP',
               'https://www.interactivebrokers.com/en/index.php?f=2222&exch=dtb&showcategories=FUTGRP&p=&cc=&limit=100&page=2'
               ]
dfmaster =[]

for url in producturls: 
    table = pd.read_html(url, index_col=None, header=None,)
    df = table[2]

    for item in df:  
        if item not in dfmaster:            
            dfmaster.append(item)   
            print(dfmaster)

    dfmaster.to_csv('IB_tickers.csv')

输出应该将来自网站的所有表数据缝合到一个数据帧中,而不重复说明标题,然后创建并存储为可读的csv文件。你知道吗

非常感谢您的关注。你知道吗


Tags: csv数据inhttpscomdfforindex
1条回答
网友
1楼 · 发布于 2024-09-24 06:21:10

这应该适合您:

import pandas as pd
from tabulate import  tabulate

producturls = ['https://www.interactivebrokers.com/en/index.php?f=2222&exch=ecbot&showcategories=FUTGRP',
               'https://www.interactivebrokers.com/en/index.php?f=2222&exch=cfe&showcategories=FUTGRP',
               'https://www.interactivebrokers.com/en/index.php?f=2222&exch=dtb&showcategories=FUTGRP&p=&cc=&limit=100&page=2'
               ]

df_list = []

for url in producturls:
    table = pd.read_html(url, index_col=None, header=None,)
    df = table[2]
    df_list.append(df)

dfmaster = pd.concat(df_list, sort=False)
dfmaster = dfmaster.drop_duplicates().reset_index(drop=True)
print(tabulate(dfmaster.head(), headers='keys'))
dfmaster.to_csv('IB_tickers.csv')

结果:

    IB Symbol    Product Description                                      Symbol    Currency
                                         (click link for more details)
        -                             -             
 0  AC           Ethanol -CME                                             EH        USD
 1  AIGCI        Bloomberg Commodity Index                                AW        USD
 2  B1U          30-Year Deliverable Interest Rate Swap Futures           B1U       USD
 3  DJUSRE       Dow Jones US Real Estate Index                           RX        USD
 4  F1U          5-Year Deliverable Interest Rate Swap Futures            F1U       USD

相关问题 更多 >