Python Pandas:连接一个或多个csv-fi

2024-10-01 15:33:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试连接多个CSV文件,其中n个列在文件夹中更新。在

我的一个文件包含68列,另一个包含26列。很少有通用的度量标准,也很少有不同的度量标准。在

我想把total连接起来,如果两个文件都有公共列,它将一个追加到另一个,否则只需在最后的数据帧中添加不匹配的列。在

输入:

df1
Funding Source  Campaign    Market  Date (mm/dd/yyyy)   Channel Phase
Local           Galaxy-A    PL      2019-05-27          Digital Display

df2
Date (mm/dd/yyyy)   Days Ago    Market  Data Source  Campaign                                         Objective  Keyword Strategy   Campaign Type   Product Division    Funding Source
2019-07-16           48         Poland  DCS          IM-PEM_telefony_Galaxy A70_modele+brand_[search]                       

预期产量:

^{pr2}$

我尝试了以下方法:

# Reading FileName for merge
        all_csv = [file_name for file_name in os.listdir(file_path) if '.csv' in file_name]

        li = []
        # Reading Files for concat
        for filename in all_csv:
            print("Reading File Name : " + filename)

            df = pd.read_csv(file_path + '/' + filename, encoding="ISO-8859-1", low_memory=False)
            df.dropna(how='all', axis=1, inplace=True)
            for f, g in zip(col_rename.iloc[:, 0], col_rename.iloc[:, 1]):
                if f is np.nan:
                    pass
                else:
                    print("renaming Column Name : " + f + " with column Name : " + g)
                    df.rename(columns={'{}'.format(f): '{}'.format(g)}, inplace=True)

            print("Data append for fileName: " + filename)

            li.append(df)
        frame = pd.concat(li, sort=True)

但它给出了一个例外:

Traceback (most recent call last):
  File "C:/sapientrepo/classes/ValidationScript.py", line 397, in <module>
    Object_Validation.main()
  File "C:/sapientrepo/classes/ValidationScript.py", line 385, in main
    self.merge_csv()
  File "C:/sapientrepo/classes/ValidationScript.py", line 96, in merge_csv
    frame = pd.concat(li, sort=True)
  File "C:\sapientrepo\myvenv\lib\site-packages\pandas\core\reshape\concat.py", line 229, in concat
    return op.get_result()
  File "C:\sapientrepo\myvenv\lib\site-packages\pandas\core\reshape\concat.py", line 426, in get_result
    copy=self.copy)
  File "C:\sapientrepo\myvenv\lib\site-packages\pandas\core\internals\managers.py", line 2056, in concatenate_block_managers
    elif is_uniform_join_units(join_units):
  File "C:\sapientrepo\myvenv\lib\site-packages\pandas\core\internals\concat.py", line 379, in is_uniform_join_units
    all(not ju.is_na or ju.block.is_extension for ju in join_units) and
  File "C:\sapientrepo\myvenv\lib\site-packages\pandas\core\internals\concat.py", line 379, in <genexpr>
    all(not ju.is_na or ju.block.is_extension for ju in join_units) and
AttributeError: 'NoneType' object has no attribute 'is_extension'

请帮忙。在


Tags: csvinpyforisliblinesite

热门问题