用python将excel转换为feather格式

path = r"filepath\*_name*.xlsx" file_list = glob.glob(path) for f in file_list: df = pd.read_excel(f, encoding='utf-8') df[['boola', 'boolb']] = dfa[['boola', 'boolb']].astype(int) pathname = f[:-5] + ".ftr" df.to_feather(pathname)

2条回答

网友

1楼 · 编辑于 2024-10-02 10:34:02

以下是解决我问题的方法：

path = r"pathname\*_somename*.xlsx"
file_list = glob.glob(path)
for f in file_list:
    df = pd.read_excel(f, encoding='utf-8', decimal=',', thousands='.')
    for col in df.columns:
            w= (df[[col]].applymap(type) != df[[col]].iloc[0].apply(type)).any(axis=1)
            if len(df[w]) > 0:

                df[col] = df[col].astype(str)

            if df[col].dtype == list:
                df[col] = df[col].astype(str)
    pathname = f[:-4] + "ftr"
    df.to_feather(pathname)
df.head()

, decimal=',', thousands='.'部分是必需的，因为我的输入文件是按照欧洲标准格式化的，即使用逗号作为十进制分隔符，使用点作为千位分隔符

网友

2楼 · 编辑于 2024-10-02 10:34:02

实际上，您会遇到这个问题，因为名为"stringa,stringb"的列有一些feather无法确定的字符，他试图转换为其他类型，但返回了错误，因此，对于我以前遇到的相同问题，我的解决方案是首先将列转换为字符串，并替换导致错误的字符此外：

import pandas as pd
import os
path = 'c://examplepath//'
files = [file for file in os.listdir(path)]
for file in files:
     df = pd.read_excel(path+file)
     df['column'] = df['column'].astype(str)
     df['column'] = df['column'].replace('old charecter causing error','new charecter').astype(str)
     df.to_feather(path+file.split('.')[0]+'.feather')

注意：我认为pd.read_excel不需要按照documentation进行参数编码

相关问题更多 >

编程相关推荐

热门问题

热门文章