使用pandas 0.13导入多级索引cvs数据

2024-10-02 12:24:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试使用Pandas 0.13read_csv导入带有多索引列的CSV数据。导入是成功的,但是生成的列的多索引令人惊讶,对我来说没有用处。在

employment=pd.read_csv('./data/spanish/employment1976-1987thousands.csv', index_col=0,header=[7,8])

就业1976-1987千.csv

^{pr2}$

生成的列索引是

Alava                  16  19 yo          
Unnamed: 2_level_0     20  24 yo          
Unnamed: 3_level_0     25  54 yo          
Unnamed: 4_level_0     55 y over yo       
Albacete               16  19 yo          
Unnamed: 6_level_0     20  24 yo          
Unnamed: 7_level_0     25  54 yo          
Unnamed: 8_level_0     55 y over yo

我希望能把它做成表格

Alava    16  19 yo          
Alava    20  24 yo          
Alava    25  54 yo          
Alava    55 y over yo       
Albacete     16  19 yo          
Albacete     20  24 yo          
Albacete     25  54 yo          
Albacete    55 y over yo

Tags: csv数据pandasreaddataleveloverpd
1条回答
网友
1楼 · 发布于 2024-10-02 12:24:10

您可以这样做,以便在事后进行转换:

In [42]: data = """A,,B,
   ....: 1,2,1,2
   ....: 1,2,3,4
   ....: 5,6,7,8
   ....: """

In [43]: df = read_csv(StringIO(data),header=[0,1],index_col=None)

In [44]: df
Out[44]: 
   A  Unnamed: 1_level_0  B  Unnamed: 3_level_0
   1                   2  1                   2
0  1                   2  3                   4
1  5                   6  7                   8

[2 rows x 4 columns]

In [45]: df.columns = unsparsify_labels(df.columns)

In [46]: df
Out[46]: 
   A     B   
   1  2  1  2
0  1  2  3  4
1  5  6  7  8

[2 rows x 4 columns]

In [40]: def unsparsify_labels(index):
   ....:     new_labels = []
   ....:     for label in index.values:
   ....:         if label[0].startswith('Unnamed'):
   ....:             label = list(label)
   ....:             label[0] = ll
   ....:             label = tuple(label)
   ....:         else:
   ....:             ll = label[0]
   ....:         new_labels.append(label)
   ....:     return MultiIndex.from_tuples(new_labels)
   ....: 

相关问题 更多 >

    热门问题