Pandas数据透视表格式化列名

2024-05-15 18:10:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我在pandas数据帧上使用了pandas.pivot_table函数,我的输出与此类似:

                    Winners                 Runnerup            
         year       2016    2015    2014    2016    2015    2014
Country  Sport                              
india    badminton                              
india    wrestling  

我真正需要的是下面这样的东西

^{pr2}$

我有很多专栏和年份,所以我不能手动编辑他们,所以谁能告诉我如何做到这一点?在


Tags: 数据函数pandastableyearcountrypivotindia
2条回答

也可以使用列表理解:

df.columns = ['_'.join(col) for col in df.columns]
print (df)
                   Winners_2016  Winners_2015  Winners_2014  Runnerup_2016  \
Country Sport                                                                
india   badminton             1             1             1              1   
        wrestling             1             1             1              1   

                   Runnerup_2015  Runnerup_2014  
Country Sport                                    
india   badminton              1              1  
        wrestling              1              1  

另一个使用convert columns^{},然后调用^{}的解决方案:

^{pr2}$

我对时间安排非常感兴趣:

In [45]: %timeit ['_'.join(col) for col in df.columns]
The slowest run took 7.82 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 4.05 µs per loop

In [44]: %timeit ['{}_{}'.format(x,y) for x,y in zip(df.columns.get_level_values(0),df.columns.get_level_values(1))]
The slowest run took 4.56 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 131 µs per loop

In [46]: %timeit df.columns.to_series().str.join('_')
The slowest run took 4.31 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 452 µs per loop

试试这个:

df.columns=['{}_{}'.format(x,y) for x,y in zip(df.columns.get_level_values(0),df.columns.get_level_values(1))]

get_level_values是您只需获得结果多重索引的一个级别所需的内容。在

旁注:您可以尝试按原样处理数据。我很讨厌熊猫,但我很讨厌它。在

相关问题 更多 >