在索引中透视具有重复值的数据帧

snapDate instance waitEvent AvgWaitInMs 0 2015-Jul-03 XX gc cr block 3-way 1 1 2015-Jun-29 YY gc current block 3-way 2 2 2015-Jul-03 YY gc current block 3-way 1 3 2015-Jun-29 XX gc current block 3-way 2 4 2015-Jul-01 XX gc current block 3-way 2 5 2015-Jul-01 YY gc current block 3-way 2 6 2015-Jul-03 XX gc current block 3-way 2 7 2015-Jul-03 YY log file sync 9 8 2015-Jun-29 XX log file sync 8 9 2015-Jul-03 XX log file sync 8 10 2015-Jul-01 XX log file sync 8 11 2015-Jul-01 YY log file sync 9 12 2015-Jun-29 YY log file sync 8

2条回答

网友

1楼 · 编辑于 2024-10-02 20:43:05

这里有一种方法可以将数据帧重塑为与您想要的类似的内容。如果您对生成的数据帧有任何额外的具体要求，请告诉我。在

import pandas as pd

# your data
# ====================================
print(df)

       snapDate instance               waitEvent  AvgWaitInMs
0                                                            
0   2015-Jul-03       XX       gc cr block 3-way            1
1   2015-Jun-29       YY  gc current block 3-way            2
2   2015-Jul-03       YY  gc current block 3-way            1
3   2015-Jun-29       XX  gc current block 3-way            2
4   2015-Jul-01       XX  gc current block 3-way            2
5   2015-Jul-01       YY  gc current block 3-way            2
6   2015-Jul-03       XX  gc current block 3-way            2
7   2015-Jul-03       YY           log file sync            9
8   2015-Jun-29       XX           log file sync            8
9   2015-Jul-03       XX           log file sync            8
10  2015-Jul-01       XX           log file sync            8
11  2015-Jul-01       YY           log file sync            9
12  2015-Jun-29       YY           log file sync            8

# processing
# ====================================
df_temp = df.set_index(['snapDate', 'instance', 'waitEvent']).unstack().fillna(0)

df_temp.columns = df_temp.columns.get_level_values(1).values

df_temp = df_temp.reset_index('instance')

print(df_temp)

            instance  gc cr block 3-way  gc current block 3-way  log file sync
snapDate                                                                      
2015-Jul-01       XX                  0                       2              8
2015-Jul-01       YY                  0                       2              9
2015-Jul-03       XX                  1                       2              8
2015-Jul-03       YY                  0                       1              9
2015-Jun-29       XX                  0                       2              8
2015-Jun-29       YY                  0                       2              8

网友

2楼 · 编辑于 2024-10-02 20:43:05

您也可以使用pivot_table：

df.pivot_table(index=['snapDate','instance'], columns='waitEvent', values='AvgWaitInMs')

Out[64]:
waitEvent             gc cr block 3-way  gc current block 3-way  log file sync
snapDate    instance
2015-Jul-01 XX                      NaN                       2              8
            YY                      NaN                       2              9
2015-Jul-03 XX                        1                       2              8
            YY                      NaN                       1              9
2015-Jun-29 XX                      NaN                       2              8
            YY                      NaN                       2              8

数据：

我使用下面的txt文件作为输入（使用read_csv来自pandas的数据帧)公司名称：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章