Pandas,从DataFrame生成一个表,其中多个列合并到新索引中

2024-05-19 12:52:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下数据框:

      Sex        CC1         CC2          CC3          CC4         count
1     Male       GBM       cough       fever      poor feeding       1
2   Female       abd dist    GBM          DIB           AMS          1
3   Female       fever     convulsion     LOC           NaN          1
4     Male       fever      vomiting   convulsion       NaN          1
5   Female       cough       GBM          NaN           NaN          1
6   Female        AMS        NaN          NaN           NaN          1

我需要以下结果:

New-index        Female   Male  Total  
AMS              14.0   14.0   28.0
Abd pain          1.0    0.0    1.0
Headache          0.0    1.0    1.0
DIB              12.0   23.0   35.0
FOC               0      1.0    1.0

在结果数据框中,“New index”列是“CC1”、“CC2”、“CC3”和“CC4”的组合。我怎样才能在熊猫身上做到这一点?你知道吗


Tags: 数据newnanmalefemalefeveramsdib
1条回答
网友
1楼 · 发布于 2024-05-19 12:52:31

我想你可以先用^{},然后再用^{}

df1 = pd.melt(df, id_vars=['Sex','count'], value_name='New-index')
print df1
       Sex  count variable     New-index
0     Male      1      CC1           GBM
1   Female      1      CC1      abd dist
2   Female      1      CC1         fever
3     Male      1      CC1         fever
4   Female      1      CC1         cough
5   Female      1      CC1           AMS
6     Male      1      CC2         cough
7   Female      1      CC2           GBM
8   Female      1      CC2    convulsion
9     Male      1      CC2      vomiting
10  Female      1      CC2           GBM
11  Female      1      CC2           NaN
12    Male      1      CC3         fever
13  Female      1      CC3           DIB
14  Female      1      CC3           LOC
15    Male      1      CC3    convulsion
16  Female      1      CC3           NaN
17  Female      1      CC3           NaN
18    Male      1      CC4  poor feeding
19  Female      1      CC4           AMS
20  Female      1      CC4           NaN
21    Male      1      CC4           NaN
22  Female      1      CC4           NaN
23  Female      1      CC4           NaN
print df1.pivot_table('count', 
                      index=['New-index'], 
                      columns=['Sex'],  
                      aggfunc='sum', 
                      margins=True, 
                      margins_name='Total')

Sex           Female  Male  Total
New-index                        
AMS              2.0   NaN    2.0
DIB              1.0   NaN    1.0
GBM              2.0   1.0    3.0
LOC              1.0   NaN    1.0
abd dist         1.0   NaN    1.0
convulsion       1.0   1.0    2.0
cough            1.0   1.0    2.0
fever            1.0   2.0    3.0
poor feeding     NaN   1.0    1.0
vomiting         NaN   1.0    1.0
Total           16.0   8.0   24.0

相关问题 更多 >