在python中如何使用透视表输出进行下一步分析?

2024-10-05 10:37:37 发布

您现在位置:Python中文网/ 问答频道 /正文

样本数据

District    Taluka  Circle  Crop    Yield_2006  Yield_2007  Yield_2008  Yield_2009
AHMEDNAGAR  AKOLE   AKOLE   PADDY   875.3   1338.9  894.9   339.2
AHMEDNAGAR  AKOLE   KOTUL   PADDY   637.2   1007.4  919.7   323.9
AHMEDNAGAR  AKOLE   RAJUR   PADDY   857.8   1227.1  1114.5  506.5
AHMEDNAGAR  AKOLE   SAMSHE  PADDY   875.3   1338.9  894.9   339.2
AHMEDNAGAR  AKOLE   BRAMHA  PADDY   637.2   1007.4  919.7   323.9
AHMEDNAGAR  AKOLE   VIRGAO  PADDY   875.3   1338.9  894.9   339.2
AHMEDNAGAR  AKOLE   SHENDI  PADDY   857.8   1227.1  1114.5  506.5
AHMEDNAGAR  AKOLE   SAKWADI PADDY   857.8   1227.1  1114.5  506.5
AMRAVATI    DHARNI  DHARNI  PADDY   590      888.6  437.8   201.9
AMRAVATI    DHARNI  DHULAT  PADDY   489.7    863.3  277     227.8
AMRAVATI    DHARNI  HARSUL  PADDY   590      888.6  437.8   201.9
AMRAVATI    DHARNI  SIKHEDA PADDY   489.7    863.3  277     227.8
AMRAVATI    CHIKARA CHHDARA PADDY   539.8    698.5  388.9   373.8
AMRAVATI    CHIKARA  SEDOH  PADDY   539.8    698.5  388.9   338.2
AMRAVATI    CHIKARA  CHURNI PADDY   539.8    698.5  388.9   338.2

代码:

^{pr2}$

现在,我想使用这个轴输出

例如:我想创建一个新的列“Average_Yield”,它是每种作物的产量_2006到产量\u 2009的平均值。在

如何创建一个新列,其中我得到了yield-2006到yield-2009的平均值,其中“average_yield”列值舍入到小数点后4位?在


Tags: 数据平均值样本产量yieldcircledistrictpaddy
2条回答

替代方案:

In [79]: res = df.groupby(["District","Crop"]).mean()

In [80]: res['Average_Yield'] = res.mean(1)

In [81]: res
Out[81]:
                  Yield_2006  Yield_2007  Yield_2008  Yield_2009  Average_Yield
District   Crop
AHMEDNAGAR PADDY  809.212500      1214.1      983.45    398.1125     851.218750
AMRAVATI   PADDY  539.828571       799.9      370.90    272.8000     495.857143

您可以先从aggfunc中删除[]以避免列中的MultiIndex,然后使用^{}by rows(axis=1)和^{}

pivot=pd.pivot_table(Data1,values=["Yield_2006", "Yield_2007", "Yield_2008", "Yield_2009"],
                           index=["District","Crop"],
                           aggfunc=np.mean,fill_value=False)

pivot['Average_Yield'] = pivot.mean(axis=1).round(4)
print (pivot)
                  Yield_2006  Yield_2007  Yield_2008  Yield_2009  \
District   Crop                                                    
AHMEDNAGAR PADDY  809.212500      1214.1      983.45    398.1125   
AMRAVATI   PADDY  539.828571       799.9      370.90    272.8000   

                  Average_Yield  
District   Crop                  
AHMEDNAGAR PADDY       851.2188  
AMRAVATI   PADDY       495.8571  

对于select列,可以使用^{}subset

^{pr2}$
pivot['Average_Yield'] = pivot[['Yield_2006','Yield_2007']].mean(axis=1).round(4)
print (pivot)
                  Yield_2006  Yield_2007  Yield_2008  Yield_2009  \
District   Crop                                                    
AHMEDNAGAR PADDY  809.212500      1214.1      983.45    398.1125   
AMRAVATI   PADDY  539.828571       799.9      370.90    272.8000   

                  Average_Yield  
District   Crop                  
AHMEDNAGAR PADDY      1011.6563  
AMRAVATI   PADDY       669.8643  

相关问题 更多 >

    热门问题