Python pivot_表添加差异列

2024-07-01 07:18:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我是python新手。我有下面的数据框。我能在Excel中进行透视

我想添加差异列(在图像中,我手动添加了它)

差别是B-A值。我能够使用PythonPivot表复制除差列和总计之外的数据。下面是我的代码

table = pd.pivot_table(data, index=['Category'], values = ['value'], columns=['Name','Date'], fill_value=0)

如何添加差异列并计算值

我怎样才能得到底部的总计

数据如下

df = pd.DataFrame({
"Value": [0.1, 0.2, 3, 1, -.5, 4],
"Date": ["2020-07-01", "2020-07-01", "2020-07-01", "2020-07-01", "2020-07-01", "2020-07-01"],
"Name": ['A', 'A', 'A', 'B', 'B', 'B'],
"HI Display1": ["X", "Y", "Z", "Z", "Y", "X"]})

我想要如下所示的透视表

Pivot table


Tags: 数据代码name图像datevaluetable差异
2条回答

添加总计的另一种方法是将“margins=True”参数添加到pivot函数中,然后用差值替换总计列,如下所示:

data = {
        'Name':['A', 'A' ,'A', 'B', 'B', 'B','A', 'A' ,'A', 'B', 'B', 'B' ],
        'Value':[1, 2, 3, 4, 5, 6,1, 2, 3, 4, 5, 6, ],
        'Category': ['X', 'Y', 'Z','X', 'Y', 'Z','X', 'Y', 'Z','X', 'Y', 'Z']
    }

df = pd.DataFrame(data)

pivot_ = df.pivot_table(index = ["Category"], 
              columns = "Name" , 
              values = "Value", 
              aggfunc = "sum", 
              margins=True, 
              margins_name='Totals')\
 .fillna('')

pivot_['Totals'] = pivot_['B'] - pivot_['A']

pivot_.rename(columns={"Totals": "Diff"})

输出:

Name    A   B   Diff
Category            
X       2   8   6
Y       4   10  6
Z       6   12  6
Totals  12  30  18

根据问题更新进行编辑:

让我们使用您现在提供的示例数据:

pivot_1 = df_1.pivot_table(index = ["HI Display1"], 
              columns = ["Name", 'Date'], 
              values = "Value", 
              aggfunc = "sum", 
              margins=True, 
              margins_name='Totals'
).fillna('')

pivot_1['Totals'] = pivot_1['B'].sum(axis=1) - pivot_1['A'].sum(axis=1)

pivot_1.rename(columns={"Totals": "Diff"})

输出:

Name        A           B           Diff
Date        2020-07-01  2020-07-01  
HI Display1         
X           0.1         4.0         3.9
Y           0.2         -0.5        -0.7
Z           3.0         1.0         -2.0
Totals      3.3         4.5         1.2

下面是一种方法:

df = pd.DataFrame({
    "Name": ["A", "A", "A", "B", "B", "B"], 
    "Date": "2020-07-01", 
    "Value": [0.1, 0.2, 3, 2, -.5, 4], 
    "Category": ["Z", "Y", "X", "Z", "Y", "X"]
})

piv = pd.pivot_table(df, index="Category", columns="Name", aggfunc=sum)
piv.columns = [c[1] for c in piv.columns]
piv["diff"] = piv.B - piv.A

输出(piv)是:

            A    B  diff
Category                
X         3.0  4.0   1.0
Y         0.2 -0.5  -0.7
Z         0.1  2.0   1.9

要为A和B添加“总计”,请执行以下操作

piv.loc["total"] = piv.sum()

从“差异”列中删除总计:

piv.loc["total", "diff"] = "" # or np.NaN, if you'd like to be more 
                              # 'pandas' style. 

现在的输出是:

            A    B  diff
Category                
X         3.0  4.0   1.0
Y         0.2 -0.5  -0.7
Z         0.1  2.0   1.9
total     3.3  5.5   

如果此时您想在类别顶部添加标题“名称”,请执行以下操作:

piv.columns = pd.MultiIndex.from_product([["Name"], piv.columns])

piv现在是:

         Name          
            A    B diff
Category               
X         3.0  4.0  1.0
Y         0.2 -0.5 -0.7
Z         0.1  2.0  1.9
total     3.3  5.5  

要将日期添加到每列,请执行以下操作:

date = df.Date.max()
piv.columns = pd.MultiIndex.from_tuples([c+(date,) for c in piv.columns])

==>
               Name                      
                  A          B       diff
         2020-07-01 2020-07-01 2020-07-01
Category                                 
X               3.0        4.0          1
Y               0.2       -0.5       -0.7
Z               0.1        2.0        1.9
total           3.3        5.5           

最后,要为列着色(例如,如果您使用的是Jupyter),请执行以下操作:

second_col = piv.columns[2]
piv.style.background_gradient("PiYG", subset = [second_col]).highlight_null('white').set_na_rep("")

enter image description here

相关问题 更多 >

    热门问题