按多个属性分组并将结果/保存传递给

2024-09-28 20:46:00 发布

您现在位置:Python中文网/ 问答频道 /正文

测向

OpCode         PYear OperationStartDate
0    MCI6AF2   1993           2017-03-24
1    MCI6BF2   1994           2017-03-24
2    APJ5LFC   1975           2017-03-13
3    MQL0AF3   1986           2017-03-16
4    BMR3KFC   1982           2017-03-15
5    BRM1NFC   1981           2017-03-15
6    LNP5GF2   1987           2017-03-16
7       EAS5   1999           2017-03-22
8    HRK1DSC   1971           2017-03-17
9    HRK2FSC   1972           2017-03-17
10   NMI6BF2   1994           2017-03-22
11   NMI6AF2   1994           2017-03-20
12   HHY1JF1   1996           2017-02-01

我可以通过使用一个属性“opcode”进行分组,以获取“OperationStartDate”中的最小日期;将该最小日期传递到join中,并使用将输出保存到excel

df2 = df.reset_index()
df3 = df2.loc[df2.groupby('OpCode')['LastOperationDate'].idxmin()] #Try df3 = df2.loc[df2.groupby(['OpCode','PYEAR'])['LastOperationDate'].idxmin()]

将df3重命名为“LastOperationDate”列

df3.rename(columns={'LastOperationDate':'EarliestStartDate'}, inplace=True)


df3.drop(df3.columns[[0,4,1,5,6]], axis=1, inplace=True)


df6 = pd.merge(left = df, right = df3, how ='left', on ='OpCode')

writer = ExcelWriter('C:\Current Projects\Mapping\Python\Process Outputs\Example1321.xlsx')
df6.to_excel(writer,'Sheet1',index=False)
writer.save()

问题是,如果按多个属性OpCode&PYear分组,则得到的结果既不能传递到join中,也不能保存到excel中。在这种情况下,我使用以下替代代码

df3 =df['Operation Start Date'].groupby([df['OpCode'], df['PYear']]).min()

df4= df.groupby(['OpCode', 'PYear'])['Operation Start Date'].apply(lambda x : x.min())

我的结果

 Op Code                P Year                
ANN3EF1                1994                     2017-08-29
APJ5LFC                1975                     2017-03-13
APJ6EFC                1977                     2017-10-09
APJ6GFC                1976                     2017-09-04
APJ6HFC                1975                     2017-09-07
APJ6JFC                1977                     2017-07-27
AYN4RFC                1968                     2017-11-13
AYN6CF1                1984                     2017-03-31
AYN6CF3                1984                     2017-08-29
AYN6DF3                1985                     2017-09-05
AZZ7AF1                2004                     2017-08-04

有没有人知道我怎样才能把这个传给下面的join

df6 = pd.merge(left = df, right = df3, how ='left', on ='OpCode')

或将列标题添加到日期输出

或者保存到excel


Tags: df属性excelleftopcodewriterdf2join
1条回答
网友
1楼 · 发布于 2024-09-28 20:46:00

您不需要加入,只需在此处执行transform

# make sure the date is in correct format
df['Operation Start Date'] = pd.to_datetime(df['Operation Start Date'])

df['min_date'] = df.groupby(['OpCode', 'PYear'])['Operation Start Date'].transform('min')

相关问题 更多 >