我有以下accounts.csv
:
CustomerID,InvoiceID,InvoiceDate,DueDate,SettledDate,InvoiceAmount,DaysToSettle,DaysLate
1,4564,29-03-2012,28-04-2012,25-04-2012,62.68,27,0
1,7897,15-05-2012,14-06-2012,28-05-2012,77.19,13,0
1,8749,21-05-2012,20-06-2012,04-06-2012,51.65,14,0
1,4189,16-06-2012,16-07-2012,04-07-2012,64.47,18,0
2,1353,12-02-2012,13-03-2012,28-02-2012,28.21,16,0
2,4898,01-03-2012,31-03-2012,17-04-2012,48.65,47,17
2,7994,20-03-2012,19-04-2012,08-04-2012,103.64,19,0
2,4652,01-07-2012,31-07-2012,17-07-2012,42.25,16,0
2,1561,01-09-2012,01-10-2012,23-09-2012,69.55,22,0
我能做到
dateparse = lambda x: pd.datetime.strptime(str(x), '%d-%m-%Y')
df = pd.read_csv('accounts.csv', parse_dates=['InvoiceDate','DueDate','SettledDate'], date_parser=dateparse)
df.sort_values(by=['CustomerID', 'InvoiceDate'])
df
要将其读入数据帧:
CustomerID InvoiceID InvoiceDate DueDate InvoiceAmount SettledDate DaysToSettle DaysLate
0 1 4564 2012-03-29 2012-04-28 62.68 2012-04-25 27 0
1 1 7897 2012-05-15 2012-06-14 77.19 2012-05-28 13 0
2 1 8749 2012-05-21 2012-06-20 51.65 2012-06-04 14 0
3 1 4189 2012-06-16 2012-07-16 64.47 2012-07-04 18 0
4 2 1353 2012-02-12 2012-03-13 28.21 2012-02-28 16 0
5 2 4898 2012-03-01 2012-03-31 48.65 2012-04-17 47 17
6 2 7994 2012-03-20 2012-04-19 103.64 2012-04-08 19 0
7 2 4652 2012-07-01 2012-07-31 42.25 2012-07-17 16 0
8 2 1561 2012-09-01 2012-10-01 69.55 2012-09-23 22 0
在Excel/LibreOffice上,使用函数COUNTIFS
、SUMIFS
和AVERAGEIFS
根据两个或多个IF条件(例如CustomerID等于当前行、InvoiceDate小于当前行、DaysLate大于0)添加新列非常容易
其中I2
是=COUNTIFS(A:A,A2, C:C, "<"&C2, H:H,">0")
,J2
是=SUMIFS(E:E,A:A,A2,C:C,"<"&C2)
,K2
是=IFERROR(AVERAGEIFS(E:E,A:A,A2,C:C,"<"&C2),0)
我对熊猫非常陌生,似乎不知道如何根据多种条件创建新的列。如果有任何帮助,我将不胜感激
首先,创建一个表示要填充的案例的布尔掩码。然后你可以有选择地根据这个面具填充
假设这是您的数据帧:
您想基于
a
和b
的值构建一个列c
基本上,您首先构建一个条件选择,然后对结果应用一些函数
要获取列为空的行,请使用
df[col].isnull()
。要求和,请使用.sum()
相关问题 更多 >
编程相关推荐