我想根据他们属于哪一组来计算百分位排名。我写了以下代码,并且能够计算,比如zscore,因为只有一个输入。我该怎么处理一个有两个参数的函数呢?谢谢。在
import pandas as pd
import scipy.stats as stats
import numpy as np
funZScore = lambda x: (x - x.mean()) / x.std()
funPercentile = lambda x, y: stats.percentileofscore(x[~np.isnan(x)], y)
A = pd.DataFrame({'Group' : ['A','A','A','A','B','B','B'],
'Value' : [4, 7, None, 6, 2, 8, 1]})
# Compute the Z-score by group
A['Z'] = A.groupby('Group')['Value'].apply(funZScore)
print(A)
Group Value Z
0 A 4.0 -1.091089
1 A 7.0 0.872872
2 A NaN NaN
3 A 6.0 0.218218
4 B 2.0 -0.440225
5 B 8.0 1.144586
6 B 1.0 -0.704361
# compute the percentile rank by group
# how to put two arguments into groupby apply?
# I hope to get something like below
Group Value Z P
0 A 4.0 -1.091089 33.33
1 A 7.0 0.872872 100
2 A NaN NaN NaN
3 A 6.0 0.218218 66.67
4 B 2.0 -0.440225 66.67
5 B 8.0 1.144586 100
6 B 1.0 -0.704361 33.33
我认为需要:
相关问题 更多 >
编程相关推荐