我正在尝试用python编写一个函数。但是,它不断返回错误“无法从重复轴重新索引”。我不知道为什么我会出错
def get_match_digital(Demo, Imp, imp_cap):
df102 = []
# Loop for running a simulation n times
d1 = df_new
impcap = imp_cap
for x in range(1):
# Randomly selecting n number of random rows which would be no. of impressions in this case
f2 = d1.sample(Imp)
# applying frequency capping after sample selection
d = f2.assign(rn=f2.sort_values(['MemberId'], ascending=False).groupby(['MemberId']).cumcount() + 1).query('rn <= 3')
a = 20000
df42 = []
for y in range(10):
df = f2.iloc[:a]
df32 = df.loc[(df['ageGroup25_54'] == Demo)].MemberId.nunique()
a = a + a
df42.append(df32)
df102.append(df42)
transposed2 = list(zip(*df102))
avg2 = lambda items: float(sum(items)) / len(items)
averages2 = list(map(avg2, transposed2))
columns = ['reach']
final = pd.DataFrame(columns=columns)
final = final.assign(reach=averages2)
final['Imp'] = 20000 * (final.index.values + 1)
final['reachP'] = round(((final['reach'] / Imp) * 100), 2)
return final
我认为给出错误的代码行是
d = f2.assign(rn=f2.sort_values(['MemberId'], ascending=False).groupby(['MemberId']).cumcount() + 1).query('rn <= 3')
我在这里要做的是确保在随机样本选择中相同memberid的出现次数不超过3次
一旦我们删除这行代码,函数就可以工作了。我们试过用
d = f2.assign(rn=f2.sort_values(['MemberId'], ascending=False).groupby(level=0)(['MemberId']).cumcount() + 1).query('rn <= 3')
通过重置索引
d.reset_index()
我希望有人能帮我找出错误的原因,帮助我解决问题
提前多谢
目前没有回答
相关问题 更多 >
编程相关推荐