如何在Python中访问聚合函数值

Home Away HG AG mean mean count 0 Astra Bistrita 4.000000 0.000000 1 1 Astra CFR Cluj 1.100000 2.100000 10 2 Astra CS...... 1.000000 1.000000 1

def functie(home, away): subset = final.loc[(final.Home==home)&(final.Away==away)] #print(subset['AG'].count) if (subset['AG'].count >= 1): print("It worked") functie('Astra', 'Bistrita')

TypeError Traceback (most recent call last) <ipython-input-46-2e7b6464c49d> in <module> 92 print("It worked") 93 ---> 94 functie('Astra', 'Bistrita') 95 96 final.head(30) <ipython-input-46-2e7b6464c49d> in functie(home, away) 89 subset = final.loc[(final.Home==home)&(final.Away==away)] 90 #print(subset['AG'].count) ---> 91 if (subset['AG'].count >= 1): 92 print("It worked") 93 TypeError: '>=' not supported between instances of 'method' and 'int'

2条回答

网友

1楼 · 编辑于 2024-09-23 08:25:11

好的，我设法得到了我的答案，我把它贴在这里给其他有同样问题的人。以下是我所做的：


# group by home and away and get the mean from HomeGoals and AwayGoals
final = dataset.groupby(['Home','Away'], as_index=False).agg({'HG': ['mean'], 'AG': ['mean']})

#count all the matches where one HomeTeam encountered the same AwayTeam, by a random column, it will get the same 'count' for every column
total_matches = dataset.groupby(['Home','Away'], as_index=False).AvgA.transform('count') 

#set the column total_matches with total matches :)
dataset['total_matches'] = total_matches 

def functie(home, away):
    # get the results from big dataset where I have all the matches from 6-7 years 
    # and list all the 'Some Home Team' vs 'Some Away Team'
    subset = dataset.loc[(dataset.Home==home)&(dataset.Away==away)]
    
    #take the value of column 'total_matches' from first row, it's all the same on the nth 
    #row
    x = subset['total_matches'].iloc[0]
    if (x < 3):
        print("Less than three matches" , x)
    else:
        if(x >= 3):
            print("More than three matches" , x)

functie('Astra', 'CFR Cluj')

#gives the output 10

functie('Astra', 'Bistrita')

#gives the output 1

网友

2楼 · 编辑于 2024-09-23 08:25:11

subset['AG'].count是应用于subset['AG']的pandascount方法。由于它是在没有()的情况下编写的，因此它提供了方法本身，而不是它的任何结果

您显然想做的是访问subset数据帧的一列。该数据帧的列上有一个多索引，这意味着您可以使用列名元组访问单个列

所以在你的代码中

subset['AG'].count

应替换为

subset['AG', 'count']

这解决了在您的版本中，count被解释为方法名，而您希望它表示列名的歧义

但是，请注意subset['AG', 'count'] >= 1将为您提供一个布尔序列，因为该列的每个值都将与1进行比较。因此，您仍然需要考虑if条件到底应该是什么

相关问题更多 >

编程相关推荐

热门问题

热门文章