在数据帧pd.count()结果<1中插入零

2024-09-27 04:19:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图找到一种方法,在.count()聚合函数的结果为<;1.我尝试过在一个条件下查找null/None值,并使用一个简单的<;1号接线员。到目前为止,我只能计算存在分类变量的实例。下面是一些示例代码来演示我的问题:


data = {'Person': ['Jim', 'Jim', 'Jim', 'Jim', 'Jim', 'Bob','Bob','Bob','Bob','Bob',], 'Result': ['Good', 'Good','Good','Good','Good','Good','Bad','Good','Bad','Bad',]}
dtf = pd.DataFrame.from_dict(data)

names = ['Jim','Bob']
append = []
for i in names:
    good = dtf[dtf['Person']==i]
    good = good[good['Result']=='Good']
    if good['Result'].count() > 0:
        good.insert(2,"Count",good['Result'].count())
    elif good['Result'].count() < 1:
        good.insert(2,"Count",0)

    bad = dtf[dtf['Person']==i]
    bad = bad[bad['Result']=='Bad']
    if bad['Result'].count() > 0:
        bad.insert(2,"Count",bad['Result'].count())
    elif bad['Result'].count() < 1:
        bad.insert(2,"Count",0)
    res = [good,bad]
    res = pd.concat(res)
    append.append(res)
    
    print(res)

电流输出为:

  Person Result  Count
0    Jim   Good      5
1    Jim   Good      5
2    Jim   Good      5
3    Jim   Good      5
4    Jim   Good      5
  Person Result  Count
5    Bob   Good      2
7    Bob   Good      2
6    Bob    Bad      3
8    Bob    Bad      3
9    Bob    Bad      3

我试图实现的是,对于dtf['Results']列中的'Bad'变量,Jim的计数为零。像这样:

  Person Result  Count
0    Jim   Good      5
1    Jim   Good      5
2    Jim   Good      5
3    Jim   Good      5
4    Jim   Good      5
5    Jim    Bad      0
  Person Result  Count
6    Bob   Good      2
7    Bob   Good      2
8    Bob    Bad      3
9    Bob    Bad      3
10   Bob    Bad      3

我希望这是有道理的。抵抗万岁!└[∵┌]└[ ∵ ]┘[┐∵]┘


Tags: ltcountresresultpersonbobbadinsert
1条回答
网友
1楼 · 发布于 2024-09-27 04:19:46

首先从PersonResult的乘积创建一个多索引mi,以保留df中缺少的组合。然后对所有组进行计数(size),并通过多重索引重新编制索引。最后,使用来自这两个数据帧的键的并集来合并这两个数据帧

mi = pd.MultiIndex.from_product([df["Person"].unique(),
                                 df["Result"].unique()],
                                names=["Person", "Result"])

out = df.groupby(["Person", "Result"]) \
        .size() \
        .reindex(mi, fill_value=0) \
        .rename("Count") \
        .reset_index()

out = out.merge(df, on=["Person", "Result"], how="outer")
>>> out
   Person Result  Count
0     Jim   Good      5
1     Jim   Good      5
2     Jim   Good      5
3     Jim   Good      5
4     Jim   Good      5
5     Jim    Bad      0
6     Bob   Good      2
7     Bob   Good      2
8     Bob    Bad      3
9     Bob    Bad      3
10    Bob    Bad      3

输出:

names, append = list(zip(*out.groupby("Person")))
>>> names
('Bob', 'Jim')

>>> append
(   Person Result  Count
 6     Bob   Good      2
 7     Bob   Good      2
 8     Bob    Bad      3
 9     Bob    Bad      3
 10    Bob    Bad      3,
   Person Result  Count
 0    Jim   Good      5
 1    Jim   Good      5
 2    Jim   Good      5
 3    Jim   Good      5
 4    Jim   Good      5
 5    Jim    Bad      0)

相关问题 更多 >

    热门问题