Pandas dataframe：添加在pas中统计事件的列

| EventID | PictureID | Date 0 | 1 | A | 2010-01-01 1 | 2 | A | 2010-02-01 2 | 3 | A | 2010-02-15 3 | 4 | B | 2010-01-01 4 | 5 | C | 2010-02-01 5 | 6 | C | 2010-02-15

| EventID | PictureID | Date | PastSix 0 | 1 | A | 2010-01-01 | 0 1 | 2 | A | 2010-02-01 | 1 2 | 3 | A | 2010-02-15 | 2 3 | 4 | B | 2010-01-01 | 0 4 | 5 | C | 2010-02-01 | 0 5 | 6 | C | 2010-02-15 | 1

1条回答

网友

1楼 · 发布于 2024-10-02 00:29:54

我不知道如何定义6个月，所以我使用prev183天，基本思想是使用asof()方法：

import pandas as pd
import numpy as np
import io

txt = u"""EventID  |  PictureID  |  Date
0        |  A          |  2009-07-01
1        |  A          |  2010-01-01
2        |  A          |  2010-02-01
3        |  A          |  2010-02-15
4        |  B          |  2010-01-01
5        |  C          |  2010-02-01
6        |  C          |  2010-02-15
7        |  A          |  2010-08-01
"""

df = pd.read_csv(io.StringIO(txt), sep=r"\s*\|\s*", parse_dates=["Date"])

def f(df):
    count = pd.Series(np.arange(1, len(df)+1), index=df["Date"])
    prev1day = count.index.shift(-1, freq="D")
    prev6month = count.index.shift(-183, freq="D")
    result = count.asof(prev1day).fillna(0).values - count.asof(prev6month).fillna(0).values
    return pd.Series(result, df.index)

df["PastSix"] = df.groupby("PictureID").apply(f)
print df

输出：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas dataframe：添加在pas中统计事件的列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >