使用pandas datafram的滑动窗口数据

2024-10-05 10:17:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个像这样的数据集:

df = DataFrame(dict(month = [1,2,3,4,5,6], a = [2,4,2,4,2,4], b = [3,5,6,3,4,6]))

enter image description here

我想要的是一个函数,它可以将窗口大小作为输入,并提供如下内容:

函数:def make_sliding_df(data, size)

  1. 如果我这样做,make_sliding_df(df, 1)输出应该是这样的一个数据帧:

enter image description here

  1. {datai>应该像这样输出^ 3帧:

enter image description here

我试过很多方法,但到目前为止没有一个帮助过我,任何帮助都将不胜感激


Tags: 数据方法函数内容dataframedfdatasize
2条回答

这里有一种使用shiftapplymapreduce的方法

In [2007]: def make_sliding(df, N):
      ...:     dfs = [df.shift(-i).applymap(lambda x: [x]) for i in range(0, N+1)]
      ...:     return reduce(lambda x, y: x.add(y), dfs)
      ...:

In [2008]: make_sliding(df, 1)
Out[2008]:
          a         b     month
0  [2, 4.0]  [3, 5.0]  [1, 2.0]
1  [4, 2.0]  [5, 6.0]  [2, 3.0]
2  [2, 4.0]  [6, 3.0]  [3, 4.0]
3  [4, 2.0]  [3, 4.0]  [4, 5.0]
4  [2, 4.0]  [4, 6.0]  [5, 6.0]
5  [4, nan]  [6, nan]  [6, nan]

In [2009]: make_sliding(df, 2)
Out[2009]:
               a              b          month
0  [2, 4.0, 2.0]  [3, 5.0, 6.0]  [1, 2.0, 3.0]
1  [4, 2.0, 4.0]  [5, 6.0, 3.0]  [2, 3.0, 4.0]
2  [2, 4.0, 2.0]  [6, 3.0, 4.0]  [3, 4.0, 5.0]
3  [4, 2.0, 4.0]  [3, 4.0, 6.0]  [4, 5.0, 6.0]
4  [2, 4.0, nan]  [4, 6.0, nan]  [5, 6.0, nan]
5  [4, nan, nan]  [6, nan, nan]  [6, nan, nan]

通过使用numpy,这可能看起来很难看,但这是我第一次尝试使用numpy。。。在

def make_sliding_df(df,step=1,width=2):
    l=[]
    for x in df.columns:
        a=df[x]
        a=np.array(a)
        b=np.append(a,[np.nan]*(width-1))
        l.append((b[(np.arange(width)[None, :] + step*np.arange(len(a))[:, None])]).tolist())
    newdf=pd.DataFrame(data=l).T
    newdf.columns=df.columns
    return(newdf)

make_sliding_df(df,step=1,width=2)
Out[157]: 
            a           b       month
0  [2.0, 4.0]  [3.0, 5.0]  [1.0, 2.0]
1  [4.0, 2.0]  [5.0, 6.0]  [2.0, 3.0]
2  [2.0, 4.0]  [6.0, 3.0]  [3.0, 4.0]
3  [4.0, 2.0]  [3.0, 4.0]  [4.0, 5.0]
4  [2.0, 4.0]  [4.0, 6.0]  [5.0, 6.0]
5  [4.0, nan]  [6.0, nan]  [6.0, nan]

make_sliding_df(df,step=1,width=3)
Out[158]: 
                 a                b            month
0  [2.0, 4.0, 2.0]  [3.0, 5.0, 6.0]  [1.0, 2.0, 3.0]
1  [4.0, 2.0, 4.0]  [5.0, 6.0, 3.0]  [2.0, 3.0, 4.0]
2  [2.0, 4.0, 2.0]  [6.0, 3.0, 4.0]  [3.0, 4.0, 5.0]
3  [4.0, 2.0, 4.0]  [3.0, 4.0, 6.0]  [4.0, 5.0, 6.0]
4  [2.0, 4.0, nan]  [4.0, 6.0, nan]  [5.0, 6.0, nan]
5  [4.0, nan, nan]  [6.0, nan, nan]  [6.0, nan, nan]

相关问题 更多 >

    热门问题