如何为pandas中的每一行获取上一行和下一行的序列

2024-09-28 01:26:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个类似这样的想法:

x     y   
1     0
5     1
3     0
2     0
5     1
6     0
1     0
4     0
3     1

我正在尝试创建一个新列,该列包含xy中的最后2个、当前和下2个元素。应该是这样的:

x     y     seq
1     0     [(nan, nan), (nan, nan), (1, 0), (5,1), (3,0)]
5     1     [(nan, nan), (1, 0), (5, 1), (3,0), (2,0)]
3     0     [(1, 0), (5, 1), (3, 0), (2,0), (5,1)]
2     0     [(5, 1), (3, 0), (2, 0), (5,1), (6,0)]
5     1     [(3, 0), (2, 0), (5, 1), (6,0), (nan, nan)]
6     0     [(2, 0), (5, 1), (6, 0), (nan, nan),(nan, nan)]

我写道:

def sequences(df):

    back2 = (df.x.shift(2), df.y.shift(2))
    back1 = (df.x.shift(1), df.y.shift(1))
    current = (df.x, df.y)
    forward1 = (df.x.shift(-1), df.y.shift(-1))
    forward2 = (df.x.shift(-2), df.y.shift(-2))

    return [back2, back1, current, forward1, forward2]

df['data_sequence'] = df.apply(sequences, axis=1)

但是.shift()df.apply()中失败,因为它将每个项视为一个int,而不是序列中的一个元素。我如何做到这一点


Tags: 元素dfdatareturnshiftdefcurrentnan
2条回答

你可以做:

xy = list(zip(df['x'], df['y']))
xy = [(np.nan, np.nan)]*2 + xy + [(np.nan, np.nan)]*2
df['seq'] = [xy[i:i+5] for i in range(len(df))]

df:

    x   y
0   1   0
1   5   1
2   3   0
3   2   0
4   5   1
5   6   0

输出:

    x   y                                                seq
0   1   0   [(nan, nan), (nan, nan), (1, 0), (5, 1), (3, 0)]
1   5   1       [(nan, nan), (1, 0), (5, 1), (3, 0), (2, 0)]
2   3   0           [(1, 0), (5, 1), (3, 0), (2, 0), (5, 1)]
3   2   0           [(5, 1), (3, 0), (2, 0), (5, 1), (6, 0)]
4   5   1       [(3, 0), (2, 0), (5, 1), (6, 0), (nan, nan)]
5   6   0   [(2, 0), (5, 1), (6, 0), (nan, nan), (nan, nan)]

按照您正在执行的方法,代码没有SomeDude的代码干净,但是工作:

def get_sequence(row, df):
    idx = row.name
    output = []
    for i in range(-2, 3):
        if 0 <= idx+i < df.shape[0]:
            output.append((df.iloc[idx+i].x, df.iloc[idx+i].y))
        else:
            output.append((np.nan, np.nan))
    return output
df["sequence"] = df.apply(lambda row: get_sequence(row, df), axis=1)

0    [(nan, nan), (nan, nan), (1, 0), (5, 1), (3, 0)]
1        [(nan, nan), (1, 0), (5, 1), (3, 0), (2, 0)]
2            [(1, 0), (5, 1), (3, 0), (2, 0), (5, 1)]
3            [(5, 1), (3, 0), (2, 0), (5, 1), (6, 0)]
4            [(3, 0), (2, 0), (5, 1), (6, 0), (1, 0)]
5            [(2, 0), (5, 1), (6, 0), (1, 0), (4, 0)]
6            [(5, 1), (6, 0), (1, 0), (4, 0), (3, 1)]
7        [(6, 0), (1, 0), (4, 0), (3, 1), (nan, nan)]
8    [(1, 0), (4, 0), (3, 1), (nan, nan), (nan, nan)]
dtype: object

相关问题 更多 >

    热门问题