基于组和前一行pandas的正向填充（ffill）

data = np.array([ [1949, '01/01/2018', np.nan, 17, '30/11/2017'], [1949, '01/01/2018', np.nan, 19, np.nan], [1811, '01/01/2018', 16, np.nan, '31/11/2017'], [1949, '01/01/2018', 15, 21, '01/12/2017'], [1949, '01/01/2018', np.nan, 20, np.nan], [3212, '01/01/2018', 21, 17, '31/11/2017'] ]) columns = ['id', 'ReceivedDate', 'PropertyType', 'MeterType', 'VisitDate'] pd.DataFrame(data, columns=columns)

id ReceivedDate PropertyType MeterType VisitDate 0 1949 01/01/2018 NaN 17 30/11/2017 1 1949 01/01/2018 NaN 19 30/11/2017 2 1811 01/01/2018 16 NaN 31/11/2017 3 1949 01/01/2018 15 21 01/12/2017 4 1949 01/01/2018 15 20 01/12/2017 5 3212 01/01/2018 21 17 31/11/2017

2条回答

网友

1楼 · 编辑于 2024-10-01 13:33:27

`groupby`和{}与`limit=1`

df.groupby(['id', 'ReceivedDate']).ffill(limit=1)

     id ReceivedDate PropertyType MeterType   VisitDate
0  1949   01/01/2018          NaN        17  30/11/2017
1  1949   01/01/2018          NaN        19  30/11/2017
2  1811   01/01/2018           16        18  31/11/2017
3  1949   01/01/2018           15        21  01/12/2017
4  1949   01/01/2018           15        20  01/12/2017
5  3212   01/01/2018           21        17  31/11/2017

`groupby`与`mask`和{}

尝试用groupby、mask、和shift-

^{pr2}$

df.mask(df.isnull().astype(int).groupby(j).cumsum().eq(1), df.groupby(j).shift())

或者

df.where(df.isnull().astype(int).groupby(j).cumsum().ne(1), df.groupby(j).shift())

     id ReceivedDate PropertyType MeterType   VisitDate
0  1949   01/01/2018          NaN        17  30/11/2017
1  1949   01/01/2018          NaN        19  30/11/2017
2  1811   01/01/2018           16        18  31/11/2017
3  1949   01/01/2018           15        21  01/12/2017
4  1949   01/01/2018           15        20  01/12/2017
5  3212   01/01/2018           21        17  31/11/2017

网友

2楼 · 编辑于 2024-10-01 13:33:27

cols_to_ffill = ['PropertyType', 'VisitDate']
i = df.copy()

newdata = pd.DataFrame(['placeholder'] )

while not newdata.index.empty:

    RowAboveid = i.id.shift()
    RowAboveRD = i.ReceivedDate.shift()
    rows_with_cols_to_ffill_all_empty = i.loc[:, cols_to_ffill].isnull().all(axis=1)
    rows_to_ffill = (i.ReceivedDate == RowAboveRD) & (i.id == RowAboveid) & (rows_with_cols_to_ffill_all_empty)
    rows_used_to_fill = i[rows_to_ffill].index-1

    newdata = i.loc[rows_used_to_fill, cols_to_ffill]
    newdata.index +=1
    i.loc[rows_to_ffill, cols_to_ffill] = newdata

一直循环，直到不再匹配为止（即所有列都是前向填充的）

`groupby`和{}与`limit=1`

`groupby`与`mask`和{}

相关问题更多 >

编程相关推荐

热门问题

热门文章