Pandas/Python中的数据操作

2条回答

网友

1楼 · 编辑于 2024-10-01 02:31:27

试试这个，用这个作为记录列表

df2['global_content']

0    100
1    300
2    301
3    101
4    400
5    500
6    401
7    501

recs = pd.DataFrame()
recs['content'] = df.groupby('Masteruserid')['content'].apply(lambda x: list(x) + np.random.choice(df2[~df2.isin(list(x))].dropna().values.flatten(), 2, replace=False).tolist())
recs

                                    content
Masteruserid                               
1             [100, 101, 102, 300.0, 301.0]
2             [100, 101, 110, 501.0, 301.0]

网友

2楼 · 编辑于 2024-10-01 02:31:27

def add_content(df, gc, k=5):
    n = len(df)
    gcs = set(gc.squeeze())
    if n < k:
        choices = list(gcs.difference(df.content))
        mc = np.random.choice(choices, k - n, replace=False)
        ids = np.repeat(df.Masteruserid.iloc[-1], k - n)
        data = dict(Masteruserid=ids, content=mc)

        return df.append(pd.DataFrame(data), ignore_index=True)


gb = df.groupby('Masteruserid', group_keys=False)
gb.apply(add_content, gc).reset_index(drop=True)

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas/Python中的数据操作

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >