如何从Pandas数据帧的列表中删除值？

[in] testing_df =pd.DataFrame(test_array,columns=['transaction_id','product_id']) # Split the product_id's for the testing data testing_df.set_index(['transaction_id'],inplace=True) testing_df['product_id'] = testing_df['product_id'].apply(lambda row: row.split(',')) [out] product_id transaction_id 001 [P01] 002 [P01, P02] 003 [P01, P02, P09] 004 [P01, P03] 005 [P01, P03, P05] 006 [P01, P03, P07] 007 [P01, P03, P08] 008 [P01, P04] 009 [P01, P04, P05] 010 [P01, P04, P08]

3条回答

网友

1楼 · 编辑于 2024-06-01 14:19:15

我会在分裂之前做：

数据：

In [269]: df
Out[269]:
                 product_id
transaction_id
1                       P01
2                   P01,P02
3               P01,P02,P09
4                   P01,P03
5               P01,P03,P05
6               P01,P03,P07
7               P01,P03,P08
8                   P01,P04
9               P01,P04,P05
10              P01,P04,P08

解决方案：

^{pr2}$

或者您可以更改：

testing_df['product_id'] = testing_df['product_id'].apply(lambda row: row.split(','))

有：

testing_df['product_id'] = testing_df['product_id'].apply(lambda row: list(set(row.split(','))- set(['P04','P08'])))

演示：

In [280]: df.product_id.apply(lambda row: list(set(row.split(','))- set(['P04','P08'])))
Out[280]:
transaction_id
1               [P01]
2          [P01, P02]
3     [P09, P01, P02]
4          [P01, P03]
5     [P01, P03, P05]
6     [P07, P01, P03]
7          [P01, P03]
8               [P01]
9          [P01, P05]
10              [P01]
Name: product_id, dtype: object

网友

2楼 · 编辑于 2024-06-01 14:19:15

列表理解可能是最有效的：

exc = {'P04', 'P08'}
df['product_id'] = [[i for i in L if i not in exc] for L in df['product_id']]

请注意，效率低下的Python级循环是不可避免的。apply+lambda、map+lambda或就地解决方案都涉及Python级别的循环。在

网友

3楼 · 编辑于 2024-06-01 14:19:15

将要删除的所有元素存储在列表中。在

remove_results = ['P04','P08']
for k in range(len(testing_df['product_id'])):
    for r in remove_results:
        if r in testing_df['product_id'][k]:
            testing_df['product_id][k].remove(r)

相关问题更多 >

编程相关推荐

热门问题

热门文章