<p>尝试分解ID和列表,然后根据ID的顺序有条件地进行过滤</p>
<pre class="lang-py prettyprint-override"><code>import pandas as pd
df1 = pd.DataFrame(columns=['ID', 'Divide', 'Object', 'List'],
data=[['A, B', 2, 20, [0, 5]],
['C, D', 2, 40, [10, 15, 35]],
['E, F', 2, 20, [11, 15]],
['G', 1, 10, [1, 5]],
['H', 1, 10, ''],
['I, J', 2, 20, '']])
# Split and Explode ID
df1['ID'] = df1['ID'].str.split(', ')
# Group By Each ID and set index so that First and Second IDs are tracked
df1 = df1.explode('ID') \
.groupby(level=0) \
.apply(lambda x: x.reset_index()) \
.droplevel(0)
# Calculate Cap For Later
df1['cap'] = df1['Object'] // df1['Divide'] - 1
def split_lists(g):
# If more than 1 row and non-empty list
if len(g) > 1 and not g['List'].empty:
# Check if is the First ID
if g['level_0'].iloc[0] == 0:
# Filter Less Than Equal To Cap
g['List'] = g['List'][g['List'] <= g['cap']]
else:
# Filter Greater Than Cap
g['List'] = g['List'][g['List'] > g['cap']]
return g
# Explode Lists Group By ID filter using function
# Regroup and convert back to lists
df2 = df1 \
.explode('List') \
.reset_index() \
.groupby('ID') \
.apply(split_lists) \
.groupby('ID')['List'] \
.apply(lambda x: x.dropna().tolist())
# Drop Extra Columns from df1 and merge back
out = df1.drop(columns=['List', 'index', 'cap']) \
.merge(df2, left_on='ID', right_index=True, how='left') \
.reset_index(drop=True)
print(out)
</code></pre>
<p>输出:</p>
<pre><code> ID Divide Object List
0 A 2 20 [0, 5]
1 B 2 20 []
2 C 2 40 [10, 15]
3 D 2 40 [35]
4 E 2 20 []
5 F 2 20 [11, 15]
6 G 1 10 [1, 5]
7 H 1 10 []
8 I 2 20 []
9 J 2 20 []
</code></pre>
<hr/>
<p>带有附加列的DF1</p>
<pre><code> index ID Divide Object List cap
0 0 A 2 20 [0, 5] 9
1 0 B 2 20 [0, 5] 9
0 1 C 2 40 [10, 15, 35] 19
1 1 D 2 40 [10, 15, 35] 19
0 2 E 2 20 [11, 15] 9
1 2 F 2 20 [11, 15] 9
0 3 G 1 10 [1, 5] 9
0 4 H 1 10 9
0 5 I 2 20 9
1 5 J 2 20 9
</code></pre>
<p>过滤和重组后的DF2</p>
<pre><code>ID
A [0, 5]
B []
C [10, 15]
D [35]
E []
F [11, 15]
G [1, 5]
H []
I []
J []
Name: List, dtype: object
</code></pre>