擅长:python、mysql、java
<p>您可以尝试:</p>
<pre><code>import pandas as pd
import numpy as np
import io
data_string = """ChildID,MotherID,preDiabetes
20,455,No
20,455,Not documented
13,102,NaN
13,102,Yes
702,946,No
82,571,No
82,571,Yes
82,571,Not documented
60,530,NaN
"""
data = io.StringIO(data_string)
df = pd.read_csv(data, sep=',', na_values=['NaN'])
df.fillna('no_value', inplace=True)
df = df.groupby(['MotherID', 'ChildID'])['preDiabetes'].apply(
lambda x: 'Yes' if 'Yes' in x.values else (np.NaN if 'no_value' in x.values.all() else 'No'))
df
</code></pre>
<p>结果:</p>
<pre><code>MotherID ChildID
102 13 Yes
455 20 No
530 60 NaN
571 82 Yes
946 702 No
Name: preDiabetes, dtype: object
</code></pre>