<p>我相信您需要分别处理列表的每个值:</p>
<pre><code>df = pd.DataFrame({'Coach_q1': ['Favourable', 'Favourable', 'Favourable', 'nan'],
'Coach_q2': ['Neutral', 'Favourable', 'Favourable', 'NaN'],
'Coach_q8': ['Favourable', 'nan', 'Unfavourable', 'Unfavourable']})
print (df)
Coach_q1 Coach_q2 Coach_q8
0 Favourable Neutral Favourable
1 Favourable Favourable nan
2 Favourable Favourable Unfavourable
3 nan NaN Unfavourable
#replace nan and NaN strings to missing values
df = df.replace(['nan','NaN'], np.nan)
ratingcollist = ['Coach_','Communication_','Development_','Diversity_','Engagement_']
for rat in ratingcollist:
#filter columns by substrings
cols = df.filter(like=rat).columns
#mask for no missing values
mask = df[cols].notna().all(axis=1)
#create new columns if match
if len(cols) > 0:
df[f'{rat.lower()}fav_count'] = (df[cols] == 'Favourable').sum(axis=1)
df[f'{rat.lower()}fav_perc'] = df[f'{rat.lower()}fav_count'] / df[cols].count(axis=1)
df.loc[mask, f'{rat.lower()}agg_perc'] = df.loc[mask, f'{rat.lower()}fav_count'] / len(cols)
</code></pre>
<hr/>
<pre><code>print (df)
Coach_q1 Coach_q2 Coach_q8 coach_fav_count coach_fav_perc \
0 Favourable Neutral Favourable 2 0.666667
1 Favourable Favourable NaN 2 1.000000
2 Favourable Favourable Unfavourable 2 0.666667
3 NaN NaN Unfavourable 0 0.000000
coach_agg_perc
0 0.666667
1 NaN
2 0.666667
3 NaN
</code></pre>
<hr/>
<p>如果将<code>nan</code>替换为<code>fav_perc</code>的word missing输出是错误的,则第二个值应为<code>1</code>,因为count排除missing值:</p>
<pre><code>df = pd.DataFrame({'Coach_q1': ['Favourable', 'Favourable', 'Favourable', 'nan'],
'Coach_q2': ['Neutral', 'Favourable', 'Favourable', 'NaN'],
'Coach_q8': ['Favourable', 'nan', 'Unfavourable', 'Unfavourable']})
print (df)
Coach_q1 Coach_q2 Coach_q8
0 Favourable Neutral Favourable
1 Favourable Favourable nan
2 Favourable Favourable Unfavourable
3 nan NaN Unfavourable
df = df.replace(['nan','NaN'], 'Missing')
print (df)
Coach_q1 Coach_q2 Coach_q8
0 Favourable Neutral Favourable
1 Favourable Favourable Missing
2 Favourable Favourable Unfavourable
3 Missing Missing Unfavourable
</code></pre>
<hr/>
<pre><code>#create a list of all the rating columns
ratingcollist = ['Coach_','Diversity', 'Leadership', 'Engagement']
#create a for loop to get all the columns that match the column list keyword
for rat in ratingcollist:
cols = df.filter(like=rat).columns
mask = (df[cols] != 'Missing').all(axis=1)
#create 3 new columns for each factor, one for count of Favourable responses,
#one for percentage of Favourable responses, and one for Factor Level percentage of Favourable responses
if len(cols) > 0:
df[f'{rat.lower()}fav_count'] = (df[cols] == 'Favourable').sum(axis=1)
df[f'{rat.lower()}fav_perc'] = df[f'{rat.lower()}fav_count'] / df[cols].count(axis=1)
df.loc[mask,f'{rat.lower()}agg_perc'] = df.loc[mask, f'{rat.lower()}fav_count'] / len(cols)
</code></pre>
<hr/>
<pre><code>print (df)
Coach_q1 Coach_q2 Coach_q8 coach_fav_count coach_fav_perc \
0 Favourable Neutral Favourable 2 0.666667
1 Favourable Favourable Missing 2 0.666667
2 Favourable Favourable Unfavourable 2 0.666667
3 Missing Missing Unfavourable 0 0.000000
coach_agg_perc
0 0.666667
1 NaN
2 0.666667
3 NaN
</code></pre>
<p>因此,如果想要使用<code>Missing</code>是必要的,请将<code>count</code>更改为<code>sum</code>与compare not equal <code>Missing</code>:</p>
<pre><code>#create a list of all the rating columns
ratingcollist = ['Coach_','Diversity', 'Leadership', 'Engagement']
#create a for loop to get all the columns that match the column list keyword
for rat in ratingcollist:
cols = df.filter(like=rat).columns
mask = (df[cols] != 'Missing').all(axis=1)
#create 3 new columns for each factor, one for count of Favourable responses,
#one for percentage of Favourable responses, and one for Factor Level percentage of Favourable responses
if len(cols) > 0:
df[f'{rat.lower()}fav_count'] = (df[cols] == 'Favourable').sum(axis=1)
df[f'{rat.lower()}fav_perc'] = df[f'{rat.lower()}fav_count'] / df[cols].ne('Missing').sum(axis=1)
df.loc[mask,f'{rat.lower()}agg_perc'] = df.loc[mask, f'{rat.lower()}fav_count'] / len(cols)
</code></pre>
<hr/>
<pre><code>print (df)
Coach_q1 Coach_q2 Coach_q8 coach_fav_count coach_fav_perc \
0 Favourable Neutral Favourable 2 0.666667
1 Favourable Favourable Missing 2 1.000000
2 Favourable Favourable Unfavourable 2 0.666667
3 Missing Missing Unfavourable 0 0.000000
coach_agg_perc
0 0.666667
1 NaN
2 0.666667
3 NaN
</code></pre>