在pandas中转置非结构化行

category UK US Germany sales 100000 48000 36000 budget 50000 20000 14000 n_employees 300 123 134 diversified 1 0 1 sustainability_score 22.8 38.9 34.5 e_commerce 37000 7000 11000 budget 25000 10000 10000 n_employees 18 22 7 traffic 150 mil 38 mil 12500 subsidy 33000 26000 23000 budget 14000 6000 6000 own_marketing 0 0 1

UK_main_sales UK_main_budget UK_main_n_employees UK_main_diversified UK_main_sustainability_score UK_e_commerce (we could also add sales but I think it is simpler without sales) UK_e_commerce_budget UK_e_commerce_n_employees UK_e_commerce_traffic UK_subsidy UK_subsidy_budget UK_subsidy_own_marketing

1条回答

网友

1楼 · 发布于 2024-09-29 06:27:37

我认为需要：

#get boolean mask for rows for split
mask = df['category'].isin(['subsidy', 'e_commerce'])

#create NaNs for non match values by where
#replace NaNs by forward fill, first NaNs replace by fillna
#create mask for match values by mask and replace by empty string
#join together 
df['category'] = (df['category'].where(mask).ffill().fillna('main').mask(mask).fillna('') 
                   + '_' + df['category']).str.strip('_')

#reshape by unstack 
df = df.set_index('category').unstack().to_frame().T
#flatten MultiIndex
df.columns = df.columns.map('_'.join)

print (df)
  UK_main_sales UK_main_budget UK_main_n_employees UK_main_diversified  \
0        100000          50000                 300                   1   

  UK_main_sustainability_score UK_e_commerce UK_e_commerce_budget  \
0                         22.8         37000                25000   

  UK_e_commerce_n_employees UK_e_commerce_traffic UK_subsidy  \
0                        18               150 mil      33000   

             Germany_main_n_employees  \
0              ...                                   134   

  Germany_main_diversified Germany_main_sustainability_score  \
0                        1                              34.5   

  Germany_e_commerce Germany_e_commerce_budget Germany_e_commerce_n_employees  \
0              11000                     10000                              7   

  Germany_e_commerce_traffic Germany_subsidy Germany_subsidy_budget  \
0                      12500           23000                   6000   

  Germany_subsidy_own_marketing  
0                             1  

[1 rows x 36 columns]

相关问题更多 >

编程相关推荐

热门问题

热门文章