使用现有列和字典创建新列

2024-09-28 22:25:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据框,看起来像:

df = pd.DataFrame({"user_id" : ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'],
                   "score" : [0, 100, 50, 0, 25, 50, 100, 0, 7, 20],
                  "valval" : ["va2.3", "va1.1", "va2.1", "va2.2", "va1.2",
                             "va1.1", "va2.1", "va1.2", "va1.2", "va1.3"]})
   
print(df)


     | user_id | score | valval 
-----+---------+-------+--------
 0   |     a   |    0  | va2.3  
 1   |     b   |  100  | va1.1  
 2   |     c   |   50  | va2.1  
 3   |     d   |    0  | va2.2  
 4   |     e   |   25  | va1.2  
 5   |     f   |   50  | va1.1  
 6   |     g   |  100  | va2.1  
 7   |     h   |    0  | va1.2  
 8   |     i   |    7  | va1.2  
 9   |     j   |   20  | va1.3  

我还有一本字典,看起来像:

dic_t = { "key1" : ["va1.1", "va1.2", "va1.3"], "key2" : ["va2.1", "va2.2", "va2.3"]}

我想要一个新的专栏“keykey”

此列的值具有对应值字典的键

结果如下所示:

     | user_id | score | valval | keykey 
----------------------------------------
 0   |     a   |    0  | va2.3  | key2
 1   |     b   |  100  | va1.1  | key1
 2   |     c   |   50  | va2.1  | key2
 3   |     d   |    0  | va2.2  | key2
 4   |     e   |   25  | va1.2  | key1
 5   |     f   |   50  | va1.1  | key1
 6   |     g   |  100  | va2.1  | key2
 7   |     h   |    0  | va1.2  | key1
 8   |     i   |    7  | va1.2  | key1
 9   |     j   |   20  | va1.3  | key1

Tags: 数据iddataframedf字典pdscoreprint
3条回答

更新空白字典并使用map函数

import pandas as pd
df = pd.DataFrame({"user_id" : ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'],
                   "score" : [0, 100, 50, 0, 25, 50, 100, 0, 7, 20],
                   "valval" : ["va2.3", "va1.1", "va2.1", "va2.2", "va1.2", "va1.1", "va2.1", "va1.2", "va1.2", "va1.3"]})

dic_t = { "key1" : ["va1.1", "va1.2", "va1.3"], "key2" : ["va2.1", "va2.2", "va2.3"]}

d_keykey = {}
for k, v in dic_t.items():
    for val in v:
        d_keykey.update({val: k})
df["keykey"] = df["valval"].map(d_keykey)
print(df)


  user_id  score valval keykey
0       a      0  va2.3   key2
1       b    100  va1.1   key1
2       c     50  va2.1   key2
3       d      0  va2.2   key2
4       e     25  va1.2   key1
5       f     50  va1.1   key1
6       g    100  va2.1   key2
7       h      0  va1.2   key1
8       i      7  va1.2   key1
9       j     20  va1.3   key1

这不是最有效的解决方案,但可以完成工作,并且易于遵循


def get_keykey(search_val, ref_dict):
    for key in ref_dict:                       # loop over all keys
        if search_val in ref_dict[key]:        # if valval is in list of values associated with key, return that key, else will return None
            return key

# apply to val column of df

df["keykey"] = df["valval"].apply(get_keykey, args = (ref_dict,))

你可以在压平字典后使用series.map

d = {val:k for k,v in dic_t.items() for val in v}
df['keykey'] = df['valval'].map(d)

print(df)

  user_id  score valval keykey
0       a      0  va2.3   key2
1       b    100  va1.1   key1
2       c     50  va2.1   key2
3       d      0  va2.2   key2
4       e     25  va1.2   key1
5       f     50  va1.1   key1
6       g    100  va2.1   key2
7       h      0  va1.2   key1
8       i      7  va1.2   key1
9       j     20  va1.3   key1

相关问题 更多 >