Pandas从多个数据帧映射列

2024-09-30 04:26:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧(FinalDF),看起来像这样

id | Movie | Cast
0   The Dark Knight Christopher Nolan
1   The Dark Knight Christian Bale
2   Pulp Fiction    Quentin Tarantino
3   Pulp Fiction    John Travolta
4   Schindler’s List    Steven Spielberg
5   Schindler’s List    Liam Neeson

在Movie_cast_DF中将电影名称映射到id中

^{2}$

我需要在FinalDF中像这样映射列中的id

id  | Movie |   Cast |  mid     | cid
------------------------------------------------------------------
0   The Dark Knight     Christopher Nolan       m1      d1
1   The Dark Knight     Christian Bale          m1      a1
2   Pulp Fiction        Quentin Tarantino       m2      d2
3   Pulp Fiction        John Travolta           m2      a2
4   Schindler’s List    Steven Spielberg        m3      d3
5   Schindler’s List    Liam Neeson             m3      a3

我尝试使用以下方法:

def getID(x):
    try:
        return movie_cast_DF[movie_cast_DF['name'].str.contains(x.lower(), case=False)]['uuid'].values[0]
    except:
        return None
FinalDF['mid'] = FinalDF['Movie'].apply(getID)
FinalDF['cid'] = FinalDF['Cast'].apply(getID)
FinalDF.head()

有没有什么高效、快速的方法来绘制地图?在


Tags: theiddfmovielistpulpdarkcast
1条回答
网友
1楼 · 发布于 2024-09-30 04:26:56

首先,将name设置为df2的索引。在

dfmap = df2.set_index("name").uuid
dfmap

name
The Dark Knight      m1
Pulp Fiction         m2
Schindler’s List     m3
Christopher Nolan    d1
Christian Bale       a1
Quentin Tarantino    d2
John Travolta        a2
Steven Spielberg     d3
Liam Neeson          a3
Name: uuid, dtype: object

我们将使用这个series对象将键映射到df中的值。接下来,调用map/replace两次-

^{pr2}$

相关问题 更多 >

    热门问题