在python中合并不同长度的数据帧

2024-06-28 11:14:02 发布

您现在位置:Python中文网/ 问答频道 /正文

My firat data frame is df_movieid_genre

Second data frame is df_fraction_data

我需要根据电影ID加入他们。内部或外部联接不起作用,因为df_分数_数据包含电影ID的重复。我想可以使用for循环,但我是一个初学者,在这样做时遇到了问题。提前谢谢I need something like this ( just a small example)


Tags: 数据iddffordata电影ismy
2条回答

您可以做的是groupby使用movie_id的df_分数,访问每个组并附加具有该movie_id的行

import pandas as pd

def merger(df,df2):
    row_to_be_merge = df2[ df2.index[ df2['Movie_id']==df.name][0] ]   

    df['Genre'],df['Movie_name'] = row_to_be_merge[['Genre','Movie_name']]

    return df

merged_df = df_fraction.group_by('Movie_id').apply(merger, df2 = df_movieid_genre)

apply will run merger on each group dataframe and merger function will concat Genre and Movie_name row value for that movie_id from df_movieid_genre dataframe for each row of that group. Hope it helps :)

试试这个:

df = pd.merge(left=df_movieid_genre, right=df_fraction_data, on=['Movie_Id'], how='inner')

相关问题 更多 >