熊猫：通过重复的ID条件合并/连接数据框架问题的回答

熊猫：通过重复的ID条件合并/连接数据框架

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

您可以使用这样的算法： <pre><code>df1 = pd.DataFrame([[1,"Building M"],[2,"Building V"], [3, "Building H"]], columns=["GlobID","Issue"]) df2 = pd.DataFrame([[1,"Y","broken","bathroom","N","",""], [2,"Y","stained","bedroom","Y","rusty","basement"], [3,"Y","missing","kitchen","Y","cracked","attic"]], columns=["ID","Issue_A","Note_A", "Location_A", "Issue_B", "Note_B", "Location_B"]) df1 = df1.set_index("GlobID") df2 = df2.set_index("ID") # divide our df2 to list of data frames issues = ["A", "B"] description = ["Issue", "Note", "Location"] delimiter = "_" issues_df_list = [] for issue in issues: # prepare concrete issue description fields issue_labels = [descr + delimiter + issue for descr in description] # select sub df for each issue df = df2[issue_labels] # rename and unify columns labels df.columns = description # then add sub df to the df list issues_df_list.append(df) # then concat list of dfs to one big df issues_df = pd.concat(issues_df_list,sort=False) # some kind of reshaping # drop rows with "N" values issues_df = issues_df[issues_df["Issue"] != "N"] # drop Issue column issues_df = issues_df.loc[:,issues_df.columns != "Issue"] # rename Note column label to the Issue issues_df = issues_df.rename(columns={"Note":"Issue"}) issues_df </code></pre> 它给你： <pre><code>+ + -+ + | | Issue | Location | + + -+ + | ID | | | | 1 | broken | bathroom | | 2 | stained | bedroom | | 3 | missing | kitchen | | 2 | rusty | basement | | 3 | cracked | attic | + + -+ + </code></pre> 然后你可以做一个简单的合并： <pre><code>pd.merge(df1.rename(columns={"Issue":"Name"}), issues_df, left_index=True, right_index=True) + -+ + -+ + | | Name | Issue | Location | + -+ + -+ + | 1 | Building M | broken | bathroom | | 2 | Building V | stained | bedroom | | 2 | Building V | rusty | basement | | 3 | Building H | missing | kitchen | | 3 | Building H | cracked | attic | + -+ + -+ + </code></pre>

熊猫：通过重复的ID条件合并/连接数据框架

1 个回答

相关Python问题