<p>您可以使用这样的算法:</p>
<pre><code>df1 = pd.DataFrame([[1,"Building M"],[2,"Building V"], [3, "Building H"]], columns=["GlobID","Issue"])
df2 = pd.DataFrame([[1,"Y","broken","bathroom","N","",""],
[2,"Y","stained","bedroom","Y","rusty","basement"],
[3,"Y","missing","kitchen","Y","cracked","attic"]],
columns=["ID","Issue_A","Note_A", "Location_A", "Issue_B", "Note_B", "Location_B"])
df1 = df1.set_index("GlobID")
df2 = df2.set_index("ID")
# divide our df2 to list of data frames
issues = ["A", "B"]
description = ["Issue", "Note", "Location"]
delimiter = "_"
issues_df_list = []
for issue in issues:
# prepare concrete issue description fields
issue_labels = [descr + delimiter + issue for descr in description]
# select sub df for each issue
df = df2[issue_labels]
# rename and unify columns labels
df.columns = description
# then add sub df to the df list
issues_df_list.append(df)
# then concat list of dfs to one big df
issues_df = pd.concat(issues_df_list,sort=False) # some kind of reshaping
# drop rows with "N" values
issues_df = issues_df[issues_df["Issue"] != "N"]
# drop Issue column
issues_df = issues_df.loc[:,issues_df.columns != "Issue"]
# rename Note column label to the Issue
issues_df = issues_df.rename(columns={"Note":"Issue"})
issues_df
</code></pre>
<p>它给你:</p>
<pre><code>+ + -+ +
| | Issue | Location |
+ + -+ +
| ID | | |
| 1 | broken | bathroom |
| 2 | stained | bedroom |
| 3 | missing | kitchen |
| 2 | rusty | basement |
| 3 | cracked | attic |
+ + -+ +
</code></pre>
<p>然后你可以做一个简单的合并:</p>
<pre><code>pd.merge(df1.rename(columns={"Issue":"Name"}), issues_df, left_index=True, right_index=True)
+ -+ + -+ +
| | Name | Issue | Location |
+ -+ + -+ +
| 1 | Building M | broken | bathroom |
| 2 | Building V | stained | bedroom |
| 2 | Building V | rusty | basement |
| 3 | Building H | missing | kitchen |
| 3 | Building H | cracked | attic |
+ -+ + -+ +
</code></pre>