基于现有的唯一值向dataframe添加值

CLASS STUDENT 'Sci' 'Francy' 'Sci' Vacant 'math' 'Alex' 'math' 'Arthur' 'math' 'Katy' 'math' Vacant 'eng' 'Jack' 'eng' Vacant 'eng' 'Francy' 'Hist' 'Francy' 'Hist' 'Francy' 'Hist' Vacant

unique_class = DF['unique_class'].drop_duplicates() vacant_column = pd.Series(['vacant'] * unique_class.shape[0]) temp_df = pd.concat([unique_class, vacant_column], axis=1, ignore_index=True) DF = DF.append(temp_df, ignore_index=True) DF.drop_duplicates(inplace=True)

3条回答

网友

1楼 · 编辑于 2024-06-01 06:04:36

使用pd.merge

df_new = pd.DataFrame({'CLASS': df['CLASS'].unique(), 'STUDENT':'vacant'})

df_new.merge(df, how='outer', on=['CLASS','STUDENT'])

# Use `.sort_values(by='CLASS') if sorted df needed

输出：

    CLASS   STUDENT
0   Sci vacant
1   math    vacant
2   eng     vacant
3   Hist    vacant
4   Sci     Francy
5   math    Alex
6   math    Arthur
7   math    Katy
8   eng     Jack
9   eng     Francy
10  Hist    Francy
11  Hist    Francy

网友

2楼 · 编辑于 2024-06-01 06:04:36

还有一种方法：

# Copy of your data
df = pd.DataFrame({
    "class": ["Sci", "Sci", "math", "math", "math", "eng", "eng", "eng", "Hist", "Hist"],
    "student": ["Francy", "vacant", "Alex", "Arthur", "Katy", "Jack", "vacant", "Francy", "Francy", "Francy"]
    })

# An identical DF with all students equal to "vacant"
vacant_df = pd.DataFrame({"class": df["class"], "student": "vacant"})

# Remove existing 'vacant' from original DF and concatenate with de-duplicated vacant dataframe (to avoid duplicate 'vacant' entries)
final_df = pd.concat([df.loc[df.student != "vacant", vacant_df.drop_duplicates("class")])

原始数据框：

  class student
8  Hist  Francy
9  Hist  Francy
0   Sci  Francy
1   Sci  vacant
5   eng    Jack
6   eng  vacant
7   eng  Francy
2  math    Alex
3  math  Arthur
4  math    Katy

最终测向：

  class student
8  Hist  Francy
9  Hist  Francy
8  Hist  vacant
0   Sci  Francy
0   Sci  vacant
5   eng    Jack
7   eng  Francy
5   eng  vacant
2  math    Alex
3  math  Arthur
4  math    Katy
2  math  vacant

网友

3楼 · 编辑于 2024-06-01 06:04:36

作为记录，你的解决方案没有错。您可以使用几乎相同的方法在“一行”中获得相同的结果：

df = df.append(df[['CLASS']].drop_duplicates().assign(STUDENT='Vacant')).drop_duplicates()

[输出]

  CLASS STUDENT
0   Sci  Francy
1   Sci  Vacant
2  math    Alex
3  math  Arthur
4  math    Katy
5   eng    Jack
6   eng  Vacant
7   eng  Francy
8  Hist  Francy
2  math  Vacant
8  Hist  Vacant

如果需要，您可以在sort_values和reset_index上链接，使表格更清晰：

df = (df.append(df[['CLASS']].drop_duplicates().assign(STUDENT='Vacant'))
      .drop_duplicates()
      .sort_values('CLASS')
      .reset_index(drop=True))

[输出]

   CLASS STUDENT
0   Hist  Francy
1   Hist  Vacant
2    Sci  Francy
3    Sci  Vacant
4    eng    Jack
5    eng  Vacant
6    eng  Francy
7   math    Alex
8   math  Arthur
9   math    Katy
10  math  Vacant

相关问题更多 >

编程相关推荐

热门问题

热门文章