基于其他文本列向pandas dataframe添加数值列

df = pd.DataFrame([['137', 'earn'], ['158', 'earn'],['144', 'ship'],['111', 'trade'],['132', 'trade']], columns=['value', 'topic'] ) print(df) value topic 0 137 earn 1 158 earn 2 144 ship 3 111 trade 4 132 trade

3条回答

网友

1楼 · 编辑于 2024-10-01 11:38:22

选项1
pd.factorize

df['topic_id'] = pd.factorize(df.topic)[0]
df

  value  topic  topic_id
0   137   earn         0
1   158   earn         0
2   144   ship         1
3   111  trade         2
4   132  trade         2

选项2
np.unique

^{pr2}$

选项3
pd.Categorical

df['topic_id'] = pd.Categorical(df.topic).codes
df

  value  topic  topic_id
0   137   earn         0
1   158   earn         0
2   144   ship         1
3   111  trade         2
4   132  trade         2

选项4
dfGroupBy.ngroup

df['topic_id'] = df.groupby('topic').ngroup()
df

  value  topic  topic_id
0   137   earn         0
1   158   earn         0
2   144   ship         1
3   111  trade         2
4   132  trade         2

网友

2楼 · 编辑于 2024-10-01 11:38:22

你可以用

In [63]: df['topic'].astype('category').cat.codes
Out[63]:
0    0
1    0
2    1
3    2
4    2
dtype: int8

网友

3楼 · 编辑于 2024-10-01 11:38:22

我们可以使用apply函数在现有列的基础上创建新列，如下所示。在

topic_list = list(df["topic"].unique()) df['topic_id'] = df.apply(lambda row: topic_list.index(row["topic"]),axis=1)

相关问题更多 >

编程相关推荐

热门问题

热门文章