标准化数据帧列中的值

2条回答

网友

1楼 · 编辑于 2024-06-13 09:27:50

您可以使用map

In [3664]: mapping = dict(zip(s.str[:3], s))

In [3665]: df.response.str[:3].map(mapping)
Out[3665]:
0    current
1       loan
2    current
3       loan
4    current
5       loan
Name: response, dtype: object

In [3666]: df['response2'] = df.response.str[:3].map(mapping)

In [3667]: df
Out[3667]:
   id  colour response response2
0   1    blue   curent   current
1   2     red  loaning      loan
2   3  yellow  current   current
3   4   green     loan      loan
4   5     red  currret   current
5   6   green     loan      loan

其中s是一系列验证值。在

^{pr2}$

细节

In [3652]: mapping
Out[3652]: {'cur': 'current', 'loa': 'loan', 'tra': 'transfer'}

mapping也可以是系列

In [3678]: pd.Series(s.str[:3].values, index=s.values)
Out[3678]:
current     cur
loan        loa
transfer    tra
dtype: object

网友

2楼 · 编辑于 2024-06-13 09:27:50

模糊匹配？在

from fuzzywuzzy import fuzz
from fuzzywuzzy import process
a=[]
for x in df.response:
    a.append([process.extract(x, val.validate, limit=1)][0][0][0])
df['response2']=a
df
Out[867]: 
   id  colour response response2
0   1    blue   curent   current
1   2     red  loaning      loan
2   3  yellow  current   current
3   4   green     loan      loan
4   5     red  currret   current
5   6   green     loan      loan

相关问题更多 >

编程相关推荐

热门问题

热门文章

标准化数据帧列中的值

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >