我在pandas dataframe上创建了基于txt的关键字提取器,其中other
作为异常处理程序,但代码似乎很长。这是我的数据集
id description
1 description: kartu debit 20/10 indomaretcipete r
4 description: biaya adm
15 description: tarikan atm 14/10
20 description: trsf ws269b100420/home credit 0372540
22 description: kartu debit 09/10 starbuckspasaraya
下面是名为text.txt
的txt文件
indomaret
starbucks
home credit
这是我的密码
with open('text.txt') as f:
content = f.readlines()
content = [x.strip() for x in content ]
def ambil(inp):
try:
out = []
for x in content:
if x in inp:
out.append(x)
if len(out) == 0:
return 'other'
else:
output = ' '.join(out)
return output
except:
return 'other'
df['keyword'] = df['description'].apply(ambil)
这是输出
id description keyword
1 description: kartu debit 20/10 indomaretcipete r indomaret
4 description: biaya adm other
15 description: tarikan atm 14/10 other
20 description: trsf ws269b100420/home credit 0372540 home credit
22 description: kartu debit 09/10 starbuckspasaraya starbucks
我想把我的代码缩短一些,用现有的熊猫函数,该怎么办呢
这应该行得通
相关问题 更多 >
编程相关推荐