如何使基于txt的关键字提取器在Pandas数据帧上更有效地使用'other'作为异常处理程序

2024-10-05 14:26:43 发布

男 | 程序猿一只，喜欢编程写python代码。

我在pandas dataframe上创建了基于txt的关键字提取器，其中other作为异常处理程序，但代码似乎很长。这是我的数据集

id  description
1   description: kartu debit 20/10 indomaretcipete r
4   description: biaya adm
15  description: tarikan atm 14/10
20  description: trsf ws269b100420/home credit 0372540
22  description: kartu debit 09/10 starbuckspasaraya

下面是名为text.txt的txt文件

indomaret
starbucks
home credit

这是我的密码

with open('text.txt') as f: 
    content = f.readlines()
content = [x.strip() for x in content ]
def ambil(inp):
    try:
        out = []
        for x in content:      
            if x in inp:
                out.append(x)
        if len(out) == 0:
            return 'other'
        else:
            output = ' '.join(out)
            return output
    except:
        return 'other'

df['keyword'] = df['description'].apply(ambil)

这是输出

id  description                                         keyword
1   description: kartu debit 20/10 indomaretcipete r    indomaret
4   description: biaya adm                              other
15  description: tarikan atm 14/10                      other
20  description: trsf ws269b100420/home credit 0372540  home credit
22  description: kartu debit 09/10 starbuckspasaraya    starbucks

我想把我的代码缩短一些，用现有的熊猫函数，该怎么办呢

Tags：代码 in txt id home return description content

1条回答

网友

1楼 · 发布于 2024-10-05 14:26:43

这应该行得通

df['keyword'] = df['description'].apply(lambda x: ' '.join([i for i in content if i in x]))
df['keyword'].fillna('other', inplace=True)

如何使基于txt的关键字提取器在Pandas数据帧上更有效地使用'other'作为异常处理程序

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使基于txt的关键字提取器在Pandas数据帧上更有效地使用'other'作为异常处理程序

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >