我有一个数据帧,在一列中有如下行:
__label__JCB_Spare_Part __label__Differential_Housings jcb casting assy differential housing
__label__Vibrating_Roller __label__Road_Roller double drum mini roller seat drive model fyl engine nbsp hp aircolled diesel engine wheel size walk speed km climbing capacity drive hydrostatic drive nbsp nbsp
__label__Vibrating_Roller __label__Road_Roller double drum mini roller seat drive model fyl engine nbsp hp aircolled diesel engine wheel size walk speed km climbing capacity drive hydrostatic drive nbsp nbsp
__label__Crawler_Dozer __label__Bulldozer dozer bulldozer
__label__Crawler_Dozer __label__Bulldozer dozer bulldozer
我希望将前缀为__label__
的所有单词提取到一个单独的列中,如下所示:
__label__JCB_Spare_Part __label__Differential_Housings
__label__Vibrating_Roller __label__Road_Roller
__label__Vibrating_Roller __label__Road_Roller
__label__Crawler_Dozer __label__Bulldozer
__label__Crawler_Dozer __label__Bulldozer
我尝试过:
labels = input[0].str.extract(r'(__label__[\w]+)')
但它只抽出一个第一个标签
你可以试试这个:
你的代码基本上是正确的;只是你想要
findall
:相关问题 更多 >
编程相关推荐