python pandas如何基于子串合并/联接两个表？

网友

1楼 · 编辑于 2024-09-30 02:18:48

您可以使用Whoosh这样的库为comments字段编制索引，然后对要搜索的每个装运编号进行文本搜索。在

网友

2楼 · 编辑于 2024-09-30 02:18:48

下面是一个基于一些虚构数据的例子。别理我在数据帧里放的那些废话，我只是随便输入一些东西来获取一个df样本。在

import pandas as pd
import re

x = pd.DataFrame({'Location': ['Chicago','Houston','Los Angeles','Boston','NYC','blah'],
                  'Comments': ['chicago is winter','la is summer','boston is winter','dallas is spring','NYC is spring','seattle foo'],
                  'Dir':      ['N','S','E','W','S','E']})

y = pd.DataFrame({'Location': ['Miami','Dallas'],
                  'Season':   ['Spring','Fall']})


def findval(row):
    comment, location, season = map(lambda x: str(x).lower(),row)
    return location in comment or season in comment

merged = pd.concat([x,y])

merged['Helper'] = merged[['Comments','Location','Season']].apply(findval,axis=1)
print(merged)
filtered = merged[merged['Helper'] == True]
print(filtered)

您可以不连接数据帧，然后创建一个助手来查看一列的字符串是否在另一列中找到。一旦有了helper列，只需过滤掉True

网友

3楼 · 编辑于 2024-09-30 02:18:48

为什么不做点像

Count = 0
def MergeFunction(rowElement):
    global Count
    df2_row = df2.iloc[[Count]]
    if(df2_row['ShipNumber'] in rowElement['Comments'] or df2_row['TrackNumber'] 
       in rowElement['Comments']
    rowElement['Amount'] = df2_row['Amount']
    Count+=1
    return rowElement

df1['Amount'] = sparseArray #Fill with zeros
new_df = df1.apply(MergeFunction)

相关问题更多 >

编程相关推荐

热门问题

热门文章

python pandas如何基于子串合并/联接两个表？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >