通过匹配lis中的字符串值在pandas数据帧中构建新列

Col1 Col2 SearchCol3 NewCol4 0 20 'May' 'abc(feb)' 'February' 1 30 'March' 'def | mar' 'March' 2 40 'June' 'ghi | feb' 'February' 3 50 'July' 'jkl(apr)' 'April' 4 60 'May' 'mno(mar)' 'March' 5 70 'March' 'abc' 'March'

3条回答

网友

1楼 · 编辑于 2024-10-01 11:28:01

还有一种方法：

data['NewCol4'] = data.Col2
for i,s in enumerate(strings):
    data.loc[data.SearchCol3.str.contains(s),'NewCol4']=replacement[i]

网友

2楼 · 编辑于 2024-10-01 11:28:01

这对我很有用，而且从好的方面来说，是非常可读的！在

strings = ['jan', 'feb', 'mar', 'apr', 'may']
replacement = ['January', 'February', 'March', 'April', 'May']

def match_string(col3, col2):
    # if in col3, return that result. Else, lazy eval for col2. If neither, return empty string.
    k = ([replacement[strings.index(s)] for s in strings if s in col3]) or ([s for s in replacement if s in col2])
    return k[0] if k else ''

df['NewCol4'] = df.apply(lambda x: match_string(x['SearchCol3'], x['Col2']), axis=1)

输出：

^{pr2}$

网友

3楼 · 编辑于 2024-10-01 11:28:01

一。str.提取物接受正则表达式。在

http://pandas.pydata.org/pandas-docs/version/0.15.2/generated/pandas.core.strings.StringMethods.extract.html#pandas.core.strings.StringMethods.extract

import pandas as pd
df = pd.DataFrame(
        {'Col1': [20, 30, 40, 50, 60, 70],
        'Col2': ['May','March','June','July','May','March'],
        'SearchCol3': ['abc(feb)','def | mar','ghi | feb','jkl(apr)','mno(mar)','abc']})


a_regex = '(jan|feb|mar|apr|may)'
month_replacements = {'jan': 'January','feb': 'February',
            'mar': 'March','apr': 'April','may': 'May'}

#Extract Using Regex
df['NewCol4'] = df['SearchCol3'].str.extract(a_regex).fillna('')
#Look up values from dictionary
df['NewCol4'] = df['NewCol4'].apply(lambda x: month_replacements.get(x,''))
#Use default value from other coumn if no other value
df['NewCol4'] = df.apply(lambda row: row['Col2'] if row['NewCol4'] == '' else row['NewCol4'], axis=1)

输出：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

通过匹配lis中的字符串值在pandas数据帧中构建新列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >