查找并向新列添加值

2024-06-13 21:40:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含2列的CSV,我需要在pandas中创建一个查找表,根据该行的某些值添加一列。例如:

DIMENSION    ACCOUNT NAME
Tax          Tiger Beta
Config       Tiger Alpha
S3           Lion Alpha
Lambda       Tiger Cigna
Glacier      Beta - Lion

我想补充的是:

DIMENSION    ACCOUNT NAME    ADDED_COLUMN1     ADDED_COLUMN2    ADDED_COLUMN3
Tax          Tiger Beta        Other             Tiger_Group      Beta
Config       Tiger Alpha       Other             Tiger_Group      Alpha
S3           Lion Alpha        VM                Lion_Group       Alpha
Lambda       Tiger Cigna       Other             Tiger_Group      Cigna
Glacier      Beta - Lion       VM                Lion_Group       Beta
Snowball     Monkey Alpha       Disk             Monkey_Group     Alpha

基本上,我需要它剥离帐户名的一些部分,并将其添加到一个新列中,我需要根据维度添加_COLUMN1(我选择哪个是VM,哪个是磁盘,哪个是其他)

我所拥有的:

import pandas as pd
import csv
import numpy as np

#turn the csv to a pandas dataframe
data_1 = pd.read_csv('data.csv')


data_1['ADDED_COLUMN1'] = np.where(data_1.DIMENSION.isin(['S3', 'Glacier']), 
'VM', 'Other')

Tags: csvalphapandasaddeddatas3groupvm
1条回答
网友
1楼 · 发布于 2024-06-13 21:40:11

您可以这样做:

rank = {"Alpha", "Beta","Cigna"}
Animal = {"Tiger", "Lion"}

def Lookup1(x):
    df_words = set(x.split(' '))
    extract_words =  rank.intersection(df_words)
    return ', '.join(extract_words)

def Lookup2(x):
    df_words = set(x.split(' '))
    extract_words =  Animal.intersection(df_words)
    return ', '.join(extract_words)



df['ADDED_COLUMN3'] = df['ACCOUNT NAME'].apply(Lookup1)
df['ADDED_COLUMN1'] = df['ACCOUNT NAME'].apply(Lookup2)
df['ADDED_COLUMN1'] =  df['ADDED_COLUMN1'] +'_Group'

返回

 DIMENSION ACCOUNT NAME ADDED_COLUMN3 ADDED_COLUMN1
0       Tax   Tiger Beta          Beta   Tiger_Group
1    Config  Tiger Alpha         Alpha   Tiger_Group
2        S3   Lion Alpha         Alpha    Lion_Group
3    Lambda  Tiger Cigna         Cigna   Tiger_Group
4   Glacier  Beta - Lion          Beta    Lion_Group

相关问题 更多 >