将列表与数据帧匹配

2024-09-30 16:40:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个以年龄为单位的数据框架。我想将这个年龄与一个群体相匹配,即婴儿=0-2岁,儿童=3-12岁,青年=13-18岁,青年=19-30岁,成人=31-50岁,老年=51-65岁

我创建了定义这些年份组的列表,例如Adult=list(range(31,51))等。 如何通过创建新列将列表“成人”的名称与数据框匹配

小输入:数据框由三列组成:df['Name',df['Country',df['Age']

Name    Country  Age
Anthony France   15
Albert  Belgium  54
.
.
.
Zahra   Tunisia  14

因此,我需要将年龄列与我已有的列表相匹配。输出应该如下所示:

Name    Country  Age  Group
Anthony France   15   Young
Albert  Belgium  54   Adult
.
.
.
Zahra   Tunisia  14   Young

谢谢


Tags: 数据namedf列表agecountryanthony年龄
3条回答

为了让初学者更清楚,您可以定义一个函数,该函数将相应地返回每个人的年龄组,然后使用^{}将该函数应用于我们的'Group'列:

import pandas as pd

def age(row):
    a = row['Age']
    if 0 < a <= 2:
        return 'Baby'
    elif 2 < a <= 12:
        return 'Child'
    elif 12 < a <= 18:
        return 'Young'
    elif 18 < a <= 30:
        return 'Young Adult'
    elif 30 < a <= 50:
        return 'Adult'
    elif 50 < a <= 65:
        return 'Senior Adult'

df = pd.DataFrame({'Name':['Anthony','Albert','Zahra'],
                   'Country':['France','Belgium','Tunisia'],
                   'Age':[15,54,14]})

df['Group'] = df.apply(age, axis=1)

print(df)

输出:

      Name  Country  Age         Group
0  Anthony   France   15         Young
1   Albert  Belgium   54  Senior Adult
2    Zahra  Tunisia   14         Young

下面是一种使用pd.cut实现这一点的方法:

df = pd.DataFrame({"person_id": range(25), "age": np.random.randint(0, 100, 25)})
print(df.head(10))
==>
   person_id  age
0          0   30
1          1   42
2          2   78
3          3    2
4          4   44
5          5   43
6          6   92
7          7    3
8          8   13
9          9   76

df["group"] = pd.cut(df.age, [0, 18, 50, 100], labels=["child", "adult", "senior"])
print(df.head(10))
==>
   person_id  age   group
0          0   30   adult
1          1   42   adult
2          2   78  senior
3          3    2   child
4          4   44   adult
5          5   43   adult
6          6   92  senior
7          7    3   child
8          8   13   child
9          9   76  senior

根据您的问题,如果您有一些列表(如下面的列表),并且希望将它们转换为“装箱”,您可以执行以下操作:

# for example, these are the lists
Adult = list(range(18,50))
Child = list(range(0, 18))
Senior = list(range(50, 100))

# Creating bins out of the lists. 
bins = [min(l) for l in [Child, Adult, Senior]]
bins.append(max([max(l) for l in [Child, Adult, Senior]]))
labels = ["Child", "Adult", "Senior"]

# using the bins: 
df["group"] = pd.cut(df.age, bins, labels=labels)

IIUC我会选择{}:

import pandas as pd
import numpy as np
df = pd.DataFrame({'Age': [3, 20, 40]})
condlist = [df.Age.between(0,2),
            df.Age.between(3,12),
            df.Age.between(13,18),
            df.Age.between(19,30),
            df.Age.between(31,50),
            df.Age.between(51,65)]

choicelist = ['Baby', 'Child', 'Young',
           'Young Adult', 'Adult', 'Senior Adult']

df['Adult'] = np.select(condlist, choicelist)

输出:

   Age        Adult
0    3        Child
1   20  Young Adult
2   40        Adult

相关问题 更多 >