Pandas根据列的函数选择行

2024-09-30 16:20:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我在学熊猫。我发现了几个关于如何构造pandas数据帧和如何添加列的示例,它们工作得很好。我想学习根据列的值选择所有行。我发现了多个例子来说明如果一个列的值应该小于或大于某个数字,那么如何执行选择,这也是可行的。我的问题是如何进行更一般的选择,我想首先计算列的函数,然后选择函数值大于或小于某个数字的所有行

import names
import numpy as np
import pandas as pd
from datetime import date
import random

def randomBirthday(startyear, endyear):
    T1 = date.today().replace(day=1, month=1, year=startyear).toordinal()
    T2 = date.today().replace(day=1, month=1, year=endyear).toordinal()
    return date.fromordinal(random.randint(T1, T2))

def age(birthday):
    today = date.today()
    return today.year - birthday.year - ((today.month, today.day) < (birthday.month, birthday.day))

N_PEOPLE = 20
dict_people = { }
dict_people['gender'] = np.array(['male','female'])[np.random.randint(0, 2, N_PEOPLE)]
dict_people['names'] = [names.get_full_name(gender=g) for g in dict_people['gender']]

peopleFrame = pd.DataFrame(dict_people)

# Example 1: Add new columns to the data frame
peopleFrame['birthday'] = [randomBirthday(1920, 2020) for i in range(N_PEOPLE)]

# Example 2: Select all people with a certain age
peopleFrame.loc[age(peopleFrame['birthday']) >= 20]

除了最后一行之外,这段代码可以工作。请提出写这行字的正确方法。我考虑过用函数age的值添加一个额外的列,然后根据它的值进行选择。那就行了。但我想知道我是否必须这么做。如果我不想存储一个人的年龄,只用于选择呢


Tags: importagetodaydatenamesnprandompeople