如何将pandas dataframe转换为多个NamedTup列表

2024-09-28 03:22:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在编写一个代码,需要将多个NamedTuple映射到一个列表中。 下面是代码示例-我的主要问题是关于对偶NamedTuplePeopleNamePeopleAgeList的映射-我不清楚如何做到这一点。这应该分为两个步骤,1/将整行提取到泛型NamedTupe,然后2/将记录拆分为不同的NamedTuplePeopleNamePeopleAge

from typing import NamedTuple, List

import pandas as pd

data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
people = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])

PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeopleAge = NamedTuple("PeopleAge", [("Age", int)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])

# The code below is not correct
Demography = NamedTuple(
    "Demography", [("names", List[(PeopleName, PeopleAge)]), ("postalcodes", PeoplePC)],
)


def to_nested_tuple(k, g):
    peoples = list(
        g["Name"].to_frame().itertuples(name="Person", index=False),
        # rec["Age"].to_frame().itertuples(name="PeopleAge", index=False),
    )
    return Demography(peoples, PeoplePC(k))


d = [to_nested_tuple(*item) for item in people.groupby("PostalCode")]

print(d)

Tags: to代码nameimportageabnamedtuplelist
2条回答

使用list(df.itertuples()),其中df是您的数据帧。你知道吗

这个注解List[(PeopleName, PeopleAge)]抛出TypeError: Too many parameters for typing.List; actual 2, expected 1。你知道吗

具有两种不同类型的元组也应该用typing.Tuple注释:

List[Tuple[PeopleName, PeopleAge]]

但是,要注释参数,最好使用抽象集合类型,如SequenceIterable

Demography = NamedTuple(
    "Demography", [("names", Sequence[Tuple[PeopleName, PeopleAge]]), ("postalcodes", PeoplePC)],
)

我不会对每一组应用to_nested_tuple,而是直接按以下方式进行:

d = [Demography([(PeopleName(row['Name']), PeopleAge(row['Age'])) for _, row in group.iterrows()], PeoplePC(k))
     for k, group in people.groupby("PostalCode")] 

现在,结果将打印为:

[Demography(names=[(PeopleName(Name='tom'), PeopleAge(Age=10)), (PeopleName(Name='juli'), PeopleAge(Age=14))], postalcodes=PeoplePC(PostalCode='ab 11')),
 Demography(names=[(PeopleName(Name='nick'), PeopleAge(Age=15))], postalcodes=PeoplePC(PostalCode='ab 22'))]

相关问题 更多 >

    热门问题