如何将数据帧转换为具有聚合的嵌套命名元组

2024-09-28 03:22:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在寻找一种从数据帧创建嵌套命名元组的方法。 对象d是预期的输出。我不确定聚合是否必须直接在Pandas中完成,然后转换到NamedTuple应该在之后完成

from typing import NamedTuple
from typing import List
import pandas as pd

if __name__ == "__main__":
    data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
    People = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])

    names = list(People[["Name"]].itertuples(name="Names", index=False))
    postal_codes = list(
        People[["PostalCode"]].itertuples(name="PostalCode", index=False)
    )

    # ...
    # ... The code after produce the expected output even if the name of the NamedTuple doesn't matter

    PeopleName = NamedTuple("PeopleName", [("Name", str)])
    PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])
    Demography = NamedTuple(
        "Demography", [("names", List[PeopleName]), ("postalcodes", PeoplePC)]
    )

    d = [
        Demography(
            [PeopleName(Name="tom"), PeopleName(Name="juli")],
            PeoplePC(PostalCode="ab 11"),
        ),
        Demography([PeopleName(Name="nick")], PeoplePC(PostalCode="ab 22"),),
    ]

Tags: thenamefromimporttypingabpeoplenamedtuple
1条回答
网友
1楼 · 发布于 2024-09-28 03:22:56

可以使用groupby,然后在组上应用函数(to_nested_tuple):

from typing import NamedTuple, List

import pandas as pd

data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
people = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])

PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])
Demography = NamedTuple("Demography", [("names", List[PeopleName]), ("postalcodes", PeoplePC)])


def to_nested_tuple(k, g):
    peoples = list(g['Name'].to_frame().itertuples(name='Person', index=False))
    return Demography(peoples, PeoplePC(k))


d = [to_nested_tuple(*item) for item in people.groupby('PostalCode')]

print(d)

输出

[Demography(names=[Person(Name='tom'), Person(Name='juli')], postalcodes=PeoplePC(PostalCode='ab 11')), Demography(names=[Person(Name='nick')], postalcodes=PeoplePC(PostalCode='ab 22'))]

相关问题 更多 >

    热门问题