在Python中创建一个包含三个列表中所有值的组合的数据框

2024-09-30 08:24:12 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我有两个列表:gender = ['Male', 'Female']subject = ['Math3_Exam_Mark', 'Math6_Exam_Mark', 'Math9_Exam_Mark', 'ELA3_Exam_Mark', 'ELA6_Exam_Mark', 'ELA9_Exam_Mark'],还有一个包含从CSV文件提取的日期列表的ndarray birthMonthYear。你知道吗

我想创建一个包含三列的新数据框:gender、subject、birthMonthYear。每个性别、主题和出生月份的组合都应该有行。你知道吗

有没有一个简单的方法可以做到这一点,也许是熊猫?我想我可以创建嵌套的foreach循环,遍历每个列表来创建数据帧,但是如果有更简单的方法,我想试试。你知道吗

谢谢你的帮助!你知道吗


Tags: 数据方法列表gendermalefemalesubjectmark
1条回答
网友
1楼 · 发布于 2024-09-30 08:24:12

设置

gender = ['Male', 'Female']
subject = ['Math3_Exam_Mark', 'Math6_Exam_Mark', 'Math9_Exam_Mark',
           'ELA3_Exam_Mark', 'ELA6_Exam_Mark', 'ELA9_Exam_Mark']
birthMonthYear = pd.date_range('2010-01-31', periods=2, freq='M')

选项1
itertools.product

from itertools import product

pd.DataFrame(
    list(product(gender, subject, birthMonthYear)),
    columns=['Gender', 'Subject', 'BirthMonthYear']
)

    Gender          Subject BirthMonthYear
0     Male  Math3_Exam_Mark     2010-01-31
1     Male  Math3_Exam_Mark     2010-02-28
2     Male  Math6_Exam_Mark     2010-01-31
3     Male  Math6_Exam_Mark     2010-02-28
4     Male  Math9_Exam_Mark     2010-01-31
5     Male  Math9_Exam_Mark     2010-02-28
6     Male   ELA3_Exam_Mark     2010-01-31
7     Male   ELA3_Exam_Mark     2010-02-28
8     Male   ELA6_Exam_Mark     2010-01-31
9     Male   ELA6_Exam_Mark     2010-02-28
10    Male   ELA9_Exam_Mark     2010-01-31
11    Male   ELA9_Exam_Mark     2010-02-28
12  Female  Math3_Exam_Mark     2010-01-31
13  Female  Math3_Exam_Mark     2010-02-28
14  Female  Math6_Exam_Mark     2010-01-31
15  Female  Math6_Exam_Mark     2010-02-28
16  Female  Math9_Exam_Mark     2010-01-31
17  Female  Math9_Exam_Mark     2010-02-28
18  Female   ELA3_Exam_Mark     2010-01-31
19  Female   ELA3_Exam_Mark     2010-02-28
20  Female   ELA6_Exam_Mark     2010-01-31
21  Female   ELA6_Exam_Mark     2010-02-28
22  Female   ELA9_Exam_Mark     2010-01-31
23  Female   ELA9_Exam_Mark     2010-02-28

选项2
pd.MultiIndex.from_product

idx = pd.MultiIndex.from_product(
    [gender, subject, birthMonthYear],
    names=['Gender', 'Subject', 'BirthMonthYear']
)

pd.DataFrame(index=idx).reset_index()

    Gender          Subject BirthMonthYear
0     Male  Math3_Exam_Mark     2010-01-31
1     Male  Math3_Exam_Mark     2010-02-28
2     Male  Math6_Exam_Mark     2010-01-31
3     Male  Math6_Exam_Mark     2010-02-28
4     Male  Math9_Exam_Mark     2010-01-31
5     Male  Math9_Exam_Mark     2010-02-28
6     Male   ELA3_Exam_Mark     2010-01-31
7     Male   ELA3_Exam_Mark     2010-02-28
8     Male   ELA6_Exam_Mark     2010-01-31
9     Male   ELA6_Exam_Mark     2010-02-28
10    Male   ELA9_Exam_Mark     2010-01-31
11    Male   ELA9_Exam_Mark     2010-02-28
12  Female  Math3_Exam_Mark     2010-01-31
13  Female  Math3_Exam_Mark     2010-02-28
14  Female  Math6_Exam_Mark     2010-01-31
15  Female  Math6_Exam_Mark     2010-02-28
16  Female  Math9_Exam_Mark     2010-01-31
17  Female  Math9_Exam_Mark     2010-02-28
18  Female   ELA3_Exam_Mark     2010-01-31
19  Female   ELA3_Exam_Mark     2010-02-28
20  Female   ELA6_Exam_Mark     2010-01-31
21  Female   ELA6_Exam_Mark     2010-02-28
22  Female   ELA9_Exam_Mark     2010-01-31
23  Female   ELA9_Exam_Mark     2010-02-28

相关问题 更多 >

    热门问题