Pandas产生的数据帧的取值范围

2024-09-29 02:24:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧:

speciality_id   speciality_name
1               Acupuncturist
2               Andrologist
3               Anaesthesiologist
4               Audiologist
5               Ayurvedic Doctor
6               Biochemist
7               Biophysicist

我想复制上述数据帧的值范围,年和月

例如:

year = [2018]
Month = [1,2]

我想生成如下所示的数据帧:

Year    Month   speciality_id   speciality_name
2018    1       1               Acupuncturist
2018    1       2               Andrologist
2018    1       3               Anaesthesiologist
2018    1       4               Audiologist
2018    1       5               Ayurvedic Doctor
2018    1       6               Biochemist
2018    1       7               Biophysicist

2018    2       1               Acupuncturist
2018    2       2               Andrologist
2018    2       3               Anaesthesiologist
2018    2       4               Audiologist
2018    2       5               Ayurvedic Doctor
2018    2       6               Biochemist
2018    2       7               Biophysicist

我想不出一个办法。正确的方法是什么


Tags: 数据nameidyeardoctormonthspecialityayurvedic
3条回答

您可以通过^{}计算笛卡尔积,然后使用平铺数据帧连接:

year = [2018]
month = [1, 2]

# calculate Cartesian product and repeat by number of rows in dataframe
cart_prod = pd.MultiIndex.from_product([year, month], names=['year', 'month'])

# tile dataframe and join year_month index
res = df.loc[np.tile(df.index, len(year) * len(month))]\
        .set_index(cart_prod.repeat(df.shape[0])).reset_index()

print(res)

    year  month  speciality_id    speciality_name
0   2018      1              1      Acupuncturist
1   2018      1              2        Andrologist
2   2018      1              3  Anaesthesiologist
3   2018      1              4        Audiologist
4   2018      1              5    AyurvedicDoctor
5   2018      1              6         Biochemist
6   2018      1              7       Biophysicist
7   2018      2              1      Acupuncturist
8   2018      2              2        Andrologist
9   2018      2              3  Anaesthesiologist
10  2018      2              4        Audiologist
11  2018      2              5    AyurvedicDoctor
12  2018      2              6         Biochemist
13  2018      2              7       Biophysicist

对所有组合使用product,使用左连接创建DataFrame^{}

year = [2018]
Month = [1,2]

from  itertools import product

df1 = pd.DataFrame(list(product(year, Month, df['speciality_id'])), 
                   columns=['Year','Month','speciality_id'])
print (df1)
    Year  Month  speciality_id
0   2018      1              1
1   2018      1              2
2   2018      1              3
3   2018      1              4
4   2018      1              5
5   2018      1              6
6   2018      1              7
7   2018      2              1
8   2018      2              2
9   2018      2              3
10  2018      2              4
11  2018      2              5
12  2018      2              6
13  2018      2              7

df = df1.merge(df, on='speciality_id', how='left')
print (df)
    Year  Month  speciality_id    speciality_name
0   2018      1              1      Acupuncturist
1   2018      1              2        Andrologist
2   2018      1              3  Anaesthesiologist
3   2018      1              4        Audiologist
4   2018      1              5   Ayurvedic Doctor
5   2018      1              6         Biochemist
6   2018      1              7       Biophysicist
7   2018      2              1      Acupuncturist
8   2018      2              2        Andrologist
9   2018      2              3  Anaesthesiologist
10  2018      2              4        Audiologist
11  2018      2              5   Ayurvedic Doctor
12  2018      2              6         Biochemist
13  2018      2              7       Biophysicist

我希望你能帮上忙

# A: Create the new columns
df['Year'], df['Month'] = 2018, None 

# A: Create the two new DataFrame
df1 = df.copy()
df2 = df.copy()

# A: Edith the month in both DataFrames
df1['Month'], df2['Month'] = 1, 2

相关问题 更多 >