格式化列中的数据帧

2024-09-29 09:26:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在处理一个来自我希望重新格式化的字典的数据帧。字典如下所示:

transactionDetails = {"paymentStatus":["COMPLETED", "REFUNDED", "COMPLETED"],
                  "address":["123 Fake Street", "123 Example Street", "123 Top Secret"],
                  "item":["Apple", "Banana", "Orange"],
                  "transactionID":["2132123", "54654645", "56754646"],
                  "orderTime":["14:55", "15:10", "23:11"],
                  "email":["example@example.com", "fake@example.com", "notreal@notreal.com"],
                  "refundNotes":[],
                  "notes": []}

字典已按以下方式写入数据帧:

df = pd.DataFrame.from_dict(transactionDetails, orient='index')

当前输出数据帧如下:

                                     0                   1                    2
paymentStatus            COMPLETED            REFUNDED            COMPLETED
address            123 Fake Street  123 Example Street       123 Top Secret
item                         Apple              Banana               Orange
transactionID              2132123            54654645             56754646
orderTime                    14:55               15:10                23:11
email          example@example.com    fake@example.com  notreal@notreal.com
refundNotes                   None                None                 None
notes                         None                None                 None

我想用以下方式垂直显示数据:

paymentStatus              COMPLETED
address              123 Fake Street
item                           Apple
transactionID                2132123
orderTime                      14:55
email            example@example.com
refundNotes                     None
notes                           None

paymentStatus              COMPLETED
address           123 Example Street
item                          Banana
transactionID               54654645 
orderTime                      15:10
email               fake@example.com
refundNotes                     None
notes                           None

etc

PS:我尝试过使用.stack(),但结果是以下输出不是我想要的:

paymentStatus  0              COMPLETED
               1               REFUNDED
               2              COMPLETED
address        0        123 Fake Street
               1     123 Example Street
               2         123 Top Secret
item           0                  Apple
               1                 Banana
               2                 Orange
transactionID  0                2132123
               1               54654645
               2               56754646
orderTime      0                  14:55
               1                  15:10
               2                  23:11
email          0    example@example.com
               1       fake@example.com
               2    notreal@notreal.com

谢谢


Tags: 数据comnonestreetaddressexampleemailitem
3条回答

选项1
unstack+reset_index-

df.unstack().reset_index(level=0, drop=True)

paymentStatus              COMPLETED
address              123 Fake Street
item                           Apple
transactionID                2132123
orderTime                      14:55
email            example@example.com
refundNotes                     None
notes                           None
paymentStatus               REFUNDED
address           123 Example Street
item                          Banana
transactionID               54654645
orderTime                      15:10
email               fake@example.com
refundNotes                     None
notes                           None
paymentStatus              COMPLETED
address               123 Top Secret
item                          Orange
transactionID               56754646
orderTime                      23:11
email            notreal@notreal.com
refundNotes                     None
notes                           None

选项2
stack+sort_index+reset_index

df.stack().sort_index(level=1).reset_index(level=1, drop=True)

paymentStatus              COMPLETED
address              123 Fake Street
item                           Apple
transactionID                2132123
orderTime                      14:55
email            example@example.com
paymentStatus               REFUNDED
address           123 Example Street
item                          Banana
transactionID               54654645
orderTime                      15:10
email               fake@example.com
paymentStatus              COMPLETED
address               123 Top Secret
item                          Orange
transactionID               56754646
orderTime                      23:11
email            notreal@notreal.com

注意stack会删除NaN值,因此可能不是您的最佳选择

使用dropna= False

df.stack(dropna=False).swaplevel(0,1).sort_index(level=0)
Out[261]: 
0  address              123 Fake Street
   email            example@example.com
   item                           Apple
   notes                           None
   orderTime                      14:55
   paymentStatus              COMPLETED
   refundNotes                     None
   transactionID                2132123
1  address           123 Example Street
   email               fake@example.com
   item                          Banana
   notes                           None
   orderTime                      15:10
   paymentStatus               REFUNDED
   refundNotes                     None
   transactionID               54654645
2  address               123 Top Secret
   email            notreal@notreal.com
   item                          Orange
   notes                           None
   orderTime                      23:11
   paymentStatus              COMPLETED
   refundNotes                     None
   transactionID               56754646
dtype: object

可以使用for循环向数据帧串行添加项:

indexes = transactionDetails.keys()
dfmade = False
for n in range(3):
    newdict = {}
    for i in indexes:
        if transactionDetails[i]:
            newdict[i]= transactionDetails[i][n]
        else:
            newdict[i] = []
    if dfmade:
        df = pd.concat([df, pd.DataFrame.from_dict(newdict, orient='index')])
    else: 
        df = pd.DataFrame.from_dict(newdict, orient='index')
        dfmade = True
print(df)

输出:

                             0
paymentStatus            COMPLETED
transactionID              2132123
item                         Apple
notes                           []
orderTime                    14:55
address            123 Fake Street
refundNotes                     []
email          example@example.com
paymentStatus             REFUNDED
transactionID             54654645
item                        Banana
notes                           []
orderTime                    15:10
address         123 Example Street
refundNotes                     []
email             fake@example.com
paymentStatus            COMPLETED
transactionID             56754646
item                        Orange
notes                           []
orderTime                    23:11
address             123 Top Secret
refundNotes                     []
email          notreal@notreal.com

相关问题 更多 >