使用字典查找和一些简单的数学方法创建数据帧值

2024-09-28 22:17:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我原来的数据框是这样的。这些是前5行:

                     Date      Id        Earned        Redeemed    Type
0 2019-01-01 00:01:18.599      69          1000               0  REGULAR
1 2019-01-01 00:04:25.287      69          1000               0  REGULAR
2 2019-01-01 00:18:21.688      70          1000               0  REGULAR
3 2019-01-01 00:29:14.709      71          1000               0      VIP
4 2019-01-01 00:30:26.460      69             0            1000  REGULAR

我还有一本类似这样的字典:

dict = {
        '69': {'REGULAR': 5, 'VIP': 10},
        '70': {'REGULAR': 10},
        '71': {'REGULAR': 1, 'VIP': 2}
       }

我想创建一个新的数据帧,如下所示:

                     Date      Id        Earned        Redeemed    Type   Earned_Normal
0 2019-01-01 00:01:18.599      69          1000               0  REGULAR            200
1 2019-01-01 00:04:25.287      69          1000               0  REGULAR            200
2 2019-01-01 00:18:21.688      70          1000               0  REGULAR            100
3 2019-01-01 00:29:14.709      71          1000               0      VIP            500
4 2019-01-01 00:30:26.460      69             0            1000  REGULAR              0

“Id”和“Type”列中的值用作字典中返回因子的键,例如Id:69和Type:REGULAR返回5

因此,在指数0处,挣得的_正常值=挣得的/5=200

我已经了解了如何在特定的行级别执行此操作,如何为所有行动态执行此操作

感谢您的帮助


Tags: 数据iddate字典type指数dict因子
2条回答

你可以试试这样的。我建议您将替换dict密钥更改为int

import pandas as pd
import numpy as np

replacement_dict = {
        '69': {'REGULAR': 5, 'VIP': 10},
        '70': {'REGULAR': 10},
        '71': {'REGULAR': 1, 'VIP': 2}
       }

data = [
    {"Id":69,"Earned":1000,"Redeemed":0,"Type":"REGULAR"},
    {"Id":70,"Earned":1000,"Redeemed":0,"Type":"REGULAR"},
    {"Id":71,"Earned":1000,"Redeemed":0,"Type":"VIP"}
    ]

df = pd.DataFrame.from_dict(data)
df["Earned_Normal"] = np.nan

print(df)

def transform_row(r):
    # we add a default
    default = {'REGULAR': 5, 'VIP': 10}
    replacement_for_this_row = replacement_dict.get(str(r.Id),default)
    r.Earned_Normal = r.Earned / replacement_for_this_row[r.Type]
    return r


print(df.apply(transform_row, axis=1))

我希望有帮助

我不知道字典中的值是什么意思,所以我将把它签名为x

df['x'] = df['Id'].apply(lambda x: dict[str(x)])
df['Earned_normal'] = df.apply(lambda x: x[2]/x[5][x[4]], axis=1) #here may be problem with index cause I kinda wrong imported csv.

相关问题 更多 >