在数据帧中使用z转换

df = df.sample(frac=1, random_state=42) x = df[["visitorid", "itemid"]].values #y = df["code"].values y = df["code"].apply(lambda x: (x - x.mean()) / x.std()).values # Assuming training on 90% of the data and validating on 10%. train_indices = int(0.9 * df.shape[0]) x_train, x_val, y_train, y_val = ( x[:train_indices], x[train_indices:], y[:train_indices], y[train_indices:], ) print(y)

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-7-2712d78bf2a4> in <module>() 2 x = df[["visitorid", "itemid"]].values 3 #y = df["code"].values ----> 4 y = df["code"].apply(lambda x: (x - x.mean()) / x.std()).values 5 # Assuming training on 90% of the data and validating on 10%. 6 train_indices = int(0.9 * df.shape[0]) 1 frames pandas/_libs/lib.pyx in pandas._libs.lib.map_infer() <ipython-input-7-2712d78bf2a4> in <lambda>(x) 2 x = df[["visitorid", "itemid"]].values 3 #y = df["code"].values ----> 4 y = df["code"].apply(lambda x: (x - x.mean()) / x.std()).values 5 # Assuming training on 90% of the data and validating on 10%. 6 train_indices = int(0.9 * df.shape[0]) AttributeError: 'int' object has no attribute 'mean'

2条回答

网友

1楼 · 编辑于 2024-09-30 22:21:48

由于使用apply(lambda x: ...)，x将只是一个值。当您尝试对单个值使用x.mean()时，将出现错误

相反，您要做的是在整个列上使用mean和std。使用apply，可以按如下方式进行：

col = 'code'
df['z_score'] = df[col].apply(lambda x: (x - df[col].mean()) / df[col].std())

但是，如果没有apply，速度会更快：

df['z_score'] = (df[col] - df[col].mean())/df[col].std()

网友

2楼 · 编辑于 2024-09-30 22:21:48

也许你需要这个：

y = (df["code"] - df["code"].mean() / df["code"].std().values

我喜欢这种方法：（高性能，如果您的数据集有15000行以上）

df.eval('(code-code.mean())/code.std()')

相关问题更多 >

编程相关推荐

热门问题

热门文章