我必须在一个数据帧中存储每一列（8个五分位数） - 问答 - Python中文网

我必须在一个数据帧中存储每一列（8个五分位数）

2024-10-04 07:32:36 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我有一个dataframe，它包含4列，对于每一列，我们必须做bucketing（将数据分布在8个bucket中），这样就可以迭代地为第一列和第二列进行bucketing，而不必手动指定列名

这是我正在尝试的代码

for col in df3.columns[0:]:
cb1 = np.linspace(min(col), max(col), 11)
df3.insert(2 ,'buckets',pd.cut(col, cb1, labels=np.arange(1, 11, 1)))
print(df3[col])

这里df3是示例数据集

苹果橙香蕉

5 2 6条

六、四、六

2 8 9号

4 7 0年

预期输出为

苹果橙香蕉桶

5 2 6 1 3 2

6 4 6 1 4

2 8 9 2 1 8

4 7 0 5 4 1

这里bucket列指定了与数据相关的bucket编号

Tags：数据代码 in 苹果 dataframe for bucket np

1条回答

网友

1楼 · 发布于 2024-10-04 07:32:36

因为输出是完全随机的，所以数据列和bucket nums之间没有相关性，所以在这种情况下应该分别生成bucket。你知道吗

for c in df.columns:
    df['bucket_' + c] = np.random.randint(8, size=(len(df))) + 1
df # your random bucket df.

如果希望桶的大小相等：

for c in df.columns:
    arr = np.arange(8) + 1
    arr = np.repeat(arr, int(len(df))/8) # your df has to be divisible by 8
    np.random.shuffle(arr) # shuffle the array.
    df['bucket_' + c] = arr

相关问题更多 >

编程相关推荐

热门问题

热门文章