如何在panda数据帧中均衡结果

grade section area_steel Nx Myy utilisation Accceptable 0 C16/20 STD R 700 350 4534 -310000 240000 0.313 0 1 C90/105 STD R 400 600 4248 -490000 270000 0.618 0 3 C35/45 STD R 550 400 1282 580000 810000 7.049 1 4 C12/15 STD R 350 750 2386 960000 610000 5.180 1

2条回答

网友

1楼 · 编辑于 2024-10-03 06:24:22

您可以在列ACCEPTABLE中将数据分组后对其进行采样

data.groupby('Accceptable').sample(lambda x: x.sample(frac = 0.5))

网友

2楼 · 编辑于 2024-10-03 06:24:22

试试这个：

import numpy as np
#to generate random sample

ratio = 1.979159389917336
no_fail =  16999

pass_to_choose = (data['Accceptable'] == 0)
#we want to choose all rows with Acceptable == 0

fail_to_choose = np.random.uniform(low = 0.0, high = 1.0, size = no_fail) < (1/ratio)
#randomly chosen 16999 bool values with relevant ratio of True and False

new_data = data[pass_to_choose]
#select all rows with Acceptable == 0

new_data = new_data.append(data[~pass_to_choose][fail_to_choose]).reset_index()
#add sampled rows with Acceptable == 1

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在panda数据帧中均衡结果

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >