使用pandas对数据进行分类

2024-09-27 04:27:09 发布

您现在位置:Python中文网/ 问答频道 /正文

enter image description here我试图在数据集上运行卡方检验,为此,我需要使用pd.cut()来表示数据集中的类别。然而,我得到了这个错误

ufunc 'subtract' did not contain a loop with signature matching types dtype('

我的代码:

import pandas as pd
import numpy as np
import scipy as sp
import math

data_main = pd.read_csv("sample_survey.csv")
data = data_main.iloc[:, [1,2]]

data["wrkstat"] = data["wrkstat"].astype(str)
data["marital"] = data["marital"].astype(str)
cols = ['wrkstat', 'marital']

cut_points = ['Divorced', 'Married', 'Never Married', 'Seperated','Widowed']
label_names = ['Divorced1', 'Married', 'Never Married', 
'Seperated','Widowed']
data["Marital_Categories"] = pd.cut(data["marital"], cut_points)

marital_by_wrkstat = data[['wrkstat', 'marital_categories']]
marital_by_wrkstat.head()

Tags: csv数据importdatamainaspointspd

热门问题