尝试为多个列创建虚拟变量

2024-09-28 05:45:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试为多个列创建虚拟变量,例如:

  • Gender (1 = male; 2 = female)
  • Education (1 = graduate school; 2 = university; 3 = high school; 4 = others)
  • Marital status (1 = married; 2 = single; 3 = others)
  • Defaulter (1= Default,0=No Default)

有人能告诉我怎么做吗?你知道吗


Tags: defaultstatusgendermalefemalehighsingleeducation
2条回答

假设你有这样的“数据”:

    Education         Gender    MarritalStatus
0   graduate school   male      married
1   university        female    single
2   high school       female    other
3   others            male      single
4   university        male      single

那你可以用pd系列应用()应用编码 例如

def enc_for_gender(x):
if x == 'male':
    return 1
return 2

def enc_for_education(x):
    if x == 'graduate school':
        return 1
    elif x == 'university':
        return 2
    elif x == 'high school':
        return 3
    return 4

data['Gender'].apply(enc_for_gender)

结果:

0    1
1    2
2    2
3    1
4    1
Name: Gender, dtype: int64

教育也一样

data['Education'].map(enc_for_education)

结果:

0    1
1    2
2    3
3    4
4    2
Name: Education, dtype: int64

其他人也一样

只需将字典用作键值:

Gender= {1: "male",2 : "female"}
Education = {1 :"graduate school", 2 : "university", 3 : "high school", 4 : "others"}

建议您使用字符串作为dict中的键,如果可能的话,您可以使用like

Gender= {"male":1, "female":2}

或者算作字符串

Gender= {"1": "male","2" : "female"}

相关问题 更多 >

    热门问题