去掉一个分类特征的编码
loo_encoder的Python项目详细描述
去掉一个编码器
为分类功能保留一个代码
请在此处查看此项目的源: https://github.com/welfare520/leave-one-out-encoder。
开始
安装
$ pip install loo_encoder
示例
根据X和y安装编码器,然后对其进行变换
fromloo_encoder.encoderimportLeaveOneOutEncoderimportpandasaspdimportnumpyasnpenc=LeaveOneOutEncoder(cols=['gender','country'],handle_unknown='impute',sigma=0.02,random_state=42)X=pd.DataFrame({"gender":["male","male","female","male"],"country":["Germany","USA","USA","UK"],"clicks":[10,33,47,21]})y=pd.Series([150,250,300,100],name="orders")df_train=enc.fit_transform(X=X,y=y,sample_weight=X['clicks'])
执行到新分类数据的转换。
X_val=pd.DataFrame({"gender":["unknown","male","female","male"],"country":["Germany","USA","Germany","Japan"]})df_test=enc.transform(X=X_val)