Python category-encoders包_程序模块 - PyPI

将分类变量编码为数字的集合sklearn transformers

category-encoders的Python项目详细描述

分类编码方法[特拉维斯状态]（https://travis-ci.org/scikit-learn-contrib/categorical-encoding.svg？branch=master）"（https://travis ci.org/scikit learn contrib/categorical encoding）
[！[工作服状态]（https://coveralls.io/repos/scikit-learn-contrib/categorical-encoding/badge.svg？branch=master&service=github）（https://coveralls.io/r/scikit learn contrib/categorical encoding）
[！[Circleci状态]（https://circleci.com/gh/scikit-learn-contrib/categorical-encoding.svg？style=shield&；circle token=：circle token）（https://circleci.com/gh/scikit learn contrib/categorical encoding/tree/master）
[！[DOI]（https://zenodo.org/badge/47077067.svg）（https://zenodo.org/badge/latestdoi/47077067）

ttp://contrib.scikit learn.org/categorical encoding/]（http://contrib.scikit learn.org/categorical encoding/）

----

*后向差异对比度[2][3]
*基于[6]
*二进制[5]
*散列[1]
*赫尔默特对比度[2][3]
*james-stein估计器[9]
*leaveoneout[4]
*m-估计器[7]
*ordinal[2][3]
*one hot[2][3]
*多项式对比度[2][3]
*和对比度[2][3]
*目标编码[7]
*证据权重[8]

"statsmodels"、"statsmodels"和"scipy`.

/>要安装软件包，请执行：

``shell
``shell
$python setup.py install
`` `

` ` ` ` ` ` ` ` ` ` ` ` ` ` ` shell
` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` `是的/>要安装开发版本，您可以使用：

``shell
pip install--upgrade git+https://github.com/scikit learn contrib/categorical encoding在你现有的脚本中。支持的输入格式包括numpy数组和pandas数据帧。如果未传递cols参数，则将对具有object或pandas分类数据类型的所有列进行编码。有关变压器特定配置选项，请参阅文档。

示例
——
编码器有两种类型：无监督和有监督。一个无监督的示例：
``` python
从Category撸encoders import*
从sklearn导入panda as pd
。数据集导入load撸boston

准备一些数据
bunch=load撸boston（）
y=bunch.target
x=pd.dataframe（bunch.data，columns=bunch.feature撸names）

对两个分类特征进行编码的nAry编码
enc=binaryEncoder（cols=['chas'，'rad']）.fit（x）

sklearn.datasets导入load庠boston

准备一些数据
bunch=load庠boston（）
y庠train=bunch.target[0:250]
y庠test=bunch.target[250:506]
x庠train=pd.dataframe（bunch.data[0:250]，columns=bunch.feature庠names）
x庠test=pd.dataframe（bunch.data[250:506]，columns=bunch.fe自然名称）

/>```

其他示例和基准可以在"示例"目录中找到。

查看控制ibuting.md文件
或在github项目上打开一个问题以开始。

引用：
----

1。Kilian Weinberger；Anirban Dasgupta；John Langford；Alex Smola；Josh Attenberg（2009年）。用于大规模多任务学习的特征散列。PROCICML.
2.信息管理。分类变量的对比编码系统。加州大学洛杉矶分校：统计咨询小组。来自https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/
3。格雷戈里·凯里（2003）。编码分类变量。来自http://psych.colorado.edu/~carey/courses/psyc5741/handouts/coding%20categorical%20variables%202006-03-03.pdf
4。多类别分类变量编码策略。来自https://www.kaggle.com/c/caterpillar tube pricing/discussion/15748 143154。
5。超越一个热点：范畴变量的探索。来自http://www.willmcginnis.com/2015/11/29/beyond one hot分类变量的探索/
6。分类变量中的basen编码和网格搜索。网址：http://www.willmcginnis.com/2016/12/18/basen-encoding-grid-search-category嫒u encoders/
7。Daniele Miccii Barreca（2001年）。分类预测问题中高基数分类属性的预处理方案。西格德探险家。新闻报道。3, 1。来自http://dx.doi.org/10.1145/507533.507538
8。说明证据权重（悲哀）和信息价值。来自https://www.listenda.com/2015/03/weight of evidence woe and information.html
9。多样本的经验bayes。摘自http://chris said.io/2017/05/03/多样本量的经验bayes/

欢迎加入QQ群-->： 979659372

category-encoders 2.0.0

category-encoders的Python项目详细描述

推荐PyPI第三方库

fast5mod

iso19119-nl-parser

orglearn

changelog-builder

pyenvinfo

roboticstoolbox-python

KMeansKTran

doug

oddyse

snippets-dxc

pyrializer

chain-norm

load-m3u8

squirrel-battle

aria2stub

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

category-encoders 2.0.0

category-encoders的Python项目详细描述

推荐PyPI第三方库

fast5mod

iso19119-nl-parser

orglearn

changelog-builder

pyenvinfo

roboticstoolbox-python

KMeansKTran

doug

oddyse

snippets-dxc

pyrializer

chain-norm

load-m3u8

squirrel-battle

aria2stub

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签