一组用于异常检测的python模块

kenchi的Python项目详细描述


https://img.shields.io/pypi/v/kenchi.svghttps://img.shields.io/pypi/pyversions/kenchi.svghttps://img.shields.io/pypi/l/kenchi.svghttps://img.shields.io/conda/v/Y_oHr_N/kenchi.svghttps://img.shields.io/conda/pn/Y_oHr_N/kenchi.svghttps://img.shields.io/readthedocs/kenchi/stable.svghttps://img.shields.io/travis/HazureChi/kenchi/master.svghttps://img.shields.io/appveyor/ci/Y-oHr-N/kenchi/master.svghttps://img.shields.io/coveralls/github/HazureChi/kenchi/master.svghttps://img.shields.io/codeclimate/maintainability/HazureChi/kenchi.svghttps://mybinder.org/badge.svg

肯尼亚

这是一个scikit学习兼容库,用于异常检测。

依赖性

安装

您可以通过pip

安装
pip install kenchi

conda

conda install -c y_ohr_n kenchi

算法

  • 异常值检测
    1. 快车[8]
    2. lof[2](scikit学习包装)
    3. knn[1][12]
    4. 一次抽样[14]
    5. HBOS [5]
  • 新颖性检测
    1. ocsvm[13](scikit学习包装)
    2. 小批量表示
    3. iforest[10](scikit学习包装)
    4. 主成分分析
    5. gmm(scikit学习包装)
    6. kde [11](scikit学习包装)
    7. 稀疏结构学习

示例

importmatplotlib.pyplotaspltimportnumpyasnpfromkenchi.datasetsimportload_pimafromkenchi.outlier_detectionimport*fromkenchi.pipelineimportmake_pipelinefromsklearn.model_selectionimporttrain_test_splitfromsklearn.preprocessingimportStandardScalernp.random.seed(0)scaler=StandardScaler()detectors=[FastABOD(novelty=True,n_jobs=-1),OCSVM(),MiniBatchKMeans(),LOF(novelty=True,n_jobs=-1),KNN(novelty=True,n_jobs=-1),IForest(n_jobs=-1),PCA(),KDE()]# Load the Pima Indians diabetes dataset.X,y=load_pima(return_X_y=True)X_train,X_test,_,y_test=train_test_split(X,y)# Get the current Axes instanceax=plt.gca()fordetindetectors:# Fit the model according to the given training datapipeline=make_pipeline(scaler,det).fit(X_train)# Plot the Receiver Operating Characteristic (ROC) curvepipeline.plot_roc_curve(X_test,y_test,ax=ax)# Display the figureplt.show()
https://raw.githubusercontent.com/HazureChi/kenchi/master/docs/images/readme.png

参考文献

[1]Angiulli, F., and Pizzuti, C., “Fast outlier detection in high dimensional spaces,” In Proceedings of PKDD, pp. 15-27, 2002.
[2]Breunig, M. M., Kriegel, H.-P., Ng, R. T., and Sander, J., “LOF: identifying density-based local outliers,” In Proceedings of SIGMOD, pp. 93-104, 2000.
[3]Dua, D., and Karra Taniskidou, E., “UCI Machine Learning Repository,” 2017.
[4]Goix, N., “How to evaluate the quality of unsupervised anomaly detection algorithms?” In ICML Anomaly Detection Workshop, 2016.
[5]Goldstein, M., and Dengel, A., “Histogram-based outlier score (HBOS): A fast unsupervised anomaly detection algorithm,” KI: Poster and Demo Track, pp. 59-63, 2012.
[6]Ide, T., Lozano, C., Abe, N., and Liu, Y., “Proximity-based anomaly detection using sparse structure learning,” In Proceedings of SDM, pp. 97-108, 2009.
[7]Kriegel, H.-P., Kroger, P., Schubert, E., and Zimek, A., “Interpreting and unifying outlier scores,” In Proceedings of SDM, pp. 13-24, 2011.
[8]Kriegel, H.-P., Schubert, M., and Zimek, A., “Angle-based outlier detection in high-dimensional data,” In Proceedings of SIGKDD, pp. 444-452, 2008.
[9]Lee, W. S, and Liu, B., “Learning with positive and unlabeled examples using weighted Logistic Regression,” In Proceedings of ICML, pp. 448-455, 2003.
[10]Liu, F. T., Ting, K. M., and Zhou, Z.-H., “Isolation forest,” In Proceedings of ICDM, pp. 413-422, 2008.
[11]Parzen, E., “On estimation of a probability density function and mode,” Ann. Math. Statist., 33(3), pp. 1065-1076, 1962.
[12]Ramaswamy, S., Rastogi, R., and Shim, K., “Efficient algorithms for mining outliers from large data sets,” In Proceedings of SIGMOD, pp. 427-438, 2000.
[13]Scholkopf, B., Platt, J. C., Shawe-Taylor, J. C., Smola, A. J., and Williamson, R. C., “Estimating the Support of a High-Dimensional Distribution,” Neural Computation, 13(7), pp. 1443-1471, 2001.
[14]Sugiyama, M., and Borgwardt, K., “Rapid distance-based outlier detection via sampling,” Advances in NIPS, pp. 467-475, 2013.

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java为什么程序显示空结果?   java应用程序在测试设备上调试时工作正常,但在发布版apk中没有,它没有获得post。来自firebase的类变量   java Android:从主活动按钮确定在listview中选中哪个复选框   在Spring中添加@OneToOne注释时启动ApplicationContext时发生java错误   用JAVA Android实现矩阵计算的最快方法   SpringJava语义有没有更好的编写方法?   java从hashmap中减去两个值后返回最小差值的键?   Java中的静态初始化顺序:Netty 4.0.7的例外   java如何检查用户输入是否为字符串   循环Java计数单词索引   java如何使用以下代码将视频流传输到Android异步Http服务器?   java如何在jtable的所有行中循环   java如何使用maven将unicode插入mysql   java使用安卓加速远程数据检索   java试图模拟麦克风(javax.sound.sampled)   swing SwingWorker从不归还任何东西?(爪哇)   首次在Android Studio上未加载java LibGDX文件   java如何在多个Mysql服务器上设置限制和偏移?   如何防止从java连接到mongodb时登录控制台?