无监督随机林(随机林聚类)
URF的Python项目详细描述
urf(无监督随机林,或称随机林聚类)是本文的python实现:shi,t.,&horvath,s.(2006)。随机森林预测的无监督学习。计算和图形统计杂志,15(1),118-138。
先决条件
conda install -c bioconda pycluster
或:
wget http://bonsai.hgc.jp/~mdehoon/software/cluster/Pycluster-1.54.tar.gz tar -zxvf Pycluster-1.54.tar.gz cd Pycluster-1.54 python setup.py install
安装
pip install URF
用法
from sklearn.datasets import load_iris from URF.main import random_forest_cluster, plot_cluster_result iris = load_iris() X = iris.data y = iris.target print(len(list(set(y)))) clf, prox_mat, cluster_ids = random_forest_cluster(X, k=3, max_depth=20, random_state=0) plot_cluster_result(prox_mat, cluster_ids, marker=y)
如果遇到类似
> QXcbConnection: Could not connect to display
然后需要将这些代码添加到文件的开头:
import matplotlib as mpl mpl.use("Agg")
当调用plot_cluster_result时,必须指定输出文件,如下所示:
plot_cluster_result(prox_mat, cluster_ids, marker=y, output="test_123.png")