Python parc包_程序模块 - PyPI

未提供项目说明

parc的Python项目详细描述

PARC公司

PARC，“加速精细社区划分的表型划分”是一种快速、自动化、基于组合图的聚类方法，它将层次图构造（HNSW）和数据驱动的图修剪与新的莱顿社区检测算法相结合。在

入门

使用pip安装

conda create --name ParcEnv pip // (optional)
pip install parc // tested on linux

通过克隆存储库并运行设置.py

^{pr2}$

如果需要，请单独安装依赖项

pip安装leidenalg、igraph和hnswlib

示例用法1。（小测试集）.sklearn的虹膜和数字数据集

from parc import PARC
import matplotlib.pyplot as plt
from sklearn import datasets

// load sample IRIS data
//data (n_obs x k_dim, 150x4)
iris = datasets.load_iris()
X = iris.data
y=iris.target

plt.scatter(X[:,0],X[:,1], c = y) // colored by 'ground truth'
plt.show()

Parc1 = parc.PARC(X,y) // instantiate PARC
Parc1.run_PARC() // run the clustering
parc_labels = Parc1.labels

# View scatterplot colored by PARC labels

plt.scatter(X[:, 0], X[:, 1], c=parc_labels)
plt.show()

// load sample digits data
digits = datasets.load_digits()
X = digits.data // (n_obs x k_dim, 1797x64) 
y = digits.target
Parc2 = parc.PARC(X,y, jac_std_global='median') // 'median' is default pruning level
Parc2.run_PARC()
parc_labels = Parc2.labels

示例用法2。（中等规模scRNA序列）：10倍PBMC（Zheng等人，2017年）

pre-processed datafile

annotations

import PARC
import csv

## load data (50 PCs of filtered gene matrix pre-processed as per Zheng et al. 2017)

X = csv.reader(open("'./pca50_pbmc68k.txt", 'rt'),delimiter = ",")
X = np.array(list(X)) // (n_obs x k_dim, 68579 x 50)
X = X.astype("float")
// OR with pandas as: X = pd.read_csv("'./pca50_pbmc68k.txt").values.astype("float")

y = [] // annotations
with open('/annotations_zhang.txt', 'rt') as f: 
    for line in f: y.append(line.strip().replace('\"', ''))
// OR with pandas as: y =  list(pd.read_csv('./data/zheng17_annotations.txt', header=None)[0])   


parc1 = parc.PARC(X,y) // instantiate PARC
parc1.run_PARC() // run the clustering
parc_labels = parc1.labels

tsne注释和PARC聚类图

示例用法3。10X PBMC（Zheng等人，2017），整合了稀疏管道

raw datafile

pip install scanpy

import scanpy.api as sc
import pandas as pd
//load data
path = './data/zheng17_filtered_matrices_mex/hg19/'
adata = sc.read(path + 'matrix.mtx', cache=True).T  # transpose the data
adata.var_names = pd.read_csv(path + 'genes.tsv', header=None, sep='\t')[1]
adata.obs_names = pd.read_csv(path + 'barcodes.tsv', header=None)[0]

// annotations as per correlation with pure samples
annotations = list(pd.read_csv('./data/zheng17_annotations.txt', header=None)[0])
adata.obs['annotations'] = pd.Categorical(annotations)

//pre-process as per Zheng et al., and take first 50 PCs for analysis
sc.pp.recipe_zheng17(adata)
sc.tl.pca(adata, n_comps=50)
parc1 = parc.PARC(adata2.obsm['X_pca'], annotations)
parc_labels = parc1.labels
adata2.obs["PARC"] = pd.Categorical(parc_labels)

//visualize
sc.pl.umap(adata, color='annotations')
sc.pl.umap(adata, color='PARC')

示例用法4。大规模（70K亚群和1.1M细胞）肺癌细胞（基于多原子成像细胞术的特征）

normalized image-based feature matrix 70K cells

Lung Cancer cells annotation 70K cells

1.1M cell features and annotations

import PARC
import pandas as pd

// load data: digital mix of 7 cell lines from 7 sets of pure samples (1.1M cells x 26 features)
X = pd.read_csv("'./LungData.txt").values.astype("float") 
y = list(pd.read_csv('./LungData_annotations.txt', header=None)[0]) // list of cell-type annotations

// run PARC
parc1 = parc.PARC(X, y)
parc_labels = parc1.labels

tsne注释和PARC聚类图，特性热图

对依赖项的引用

莱顿（pip安装leidenalg）（V.A.Traag，2019年）doi.org/10.1038/s41598-019-41695-z）
hsnwlib Malkov，Yu A.和D.A.Yashunin。”使用分层可导航小世界图的高效和健壮的近似近邻搜索。“TPAMI，预印本：https://arxiv.org/abs/1603.09320
测谎仪(igraph.org/python/)在

引用

如果您发现这段代码对您的工作有用，请考虑引用本文PARC:ultrafast and accurate clustering of phenotypic data of millions of single cells

欢迎加入QQ群-->： 979659372

parc 0.31

parc的Python项目详细描述

PARC公司

入门

使用pip安装

通过克隆存储库并运行设置.py

如果需要，请单独安装依赖项

示例用法1。（小测试集）.sklearn的虹膜和数字数据集

示例用法2。（中等规模scRNA序列）：10倍PBMC（Zheng等人，2017年）

示例用法3。10X PBMC（Zheng等人，2017），整合了稀疏管道

示例用法4。大规模（70K亚群和1.1M细胞）肺癌细胞（基于多原子成像细胞术的特征）

对依赖项的引用

引用

推荐PyPI第三方库

giteamigration

kompozitor

ashishbansal-101703113-missing-value

wxpay_sdk

pmx-biobb

gmailp

checkidcreator

owldata

crypto-histor

primeight

augaudio

palett

aliyun-python-sdk-clickhouse

nautik-als

easy-s3

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

parc 0.31

parc的Python项目详细描述

PARC公司

入门

使用pip安装

通过克隆存储库并运行设置.py

如果需要，请单独安装依赖项

示例用法1。（小测试集）.sklearn的虹膜和数字数据集

示例用法2。（中等规模scRNA序列）：10倍PBMC（Zheng等人，2017年）

示例用法3。10X PBMC（Zheng等人，2017），整合了稀疏管道

示例用法4。大规模（70K亚群和1.1M细胞）肺癌细胞（基于多原子成像细胞术的特征）

对依赖项的引用

引用

推荐PyPI第三方库

giteamigration

kompozitor

ashishbansal-101703113-missing-value

wxpay_sdk

pmx-biobb

gmailp

checkidcreator

owldata

crypto-histor

primeight

augaudio

palett

aliyun-python-sdk-clickhouse

nautik-als

easy-s3

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签