Python biovida包_程序模块 - PyPI

用于机器学习应用的自动化生物医学信息管理。

biovida的Python项目详细描述

《生物维达》是一个旨在使其易于获得现有的图书馆。生物医学图像的数据集，以及建立全新的、定制的一个。

希望通过自动化单调乏味的数据咀嚼通常，更多的人会对这个过程感兴趣将机器学习应用于生物医学图像，反过来，深入了解人类疾病。

为了向递归致敬，biovida试图完成一些利用机器学习本身实现自动化，使用诸如卷积的工具神经网络。

安装

python包索引：

$ pip install biovida

图像：稳定

只需几行代码，您就可以访问生物医学数据库存储了上千万张图片。

请注意，您必须遵守版权和其他根据其向您提供此数据的使用限制创造者。

open-i生物医学图像搜索引擎

# 1. Import the Interface for the NIH's Open-i API.frombiovida.imagesimportOpeniInterface# 2. Create an Instance of the Toolopi=OpeniInterface()# 3. Perform a search for x-rays and cts of lung canceropi.search(query='lung cancer',image_type=['x_ray','ct'])# Results Found: 9,220.# 4. Pull the datasearch_df=opi.pull()

癌症影像档案

# 1. Import the interface for the Cancer Imaging Archivefrombiovida.imagesimportCancerImageInterface# 2. Create an Instance of the Toolcii=CancerImageInterface(YOUR_API_KEY_HERE)# 3. Perform a searchcii.search(cancer_type='esophageal')# 4. Pull the datacdf=cii.pull()

的CancerImageInterface和OpeniInterface缓存图像以后使用。当数据被“拉”时，会生成一个records_db，它是与图像关联的所有文本数据的数据帧。他们是作为类属性提供，例如cii.records_db。当 records_db只存储最近数据拉取的数据， cache_records_dbdataframes提供所有图像数据的帐户当前已缓存。

分割图像

biovida可以将缓存的图像分为训练/验证/测试。

frombiovida.imagesimportimage_divvy# 1. Define a rule to 'divvy' up images in the cache.defmy_divvy_rule(row):ifrow['image_modality_major']=='x_ray':return'x_ray'elifrow['image_modality_major']=='ct':return'ct'# 2. Define Proportions and Divide Datatt=image_divvy(opi,my_divvy_rule,action='ndarray',train_val_test_dict={'train':0.8,'test':0.2})# 3. The resultant ndarrays can be unpacked as follows:train_ct,train_xray=tt['train']['ct'],tt['train']['x_ray']test_ct,test_xray=tt['test']['ct'],tt['test']['x_ray']

图像：实验性

自动图像数据清理

不幸的是，从上面open-i中提取的数据可能包含大量与搜索查询无关的图像和/或不适合机器学习。

实验性的OpeniImageProcessing类可用于完全自动化此数据清理过程，该过程部分由通过卷积神经网络。

# 1. Import Image Processing Toolsfrombiovida.imagesimportOpeniImageProcessing# 2. Instantiate the Tool using the OpeniInterface Instanceip=OpeniImageProcessing(opi)# 3. Analyze the Imagesidf=ip.auto()# 4. Use the Analysis to Clean Imagesip.clean_image_dataframe()

很容易将这些图像分割成训练集和测试集。

frombiovida.imagesimportimage_divvydefmy_divvy_rule(row):ifrow['image_modality_major']=='x_ray':return'x_ray'elifrow['image_modality_major']=='ct':return'ct'tt=image_divvy(ip,my_divvy_rule,action='ndarray',train_val_test_dict={'train':0.8,'test':0.2})# These ndarrays can be unpack as shown above.

基因组数据

虽然biovida主要关注图像，但它也提供了一个简单的获取相关信息的接口，如基因组数据。

# 1. Import the Interface for DisGeNET.orgfrombiovida.genomicsimportDisgenetInterface# 2. Create an Instance of the Tooldna=DisgenetInterface()# 3. Pull a Databasegdf=dna.pull('curated')

诊断数据

biovida还使获得诊断数据变得容易。

关于疾病定义、家族和同义词的信息：

# 1. Import the Interface for DiseaseOntology.orgfrombiovida.diagnosticsimportDiseaseOntInterface# 2. Create an Instance of the Tooldoi=DiseaseOntInterface()# 3. Pull the Databaseddf=doi.pull()

有关疾病相关症状的信息：

# 1. Import the Interface for Disease-Symptoms Informationfrombiovida.diagnosticsimportDiseaseSymptomsInterface# 2. Create an Instance of the Tooldsi=DiseaseSymptomsInterface()# 3. Pull the Databasedsdf=dsi.pull()

统一信息

unify_against_images函数集成图像数据信息对DisgenetInterface，DiseaseOntInterface和 DiseaseSymptomsInterface。

frombiovida.unificationimportunify_against_imagesunify_against_images(interfaces=[cii,opi],db_to_extract='cache_records_db')

数据框左侧：仅图像数据

	article_type	image_id	image_ca ption	modality_best_guess	age	sex	disease	…
0	case_re port	1	…	Magnetic Resonance Imaging (MRI)	73	male	fibroma	…
1	case_re port	2	…	Magnetic Resonance Imaging (MRI)	73	male	fibroma	…
2	case_re port	1	…	Computed Tomography (CT): angiography	45	femal e	bile duct cancer	…

数据框右侧：添加了信息

disease_famil y	disease_sy nonym	disease_d efinition	known_associ ated_symptom s	mentioned_symptoms	known_assoc iated_genes
(cell type benign neoplasm,)	nan	nan	(abdominal pain,…)	(pain,)	((ANTXR2, 0.12), …)
(cell type benign neoplasm,)	nan	nan	(abdominal pain,…)	(pain,)	((ANTXR2, 0.12), …)
(biliary tract cancer,)	(bile duct tumor,…)	A biliary tract…	(abdominal obesity,..)	(colic,)	nan

文档

贡献

有关如何贡献的详细信息，请参阅 contributing 文件。

始终欢迎错误报告和功能请求，并且可以提供穿过Issues 网页

资源

那个 resources 文档提供了所有数据源和所用学术著作的说明生物多样性。

欢迎加入QQ群-->： 979659372

biovida 0.1.1

biovida的Python项目详细描述

安装

图像：稳定

open-i生物医学图像搜索引擎

癌症影像档案

分割图像

图像：实验性

基因组数据

诊断数据

统一信息

文档

贡献

资源

推荐PyPI第三方库

translitua

target-datadotworld

sphinx-testing

pypubg

cfgeom

pycopy-doctest

gitlab-gce-autoscaler

fairness

awspice

OneSignalPythonSDK

usa_toda

mailwatch

rohrleitung

pyxform-cadasta

odoo8-addon-purchase-partner-invoice-method

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

biovida 0.1.1

biovida的Python项目详细描述

安装

图像：稳定

open-i生物医学图像搜索引擎

癌症影像档案

分割图像

图像：实验性

基因组数据

诊断数据

统一信息

文档

贡献

资源

推荐PyPI第三方库

translitua

target-datadotworld

sphinx-testing

pypubg

cfgeom

pycopy-doctest

gitlab-gce-autoscaler

fairness

awspice

OneSignalPythonSDK

usa_toda

mailwatch

rohrleitung

pyxform-cadasta

odoo8-addon-purchase-partner-invoice-method

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签