如何使用bioinfokit创建火山图?

2024-10-05 15:28:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我一直在尝试使用python和bioinfokit在excel文件中创建基因表达数据的火山图。我使用熊猫创建了一个数据帧,然后消除了一些负值。然后我尝试在最后一行代码中创建火山图

import pandas as pd
import numpy as np
import bioinfokit
from bioinfokit import analys, visuz


panda_brie = pd.read_csv("C:\\Users\\amorgan\\Documents\\brie_gRNA_stats.csv", encoding='ISO-8859-1', low_memory=False)
shape = panda_brie.shape
print(shape)
panda_brie = panda_brie.loc[(panda_brie[("fold_change")] > 0)]
shape = panda_brie.shape
print(shape)


bioinfokit.visuz.gene_exp.volcano(df=panda_brie, lfc="log_fold_change", pv="log_p_value")

我收到以下错误,不确定该怎么办

Traceback (most recent call last):
  File "C:/Users/amorgan/AppData/Local/Programs/Python/Python39/graphing brie data.py", line 19, in <module>
    bioinfokit.visuz.gene_exp.volcano(df=panda_brie, lfc="log_fold_change", pv="log_p_value")
  File "C:\Users\amorgan\AppData\Local\Programs\Python\Python39\lib\site-packages\bioinfokit\visuz.py", line 397, in volcano
    df.loc[(df[lfc] >= lfc_thr[0]) & (df[pv] < pv_thr[0]), 'color_add_axy'] = color[0]  # upregulated
  File "C:\Users\amorgan\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\ops\common.py", line 69, in new_method
    return method(self, other)
  File "C:\Users\amorgan\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\arraylike.py", line 52, in __ge__
    return self._cmp_method(other, operator.ge)
  File "C:\Users\amorgan\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\series.py", line 5501, in _cmp_method
    res_values = ops.comparison_op(lvalues, rvalues, op)
  File "C:\Users\amorgan\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\ops\array_ops.py", line 284, in comparison_op
    res_values = comp_method_OBJECT_ARRAY(op, lvalues, rvalues)
  File "C:\Users\amorgan\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\ops\array_ops.py", line 73, in comp_method_OBJECT_ARRAY
    result = libops.scalar_compare(x.ravel(), y, op)
  File "pandas\_libs\ops.pyx", line 107, in pandas._libs.ops.scalar_compare
TypeError: '>=' not supported between instances of 'str' and 'int'

我的panda数据帧的标题,以防有帮助

                    Unnamed: 0  control.avg  ...  log_fold_change  log_p_value
0   Syt15_GGTACCACAAATGGTACACT         7.80  ...      0.421772618     9.665546
1  Fbxo21_CTTGTGTGCAAAACCCTCCG         3.67  ...      0.678371984     8.397940
2   Irgc1_GAGGCCCTCGGGTTTCAGCG         3.10  ...      0.736525011     8.151195
3  Ttll12_CCTGTGTCTAGGTCCCTTAG         3.98  ...      0.622833399     9.659556
4   Kdm4b_ATGTCATCATACGTCTGCCG         4.41  ...      0.545893109     9.899629

panda_brie.info()的输出

[5 rows x 24 columns]
<class 'pandas.core.frame.DataFrame'>
Int64Index: 50629 entries, 0 to 53135
Data columns (total 24 columns):
 #   Column                       Non-Null Count  Dtype  
---  ------                       --------------  -----  
 0   Unnamed: 0                   50629 non-null  object 
 1   control.avg                  50629 non-null  float64
 2   Tg50.avg                     50629 non-null  float64
 3   Tg100.avg                    50629 non-null  float64
 4   Tg150.avg                    50629 non-null  float64
 5   Tg250.avg                    50629 non-null  float64
 6   Treated.vs.Nontreated.p      50629 non-null  float64
 7   Treated.vs.Nontreated.FDR    50629 non-null  float64
 8   Treated.vs.Nontreated.logFC  50629 non-null  float64
 9   Treated.vs.Nontreated.FC     50629 non-null  float64
 10  Dose.Regression.p            50629 non-null  float64
 11  Dose.Regression.FDR          50629 non-null  float64
 12  Dose.Regression.Slope        50629 non-null  float64
 13  gene                         50629 non-null  object 
 14  gRNASeq                      50629 non-null  object 
 15  Unnamed: 15                  0 non-null      float64
 16  Unnamed: 16                  0 non-null      float64
 17  Unnamed: 17                  13 non-null     object 
 18  Unnamed: 18                  3 non-null      object 
 19  Unnamed: 19                  3 non-null      object 
 20  Unnamed: 20                  1 non-null      object 
 21  fold_change                  50629 non-null  float64
 22  log_fold_change              50629 non-null  object 
 23  log_p_value                  50629 non-null  float64
dtypes: float64(16), object(8)
memory usage: 9.7+ MB

Tags: inlogpandasobjectlinepandanullusers