pandas值错误:无法从重复轴重新编制索引

2024-04-26 02:21:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有20个df(命名为sample1…sample20),每个df都使用

sample1 = pd.read_table('pathtosample1.csv', sep='\t', index_col=0)["score"]

我再次为下面的meta步骤用不同的变量加载每个文件

^{pr2}$

样品1 df

Unique  junction_id score   splice_site anchor  intron_size exons_skipped   genes   transcripts
3:107915006-107915391(-)    ENSMUSG00000000001:E001 1017    GT-AG   DA  386 0   Gnai3   ENSMUST00000000001
3:107912225-107912321(-)    ENSMUSG00000000001:E002 10  GT-AG   D   97  0   Gnai3   ENSMUST00000000001
3:107912234-107912321(-)    ENSMUSG00000000001:E003 979 GT-AG   DA  88  0   Gnai3   ENSMUST00000000001
3:107912530-107914853(-)    ENSMUSG00000000001:E004 996 GT-AG   DA  2324    0   Gnai3   ENSMUST00000000001
3:107912530-107915391(-)    ENSMUSG00000000001:E005 3   GT-AG   NDA 2862    1   Gnai3   ENSMUST00000000001
3:107915520-107918681(-)    ENSMUSG00000000001:E006 1113    GT-AG   DA  3162    0   Gnai3   ENSMUST00000000001
3:107915520-107921219(-)    ENSMUSG00000000001:E007 1   GT-AG   NDA 5700    1   Gnai3   ENSMUST00000000001
3:107915520-107915944(-)    ENSMUSG00000000001:E008 1   GT-AG   A   425 0   Gnai3   ENSMUST00000000001
3:107918809-107921219(-)    ENSMUSG00000000001:E009 1141    GT-AG   DA  2411    0   Gnai3   ENSMUST00000000001

为了表示,我使用这些命令只指示6个样本

concat =  pd.concat([sample1,sample2,sample3,sample4,sample5,sample6], axis=1).fillna(0)
concat.columns = ["score_1", "score_2", "score_3","score_4", "score_5", "score_6"]
meta = pd.concat([meta1,meta2,meta3,meta4,meta5,meta6], ignore_index=True)

meta = meta[~meta.index.duplicated(keep='first')]
concat = pd.concat([concat, meta], axis=1)
concat.to_csv('data.csv')

我得到的错误是

值错误:无法从重复轴重新编制索引

我的预期输出是首先从所有文件中获取第一列中的所有元素,并对列中的每个示例添加分数,然后添加与每行相对应的其余meta列,即预期输出

Junction_id score1  score2  score3  score4 score5 score6    Unique  splice_site intron_size anchor  genes   transcripts exons_skipped
ENSMUSG00000000001:E001 1017    1   1651    6   3   1   3:107915006-107915391(-)    GT-AG   386 DA  Gnai3   ENSMUST00000000001  0
ENSMUSG00000000001:E002 10  7   3   1144    1193    895 3:107912225-107912321(-)    GT-AG   97  D   Gnai3   ENSMUST00000000001  0
ENSMUSG00000000001:E003 979 1075    1588    923 1223    1017    3:107912234-107912321(-)    GT-AG   88  DA  Gnai3   ENSMUST00000000001  0
ENSMUSG00000000001:E004 996 3   1522    1   1   2   3:107912530-107914853(-)    GT-AG   2324    DA  Gnai3   ENSMUST00000000001  0
ENSMUSG00000000001:E005 3   1759    14  1127    4   1112    3:107912530-107915391(-)    GT-AG   2862    NDA Gnai3   ENSMUST00000000001  1

不确定是哪个步骤导致了此错误


Tags: csvgtdfindexmetadapdscore