生物图案抛出键错误'd'

2024-09-21 05:49:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我在用Biopython处理一些数据。 但是当我在Biopython中使用motif模块时遇到了一个奇怪的问题。 这是密码。你知道吗

frame = pd.DataFrame({'Spacer': seqs1.values()}, index=seqs.keys())
Motif = motifs.create(frame.Spacer.values, alphabet=IUPAC.IUPACAmbiguousDNA())

然后我得到一个键错误:

Traceback (most recent call last):
File "<input>", line 2, in <module>
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\Bio\motifs\__init__.py", line 23, in create
    return Motif(instances=instances, alphabet=alphabet)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\Bio\motifs\__init__.py", line 244, in __init__
    counts = self.instances.count()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python36\lib\site-packages\Bio\motifs\__init__.py", line 199, in count
    counts[letter][position] += 1
KeyError: 'd'

seqs1包含以下元素:

seqs1 ={'E00491:315:HVLGTCCXY:1:1101:18193:49320': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:26250:49320': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:26534:49320': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:27651:49320': 'GGCACNGCGGCTGGAGGNGG', 'E00491:315:HVLGTCCXY:1:1101:28625:49320': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:4503:49338': 'GGCACTGCGGCTGGAGGNGG', 'E00491:315:HVLGTCCXY:1:1101:5781:49338': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:6005:49338': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:8176:49338': 'GGCGCTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:11099:49338': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:15564:49338': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:17553:49338': 'GGCGCTTCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:22059:49338': 'GGCGCTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:24129:49338': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:24535:49338': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:30117:49338': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:22191:49355': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:25134:49355': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:7243:49373': 'GGCACTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:10064:49373': 'GGCGCTGCGGCTGGAGGTGG', 'E00491:315:HVLGTCCXY:1:1101:14752:49373': 'GGCACTGCGGCTGGAGGTGG'}

在我的序列中没有'd'。你知道吗


Tags: ininitlocallineusersappdatafileprograms
1条回答
网友
1楼 · 发布于 2024-09-21 05:49:22

这个问题也在Biopython的GitHub页面上提出,并在那里得到了解决(https://github.com/biopython/biopython/issues/1978

简而言之:Bio.motifs.create()需要一个序列列表作为输入(例如['ATTG', 'CTTG', ...])。上面显示的Pandas数据帧操作并没有完成问题作者想要做的事情。你知道吗

他可以做到:

Motif = motifs.create(seqs1.values(), alphabet=IUPAC.IUPACAmbiguousDNA())

相关问题 更多 >

    热门问题