我使用的是一个包,对于列表中的每个元素,在文件中打印以下行:
Entry Entry name Status Protein names Gene names Organism
A0A20CSC4 A0A20CSC4_1PHYC unreviewed Uncharacterized protein OlL7_200 Ostreococcus lucimarinus virus 7
Entry Entry name Status Protein names Gene names Organism
A0A0P0DZ8 A0A0PCDZ8_9PLYC unreviewed Uncharacterized protein OlL7_159 Ostreococcus lucimarinus virus 7
Entry Entry name Status Protein names Gene names Organism
A0A1P0BY71 A0A1P0BY71_9PHYC unreviewed Uncharacterized protein OlL7_111c Ostreococcus lucimarinus virus 7
。。。x1000
因此,如果我用pandas打开此文件,会得到一个数据帧,如:
>>> blast
Entry Entry name Status Protein names Gene names
0 A0A20CSC4 A0A20CSC4_1PHYC unreviewed Uncharacterized protein OlL7_200
1 NaN NaN NaN NaN NaN
2 A0A0P0DZ8 A0A0PCDZ8_9PLYC unreviewed Uncharacterized protein OlL7_159
3 NaN NaN NaN NaN NaN
4 Entry Entry name Status Protein names Gene names
5 A0A1P0BY71 A0A1P0BY71_9PHYC unreviewed Uncharacterized protein OlL7_111c
我只想用colname创建一个数据帧:
Entry Entry name Status Protein names Gene names Organism
A0A20CSC4 A0A20CSC4_1PHYC unreviewed Uncharacterized protein OlL7_200 Ostreococcus lucimarinus virus 7
A0A0P0DZ8 A0A0PCDZ8_9PLYC unreviewed Uncharacterized protein OlL7_159 Ostreococcus lucimarinus virus 7
A0A1P0BY71 A0A1P0BY71_9PHYC unreviewed Uncharacterized protein OlL7_111c Ostreococcus lucimarinus virus 7
你知道在Python3中使用熊猫的方法吗
更新的数据框:
Entry Entry name Status Protein names Gene names
0 A0A20CSC4 A0A20CSC4_1PHYC unreviewed Uncharacterized protein OlL7_200
2 A0A0P0DZ8 A0A0PCDZ8_9PLYC unreviewed Uncharacterized protein OlL7_159
4 Entry Entry name Status Protein names Gene names
5 A0A1P0BY71 A0A1P0BY71_9PHYC unreviewed Uncharacterized protein OlL7_111c
第4行仍然具有行名称
因此,获得这种类型输出的一种方法是删除NaN值
所以你可以,
blast.dropna(inplace=True)
blast.drop(blast[blast['Entry'] == 'Entry'].index, inplace=True)
这应该行得通
相关问题 更多 >
编程相关推荐