合并多个数据帧，只保留一组列名

2024-10-02 08:26:29 发布

男 | 程序猿一只，喜欢编程写python代码。

我使用的是一个包，对于列表中的每个元素，在文件中打印以下行：

Entry   Entry name  Status  Protein names   Gene names  Organism
A0A20CSC4   A0A20CSC4_1PHYC unreviewed  Uncharacterized protein OlL7_200    Ostreococcus lucimarinus virus 7

Entry   Entry name  Status  Protein names   Gene names  Organism
A0A0P0DZ8   A0A0PCDZ8_9PLYC unreviewed  Uncharacterized protein OlL7_159    Ostreococcus lucimarinus virus 7

Entry   Entry name  Status  Protein names   Gene names  Organism
A0A1P0BY71  A0A1P0BY71_9PHYC    unreviewed  Uncharacterized protein OlL7_111c   Ostreococcus lucimarinus virus 7

。。。x1000

因此，如果我用pandas打开此文件，会得到一个数据帧，如：

>>> blast
        Entry        Entry name      Status            Protein names  Gene names
0   A0A20CSC4   A0A20CSC4_1PHYC  unreviewed  Uncharacterized protein    OlL7_200
1         NaN               NaN         NaN                      NaN         NaN
2   A0A0P0DZ8   A0A0PCDZ8_9PLYC  unreviewed  Uncharacterized protein    OlL7_159
3         NaN               NaN         NaN                      NaN         NaN
4       Entry        Entry name      Status            Protein names  Gene names
5  A0A1P0BY71  A0A1P0BY71_9PHYC  unreviewed  Uncharacterized protein   OlL7_111c

我只想用colname创建一个数据帧：

Entry   Entry name  Status  Protein names   Gene names  Organism
A0A20CSC4   A0A20CSC4_1PHYC unreviewed  Uncharacterized protein OlL7_200    Ostreococcus lucimarinus virus 7
A0A0P0DZ8   A0A0PCDZ8_9PLYC unreviewed  Uncharacterized protein OlL7_159    Ostreococcus lucimarinus virus 7
A0A1P0BY71  A0A1P0BY71_9PHYC    unreviewed  Uncharacterized protein OlL7_111c   Ostreococcus lucimarinus virus 7

你知道在Python3中使用熊猫的方法吗

更新的数据框：

        Entry        Entry name      Status            Protein names  Gene names
0   A0A20CSC4   A0A20CSC4_1PHYC  unreviewed  Uncharacterized protein    OlL7_200
2   A0A0P0DZ8   A0A0PCDZ8_9PLYC  unreviewed  Uncharacterized protein    OlL7_159
4       Entry        Entry name      Status            Protein names  Gene names
5  A0A1P0BY71  A0A1P0BY71_9PHYC  unreviewed  Uncharacterized protein   OlL7_111c

第4行仍然具有行名称

Tags： name names status nan entry gene protein uncharacterized

1条回答

网友

1楼 · 发布于 2024-10-02 08:26:29

因此，获得这种类型输出的一种方法是删除NaN值

所以你可以， blast.dropna(inplace=True)

blast.drop(blast[blast['Entry'] == 'Entry'].index, inplace=True)

这应该行得通

合并多个数据帧，只保留一组列名

相关问题更多 >

编程相关推荐

热门问题

热门文章

合并多个数据帧，只保留一组列名

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >