什么是合适的分隔符?

2024-06-30 15:21:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文本文件,其结构如下:

>hsa:9934 K04299 purinergic receptor P2Y, G protein-coupled
MINSTSTQPPDESCSQNLLITQQIIPVLYCMVFIAGILLNGVSGWIFFYVPSSKSFIIYL
KNIVIADFVMSLTFPFKILGDSGLGPWQLNVFVCRVSAVLFYVNMYVSIVFFGLISFDRY
>hsa:9934 K04299 purinergic receptor P2Y, G protein-coupled
MINSTSTQPPDESCSQNLLITQQIIPVLYCMVFIAGILLNGVSGWIFFYVPSSKSFIIYL
KNIVIADFVMSLTFPFKILGDSGLGPWQLNVFVCRVSAVLFYVNMYVSIVFFGLISFDRY

我需要按以下表格结构加载和转换此文件:

--------------------------------------------------------------
|>hsa:9934 K04299 purinergic receptor P2Y, G protein-coupled |
|MINSTSTQPPDESCSQNLLITQQIIPVLYCMVFIAGILLNGVSGWIFFYVPSSKSFIIYL|
|KNIVIADFVMSLTFPFKILGDSGLGPWQLNVFVCRVSAVLFYVNMYVSIVFFGLISFDRY|
--------------------------------------------------------------
|>hsa:9934 K04299 purinergic receptor P2Y, G protein-coupled |
|MINSTSTQPPDESCSQNLLITQQIIPVLYCMVFIAGILLNGVSGWIFFYVPSSKSFIIYL|
|KNIVIADFVMSLTFPFKILGDSGLGPWQLNVFVCRVSAVLFYVNMYVSIVFFGLISFDRY|
--------------------------------------------------------------

我试过以下代码:

dataset = pd.read_csv(path, sep = ">")

但它没有像我预期的那样工作

我怎样才能得到准确的格式


Tags: 文件代码结构dataset表格文本文件proteinreceptor