重命名带有标题的模式匹配后的所有行

2024-10-03 19:27:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我的文件如下所示:

BLOCK: offset: 59051 len: 1615 phased: 37 SPAN: 1614 MECscore 65.96 fragments 266
59294   0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    513 C   A   0/1:23,12:35:99:262,0,691   19,10:-40.6,-28.8,-78.7:-11.9:6.0
59876   0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    1095    G   A   0/1:35,12:47:99:328,0,1157  30,11:-61.1,-63.4,-134.7:2.2:12.0
59998   0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    1217    G   A   0/1:22,12:34:99:314,0,730   20,10:-68.4,-54.2,-109.0:-14.2:6.0
60000   0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    1219    A   C   0/1:22,12:34:99:308,0,715   20,10:-69.9,-54.2,-107.7:-15.7:6.0
60502   0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    1721    G   C   0/1:15,6:21:99:141,0,464    7,5:-21.8,-18.5,-30.1:-3.3:4.0
BLOCK: offset: 60874 len: 79 phased: 3 SPAN: 78 MECscore 11.99 fragments 21
60952   0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    2171    G   C   0/1:14,13:27:99:388,0,369   9,5:-35.3,-26.5,-46.7:-8.7:3.0
BLOCK: offset: 62339 len: 3617 phased: 123 SPAN: 3616 MECscore 1516.57 fragments 4565
62442   1   0   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    3661    G   A   0/1:148,55:203:99:1070,0,4008   107,39:-163.0,-160.9,-438.4:-2.1:33.0
62481   1   0   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    3700    C   T

我要通读文件并重命名每行的第一个字段,以便将其分组到前面的“块”行。我想重命名“BLOCK”行,以便第一行称为“BLOCK1”,第二行称为“BLOCK2”,等等。我想要的输出如下所示:

BLOCK1: offset: 59051 len: 1615 phased: 37 SPAN: 1614 MECscore 65.96 fragments 266
BLOCK1  0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    513 C   A   0/1:23,12:35:99:262,0,691   19,10:-40.6,-28.8,-78.7:-11.9:6.0
BLOCK1  0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    1095    G   A   0/1:35,12:47:99:328,0,1157  30,11:-61.1,-63.4,-134.7:2.2:12.0
BLOCK1  0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    1217    G   A   0/1:22,12:34:99:314,0,730   20,10:-68.4,-54.2,-109.0:-14.2:6.0
BLOCK1  0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    1219    A   C   0/1:22,12:34:99:308,0,715   20,10:-69.9,-54.2,-107.7:-15.7:6.0
BLOCK1  0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    1721    G   C   0/1:15,6:21:99:141,0,464    7,5:-21.8,-18.5,-30.1:-3.3:4.0
BLOCK2: offset: 60874 len: 79 phased: 3 SPAN: 78 MECscore 11.99 fragments 21
BLOCK2  0   1   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    2171    G   C   0/1:14,13:27:99:388,0,369   9,5:-35.3,-26.5,-46.7:-8.7:3.0
BLOCK3: offset: 62339 len: 3617 phased: 123 SPAN: 3616 MECscore 1516.57 fragments 4565
BLOCK3  1   0   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    3661    G   A   0/1:148,55:203:99:1070,0,4008   107,39:-163.0,-160.9,-438.4:-2.1:33.0
BLOCK3  1   0   Locus_540_Transcript_32_Length_8324_genewise_newlength_8215__CDS__3870__6491    3700    C   T

一般来说,我对编程还比较陌生,并尝试过使用awk/sed和perl,但我似乎无法理解这一点:(我非常感谢您的帮助,最好是解释一下每行代码的作用。非常感谢


Tags: lenblocklengthoffsetspantranscriptlocuscds
1条回答
网友
1楼 · 发布于 2024-10-03 19:27:29

使用perl oneliner

perl -pe 's/^BLOCK\K/++$i/e or s/^\d+/"BLOCK$i"/e' file.txt 

开关

  • -p:为输入文件中的每一行创建一个while(<>){...; print}循环
  • -e:告诉perl在命令行上执行代码

Live Demo

相关问题 更多 >