我有一个文本文件,如下面的小示例:
small example
:
0,1,2,3,4,5,6
chr1,144566,144597,30,chr1,120000,210000
chr1,154214,154245,34,chr1,120000,210000
chr1,228904,228935,11,chr1,210000,240000
chr1,233265,233297,13,chr1,210000,240000
chr1,233266,233297,58,chr1,210000,240000
chr1,235438,235469,36,chr1,210000,240000
chr1,262362,262393,16,chr1,240000,610000
chr1,347253,347284,12,chr1,240000,610000
chr1,387022,387053,38,chr1,240000,610000
我想删除第一行,而不是comma separated
,创建一个tab separated
文件。与预期输出类似:
expected output
:
chr1 144566 144597 30 chr1 120000 210000
chr1 154214 154245 34 chr1 120000 210000
chr1 228904 228935 11 chr1 210000 240000
chr1 233265 233297 13 chr1 210000 240000
chr1 233266 233297 58 chr1 210000 240000
chr1 235438 235469 36 chr1 210000 240000
chr1 262362 262393 16 chr1 240000 610000
chr1 347253 347284 12 chr1 240000 610000
chr1 387022 387053 38 chr1 240000 610000
我试图在python
中使用pandas
来实现这一点。我写了这个代码,但没有返回我想要的。你知道怎么修吗
import pandas
file = open('myfile.txt', 'rb')
new =[]
for line in file:
new.append(line.split(','))
df = pd.DataFrame(new)
df.to_csv('outfile.txt', index=False)
根据文件的大小,避免使用Pandas和使用基本Python I/O可能是一个更有效的想法。这样您就不必将整个文件读入内存,而是逐行读取并转储到带有制表符分隔的新文件中:
myfile2.txt
现在是myfile.txt
的制表符分隔版本相关问题 更多 >
编程相关推荐