根据字段拆分大文件，并为每个fi添加唯一标识符

网友

1楼 · 编辑于 2024-09-30 16:20:14

在Python2.x中，可以使用groupby完成此操作，如下所示：

import csv
from itertools import groupby

with open('huge.txt', 'rb') as f_input:
    csv_input = csv.reader(f_input, delimiter=' ', skipinitialspace=True)

    for index, (k, g) in enumerate(groupby(csv_input, lambda x: x[0]), start=1):
        with open('{}.{}'.format(k, index), 'wb') as f_output:
            csv.writer(f_output, delimiter=' ').writerows(g)

如果您使用的是Python 3.x：

^{pr2}$

网友

2楼 · 编辑于 2024-09-30 16:20:14

Awk真的很简单，不是吗？在

#!/usr/bin/env python
files_count = 1
first_col = None
with open('maria.txt') as maria:
    for line in maria:
        line = line.rstrip()
        columns = line.split()
        if columns[0] == first_col:
            print (line, file=current_out)
        else:
            first_col = columns[0]
            current_out = open(first_col+'.'+str(files_count), 'w')
            files_count+=1
            print (line, file=current_out)

网友

3楼 · 编辑于 2024-09-30 16:20:14

听起来你可能想要这个：

awk '$1!=prev{ close(out); out="File_"$1"."(++cnt); prev=$1 } { print > out }' test_file

你的问题不太清楚，但不是很清楚。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

根据字段拆分大文件，并为每个fi添加唯一标识符

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >