如何拆分文件

10900 PART1 3211034 10900 PART2 3400458 10900 PART4 3183857 10900 PART3 4152115 10900 PART5 3366650 10900 PART6 1548868 10920 PART3 4154075 10920 PART2 3404018 10920 PART1 3207571 10920 PART4 3178505 10920 PART6 1882924 10920 PART5 3363267 10940 PART6 2183534 10940 PART3 4153924 10940 PART4 3178554 10940 PART1 3207436 10940 PART5 3363585 10940 PART2 3404220

3条回答

网友

1楼 · 编辑于 2024-09-28 01:33:47

我不明白你想怎么处理第一个专栏。但是，这里有一些python遵守了对第二列和的限制

文件ID=itertools.count（一）以open（'path/to/file'）作为填充：总和=0 阈值=10000000 outfile=open（“文件%d”%fileID，“w”）

for line in infile:
    val = int(line.strip().split()[-1])
    if threshold-sum >= val:
        outfile.write(line)
    else:
        outfile.close()
        sum = 0
        outfile = open("file%d"%next(fileID), 'w')
        outfile.write(line)

    sum += val

outfile.close()

希望这有帮助

网友

2楼 · 编辑于 2024-09-28 01:33:47

{1>在这里使用

awk '{ s+=$3 } s>=10000000 || $1!=x { s=$3; c++ } { print > "File" c; x=$1 }' file

这将创建7个文件。以下是grep . File*的输出，显示了这些文件中的每一个的内容：

^{pr2}$

网友

3楼 · 编辑于 2024-09-28 01:33:47

如果我没有弄错你的说明书，下面的内容可能对你有用。基本上，它检查第二个字段是否大于1000，如果大于1000，则将其打印到filec（c是计数器），然后重置第二个字段的总和并增加文件计数器，等等

awk 'BEGIN {c=1}
     $3>10000000 {print $0 > ("file" c) ; c++ ; sum=0 } 
     $3< 10000000 {print $0 > ("file" c) ; sum+=$3 ; if (sum> 10000000) {sum=0;c++}}' INPUTFILE

如果要在第一列上拆分和第三列的和：

^{pr2}$

是的，我知道这可以缩短。。。在

相关问题更多 >

编程相关推荐

热门问题

热门文章