如何根据python中每行的前一个数字将两个文件连接在一起

2024-09-27 23:21:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我有多个文件的行格式如下:

8 upchimy79 291160.8516853 345706.9991016
9 upchimy79 291160.8516853 345706.9991016
70 upchimy79 291178.7591454 345733.5179607
134 upchimy79 291391.9184244 345688.8950164
190 upchimy79 291511.4331200 345634.4573389

以及:

0 eapceou79 289109.1707774 345638.6043512
60 eapceou79 289091.8125863 345656.2855532
120 eapceou79 289041.8477906 345702.7290361
183 eapceou79 288993.3282226 345747.8902265
215 eapceou79 289074.9134241 345759.2455079

我想合并所有的文件在一起,所以第一个数字将在升序。所以输出是这样的:

0 eapceou79 289109.1707774 345638.6043512
8 upchimy79 291160.8516853 345706.9991016
9 upchimy79 291160.8516853 345706.9991016
60 eapceou79 289091.8125863 345656.2855532
70 upchimy79 291178.7591454 345733.5179607
120 eapceou79 289041.8477906 345702.7290361
134 upchimy79 291391.9184244 345688.8950164

我有相当多的文件做这个,每个文件有大约1400行,所以我不确定的最佳方式来实现这一点


Tags: 文件格式方式数字升序upchimy79eapceou79
3条回答
import pandas as pd

all_your_files = ["filenames","filename2",...]

all_dfs = ( pd.read_csv(f, delimiter=' ', delim_whitespace=True, header=["nr","name","d2","d3"], ) \
            for f in all_your_files)

df = pd.concat(all_dfs)
df.sort_values(by='nr', inplace=true)

一次把它们都整理好。然后写下熊猫的简历:

df.to_csv("file_name", index=False, header=None, delimiter=" ")

不使用第一个数字作为索引,如果它们包含一些

熊猫非常适合做这样的东西:

d1 = pd.read_csv(file1, delimiter=' ', index_col=0, header=None)
d2 = pd.read_csv(file2, delimiter=' ', index_col=0, header=None)

df = pd.concat([d1, d2], axis=0).sort_index()

当所有文件都单独排序时(如在您的示例中),您可以使用heapq.mergedocs here)和key参数来合并它们。此示例包含两个文件,但您可以通过以下方式合并任意数量的文件:

from heapq import merge

with open('f1.txt', 'r', newline='') as f1_in, \
     open('f2.txt', 'r', newline='') as f2_in, \
     open('data_out.txt', 'w', newline='') as f_out:

     for line in merge(f1_in, f2_in, key=lambda l: int(l.split(' ')[0])):
        f_out.write(line)

输出文件中的行如下所示:

0 eapceou79 289109.1707774 345638.6043512
8 upchimy79 291160.8516853 345706.9991016
9 upchimy79 291160.8516853 345706.9991016
60 eapceou79 289091.8125863 345656.2855532
70 upchimy79 291178.7591454 345733.5179607
120 eapceou79 289041.8477906 345702.7290361
134 upchimy79 291391.9184244 345688.8950164
183 eapceou79 288993.3282226 345747.8902265
190 upchimy79 291511.4331200 345634.4573389
215 eapceou79 289074.9134241 345759.2455079

相关问题 更多 >

    热门问题