如何分隔.txt文件的列。将所有列合并为一个脚本?

2024-09-27 07:27:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个如下所示的文本文件:

      UWI/API             WELLNAME             WELLNO              LABEL                OPER               PRODFM         SURFLAT     SURFLON      BOTLAT      BOTLON   PERFTOP PERFBASE     PERFSOURCE         WELL        WELL        WELL        WELL        WELL        WELL        WELL      SYM   
                                                                                                                                                                                                              ELEV_KB        TD      COMP_DATE   SPUD_DATE     INJLIQ      INJGAS      INJWTR           
49025000510000      CH039068            3-A                 03WC2NE01           STANOLIND O&G CO    603WLCK2               43.384316 -106.275540    0.000000    0.000000  1525.0  1815.0                         4847.00     1815.00  03/20/1925  01/01/1925        0.00   857224.00  6538652.00CO2-INJ 
49025000540000      CH036906A           8-A                 08WC2NE01           STANOLIND O&G CO                           43.383111 -106.278775    0.000000    0.000000     0.0     0.0                                     1842.00  10/07/1926  01/01/1926        0.00   757266.00  2287490.00CO2-INJ 
49025000550000      CH03906B            12-A                12WC2NE01           STANOLIND O&G CO    603WLCK                43.383171 -106.272200    0.000000    0.000000  1448.0  1703.0                         4896.00     1916.00  07/23/1924  01/01/1924        0.00  1157524.00  9007039.00CO2-INJ 
49025000550000                                                                                                                                                            1448.0  1451.0                                                                                                    
49025000550000                                                                                                                                                            1700.0  1703.0                                                                                                    
49025000580000      CH03906B            19-A                19WC2NE01           STANOLIND O&G CO                           43.380695 -106.280337    0.000000    0.000000     0.0     0.0                                     1833.00  08/27/1924  01/01/1924        0.00  3602831.00  8002146.00CO2-INJ 
49025000580000                                                                                                                                                            1565.0  1833.0                                                                                                    
49025000590000      CH03906B            23-AX               23WC2NE01           STANOLIND O&G CO    603WLCK2               43.380810 -106.273800    0.000000    0.000000  1842.0  1916.0                                     1927.00  10/19/1923  01/01/1923        0.00  2673371.00  9392905.00CO2-INJ 
49025000590000                                                                                                                                                               0.0     0.0                                                                

当我运行这段代码时,输出将所有第一行放入一列。我希望第一列是“UWI/API”,第二列是“WELLNAME”,等等。。。如您所见,数据不是用逗号分隔的,而是用空格分隔的。数据有点混乱?它没有对齐。标题应为“UWI/API”至“INJWTR”

这是我的密码:

import pandas as pd

df = pd.read_csv(r'C:/Users/mmcintyre/Documents/Teresa CO2 Injectors_Spaces.txt',sep=r'\s{2,}')

df.to_csv(r'C:/Users/mmcintyre/Documents/Teresa.csv',index=None)

输出是这样的:https://sru365edu-my.sharepoint.com/:x:/g/personal/mam1064_sru_edu/Ebq0tpHveAJBu710UcDt82IB23gmRs6nNrTE-MtDUgkU6Q?e=byhjuL

我突出显示了位置不正确的行。我不确定这是否可以解决

谢谢! 摩根


Tags: csv数据apidfdatepdco2well
1条回答
网友
1楼 · 发布于 2024-09-27 07:27:16

这些数据似乎很难解析。我用一堆csv解析器尝试了许多设置,但我认为没有一种干净的方法来解析如此混乱的数据。我选择了肮脏的方法。如果你只需要为这个文件,以下应该工作,如果你需要经常这样做,我不保证这将永远工作干净

#! /usr/bin/env python

import csv

data = []

with open('Teresa CO2 Injectors_Spaces.TXT') as file_handler:

    header = file_handler.readline().strip().split()
    header2 = ['' for i in range(13)] + file_handler.readline().strip().split() + ['']

    data.append(header)
    data.append(header2)

    for row in file_handler.readlines():

        tempRow = []

        UWIAPI = row[0:20]
        WELLNAME = row[20:40]
        WELLNO = row[40:60]
        LABEL = row[60:80]
        OPER = row[80:100]
        PRODFM = row[100:120]
        SURFLAT = row[120:132]
        SURFLON = row[132:144]
        BOTLAT = row[144:156]
        BOTLON = row[156:168]
        PERFTOP = row[168:176]
        PERFBASE = row[176:184]
        PERFSOURCE = row[184:200]
        ELEV_KB = row[200:220]
        TD = row[218:228]
        COMP_DATE = row[228:240]
        SPUD_DATE = row[240:252]
        INJLIQ = row[252:264]
        INJGAS = row[264:276]
        INJWTR = row[276::]
        SYM = ''

        tempRow = [
            UWIAPI.strip(),
            WELLNAME.strip(),
            WELLNO.strip(),
            LABEL.strip(),
            OPER.strip(),
            PRODFM.strip(),
            SURFLAT.strip(),
            SURFLON.strip(),
            BOTLAT.strip(),
            BOTLON.strip(),
            PERFTOP.strip(),
            PERFBASE.strip(),
            PERFSOURCE.strip(),
            ELEV_KB.strip(),
            TD.strip(),
            COMP_DATE.strip(),
            SPUD_DATE.strip(),
            INJLIQ.strip(),
            INJGAS.strip(),
            INJWTR.strip(),
            SYM,
        ]

        data.append(tempRow)

with open('example.csv', 'w+') as fh:

    writer = csv.writer(fh, delimiter=',')

    for row in data:

        writer.writerow(row)

相关问题 更多 >

    热门问题