python从fi读取行块

2024-09-27 05:02:22 发布

您现在位置:Python中文网/ 问答频道 /正文

从CARI/201412文件中获取输出(在CARI/201I中也有输出)

---------------------------------------------
TM 05120970.01: Processing...
TM 05120970: Processing...
TM 05120970: current status Open
TM 05120970: Owner_Info.User_ref = crossi14
TM 05120970: Owner_Info.Email = Criss.Rossi@gmail.com
TM 05120970: CarModel = Nissan Micra
----------------------------------------------
TM 05157414.06: Processing...
TM 05157414: Processing...
TM 05157414: current status Open
TM 05157414: Owner_Info.User_ref = yumiao12
TM 05157414: Owner_Info.Email = Yu.Miao@gmail.com
TM 05157414: CarModel = Toyota Avensis
----------------------------------------------

我用过:exec_cmd('cat ' + f1 + '| grep -e "CarModel = " -e "Owner_Info.User_ref = "') 但我还需要块的第一行(实际上是第二行)

^{pr2}$

我尝试/需要做的是,解析并获取每个块的变量中的值:

TM 05120970.01 -> car_number = 05120970.01

Owner_Info.User_ref = crossi14 -> owner_user = crossi14

CarModel = Nissan Micra -> car_model = Nissan Micra

根据这些信息,我将添加一些默认内容,如:

priority = Unknown

我需要将这些变量作为另一个名为insert\u owner的脚本的输入_汽车.pl在

 insert_owner_car.pl -id 05120970.01 -o owner_user="crossi14",car_model="Nissan Micra",priority="Unknown"

到目前为止,这是我设法做到的,但它不可用,因为我不能得到提到的价值

#!/usr/bin/python

import itertools, commands, datetime, os, re, sys, time

inFile = open("/tmp/20141202.194812_carStatus")
outFile = open("result.txt", "w")
keepCurrentSet = False
for line in inFile:
    if line.startswith("----------------------------------------------"):
        keepCurrentSet = False
    if keepCurrentSet:
        parts = line.split(" = ")[1:]
        part=','.join(parts)
        print part
#outFile.write(parts)   
    if line.startswith("----------------------------------------------"):
        keepCurrentSet = True
inFile.close()
outFile.close()

我不知道怎么得到:05120970.01 以及如何将一个块中的所有变量都用作另一个脚本的输入

PS:我有Python2.5.1


Tags: inforeflinecarinfiletmownerprocessing
2条回答

可以使用utility function ^{}分块处理文件:

import re
import subprocess

def open_chunk(readfunc, delimiter, chunksize=1024):
    """
    readfunc(chunksize) should return a string.
    """
    remainder = ''
    for chunk in iter(lambda: readfunc(chunksize), ''):
        pieces = re.split(delimiter, remainder + chunk)
        for piece in pieces[:-1]:
            yield piece
        remainder = pieces[-1]
    if remainder:
        yield remainder

f = open(filename, 'r')
for chunk in open_chunk(f.read, delimiter=r'-{45,}'):
    chunk = chunk.strip()
    if chunk:
        lines = chunk.splitlines()
        firstline = lines[0]
        car_number = firstline.split()[1][:-1]
        for line in lines[1:]:
            if 'Owner_Info.User_ref = ' in line:
                owner_user = line.split(" = ")[1]
            elif 'CarModel = ' in line:
                car_model =  line.split(" = ")[1]
        cmd = ['insert_owner_car.pl'
               , '-id'
               , car_number
               , '-o'
               , 'owner_user="%s"' % (owner_user, )
               , 'car_model="%s"' % (car_model, )
               , 'priority="Unknown"']
        print(' '.join(cmd))
        # subprocess.call(cmd)
f.close()

印刷品

^{pr2}$

如果数据文件很小,则可以将整个文件分成一个字符串,然后使用re.split将其拆分为多个块:

In [37]: import re

In [38]: re.split(r'-{45,}', open('data').read())
Out[38]: 
['\n\n',
 '\nTM 05120970.01: Processing...\nTM 05120970: Processing...\nTM 05120970: current status Open\nTM 05120970: Owner_Info.User_ref = crossi14\nTM 05120970: Owner_Info.Email = Criss.Rossi@gmail.com\nTM 05120970: CarModel = Nissan Micra\n',
 '\nTM 05157414.06: Processing...\nTM 05157414: Processing...\nTM 05157414: current status Open\nTM 05157414: Owner_Info.User_ref = yumiao12\nTM 05157414: Owner_Info.Email = Yu.Miao@gmail.com\nTM 05157414: CarModel = Toyota Avensis\n',
 '\n']

这可以代替上面的open_chunk。使用open_chunk的优点是可以在非常大的文件上使用,因为将整个文件分成一个字符串并将其拆分为一个列表需要太多内存。在

您应该使用re模块来提取相关信息:它是标准的、简单的和健壮的。 您还可以显示块限制的块信息,并在文件末尾添加一个catch all。在

脚本将是:

import re

rnum = re.compile('\s*TM\s+([^\s:]+):.*')
ruser = re.compile('.*Owner_Info.User_ref\s*=\s*(.*)')
rmodel = re.compile('.*CarModel\s*=\s*(.*)')


def display(out, num, user, model):
    print(num, user, model)
    out.write('insert_owner_car.pl -id %s -o owner_user="%s",car_model="%s",priority="Unknown"\n' % (num, user, model))

inFile = open("/tmp/20141202.194812_carStatus")
outFile = open("result.txt", "w")
firstOfBlock = False
carnum = None
for line in inFile:
    if line.startswith("                "):
        firstOfBlock = True
        if carnum is not None:
            display(outFile, carnum, user, model)
            carnum = None
    else:
        if firstOfBlock:
            m = rnum.match(line)
            if m is not None:
                carnum = m.group(1)
                firstOfBlock = False
        else:
            line = line.strip()
            m = ruser.match(line)
            if m is not None:
                user = m.group(1)
            else:
                m = rmodel.match(line)
                if m is not None:
                    model = m.group(1)

if carnum is not None:
    display(outFile, carnum, user, model)
    carnum = None

inFile.close()
outFile.close()

在您当前的示例中,输出是

^{pr2}$

以及结果.txt是:

insert_owner_car.pl -id 05120970.01 -o owner_user="crossi14",car_model="Nissan Micra",priority="Unknown"
insert_owner_car.pl -id 05157414.06 -o owner_user="yumiao12",car_model="Toyota Avensis",priority="Unknown"

相关问题 更多 >

    热门问题