使用正则表达式分析Python日志文件

2024-06-26 01:45:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我必须分析电子邮件发送日志文件(获取邮件id的SMTP回复),如下所示:

Nov 12 17:26:57 zeus postfix/smtpd[23992]: E859950021DB1: client=pegasus.os[172.20.19.62]
Nov 12 17:26:57 zeus postfix/cleanup[23995]: E859950021DB1: message-id=a92de331-9242-4d2a-8f0e-9418eb7c0123
Nov 12 17:26:58 zeus postfix/qmgr[22359]: E859950021DB1: from=<system@directoperation.de>, size=114324, nrcpt=1 (queue active)
Nov 12 17:26:58 zeus postfix/smtp[24007]: certificate verification failed for mx.elutopia.it[62.149.128.160]:25: untrusted issuer /C=US/O=RTFM, Inc./OU=Widgets Division/CN=Test CA20010517
Nov 12 17:26:58 zeus postfix/smtp[24007]: E859950021DB1: to=<mike@elutopia.it>, relay=mx.elutopia.it[62.149.128.160]:25, delay=0.89, delays=0.09/0/0.3/0.5, dsn=2.0.0, status=sent (250 2.0.0 d3Sx1m03q0ps1bK013Sxg4 mail accepted for delivery)
Nov 12 17:26:58 zeus postfix/qmgr[22359]: E859950021DB1: removed
Nov 12 17:27:00 zeus postfix/smtpd[23980]: connect from pegasus.os[172.20.19.62]
Nov 12 17:27:00 zeus postfix/smtpd[23980]: setting up TLS connection from pegasus.os[172.20.19.62]
Nov 12 17:27:00 zeus postfix/smtpd[23980]: Anonymous TLS connection established from pegasus.os[172.20.19.62]: TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)
Nov 12 17:27:00 zeus postfix/smtpd[23992]: disconnect from pegasus.os[172.20.19.62]
Nov 12 17:27:00 zeus postfix/smtpd[23980]: 2C04150101DB2: client=pegasus.os[172.20.19.62]
Nov 12 17:27:00 zeus postfix/cleanup[23994]: 2C04150101DB2: message-id=21e2f9d3-154a-3683-85d3-a7c52d429386
Nov 12 17:27:00 zeus postfix/qmgr[22359]: 2C04150101DB2: from=<system@directoperation.de>, size=53237, nrcpt=1 (queue active)
Nov 12 17:27:00 zeus postfix/smtp[24006]: ABE7C50001D62: to=<info@elvictoria.it>, relay=relay3.telnew.it[195.36.1.102]:25, delay=4.9, delays=0.1/0/4/0.76, dsn=2.0.0, status=sent (250 2.0.0 r9EFQt0J009467 Message accepted for delivery)
Nov 12 17:27:00 zeus postfix/qmgr[22359]: ABE7C50001D62: removed
Nov 12 17:27:00 zeus postfix/smtp[23998]: 2C04150101DB2: to=<peter@elgravo.ch>, relay=liberomx2.elgravo.ch[212.52.84.93]:25, delay=0.72, delays=0.07/0/0.3/0.35, dsn=2.0.0, status=sent (250 ok:  Message 2040264602 accepted)
Nov 12 17:27:00 zeus postfix/qmgr[22359]: 2C04150101DB2: removed

现在,我从数据库(例如a92de331-9242-4d2a-8f0e-9418eb7c0123)获取消息id(uuid),然后在日志文件中运行我的代码:

^{pr2}$

有了邮件id我就找到了log_id,用log_id我可以找到SMTP回复答案。在

这样做很好,但是更好的方法是,如果软件遍历日志文件,获取消息id和回复代码,然后更新数据库。但我不确定,我该怎么做?此脚本必须每~2分钟运行一次,并检查更新日志文件。所以我怎么能保证,它能记住它在哪里,而且不会两次得到消息id? 提前谢谢


Tags: 文件fromidforositsmtppostfix
1条回答
网友
1楼 · 发布于 2024-06-26 01:45:13

使用字典存储消息ID,使用单独的文件存储日志文件中上次删除的字节号。在

msgIDs = {}
# get where you left off in the logfile during the last read:
try:
    with open('logfile_placemarker.txt', 'r') as f:
        lastRead = int(f.read())
except IOError:
    print("Can't find/read place marker file!  Starting at 0")
    lastRead = 0

with open('logfile.log', 'r') as f:
    f.seek(lastRead)
    for line in f:
        # ...
        # Pick out msgIDs and response codes
        # ...
        if msgID in msgIDs:
            print("uh oh, found the same msg id twice!!")
        msgIDs[msgID] = responseCode
    lastRead = f.tell()

# Do whatever you need to do with the msgIDs you found:
updateDB(msgIDs)
# Store lastRead (where you left off in the logfile) in a file if you need to so it persists in the next run
with open('logfile_placemarker.txt', 'w') as f:
    f.write(str(lastRead))

相关问题 更多 >