擅长:python、mysql、java
<p>尝试以下操作(未触及正则表达式):</p>
<pre><code>import glob, gzip, re
import cPickle
logformat = re.compile(r'^\S+ \S+ \S+ \[([\w:/]+\s[+\-]\d{4})\] "(\S+) (\S+) .*" (\d+) (\d+) "([^"]*)" "[^"]*"')
with open('Logs.txt', 'w') as f_out:
for i in glob.glob('*.gz'):
with gzip.GzipFile(i,'r') as f_in:
for txtline in f_in:
parsedline = logformat.match(txtline)
if parsedline:
f_out.write("time={t} size={s} url={u}".format(t=parsedline.group(1), s=parsedline.group(5), u=parsedline.group(3)))
</code></pre>