<p>如果你搜索“为什么pythongzip很慢”,你会发现很多关于这个问题的讨论,包括python2.7和3.2中的改进补丁。同时,像在Perl中那样使用zcat,这非常快。你的(第一个)函数用了4.19秒,一个5MB的压缩文件,第二个函数用了0.78秒,但是,我不知道你的未压缩文件是怎么回事。如果我解压缩日志文件(apache日志)并在它们上运行两个函数,使用一个简单的Python open(file)和Popen('cat'),Python比cat(0.48s)快(0.17s)。在</p>
<pre>
#!/usr/bin/python
import gzip
from subprocess import PIPE, Popen
import sys
import timeit
#pathToLog = 'big.log.gz' # 50M compressed (*10 uncompressed)
pathToLog = 'small.log.gz' # 5M ""
def test_ori():
counter = 0
f = gzip.open(pathToLog, 'r')
for line in f:
counter = counter + 1
if (counter % 100000 == 0): # 1000000
print counter, line
f.close
def test_new():
counter = 0
content = Popen(["zcat", pathToLog], stdout=PIPE).communicate()[0].split('\n')
for line in content:
counter = counter + 1
if (counter % 100000 == 0): # 1000000
print counter, line
if '__main__' == __name__:
to = timeit.Timer('test_ori()', 'from __main__ import test_ori')
print "Original function time", to.timeit(1)
tn = timeit.Timer('test_new()', 'from __main__ import test_new')
print "New function time", tn.timeit(1)
</pre>