<p>正则表达式解析行的快速示例:</p>
<pre><code>>>> import re
>>> line = 'A@1:100;2:240'
>>> data = re.search(r'@(\d+):(\d+);(\d+):(\d+)',line).groups()
>>> D = {data[0]:data[1],data[2]:data[3]}
>>> D
{'1': '100', '2': '240'}
</code></pre>
<p>以下是一些时间安排:</p>
<pre><code>import re
regex = re.compile(r'@(\d+):(\d+);(\d+):(\d+)')
def optionA(line):
_id, info = line.split("@")
data = {}
for g_info in info.split(";"):
k, v = g_info.split(":")
data[k] = v
return data
def optionB(line):
_id, info = line.split("@")
return dict(map(lambda i: i.split(":"), info.split(";")))
def optionC(line):
data = regex.search(line).groups()
return {data[0]:data[1],data[2]:data[3]}
line = 'A@1:100;2:240'
</code></pre>
<p>次数:</p>
<pre><code>C:\>py -m timeit -s "import x" "x.optionA(x.line)"
100000 loops, best of 3: 3.01 usec per loop
C:\>py -m timeit -s "import x" "x.optionB(x.line)"
100000 loops, best of 3: 5.15 usec per loop
C:\>py -m timeit -s "import x" "x.optionC(x.line)"
100000 loops, best of 3: 2.88 usec per loop
</code></pre>
<p><strong>编辑:</strong>随着需求的轻微变化,我尝试了<code>findall</code>的<code>optionC</code>和<code>optionA</code>的稍微不同的版本:</p>
<pre><code>import re
regex = re.compile(r'(\d+):(\d+)')
def optionA(line):
_id, info = line.split("@")
data = {}
for g_info in info.split(";"):
k, v = g_info.split(":")
data[k] = v
return data
def optionAA(line):
data = {}
for g_info in line[2:].split(";"):
k, v = g_info.split(":")
data[k] = v
return data
def optionB(line):
_id, info = line.split("@")
return dict(map(lambda i: i.split(":"), info.split(";")))
def optionC(line):
return dict(regex.findall(line))
line = 'A@1:100;2:240;3:250;4:260;5:100;6:100;7:100;8:100;9:100;10:100'
</code></pre>
<p>时间安排:</p>
<pre><code>C:\>py -m timeit -s "import x" "x.optionA(x.line)"
100000 loops, best of 3: 8.35 usec per loop
C:\>py -m timeit -s "import x" "x.optionAA(x.line)"
100000 loops, best of 3: 8.17 usec per loop
C:\>py -m timeit -s "import x" "x.optionB(x.line)"
100000 loops, best of 3: 12.3 usec per loop
C:\>py -m timeit -s "import x" "x.optionC(x.line)"
100000 loops, best of 3: 12.8 usec per loop
</code></pre>
<p>所以看起来修改后的<code>optionAA</code>在这一行中获胜。希望这能说明测量算法的重要性。我很惊讶<code>findall</code>的速度慢了。你知道吗</p>