<p>这是因为<code>\w</code>包含数字字符:</p>
<pre><code>>>> import re
>>> re.match('\w*', '12345')
<_sre.SRE_Match object at 0x021241E0>
>>> re.match('\w*', '12345').group()
'12345'
>>>
</code></pre>
<p>您需要更具体地告诉Python您只需要字母:</p>
^{pr2}$
<hr/>
<p>关于您的第二个问题,您可以使用以下内容:</p>
<pre><code>import re
# Dictionary to hold the results
results = {}
# Break-up the file text to get the names and their associated data.
# filetext2.split('\n\n') breaks it up into individual data blocks (one per person).
# Mapping to str.splitlines breaks each data block into single lines.
for name, *data in map(str.splitlines, filetext2.split('\n\n')):
# See if the name matches our pattern.
if re.match('[A-Za-z]*\d{5}:', name):
# Add the name and the relevant data to the file.
# [:-1] gets rid of the colon on the end of the name.
# The list comprehension gets only the file names from the data.
results[name[:-1]] = [x for x in data if x.endswith('.zip')]
</code></pre>
<p>或者,没有所有的评论:</p>
<pre><code>import re
results = {}
for name, *data in map(str.splitlines, filetext2.split('\n\n')):
if re.match('[A-Za-z]*\d{5}:', name):
results[name[:-1]] = [x for x in data if x.endswith('.zip')]
</code></pre>
<p>下面是一个演示:</p>
<pre><code>>>> import re
>> filetext2 = '''\
... john123:
... 1
... 2
... coconut_rum.zip
...
... bob234513253:
... 0
... jackdaniels.zip
... nowater.zip
... 3
...
... judy88009:
... dontdrink.zip
... 9
...
... tommi54321:
... dontdrinkalso.zip
... 92
... '''
>>> results = {}
>>> for name, *data in map(str.splitlines, filetext2.split('\n\n')):
... if re.match('[A-Za-z]*\d{5}:', name):
... results[name[:-1]] = [x for x in data if x.endswith('.zip')]
...
>>> results
{'tommi54321': ['dontdrinkalso.zip'], 'judy88009': ['dontdrink.zip']}
>>>
</code></pre>
<p>但是请记住,一次读入文件的所有内容并不是很有效。相反,您应该考虑生成一个生成器函数,一次生成一个数据块。此外,还可以通过预编译正则表达式模式来提高性能。在</p>