<p>在编写代码之前,您应该为要匹配的文件找出正确的规范。您为要匹配的文件名(“<code>[number][a-z]</code>或<code>[a-z][number]</code>”)提供的伪regexp甚至不包括您提供的示例,例如<code>0-file</code>。你知道吗</p>
<h2>简单版本</h2>
<p>但是,从表面上看,假设您也希望包含大写拉丁字母,下面是一个简单的函数,它将匹配<code>[number][a-z]</code>或<code>[a-z][number]</code>,并返回适当的前缀、后缀和数字位数。你知道吗</p>
<pre><code>import re
def find_number_in_filename(fn):
m = re.match(r"(\d+)([A-Za-z]+)$", fn)
if m:
prefix, suffix, num_length = "", m.group(2), len(m.group(1))
return prefix, suffix, num_length
m = re.match(r"([A-Za-z]+)(\d+)$", fn)
if m:
prefix, suffix, num_length = m.group(1), "", len(m.group(2))
return prefix, suffix, num_length
return fn, "", 0
example_fn = ("000foo", "bar14", "baz0", "file10name")
for fn in example_fn:
prefix, suffix, num_length = find_number_in_filename(fn)
if num_length == 0:
print "%s: does not match" % fn
else:
print "%s -> %s[%d-digits]%s" % (fn, prefix, num_length, suffix)
all_numbered_versions = [("%s%0"+str(num_length)+"d%s") % (prefix, ii, suffix) for ii in range(0,10**num_length)]
print "\t", all_numbered_versions[0], "through", all_numbered_versions[-1]
</code></pre>
<p>输出为:</p>
<pre><code>000foo -> [3-digits]foo
000foo through 999foo
bar14 -> bar[2-digits]
bar00 through bar99
baz0 -> baz[1-digits]
baz0 through baz9
file10name: does not match
</code></pre>
<p>注意,我正在使用标准的<code>printf</code>样式的字符串格式将数字转换为0填充的字符串,例如<code>%03d</code>表示0填充的3位数字。使用较新的<a href="https://docs.python.org/2/library/stdtypes.html#str.format" rel="nofollow">^{<cd8>}</a>可能更适合于将来的校对。你知道吗</p>
<h2>优雅地处理完整路径和扩展</h2>
<p>如果您的输入包含完整路径和带有扩展名的文件名(例如<code>/home/someone/project/foo000.txt</code>),并且您只想基于路径的最后一段进行匹配,那么请使用<code>os.path.split</code>和<code>.splitext</code>来执行此操作。你知道吗</p>
<p><strong>更新:</strong>修复丢失的路径分隔符</p>
<pre><code>import re
import os.path
def find_number_in_filename(path):
# remove the path and the extension
head, tail = os.path.split(path)
head = os.path.join(head, "") # include / or \ on the end of head if it's missing
fn, ext = os.path.splitext(tail)
m = re.match(r"(\d+)([A-Za-z]+)$", fn)
if m:
prefix, suffix, num_length = head, m.group(2)+ext, len(m.group(1))
return prefix, suffix, num_length
m = re.match(r"([A-Za-z]+)(\d+)$", fn)
if m:
prefix, suffix, num_length = head+m.group(1), ext, len(m.group(2))
return prefix, suffix, num_length
return path, "", 0
example_paths = ("/tmp/bar14.so", "/home/someone/0000baz.txt", "/home/someone/baz00bar.zip")
for path in example_paths:
prefix, suffix, num_length = find_number_in_filename(path)
if num_length == 0:
print "%s: does not match" % path
else:
print "%s -> %s[%d-digits]%s" % (path, prefix, num_length, suffix)
all_numbered_versions = [("%s%0"+str(num_length)+"d%s") % (prefix, ii, suffix) for ii in range(0,10**num_length)]
print "\t", all_numbered_versions[0], "through", all_numbered_versions[-1]
</code></pre>