擅长:python、mysql、java
<p>我将任务分成几个独立的关注点:首先构建字典,用相同的根名称分组文件;然后检查哪些文件同时具有视频和字幕文件。(请不要使用regex来分割文件名,<code>os.path</code>在这里做得更好)。在</p>
<pre><code>from collections import defaultdict
import os
mylist = ['movie1.mp4','movie2.srt','movie1.srt','movie3.mp4','movie1.mp4']
movies = defaultdict(dict)
for filename in mylist:
name, ext = os.path.splitext(filename)
movies[name][ext] = filename
sub_extentions = set(['.txt', '.srt'])
movie_extensions = set(['.mp4', '.avi'])
for name, files in movies.items():
files_set = set(files.keys())
if not files_set & sub_extentions:
continue # no subs
elif not files_set & movie_extensions:
continue # no movie
else:
print name, files.values()
# output: movie1 ['movie1.srt', 'movie1.mp4']
</code></pre>
<p>另外,你打算如何处理带有附加字幕的<code>.mkv</code>文件?;)</p>