<p>(回答编辑后的问题。)</p>
<p>在shell中实现这一点比较困难(可读性较差),因此我求助于Python:</p>
<pre><code>#!/usr/bin/env python3
import os
import re
import pprint
from sets import Set
from subprocess import call
group1 = {} # collect here the filenames for _1
group2 = {} # collect here the filenames for _2
for root, directories, filenames in os.walk('.'):
for filename in filenames:
ff = os.path.join(root,filename)
if filename.endswith("_1.txt"):
base = re.sub('_1\.txt$','', ff)
group1[base] = ff
if filename.endswith("_2.txt"):
base = re.sub('_2\.txt$','', ff)
group2[base] = ff
#pprint.pprint(group1)
#pprint.pprint(group2)
# find common ones: the dirs which contain the files with the common prefix:
list1 = Set(group1.keys()).intersection(Set(group2.keys()))
#pprint.pprint(list1)
# call the myscript.py
cwd = os.getcwd()
for base in list1:
path, filename = os.path.split(base)
#print path," ",filename
try:
os.chdir(path)
call(['echo', 'myscript.py', filename+"_1.txt", filename+"_2.txt", "outputfile"])
finally:
os.chdir(cwd)
</code></pre>
<p>(为糟糕的Python风格感到抱歉:我实际上是一个Perl程序员。)</p>
<hr/>
<blockquote>
<p>Most recursive solutions I have seen so far use either find or grep for each individual file however I need the location as well, to get them in pairs and write to disk at the appropriate place. </p>
</blockquote>
<p>不要迭代文件-遍历目录。shell中的示例:</p>
^{pr2}$
<p>或者,您仍然可以迭代文件,让<code>find</code>为我们检查其中一个文件。然后从找到的文件名中提取目录:</p>
<pre><code>find -type f -name xyz_1.gz -print |
while read FN; do
DIR=`dirname $FN`
test -r $DIR/xyz_2.gz -a -r $DIR/some_other_file || continue
( cd $DIR; myscript.py xyz_1.gz xyz_2.gz outputfile )
done
</code></pre>
<p>此外,您还可以将开头的<code>cd $DIR</code>(<code>os.chdir()</code>);将目录作为参数或env var传递到Python脚本本身,并检查输入文件(例如,如果文件不存在,则自动退出)。在</p>