<p>在我的操作之后,额外的网络搜索发现:<a href="https://stackoverflow.com/questions/773/how-do-i-use-pythons-itertools-groupby">How do I use Python's itertools.groupby()?</a></p>
<p>这是我目前的做法。请告诉我能不能让它更像Python。你知道吗</p>
<p>loadfile1.txt(无分组变量-输出与loadfile4.txt相同):</p>
<pre><code>pgm1
pgm2
pgm3
pgm4
pgm5
pgm6
pgm7
pgm8
/a/path/with spaces/pgm9
</code></pre>
<p>loadfile2.txt(随机分组变量):</p>
<pre><code>10, pgm1
10, pgm2
10, pgm3
ZZ, pgm4
ZZ, pgm5
-5, pgm6
-5, pgm7
-5, pgm8
-5, /a/path/with spaces/pgm9
</code></pre>
<p>loadfile3.txt(相同的分组变量-无依赖关系-多线程):</p>
<pre><code>,pgm1
,pgm2
,pgm3
,pgm4
,pgm5
,pgm6
,pgm7
,pgm8
,/a/path/with spaces/pgm9
</code></pre>
<p>loadfile4.txt(不同的分组变量-依赖项-单线程):</p>
<pre><code>1, pgm1
2, pgm2
3, pgm3
4, pgm4
5, pgm5
6, pgm6
7, pgm7
8, pgm8
9, /a/path/with spaces/pgm9
</code></pre>
<p>我的Python脚本:</p>
<pre><code>#!/usr/bin/python
# See https://stackoverflow.com/questions/4842057/python-easiest-way-to-ignore-blank-lines-when-reading-a-file
# convert file to list of lines, ignoring any blank lines
filename = 'loadfile2.txt'
with open(filename) as f_in:
lines = filter(None, (line.rstrip() for line in f_in))
print(lines)
# convert list to a list of lists split on comma
lines = [i.split(',') for i in lines]
print(lines)
# create list of lists based on the key value (first item in sub-lists)
listofpgms = []
for key, group in groupby(lines, lambda x: x[0]):
pgms = []
for pgm in group:
try:
pgms.append(pgm[1].strip())
except IndexError:
pgms.append(pgm[0].strip())
listofpgms.append(pgms)
print(listofpgms)
</code></pre>
<p>使用loadfile2.txt时输出:</p>
<pre><code>['10, pgm1', '10, pgm2', '10, pgm3', 'ZZ, pgm4', 'ZZ, pgm5', '-5, pgm6', '-5, pgm7', '-5, pgm8', '-5, /a/path/with spaces/pgm9']
[['10', ' pgm1'], ['10', ' pgm2'], ['10', ' pgm3'], ['ZZ', ' pgm4'], ['ZZ', ' pgm5'], ['-5', ' pgm6'], ['-5', ' pgm7'], ['-5', ' pgm8'], ['-5', ' /a/path/with spaces/pgm9']]
[['pgm1', 'pgm2', 'pgm3'], ['pgm4', 'pgm5'], ['pgm6', 'pgm7', 'pgm8', '/a/path/with spaces/pgm9']]
</code></pre>