<p>这是另一种方法:</p>
<p>在构建<code>email_to_indices</code>字典时,可以将该行的电话号码存储为值,然后让<code>phone_to_indices</code>拥有该行的索引。这样我们就可以创建一个<code>email_to_indices</code>到{<cd2>}到行映射的索引。在</p>
<p>通过修改和基本的设置操作,我可以得到您想要的东西:</p>
<pre><code>from collections import defaultdict
email_to_indices = defaultdict(list)
phone_to_indices = defaultdict(list)
combined = defaultdict(set)
original=[['email', 'tel', 'fecha', 'descripcion', 'categ'],
['a@gmail.com', '1', '2014-08-06 00:00:06', 'MySpace a', 'animales'],
['b@gmail.com', '1', '2014-08-01 00:00:06', 'My Space a', 'ropa'],
['a@gmail.com', '2', '2014-08-06 00:00:06', 'My Space b', 'electronica'],
['b@gmail.com', '3', '2014-08-10 00:00:06', 'Myace c', 'animales'],
['c@gmail.com', '4', '2014-08-10 00:00:06', 'Myace c', 'animales']]
for idx, row in enumerate(original[1:], start=1):
email = row[0].lower()
phone = row[1]
email_to_indices[email].append(phone) # Here is what I changed
phone_to_indices[phone].append(idx)
random_key = 0
for idx, row in enumerate(original[1:], start=1):
grouped_rows = []
if row[0].lower() in email_to_indices:
for phone_no in email_to_indices[row[0].lower()]:
grouped_rows.extend(phone_to_indices[phone_no])
if len(combined[random_key]) > 0 and len(set(grouped_rows).intersection(combined[random_key])) > 0:
combined[random_key].update(set(grouped_rows))
elif len(combined[random_key]) > 0:
random_key += 1
combined[random_key].update(set(grouped_rows))
else:
combined[random_key].update(set(grouped_rows))
print combined
</code></pre>
<p>这样可以得到:</p>
^{pr2}$