<p>首先,写得非常好的问题。谢谢。在</p>
<p>我建议为每个系列制作一个数据帧,并在末尾连接:
您需要反转您的<code>lookupMap</code>:</p>
<pre><code>In [80]: d = {'mammal': ['dolphin', 'lion', 'seal', 'tiger'], 'insect': ['ladybug', 'locust', 'mosquito'], 'fish':
['seabass', 'shark']}
</code></pre>
<p>例如:</p>
^{pr2}$
<p>现在每个家庭都要这样做:</p>
<pre><code>In [88]: for k, v in d.iteritems():
....: results.append(pd.DataFrame([habitat_family[k] for _ in v], index=v).T)
</code></pre>
<p>还有海螺:</p>
<pre><code>In [89]: habitat_species = pd.concat(results, axis=1)
In [90]: habi
habitat_family habitat_species
In [90]: habitat_species
Out[90]:
dolphin lion seal tiger ladybug locust mosquito seabass shark
1 101 101 101 101 345 345 345 625 625
2 123 123 123 123 928 928 928 254 254
3 523 523 523 523 183 183 183 929 929
4 562 562 562 562 645 645 645 827 827
5 546 546 546 546 113 113 113 102 102
6 213 213 213 213 942 942 942 295 295
7 562 562 562 562 689 689 689 174 174
8 234 234 234 234 539 539 539 777 777
9 987 987 987 987 789 789 789 123 123
10 901 901 901 901 814 814 814 763 763
[10 rows x 9 columns]
</code></pre>
<p>如果您想要具有(family,species)对的列的层次索引,可以考虑将这些家族作为<code>key</code>参数传递给<code>concat</code>。在</p>
<p>因为你说过性能很重要:</p>
<pre><code># Mine
In [97]: %%timeit
....: for k, v in d.iteritems():
....: results.append(pd.DataFrame([habitat_family[k] for _ in v], index=v).T)
....: habitat_species = pd.concat(results, axis=1)
....:
1 loops, best of 3: 296 ms per loop
# Your's
In [98]: %%timeit
....: for id in habitat_family.index: # loop through habitat id's
....: for spec in species: # loop through species
....: corresp_family = lookupMap[spec]
....: habitat_species.loc[id,spec] = habitat_family.loc[id,corresp_family]
10 loops, best of 3: 21.5 ms per loop
# Dan's
In [102]: %%timeit
.....: habitat_species = habitat_family[Series(species).replace(lookupMap)]
.....: habitat_species.columns = species
.....:
100 loops, best of 3: 2.55 ms per loop
</code></pre>
<p>看来丹赢了一个长传!在</p>