擅长:python、mysql、java
<p>使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html" rel="nofollow noreferrer">^{<cd1>}</a>和<code>size</code>的解决方案,如果还需要缺少值,请添加<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html" rel="nofollow noreferrer">^{<cd3>}</a>,并用<code>0</code>替换它们:</p>
<pre><code>mux = pd.MultiIndex.from_product([df1['c_name'], df1['p_name']], names=['c_name','p_name'])
df1 = df1.groupby(['c_name','p_name']).size()
.reindex(mux, fill_value=0).reset_index(name='Freq')
</code></pre>
<hr/>
^{pr2}$
<p><strong>计时</strong>:</p>
<p>解决方案更快,因为没有<code>stack</code>:</p>
<pre><code>In [197]: %timeit pd.crosstab(df1['c_name'], df1['p_name']).stack().reset_index(name='Freq')
100 loops, best of 3: 6.74 ms per loop
In [198]: %timeit df1.groupby(['c_name','p_name']).size().reindex(pd.MultiIndex.from_product([df1['c_name'], df1['p_name']], names=['c_name','p_name']), fill_value=0).reset_index(name='Freq')
100 loops, best of 3: 3.12 ms per loop
</code></pre>