<p><a href="http://pandas.pydata.org" rel="nofollow">pandas</a>示例:</p>
<pre><code>>>> import pandas as pd
>>> df = pd.read_csv("grouped.csv", sep="[,\s]*")
>>> df
epochtime name score level extralives
0 1234455 suzy 120 3 0
1 1234457 billy 123 1 2
2 1234459 billy 124 2 4
3 1234459 suzy 224 5 4
4 1234460 suzy 301 7 1
5 1234461 billy 201 3 1
>>> g = df.groupby("name").describe()
>>> g
epochtime score level extralives
name
billy count 3.000000 3.000000 3.0 3.000000
mean 1234459.000000 149.333333 2.0 2.333333
std 2.000000 44.747439 1.0 1.527525
min 1234457.000000 123.000000 1.0 1.000000
25% 1234458.000000 123.500000 1.5 1.500000
50% 1234459.000000 124.000000 2.0 2.000000
75% 1234460.000000 162.500000 2.5 3.000000
max 1234461.000000 201.000000 3.0 4.000000
suzy count 3.000000 3.000000 3.0 3.000000
mean 1234458.000000 215.000000 5.0 1.666667
std 2.645751 90.835015 2.0 2.081666
min 1234455.000000 120.000000 3.0 0.000000
25% 1234457.000000 172.000000 4.0 0.500000
50% 1234459.000000 224.000000 5.0 1.000000
75% 1234459.500000 262.500000 6.0 2.500000
max 1234460.000000 301.000000 7.0 4.000000
</code></pre>
<p>或者简单地说:</p>
^{pr2}$
<p>然后:</p>
^{3}$
<p>等等。如果你用R/SQL的方式思考,但又想使用Python,那么一定要试试pandas。在</p>
<p>请注意,您还可以执行多列分组:</p>
<pre><code>>>> df.groupby(["epochtime", "name"]).mean()
score level extralives
epochtime name
1234455 suzy 120 3 0
1234457 billy 123 1 2
1234459 billy 124 2 4
suzy 224 5 4
1234460 suzy 301 7 1
1234461 billy 201 3 1
</code></pre>