<p>如果你用熊猫,这很容易。在</p>
<pre><code>import pandas as pd
def sorted_df(df, ascending=False):
grouped = df.groupby([0,1])
data = []
for g in grouped:
d = g[1]
d[4] = d[2].rank(ascending=ascending)
d = d.sort(4)
data.append(d)
return pd.concat(data)
# load our dataframe from a csv string
import StringIO
f = StringIO.StringIO("""uniquedata1,uniquecell1,42,data,1,data
uniquedata1,uniquecell1,32,data,2,data
uniquedata1,uniquecell1,13,data,3,data
uniquedata2,uniquecell2,41,data,2,data
uniquedata2,uniquecell2,39,data,3,data
uniquedata2,uniquecell2,45,data,1,data
uniquedata2,uniquecell2,22,data,4,data
uniquedata1,uniquecell2,36,data,3,data
uniquedata1,uniquecell2,66,data,1,data
uniquedata1,uniquecell2,40,data,2,data""")
df = pd.read_csv(f, header=None)
# sort descending
sorted_df(df)
=> 0 1 2 3 4 5
0 uniquedata1 uniquecell1 42 data 1 data
1 uniquedata1 uniquecell1 32 data 2 data
2 uniquedata1 uniquecell1 13 data 3 data
8 uniquedata1 uniquecell2 66 data 1 data
9 uniquedata1 uniquecell2 40 data 2 data
7 uniquedata1 uniquecell2 36 data 3 data
5 uniquedata2 uniquecell2 45 data 1 data
3 uniquedata2 uniquecell2 41 data 2 data
4 uniquedata2 uniquecell2 39 data 3 data
6 uniquedata2 uniquecell2 22 data 4 data
# sort ascending
sorted_df(df, ascending=True)
=> 0 1 2 3 4 5
2 uniquedata1 uniquecell1 13 data 1 data
1 uniquedata1 uniquecell1 32 data 2 data
0 uniquedata1 uniquecell1 42 data 3 data
7 uniquedata1 uniquecell2 36 data 1 data
9 uniquedata1 uniquecell2 40 data 2 data
8 uniquedata1 uniquecell2 66 data 3 data
6 uniquedata2 uniquecell2 22 data 1 data
4 uniquedata2 uniquecell2 39 data 2 data
3 uniquedata2 uniquecell2 41 data 3 data
5 uniquedata2 uniquecell2 45 data 4 data
# add some NA values
from numpy import nan
df.ix[1,2] = nan
df.ix[4,2] = nan
df.ix[5,2] = nan
# sort ascending
sorted_df(df, ascending=True)
=> 0 1 2 3 4 5
2 uniquedata1 uniquecell1 13 data 1 data
0 uniquedata1 uniquecell1 42 data 2 data
1 uniquedata1 uniquecell1 NaN data NaN data
7 uniquedata1 uniquecell2 36 data 1 data
9 uniquedata1 uniquecell2 40 data 2 data
8 uniquedata1 uniquecell2 66 data 3 data
6 uniquedata2 uniquecell2 22 data 1 data
3 uniquedata2 uniquecell2 41 data 2 data
4 uniquedata2 uniquecell2 NaN data NaN data
5 uniquedata2 uniquecell2 NaN data NaN data
</code></pre>
<p>我认为我在这里展示的处理NA值的行为(将它们排序为NA)可能比您在假设的示例中展示的行为更合适,但是您可以使用<code>fillna</code>在每个组中填充NA值。在</p>