<p>每个村庄使用的第一种顶级作物:</p>
<pre><code>df1 = df.sort_values(['Village Name','Area in hec'], ascending=[True, False])
df2 = df1.drop_duplicates('Village Name')
print (df2)
District Taluka Village Name Crop Area in hec
11 Ahmednagar Pathardi Adgaon Cotton 310.0
9 Ahmednagar Pathardi Agaskhand Bajara 100.0
17 Ahmednagar Pathardi Akola Cotton 550.0
0 Ahmednagar Pathardi Alhanwadi Bajara 370.0
12 Ahmednagar Pathardi Ambika Nagar Cotton 131.0
28 Ahmednagar Pathardi Auranjpur Soybean 45.0
16 Ahmednagar Pathardi Badewadi Cotton 104.0
14 Ahmednagar Pathardi Bhalgaon Cotton 562.0
13 Ahmednagar Pathardi Bharajwadi Cotton 161.0
15 Ahmednagar Pathardi Bhawarwadi (N.V.) Cotton 211.0
</code></pre>
<p>以及每种作物的面积百分比:</p>
<pre><code>s = df1.groupby("Crop")['Area in hec'].transform('sum')
df1['perc'] = df1['Area in hec'].div(s).mul(100)
print (df1.head(10))
District Taluka Village Name Crop Area in hec perc
11 Ahmednagar Pathardi Adgaon Cotton 310.0 14.226709
1 Ahmednagar Pathardi Adgaon Bajara 302.0 21.297602
21 Ahmednagar Pathardi Adgaon Soybean 52.0 8.724832
31 Ahmednagar Pathardi Adgaon Maize 1.5 1.176471
9 Ahmednagar Pathardi Agaskhand Bajara 100.0 7.052186
29 Ahmednagar Pathardi Agaskhand Soybean 20.0 3.355705
39 Ahmednagar Pathardi Agaskhand Maize 10.0 7.843137
19 Ahmednagar Pathardi Agaskhand Cotton 0.0 0.000000
17 Ahmednagar Pathardi Akola Cotton 550.0 25.240936
7 Ahmednagar Pathardi Akola Bajara 175.0 12.341326
</code></pre>