<p>我相信您可以通过<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.cut.html" rel="nofollow noreferrer">pd.cut</a>和<a href="https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.where.html" rel="nofollow noreferrer">np.where</a>来实现这一点:</p>
<pre><code>adjusted # copied text from your example
Out[86]:
fyear conm indadjsg
0 1999 1-800-FLOWERS.COM 26.64609
1 2000 1-800-FLOWERS.COM 22.72717
2 2001 1-800-FLOWERS.COM 7.31201
3 2002 1-800-FLOWERS.COM 4.94831
4 2003 1-800-FLOWERS.COM 6.27880
5 1996 ABERCROMBIE 34.83169
6 1997 ABERCROMBIE 48.05314
7 1998 ABERCROMBIE 48.91833
8 1999 ABERCROMBIE 46.95646
9 2000 ABERCROMBIE 33.91436
10 2001 ABERCROMBIE 67.23423
11 2002 ABERCROMBIE 99.09342
.. ... ... ...
25 2015 CM 68.51856
26 2009 VCA -5.55203
27 2010 VCA -3.35728
28 2011 VCA -0.93080
29 2012 VCA 5.97491
30 2007 VIASPACE -50.96687
31 2008 VIASPACE 149.95740
32 2009 VIASPACE 197.77686
33 2010 VIASPACE -25.20173
34 2011 VIASPACE 77.08262
35 2012 VIASPACE 78.03423
36 2005 YASHENG -3.72810
byyr = adjusted.groupby(by='conm')['fyear'].count().to_frame()
start = byyr.fyear[adjusted.conm]
indadjsg = adjusted.groupby(by='conm')['indadjsg'].mean().to_frame()
px = indadjsg.indadjsg[adjusted.conm]
categories = pd.cut(px.values.reshape((len(px), )),
bins= [-np.inf, 0, 15, 100, np.inf],
labels=['decline', 'revival', 'mature', 'growth'])
adjusted.loc[:, 'stage'] = np.where(start <= 5, 'start', categories)
adjusted # result
Out[130]:
fyear conm indadjsg stage
0 1999 1-800-FLOWERS.COM 26.64609 start
1 2000 1-800-FLOWERS.COM 22.72717 start
2 2001 1-800-FLOWERS.COM 7.31201 start
3 2002 1-800-FLOWERS.COM 4.94831 start
4 2003 1-800-FLOWERS.COM 6.27880 start
5 1996 ABERCROMBIE 34.83169 mature
6 1997 ABERCROMBIE 48.05314 mature
7 1998 ABERCROMBIE 48.91833 mature
8 1999 ABERCROMBIE 46.95646 mature
9 2000 ABERCROMBIE 33.91436 mature
10 2001 ABERCROMBIE 67.23423 mature
11 2002 ABERCROMBIE 99.09342 mature
.. ... ... ... ...
25 2015 CM 68.51856 start
26 2009 VCA -5.55203 start
27 2010 VCA -3.35728 start
28 2011 VCA -0.93080 start
29 2012 VCA 5.97491 start
30 2007 VIASPACE -50.96687 mature
31 2008 VIASPACE 149.95740 mature
32 2009 VIASPACE 197.77686 mature
33 2010 VIASPACE -25.20173 mature
34 2011 VIASPACE 77.08262 mature
35 2012 VIASPACE 78.03423 mature
36 2005 YASHENG -3.72810 start
</code></pre>
<p>打开pd.切割,请确保使用<code>right=True</code>或<code>right=False.</code>指定容器的边缘</p>