<p>我建议将<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.cut.html" rel="nofollow noreferrer">^{<cd1>}</a>函数更改为新的bin和新的标签,因为最好的方法是避免pandas中的循环,因为如果存在一些向量化函数,则速度较慢:</p>
<pre><code>df = pd.DataFrame({'Floors':[0,1,10,19,20,25,40, 70]})
bins = [0, 10, 20, 30, 40, 50, np.inf]
labels = ['0-9', '10-19', '20-29', '30-39', '40-49', '50~']
df['NumFloorsGroup'] = pd.cut(df['Floors'],
bins=bins,
labels=labels,
include_lowest=True)
df['Category'] = pd.cut(df['Floors'],
bins=[0, 19, 50, np.inf],
labels=['LowFl','NormalFl','HighFl'],
include_lowest=True)
print (df)
Floors NumFloorsGroup Category
0 0 0-9 LowFl
1 1 0-9 LowFl
2 10 0-9 LowFl
3 19 10-19 LowFl
4 20 10-19 NormalFl
5 25 20-29 NormalFl
6 40 30-39 NormalFl
7 70 50~ HighFl
</code></pre>
<p>或者使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.map.html" rel="nofollow noreferrer">^{<cd2>}</a>和dictionary with <a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.fillna.html" rel="nofollow noreferrer">^{<cd3>}</a>替换dict(<code>NaN</code>s)中没有的值,替换为<code>NormalFl</code>:</p>
<pre><code>d = { "0-9": 'LowFl', "10-19": 'LowFl',"50+": 'HighFl'}
df['Category'] = df['NumFloorsGroup'].map(d).fillna('NormalFl')
</code></pre>