回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>在我的数据框中有一列叫做“State”。它包含美国各州的缩写。我有硬编码的地区,我想为每个州创建一个带有地区的新列。你知道吗</p>
<p>我曾经pd系列应用(),但我想知道是否有一个更好的做法,这种类型的映射。关于如何改进代码有什么建议吗?你知道吗</p>
<p>这是我目前的代码,但我只是对最佳实践的建议开放。你知道吗</p>
<pre><code>def get_region(s, *regions):
if s in regions[0]:
return 'west'
elif s in regions[1]:
return 'midwest'
elif s in regions[2]:
return 'south'
elif s in regions[3]:
return 'northeast'
else:
return None
west = ['WA','OR','CA','ID','NV','MT','WY','UT','AZ','CO','NM']
midwest = ['ND','MN','WI','MI','SD','NE','KS','IA','MO','IL','IN','OH']
south = ['TX','OK','AR','LA','MS','TN','KY','AL','GA','FL','SC','NC','VA','WV','MD','DE']
northeast = ['PA','NJ','NY','CT','MA','RI','VT','NH','ME']
regions = [west,midwest,south,northeast]
full_df['Region'] = full_df['State'].apply(get_region, args=regions)
full_df['Region'].head(15)
Out:
0 west
1 midwest
2 south
3 south
4 midwest
5 west
6 south
7 south
8 west
9 midwest
10 south
11 northeast
12 northeast
13 west
14 west
Name: Region, dtype: object
</code></pre>