<p>我已经做了一个冗长的方式,让你贯彻到底。在</p>
<p>首先,让我们定义一个函数来确定值“country”</p>
<pre><code>In [4]: def get_country(s):
...: if 'Nor' in s:
...: return 'Norway'
...: if 'S' in s:
...: return 'Sweden'
...: # return 'Default Country' # if you get unmatched values
In [5]: get_country('Sven')
Out[5]: 'Sweden'
In [6]: get_country('Norv')
Out[6]: 'Norway'
</code></pre>
<p>我们可以使用<code>map</code>对每一行运行<code>get_country</code>。Pandas数据帧还有一个<a href="http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.apply.html" rel="nofollow">^{<cd3>}</a>,其工作原理类似*。在</p>
^{pr2}$
<p>现在我们将这个结果赋给名为“country”的列</p>
<pre><code>In [8]: df['country'] = map(get_country, df['season'])
</code></pre>
<p>让我们来看看最终结果:</p>
<pre><code>In [9]: df
Out[9]:
season country
0 Nor 2014 Norway
1 Nor 2013 Norway
2 Nor 2013 Norway
3 Norv 2013 Norway
4 Swe 2014 Sweden
5 Swe 2014 Sweden
6 Swe 2013 Sweden
7 Swe 2013 Sweden
8 Sven 2013 Sweden
9 Sven 2013 Sweden
10 Norv 2014 Norway
</code></pre>
<p>*使用<code>apply()</code>以下是它的外观:</p>
<pre><code>In [16]: df['country'] = df['season'].apply(get_country)
In [17]: df
Out[17]:
season country
0 Nor 2014 Norway
1 Nor 2013 Norway
2 Nor 2013 Norway
3 Norv 2013 Norway
4 Swe 2014 Sweden
5 Swe 2014 Sweden
6 Swe 2013 Sweden
7 Swe 2013 Sweden
8 Sven 2013 Sweden
9 Sven 2013 Sweden
10 Norv 2014 Norway
</code></pre>
<h2>可扩展性更强的国家匹配器</h2>
<p>仅限伪代码:)</p>
<pre><code># Modify this as needed
country_matchers = {
'Norway': ['Nor', 'Norv'],
'Sweden': ['S', 'Swed'],
}
def get_country(s):
"""
Run the passed string s against "matchers" for each country
Return the first matched country
"""
for country, matchers in country_matchers.items():
for matcher in matchers:
if matcher in s:
return country
</code></pre>