<p>假设您有以下DF:</p>
<pre><code>In [73]: df
Out[73]:
text
0 Alabama[edit]
1 Auburn (Auburn University)[1]
2 Florence (University of North Alabama)
3 Jacksonville (Jacksonville State University)[2]
4 Livingston (University of West Alabama)[2]
5 Montevallo (University of Montevallo)[2]
6 Troy (Troy University)[2]
7 Tuscaloosa (University of Alabama, Stillman Co...
8 Tuskegee (Tuskegee University)[5]
9 Alaska[edit]
10 Fairbanks (University of Alaska Fairbanks)[2]
11 Arizona[edit]
12 Flagstaff (Northern Arizona University)[6]
13 Tempe (Arizona State University)
14 Tucson (University of Arizona)
15 Arkansas[edit]
</code></pre>
<p>您可以使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.extract.html" rel="nofollow noreferrer">Series.str.extract()</a>方法:</p>
<pre><code>In [117]: df['State'] = df.loc[df.text.str.contains('[edit]', regex=False), 'text'].str.extract(r'(.*?)\[edit\]', expand=False)
In [118]: df['Region Name'] = df.loc[df.State.isnull(), 'text'].str.extract(r'(.*?)\s*[\(\[]+.*[\n]*', expand=False)
In [120]: df.State = df.State.ffill()
In [121]: df
Out[121]:
text State Region Name
0 Alabama[edit] Alabama NaN
1 Auburn (Auburn University)[1] Alabama Auburn
2 Florence (University of North Alabama) Alabama Florence
3 Jacksonville (Jacksonville State University)[2] Alabama Jacksonville
4 Livingston (University of West Alabama)[2] Alabama Livingston
5 Montevallo (University of Montevallo)[2] Alabama Montevallo
6 Troy (Troy University)[2] Alabama Troy
7 Tuscaloosa (University of Alabama, Stillman Co... Alabama Tuscaloosa
8 Tuskegee (Tuskegee University)[5] Alabama Tuskegee
9 Alaska[edit] Alaska NaN
10 Fairbanks (University of Alaska Fairbanks)[2] Alaska Fairbanks
11 Arizona[edit] Arizona NaN
12 Flagstaff (Northern Arizona University)[6] Arizona Flagstaff
13 Tempe (Arizona State University) Arizona Tempe
14 Tucson (University of Arizona) Arizona Tucson
15 Arkansas[edit] Arkansas NaN
In [122]: df = df.dropna()
In [123]: df
Out[123]:
text State Region Name
1 Auburn (Auburn University)[1] Alabama Auburn
2 Florence (University of North Alabama) Alabama Florence
3 Jacksonville (Jacksonville State University)[2] Alabama Jacksonville
4 Livingston (University of West Alabama)[2] Alabama Livingston
5 Montevallo (University of Montevallo)[2] Alabama Montevallo
6 Troy (Troy University)[2] Alabama Troy
7 Tuscaloosa (University of Alabama, Stillman Co... Alabama Tuscaloosa
8 Tuskegee (Tuskegee University)[5] Alabama Tuskegee
10 Fairbanks (University of Alaska Fairbanks)[2] Alaska Fairbanks
12 Flagstaff (Northern Arizona University)[6] Arizona Flagstaff
13 Tempe (Arizona State University) Arizona Tempe
14 Tucson (University of Arizona) Arizona Tucson
</code></pre>