<p>您可以只提取字符串开头的<code>:</code>、<code>,</code>或<code>(</code>以外的任何0个或多个字符</p>
<pre><code>df['RegionName'] = df['Region'].str.extract(r'^([^:(,]*)\b', expand=False)
</code></pre>
<p>如果您使用的是python2.x,请在模式的开头使用<code>(?u)</code>,这样单词边界<code>\b</code>也可以匹配Unicode字符串中的正确位置。你知道吗</p>
<p><strong>细节</strong></p>
<ul>
<li><code>^</code>-字符串的开头</li>
<li><code>([^:(,]*)</code>-第1组:零个或更多(<code>*</code>)连续出现的任何字符,而不是(构成<em>否定的</em>字符类)<code>:</code>、<code>(</code>和<code>,</code>。你知道吗</li>
<li><code>\b</code>-单词边界。你知道吗</li>
</ul>
<p>请参见下面的<a href="https://regex101.com/r/JXdYG4/1" rel="nofollow noreferrer">regex demo</a>和Python3演示:</p>
<pre><code>>>> from pandas import DataFrame
>>> import pandas as pd
>>> item_list = ['Boston (Boston University, Boston College, Bos...','Bridgewater (Bridgewater State College)[2]','Cambridge (Harvard University, Massachusetts I...','Chestnut Hill (Boston College)','The Colleges of Worcester Consortium:','Dudley (Nichols College)','Faribault, South Central College','Mankato (Minnesota State University, Mankato),...','Marshall (Southwest Minnesota State University...','Moorhead (Minnesota State University, Moorhead...','Morris (University of Minnesota Morris)[2]','Northfield (Carleton College, St. Olaf College...','North Mankato, South Central College','St. Cloud (St. Cloud State University, The Col...','St. Joseph (College of Saint Benedict)[2]','St. Peter (Gustavus Adolphus College)[2]']
>>> df = pd.DataFrame(item_list, columns=['Region'])
>>> df['RegionName'] = df['Region'].str.extract(r'^([^:(,]*)\b', expand=False)
>>> df['RegionName']
RegionName
0 Boston
1 Bridgewater
2 Cambridge
3 Chestnut Hill
4 The Colleges of Worcester Consortium
5 Dudley
6 Faribault
7 Mankato
8 Marshall
9 Moorhead
10 Morris
11 Northfield
12 North Mankato
13 St. Cloud
14 St. Joseph
15 St. Peter
>>>
</code></pre>