<p>尝试使用<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.extract.html#pandas-series-str-extract" rel="nofollow noreferrer">^{<cd1>}</a>和正则表达式,使第二个捕获组成为可选<code>-</code>后面的值:</p>
<pre><code>df[['B', 'C']] = df['A'].str.extract(r"(\d+$|\d+(?=\s*-))?(?:\s*-\s*)?(.+)?")
</code></pre>
<pre class="lang-none prettyprint-override"><code> A B C
0 00000-UNITED STATES 00000 UNITED STATES
1 01000-ALABAMA 01000 ALABAMA
2 01001-Autauga County, AL 01001 Autauga County, AL
3 01003-Baldwin County, AL 01003 Baldwin County, AL
4 Barbour County, AL NaN Barbour County, AL
5 10234 10234 NaN
6 32 Alabama NaN 32 Alabama
7 432423 - state 432423 state
</code></pre>
<hr/>
<p>完整代码:</p>
<pre><code>import pandas as pd
df = pd.DataFrame({
'A': ['00000-UNITED STATES', '01000-ALABAMA',
'01001-Autauga County, AL', '01003-Baldwin County, AL',
'Barbour County, AL', '10234', '32 Alabama', '432423 - state']
})
df[['B', 'C']] = df['A'].str.extract(r"(\d+$|\d+(?=\s*-))?(?:\s*-\s*)?(.+)?")
</code></pre>