擅长:python、mysql、java
<p>下面是另一个使用<a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.where.html" rel="nofollow noreferrer"><strong>^{<cd1>}</strong></a>的选项,用于更简单的正则表达式</p>
<p>如果不以<code>INC</code>开头,则保留<code>df.COURSE_ID</code>,否则<a href="https://pandas.pydata.org/docs/reference/api/pandas.Series.str.extract.html" rel="nofollow noreferrer"><strong>^{<cd4>}</strong></a>正则表达式:</p>
<pre class="lang-py prettyprint-override"><code>regex = r'(INC-AAB)-(WW|DE|NL|AU|NW).*(\d{6})'
df['COURSE_ID'] = df.COURSE_ID.where(
~df.COURSE_ID.str.startswith('INC'),
df.COURSE_ID.str.extract(regex).dropna().agg('-'.join, axis=1)
)
# COURSE_ID
# 0 INC-AAB-WW-105614
# 1 INC-AAB-DE-234567
# 2 INC-AAB-NL-123489
# 3 INC-AAB-NL-145678
# 4 EXI-WDFT-145678
</code></pre>