<p>当然!可以使用此正则表达式一次性捕获所需的所有内容。我在正则表达式中包含了注释。为了通知<code>re</code>,我传递了标志<code>re.X</code>,这意味着此模式是一个“详细”模式,其中包含在执行实际匹配时应忽略的注释</p>
<pre><code>import re
pattern = """
^([^-]+)- # From the beginning of the string, capture all non-hyphen characters and stop at the first actual hyphen.
.+? # Consume all characters up to the next capture group in this pattern
([\d.]+T) # Capture all digits (including a literal period) that end with a "T".
""".strip()
extracted_df = df["type"].str.extract(pattern, flags=re.X)
print(extracted_df)
0 1
0 Hello 12T
1 Hello 12T
2 Hello 50T
3 Hello 50T
4 Happy 90T
5 Kind 14T
6 Kind 14T
7 AY14.5 6.4T
8 AY14.5 6.4T
</code></pre>
<p>现在我们已经提取了相关的信息位,我们可以继续将它们粘在一起以覆盖旧的<code>"type"</code>列:</p>
<pre><code>df["type"] = extracted_df[0] + " " + extracted_df[1]
print(df)
type free use total
0 Hello 12T 10 10 20
1 Hello 12T 5 1 6
2 Hello 50T 1 4 5
3 Hello 50T 2 1 1
4 Happy 90T 10 0 10
5 Kind 14T 7 4 3
6 Kind 14T 6 3 2
7 AY14.5 6.4T 3 0 3
8 AY14.5 6.4T 0 20 20
</code></pre>
<p>与常规正则表达式一样,这可能无法捕获所有角落的情况,但我希望它阐明了如何使用正则表达式和捕获组从列中收集相关信息的方法</p>