<p>使用</p>
<pre class="lang-py prettyprint-override"><code>import pandas as pd
patterns=['H', 'He', 'Li', 'Be', 'B', 'C', 'N', 'O', 'F', 'Ne', 'Na', 'Mg', 'Al',
'Si', 'P', 'S', 'Cl', 'Ar', 'K', 'Ca', 'Sc', 'Ti', 'V', 'Cr', 'Mn',
'Fe', 'Co', 'Ni', 'Cu', 'Zn', 'Ga', 'Ge', 'As', 'Se', 'Br', 'Kr', 'Rb',
'Sr', 'Y', 'Zr', 'Nb', 'Mo', 'Tc', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'In',
'Sn', 'Sb', 'Te', 'I', 'Xe', 'Cs', 'Ba', 'La', 'Ce', 'Pr', 'Nd', 'Pm',
'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb', 'Lu', 'Hf', 'Ta',
'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Tl', 'Pb', 'Bi', 'Po', 'At',
'Rn']
rx = fr'({"|".join(sorted(patterns, key=len,reverse=True))})(\d+(?:\.\d+)?)?'
df = pd.DataFrame({'formulas' : ['Mg0.97Fe0.03B2', 'Tl0.5Hg0.5Ba2Ca2Cu3O8', 'Hg0.75SrBa2Ca2Cu3O8', 'NbSn3']})
df['result'] = df['formulas'].str.findall(rx)
df['result'] = df['result'].apply(lambda m: [(x,y) if y else (x,1) for x,y in m])
</code></pre>
<p><strong>结果</strong></p>
<pre class="lang-py prettyprint-override"><code>>>> df
formulas result
0 Mg0.97Fe0.03B2 [(Mg, 0.97), (Fe, 0.03), (B, 2)]
1 Tl0.5Hg0.5Ba2Ca2Cu3O8 [(Tl, 0.5), (Hg, 0.5), (Ba, 2), (Ca, 2), (Cu, 3), (O, 8)]
2 Hg0.75SrBa2Ca2Cu3O8 [(Hg, 0.75), (Sr, 1), (Ba, 2), (Ca, 2), (Cu, 3), (O, 8)]
3 NbSn3 [(Nb, 1), (Sn, 3)]
</code></pre>