擅长:python、mysql、java
<p>试试这个,它将根据数字进行分割,并为您提供名称部分:</p>
<pre><code>import re
exp = re.compile(r'(\d+\.?\d+)')
with open('mainfile.txt') as f, open('names.txt','w') as out:
for line in f:
line = line.strip()
if len(line):
try:
out.write('{}\n'.format(re.split(exp, line)[0].strip()))
except:
print('Could not parse {}'.format(line))
</code></pre>
<p>正则表达式<code>\d+\.?\d+</code>表示:</p>
<ul>
<li><code>\d+</code>一个或多个数字</li>
<li><code>\.?</code>一个可选的<code>.</code>(注意在正则表达式中<code>.</code>有特殊的含义,所以当我们指的是<em>文字</em><code>.</code>时,我们将其转义)</li>
<li><code>\d+</code>后跟一个或多个数字</li>
</ul>
<p>它周围的<code>()</code>使它成为一个捕获组;结果如下:</p>
<pre><code>>>> x = r'(\d+\.?\d+)'
>>> l = 'Benzoyl Peroxide 50 MG/ML Topical Lotion'
>>> re.split(x, l)
['Benzoyl Peroxide ', '50', ' MG/ML Topical Lotion']
</code></pre>