擅长:python、mysql、java
<p>要排除所有非ascii字符和所有其他在连字符<code>-</code>之后的字符,用空字符串<code>""</code>替换它们就足够了。<br/>使用特定regex模式的简短解决方案:</p>
<pre><code>dirty_name = '''
(10) Johny Doe
Eric E. Shelby
(1) Chris Melton - ŗ≤ēŗ≤Ņŗ≤įŗ≤Ņŗ≤ēŗ≥ć ŗ≤ēŗ≥Äŗ≤įŗ≥ćŗ≤§ŗ≤Ņ
Jonas Alexander Bay
Christopher Rockstar - An awesome guy
Jones Collier'''
clean_name = '\n'.join(l.lstrip() for l in re.sub(r'[^\x00-\x7f]|[\d()]| - .+\b(?=\n)', "", dirty_name).split('\n'))
print(clean_name)
</code></pre>
<p>输出:</p>
<pre><code>Johny Doe
Eric E. Shelby
Chris Melton
Jonas Alexander Bay
Christopher Rockstar
Jones Collier
</code></pre>
<p><strong><em>编辑:</em></strong>删除了左前导空格,因为@TigerhawkT3对空间太敏感了(在他自己的宗教中)</p>
<p><em>p.S.</em><code>\x00-\x7f</code>是<strong>ASCII</strong>字符范围</p>