擅长:python、mysql、java
<p>您可以使用先行表达式,该表达式在行首查找名称的相同模式,后跟冒号:</p>
<pre><code>s = '''CRO: How far are you from the World Trade Center, how many blocks, about? Three or four blocks?
63FDNY 911 Calls Transcript - EMS - Part 1 9-11-01
CALLER:
CRO: You're welcome. Thank you.
OPERATOR: Bye.
CRO: Bye.
RECORDER: The preceding portion of tape concludes at 0913 hours, 36 seconds.
This tape will continue on side B.
OPERATOR NEWELL: blah blah.
GUY IN DESK: I speak words!'''
import re
from pprint import pprint
pprint(re.findall(r'^([^:\n]+):\s*(.*?)(?=^[^:\n]+?:|\Z)', s, flags=re.MULTILINE | re.DOTALL), width=200)
</code></pre>
<p>这将输出:</p>
<pre><code>[('CRO', 'How far are you from the World Trade Center, how many blocks, about? Three or four blocks?\n63FDNY 911 Calls Transcript - EMS - Part 1 9-11-01\n'),
('CALLER', ''),
('CRO', "You're welcome. Thank you.\n"),
('OPERATOR', 'Bye.\n'),
('CRO', 'Bye.\n'),
('RECORDER', 'The preceding portion of tape concludes at 0913 hours, 36 seconds.\nThis tape will continue on side B.\n'),
('OPERATOR NEWELL', 'blah blah.\n'),
('GUY IN DESK', 'I speak words!')]
</code></pre>