<p>对“我不确定这是最好的方法还是更容易一次搜索和分组所有的文本。”或者,最好的方法是你理解和解决问题的方式。这是快速和肮脏的,但应该让你开始。在</p>
<pre><code>import pprint
test_data=""" <div class="entry" itemprop="articleBody" id="article-entry">...
<p> CARSON: extremely effectively.</p>
<p> (APPLAUSE)</p>
<p> BAIER: Gentlemen, the next series of questions deals with ObamaCare and the role of the federal government.</p>
<p> Mr. Trump, ObamaCare is one of the things you call a disaster.</p>
<p> TRUMP: A complete disaster, yes.</p>
<p> BAIER: Saying it needs to be repealed and replaced.</p>
<p> TRUMP: Correct.</p>
<p> BAIER: Now, 15 years ago, uncalled yourself a liberal on health care. You were for a single-payer system, a Canadian-style system.</p>
<p> Why were you for that then and why aren't you for it now? TRUMP: First of all, I'd like to just go back to one. In July of 2004, I came out strongly against the war with Iraq, because it was going to destabilize the Middle East. And I'm the only one on this stage that knew that and had the vision to say it. And that's exactly what happened.</p>
<p> BAIER: But on ObamaCare...</p>
<p> TRUMP: And the Middle East became totally destabilized. So I just want to say.</p>
<p> As far as single payer, it works in Canada. It works incredibly well in Scotland. It could have worked in a different age, which is the age you're talking about here.</p>
<p> What I'd like to see is a private system without the artificial lines around every state. I have a big company with thousands and thousands of employees. And if I'm negotiating in New York or in New Jersey or in California, I have like one bidder. Nobody can bid.</p>
<p> You know why?</p>
<p> Because the insurance companies are making a fortune because they have control of the politicians, of course, with the exception of the politicians on this stage.</p>
<p> But they have total control of the politicians. They're making a fortune.</p>
<p> Get rid of the artificial lines and you will have...</p>
<p> (BUZZER NOISE)</p>
<p> TRUMP: yourself great plans. And then we have to take care of the people that can't take care of themselves. And I will do that through a different system.</p>
<p> (CROSSTALK)</p>
<p> BAIER: Mr. Trump, hold up one second.</p>
<p> PAUL: I've got a news flash...</p>"""
## look for 3 capital letters
## assume every line starts with "<p>" (so won't test for it)
one_group=[]
for record in test_data.split("\n"):
record=record.strip()
if len(record):
split_rec=record.split()
found=True
for ltr in split_rec[1][:3]:
if ltr < "A" or ltr > "Z":
found=False
## found new name so print previous block
if found and len(one_group):
pprint.pprint(one_group)
print
one_group=[]
one_group.append(record)
## last group
print one_group
</code></pre>