<p>可以使用<a href="https://docs.python.org/2/library/re.html#re.sub" rel="nofollow noreferrer">^{<cd1>}</a>在遇到停止字的地方添加换行符</p>
<p>正则表达式很简单:<code>(and|\.|but|,)</code>,它与你的stopwords匹配。然后将该组替换为自身,再加上一个换行符</p>
<pre><code>>>> import re
>>> sentence = "Very disorganized and hard professor. Does not come to classes on time, she grades tough, does not help on anything. She says come for help but when you go to her office hour, she is not there to help."
>>> sample = re.sub(r'(and|\.|but|,)', r'\1\n', sentence)
>>> sample
Very disorganized and
hard professor.
Does not come to classes on time,
she grades tough,
does not help on anything.
She says come for help but
when you go to her office hour,
she is not there to help.
</code></pre>
<p>如果要将其列在列表中:</p>
<pre><code>>>> re.sub(r'(and|\.|but|,)', r'\1\n', sentence).split('\n')
['Very disorganized and', ' hard professor.', ' Does not come to classes on time,', ' she grades tough,', ' does not help on anything.', ' She says come for help but', ' when you go to her office hour,', ' she is not there to help.', '']
</code></pre>
<p>如果要删除以下每行前面的空白,可以使用以下方法:</p>
<pre><code>sample = re.sub(r'(and|\.|but|,)(?:\s)', r'\1\n', sentence)
</code></pre>