<p>您可以使用正则表达式来查找<code>'symptoms'</code>之后的第一个单词,还可以选择使用以komma开头的更多匹配项、mabye空格和更多wordcharacters:</p>
<pre><code>import re
pattern = r"symptoms\s+(\w+)(?:,\s*(\w+))*"
regex = re.compile(pattern)
t = "kathy has symptoms cold,cough her gender is female. john's symptoms hunger, thirst."
symptoms = regex.findall(t)
print(symptoms)
</code></pre>
<p>输出:</p>
<pre><code>[('cold', 'cough'), ('hunger', 'thirst')]
</code></pre>
<hr/>
<p>说明:</p>
<pre><code>r"symptoms\s+(\w+)(?:,\s*(\w+))*"
# symptoms\s+ literal symptoms followed by 1+ whitepsaces
# (\w+) followed by 1+ word-chars (first symptom) as group 1
# (?:, )* non grouping optional matches of comma+spaces
# (\w+) 1+ word-chars (2nd,..,n-th symptom) as group 2-n
</code></pre>
<hr/>
<p>备用方式:</p>
<pre><code>import re
pattern = r"symptoms\s+(\w+(?:,\s*\w+)*(?:\s+and\s+\w+)?)"
regex = re.compile(pattern)
t1 = "kathy has symptoms cold,cough,fever and noseitch her gender is female. "
t2 = "john's symptoms hunger, thirst."
symptoms = regex.findall(t1+t2)
print(symptoms)
</code></pre>
<p>输出:</p>
<pre><code>['cold,cough,fever and noseitch', 'hunger, thirst']
</code></pre>
<p>这只适用于“英式”英语-美国式的学习方式</p>
<pre><code>"kathy has symptoms cold,cough,fever, and noseitch"
</code></pre>
<p>只会导致<code>cold,cough,fever, and</code>匹配。你知道吗</p>
<p>您可以在<code>','</code>和<code>" and "</code>分割每个匹配项,以得到您的单一原因:</p>
<pre><code>sym = [ inner.split(",") for inner in (x.replace(" and ",",") for x in symptoms)]
print(sym)
</code></pre>
<p>输出:</p>
<pre><code>[['cold', 'cough', 'fever', 'noseitch'], ['hunger', ' thirst']]
</code></pre>