<p>我认为用Regex来匹配它是不可行的。你知道吗</p>
<p>您可以使用一个名为NLTK的包<a href="http://www.nltk.org/book/ch05.html" rel="nofollow noreferrer">^{<cd1>}</a>并从这些标记化的单词中获取<a href="https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html" rel="nofollow noreferrer">^{<cd2>}</a>,然后返回运行自定义业务逻辑的元组列表。你知道吗</p>
<pre><code>import nltk
str = r"Honestly 'The Art of War' should be required reading in schools (outside of China), it has so much wisdom packed into it that is so sorely lacking in our current education system."
tagged_text = nltk.word_tokenize(str)
pos_tags = nltk.pos_tag(tagged_text)
print (pos_tags)
</code></pre>
<p>输出:</p>
<pre><code>[
('Honestly', 'RB'),
("'The", 'POS'),
('Art', 'NNP'),
('of', 'IN'),
('War', 'NNP'),
("'", 'POS'),
('should', 'MD'),
('be', 'VB'),
('required', 'VBN'),
('reading', 'NN'),
('in', 'IN'),
('schools', 'NNS'),
('(', '('),
('outside', 'IN'),
('of', 'IN'),
('China', 'NNP'),
(')', ')'),
(',', ','),
('it', 'PRP'),
('has', 'VBZ'),
('so', 'RB'),
('much', 'JJ'),
('wisdom', 'NN'),
('packed', 'VBD'),
('into', 'IN'),
('it', 'PRP'),
('that', 'WDT'),
('is', 'VBZ'),
('so', 'RB'),
('sorely', 'RB'),
('lacking', 'VBG'),
('in', 'IN'),
('our', 'PRP$'),
('current', 'JJ'),
('education', 'NN'),
('system', 'NN'),
('.', '.')
]
</code></pre>
<p>这里<code>'IN'</code>表示介词。你知道吗</p>