<p>在其他几篇文章之后,[例如<a href="https://stackoverflow.com/questions/3434144/detect-english-verb-tenses-using-nltk">Detect English verb tenses using NLTK</a>,<a href="https://stackoverflow.com/questions/19966345/identifying-verb-tenses-in-python">Identifying verb tenses in python</a>,<a href="https://stackoverflow.com/questions/2539782/python-nltk-figure-out-tense">Python NLTK figure out tense</a>]我编写了以下代码,以使用POS标记确定Python中句子的时态:</p>
<pre><code>from nltk import word_tokenize, pos_tag
def determine_tense_input(sentence):
text = word_tokenize(sentence)
tagged = pos_tag(text)
tense = {}
tense["future"] = len([word for word in tagged if word[1] == "MD"])
tense["present"] = len([word for word in tagged if word[1] in ["VBP", "VBZ","VBG"]])
tense["past"] = len([word for word in tagged if word[1] in ["VBD", "VBN"]])
return(tense)
</code></pre>
<p>这将返回过去/现在/将来动词用法的值,然后我通常将最大值作为句子的时态。准确度还算不错,但我想知道是否有更好的方法。</p>
<p>例如,现在是否偶然有一个包,它更专注于提取句子的时态?[注-3个堆栈溢出柱中有2个已经4年了,所以现在情况可能已经改变]。或者,我应该使用不同于nltk的解析器来提高准确性吗?如果没有,希望上面的代码可以帮助别人!</p>