确定句子的时间Python

from nltk import word_tokenize, pos_tag def determine_tense_input(sentence): text = word_tokenize(sentence) tagged = pos_tag(text) tense = {} tense["future"] = len([word for word in tagged if word[1] == "MD"]) tense["present"] = len([word for word in tagged if word[1] in ["VBP", "VBZ","VBG"]]) tense["past"] = len([word for word in tagged if word[1] in ["VBD", "VBN"]]) return(tense)

3条回答

网友

1楼 · 编辑于 2024-09-26 17:55:52

从http://dev.lexalytics.com/wiki/pmwiki.php?n=Main.POSTags开始，标记的意思是

MD  Modal verb (can, could, may, must)
VB  Base verb (take)
VBC Future tense, conditional
VBD Past tense (took)
VBF Future tense
VBG Gerund, present participle (taking)
VBN Past participle (taken)
VBP Present tense (take)
VBZ Present 3rd person singular (takes)

所以你的代码是

tense["future"] = len([word for word in tagged if word[1] in ["VBC", "VBF"])

网友

2楼 · 编辑于 2024-09-26 17:55:52

你可以通过各种方式加强你的方法。你可以考虑更多的英语语法，并根据你观察到的东西添加更多的规则；或者你可以推动统计方法，提取更多的（相关的）特征，并把所有的东西都扔给一个分类器。NLTK提供了大量的分类器供您使用，它们在NLTK书中有很好的文档记录。

你可以拥有两个世界中最好的：手写规则可以是输入到分类器的特性的形式，分类器将决定何时可以依赖它们。

网友

3楼 · 编辑于 2024-09-26 17:55:52

您可以使用Stanford Parser来获得句子的依赖性分析。依赖分析的根将是定义句子的“主要”动词（我不太确定具体的语言术语是什么）。然后可以使用这个动词的POS标记来查找它的时态，并使用它。

相关问题更多 >

编程相关推荐

热门问题

热门文章