NLTK中的Python concordance命令

2条回答

网友

1楼 · 编辑于 2024-06-28 10:53:21

我用这个密码搞定了：

import sys
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.text import Text

def main():
    if not sys.argv[1]:
        return
    # read text
    text = open(sys.argv[1], "r").read()
    tokens = word_tokenize(text)
    textList = Text(tokens)
    textList.concordance('is')
    print(tokens)



if __name__ == '__main__':
    main()

基于this site

网友

2楼 · 编辑于 2024-06-28 10:53:21

.concordance()是一个特殊的nltk函数。所以你不能在任何python对象上调用它（比如你的列表）。

更具体地说：.concordance()是^{} class of nltk中的一个方法

基本上，如果要使用.concordance()，必须先实例化一个文本对象，然后在该对象上调用它。

Text

A Text is typically initialized from a given document or corpus. E.g.:
import nltk.corpus  
from nltk.text import Text  
moby = Text(nltk.corpus.gutenberg.words('melville-moby_dick.txt'))

.concordance()

concordance(word, width=79, lines=25)
Print a concordance for word with the specified context window. Word matching is not case-sensitive.

所以我想这样的东西会有用（没有测试）

import nltk.corpus  
from nltk.text import Text  
textList = Text(nltk.corpus.gutenberg.words('YOUR FILE NAME HERE.txt'))
textList.concordance('CNA')

相关问题更多 >

编程相关推荐

热门问题

热门文章

NLTK中的Python concordance命令

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >