如何计算没有NLTK和WordNet下义词的名词下义词?

2024-09-22 16:26:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我试着计算一个名词的所有上下义词,它本身没有上下义词(在名词下面的名词层次中是末端的)。例如,对于“entity”(层次结构中最高的名词),结果应该是所有没有上下义词的名词的计数(层次结构中的所有终结名词)。对于本身是词尾的名词,数字必须是1。我有一个名词表。输出必须为列表中的每个名词指定这样的计数。

经过大量的搜索,尝试和错误,这是我找到的代码(只有相关部分):

import nltk
from nltk.corpus import wordnet as wn

def get_hyponyms(synset): #function source:https://stackoverflow.com/questions/15330725/how-to-get-all-the-hyponyms-of-a-word-synset-in-python-nltk-and-wordnet?rq=1
    hyponyms = set()
    for hyponym in synset.hyponyms():
        hyponyms |= set(get_hyponyms(hyponym))
    return hyponyms | set(synset.hyponyms())

with open("list-nouns.txt", "rU") as wordList1:
    myList1 = [line.rstrip('\n') for line in wordList1]
    for word1 in myList1:
        list1 = wn.synsets(word1, pos='n')
        countTerminalWord1 = 0  #counter for synsets without hyponyms
        countHyponymsWord1 = 0  #counter for synsets with hyponyms
        for syn_set1 in list1:
            syn_set11a = get_hyponyms(syn_set1)
            n = len(get_hyponyms(syn_set1))  #number of hyponyms
            if n > 0:
                countHyponymsWord1 += n
            else:
                countTerminalWord1 += 1
            for syn_set11 in syn_set11a:
                syn_set111a = get_hyponyms(syn_set11)
                n = len(get_hyponyms(syn_set11))
                if n > 0:
                    countHyponymsWord1 += n
                else: 
                    countTerminalWord1 += 1
                #...further iterates in the same way for the following levels
        print (countHyponymsWord1)
        print (countTerminalWord1)

(代码还试图计算所有有下义词的名词,但这不是必需的)。

主要的问题是,我不能在19个步骤的名词层次结构中重复这段代码。很快就会出现“SystemError:too many staticly nested blocks”(系统错误:静态嵌套块太多)。

如能提供帮助或建议,我们将不胜感激。


Tags: the代码inforget层次结构setnltk