Python中NLTK的命名实体识别。识别东北

>>> sentence = "I am Jhon from America" >>> sent1 = nltk.word_tokenize(sentence ) >>> sent2 = nltk.pos_tag(sent1) >>> sent3 = nltk.ne_chunk(sent2, binary=True) >>> sent3 Tree('S', [('I', 'PRP'), ('am', 'VBP'), Tree('NE', [('Jhon', 'NNP')]), ('from', 'IN'), Tree('NE', [('America', 'NNP')])])

>>> sent3[2] Tree('NE', [('Jhon', 'NNP')]) >>> sent3[2][0] ('Jhon', 'NNP') >>> sent3[2][1] Traceback (most recent call last): File "<pyshell#121>", line 1, in <module> sent3[2][1] File "C:\Python26\lib\site-packages\nltk\tree.py", line 139, in __getitem__ return list.__getitem__(self, index) IndexError: list index out of range

3条回答

网友

1楼 · 编辑于 2024-10-05 14:30:40

这个答案可能是偏离基准的，在这种情况下，我将删除它，因为我没有安装NLTK来尝试它，但我认为您可以这样做：

   >>> sent3[2].node
   'NE'

sent3[2][0]返回树的第一个子节点，而不是节点本身

编辑：我回家后试过这个，确实有效。

网友

2楼 · 编辑于 2024-10-05 14:30:40

这样就行了

for sent in chunked_sentences:
  for chunk in sent:
    if hasattr(chunk, "label"):
        print(chunk.label())

网友

3楼 · 编辑于 2024-10-05 14:30:40

以下是我的代码：

chunks = ne_chunk(postags, binary=True)
for c in chunks:
  if hasattr(c, 'node'):
    myNE.append(' '.join(i[0] for i in c.leaves()))

相关问题更多 >

编程相关推荐

热门问题

热门文章