如何从依赖关系分析器的输出生成树？

{'shot': (('ROOT', 'ROOT'), {'I': (('nsubj', 'shot'), {}), 'elephant': (('dobj', 'shot'), {'an': (('det', 'elephant'), {})}), 'sleep': (('nmod', 'shot'), {'in': (('case', 'sleep'), {}), 'my': (('nmod:poss', 'sleep'), {})})})}

2条回答

网友

1楼 · 编辑于 2024-07-05 15:02:46

这会将输出转换为嵌套字典形式。如果我也能找到路，我会随时通知你的。也许这个，是有帮助的。在

list_of_tuples = [('ROOT','ROOT', 'shot'),('nsubj','shot', 'I'),('det','elephant', 'an'),('dobj','shot', 'elephant'),('case','sleep', 'in'),('nmod:poss','sleep', 'my'),('nmod','shot', 'sleep')]

nodes={}

for i in list_of_tuples:
    rel,parent,child=i
    nodes[child]={'Name':child,'Relationship':rel}

forest=[]

for i in list_of_tuples:
    rel,parent,child=i
    node=nodes[child]

    if parent=='ROOT':# this should be the Root Node
            forest.append(node)
    else:
        parent=nodes[parent]
        if not 'children' in parent:
            parent['children']=[]
        children=parent['children']
        children.append(node)

print forest

输出是一个嵌套字典

[{'Name': 'shot', 'Relationship': 'ROOT', 'children': [{'Name': 'I', 'Relationship': 'nsubj'}, {'Name': 'elephant', 'Relationship': 'dobj', 'children': [{'Name': 'an', 'Relationship': 'det'}]}, {'Name': 'sleep', 'Relationship': 'nmod', 'children': [{'Name': 'in', 'Relationship': 'case'}, {'Name': 'my', 'Relationship': 'nmod:poss'}]}]}]

以下函数可以帮助您找到根到叶的路径：

^{pr2}$

网友

2楼 · 编辑于 2024-07-05 15:02:46

首先，如果您只是为Stanford CoreNLP依赖性解析器使用预先训练的模型，那么应该使用CoreNLPDependencyParserfrom{}并避免使用旧的nltk.parse.stanford接口。在

见Stanford Parser and NLTK

在终端下载并运行Java服务器后，在Python中：

>>> from nltk.parse.corenlp import CoreNLPDependencyParser
>>> dep_parser = CoreNLPDependencyParser(url='http://localhost:9000')
>>> sent = "I shot an elephant with a banana .".split()
>>> parses = list(dep_parser.parse(sent))
>>> type(parses[0])
<class 'nltk.parse.dependencygraph.DependencyGraph'>

现在我们看到解析的类型是DependencyGraph，来自nltk.parse.dependencygraphhttps://github.com/nltk/nltk/blob/develop/nltk/parse/dependencygraph.py#L36

要将DependencyGraph转换为nltk.tree.Tree对象，只需执行DependencyGraph.tree()操作：

^{pr2}$

要将其转换为方括号内的解析格式：

>>> print(parses[0].tree())
(shot I (elephant an) (banana with a) .)

如果您正在寻找依赖关系三元组：

>>> [(governor, dep, dependent) for governor, dep, dependent in parses[0].triples()]
[(('shot', 'VBD'), 'nsubj', ('I', 'PRP')), (('shot', 'VBD'), 'dobj', ('elephant', 'NN')), (('elephant', 'NN'), 'det', ('an', 'DT')), (('shot', 'VBD'), 'nmod', ('banana', 'NN')), (('banana', 'NN'), 'case', ('with', 'IN')), (('banana', 'NN'), 'det', ('a', 'DT')), (('shot', 'VBD'), 'punct', ('.', '.'))]

>>> for governor, dep, dependent in parses[0].triples():
...     print(governor, dep, dependent)
... 
('shot', 'VBD') nsubj ('I', 'PRP')
('shot', 'VBD') dobj ('elephant', 'NN')
('elephant', 'NN') det ('an', 'DT')
('shot', 'VBD') nmod ('banana', 'NN')
('banana', 'NN') case ('with', 'IN')
('banana', 'NN') det ('a', 'DT')
('shot', 'VBD') punct ('.', '.')

CONLL格式：

>>> print(parses[0].to_conll(style=10))
1   I   I   PRP PRP _   2   nsubj   _   _
2   shot    shoot   VBD VBD _   0   ROOT    _   _
3   an  a   DT  DT  _   4   det _   _
4   elephant    elephant    NN  NN  _   2   dobj    _   _
5   with    with    IN  IN  _   7   case    _   _
6   a   a   DT  DT  _   7   det _   _
7   banana  banana  NN  NN  _   2   nmod    _   _
8   .   .   .   .   _   2   punct   _   _

相关问题更多 >

编程相关推荐

热门问题

热门文章