斯坦福coreNLP分割实体词忽略下划线_

2024-09-29 19:35:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用斯坦福coreNLP和python pycorenlp来解析这个句子,我想提取依赖项解析。但当我检查它为我返回的结果时,我发现它将实体词拆分,忽略下划线“\ux”。你知道吗

例如: 实体“劳伦•康拉德”被分成三个词:劳伦、\u、康拉德。你知道吗

下面是一个文本和代码的示例

from stanfordcorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP(r'C:\Users\Liao\Desktop\stanford-corenlp-full-2018-10-05')

sentence ="last november , lauren_conrad , 19 , attended a celebrity-filled function thrown by teen people magazine , a perk of being a central character on mtv 's popular reality series '' laguna_beach : the real orange county ."

depend_path = nlp.dependency_parse(sentence)
print(depend_path)

nlp.close()

输出如下:

root(ROOT-0, attended-10)
amod(november-2, last-1)
nmod:tmod(attended-10, november-2)
punct(attended-10, ,-3)
compound(conrad-6, lauren-4)
compound(conrad-6, _-5)
nsubj(attended-10, conrad-6)
punct(conrad-6, ,-7)
amod(conrad-6, 19-8)
punct(conrad-6, ,-9)
det(function-13, a-11)
amod(function-13, celebrity-filled-12)
dobj(attended-10, function-13)
acl(function-13, thrown-14)
case(magazine-18, by-15)
amod(magazine-18, teen-16)
compound(magazine-18, people-17)
nmod:by(thrown-14, magazine-18)
punct(magazine-18, ,-19)
det(perk-21, a-20)
appos(magazine-18, perk-21)
mark(character-26, of-22)
cop(character-26, being-23)
det(character-26, a-24)
amod(character-26, central-25)
acl(perk-21, character-26)
case(series-32, on-27)
nmod:poss(series-32, mtv-28)
case(mtv-28, 's-29)
amod(series-32, popular-30)
compound(series-32, reality-31)
nmod:on(character-26, series-32)
punct(character-26, ''-33)
compound(beach-36, laguna-34)
compound(beach-36, _-35)
dep(character-26, beach-36)
punct(character-26, :-37)
det(county-41, the-38)
amod(county-41, real-39)
compound(county-41, orange-40)
dep(character-26, county-41)
punct(attended-10, .-42)

Tags: nlpfunctionseriesconraddetcharactercountycompound

热门问题