除了冠词和所有格外,如何提取名词?

2024-10-05 15:25:16 发布

您现在位置:Python中文网/ 问答频道 /正文

背景

我想知道如何区分名词及其修饰语,如冠词和所有格

范例

#sentence
The man with the star regarded her with his expressionless eyes.


# what to extract 
man
star
eyes

问题

如下图所示,使用displace工具创建的“男人、明星和他无表情的眼睛”统一为名词

词性与依存关系的可视化

https://explosion.ai/demos/displacy

the result of the sample sentence

我试过的

我已经运行了在the spaCy page上引入的示例代码

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("The man with the star regarded her with his calm, expressionless eyes.")
for token in doc:
    print(token.text, token.dep_, token.head.text, token.head.pos_,
            [child for child in token.children])

使用下面的结果或其他方法,如何提取名词本身,排除冠词和所有格

$ python sample.py
The det man NOUN []
man nsubj regarded VERB [The, with]
with prep man NOUN [star]
the det star NOUN []
star pobj with ADP [the]
regarded ROOT regarded VERB [man, her, with, .]
her dobj regarded VERB []
with prep regarded VERB [eyes]
his poss eyes NOUN []
calm amod eyes NOUN [,]
, punct calm ADJ []
expressionless amod eyes NOUN []
eyes pobj with ADP [his, calm, expressionless]
. punct regarded VERB []

Tags: thetokenwithstarnounverbeyes名词
1条回答
网友
1楼 · 发布于 2024-10-05 15:25:16

尝试以下方法以实现您在初始示例中提供的所需输出:

import spacy

nlp = spacy.load('en')

text = "The man with the star regarded her with his expressionless eyes."

for word in nlp(text):
  if word.pos_ == 'NOUN':
    print(word.text)

输出:

man
star
eyes

您也可以考虑使用nltk包,因为它可能更快,对于这种用例:

import nltk

text = "The man with the star regarded her with his expressionless eyes."

for word, pos in nltk.pos_tag(nltk.word_tokenize(text)):
  if pos[0] == 'N':
    print(word)

输出:

man
star
eyes

相关问题 更多 >