我已经使用预先存在的en_core_web_sm-2.2.0模型在我的数据上训练了一个spaCy模型。我的数据中有一些实体是经过训练的模型部分捕获的
for text in ['KOYA MOTORS PRIVATE LTD.','KOYAL MOTORS PRIVATE LTD.' , 'PUTTAR MOTORS LIMITED' , 'BRENSON MOTORS LIMITED','MITASHI LIMITED','FEDERATION OF KARNATAKA CHAMBERS OF COMMERCE & INDUSTRY' ]:
print("#####################")
print(text , nlp_trained(text).ents)
print("##")
for i in nlp_trained(text):
print(i,i.ent_iob_,i.ent_type_,i.pos_,i.tag_,i.head,i.lang_,i.lemma_)
输出:
#####################
KOYA MOTORS PRIVATE LTD. (MOTORS PRIVATE LTD.,)
##
KOYA O PROPN NNP LTD en KOYA
MOTORS B ORG PROPN NNP LTD en MOTORS
PRIVATE I ORG PROPN NNP LTD en PRIVATE
LTD I ORG PROPN NNP LTD en LTD
. I ORG PUNCT . LTD en .
#####################
KOYAL MOTORS PRIVATE LTD. (KOYAL MOTORS PRIVATE LTD.,)
##
KOYAL B ORG PROPN NNP LTD en KOYAL
MOTORS I ORG PROPN NNP LTD en MOTORS
PRIVATE I ORG PROPN NNP LTD en PRIVATE
LTD I ORG PROPN NNP LTD en LTD
. I ORG PUNCT . LTD en .
#####################
PUTTAR MOTORS LIMITED (MOTORS LIMITED,)
##
PUTTAR O NOUN NN LIMITED en puttar
MOTORS B ORG PROPN NNP LIMITED en MOTORS
LIMITED I ORG PROPN NNP LIMITED en LIMITED
#####################
BRENSON MOTORS LIMITED (BRENSON MOTORS LIMITED,)
##
BRENSON B ORG PROPN NNP LIMITED en BRENSON
MOTORS I ORG PROPN NNP LIMITED en MOTORS
LIMITED I ORG PROPN NNP LIMITED en LIMITED
#####################
MITASHI LIMITED ()
##
MITASHI O PROPN NNP MITASHI en MITASHI
LIMITED O PROPN NNP MITASHI en LIMITED
#####################
FEDERATION OF KARNATAKA CHAMBERS OF COMMERCE & INDUSTRY (KARNATAKA CHAMBERS OF COMMERCE & INDUSTRY,)
##
FEDERATION O NOUN NN FEDERATION en federation
OF O ADP IN FEDERATION en of
KARNATAKA B ORG PROPN NNP CHAMBERS en KARNATAKA
CHAMBERS I ORG NOUN NNS OF en chamber
OF I ORG ADP IN CHAMBERS en of
COMMERCE I ORG PROPN NNP OF en COMMERCE
& I ORG CCONJ CC COMMERCE en &
INDUSTRY I ORG PROPN NNP COMMERCE en INDUSTRY
这个问题可能的原因是什么?我如何纠正它
Spacy的
en_core_web_sm-2.2.0
模型没有针对KOYAL
、KOYA
等词进行训练。使模型预测KOYAL
、KOYA
等词的一种方法是更新en_core_web_sm-2.2.0
模型你可以在here中找到更多信息
代码应该如下所示:
相关问题 更多 >
编程相关推荐