空间分类器：“unicode”对象没有“to”to“数组”属性

import spacy from spacy.pipeline import TextCategorizer nlp = spacy.load('en') doc1 = u'This is my first document in the dataset.' doc2 = u'This is my second document in the dataset.' gold1 = u'Category1' gold2 = u'Category2' textcat = TextCategorizer(nlp.vocab) textcat.add_label('Category1') textcat.add_label('Category2') losses = {} optimizer = textcat.begin_training() textcat.update([doc1, doc2], [gold1, gold2], losses=losses, sgd=optimizer)

1条回答

网友

1楼 · 发布于 2024-06-13 23:08:07

显然，textcat期望使用GoldParse生成的黄金值，而不是纯文本值。工作版本如下所示：

import spacy
from spacy.pipeline import TextCategorizer
from spacy.gold import GoldParse
nlp = spacy.load('en')

doc1 = nlp(u'This is my first document in the dataset.')
doc2 = nlp(u'This is my second document in the dataset.')

gold1 = GoldParse(doc=doc1, cats={'Category1': 1, 'Category2': 0})
gold2 = GoldParse(doc=doc2, cats={'Category1': 0, 'Category2': 1})

textcat = TextCategorizer(nlp.vocab)
textcat.add_label('Category1')
textcat.add_label('Category2')
losses = {}
optimizer = textcat.begin_training()
textcat.update([doc1, doc2], [gold1, gold2], losses=losses, sgd=optimizer)

感谢@abarner在评论中帮助我调试这个。

相关问题更多 >

编程相关推荐

热门问题

热门文章