警告：[W030]某些实体无法在文本中对齐

2024-06-18 13:15:49 发布

男 | 程序猿一只，喜欢编程写python代码。

TRAIN_DATA = [
    ("XYZxyzg hat die beste Camera für Selfies", {"entities": [(0, 7, "BRAND"), (23, 28, "CAMERA")]}),
]

经过培训后，我在这一行中不断遇到一个错误：

serWarning: [W030] Some entities could not be aligned in the text "XYZxyzg hat die beste Camera für Selfie" with entities "[(0, 7, 'BRAND'), (23, 28, 'CAMERA')]". Use `spacy.gold.biluo_tags_from_offsets(nlp.make_doc(text), entities)` to check the alignment. Misaligned entities ('-') will be ignored during training.
  gold = GoldParse(doc, **gold)

我的索引有什么问题？我应该排除空白吗？我也试过了，但似乎不起作用。如何使用spacy.gold.biluo_tags_from_offsets(nlp.make_doc(text), entities)来检查警告建议的索引

Tags： the text doc spacy hat be camera entities

1条回答

网友

1楼 · 发布于 2024-06-18 13:15:49

从您的帖子：

TRAIN_DATA = [
    ("XYZxyzg hat die beste Camera für Selfies", {"entities": [(0, 7, "BRAND"), (23, 28, "CAMERA")]}),
]

实体偏移需要与标记边界对齐。不能在令牌的中间启动/结束实体。在您的例子中，似乎出现了一个小错误，我认为第二个实体的偏移量应该是(22, 28, "CAMERA")

警告：[W030]某些实体无法在文本中对齐

相关问题更多 >

编程相关推荐

热门问题

热门文章

警告：[W030]某些实体无法在文本中对齐

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >