拥抱的脸希伯特预测蒙面词不起作用

2024-10-04 09:19:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图使用Huggingface的预训练SciBERT模型(https://huggingface.co/allenai/scibert_scivocab_uncased)预测科学/生物医学文本中的蒙面词。这会产生错误,并且不确定如何从这一点向前推进

以下是迄今为止的代码-

!pip install transformers

from transformers import pipeline, AutoTokenizer, AutoModel
  
tokenizer = AutoTokenizer.from_pretrained("allenai/scibert_scivocab_uncased")

model = AutoModel.from_pretrained("allenai/scibert_scivocab_uncased")

unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer)
unmasker("the patient is a 55 year old [MASK] admitted with pneumonia")

这仅适用于BERT,但不是专门的预训练模型-

!pip install transformers

from transformers import pipeline

unmasker = pipeline('fill-mask', model='bert-base-uncased')
unmasker("the patient is a 55 year old [MASK] admitted with pneumonia")

SciBERT的错误为-

/usr/local/lib/python3.7/dist-packages/transformers/pipelines/__init__.py in pipeline(task, model, config, tokenizer, feature_extractor, framework, revision, use_fast, use_auth_token, model_kwargs, **kwargs)
    494         kwargs["feature_extractor"] = feature_extractor
    495 
--> 496     return task_class(model=model, framework=framework, task=task, **kwargs)

/usr/local/lib/python3.7/dist-packages/transformers/pipelines/fill_mask.py in __init__(self, model, tokenizer, modelcard, framework, args_parser, device, top_k, task)
     73         )
     74 
---> 75         self.check_model_type(TF_MODEL_WITH_LM_HEAD_MAPPING if self.framework == "tf" else MODEL_FOR_MASKED_LM_MAPPING)
     76         self.top_k = top_k
     77 

/usr/local/lib/python3.7/dist-packages/transformers/pipelines/base.py in check_model_type(self, supported_models)
    652                 self.task,
    653                 self.model.base_model_prefix,
--> 654                 f"The model '{self.model.__class__.__name__}' is not supported for {self.task}. Supported models are {supported_models}",
    655             )
    656 

PipelineException: The model 'BertModel' is not supported for fill-mask. Supported models are ['BigBirdForMaskedLM', 'Wav2Vec2ForMaskedLM', 'ConvBertForMaskedLM', 'LayoutLMForMaskedLM', 'DistilBertForMaskedLM', 'AlbertForMaskedLM', 'BartForConditionalGeneration', 'MBartForConditionalGeneration', 'CamembertForMaskedLM', 'XLMRobertaForMaskedLM', 'LongformerForMaskedLM', 'RobertaForMaskedLM', 'SqueezeBertForMaskedLM', 'BertForMaskedLM', 'MegatronBertForMaskedLM', 'MobileBertForMaskedLM', 'FlaubertWithLMHeadModel', 'XLMWithLMHeadModel', 'ElectraForMaskedLM', 'ReformerForMaskedLM', 'FunnelForMaskedLM', 'MPNetForMaskedLM', 'TapasForMaskedLM', 'DebertaForMaskedLM', 'DebertaV2ForMaskedLM', 'IBertForMaskedLM']

Tags: fromselftaskmodelpipelineismaskframework
1条回答
网友
1楼 · 发布于 2024-10-04 09:19:03

正如错误消息告诉您的,您需要使用AutoModelForMaskedLM

from transformers import pipeline, AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("allenai/scibert_scivocab_uncased")
model = AutoModelForMaskedLM.from_pretrained("allenai/scibert_scivocab_uncased")
unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer)
unmasker("the patient is a 55 year old [MASK] admitted with pneumonia")

输出:

[{'sequence': 'the patient is a 55 year old woman admitted with pneumonia',
  'score': 0.4025486707687378,
  'token': 10221,
  'token_str': 'woman'},
 {'sequence': 'the patient is a 55 year old man admitted with pneumonia',
  'score': 0.23970800638198853,
  'token': 508,
  'token_str': 'man'},
 {'sequence': 'the patient is a 55 year old female admitted with pneumonia',
  'score': 0.15444642305374146,
  'token': 3672,
  'token_str': 'female'},
 {'sequence': 'the patient is a 55 year old male admitted with pneumonia',
  'score': 0.1111455038189888,
  'token': 3398,
  'token_str': 'male'},
 {'sequence': 'the patient is a 55 year old boy admitted with pneumonia',
  'score': 0.015877680853009224,
  'token': 12481,
  'token_str': 'boy'}]

相关问题 更多 >