input_ids=火炬。张量(input_ids)值错误:尺寸1处长度133的预期序列(got 80)

2024-09-28 05:15:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图将一列元素列表转换为张量,但我一直遇到这个错误,并且不知道如何解决它? 错误:


Traceback (most recent call last):
  File "test.py", line 63, in <module>
    main()
  File "test.py", line 46, in main
    dic_acc, dic_report, dic_cm, s = cross_validation(data_train, data_label_train, models_list, name, language_model_dir)
  File "../traitements/processin_test.py", line 198, in cross_validation
    features, s = get_flaubert_layer(features, lge_model)
  File "../traitements/processin_test.py", line 108, in get_flaubert_layer
    input_ids = torch.as_tensor(input_ids)
ValueError: expected sequence of length 133 at dim 1 (got 80)

以下几行是我的代码:

    input_ids = []
    attention_masks = []
    print(flaubert)
    for sent in texte:
        encoded_sent = flaubert_tokenizer.encode_plus(sent, add_special_tokens=True, truncation=True, padding=True, return_attention_mask=True) #, return_tensors='pt'

        # Add the outputs to the lists
        input_ids.append(encoded_sent.get('input_ids'))
        attention_masks.append(encoded_sent.get('attention_mask'))

        # Convert lists to tensors

    print("len", len(input_ids))
    print(input_ids)

    input_ids = torch.as_tensor(input_ids)
    attention_masks = torch.as_tensor(attention_masks)

    hidden_state = flaubert(input_ids=input_ids, attention_mask=attention_masks)

    # Extract the last hidden state of the token `[CLS]` for classification task
    last_hidden_state_cls = outputs[0][:, 0, :]
    print(last_hidden_state_cls)

输入是这样的

[[02, 15, 21085, 14, 65, 65, 27536, 339, 612, 19, 24, 103],[ 1081, 14, 72, 20, 1702, 50, 20, 15646, 2031, 16, 55, 15646, 2031, 104, 12228], [26, 18059, 50, 30, 10169, 19, 1485, 14, 50, 17, 1216, 104, 73, 3742, 24, 26, 29556, 18, 24, 26, 40798, 14, 63, 3531, 48, 394, 42, 991, 13320, 19, 17, 1074],[ 16, 635, 34, 2268, 19, 21, 1120, 15, 86, 289, 14, 63, 101, 2339, 19218, 16, 1]]

有什么想法吗


Tags: inpytesttrueidsinputgetline

热门问题