ValueError:操作数无法与形状(1,55)(42,)一起广播

2024-10-01 17:35:33 发布

您现在位置:Python中文网/ 问答频道 /正文

To Download Dataset click link

我试图通过使用机器学习模型,根据症状找出疾病类型。一切进展顺利,但当我试图根据给定症状预测疾病类型时,它给了我“ValueError:操作数不能与形状(1,55)(42,)一起广播”的错误。为了解决这个问题,我看过很多类似的帖子,但都没能解决

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

import re
import string
import nltk
from nltk.corpus import stopwords
from sklearn.metrics import f1_score
from sklearn.model_selection import train_test_split
from nltk.stem.snowball import SnowballStemmer
from nlppreprocess import NLP

import math
import string
punct = string.punctuation
import spacy
import en_core_web_sm
nlp = en_core_web_sm.load()
#nlp = spacy.load("en_core_web_sm")
from spacy.lang.en.stop_words import STOP_WORDS

from sklearn.metrics import confusion_matrix,accuracy_score, classification_report, roc_curve, auc

from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()

input data

w = pd.read_csv("symptom_disease.csv")

w = w.fillna(int(0))

X = w.drop(["Disease"],axis=1)

m = w["Disease"]

data = [1,2,3,4,5,6,7,8,9,10]
y = pd.DataFrame(data,columns=["disease"])

gnb=gnb.fit(X,np.ravel(y))

X.head()

X.head()

output:

Passing much less urine Bleeding from any body part Feeling extremely lethargic/weak    Excessive sleepiness/restlessness   Altered mental status   Seizure/fits    Breathlessness  Blood in sputum Chest pain  Sound/noise in breathing    ... diarrhoea   sweats and chills   difficulty breathing    sweating and shivering  rapid heartbeat sweating    shivering   loss of appetite    coughing up blood   vomiting
0   1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1   0.0 0.0 0.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2   0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4   0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0







import spacy
nlp = spacy.load("en_core_web_sm")
t = ['Passing much less urine','Bleeding from any body part','Feeling extremely lethargic/weak','Excessive sleepiness/restlessness','Altered mental status','Seizure/fits','Breathlessness','Blood in sputum','Chest pain','Sound/noise in breathing','Drooling of saliva','Difficulty in opening mouth','Eye irritation','Runny nose','Stuffy nose','watery eyes','Sneezing','itchy nose','itchy throat','fever','headache','intense pain','fatigue','dry cough','bloody stools','loose stools','nausea','shortness of breath','tight chest','cough','short of breath','muscle pains','diarrhoea','sweats and chills','difficulty breathing','sweating and shivering','rapid heartbeat','sweating','shivering','loss of appetite','coughing up blood','vomiting','Weakness','Stomach pain','constipation','Cough','Chills','Abdominal pain','Yellow skin color','skin color yellow','Dark-colored urine','clay-colored stool','yellow color urine','weight loss','itchy skin']
#t = ['Passing much less urine', 'Bleeding from any body part', 'Feeling extremely lethargic/weak', 'Excessive sleepiness/restlessness', 'Altered mental status', 'Seizure/fits', 'Breathlessness', 'Blood in sputum', 'Chest pain', 'Sound/noise in breathing', 'Drooling of saliva', 'Difficulty in opening mouth']
docs = nlp.pipe(t)

l1= []
for doc in docs:
    clean_doc = " ".join([tok.lemma_.lower() for tok in doc if not tok.is_stop and not tok.is_punct])
    l1.append(clean_doc)







l2=[]
for i in range(0,len(l1)):
    l2.append(0)
print(l2)


import spacy
nlp = spacy.load("en_core_web_sm")

psymptoms = ["Blood in sputum","Chest pain","Sound/noise in breathing","Breathlessness"]
docs = nlp.pipe(psymptoms)

sym= []
for doc in docs:
    clean_doc = " ".join([tok.lemma_.lower() for tok in doc if not tok.is_stop and not tok.is_punct])
    sym.append(clean_doc)


for k in range(0,len(l1)):
    for z in sym:
        #print(z)
        if(z==l1[k]):
            l2[k]=1

inputtest = [l2]
predict = gnb.predict(inputtest)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-39-d99236746b75> in <module>
      1 #print(inputtest)
----> 2 predict = gnb.predict(inputtest)

~\anaconda3\lib\site-packages\sklearn\naive_bayes.py in predict(self, X)
     76         check_is_fitted(self)
     77         X = self._check_X(X)
---> 78         jll = self._joint_log_likelihood(X)
     79         return self.classes_[np.argmax(jll, axis=1)]
     80 

~\anaconda3\lib\site-packages\sklearn\naive_bayes.py in _joint_log_likelihood(self, X)
    454             jointi = np.log(self.class_prior_[i])
    455             n_ij = - 0.5 * np.sum(np.log(2. * np.pi * self.sigma_[i, :]))
--> 456             n_ij -= 0.5 * np.sum(((X - self.theta_[i, :]) ** 2) /
    457                                  (self.sigma_[i, :]), 1)
    458             joint_log_likelihood.append(jointi + n_ij)

ValueError: operands could not be broadcast together with shapes (1,55) (42,)

Error msg picture


Tags: andinfromimportselffordocnlp
1条回答
网友
1楼 · 发布于 2024-10-01 17:35:33

好了,我终于解决了。 实际上,有一个维度问题。 问题的出现是因为给了我一个模型的输入数据维度是X=(10行×42列)和y=(10行×1列)。 当使用模型进行预测时,我得到的测试数据维度为=(1行×55列)。这就是维度的问题。现在我更改了输入数据的形状X=(10行×55列)。所以现在它运行良好,预测良好

相关问题 更多 >

    热门问题