在不使用Scikit学习的情况下尝试Kfold随机搜索简历。获取列表的numpy数组的值时出错

2024-10-02 00:24:07 发布

您现在位置:Python中文网/ 问答频道 /正文

对于以下代码,我得到了错误信息:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). 

张贴整个功能,以便更好地理解

x,y = make_classification(n_samples=10000, n_features=2, n_informative=2, n_redundant= 0, n_clusters_per_class=1, random_state=60)

X_train, X_test, y_train, y_test=train_test_split(x,y,stratify=y,random_state=42)

def RandomSearchCV(x_train,y_train,classifier, param_range, folds):

    param = np.random.randint(param_range[0],param_range[1],10)
    
    trainscores=[]
    cvscores=[]
    
    train_data_copy=list(x_train)    #creating a copy of the train dataset
    x_train_split=[]                 #final dataset with K folds of data
    y_train_split=[]                 #final label of k fold train data
    y_train_copy=list(y_train)
    fold_size=int(len(x_train)/folds)
    
    for i in range(0,folds):          #loop to create a list of dataset with k folds
        fold = []
        y_fold = []
        while (len(fold) < fold_size):
            index = np.random.randint(len(train_data_copy))
            fold.append(train_data_copy.pop(index))
            y_fold.append(y_train_copy.pop(index))
        x_train_split.append(fold)
        y_train_split.append(y_fold)
    
        
    for k in tqdm(param):
        
        trainscore_fold=[]
        cvscore_fold=[]
        
        for (x_split, y_split) in zip(x_train_split,y_train_split):
            x_train_data = list(x_train_split)
            print(type(x_split[0]))
            
            
            x_split2=[]
            
            for z in x_split:
                x_split1=[]
                for w in z:
                    x_split1.append(w)
                x_split2.append(x_split1)
            print(type(x_split2[0]))    
            x_train_data.remove(x_split2)        #removing the CV data from train data to perform K-fold CV
            
            x_train_data_final=[]
            for ele in x_train_data:
                x_train_data_final += ele
            
            print((np.array(x_train_data_final)).shape)
                
            y_train_data = list(y_train_split)
            y_train_data.remove(y_split)
            
            y_train_data_final=[]
            for elem in y_train_data:
                y_train_data_final += elem
            print((np.array(y_train_data_final)).shape)
            
            x_cv_data = np.array(list(x_split))
            y_cv_data = np.array(list(y_split))
            
            print(x_cv_data.shape, y_cv_data.shape)
            
            classifier.n_neighbors = k

            classifier.fit(x_train_data_final,y_train_data_final)
            
            x_predicted = classifier.predict(x_train_data_final)
            trainscore_fold.append(accuracy_score(y_train_data_final,x_predicted))
            
            y_predicted = classifier.predict(x_cv_data)
            cvscore_fold.append(accuracy_score(y_cv_data,y_predicted))
            
        trainscores.append(np.mean(np.array(trainscore_fold)))
        cvscores.append(np.mean(np.array(cvscore_fold)))
        
    return trainscores,cvscores,param

这里我的x_split是Numpy数组的列表,我将使用它作为CV测试数据,以便从我的列车数据中删除。 首先,当我使用remove(x_split)时,得到了相同的错误{上面代码中的注释代码}。这是合乎逻辑的,因为x_split是一个列表,但是x_split的元素是数组

但是当我使用循环(对于带有w z的循环)转换纯列表中的whichx_split时,这个错误应该已经消失了。但这一错误依然存在。即使尝试set()删除元素by=u set(),如果在列表列表上应用,也会给出不可散列的类型错误

堆栈跟踪错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-81-340dede088ab> in <module>
      9 folds = 3
     10 
---> 11 train_score,cv_score,k_parameters = RandomSearchCV(X_train, y_train, neigh, params, folds)
     12 
     13 

<ipython-input-80-2dd5163c904b> in RandomSearchCV(x_train, y_train, classifier, param_range, folds)
     44                 x_split2.append(x_split1)
     45             print(type(x_split2[0]))
---> 46             x_train_data.remove(x_split2)        #removing the CV data from train data to perform K-fold CV
     47 
     48             x_train_data_final=[]

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Tags: infordataparamnptrainfoldarray

热门问题