能否将字典传递到python函数中,并在不同的位置使用密钥对

2024-10-04 05:33:41 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在做机器学习的回归模型。我正在测试3个独立的y变量。我编写了一个函数:

  1. 将模型与训练数据相匹配
  2. 得到列车的R^2,并用得到R^2的测试数据进行预测;微卫星
  3. “网格搜索”一组alpha值和“规格化”选项*
  4. 获取列车的R^2,并使用获取R^2的测试数据进行预测&;每个网格搜索组合的MSE
  5. 将所有结果放入表中

这不是使用grid search fx的真正网格搜索,它只是使用定义的值重复拟合和测试模型

见下文:

    ## GRID FOR y
    x=Lasso()
    
    def regdf(x,i,j):
        yval = [y1_train, y2_train, y3_train]
        for y in yval:
            model = x(alpha=i)
            model.fit(Xtrain, y)
            rsqtrain= format(model.score(Xtrain, y), '.3f')
        
        # Predict the test set
            model_pred= model.predict(Xt20)
    
        # Calculating mean squared error and R-sq for the predictions
            msetest = np.sqrt(mean_squared_error(y1_test model_pred))  # here I need to y_test to change
            rsqtest = format(model.score(Xtest, y1_test), '.3f')# here I need to y_test to change
            return rsqtrain, msetest, rsqtest
    
    
    y_ = "y1"
    y_test = "y1T"
    alphas = [1,0.1,0.01,0.001,0.0001, 0]
    rsqtrainlist = []
    msetestlist = []
    rsqtestlist = []
    modellist = []
    alphalist = []
    normlist = []
    norm = [True, False]
    yval = [y1, y2, y3]
    ylist= []
    for x in [Ridge]:
        for y in yval:
            for i in alphas:        
                for j in norm: 
                    a,b,c = regdf(x, i, j)
                    rsqtrainlist.append(a)
                    msetestlist.append(b)
                    rsqtestlist.append(c)
                    modellist.append(x)
                    alphalist.append(i)
                    normlist.append(j)
                    
        
    testdf = pd.DataFrame()
    testdf["yval"] = ylist
    testdf['rsqtrain'] = rsqtrainlist
    testdf["y_test"] = y_test
    testdf['msetest'] = msetestlist
    testdf['rsqtest'] = rsqtestlist
    testdf['model'] = modellist
    testdf['alpha'] = alphalist
    testdf["normalize"]=normlist


gridLassoytest = testdf.copy()
gridLassoytest

因为我有3个不同的y变量,所以我尝试将for循环添加到函数中,以便为每个y变量运行。我添加了循环来迭代y_trian值。我的问题是,当它迭代y变量时,我需要它将y1_列和y1_测试放在另一个变量中,现在它的迭代抛出了三个y_列值,但没有改变y_测试值。我想我可以使用字典将y_train/y_测试变量配对,如下所示:

yvals={y1_列:y1_测试,y2_列:y2_测试,y3_列:y3_测试}

问题是我不知道如何使函数调用键对,或者知道它需要更改Y_测试变量以匹配Y_序列。有人有什么想法吗?我正在使用一个通用数据集,所以我在我的谷歌硬盘中附加了一个指向该文件的链接。我把它们作为测试集和训练集,以及我如何定义变量。

多谢各位

测试=https://drive.google.com/file/d/1FNeJ5YgT-_VP7DzR6sXJAREzTnhkxAB7/view?usp=sharing 列车=https://drive.google.com/file/d/1hwrnwbqjdmoRyo1hO1S5Gp_f2vudfsFU/view?usp=sharing

# import csv's
RegTrain80.drop(["Unnamed: 0"], axis=1, inplace=True)
RegTest20.drop(["Unnamed: 0"], axis=1, inplace=True)

#train set
Xtrain = RegTrain80.iloc[ : , 4:66]
    # y1 variable
y1_train = RegTrain80.iloc[ : , 1]
    # y2 variable
y2_train  = RegTrain80.iloc[ : , 2]
    # y3 variable
y3_train = RegTrain80.iloc[ : , 3]

# test set
Xtest = RegTest20.iloc[ : , 4:66]
    # y1 variable
y1_test  = RegTest20.iloc[ : , 1]
    # y2 variable
y2_test  = RegTest20.iloc[ : , 2]
    # y3 variable
y3_test = RegTest20.iloc[ : , 3]

Tags: intestformodeltrainvariableappendy1