
2024-09-18 22:20:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个使用scikit learn的现有设置,但我正在考虑使用Keras扩展到深度学习。我也在使用Dask,which recommends using SciKeras

SciKeras KerasClassifier当前的设置方式似乎与预期相符(从详细的输出中),但模型似乎什么都没有学到。我遵循了SciKeras docs here,但我可能忽略了一些东西

With a Scikit-Learn RF Classifier the kappa score is about 0.44, with Keras it is about 0.55, and with SciKeras it is 0.0 (clearly an issue). In the 2. Following SciKeras docs to use Keras where is the implementation error that prevents a similar result compared to the one achieved using the 3. Exclusively using Keras below?



def default_classifier():
    return RandomForestClassifier(oob_score=True, n_jobs=-1)

... ### Preprocessing stuff...

X_train, X_test, y_train, y_test = splits

# Define the Pipeline    
## Classification    
model = default_classifier()
model.fit(X_train, y_train)

## Evaluation Metrics
from sklearn.model_selection import cross_val_score
score = cross_val_score(model, X_test, y_test, scoring='accuracy', cv=5, n_jobs=-1, error_score='raise')
print('Mean: %.3f (Std: %.3f)' % (np.mean(score), np.std(score)))

# Verbose with results...
columns, report, true_matrix, pred_matrix = cl.classification_metrics(model, splits, score)


Test Size:  0.2
Split Shapes:   [(79997, 96), (20000, 96), (79997, 12), (20000, 12)]
Mean: 0.374 (Std: 0.006)
Overall: 0.510  Kappa: 0.441
Weighted F1-Score: 0.539


from tensorflow import keras
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import train_test_split
import numpy as np

def fcn_model(hidden_layer_dim, meta):
    # note that meta is a special argument that will be
    # handed a dict containing input metadata
    n_features_in_ = meta["n_features_in_"]
    X_shape_ = meta["X_shape_"]
    n_classes_ = meta["n_classes_"]
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(n_features_in_, input_shape=X_shape_[1:]))
    return model

def get_model_fcn(modelargs={}):
    return KerasClassifier(fcn_model, 

... ### Preprocessing stuff...

X_train, X_test, y_train, y_test = splits

# Define the Pipeline    
## Classification    
model = get_model_fcn()
model.fit(X_train, y_train)

## Evaluation Metrics
from sklearn.model_selection import cross_val_score
score = cross_val_score(model, X_test, y_test, scoring='accuracy', cv=5, n_jobs=-1, error_score='raise')
print('Mean: %.3f (Std: %.3f)' % (np.mean(score), np.std(score)))

columns, report, true_matrix, pred_matrix = cl.classification_metrics(model, splits, score)


Test Size:  0.2
Split Shapes:   [(79997, 96), (20000, 96), (79997, 12), (20000, 12)]
Epoch 1/10
2500/2500 [==============================] - 4s 1ms/step - loss: 1.6750 - accuracy: 0.3762
Epoch 2/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.3132 - accuracy: 0.5021
Epoch 3/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.2295 - accuracy: 0.5371
Epoch 4/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.1651 - accuracy: 0.5599
Epoch 5/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.1178 - accuracy: 0.5806
Epoch 6/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0889 - accuracy: 0.5935
Epoch 7/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0845 - accuracy: 0.5922
Epoch 8/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0548 - accuracy: 0.6043
Epoch 9/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0415 - accuracy: 0.6117
Epoch 10/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0316 - accuracy: 0.6172
Mean: 0.000 (Std: 0.000)
625/625 [==============================] - 0s 700us/step # Here it is running model.predict(X_test)
Overall: 0.130  Kappa: 0.000
Weighted F1-Score: 0.030


# meta copies what SciKeras passes to the Keras model
meta = {
    #'classes_': array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]), 
    #'target_type_': 'multilabel-indicator', 
    'y_dtype_': np.dtype('uint8'), 
    'y_ndim_': 2, 
    'X_dtype_': np.dtype('float32'), 
    'X_shape_': (79997, 96), 
    'n_features_in_': 96, 
    #'target_encoder_': ClassifierLabelEncoder(loss='categorical_crossentropy'), 
    'n_classes_': 12, 
    'n_outputs_': 1, 
    'n_outputs_expected_': 1, 
    #'feature_encoder_': FunctionTransformer()

def fcn_model(hidden_layer_dim, meta):
    # note that meta is a special argument that will be
    # handed a dict containing input metadata
    n_features_in_ = meta["n_features_in_"]
    X_shape_ = meta["X_shape_"]
    n_classes_ = meta["n_classes_"]
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(n_features_in_, input_shape=X_shape_[1:]))
    return model

def get_model_fcn(modelargs={}):
    model = fcn_model(128, meta)
    return model

... ### Preprocessing stuff...

X_train, X_test, y_train, y_test = splits

# Define the Pipeline    
## Classification    
model = get_model_fcn()
model.fit(X_train, y_train, epochs=10)

## Evaluation Metrics
#from sklearn.model_selection import cross_val_score
#score = cross_val_score(model, X_test, y_test, scoring='accuracy', cv=5, n_jobs=-1, #error_score='raise')
#print('Mean: %.3f (Std: %.3f)' % (np.mean(score), np.std(score)))

columns, report, true_matrix, pred_matrix = cl.classification_metrics(model, splits, score)


Test Size:  0.2
Split Shapes:   [(79997, 96), (20000, 96), (79997, 12), (20000, 12)]
Epoch 1/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.6941 - accuracy: 0.3730
Epoch 2/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.3193 - accuracy: 0.5002
Epoch 3/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.2206 - accuracy: 0.5399
Epoch 4/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.1585 - accuracy: 0.5613
Epoch 5/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.1221 - accuracy: 0.5758
Epoch 6/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0923 - accuracy: 0.5928
Epoch 7/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0682 - accuracy: 0.5984
Epoch 8/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0611 - accuracy: 0.6046
Epoch 9/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0445 - accuracy: 0.6138
Epoch 10/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0236 - accuracy: 0.6186
Overall: 0.601  Kappa: 0.548
Weighted F1-Score: 0.600

Tags: thetestaddmodellayersstepnptrain