Scikit-learn is the most famous machine learning library in the python community. It's been used by the majority of developers worldwide. It provides an implementation of the majority of machine learning algorithms in its API. One of the main reasons scikit-learn is preferred by many developers is the simplicity of the API. It let us train ML models with one function call, make a prediction using one function call and even evaluation of dataset can be done with just one function call. This easy-to-use API has made the scikit-learn widely accepted library of ML as it does not require a steep learning curve.
But latest ML problems (object detection, image classification, speech recognition, etc) are quite complicated and can not be solved using simple ML algorithms available from scikit-learn. It requires creating complicated neural networks like convolutional neural networks, recurrent neural networks, etc. One famous library for creating such a complicated neural network is keras. Keras like scikit-learn has been accepted by many developers worldwide to create deep neural networks. Like scikit-learn provides easy API in the machine learning domain, keras provides easy to use API for the deep learning domain. This is the reason keras has been favored by many developers worldwide to create deep neural networks though it requires little learning to get things right.
As a part of this tutorial, we are going to introduce a new library named scikeras which lets us use keras deep neural networks with simple API like that of scikit-learn. Scikeras let us wrap our keras models into classes available from scikeras. We can then use this wrapped instances like scikit-learn ML model instances and call methods like fit(), predict() and score() on them. In short, scikeras let us use keras model like they are scikit-learn models. We'll explain the API of scikeras with simple examples using toy datasets available from scikit-learn.
Below we have highlighted important sections of the tutorial to give an overview of the material that we'll be covering.
Below we have imported the necessary libraries that we'll be using in this tutorial and printed the versions of each of them.
import sklearn
print("Scikit-Learn Version : {}".format(sklearn.__version__))
import tensorflow
from tensorflow import keras
print("Tensorflow Version : {}".format(tensorflow.__version__))
import scikeras
print("Scikeras Version : {}".format(scikeras.__version__))
In this section, we'll explain how we can solve a simple regression problem using keras neural net by wrapping it using scikeras so that we can use it like scikit-learn for training and evaluation. We'll be creating a very simple neural network for explanation purposes. The dataset used for example is a simple Boston housing toy dataset available from scikit-learn.
We'll start by loading the Boston housing dataset available from scikit-learn. It has information houses in Boston like the number of bedrooms, the crime rate in the area, tax rate, etc. The target variable of the dataset is the median value of homes in 1000 dollars. As the target variable is a continuous variable, this will be a regression problem.
We have divided the dataset into the train (80%) and test (20%) sets as well.
from sklearn import datasets
from sklearn.model_selection import train_test_split
X, Y = datasets.load_boston(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=123)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
In this section, we have created a simple keras neural network that will be used for our regression problem. The network has an input layer with the same shape as that of a number of features of data which is 13 in our case. The second layer has 26 units, the third layer has 52 units and the final layer has 1 unit. The final layer has only 1 unit as it'll output prediction. The activation for the second and third layers is relu.
Please make a NOTE that we have not covered technical detail about model creation as we expect that reader has a bit background of simple neural network creation using keras.
from tensorflow import keras
from tensorflow.keras import models
neural_regressor = models.Sequential(
[
keras.layers.Dense(26, activation="relu", input_shape=(X_train.shape[1],)),
keras.layers.Dense(52, activation="relu"),
keras.layers.Dense(1)
]
)
neural_regressor.summary()
In this section, we have explained how we can wrap the keras neural network into scikeras model so that it can be used like scikit-learn models. We'll be wrapping our network into KerasRegressor class from scikeras. It provides an API for regression tasks. Below we have highlighted the definition of KerasRegressor class.
Below we have wrapped our keras neural network inside of KerasRegressor class. We have asked to use adam as optimizer and mean squared error as a loss. We have asked to use a batch size of 8 and run the training process for 100 epochs. We have set the verbose parameter to 0 to silent output as we don't want to flood output with messages of each epoch.
from scikeras.wrappers import KerasRegressor
scikeras_regressor = KerasRegressor(model=neural_regressor,
optimizer="adam",
loss=keras.losses.mean_squared_error,
batch_size=8,
epochs=100,
verbose=0
)
Now, we have simply trained our KerasRegressor using fit() method by giving it train features and target values.
scikeras_regressor.fit(X_train, Y_train);
In this section, we have made predictions on test data using predict() method of KerasRegressor instance.
Y_preds = scikeras_regressor.predict(X_test)
Y_preds[:5]
At last, we have calculated mean squared error and R^2 score on both train and test datasets to evaluate the performance of our neural network. We can notice from the results that it seems to have done a good job at the task.
The score() method will calculate the R^2 score for regression tasks by default.
If you are interested in learning about model evaluation metrics using scikit-learn then please feel free to check our tutorial on the same which explains the topic with simple and easy-to-understand examples.
from sklearn.metrics import mean_squared_error
print("Train MSE : {}".format(mean_squared_error(Y_train, scikeras_regressor.predict(X_train))))
print("Test MSE : {}".format(mean_squared_error(Y_test, scikeras_regressor.predict(X_test))))
print("\nTrain R^2 : {}".format(scikeras_regressor.score(X_train, Y_train)))
print("Test R^2 : {}".format(scikeras_regressor.score(X_test, Y_test)))
The history object is available through history_ attribute of KerasRegressor instance. We can use it to access loss and metric values for both train and validation sets. Those values can later be used for plotting purposes as well.
The metrics that we have set in metrics parameter of KerasRegressor will also have entries here for each epoch.
scikeras_regressor.history_.keys()
scikeras_regressor.history_["loss"][-5:]
In this section, we'll explain how we can solve a simple classification problem using keras neural net by wrapping it using scikeras so that we can use it like scikit-learn estimator for training and evaluation. We'll be creating a very simple neural network for explanation purposes. The dataset used for example is a simple wine classification toy dataset available from scikit-learn.
In this section, we have loaded the wine dataset available from scikit-learn. The wine dataset has the measurement of ingredients used in the creation of three different types of wine. The measurement of ingredients is the features of our dataset and wine type is the target variable.
After loading, We have divided the dataset into the train (80%) and test (20%) sets.
from sklearn import datasets
from sklearn.model_selection import train_test_split
X, Y = datasets.load_wine(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
In this section, we have created a simple classification neural network which we'll use to solve our classification task. The input layer of the network is of shape 13 which is the same size as the number of features. The second layer has 13 units, the third layer has 26 units and the final layer has 3 units (same as the number of wine classes). The second and third layer has relu as activation function. The final layer has softmax as an activation function.
from tensorflow import keras
from tensorflow.keras import models
neural_classifier = models.Sequential(
[
keras.layers.Dense(13, activation="relu", input_shape=(X_train.shape[1],)),
keras.layers.Dense(26, activation="relu"),
keras.layers.Dense(3, activation="softmax")
]
)
neural_classifier.summary()
In this section, we have wrapped the keras neural network we created in the previous step into scikeras KerasClassifier. The KerasClassifier has an API for classification tasks. Below we have highlighted the definition of it which is almost the same as that of KerasRegressor.
Below we have wrapped our keras classifier into an instance of KerasClassifier. We have asked to use adam optimizer and categorical cross entropy as a loss. We have set epochs to 100 so that training will make 100 passes through data. We have set validation_split to 0.1 which will instruct the model to use 10% of training data for validation purposes.
from scikeras.wrappers import KerasClassifier
scikeras_classifier = KerasClassifier(model=neural_classifier,
optimizer="adam",
loss=keras.losses.categorical_crossentropy,
batch_size=8,
epochs=100,
verbose=0,
validation_split=0.1
)
In this section, we have performed actual training by calling fit() method on an instance of KerasClassifier.
scikeras_classifier.fit(X_train, Y_train);
In this section, we have made predictions on test data using predict() method. We can also make model output probabilities by calling predict_proba() method.
Y_preds = scikeras_classifier.predict(X_test)
Y_probs = scikeras_classifier.predict_proba(X_test)
Y_preds[:5], Y_probs[:5]
In this section, we have evaluated model performance by calculating accuracy on test and train datasets using score() method. It'll calculate accuracy for classification models.
print("Test Accuracy : {:.2f}".format(scikeras_classifier.score(X_test, Y_test)))
print("Train Accuracy : {:.2f}".format(scikeras_classifier.score(X_train, Y_train)))
Here, we have shown a few entries of training and validation losses using the history object of the model. The metrics that we have set in metrics parameter of KerasClassifier will also have entries here for each epoch.
scikeras_classifier.history_.keys()
scikeras_classifier.history_["loss"][-5:]
scikeras_classifier.history_["val_loss"][-5:]
When running the keras model to improve performance, we generally run it for a few epochs, check performance and then run it again for a few epochs to check whether performance is improving or not. We generally perform these trials until we find good accuracy. The process will update weights that were already updated last time.
When we wrap our keras model inside of scikeras model, by default it'll reset weights of the model each time we call fit() method on them which is referred to as the cold start. If we want to call fit() method more than once and update weights of the model from the last call to fit() then we should set parameter warm_start to True when creating scikeras model. This will inform the model to set model weights only before the first call to fit() method and all subsequent calls should update already updated weights through previous calls to fit().
By default, warm_start parameter is set to False which will reset model weights before each call to fit(). We can change this default behavior by setting parameter warm_start to True.
Our code for this example starts by loading the wine dataset and divides it into train/test sets. It then creates a keras model which is the same as that of the classification section. We have then wrapped our keras model inside of KerasClassifier scikeras model. The code in this part is almost the same as the code from the classification section.
### Load Dataset
from sklearn import datasets
from sklearn.model_selection import train_test_split
X, Y = datasets.load_wine(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)
### Define Neural Network
from scikeras.wrappers import KerasClassifier
from tensorflow import keras
from tensorflow.keras import models
neural_classifier = models.Sequential(
[
keras.layers.Dense(13, activation="relu", input_shape=(X_train.shape[1],)),
keras.layers.Dense(26, activation="relu"),
keras.layers.Dense(3, activation="softmax")
]
)
### Initialize Model
scikeras_classifier = KerasClassifier(model=neural_classifier,
optimizer="adam",
loss=keras.losses.categorical_crossentropy,
batch_size=8,
epochs=5,
warm_start=True
)
Below we are calling fit() method the first time with train data and the target variable. We have kept statistics getting displayed at the end of each epoch for this example. It's displaying train data loss at end of each epoch.
scikeras_classifier.fit(X_train, Y_train);
Below we have again called fit() method with train data to run the training process for another 5 epochs. This call to fit() method won't start with fresh model weights. Instead, it'll update weights from last call to fit() because we have set warm_start to True. We can notice from loss getting displayed after each epoch that it's decreasing from last call to fit().
scikeras_classifier.fit(X_train, Y_train);
Below we have called fit() method again to run the training process for another 5 epochs.
scikeras_classifier.fit(X_train, Y_train);
Below we have printed model accuracy on train and test datasets after completion of the training process. The accuracy is quite less because we have run the training process for only 15 epochs. If we run it for like 100 epochs then accuracy will improve significantly which we have done in the classification section.
print("Test Accuracy : {:.2f}".format(scikeras_classifier.score(X_test, Y_test)))
print("Train Accuracy : {:.2f}".format(scikeras_classifier.score(X_train, Y_train)))
In this section, we'll explain how we can create a machine learning pipeline where we perform a list of steps on data before feeding it to the model. We'll explain how we can create a pipeline using scikit-learn and use our scikeras model in it. The pipeline will be simple and will have two steps only. The first step will scale the data and the second step will fit keras model to it. We'll be using the Boston housing dataset for our purpose.
If you are interested in learning about how to create a machine learning pipeline using scikit-learn then please feel free to check our tutorial on the same which tries to explain the topic with simple and easy-to-understand examples.
Below we have loaded the Boston housing dataset available from scikit-learn and divided it into train/test sets. The code is exactly the same as the one from the regression section.
### Load Dataset
from sklearn import datasets
from sklearn.model_selection import train_test_split
import numpy as np
X, Y = datasets.load_boston(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=123)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
Below we have created a simple keras model for performing regression task on our Boston housing dataset and wrapped it inside of scikeras KerasRegressor model. The code for this part is exactly the same as that of the regression section hence we have not included a detailed explanation.
### Define Model
from tensorflow import keras
from tensorflow.keras import models
neural_regressor = models.Sequential(
[
keras.layers.Dense(26, activation="relu", input_shape=(X_train.shape[1],)),
keras.layers.Dense(52, activation="relu"),
keras.layers.Dense(1)
]
)
neural_regressor.summary()
### Initiate Model
from scikeras.wrappers import KerasRegressor
scikeras_regressor = KerasRegressor(model=neural_regressor,
optimizer="adam",
loss=keras.losses.mean_squared_error,
batch_size=8,
epochs=100,
verbose=0
)
In this section, we have created our machine learning pipeline using Pipeline class of scikit-learn. It accepts a list of scikit-learn estimators which will be applied to data in sequence in which they are specified. We have a pipeline with two steps.
After creating the pipeline, we have trained the pipeline by calling fit() method on it giving train data and target variables to it.
If you want to learn about scaling the data for machine learning tasks then please feel free to check our tutorial on the same which covers the topic with simple and easy-to-understand examples.
## Create Pipeline
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import RobustScaler
ml_pipeline = Pipeline([("Normalize", RobustScaler()), ("Model", scikeras_regressor)])
ml_pipeline.fit(X_train, Y_train)
In this section, we have evaluated the performance of the ML pipeline by calculating MSE and R^2 score metrics on both train and test datasets. We can notice from the metrics output that the performance seems a little better due to scaling if we compare them with metrics results from the regression section.
### Evaluate Model
from sklearn.metrics import mean_squared_error
print("Train MSE : {}".format(mean_squared_error(Y_train, ml_pipeline.predict(X_train).reshape(-1))))
print("Test MSE : {}".format(mean_squared_error(Y_test, ml_pipeline.predict(X_test).reshape(-1))))
print("\nTrain R^2 : {}".format(ml_pipeline.score(X_train, Y_train)))
print("Test R^2 : {}".format(ml_pipeline.score(X_test, Y_test)))
In this section, we'll explain how we can perform a grid search on hyperparameters to tune the model for good performance. We'll be creating a simple keras model, wrapping it inside of the scikeras model, and grid searching different hyperparameters of the model to find parameters setting which gives the best results. We'll be using the Boston housing dataset for our purpose.
If you are interested in learning about hyperparameters grid search using scikit-learn then please feel free to check our tutorial on the same which covers the topic with simple and easy-to-understand examples.
In this section, we have loaded the Boston housing dataset and divided it into train/test sets. We have then created a simple keras model for the regression task and wrapped it inside of scikeras model. We have not provided optimizer parameter this time as we'll be trying different optimizers in a grid search.
### Load Dataset
from sklearn import datasets
from sklearn.model_selection import train_test_split
import numpy as np
X, Y = datasets.load_boston(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=123)
### Define Model
from tensorflow import keras
from tensorflow.keras import models
neural_regressor = models.Sequential(
[
keras.layers.Dense(26, activation="relu", input_shape=(X_train.shape[1],)),
keras.layers.Dense(52, activation="relu"),
keras.layers.Dense(1)
]
)
neural_regressor.summary()
### Initiate Model
from scikeras.wrappers import KerasRegressor
scikeras_regressor = KerasRegressor(model=neural_regressor,
loss="mean_squared_error",
verbose=0,
epochs=100
)
In this section, we have first declared a hyperparameters search dictionary with three different hyperparameters.
After creating a dictionary, we have created an instance of GridSearchCV by giving it scikeras model and hyperparameters dictionary. We have then called fit() method on an instance of GridSearchCV which will perform grid search by trying different combinations of those three hyperparameters to find the combination which gives the best result.
from sklearn.model_selection import GridSearchCV
import warnings
warnings.filterwarnings("ignore")
params = {
"batch_size": [8,16],
"optimizer": ["adam", "sgd"],
"optimizer__learning_rate": [0.001, 0.01, 0.1],
}
grid = GridSearchCV(scikeras_regressor, params, scoring='r2')
grid.fit(X_train, Y_train)
Below we have printed hyperparameters setting that gave the best result. We have also printed the best score.
print("Best Score : {}".format(grid.best_score_))
print("Best Params : {}".format(grid.best_params_))
Below we have evaluated MSE and R^2 score metrics on both train and test datasets to check the performance of the model with the best hyperparameters setting.
### Evaluate Model
from sklearn.metrics import mean_squared_error
print("Train MSE : {}".format(mean_squared_error(Y_train, grid.predict(X_train))))
print("Test MSE : {}".format(mean_squared_error(Y_test, grid.predict(X_test))))
print("\nTrain R^2 : {}".format(grid.score(X_train, Y_train)))
print("Test R^2 : {}".format(grid.score(X_test, Y_test)))
In this section, we have explained how we can perform a grid search on a machine learning pipeline. This way we can tune earlier components of the ML pipeline as well along with the ML model. We'll be using the same ML pipeline which we had used in the ML pipeline section. We'll be using the Boston housing dataset for this example.
In this section, we have loaded the Boston housing dataset and divided it into train/test sets. We have then created a simple keras model for the regression task and wrapped it inside of scikeras model. The code for this part is exactly the same as our code from the previous grid search section.
### Load Dataset
from sklearn import datasets
from sklearn.model_selection import train_test_split
import numpy as np
X, Y = datasets.load_boston(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=123)
### Define Model
from tensorflow import keras
from tensorflow.keras import models
neural_regressor = models.Sequential(
[
keras.layers.Dense(26, activation="relu", input_shape=(X_train.shape[1],)),
keras.layers.Dense(52, activation="relu"),
keras.layers.Dense(1)
]
)
neural_regressor.summary()
### Initiate Model
from scikeras.wrappers import KerasRegressor
scikeras_regressor = KerasRegressor(model=neural_regressor,
loss="mean_squared_error",
verbose=0,
epochs=100
)
In this section, we have first declared a hyperparameters search dictionary with three hyperparameters to be tuned with different values of them. We have prefixed each hyperparameter name with string 'Model__' to specify that those hyperparameters are of scikeras model. The reason behind adding this prefix is that because we have specified scikeras model name as string 'Model' inside of ML pipeline.
After creating a dictionary, we have created an ML pipeline as we had created in the ML pipeline section. We have then created an instance of GridSearchCV by giving ML pipeline and hyperparameters dictionary to it. We have then called fit() method on an instance of GridSearchCV to perform grid search on hyperparameters.
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import RobustScaler
## Declare Hyperparameters Range
params = {
"Model__batch_size": [8,16],
"Model__optimizer": ["adam", "sgd"],
"Model__optimizer__learning_rate": [0.001, 0.01, 0.1],
}
### Create Pipeline
ml_pipeline = Pipeline([("Normalize", RobustScaler()), ("Model", scikeras_regressor)])
## Grid Search Hyperparameters
grid = GridSearchCV(ml_pipeline, params, scoring="r2")
grid.fit(X_train, Y_train)
Below we have printed the performance of the model which gave the best result and the hyperparameters setting which was responsible for that result.
print("Best Score : {}".format(grid.best_score_))
print("Best Params : {}".format(grid.best_params_))
Below we have printed MSE and R^2 scores evaluated on train and test datasets using the above ML pipeline with the best hyperparameters setting.
### Evaluate Model
from sklearn.metrics import mean_squared_error
print("Train MSE : {}".format(mean_squared_error(Y_train, grid.predict(X_train))))
print("Test MSE : {}".format(mean_squared_error(Y_test, grid.predict(X_test))))
print("\nTrain R^2 : {}".format(grid.score(X_train, Y_train)))
print("Test R^2 : {}".format(grid.score(X_test, Y_test)))
In this section, we'll explain how we can save keras model wrapped inside of scikeras model to a file and then load it again.
We can access keras model underlying scikeras model anytime by just accessing model attribute of scikeras model. Below we have called model attribute on scikeras model from the regression section.
scikeras_regressor.model
Keras model has a method named save() which accepts pathname as input and will save keras model in that path.
Below we have called save() method on keras model present inside of scikeras model from regression section. We can notice a logging message informing us that model is saved inside of keras_regressor path.
scikeras_regressor.model.save("keras_regressor")
%ls keras_regressor/
We can load the keras model from the saved path by calling load_model() method available from keras.models module. Below we have reloaded our keras model from keras_regressor directory.
neural_regressor2 = keras.models.load_model("keras_regressor")
After loading the keras model, we have wrapped it again inside of KerasRegressor to create a new scikeras model. We have set other parameters exactly the same way as we had set earlier during the regression section.
scikeras_regressor2 = KerasRegressor(model=neural_regressor2,
optimizer="adam",
loss=keras.losses.mean_squared_error,
batch_size=8,
epochs=100,
verbose=0
)
After creating scikeras model, we need to initialize it as well so that it can be used to make predictions.
scikeras_regressor2.initialize(X_train, Y_train)
Below we have calculated MSE and R^2 score metrics on both train and test datasets using both the original scikeras model from the regression section and the one we loaded from a file. We can notice that both have given the same results hence we have correctly loaded the model from the file.
from sklearn.metrics import mean_squared_error
print("===== Original Model Performance ========\n")
print("Train MSE : {}".format(mean_squared_error(Y_train, scikeras_regressor.predict(X_train))))
print("Test MSE : {}".format(mean_squared_error(Y_test, scikeras_regressor.predict(X_test))))
print("\nTrain R^2 : {}".format(scikeras_regressor.score(X_train, Y_train)))
print("Test R^2 : {}".format(scikeras_regressor.score(X_test, Y_test)))
print("\n===== Loaded Model Performance ========\n")
print("Train MSE : {}".format(mean_squared_error(Y_train, scikeras_regressor2.predict(X_train))))
print("Test MSE : {}".format(mean_squared_error(Y_test, scikeras_regressor2.predict(X_test))))
print("\nTrain R^2 : {}".format(scikeras_regressor2.score(X_train, Y_train)))
print("Test R^2 : {}".format(scikeras_regressor2.score(X_test, Y_test)))
This ends our small tutorial explaining how we can wrap keras model inside of scikeras so that the resulting model can be used like scikit-learn estimator with simple API. Please feel free to let us know your views in the comments section.
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to