Machine learning is the most famous branch of computer science at the time of writing this tutorial.
Deep neural networks are getting developed every other day to solve hard tasks (image classification, object detection, speech recognition, etc.) which were once impossible for computers to solve. The data required to train complicated ML models are also increasing in size day by day.
As training neural network requires trying different combinations of hyperparameters settings for better results, the process of building a model with a large amount of data can take a lot of time.
The general approach used to find the best hyperparameters of the ML model is grid search which tries all possible combinations of hyperparameters and can take a lot of time if the dataset is large enough.
The training process will also waste time on hyperparameters combinations in particular ranges which do not give good results but it'll still try all combinations in those ranges.
We need that our hyperparameters tuning process only tries combinations of hyperparameters that are supposed to give good results rather than trying all possible combinations in the given range.
Python has a bunch of libraries that let us perform hyperparameters tuning. Below, we have listed few famous ones.
As a part of this tutorial, we have explained how to use Python library bayes_opt to perform hyperparameters tuning of sklearn ML Models with simple and easy-to-understand examples. Tutorial provides a guide to use "bayes_opt" for regression and classification problems. Apart from hyperparameters tuning, it covers topics like changing parameter ranges during tuning, manually looping for tuning, guiding tuning process, saving tuning results, and resuming tuning process later, etc.
The bayes_opt uses Bayesian interference and Gaussian process to find values of hyperparameters which gives the best results in fewer trials. It can take any black-box function as input and maximize the output return value of that function.
The library starts by constructing posterior distribution of functions (Gaussian process) that can accurately describe our input function whose output we want to maximize. It then tries different combinations of parameters that the function takes as input.
As we try different combinations, the posterior distribution improves. It learns more about regions in parameter space where we are getting good results and keep exploring in that space rather than trying the whole parameter space.
At each trial, the gaussian process is fitted and posterior distribution along with exploration strategy (UCB (Upper Confidence Bound), or EI (Expected Improvement)) is used to determine which combination of parameters to try next. It tries to find the best parameters setting for function in as many fewer trials as possible.
Below, we have listed important sections of tutorial to give an overview of the material covered.
We can easily install bayes_opt by using pip or conda commands.
We'll start by importing bayes_opt library.
import bayes_opt
import warnings
warnings.filterwarnings("ignore")
We'll be following the steps mentioned below in all our sections to explain how to perform hyperparameters optimization.
NOTE: Please feel free to skip this section if you are in hurry and want to use "bayes_opt" with sklearn models. This section introduces how to use library to tune parameter of simple line formula. It has few definitions that can be referred later on.
In this section, we'll explain how we can use bayes_opt to maximize the simple line formula.
We'll be trying to optimize line formula -1 * abs(5x-21). This formula will have a maximum value of 0 when 5x-21 evaluates to 0.
The abs() function will make sure that the output of 5x-21 will always be positive. When we multiply the output of abs() function with -1, it'll turn the value to negative. This way the output of the function will be maximum when abs(5x-21) returns 0 as all other values will turn negative and will be less than 0.
We'll be providing a range for parameter x to try and different values will be tried by the bayesian optimizer to find the maximum value of the function in fewer trials.
In this section, we have simply defined objective function which takes as input single parameter x and returns value of formula -1 * abs(5x - 21). Our bayesian optimizer will provide different values of x to this function so that value returned by it is maximized in less number of trials.
def objective(x):
return -1 * abs(5*x - 21)
In this section, we have defined parameters search space dictionary. As we have only one parameter to optimize in our function, our dictionary consists of only 1 entry. We have provided range (-1,5) so that optimizer will try values in that range for x.
search_space = {"x" : (-1, 5)}
In this section, we have actually defined an instance of BayesianOptimization optimizer and performed objective function maximization using it. In order to perform parameter search, we need to create an instance of BayesianOptimization first. Below we have given the definition of class.
Below, we have first created an instance of BayesianOptimization with our objective function and parameters search space. We have then called maximize() method on BayesianOptimization to maximize value returned by objective function for 15 trials.
We can notice from the output that it prints results for 20 trials (5 random and 15 input). We can notice from the output that the function has achieved the maximum value after 20 trials. At trial 20, the value of x tried is 4.2 which will almost make the value of our function(5x-21) zero.
optimizer = bayes_opt.BayesianOptimization(
f=objective,
pbounds=search_space,
random_state=123
)
optimizer.maximize(n_iter=15)
In this section, we are printing the results of our bayesian optimization process. The BayesianOptimization optimizer object has some attributes which we can explore to retrieve the results of our optimization process.
The max attribute holds a dictionary that has information about parameter settings that generated the maximum value of our objective function and maximum value as well. We have below printed which parameter setting gave maximum value.
We can access information about each setting tried and objective value for those settings by using res attribute of BayesianOptimization optimizer. In the next cell, we have printed the last 10 optimization settings tried.
print("Best Parameter Setting : {}".format(optimizer.max["params"]))
print("Best Target Value : {}".format(optimizer.max["target"]))
results = optimizer.res
results[-10:]
We can also call maximize() function more than once with different values of n_iter and init_points asking it to try more parameter settings if we are not satisfied with the results from the previous call to it.
Below we have called maximize() again with init_points set to 0 and n_iter set to 10. This will ask it to not try any random trials but try 10 more trials based on previous trials.
We can notice from the results that the value of x has stuck to 4.2 as this is the value where the objective function returns the maximum value.
optimizer.maximize(init_points=0, n_iter=10)
Below we have printed the best results after trying 10 more parameters combinations.
print("Best Parameter Setting : {}".format(optimizer.max["params"]))
print("Best Target Value : {}".format(optimizer.max["target"]))
We can also change the parameter range for any parameter if we are not getting good results from our current set ranges. The BayesianOptimization provides a method named set_bounds() which accepts dictionary with parameter search space. It'll replace search space for parameters mentioned in this dictionary.
Below we have replaced our search space for parameter x to the new range which is (-5,-1). We know that we won't get good results by trying values in this range but this is for explanation purposes.
optimizer.set_bounds(new_bounds={"x": (-5, -1)})
Below we have called maximize() again to run for 10 trials. It'll now try values in the new range and record results. We can notice from the results that they are not good compared to our previous trials.
optimizer.maximize(
init_points=0,
n_iter=10,
)
Below we have printed the best results for all trials. We can notice that it has still maintained the value of x at 4.2 which gave the best results. All trials which we performed after setting a new bound to x were not able to improve results further.
print("Best Parameter Setting : {}".format(optimizer.max["params"]))
print("Best Target Value : {}".format(optimizer.max["target"]))
In this section, we'll explain how we can use bayes_opt with scikit-learn to find the best hyperparameters for linear regression model solving regression tasks. We'll be using the Boston housing dataset available from scikit-learn for our example. We'll be following the same steps which we had listed earlier for using bayes_opt.
If you are interested in learning about regression using scikit-learn then please feel free to check our tutorial on the same which explains it with simple examples.
In this section, We have loaded the Boston housing dataset available from scikit-learn. It has information houses in Boston like the number of bedrooms, the crime rate in the area, tax rate, etc. The target variable of the dataset is the median value of homes in 1000 dollars. As the target variable is a continuous variable, this will be a regression problem.
Below we have loaded our Boston hosing dataset as variable X and Y. The variable X has data for each feature and variable Y has target variable values. We have then divided the dataset into the train (80%) and test (20%) sets.
from sklearn import datasets
from sklearn.model_selection import train_test_split
X, Y = datasets.load_boston(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=123)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
In this section, we have created an objective function that we'll be using for our regression task. We'll be using Ridge regression model available from scikit-learn to train our training dataset. We'll be tuning 3 hyperparameters of the Ridge model.
Our objective function takes as input these 3 hyperparameters. It then creates an instance of the Ridge model using the values of these 3 hyperparameters.
The bayes_opt library can provide only float values as hyperparameters hence we have included little logic which converts these float values to actual hyperparameters values. We have created a list for values accepted by hyperparameters fit_intercept and solver. We have then converted the float value provided by the optimizer for these parameters to integer and used these integer values to retrieve hyperparameters value using integer indexing of array.
After creating Ridge regression model, we train it on train data. At last, we return R^2 score calculated on test data using a trained regression model. The R^2 score has value in the range (-1,1) and 1 being the best score. We want our optimizer to find the value of the score as high as possible for better performance of the model.
If you are interested in learning about R^2 score and other metrics available from scikit-learn then please feel free to check our tutorial on the same. It covers the majority of ML metrics.
from sklearn.linear_model import Ridge
fi_range = [True, False]
solvers = ["svd", "cholesky", "lsqr", "sparse_cg", "sag", "saga"]
def objective(alpha, fit_intercept, solver):
regressor = Ridge(alpha=alpha,
fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
solver=solvers[int(solver)],
random_state=123)
regressor.fit(X_train, Y_train)
return regressor.score(X_test, Y_test)
Below we have declared hyperparameters search space for our regression model. We have set alpha parameter range (0.5,5) which will try different float values in this range. The range for fit_intercept and solver is set as (0,1) and (0,5) which will try different float values in these ranges. The float values tried for these parameters will be converted to an integer and then the proper value for them will be selected using integer indexing. This logic is included inside of our objective function above.
search_space = {
"alpha": (0.5, 5),
"fit_intercept": (0, 1),
"solver": (0, 5)
}
In this section, we have first created an optimizer using BayesianOptimization constructor by giving it an objective function and hyperparameters search space. We have then called maximize() method with default parameters on the optimizer instance. This will run the optimization process for a total of 30 (5 random + 25 normal) iterations. It'll try 30 different combinations of hyperparameters on the objective function.
optimizer = bayes_opt.BayesianOptimization(
f=objective,
pbounds=search_space,
random_state=123
)
optimizer.maximize()
Below we have retrieved the best performing hyperparameters setting using max attribute of the optimizer. We have then converted values for fit_intercept and solver to their actual values using integer indexing. We have also printed the best parameter combination and the value of the objective function.
alpha = optimizer.max["params"]["alpha"]
fit_intercept = fi_range[int(optimizer.max["params"]["fit_intercept"])]
solver = solvers[int(optimizer.max["params"]["solver"])]
print("Best Parameter Setting : {}".format({"alpha": alpha, "fit_intercept": fit_intercept, "solver": solver}))
print("Best R^2 : {}".format(optimizer.max["target"]))
Below we have called maximize() function again on optimizer for 2 random iterations and 5 normal iterations. We have tried 7 more iterations to check whether it can further improve results.
optimizer.maximize(init_points=2, n_iter=5)
Below we have printed the best hyperparameters combination and objective function value again to verify that whether there is any change from previously printed results. We can notice from the output that is exactly the same as our previous call to maximize(). We can stop the optimization process if it’s not improving model performance further.
alpha = optimizer.max["params"]["alpha"]
fit_intercept = fi_range[int(optimizer.max["params"]["fit_intercept"])]
solver = solvers[int(optimizer.max["params"]["solver"])]
print("Best Parameter Setting : {}".format({"alpha": alpha, "fit_intercept": fit_intercept, "solver": solver}))
print("Best R^2 : {}".format(optimizer.max["target"]))
In this section, we have created an instance of Ridge regression using the best hyperparameters values that we got using the bayesian optimization process. We have then fit the model with train data and evaluated R^2 score on both train/test datasets.
regressor = Ridge(alpha=alpha,
fit_intercept=fit_intercept,
solver=solver,
random_state=123)
regressor.fit(X_train, Y_train)
print("Train R^2 : {:.2f}".format(regressor.score(X_train, Y_train)))
print("Test R^2 : {:.2f}".format(regressor.score(X_test, Y_test)))
In this section, we'll explain how we can use bayes_opt for optimizing hyperparameters for the classification task. We'll be using the wine dataset available from scikit-learn for this example. We'll be creating a logistic regression classification model and trying different hyperparameters combinations using the bayesian process to improve the performance of the model.
If you are interested in learning about classification using scikit-learn then please feel free to check our tutorial on the same which explains it with simple examples.
In this section, we have loaded the wine dataset available from scikit-learn. The wine dataset has the measurement of ingredients used in the creation of three different types of wine. The measurement of ingredients is the features of our dataset and wine type is the target variable.
Below we have loaded the wine dataset from scikit-learn and divided it into the train (80%) and test (20%) sets.
from sklearn import datasets
from sklearn.model_selection import train_test_split
X, Y = datasets.load_wine(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
In this section, we have declared an objective function for our classification task. Our objective function takes values for 4 hyperparameters as input.
Our objective function first creates an instance of LogisticRegression with hyperparameters values provided to it. We have included logic to convert float values of hyperparameters using integer indexing. As we explained during the regression section the optimizer can only provide float values and some of our hyperparameters have values of different data types. To solve this, we are maintaining lists of possible values of hyperparameters, we are then converting float values to integer and using integer indexing on hyperparameters list to retrieve actual hyperparameter values.
After creating the model, we have fit it with train data and returned model accuracy on the test dataset. We'll be trying to maximize this test accuracy returned by the objective function.
from sklearn.linear_model import LogisticRegression
fi_range = [True, False]
solvers = ["newton-cg", "lbfgs"]
penalties = ["l2", "none"]
def objective(C, fit_intercept, solver, penalty):
classifier = LogisticRegression(C=C,
fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
solver=solvers[1 if solver > 0.5 else 0],
penalty=penalties[1 if penalty > 0.5 else 0],
max_iter=1000,
random_state=123)
classifier.fit(X_train, Y_train)
return classifier.score(X_test, Y_test)
In this section, we have declared hyperparameters search space as a python dictionary. The keys of the dictionary are hyperparameter names and values are range from which to try values of hyperparameters. The range for hyperparameter C is set as (0.5,5) which means float values in this range will be tried for it. The range for the other three hyperparameters is set as (0,1) which will try different float values in this range and our objective function will convert float values to hyperparameter values using integer indexing.
search_space = {
"C": (0.5, 5),
"fit_intercept": (0, 1),
"solver": (0, 1),
"penalty": (0, 1)
}
In this section, we have first created an optimizer instance (BayesianOptimization) with objective function and hyperparameters search space. We have then called maximize() method on the optimizer to perform hyperparameters optimization. This call will try 30 different hyperparameters combinations (5 random & 25 normal) on an objective function to maximize its output.
optimizer = bayes_opt.BayesianOptimization(
f=objective,
pbounds=search_space,
random_state=123
)
optimizer.maximize()
In this section, we have printed the parameters combination that maximized the output of our objective function. We have also converted float values of hyperparameters fit_intercept, solver, and penalty to their actual values which were tried. We have then printed the best hyperparameters combination and objective function output as well.
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]
print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy : {:.2f}".format(optimizer.max["target"]))
In this section, we have created an instance of LogisticRegression using the best hyperparameters combination which we found out using our bayesian optimization process. We have then fit it on the training dataset. At last, we have printed the accuracy of the model on train and test datasets.
classifier = LogisticRegression(C=C,
fit_intercept=fit_intercept,
solver=solver,
penalty=penalty,
max_iter=1000,
random_state=123)
classifier.fit(X_train, Y_train)
print("Train Accuracy : {:.2f}".format(classifier.score(X_train, Y_train)))
print("Test Accuracy : {:.2f}".format(classifier.score(X_test, Y_test)))
In this section, we'll explain how we can loop through hyperparameters settings suggested by the Bayesian optimization process. Till now, all our examples have called maximize() method of BayesianOptimization optimizer which performs loop through different hyperparameters combinations on our behalf. But, there are situations when we want more control in our hands and we want to perform some extra operations during various trials like saving weights of models if the training process is costly. In this situation, we can use other methods available from BayesianOptimization object which let us loop through hyperparameters combinations.
We'll be using the wine classification dataset from scikit-learn in our example hence we'll be reusing code from the classification section of our tutorial.
In this section, we have loaded the wine dataset and divided it into train/test sets. The code is almost the same as that of the classification section.
from sklearn import datasets
from sklearn.model_selection import train_test_split
X, Y = datasets.load_wine(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
In this section, we have defined the objective function that we have used for our classification problem. We have reused the objective function from the classification section which tries to optimize 4 hyperparameters of a logistic regression model. Please feel free to check the classification section if you want to understand this function in detail.
from sklearn.linear_model import LogisticRegression
fi_range = [True, False]
solvers = ["newton-cg", "lbfgs"]
penalties = ["l2", "none"]
def objective(C, fit_intercept, solver, penalty):
classifier = LogisticRegression(C=C,
fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
solver=solvers[1 if solver > 0.5 else 0],
penalty=penalties[1 if penalty > 0.5 else 0],
max_iter=1000,
random_state=123)
classifier.fit(X_train, Y_train)
return classifier.score(X_test, Y_test)
Below we have defined hyperparameters search space for our problem. It is the same as that of the classification section.
search_space = {
"C": (0.5, 5),
"fit_intercept": (0, 1),
"solver": (0, 1),
"penalty": (0, 1)
}
In this section, we have first defined our optimizer BayesianOptimization using objective function and hyperparameters search space. We'll be using a different approach to maximize objective function for this example.
optimizer = bayes_opt.BayesianOptimization(
f=objective,
pbounds=search_space,
random_state=123
)
In order to maximize objective function using a loop, we need to follow the below steps.
Below we have first created an instance of UtilityFunction.
from bayes_opt import UtilityFunction
utility = UtilityFunction(kind="ucb", kappa=2.5, xi=0.0)
In this cell, we have called suggest() method of optimizer with an instance of UtilityFunction which returns a single hyperparameters combination which we have printed as well.
next_point_to_probe = optimizer.suggest(utility)
print("Next point to probe is:", next_point_to_probe)
Now, we have evaluated objective function by using hyperparameters combination from the previous cell. We have also printed the result of the objective function using this setting.
target = objective(**next_point_to_probe)
print("Found the target value to be:", target)
At last, we have registered hyperparameters combination which we tried on objective function and result of the objective function by calling register() method on the optimizer. This will help the optimizer make decisions about the next settings to be suggested.
optimizer.register(
params=next_point_to_probe,
target=target,
)
Here, we have explained how we can perform a loop of all the steps which we explained earlier. We have created a loop that tries 5 different combinations of hyperparameters on the objective function and registers the results of them.
for _ in range(5):
next_point = optimizer.suggest(utility)
target = objective(**next_point)
optimizer.register(params=next_point, target=target)
print("Hyperparameters Setting : {}".format(next_point))
print("Objective Value : {}\n".format(target))
In this section, we have retrieved the best results and hyperparameters setting from max property of the optimizer instance and printed them.
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]
print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy : {:.2f}".format(optimizer.max["target"]))
In this section, we have created an instance of logistic regression using the best hyperparameters setting that we got through our optimization process. We have then trained the model on train data and evaluated accuracy on both train/test sets.
classifier = LogisticRegression(C=C,
fit_intercept=fit_intercept,
solver=solver,
penalty=penalty,
max_iter=1000,
random_state=123)
classifier.fit(X_train, Y_train)
print("Train Accuracy : {:.2f}".format(classifier.score(X_train, Y_train)))
print("Test Accuracy : {:.2f}".format(classifier.score(X_test, Y_test)))
Until now, in all our examples, the hyperparameters combinations were suggested by Bayesian optimizer to maximize the objective function. We just need to give range to it and it'll try different values in that range for each hyperparameter.
But there are situations where we know exactly which value of hyperparameters can give good results.
For those situations, bayes_opt library let us suggest a hyperparameters combination. It provides a method named probe() through an instance of BayesianOptimization which lets us suggest hyperparameters combination. This is referred to as guided optimization as we are manually guiding the process which hyperparameters combination to try.
We'll be using the wine dataset from scikit-learn to explain guided optimization in this section. The code in this section reuses much of the code from the classification section hence there won't be a detailed explanation here. Please check the classification section above if you have come to this section directly but want to understand the code in detail.
In this section, we have loaded the wine dataset from scikit-learn and divided it into train/test sets.
from sklearn import datasets
from sklearn.model_selection import train_test_split
X, Y = datasets.load_wine(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
In this section, we have defined an objective function that we'll be using for our classification problem. We have reused the function from the classification section.
from sklearn.linear_model import LogisticRegression
fi_range = [True, False]
solvers = ["newton-cg", "lbfgs"]
penalties = ["l2", "none"]
def objective(C, fit_intercept, solver, penalty):
classifier = LogisticRegression(C=C,
fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
solver=solvers[1 if solver > 0.5 else 0],
penalty=penalties[1 if penalty > 0.5 else 0],
max_iter=1000,
random_state=123)
classifier.fit(X_train, Y_train)
return classifier.score(X_test, Y_test)
In this section, we have defined hyperparameters’ search space for our problem.
search_space = {
"C": (0.5, 5),
"fit_intercept": (0, 1),
"solver": (0, 1),
"penalty": (0, 1)
}
In order to maximize objective function using guided optimization, we first need to create an optimizer as usual. Below we have created our optimizer BayesianOptimization using objective function and hyperparameters search space.
optimizer = bayes_opt.BayesianOptimization(
f=objective,
pbounds=search_space,
random_state=123,
verbose=2
)
We can suggest hyperparameters combination using probe() method of the optimizer. We need to give hyperparameters combination to params parameter of the method.
Below we have suggested one combination of hyperparameters (as python dictionary) using probe() method. We can also suggest hyperparameters combination as a plain python list if we know the order of hyperparameters.
optimizer.probe(
params={"C": 0.5, "fit_intercept": 0.7, "solver": 0.3, "penalty": 0.2},
lazy=True,
)
We can retrieve the order of hyperparameters using space.keys attribute of optimizer instance.
print(optimizer.space.keys)
Below, we have suggested another three combinations using probe() method. We have provided combinations, two combinations as a python list and one as a python dictionary.
optimizer.probe(
params=[0.5, 0.3, 0.7, 0.6],
lazy=True,
)
optimizer.probe(
params=[0.5, 0.7, 0.7, 0.6],
lazy=True,
)
optimizer.probe(
params={"C": 3.63, "fit_intercept": 0.7, "solver": 0.3, "penalty": 0.2},
lazy=True,
)
At last, we need to call maximize() method of the optimizer to try combinations manually suggested by us. In order to try only our suggested combinations, we need to set init_points and n_iter parameters to 0.
If we want then we can provide values for these parameters, the maximize() method will first try all combinations we suggested through probe() method calls, then it'll try other combinations based on init_points and n_iter parameter values.
optimizer.maximize(init_points=0, n_iter=0)
In this section, we have printed the results of our optimization process as usual.
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]
print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy : {:.2f}".format(optimizer.max["target"]))
In this section, we have created an instance of logistic regression using the best parameters that we got using the bayesian optimization process. We have then trained the model on train data and evaluated the accuracy on train and test datasets.
classifier = LogisticRegression(C=C,
fit_intercept=fit_intercept,
solver=solver,
penalty=penalty,
max_iter=1000,
random_state=123)
classifier.fit(X_train, Y_train)
print("Train Accuracy : {:.2f}".format(classifier.score(X_train, Y_train)))
print("Test Accuracy : {:.2f}".format(classifier.score(X_test, Y_test)))
There can be situations where we need to log the results of our optimization process so that we can resume the optimization process later from where we have left. The bayes_opt library provides us with this functionality. It let us define json logger which logs details about each optimization step to a json file. We can then reload optimizer history from this file so it has information about all trials that we had performed previously. The optimization process will resume by taking all previous steps tried.
We'll be using the wine dataset available from scikit-learn in this section to explain how we can save optimization results and reload them to resume the optimization process from where we have left last time. We'll be reusing much of the code that we have used in our classification section. Therefore, we have not included a detailed description of some parts of the code as their explanation is already present in the classification section. Please feel free to check it if you want to understand some code sections better but their description is not present here in detail.
In this section, we have loaded the wine dataset from scikit-learn and divided it into train/test sets.
from sklearn import datasets
from sklearn.model_selection import train_test_split
X, Y = datasets.load_wine(return_X_y=True)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, stratify=Y, random_state=123)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
In this section, we have defined the objective function that we'll use in this example. We have reused the objective function from the classification section again.
from sklearn.linear_model import LogisticRegression
fi_range = [True, False]
solvers = ["newton-cg", "lbfgs"]
penalties = ["l2", "none"]
def objective(C, fit_intercept, solver, penalty):
classifier = LogisticRegression(C=C,
fit_intercept=fi_range[1 if fit_intercept > 0.5 else 0],
solver=solvers[1 if solver > 0.5 else 0],
penalty=penalties[1 if penalty > 0.5 else 0],
max_iter=1000,
random_state=123)
classifier.fit(X_train, Y_train)
return classifier.score(X_test, Y_test)
In this section, we have defined hyperparameters’ search space over which to search for values.
search_space = {
"C": (0.5, 5),
"fit_intercept": (0, 1),
"solver": (0, 1),
"penalty": (0, 1)
}
In this section, we have defined our optimize BayesianOptimization using objective function and hyperparameters search space.
optimizer = bayes_opt.BayesianOptimization(
f=objective,
pbounds=search_space,
random_state=123,
verbose=2
)
In order to log information about the optimization process, we need to create an instance of JSONLogger and subscribe it with optimizer so that it logs information about events to the logger.
Below we have first created an instance of JSONLogger with json file name classifier_opt.json. This is the file to which logging information about each optimization step will be stored.
We have then subscribed this logger to the optimization process by calling subscribe() method of the optimizer. We need to give two values to subscribe() method.
There are three types of events available from bayes_opt.
from bayes_opt.logger import JSONLogger
from bayes_opt.event import Events
logger = JSONLogger(path="./classifier_opt.json")
optimizer.subscribe(Events.OPTIMIZATION_STEP, logger)
Now we have called maximize() method on the optimizer asking it to run the optimization process for 7 trials (2 random and 5 normal).
optimizer.maximize(
init_points=2,
n_iter=5,
)
Below we have printed the results of the optimization process which we performed in the previous step.
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]
print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy : {:.2f}".format(optimizer.max["target"]))
In this section, we have created another optimizer using our objective function and hyperparameters search space. We'll be loading optimization steps logged from our main optimizer into this optimizer.
optimizer2 = bayes_opt.BayesianOptimization(
f=objective,
pbounds=search_space,
random_state=123
)
We can load optimizer with steps from a log file using load_logs method available from bayes_opt.util module. We need to give it an optimizer instance and a list of log files from which to load logs.
Below we have loaded our second optimizer using logs of our first optimizer. We have not called maximize() a single time on our second optimizer hence it does not have any history. After loading the second optimizer, we have also printed the history to check whether it properly loaded all steps of the first optimizer. We can notice that it seems to have loaded all 7 trials which we had tried in our first optimizer.
from bayes_opt.util import load_logs
load_logs(optimizer2, logs=["./classifier_opt.json"]);
print("Loaded optimizer is now aware of {} points.".format(len(optimizer2.space)))
optimizer2.res
In this section, we have printed the best results using a section optimizer to compare it with the first optimizer. We can notice that the results are the same as that of the first optimizer.
C = optimizer.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer.max["params"]["penalty"] > 0.5 else 0]
print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy : {:.2f}".format(optimizer.max["target"]))
In this section, we have called maximize() method on our second optimizer to let it try more trials to check whether it can further improve results or not.
optimizer2.maximize()
At last, we have printed the best results again to check whether extra trials performed on the second optimizer have improved results any further or not. We can notice that results are the same as before hence the last call to maximize() on the second optimizer was not able to improve results further. We can end the optimization process if we are satisfied with the results or we can change the ranges of hyperparameters and try again further.
C = optimizer2.max["params"]["C"]
fit_intercept = fi_range[1 if optimizer2.max["params"]["fit_intercept"] > 0.5 else 0]
solver = solvers[1 if optimizer2.max["params"]["solver"] > 0.5 else 0]
penalty = penalties[1 if optimizer2.max["params"]["penalty"] > 0.5 else 0]
print("Best Parameter Setting : {}".format({"C": C, "fit_intercept": fit_intercept, "solver": solver, "penalty":penalty}))
print("Best Accuracy : {:.2f}".format(optimizer2.max["target"]))
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to