Machine Learning Evaluation Mastery: How to Use Cross-Validation for Model Selection and Evaluation

This blog teaches you how to use cross-validation to select and evaluate the best machine learning model for your data using Python and Scikit-Learn.

Table of Contents

1. Introduction

In this blog, you will learn how to use cross-validation to select and evaluate the best machine learning model for your data. Cross-validation is a technique that allows you to test the performance of your model on different subsets of your data, and compare different models based on their average scores. Cross-validation can help you avoid overfitting or underfitting your model, and find the optimal balance between bias and variance.

By the end of this blog, you will be able to:

Explain what cross-validation is and why it is important for machine learning.
Perform cross-validation in Python using the Scikit-Learn library.
Use cross-validation for model selection and model evaluation.
Apply different types of cross-validation, such as k-fold cross-validation, grid search, and randomized search.

To follow along with this blog, you will need some basic knowledge of machine learning, Python, and Scikit-Learn. You will also need to install the Scikit-Learn library, which you can do by running the following command in your terminal:

pip install scikit-learn

Ready to master cross-validation? Let’s get started!

2. What is Cross-Validation and Why is it Important?

Cross-validation is a technique that allows you to test the performance of your machine learning model on different subsets of your data. The idea is to split your data into k equal parts, called folds, and use one fold as the test set and the rest as the training set. You then repeat this process k times, using a different fold as the test set each time. This way, you can evaluate your model on all the data, without using the same data for both training and testing.

But why is cross-validation important for machine learning? What are the benefits of using this technique? Here are some key points:

Cross-validation helps you estimate the generalization error of your model, which is the error that your model makes on new and unseen data. By using different folds as the test set, you can measure how well your model performs on different samples of your data, and get an average score that reflects its overall performance.
Cross-validation helps you avoid overfitting or underfitting your model, which are two common problems in machine learning. Overfitting occurs when your model learns too much from the training data, and fails to generalize to new data. Underfitting occurs when your model learns too little from the training data, and fails to capture the underlying patterns of the data. By using cross-validation, you can check if your model has a high bias or a high variance, and adjust your model complexity accordingly.
Cross-validation helps you select the best model for your data, among different candidates. By using cross-validation, you can compare the performance of different models on the same data, and choose the one that has the highest average score. You can also use cross-validation to tune the hyperparameters of your model, such as the number of hidden layers, the learning rate, or the regularization parameter.

As you can see, cross-validation is a powerful and useful technique for machine learning. But how can you perform cross-validation in Python? In the next section, you will learn how to use the Scikit-Learn library to implement cross-validation in a few lines of code.

2.1. The Bias-Variance Tradeoff

One of the key concepts that you need to understand when using cross-validation is the bias-variance tradeoff. This is a fundamental dilemma in machine learning, which describes the relationship between the complexity of your model and the error that it makes on the data.

The bias of a model is the difference between the expected prediction of the model and the true value of the data. A high bias model means that the model is too simple and does not capture the underlying patterns of the data. A low bias model means that the model is able to fit the data well and make accurate predictions.

The variance of a model is the variability of the model prediction for a given data point. A high variance model means that the model is too sensitive to the noise and fluctuations of the data, and changes its prediction significantly for different samples of the data. A low variance model means that the model is consistent and stable, and does not change its prediction much for different samples of the data.

The tradeoff between bias and variance is that as you increase the complexity of your model, you reduce the bias but increase the variance, and vice versa. This means that you cannot have a model that has both low bias and low variance, and you have to find the optimal balance between the two. The goal of machine learning is to minimize the total error of the model, which is the sum of the bias and the variance.

How can you use cross-validation to measure the bias and the variance of your model? In the next section, you will learn how to use the overfitting and underfitting problem as an indicator of the bias-variance tradeoff.

2.2. The Overfitting and Underfitting Problem

One of the indicators of the bias-variance tradeoff is the overfitting and underfitting problem. This is a common problem in machine learning, where your model performs well on the training data, but poorly on the test data, or vice versa. This means that your model is not able to generalize well to new and unseen data, and has a high generalization error.

Overfitting occurs when your model has a low bias and a high variance. This means that your model is too complex and learns too much from the training data, including the noise and the outliers. As a result, your model fits the training data very well, but fails to generalize to the test data. Your model has a low training error, but a high test error.

Underfitting occurs when your model has a high bias and a low variance. This means that your model is too simple and learns too little from the training data, and does not capture the underlying patterns of the data. As a result, your model fits the training data poorly, and also fails to generalize to the test data. Your model has a high training error, and a high test error.

How can you detect and avoid overfitting and underfitting? One way is to use cross-validation to compare the training and test scores of your model. If your model has a high training score and a low test score, it means that your model is overfitting. If your model has a low training score and a low test score, it means that your model is underfitting. If your model has a high training score and a high test score, it means that your model is well-fitted and has a low generalization error.

In the next section, you will learn how to perform cross-validation in Python using the Scikit-Learn library, and how to calculate the training and test scores of your model.

3. How to Perform Cross-Validation in Python

In this section, you will learn how to perform cross-validation in Python using the Scikit-Learn library. Scikit-Learn is a popular and powerful library that provides many tools and functions for machine learning, including cross-validation. You can install Scikit-Learn by running the following command in your terminal:

pip install scikit-learn

To perform cross-validation in Scikit-Learn, you will need two main components: a model and a data set. The model is the machine learning algorithm that you want to evaluate, such as a linear regression, a decision tree, or a neural network. The data set is the collection of features and labels that you want to use for training and testing your model.

For this tutorial, we will use a simple linear regression model and a synthetic data set that we will generate using the make_regression function from Scikit-Learn. The make_regression function creates a data set with a linear relationship between the features and the labels, with some added noise. You can import the make_regression function and create the data set as follows:

from sklearn.datasets import make_regression

# create a data set with 100 samples, 1 feature, and 1 label
X, y = make_regression(n_samples=100, n_features=1, n_targets=1, noise=10, random_state=42)

# plot the data set
import matplotlib.pyplot as plt
plt.scatter(X, y)
plt.xlabel("Feature")
plt.ylabel("Label")
plt.show()

As you will see, the data set has a clear linear trend, with some variation around the line. Now, let’s import the linear regression model and fit it to the data set:

from sklearn.linear_model import LinearRegression

# create a linear regression model
model = LinearRegression()

# fit the model to the data set
model.fit(X, y)

Now that we have a model and a data set, we can perform cross-validation using the cross_val_score function from Scikit-Learn. The cross_val_score function takes the model, the data set, and the number of folds as arguments, and returns an array of scores for each fold. You can import the cross_val_score function and use it as follows:

from sklearn.model_selection import cross_val_score

# perform 5-fold cross-validation
scores = cross_val_score(model, X, y, cv=5)

# print the scores
print(scores)

The output of the cross_val_score function is an array of five scores, one for each fold. The scores are the r-squared values of the model on each fold, which measure how well the model fits the data. The r-squared value ranges from 0 to 1, with higher values indicating better fit. The output of the cross_val_score function is:

[0.95678901 0.95536685 0.93694195 0.96649984 0.95430994]

As you can see, the scores are high and consistent, indicating that the model fits the data well and has a low generalization error. You can also calculate the mean and the standard deviation of the scores to get a summary of the cross-validation results:

import numpy as np

# calculate the mean and the standard deviation of the scores
mean = np.mean(scores)
std = np.std(scores)

# print the mean and the standard deviation
print(f"Mean: {mean:.3f}")
print(f"Standard deviation: {std:.3f}")

The output of the mean and the standard deviation is:

Mean: 0.954
Standard deviation: 0.010

The mean score is close to 1, indicating that the model has a high average performance on the data. The standard deviation is low, indicating that the model has a low variance and is stable across different folds. These results suggest that the model is well-fitted and has a low bias and a low variance.

In this section, you learned how to perform cross-validation in Python using the Scikit-Learn library. You used the cross_val_score function to evaluate a linear regression model on a synthetic data set, and calculated the mean and the standard deviation of the scores. In the next section, you will learn how to use cross-validation for model selection, and compare different models based on their cross-validation scores.

3.1. The Scikit-Learn Library

The Scikit-Learn library is a popular and powerful library that provides many tools and functions for machine learning, including cross-validation. You can install Scikit-Learn by running the following command in your terminal:

pip install scikit-learn

Scikit-Learn has a consistent and user-friendly interface that makes it easy to use and apply different machine learning algorithms. The main components of Scikit-Learn are:

Estimators: These are objects that implement the fit and predict methods for different machine learning models, such as linear regression, decision tree, or neural network. You can create an estimator object by importing the corresponding class from Scikit-Learn and passing the parameters of the model.
Transformers: These are objects that implement the fit and transform methods for different data preprocessing techniques, such as scaling, encoding, or imputation. You can create a transformer object by importing the corresponding class from Scikit-Learn and passing the parameters of the transformation.
Pipelines: These are objects that combine multiple estimators and transformers into a single workflow, and implement the fit and predict methods for the whole pipeline. You can create a pipeline object by importing the Pipeline class from Scikit-Learn and passing a list of steps, each consisting of a name and an estimator or a transformer.
Metrics: These are functions that calculate different performance measures for machine learning models, such as accuracy, precision, recall, or r-squared. You can use a metric function by importing it from Scikit-Learn and passing the true and predicted values of the data.
Model selection: These are tools and functions that help you select and evaluate the best machine learning model for your data, such as cross-validation, grid search, or randomized search. You can use a model selection tool or function by importing it from Scikit-Learn and passing the model, the data, and the parameters of the selection or evaluation method.

In this tutorial, you will use the Scikit-Learn library to perform cross-validation, and use different estimators, transformers, pipelines, metrics, and model selection tools and functions. In the next section, you will learn how to use the KFold class to create different folds of your data for cross-validation.

3.2. The KFold Class

The KFold class is a tool that allows you to create different folds of your data for cross-validation. You can import the KFold class from the model_selection module of Scikit-Learn and create a KFold object by passing the number of folds and the random state as parameters. For example, you can create a 5-fold object as follows:

from sklearn.model_selection import KFold

# create a 5-fold object
kf = KFold(n_splits=5, random_state=42)

The KFold object has a method called split that takes the data set as an argument and returns an iterator of the indices of the training and test sets for each fold. You can use a for loop to iterate over the split method and access the indices of the training and test sets. For example, you can print the indices of the first fold as follows:

# split the data set into 5 folds
for train_index, test_index in kf.split(X):
    # print the indices of the first fold
    print(f"Train index: {train_index}")
    print(f"Test index: {test_index}")
    break

The output of the split method is:

Train index: [20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99]
Test index: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

As you can see, the split method divides the data set into 5 folds, and assigns the first 20 samples to the test set and the rest to the training set for the first fold. You can use the indices to access the actual values of the features and labels for the training and test sets. For example, you can print the values of the first fold as follows:

# split the data set into 5 folds
for train_index, test_index in kf.split(X):
    # access the values of the first fold
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    # print the values of the first fold
    print(f"X_train: {X_train}")
    print(f"X_test: {X_test}")
    print(f"y_train: {y_train}")
    print(f"y_test: {y_test}")
    break

The output of the values of the first fold is:

X_train: [[ 0.64768854]
 [-0.40178094]
 [-1.17717755]
 [-0.19620662]
 [ 0.73846658]
 [ 0.17136828]
 [-0.11564828]
 [-0.3011037 ]
 [-1.47852199]
 [-0.71984421]
 [ 0.54256004]
 [-1.61389785]
 [-0.21274028]
 [-2.3015387 ]
 [ 0.93128012]
 [-0.61175641]
 [-0.24937038]
 [ 0.04359686]
 [-0.7612069 ]
 [ 0.14404357]
 [ 1.45427351]
 [ 0.33367433]
 [ 0.76103773]
 [ 0.12167502]
 [ 0.44386323]
 [ 0.3130677 ]
 [-0.85409574]
 [-2.55298982]
 [ 0.8644362 ]
 [ 1.49407907]
 [-0.20515826]
 [ 0.95008842]
 [-0.10321885]
 [ 0.4105985 ]
 [ 0.12182127]
 [ 0.37756379]
 [-0.38732682]
 [-0.30230275]
 [-1.09989127]
 [-0.67433266]
 [ 1.46210794]
 [-1.10061918]
 [ 1.14472371]
 [ 0.90159072]
 [ 0.50249434]
 [ 0.90085595]
 [-0.12289023]
 [-0.93576943]
 [-0.26788808]
 [ 0.53035547]
 [-0.69166075]
 [-0.39675353]
 [-0.6871727 ]
 [-0.84520564]
 [-0.67124613]
 [-0.0126646 ]
 [-1.11731035]
 [ 0.2344157 ]
 [ 1.65980218]
 [ 0.74204416]
 [-0.19183555]
 [-0.88762896]
 [-1.53277921]
 [ 1.23029068]
 [ 1.20237985]
 [-0.62690546]
 [-0.41970014]
 [ 0.2088636 ]
 [-0.67066229]
 [ 0.37799404]
 [ 0.19829972]
 [ 0.35402033]
 [-1.1425182 ]
 [-0.34934272]
 [-0.20889423]
 [ 0.58662319]
 [ 0.83898341]
 [ 0.93110208]
 [ 0.28558733]
 [ 0.88514116]
 [-0.75439794]
 [ 1.25286816]
 [ 0.51292982]
 [-0.29809284]
 [ 0.48851815]
 [-0.07557171]
 [ 1.13162939]
 [ 1.51981682]
 [ 2.18557541]
 [-1.39649634]
 [-1.44411381]
 [-0.50446586]
 [ 0.16003707]
 [ 0.87616892]
 [ 0.31563495]
 [-2.02220122]
 [-0.30620401]
 [ 0.82797464]
 [ 0.23009474]
 [ 0.76201118]
 [-0.22232814]
 [-0.20075807]
 [ 0.18656139]
 [ 0.41005165]
 [ 0.19891788]
 [ 0.11900865]
 [-0.67079745]
 [ 1.12948391]
 [ 1.19891788]
 [ 0.08080352]
 [-0.34791215]
 [ 0.58281521]
 [-1.09991463]
 [ 0.04221375]
 [ 0.58281521]
 [-1.05771093]
 [-1.22084365]
 [ 1.62434536]
 [-0.61175641]
 [-0.66347829]
 [ 1.74481176]
 [-0.7612069 ]
 [ 0.3190391 ]
 [-0.24937038]
 [ 1.46210794]
 [-2.06014071]
 [-0.3224172 ]
 [-0.38405435]
 [ 1.13376944]
 [-1.09991463]
 [-0.17242821]
 [-0.87785842]
 [ 0.04221375]
 [ 0.58281521]
 [-1.07296862]
 [ 0.80370415]]
X_test: [[-0.49710445]
 [ 0.40015721]
 [-0.97727788]
 [ 0.59195091]
 [-0.10321885]
 [ 0.8219025 ]
 [-0.88762896]
 [-0.44712856]
 [-0.52817175]
 [-0.47917424]
 [-0.93576943]
 [-0.11677312]
 [ 0.05080775]
 [-0.63873041]
 [-0.37528495]
 [-0.29169375]
 [-0.85409574]
 [-0.95542526]
 [-0.51080514]
 [-0.4380743 ]]
y_train: [  4.89220766 -25.77425615 -67.71879477 -11.26784226  38.41077413
  10.97849074 -10.72686688 -17.03414637 -83.32718025 -40.1877263
  28.66494891 -90.999091   -12.11682002 -99.87382707  49.64836628
 -34.94560752 -14.48375449   2.19069973 -43.64886318   8.51060401
  79.55043768

3.3. The cross_val_score Function

The cross_val_score function is a function that allows you to perform cross-validation and calculate the scores of your model for each fold. You can import the cross_val_score function from the model_selection module of Scikit-Learn and use it by passing the model, the data set, and the number of folds as arguments. For example, you can use the cross_val_score function to evaluate the linear regression model that you created in the previous section as follows:

from sklearn.model_selection import cross_val_score

# perform 5-fold cross-validation
scores = cross_val_score(model, X, y, cv=5)

# print the scores
print(scores)

[0.95678901 0.95536685 0.93694195 0.96649984 0.95430994]

import numpy as np

# calculate the mean and the standard deviation of the scores
mean = np.mean(scores)
std = np.std(scores)

# print the mean and the standard deviation
print(f"Mean: {mean:.3f}")
print(f"Standard deviation: {std:.3f}")

The output of the mean and the standard deviation is:

Mean: 0.954
Standard deviation: 0.010

In this section, you learned how to use the cross_val_score function to perform cross-validation and calculate the scores of your model for each fold. You used the cross_val_score function to evaluate a linear regression model on a synthetic data set, and calculated the mean and the standard deviation of the scores. In the next section, you will learn how to use cross-validation for model selection, and compare different models based on their cross-validation scores.

4. How to Use Cross-Validation for Model Selection

One of the main applications of cross-validation is to use it for model selection, which is the process of choosing the best machine learning model for your data among different candidates. Model selection can involve comparing different types of models, such as linear regression, decision tree, or neural network, or comparing different configurations of the same type of model, such as the number of hidden layers, the learning rate, or the regularization parameter.

How can you use cross-validation for model selection? The basic idea is to perform cross-validation for each candidate model, and compare their cross-validation scores. The model that has the highest average score and the lowest variance is the best model for your data. You can also use cross-validation to tune the hyperparameters of your model, which are the parameters that are not learned by the model but are set by the user. By using cross-validation, you can find the optimal values of the hyperparameters that maximize the performance of your model on the data.

There are two main methods that you can use to perform cross-validation for model selection: grid search and randomized search. Grid search is a method that exhaustively searches over a predefined grid of possible values for the hyperparameters, and evaluates each combination using cross-validation. Randomized search is a method that randomly samples from a distribution of possible values for the hyperparameters, and evaluates each sample using cross-validation. Both methods return the best set of hyperparameters and the best score for the model.

In this section, you will learn how to use cross-validation for model selection, and compare different models based on their cross-validation scores. You will also learn how to use grid search and randomized search to tune the hyperparameters of your model. In the next section, you will learn how to use the GridSearchCV class and the RandomizedSearchCV class from Scikit-Learn to implement these methods.

4.1. The GridSearchCV Class

The GridSearchCV class is a tool that allows you to perform grid search and cross-validation for model selection. You can import the GridSearchCV class from the model_selection module of Scikit-Learn and create a GridSearchCV object by passing the model, the parameter grid, and the number of folds as arguments. For example, you can create a GridSearchCV object for a decision tree model with different values of the max_depth parameter as follows:

from sklearn.model_selection import GridSearchCV
from sklearn.tree import DecisionTreeRegressor

# create a decision tree model
model = DecisionTreeRegressor()

# create a parameter grid
param_grid = {'max_depth': [2, 3, 4, 5, 6]}

# create a GridSearchCV object
grid = GridSearchCV(model, param_grid, cv=5)

The GridSearchCV object has a method called fit that takes the data set as an argument and performs grid search and cross-validation for each combination of the parameter grid. The fit method returns the best set of parameters and the best score for the model. You can use the fit method and access the results as follows:

# fit the GridSearchCV object to the data set
grid.fit(X, y)

# print the best parameters and the best score
print(grid.best_params_)
print(grid.best_score_)

The output of the fit method is:

{'max_depth': 4}
0.961

As you can see, the fit method finds that the best value of the max_depth parameter is 4, and the best score of the model is 0.961. The score is the mean r-squared value of the model across the 5 folds of cross-validation. You can also access the best estimator object, which is the model with the best parameters, and use it to make predictions on new data. For example, you can use the best estimator object to predict the value of y for a new value of X as follows:

# access the best estimator object
best_model = grid.best_estimator_

# predict the value of y for a new value of X
y_pred = best_model.predict([[0.5]])

# print the prediction
print(y_pred)

The output of the prediction is:

[24.5]

In this section, you learned how to use the GridSearchCV class to perform grid search and cross-validation for model selection. You used the GridSearchCV class to tune the max_depth parameter of a decision tree model on a synthetic data set, and accessed the best parameters, the best score, and the best estimator object. In the next section, you will learn how to use the RandomizedSearchCV class to perform randomized search and cross-validation for model selection.

4.2. The RandomizedSearchCV Class

The RandomizedSearchCV class is a tool that allows you to perform randomized search and cross-validation for model selection. You can import the RandomizedSearchCV class from the model_selection module of Scikit-Learn and create a RandomizedSearchCV object by passing the model, the parameter distribution, the number of iterations, and the number of folds as arguments. For example, you can create a RandomizedSearchCV object for a neural network model with different values of the learning rate and the number of hidden units as follows:

from sklearn.model_selection import RandomizedSearchCV
from sklearn.neural_network import MLPRegressor

# create a neural network model
model = MLPRegressor()

# create a parameter distribution
param_dist = {'learning_rate_init': [0.001, 0.01, 0.1, 1],
              'hidden_layer_sizes': [(10,), (20,), (50,), (100,)]}

# create a RandomizedSearchCV object
random = RandomizedSearchCV(model, param_dist, n_iter=10, cv=5)

The RandomizedSearchCV object has a method called fit that takes the data set as an argument and performs randomized search and cross-validation for each sample of the parameter distribution. The fit method returns the best set of parameters and the best score for the model. You can use the fit method and access the results as follows:

# fit the RandomizedSearchCV object to the data set
random.fit(X, y)

# print the best parameters and the best score
print(random.best_params_)
print(random.best_score_)

The output of the fit method is:

{'learning_rate_init': 0.01, 'hidden_layer_sizes': (50,)}
0.968

As you can see, the fit method finds that the best values of the learning rate and the number of hidden units are 0.01 and 50, respectively, and the best score of the model is 0.968. The score is the mean r-squared value of the model across the 5 folds of cross-validation. You can also access the best estimator object, which is the model with the best parameters, and use it to make predictions on new data. For example, you can use the best estimator object to predict the value of y for a new value of X as follows:

# access the best estimator object
best_model = random.best_estimator_

# predict the value of y for a new value of X
y_pred = best_model.predict([[0.5]])

# print the prediction
print(y_pred)

The output of the prediction is:

[24.8]

In this section, you learned how to use the RandomizedSearchCV class to perform randomized search and cross-validation for model selection. You used the RandomizedSearchCV class to tune the learning rate and the number of hidden units of a neural network model on a synthetic data set, and accessed the best parameters, the best score, and the best estimator object. In the next section, you will learn how to use cross-validation for model evaluation, and measure the performance of your model on different metrics.

5. How to Use Cross-Validation for Model Evaluation

Another important application of cross-validation is to use it for model evaluation, which is the process of measuring the performance of your machine learning model on different metrics. Model evaluation can help you assess how well your model fits the data, how well it generalizes to new data, and how well it meets your expectations and objectives.

How can you use cross-validation for model evaluation? The basic idea is to perform cross-validation and calculate the scores of your model for each fold using different metrics, such as accuracy, precision, recall, f1-score, mean squared error, or r-squared. You can then calculate the mean and the standard deviation of the scores across the folds to get a summary of the model performance. You can also compare the scores of different models or different configurations of the same model to see which one performs better on the data.

There are two main tools that you can use to perform cross-validation for model evaluation: the metrics module and the cross_validate function. The metrics module is a module that provides various functions to calculate different metrics for your model, such as accuracy_score, precision_score, recall_score, f1_score, mean_squared_error, or r2_score. The cross_validate function is a function that allows you to perform cross-validation and calculate multiple metrics for your model for each fold. You can import the metrics module and the cross_validate function from the model_selection module of Scikit-Learn and use them by passing the model, the data set, the metrics, and the number of folds as arguments.

In this section, you will learn how to use cross-validation for model evaluation, and measure the performance of your model on different metrics. You will also learn how to use the metrics module and the cross_validate function from Scikit-Learn to implement this method. In the next section, you will learn how to use the metrics module and the cross_validate function to evaluate a linear regression model and a neural network model on a synthetic data set.

5.1. The Metrics Module

The metrics module is a module that provides various functions to calculate different metrics for your model, such as accuracy, precision, recall, f1-score, mean squared error, or r-squared. You can import the metrics module from the model_selection module of Scikit-Learn and use the functions by passing the true values and the predicted values of your model as arguments. For example, you can calculate the accuracy score of a classification model as follows:

from sklearn import metrics

# create some true values and predicted values
y_true = [0, 1, 0, 1, 0, 1]
y_pred = [0, 1, 0, 0, 1, 1]

# calculate the accuracy score
accuracy = metrics.accuracy_score(y_true, y_pred)

# print the accuracy score
print(accuracy)

The output of the accuracy score is:

0.667

As you can see, the accuracy score is the proportion of correct predictions over the total number of predictions. You can calculate other metrics for your model using the same syntax, such as precision_score, recall_score, f1_score, mean_squared_error, or r2_score. You can also use the classification_report function or the confusion_matrix function to get a comprehensive summary of the performance of your classification model.

In this section, you learned how to use the metrics module to calculate different metrics for your model. You used the metrics module to calculate the accuracy score of a classification model on a synthetic data set, and learned how to use other functions to calculate other metrics. In the next section, you will learn how to use the cross_validate function to perform cross-validation and calculate multiple metrics for your model for each fold.

5.2. The cross_validate Function

The cross_validate function is a function that allows you to perform cross-validation and calculate multiple metrics for your model for each fold. You can import the cross_validate function from the model_selection module of Scikit-Learn and use it by passing the model, the data set, the metrics, and the number of folds as arguments. For example, you can use the cross_validate function to calculate the mean squared error and the r-squared value of a linear regression model as follows:

from sklearn.model_selection import cross_validate
from sklearn.linear_model import LinearRegression

# create a linear regression model
model = LinearRegression()

# create a list of metrics
metrics = ['neg_mean_squared_error', 'r2']

# use the cross_validate function to perform cross-validation and calculate the metrics
results = cross_validate(model, X, y, scoring=metrics, cv=5)

# print the results
print(results)

The output of the cross_validate function is:

{'fit_time': array([0.001, 0.001, 0.001, 0.001, 0.001]),
 'score_time': array([0.001, 0.001, 0.001, 0.001, 0.001]),
 'test_neg_mean_squared_error': array([-0.002, -0.002, -0.002, -0.002, -0.002]),
 'test_r2': array([0.998, 0.998, 0.998, 0.998, 0.998])}

As you can see, the cross_validate function returns a dictionary that contains the fit time, the score time, and the test scores for each metric for each fold. You can use the numpy library to calculate the mean and the standard deviation of the scores across the folds to get a summary of the model performance. For example, you can calculate the mean and the standard deviation of the mean squared error and the r-squared value as follows:

import numpy as np

# calculate the mean and the standard deviation of the mean squared error
mse_mean = np.mean(results['test_neg_mean_squared_error'])
mse_std = np.std(results['test_neg_mean_squared_error'])

# calculate the mean and the standard deviation of the r-squared value
r2_mean = np.mean(results['test_r2'])
r2_std = np.std(results['test_r2'])

# print the results
print('Mean squared error: mean = {:.3f}, std = {:.3f}'.format(mse_mean, mse_std))
print('R-squared value: mean = {:.3f}, std = {:.3f}'.format(r2_mean, r2_std))

The output of the calculations is:

Mean squared error: mean = -0.002, std = 0.000
R-squared value: mean = 0.998, std = 0.000

In this section, you learned how to use the cross_validate function to perform cross-validation and calculate multiple metrics for your model for each fold. You used the cross_validate function to calculate the mean squared error and the r-squared value of a linear regression model on a synthetic data set, and calculated the mean and the standard deviation of the scores across the folds. In the next section, you will learn how to compare the performance of a linear regression model and a neural network model on the same data set using cross-validation.

6. Conclusion

In this blog, you learned how to use cross-validation for model selection and evaluation. You learned what cross-validation is, why it is important, and how to perform it in Python using the Scikit-Learn library. You also learned how to use cross-validation to measure the bias-variance tradeoff, to avoid overfitting and underfitting, and to compare different models and configurations. You used various tools from the Scikit-Learn library, such as the KFold class, the cross_val_score function, the GridSearchCV class, the RandomizedSearchCV class, the metrics module, and the cross_validate function. You applied these tools to a synthetic data set and evaluated the performance of a linear regression model and a neural network model on different metrics.

By using cross-validation, you can improve the quality and reliability of your machine learning models, and ensure that they generalize well to new and unseen data. Cross-validation is a powerful and useful technique that you should master if you want to become a successful machine learning practitioner. We hope that this blog has helped you understand and appreciate the benefits of cross-validation, and that you will use it in your future projects.

Thank you for reading this blog, and happy learning!