Machine Learning Evaluation Mastery: How to Use Mean Absolute Error and Mean Absolute Percentage Error for Regression Problems

Learn how to measure and interpret mean absolute error and mean absolute percentage error for regression problems.

Table of Contents

1. Introduction

Machine learning is a powerful tool for solving complex problems, but it also requires careful evaluation of the results. How do you know if your machine learning model is performing well on a regression problem? How do you measure the accuracy and error of your predictions?

There are many metrics that can be used to evaluate regression models, such as root mean squared error, coefficient of determination, and mean squared logarithmic error. However, some of these metrics have limitations or drawbacks that make them unsuitable for certain situations. For example, root mean squared error is sensitive to outliers and can exaggerate the error of large deviations. Coefficient of determination can be misleading if the model has a high intercept or a low slope. Mean squared logarithmic error can only be used for positive values and can penalize underestimations more than overestimations.

In this blog, you will learn how to use two scale-invariant metrics that can overcome some of these limitations: mean absolute error and mean absolute percentage error. These metrics are simple, intuitive, and easy to interpret. They can also be used to compare models with different scales or units of measurement.

You will learn how to:

Define mean absolute error and mean absolute percentage error and understand their properties and assumptions.
Calculate mean absolute error and mean absolute percentage error in Python using scikit-learn and numpy.
Interpret mean absolute error and mean absolute percentage error and understand their advantages and disadvantages.
Compare mean absolute error and mean absolute percentage error with other metrics and choose the best one for your problem.
Improve mean absolute error and mean absolute percentage error by applying some techniques and best practices.

By the end of this blog, you will have a solid understanding of how to use mean absolute error and mean absolute percentage error for regression problems and how to improve your machine learning model performance.

Are you ready to master these metrics? Let’s get started!

2. What are Mean Absolute Error and Mean Absolute Percentage Error?

Before we dive into the calculation and interpretation of mean absolute error and mean absolute percentage error, let’s first understand what they are and how they work.

Mean absolute error (MAE) and mean absolute percentage error (MAPE) are two metrics that measure the average magnitude of the errors or deviations between the actual values and the predicted values in a regression problem. They are also known as scale-invariant metrics, because they do not depend on the scale or unit of measurement of the values.

Mean absolute error is calculated by taking the average of the absolute values of the errors or residuals. A residual is the difference between an actual value and a predicted value. The formula for MAE is:

$$\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i – \hat{y}_i|$$

where $y_i$ is the actual value, $\hat{y}_i$ is the predicted value, and $n$ is the number of observations.

Mean absolute percentage error is calculated by taking the average of the absolute values of the percentage errors or relative errors. A percentage error is the ratio of the residual to the actual value, expressed as a percentage. The formula for MAPE is:

$$\text{MAPE} = \frac{100}{n} \sum_{i=1}^{n} \left| \frac{y_i – \hat{y}_i}{y_i} \right|$$

where $y_i$, $\hat{y}_i$, and $n$ are the same as above.

Both MAE and MAPE are easy to understand and interpret, as they represent the average error or deviation in the same unit or percentage as the actual values. For example, if the MAE is 5 and the unit of measurement is dollars, it means that the average error is 5 dollars. If the MAPE is 10%, it means that the average error is 10% of the actual value.

However, MAE and MAPE also have some assumptions and limitations that you need to be aware of. We will discuss them in the next section.

2.1. Mean Absolute Error (MAE)

Mean absolute error (MAE) is one of the simplest and most intuitive metrics for evaluating regression models. It measures the average magnitude of the errors or deviations between the actual values and the predicted values. It is also scale-invariant, meaning that it does not depend on the scale or unit of measurement of the values.

To calculate MAE, you need to take the absolute value of the difference between each actual value and its corresponding predicted value, and then take the average of these values. The formula for MAE is:

$$\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i – \hat{y}_i|$$

where $y_i$ is the actual value, $\hat{y}_i$ is the predicted value, and $n$ is the number of observations.

For example, suppose you have a regression model that predicts the price of a house based on some features, such as the number of bedrooms, the size of the lot, and the location. You have five observations in your test set, and the actual and predicted prices are as follows:

Observation	Actual Price ($)	Predicted Price ($)	Absolute Error ($)
1	300,000	320,000	20,000
2	400,000	380,000	20,000
3	500,000	450,000	50,000
4	600,000	550,000	50,000
5	700,000	720,000	20,000

To calculate the MAE, you need to add up the absolute errors and divide by the number of observations. In this case, the MAE is:

$$\text{MAE} = \frac{1}{5} (20,000 + 20,000 + 50,000 + 50,000 + 20,000) = 32,000$$

This means that the average error of your model is $32,000. In other words, your model is off by $32,000 on average when predicting the price of a house.

MAE is easy to understand and interpret, as it represents the average error in the same unit as the actual values. However, MAE also has some limitations that you need to be aware of. We will discuss them in the next section.

2.2. Mean Absolute Percentage Error (MAPE)

Mean absolute percentage error (MAPE) is another scale-invariant metric for evaluating regression models. It measures the average magnitude of the percentage errors or relative errors between the actual values and the predicted values. It is useful for comparing models with different scales or units of measurement.

To calculate MAPE, you need to take the absolute value of the percentage error for each observation, and then take the average of these values. The formula for MAPE is:

$$\text{MAPE} = \frac{100}{n} \sum_{i=1}^{n} \left| \frac{y_i – \hat{y}_i}{y_i} \right|$$

where $y_i$ is the actual value, $\hat{y}_i$ is the predicted value, and $n$ is the number of observations.

For example, suppose you have the same regression model and test set as in the previous section, where you predicted the price of a house based on some features. The actual and predicted prices are as follows:

Observation	Actual Price ($)	Predicted Price ($)	Percentage Error (%)	Absolute Percentage Error (%)
1	300,000	320,000	6.67	6.67
2	400,000	380,000	-5.00	5.00
3	500,000	450,000	-10.00	10.00
4	600,000	550,000	-8.33	8.33
5	700,000	720,000	2.86	2.86

To calculate the MAPE, you need to add up the absolute percentage errors and divide by the number of observations. In this case, the MAPE is:

$$\text{MAPE} = \frac{100}{5} (6.67 + 5.00 + 10.00 + 8.33 + 2.86) = 6.57$$

This means that the average percentage error of your model is 6.57%. In other words, your model is off by 6.57% on average when predicting the price of a house.

MAPE is useful for comparing models with different scales or units of measurement, as it represents the average error as a percentage of the actual value. However, MAPE also has some limitations that you need to be aware of. We will discuss them in the next section.

3. How to Calculate Mean Absolute Error and Mean Absolute Percentage Error in Python?

Now that you know what mean absolute error and mean absolute percentage error are and how they work, let’s see how you can calculate them in Python. Python is a popular programming language for data science and machine learning, and it offers many libraries and tools that can help you with your tasks.

One of the most widely used libraries for machine learning in Python is scikit-learn, which provides a variety of functions and classes for data preprocessing, model building, evaluation, and more. Scikit-learn has a built-in function for calculating mean absolute error, which is sklearn.metrics.mean_absolute_error. This function takes two arguments: the actual values and the predicted values, and returns the MAE as a floating-point number.

For example, suppose you have the same regression model and test set as in the previous sections, where you predicted the price of a house based on some features. You have stored the actual and predicted prices in two numpy arrays, y_true and y_pred, as follows:

import numpy as np

y_true = np.array([300000, 400000, 500000, 600000, 700000])
y_pred = np.array([320000, 380000, 450000, 550000, 720000])

To calculate the MAE using scikit-learn, you need to import the function and pass the two arrays as arguments. The code and the output are as follows:

from sklearn.metrics import mean_absolute_error

mae = mean_absolute_error(y_true, y_pred)
print(mae)

The output is:

32000.0

This is the same result as we obtained manually in the previous section.

Scikit-learn does not have a built-in function for calculating mean absolute percentage error, but you can easily implement it using numpy. Numpy is another popular library for scientific computing in Python, and it offers many functions and methods for working with arrays and matrices. To calculate MAPE using numpy, you need to use the following formula:

$$\text{MAPE} = \frac{100}{n} \sum_{i=1}^{n} \left| \frac{y_i – \hat{y}_i}{y_i} \right|$$

where $y_i$ is the actual value, $\hat{y}_i$ is the predicted value, and $n$ is the number of observations.

You can use the same numpy arrays, y_true and y_pred, as before, and apply the formula using numpy functions and methods. The code and the output are as follows:

import numpy as np

mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100
print(mape)

The output is:

6.572

This is the same result as we obtained manually in the previous section.

As you can see, calculating mean absolute error and mean absolute percentage error in Python is quite easy and straightforward, thanks to the powerful libraries like scikit-learn and numpy. However, calculating these metrics is only the first step of evaluating your regression model. You also need to know how to interpret them and compare them with other metrics. We will discuss these topics in the next sections.

4. How to Interpret Mean Absolute Error and Mean Absolute Percentage Error?

Calculating mean absolute error and mean absolute percentage error is not enough to evaluate your regression model. You also need to know how to interpret these metrics and what they mean for your model performance. In this section, you will learn how to interpret MAE and MAPE and understand their advantages and disadvantages.

The first thing you need to know is that MAE and MAPE are both error metrics, meaning that lower values indicate better performance. This is different from some other metrics, such as coefficient of determination or accuracy, which are score metrics, meaning that higher values indicate better performance. You should always keep this in mind when comparing MAE and MAPE with other metrics.

The second thing you need to know is that MAE and MAPE are both scale-invariant metrics, meaning that they do not depend on the scale or unit of measurement of the values. This makes them useful for comparing models with different scales or units of measurement, such as dollars, kilograms, or degrees. However, this also means that MAE and MAPE are not very sensitive to the magnitude or range of the values. For example, a MAE of 10 might be very good for predicting the temperature, but very bad for predicting the stock price. Similarly, a MAPE of 5% might be very good for predicting the revenue, but very bad for predicting the inflation rate. Therefore, you should always consider the context and the domain of your problem when interpreting MAE and MAPE.

The third thing you need to know is that MAE and MAPE have different properties and assumptions that affect their interpretation. MAE is more robust to outliers and large deviations, as it does not square the errors or take the logarithm of the values. However, MAE can also be biased by the distribution of the errors, as it does not account for the direction or the proportion of the errors. For example, a MAE of 10 can be caused by either underestimating or overestimating the values by 10, or by a combination of both. Similarly, a MAE of 10 can be caused by either a constant error of 10, or by a variable error that ranges from 0 to 20. Therefore, you should always check the distribution and the variance of the errors when using MAE.

MAPE, on the other hand, is more sensitive to outliers and large deviations, as it takes the percentage of the errors. This can cause MAPE to be very high or even infinite if the actual value is very small or zero. However, MAPE can also be more informative and intuitive, as it accounts for the direction and the proportion of the errors. For example, a MAPE of 10% means that the model is off by 10% on average, regardless of the scale or unit of measurement. Similarly, a MAPE of 10% means that the model is equally bad for predicting a value of 10 or a value of 100, as the percentage error is the same. Therefore, you should always check the scale and the range of the values when using MAPE.

In summary, MAE and MAPE are both useful metrics for evaluating regression models, but they also have some limitations and assumptions that you need to be aware of. You should always interpret MAE and MAPE in the context and the domain of your problem, and compare them with other metrics to get a more comprehensive picture of your model performance. In the next section, you will learn how to compare MAE and MAPE with other metrics and choose the best one for your problem.

4.1. Advantages and Disadvantages of MAE and MAPE

Mean absolute error and mean absolute percentage error are two of the most commonly used metrics for evaluating regression models. They have some advantages and disadvantages that you need to be aware of when using them. In this section, you will learn about the pros and cons of MAE and MAPE and how they affect your model evaluation.

One of the main advantages of MAE and MAPE is that they are both scale-invariant, meaning that they do not depend on the scale or unit of measurement of the values. This makes them useful for comparing models with different scales or units of measurement, such as dollars, kilograms, or degrees. For example, if you have two models that predict the price of a house and the weight of a car, you can use MAE or MAPE to compare their performance without worrying about the difference in the units.

Another advantage of MAE and MAPE is that they are both easy to understand and interpret, as they represent the average error or deviation in the same unit or percentage as the actual values. For example, if the MAE is 5 and the unit of measurement is dollars, it means that the average error is 5 dollars. If the MAPE is 10%, it means that the average error is 10% of the actual value. This makes MAE and MAPE intuitive and meaningful for both the model developers and the end users.

However, MAE and MAPE also have some disadvantages that you need to be aware of when using them. One of the main disadvantages of MAE and MAPE is that they are both not very sensitive to the magnitude or range of the values. This means that MAE and MAPE can be misleading or inaccurate if the actual values are very large or very small, or if they vary widely. For example, a MAE of 10 might be very good for predicting the temperature, but very bad for predicting the stock price. Similarly, a MAPE of 5% might be very good for predicting the revenue, but very bad for predicting the inflation rate. Therefore, you should always consider the context and the domain of your problem when using MAE and MAPE.

Another disadvantage of MAE and MAPE is that they have different properties and assumptions that affect their interpretation. MAE is more robust to outliers and large deviations, as it does not square the errors or take the logarithm of the values. However, MAE can also be biased by the distribution of the errors, as it does not account for the direction or the proportion of the errors. For example, a MAE of 10 can be caused by either underestimating or overestimating the values by 10, or by a combination of both. Similarly, a MAE of 10 can be caused by either a constant error of 10, or by a variable error that ranges from 0 to 20. Therefore, you should always check the distribution and the variance of the errors when using MAE.

4.2. How to Compare MAE and MAPE with Other Metrics?

MAE and MAPE are not the only metrics that can be used to evaluate regression models. There are many other metrics that have different properties and assumptions, and can provide different insights into your model performance. In this section, you will learn how to compare MAE and MAPE with other metrics and choose the best one for your problem.

Some of the most common metrics for evaluating regression models are:

Root mean squared error (RMSE): This metric measures the average magnitude of the errors or deviations between the actual values and the predicted values, but it squares the errors before taking the square root. This makes RMSE more sensitive to outliers and large deviations, as it penalizes them more than MAE. The formula for RMSE is:

$$\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)^2}$$

Coefficient of determination (R-squared): This metric measures how well the predicted values fit the actual values, by comparing the variation explained by the model to the total variation in the data. It ranges from 0 to 1, with higher values indicating better fit. The formula for R-squared is:

$$R^2 = 1 – \frac{\sum_{i=1}^{n} (y_i – \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i – \bar{y})^2}$$

Mean squared logarithmic error (MSLE): This metric measures the average magnitude of the logarithmic errors or deviations between the actual values and the predicted values, but it squares the errors before taking the square root. This makes MSLE more suitable for problems where the relative error is more important than the absolute error, such as predicting growth rates or percentages. The formula for MSLE is:

$$\text{MSLE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (\log{(1 + y_i)} – \log{(1 + \hat{y}_i)})^2}$$

To compare MAE and MAPE with these metrics, you need to consider the following factors:

The scale and the range of the values: MAE and MAPE are scale-invariant, meaning that they do not depend on the scale or unit of measurement of the values. This makes them useful for comparing models with different scales or units of measurement. However, this also means that MAE and MAPE are not very sensitive to the magnitude or range of the values, which can be misleading or inaccurate if the actual values are very large or very small, or if they vary widely. For example, a MAE of 10 might be very good for predicting the temperature, but very bad for predicting the stock price. Similarly, a MAPE of 5% might be very good for predicting the revenue, but very bad for predicting the inflation rate. Therefore, you should always consider the context and the domain of your problem when using MAE and MAPE.
The distribution and the variance of the errors: MAE and MAPE have different properties and assumptions that affect their interpretation. MAE is more robust to outliers and large deviations, as it does not square the errors or take the logarithm of the values. However, MAE can also be biased by the distribution of the errors, as it does not account for the direction or the proportion of the errors. For example, a MAE of 10 can be caused by either underestimating or overestimating the values by 10, or by a combination of both. Similarly, a MAE of 10 can be caused by either a constant error of 10, or by a variable error that ranges from 0 to 20. Therefore, you should always check the distribution and the variance of the errors when using MAE.
The sensitivity and the penalization of the errors: MAE and MAPE have different effects on the sensitivity and the penalization of the errors. MAE is less sensitive to outliers and large deviations, as it does not square the errors or take the logarithm of the values. This makes MAE more suitable for problems where the absolute error is more important than the relative error, such as predicting the price of a house or the weight of a car. However, MAE can also underestimate the impact of large errors, as it does not penalize them more than small errors. For example, a MAE of 10 can be caused by either a small error of 10, or a large error of 100. Therefore, you should always consider the impact and the importance of the errors when using MAE.
The interpretation and the intuition of the metrics: MAE and MAPE are easy to understand and interpret, as they represent the average error or deviation in the same unit or percentage as the actual values. For example, if the MAE is 5 and the unit of measurement is dollars, it means that the average error is 5 dollars. If the MAPE is 10%, it means that the average error is 10% of the actual value. This makes MAE and MAPE intuitive and meaningful for both the model developers and the end users. However, MAE and MAPE can also be less informative and intuitive than some other metrics, such as RMSE, R-squared, or MSLE, which can provide different insights into the model performance. For example, RMSE can indicate the standard deviation of the errors, R-squared can indicate the proportion of the variation explained by the model, and MSLE can indicate the relative error of the model. Therefore, you should always consider the insights and the intuition of the metrics when using MAE and MAPE.

In summary, MAE and MAPE are both useful metrics for evaluating regression models, but they also have some limitations and assumptions that you need to be aware of. You should always compare MAE and MAPE with other metrics and choose the best one for your problem, depending on the scale and the range of the values, the distribution and the variance of the errors, the sensitivity and the penalization of the errors, and the interpretation and the intuition of the metrics. In the next section, you will learn how to improve MAE and MAPE by applying some techniques and best practices.

5. How to Improve Mean Absolute Error and Mean Absolute Percentage Error?

Now that you have learned how to calculate, interpret, and compare mean absolute error and mean absolute percentage error, you might be wondering how to improve them. After all, the goal of any machine learning model is to minimize the error and maximize the accuracy. In this section, you will learn some techniques and best practices that can help you improve MAE and MAPE for your regression problems.

One of the most important factors that affect MAE and MAPE is the quality of the data. If your data is noisy, incomplete, or inaccurate, your model will not be able to make accurate predictions. Therefore, you should always perform some data cleaning and preprocessing steps before feeding your data to your model. Some of the common steps are:

Handling missing values: Missing values can cause errors or bias in your model, as they can affect the distribution and the variance of the data. You should always check for missing values in your data and handle them appropriately. Some of the common methods are deleting the rows or columns with missing values, imputing the missing values with the mean, median, or mode of the data, or using more advanced techniques such as k-nearest neighbors or regression imputation.
Handling outliers: Outliers are values that are far away from the rest of the data, and they can skew the results or cause high errors in your model. You should always check for outliers in your data and handle them appropriately. Some of the common methods are deleting the outliers, transforming the outliers using methods such as log, square root, or box-cox transformation, or using more robust techniques such as robust regression or robust scaling.
Handling categorical variables: Categorical variables are variables that have a finite number of discrete values, such as gender, color, or type. Most machine learning models cannot handle categorical variables directly, as they require numerical inputs. Therefore, you should always encode your categorical variables into numerical values before feeding them to your model. Some of the common methods are label encoding, one-hot encoding, or ordinal encoding.

Another important factor that affects MAE and MAPE is the choice of the model. Different models have different assumptions, properties, and parameters, and they can perform differently on different problems. Therefore, you should always try different models and compare their performance on your problem. Some of the common models for regression problems are linear regression, ridge regression, lasso regression, decision tree regression, random forest regression, support vector regression, or neural network regression.

A third important factor that affects MAE and MAPE is the tuning of the hyperparameters. Hyperparameters are the parameters that control the behavior and the complexity of the model, such as the learning rate, the regularization term, the number of trees, or the number of hidden layers. Different hyperparameters can have different effects on the model performance, and they can also interact with each other. Therefore, you should always try different values and combinations of the hyperparameters and find the optimal ones for your problem. Some of the common methods for hyperparameter tuning are grid search, random search, or Bayesian optimization.

In summary, MAE and MAPE can be improved by applying some techniques and best practices that can enhance the quality of the data, the choice of the model, and the tuning of the hyperparameters. You should always experiment with different methods and find the best ones for your problem. By doing so, you can reduce the error and increase the accuracy of your regression model.

This concludes our blog on how to use mean absolute error and mean absolute percentage error for regression problems. We hope you have learned something useful and interesting from this blog. If you have any questions or feedback, please feel free to leave a comment below. Thank you for reading and happy learning!

6. Conclusion

This concludes our blog on how to use mean absolute error and mean absolute percentage error for regression problems. In this blog, you have learned:

What are mean absolute error and mean absolute percentage error and how they work.
How to calculate mean absolute error and mean absolute percentage error in Python using scikit-learn and numpy.
How to interpret mean absolute error and mean absolute percentage error and understand their advantages and disadvantages.
How to compare mean absolute error and mean absolute percentage error with other metrics and choose the best one for your problem.
How to improve mean absolute error and mean absolute percentage error by applying some techniques and best practices.

We hope you have found this blog useful and interesting, and that you have gained some valuable insights into how to evaluate your regression models using mean absolute error and mean absolute percentage error. If you have any questions or feedback, please feel free to leave a comment below. Thank you for reading and happy learning!