MAE | RMSE | MAPE : Measures of model accuracy for data scientists

Mean Absolute Error (MAE)

This simply takes the difference between the predicted value and the actual value for every prediction and takes an average of the result. However, to avoid values cancelling one another out, it takes the absolute value (which means, it makes all the values positive).

Let’s consider an example. In the below, we have made three predictions. One is 1 higher than the actual value; one is 1 lower than the actual and the final prediction is 100% accurate. If we average these, we get a mean of zero. This does not describe the actual model performance.


To get around this, we take the absolute values; now the mean is 0.666, which reflects the not perfect nature of the model, which he above did not.

Root Mean Square Error (RMSE)

Very similarly to MAE, we take the difference between the actual and the forecast values. The formula is sqrt(avg(power(actual-forecast))). I have demonstrated this below. We: 

  • Calculate the difference between predicted and actual values
  • Square those values
  • Take an average of the squared values
  • Square root the result


By squaring the differences between the actual and predicted value, we add additional ‘weight’ to the larger errors – effectively punishing the larger errors; giving a worse overall model score. Below is the MAE for the same data.

Mean Absolute Percentage Error (MAPE)

With MAE and RMSE we sometimes run into issues around interpreting the results. If you have a RMSE of 54, is that bad? Well, if you are trying to predict something where the value was 60, then yes, being wrong by 54 points, is quite bad; but if you were predicting something where the actual value was 500,000,000; then being wrong by 54 on average is excellent.

For this reason we can use the percentage error; which gives us an output relative to the prediction, rather than an unscaled number. 

The formula to calculate this is below. Let’s have a look at this in practice.

avg(abs(actual-forecast)/abs(actual))*100.

In the below, we can see that the average % inaccuracy is 55.55%, which is pretty bad!

.

Kodey