In machine learning, **misclassification rate** is a metric that tells us the percentage of observations that were incorrectly predicted by some classification model.

It is calculated as:

Misclassification Rate = # incorrect predictions / # total predictions

The value for misclassification rate can range from 0 to 1 where:

**0**represents a model that had zero incorrect predictions.**1**represents a model that had completely incorrect predictions.

The lower the value for the misclassification rate, the better a classification model is able to predict the outcomes of the response variable.

The following example show how to calculate misclassification rate for a logistic regression model in practice.

**Example: Calculating Misclassification Rate for a Logistic Regression Model**

Suppose we use a logistic regression model to predict whether or not 400 different college basketball players get drafted into the NBA.

The following confusion matrix summarizes the predictions made by the model:

Here is how to calculate the misclassification rate for the model:

- Misclassification Rate = # incorrect predictions / # total predictions
- Misclassification Rate = (false positive + false negative) / (total predictions)
- Misclassification Rate = (70 + 40) / (400)
- Misclassification Rate = 0.275

The misclassification rate for this model is 0.275 or **27.5%**.

This means the model incorrectly predicted the outcome for **27.5%** of the players.

The opposite of misclassification rate would be accuracy, which is calculated as:

- Accuracy = 1 – Misclassification rate
- Accuracy = 1 – 0.275
- Accuracy = 0.725

This means the model correctly predicted the outcome for **72.5%** of the players.

**Pros & Cons of Misclassification Rate**

Misclassification rate offers the following **pros**:

**It’s easy to interpret**. A misclassification rate of 10% means a model made an incorrect prediction for 10% of the total observations.**It’s easy to calculate**. A misclassification rate is calculated as the number of total incorrect predictions divided by the total number of predictions.

However, misclassification rate has the following **con**:

**It doesn’t take into account how the data is distributed**. For example, suppose 90% of all players do not get drafted into the NBA. If we have a model that simply predicts every player to not get drafted, the model would have a misclassification rate of just 10%. This seems low, but, but the model is actually unable to correctly predict any player who gets drafted.

In practice, we often calculate the misclassification rate of a model along with other metrics like:

**Sensitivity**: The “true positive rate” – the percentage of positive outcomes the model is able to detect.**Specificity**: The “true negative rate” – the percentage of negative outcomes the model is able to detect.**F1 Score**: A metric that tells us the accuracy of a model, relative to how the data is distributed.

By calculating the value for each of these metrics, we can gain a full understanding of how well the model is able to make predictions.

**Additional Resources**

The following tutorials provide additional information about common machine learning concepts:

Introduction to Logistic Regression

What is Balanced Accuracy?

F1 Score vs. Accuracy: Which Should You Use?