Logistic regression is a type of regression we use when the response variable is binary.

One common way to evaluate the quality of a logistic regression model is to create a **confusion matrix**, which is a 2×2 table that shows the predicted values from the model vs. the actual values from the test dataset.

The following step-by-step example shows how to create a confusion matrix in Google Sheets.

**Step 1: Enter the Data**

First, let’s enter a column of actual values for a response variable along with the predicted values by a logistic regression model:

**Step 2: Create the Confusion Matrix**

Next, we will create the following outline for a confusion matrix:

Next, we will type the following formula into cell **E3**:

=COUNTIFS($A$2:$A$21, $D3, $B$2:$B$21, E$2)

We can then click and drag this formula to each remaining cell in the confusion matrix:

**Step 3: Calculate Accuracy, Precision and Recall**

Once we’ve created the confusion matrix, we can calculate the following metrics:

**Accuracy**: Percentage of correct predictions**Precision**: Correct positive predictions relative to total positive predictions**Recall**: Correct positive predictions relative to total actual positives

The following formulas show how to calculate each of these metrics in Google Sheets:

The higher the accuracy, the better a model is able to correctly classify observations.

In this example, our model has an accuracy of **0.7** which tells us that it correctly classified 70% of observations.

If we’d like, we can compare this accuracy to that of other logistic regression models to determine which model is best at classifying observations into categories of 0 or 1.

**Additional Resources**

The following tutorials provide additional information about logistic regression:

Introduction to Logistic Regression

The 3 Types of Logistic Regression

Logistic Regression vs. Linear Regression