**Covariance **is a measure of how changes in one variable are associated with changes in a second variable. Specifically, it’s a measure of the degree to which two variables are linearly associated.

A **covariance matrix** is a square matrix that shows the covariance between many different variables. This can be a useful way to understand how different variables are related in a dataset.

The following example shows how to read a covariance matrix in practice.

**How to Read a Covariance Matrix**

Suppose we have the following covariance matrix that contains information about exam scores for three different subjects for college students:

**The values along the diagonals of the matrix represent the variances of each subject.**

For example:

- The variance of the math scores is
**64.9**. - The variance of the science scores is
**56.4**. - The variance of the history scores is
**75.6**.

**The other values in the matrix represent the covariances between the various subjects.**

For example:

- The covariance between the math and science scores is
**33.2**. - The covariance between the math and history scores is –
**24.4**. - The covariance between the science and history scores is –
**24.1**.

A **positive number** for covariance indicates that two variables tend to increase or decrease in tandem.

For example, math and science have a positive covariance (**33.2**), which indicates that students who score high on math also tend to score high on science.

Conversely, students who score low on math also tend to score low on science.

A **negative number** for covariance indicates that as one variable increases, a second variable tends to decrease.

For example, math and history have a negative covariance (**-24.44**), which indicates that students who score high on math tend to score low on history.

Conversely, students who score low on math tend to score high on history.

**A Note on the Symmetry of a Covariance Matrix**

It’s worth noting that a covariance matrix is perfectly symmetrical.

For example, the top right cell shows the exact same value as the bottom left cell:

This is because both cells are measuring the covariance between History and Math.

Because a covariance matrix is symmetrical, half of the covariance values shown in the matrix are redundant and unnecessary.

Thus, sometimes only half of the covariance matrix will be displayed:

**When to Use a Covariance Matrix**

In practice, you will often need to create and interpret a correlation matrix more often than a covariance matrix.

However, covariance matrices are often used “under the hood” for different machine learning algorithms and models.

For example, the covariance matrix is used when performing principal components analysis, which helps us understand underlying patterns in a dataset that contains a large number of variables.

**Additional Resources**

The following tutorials explain how to create a covariance matrix using different statistical software:

How to Create a Covariance Matrix in R

How to Create a Covariance Matrix in Python

How to Create a Covariance Matrix in SPSS

How to Create a Covariance Matrix in Excel