One way to quantify the relationship between two variables is to use the Pearson correlation coefficient, which measures the linear association between two variables*.*

It always takes on a value between -1 and 1 where:

**-1**indicates a perfectly negative linear correlation**0**indicates no linear correlation**1**indicates a perfectly positive linear correlation

To determine if a correlation coefficient is statistically significant, you can calculate the corresponding t-score and p-value.

The formula to calculate the t-score of a correlation coefficient (r) is:

**t** = r * √n-2 / √1-r^{2}

The p-value is then calculated as the corresponding two-sided p-value for the t-distribution with n-2 degrees of freedom.

**Example: Correlation Test in Python**

To determine if the correlation coefficient between two variables is statistically significant, you can perform a correlation test in Python using the **pearsonr** function from the **SciPy** library.

This function returns the correlation coefficient between two variables along with the two-tailed p-value.

For example, suppose we have the following two arrays in Python:

#create two arrays x = [3, 4, 4, 5, 7, 8, 10, 12, 13, 15] y = [2, 4, 4, 5, 4, 7, 8, 19, 14, 10]

We can import the **pearsonr** function and calculate the Pearson correlation coefficient between the two arrays:

from scipy.stats.stats import pearsonr #calculation correlation coefficient and p-value between x and y pearsonr(x, y) (0.8076177030748631, 0.004717255828132089)

Here’s how to interpret the output:

- Pearson correlation coefficient (r):
**0.8076** - Two-tailed p-value:
**0.0047**

Since the correlation coefficient is close to 1, this tells us that there is a strong positive association between the two variables.

And since the corresponding p-value is less than .05, we conclude that there is a statistically significant association between the two variables.

Note that we can also extract the individual correlation coefficient and p-value from the **pearsonr** function as well:

#extract correlation coefficient (rounded to 4 decimal places) r = round(pearsonr(x, y)[0], 4) print(r) 0.8076 #extract p-value (rounded to 4 decimal places) p = round(pearsonr(x, y)[1], 4) print(p) 0.0047

These values are a bit easier to read compared to the output from the original **pearsonr** function.

**Additional Resources**

The following tutorials provide additional information about correlation coefficients:

An Introduction to the Pearson Correlation Coefficient

What is Considered to Be a “Strong” Correlation?

The Five Assumptions for Pearson Correlation