How to Perform a Correlation Test in R (With Examples)


One way to quantify the relationship between two variables is to use the Pearson correlation coefficient, which is a measure of the linear association between two variablesIt always takes on a value between -1 and 1 where:

  • -1 indicates a perfectly negative linear correlation between two variables
  • 0 indicates no linear correlation between two variables
  • 1 indicates a perfectly positive linear correlation between two variables

To determine if a correlation coefficient is statistically significant, you can calculate the corresponding t-score and p-value.

The formula to calculate the t-score of a correlation coefficient (r) is:

t = r√(n-2) / √(1-r2)

The p-value is calculated as the corresponding two-sided p-value for the t-distribution with n-2 degrees of freedom.

Correlation Test in R

To determine if the correlation coefficient between two variables is statistically significant, you can perform a correlation test in R using the following syntax:

cor.test(x, y, method=c(“pearson”, “kendall”, “spearman”))

where:

  • x, y: Numeric vectors of data
  • method: Method used to calculate correlation between two vectors

The following example shows how to use this function to perform a correlation test in R.

Example: Correlation Test in R

Suppose we have the following two vectors in R:

x <- c(2, 3, 3, 5, 6, 9, 14, 15, 19, 21, 22, 23)
y <- c(23, 24, 24, 23, 17, 28, 38, 34, 35, 39, 41, 43)

Before we perform a correlation test between the two variables, we can create a quick scatterplot to view their relationship:

#create scatterplot 
plot(x, y, pch=16)

Correlation test in R

There appears to be a positive correlation between the two variables. That is, as one increases the other tends to increase as well.

To see if this correlation is statistically significant, we can perform a correlation test:

#perform correlation test between the two vectors
cor.test(x, y)

	Pearson's product-moment correlation

data:  x and y
t = 7.8756, df = 10, p-value = 1.35e-05
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.7575203 0.9799783
sample estimates:
      cor 
0.9279869 

The correlation coefficient between the two vectors turns out to be 0.9279869.

The test statistic turns out to be 7.8756 and the corresponding p-value is 1.35e-05. Since this value is less than .05, we have sufficient evidence to say that the correlation between the two variables is statistically significant. 

Additional Resources

An Introduction to the Pearson Correlation Coefficient
How to Calculate Partial Correlation in R

Leave a Reply

Your email address will not be published. Required fields are marked *