The Pearson correlation coefficient can be used to measure the linear association between two variables.
This correlation coefficient always takes on a value between -1 and 1 where:
- -1: Perfectly negative linear correlation between two variables.
- 0: No linear correlation between two variables.
- 1: Perfectly positive linear correlation between two variables.
To determine if a correlation coefficient is statistically significant, you can calculate the corresponding t-score and p-value.
The formula to calculate the t-score of a correlation coefficient (r) is:
t = r√n-2 / √1-r2
The p-value is calculated as the corresponding two-sided p-value for the t-distribution with n-2 degrees of freedom.
To calculate the p-value for a Pearson correlation coefficient in R, you can use the cor.test() function.
cor.test(x, y)
The following example shows how to use this function in practice.
Example: Calculate P-Value for Correlation Coefficient in R
The following code shows how to use the cor.test() function to calculate the p-value for the correlation coefficient between two variables in R:
#create two variables
x <- c(70, 78, 90, 87, 84, 86, 91, 74, 83, 85)
y <- c(90, 94, 79, 86, 84, 83, 88, 92, 76, 75)
#calculate correlation coefficient and corresponding p-value
cor.test(x, y)
Pearson's product-moment correlation
data: x and y
t = -1.7885, df = 8, p-value = 0.1115
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.8709830 0.1434593
sample estimates:
cor
-0.5344408
From the output we can see:
- The Pearson correlation coefficient is -0.5344408.
- The corresponding p-value is 0.1115.
Since the correlation coefficient is negative, it indicates that there is a negative linear relationship between the two variables.
However, since the p-value of the correlation coefficient is not less than 0.05, the correlation is not statistically significant.
Note that we can also type cor.test(x, y)$p.value to only extract the p-value for the correlation coefficient:
#create two variables
x <- c(70, 78, 90, 87, 84, 86, 91, 74, 83, 85)
y <- c(90, 94, 79, 86, 84, 83, 88, 92, 76, 75)
#calculate p-value for correlation between x and y
cor.test(x, y)$p.value
[1] 0.1114995
The p-value for the correlation coefficient is 0.1114995.
This matches the p-value from the previous output.
Additional Resources
The following tutorials explain how to perform other common tasks in R:
How to Calculate Partial Correlation in R
How to Calculate Spearman Correlation in R
How to Calculate Rolling Correlation in R