How to Use rcorr in R to Create a Correlation Matrix


You can use the rcorr function from the Hmisc package in R to create a matrix of correlation coefficients along with a matrix of p-values for variables in a data frame.

This function is particularly useful because the matrix of p-values allows you to see if the correlation coefficient between different pairwise combinations of variables is statistically significant.

This function uses the following basic syntax:

library(Hmisc)

#create matrix of correlation coefficients and matrix of p-values
rcorr(as.matrix(df))

The following example shows how to use the rcorr function in practice.

Example: How to Use rcorr Function in R

Suppose we have the following data frame in R that contains information about various basketball players:

#create data frame
df <- data.frame(assists=c(4, 5, 5, 6, 7, 8, 8, 10),
                 rebounds=c(12, 14, 13, 7, 8, 8, 9, 13),
                 points=c(22, 24, 26, 26, 29, 32, 20, 14),
                 steals=c(5, 6, 7, 7, 8, 5, 3, 4))

#view data frame
df

  assists rebounds points steals
1       4       12     22      5
2       5       14     24      6
3       5       13     26      7
4       6        7     26      7
5       7        8     29      8
6       8        8     32      5
7       8        9     20      3
8      10       13     14      4

We can use the following syntax to create a matrix of correlation coefficients and a matrix of corresponding p-values for this data frame:

library(Hmisc)

#create matrix of correlation coefficients and matrix of p-values
rcorr(as.matrix(df))

         assists rebounds points steals
assists     1.00    -0.24  -0.33  -0.47
rebounds   -0.24     1.00  -0.52  -0.17
points     -0.33    -0.52   1.00   0.61
steals     -0.47    -0.17   0.61   1.00

n= 8 


P
         assists rebounds points steals
assists          0.5589   0.4253 0.2369
rebounds 0.5589           0.1844 0.6911
points   0.4253  0.1844          0.1047
steals   0.2369  0.6911   0.1047 

The first matrix shows the correlation coefficient between each pairwise combination of variables in the data frame.

For example, we can see:

  • The correlation coefficient between assists and rebounds is -0.24.
  • The correlation coefficient between assists and points is -0.33.
  • The correlation coefficient between assists and steals is -0.47.

And so on.

The second matrix shows the corresponding p-value for each correlation coefficient from the first matrix.

For example, we can see:

  • The p-value for the correlation coefficient between assists and rebounds is 0.5589.
  • The p-value for the correlation coefficient between assists and points is 0.4253.
  • The p-value for the correlation coefficient between assists and steals is 0.2369.

And so on.

Note: By default, the rcorr function calculates the Pearson correlation coefficient, but you can specify type=’spearman’ if you would instead like to calculate the Spearman Rank correlation coefficient.

Additional Resources

The following tutorials explain how to perform other common operations in R:

How to Calculate Rolling Correlation in R
How to Calculate Spearman Rank Correlation in R
How to Calculate Correlation in R with Missing Values

Leave a Reply

Your email address will not be published. Required fields are marked *