How to Create a Biplot in R to Visualize PCA Results


Principal components analysis (PCA) is an unsupervised machine learning technique that seeks to find principal components that explain a large portion of the variation in a dataset.

To visualize the results of PCA for a given dataset we can create a biplot, which is a plot that displays every observation in a dataset on a plane that is formed by the first two principal components.

We can use the following basic syntax in R to create a biplot:

#perform PCA
results <- princomp(df)

#create biplot to visualize results of PCA
biplot(results)

The following example shows how to use this syntax in practice.

Example: How to Create a Biplot in R

For this example we’ll use the built-in R dataset called USArrests:

#view first six rows of USArrests dataset
head(USArrests)

           Murder Assault UrbanPop Rape
Alabama      13.2     236       58 21.2
Alaska       10.0     263       48 44.5
Arizona       8.1     294       80 31.0
Arkansas      8.8     190       50 19.5
California    9.0     276       91 40.6
Colorado      7.9     204       78 38.7

We can use the following code to perform PCA and visualize the results in a biplot:

#perform PCA
results <- princomp(USArrests)

#visualize results of PCA in biplot
biplot(results)

The x-axis displays the first principal component, the y-axis displays the second principal component, and the individual observations from the dataset are shown inside the plot along with the four variables shown in red.

Note that there are several arguments we can use within the biplot function to modify the appearance of the plot.

For example, we can use the following code to modify the colors, font size, axis limits, plot title, axis titles, and size of the arrows in the plot:

#create biplot with custom appearance
biplot(results,
       col=c('blue', 'red'),
       cex=c(1, 1.3),
       xlim=c(-.4, .4),
       main='PCA Results',
       xlab='First Component',
       ylab='Second Component',
       expand=1.2)

biplot in R

This biplot is a bit easier to read than the previous one.

You can find a full list of arguments that you can use to modify the appearance of the biplot here.

Additional Resources

The following tutorials provide additional information about principal components analysis:

A Quick Introduction to Supervised vs. Unsupervised Learning
Principal Components Analysis in R: Step-by-Step Example

Leave a Reply

Your email address will not be published.