Logistic Regression is a statistical method that we use to fit a regression model when the response variable is binary. To assess how well a logistic regression model fits a dataset, we can look at the following two metrics:
- Sensitivity: The probability that the model predicts a positive outcome for an observation when indeed the outcome is positive.
- Specificity: The probability that the model predicts a negative outcome for an observation when indeed the outcome is negative.
One easy way to visualize these two metrics is by creating a ROC curve, which is a plot that displays the sensitivity and specificity of a logistic regression model.
This tutorial explains how to create and interpret a ROC curve in R using the ggplot2 visualization package.
Example: ROC Curve Using ggplot2
Suppose we fit the following logistic regression model in R:
#load Default dataset from ISLR book data <- ISLR::Default #divide dataset into training and test set set.seed(1) sample <- sample(c(TRUE, FALSE), nrow(data), replace=TRUE, prob=c(0.7,0.3)) train <- data[sample, ] test <- data[!sample, ] #fit logistic regression model to training set model <- glm(default~student+balance+income, family="binomial", data=train) #use model to make predictions on test set predicted <- predict(model, test, type="response")
To visualize how well the logistic regression model performs on the test set, we can create a ROC plot using the ggroc() function from the pROC package:
#load necessary packages library(ggplot2) library(pROC) #define object to plot rocobj <- roc(test$default, predicted) #create ROC plot ggroc(rocobj)
The y-axis displays the sensitivity (the true positive rate) of the model and the x-axis displays the specificity (the true negative rate) of the model.
Note that we can add some styling to the plot and also provide a title that contains the AUC (area under the curve) for the plot:
#load necessary packages library(ggplot2) library(pROC) #define object to plot and calculate AUC rocobj <- roc(test$default, predicted) auc <- round(auc(test$default, predicted),4) #create ROC plot ggroc(rocobj, colour = 'steelblue', size = 2) + ggtitle(paste0('ROC Curve ', '(AUC = ', auc, ')'))
Note that we can also modify the theme of the plot:
#create ROC plot with minimal theme ggroc(rocobj, colour = 'steelblue', size = 2) + ggtitle(paste0('ROC Curve ', '(AUC = ', auc, ')')) + theme_minimal()
Refer to this article for a guide to the best ggplot2 themes.