Simple linear regression is used to quantify the relationship between a predictor variable and a response variable.
This method finds a line that best “fits” a dataset and takes on the following form:
ŷ = b0 + b1x
- ŷ: The estimated response value
- b0: The intercept of the regression line
- b1: The slope of the regression line
- x: The value of the predictor variable
Often we’re interested in the value for b1, which tells us the average change in the response variable associated with a one unit increase in the predictor variable.
We can use the following formula to calculate a confidence interval for the value of β1, the value of the slope for the overall population:
Confidence Interval for β1: b1 ± t1-α/2, n-2 * se(b1)
- b1 = Slope coefficient shown in the regression table
- t1-∝/2, n-2 = The t critical value for confidence level 1-∝ with n-2 degrees of freedom where n is the total number of observations in our dataset
- se(b1) = The standard error of b1 shown in the regression table
The following example shows how to calculate a confidence interval for a regression slope in practice.
Example: Confidence Interval for Regression Slope
Suppose we’d like to fit a simple linear regression model using hours studied as a predictor variable and exam score as a response variable for 15 students in a particular class:
We can perform simple linear regression in Excel and receive the following output:
Using the coefficient estimates in the output, we can write the fitted simple linear regression model as:
Score = 65.334 + 1.982*(Hours Studied)
The value for the regression slope is 1.982.
This tells us that each additional one hour increase in studying is associated with an average increase of 1.982 in exam score.
We can use the following formula to calculate a 95% confidence interval for the slope:
- 95% C.I. for β1: b1 ± t1-α/2, n-2 * se(b1)
- 95% C.I. for β1: 1.982 ± t.975, 15-2 * .248
- 95% C.I. for β1: 1.982 ± 2.1604 * .248
- 95% C.I. for β1: [1.446, 2.518]
The 95% confidence interval for the regression slope is [1.446, 2.518].
Since this confidence interval doesn’t contain the value 0, we can conclude that there is a statistically significant association between hours studied and exam score.
Note: We used the Inverse t Distribution Calculator to find the t critical value that corresponds to a 95% confidence level with 13 degrees of freedom.
The following tutorials provide additional information about linear regression: