Multiple Linear Regression by Hand (Step-by-Step)


Multiple linear regression is a method we can use to quantify the relationship between two or more predictor variables and a response variable.

This tutorial explains how to perform multiple linear regression by hand.

Example: Multiple Linear Regression by Hand

Suppose we have the following dataset with one response variable y and two predictor variables X1 and X2:

Use the following steps to fit a multiple linear regression model to this dataset.

Step 1: Calculate X12, X22, X1y, X2y and X1X2.

Multiple linear regression by hand

Step 2: Calculate Regression Sums.

Next, make the following regression sum calculations:

  • Σx12 = ΣX12 – (ΣX1)2 / n = 38,767 – (555)2 / 8 = 263.875
  • Σx22 = ΣX22 – (ΣX2)2 / n = 2,823 – (145)2 / 8 = 194.875
  • Σx1y = ΣX1y – (ΣX1Σy) / n = 101,895 – (555*1,452) / 8 = 1,162.5
  • Σx2y = ΣX2y – (ΣX2Σy) / n = 25,364 – (145*1,452) / 8 = -953.5
  • Σx1x2 = ΣX1X2 – (ΣX1ΣX2) / n = 9,859 – (555*145) / 8 = -200.375

Example of multiple linear regression by hand

Step 3: Calculate b0, b1, and b2.

The formula to calculate b1 is: [(Σx22)(Σx1y)  – (Σx1x2)(Σx2y)]  / [(Σx12) (Σx22) – (Σx1x2)2]

Thus, b1 = [(194.875)(1162.5)  – (-200.375)(-953.5)]  / [(263.875) (194.875) – (-200.375)2] = 3.148

The formula to calculate b2 is: [(Σx12)(Σx2y)  – (Σx1x2)(Σx1y)]  / [(Σx12) (Σx22) – (Σx1x2)2]

Thus, b2 = [(263.875)(-953.5)  – (-200.375)(1152.5)]  / [(263.875) (194.875) – (-200.375)2] = -1.656

The formula to calculate b0 is: y – b1X1 – b2X2

Thus, b0 = 181.5 – 3.148(69.375) – (-1.656)(18.125) = -6.867

Step 5: Place b0, b1, and b2 in the estimated linear regression equation.

The estimated linear regression equation is: ŷ = b0 + b1*x1 + b2*x2

In our example, it is ŷ = -6.867 + 3.148x1 – 1.656x2

How to Interpret a Multiple Linear Regression Equation

Here is how to interpret this estimated linear regression equation: ŷ = -6.867 + 3.148x1 – 1.656x2

b0 = -6.867. When both predictor variables are equal to zero, the mean value for y is -6.867.

b= 3.148. A one unit increase in x1 is associated with a 3.148 unit increase in y, on average, assuming x2 is held constant.

b2 = -1.656. A one unit increase in x2 is associated with a 1.656 unit decrease in y, on average, assuming x1 is held constant.

Additional Resources

The following tutorials provide additional information about linear regression:

An Introduction to Multiple Linear Regression
How to Perform Multiple Linear Regression in Excel
How to Perform Simple Linear Regression by Hand

21 Replies to “Multiple Linear Regression by Hand (Step-by-Step)”

  1. Hi there, how to find the value of b1,b2 and b3 , if there is 3 independent variables , what formula shoud I use? Hope you can share your solution and thanks in advanced. Have a good day.

  2. Hi Zach,
    Thanks for this tutorial!
    I am unclear on how to extend this from 2 variables to 3.
    Can you provide some guidance?
    Thanks!

  3. Your formula are very useful but I advice you to use the example which are relevance to many people and also which have many independent variables rather than use simple example but we can not refer them to our studies. Thanks Your document are very useful.

  4. Do you have an example how to do Multiple Linear Regression with say 5 variables in Excel without using the Regression Tool in Data Analysis

  5. Thanks, this was really helpful.

    I think there is a small mistake in the working for b_2. In the calculation below, 1152.5 should be 1162.5. The answer is correct so it looks like just a typo.

    Thus, b2 = [(263.875)(-953.5) – (-200.375)(1152.5)] / [(263.875) (194.875) – (-200.375)2] = -1.656

  6. 1152.5 in the line below should be 1162.5.

    Thus, b2 = [(263.875)(-953.5) – (-200.375)(1152.5)] / [(263.875) (194.875) – (-200.375)2] = -1.656

  7. Thanks a lot,for using such questions of multiple linear regressions which are more likely to practice for exam ,the explanation helps a lot.JajakAllahu khairun(Thank you)

  8. Great work!
    The only small mistake is 1152.5 which need to be 1162.5 according to the “Reg Sums” above.

  9. Ok. Im sorry, your formula is correct. Dont you have excel formula for the above? It tedious task for 20 variables.

  10. Hello again, I think using Sum of Y is giving better b0 = Y-b1x1-b2x2-b3x3-b4x4 value cause =Slope(), LINEST() slope results using sum of Y is more accurate than using Average of Y

Leave a Reply

Your email address will not be published. Required fields are marked *