Multiple linear regression is a method we can use to quantify the relationship between two or more predictor variables and a response variable.

This tutorial explains how to perform multiple linear regression by hand.

**Example: Multiple Linear Regression by Hand**

Suppose we have the following dataset with one response variable *y* and two predictor variables X_{1} and X_{2}:

Use the following steps to fit a multiple linear regression model to this dataset.

**Step 1: Calculate X _{1}^{2}, X_{2}^{2}, X_{1}y, X_{2}y and X_{1}X_{2}.**

**Step 2: Calculate Regression Sums.**

Next, make the following regression sum calculations:

- Σx
_{1}^{2 }= ΣX_{1}^{2 }– (ΣX_{1})^{2}/ n = 38,767 – (555)^{2}/ 8 =**263.875** - Σx
_{2}^{2 }= ΣX_{2}^{2 }– (ΣX_{2})^{2}/ n = 2,823 – (145)^{2}/ 8 =**194.875** - Σx
_{1}y = ΣX_{1}y – (ΣX_{1}Σy) / n = 101,895 – (555*1,452) / 8 =**1,162.5** - Σx
_{2}y = ΣX_{2}y – (ΣX_{2}Σy) / n = 25,364 – (145*1,452) / 8 =**-953.5** - Σx
_{1}x_{2}= ΣX_{1}X_{2}– (ΣX_{1}ΣX_{2}) / n = 9,859 – (555*145) / 8 =**-200.375**

**Step 3: Calculate b _{0}, b_{1}, and b_{2}.**

The formula to calculate b_{1 }is: [(Σx_{2}^{2})(Σx_{1}y) – (Σx_{1}x_{2})(Σx_{2}y)] / [(Σx_{1}^{2}) (Σx_{2}^{2}) – (Σx_{1}x_{2})^{2}]

Thus, **b _{1 }**= [(194.875)(1162.5) – (-200.375)(-953.5)] / [(263.875) (194.875) – (-200.375)

^{2}] =

**3.148**

The formula to calculate b_{2 }is: [(Σx_{1}^{2})(Σx_{2}y) – (Σx_{1}x_{2})(Σx_{1}y)] / [(Σx_{1}^{2}) (Σx_{2}^{2}) – (Σx_{1}x_{2})^{2}]

Thus, **b _{2 }**= [(263.875)(-953.5) – (-200.375)(1152.5)] / [(263.875) (194.875) – (-200.375)

^{2}] =

**-1.656**

The formula to calculate b_{0 }is: y – b_{1}X_{1} – b_{2}X_{2}

Thus, **b _{0 }**= 181.5 – 3.148(69.375) – (-1.656)(18.125) =

**-6.867**

**Step 5: Place b _{0}, b_{1}, and b_{2} in the estimated linear regression equation.**

The estimated linear regression equation is: ŷ = b_{0} + b_{1}*x_{1} + b_{2}*x_{2}

In our example, it is **ŷ = -6.867 + 3.148x _{1} – 1.656x_{2}**

**How to Interpret a Multiple Linear Regression Equation**

Here is how to interpret this estimated linear regression equation: ŷ = -6.867 + 3.148x_{1} – 1.656x_{2}

**b _{0} = -6.867**. When both predictor variables are equal to zero, the mean value for y is -6.867.

**b _{1 }= 3.148**. A one unit increase in x

_{1 }is associated with a 3.148 unit increase in y, on average, assuming x

_{2 }is held constant.

**b _{2 }= -1.656**. A one unit increase in x

_{2 }is associated with a 1.656 unit decrease in y, on average, assuming x

_{1 }is held constant.

**Additional Resources**

The following tutorials provide additional information about linear regression:

An Introduction to Multiple Linear Regression

How to Perform Multiple Linear Regression in Excel

How to Perform Simple Linear Regression by Hand

What about F test, T test, r, rsqure?

Hi there, how to find the value of b1,b2 and b3 , if there is 3 independent variables , what formula shoud I use? Hope you can share your solution and thanks in advanced. Have a good day.

Hi Zach,

Thanks for this tutorial!

I am unclear on how to extend this from 2 variables to 3.

Can you provide some guidance?

Thanks!

Hi Zach — could you clarify where your formulas came from? Thanks

So, how to calculate 3 variable multiple linear regression by hand

This was really sooo helpful thank you so much!

Need more examples

Your formula are very useful but I advice you to use the example which are relevance to many people and also which have many independent variables rather than use simple example but we can not refer them to our studies. Thanks Your document are very useful.

Thank you Zach, most helpful.

Do you have an example how to do Multiple Linear Regression with say 5 variables in Excel without using the Regression Tool in Data Analysis

This resource is a total lifesaver. Thank you and bless you!

Thanks, this was really helpful.

I think there is a small mistake in the working for b_2. In the calculation below, 1152.5 should be 1162.5. The answer is correct so it looks like just a typo.

Thus, b2 = [(263.875)(-953.5) – (-200.375)(1152.5)] / [(263.875) (194.875) – (-200.375)2] = -1.656

1152.5 in the line below should be 1162.5.

Thus, b2 = [(263.875)(-953.5) – (-200.375)(1152.5)] / [(263.875) (194.875) – (-200.375)2] = -1.656

Thanks a lot,for using such questions of multiple linear regressions which are more likely to practice for exam ,the explanation helps a lot.JajakAllahu khairun(Thank you)

Great work!

The only small mistake is 1152.5 which need to be 1162.5 according to the “Reg Sums” above.

thank you, sir,

this is very useful.

HOW TO MAKE X1,X,2,X3,X4 INDEPENDNET REGRTION WITH THE HELP OF Y DEPENDNET PARAMETER ANY STEP IS AVAILABLE IN MATHEMATICALLY TO FIND THE PREDICTABLE Y VALUES

The formula to calculate b1 is: [(Σx22)(Σx1y) – (Σx1x2)(Σx2y)] / [(Σx12) (Σx22) – (Σx1x2)2]

WHAT IS THE FORMULA MAKING TO FIND OUT b2,b3,b4 in multiple linear regression module

[(Σx2^2)(Σx1y)

It’s supposed to be Sigma X1^2 for B1. Please check for B2 as well.

Ok. Im sorry, your formula is correct. Dont you have excel formula for the above? It tedious task for 20 variables.

Hello again, I think using Sum of Y is giving better b0 = Y-b1x1-b2x2-b3x3-b4x4 value cause =Slope(), LINEST() slope results using sum of Y is more accurate than using Average of Y