The **method of least squares** is a method we can use to find the regression line that best fits a given dataset.

We can use the **linalg.lstsq(**) function in NumPy to perform least squares fitting.

The following step-by-step example shows how to use this function in practice.

**Step 1: Enter the Values for X and Y**

First, let’s create the following NumPy arrays:

import numpy as np #define x and y arrays x = np.array([6, 7, 7, 8, 12, 14, 15, 16, 16, 19]) y = np.array([14, 15, 15, 17, 18, 18, 19, 24, 25, 29])

**Step 2: Perform Least Squares Fitting**

We can use the following code to perform least squares fitting and find the line that best “fits” the data:

#perform least squares fitting np.linalg.lstsq(np.vstack([x, np.ones(len(x))]).T, y, rcond=None)[0] array([0.96938776, 7.76734694])

The result is an array that contains the **slope** and **intercept** values for the line of best fit.

From the output we can see:

- Slope:
**0.969** - Intercept:
**7.767**

Using these two values, we can write the equation for the line of best fit:

ŷ = 7.767 + 0.969x

**Step 3: Interpret the Results**

Here’s how to interpret the line of best fit:

- When x is equal to 0, the average value for y is
**7.767**. - For each one unit increase in x, y increases by an average of
**.969**.

We can also use the line of best fit to predict the value of y based on the value of x.

For example, if x has a value of 10 then we predict that the value of y would be **17.457**:

- ŷ = 7.767 + 0.969x
- ŷ = 7.767 + 0.969(10)
- ŷ = 17.457

**Bonus: Video Explanation of Least Squares Fitting**

Refer to the video below for a simple explanation of least squares fitting:

**Additional Resources**

The following tutorials explain how to perform other common tasks in NumPy:

How to Remove Specific Elements from NumPy Array

How to Get the Index of Max Value in NumPy Array

How to Fill NumPy Array with Values