How to Perform Least Squares Fitting in NumPy (With Example)

The method of least squares is a method we can use to find the regression line that best fits a given dataset.

We can use the linalg.lstsq() function in NumPy to perform least squares fitting.

The following step-by-step example shows how to use this function in practice.

Step 1: Enter the Values for X and Y

First, let’s create the following NumPy arrays:

import numpy as np

#define x and y arrays
x = np.array([6, 7, 7, 8, 12, 14, 15, 16, 16, 19])

y = np.array([14, 15, 15, 17, 18, 18, 19, 24, 25, 29])

Step 2: Perform Least Squares Fitting

We can use the following code to perform least squares fitting and find the line that best “fits” the data:

#perform least squares fitting
np.linalg.lstsq(np.vstack([x, np.ones(len(x))]).T, y, rcond=None)[0]

array([0.96938776, 7.76734694])

The result is an array that contains the slope and intercept values for the line of best fit.

From the output we can see:

  • Slope: 0.969
  • Intercept: 7.767

Using these two values, we can write the equation for the line of best fit:

ŷ = 7.767 + 0.969x

Step 3: Interpret the Results

Here’s how to interpret the line of best fit:

  • When x is equal to 0, the average value for y is 7.767.
  • For each one unit increase in x, y increases by an average of .969.

We can also use the line of best fit to predict the value of y based on the value of x.

For example, if x has a value of 10 then we predict that the value of y would be 17.457:

  • ŷ = 7.767 + 0.969x
  • ŷ = 7.767 + 0.969(10)
  • ŷ = 17.457

Bonus: Video Explanation of Least Squares Fitting

Refer to the video below for a simple explanation of least squares fitting:

Additional Resources

The following tutorials explain how to perform other common tasks in NumPy:

How to Remove Specific Elements from NumPy Array
How to Get the Index of Max Value in NumPy Array
How to Fill NumPy Array with Values

Leave a Reply

Your email address will not be published.