Gauss Newton algorthm

The information presented here is based off the Wikipedia pages on Gauss-Newton.

The Gauss-Newton algorithm is a simple method for solving non-linear least square problems, typically expressed mathematically as:

$S = \displaystyle\sum\limits_{i=1}^n r_i^2$

where S is the sum of the residuals. The residual, r, is the difference between the actual observed value and the value predicted by the model:

$r_i = y_i - f(x_i, \beta)$

where

• y is the observed value
• x is the input value
• f is the function that we are trying to fit the data to
• $\beta = (\beta_1 .. \beta_n)$  is the parameters of the model that we are trying to find, to minimise S

Starting with an intial guess $\beta^{(0)}$, the Gauss-Newton algorithm will iteratively find the best values for $\beta$ as follows:

$\beta^{(t+1)} = \beta^{(t)} + \Delta$

$\Delta$ can be thought as the correction” applied to $\beta$ to get it closer to the solution. $\Delta$ in terms of the Jacobian matrix of f is:

$(J_f^T J_f)\Delta = J_f^T r$

Solving for $\Delta$

$\Delta = (J_f^T J_f)^{-1} J_f^T r$

$^T$ is the transpose of the matrix and $^{-1}$ is the inverse of the matrix.

I don’t know about most people, but whenever I see a bunch of matrix symbols clumped together it more than often leaves me in a state of confusion, particularly the matrix dimensions. For a better understanding of the matrices I’ll illustrate with an example. Suppose we want to do a quadratic fit to a set of data containing 100 points. A quadratic function has 3 parameters (A, B, C):

$y = Ax^2 + Bx + C$

thus the Jacbobian $J_f$ is 100×3 and the residual matrix, r, is 100×1, $J_f^T J_f$ ends up a 3×3 matrix.

One important note about the algorithm is that there is no guarantee it will converge. Each iteration will not necessarily improve the sum of residual. You can see this behaviour in my demo code. A plot of the mean square error vs iterations for my demo code is illustrated in the graph below. The code tries to fit the function $y = A\sin(Bx) + C\cos(Dx)$, which is sensitive to the initial estimate.

It is important to initialise the algorithm close to the true parameter value as possible, this will improve your chances of finding the correct solution.

Code

GaussNewton.zip

You’ll need OpenCV 2.x installed. In Linux just type make to compile and run. Replace the function Func() with your own one to play around with.

The program is very minimal and should be more or less self explanatory from the code, hopefully.

23 thoughts on “Gauss Newton algorthm”

1. javi says:

Hi, good job there! it seems quite clear.. Can you provide a new download link? The one here is broken 🙁

1. nghiaho12 says:

Thanks for letting me know. Link fixed.

Good day. Pls I’ve been able to download Opencv 2.4.4-beta. It contains so many files, i opened all but couldnt see the executable file to install to be able to use the code as directed. Please can you kindly assist on how to install the Opencv 2.x and how to go about the algorithm.
Thank you.

1. nghiaho12 says:

My experience is with Linux only. I always do

[go to the opencv directory]

mkdir build
cd build
cmake ..
make
make install

1. Thinh P. Tran says:

Hi,

The algorithm is clear but I sitll have some doubt. That is how can we be sure that J^{T}J has an inverse. Thank you for sharing. I am considering some nonlinear regression when I ran into your blog. 🙂

Best regards,
Thinh Tran

1. nghiaho12 says:

A quick check is to calculate the determinant of the matrix and see if it’s non-zero (good), zero (non-invertible).

3. Yeonuk Seong says:

Gooday for you!! Thanks for your help.

I have a project about PEM Fuel Cell.
I should draw the profile of lamda which is property of water. The parameter is lamda and I want to have 41 nodes of the system.

can you explain more specifically to me why it has 41×1 matrix?

1. nghiaho12 says:

Umm not sure if I understand you correctly. I assume your system has 41 data points, hence the 41×1 matrix?

4. driftye says:

Mat delta = ((Jf.t()*Jf)).inv() * Jf.t()*r;
params += delta;
How is your code to be right with plus instead of minus.

1. nghiaho12 says:

Look at the Wikipedia page history, someone changed the + to a minus, which is incorrect. I’ve checked this against other pdf files on the topic.

5. Thanks for your tutorial and your code which are both clear and easy to understand.
I have only one question: How can get the equation above Solving for \Delta ?

1. nghiaho12 says:

It’s from the Wikipedia page. But I think the page may have changed over the years, it looks different to what I remembered.

6. Linux says:

Hello , i would need the Gauss Newton algorithm to minimize reprojection error to estimate camera motion. I’ve seen your code but i don’t understand how i can use it for my target.

1. nghiaho12 says:

Modify the Func() function. It needs to return a single error value given your input and parameters.

1. Linux says:

Thank you for your kind support.
Which value i have to assign to initial guess if the images are rectified ? I’ m working on visual stereo odometry.

1. nghiaho12 says:

You should look at OpenCV’s way of doing stereo rectification. I’d probably not use my own code for serious work. This is more for learning.

1. Linux says:

the method that i’ve implemented based on a paper is not provided by opencv. Could you help me, in motion estimation if i provide you the paper’s name? I wrong something in the implementation, but i’m not able to understand what. The number of inliers provided is very low. And the estimated trajectory, is totally wrong.

2. nghiaho12 says:

Unfortunately, I don’t have much spare time these days. You might want to try posting on the OpenCV forum.

7. Linux says:

For me would be fine also a pseudo algorithm. I have to compute the 6 DOF motion parameters , i need to use a windowed bundle adjustement algorith that at each step use two pair of stereo images, current and previous, and use gauss newton to minimize a function cost . the algorithm use a dual residual scheme.

1. nghiaho12 says:

I haven’t done any work in bundle adjustment so won’t be of help here.

1. Linux says:

Ok thanks anyway.

8. Veri88 says:

Hi, I’m trying to estimate camera motion and I have two equation with eight parameters that describes the 2D parametric image motion, but I don’t understand how I can implement Gauss-Newton if the parameters are affecting both equations. I hope you can help me with some suggestions!

1. nghiaho12 says:

This is a topic that I’ve dabbled on in the past but not an expert in. You should check the Multiple View Geometry book for different minimization formulations for this problem. Gauss Newton is one of the many methods you can use for numerical optimization.