Nonlinear Equation System Solving Using Newton, Line Search, and LU factorization

Published

April 12, 2021

Summary

A couple of weeks ago, I was inspired to revisit Newton’s method for solving nonlinear equations. I was reminded of my undergrad days when I read a paper by Madsen and Reid about finding the roots of a polynomial. What struck me at that time is Madsen-Reid’s application of Newton’s method with a line search was faster than any other algorithm. The Madsen-Reid papers were written in 1973 and 1975. Ever since then, when another paper benchmarked various algorithms to find roots of polynomials, Madsen-Reid would win; to my knowledge, if root-finding of polynomials is benchmarked in literature and the Fortran Madsen-Reid algorithm (PA16 and/or PA17) is used, the Madsen-Reid algorithm is the fastest and nearly the most accurate. What made the algorithm so successful? How can this generalize to solving a set of nonlinear equations? In this post, I plan to discuss these ideas.

Note that I have a little math syntax but not as much as you would find in a math article or paper. This means my language is a little more inexact and high level without explaining all the detail.

The Madsen-Reid Algorithm

The Madsen-Reid algorithm can be summarized as follows:

  1. Calculate a newton direction \(d = - {f'}^{- 1}f\). Do a line search along the \(d\) direction to find where \(|f|\) is the smallest. However, don’t find the minimum perfectly. Instead, quit after \(|f|\) isn’t decreasing in a sequence of new guesses along \(d\). This is called “stage 1.”
  2. Switch to pure newton iteration based on checking an inequality that ensures convergence. This is called “stage 2” pure Newton iteration. If the inequality fails, go back to stage 1.

The beauty of the Madsen-Reid algorithm is that it is

  • Fast
  • Simple
  • Guaranteed quadratic convergence even to roots with multiplicities greater than 1 because of stage 1 line search. Guaranteed quadratic convergence is rare.

Link to a paper describing the Madsen-Reid algorithm: https://apps.dtic.mil/sti/pdfs/ADA278422.pdf

C++ implementation of Madsen-Reid algorithm: http://www.hvks.com/Numerical/ports.html

Fortran code PA16 and PA17 can be found at https://www.hsl.rl.ac.uk/catalogue/

Nonlinear Equation Systems, Convex Optimization

Newton iteration with a line search has shown up a lot since Madsen-Reid in optimization and root finding of nonlinear equation systems. It isn’t necessarily because of Madsen-Reid that Newton’s method with line search is broadly used. The reason, in my opinion, is different researchers or practitioners have found that Newton’s method with line search is highly effective.

The parallels between Newton iteration with line search in optimization and nonlinear equation solving are:

Unconstrained convex optimization (minimization) Nonlinear equation system

Cost = F(x)

Where

  • x is a vector value function.

  • F is a scalar function of x. F is at least twice differentiable or Lipschitz continuous. The point is F needs to be twice differentiable in some sense.

You can think of the cost as

Cost= ||F(x)||1

Where

  • ||  ||1 is one norm (absolute value of the elements of the vector summed).

  • However, this isn’t necessary. The above equation helps make the analog to optimization only.

The gradient is 0 at minimum

\(\frac{\partial F}{\partial x} = 0\)

Where

\(\frac{\partial F}{\partial x}\) is a vector value function of the x vector.

F(x) = 0

Where

  • F(x) is a vector value function of vector x.

  • F must be twice differentiable in some sense, at least locally. More advanced math will use advanced ideas to say what this means.

The Newton iteration with line search is

xk + 1 = xk + αD

\[D = \left( \frac{\partial^{2}F}{\partial x^{2}} \right)^{- 1}\left( - \frac{\partial F}{\partial x} \right)\]

where

  • \(\frac{\partial^{2}F}{\partial x^{2}}\) is the Hessian matrix. If F was twice differentiable, then \(\frac{\partial^{2}F}{\partial x^{2}}\) is always symmetric. It should not be singular if you are going to invert it.

  • D is the direction of the line search.

  • α is the size of the step found via a line search along D. There are many inexact and exact ways to do the line search. Quadratic approximation, Inexact Line Search, etc. The point is to find an α that decreases F.
    See https://youtu.be/MKmIvtq83LY for Quadratic Approximation.
    See https://youtu.be/MzmqM0tuO1Q for an inexact line search.

The Newton iteration with line search is

xk + 1 = xk + αD

\[D = \left( \frac{\partial\ F}{\partial x} \right)^{- 1}( - F)\]

Where

  • \(\frac{\partial\ F}{\partial x}\) is the Jacobian matrix. It is not symmetric like in optimization. It can’t be singular if you are going to invert it.

  • D is the direction of the line search.

  • α is the size of the step found via a line search along D. There are many inexact and exact ways to do the line search. Quadratic approximation, Inexact Line Search, etc. The point is to find an α that decreases the one norm of F, ||F(xk+αD)||1.
    See https://youtu.be/MKmIvtq83LY for Quadratic Approximation.
    See https://youtu.be/MzmqM0tuO1Q for an inexact line search.

The guarantee of convergence of the newton iteration with line search is

  • \(\frac{\partial^{2}F}{\partial x^{2}}\) is positive definite. Said another way, F is convex (looks like a parabola).

  • If it is not positive definite, then we need to find a replacement direction to search. That direction can come up in many ways but boils down to a change in the direction moving away from a saddle point. See:
    https://youtu.be/x1wMciVA6Xc?t=180

I am not sure what the guarantee of convergence is for Newton iteration in the nonlinear system case. As far as I know, \(\frac{\partial\ F}{\partial x}\) needs to be invertible, thus nonsingular. I think that is the only requirement. However, we would like to find a way for it to still work if \(\frac{\partial\ F}{\partial x}\) is sometimes singular. This what we want to explore in the below sections.

What’s Next?

I need to

  • Write a MATLAB function for Newton iteration with line search and LU factorization to recognize singularity.

  • Implement a line search using a quadratic interpolation to find the step size, \(\alpha\). This will be key. Madsen-Reid’s line search would also work.

  • I need to add inequality constraints. This will most likely add some Lagrange multipliers or something similar. This will be needed especially for functions like log where crossing a vertical asymptote is a terrible idea. This will be needed when we want one particular solution and don’t want to cross into the basin of attraction to another solution.

  • Write this in Simulink so that it can be compiled to code and even work on fixed-point types.

    • This will require using some custom LU code. See the examples of the Fixed-point types for MATLAB functions.

Conclusion

Newton iteration with a line search with \(LU\) factorization to find the singular case looks to be a very robust nonlinear equation solver.