Broyden’s Method

Source inspiration: (Mathew 2000-2019).

Description

Solving a nonlinear system \(\mathbf{F}(\mathbf{X}) = \mathbf{0}\) in \(n\) dimensions by Newton’s method is powerful but expensive: every iteration requires computing all \(n^2\) partial derivatives that form the Jacobian matrix \(J(\mathbf{X}^{(k)})\). In large problems this cost can dominate the computation. Broyden’s method eliminates that bottleneck by maintaining an approximate Jacobian that is cheaply updated at each step using only the change in function values — no additional derivative or finite-difference evaluations are needed after the first iteration.

Newton’s Method for Systems (the baseline)

For a vector function \(\mathbf{F}(\mathbf{X}) = \bigl(f_1(\mathbf{X}),\dots,f_n(\mathbf{X})\bigr)^T\) Newton’s method linearizes \(\mathbf{F}\) at the current iterate and solves the resulting \(n \times n\) linear system:

\[J\!\left(\mathbf{X}^{(k)}\right)\Delta\mathbf{X}^{(k)} = -\mathbf{F}\!\left(\mathbf{X}^{(k)}\right), \qquad \mathbf{X}^{(k+1)} = \mathbf{X}^{(k)} + \Delta\mathbf{X}^{(k)}.\]

It converges quadratically near a simple root but demands \(n^2\) new partial-derivative evaluations each step. See Nonlinear Systems for the full treatment.

Reducing Cost with a Finite-Difference Jacobian

One obvious improvement replaces every partial derivative with a finite-difference quotient. For two dimensions:

\[M_{ij} \approx \frac{F_i(\mathbf{X} + h\,\mathbf{e}_j) - F_i(\mathbf{X} - h\,\mathbf{e}_j)}{2h},\]

where \(\mathbf{e}_j\) is the \(j\)-th unit vector and \(h\) is a small step size. This “Pseudo-Newton” or finite-difference Newton approach trades symbolic derivatives for \(2n^2\) additional function evaluations per iteration. For large \(n\) this is still expensive.

The Broyden Update: A Quasi-Newton Method

Broyden’s method goes further: instead of recomputing (or re-approximating) the full Jacobian at every step, it updates the previous approximate Jacobian \(B_k\) using only the information gained in the last iteration. The key requirement is the quasi-Newton (secant) condition — the new approximate Jacobian must satisfy

\[B_{k+1}\,\mathbf{s}_k = \mathbf{y}_k,\]

where \(\mathbf{s}_k = \mathbf{X}^{(k+1)} - \mathbf{X}^{(k)}\) is the step taken and \(\mathbf{y}_k = \mathbf{F}(\mathbf{X}^{(k+1)}) - \mathbf{F}(\mathbf{X}^{(k)})\) is the corresponding change in function values. This is the multidimensional analogue of the scalar secant equation \(f'(x) \approx (f(x_1) - f(x_0))/(x_1 - x_0)\).

The secant condition alone does not uniquely determine \(B_{k+1}\) when \(n > 1\). Broyden’s rank-1 update resolves the ambiguity by choosing the matrix closest to \(B_k\) (in the Frobenius sense) that satisfies the secant condition:

\[B_{k+1} = B_k + \frac{\left(\mathbf{y}_k - B_k\,\mathbf{s}_k\right)\mathbf{s}_k^T}{\mathbf{s}_k^T\,\mathbf{s}_k}.\]

This is a least-change secant update: the correction to \(B_k\) has rank 1, so only \(O(n^2)\) arithmetic operations are needed to update the approximate Jacobian, and no new function evaluations are required beyond computing \(\mathbf{F}(\mathbf{X}^{(k+1)})\).

Algorithm

Step 0. Compute the exact Jacobian \(B_0 = J(\mathbf{X}^{(0)})\). Apply one Newton step to obtain the first iterate:

\[\mathbf{X}^{(1)} = \mathbf{X}^{(0)} - B_0^{-1}\,\mathbf{F}\!\left(\mathbf{X}^{(0)}\right).\]

For \(k = 1, 2, \dots\):

Evaluate \(\mathbf{F}(\mathbf{X}^{(k)})\).
Update the approximate Jacobian: \[B_k = B_{k-1} + \frac{\left(\mathbf{F}(\mathbf{X}^{(k)}) - \mathbf{F}(\mathbf{X}^{(k-1)}) - B_{k-1}\,\mathbf{s}_{k-1}\right)\mathbf{s}_{k-1}^T}{\mathbf{s}_{k-1}^T\,\mathbf{s}_{k-1}}.\]
Solve \(B_k\,\Delta\mathbf{X} = -\mathbf{F}(\mathbf{X}^{(k)})\) for the step \(\Delta\mathbf{X}\).
Update: \(\mathbf{X}^{(k+1)} = \mathbf{X}^{(k)} + \Delta\mathbf{X}\).

Convergence

Broyden’s method converges superlinearly near a simple root — faster than linear, but slower than Newton’s quadratic rate. In practice, because each iteration is far cheaper than a full Newton step, Broyden’s method often reaches a given accuracy in less total work than Newton’s method, especially in high dimensions. Convergence can fail if the starting point is far from a root or if the approximate Jacobian becomes ill-conditioned.

Animations

Each animation below shows Broyden’s method converging on the nonlinear system

\[f_1(x,y) = 1 - 4x + 2x^2 - 2y^2 = 0, \qquad f_2(x,y) = -4 + x^2 + 4y + y^2 = 0.\]

The blue curve is \(f_1 = 0\) and the red curve is \(f_2 = 0\); their intersections are the solutions. Each frame adds the next iterate \(\mathbf{X}^{(k)}\), connected by a line so the path toward the root is visible.

Julia source scripts that generated these animations are linked under each case.

Case A — Convergent, \(\mathbf{X}^{(0)} = (0,\,1)\), root \(\approx (-0.088,\, 0.827)\)

Behavior: Starting near the upper-left intersection, the initial exact Newton step lands close to the root and the subsequent Broyden quasi-Newton updates converge superlinearly. The approximate Jacobian is updated cheaply by the rank-1 formula without any new derivative evaluations.

Julia source

Broyden’s method animation starting at (0,1): iterates converge superlinearly to the root near (-0.088, 0.827) on the system f1=0 (blue) and f2=0 (red)

Case B — Convergent, \(\mathbf{X}^{(0)} = (2.5,\,-1.5)\), root \(\approx (2.808,\,-1.665)\)

Behavior: Starting near the lower-right intersection, the same Broyden algorithm converges to the second solution. This case illustrates that the method is dimension-independent and starting-point dependent: a different initial guess finds a different root.

Julia source

Broyden’s method animation starting at (2.5,-1.5): iterates converge superlinearly to the root near (2.808, -1.665) on the system f1=0 (blue) and f2=0 (red)

References

Mathew, John H. 2000-2019. Numerical Analysis - Numerical Methods Modules. https://web.archive.org/web/20190808102217/http://mathfaculty.fullerton.edu/mathews/n2003/NumericalUndergradMod.html.