Goals:
See the pitfalls of Newton’s method
Understand the ND Newton-Raphson method
Awareness of linesearch algorithms for improved convergence.
‘Global’ convergence#
There are several options to modify the Newton-Raphson method in order to enhance the robustness of root finding, but the improvement in robustness has to be weighed against the computational expense.
We have to assume our initial guess is reasonable, so the goal is to ensure the solution doesn’t wander.
Line search#
If we trust that \(\Delta \vec{x}\) is pointing in the right direction, then we should make sure it doesn’t step too far.
These approaches are called line search alrogithms since we are moving along the direction prescribed by \(\Delta \vec{x}\).
Damped Newton-Raphson#
The easiest modification is adding a damping term. Calculate \(\Delta \vec{x}\) as usual from \(J \Delta \vec{x} = - \vec{f}\) but update the step a scalar factor of the increment: $\( \vec{x}^{i+1} = \vec{x}^i+\alpha \Delta \vec{x} \)$
For \(\alpha =1\) we recover the method.
For \(0 \lt \alpha \lt 1\) the method is underdamped, and the solver forced to take small steps, keeping it closer to the guess but at the cost of convergence speed.
For \(\alpha \gt 1\) the method is overdamped, and solver steps exagerated. This may seem like a good idea to speed things up but if you are not careful you will constantly overshoot your root or produce frenetic behaviour!
Optimal step size#
From the damped approach, we can also attempt to find a ‘good’ step size algorithmically. Near the root, we know we want \(\alpha=1\) to get quadratic convergence, so this is often a starting point.
There are several approaches which levarage the information we have:
\(\vec{f}(\vec{x})\)
\(J(\vec{x})\)
\(\vec{f}(\vec{x}+\Delta \vec{x})\)
Fit a quadratic#
With 3 pieces of information, we can fit a quadratic in the search direction and choose the minimum for the update.
Backtracking#
The backtracking routine starts with \(\alpha=1\) and then subdivides it until
Bounded minimization#
Find \(\alpha\) in the range (0,1] that minimizes some measure of residual, e.g. \(\| \vec{f}(\vec{x}+\alpha \Delta \vec{x}) \|\).
This is a 1D minimizaqtion problem (similar to our 1D root finding) and can use similar tools e.g.: Secant method. The tradeoff here is the efficiency of this minimzation vs just taking another Newton step.