Trust region methods#
Trust region method define a region around the current guess in which the actual function is trusted to behave like some model function (to within some tolerance). They are designed to improve robustness for complex surfaces, including indefinate and ill-conditioned surfaces.
They are somewhat dual to line search:
Line search: Choose a direction and search along the line for the next step.
Trust region: Choose a step length (or limit) and then choose the direction.
The model function can be linear (based on the gradient at the current point) or quadratic (based on the gradient and the Hessian).
Algorithms differ by the criteria for updating the trust region, selection of the next guess, and what to do if the trust region fails.
(E.g.: If the new step deviates too far from the model, does one shrink the trust region for the next step or reject the current step and recalculat?).
#Comparison of optimization methods discussed so far
Category |
Optimization Technique |
Method Name |
---|---|---|
Line Search |
Broyden-Fletcher-Goldfarb-Shanno (BFGS) |
|
Newton-Conjugate-Gradient |
|
|
Limited-memory BFGS |
|
|
Sequential Least Squares Programming |
|
|
Trust Region |
Trust-Region Newton-Conjugate-Gradient |
|
Trust-Region Truncated Generalized Lanczos |
|
|
Trust-Region Nearly Exact |
|
|
Direct |
Nelder-Mead Simplex |
|
Powell’s Method |
|
|
Gradient Descent |
Conjugate Gradient |
|
Example - Compare optimizers#
def f(x):
return (((x[0]**2+x[1]-11)**2) + (((x[0]+x[1]**2-7)**2)))
def grad_f(x):
"""Gradient of the function f."""
x1, x2 = x
df_dx1 = 4 * (x1**2 + x2 - 11) * x1 + 2 * (x1 + x2**2 - 7)
df_dx2 = 2 * (x1**2 + x2 - 11) + 4 * (x1 + x2**2 - 7) * x2
return np.array([df_dx1, df_dx2])
def hess_f(x):
"""Hessian of the function f."""
x1, x2 = x
d2f_dx1dx1 = 12 * x1**2 + 4 * x2 - 44 + 2
d2f_dx1dx2 = 4 * x1 + 4 * x2
d2f_dx2dx1 = 4 * x1 + 4 * x2
d2f_dx2dx2 = 2 + 4 * x1 + 12 * x2**2 - 28
return np.array([[d2f_dx1dx1, d2f_dx1dx2], [d2f_dx2dx1, d2f_dx2dx2]])
from scipy.optimize import minimize
import matplotlib.pyplot as plt
# Create a grid of x and y values
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)
# Calculate the function values for each point on the grid
Z = f([X, Y])
# Initial guess
x0 = np.array([1.5, 2])
guesses = [x0]
def callback(xk):
guesses.append(xk)
# BFGS
# trust-ncg
# Newton-CG
# Nelder-Mead
# CG
# trust-exact
result = minimize(f, x0, method='trust-exact', jac=grad_f, hess = hess_f, callback=callback)
# Create a contour plot
plt.contour(X, Y, Z, levels=20)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Contour Plot of f(x,y)')
# Plot the path of the guesses
x_guesses = [guess[0] for guess in guesses]
y_guesses = [guess[1] for guess in guesses]
plt.plot(x_guesses, y_guesses, marker='o', linestyle='-', color='red')
plt.show()
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[2], line 1
----> 1 from scipy.optimize import minimize
2 import matplotlib.pyplot as plt
4 # Create a grid of x and y values
ModuleNotFoundError: No module named 'scipy'