Notes on Nonlinear Programming #
tip
- from classes of OD
Problem Setup #
- nonlinear programming (NLP) optimizes a nonlinear objective, with optional nonlinear constraints:
- core challenge:
- objective geometry can be curved and nonconvex
- constraints can define complex feasible regions
- local minima can differ strongly from global minima
Lagrangian and KKT Conditions #
- define Lagrangian:
- first-order necessary conditions (under regularity assumptions):
- stationarity: \[\nabla_x \mathcal{L}(x^\star,\lambda^\star,\nu^\star)=0\]
- primal feasibility: \[g_i(x^\star)\le0,\quad h_j(x^\star)=0\]
- dual feasibility: \[\lambda_i^\star \ge 0\]
- complementary slackness: \[\lambda_i^\star g_i(x^\star)=0\]
important
- KKT gives candidate solutions; for nonconvex NLP it does not guarantee global optimality.
Convex vs Nonconvex Programs #
convex program:
- convex objective + convex inequality constraints + affine equalities
- every local optimum is global
- duality is often strong and numerically stable
nonconvex program:
- may have multiple local minima/saddles
- initialization strongly impacts final solution
- global methods usually trade optimality guarantees for high compute cost
Major Algorithm Families #
gradient-based unconstrained methods
- steepest descent: low per-step cost, slower convergence
- Newton / quasi-Newton (BFGS, L-BFGS): faster near optimum using curvature
constrained methods
- projected gradient: simple constraints (box, simplex)
- penalty / augmented Lagrangian: move constraints into objective
- interior-point (barrier): stay inside feasible region
- SQP (Sequential Quadratic Programming): solve QP subproblems iteratively
Line Search and Trust Region #
- line search update:
choose step length \(\alpha_k\) via Wolfe/Armijo conditions to ensure sufficient decrease
trust-region update solves:
- trust-region methods are robust when Hessians are indefinite or badly conditioned
Practical Workflow #
- scale decision variables and constraints before solving
- provide gradients/Jacobians/Hessians when possible
- pick solver based on structure:
- unconstrained smooth: L-BFGS / Newton-CG
- box constraints: L-BFGS-B
- general constraints: SLSQP / trust-constr / IPOPT
- run multiple starts for nonconvex objectives
- verify KKT residuals and sensitivity to initialization
import numpy as np
from scipy.optimize import minimize, NonlinearConstraint
def f(x):
return (x[0] - 1.5)**2 + (x[1] + 0.5)**2 + np.sin(3.0 * x[0]) * 0.1
def g(x): # inequality g(x) <= 0
return x[0]**2 + x[1]**2 - 1.5
x0 = np.array([0.2, 0.2])
nlc = NonlinearConstraint(g, -np.inf, 0.0)
bounds = [(-2.0, 2.0), (-2.0, 2.0)]
res = minimize(f, x0, method="SLSQP", constraints=[nlc], bounds=bounds)
print(res.x, res.fun, res.success)