Continuous Optimization: Theory, Algorithms, and Duality
Optimization represents the pinnacle of applied mathematical analysis, providing the framework for decision-making in economics, engineering, and artificial intelligence. This lesson provides a rigorous treatment of extrema in , moving from unconstrained local analysis to constrained global optimization.
1. Extremum Problems in
Let be a scalar field. We are interested in finding such that is as small (or large) as possible.
Definition 1.1 (Local Extremum). A point is a local minimum of if there exists an open neighborhood of such that for all . If the inequality is strict for , it is a strict local minimum.
Definition 1.2 (Global Extremum). is a global minimum if for all .
The Extreme Value Theorem guarantees the existence of global extrema if is continuous and is compact (closed and bounded in ). In non-compact domains, we must analyze the behavior of as .
2. First-Order Necessary Conditions (FONC)
For a function to have a local extremum at an interior point , it must be “flat” in all directions.
Theorem 2.1 (Vanishing Gradient). If is differentiable at and is a local extremum, then: where .
Proof Sketch: Consider the one-dimensional functions , where is the -th standard basis vector. Since is a local extremum of , must be a local extremum of each . By single-variable calculus, . Since , all partial derivatives must vanish.
Points where are called stationary or critical points. Note that this condition is necessary but not sufficient (e.g., at ).
3. Second-Order Conditions and Stability
To determine if a critical point is a minimum, maximum, or saddle point, we analyze the curvature using the Hessian matrix .
Definition 3.1 (The Hessian).
Theorem 3.2 (Second-Order Sufficiency). Let for a function .
- If is positive definite (), is a strict local minimum.
- If is negative definite (), is a strict local maximum.
- If is indefinite (has both positive and negative eigenvalues), is a saddle point.
4. Convexity: The Bridge to Global Optimality
Convexity is arguably more important than differentiability in modern optimization.
Definition 4.1 (Convex Set). A set is convex if for any , the line segment is contained in .
Definition 4.2 (Convex Function). is convex if for all .
Fundamental Property: If is a convex function on a convex set , then any local minimum is a global minimum. This property makes convex optimization problems (like Least Squares or Support Vector Machines) computationally tractable and allows for efficient numerical solvers.
5. Equality Constrained Optimization: Lagrange Multipliers
Consider the problem:
Theorem 5.1 (Lagrange’s Theorem). Let be a local extremum and a regular point of the constraints (meaning are linearly independent). Then there exists a vector of Lagrange Multipliers such that:
We define the Lagrangian function: The necessary conditions for optimality are given by the system .
Geometric Intuition: At the optimum, the gradient of the objective must be orthogonal to the tangent space of the constraint manifold. Since the gradients of the constraints span the normal space at , must be expressible as a linear combination of these normal vectors.
6. Inequality Constraints and KKT Conditions
When constraints include inequalities (), we extend Lagrange’s method to the Karush-Kuhn-Tucker (KKT) conditions.
For to be a local minimum, under a constraint qualification, there must exist and such that:
- Stationarity:
- Primal Feasibility:
- Dual Feasibility:
- Complementary Slackness:
Complementary slackness is the most critical KKT condition: it implies that if a constraint is “slack” (), its multiplier must be zero (). If , the constraint must be “active” ().
7. Sensitivity and Shadow Prices
The multipliers provide a measure of the sensitivity of the optimal value to perturbations in the constraints. Suppose we relax a constraint to . Let be the optimal objective value.
Theorem 7.1. Under regularity: In economics, is interpreted as the shadow price. It represents the marginal change in the objective function if the -th constraint is relaxed by one unit. This is vital for resource allocation problems where one must decide whether the cost of increasing a resource is justified by the resulting improvement in the objective.
8. Computational Example: Constrained Optimization in Python
The following code uses scipy.optimize to solve a quadratic objective with a non-linear inequality constraint.
import numpy as np
from scipy.optimize import minimize
# Objective: Minimize f(x, y) = x^2 + y^2
def objective(x):
return x[0]**2 + x[1]**2
# Constraint: x + y >= 1 => x + y - 1 >= 0
def constraint1(x):
return x[0] + x[1] - 1
# Initial guess
x0 = [2, 2]
# Define the constraint dictionary
con1 = {'type': 'ineq', 'fun': constraint1}
cons = [con1]
# Perform optimization
sol = minimize(objective, x0, method='SLSQP', constraints=cons)
print(f"Optimal solution: x = {sol.x[0]:.4f}, y = {sol.x[1]:.4f}")
print(f"Minimum value: f(x,y) = {sol.fun:.4f}")