A Generalized Primal-Dual Augmented Lagrangian Philip E. Gill1 1 University
Daniel P. Robinson2 of California, San Diego
2 Computing
Laboratory University of Oxford
SIAM Annual Meeting : July 9, 2008
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
1/1
Outline
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
2/1
Introduction
Introduction
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
3/1
Introduction
The Target Problem (NEP) minimize f (x) subject to c(x) = 0 n x∈R
f : Rn → R, g(x) := ∇f (x) ∈ Rn c : Rn → Rm , J(x) := c0 (x) ∈ Rm×n L(x, y) := f (x) − c(x)T y H(x, y) :=
(the Lagrangian)
∇2xx L(x, y)
Notation f := f (x), g := g(x), c := c(x), J := J(x) fk := f (xk ), gk := g(xk ), ck := c(xk ), Jk := J(xk ), Hk := H(xk , yk )
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
4/1
Introduction
The Target Problem (NEP) minimize f (x) subject to c(x) = 0 n x∈R
f : Rn → R, g(x) := ∇f (x) ∈ Rn c : Rn → Rm , J(x) := c0 (x) ∈ Rm×n L(x, y) := f (x) − c(x)T y H(x, y) :=
(the Lagrangian)
∇2xx L(x, y)
Notation f := f (x), g := g(x), c := c(x), J := J(x) fk := f (xk ), gk := g(xk ), ck := c(xk ), Jk := J(xk ), Hk := H(xk , yk )
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
4/1
Introduction
The Augmented Lagrangian The Augmented Lagrangian LA (x; ye , µ) = f (x) − c(x)Tye +
1 kc(x)k2 2µ
ye is an estimate of a Lagrange multiplier vector. µ > 0 is the penalty parameter. Definition π(x) = ye − c(x)/µ ∇LA (x) = g(x) − J(x)T π(x) ∇2LA (x) = H x, π(x) + µ1 J(x)T J(x)
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
5/1
Introduction
The Augmented Lagrangian The Augmented Lagrangian LA (x; ye , µ) = f (x) − c(x)Tye +
1 kc(x)k2 2µ
ye is an estimate of a Lagrange multiplier vector. µ > 0 is the penalty parameter. Definition π(x) = ye − c(x)/µ ∇LA (x) = g(x) − J(x)T π(x) ∇2LA (x) = H x, π(x) + µ1 J(x)T J(x)
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
5/1
Introduction
The Augmented Lagrangian
The Augmented Lagrangian Newton System 1 H x, π(x) + J(x)T J(x) ∆x = − g(x) − J(x)T π(x) µ
The Augmented Lagrangian Primal-Dual Newton System Let y be an arbitrary vector in Rm . Then the solution to the augmented Lagrangian Newton system satisfies the following primal-dual system ! ! ! H(x, π(x)) J(x)T ∆x g(x) − J(x)Ty =− . J(x) −µIm −∆y c(x) + µ(y − ye )
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
6/1
Introduction
The Augmented Lagrangian
The Augmented Lagrangian Newton System 1 H x, π(x) + J(x)T J(x) ∆x = − g(x) − J(x)T π(x) µ
The Augmented Lagrangian Primal-Dual Newton System Let y be an arbitrary vector in Rm . Then the solution to the augmented Lagrangian Newton system satisfies the following primal-dual system ! ! ! H(x, π(x)) J(x)T ∆x g(x) − J(x)Ty =− . J(x) −µIm −∆y c(x) + µ(y − ye )
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
6/1
Introduction
The Augmented Lagrangian
The Big Theorem If x∗ satisfies the second-order sufficient conditions for a solution of problem (NEP), then there exists a µ¯ such that for all 0 < µ < µ, ¯ the point x∗ satisfies the second-order sufficient conditions for a solution of the unconstrained problem minimize LA (x; y∗ , µ) = f (x) − c(x)T y∗ + n x∈R
Philip, Daniel (UCSD, OUCL)
PDAL
1 kc(x)k2 . 2µ
SIAM - 2008
7/1
GPD Augmented Lagrangian
The Generalized Primal-Dual Augmented Lagrangian
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
8/1
GPD Augmented Lagrangian
The Generalized Primal-Dual Augmented Lagrangian M(x, y; ye , µ, ν) = f (x) − c(x)Tye +
1 ν kc(x)k2 + kc(x) + µ(y − ye )k2 2µ 2µ
The Big Theorem Assume that (x∗ , y∗ ) satisfies the second-order sufficient conditions associated with problem (NEP). Then (x∗ , y∗ ) is a stationary point of the primal-dual function M(x, y; y∗ , µ, ν) = f (x) − c(x)Ty∗ +
ν 1 kc(x)k2 + kc(x) + µ(y − y∗ )k2 . 2µ 2µ
Moreover, if ν > 0, then there exists a positive scalar µ¯ such that (x∗ , y∗ ) is an isolated unconstrained minimizer of M(x, y; y∗ , µ, ν) for all 0 < µ < µ. ¯
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
9/1
GPD Augmented Lagrangian
The Generalized Primal-Dual Augmented Lagrangian M(x, y; ye , µ, ν) = f (x) − c(x)Tye +
1 ν kc(x)k2 + kc(x) + µ(y − ye )k2 2µ 2µ
The Big Theorem Assume that (x∗ , y∗ ) satisfies the second-order sufficient conditions associated with problem (NEP). Then (x∗ , y∗ ) is a stationary point of the primal-dual function M(x, y; y∗ , µ, ν) = f (x) − c(x)Ty∗ +
ν 1 kc(x)k2 + kc(x) + µ(y − y∗ )k2 . 2µ 2µ
Moreover, if ν > 0, then there exists a positive scalar µ¯ such that (x∗ , y∗ ) is an isolated unconstrained minimizer of M(x, y; y∗ , µ, ν) for all 0 < µ < µ. ¯
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
9/1
GPD Augmented Lagrangian
Some Special Cases Function Primal-Dual Augmented Lagrangian Function Augmented Lagrangian Function Proximal-Point Lagrangian Function Primal-Dual Penalty Function Classical Penalty Function Proximal-Point Penalty Function
ν 1 0 −1 1 0 −1
ye ye ye ye 0 0 0
The Primal-Dual Augmented Lagrangian (ν = 1) M(x, y; ye , µ) = f (x) − c(x)Tye +
Philip, Daniel (UCSD, OUCL)
1 1 kc(x)k2 + kc(x) + µ(y − ye )k2 2µ 2µ
PDAL
SIAM - 2008
10 / 1
GPD Augmented Lagrangian
Some Special Cases Function Primal-Dual Augmented Lagrangian Function Augmented Lagrangian Function Proximal-Point Lagrangian Function Primal-Dual Penalty Function Classical Penalty Function Proximal-Point Penalty Function
ν 1 0 −1 1 0 −1
ye ye ye ye 0 0 0
The Primal-Dual Augmented Lagrangian (ν = 1) M(x, y; ye , µ) = f (x) − c(x)Tye +
Philip, Daniel (UCSD, OUCL)
1 1 kc(x)k2 + kc(x) + µ(y − ye )k2 2µ 2µ
PDAL
SIAM - 2008
10 / 1
GPD Augmented Lagrangian
The Primal-Dual Augmented Lagrangian
The Primal-Dual Augmented Lagrangian Newton System JT H(x, 2π − y) + µ2 J TJ ∆x g − J T 2π − y =− ∆y c + µ(y − ye ) J µIm The Transformed Primal-Dual Augmented Lagrangian Newton System H(x, 2π − y) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) The transformed system looks familiar.
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
11 / 1
GPD Augmented Lagrangian
The Primal-Dual Augmented Lagrangian
The Primal-Dual Augmented Lagrangian Newton System JT H(x, 2π − y) + µ2 J TJ ∆x g − J T 2π − y =− ∆y c + µ(y − ye ) J µIm The Transformed Primal-Dual Augmented Lagrangian Newton System H(x, 2π − y) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) The transformed system looks familiar.
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
11 / 1
GPD Augmented Lagrangian
The Primal-Dual Augmented Lagrangian
The Primal-Dual Augmented Lagrangian Newton System JT H(x, 2π − y) + µ2 J TJ ∆x g − J T 2π − y =− ∆y c + µ(y − ye ) J µIm The Transformed Primal-Dual Augmented Lagrangian Newton System H(x, 2π − y) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) The transformed system looks familiar.
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
11 / 1
GPD Augmented Lagrangian
The Primal-Dual Augmented Lagrangian The Transformed Primal-Dual Augmented Lagrangian Newton System ! ! ! H(x, 2π − y) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) The Augmented Lagrangian Primal-Dual Newton System ! ! ! H(x, π) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) If y is replaced by π in the (1, 1) block of the Hessian of the transformed system, then the two systems are identical.
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
12 / 1
PDBCL
A Primal-Dual Bound Constrained Lagrangian Method
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
13 / 1
PDBCL
A Primal-Dual Bound Constrained Lagrangian Method The Basic Idea (Conn, Gould and Toint, 1991) Solve problem (NEP) by solving a sequence of bound constrained subproblems of the form minimize M(x, y; ye , µ) n m x∈R ,y∈R
subject to −y` ≤ y ≤ yu ,
with parameter adjustments in-between subproblems. What Have We Proved? The algorithm is globally convergent and R-linearly convergent. Penalty parameter remains uniformly bounded away from zero. Convergence to points satisfying second-order optimality conditions may be proved. The algorithm converges to an “optimal” infeasible point when applied to an infeasible problem. Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
14 / 1
PDBCL
A Primal-Dual Bound Constrained Lagrangian Method The Basic Idea (Conn, Gould and Toint, 1991) Solve problem (NEP) by solving a sequence of bound constrained subproblems of the form minimize M(x, y; ye , µ) n m x∈R ,y∈R
subject to −y` ≤ y ≤ yu ,
with parameter adjustments in-between subproblems. What Have We Proved? The algorithm is globally convergent and R-linearly convergent. Penalty parameter remains uniformly bounded away from zero. Convergence to points satisfying second-order optimality conditions may be proved. The algorithm converges to an “optimal” infeasible point when applied to an infeasible problem. Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
14 / 1
PDLCL
A Primal-Dual Linearly Constrained Lagrangian Method
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
15 / 1
PDLCL
The Basic Idea (Friedlander and Saunders, 2005) Solve the simplified problem (NEP) by solving a sequence of linearly constrained subproblems of the form minimize n m
M(x, y; ye , µ)
subject to
ck + Jk (x − xk ) = 0, −y` ≤ y ≤ yu .
x∈R ,y∈R
What Have We Proved? The algorithm is globally convergent and R-quadratically convergent under exact solves. The penalty parameter remains uniformly bounded away from zero. The algorithm converges to second-order points when a second-order subproblem solver is used. The algorithm converges to an “optimal” infeasible point when applied to an infeasible problem. Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
16 / 1
PDLCL
The Basic Idea (Friedlander and Saunders, 2005) Solve the simplified problem (NEP) by solving a sequence of linearly constrained subproblems of the form minimize n m
M(x, y; ye , µ)
subject to
ck + Jk (x − xk ) = 0, −y` ≤ y ≤ yu .
x∈R ,y∈R
What Have We Proved? The algorithm is globally convergent and R-quadratically convergent under exact solves. The penalty parameter remains uniformly bounded away from zero. The algorithm converges to second-order points when a second-order subproblem solver is used. The algorithm converges to an “optimal” infeasible point when applied to an infeasible problem. Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
16 / 1
Explicitly Bounding the Dual Variables
Explicitly Bounding the Dual Variables
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
17 / 1
Explicitly Bounding the Dual Variables
Bounding the Multipliers Definition yv = max(0, y − yu , y` − y) Theorem ¯ is a solution to If the point (¯ x, y¯, w) minimize M(x, y; ye , µ) subject to y` ≤ y ≤ yu , x,y
¯ v k1 , where then (¯ x, y¯) minimizes M(x, y) + kD(w)y ¯ = diag(d1 , . . . , dm ) and di ≥ |w ¯i | for all i = 1, . . . , m. D(w) ¯ ∞ ≤ kc(¯ kwk x) − c x(µ) k∞ + µk¯ y − y(µ)k∞ .
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
18 / 1
Explicitly Bounding the Dual Variables
Bounding the Multipliers Definition yv = max(0, y − yu , y` − y) Theorem ¯ is a solution to If the point (¯ x, y¯, w) minimize M(x, y; ye , µ) subject to y` ≤ y ≤ yu , x,y
¯ v k1 , where then (¯ x, y¯) minimizes M(x, y) + kD(w)y ¯ = diag(d1 , . . . , dm ) and di ≥ |w ¯i | for all i = 1, . . . , m. D(w) ¯ ∞ ≤ kc(¯ kwk x) − c x(µ) k∞ + µk¯ y − y(µ)k∞ .
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
18 / 1
PDSQP
A Primal-Dual SQP-like Method
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
19 / 1
PDSQP
Basic Idea
Use the primal-dual augmented Lagrangian function more like a merit function. Use a composite-step approach: - a trajectory step that aims for the trajectory; - an SQP-like step that aims “down” the trajectory.
Guarantee convergence by piggy-backing on the primal-dual penalty methods already discussed as a “last resort”.
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
20 / 1
PDSQP
Notation v := (x, y), vk := (xk , yk ), ∆v := (∆x, ∆y), ∇Mk (µ) := ∇M(vk ; µ), ∇2Mk (µ) := ∇2M(vk ; µ) The trajectory step : ∆vT Compute ∆vC as a solution to the convex problem minimize ∇Mk (µ)T ∆v + 12 ∆vT Bk ∆v subject to k∆vk ≤ δC , ∆v∈Rm+n
where Bk is a positive semi-definite approximation to ∇2Mk (µ). Compute αC as a solution to the problem minimize α∇Mk (µ)T ∆vC + 0≤α≤1
α2 T 2 ∆vC ∇ Mk (µ)∆vC . 2
Define ∆vT := αC · ∆vC Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
21 / 1
PDSQP
Notation v := (x, y), vk := (xk , yk ), ∆v := (∆x, ∆y), ∇Mk (µ) := ∇M(vk ; µ), ∇2Mk (µ) := ∇2M(vk ; µ) The trajectory step : ∆vT Compute ∆vC as a solution to the convex problem minimize ∇Mk (µ)T ∆v + 12 ∆vT Bk ∆v subject to k∆vk ≤ δC , ∆v∈Rm+n
where Bk is a positive semi-definite approximation to ∇2Mk (µ). Compute αC as a solution to the problem minimize α∇Mk (µ)T ∆vC + 0≤α≤1
α2 T 2 ∆vC ∇ Mk (µ)∆vC . 2
Define ∆vT := αC · ∆vC Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
21 / 1
PDSQP
The SQP-like step : ∆vS Compute ∆vS as a solution to the problem minimize ∆v∈Rm+n
subject to
∇Mk (µD ) + ∇2Mk (µD )∆vT
T
∆v + 12 ∆vT ∇2Mk (µD )∆v
k∆vk ≤ δS ,
where µD ≤ µ (think µD close to zero!). The full step: ∆v :=
∆v |{z}T
to the trajectory
+
∆v |{z}S
down the trajectory
Control convergence by considering multiple objectives: - kck + kg − J T yk solution of problem (NEP) - kc + µ(y − ye )k + kg − J T (2π − y)k (point on the trajectory)
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
22 / 1
PDSQP
The SQP-like step : ∆vS Compute ∆vS as a solution to the problem minimize ∆v∈Rm+n
subject to
∇Mk (µD ) + ∇2Mk (µD )∆vT
T
∆v + 12 ∆vT ∇2Mk (µD )∆v
k∆vk ≤ δS ,
where µD ≤ µ (think µD close to zero!). The full step: ∆v :=
∆v |{z}T
to the trajectory
+
∆v |{z}S
down the trajectory
Control convergence by considering multiple objectives: - kck + kg − J T yk solution of problem (NEP) - kc + µ(y − ye )k + kg − J T (2π − y)k (point on the trajectory)
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
22 / 1
PDSQP
The SQP-like step : ∆vS Compute ∆vS as a solution to the problem minimize ∆v∈Rm+n
subject to
∇Mk (µD ) + ∇2Mk (µD )∆vT
T
∆v + 12 ∆vT ∇2Mk (µD )∆v
k∆vk ≤ δS ,
where µD ≤ µ (think µD close to zero!). The full step: ∆v :=
∆v |{z}T
to the trajectory
+
∆v |{z}S
down the trajectory
Control convergence by considering multiple objectives: - kck + kg − J T yk solution of problem (NEP) - kc + µ(y − ye )k + kg − J T (2π − y)k (point on the trajectory)
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
22 / 1
PDSQP
Why SQP-like?
The Transformed Primal-Dual Augmented Lagrangian Newton System ! ! ! H(x, 2π − y) JT ∆x g − J Ty =− J −µD I −∆y c + µD (y − ye )
– VERSUS – Traditional SQP Step H(x, y) J
Philip, Daniel (UCSD, OUCL)
JT 0
!
∆x −∆y
PDAL
! =−
g − J Ty c
!
SIAM - 2008
23 / 1
Summary
Summary/Future Work
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
24 / 1
Summary
Summary Introduced a new generalized primal-dual augmented Lagrangian that is a function of both the primal and dual variables. Many popular functions (i.e. augmented Lagrangian) are specific instances of the generalized primal-dual augmented Lagrangian. Extended the theory of traditional primal methods to the primal-dual setting via the primal-dual augmented Lagrangian. Briefly discussed the basis of a primal-dual SQP-like approach. Future Work Compare the numerical results of a primal-dual BCL approach to a traditional primal BCL approach. Compare the numerical results of a primal-dual LCL approach to a traditional primal LCL approach. Provide a convergence proof for the primal-dual SQP-like method. Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
25 / 1
Summary
Summary Introduced a new generalized primal-dual augmented Lagrangian that is a function of both the primal and dual variables. Many popular functions (i.e. augmented Lagrangian) are specific instances of the generalized primal-dual augmented Lagrangian. Extended the theory of traditional primal methods to the primal-dual setting via the primal-dual augmented Lagrangian. Briefly discussed the basis of a primal-dual SQP-like approach. Future Work Compare the numerical results of a primal-dual BCL approach to a traditional primal BCL approach. Compare the numerical results of a primal-dual LCL approach to a traditional primal LCL approach. Provide a convergence proof for the primal-dual SQP-like method. Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
25 / 1
Summary
For Further Reading
References P. E. Gill and D. P. Robinson. A Primal Dual Augmented Lagrangian. Department of Mathematics, University of California San Diego. Numerical Analysis Report 08-2. M. P. Friedlander and M. A. Saunders. A globally convergent linearly constrained Lagrangian method for nonlinear optimization. SIAM J. Optim., 15(3):863-897, 2005. A. R. Conn, N. I. M. Gould, and Ph. L. Toint. A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds. SIAM J. Numer. Anal., 28:545-572, 1991.
Philip, Daniel (UCSD, OUCL)
PDAL
SIAM - 2008
26 / 1