A Generalized Primal-Dual Augmented Lagrangian

Viewer
Transcript

A Generalized Primal-Dual Augmented Lagrangian Philip E. Gill1 1 University

Daniel P. Robinson2 of California, San Diego

2 Computing

Laboratory University of Oxford

SIAM Annual Meeting : July 9, 2008

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

1/1

Outline

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

2/1

Introduction

Introduction

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

3/1

Introduction

The Target Problem (NEP) minimize f (x) subject to c(x) = 0 n x∈R

f : Rn → R, g(x) := ∇f (x) ∈ Rn c : Rn → Rm , J(x) := c0 (x) ∈ Rm×n L(x, y) := f (x) − c(x)T y H(x, y) :=

(the Lagrangian)

∇2xx L(x, y)

Notation f := f (x), g := g(x), c := c(x), J := J(x) fk := f (xk ), gk := g(xk ), ck := c(xk ), Jk := J(xk ), Hk := H(xk , yk )

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

4/1

Introduction

The Target Problem (NEP) minimize f (x) subject to c(x) = 0 n x∈R

f : Rn → R, g(x) := ∇f (x) ∈ Rn c : Rn → Rm , J(x) := c0 (x) ∈ Rm×n L(x, y) := f (x) − c(x)T y H(x, y) :=

(the Lagrangian)

∇2xx L(x, y)

Notation f := f (x), g := g(x), c := c(x), J := J(x) fk := f (xk ), gk := g(xk ), ck := c(xk ), Jk := J(xk ), Hk := H(xk , yk )

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

4/1

Introduction

The Augmented Lagrangian The Augmented Lagrangian LA (x; ye , µ) = f (x) − c(x)Tye +

1 kc(x)k2 2µ

ye is an estimate of a Lagrange multiplier vector. µ > 0 is the penalty parameter. Definition π(x) = ye − c(x)/µ ∇LA (x) = g(x) − J(x)T π(x) ∇2LA (x) = H x, π(x) + µ1 J(x)T J(x)

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

5/1

Introduction

The Augmented Lagrangian The Augmented Lagrangian LA (x; ye , µ) = f (x) − c(x)Tye +

1 kc(x)k2 2µ

ye is an estimate of a Lagrange multiplier vector. µ > 0 is the penalty parameter. Definition π(x) = ye − c(x)/µ ∇LA (x) = g(x) − J(x)T π(x) ∇2LA (x) = H x, π(x) + µ1 J(x)T J(x)

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

5/1

Introduction

The Augmented Lagrangian

The Augmented Lagrangian Newton System 1 H x, π(x) + J(x)T J(x) ∆x = − g(x) − J(x)T π(x) µ

The Augmented Lagrangian Primal-Dual Newton System Let y be an arbitrary vector in Rm . Then the solution to the augmented Lagrangian Newton system satisfies the following primal-dual system ! ! ! H(x, π(x)) J(x)T ∆x g(x) − J(x)Ty =− . J(x) −µIm −∆y c(x) + µ(y − ye )

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

6/1

Introduction

The Augmented Lagrangian

The Augmented Lagrangian Newton System 1 H x, π(x) + J(x)T J(x) ∆x = − g(x) − J(x)T π(x) µ

The Augmented Lagrangian Primal-Dual Newton System Let y be an arbitrary vector in Rm . Then the solution to the augmented Lagrangian Newton system satisfies the following primal-dual system ! ! ! H(x, π(x)) J(x)T ∆x g(x) − J(x)Ty =− . J(x) −µIm −∆y c(x) + µ(y − ye )

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

6/1

Introduction

The Augmented Lagrangian

The Big Theorem If x∗ satisfies the second-order sufficient conditions for a solution of problem (NEP), then there exists a µ¯ such that for all 0 < µ < µ, ¯ the point x∗ satisfies the second-order sufficient conditions for a solution of the unconstrained problem minimize LA (x; y∗ , µ) = f (x) − c(x)T y∗ + n x∈R

Philip, Daniel (UCSD, OUCL)

PDAL

1 kc(x)k2 . 2µ

SIAM - 2008

7/1

GPD Augmented Lagrangian

The Generalized Primal-Dual Augmented Lagrangian

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

8/1

GPD Augmented Lagrangian

The Generalized Primal-Dual Augmented Lagrangian M(x, y; ye , µ, ν) = f (x) − c(x)Tye +

1 ν kc(x)k2 + kc(x) + µ(y − ye )k2 2µ 2µ

The Big Theorem Assume that (x∗ , y∗ ) satisfies the second-order sufficient conditions associated with problem (NEP). Then (x∗ , y∗ ) is a stationary point of the primal-dual function M(x, y; y∗ , µ, ν) = f (x) − c(x)Ty∗ +

ν 1 kc(x)k2 + kc(x) + µ(y − y∗ )k2 . 2µ 2µ

Moreover, if ν > 0, then there exists a positive scalar µ¯ such that (x∗ , y∗ ) is an isolated unconstrained minimizer of M(x, y; y∗ , µ, ν) for all 0 < µ < µ. ¯

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

9/1

GPD Augmented Lagrangian

The Generalized Primal-Dual Augmented Lagrangian M(x, y; ye , µ, ν) = f (x) − c(x)Tye +

1 ν kc(x)k2 + kc(x) + µ(y − ye )k2 2µ 2µ

The Big Theorem Assume that (x∗ , y∗ ) satisfies the second-order sufficient conditions associated with problem (NEP). Then (x∗ , y∗ ) is a stationary point of the primal-dual function M(x, y; y∗ , µ, ν) = f (x) − c(x)Ty∗ +

ν 1 kc(x)k2 + kc(x) + µ(y − y∗ )k2 . 2µ 2µ

Moreover, if ν > 0, then there exists a positive scalar µ¯ such that (x∗ , y∗ ) is an isolated unconstrained minimizer of M(x, y; y∗ , µ, ν) for all 0 < µ < µ. ¯

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

9/1

GPD Augmented Lagrangian

Some Special Cases Function Primal-Dual Augmented Lagrangian Function Augmented Lagrangian Function Proximal-Point Lagrangian Function Primal-Dual Penalty Function Classical Penalty Function Proximal-Point Penalty Function

ν 1 0 −1 1 0 −1

ye ye ye ye 0 0 0

The Primal-Dual Augmented Lagrangian (ν = 1) M(x, y; ye , µ) = f (x) − c(x)Tye +

Philip, Daniel (UCSD, OUCL)

1 1 kc(x)k2 + kc(x) + µ(y − ye )k2 2µ 2µ

PDAL

SIAM - 2008

10 / 1

GPD Augmented Lagrangian

Some Special Cases Function Primal-Dual Augmented Lagrangian Function Augmented Lagrangian Function Proximal-Point Lagrangian Function Primal-Dual Penalty Function Classical Penalty Function Proximal-Point Penalty Function

ν 1 0 −1 1 0 −1

ye ye ye ye 0 0 0

The Primal-Dual Augmented Lagrangian (ν = 1) M(x, y; ye , µ) = f (x) − c(x)Tye +

Philip, Daniel (UCSD, OUCL)

1 1 kc(x)k2 + kc(x) + µ(y − ye )k2 2µ 2µ

PDAL

SIAM - 2008

10 / 1

GPD Augmented Lagrangian

The Primal-Dual Augmented Lagrangian

The Primal-Dual Augmented Lagrangian Newton System JT H(x, 2π − y) + µ2 J TJ ∆x g − J T 2π − y =− ∆y c + µ(y − ye ) J µIm The Transformed Primal-Dual Augmented Lagrangian Newton System H(x, 2π − y) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) The transformed system looks familiar.

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

11 / 1

GPD Augmented Lagrangian

The Primal-Dual Augmented Lagrangian

The Primal-Dual Augmented Lagrangian Newton System JT H(x, 2π − y) + µ2 J TJ ∆x g − J T 2π − y =− ∆y c + µ(y − ye ) J µIm The Transformed Primal-Dual Augmented Lagrangian Newton System H(x, 2π − y) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) The transformed system looks familiar.

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

11 / 1

GPD Augmented Lagrangian

The Primal-Dual Augmented Lagrangian

The Primal-Dual Augmented Lagrangian Newton System JT H(x, 2π − y) + µ2 J TJ ∆x g − J T 2π − y =− ∆y c + µ(y − ye ) J µIm The Transformed Primal-Dual Augmented Lagrangian Newton System H(x, 2π − y) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) The transformed system looks familiar.

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

11 / 1

GPD Augmented Lagrangian

The Primal-Dual Augmented Lagrangian The Transformed Primal-Dual Augmented Lagrangian Newton System ! ! ! H(x, 2π − y) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) The Augmented Lagrangian Primal-Dual Newton System ! ! ! H(x, π) JT ∆x g − J Ty =− J −µIm −∆y c + µ(y − ye ) If y is replaced by π in the (1, 1) block of the Hessian of the transformed system, then the two systems are identical.

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

12 / 1

PDBCL

A Primal-Dual Bound Constrained Lagrangian Method

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

13 / 1

PDBCL

A Primal-Dual Bound Constrained Lagrangian Method The Basic Idea (Conn, Gould and Toint, 1991) Solve problem (NEP) by solving a sequence of bound constrained subproblems of the form minimize M(x, y; ye , µ) n m x∈R ,y∈R

subject to −y` ≤ y ≤ yu ,

with parameter adjustments in-between subproblems. What Have We Proved? The algorithm is globally convergent and R-linearly convergent. Penalty parameter remains uniformly bounded away from zero. Convergence to points satisfying second-order optimality conditions may be proved. The algorithm converges to an “optimal” infeasible point when applied to an infeasible problem. Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

14 / 1

PDBCL

A Primal-Dual Bound Constrained Lagrangian Method The Basic Idea (Conn, Gould and Toint, 1991) Solve problem (NEP) by solving a sequence of bound constrained subproblems of the form minimize M(x, y; ye , µ) n m x∈R ,y∈R

subject to −y` ≤ y ≤ yu ,

with parameter adjustments in-between subproblems. What Have We Proved? The algorithm is globally convergent and R-linearly convergent. Penalty parameter remains uniformly bounded away from zero. Convergence to points satisfying second-order optimality conditions may be proved. The algorithm converges to an “optimal” infeasible point when applied to an infeasible problem. Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

14 / 1

PDLCL

A Primal-Dual Linearly Constrained Lagrangian Method

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

15 / 1

PDLCL

The Basic Idea (Friedlander and Saunders, 2005) Solve the simplified problem (NEP) by solving a sequence of linearly constrained subproblems of the form minimize n m

M(x, y; ye , µ)

subject to

ck + Jk (x − xk ) = 0, −y` ≤ y ≤ yu .

x∈R ,y∈R

What Have We Proved? The algorithm is globally convergent and R-quadratically convergent under exact solves. The penalty parameter remains uniformly bounded away from zero. The algorithm converges to second-order points when a second-order subproblem solver is used. The algorithm converges to an “optimal” infeasible point when applied to an infeasible problem. Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

16 / 1

PDLCL

The Basic Idea (Friedlander and Saunders, 2005) Solve the simplified problem (NEP) by solving a sequence of linearly constrained subproblems of the form minimize n m

M(x, y; ye , µ)

subject to

ck + Jk (x − xk ) = 0, −y` ≤ y ≤ yu .

x∈R ,y∈R

What Have We Proved? The algorithm is globally convergent and R-quadratically convergent under exact solves. The penalty parameter remains uniformly bounded away from zero. The algorithm converges to second-order points when a second-order subproblem solver is used. The algorithm converges to an “optimal” infeasible point when applied to an infeasible problem. Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

16 / 1

Explicitly Bounding the Dual Variables

Explicitly Bounding the Dual Variables

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

17 / 1

Explicitly Bounding the Dual Variables

Bounding the Multipliers Definition yv = max(0, y − yu , y` − y) Theorem ¯ is a solution to If the point (¯ x, y¯, w) minimize M(x, y; ye , µ) subject to y` ≤ y ≤ yu , x,y

¯ v k1 , where then (¯ x, y¯) minimizes M(x, y) + kD(w)y ¯ = diag(d1 , . . . , dm ) and di ≥ |w ¯i | for all i = 1, . . . , m. D(w) ¯ ∞ ≤ kc(¯ kwk x) − c x(µ) k∞ + µk¯ y − y(µ)k∞ .

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

18 / 1

Explicitly Bounding the Dual Variables

Bounding the Multipliers Definition yv = max(0, y − yu , y` − y) Theorem ¯ is a solution to If the point (¯ x, y¯, w) minimize M(x, y; ye , µ) subject to y` ≤ y ≤ yu , x,y

¯ v k1 , where then (¯ x, y¯) minimizes M(x, y) + kD(w)y ¯ = diag(d1 , . . . , dm ) and di ≥ |w ¯i | for all i = 1, . . . , m. D(w) ¯ ∞ ≤ kc(¯ kwk x) − c x(µ) k∞ + µk¯ y − y(µ)k∞ .

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

18 / 1

PDSQP

A Primal-Dual SQP-like Method

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

19 / 1

PDSQP

Basic Idea

Use the primal-dual augmented Lagrangian function more like a merit function. Use a composite-step approach: - a trajectory step that aims for the trajectory; - an SQP-like step that aims “down” the trajectory.

Guarantee convergence by piggy-backing on the primal-dual penalty methods already discussed as a “last resort”.

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

20 / 1

PDSQP

Notation v := (x, y), vk := (xk , yk ), ∆v := (∆x, ∆y), ∇Mk (µ) := ∇M(vk ; µ), ∇2Mk (µ) := ∇2M(vk ; µ) The trajectory step : ∆vT Compute ∆vC as a solution to the convex problem minimize ∇Mk (µ)T ∆v + 12 ∆vT Bk ∆v subject to k∆vk ≤ δC , ∆v∈Rm+n

where Bk is a positive semi-definite approximation to ∇2Mk (µ). Compute αC as a solution to the problem minimize α∇Mk (µ)T ∆vC + 0≤α≤1

α2 T 2 ∆vC ∇ Mk (µ)∆vC . 2

Define ∆vT := αC · ∆vC Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

21 / 1

PDSQP

Notation v := (x, y), vk := (xk , yk ), ∆v := (∆x, ∆y), ∇Mk (µ) := ∇M(vk ; µ), ∇2Mk (µ) := ∇2M(vk ; µ) The trajectory step : ∆vT Compute ∆vC as a solution to the convex problem minimize ∇Mk (µ)T ∆v + 12 ∆vT Bk ∆v subject to k∆vk ≤ δC , ∆v∈Rm+n

where Bk is a positive semi-definite approximation to ∇2Mk (µ). Compute αC as a solution to the problem minimize α∇Mk (µ)T ∆vC + 0≤α≤1

α2 T 2 ∆vC ∇ Mk (µ)∆vC . 2

Define ∆vT := αC · ∆vC Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

21 / 1

PDSQP

The SQP-like step : ∆vS Compute ∆vS as a solution to the problem minimize ∆v∈Rm+n

subject to

∇Mk (µD ) + ∇2Mk (µD )∆vT

T

∆v + 12 ∆vT ∇2Mk (µD )∆v

k∆vk ≤ δS ,

where µD ≤ µ (think µD close to zero!). The full step: ∆v :=

∆v |{z}T

to the trajectory

+

∆v |{z}S

down the trajectory

Control convergence by considering multiple objectives: - kck + kg − J T yk solution of problem (NEP) - kc + µ(y − ye )k + kg − J T (2π − y)k (point on the trajectory)

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

22 / 1

PDSQP

The SQP-like step : ∆vS Compute ∆vS as a solution to the problem minimize ∆v∈Rm+n

subject to

∇Mk (µD ) + ∇2Mk (µD )∆vT

T

∆v + 12 ∆vT ∇2Mk (µD )∆v

k∆vk ≤ δS ,

where µD ≤ µ (think µD close to zero!). The full step: ∆v :=

∆v |{z}T

to the trajectory

+

∆v |{z}S

down the trajectory

Control convergence by considering multiple objectives: - kck + kg − J T yk solution of problem (NEP) - kc + µ(y − ye )k + kg − J T (2π − y)k (point on the trajectory)

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

22 / 1

PDSQP

The SQP-like step : ∆vS Compute ∆vS as a solution to the problem minimize ∆v∈Rm+n

subject to

∇Mk (µD ) + ∇2Mk (µD )∆vT

T

∆v + 12 ∆vT ∇2Mk (µD )∆v

k∆vk ≤ δS ,

where µD ≤ µ (think µD close to zero!). The full step: ∆v :=

∆v |{z}T

to the trajectory

+

∆v |{z}S

down the trajectory

Control convergence by considering multiple objectives: - kck + kg − J T yk solution of problem (NEP) - kc + µ(y − ye )k + kg − J T (2π − y)k (point on the trajectory)

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

22 / 1

PDSQP

Why SQP-like?

The Transformed Primal-Dual Augmented Lagrangian Newton System ! ! ! H(x, 2π − y) JT ∆x g − J Ty =− J −µD I −∆y c + µD (y − ye )

– VERSUS – Traditional SQP Step H(x, y) J

Philip, Daniel (UCSD, OUCL)

JT 0

!

∆x −∆y

PDAL

! =−

g − J Ty c

!

SIAM - 2008

23 / 1

Summary

Summary/Future Work

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

24 / 1

Summary

Summary Introduced a new generalized primal-dual augmented Lagrangian that is a function of both the primal and dual variables. Many popular functions (i.e. augmented Lagrangian) are specific instances of the generalized primal-dual augmented Lagrangian. Extended the theory of traditional primal methods to the primal-dual setting via the primal-dual augmented Lagrangian. Briefly discussed the basis of a primal-dual SQP-like approach. Future Work Compare the numerical results of a primal-dual BCL approach to a traditional primal BCL approach. Compare the numerical results of a primal-dual LCL approach to a traditional primal LCL approach. Provide a convergence proof for the primal-dual SQP-like method. Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

25 / 1

Summary

Summary Introduced a new generalized primal-dual augmented Lagrangian that is a function of both the primal and dual variables. Many popular functions (i.e. augmented Lagrangian) are specific instances of the generalized primal-dual augmented Lagrangian. Extended the theory of traditional primal methods to the primal-dual setting via the primal-dual augmented Lagrangian. Briefly discussed the basis of a primal-dual SQP-like approach. Future Work Compare the numerical results of a primal-dual BCL approach to a traditional primal BCL approach. Compare the numerical results of a primal-dual LCL approach to a traditional primal LCL approach. Provide a convergence proof for the primal-dual SQP-like method. Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

25 / 1

Summary

For Further Reading

References P. E. Gill and D. P. Robinson. A Primal Dual Augmented Lagrangian. Department of Mathematics, University of California San Diego. Numerical Analysis Report 08-2. M. P. Friedlander and M. A. Saunders. A globally convergent linearly constrained Lagrangian method for nonlinear optimization. SIAM J. Optim., 15(3):863-897, 2005. A. R. Conn, N. I. M. Gould, and Ph. L. Toint. A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds. SIAM J. Numer. Anal., 28:545-572, 1991.

Philip, Daniel (UCSD, OUCL)

PDAL

SIAM - 2008

26 / 1

Augmented Lagrangian method for total variation ... - CiteSeerX