High-Order Methods for Overcoming Degeneracy in Interior-Point Methods for Quadratic Programming Nicholas Gould 1

Dominique Orban 2 1 2

3

Daniel Robinson 3

Rutherford Appleton Laboratory

École Polytechnique de Montréal

Oxford University Mathematical Institute

Santiago, Chile July 29, 2010

Outline

Introduction and Motivation

Problem of interest QP

minimize cTx + 12 xTHx x

subject to Ax = b, x ≥ 0

Interested in general H, but require H  0 for some results Assume that A has full row rank

First order optimality conditions Hx∗ + c = AT y∗ + z∗ Ax∗ = b min(x∗ , z∗ ) = 0 max(x∗i , z∗i ) > 0 for all i =⇒ nondegenerate max(x∗i , z∗i ) = 0 for some i =⇒ degenerate

Problem of interest QP

minimize cTx + 12 xTHx x

subject to Ax = b, x ≥ 0

Interested in general H, but require H  0 for some results Assume that A has full row rank

First order optimality conditions Hx∗ + c = AT y∗ + z∗ Ax∗ = b min(x∗ , z∗ ) = 0 max(x∗i , z∗i ) > 0 for all i =⇒ nondegenerate max(x∗i , z∗i ) = 0 for some i =⇒ degenerate

Problem of interest QP

minimize cTx + 12 xTHx x

subject to Ax = b, x ≥ 0

Interested in general H, but require H  0 for some results Assume that A has full row rank

First order optimality conditions Hx∗ + c = AT y∗ + z∗ Ax∗ = b min(x∗ , z∗ ) = 0 max(x∗i , z∗i ) > 0 for all i =⇒ nondegenerate max(x∗i , z∗i ) = 0 for some i =⇒ degenerate

Interior point subproblem T

minimize c x + x

1 T x Hx−µ 2

n X

ln(xi )

subject to Ax = b

i=1

µ > 0 is the barrier parameter  if x(µ), y(µ) is the minimizer, then generally  lim x(µ), y(µ), z(µ) → (x∗ , y∗ , z∗ ) µ→0

def

def

where z(µ) = µX(µ)−1 e and X = diag(x1 , x2 , . . . , xn )  Notation: v = (x, y, z), v(µ) = x(µ), y(µ), z(µ) , . . . etc. Key result (Stoer/Wechs/Mizuno 1998, Wright/Orban 2002) ( Θ(µ) nondegenerate ∗ v(µ) − v = √ Θ( µ) degenerate

Interior point subproblem T

minimize c x + x

1 T x Hx−µ 2

n X

ln(xi )

subject to Ax = b

i=1

µ > 0 is the barrier parameter  if x(µ), y(µ) is the minimizer, then generally  lim x(µ), y(µ), z(µ) → (x∗ , y∗ , z∗ ) µ→0

def

def

where z(µ) = µX(µ)−1 e and X = diag(x1 , x2 , . . . , xn )  Notation: v = (x, y, z), v(µ) = x(µ), y(µ), z(µ) , . . . etc. Key result (Stoer/Wechs/Mizuno 1998, Wright/Orban 2002) ( Θ(µ) nondegenerate ∗ v(µ) − v = √ Θ( µ) degenerate

Our motivation ( Θ(µ) v(µ) − v∗ = √ Θ( µ) 1

nondegenerate degenerate

“one off” solve of problem QP - µ = 1.0−6 =⇒ kv(µ) − v∗ k ≈ 1.0−3 in the degenerate case

2

improve cross-over methods for solving QP - use x(µ) to predict optimal active set for an active set phase

3

improve large-scale SQP methods (S2QP) - solve a convex QP to predict an optimal active set, which is used to compute an additional accelerator step

4

efficiency - higher-order methods =⇒ fewer factorizations

Nondegenerate QP

Trajectory of minimizers (Fiacco and McCormick) Let v∗ be a nondegenerate second-order sufficient local minimizer of QP and assume that A has full row rank, then there exists a continuously differentiable path v(µ) such that lim v(µ) = v∗

µ→0

and v(µ) minimizes the barrier problem for all µ sufficiently small 0.6

0.5

0.4

0.3

0.2

0.1

0

−0.1 −0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Question: Given v(µ) and µ ¯ ∈ (0, µ), how do we find v(µ) ¯ efficiently? Answer: Apply Newton’s Method to perturbed optimality conditions

Barrier optimality conditions 1

Hx + c =

2

Ax = b µ ¯ Z−1 e

3

x=

4

min(x, z) > 0

H −AT  A 0 Z(µ) 0 

ATy

+z

Equivalent conditions 1

Hx + c = ATy + z

2

Ax = b

3

Xz = µe ¯

4

min(x, z) > 0

  N   −I ∆x Hx(µ) + c − ATy(µ) − z(µ)  0  ∆yN  = −  Ax(µ) − b N X(µ) ∆z X(µ)z(µ) − µe ¯

Question: Given v(µ) and µ ¯ ∈ (0, µ), how do we find v(µ) ¯ efficiently? Answer: Apply Newton’s Method to perturbed optimality conditions

Barrier optimality conditions 1

Hx + c =

2

Ax = b µ ¯ Z−1 e

3

x=

4

min(x, z) > 0

H −AT  A 0 Z(µ) 0 

ATy

+z

Equivalent conditions 1

Hx + c = ATy + z

2

Ax = b

3

Xz = µe ¯

4

min(x, z) > 0

  N   −I ∆x Hx(µ) + c − ATy(µ) − z(µ)  0  ∆yN  = −  Ax(µ) − b N X(µ) ∆z X(µ)z(µ) − µe ¯

Question: Given v(µ) and µ ¯ ∈ (0, µ), how do we find v(µ) ¯ efficiently? Answer: Apply Newton’s Method to perturbed optimality conditions

Barrier optimality conditions 1

Hx + c =

2

Ax = b µ ¯ Z−1 e

3

x=

4

min(x, z) > 0

H −AT  A 0 Z(µ) 0 

ATy

+z

Equivalent conditions 1

Hx + c = ATy + z

2

Ax = b

3

Xz = µe ¯

4

min(x, z) > 0

  N   −I ∆x Hx(µ) + c − ATy(µ) − z(µ)  0  ∆yN  = −  Ax(µ) − b N X(µ) ∆z X(µ)z(µ) − µe ¯

Question: Given v(µ), is there a method to compute accurate estimates of v(µ) ¯ for 0 < µ ¯  µ? Answer: Yes, use a higher-order Taylor approximation for v(µ), ¯ i.e., v(µ) ¯ ≈ v(µ) +

p X v[j] (µ)(µ ¯ − µ)j j=1

j!

def

= vp (µ) ¯

v1 (µ) ¯ ≡ ∆vN existence of derivatives follows from Fiacco and McCormick derivatives are obtained by repeated differentiation of Hx(µ) + c = ATy(µ) + z(µ) Ax(µ) = b X(µ)z(µ) = µe these results assume that v(µ) is computed exactly

Question: Given an approximation to v(µ), is there a method to compute accurate approximations to v(µ) ¯ for 0 < µ ¯  µ? Answer: Yes, use approximations to multiple points on the trajectory compute (µj , xj ) such that xj ≈ x(µj ) for j = 1, . . . p solve the least-squares problem minimize c

1 2

p X

pr (µj ) − xj

2

j=1 def

for the coefficient vector c, where pr (µ) =

Pr

l=0 cl µ

l

and r ≤ p

use the approximation pr (µ) ¯ ≈ x(µ) ¯ For most of the talk we focus on higher-order approximations, but key concepts may be adapted to a least-squares approach

Question: Given an approximation to v(µ), is there a method to compute accurate approximations to v(µ) ¯ for 0 < µ ¯  µ? Answer: Yes, use approximations to multiple points on the trajectory compute (µj , xj ) such that xj ≈ x(µj ) for j = 1, . . . p solve the least-squares problem minimize c

1 2

p X

pr (µj ) − xj

2

j=1 def

for the coefficient vector c, where pr (µ) =

Pr

l=0 cl µ

l

and r ≤ p

use the approximation pr (µ) ¯ ≈ x(µ) ¯ For most of the talk we focus on higher-order approximations, but key concepts may be adapted to a least-squares approach

Nondegenerate example minimize x

1 2 x 2

subject to x ≥ 1

Errors for various approximation schemes µ = 1.0e−5 µ 10 −10

x(0)

x µ2

Interp 2

9.99e−10

9.99e−10

Interp 3

−12

1.99e

−12

1.99e

1.77e−12

Interp 4

4.86e−14

4.84e−14

4.30e−14

Hermite 2

0.00e+00

0.00e+00

0.00e+00

Spline 2

9.99e−10

9.99e−10

8.90e−10

Spline 3

1.99e−12

1.99e−12

1.77e−12

Spline 4

−14

4.86e

−14

4.86e

4.30e−14

Taylor 2

1.99e−15

1.99e−15

1.33e−15



x



8.90e

Degenerate QP

Degenerate example minimize x

1 2 x 2

subject to 0 ≤ x ≤ 1

Errors for various approximation schemes µ = 1.0e−5

Interp 4

x(0)

x µ2

2.34e−03

2.33e−03

−03

−03



x

µ 10 −03



1.42e

Hermite 3

1.40e

1.39e

5.99e−04

Hermite 4

1.40e−03

1.39e−03

5.98e−04

Spline 4

2.34e−03

2.33e−03

1.42e−03

Taylor 2

1.18e−03

1.17e−03

4.19e−04

Taylor 3

9.88e−04

9.78e−04

2.75e−04

Taylor 4

−04

8.64e

−04

8.54e

1.93e−04

Taylor 5

7.78e−04

7.68e−04

1.42e−04

The main idea – go Puiseux!

√ Based on the result that v(µ) − v∗ = Θ( µ), we postulate v(µ) = v∗ +

∞ X

cj µ j/2 ,

j=1

and by defining ρ =



µ, we have def

v¯(ρ) = v∗ +

∞ X

cj ρ j = v(µ)

j=1

Hope that v¯(ρ) is analytic Use higher-order or least-squares approximations to v¯(ρ)

The main idea – go Puiseux!

√ Based on the result that v(µ) − v∗ = Θ( µ), we postulate v(µ) = v∗ +

∞ X

cj µ j/2 ,

j=1

and by defining ρ =



µ, we have def

v¯(ρ) = v∗ +

∞ X

cj ρ j = v(µ)

j=1

Hope that v¯(ρ) is analytic Use higher-order or least-squares approximations to v¯(ρ)

Degenerate example revisited minimize x

1 2 x 2

subject to 0 ≤ x ≤ 1

Errors for various approximation schemes µ = 1.0e−5

Interp 4

x(0)

x µ2

6.99e−09

6.96e−09

−13

−13



x

µ 10 −09



4.15e

Hermite 3

4.87e

4.82e

1.73e−13

Hermite 4

1.27e−14

1.26e−14

4.44e−15

Spline 4

6.99e−09

6.96e−09

4.15e−09

Taylor 2

1.19e−08

1.17e−08

3.79e−09

Taylor 3

5.49e−13

5.42e−13

1.29e−13

Taylor 4

−13

1.38e

−13

1.36e

2.07e−14

Taylor 5

4.97e−16

4.88e−16

5.05e−17

Comparison of “µ” and “ρ” parametrization. µ = 1.0e−5

Interp 4

x(0)

x(µ)  x µ2

x

2.34e−03

2.33e−03

1.42e

−03

−03

µ 10 −03



−04

x(0)

x¯(ρ)  x µ2

x

6.99e−09

6.96e−09

4.15e

−13

−13

µ 10 −09



Hermite 3

1.40e

1.39e

5.99e

4.87e

4.82e

1.73e−13

Hermite 4

1.40e−03

1.39e−03

5.98e−04

1.27e−14

1.26e−14

4.44e−15

Spline 4

2.34e−03

2.33e−03

1.42e−03

6.99e−09

6.96e−09

4.15e−09

Taylor 2

−03

1.18e

−03

1.17e

−04

4.19e

−08

1.19e

−08

1.17e

3.79e−09

Taylor 3

9.88e−04

9.78e−04

2.75e−04

5.49e−13

5.42e−13

1.29e−13

Taylor 4

8.64e−04

8.54e−04

1.93e−04

1.38e−13

1.36e−13

2.07e−14

Taylor 5

7.78e−04

7.68e−04

1.42e−04

4.97e−16

4.88e−16

5.05e−17

Give credit where credit is due We are not the first, but few seem aware of this result Stoer/Wechs/Mizuno (1998), proved that x¯(ρ) is analytic for the sufficient horizontal complementarity problem Stoer/Wechs/Mizuno (1998) suggest different strategies for using x¯(ρ) and indicator functions as the basis of an algorithm Zhao/Sun (1999), suggested a “predictor-corrector” method and a single step method based on perturbed trajectories We do not know of any numerical results We do not know of any software that incorporates this idea We have our own ideas for implementation

Implementation

Strategy 1: the “lunge” 1

Compute v(µ) for µ ≈ 10−6 using an interior point solver

2

Approximate v∗ by a sixth degree Puiseux expansion

3

Obtain an estimate of an optimal active set

4

Verify optimality

5

Use “cross-over” to resolve multipliers for general constraints May use Puiseux expansion to “skip down the path” Strategy has been added to QPB Highly accurate solutions are typically returned Active set identification appears to be improved Requires staying “close” to the path QPB is a two stage method for solving indefinite QP

Strategy 1: the “lunge” 1

Compute v(µ) for µ ≈ 10−6 using an interior point solver

2

Approximate v∗ by a sixth degree Puiseux expansion

3

Obtain an estimate of an optimal active set

4

Verify optimality

5

Use “cross-over” to resolve multipliers for general constraints May use Puiseux expansion to “skip down the path” Strategy has been added to QPB Highly accurate solutions are typically returned Active set identification appears to be improved Requires staying “close” to the path QPB is a two stage method for solving indefinite QP

Strategy 2: perturbed trajectory + Zhang + Zhao/Sun 1

Given v ≈ v(µ) where µ = (xT z)/n, compute σ ∈ (0, 1)

2

Approximate v(σµ) by a Puiseux expansion

3

Perform linesearch to ensure + +

X z

+

≥ γµ

+

and µ

 + 

Hx + c − AT y+ − z+

≥ν

Ax+ − b

v

Single phase v(µ)

Intended for convex QP Higher order friendly

v

Dynamic updating of µ

+

v(σµ)

CQP in GALAHAD

May use final “lunge”

v∗

Strategy 2: perturbed trajectory + Zhang + Zhao/Sun 1

Given v ≈ v(µ) where µ = (xT z)/n, compute σ ∈ (0, 1)

2

Approximate v(σµ) by a Puiseux expansion

3

Perform linesearch to ensure + +

X z

+

≥ γµ

+

and µ

 + 

Hx + c − AT y+ − z+

≥ν

Ax+ − b

v

Single phase v(µ)

Intended for convex QP Higher order friendly

v

Dynamic updating of µ

+

v(σµ)

CQP in GALAHAD

May use final “lunge”

v∗

Algorithm CQP on degenerate problem using Taylor expansion It 0 1 2 3 4 5 . 20 21 22 23 24 25 26 27 28 29

p-feas 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 . 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00

d-feas 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 . 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00 3.2E-27 8.1E-28 4.0E-28 2.0E-28

com-slk 1.0E+00 1.5E-01 2.1E-02 3.1E-03 4.6E-04 6.7E-05 . 1.2E-17 1.7E-18 2.3E-19 3.3E-20 4.6E-21 6.5E-22 9.1E-23 1.3E-23 1.8E-24 2.5E-25

obj 5.0E-01 7.3E-02 1.1E-02 1.6E-03 2.3E-04 3.4E-05 . 5.9E-18 8.3E-19 1.2E-19 1.6E-20 2.3E-21 3.2E-22 4.5E-23 6.4E-24 9.0E-25 1.3E-25

Optimal solution value = 5.0259E-13

step 1.0E+00 1.0E+00 1.0E+00 1.0E+00 1.0E+00 . 1.0E+00 1.0E+00 1.0E+00 1.0E+00 1.0E+00 1.0E+00 1.0E+00 1.0E+00 1.0E+00 1.0E+00

mu 1.0E-02 1.5E-03 2.1E-04 3.1E-05 4.6E-06 5.5E-07 . 4.0E-26 2.1E-27 1.1E-28 5.9E-30 3.1E-31 1.6E-32 8.7E-34 4.6E-35 2.4E-36 1.3E-37

Algorithm CQP on degenerate problem using Puixeux expansion It 0 1 2 3 4

p-feas 0.0E+00 0.0E+00 0.0E+00 0.0E+00 0.0E+00

d-feas 0.0E+00 0.0E+00 1.7E-18 8.1E-20 0.0E+00

com-slk 1.0E+00 2.2E-04 5.0E-08 5.6E-15 7.1E-29

obj 5.0E-01 1.1E-04 2.5E-08 2.8E-15 3.5E-29

Optimal solution value = 8.4205E-15

step 1.0E+00 1.0E+00 1.0E+00 1.0E+00

mu 1.0E-02 2.2E-06 1.1E-11 4.2E-22 6.0E-43

Summary

Summary Highly accurate solution of degenerate QP is not possible by standard higher order or curve fitting methods Degeneracy in QP may be overcome by using a “fractional” Puiseux expansion that is analytic (convex QP) Posed two strategies for incorporating the Puiseux expansion into interior-point framework Appears to work well for nonconvex QP, although not covered by existing theory Extends to classes of LCP Higher order Puiseux expansions improve efficiency by reducing the number of overall factorizations, although generally more system solves are required

References

Josef Stoer, Martin Wechs, and Shinji Mizuno High order infeasible-interior-point methods for solving sufficient linear complementarity problems Mathematics of Operations Research, 23, pp. 832-862, (1998) Yin Zhang On the convergence of a class of infeasible interior-point methods for the horizontal linear complementarity problem Siam J. Optimization, 1 (1994), pp. 208–227 Gongyun Zhao and Jie Sun On the rate of local convergence of high-order infeasible path-following algorithms for P∗ -linear complementarity problems Computational Optimization and Applications, 14, 293-307 (1999)

High-Order Methods for Overcoming Degeneracy in ...

Jul 29, 2010 - Interp 2. 9.99e−10. 9.99e−10. 8.90e−10. Interp 3. 1.99e−12. 1.99e−12. 1.77e−12. Interp 4. 4.86e−14. 4.84e−14. 4.30e−14. Hermite 2. 0.00ep00. 0.00ep00. 0.00ep00. Spline 2. 9.99e−10. 9.99e−10. 8.90e−10. Spline 3. 1.99e−12. 1.99e−12. 1.77e−12. Spline 4. 4.86e−14. 4.86e−14. 4.30e−14. Taylor 2.

358KB Sizes 0 Downloads 162 Views

Recommend Documents

No documents