On the use of gradual densesparse discretizations in ...

Viewer
Transcript

OPTIMAL CONTROL APPLICATIONS AND METHODS Optim. Control Appl. Meth. (2013) Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/oca.2065

On the use of gradual dense–sparse discretizations in receding horizon control John W. C. Robinson1, * ,† and Petter Ögren2 1 Department 2 Centre

of Aeronautics, Swedish Defence Research Agency, SE-164 90, Stockholm, Sweden for Autonomous Systems (CAS), Royal Institute of Technology (KTH), SE-100 44, Stockholm, Sweden

SUMMARY A key factor to success in implementations of real time optimal control, such as receding horizon control (RHC), is making efficient use of computational resources. The main trade-off is then between efficiency and accuracy of each RHC iteration, and the resulting overall optimality properties of the concatenated iterations, that is, how closely this represents a solution to the underlying infinite time optimal control problem (OCP). Both these issues can be addressed by adapting the RHC solution strategy to the expected form of the solution. Using gradual dense–sparse (GDS) node distributions in direct transcription formulations of the finite time OCP solved in each RHC iteration is a way of adapting the node distribution of this OCP to the fact that it is actually part of an RHC scheme. We have previously argued that this is reasonable, because the near future plan must be implemented now, but the far future plan can and will be revised later. In this paper, we investigate RHC applications where the asymptotic qualitative behavior of the OCP solution can be analyzed in advance. For some classes of systems, explicit exponential convergence rates of the solutions can be computed. We establish such convergence rates for a class of control affine nonlinear systems with a locally quadratic cost and propose to use versions of GDS node distributions for such systems because they will (eventually) be better adapted to the form of the solution. The advantages of the GDS approach in such settings is illustrated with simulations. Copyright © 2013 John Wiley & Sons, Ltd. Received 19 October 2011; Revised 19 June 2012; Accepted 17 January 2013 KEY WORDS:

optimal control; receding horizon control; numerical solution; convergence

1. INTRODUCTION Solution methods for optimal control problems (OCPs) have by now advanced to the state where they can be a viable alternative also in real time receding horizon control (RHC) applications [1, 2]. An important factor in this development is the recent advances in numerical procedures for solution of OCPs using direct transcription methods, in particular the pseudo-spectral methods [3–5]. The pseudo-spectral methods offer fast convergence, have proven consistency properties [6, 7], and are easy to implement using general nonlinear programming software [8]. However, they require a particular distribution of the collocation nodes which can stand in conflict with the need to adapt the node density to the character of the solution. Generally, a higher node density is needed where the solution is expected to vary more [9]. One way to address this conflict is to divide the optimization horizon into a number of phases with different node densities. For certain OCPs, the asymptotic qualitative behavior of the solution is known a priori, such as nonlinear problems with locally quadratic cost where a linearization of the problem (locally) gives properties reminiscent of the linear quadratic (LQ) case [10–12]. This holds true also in the RHC

*Correspondence to: John Robinson, Swedish Defence Research Agency, SE-164 90 Stockholm, Sweden. † E-mail: [email protected] Copyright © 2013 John Wiley & Sons, Ltd.

J. W. C. ROBINSON AND P. ÖGREN

setting [13]. In particular, Jadbabaie and Hauser [14] have shown that exponential convergence (and thus stability) of the RHC solution can result in such cases even if no end term cost is used, provided that the planning horizon is long enough. This means that using one and the same node density when solving such RHC problems numerically will sooner or later be inefficient and provides a key to the node distribution problem provided that the horizon is long enough. Clearly, a long planning horizon can be beneficial in many RHC applications because a longer horizon is closer to the (infinite) horizon of the underlying OCP one normally wishes to solve. However, with a given number of nodes, a longer horizon reduces the density and hence, the accuracy of the solution; and therefore, the node distribution has to be carefully adapted to the expected shape of the solution. In the present work, we start by deriving explicit bounds for the (exponential) convergence rate of solutions to OCPs for a general class of control affine systems that can be analyzed with the asymptotic linearization technique previously mentioned. This complements the results in [14] for RHC (without terminal cost) and provides information that can be used as a guide when choosing the planning horizon in RHC. It also gives a solid argument for using gradual dense–sparse (GDS) discretizations, see Figure 1, in the OCPs that are part of an RHC scheme. Using GDS discretizations for the OCPs in RHC problems was proposed in [15], inspired by the work on fixed end time OCPs by Kumar and Seywald [16]. GDS discretizations (similar to those in [15]) have also recently been obtained as part of a numerical solution to the problem of (efficiently) attaining a given accuracy for discretization schemes in (infinite horizon) constrained linear quadratic OCPs with iterative refinements of the node distribution [17]. We propose to use exponentially decaying node densities as a default choice for problems where exponential convergence can be expected, for instance, via asymptotic linearization analysis and show a numerically efficient scheme of implementing exponentially decaying GDS node densities (suitable for RHC). This provides a way to effectively achieve longer planning horizons without sacrificing accuracy, thus making better use of computational resources. We moreover show that exponential convergence (and thus stability) for RHC can occur for planning horizons which are not large. The outline of the paper is as follows. In Section 2, we present a family of OCPs for finite as well as infinite horizons that provide the basis for the generic RHC problem to be solved. In Section 3, we analyze these OCPs, in particular, we establish conditions for exponential convergence as well as explicit estimates for convergence rate. In Section 4, we propose a solution for the RHC problem

Figure 1. Illustration of a gradual dense–sparse (GDS) node distribution (top) compared with two uniform distributions (middle and bottom) sampling an exponentially convergent trajectory. If a uniform discretization is to cover the same horizon as a GDS, the critical (in RHC) initial part gets very sparse (middle), and if a uniform discretization is to have the same initial density as a GDS, the important (in RHC) horizon length gets much shorter (bottom). We propose to use the GDS as a default choice for systems that are known to give exponentially convergent optimal solutions. Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

GDS DISCRETIZATIONS IN RHC

for the case where exponential convergence can be expected and in Section 5, we show simulations illustrating the benefits of the approach. Finally, some conclusions are drawn in Section 6. 2. PROBLEM FORMULATION We assume that the reader is familiar with the basic ideas behind RHC, to iteratively solve OCPs and implement the first phase of those solutions [18]. The underlying family of OCPs is as follows. Problem 1 (OCP) The objective is to solve inf

JT .x, u.//,

s.t .

xP D f .x/ C g.x/u,

u2U

x.0/ D x0 ,

(1)

where x 2 Rn , u 2 Rm and the vector fields f , gi with values in Rn are C 2 with f .0/ D 0 and f .x/Cg.x/u has a linearization .A, B/ D .Dx f .0/, g.0//, which is controllable. The cost function is defined as Z JT .x, u.// D

T

L.x.t /, u.t // dt

(2)

0

where the planning horizon T can be either finite or infinite, that is, T 2 Œ0, 1. The control functions u./ are assumed to be piecewise continuous and restricted so that their pointwise values lie in a set U which is convex, compact, and contains 0 in its interior. The set of such control functions u./ is denoted U . This means in particular that a solution trajectory t 7! x.t / to (1) is piecewise C 1 and uniformly Lipschitz continuous as long as it stays in a compact set. We further assume that there is some growth restriction on x./ so as to exclude finite escape time behavior. (This cannot happen if, for example, f , g are bounded, and then solutions to (1) are globally uniformly Lipschitz.) The incremental cost L W Rn Rm ! Œ0, 1/ in (2) is assumed to be C 2 in both arguments and normalized such that L.0, 0/ D 0. The OCP in Problem 1 describes both the underlying infinite time OCP one wishes to solve using an RHC formulation and the individual OCPs that are part of the RHC scheme. On the basis of Problem 1, we wish to solve the following RHC problem (using a direct method). Problem 2 (RHC node distribution problem) Using the notation in [14], let RH.ı, T / be an RHC scheme with planning horizon T 2 .0, 1/ and execution horizon ı 2 .0, T / applied to Problem 1 with infinite horizon. Let the total number of nodes, as well as the length and node distribution (and hence density) of the first RHC phase (interval of node points) be given and assume that the length of this phase is equal to the execution horizon. What horizon length and node distribution should be used for the rest of the RHC horizon (i.e., the remaining phases)? Remark 1 Note that a given total number of RHC nodes is reasonable when the computation time is limited. Moreover, a given minimal density might be needed by the low level controller that is to execute the first phase of the computed trajectory. In the formulation and analysis of the OCPs, we consider general piecewise continuous controls u./, but in the later applications to GDS discretizations that are part of RHC schemes, we shall only consider controls u./ that have their discontinuities confined to RHC phase boundaries. Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

J. W. C. ROBINSON AND P. ÖGREN

3. ANALYSIS OF EXPONENTIAL CONVERGENCE In this section, Theorem 3 will provide the basis for a solution to Problem 2 under Assumption 1. Assumption 1 The incremental cost L in (2) is quadratically positive definite with‡ L.x, u/ > cx kxk2 C cu kuk2 ,

x 2 Rn , u 2 Rm ,

(3)

for some cx , cu > 0, and u 7! L.x, u/ is convex for all x. Moreover, f , g, L are such that the minimum value JT .x/ of the cost function JT .x, u.// is attained§ for all (relevant) x by some admissible (but not necessarily unique) control u./ 2 U , and JT ./ is proper (see succeeding text). Remark 2 2 L.0, 0/ of L at The first part of Assumption 1 (cf. [13, 14] and [19]) implies that the Hessian D.x,u/ .0, 0/ is positive definite and this guarantees that JT ./ is positive definite near 0 (Lemma 4). Henceforth, we shall assume that Assumption 1 holds. We shall denote by xT .t I x/ the value at time t of an optimal solution to Problem 1 starting at x at time 0, and let uT .t I x/ be the value of the corresponding optimal control at time t . The sublevel sets of JT ./ for T 2 Œ0, 1 will be of interest and we define them as ® ¯ rT D x 2 Rn W JT .x/ 6 r 2 , r 2 Œ0, 1/, and we moreover define T D [r2Œ0,1/ rT . It is easy to see (using (3)) that under a growth restriction on f , gi (in particular if f , gi are bounded) then JT ./ is proper on T so that rT is compact (for the continuity, see Lemma 4). 3.1. Convergence of solutions The following three lemmas will be used to prove Theorems 1–3 where Theorems 1 and 2 describe the tail behavior of OCP solutions that are part of the RHC schemes we consider in Theorem 3. Lemma 1 For j D 1, 2, let xTj .I x/ be an optimal solution over Œ0, Tj , with Tj 2 Œ0, 1, starting from x 2 1 and with corresponding optimal control uTj .I x/, where T1 6 T2 . Then, we have JT2 .x/ 6 JT1 .x/ C JT2 T1 xT1 .T1 I x/

(4)

JT2 .x/ > JT1 .x/ C JT2 T1 xT2 .T1 I x/ > JT1 .x/.

(5)

and

Proof Let uT2 ./ be an arbitrary control in U defined over Œ0, T2 and let xT2 .I x/ be the corresponding solution over Œ0, T2 to the dynamic system in (1) starting at x 2 1 . Then, Z T1 Z T2 L.xT2 .t I x/, uT2 .t // dt C L.xT2 .t I x/, uT2 .t // dt JT2 .x/ 6 0

‡ §

T1

Here and henceforth, k k denotes the two-norm. Many results for this are available. For instance, in the case T < 1, the present hypothesis is sufficient to guarantee the existence of an optimal control [20] (cf. also [10, Theorem 1.1], [19, Cor. 5.2; Theorem 6.1] and [21, Proposition 4]).

Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

GDS DISCRETIZATIONS IN RHC

and if we in particular chose uT2 ./ so that it becomes identical with uT1 .I x/ over Œ0, T1 / and so that uT2 .T1 C t / D uT2 T1 t I xT1 .T1 I x/ for t 2 Œ0, T2 T1 , we obtain Z

Z

T1

L.xT2 .t I x/, uT2 .t // dt C

0

Z

T1

D 0

T2

T1

L.xT2 .t I x/, uT2 .t // dt

L xT1 .t I x/, uT1 .t I x/ dt C Z

Z

T2 T1 0

L.xT2 .T1 C t I x/, uT2 .T1 C t // dt

T2 T1

L xT2 T1 t I xT1 .T1 I x/ , uT2 T1 t I xT1 .T1 I x/ dt 0 D JT1 .x/ C JT2 T1 xT1 .T1 I x/ D

JT1 .x/ C

which yields (4). Further, by the principle of optimality, the state xT2 .I x/ and control uT2 .I x/ restricted to the time interval ŒT1 , T2 will give the optimal value JT2 T1 xT2 .T1 I x/ to the cost function for the OCP over Œ0, T2 T1 starting at xT2 .T1 I x/ and thus JT2 .x/

D

JT1 .x/ C Z

T1

C 0

Z

T2 T1

L xT2 .t I x/, uT2 .t I x/ dt

L xT2 .t I x/, uT2 .t I x/ L xT1 .t I x/, uT1 .t I x/ dt

> JT1 .x/ C

Z

T2 T1

L xT2 .t I x/, uT2 .t I x/ dt D JT1 .x/ C JT2 T1 xT2 .T1 I x/ > JT1 .x/

which yields (5).

Lemma 2 Consider the setting in Lemma 1 and let JPT2 T1 xT2 .T1 I x/I uT2 .T1 I x/ for T1 2 .0, T2 /, denote the derivative of the function .0, T2 / 3 t 7! JT2 t xT2 .t I x/ obtained using the control uT2 .t I x/. Then, JPT2 T1 xT2 .T1 I x/I uT2 .T1 I x/ D L xT2 .T1 I x/, uT2 .T1 I x/ 6 cx xT2 .T1 I x/ 2 . (6) Proof The fact that an optimal control uT exists implies (by Pontryagin’s Maximum Principle, see e.g., [19]) t 7! uT .t I x/. This, in its turn, implies that the derivative continuity of the map x .T1 I x/I u .T1 I x/ is well defined for T1 2 .0, T2 /, x 2 1 (cf. [13]) and by (3), JP T2 T1

T2

we have JPT2 T1

T2

JT2 T1 xT2 .T1 C I x/ JT2 T1 xT2 .T1 I x/ xT2 .T1 I x/I uT2 .T1 I x/ D lim &0 D L xT2 .T1 I x/, uT2 .T1 I x/ 6 cx xT2 .T1 I x/ 2 .

Because the function t 7! J1 x1 .t I x/ is nonincreasing, every level set r1 , r > 0, is positively invariant for infinite horizon optimal trajectories x1 and because the integral Œ0,1/ defining over J1 ./ is finite on 1 , its tail must asymptotically vanish, that is, limt !1 J1 .t I x/ D 0. Morex1 over, because L satisfies (3) and t 7! x1 .t I x/ is uniformly Lipschitz continuous for x 2 r1 , it follows (by Barbalat’s lemma) that limt !1 x1 .t I x/ D 0. Similar tail properties for finite horizon problems are given in Lemma 3. As a preparation for that lemma, we note that for a uniformly Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

J. W. C. ROBINSON AND P. ÖGREN

Lipschitz continuous function ˛ W Œ0, t ! Œ0, 1/ with Lipschitz constant > 0, we have (by piecewise linear bounding) the estimate μ ´ tM ³ ² Z t if t < 2M tM M 2 4 , (7) > min ˛.s/ ds > , M2 4 2 0 if t > 2M 2 where M D maxs2Œ0,t ˛.s/. Lemma 3 Let ¹Tj º1 j D1 Œ0, 1/ be an increasing sequence of times such that Tj ! 1 as j ! 1 and assume without loss of generality that Tj C1 Tj > for some > 0. Assume further that all the trajectories ¹x .I x/º >0 , which are solutions to (1) with x 2 1 , u 2 U , are uniformly Lipschitz continuous (this holds e.g., if f , g are bounded). Then, lim max xTk .t I x/ D 0, x 2 1 . (8) j ,k!1 t 2ŒTj ,Tk j
Proof Under the assumptions in the lemma, it follows from (5) that the sequence of functions ° ±1 J1 ./ JTj ./ , j D1

(9)

is non-negative and decreasing on 1 and thus has a pointwise limit there. From (3) and (5), we also see that the convergence of the sequence in (9) implies Z Tk Z Tk 2 0 D lim L xTk .t I x/, uTk .t I x/ dt D lim xTk .t I x/ dt , x 2 1 . (10) j ,k!1 Tj j
j ,k!1 Tj j
2 Calculating a Lipschitz constant for t 7! xTk .t I x/ from that of ¹x .I x/º >0 and applying the estimate in (7) to (10) then yields the result (8). From (4), we have J1 .x/ JTj .x/ 6 J1 xTj .Tj , x/ but the right-hand side tends to zero as j ! 1 by the continuity of J1 ./ (Lemma 4) when (8) holds. Thus, in this case, the limit of the sequence in (9) is 0. In fact, by Dini’s theorem, the convergence is uniform on compact subsets of 1 (cf. [14]). 3.2. Properties of the value function near the origin The next result is due to Lukes [10] for the case T D 1, U D Rn . The case T < 1 with constrained control is given in [14] and the development in [21] shows that the result in [10] for the case T D 1 can be extended to our case with U -constrained controls. Lemma 4 For T 2 .0, 1, the function JT ./ is continuous and positive definite on T and is quadratically positive definite and C 2 near 0. More precisely, JT .x/ > 0 for all x 2 Rn n ¹0º and 1 JT .x/ D x T PT x C hT .x/, 2

hT .x/ D O.kxk3 /,

(11)

where PT 2 Rnn is symmetric and positive definite. If T < 1, then PT D P .T / is the left-endpoint value of the unique symmetric positive definite solution to the Riccati differential equation PP .t / D A BR1 S T T P .t /CP .t / A BR1 S T P .t /BR1 B T P .t /CQSR1 S T, (12) Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

GDS DISCRETIZATIONS IN RHC

where the matrices Q 2 Rnn , R 2 Rmm and S 2 Rnm are defined in terms of the Hessian of L at .0, 0/ as # " Q S 2 D.x,u/ L.0, 0/ D (13) ST R and the boundary condition on the right is P .0/ D 0. If instead T D 1, then PT is the solution to the limiting algebraic Riccati equation. Remark 3 A matrix of the form (13) is positive definite if and only if both R and Q SR1 S T are positive definite, and when this holds, the Riccati equation (12) has a unique symmetric positive definite solution (because .A, B/ is controllable and the linear quadratic (LQ) regulator problem with cross term penalty can be transformed to a block diagonal penalty problem [22, Sec. 3.4]). By Remark 2, we know that the Hessian in (13) has these properties. For later use, we note also that the representation (11) shows that there exists a neighborhood N T of 0 such that 0 < cT <

JT .x/ < dT < 1, kxk2

x 2 N T n ¹0º,

(14)

for some cT , dT with cT < .1=2/mi n .PT / and dT > .1=2/max .PT /. 3.3. Exponential convergence The following theorem provides an explicit estimate of exponential decay for the optimal value function for the OCP along an optimal state trajectory. Theorem 1 (Exponential convergence of OCP value functions) Assume that the family ¹x .I x/º >0 of solutions to (1) with x 2 1 , u 2 U is uniformly Lipschitz continuous. Then, for each x 2 1 , there exist r0 > 0, T0 < 1 such that xT .T1 I x/ 2 r1 for 0 ¶ 0 < T0 < T1 < T 6 1 and the following estimate holds JTT1 xT .T1 I x/ 6 e c.T1 T0 / JTT0 xT .T0 I x/ , (15) where c D cx =d1 with cx as in (3) and d1 as in (14) with N T D r1 . 0 Proof ./, it follows that for x 2 1 , there exists r0 > 0, T0 < 1 From Lemma 3 and the continuity of J1 T 1 such that (14) holds with N D r0 and xT .T1 I x/ 2 r1 for 0 < T0 < T1 < T 6 1. Hence, from 0 (14), we see that we have x .T1 I x/ 2 > 1 J x .T1 I x/ > 1 J 1 T T T T1 xT .T1 I x/ . d1 d1

(16)

Combining this with the estimate (6) in Lemma 2 for the derivative JPTT1 ./, we obtain cx JPTT1 xT .T1 I x/I uT .T1 I x/ 6 JT T1 xT .T1 I x/ , d1 to xT .I x/. An application of Grönwall’s lemma to the where uT .I x/ is the control corresponding function ŒT0 , T1 3 t 7! JTt xT .t I x/ then gives (15). ¶

Clearly, when T D 1, the Lipschitz continuity assumption on ¹x .I x/º>0 is redundant in this and the following result, and Lipschitz continuity is neither needed for the RHC result of Theorem 3

Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

J. W. C. ROBINSON AND P. ÖGREN

For fixed T0 , T (where T is the planning horizon), the right-hand side of (15) decays exponentially with T1 2 .T0 , T / and via the inequality (15), this gives a description of the (tail) decay of the optimal value function along an optimal (finite or infinite horizon) solution trajectory. Using this result, it is straightforward to similarly establish explicit estimates also for the rate of convergence of optimal solutions to the OCP. (For another type of estimate for exponential convergence in the infinite horizon case, see [10].) Theorem 2 (Exponential OCP convergence) Let the assumptions of Theorem 1 be fulfilled. Then, for each x 2 1 , there exist r0 > 0, T0 < 1 such that for T0 < T1 < T 6 1, we have xT .T1 I x/ 2 r1 and 0 Z T T1 x .T1 C t I x/ 2 dt 6 1 e c.T1 T0 / x .T0 I x/ 2 , (17) T T c 0 and in the case of infinite horizon also x .T1 I x/ 2 6 d1 e c.T1 T0 / x .T0 I x/ 2 , 1 1 c1

(18)

. where c D cx =d1 with cx as in (3) and d1 as in (14) with N T D r1 0 Proof From Theorem 1, we know that there exist r0 > 0, T0 < 1 such that xT .T1 I x/ 2 r1 for 0 T0 < T1 < T 6 1 and (15) holds. Therefore, from (14) and (15), we obtain JTT1 xT .T1 I x/ 6 e c.T1 T0 / JTT0 xT .T0 I x/ (19) xT .T0 I x/ 6 e c.T1 T0 / d1 xT .T0 I x/ 2 , 6 e c.T1 T0 / J1 where c D cx =d1 . From the principle of optimality and (3), we obtain Z T T1 JT T1 xT .T1 I x/ D L xT .T1 C t I x/, uT .T1 C t I x/ dt 0

Z

T T1

> cx

Z

T

0

Z

x .T1 C t I x/ 2 dt C cu

T T1

> cx 0

x .T1 C t I x/ 2 dt , T

T T1

0

T < 1.

This, together with (19) yields (17). From (14) and (19), we obtain (18).

u .T1 C t I x/ 2 dt T (20)

Remark 4 For T < 1, the constant c D cx =d1 determines the bound for the rate of convergence of xT .I x/ to 0 (when the tail of the trajectory is in r1 ). The constant cx must satisfy cx 6 .1=2/mi n .Q/ and 0 d1 must satisfy d1 > .1=2/max .P1 /. Hence, in this case, we have c 6 mi n .Q/=max .P1 /, and the tightest bound for c, we can use near x D 0 is cb D mi n .Q/=max .P1 /. Remark 5 From (19) and the first two lines of (20), we see that in the considered OCPs, we will also have exponential convergence in an L2 norm sense for the tails of the optimal control trajectories. Furthermore, the tails of the optimal state trajectories decay at an exponential rate also in the case of finite horizon T < 1. Indeed, by considering (17) and applying (7), for growing T , T1 with T T1 > > 0, we obtain a bound for maxt 2Œ0,T T1 kxT .T1 C t I x/k3 which decays exponentially with T1 with exponent c. If the optimal controls are uniformly Lipschitz, such a bound follows for them too, and the optimal controls are uniformly Lipschitz under a slight strengthening|| of the hypothesis on L [23]. ||

For instance, this holds if our L is of the form L.x, u/ D `.x/ C uT Ru [23].

Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

GDS DISCRETIZATIONS IN RHC

3.4. Receding horizon control convergence We now turn to the RHC case. A concatenation of the execution horizon parts from j consecu tive RHC solutions xT .ıI xT .ıI xT .ıI x/ / with common execution horizon ıand planning horizon T , initiated from a point x, will be denoted xT .ıI x/ j (we set xT .ıI x/ 0 D x). The following result is based on [14, Thm. 4] but adds explicit estimates of convergence rate. Theorem 3 (Exponential RHC convergence) Let r1 6 r and T1 < 1 be such that rT11 r1 . Then, for each RHC execution horizon ı > 0 and constant b 2 .0, 1/, there r0 > 0, T2 < 1 and a positive integer N such that exist constants the concatenated solutions xT .ıI x/ j of the RHC scheme RH.ı, T / starting at x 2 rT1ı with T ı > max¹T1 , T2 , ıº satisfy xT .ıI x/ j 2 r1 for j > N and the following estimates hold 0 xT .ıI x/ j C1 6 cQ .j N C1/ JTı xT .ıI x/ N ,

(21)

x .ıI x/ j C1 2 6 dT ı cQ .j N C1/ x .ıI x/ N 2 , T T cı

(22)

JTı

for j > N , where cQ D 1 C cı .b 1/=dT ı 2 .0, 1/ and cı , dT ı are as in (14) with N T D r1 . 0 Proof First, because JT1 ./ is continuous at 0, it is clear that for any r > 0, there always exists an r1 > 0 such that rT11 r1 . Then, from the inequality (5) in Lemma 1, we have Jı .x/ C JTı xT .ıI x/ 6 JT .x/ 6 J1 .x/, x 2 1 , T > ı,

and it follows that JTı xT .ıI x/ JTı .x/ 6 J1 .x/ Jı .x/ JTı .x/,

x 2 1 , T > ı.

(23)

Further, the uniform convergence of the sequence in (9) on r1 (together with the representation (11)) implies (cf. [14, Thm. 4]) that for any b 2 .0, 1/, there exists T2 2 .0, 1/ such that J1 .x/ JTı .x/ 6 bJı .x/, for x 2 r1 , T ı > T2 . Together with (23), this gives (24) JTı xT .ıI x/ JTı .x/ 6 .b 1/Jı .x/, x 2 r1 , T ı > T2 . Now, if T ı > max¹T1 , T2 º, then rT1ı rT11 r1 and the inequality (24) shows that when x 2 rT1ı , then xT .ıI x/ j 2 rT1ı for all j , and by standard (discrete time) Lyapunov stability ./ theory (e.g., [24, Thm. 1]), it follows that limj !1 xT .ıI x/ j D 0. From the continuity of J1 1 ı T ı ı T ı at 0, it follows that r0 N \ N , for r0 > 0 is small enough, where N and N are two neighborhoods of 0 in which estimates of the type (14) are valid for Jı and JTı , respectively. It follows (using (5) again) that cı kxk2 6 Jı .x/ 6 JTı .x/ 6 dT ı kxk2 ,

x 2 r1 , T ı > ı, 0

(25)

® ¯ for some cı , dT ı > 0, and by the convergence of xT .ıI x/ j j1D0 for x 2 rT1ı , it follows that for j > N . The inequalities in (24) and (25) there exists N 2 N such that xT .ıI x/ j 2 r1 0 together imply JTı xT .ıI x/ j C1 JTı xT .ıI x/ j 6 .b 1/Jı xT .ıI x/ j 6 cı .b 1/ cı .b 1/ xT .ıI x/ j 2 6 JT ı xT .ıI x/ j , dT ı x 2 rT1ı , T ı > max¹T1 , T2 , ıº, j > N , Copyright © 2013 John Wiley & Sons, Ltd.

(26)

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

J. W. C. ROBINSON AND P. ÖGREN

and if we define cQ D 1 C cı .b 1/=dT ı , we have cQ 2 .0, 1/, and the estimate in (21) follows. From (21) and (25), we obtain x .ıI x/ j C1 2 6 1 J x .ıI x/ j C1 6 T T cı ı 1 .j N C1/ cQ JT ı xT .ıI x/ N 6 cı

1 J xT .ıI x/ j C1 6 cı T ı dT ı .j N C1/ x .ıI x/ N 2 , cQ T cı

x 2 rT1ı , T ı > max¹T1 , T2 , ıº, j > N , which gives (22).

Remark 6 The bound for the convergence rate in the RHC case is determined by the constant c. Q An estimate for the best value of cQ near 0 can be obtained by setting b D 0 in the expression for cQ and replacing the ratio cı =dT ı with the eigenvalue ratio mi n .Pı /=max .PT ı /. This yields the estimated best value cQb for cQ as cQb D 1 mi n .Pı /=max .PT ı /. For small values of ı and large values of T , the value cQb is easily linked to the estimate cb of the best value of c discussed in Remark 4. The first-order approximation of the solution Pı to (12) is Pı D ı Q SR1 S T so that (approximately) mi n .Pı / D ımi n Q SR1 S T and for large T , we can use the approximation PT ı D P1 . The eigenvalue ratio mi n .Pı /=max .PT ı / can thus be approximated as ımi n Q SR1 S T =max .P1 / and the estimate cQb can be approximated as 1 ımi n Q SR1 S T =max .P1 /. By exploiting the local quadratic properties of JT ./ further, the result in Theorem 3 can be extended to yield a stability result which holds also for planning horizons T which need not be large. Proposition 1 If T 2 .0, 1/ and ı 2 .0, T / are such that the solution P to (12) satisfies PT < PT ı C Pı ,

(27)

then there exist r2 > 0, b 2 .0, 1/ such that any RHC scheme RH.ı, T / starting at x 2 rT2ı is . exponentially convergent as in (21), (22) with N D 1, and N T D r1 2 Proof When 0 < ı < T < 1, we can proceed analogously as in the proof of Theorem 3 to obtain JTı xT .ıI x/ JTı .x/ 6 JT .x/ Jı .x/ JTı .x/, x 2 1 . Further, when (27) holds, there exists ˇ 2 .0, 1/ such that PT < PT ı C ˇPı and this taken together with the representation (11) shows that there exists b 2 .0, 1/ such that JT .x/ JTı .x/ 6 bJı .x/ for x 2 r1 , provided r1 > 0 is sufficiently small. Moreover, for sufficiently small r2 6 r1 , we have 1 by the continuity of JTı ./. Hence, rT2ı r1 1 JTı xT .ıI x/ JTı .x/ 6 .b 1/Jı .x/, The rest of the proof is analogous to that of Theorem 3.

x 2 rT2ı .

The situation in (27) is not uncommon because of the monotonicity and exponential convergence properties of the solution PT to the Riccati equation (12) (cf. e.g., [25]). It can occur for values of T which are moderate, or even small, as is easily verified, for example, for a double integrator (e.g., the one corresponding to the local approximation near 0 of the example in the succeeding text). The mechanism is illustrated by the following example. Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

GDS DISCRETIZATIONS IN RHC

Example 1 RT Consider the scalar integrator system P D u, .0/ D x with cost JT .x, u.// D 0 2 .t / C u2 .t /dt . The Riccati equation (12) for this problem is PP .t / D P 2 .t / 1, P .0/ D 0, with positive solution P .t / D tanh.t / (for t 6 0) and left-endpoint value PT D P .T / D tanh.T /. The function Œ0, 1/ 3 t 7! tanh.t / converges to P1 D 1 exponentially fast and is moreover concave so that (27) holds for all T > 0, ı 2 .0, T /. 3.4.1. Tightness. Before we leave this section, we make some, somewhat informal, observations about the sharpness of the bounds in Theorems 1–3 and Proposition 1 and the corresponding estimates in Remarks 4 and 6. For an RHC scheme RH.ı, T / as in Theorem 3, the reduction rate for the minimal value of the cost can be defined as JTı xT .ıI x/ j C1 JTı xT .ıI x/ j =ı and this ratio is (for sufficiently large T and j ) bounded above by .cQ 1/JTı xT .ıI x/ j =ı (Theorem 3). An approximation of the best value of this bound infinite horizon .cQ 1/ can be obtained by considering the case, for which we have b j 1 T J1 x1,j =ı D mi n Q SR S =max .P1 /J1 x1,j , where x1,j D x1 .ıI x/ (Remark 6). This, in turn, can be approximated (for x1,j close to 0, or large j ) as .1=2/ T mi n Q SR1 S T =max .P1 / x1,j P1 x1,j (Lemma 4, Theorem 1). For small ı, the reduction rate for the cost could (forthe infinite horizon approximation) be , u1,j D L x1,j , u1,j (cf. (6)), where u1,j D x1,j compared with the derivative JP1 u1 0I x1 .ıI x/ j is an optimal control for the j th RHC iteration. This derivative can, for x near 0, be approximated by a Taylor expansion of L as L x1,j , u D .1=2/ 1,j 1,j T T T T Q SR1 S T C x1,j Qx1,j x1,j S u1,j .1=2/ u1,j Ru1,j D .1=2/ x1,j P1 BR1 B T P1 x1,j , where we have used the fact [10] that for x1,j near 0, the optimal control u1,j can be approximated by an LQ state feedback control as u1,j D R1 S T C B T P1 x1,j . By comparing the two estimates obtained earlier, it is clear that the reduction rate estimate is not sharp in the case considered. Indeed, we have L x1 , u1 6 .1=2/mi n Q SR1 S T 2 kx1 k .1=2/ x1,j

P1 BR1 B T P1 x1,j , whereas the possible values for the aforemen 2 tioned reduction rate estimate are .1=2/mi n Q SR1 S T kx1 k p for p 2 Œmi n .P1 /= max .P1 /, 1. The reduction rate bounds thus give pessimistic estimates but they have the advantage that they are applicable in cases when ı is not necessarily small and T is not necessarily large. T

4. PROPOSED SOLUTION As a general solution approach to the node distribution problem described in Problem 2, we now propose the following: (1) Over the basic planning interval Œ0, T , use a GDS discretization, that is, one that is dense in the left part and gradually gets sparser toward the right. (2) In the GDS discretization, let the node density decay with the same rate, or faster, than the expected decay of the optimal cost. (3) For the case where exponential decay of the optimal cost can be expected, so-called dyadic divisioning can be used to improve the reuse of old solutions as starting guesses in the next RHC iteration. We will now motivate all the three items. 4.1. Gradual dense–sparse discretizations in receding horizon control We believe that there are two good reasons for using a GDS approach, item (1). The first reason was investigated in [15] and relies on the fact that in an RHC iteration, near term parts of a plan needs to Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

J. W. C. ROBINSON AND P. ÖGREN

be very accurate because of the fact that it will be executed shortly, whereas far term parts of a plan will actually be revised and refined a number of times before they are implemented. The second reason, which is independent of the first, is connected to the analysis in Section 3. The basic idea is that if we know beforehand that the solution will vary less in its later parts, and combine this, with the fact that node distributions in general need to be denser where the solution varies more, we have another reason for applying GDS distributions. In detail, we argue as follows. As noted earlier, the number of nodes, as well as the density and duration of the first phase, is often given in a real time RHC setting. What remains to be decided is the planning horizon T and the node distribution in the planning interval Œ0, T . Generally, to improve accuracy, nodes should be distributed according to the variability of the solution, with higher node density in the regions where the solution is expected to vary more [9]. From Theorems 1 and 2 (and Remark 5) in the previous section, we know that, for long horizons and under the stated conditions, the optimal OCP solution decays exponentially. Moreover, from Theorem 3, we know that the RHC solution, when sampled with a sampling interval equal to the execution horizon ı, inherits this property. As remarked earlier, under a slight strengthening of the hypothesis on L, the optimal controls are uniformly (in time) Lipschitz [23] and we then know by Remark 5 that the optimal controls too decay exponentially fast. Thus, also an optimal control can (under a slight strengthening of hypothesis) be expected to converge exponentially to zero and the derivative xP T will similarly decay exponentially along an optimal solution. It follows from [9] that a priori, a GDS distribution is a natural choice. We now turn to item (2). On the basis of the aforementioned observations, we argue that exponentially decaying GDS discretizations is a good default choice in the OCPs occurring in the RHC settings considered here. If estimates of the exponential convergence rate are known, such as those previously mentioned, these can be used as a guide when deciding the length and node distribution in the second phase and onwards in order to minimize the total error over all phases, given a bound on the total number of nodes. When the length and density of the first phase are not fixed, the entire node distribution over phases can be adapted to the expected exponential convergence. This way, the density is indeed adapted to what is known a priori about the solutions. The aforementioned arguments hold even when the fact that the OCP is part of an RHC scheme is not considered. The reason for choosing density distributions that converge even faster than the bound is due to the first argument for item (1). In the RHC iterations, errors in the far time part of the plan are far less critical than errors in the near term, because the near term plan will be executed, whereas the far term plan will be revised and refined (with increased node densities) a number of times. The reason for proposing item (3) is as follows. In an RHC scheme RH.ı, T / with planning horizon T and execution horizon ı, it is in general beneficial to reuse the previously computed solution over Œt0 C ı, t0 C T as initial guess when computing a new solution over Œt0 C ı, t0 C T C ı. If a uniform node distribution has been used, this is very straightforward, but if a nonuniform distribution has been used, as in GDS, it becomes important to devise a method to efficiently reuse old data to produce a new initial guess. Dyadic divisioning GDS [15] is an exponentially decaying node distribution that makes it possible to reuse old data in an efficient way and at the same time is easy to implement. It employs the multiwavelet framework of Alpert et al. [26] using locally polynomial bases and is well adapted to the Legendre–Gauss node distributions used in recently developed pseudo-spectral OCP solver software [6–8, 27]. The dyadic divisioning scheme is illustrated in Figure 2. The RHC problem over Œt0 , t0 C T is divided into m C 1 phases, where the time duration of each phase is an even multiple of the duration ı of the execution horizon and ı D T =2m , for some positive integer m. All phases have a Legendre– Gauss node distribution for the collocation, with the same number of nodes, but the time duration of the phases gradually increase toward the right end of the planning interval Œt0 , t0 C T . Dyadic divisioning GDS can be (cyclically) updated in a manner which efficiently reuses the old solution as an initial guess and keeps the total amount of collocation (node) data constant in memory. This is described in detail in [15]. Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

GDS DISCRETIZATIONS IN RHC

Figure 2. Illustration of the base gradual dense–sparse (GDS) discretization used (with cyclical updating) in the simulations. This GDS is so-called dyadic, with m C 1 D 5 phases and four different discretization densities. Each box symbolizes a phase of the RHC problem, all phases have the same number of nodes and the height of the box quantifies the node density (on a linear scale).

5. EXAMPLE In this section, we will investigate an RHC control example of a low-flying helicopter where the objective is to reach a point target and stop there. 5.1. Model A helicopter can be modeled by the following equations of motion rP D v,

(28)

vP D fg C fd .r, v/ C u,

(29)

where r D .x, y, ´/T , v D .vx , vy , v´ /T 2 R3 and u 2 U R3 . The variables r, v symbolize the position and velocity, respectively, and together, they represent the motion of a vehicle modeled as a point mass where the dynamics are governed by Newton’s second law. The constant vector fg 2 R3 and vector function fd W R3 ! R3 represent the acceleration due to gravity and aerodynamic drag, respectively. The former is constant and the latter is C 2 and satisfies fd .r, 0/ D 0 for all r (fg , fd will be further specified in the succeeding text). The variable u is the control acceleration, which we assume is perpendicular to v. 5.1.1. Motion constraints. We consider the case of terrain following with a given (low) clearance. Thus, we assume that the position r is constrained to a height surface** ´ D h.x, y/

(30)

given by a C 2 -function h W R2 ! R symbolizing the terrain altitude plus a clearance. This restricts the position to a two-dimensional manifold M , which can also be viewed as an embedded submanifold MQ R3 . 5.1.2. Transforming the constraints I. The constrained system (28), (29), and (30) will, in two steps, be transformed into an equivalent unconstrained system of reduced dimension. The first step is to transform (30) into an equivalent velocity constraint. For this, we note that (30) holds along a solution curve t 7! .x.t /, y.t /, ´.t // to (28) in some time interval I 3 0 if and only if d .´.t / h.x.t /, y.t /// D 0, dt

t 2 I,

´.0/ D h.x.0/, y.0//.

(31)

The first condition in (31) can be written in terms of, for example, the (non-normalized) normal vector ‰.x, y/ 2 R3 to MQ at .x, y/ as v.t /T ‰.x.t /, y.t // D 0, **

t 2 I,

(32)

The control effort needed to keep the vehicle on this surface is not taken onto account.

Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

J. W. C. ROBINSON AND P. ÖGREN

where ‰.x, y/ D .@h.x, y/=@x, @h.x, y/=@y, 1/T . If we let Œ denote column span then Œ‰.x, y/? is the tangent space to MQ at .x, y/ and we have Œ‰.x, y/? D Œˆ.x, y/, where ˆ.x, y/ 2 R32 is the matrix†† ˆ.x, y/ D .ˆ1 .x, y/, ˆ2 .x, y// with linearly independent columns

@h.x, y/ T @h.x, y/ T ˆ1 .x, y/ D 1, 0, , ˆ2 .x, y/ D 0, 1, . (33) @x @y 5.1.3. Transforming the constraints II. We now apply the ‘constraint backstepping’ trick once more and transform the velocity constraint (31) into an equivalent acceleration constraint. By defining 3 2 @2 h.x,y/ 2 h.x,y/ @x 2 @ @x@y 0 7 @‰.r/ 6 7 @2 h.x,y/ @2 h.x,y/ (34) D6 0 5 4 @x@y @y 2 @r 0 0 0 and taking time derivatives of (32), we see that (31) holds if and only if the condition @‰.r.t // v.t / D 0, t 2 I, (35) @r holds together with the initial condition in (31) for r and the initial condition obtained from (32) for v. From (35), it follows that the Hessian of the map (30) occurring in the matrix on the right in (34) determines the constraint-induced acceleration‡‡ PŒ‰.x,y/ vP along a curve on MQ . The acceleration components in the tangent space Œˆ.x, y/ are however free. Moreover, from (32) and (35), it follows that v´ , vP ´ are determined uniquely by .x, y, vx , vy / for curves on MQ . The last step in our reduction procedure is thus to project the acceleration terms (normalized forces) on the right in (29) onto Œˆ (to obtain the free accelerations) and express result as well as the right-hand side of (28) in terms of .x, y, vx , vy /. The redundant equations for the evolution of ´ and v´ can then be dropped. v.t P /T ‰.x.t /, y.t // C v.t /T

5.1.4. Projecting the accelerations/forces. The gravity-induced acceleration fg is given by fg D Œ0, 0, gc T , where gc > 0 is the gravitation acceleration constant .gc 9.81/. It is easy to calculate the projection PŒˆ.x,y/ fg of the vector fg onto Œˆ.x, y/ using the representation (33). The acceleration due to aerodynamic drag fd is of the form fd .r, v/ D .1=2m/Cd Sref .´/kvkv, where m is the mass of the vehicle, Cd , Sref > 0 are constants that depend on the shape and area of the vehicle, and .´/ is the air density which is assumed to follow the standard ISA profile.§§ From (32), we obtain v D PŒˆ.x,y v and thus PŒˆ.x,y/ fd .r, v/ D fd .r, v/. As for u, finally, we can express PŒˆ.x,y/ u as PŒˆ.x,y/ u D uQ 1

ˆ1 .x, y/ ˆ2 .x, y/ C uQ 2 , kˆ1 .x, y/k kˆ2 .x, y/k

where .uQ 1 , uQ 2 / are the coordinates for the control. 5.1.5. The reduced ODE. The constrained system (28), (29), and (30) can now equivalently be solved on M in terms of the ODE xP D vx , yP D vy , vP x D PŒˆ.x,y/ .fg C fd .r, v/ C u/ Q x, vP y D PŒˆ.x,y/ .fg C fd .r, v/ C u/ Q y,

(36) (37) (38) (39)

††

In the language of [28], the columns of ˆ.x, y/ represent a basis of generalized speeds. (Our reduction procedure is a version of the one popularized by Kane and coworkers.) Given the appropriate initial conditions for r and v, this acceleration is equivalent to the position constraint (30). §§ ISO standard 2533:1975. (The profile is smooth up to 11, 000 m.) ‡‡

Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

GDS DISCRETIZATIONS IN RHC

where ´ and v´ have been eliminated everywhere (including the initial conditions) using (30) and (31). 5.2. Simulation results Problem 2 was solved numerically using GPOPS v2.4 [8] with system dynamics (1) given by (36)–(39), using both RHC with equal (but nonuniform) node distribution in each phase, here denoted ED, and dyadic divisioning GDS. (For both ED and GDS, a Legendre–Gauss node distribution was used in each phase.) The parameters were m D 10 kg, Sref D 0.75 m2 , CD D0.25 andthe incremental cost L in (2) was given by L.x, y, vx , vy , uQ 1 , uQ 2 / D Qr .x 2 C y 2 / C Qv vx2 C vy2 C Ru uQ 21 C uQ 22 , where Qr D 0.001, Qv D 1.5 and Ru D 1. The constraint on the control was kuk 6 20. In the ED case, the planning horizon T was 10 s, the execution horizon ı was 2 s, and the basic planning interval Œ0, T was divided into five equal length phases each containing 11 nodes. In the GDS case, a dyadic divisioning phase profile with five phases as in Figure 2 was used with ı D 2s and 11 nodes in a phase, yielding a total planning horizon T of 32 s (with the same total number of nodes as in the ED case). Thus, the executed parts of the trajectories (the first phase from each planning interval) were computed with the same accuracy in the ED and GDS cases. The criterion (27) is fulfilled near 0 for the system when T D 10 s and ı D 2 s. The height map h used is illustrated in Figure 3; it is a sum of Gaussian functions multiplied by a ‘zeroing’ function so that h.0, 0/ D 0 and h is locally quadratic near .0, 0/. This implies that the systems (36)–(39) become a double integrator near the origin, and the asymptotic results in the previous section are applicable there. It is straightforward to compute the solutions P2 , P30 to (12) numerically and the eigenvalue ratio mi n .P2 /=max .P30 / is 1.6 103 . The value for the rate of decay per ED-RHC or GDS-RHC iteration (corresponding to the execution horizon ı) of (the minimal value of the) the cost function predicted by cQb in Remark 6 is cQb D 0.9984. Likewise, the ratio mi n .Q/=max .P1 / is easily computed to 0.8 103 . Consequently, the value of cb in Remark 4 for the predicted rate of decay of the (the minimal value of the) cost function in the underlying OCP problem is cb D 0.8 103 (and the approximation in Remark 6 is thus valid). The rather slow convergence of the OCP problem means that the ED formulation (with the same node distribution in all phases) can be considered as consistent with the expected convergence properties of the solution for planning horizons in the order of tens of seconds. In Figure 3, the trajectories resulting from 180 s of the simulation starting from 10 random locations along a line on the height map are shown. The corresponding values of the computed cost JT along the trajectories are shown in Figure 4, where also a (dashed) line with the slope corresponding to the value of cQ given in the previous paragraph is added. In Figure 5, finally, the values of the state and control variables are shown.

Figure 3. Optimal trajectories for 10 different starting positions with a common goal at .0.0/. Position coordinates, ED (left) and gradual dense–sparse (GDS) (right). Each executed short term plan (duration ı D 2s) is marked with a separate color. Note how the longer planning horizons make the GDS trajectories avoid the hills and progress faster toward the goal. Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

J. W. C. ROBINSON AND P. ÖGREN

Figure 4. Calculated value function JT along calculated optimal trajectories (log scale, base 10) as a function of time, ED (left) and gradual dense–sparse (right). The value of the estimate cQ for the rate of decay given in the text is indicated by a (red) dashed line.

Figure 5. State variables and control as a function of time, ED (left) and gradual dense–sparse (right).

The exponential convergence can clearly be seen in Figure 4 and it is apparent that the GDS strategy gives a (noticeably) faster convergence, in particular, in the region, where the quadratic approximation of the incremental cost holds. The asymptotic rate of convergence per iteration for ED is approximately 0.9899 and for GDS, it is approximately 0.9689, both of which are faster than predicted by the value cQb given earlier. From Figure 3, it is clear that, even though GDS may produce plans with tails that are non-optimal (due to low node density), the executed parts (i.e., the conjunction of the first phases from each iteration) of the GDS plans follow more efficient paths than ED through the terrain. Moreover, the extension of the planning horizon utilized by GDS helps the solver find solutions that are, when concatenated to form the overall trajectory, more efficient. This is manifested by a much higher utilization of control effort by GDS than ED in the initial part of the overall trajectories (which results in a higher velocity, initially). 6. CONCLUSION We have shown that asymptotic exponential convergence of both OCPs and RHC schemes with easily computed rates can occur in settings where the incremental cost is locally quadratic. On the basis of this, we have proposed that, in these settings, the node distribution in approaches of RHC based on the direct form OCP formulation should be adapted to the expected asymptotic exponential decay of the cost and solution. We have also devised a method for doing this, dyadic gradual dense–sparse discretization. In situations where iterative node distribution is not an option, off-line adaptation of the node distribution to the expected form of the solution will lead to a more efficient use of computational resources. More importantly, however, we have highlighted the importance of selecting Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

GDS DISCRETIZATIONS IN RHC

planning horizon length and the node distribution near the far right of the planning interval of the OCP, which is repeatedly solved in an RHC scheme. In the simulations, we have shown that it can be beneficial to use an even lower node density near the right endpoint than suggested by a bound for the decay of the OCP solution because this helps the solver find solutions that, when concatenated in an RHC scheme, form better (closer to infinite horizon OCP optimal) overall solutions. Extending the planning horizon provides a mechanism for backpropagating at least some information about conditions in the far future. Obtaining a longer horizon can therefore be more important than maintaining high accuracy in the computation of the solution near the far right in an RHC scheme (with short planning horizon but high node density throughout). This is intuitively reasonable, because the far-right part of the solution to an OCP which is part of an RHC scheme will anyway only be used as a starting guess for the following OCP in the next RHC iteration. The results can therefore be interpreted as to suggest that the decay bounds should, in many practical situations, not be used as maximum rates for the node density decay but minimum. An important question for future investigation is thus the trade-off between node density near the far right as dictated by accuracy concerns (and determined by decay bounds) versus as dictated by the positive effects on overall optimality of the concatenated RHC solutions due to backpropagation of information about the far future. ACKNOWLEDGEMENT

This research was supported by a contract/grant sponsor: FOI project E20674.

REFERENCES 1. Mayne D, Rawlings J, Rao C, Scokaert P. Constrained model predictive control: stability and optimality. Automatica 2000; 36:789–814. 2. Ohtsuka T. A continuation/GMRES method for fast computation of nonlinear receding horizon control. Automatica 2004; 40:563–574. 3. Elnagar J, Kazemi M, Razzaghi M. The pseudospectral legendre method for discretizing optimal control problems. IEEE Transactions on Automatic Control 1995; 40(10):1793–1796. 4. Gong Q, Kang W, Bedrossian N, Fahroo F, Sekhavat P, Bollino K. Pseudospectral optimal control for military and industrial applications. IEEE Conference on Decision and Control, New Orleans, LA, 2007; 4128–4142. 5. Strizzi J, Ross I, Fahroo F. Towards real-time computation of optimal controls for nonlinear systems. Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Monterey, CA, 2002, DOI: 10.2514/6.20024945. AIAA paper 2002-4945. 6. Benson D. A Gauss pseudospectral transcription for optimal control. Ph.D. Thesis, MIT, 2004. 7. Huntington G. Advancement and analysis of a Gauss pseudospectral transcription for optimal control problems. Ph.D. Thesis, MIT, 2007. 8. Rao A, Benson D, Huntington G, Francolin C. Users manual for GPOPS, A MATLAB package for dynamic optimization using the Gauss pseudospectral method, 2009. 9. Jain S. Multiresolution strategies for the solution of optimal control problems. Ph.D. Thesis, Georgia Institute of Technology, School of Aerospace Engineering, 2008. 10. Lukes D. Optimal regulation of nonlinear dynamical systems. SIAM Journal on Control 1969; 7(1):75–100. 11. Willemstein A. Optimal regulation of nonlinear dynamical systems on a finite interval. SIAM Journal on Control and Optimization 1977; 15(6):1050–1069. 12. van der Schaft A. Relations betweeen H1 optimal control and its linearization. Proceeding of the 30th Conference on Decision and Control, Brighton, UK, 1991; 1807–1808. 13. Jadbabaie A, Yu J, Hauser J. Unconstrained receding-horizon control of nonlinear systems. IEEE Transactions on Automatic Control 2001; 46(5):776–783. 14. Jadbabaie A, Hauser J. On the stability of receding horizon control with a general terminal cost. IEEE Transactions on Automatic Control 2005; 50(5):674–678. 15. Ögren P, Robinson J. Receding horizon control of UAVs using gradual dense–sparse discretizations. Proceedings of the AIAA Guidance, Navigation and Control Conference, Toronto, Canada, 2010, DOI: 10.2514/6.2010-8084. AIAA paper 2010-8084. 16. Kumar R, Seywald H. Dense-sparse discretization for optimization and real-time guidance. Journal of Guidance Control and Dynamics 1996; 19(2):501–503. 17. Pannocchia G, Rawlings J, Mayne D, Marquardt W. On computing solutions to the continuous time constrained linear quadratic regulator. IEEE Transactions on Automatic Control 2010; 55(9):2192–2198. 18. Camacho E, Bordons C. Model Predictive Control (2nd edn). Springer-Verlag: London, 2004. 19. Byrnes C. On the Riccati differential equation for nonlinear Bolza and Lagrange problems. Journal of the Mathematical Systems, Estimation and Control 1998; 8(1):1–54. Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

J. W. C. ROBINSON AND P. ÖGREN

20. Vidyasagar M. On the existence of optimal controls. Journal of Optimization Theory and Applications 1975; 17(3/4):273–278. 21. Hauser J, Osinga H. On the geometry of optimal control: the inverted pendulum example. Proceedings of the IEEE American Control Conference, Vol. 2, Arlington, VA, 2001; 1721–1726. 22. Anderson B, Moore J. Linear Optimal Control. Prentice-Hall: Englewood Cillfs, NJ, 1971. 23. Galbraith G, Vinter R. Lipschitz continuity of optimal controls for state constrained problems. SIAM Journal on Control and Optimization 2003; 42(5):1727–1744. 24. Kalman R, Bertram J. Control system design via the “second method” of Lyapunov: Part II, discrete time systems. ASME Journal on Basic Engineering 1960; 82:394–400. 25. Callier F, Winkin J, Willems J. On the exponential convergence of the time-invariant matrix Riccati equation. Proceedings of the 31th IEEE Conference Decision and Control, Vol. 2, Tucson, AZ, 1992; 1536–1537. 26. Alpert B, Beylkin G, Gines D, Vozovoi L. Adaptive solution of partial differential equations in multiwavelet bases. Journal of Computational Physics 2002; 182(1):149–190. 27. Ross I, Fahroo F. Legendre pseudospectral approximations of optimal control problems. In New Trends in Nonliner Dyanamics and Control and Their Applications, Vol. 295, Lecture Notes in Control and Information Sciences. Springer: Berlin, 2004; 327–342. 28. Kane T, Wang C. On the derivation of equations of motion. Journal of the Society of Industrial and Applied Mathematics 1965; 13(2):487–492.

Copyright © 2013 John Wiley & Sons, Ltd.

Optim. Control Appl. Meth. (2013) DOI: 10.1002/oca

On the use of gradual densesparse discretizations in ...

SUMMARY. A key factor to success in implementations of real time optimal control, such as receding horizon control. (RHC), is making efficient use of computational resources. The main trade-off is then between efficiency and accuracy of each RHC iteration, and the resulting overall optimality properties of the concatenated ...

Download PDF

421KB Sizes 1 Downloads 241 Views

Report

On the use of gradual densesparse discretizations in ...

Recommend Documents