A Game Theoretical Multiple Resource Interaction ... - Springer Link

Viewer
Transcript

Annals of Operations Research 109, 15–40, 2002  2002 Kluwer Academic Publishers. Manufactured in The Netherlands.

A Game Theoretical Multiple Resource Interaction Approach to Resource Allocation in an Air Campaign ∗ DEBASISH GHOSE [email protected] Department of Aerospace Engineering, Indian Institute of Science, Bangalore 560 012, India JASON L. SPEYER and JEFF S. SHAMMA {speyer, shamma}@seas.ucla.edu Mechanical and Aerospace Engineering Department, University of California at Los Angeles, Los Angeles, CA 90095, USA

Abstract. In this paper we propose a multiple resource interaction model in a game-theoretical framework to solve resource allocation problems in theater level military campaigns. An air raid campaign using SEAD aircraft and bombers against an enemy target defended by air defense units is considered as the basic platform. Conditions for the existence of saddle point in pure strategies is proved and explicit feedback strategies are obtained for a simplified model with linear attrition function limited by resource availability. An illustrative example demonstrates the key features. Keywords: air campaign modeling, resource interaction models, resource allocation, military campaigns, applied game theory

1.

Introduction

There has been considerable recent interest in automating theater level military campaign management systems, due to advancements in computation, communication, and weapon systems technology. Theater level campaign management encompasses several issues among which those related to strategic and tactical decision-making in adversarial environments are of special importance. In this paper, we address a generic problem related to an air campaign where the diversity of operational assets, having varying degree of effectiveness, role, and utility, play a critical role in the decision-making process. Specifically, we address warfare as a multiple resource interaction problem, modeled in a game-theoretic framework. Here, two adversaries commit their resources to an arena, where a player’s resources inflict attrition on its adversary’s resources through a sequence of interaction defined by the spatial distribution of resources in the battlefield. As in large scale warfare, the resource types of the two players may differ significantly in their capabilities and operational roles. The payoff of the game is a function of the surviving resources of the two adversaries at intermediate and terminal time points. Players’ decisions are in terms of the resource levels to be committed by each player at intermediate time points. The proposed framework is general enough to encompass a large class ∗ Partially supported by the DARPA grant N66001-99-8511.

16

GHOSE, SPEYER AND SHAMMA

of resource interaction problems in the context of military campaigns. However, this paper presents a specific application that involves an air campaign by friendly forces into enemy territory with the intention of destroying an enemy target. One of the earliest related papers is by Blackwell [3] on multi-component attrition where two players, possessing several resources each, deploy one resource each at each stage of the game, causing an attrition to that resource. The game is played several times till at least one of the resources is reduced to zero. The strategy of each player is defined as a probability distribution on its resource set. In spite of the simplicity of the model, this game proves to be intractable [14]. So, Blackwell investigates the asymptotic behaviour of the optimal probability distribution on the discrete resource space (that is, optimal strategies) when the initial resource levels of the two players are infinitely large but the ratio of the players’ resource levels remains fixed. This transforms the game to an infinitely repeated game, for which sufficient conditions on the ratio of initial resource levels are obtained such that the probability of win of one of the players asymptotically approaches one. While mathematically elegant, the interpretation of these results to practical warfare is difficult mainly because of the assumption of infinitely large resource levels. It is relevant to mention that the literature on infinitely repeated non-zero sum games [16] deal with issues of cooperation emerging from infinite repetition and has no apparent application in the warfare problems addressed here. Another paper that addresses the problem of resource allocation in an air campaign is by Berkovitz and Dresher [2]. In this paper, the two adversaries are evenly matched in terms of their resource types. This imposes a symmetry in their resource types, capabilities, and decision variables. The solution is sought in terms of an optimal assignment of a single resource among several tasks. Specifically, both players have several aircraft that have to be assigned the separate roles of counter air, air defense, and ground support. The game is assumed to consist of a sequence of several missions, each of which consists of simultaneous counter-air, air defense, and close-support operations. Solution is sought in terms of optimal partitioning of the aircraft resources at each mission by both adversaries. It is shown that for a large number of stages both players need to use mixed strategies.1 Other papers addressing similar problems includes Bracken and McGill [4] where a mathematical programming approach is taken to formulate the problem of aircraft sortie allocation for a two stage game. The formulation and the proposed algorithm both depend strongly on the convexity of the payoff. Related but different classes of weapon allocation problems has been covered in the book by Danskin [5]. Preliminary but recent results on modeling air campaigns as resource interaction problems have been reported in Ghose et al. [9,10], Ghose, Speyer and Shamma [11,12]. The attrition game of Blackwell [3] and the air war game of Berkovitz and Dresher [2] have similarities with the model we propose here. However, there are some 1 In fact, the solution given in the paper proposes mixed strategy for only one of the players and pure

strategy for the other. Later L. Shapley, in an unpublished working note, showed that the correct solution is fairly complicated and involves mixed strategies by both players [13, p. 160].

GAME THEORETICAL RESOURCE ALLOCATION

17

important distinctions: (i) Our multiple resource interaction game has a continuous kernel, while in Blackwell’s multi-component attrition game the action space is discrete. (ii) In our model the attrition matrices are not constant and are functions of the available resource levels. (iii) The interaction dynamics in our model is defined through a sequence of interactions whereas in the other models an instantaneous interaction is assumed. Since the air campaign problem must explicitly model the sequence in which resources interact, the multiple resource interaction model is closer to reality than the other models. (iv) Our model differs from the tactical air war model both in terms of multiplicity of resources as well as in the mode of resource allocation. (v) The payoff function in our model is more realistic and relaxes the convexity assumption made in Bracken and McGill [4]. 2.

Formulation of the SEAD air campaign

We consider an air campaign by friendly (BLUE) forces into enemy (RED) territory to destroy a target (TG) which is protected by several RED air defense (AD) units. The BLUE forces employ two types of resources, the first of which constitutes SEAD (suppression of enemy air defense) units which are flight vehicles equipped with sophisticated sensor systems that detect the presence of AD units and destroy them using antiradiation missiles [6,17]. The objective of using SEAD units is to create a safe corridor for bomber aircraft to penetrate enemy territory. Bombers (BMB) are the second type of BLUE resources and are used to destroy TGs (primarily) as well as ADs. A SEAD air campaign scenario is shown in figure 1. The shaded area shows the lethal zone of the AD units. The SEAD assisted air campaign problem has a significant spatial dimension since the locations of the TG and ADs determine the effectiveness of SEAD and BMB missions. Also, the multi-stage formulation of the game needs to take into account the physical movement of the resources (for example, the AD units can reposition themselves to more effectively guard the corridor; and the BLUE forces can re-compute the ingress corridor based upon the current risk map) [9,10]. In this paper, by assuming that the air campaign takes place on a single corridor, we subsume the spatial dimension into the attrition (or loss) functions that quantify damage suffered by resources due to interaction with adversary’s resources. Thus, we consider only the temporal dimension of the problem in this paper and address the problem of optimal allocation of resources by the two adversaries at each stage of the air campaign modeled as a multi-stage game. A stage in a game is defined as a single sortie in which SEADs and BMBs participate. At a stage k, BLUE has SEAD strength of Sks and bomber strength of Skb , RED has g air defense strength of Ska , and the target has a strength (or value) of Sk . These strengths

18

GHOSE, SPEYER AND SHAMMA

Figure 1. The SEAD assisted air campaign scenario.

are known to the players at the beginning of a stage. Note that these strengths may not be in terms of numbers of SEAD, BMB, or AD. Rather, they may derive from an aggregation process that models strengths as capability that each resource has in terms of the mission objectives. This is closely related to the spatial dimension of the problem which determines the corridor of operation and which, in turn, defines the effectiveness of specific resources against adversary’s resources through loss functions [9,10]. The solution to this problem is an optimal decision by both players on the amount of each resource to be used at each stage. Specifically, the BLUE forces have to decide on the amount (or number or strength) of SEADs and BMBs to be used in each sortie or stage and the amount to be kept in reserve for use in later stages. Similarly, the RED forces have to decide on the amount of AD strength to be used to defend the corridor at each stage and the amount to be kept in reserve. BLUE’s objective is to minimize the sum of the surviving target strength over a specified number of stages while RED’s objective is to maximize this sum. If the target is a production facility that produces material detrimental to BLUE’s interests then this objective reflects the players’ perception. At any given stage k of the game, the BLUE forces partition Sks and Skb as, Sks = usk + rks ,

Skb = ubk + rkb ,

(1)

where, usk ∈ [0, Sks ] and ubk ∈ [0, Skb ] are used by BLUE at the k-th stage and rks = Sks −usk and rkb = Skb − ubk are kept in reserve for later use. Similarly, at stage k, RED keeps some ADs “hidden” (or passive) and switches on (or made active) the rest to engage SEADs and BMBs. Thus, RED partitions its air defense strength as, Ska = vka + rka ,

(2)

where, vka ∈ [0, Ska ] is the AD strength used to engage SEADs and BMBs and rka =

GAME THEORETICAL RESOURCE ALLOCATION

19

Ska −vka is the AD strength kept in reserve for later use. Thus, the BLUE decision variables at stage k are (usk , ubk ) and the RED decision variable is vka . The resource interaction sequence in a stage k and the resulting attrition to resources is given next. Here Lsa (·, ·) defines the attrition to SEADs by ADs, Las (·, ·) defines the attrition to ADs by SEADs, Lba (·, ·) defines the attrition to BMBs by ADs, g Lab (·, ·) defines the attrition to ADs by BMBs, and Lb (·, ·) defines the attrition to TGs by BMBs. First, the SEADs fly along a designated corridor and engage ADs. sk1 = surviving SEAD strength = max 0, usk − Lsa vka , usk , (3) 1 a a a s (4) ak = surviving AD strength = max 0, vk − Ls vk , uk . Next, BMBs fly through the corridor and are engaged by ADs. bk2 = surviving BMB strength = max 0, ubk − Lba ak1 , ubk (5) = max 0, ubk − Lba max 0, vka − Las vka , usk , ubk , 2 1 a 1 b ak = surviving AD strength = max 0, ak − Lb ak , uk = max 0, max 0, vka − Las vka , usk − Lab max 0, vka − Las vka , usk , ubk . (6) Finally, the BMBs engage the TGs at the end of the corridor. g g g gk3 = surviving ground troop strength = max 0, Sk − Lb bk2 , Sk g g g = max 0, Sk − Lb max 0, ubk − Lba max 0, vka − Las vka , usk , ubk , Sk . (7) At the next stage k + 1 the two players have the following resource strengths available: s = rks + sk1 , Sk+1

b Sk+1 = rkb + bk2 ,

a Sk+1 = rka + ak2 ,

g

Sk+1 = gk3 .

(8)

The state equations corresponding to the above sequence of interactions are as follows, s = max 0, usk − Lsa vka , usk + Sks − usk , (9) Sk+1 b b b b b a a a s b (10) Sk+1 = max 0, uk − La max 0, vk − Ls vk , uk , uk + Sk − uk , a a a a s Sk+1 = max 0, max 0, vk − Ls vk , uk (11) − Lab max 0, vka − Las vka , usk , ubk + Ska − vka , b g g g g b b a a a s (12) Sk+1 = max 0, Sk − Lb max 0, uk − La max 0, vk − Ls vk , uk , uk , Sk with controls usk ∈ [0, Sks ], ubk ∈ [0, Skb ], vka ∈ [0, Ska ]. We may write the state equations as, Sk+1 = f (Sk , uk , vk ), g

(13)

where, Sk = (Sks , Skb , Ska , Sk ), uk = (usk , ubk ), vk = vka , and f (·, ·, ·) represents state transitions. A resource interaction tableau that summarizes the above is shown in figure 2.

20

GHOSE, SPEYER AND SHAMMA

Figure 2. The resource interaction tableau for stage k for SEAD assisted air campaign.

The payoff in a n stage game, minimized by BLUE and maximized by RED, is the sum of surviving TG strengths and is proportional to the TGs cumulative damage potential. J =

n

g

(14)

Sk+1 .

k=1

This choice of payoff comes from purely practical considerations and satisfies the stageadditivity property [1].

3.

Monotonic loss functions

We consider a simplified model for the loss functions so that the loss to a given resource is a function of the adversary’s interacting resource strength only. The function lyx (·) : R → R has a similar interpretation as Lxy explicitly defined earlier. Lsa vka , usk = las vka , Lba ak1 , ubk = lab ak1 ,

Las vka , usk = lsa usk , Lab ak1 , ubk = lba ubk ,

g g g Lb bk2 , Sk = lb bk2 .

(15)

GAME THEORETICAL RESOURCE ALLOCATION

21

Each of these loss functions are assumed to be non-negative, continuous, and monotonically increasing2 function of its argument, and attains zero value when its argument is zero. The resulting state equations are the same as (9)–(12) with the loss function defined as in (15). 3.1. The single-stage game Consider the payoff at the kth stage, g

Jk (uk , vk ) = Sk+1

(16)

with uk = (usk , ubk ) ∈ Uk = [0, Sks ] × [0, Skb ] and vk = (vka ) ∈ Vk = [0, Ska ]. Suppose we solve the game for the kth stage only. That is, g g max a max 0, Sk −lb max 0, ubk −lab max 0, vka −lsa usk . (17) min a (usk ,ubk )∈[0,Sks ]×[0,Skb ] vk ∈[0,Sk ]

Below we state and prove a useful lemma on the monotonicity of the resource strengths. Lemma 1. In the game defined in (17), using (9)–(12), with monotonic loss functions defined as in (15): g

(i) Sk+1 is a monotonically decreasing function of usk and ubk and a monotonically increasing function of vka . a is a monotonically decreasing function of usk , ubk , and vka . (ii) Sk+1 b is a monotonically decreasing function of ubk and vka , and a monotonically (iii) Sk+1 increasing function of usk . s is a monotonically decreasing function of usk and vka . (iv) Sk+1 b − rkb is a monotonically increasing function of ubk . (v) Sk+1

Proof. These can be easily proved using the monotonicity of the loss functions and the simple relation that if x y, then max{0, x} max{0, y}. See Ghose, Speyer, and Shamma [12] for details. Theorem 1. A saddle point in mixed strategies exists for the kth stage of the game, treated as a single stage game, with the payoff given in (16). Proof. Jk is jointly continuous in uk and vk since the loss functions are continuous in their arguments, and “max” operation over continuous functions preserves continuity. Since the control sets are intervals on the real line (and therefore compact), by standard results in game theory [1,8,15], the game admits a saddle point in mixed strategies. 2 A function f : R → R is said to be monotonically increasing (monotonically decreasing) if f (x) f (y)

(f (x) f (y)) whenever x > y.

22

GHOSE, SPEYER AND SHAMMA

This theorem proves the existence of saddle-point in mixed strategies for the singlestage game. However, for this game, saddle points in pure strategies also exist and we invoke the following fundamental minimax theorem by Fan [7] to prove this in the next theorem. Note that the payoff function (16) does not satisfy the standard convexityconcavity property normally used to prove the existence of saddle points in pure strategies [1]. Theorem 2 (Fan’s minimax theorem [7]). Let X, Y be two compact Hausdorff spaces and f a real-valued function defined on X × Y . Suppose that, for every y ∈ Y , f (x, y) is lsc on X; and for every x ∈ X, f (x, y) is usc on Y . Then the equality minx∈X maxy∈Y f (x, y) = maxy∈Y minx∈X f (x, y) holds if and only if for any two finite sets {x1 , x2 , . . . , xn } ⊂ X and {y1 , y2 , . . . , ym } ⊂ Y , there exist x0 ∈ X and y0 ∈ Y such that f (x0 , yi ) f (xj , y0 ) for all 1 j n and 1 i m. Theorem 3. A saddle point in pure strategies exists for the kth stage of the game, treated as a single stage game, with the payoff as given in (16). Proof. As stated earlier, Jk is jointly continuous with respect to uk and vk and the control sets are compact. Select a,j j b,i ∈ Uk , i = 1, . . . , p, and vk = vk ∈ Vk , j = 1, . . . , q, uik = us,i k , uk where p and q are arbitrary but finite integers. Define s,p b,p , max ub,1 and uˆ k = max us,1 k , . . . , uk k , . . . , uk

a,q vˆk = max vka,1 , . . . , vk .

g

Then, from the monotonicity of Jk = Sk+1 with respect to uk and vk (lemma 1), for j j any uik , i = 1, . . . , p, and vk , j = 1, . . . , q, we have Jk (uˆ k , vk ) Jk (uik , vˆk ). Since this result is true for any finite sets of uk ’s and vk ’s selected from Uk and Vk , respectively, by Fan’s minimax theorem, the game has a saddle point in pure strategies. There could be multiple saddle points but the interchangeability property ensures that the payoff for all saddle point strategies is the same [1,8]. Verification of saddle point property can be done using the fact that if (u∗k , vk∗ ) is a saddle point strategy pair, then, Jk (u∗k , vk ) Jk (u∗k , vk∗ ) Jk (uk , vk∗ ) for all uk ∈ Uk , vk ∈ Vk ,

(18)

which means that a player cannot improve its payoff by unilateral deviation. The following theorems give the exact expression for the saddle point solution of any single stage of the SEAD assisted air campaign game with monotonic loss functions. Theorem 4. In the game described by (9)–(12) with monotonic loss functions defined in (15) and with payoff as given in (16), the saddle point strategies (u∗k , vk∗ ) for the single stage game are as follows:

GAME THEORETICAL RESOURCE ALLOCATION

23 g

g

(i) If for every (usk , ubk ) ∈ Uk there exists a vka ∈ Vk such that Sk+1 = Sk , then vk∗ = vka∗ ∈ vka ∈ Vk : lab vka − lsa Sks Skb , b∗ ∈ 0, Sks × 0, Skb u∗k = us∗ k , uk

(19) (20)

g

and the value of the game is Sk . g

(ii) If for every vka ∈ Vk , there exists a (usk , ubk ) ∈ Uk such that Sk+1 = 0, then (21) vk∗ = vka∗ ∈ 0, Ska , s∗ b∗ s b g b g ∗ a s a uk = uk , uk ∈ uk , uk ∈ Uk : ls uk Sk ; lb uk Sk g g Sk (22) ∪ usk , ubk ∈ Uk : lsa usk < Ska ; lb ubk − lab Ska − lsa usk and the value of the game is 0. (iii) If neither of the conditions in (i) and (ii) above holds then

If Ska lsa (Sks ) then

vka∗ = Ska .

(23)

s b b∗ = Sk , Sk us∗ k , uk

(24)

else (that is, if Ska < lsa (Sks )), s∗ b∗ s s b uk , uk ∈ uˆ k , Sk × Sk ,

(25)

where, uˆ sk is such that Ska = lsa (uˆ sk ). g g The value of the game is max{0, Sk − lb (max{0, Skb − lab (max{0, Ska − lsa (Sks )})})}. Proof. that

g

g

(i) From (12), Sk+1 = Sk implies that for every (usk , ubk ) there exists a vka such g =0 lb max 0, ubk − lab max 0, vka − lsa usk

which, in turn, implies that

ubk lab max 0, vka − lsa usk .

(26)

(27)

Since the above must hold for all (usk , ubk ), it must also be true for (usk , ubk ) = (Sks , Skb ), which implies that there exists a vka ∈ Vk , denoted by vka∗ , such that, (28) lab vka∗ − lsa Sks Skb . To prove that this vka∗ and any u∗k ∈ Uk together form a saddle point, we need to validate (18). The left inequality of (18) is true since, by the monotonicity of loss functions, any deviation vka from vka∗ implies that vka < vka∗ . And so, by (i) of lemma 1, we have Jk (u∗k , vk ) Jk (u∗k , vk∗ ). The right inequality of (18) is true since, from (28) and the g monotonicity of loss functions, deviation in uk will not change the value of Sk+1 .

24

GHOSE, SPEYER AND SHAMMA g

(ii) If for every vka ∈ Vk , there exists a (usk , ubk ) ∈ Uk such that Sk+1 = 0, then from (12), g g (29) Sk lb max 0, ubk − lab max 0, vka − lsa usk must also be satisfied. This, in turn, implies that ubk lab max 0, vka − lsa usk , g g Sk . lb ubk − lab max 0, vka − lsa usk

(31)

Equation (31), in turn, implies that if vka lsa (usk ), then g g lb ubk Sk

(32)

else, if vka > lsa (usk ), then

g g Sk . lb ubk − lab vka − lsa usk

(30)

(33)

Since this is true for all vka ∈ Vk , it must hold for vka = Ska . Then, there exists b∗ (usk , ubk ) ∈ Uk , denoted by (us∗ k , uk ), such that, either g g (34) Ska and lb ub∗ Sk lsa us∗ k k or,

< Ska lsa us∗ k

and

a g g b a s∗ lb ub∗ Sk . k − la Sk − ls uk

(35)

b∗ a∗ To prove that this (us∗ k , uk ) along with any vk ∈ Vk together form a saddle point, we b∗ need to validitate (18). To prove the left inequality of (18), we note that if u∗k = (us∗ k , uk ) g satisfies (35) then regardless of any admissible choice of vk we have Sk+1 = 0. To prove the right inequality of (18), it is sufficient to show that for any (usk , ubk ) that does not g satisfy (22), there exists a vka ∈ Vk such that Sk+1 > 0. Now, if lsa (usk ) Ska and g b g g g g b lb (uk ) < Sk , then, Sk+1 = Sk − lb (uk ) > 0. Else, if lsa usk < Ska and ubk lab Ska − lsa usk , b − rkb (lemma 1), there exists a v˜ka ∈ Vk then, by the continuity and monotonicity of Sk+1 g for which ubk = lab (v˜ka − lsa (usk )), and from (12), for any vka ∈ [v˜ka , Ska ], we have Sk+1 = g Sk > 0. Finally, if lsa usk < Ska and ubk > lab Ska − lsa usk ;

but

g g < Sk , lb ubk − lab Ska − lsa usk g

then by the monotonicity and continuity of Sk+1 with respect to vka (lemma 1), there g g exists a v˜ka ∈ Vk such that lb (ubk − lab (v˜ka − lsa (usk ))) = Sk , and for any vka ∈ [v˜ka , Ska ], we have g g < Sk lb ubk − lab vka − lsa usk

GAME THEORETICAL RESOURCE ALLOCATION

and so,

25

g g g > 0. Sk+1 = Sk − lb ubk − lab vka − lsa usk g

(iii) Since Sk+1 is monotonically increasing in vka , the left inequality of (18) is automatically satisfied. Now, if Ska lsa (Sks ) then any deviation from u∗k implies that g either usk < Sks , or ubk < Skb , or both. Since Sk+1 is monotonically decreasing in usk and ubk (lemma 1), the right inequality is satisfied. If Ska < lsa (Sks ) then by the monotonicity and continuity of loss functions there exists a uˆ sk such that Ska = lsa (uˆ sk ). So, any deviation from u∗k implies that either usk < uˆ sk , or ubk < Skb , or both. Again, by the monotonicity g of Sk+1 , the right inequality holds. An intuitive explanation of the above theorem follows: the condition in (i) implies that RED has sufficient ADs to destroy all BMBs before they reach the TGs which survive without any damage. In fact, in (19) the ADs used by RED are sufficiently large in strength to ensure that the surviving ADs (after interaction with SEADs) can destroy the BMBs entirely even when BLUE uses the maximum available SEAD and BMB resources. To show that this is indeed the saddle point strategy, examine the validity of (18). b∗ s b The left inequality holds since if BLUE uses (us∗ k , uk ) = (Sk , Sk ) and RED deviates a from its saddle point strategy, that is, RED uses a vk such that lab (vka − lsa (Sks )) < Skb , g g then some BMBs would survive and inflict damage on the TGs so that Sk+1 < Sk . Similarly, the right inequality in (18) holds since the whole of the control set Uk of BLUE qualifies for a saddle point strategy and so any deviation by BLUE does not change the payoff. The condition in (ii) implies that BLUE has enough SEADs and BMBs to destroy all TGs. The optimal u∗k in (22) is a union of two sets, the first of which shows that if all ADs are destroyed by SEADs then the BMB strength used should be sufficient to destroy the TGs. Otherwise, the second set shows that, even if some ADs survive the SEADs, the BMBs that survive these surviving ADs, should be sufficient to destroy the TGs. The left inequality in (18) is true since any deviation by RED will not change the payoff. Similarly, if RED uses vka∗ = Ska , then any deviation by BLUE from its saddle point strategy (that is, if BLUE uses its SEAD and BMB resources such that the surviving g BMBs are not sufficient to destroy all TGs) will result in a non-zero Sk+1 . Finally, (iii) implies that even when the players use maximum resources available, the TGs are neither completely destroyed nor do they survive intact. To prove saddle point property, note that any deviation by RED (that is, if vka < Ska ) would mean that less ADs survive to interact with BMBs. Consequently, more BMBs survive and so more TGs get destroyed, thus reducing the payoff. On the other hand, if BLUE deviates from the saddle point strategy, then BLUE uses less SEADs or less BMBs. In the former case, more ADs survive to interact with BMBs and so less BMBs survive to interact with the TGs. In the latter case, since the BMB strength used is less, the surviving BMBs are also less. Both cases lead to a larger surviving TG strength, thus increasing the payoff. The above three cases (i)–(iii) exhaust all the possibilities in the single stage game. It is obvious that both (i) and (ii) cannot hold. Note that the saddle point, except in one

26

GHOSE, SPEYER AND SHAMMA

of the cases in (iii), is not unique. However, by the interchangeability property of saddle point strategies in zero-sum games [1], each strategy pair selected from the above sets is a saddle point strategy pair and yields the same payoff which is the value of the game. This multiplicity of saddle point strategies gives rise to a related problem of selection of saddle point strategies by the players. We will address this issue later. The above theorem can be used to compute a saddle point strategy provided that the conditions specified can be verified easily. In the following theorem we will simplify this verification process by using the monotonicity properties given in lemma 1. Theorem 5. In the game described by (9)–(12) with monotonic loss functions as in (15) and with payoff as given in (16), the saddle point strategies are as follows: (i) If Ska lsa (Sks ) and lab (Ska − lsa (Sks )) Skb , then the saddle point strategies are as g given in (19) and (20) and the saddle-point value of the game is Sk . g

g

g

g

(ii) If lsa (Sks ) Ska and lb (Skb ) Sk ; or if lsa (Sks ) < Ska and lb (Skb −lab (Ska −lsa (Sks ))) Sk ; then the saddle point strategies are as given in (21) and (22) and the saddle point value of the game is 0. (iii) If neither of the conditions in (i) and (ii) above holds then the saddle point strategies g are as given in (23)–(25) and the saddle point value of the game is max{0, Sk − g lb (max{0, Skb − lab (max{0, Ska − lsa (Sks )})})}. Proof. For each of these cases we will show that the saddle point inequality (18) holds. b a∗ a s∗ (i) Since lab (vka∗ − lsa (Sks )) Skb , we get vka∗ − lsa (us∗ k ) 0 and la (vk − ls (uk )) g g b∗ ∗ ∗ uk . Then, from (12) and (16), Jk (uk , vk ) = Sk+1 = Sk . Consider the left inequality g b ∗ in (18). If vk satisfies lab (vka − lsa (us∗ k )) < Sk then, Jk (uk , vk ) < Sk . For all other g values of vk , Jk (u∗k , vk ) = Sk . Now, consider the right inequality in (18). For all values g ∗ of uk , we get Jk (uk , vk ) = Jk (u∗k , vk∗ ) = Sk . And so, the saddle point inequality is satisfied. g b∗ g a a∗ a s∗ (ii) If u∗k satisfies lsa (us∗ k ) Sk then ls (uk ) vk . Further, if lb (uk ) Sk , then g a from (12) and (16), Jk (u∗k , vk∗ ) = Sk+1 = 0. Otherwise, if lsa (us∗ k ) < Sk then either a∗ a∗ a s∗ a s∗ ls (uk ) vk or ls (uk ) > vk . In the former case, a g g b a s∗ Sk lb ub∗ k − la Sk − ls uk implies

g g b a∗ a s∗ Sk , lb ub∗ k − la vk − ls uk

and so Jk (u∗k , vk∗ ) = Sk+1 = 0. In the latter case, g g b a∗ a s∗ = lb ub∗ lb max 0, ub∗ k − la max 0, vk − ls uk k , g

and since

a g g b a s∗ Sk , lb ub∗ k − la Sk − ls uk

GAME THEORETICAL RESOURCE ALLOCATION

27

it implies that lb (ub∗ ) Sk . Thus, Jk (u∗k , vk∗ ) = Sk+1 = 0. Now, consider the left inequality in (18). This inequality holds since for all values of vk we have Jk (u∗k , vk ) = g Sk+1 = 0. The right inequality in (18) holds since if either of the two conditions on u∗k g given in (22) is violated, we get Jk (uk , vk∗ ) = Sk+1 0. This satisfies the saddle point property. (iii) If neither (i) nor (ii) hold then the proof is the same as (iii) in theorem 4. g

g

g

If (i) in the above theorem holds then RED is said to have a winning (denoted by W) strategy, since RED can ensure destruction of all BMBs and survival of all TGs, irrespective of the resources used by BLUE. If (i) does not hold then RED has a non-winning (denoted by NW) strategy implying that BLUE can destroy some TGs irrespective of the resources used by RED. Similarly, if (ii) holds then BLUE has a W strategy since BLUE can destroy all TGs, irrespective of the ADs used by RED. If (ii) does not hold then BLUE has a NW strategy, and some TGs will survive irrespective of the resources used by BLUE. If neither (i) nor (ii) is satisfied then both players have NW strategies. Obviously, as mentioned earlier, both players cannot have winning strategies at any stage. 3.2. The multi-stage game In the multi-stage game termination occurs if either (i) All TGs get destroyed before the last stage N, or (ii) The last stage N is reached. The game terminates before stage N if BLUE has a winning strategy (and uses it) at any stage. Define, v = (v1 , . . . , vN ),

u = (u1 , . . . , uN ), (vka ),

max

min

u=(u1 ,...,uN ) v=(v1 ,...,vN )

J =

usk

min

[0, Sks ],

(36)

and vk = with ∈ ∈ and ∈ [0, Ska ], where, uk = for all k = 1, . . . , N. We would like to obtain saddle point strategies to achieve (usk , ubk ),

ubk

max

u=(u1 ,...,uN ) v=(v1 ,...,vN )

[0, Skb ], N

g

Sk+1 .

vka

(37)

k=1

To ensure existence of saddle point in pure strategies we need to impose some conditions on the payoff kernel at each stage. Using (13), define the payoff kernel at stage k as, (38) Jk (Sk , uk , vk ) + Vk+1 f (Sk , uk , vk ) , where, Vk (Sk ) is the value of the game at stage k, obtained when players play optimally. The optimal payoff is given by, (39) Vk (Sk ) = min max Jk (Sk , uk , vk ) + Vk+1 f (Sk , uk , vk ) . uk

vk

If a saddle point exists then the solution of (39) gives the optimal strategies of the players at the kth stage. The optimal payoff of the multi-stage game is then given by V1 (S1 ). Theorem 6. If Jk (Sk , uk , vk ) + Vk+1 (f (Sk , uk , vk )) is a monotonically decreasing function of uk and a monotonically increasing function of vk for all k, then the multi-stage game has a saddle point in pure strategies at each stage k.

28

GHOSE, SPEYER AND SHAMMA

Proof. The proof uses similar arguments as in theorem 3, which depends only on the monotonicity property of the objective function (which, in this case, is replaced by Jk (Sk , uk , vk ) + Vk+1 (f (Sk , uk , vk ))), to invoke Fan’s theorem. The condition in theorem 6 is sufficient for the existence of saddle points and is quite restrictive. However, it is not a necessary condition since Fan’s theorem does not require monotonicity to be satisfied. In fact, we can relax the above condition further as follows, Theorem 7. If Jk (Sk , uk , vk ) + Vk+1 (f (Sk , uk , vk )) satisfies the property that Jk Sk , Sks , Skb , vka + Vk+1 f Sk , Sks , Skb , vka for all uk and a fixed vk , Jk Sk , usk , ubk , vka + Vk+1 f Sk , usk , ubk , vka s b a s b a Jk Sk , uk , uk , Sk + Vk+1 f Sk , uk , uk , Sk for all vk and a fixed uk Jk Sk , usk , ubk , vka + Vk+1 f Sk , usk , ubk , vka then the multi-stage game has a saddle point in pure strategies at each stage k. Proof. Since Jk (Sk , uk , vk ) + Vk+1 (f (Sk , uk , vk )) is continuous in uk and vk , and the given conditions in the theorem ensure that for every finite set of uk ’s and vk ’s, we can define uˆ k = (Sks , Skb ) and vˆk = (Ska ) such that Jk (Sk , uˆ k , vk ) + Vk+1 f (Sk , uˆ k , vk ) Jk (Sk , uk , vˆk ) + Vk+1 f (Sk , uk , vˆk ) , the conditions in Fan’s theorem are satisfied. Hence the proof.

Theorem 8. If the conditions given in theorems 6 or 7 hold then a pure strategy saddle point exists for the multi-stage game (37), having state equations (9)–(12) with monotonic loss functions as in (15), and is given by the pure stationary saddle point strategies of the single-stage game at each stage k. Proof. Suppose at stage k both players have only NW (non-winning) solutions (that is, condition (iii) in theorem 4 holds). Any deviation by RED would imply that vka < Ska which, by theorems 6 or 7, would imply a lower value of the cumulative TG strength at the end of the multi-stage game. If lsa (Sks ) Ska and BLUE deviates from the single stage saddle point strategy, then either usk < Sks or ubk < Skb or both. But, if lsa (Sks ) > Ska , then a deviation by BLUE from its single stage saddle point strategy would mean that usk < uˆ sk (where, lsa (uˆ sk ) = Ska ) or ubk < Skb or both. In both cases, by theorems 6 or 7, a higher value of the cumulative TG strength accrues at the end of the multi-stage game. Now, at a stage k, if BLUE has a W (winning) solution and deviates from it (that is, uses a NW solution), then the payoff (that is, the surviving TG strength) in that stage is non-zero. In subsequent stages the surviving TG strength is either positive or zero and so the deviation by BLUE increases the cumulative TG strength. Similarly, if RED has a W solution and deviates from it (and uses a NW solution), then the surviving

GAME THEORETICAL RESOURCE ALLOCATION g

29

g

TG strength Sk+1 < Sk . In subsequent stages the surviving TG strength may decrease further or remain the same. In any case, the deviation by RED decreases the cumulative TG strength. This proves that the single-stage saddle point solution is a stationary saddle point for the multi-stage game if the conditions in theorems 6 or 7 hold. The monotonicity conditions in theorem 6, and even those in theorem 7, are stringent and are not easy to verify for a large number of stages. However, for games with fewer stages it might be possible to verify them computationally. If these conditions do not hold then the optimal strategies of the players are likely to be non-stationary and/or mixed. Later we will obtain optimal stationary pure strategies for an example where the specified conditions are indeed met. We will also give an example where these conditions are not met and no saddle point in pure strategies exists for the multi-stage game. 4.

Linear loss functions

Here, we consider monotonic loss functions that are “linear” in the sense that the loss to a player’s resource is proportional to the adversary’s interacting resource strength, but within the bounds of resource availability. Let, lsa usk = βusk , las vka = αvka , (40) g lba ubk = ηubk , lb bk2 = θbk2 , lab ak1 = γ ak1 , where, α, β, γ , η, and θ are non-negative scalars. Thus, α SEAD strength is destroyed by one unit of AD strength. The other attrition parameters have a similar interpretation. Since these functions are also monotonic, all the previous results hold. 4.1. Optimal strategies For linear loss functions, the optimal strategies for the players at each stage can be obtained from theorems 4 and 5 as follows: – If (Ska ) ∈ M, then vka∗ = Ska . / M, then – If (Ska ) ∈ vka∗ ∈

0, Ska \ M ,

(41)

where “\” denotes the set difference operation. – If (Sks , Skb ) ∈ N , then

/ N , then – If (Sks , Skb ) ∈

b ub∗ k = Sk , s if Ska /β > Sks , us∗ k = Sk a s if Ska /β Sks . us∗ k ∈ Sk /β, Sk

(42) (43) (44)

b∗ ∈ 0, Sks × 0, Skb \ N . us∗ k , uk

(45)

30

GHOSE, SPEYER AND SHAMMA

Figure 3. Optimal resource allocation and the Pareto minimum set.

In the above, the sets N = N1 ∪ N2 and M are defined as, g N1 = x us , x ub : x us Ska /β; x ub < Sk /θ , g N2 = x us , x ub : x us < Ska /β; x ub < γ Ska − βx us + Sk /θ , M = y va : y va < Skb /γ + βSks ,

(46) (47) (48)

where, x us , x ub , and y va correspond to the SEAD, BMB, and AD strengths, respectively. The sets M and N are such that BLUE will not be able to destroy all TGs if it confines its allocation to N . Similarly, RED cannot ensure survival of all its TGs if it confines its allocation to M. The optimal allocations for BLUE are in the shaded region shown in figure 3 if the available resource strength (Sks , Skb ) does not lie inside N . Any allocation in the shaded region is an optimal winning solution and will destroy all TGs. Otherwise, if (Sks , Skb ) lies inside N , then (i) if this point lies on the left of the line x us = Ska /β then the optimal allocation is (Sks , Skb ), (ii) if it lies on the right side of this line then the optimal a s b∗ b allocation is us∗ k ∈ [Sk /β, Sk ] and uk = Sk . These are the non-winning solutions. a Similarly, for RED, if Sk lies in the interior of M, then Ska is the optimal non-winning solution. Otherwise, the optimal allocation would be any point in [Skb /γ + βSks , Ska ] and is a winning solution. Each such winning allocation would destroy all BMBs and no damage would be inflicted on the TGs. Although, depending on the available resource levels, the game admits multiple saddle points in pure strategies, it is logical for the players to avoid using excessive resources. This implies that RED will use, (49) vka∗ = min Skb /γ + βSks , Ska and BLUE will select a Pareto minimum point from its solution set given in (45). The Pareto minimum set is shown in figure 3 as the bold line when the available resources are not in the interior of N . When the resource level is in the interior of N then (i) if Sks Ska /β, then u∗k = (Sks , Ska ), (ii) if Sks > Ska /β, then u∗k = (Ska /β, Ska ).

GAME THEORETICAL RESOURCE ALLOCATION

31

4.2. Surviving resources Selection of an optimal strategy by RED, when there are multiple saddle points, is straightforward. But, for BLUE, any one among the Pareto minimum set of strategies is optimal. Selection of a strategy from among them may have to take into account extraneous factors based on the relative value of the SEAD and BMB aircrafts. However, even before considering these factors, it is possible to eliminate some of the options. / N ). Then, Suppose, at stage k, BLUE has a winning strategy (that is, (Sks , Skb ) ∈ from the previous section, vka∗ = Ska . The surviving SEADs and BMBs are given by, s s a s∗ = max 0, us∗ (50) Sk+1 k − αSk + Sk − uk , s∗ b b b∗ a b∗ + Sk − uk . (51) Sk+1 = max 0, uk − γ max 0, Sk − β uk The Pareto minimum allocation by the BLUE forces should satisfy a g s∗ ub∗ k = γ Sk − βuk + Sk /θ

(52)

b∗ s b with (us∗ k , uk ) ∈ {[0, Sk ] × [0, Sk ]}\N . In fact, the above expression can be refined further, depending on the position of the point (Sks , Skb ). The four possibilities are shown in figure 4. The range of optimal allocations for these four cases are as follows: g

g

s b∗ a s a Case (i): us∗ k ∈ [0, Sk ]; uk ∈ [γ (Sk − βSk ) + Sk /θ, γ Sk + Sk /θ], a Case (ii): us∗ k ∈ [Sk /β −

1 (Skb γβ

g

g

b − Sk /θ), Ska /β]; ub∗ k ∈ [Sk /θ, Sk ], g

g

a b∗ a Case (iii): us∗ k ∈ [0, Sk /β]; uk ∈ [Sk /θ, γ Sk + Sk /θ], g

g

a b s b∗ a s b Case (iv): us∗ k ∈ [Sk /β − (1/γβ)(Sk − Sk /θ), Sk ]; uk ∈ [γ (Sk − βSk ) + Sk /θ, Sk ].

Figure 4. The positions of the point (Sks , Skb ) in the BLUE resource space.

32

GHOSE, SPEYER AND SHAMMA

These can be combined to yield: b s a g a us∗ k ∈ max 0, Sk /β − (1/γβ) Sk − Sk /θ , min Sk , Sk /β , g a b g g s a ub∗ k ∈ max Sk /θ, γ Sk − βSk + Sk /θ , min Sk , γ Sk + Sk /θ .

(53) (54)

Below, we analyze each of the above cases separately, Case (i): From (50)–(52), we get, b = Skb − γ Ska + γβus∗ Sk+1 k

If Sks αSka then s = Sks − us∗ Sk+1 k

If Sks > αSka then s = Sks − us∗ Sk+1 k s Sk+1

= Sks

−

αSka

s with us∗ k ∈ 0, Sk .

(55)

s for us∗ k ∈ 0, Sk .

(56)

a if us∗ k ∈ 0, αSk , a s if us∗ k ∈ αSk , Sk .

(57) (58)

The surviving SEADs and BMBs for this case is shown in figure 5. The last two figures show these for all possible optimal allocations. When Sks αSka , a judicious choice of SEAD and BMB will depend on the relative value associated with these assets, evaluated s s a for all ranges of us∗ k ∈ [0, Sk ]. When Sk > αSk , a similar evaluation is necessary but for s∗ s s∗ a a the case when uk = Sk and uk ∈ [0, αSk ). The case us∗ k = αSk need not be considered

Figure 5. Surviving SEAD and BMB resources: case (i).

GAME THEORETICAL RESOURCE ALLOCATION

33

s since us∗ k = Sk yields an improvement in the number of surviving BMBs while keeping the number of surviving SEADs the same. Case (ii): From (50)–(52), we get, a g

1 Ska Sk Sk b b a s∗ s∗ b − S − , . (59) Sk+1 = Sk − γ Sk + γβuk with uk ∈ β γβ k θ β g

If αSka Ska /β − (1/γβ)(Skb − Sk /θ) then s Sk+1

=

Sks

−

αSka

for

us∗ k

g

Ska Sk 1 Ska b ∈ − S − , . β γβ k θ β

(60)

g

If Ska /β − (1/γβ)(Skb − Sk /θ) < αSka Ska /β then a g

1 Sk Sk s s∗ b a − S , αS = Sks − us∗ if u ∈ − Sk+1 k k k . β γβ k θ a s s a s∗ a Sk . Sk+1 = Sk − αSk if uk ∈ αSk , β If Ska /β < αSka (that is, αβ > 1) then s Sk+1

=

Sks

−

us∗ k

for

g

1 Ska Ska Sk b − S − , . ∈ β γβ k θ β

(61) (62)

us∗ k

(63)

The surviving SEADs and BMBs for this case is shown in figure 6. The last three figures show these for all possible optimal allocations. These figures show that when αSka g a a Ska /β − (1/γβ)(Skb − Sk /θ), the only optimal choice is us∗ k = Sk /β. When Sk /β − g b a a (1/γβ)(Sk −Sk /θ) < αSk Sk /β, the optimal choice should be based on an evaluation between a

g

1 Ska Sk Sk s∗ s∗ b a and uk ∈ − S − , αSk . uk = β β γβ k θ When αβ > 1, an optimal choice of SEAD and BMB needs to be evaluated for all ranges g a b a of us∗ k ∈ [Sk /β − (1/γβ)(Sk − Sk /θ), Sk /β]. Case (iii): From (50)–(52), we get, b a = Skb − γ Ska + γβus∗ with us∗ (64) Sk+1 k k ∈ 0, Sk /β . If Ska /β αSka (that is, αβ 1) then s = Sks − αSka Sk+1

If αSka < Ska /β (that is, αβ < 1) then s = Sks − us∗ Sk+1 k s Sk+1

= Sks

−

αSka

a for us∗ k ∈ 0, Sk /β .

(65)

a if us∗ k ∈ 0, αSk , a a if us∗ k ∈ αSk , Sk /β .

(66) (67)

The surviving SEADs and BMBs for this case is shown in figure 7. These figures have an interpretation similar to figure 5. When αβ 1, an optimal choice of SEAD and

34

GHOSE, SPEYER AND SHAMMA

Figure 6. Surviving SEAD and BMB resources: case (ii). a BMB needs to be evaluated for all ranges of us∗ k ∈ [0, Sk /β]. When αβ < 1, a similar a s∗ a choice is necessary between the case when us∗ k = Sk /β and uk ∈ [0, αSk ). Case (iv): From (50)–(52), we get, a g

1 Sk Sk b b a s∗ s∗ b s − S − , Sk . (68) Sk+1 = Sk − γ Sk + γβuk with uk ∈ β γβ k θ g

If αSka Ska /β − (1/γβ)(Skb − Sk /θ) then s Sk+1

=

Sks

−

αSka

for

us∗ k

g

Ska Sk 1 b s − S − , Sk . ∈ β γβ k θ

(69)

g

If Ska /β − (1/γβ)(Skb − Sk /θ) < αSka Sks then a g

1 Sk Sk s s s∗ s∗ b a − S − , αSk , Sk+1 = Sk − uk if uk ∈ β γβ k θ

(70)

GAME THEORETICAL RESOURCE ALLOCATION

35

Figure 7. Surviving SEAD and BMB resources: case (iii). s Sk+1 = Sks − αSka

a s if us∗ k ∈ αSk , Sk .

If Sks < αSka

g

1 Ska Sk b s − S − , Sk . ∈ β γβ k θ

(71)

s Sk+1

=

Sks

−

us∗ k

for

us∗ k

(72)

The surviving SEADs and BMBs for this case is shown in figure 8. These figures have g a similar interpretation as figure 6. When αSka Ska /β − (1/γβ)(Skb − Sk /θ), the only g s∗ s a b optimal choice is uk = Sk . When Sk /β − (1/γβ)(Sk − Sk /θ) < αSka Sks , the s s∗ a optimal choice should be based upon an evaluation between us∗ k = Sk and uk ∈ [Sk /β − g b a s a (1/γβ)(Sk −Sk /θ), αSk ). When Sk < αSk , an optimal choice of SEAD and BMB needs g a b s to be evaluated for all ranges of us∗ k ∈ [Sk /β − (1/γβ)(Sk − Sk /θ), Sk ]. 5.

Examples

To illustrate the main results of this paper, we present an example. Let the initial resource g strengths be S1s = 100 (SEAD), S1b = 150 (BMB), S1a = 180 (AD), S1 = 300 (TG). Let the attrition coefficients be α = 0.5, β = 1, γ = 0.5, η = 0.5, θ = 2. The sets N and M for the stage k = 1, obtained from (46)–(48), are shown in figure 9. In stage k = 1 both

36

GHOSE, SPEYER AND SHAMMA

Figure 8. Surviving SEAD and BMB resources: case (iv).

Figure 9. Optimal SEAD and BMB allocation and the Pareto minimum set for BLUE in stages 1 and 2.

GAME THEORETICAL RESOURCE ALLOCATION

37

the optimal solutions are of type NW. Assuming that the condition in theorem 7 holds (we will discuss this later), the optimal allocations should be, v1a∗ = 180,

us∗ 1 = 100,

ub∗ 1 = 150.

With this allocation the surviving resources at the end of stage 1 are, S2s = 10,

S2b = 110,

S2a = 5,

g

S2 = 80.

In stage k = 2, the sets N and M are shown in figure 9. BLUE now has W solutions while RED has a NW solution. Further, BLUE has multiple Pareto solutions, any of which could be used to destroy the TGs completely. BLUE thus has the option of a trade-off between its SEADs and BMBs. These Pareto solutions may be parmeterized as b∗ (us∗ 2 , u2 ) = ρ(0, 42.5) + (1 − ρ)(5, 40) with respect to a parameter ρ ∈ [0, 1]. The optimal solution for RED is to use v2a∗ = 5. These solutions are also shown in figure 9. The game thus ends after 2 stages with a total payoff of 80. This game does not satisfy the monotonicity condition as stated in theorem 6, but it does satisfy the conditions in theorem 7. That is, J1 us1 , ub1 , v1a + J2 S2s , S2b , S2a = J1 us1 , ub1 , v1a + J2 max 0, us1 − αv1a + S1s − us1 , max 0, ub1 − γ max 0, v1a − βus1 + S1b − ub1 , max 0, max 0, v1a − βus1 − ηub1 + S1a − v1a satisfies the properties that it attains (i) its minimum with respect to (us1 , ub1 ) at us1 = S1s and ub1 = S1b while v1a is held constant and (ii) its maximum with respect to v1a at v1a = S1a , while us1 and ub1 are held constant. These can be computationally verified with little effort. BLUE may select from multiple Pareto solutions in stage 2 on the basis of surviving resources. When BLUE uses an optimal solution, all RED resources are destroyed at the end of stage 2. The available BLUE resources at the beginning of stage 2 are shown in figure 9 and matches with case (iii) in figure 4. The surviving SEAD and BMBs are similar to that in figure 7 for αβ < 1, and are shown in figure 10 where the points A, B, and C correspond to those shown in figure 9. Obviously, A is a better choice than B, and so BLUE may consider a trade-off between A (for which 7.5 SEADs and 110 BMBs survive) and the choices in (B, C] (for which the surviving SEADs and BMB strengths are in ((7.5, 108.75), (10, 107.5)] depending on the relative values assigned to them. For comparison with a few other possible strategies, let νk denote the fraction of its resources (both SEAD and BMB) that BLUE deploys at the kth stage. Let ν1 take values between 0 and 1 while νk = 1 for k 2. Note that ν1 = 1 corresponds to the g optimal strategy. The results are given in table 1. The variable Sk denotes the surviving TG strength after the stage k − 1. Finally, we present an example in which the conditions of theorems 6 or 7 do not hold and the game does not have a pure strategy saddle point or a stationary strategy. In

38

GHOSE, SPEYER AND SHAMMA

Figure 10. Surviving SEADs and BMBs after stage 2. Table 1 Payoffs for various strategies. g

g

g

ν1

Payoff

S2

S3

S4

1 0.9 0.75 0.5

80 120 188.75 432.5

80 120 180 280

0 0 8.75 152.5

0 0 0 0

this game the SEADs do not have a role to play. Thus, α = 0 and β = 0. The other parameters are, γ = 5, η = 1, and θ = 1. This corresponds to a scenario where only BMBs interact with ADs. Also, each unit of AD can destroy 5 units of BMBs and each unit of BMBs can destroy one unit of AD and one unit of TG. The initial conditions g are S1b = 30, S1a = 5 and S1 = 100. We solve this game for two stages. The single stage optimality conditions can be directly used to obtain an expression for the optimal payoff P if BLUE uses u1 BMBs and RED uses v1 ADs in the first stage, and then both play optimally in the second stage. The corresponding payoff is, P = J1 (u1 , v1 ) + J2 (u∗2 , v2∗ ) = J1 (u1 , v1 ) + J2 S2b , S2a , g g J1 (u1 , v1 ) = S2 = max 0, S1 − θ max{0, u1 − γ v1 } , g J2 S2b , S2a = max 0, S2 − θ max 0, S2b − γ S2a . g

Substituting numerical values, and since u1 ∈ [0, 30], S2b 30 and S2 70, we have, g J1 (u1 , v1 ) = S2 = max 0, 100 − max{0, u1 − 5v1 } = 100 − max{0, u1 − 5v1 }, g g J2 S2b , S2a = max 0, S2 − max 0, S2b − 5S2a = S2 − max 0, S2b − 5S2a . Substituting S2b = max{0, u1 − 5v1 } + (30 − u1 ) and S2a = max{0, v1 − u1 } + (5 − v1 ), P = 200 − 2 max{0, u1 − 5v1 } − max 0, 5 − (u1 − 5v1 ) (73) + max{0, u1 − 5v1 } − 5 max{0, v1 − u1 } .

GAME THEORETICAL RESOURCE ALLOCATION

39

Figure 11. (i) Rational reaction curves. (ii) BLUE’s optimal payoff. (iii) RED’s optimal payoff.

The payoff function is continuous and so, for every choice of u1 ∈ [0, S1b ], there exists an optimal choice vˆ1 = lv (u1 ) of RED which maximizes the payoff. Here, lv (·) : [0, S1b ] → [0, S1a ] denotes the rational reaction function of RED. Also, the maximum payoff is de (u1 ). Similarly, for every choice of v1 ∈ [0, S1a ], there exists an optimal choice noted by P uˆ 1 = lu (v1 ) of BLUE that minimizes the payoff. Here, lu (·) : [0, S1a ] → [0, S1b ] denotes the rational reaction function of BLUE. This maximum payoff is denoted by P (v1 ). Plotting these quantities in figure 11, we observe the following: (i) There is a discontinuity in the rational reaction curve of BLUE. (ii) If we consider only pure strategies for the players then minimax value of the game (= 185) is not equal to the maximin value (= 177 67 ). Thus, the two stage game does not have a pure strategy saddle point. Figure 11(i) also shows that the conditions mentioned in theorems 6 and 7 are both violated. Also, BLUE cannot have an optimal pure strategy since, if it does then, the optimal reaction to it would be a pure strategy for RED. This example, although simple, shows that when the conditions of theorems 6 and 7 are not met, the solution of the game must be sought in terms of mixed strategies. Also, these mixed strategies may not have a finite support, that is, expressible as a probability distribution on a finite set of pure strategies. Thus, even simple resource interaction problems can give rise to a rich strategy space. 6.

Conclusions

A game theoretical framework for a war game involving an air campaign against an enemy target, defended by air defense units, is proposed and modeled as a multiple resource interaction problem. Focusing only on the temporal aspect of the game, existence of optimal pure strategies for resource allocation is proved. Closed-form solutions are also obtained for attrition functions that are linear within the bounds of resource availability. This paper shows the potential that game theoretical concepts have on planning campaigns from a higher level command point of view. Further work in this direction involves the computation and implementation of non-stationary pure and mixed strategies when they exist, the extension of the model to its spatial dimension, incorporating

40

GHOSE, SPEYER AND SHAMMA

multiple interactions among resources to account for approximations introduced due to aggregation of interaction events, development of computational algorithms to compute optimal strategies for large-scale interactions, and incorporation of non-linear attrition functions. References [1] T. Basar and G.J. Olsder, Dynamic Noncooperative Game Theory, 2nd edn. (Academic Press, London, 1995). [2] L.D. Berkovitz and M. Dresher, A game-theory analysis of tactical air war, Operations Research 7 (1959) 599–620. [3] D. Blackwell, On multi-component attrition games, Naval Research Logistics Quarterly 1 (1954) 210–216. [4] J. Bracken and J.T. McGill, Defense applications of mathematical programs with optimization problems in the constraints, Operations Research 22 (1974) 1086–1096. [5] J.M. Danskin, The Theory of Max–Min and its Applications to Weapons Allocation Problems (Springer, New York, 1967). [6] P.K. Davis and J.H. Bigelow, Experiments in Multi-Resolution Modeling (MRM), RAND Publication MR-1004-DARPA, Santa Monica, CA (1998). [7] K. Fan, Minimax theorems, Proc. Nat. Acad. Sci. 39 (1953) 42–47. [8] F. Forgo, J. Szep and F. Szidarovszky, Introduction to the Theory of Games: Concepts, Methods, Applications (Kluwer Academic, Dordrecht, 1999). [9] D. Ghose, M. Krichman, J.S. Shamma and J.L. Speyer, Modeling of a SEAD assisted air campaign as an integrated temporal and spatial resource allocation problem, Technical Report, Mechanical and Aerospace Engineering Department, University of California at Los Angeles (April 2000). [10] D. Ghose, M. Krichman, J.L. Speyer and J.S. Shamma, Game theoretic campaign modeling and analysis, in: Proceedings of the IEEE Conference on Decision and Control (CDC ’2000), Sydney, Australia (December 2000) pp. 2556–2561. [11] D. Ghose, J.L. Speyer and J.S. Shamma, A game theoretical model for temporal resource allocation in an air campaign, in: Proceedings of the JFACC Symposium on Advances in Enterprise Control, Minneapolis, MN (2000) pp. 129–138. [12] D. Ghose, J.L. Speyer and J.S. Shamma, A game theoretical analysis of a multiple resource interaction model for resource allocation in an air campaign, Technical Report, Mechanical and Aerospace Engineering Department, University of California at Los Angeles (October 2000). [13] R.J. Hillestad and L. Moore, The theater-level campaign model: A research prototype for a new generation of combat analysis model, RAND Technical Report MR-388-AF/A, Rand Publication, Santa Monica, CA (1996). [14] R.D. Luce and H. Raiffa, Games and Decisions (Wiley, New York, 1957). [15] G. Owen, Game Theory, 3rd edn. (Academic Press, New York, 1995). [16] E. van Damme, Stability and Perfection of Nash Equilibria (Springer, Berlin, 1987). [17] D. Vaughan, J. Kvitky, K. Henry, M. Gabriele, G. Park, G. Halverson and B. Schweitzer, Capturing the essential factors in reconnaissance and surveillance force sizing and mix, Project Air Force, RAND Documented Briefing, DB 199-AF, Rand Publication, Santa Monica, CA (1998).

Chapter 2 NUMBER-THEORETICAL TOOLS - Springer Link