Approximating the Stationary Distribution of an Infinite Stochastic Matrix Daniel P. Heyman Journal of Applied Probability, Vol. 28, No. 1. (Mar., 1991), pp. 96-103. Stable URL: http://links.jstor.org/sici?sici=0021-9002%28199103%2928%3A1%3C96%3AATSDOA%3E2.0.CO%3B2-F Journal of Applied Probability is currently published by Applied Probability Trust.
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/journals/apt.html. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.
The JSTOR Archive is a trusted digital repository providing for long-term preservation and access to leading academic journals and scholarly literature from around the world. The Archive is supported by libraries, scholarly societies, publishers, and foundations. It is an initiative of JSTOR, a not-for-profit organization with a mission to help the scholarly community take advantage of advances in technology. For more information regarding JSTOR, please contact
[email protected].
http://www.jstor.org Fri Sep 14 14:42:24 2007
J . Appl. Prob. 28, 96-103 (1991) Printed in Israel (D Applied Probability Trust 1991
APPROXIMATING THE STATIONARY DISTRIBUTION OF AN INFINITE STOCHASTIC MATRIX DANIEL P. HEYMAN,* Bellcore
Abstract We are given a Markov chain with states 0, 1 , 2 , . . .. We want to get a numerical approximation of the steady-state balance equations. To do this, we truncate the chain, keeping the first n states, make the resulting matrix stochastic in some convenient way, and solve the finite system. The purpose of this paper is to provide some sufficient conditions that imply that as n tends to infinity, the stationary distributions of the truncated chains converge to the stationary distribution of the given chain. Our approach is completely probabilistic, and our conditions are given in probabilistic terms. We illustrate how to verify these conditions with five examples. MARKOV CHAINS; TRUNCATION; AUGMENTATION; HESSENBERG MATRICES
1. Introduction
We are given a Markov chain with states 0, 1,2, . .. Let X , = {X,(m); m = 0, 1,2,. . . ) be the chain, and let P, be the transition matrix. We want to get a numerical approximation of the steady-state balance equations
To do this, we truncate P,, keeping the first n states, and make the resulting matrix stochastic in some convenient way. Let Pn be the transition matrix, and let Xn denote the chain. We solve the finite system
We can guarantee that a solution of (1.2) will be a good approximation of n, for a large enough value of n if
holds true for each state i. The purpose of this paper is to provide some sufficient conditions for (1.3) to hold. Our approach is completely probabilistic, and our conReceived 20 December 1 9 8 9 ; revision received 20 February 1 9 9 0 . Postal address: Bellcore, Room 3D-308, 331 Newman Springs Road, Red Bank, NJ
*
07701,
USA.
Approximating the stationary distribution of an infinite stochastic matrix
97
ditions are given in probabilistic terms. We illustrate how to verify these conditions with several examples. 1.1. Background. Keeping only states 0, 1,. . , n is called truncation, and n is called the truncation level. When P, is truncated, some (at least one) of the rows will not sum to 1. The procedure that rectifies this is called augmentation. For example, one might augment column zero, or the last column kept, or (for each row separately) all kept columns equally. The validity of (1.3) may depend on the nature of P, and the method of truncation. Examples are given in Wolf (1980) and Gibson and Seneta (1987b) where (1.3) is valid when column zero is augmented but is invalid when the last column kept is augmented. Example 2 in Section 4.2, which originally appeared in Golub and Seneta (1973), shows that when P, has at least one column that is bounded away from zero, then (1.3) is valid for any truncation. 1.2. Previous work. Seneta (1967) is the first paper to treat this problem as the central issue. In a series of papers with various collaborators, Seneta has made many contributions, most of which are included in the results presented in Gibson and Seneta (1987b). The primary approach is to specify the nature of P, and the truncations considered, and then establish (1.3) for the assumed situation by exploiting the special properties at hand. Wolf (1980) takes a different approach. The properties of P, and the truncation are not specified. Some conditions are derived which, if satisfied, imply that (1.3) is valid. These conditions take the form of an infinite system of inequalities, and some classes of matrices and truncations can be shown to obey them. Our approach invokes a theorem from Heyman and Whitt (1989). They considered Markov chains arising in queueing models, and used properties that are natural in queueing systems to guarantee convergence. The Markov chains they considered have states that consist of one counting variable and some supplementaryvariables that are of no interest here. They use the regenerative properties of the Markov chain to control the first-passagetimes when proving convergence. A special case of their theorem is applied to the problem of this paper. 1.3. Outline. In Section 2 we give our basic assumptions, and construct the truncated and the original Markov chains on a common probability space, and the theorem giving our conditions for (1.3) to be valid is stated. In Section 3 we apply the theorem to five situations from the literature and obtain these results from a single framework. 2. Preliminaries
These are our basic assumptions. ('41)
P is irreducible and positive recurrent.
('42)
State zero is positive recurrent in P,, .
DANIEL P. HEYMAN
lim Pn(i,j)
n-a
= P,(i,
j).
Assumption (Al) guarantees that (1.1) has a unique solution, so the right-hand side of (1.3) is well defined. Assumption (A3) asserts that Pnis (element-wise) at least as large as the northwest comer of P,. Assumption (A4) is very natural. Assumption (A2) is needed for the regenerative argument used in the proof of the theorem of Heyman and Whitt (1989) to be valid. It also implies that if Pnhas multiple communicating classes, we can choose for nn that stationary vector that assigns positive probability to state zero; this makes nn unique. It will turn out that the sufficient condition we use for the validity of (1.3) implies (A2), so this assumption will always be valid when the theorem is applied. There is nothing magical about state zero; any other fixed state can be used. 2.1. Construction of the Markov chains. We construct the Markov chains {X,) in a way that facilitates coupling and comparison arguments. The idea is to imagine that X, and Xn are simulated using a common set of uniform random numbers. Let the uniform random variables be U,, U,, . . .. The probability space that we use is the space that carries a countably infinite sequence of i.i.d. random variables that are uniformly distributed on [0, I]. That is, the probability space is {R, F, P), where sz=[O,l]X[O, 1]X .
.
a
;
F = the Bore1 sets of S2; P{UI P a , , U,$a,,.
. ., U,
m
= a ,
)
fl
= i-I
a,.
Since we are interested in the stationary distribution exclusively, we can set
For each transition epoch m, when X,(m - 1) = i, if
then X,(m) = k. When k = 0, the left-hand side of (2.2) is zero. For each n = 1, 2,. and for each transition epoch m, when Xn(m - 1) =j, if (2.2) is valid, then
..
where k, is determined by the augmentation. For example, ka = 0 when only column 0 is augmented, and k, = n when only the last column kept is augmented.
Approximating the stationary distribution of an infinite stochastic matrix
99
2.2. The limit theorem. For Markov chain X,,, let Cndenote the first-passage time from state 0 back to itself, n = 1,2, ., x. Recall that Xn(0)= 0, so Cnis the first return time to state 0. Let T,, = inf {k 2 1 : X,(k) > n); that is, Tnis the first time that X, reaches a state larger than n. We use 1{ ) to denote an indicator function, and all inequalities and equations among random variables are understood to hold with probability 1. The construction in Section 2.1 defines X, and X, on the same probability space. Also, for m < T,,, (2.3) can be written as
so Xn = X, for m < T,,. These are the representation conditions assumed by Heyman and Whitt (1989). Their extended representation conditions are
and To establish these conditions for our situation, observe that (2.3') shows that when l{C, < Tn) = 1, then 1{Cn< Tn) = 1 and Cn = C,, so both conditions hold in this case. When 1{C, < T,,) = 0, C,, I T,,, establishing that the former condition holds. From (Al) and (A2), both C, and C, are finite, so the latter condition holds too. Ergodicity implies limn,, Tn = co w.p. 1. As n -- co, X,,(m) -- X,(m) for every m. We want to first let m -- x, and then let n -- x. By controlling the integral of T,,, Heyman and Whitt (1989) were able to apply the dominated convergence theorem and prove the following result. Theorem. If for all n sufficiently large there exists a random variable 2 , with Z I 0 and E(Z) < x , such that
is valid, then (1.3) is valid. Since (Al) and (2.4) imply that E[C,,] < x when (2.4) is valid, we do not have to check that (A2) holds. When C, < T,,, (2.3) makes C,, = C, and C, = T,, is impossible, so (2.4) only has to be verified for C, > T,, .
3. Examples In this section we apply our theorem to several examples drawn from the literature. Usually, verifying that (2.4) holds is easier to do than was the extant convergence proof, but not always (see Example 4).
100
DANIEL P. HEYMAN
3.1. Some special augmentations. These are some special augmentations that we shall consider. Augment only column zero. All the missing probability is added to state 0. 5
P,(i, 0)
+ 2
P,(i, k), if j
=0
k=n+l
PnG, J ) = PEG,J),
ifO
It is clear that any fixed state can be substituted for state 0. Augment the last column kept. All the missing probability is added to state n, which is the truncation level.
For these two examples, the precise statement of (2.3) has been given. For the next two examples, let k
C
rk(i)= J
and r-,(i)
=o
P,(i, j),
k P 0,
= 0.
Uniform augmentation. The missing probability is distributed equally to all the states that are kept.
For this example, when T, > m , Xn= k if either rk- ,(i) 5 Urn< rk(i) or if
Row renormalization. Each of the kept rows is renormalized.
We require n to be large enough so that rn(i)> 0. Here, (2.3) becomes k,
=
k if
Approximating the stationary distribution of an infinite stochastic matrix
101
This augmentation shows that the construction in (2.2) is crucial. It might be considered more natural to set X,(m) = k if
but that would destroy the coupling between X, and X,. 3.2. The examples Example 1: Augment only column zero. Here it seems clear that (2.4) holds, and we are done. Formally, when Cw< T,, then C, = C,. When Cw> T,, then X, (T,) = 0 so C, < Cw. Since the numbering of the states is arbitrary, any fixed column can be specified. (Alternatively, in the proof of the theorem choose the entrance to that fixed state as the regenerative event.) Thus, we recover Theorem 3.1 of Gibson and Seneta (1987a) and Theorem 5.1 of Wolf (1980). Example 2: Markov matrices. A Markov matrix is a stochastic matrix with at least one column bounded away from zero. Without loss of generality we take this special column to be column zero, so
For the uniform random variables used to represent the Markov chains in Section 2.1, let Yrn= 1 if Urnc P,(X,[m - 1],0) and Y,,, = 0 otherwise. Let Z be the index of the first Y. that equals zero; then C, 5 Z. For any augmentation satisfying (A4), P{ Y, = 0) > 6, so Z is no larger than a geometrically distributed random variable on the positive integers with mean 116,so a suitable Z has been obtained. This recovers a result that was obtained first in Golub and Seneta (1973). Example 3: Stochastically monotone matrices. A stochastically monotone matrix has the property that rn
i < k implies
C
rn
P(i, j ) 1
C
P(k, j )
for every state m . Probabilistically, this means that when { Y(m); m = 1,2, . ) is the chain, the random variables [Y(m 1) I Y(m) = i] are stochastically decreasing in i. Any augmentation that preserves this property makes X,(m) lX,(m) valid for all positive integers m, so C, 2 C, holds. Consequently our theorem applies and (1.3) is valid. Manifestly, augmenting the last column kept, uniform augmentation and row renormalization preserve stochastic monotonicity. This is essentially the result given in Gibson and Seneta (1987a).
+
Example 4: Upper-Hessenbergmatrices. A matrix is upper-Hessenbergif P(i,j ) = 0 when i >j 1. Probabilistically, this means that transitions from a state in { i + l , i +2,...)toastatein{O, 1,..-,i)arealwaysbyatransitionfromstatei+ 1 t o state i. This is also called the skip-jke-to-the-left property. It is clear that the skip-free-to-the-left property implies that augmenting the last column kept makes C, as large as possible. Thus, (1.3) will hold for any augmentation
+
102
DANIEL P. HEYMAN
satisfying (A4) if it holds for that augmentation. If C, > T,, then X,(T,) = n and X,(T,) > n . From the skip-free-to-the-left property, if we stop the X,-process until the X,-process reaches state n , the two processes will agree again. We do this over and over until the X,-process reaches state 0. Manifestly, stopping the X,-process makes C, too large, so we have (2.4) holding as a strict inequality, and we are done. A completely formal proof that the construction in the previous paragraph establishes (2.4) requires some reindexing of the uniform random variables in (2.3a). This is standard fare, and is omitted. Thus, we have recovered Theorem 2.2 in Gibson and Seneta (1987b). Our proof for augmenting the last column is not as straightforward as the proof in Golub and Seneta (1974).
Example 5: Lower-Hessenberg matrices. A matrix is lower-Hessenbergif P(i, j ) = 0 when i
0 there is a finite K such that K
C
an(j)> 1 -
E
I -0
for all n 2 K. In the interesting case where C, > T,, let R, denote a generic first-passage-time from state n to state zero. Then
so it suffices to find a bound for R,. The skip-free-to-the-right property implies that X,(T, - 1) = n = X,(T, - 1) and X,(T,) = n 1. The distribution of X,(T,) is known from the augmentation rule, and the tightness condition implies that P{X,(T,) 2 K ) > 1 - E . Since E is arbitrary, P{X,(T,) 5 K) = 1. The epoch T, is the first time that (2.2) attempts to make a transition to state n 1; let S, be the second time. Let p = P{C, c n I C, > T,) and q = 1 - p. When X,(T,) 5 K, let W, be the conditional first-passage-time from X,(T,) to zero 1. The strong Markov given that C, < S,, and (unconditionally) let V, = S, - T, property gives (w.p. 1)
+
+
+
with prbbability p, V,
+ R; ,
with probability q,
where R:,is an i.i.d. copy of R,. Let W be any random variable that is stochastically larger than each W,, n 5 K and has a finite mean. Let Vbe any random variable that is stochastically larger than each V,, n 5 K and has a finite mean. These definitions are legitimate because K c x . Let w and be their generating functions. Let R be the random variable with generating function p ~ / ( -l qP). Clearly R, is stochastically
Approximating the stationary distribution of an infinite stochastic matrix
103
smaller than R , and expressing these random variables in terms of the underlying uniform random variables will order them w.p. 1. This result first appeared in Gibson and Seneta (1987b).
Acknowledgements I would like to thank Walter Willinger and Ward Whitt for their helpful comments.
References GIBSON,D.AND SENETA,E. (1987a) Monotone infinite stochastic matrices and their augmented truncations. Stoch. Proc. Appl. 24, 287-292. GIBSON,D. AND SENETA,E. (1987b) Augmented truncations of infinite stochastic matrices. J.Appl. Prob. 24, 600-608. GOLUB,G. H. AND SENETA,E. (1973) Computation of the stationary distribution of an infinite stochastic matrix. Bull. Austral. Math. Soc. 8, 333-34 1. GOLUB,G. H. AND SENETA,E. (1974) Computation of the stationary distribution of an infinite stochastic matrix of special form. Bull. Austral. Math. Soc. 10, 255-261. HEYMAN,D.P.AND WHITT,W. (1989) Limits of queues as the waiting room grows. QUESTA 5, 381-392. E. (1967) Finite approximation to infinite non-negative matrices. Proc. Camb. Phil. Soc. SENETA, 63,983-992. WOLF,D. (1980) Approximation of the invariant probability distribution of an infinite stochastic matrix. Adv. Appl. Prob. 12, 710-726.