Novel Shaping and Complexity-Reduction Techniques ...

Viewer
Transcript

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2009 proceedings

Novel Shaping and Complexity-Reduction Techniques for Approaching Capacity over Queuing Timing Channels Negar Kiyavash

Todd P. Coleman

Mavis Rodrigues

CS Department Information Trust Institute University of Illinois [email protected]

ECE Department University of Illinois [email protected]

ECE Department University of Illinois [email protected]

Abstract— This paper discusses practical codes for communication via packet timings across network queuing systems - an instantiation of the “Bits Through Queues” result for timing channels. It has recently been shown that sparse-graph linear codes followed by shaping techniques, combined with message-passing decoding, can enable practical timing channel codes with low symbol error rates near the capacity. The previous work had two main drawbacks. First, the shaping technique was only effective for very large finite field sizes. Secondly, the complexity of the message-passing decoder was quadratic in the block length. In this work, 1) we develop an alternative shaping technique using random dithers with provably good statistical guarantees; 2) we exploit Little’s Law from queuing theory along with a large deviations argument to reduce the message-passing decoder’s complexity from quadratic to linear in block length. We illustrate the effectiveness of this approach on simulated queuing systems with low symbol error rates near the capacity.

I. I NTRODUCTION This paper discusses a linear complexity implementation of coding schemes for queuing channels that approaches the capacity. Here we consider a communication channel where the encoder communicates information based upon timings between successive packets. A receiver observes packet timings after they have traveled through a communication network with queues at intermediate router nodes. Based upon the encoding mechanism, the statistical structure of the network queues, and the packet timings it observes, the receiver finds the most likely bit sequence. Anantharam & Verdu characterized - in closed form - the capacity of an instance of the problem where a single server ·/M/1 queue is placed between the packets at the transmitters and the packets at the receiver [1]. The characterization of capacity is

nontrivial - due to queueing systems being nonstationary, nonlinear, and non-memoryless. Building upon the work in [2], we consider an architecture with linear decoding complexity and a more useful shaping technique that has empirical performance approaching the capacity of communication over queueing channels. In this work, 1) we develop an alternative shaping technique using random dithers [3] with provably good statistical guarantees; 2) we exploit Little’s Law from queuing theory [4] along with a large deviations argument to reduce the message-passing decoder’s complexity from quadratic to linear in block length. Sundaresan and Verdu [5] showed the existence of tree codes with sequential decoding for the exponential server timing channel that can achieve half of capacity. However, 1) they can only achieve half of capacity at best, and 2) such codes have infinite worst-case decoding time. Recently, in [2], the authors showed that using sparse-graph linear codes followed by shaping techniques, combined with message-passing decoding, can enable practical timing channel codes with low symbol error rates near the capacity. However, the previous work had two main drawbacks. First, the shaping technique was only effective for very large finite field sizes. Secondly, the complexity of the message-passing decoder was quadratic in the block length. II. D ETAILED T ECHNICAL S UMMARY We here exploit the graphical structure of the conditional distribution of the departure process d = (d1 , . . . , dn ) given the arrival process a = (a1 , . . . , an ) , to develop a algebraic codes with low-complexity decoding algorithms and performance approaching the capacity. The methodology draws from understanding

978-1-4244-3435-0/09/$25.00 ©2009 IEEE

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2009 proceedings

the dynamics of queueing systems [6], [4], as well as algebraic coding theory [7], [8], and message-passing on graphs [8]. m

encoder

a1 a2

a3

·/M/1 queue

d1 d2 d3

decoder

m ˆ

Fig. 1. Conveying information through packet timings in a queueing system. Bits are encoded into the arrival times of the packet sequence originating from the source host (blue). The routers in a network are modeled as a queueing system, which introduces “noise” in the sense that the timings of the packet sequence en route to the destination host (red) will be different than those of the source packet sequence.

A. Problem Formulation The problem scenario is well-explained with Figure 1. Queuing timing channel maps n input packets to n output packets according to the model PDn |An (d |a ) = n

n

n

PSi (si )

(1a)

i=1

si di − max(di−1 , ai )

C(λ) (bits/s) ˜ C(λ) (bits/packet) λ∗ C(λ∗ ) (bits/s) ˜ ∗ ) (bits/packet) C(λ

i=1

We define C(λ) to be the maximum achievable rate over all arrival processes with average rate λ. C(λ) is known to be attained with a Poisson (Bernoulli) arrival process in continuous (discrete) time. Table I provides a concise description of the CT [1] and DT [9], [10] capacity results and optimal input distributions from the literature. C(λ) denotes the capacity in bits/second (bits/slot) for the continuous (discrete) time case - for arrival processes of rate λ, i.e. constrained according to (2). Similarly, ˜ C(λ) denotes the capacity in bits/packet for arrival processes of rate λ. Since for large T there are on average ˜ λT packets in T seconds/slots, C(λ)/C(λ) = λ. λ∗ corresponds to the maximum of C(λ) over all λ ∈ [0, μ). H2 (·) is the binary entropy function. B. Shaping We now consider forcing the inter-arrival times to satisfy certain algebraic conditions. We know that for in the discrete-time case, the inter-arrival times should be i.i.d. following a geometric distribution, and for the continuous case, the inter-arrival times should be i.i.d.

DT [9], [10] H2 (λ) − μλ H2 (μ) H2 (λ) − H2μ(μ) λ

e−1 μ e−1 μ log2 e

γe log (1 2 + γ)

−

γ , γ+1

1+

log2 e

1 γ

H2 (µ) µ

log2 (1 + γ)

TABLE I T HE CAPACITY OF QUEUING TIMING CHANNELS IN DISCRETE AND CONTINUOUS TIME

following an exponential distribution. So we consider doing the following. We know that if we would like to construct a random variable Z with cumulative distribution function (CDF) FZ (z), then we can first construct a uniform random variable U on [0, 1] and then construct X as Z = FZ−1 (U ). (3) So for the capacity-achieving distributions for the discrete-time (DT) and continuous-time (CT) queuing channel scenarios, we have:

(1b)

Where the service times {Si } are i.i.d. and exponentially (geometrically) distributed in continuous (discrete) time. The encoder controls the queuing system’s arrival process A = (A1 , A2 , . . . An ) subject to the constraint n 1 1 (Ai − Ai−1 ) = . E (2) n λ

CT [1] λ log2 μλ log2 μλ

FZ (z) Z(U )

CT 1 − e−λz − ln(1 − U ) λ

DT 1 − (1 − λ)z ln(1 − U ) ln(1 − λ)

(4)

So if we can generate n i.i.d. uniform-[0, 1] random variables, {Ui }ni=1 , then we can generate the inter-arrival times Zi according to (4). 1) Shaping Using Algebraic Codes and Dithers: By first using the inverse CDF, we can collapse the encoding problem into constructing n i.i.d. uniform [0, 1] random variables. It is well known [7] that the ensemble of random linear codes over FQ produces codewords words whose elements are i.i.d. and uniformly distributed over FQ . By shaping according to the method in the previous section, Gallager showed how using random linear coset codes over finite fields with maximumlikelihood decoding suffices to achieve capacity [7, p. 208] on arbitrary discrete memoryless channels. This has recently been shown to also be true when we specifically consider LDPC coset codes [11]. A similar approach was used in [12]. Other authors have considered using essentially the same inverse CDF idea for other continuous communication channels that require shaping [13]. We propose using a technique based upon algebraic codes and dithering. Consider some field size Q = 2t . Then we force our Xi ’s to lie in the finite field FQ . We consider a matrix H with m < n rows and n columns defined over FQ . Define the linear coset code C as C = {x : Hx = v.}

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2009 proceedings

xi

From here, interpret each xi ∈ FQ as a member of R and define the ith inter-arrival time, Zi , as Zi = Ti (xi )

xi + Ui = FZ−1 Q

P (ai |ai−1 , xi )

ai

(5) z −1

(6) mod 1

where (U1 , . . . , Un ), are i.i.d. uniform[0, 1] dithers that have been used extensively in quantization [14], watermarking [15], communication on linear Gaussian channels [3], [16], and structured multiterminal binning [17]. xi We note from the Crypto Lemma [3], Q + Ui mod 1 will also be uniformly distributed on [0, 1]. Secondly, the ensemble of random linear codes, the Xi ’s will be uniformly distributed over FQ and thus the Zi ’s will be i.i.d. and distributed according to the capacity-achieving distribution. Note that in short, we can a priori define the n functions Ti : FQ → R and in short they are completely characterized by a table of nQ real numbers known at the encoder and decoder. In practice, we can imagine that the encoder and decoder only need to know the state of a random number generator to generate the n Ui ’s which subsequently defines the n functions T1 , . . . , Tn . Since the rate for such a procedure is log2 Qn−m R= bits per packet, n we have from Table I that the rate must be upperbounded by the following bit/packet capacities: log2 Qn−m R= n

ai−1

CT upper bound

DT upper bound

μ log2 λ

h2 (λ) h2 (μ) − λ μ

C. The Arrival and Departure Process as Simple FirstOrder Stochastic Dynamical Systems We know that the actual arrival times of our input process satisfy

Fig. 2. Viewing the arrival process as a simple first-order stochastic dynamical system with xi as an exogenous input

first packet does not depart from the queue until after the second packet arrives. Thus the service time s2 for the second packet is given by s2 = d2 − d1 , because the server starts working on the second packet once the first packet departs. The second packet departs before the third arrival a3 . Thus the third service time is simply s3 = d3 − a3 . So in general, it follows that [6], [4] si = di − max(ai , di−1 ) = w(di , ai , di−1 )

(9)

where (9) reflects that si only depends on di , ai , and di−1 . a1

a2

d1

d2

a3

d3

0 1

5

7

9

11

16

Fig. 3. The arrivals (blue) and departures (red) from a First-Come, First-Serve Discrete-Time Queue

Note from (1) that the channel law is specified in terms of the service times, and the ith service time is only a function of di , ai , and di−1 . Thus, P (dn |an ) =

n

βi (di , di−1 , ai ),

i=1

βi (di , di−1 , ai ) fS (w(di , ai , di−1 )) 1{s>0} μ(1 − μ)s−1 fS (s) = 1{s>0} μe−μs

(10) for DT . for CT

(8)

since in CT, the service times are i.i.d. and exponentially distributed with parameter μ and in DT they are i.i.d. and geometrically distributed with parameter μ. Figure 4 depicts the formulation of departure process as a simple first-order stochastic dynamical system, with the arrival times as an exogenous input.

The formulation of the arrival process as a simple first-order stochastic dynamical system is illustrated in figure 2. Now consider the departure process from the output of a queue with a first-come, first-serve (FCFS) discipline, depicted in Figure 3. Note that the service time s1 for the first packet is given by d1 − a1 = 6. Note that the

D. An Aggregate State-Space Stystem Representation and Message-Passing Decoder Now that we have used a linear code to map input bits to code symbols xi , viewed the arrival process as a simple first-order stochastic dynamical system with xi as an exogenous input, and viewed the departure process as a simple first-order stochastic dynamical system with

ai = ai−1 + zi = ai−1 + Ti (xi )

(7a) (7b)

We denote the variable αi to capture this as: αi (ai , ai−1 , xi ) 1{ai =ai−1 +Ti (xi )} .

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2009 proceedings

ai di−1 P (di |di−1 , ai )

di

has a sparse graph representation. Then by associating node gi with (11), node rci with xi ∈ FQ , and node pcj with 1{h x=sj } , we have the following messagej passing rules pertaining to the sum-product algorithm [8] on Forney Factor Graphs:

z −1 Fig. 4. Viewing the departure process as a simple first-order stochastic dynamical system, with ai as an exogenous input.

Q−1

μgi →gi−1(di−1 , ai−1)=

gi (di ,di−1 ,ai−1+Ti (xi ),ai−1 ,xi ).

xi =0 d1

g1

x1

μrci →gi(xi ) μgi+1 →gi(di , ai−1+Ti (xi ))

=

x1

+

(d1 , a1 ) x2

=

x2

g3

(d3 , a3 )

x3

.. .

gn

ai−1

j∈N (i)

+m

xn

=

gi (di , di−1 , ai , ai − Ti (xi ), xi ).

xi =0

μgi−1 →gi(di−1 ,ai−1 )μgi+1 →gi(di ,ai−1+Ti (xi )) μpcj →rci(xi ) μrci →gi(xi )=

x3

.. . xn

s2

2

.. .

=

3

(dn−1 , an−1 ) dn

+

x3

μgi →gi+1(di , ai )=

μrc →g (xi ) μgi−1 →gi(di−1 , ai−Ti (xi )) i i μgi →rci(xi )= gi (di ,di−1 ,ai−1+Ti (xi ),ai−1 ,xi ).

x2

2

(d2 , a2 ) d3

s1

1

g2

d2

Q−1

1

sm

μpcj →rci(xi )=

j

μrci →pcj(xi )

x:h x=sj i ∈N (j)\i

xn

μrci →pcj(xi )=μrci →gi(xi )

n

μpcj →rci(xi )

j ∈N (i)\j

Fig. 5. A Forney Factor Graph for P (a, x|d). Note the blue component is the Forney Factor Graph of a Traditional Linear Coset Code. Note that the red component has no cycles.

ai as an exogenous input, we can characterize the joint likelihood of all observable and unobservable state variables [2]:

P (d, a, x) ∝ 1{Hx=v}

n

gi (di , di−1 , ai , ai−1 , xi )

We note, however, that the messages μgi →gi−1(di−1 , ai−1 ) and μgi →gi+1(di , ai ) must take on values for all ai ∈ [0, di ) for the CT case and {1, 2, . . . , di − 1} in DT. Since di grows linearly with i, this amounts to an O(n2 ) complexity, even with a sparse graph code. We note, however, that expected system time W = E[Di − Ai ] of a stable M/M/1 queue is given by Little’s Law [4]: W =

i=1

1 μ−λ

(12)

gi (di , di−1 , ai , ai−1 , xi ) βi (di , di−1 , ai )αi (ai , ai−1 , xi ), (11) Many large deviation bounds (i.e. Chebyshev, Chernoff,

with αi given by (8) and βi given by (10). This statespace representation can be captured using a Forney Factor Graph [8], also termed a “normal graph”. See Figure 5. We would also note that by viewing the arrival process and departure process as first-order stochastic dynamical systems with appropriate exogenous inputs, the red component of the graph has no cycles. It has a trellis structure reminiscent of the Kalman filter and the Viterbi decoding algorithm. E. An Iterative Decoding Scheme with Message-Passing Update Rules Note that there are no cycles in the factor graph representing ni=1 gi (di , di−1 , ai , ai−1 , xi ), and a sparse graph (i.e. LDPC) coset code can be uses so that 1{Hx=v}

etc) can assess P (Di − Ai > cW ) ≤ (c) for appropriate . Thus we can reduce the decoder complexity by defining μgi →gi+1(d, a) = μgi →gi−1(d, a) = when d−a > W c and theoretically without much performance loss. Secondly, for the CT scenario, at the decoder, we bin continuous time into discrete time using a width of Δ seconds. Thus for both scenarios, with these simplifications, we have a linear complexity messagepassing decoder. F. Performance Results We used this architecture and tested its performance using a Q = 4, n = 1000 regular LDPC coset code to encode messages and simulate them a FCFS memoryless

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2009 proceedings

queue. The DT performance for a geometric server with service rate μ = 0.9 is given in Figure 6. −1

10

continuous densities, which is imprecise if not altogether infeasible. To improve the performance of our decoding algorithm one can use particle filtering techniques to more accurately track the densities [18]. R EFERENCES

−2

symbol error rate

10

−3

10

−4

10

−5

10

−6

10

0.1

0.2

0.3

0.4

0.5 0.6 0.7 R/C (bits/packet)

0.8

0.9

1

Fig. 6. Symbol Error Rate vs ratio to capacity for the DT queuing channel. 0

10

−1

symbol error rate

10

−2

10

−3

10

−4

10

0.55

0.6

0.65 0.7 0.75 0.8 ratio to capacity (bits/packet)

0.85

0.9

Fig. 7. Symbol Error Rate vs ratio to capacity for the CT queuing channel.

The CT performance for an exponential server with service rate μ = 20 is given in Figure 7. In case of DT queuing channel, with these simple sparse graph codes and without detailed optimizing the detailed structure of the sparse graph code, we were still able to achieve nontrivial and near-capacity performance. Thus this illustrates a proof-of-concept of the effectiveness of this architecture. However, for the CT queuing channel the error does not decay as sharply. We attribute this slow drop in the waterfall region to at least a couple of possible reasons: 1) An regular sparse graph was used for the LDPC code - performing density evolution will most certainly improve the performance; 2) In the CT case, continuous messages are being passed on the trellis part of the graph - the support of μgi →gi−1(di−1 , ai−1 ) and μgi →gi+1(di , ai ) is ai ∈ [0, di ). However, our implementation both truncated [0, di ) to [di −cW, di ) via Little’s law, and discretized the continuous time axis uniformly. Such a discretization turns the message passing algorithm to a discrete inference problem for high dimensional

[1] V. Anantharam and S. Verdu, “Bits through queues,” IEEE Transactions on Information Theory, vol. 42, no. 1, pp. 4–18, 1996. [2] T. Coleman and N. Kiyavash, “Practical codes for queueing channels: An algebraic, state-space, message-passing approach,” in IEEE Information Theory Workshop, 2008, pp. 318–322. [3] G. D. F. Jr, “On the role of MMSE estimation in approaching the information-theoretic limits of linear Gaussian channels: Shannon meets Wiener,” in Allerton Conference on Com- munications, Control and Computing, 2003. [4] L. Kleinrock, Queueing Systems. Vol 1: Theory. New York, NY: Wiley, 1975. [5] R. Sundaresan and S. Verdu, “Sequential decoding for the exponential server timing channel,” IEEE Transactions on Information Theory, vol. 46, no. 2, pp. 705–709, March 2000. [6] R. Gallager, Discrete Stochastic Processes. Boston, MA: Kluwer, 1996. [7] R. G. Gallager, Information Theory and Reliable Communication. New York: John Wiley & Sons, 1968. [8] G. Forney, “Codes on graphs: Normal realizations,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp. 520– 548, 2001. [9] A. S. Bedekar and M. Azizoglu, “The information-theoretic capacity of discrete-time queues,” IEEE Transactions on Information Theory, vol. 44, no. 2, pp. 446–461, 1998. [10] B. Prabhakar and R. Gallager, “Entropy and the timing capacity of discrete queues,” IEEE Transactions on Information Theory, vol. 49, no. 2, pp. 357–370, February 2003. [11] A. Bennatan and D. Burshtein, “Design and analysis of nonbinary LDPC codes for arbitrary discrete-memoryless channels,” IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 549–583, February 2006. [12] E. A. Ratzer and D. J. C. Mackay, “Sparse low-density paritycheck codes for channels with cross-talk,” in Proceedings of IEEE Information Theory Workshop (ITW), March/April 2003. [13] F.-W. Sun and H. C. A. van Tilborg, “Approaching capacity by equiprobable signaling on the Gaussian channel,” IEEE Transactions on Information Theory, vol. 39, no. 5, pp. 1714– 1716, September 1993. [14] J. Ziv, “On universal quantization,” IEEE Transactions on Information Theory, vol. 31, no. 3, pp. 344–347, 1985. [15] B. Chen and G. Wornell, “Quantization index modulation: a class of provably good methods for digital watermarking and information embedding,” Information Theory, IEEE Transactions on, vol. 47, no. 4, pp. 1423–1443, 2001. [16] U. Erez and R. Zamir, “Achieving 1/2 log (1+ SNR) on the AWGN channel with lattice encoding and decoding,” Information Theory, IEEE Transactions on, vol. 50, no. 10, pp. 2293– 2314, 2004. [17] R. Zamir, S. Shamai, and U. Erez, “Nested linear/lattice codes for structured multiterminal binning,” Information Theory, IEEE Transactions on, vol. 48, no. 6, pp. 1250–1276, 2002. [18] A. Ihler, J. Fisher III, R. Moses, and A. Willsky, “Nonparametric belief propagation for self-localization of sensor networks,” IEEE Journal on Selected Areas in Communications, vol. 23, no. 4, pp. 809–819, 2005.

Novel Shaping and Complexity-Reduction Techniques ...

useful shaping technique that has empirical performance approaching the ... server timing channel that can achieve half of capacity. However, 1) they can only ..... check codes for channels with cross-talk,â in Proceedings of. IEEE Information ...

Download PDF

162KB Sizes 0 Downloads 228 Views

Report

Novel Shaping and Complexity-Reduction Techniques ...

Recommend Documents