List decoding of Reed-Muller codes up to the Johnson bound with almost linear complexity Ilya Dumer
Grigory Kabatiansky
C´edric Tavernier1
University of California Riverside, CA, USA Email:
[email protected]
Inst. for Information Transmission Problems Moscow, Russia Email:
[email protected]
THALES Communications Colombes, France Email:
[email protected]
Abstract— A new deterministic list decoding algorithm is proposed for general Reed-Muller codes RM (s, m) of length n = 2m and distance d = 2m−s . Given n and d, the algorithm performs beyond the bounded distance threshold of d/2 and has a low complexity order of nms−1 for any decoding radius T that is less than the Johnson bound.
I. I NTRODUCTION Preliminaries. Reed-Muller (RM) codes RM (s, m) - defined by two integers m and s, where m ≥ s ≥ 0 - have length n, dimension k, and distance d as follows: n = 2m ,
k=
s X
(m i ),
d = 2m−s .
i=0
Over the past 50 years, there have been many studies of RM codes thanks to their simple code structure and fast decoding procedures. In particular, majority decoding proposed in the seminal paper [1] has complexity order of nk or less for any code RM (s, m). Even a lower complexity order of n min(s, m−s) is required for different recursive techniques of [12], [13], and [14]. These algorithms also correct many errors beyond the bounded-distance decoding (BDD) radius of d/2. In particular, for long RM codes RM (s, m) of any given order s, the majority decoding (see [2]) and the recursive technique (see [14]) correct most error patterns of weight T = n(1−ε)/2 or less for any ε > 0. However, for any given error weight T ≥ d/2, decoding performance depends on a specific error pattern. As a result, these algorithms cannot necessarily output the complete list of codewords within the radius T ≥ d/2. Another line of research focuses on list decoding algorithms and has originated in the paper [11]. Given code C and decoding radius T, a list decoder gets any received vector y and produces the list LT ;C (y) = {c ∈ C : d(y, c) ≤ T } of all code vectors c ∈ C located within the distance T from y. A particularly important question is to define the maximum decoding radius T that allows for low-complexity decoding of a specific code. As a first cut to this problem, we consider list decoding within the Johnson bound, which guarantees that the 1 The work of C. Tavernier was partially supported by DISCREET, IST project no. 027679, funded in part by the European Commission’s Information Society Technology 6th Framework Programme
output list LT ;C (y) will stay limited for growing blocklengths. This bound is defined as follows. Consider a code C with the minimum code distance at least d = δn. Let Bn,T (y) = {x : d(y, x) ≤ T } be the Hamming ball of radius T = ωn centered at any y. The Johnson bound limits the number of codewords δ |LT ;C (y)| = |C ∩ Bn,T (y)| ≤ (1) δ − 2ω(1 − ω) provided that (1) has positive denominator. In particular, this bound shows that any code RM (s, m) contains at most a constant number δ |LT ;RM (s,m) (y)| ≤ (2) 2²(1 − 2Js + ²) of codewords located within a decoding radius T² (s) = n(Js − ²) from any center y, where p Js = 2−1 (1 − 1 − 2−s+1 ),
(3)
² ∈ (0, Js ].
Note that for any s, 2−(s+1) + 2−(2s+2) ≤ Js ≤ 2−(s+1) + 2−(2s+1) . Therefore for any code RM (s, m), nJs is larger than the BDD threshold of d/2. Summary of the results. In this presentation, our results are summarized in the following theorem. Theorem 1: For any received vector y and any ² > 0, list decoding of codes RM (s, m) can be performed within the decoding radius n(Js − ²) with complexity ( λ1 n²−3 , if s = 1, χ(m, s, ²) ≤ (4) −2 s−1 λs ² mn + µs m n, if s ≥ 2. where the constants λs and µs depend on the order s only. Let us compare Theorem 1 with other results known for list decoding or RM codes. The first significant breakthrough in this area was achieved in the paper [8] that presents a very efficient - though probabilistic - algorithm for codes RM (1, m) of the first order. Later, the result was substantially extended in [9] and [10] to a larger context of nonbinary codes. Note that the algorithms of [8] - [10] are not error-specific, and output the required lists with high fidelity regardless of
its input y. On the other hand, these algorithms output the required list of codewords with some small error probability Perr = P1 + P2 . Here P1 is the probability that a “good” code vector c (i.e. d(y, c) ≤ T ) is discarded from the output list, and P2 is another probability of leaving a “bad” code vector within this list. For binary codes RM (1, m), the algorithms of [8] - [10] have decoding radius T² (1) = n(1/2 − ²) and −1 complexity poly(log n, ²−1 , logPerr ). To date, the question of constructing (probabilistic) list decoding algorithm with polylog complexity is open for all binary RM codes of order s > 1. Theorem 1 can be extended to probabilistic decoding that performs within the radius T² (s) with polylog complexity for any given s; however, the proof is beyond the scope of this correspondence. Our conjecture is that such a decoding with polylog complexity does not exist if the decoding radius is larger than the Johnson bound. Another - deterministic - list decoding algorithm is proposed in [5] for general RM codes RM (s, m). This is done by applying the Guruswami-Sudan (GS) algorithm [4] to the 1shortened codes RM 0 (s, m) of length 2m − 1. This decoding design is due to the two facts. First, 1-shortened codes RM 0 (s, m) are subcodes of BCH-codes of the same designed distance d. Second, BCH codes are the subfield subcodes of RS codes, again of the same distance d. These two facts lead to a list decoding algorithm [5] that has decoding radius
is a Boolean function of degree at most s and (x1 , ..., xm ) range over all 2m points of the Boolean cube of dimension m. Let y be the received vector. Our goal is to design the list L² (y) = {f ∈ RM (s, m) : d(y, f ) ≤ T = n(Js − ²)}
(5)
whose size is restricted in (2). First, we wish to recover all terms of degree s in f (x) ∈ L² (y). For any list f ∈ L² (y), consider the corresponding list of (different) polynomials def
=
X
L²,high (y) = {f(high) fi1 ,...,is xi1 ...xis |f ∈ L² (y)}.
1≤i1 <...
Here we shall design an algorithm Ψs,m (y, T ) that outputs some list R² (y) of size ²−2 or less and also includes the required list L²,high (y). This algorithm is call the Ratio algorithm and will be described later. To retrieve the lists L² (y) and L²,high (y) in the next step, we consider the list of the new “received” vectors y 0 = y + f(high) ,
f(high) ∈ R² (y).
Each vector y 0 is decoded into RM (s − 1, m). Note that y 0 is decoded within the same decoding radius T < nJs , whereas the latter code has twice the distance since
n(2Js+1 − ²)
d {RM (s − 1, m)} ≥ 2Js n.
and complexity O(n3 ²−6 ). Note that the above Theorem 1 compares favorably to the results of [5]. First, the cubic complexity O(n3 ²−6 ) is now reduced to a quasi-linear order of ms−1 n of the blocklength n. Second, the maximum decoding radius nJs of Theorem 1 always exceeds its counterpart 2nJs+1 . For example, for RM (1, m) and RM (2, m), the former radii of n(0.293 − ²) and n(0.134 − ²) are now replaced with n(0.5 − ²) and n(0.146 − ²), respectively. For higher orders s, the former radius 2nJs+1 approaches the midpoint between d/2 and the Johnson bound nJs . Theorem 1 also extends our previous techniques of [6] and [7]. For RM-1 codes, these techniques give the same result as that of Theorem 1 and are partially extendable to RM-2 codes. At the same time, direct application of [6] and [7] to RM-codes of general order s leads to less efficient algorithms. In this correspondence, we suggest one improvement that extends this approach to general RM-codes and gives Theorem 1.
Thus, we can employ BD decoding in this step. In particular, we shall use the recursive BDD algorithm Θs−1,m (y 0 ) of [13], which has low complexity of order min(s, m − s)O(n). Finally, we keep only those decoded vectors Θs−1,m (y 0 ), which are within the distance T from the received vector y :
II. D ETERMINISTIC LIST DECODING ALGORITHMS OF R EED -M ULLER CODES
3. Choose all f(high) : d(Θs−1,m (y 0 ), y) ≤ T
Binary Reed-Muller code RM (s, m) of order s and length 2m consists of vectors f = (..., f (x1 , ..., xm ), ...) where X X f (x1 , ..., xm ) = f0 + fi xi + fi1 ,i2 xi1 xi2 + X 1≤i1
1≤i≤m
fi1 ,i2 xi1 xi2 + ... +
1≤i1
X
1≤i1 <...
fi1 ,...,is xi1 ...xis
d(Θs−1,m (y 0 ), y) ≤ T Then the corresponding vectors - polynomials f(high) + Θs−1,m (y 0 ) form the required list L² (y). The resulting algorithm, denoted Φs,m , outputs this list along with L²,high (y). This is summarized as follows. Algorithm Φs,m (y, T ) for T = n(Js − ²). 1. Apply algorithm Ψs,m (y, T ) and output the list R² (y). 2. Apply BDD algorithm Θs−1,m (y 0 ) for all y 0 = y + f(high) and f(high) ∈ R² (y). and output the lists L²,high (y) = {f(high) } and L² (y) = {f(high) + Θs−1,m (y 0 )}. Thus, our problem is reduced to constructing the algorithm Ψs,m . To do this, we shall use the three main ingredients from the previous papers [6] and [7], namely, recursive decoding structure (this is common for decoding of RM codes), the acceptance criteria of a candidate, and the list decoding of
RM-1 codes with linear complexity. The new ingredient is based on the well known Plotkin construction of RM-codes which represents any codeword f ∈ RM (s, m) in the form u, u+v, where u ∈ RM (s, m−1) and v ∈ RM (s−1, m−1). Namely, given any vector f = (f1 , ..., fn ) of length 2m , let a new vector f (j) of length 2m−1 be obtained by the j-th folding. This is a mod 2 summation of coordinates of the vector f which differ only in the j-th position. For example, for j = m, (m)
fi1 ,...,im−1 = fi1 ,...,im−1 ,0 + fi1 ,...,im−1 ,1 . Clearly, f (j) ∈ RM (s − 1, m − 1). Moreover, X fi1 ,...,is xi1 ...xis (j) f(high) = . xj 1≤i1 <...
Assume that the list decoding algorithms Ψs−1,m0 and Φs−1,m0 are already constructed for m0 ≤ m. Then the algorithm Ψs,m can be divided into the three major steps as follows. 1. We first apply the list decoding algorithm Φs−1,m−1 (y (j) , Js 2m ) for j = s, ..., m, using the (bigger) radius Js 2m > T . By the above recursive assumption, the decoding Φs−1,m−1 (y (j) , Js 2m ) is possible due to the following inequalities d(y (j) , f (j) ) ≤ <
d(y, f ) ≤ (Js − ²)2m Js 2m < Js−1 2m−1 .
The result is the list of polynomials (j) Lb(j) (y) = {f(high) }
of degree s−1. The above inequalities also guarantee that this (j) list contains polynomials f(high) that pertain to all functions f ∈ L² (y). Moreover, the Johnson bound also shows that decoding within the radius Js 2m gives the list of size |Lb(j) (y)| ≤ 2s+1 . Note that the list Lb(j) (y) yields the set of coefficients f (j,j) = {fi1 ,...,is : i1 < ... < is = j} for all code vectors f (x) ∈ L² (y). In the next step, we use the list Lb(j,j) (y), which contains functions X f (j,j) (x1 , ..., xj−1 ) = fi1 ,...,is xi1 ...xis−1 1≤i1 <...
for all f ∈ L² (y), but may contain other functions f (j,j) . This list is used as follows. (j)
2. For any f (x1 , ..., xm ), define its j-th prefix qf : X (j) qf (x1 , ..., xj ) = fi1 ,...,is xi1 ...xis 1≤i1 <...
The suggested algorithm Ψs,m works recursively for each j = b²(j) (y) that includes s, ..., m. In each step j, we design the list R the full list of j-th prefixes: (j)
L(j) ² (y) = {qf : f (x1 , ..., xm ) ∈ L² (y)}.
b²(j) (y), we first consider an auxiliary list To find R Aux(j) ² (y) = {q(x1 , ..., xj−1 ) + xj g(x1 , ..., xj−1 ) b²(j−1) (y), : q(x1 , ..., xj−1 ) ∈ R g(x1 , ..., xj−1 ) ∈ Lb(j,j) (y)}. (j)
All elements of Aux² (y) are verified against the Ratio b²(j) (y) Criteria described in the following Step 3. The list R is formed by those elements that successfully pass this test. Finally, we set b²(m) (y). R² (y) = R 3. In this step, the key idea of the algorithm is to upper bound the Hamming distance between the received vector y and any codeword f ∈ RM (s, m) using the Hamming distances from y − q (j) to the RM (s − 1, j) codes. This is done as follows. Let j = s, ..., m and Sα = {(x1 , . . . , xj , α1 , . . . , αm−j )} be an j-dimensional facet, where (x1 , . . . , xj ) range over all 2j binary j-dimensional vectors; α1 , . . . , αm−j are fixed, and α = α1 +. . . +αm−j 2m−j−1 is the “number” assigned to this facet. The restriction of a function f (x1 , ..., xm ) ∈ RM (s, m) to any j-dimensional facet Sα can be represented as a sum of its j-th prefix q (j) (x1 , ..., xj ) and the corresponding (“remaining”) boolean function rα (x1 , ..., xj ) of degree less than s. (j) Let dα (a, b) be the Hamming distance between two arbitrary vectors a and b restricted to a given j-dimensional facet Sα , and let (j) )= ∆(j) α (y, q
min
h∈RM (s−1,j)
(j) , h) d(j) α (y − q
be the minimal Hamming distance from the vector y − q (j) to the code RM (s − 1, j) restricted to Sα . Now we define the Ratio Criteria for specific candidates (j) q (j) ∈ Aux² (y). Namely, given the radius T = (Js − ²)n, we accept a candidate q (j) if it is “sufficiently close” to y on sufficiently many facets Sα in the following sense: q (j) |{α :
is accepted
(j) ∆(j) ) α (y, q
if
(6)
j
≤ 2 (Js − ²/2)}| ²/2 ≥ 2m−j . Js − ²/2
To verify the Ratio Criteria (6), we apply BD decoding of RM (s − 1, j) code for the “received” vector y 0 = y + q (j) restricted to the corresponding facet Sα . (j) Let R² (y) = {q (j) } be the list of all q (j) satisfying the Ratio Criteria (6). The result is the list (j) b²(j) (y) = Aux(j) R ² (y) ∩ R² (y).
required in Step 2 for all j = s, ..., m. Thus, the algorithm
Proof. For 0 ≤ z ≤ 1/2 denote pz = 2j−m |{α : = z}|. We shall prove that
Ψs,m (y, T ) has the following structure.
(j) (j) δbα (y, qf )
Algorithm Ψs,m (y, T ) for T = n(Js − ²). 1. For each j = s, ..., m, apply algorithm Φs−1,m−1 (y (j) , Js 2m ). Output the list Lb(j,j) (y) to Step 2.
pz >
z≤θ+γ
γ . θ+γ
Indeed,
(j) Aux² (y)
2. Construct the list from the lists (j−1) (j) b² R (y) and Lb(j,j) (y). Send Aux² (y) b²(j) (y). to Step 3 and return the list R For j = m, output R² (y) =
X
P =
θ
≥ =
(j) δ(y, f ) ≥ δb(j) (y, qf ) 2m−j X−1
(j)
2−m ∆α (y, qf ) =
X
zpz .
α=0
b²(m) (y). R
On the other hand, X X zpz ≥ zpz > (1 − P )(θ + γ).
(j)
3. Choose all q (j) in Aux² (y) that satisfy the Ratio Criteria (6).
z>θ+γ
III. A NALYSIS OF THE R ATIO ALGORITHM The main part of analysis of the described algorithm conb²(j) (y) in sists in estimating the size of the generated lists, R particular. First of all, observe that (j) (j) d(j) ) α (y, f ) ≥ ∆α (y, q
and hence d(y, f )
=
2m−j X−1
d(j) α (y, f )
(7)
α=0
≥
2m−j X−1
(j) ∆(j) ) := ∆(j) (y, q (j) ). α (y, q
α=0
Denote the corresponding relative distances as (j)
d(y, f ) dα (a, b) , δα(j) (a, b) = , n 2j (j) ∆α (y, q) b(j) ∆(j) (y, q) δbα(j) (y, q) = , δ (y, q) = , 2j n −1 θ = n T = Js − ². δ(y, f )
=
Inequality (7) leads to the list
(j)
γ θ+γ .
We choose the candidates q (j) in a slightly different way. (j) Namely, we introduce the lists Rθ,γ (y) which include all prefixes q (j) which satisfy inequality (8)
γ for at least θ+γ fraction of all facets Sα . This criterion is based on the following simple lemma. Lemma 2: Let f (x1 , ..., xm ) be a Boolean function such (j) that d(y, f ) ≤ T = θn and let qf be its j-th prefix. Then for γ any γ > 0 and every j = s, . . . , m, a fraction of at least θ+γ facets Sα satisfy inequality (8).
2
(j)
(j)
Lθ (y) ⊆ Lθ+γ (y) ⊆ Rθ,γ (y)
(9)
Let Ms,m;θ be the maximal possible size of the list for RM (s, m) code with the decoding radius T = 2m θ. The (j) next lemma shows that the size of the lists Rθ,γ (y) cannot be much larger than the size of the list for RM (s, j) code with the decoding radius 2j (θ + γ). Lemma 3: For any received vector y and for every j = s, ..., m θ+γ (j) |Rθ,γ (y)| ≤ Ms,j;θ+γ γ Proof. By definition, each j-dimensional facet Sα includes at most Ms,j;θ+γ code vectors v ∈ RM (s, j) such that (j) δbα (y, q (j) ) ≤ θ + γ. We say that both such vectors v and the corresponding pairs (v, Sα ) are good. Hence the total number of good pairs is at most 2m−j Ms,j;θ+γ . On the other hand, (j) Lemma 2 guarantees that for any f ∈ Rθ,γ (y) its prefix q (j) is γ a good vector for at least 2m−j θ+γ facets. Any two different prefixes being restricted to the same facet generate two distinct vectors of length 2j . Therefore, we have inequality 2m−j Ms,j;θ+γ ≥ 2m−j
(j)
Lθ (y) = {q (j) : ∆(j) (y, q (j) ) ≤ T }.
δbα(j) (y, q (j) ) ≤ θ + γ
We conclude that P > Note that
γ (j) |R (y)|, θ + γ θ,γ
which proves the lemma. 2 Recall that the Ratio Criteria ((6)) employ the list (j) (j) RJs −²,²/2 (y), denoted as R² (y). It follows from Lemma 3 that Js Ms,j;Js −²/2 < 2−2s+1 ²−2 . (10) |R²(j) (y)| ≤ ²/2 (j)
Now we deduce from (10) that Aux² (y) has size |Aux(j) ² (y)|
= ≤
b²(j−1) (y)| × |Lb(j,j) (y)| |R 2−2s+1 ²−2 2s+1 = 2−s+2 ²−2 .
(11)
IV. C OMPLEXITY For any algorithm A, let χ(A) denote its complexity. Then χ(Φs,m (∗, T )
= χ(Ψs,m (∗, T )) + χ(Θs−1,m ) × |R² (y)| ≤ χ(Ψs,m (∗, T )) + χ(Θs−1,m )²−2 , (12)
where Θs,m denotes the chosen bounded distance decoding algorithm of RM (s, m) code, which has complexity χ(Θs,m ) = O(s2m ). At the j-th step of the algorithmΨs,m , we have already obtained the corresponding list that contains all good candidates q (j−1) . For each candidate q (j−1) , we need to find all its good continuations xj q (j,j) (x1 , ..., xj−1 ). Hence, we check the Ratio criteria for all the candidates from the auxiliary list. To do it, we take the ”received” vector y 0 = y + q (j) restricted to the corresponding facet Sα and apply bounded distance decoding Θs−1,j (y 0 ) of code RM (s − 1, j). Since the size of the auxiliary list is bounded by ²−2 in (11), it follows that χ(Ψs,m )
≤ (m − s + 1)χ(Φs−1,m−1 (∗, 2m Js )) m X + ²−2 2m−j χ(Θs−1,j ). (13) j=s
Now Theorem 1 follows from the straightforward estimations based on recursive inequalities (12) and (13). R EFERENCES [1] I.S. Reed, “A class of multiple error correcting codes and the decoding scheme,” IEEE Trans. Info. Theory, vol. IT-4, pp. 38-49, 1954. [2] R.E. Krichevskiy, “On the Number of Reed-Muller Code Correctable Errors,” Dokl. Soviet Acad. Sciences, vol. 191, pp. 541-547, 1970. [3] F.J. MacWilliams, N.J.A. Sloane, The Theory of Error-Correcting Codes, North-Holland, Amsterdam, 1981. [4] V. Guruswami and M. Sudan, “Improved decoding of Reed-Solomon and algebraic-geometry codes ,” IEEE Trans. on Information Theory, vol. 45, pp. 1757–1767, 1999. [5] R. Pellikaan and X.-W. Wu, “List decoding of q-ary Reed-Muller Codes”, IEEE Trans. on Information Theory, vol. 50, pp. 679-682, 2004. [6] G.Kabatiansky and C.Tavernier, “List decoding of Reed-Muller codes of first order,” in Proc. ACCT-9, pp. 230–235, Bulgaria, 2004. [7] G. Kabatiansky and C. Tavernier, “List decoding of of second order Reed-Muller Codes,” in Proc. 8th Intern. Simp. Comm. Theory and Applications, Ambleside, UK, July 2005. [8] O. Goldreich and L.A. Levin, “A hard-core predicate for all one-way functions”, Proc. 21st ACM Symp. Theory of Computing, pp. 25–32, 1989. [9] O. Goldreich, R. Rubinfeld and M. Sudan, “Learning polynomials with queries: the highly noisy case”, SIAM J. on Discrete Math., pp. 535–570, 2000. [10] M. Sudan, L. Trevisan, and S. Vadhan, “Pseudorandom generators without the XOR Lemma,” Computer and System Sciences, 62(2): 236– 266, March 2001. [11] P. Elias, “List decoding for noisy channels” 1957-IRE WESCON Convention Record, Pt. 2, pp. 94–104, 1957. [12] S. Litsyn , “On complexity of decoding low rate Reed-Muller codes ”, Proc. 9th All Union Conf. Coding Theory and Inform. Transmission, vol. 1, pp. 202–204, 1988 (in Russian). [13] G. A. Kabatianskii, “On decoding of Reed-Muller codes in semicontinuous channels, “Proc. 2nd Int. Workshop “Algebr. and Comb. Coding theory”, Leningrad, USSR,pp. 87-91, 1990. [14] I. Dumer, “Recursive decoding and its performance for low-rate ReedMuller codes”, IEEE Trans. Inform. Theory, vol. 50, pp. 811-823, 2004. [15] I. Dumer, “On recursive decoding with sublinear complexity for ReedMuller codes”, Proc. IEEE Inform. Theory Workshop 2003, Paris, France , pp. 14-17, 2003.