Asymptotic behavior of RA-estimates in autoregressive ...

Viewer
Transcript

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright

Author's personal copy Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: w w w . e l s e v i e r . c o m / l o c a t e / j s p i

Asymptotic behavior of RA-estimates in autoregressive 2D processes Oscar H. Bustosa , Marcelo Ruizb , Silvia Ojedaa , Ronny Vallejosc, ∗ , Alejandro C. Freryd a

FAMAF-CIEM, Universidad Nacional de Córdoba, Argentina Fac. de Cs. Exactas, Universidad Nacional de Río Cuarto, Argentina c Departamento de Estadística-CIMFAV, Universidad de Valpararaíso, Chile d Instituto de Computação, Universidade Federal de Alagoas, Brazil b

A R T I C L E

I N F O

Article history: Received 27 June 2008 Received in revised form 7 April 2009 Accepted 20 April 2009 Available online 3 May 2009 Keywords: Image processing AR-2D models Polynomial coefficients Robust estimators Residual autocovariance Asymptotic normality and consistency

A B S T R A C T

In this work we study the asymptotic behavior of a robust class of estimators of the coefficient of a AR-2D process. We establish the precise conditions for the consistency and asymptotic normality of the RA estimator. The AR-2D model has many applications in image modeling and statistical image processing, therefore the relevance of knowing such properties. The adequacy of the AR-2D model is analyzed with real images; we also show the impact of contamination and the capability of the RA estimator to produce useful results even in the presence of spurious data. © 2009 Elsevier B.V. All rights reserved.

1. Introduction Robust inference techniques appear in a diversity of contexts and applications, though the terms “robust” and “robustness” are quite freely used in the image processing and computer vision literature, not necessarily with the usual statistical meaning. The median and similar order-based filters are basic tools in image processing (Aysal and Barner, 2006; Huang and Lee, 2006; Palenichka et al., 2000, 1998), and in some cases particular attention has been devoted to obtain the distribution of those estimators (Steland, 2005). Frery et al. (1997) derive a family of robust estimators for a class of low signal-to-noise ratio images, while Vallejos and Mardesic (2004) propose the robust estimation of structural parameters for their restoration. Other resistant approaches have proved being successful in image restoration (see, for instance, Ben Hamza and Krim, 2001; Chu et al., 1988; Koivunen, 1995; Marroquin et al., 1998; Rabie, 2005; Tarel and Ieng, 2002; Voloshynovskiy et al., 2000; Zervakis and Kwon, 1992). A common challenge in these applications is that the number of observations is reduced to a few, typically less than a hundred points. When it comes to image analysis, many robust techniques have been proposed. In this case, the sample size is usually larger than the one available in filters and, frequently, structure and topology do not impose heavy requirements or constraints. In some cases, strong hypothesis are made on the laws governing the observed process (Allende and Pizarro, 2003; Brunelli and Messelodi, 1995; Bustos et al., 2002; Butler, 1998; Dryden et al., 2002; Van de Weijer and Van den Boomgaard, 2005); other approaches can be seen in the works by Bouzouba and Radouane (2000), Brandle et al. (2003), Nirel et al. (1998), Sim et al. (2004), Tohka et al. (2004), Xu (2005) and Zervakis et al. (1995).

∗ Corresponding author. E-mail address: [email protected] (R. Vallejos). 0378-3758/$ - see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2009.04.016

Author's personal copy 3650

O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

High-level image analysis, or vision, also benefits from the use of robust estimation techniques, as can be seen in Black and Rangarajan (1996), Black et al. (1997), Chen et al. (2003), Comport et al. (2006), Glendinning (1999), Gottardo et al. (2006), Hasler et al. (2003), Kim and Han (2006), Li et al. (1998), Meer et al. (1991), Mirza and Boyer (1993), Prastawa et al. (2004), Roth (2006), Singh et al. (2004), Stewart (1999), Torr and Zisserman (2000) and Wang and Suter (2004a, b). In a wide variety of different situations such as image analysis, remote sensing and agricultural field trials, observations are obtained on a rectangular 2D lattice or grid. A class of 2D autoregressive processes has been suggested (Whittle, 1954) as a source of reasonable models for the spatial correlation in such data (Tjostheim, 1978). These models are natural extensions of the autoregressive processes used in time series analysis (Basu and Reinsel, 1993). Consequently, most of the robust techniques developed for parametric models in time series have been implemented for spatial parametric models when the process has been contaminated with innovation or additive outliers (Kashyap and Eom, 1988). Since a single outlier can produce bias and large variance in the estimators, most of the proposals are oriented to provide estimators that are more resistant to the presence of contamination. In the literature there are at least three classes of robust estimators that have been studied in this context. They are the M, GM and RA estimators. Kashyap and Eom (1988) introduced the M estimators for 2D autoregressive models. A recursive image restoration algorithm was implemented by using the robust M estimations to produce a restored image. Later Allende et al. (1998) studied the computational implementation of the Generalized M (GM) estimators for the same class of models. The image restoration algorithm previously developed by Kashyap and Eom (1988) was generalized by Allende et al. (2001). The Robust Residual Autocovariance (RA) estimators were introduced by Bustos and Yohai (1986) in the context of time series, where in the recursive estimation procedure the residuals are cleaned through the application of a robustifying function. An extension of the RA estimators for spatial unilateral autoregressive models and its computational aspects were studied by Ojeda et al. (2002). Monte Carlo simulation studies show that the performance of the RA estimator is better than the M estimator and slightly better than the GM estimator when the model has been contaminated with additive outliers. Although the performance of the M and GM estimators is acceptable under innovation outliers, their asymptotic properties are still open problems. In this paper we study the asymptotic behavior of the RA estimator for unilateral autoregressive spatial processes, generalizing the results for 1D time series asymptotic behavior established by Bustos and Fraiman (1984). We give precise conditions for the consistency and asymptotic normality of the RA estimators. The paper is organized as follows. Section 2 presents the motivation of this study based on (i) the adequacy of 2D-AR models in the representation of real images, (ii) the impact of contamination on the performance of classical least square estimators, and (iii) empirical evidence that the RA version of such estimators is able to resist contamination. In Section 3 the 2D-AR model is presented. Section 4 defines the RA estimator of a 2D autoregressive process. In Sections 5 and 6 the Strong Consistency and Asymptotic Normality are established, respectively. Section 7 discusses some final remarks and directions for future work. The proofs of the results are organized in two subsections of the Appendix. 2. Motivation In this section we present real data examples to illustrate the local approximation of images by using 2D unilateral AR processes. The goal is to graphically show that the 2D unilateral AR processes are useful and expressive processes capable of representing a number of different real scenarios. We also show evidence that proposing and assessing the properties of robust estimators are relevant tasks. In this context, the following question arises: Is it possible to represent images with 2D-AR unilateral processes? In order to provide empirical evidence suggesting that the answer is yes, at least for a wide variety of real images, we consider here an algorithm that defines what we call a local AR-2D approximated image by using blocks. Let z = [z(m, n)]0 ⱕ m ⱕ M−1,0 ⱕ n ⱕ N−1 be the original image, and let x = [x(m, n)]0 ⱕ m ⱕ M−1,0 ⱕ n ⱕ N−1 , where for all 0 ⱕ m ⱕ M −1 and 0 ⱕ n ⱕ N −1, x(m, n)=z(m, n)− z¯ and z¯ is the mean of z. As an example, consider the approximated image y of x based on a causal AR-2D process of the form y(i, j) = 1 y(i − 1, j) + 2 y(i, j − 1) + (i, j), where (i, j) ∈ Z2 and ((i, j))(i,j)∈Z2 is a Gaussian white noise. Let 4 ⱕ k ⱕ min(M, N). Consider z = [z(m, n)]0 ⱕ m ⱕ M −1,0 ⱕ n ⱕ N −1 , x = [x(m, n)]0 ⱕ m ⱕ M −1,0 ⱕ n ⱕ N −1 ,

Author's personal copy O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

where

3651

M−1 (k − 1) + 1, k−1 N−1 N = (k − 1) + 1. k−1

M =

For all ib = 1, . . . , [(M − 1)/(k − 1)] and for all jb = 1, . . . , [(N − 1)/(k − 1)] we define the (k − 1) × (k − 1) block (ib , jb ) of the image x by Bx (ib , jb ) = [x(r, s)](k−1)(ib −1)+1 ⱕ r ⱕ (k−1)ib ,(k−1)(jb −1)+1 ⱕ s ⱕ (k−1)jb . The M × N approximated image x of x is provided by the following algorithm: Algorithm 1 (Image approximation). For each block Bx (ib , jb ) 1. Compute the least square estimators 1 (ib , jb ), 2 (ib , jb ) of 1 and 2 corresponding to the block Bx (ib , jb ) extended to Bx (ib , jb ) = [x(r, s)](k−1)(ib −1) ⱕ r ⱕ (k−1)ib ,(k−1)(jb −1) ⱕ s ⱕ (k−1)jb . 2. Let x be defined on the block Bx (ib , jb ) by x(r, s) = 1 (ib , jb )x(r − 1, s) + 2 (ib , jb )x(r, s − 1), where (k − 1)(ib − 1) + 1 ⱕ r ⱕ (k − 1)ib and (k − 1)(jb − 1) + 1 ⱕ s ⱕ (k − 1)jb . Then, the approximated image z of the original image z is z(m, n) = x(m, n) + z¯ ,

0 ⱕ m ⱕ M − 1, 0 ⱕ n ⱕ N − 1.

Figs. 1 and 2 present results obtained with Algorithm 1. Fig. 1(a) shows a single band obtained by the Landsat sensor over Colonia Tirolesa, Córdoba, Argentina. This area is mainly an urban spot (the light area to the left-center of the image) and a series of agricultural parcels (the geometrical regions). Fig. 1(b) shows the estimated image from the data using Algorithm 1, and Fig. 1(c) presents the difference between the true and estimated data. The lack of structure of Fig. 1(c) suggests that the model and the algorithm are able to retrieve the main features of the original data. Fig. 2 presents further results on the adequacy of the model to images. Figs. 2(a) and (g) are the original datasets x; these images are frequently used in the literature as benchmarks, since they present areas both with and without detail. These datasets were contaminated with additive noise following a zero mean Gaussian low with variance 50, yielding Fig. 2(b) (10% of pixels were contaminated) and Fig. 2(h) (5% of pixels were contaminated); though contamination is barely noticeable, it precludes any sensible statistical analysis of these datasets. The contaminated datasets were used as input to Algorithm 1, which produced Figs. 2(c) and (i). Both estimated images show a decreased contrast, and contamination clearly appears as a grainy effect in Fig. 2(c). In spite of such defects, these estimated datasets resemble closely the original images suggesting, thus, that the model is adequate for a variety of situations. Figs. 2(d) and (j) present the differences between the true and estimated datasets; the better the estimation, the less the information present in these error images. The presence of structure in Figs. 2(d) and (j) motivates the development of robust estimators. Such estimators should be able to produce better estimates in the presence of contamination and, thus, yield errors with as little information as possible. A robust estimation procedure applied to the estimation of (1 , 2 ) produced Figs. 2(e) and (k) using the contaminated images as input, which are visually closer to the original data than the estimators yielded by Algorithm 1. The errors of these new images are presented in Figs. 2(f) and (l); they retained less information than the classical version did. It is therefore relevant to study the properties of these robust estimators, since they provide a good fit for an expressive model. 3. AR-2D models Throughout this paper we assume that the random variables are defined on the same probability space (, F, P). If m=(m1 , m2 ) and k = (k1 , k2 ) ∈ Z2 , we write m ⱕ k if mi ⱕ ki for i = 1, 2. Let I0 = {m ∈ Z2 : 0 ⱕ m and m 0 = (0, 0) }, and T = {t1 , . . . , tL } = {(t1,1 , t1,2 ) , . . . , (tL,1 , tL,2 ) } ⊂ I0 be a finite and not empty set. Consider /0 = (01 , . . . , 0L ) ∈ RL such that |0 | = sup {|0p |} < 1ⱕpⱕL

1 , L

(1)

and let P/0 be the polynomial defined over C2 as follows: P/0 (z, w) = 1 −

L p=1

0p ztp,1 wtp,2 .

(2)

Author's personal copy 3652

O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

Fig. 1. Landsat image, its reconstruction and the error. (a) Original image. (b) Estimated data. (c) Difference.

X = (Xm )m∈Z2 is a AR(P/0 , For each m ∈ Z2 let Xm : → R be a random variable such that e) process; that is, for each m ∈ Z2 , Xm satisfies Xm = +

L

0p Xm−tp + m ,

(3)

p=1

where e = (m )m∈Z2 is a white noise, i.e. the components of e are independent and identically distributed random variables (not

necessarily Gaussian) with common distribution function F , zero mean and finite variance 2 > 0. The following condition on F is necessary:

Assumption 1. The distribution function of the errors, F , is absolutely continuous with density f . X, so we assume in this paper that Our interest consist in the estimation of the coefficients /0 from the observed values of and are known and, without loss of generality, we take = 0. If the position and the scale are unknown they can be estimated using preliminary robust estimators.

Author's personal copy O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

3653

Fig. 2. Original and contaminated images, their LS and RA estimators and their residuals. (a) Original x. (b) Contaminated. (c) LS estimator x1 . (d) x − x1 . (e) RA x2 . (g) Original x. (h) Contaminated. (i) LS estimator x1 . (j) x − x1 . (k) RA estimator x2 . (l) x − x2 . estimator x2 . (f) x −

Example 1. In practice, typical values for L are 1, 2 or at most 3. The following case has been frequently found in recent applications (Ojeda et al., 2002). Consider T = {(1, 0) , (1, 1) , (0, 1) } and the sets t1 = (1, 0) , t2 = (1, 1) and t3 = (0, 1) . So, the polynomial in (2) is given by P0 (z, w) = 1 − 01 z − 02 zw − 03 w. Hence, the model is described by X(m1 ,m2 ) = 01 X(m1 −1,m2 ) + 02 X(m1 −1,m2 −1) + 03 X(m1 ,m2 −1) + (m1 ,m2 ) . Remark 1. If we assume that Xm ∈ L2 (, F, P) with Xm 2L2 = Var(Xm ) = 2X and E(Xm ) = 0 (see Guyon, 1993, Chapter 3) then, for each m, Xm can be expressed as a series of terms of (n )n∈Z2 in the L2 (, F, P) space, that is, Xm = m +

∞ j=1 k∈Lj

Ij, k (/0 )m−s(j, k) ,

(4)

j where L is the set {1, 2, . . . , L}, for each j ∈ N and k = (k1 , . . . , kj ) ∈ Lj , s(j, k) is defined as s(j, k) = i=1 tki , and Ih, k : → R is the function defined on 1 L = / = (1 , . . . , L ) ∈ R : || = sup {|p |} < , (5) L 1ⱕpⱕL such that Ih, k (/) = k1 . . . kh , ∀/ ∈ . The following proposition gives a characterization of the covariance matrix of the processes. Proposition 1. Let T − T = {t − s : t and s in T} and, for each v ∈ T − T, let (T − T)1 (v) = {(j, k) : j ⱖ 1, k ∈ Lj and s(j, k) = v}, (T − T)2 (v) = {(j, k, , h) : j, ⱖ 1, k ∈ Lj , h ∈ L and s(j, k) − s(, h) = v}. Let I∞ (/0 ) be the L × L matrix given by

I∞ (/0 )p,p = 1 +

(j, k,, h)∈(T−T)2 (0)

Ij, k (/0 )I, h (/0 )

for all 1 ⱕ p ⱕ L, and I∞ (/0 )p,q =

(j, k)∈(T−T)1 (tp −tq )

Ij, k (/0 ) +

(j, k)∈(T−T)1 (tq −tp )

Ij, k (/0 ) +

(j, k,, h)∈(T−T)2 (tp −tq )

Ij, k (/0 )I, h (/0 )

for all p q, 1 ⱕ p, q ⱕ L. Then (XT ), the covariance matrix of XT = (Xt1 , . . . , XtL ) , is positive definite and satisfies (XT ) = 2 I∞ (/0 ).

Author's personal copy 3654

O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

4. Estimators of the polynomial coefficients For each positive integer M we define the square window of order M as WM = {m ∈ I0 : m ⱕ (M, M)}, and we denote with XM = (Xm )m∈WM the observed process in WM . For all m ∈ Z2 , the residual with respect to / is the function Rm : × → R defined as follows: ⎞ ⎛ L 1 ⎝ (6) Xm () − p Xm−tp ()⎠ . Rm (, /) =

p=1

Now we introduce the classical definition of least square estimator. : ⊂ is defined as the function / → Definition 1. The least square estimator of /0 based on XM , with domain M M M such that ()))2 ⱕ (Rm (, / (Rm (, /))2 M m∈WM

m∈WM

and / ∈ . for all ∈ M satisfies and m ∈ Z2 the function /rm (, /) is continuously differentiable, then / Considering that, for each ∈ M M

()))Xm−t () = 0 (Rm (, / M p

(7)

m∈WM

. Taking into account Eq. (4), we have an equivalent definition of the least square estimator of / . for all 1 ⱕ p ⱕ L and ∈ M 0 Definition 2. The least square estimator of /0 (based on the covariance of the residuals) corresponding to the observations XM , : ())) = → such that, for all 1 ⱕ p ⱕ L and ∈ , ⊂ is defined as the function / ( C ( , / with domain M M M m,M p M M m∈WM 0, where Cn,N : × → RL is given by

(Cn,N (, /))p = Rn (, /)Rn−tp (, /) +

N−1

j=1 k∈Lj

Ij, k (/)Rn (, /)Rn−t

p −s(j,k)

(, /),

1 ⱕ N, ∈ , n ∈ Z2 , / ∈ and 1 ⱕ p ⱕ L. X = (Xm )m∈Z2 is a AR(P/0 , Remark 2. If we assume that e) Gaussian process, then the asymptotic properties of this estimator can be derived from more general results as those in Guyon (1993, Chapter 3). It is well known that the least square estimators based on the covariances are not robust. Hence, the idea is to make them robust using adequate continuous and bounded score functions that can be chosen from several families or types such as the Mallows, Hampel or Huber families (see Bustos and Yohai, 1986).

Definition 3. The robust covariance (RA) estimator of /0 based on the observations (Xm ), with m ⱕ (M, M) and domain RA,M ⊂ RA : is defined as the function / → , such that M

RA,M

1 RA ()) = 0 Cm,M (, / M #(WM ) m∈WM

for all ∈ RA,M , where Cn,N : × → RL is given by

(Cn,N (, /))p = (Rn (, /), Rn−tp (, /)) +

N−1

j=1 k∈Lj

Ij, k (/) (Rn (, /), Rn−t

p −s(j,k)

(, /))

with 1 ⱕ N ⱕ ∞, n ∈ Z2 , ∈ , / ∈ , 1 ⱕ p ⱕ L and a score function.

Remark 3. Note that for N =∞ the convergence of the series involved in the definition of Cn,∞ is granted because of the definition of and the assumptions about (see Assumption 2).

Author's personal copy O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

3655

5. Consistency of the RA estimators In order to simplify the presentation we introduce some additional notation and considerations. Let GM,N (, /) =

1 Cm,N (, /), #(WM )

gN (/) = E( 0,N (/)),

m∈WM

FM (, /) = GM,M (, /),

f(/) = g∞ (/),

(8)

where 1 ⱕ N ⱕ ∞, M0 ⱕ M < ∞, ∈ and / ∈ . Note that, since is continuously differentiable, for each ∈ , the function /FM (, /) is continuously differentiable on for any M. In general, if H: → RL is a differentiable function and, for each 1 ⱕ p ⱕ L, Hp : → R is its p-th component (that is, Hp (/) = (H(/))p ), then we denote the derivative of H by DH and, for each 1 ⱕ p, q ⱕ L, Dq (Hp )(/) is the partial derivative with respect to the q-th component of the function Hp evaluated at /. Similarly, if h: → R is a differentiable function, Dq (h)(/) and ∇h(/) denote the partial derivative with respect to the q-th component and the gradient of h evaluated at /, respectively. The existence and consistency of the RA estimators of 0 are based on the propositions given below. Proposition 2 states the differentiability properties of f(/)=E( 0,∞ (/)), while Proposition 3 relates the asymptotic behavior of the averages of Cm,M (, /), (1/#(WM )) m∈WM Cm,M (, /), with respect to f(/). In this section we assume that the following conditions on the score function are satisfied: Assumption 2. (i) : R2 −→ R is a continuously differentiable function satisfying (0, v) = (u, 0) = 0 and | (u, v)| ⱕ K for some constant K < ∞ and for all (u, v) ∈ R2 . (ii) Let 1 (u, v)=D1 (u, v) and 2 (u, v)=D2 (u, v) be the partial derivatives of with respect to the first and second components, respectively. One of the two following conditions is satisfied: (a) There are constants K1 , K2 such that | 1 (u, v)| ⱕ K1 and | 2 (u, v)| ⱕ K2 , for all (u, v) in R2 . (b) There exists a constant 0 < K3 < ∞, such that | 1 (u, v)| ⱕ K3 |v| and | 2 (u, v)| ⱕ K3 |u|, for all (u, v) in R2 . (iii) E( (m / , m / )) = 0 if m m , where m and m are independent random variables with distribution F . (iv) E( 1 (m / , m / )m ) 0 if m m , where m and m are independent random variables with distribution F .

Proposition 2. Let 0 > 0 such that = {/ ∈ : |/ − /0 | < 0 } ⊂ and let and f(/) = E( 0,∞ (/)) be as in (8). Then (1) f is continuously differentiable on and for each 1 ⱕ p, q ⱕ L and for each / ∈ : Dq ((f(/))p ) = A1p,q (/) + A2p,q (/) +

∞

(A3p,q (j, k, /) + A4p,q (j, k, /) + A5p,q (j, k, /)),

j=1 k∈L

j

where A1p,q (/) = − A2p,q (/) = −

1

1

E( 1 (R0 (/), R−tp (/))X−tq ), E( 2 (R0 (/), R−tp (/))X−tp −tq ),

A3p,q (j, k, /) = Dq (Ij, k (/))E( (R0 (/), R−t

1 A4p,q (j, k, /) = − I, k (/)E( 1 (R0 (/), R−t

p −s(j,k)

(/))X−tq ),

(/))X−t

p −s(j,k)

1 A5p,q (j, k, /) = − Ij, k (/)E( 2 (R0 (/), R−t

(/))),

p −s(j,k)

p −tq −s(j,k)

).

(2) Df(/0 ) = −(1/ )E( 1 (/ , / ) )I∞ (/0 ), where and are independent random variables with distribution F , and Df(/0 ) is not zero. Proposition 3. Let FM and f be as in (8). Then, there exists a subset 0 of with P(0 ) = 1 such that

0 ⊂ ∈ : lim sup FM (, /) − f(/) = 0 and lim sup DFM (, /) − Df(/) = 0 . M→∞ /∈C

M→∞ /∈C

Author's personal copy 3656

O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

Theorem 1 (Existence and consistency of the RA estimator). Let FM be as in (8). Then, given M0 < ∞, there exists ⊂ with P( ) = 1 such that

RA

RA

RA

() ∈ with F (, / ())=0, and lim ⊂ = ∈ : there exists N > M0 such that M ⱖ N ⇒ ∃/ M M→∞ /M ()=/0 . M M 6. Asymptotic normality of the RA estimators Before stating formally the main results, in addition to the conditions given in Assumption 2, we assume in this section that the score function, , also satisfies the following. Assumption 3. E , , , , =E = 0,

where , and are independent random variables with distribution function F . RA − / ) converge, in distribution, to the N(0, 2 ) distribution where the asymptotic In this section we prove that #(WM )(/ M 0 A variance is specified in Theorem 2. By a mean value argument for GM,∞ (, /) =

1 Cm,∞ (, /) #(WM )

(9)

m∈WM

we have that

∗

RA

). #(W )(/ −/ ) DGM,∞ (/ M M M 0

(10)

RA ) − #(W )G = #(WM )GM,∞ (/ M M,∞ (/0 ), M

(11)

∗ is a random vector belonging to the segment in RL that connects / RA and / . Then we state that where, for each M, / M M 0 P P RA ) → ∗ ) −→ 0 (see Proposition 4), D G ( / D f( / ) (see Lemma 6 in Appendix A.2) and #(WM )GM,∞ (/ #(WM )GM,∞ (/0 ) conM,∞ M M 0 verges in distribution to the N(0, G) law as M → ∞ (see below Proposition 5). More formally we have Proposition 4. Let GM,∞ be as in Eq. (9). Then

P RA → #(WM )GM,∞ / 0 M

as M → ∞.

Proposition 5. Let GM,∞ be as in Eq. (9) and let G be the L × L matrix given by G = I∞ (/0 )E

,

2 ,

(12)

D where and are independent random variables with distribution F and I∞ (/0 ) is as in Proposition 1. Then #(WM )GM,∞ (/0 ) → N(0, G) as M → ∞. Finally we state the main result of this section. RA ) Theorem 2 (Asymptotic normality of the RA estimator). Let (/ M MM0 be a sequence of random variables in such that ⎡ #(WM ) ⎣

⎤ RA 1 P Cm,M (/M )⎦ → 0 #(WM )

as M → ∞

m∈WM

RA P

→ / , as M → ∞. Then and / M 0

D RA − / ) → #(WM )(/ N(0, 2A ) M 0

as M → ∞,

(13)

Author's personal copy O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

3657

where the asymptotic variance is given by

E

,

2

2A = 2 2 I∞ (/0 ) , E 1 .

−1

,

and the errors , are independent random variables with distribution function F and I∞ (/0 ) as in Proposition 1. Remark 4. Theorem 2 gives the asymptotic distribution of the least square estimator of /0 based on the covariance of the residuals (LS estimator) when (u, v) = uv. Hence, we see that the efficiency of the RA estimator with respect to the LS estimator is given by ⎡

,

2 ⎤−1

E ⎢ ⎥ ⎢ 2 ⎥ ⎢ ⎥ ⎢ 2⎥ ⎣ ⎦ , E 1

.

Therefore, the constants involved in the definition of the function can be tuned. A more detailed discussion can be found in Bustos and Yohai (1986). 7. Concluding remarks The following comments give a brief summary of the results obtained in this paper. Under mild regularity conditions, we established the asymptotic normality and consistency of a class of robust estimators (the RA estimators) for the parameter / of a 2D autoregressive unilateral process. The results we proved extend the asymptotic theory of the RA estimators, available only for 1D time series (see Bustos and Yohai, 1986). Although the literature presents several reasonable classes of estimators for the parameter /, such as M and GM estimators, their asymptotic behavior are still open problems. Moreover, the advantage of the RA estimator over the other classes is that it is less sensitive to the presence of additive outliers (see Ojeda et al., 2002). In the following we outline future lines of research. The extension of the results proved in this paper to the model with colored noise (that is, with non-null autocorrelation), e = (m )m∈Z2 , would be of importance in signal and image processing. instead of white noise In the case of causal AR-2D processes of order two or three, algorithms for the computation of the RA estimators have been proposed (see Ojeda et al., 2002). It would be important to develop efficient algorithms when the order of the process is greater than three. In the motivation of this paper the difference between real and approximated images was computed. The resulting image could be used to detect the borders and to classify the original image. It would be interesting to explore the limitations of a segmentation method based on the difference image between real and a fitted images. It is also important to analyze the behavior of the RA estimator in combination with image restoration techniques. The same relevance has the study of the properties of the RA estimator, in particular, and robust estimators, in general, as alternatives to the least squares estimators under not causal and semi-causal AR-2D processes. Acknowledgments The fourth author was partially supported by Grant 11075095 from Fondecyt, Chile. The fifth author acknowledges partial support from CNPq, Brazil. Appendix A. Proofs We only show the main results of the paper (propositions and theorems). For a more detailed argumentation and the proofs of auxiliary lemmas, the reader should contact the first author for a Technical Report. A.1. Proof of the consistency of the RA estimator Proof of Proposition 1. Let us prove that the covariance matrix is positive definite. Note first that, since E(Xtp )=0, for all 1 ⱕ p ⱕ L, then (XT )=[E(Xtp Xtq )]1 ⱕ p,q ⱕ L . We suppose that (XT ) is not positive definite. Then there exists a vector a=(a1 , . . . , aL ) (0, . . . 0) such that 0= Lp=1 ap Xtp 2L2 implying that Lp=1 ap Xtp =0 in L2 . Without loss of generality we may assume that the set T ={t1 , . . . , tL } satisfies tp,1 ⱖ tq,1 if 1 ⱕ p < q ⱕ L and tp,2 > tq,2 if tp,1 = tq,1 , 1 ⱕ p < q ⱕ L. Considering that a is different from the null vector then there exist 1 ⱕ p1 < · · · < pr ⱕ L, 2 ⱕ r ⱕ L such that api 0, for all i = 1, . . . , r.

Author's personal copy 3658

O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

For every U ⊂ L2 (, F, P) we denote H(U) the closed vector subspace of L2 (, F, P) generated by U. j 2 Using (4) of Remark 1, we have that Xm ∈ H({m } ∪ {m−s(j, k) : j ⱖ 1, k ∈ L }), for all m ∈ Z . By the previous considerations we conclude that

tp1 ∈ H({tp2 , . . . , tpr } ∪ {t and tp ∈/ ({tp , . . . , tpr }∪{t 1

2

pi −s(j,k)

pi −s(j,k)

: 1 ⱕ i ⱕ r, j ⱖ 1, k ∈ Lj })

(14)

: 1 ⱕ i ⱕ r, j ⱖ 1, k ∈ Lj }). Since (m )m∈Z2 is a white noise, (14) is impossible. By this contradiction

and as (XT ) is non-negative definite it has to be positive definite. Finally, the equality (XT ) = 2 I∞ (/0 ) follows from a straightforward calculation. Lemma 1. For each m ∈ Z2 , let Zm be the random vector with values in RL+1 given by Zm = (Xm , Xm−t1 , . . . , Xm−tL ) . Let be as in k ∈ Lj let cj, Assume that Proposition 2 and c0 : −→ R be a continuous function. For each j ⱖ 1 and k : −→ R be a continuous function. ∞ there exists a sequence of positive real numbers (bj )j ⱖ 0 such that j=0 bj < ∞, sup/∈ |c0 (/)| ⱕ b0 and sup/∈ j |c (/)| ⱕ bj , j,k k∈L for all j ⱖ 1. Consider W: RL+1 × RL+1 × → R such that: (w1) There exists KW < ∞ satisfying for each (x, y) ∈ RL+1 × RL+1 : sup/∈ |W(x, y, /)| ⱕ KW |(x, y)|, where is a function : RL+1 ×

RL+1 −→ R such that there exists a constant satisfying E(|(Zm , Zn )|) ⱕ .X0 L2

(15)

for all m, n ∈ Z2 . ∗ ∗ ∗ (w2) Given K ⊂ RL+1 compact and d > 0, there exists d > 0 such that |/ − / | < d , / and / ∈ ⇒ |W(x, y, /) − W(x, y, / )| < d 2 2 for all x, y ∈ K. Consider n1 ∈ Z . For each 1 ⱕ N integer and n ∈ Z let n,N : × → R be given by

n,N (, /) = c0 (/)W(Zn (), Zn−n1 (), /) +

N−1

j=1 k∈Lj

cj, k (/)W(Zn (), Zn−n

1 −s(j,k)

(), /).

Then 1. (0,N (/))N ⱖ 1 converges uniformly on / ∈ K in L2 (, F, P) to the limit

0,∞ (, /) = c0 (/)W(Z0 (), Z0−n1 (), /) +

∞ j=1 k∈Lj

cj, k (/)W(Z0 (), Z0−n

1 −s(j,k)

(), /).

2. There exists a strictly increasing sequence (J(N))N such that, for all N there exists N ⊆ with P(N ) = 1 satisfying lim sup sup |TM,M (, /) − E(0,∞ (/))| ⱕ lim sup sup |TM,J(N) (, /) − E(0,J(N) (/))| + M→∞

/∈C

/∈C

M→∞

1 N

if ∈ N ,

where TM,N (, /) = (1/#(WM )) m∈WM m,N (, /). 3. For each N there exists N with P(N ) = 1 satisfying lim sup |TM,N (, /) − E(0,N (/))| = 0

M→∞ /∈C

if ∈ N .

4. There exists a subset 0 of with P(0 ) = 1 such that ⎧ ⎫ ⎨ ⎬ 1 0 ⊂ ∈ : lim sup m,M (, /) − E(0,∞ (/)) = 0 . ⎩ ⎭ M→∞ /∈C #(WM ) m∈WM The proof, which is part of a Technical Report, can be requested to the first author. The next lemma is very well known in robustness; it can be found in Ruskin (1978). Lemma 2 (The zeros lemma). Let U ⊂ Rk be an open set, k0 ∈ U, for each n = 1, 2, . . . let qn : U → Rk and q: U → Rk be continuously differentiable functions. We assume that q(k0 ) = 0, Dq(k0 ) is not zero and there exists > 0 such that (qn )n and (Dqn )n converge uniformly to q and Dq, respectively, on B(k0 , ) = {k ∈ Rk : sup1 ⱕ i ⱕ k |ki − k0i | < }. Then there exist n0 ⱖ 1 and a sequence (kn )n ⱖ n0 in B(k0 , ) such that (kn )n converges to k0 and qn (kn ) = 0, for all n ⱖ n0 .

Author's personal copy O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

3659

Proof of Proposition 2. Let gN (/) = E( 0,N (/)) and f(/) = E( 0,∞ (/)) be as in (8). By the definition of and by Eq. (5), we have that b = sup{Lsup1 ⱕ p ⱕ L |p | : / ∈ } < 1. First note that for any / ∈ , 1 ⱕ p ⱕ L and N we have the following inequality: ∞ ∞ Ij, bj , |(f(/))p − (gN (/))p | = k ()E( (R0 (/), R−tp −s(j, k) (/))) ⱕ K j=N j j=N k∈L

and this implies that (gN )N converges uniformly to f on . So (1) in Proposition 2 will follow if we show that (Dq ((gN (/))p ))N

converges uniformly on .

(16)

Notice that for each / ∈ Dq ((gN (/))p ) = A1p,q (/) + A2p,q (/) +

N−1

(A3p,q (j, k, /) + A4p,q (j, k, /) + A5p,q (j, k, /)).

j=1 k∈Lj

To prove (16), it is enough to show that for each d > 0, there exists a positive integer N such that for all positive integer n sup |Dq ((gN+n (/))p ) − Dq ((gN (/))p )| < d.

(17)

/∈

It is straightforward to check that, under (a) of Assumption 2(ii), the following inequality holds for each N, n positive integers and / ∈ : |Dq ((gN+n (/))p ) − Dq ((gN (/))p )| ⱕ

N+n

K .j(b)j−1 +

j=N

1

K1 E(|X0 |)bj +

1

K2 E(|X0 |)bj .

Similarly if we assume (b) of Assumption 2(ii) instead of (a) we can establish the following inequality: |Dq ((gN+n (/))p ) − Dq ((gN (/))p )| ⱕ

N+n

K j(b)j−1 + 2

j=N

K3

E(|X0 |2 )bj .

So (17) is satisfied and this finishes the proof of (1) in Proposition 2. Now, let us prove (2) in Proposition 2. Note that ⎧ 1 ⎪ if p = q, ⎪ ⎪ − E 1 , ⎨ A1p,q (/0 ) = 1 ⎪ ⎪ , Ij, ⎪ − E 1 k (/0 ) if p q, ⎩

(18)

(j, k)∈(T−T)1 (tp −tq )

where and are independent random variables with distribution F . ⎛ ⎛ ⎞⎞ ∞ 1 ⎝ ⎠⎠ = 0. Ij, A2p,q (/0 ) = − E 2 (0 , −tp ) ⎝−tp −tq + k (/0 )−t −t −s(j, k)

j=1 k∈Lj

p

(19)

q

Now, by the assumptions on and considering that tq 0 and (4) imply the independence of 0 , −t A3p,q (j, k, /0 ) = A5p,q (j, k, /0 ) = 0.

p −s(j,k)

and X−t

p −s(j,k)−t q

then (20)

Finally, it is possible to prove that ∞

A4p,q (j, k, /) = −

j=1 k∈Lj

1

E 1 , A4∗p,q ,

where and are independent random variables with distribution F and ⎛ ⎞ ⎜ ⎟ Ij, Ij, A4∗p,q = ⎝ k (/0 ) + k (/0 )Il, h (/0 )⎠ . (j, k)∈(T−T)1 (tq −tp )

(j, k,l, h)∈(T−T)2 (tp −tq )

By (18)–(21) and by the definition of I∞ (/0 ) it follows that Df(0 ) = −E 1 , I∞ (/0 ),

(21)

Author's personal copy 3660

O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

where and are independent random variables with distribution F . Now, by the Lemma 1 and Assumption 2(iv), Df(0 ), with the exception of a non-zero constant, is equal to the covariance matrix of (Xm )m∈T and, hence, it is invertible. This concludes the proof of the Proposition 2. Proof of Proposition 3. It is divided into two stages, (a) and (b). (a) There exists a subset 01 of with P(01 ) = 1, such that

01 ⊂ ∈ : lim sup |FM (, /) − f(/)| = 0 . M→∞ /∈C

j For all 1 ⱕ p ⱕ L, it is enough to apply Lemma 1 under the following setup: c0 (/) = 1, cj, k (/) = Ij, k (/), for all j ⱖ 1, k ∈ L and

/ ∈ ; for x = (x1 , . . . , xL+1 ) and y = (y1 , . . . , yL+1 ) let W(x, y, /) =

L

p =1 p xp +1

x1 −

,

y1 −

L

p =1 p yp +1

and (x, y) = 1; finally, set n1 = tp . (b) There exists a subset 02 of with P(02 ) = 1, such that

02 ⊂ ∈ : lim sup |DFM (, /) − Df(/)| = 0 . M→∞ /∈C

For all 1 ⱕ p, q ⱕ L, the following notation will be used: (a1n(, /))p,q = − (a2n(, /))p,q = −

1

1

1 (Rn (, /), Rn−tp (, /))Xn−tq (), 2 (Rn (, /), Rn−tp (, /))Xn−tp −tq (),

(a3n,j, k (, /))p,q = Dq (Ij, k (/)) (Rn (, /), Rn−t (a4n,j, k (, /))p,q = − (a5n,j, k (, /))p,q = −

1

1

p −s(j,k)

(, /)),

Ij, k (/) 1 (Rn (, /), Rn−t

p −s(j,k)

(, /))Xn−tq (),

Ij, k (/) 2 (Rn (, /), Rn−t

p −s(j,k)

(, /))Xn−t

p −s(j,k)−tq

(),

k ∈ Lj , / ∈ . Then where ∈ , n ∈ Z 2 , j is a positive integer Dq ((DFM (, /))p ) =

M−1 1 1 1 (a1m(, /))p,q + (a2m(, /))p,q + (a3m,j, k (, /))p,q #(WM ) #(WM ) #(WM ) j m∈WM

+

m∈WM j=1 k∈L

m∈WM

M−1 M−1 1 1 (a4m,j, (, /))p,q + (a5m,j, k k (, /))p,q . #(WM ) #(WM ) j j m∈WM j=1 k∈L

m∈WM j=1 k∈L

Thus, to prove (b) it is enough to show the following five statements: (b1) There exists a subset G of with P(G ) = 1, such that for all 1 ⱕ p, q ⱕ L, G is contained in G, where ⎧ ⎫ ⎨ ⎬ 1 a1m(, /) p,q − A1p,q (/) = 0 . G = ∈ : lim sup ⎩ ⎭ M→∞ /∈C #(WM ) m∈WM

(b2) There exists a subset H of with P(H ) = 1, such that for each s, t ∈ T, H is contained in H, where ⎧ ⎫ ⎨ ⎬ 1 H = ∈ : lim sup/∈C (a2m(, /))p,q − A2p,q (/) = 0 . ⎩ ⎭ M→∞ #(WM ) m∈WM

(b3) There exists a subset A of with P(A ) = 1, such that for each s, t ∈ T, A is included in A, where ⎧ ⎫ ∞ ⎨ ⎬ M−1 1 =0 . A = ∈ : lim sup (a3n,j, ( , / )) − A3 (j, k, / ) p,q p,q k ⎩ ⎭ M→∞ /∈C #(WM ) j j m∈WM j=1 k∈L

j=1 k∈L

(22)

Author's personal copy O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

3661

(b4) There exists a subset B of with P(B ) = 1, such that for each s, t ∈ T, B is included in B, where ⎧ ⎨

⎫ ∞ ⎬ M−1 1 B = ∈ : lim sup (a4n,j, (, /))p,q − A4p,q (j, k, /) = 0 . k ⎩ ⎭ M→∞ /∈C #(WM ) m∈WM j=1 j j=1 j k∈L

k∈L

(b5) There exists a subset D of with P(D ) = 1, such that for each s, t ∈ T, D is included in D, where ⎧ ⎨

⎫ ∞ ⎬ M−1 1 =0 . D = ∈ : lim sup (a5n,j, ( , / )) − A5 (j, k, / ) p,q p,q k ⎩ ⎭ M→∞ /∈C #(WM ) m∈WM j=1 j j=1 j k∈L

k∈L

We only prove (b1) (the proofs of (b2)–(b5) follow the same scheme). If we suppose that (a) of Assumption 2(ii) holds then j we apply Lemma 1 under the following setup c0 (/) = 1, cj, k (/) = 0, ∀j ⱖ 1, k ∈ L , / ∈ , (x, y) = xq+1 . Consider also W(x, y, /) = 1

x1 −

L

p =1 p xp +1

,

y1 −

L

p =1 p yp +1

xq+1

and x=(x1 , . . . , xL+1 ) , y=(y1 , . . . , yL+1 ) and /=(1 , . . . , L ) . Finally, n1 =tp is considered as in (a). Suppose now that (b) of Assumption 2(ii) holds, then we use (x, y) = |xq+1 y1 / | + (1/L) Lp =1 |xq+1 (yp +1 )/ |. This completes the proof of Proposition 3. Proof of Theorem 1. Let 0 be as in Proposition 3. Let ∈ 0 . We consider the zeros lemma (Lemma 2) under the following conditions: U = C o (the interior of C), k0 = /0 , for all M, qM is the function /FM (, /), q is the function f defined in (8). Note that from (6), and (i) and (iii) of Assumption 2, it follows that f(/0 ) = 0. With this result, (2) of Propositions 2 and 3, the assumptions of Lemma 2 are fullfilled. Hence ∈ . Setting = 0 , Theorem 1 is proved. A.2. Proofs of the asymptotic normality of the RA estimator Lemma 3. For 1 ⱕ M, N ⱕ ∞ integers, n, m ∈ Z2 , ∈ , / ∈ and 1 ⱕ p ⱕ L define YM,N (, /) = 1/#(WM ) m∈WM !m,N (, /) where

(!n,N (, /))p =

∞ j=N k∈Lj

Ij, k (/) (Rn (, /), Rn−t

p −s(j,k)

(, /)).

Then for all 1 ⱕ p ⱕ L we have that, in probability, lim lim sup( #(WM )YM,N (/0 )) = 0.

N→∞

M→∞

The proof, which is part of a Technical Report, can be requested to the first author. The following central limit theorem for random fields (see Guyon, 1993, p. 99) is needed for proving Lemma 5. Lemma 4 (Central limit theorem for random fields). Let X = {Xi : i ∈ Zd } be a real valued random field such that E(Xi ) = 0, for all i ∈ Zd . For each S ⊂ Zd , let F(X, S) be the -algebra generated by {Xi−1 (C) : C is a Borel set, i ∈ S}. For each k, ∈ N ∪ {∞} and n ∈ N, let k, (n)=sup{|P(A∩B)−P(A)P(B)| : A, B ∈ Ck, (n)} where Ck, (n)={A ∈ F(X, S1 ), B ∈ F(X, S2 ) : #(S1 ) ⱕ k, #(S2 ) ⱕ , dist(S1 , S2 ) ⱖ n}. Now let (Dn )n ⱖ 1 be a decreasing sequence of finite subsets of Zd and Sn = i∈Dn Xi with variance 2n . If we assume that:

if k + ⱕ 4 and 1,∞ (m) = o(m−d ); d−1 ( (m))/(+2) < ∞, then lim sup (1/#(D )) (ii) there exists > 0 such that supi Xi +2 < ∞ and n 1,1 n mⱖ1 m i,j∈Dn |Cov(Xi , Xj )| < ∞; (i)

d−1 (m) < ∞ k, mⱖ1 m

D

(iii) if, further, we assume that lim inf n→∞ (1/#(Dn ))2n > 0, then −1 n Sn → N(0, 1). If, for each i ∈ Zd , Xi takes values in Rp and if (iii) is replaced by lim inf n→∞

1

n ⱖ I0 > 0 #(Dn ) −1/2

with I0 a symmetric and positive definite matrix, then n matrix of order p.

D

Sn → N(0, Ip ), where n is the covariance matrix of Sn and Ip is the identity

Author's personal copy 3662

O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

Lemma 5. For each N ⱖ 2 integer, let GN be the L × L matrix given by GN = IN (/0 )E

,

2

,

where and are independent random variables with distribution F and IN (/0 ) is the L × L matrix given by IN (/0 )p,p = 1 + Ij, k (0 )I, h (0 ) (j, k,, h)∈(T−T)N 2 (0)

for all 1 ⱕ p ⱕ L, IN (/0 )p,q =

(j, k)∈(T−T)N 1 (t p −tq )

Ij, k (0 ) +

(j, k)∈(T−T)N 1 (t q −t p )

Ij, k (0 ) +

(j, k,, h)∈(T−T)N 2 (t p −tq )

Ij, k (0 )I, h (0 ),

for p q with 1 ⱕ p, q ⱕ L, where j (T − T)N 1 (v) = {(j, k) : N − 1 ⱖ j ⱖ 1, k ∈ L , s(j, k) = v}, j (T − T)N 2 (v) = {(j, k, , h) : N − 1 ⱖ j, ⱖ 1, k ∈ L , h ∈ L , s(j, k) − s(, h) = v}

D with v ∈ (T−T) and (T−T) as in Proposition 1. Let GM,N be as in (8). Then, there exists N0 such that, for each N ⱖ N0 , #(WM )GM,N (/0 ) −→ N(0, GN ), as M → ∞. The proof, which is part of a Technical Report, can be requested to the first author. P ∗ ) −→ ∗ is as in (10). Df(/0 ) as M → ∞, where / Lemma 6. Let GM,∞ be as in (9) and f be as in (8). Then DGM,∞ (/ M M

The proof, which is part of a Technical Report, can be requested to the first author. Proof of Proposition 4. By the definition of GM,∞ , GM,N and YM,N (see (9), (8) and Lemma 3, respectively), #(WM )GM,∞ (/) = #(WM )GM,N (/) + #(WM )YM,N (/).

(23)

In particular (FM = GM,M ),

RA ) = #(WM )GM,∞ (/ M

RA ) + #(W )Y RA #(WM )FM (/ M M,M (/M ). M

(24)

By (13) and (24) it is enough to show that for each 1 ⱕ p ⱕ L, P RA )) −→ ( #(WM )YM,M (/ 0 M p

as M → ∞.

Now, for each ∈ and / ∈ , a straightforward calculation gives ∞ #(WM ) j−1 jb , |( #(WM )YM,M (, /))p | ⱕ K b M

(25)

j=M

where b is as in the proof of Proposition 2. Since

#(WM )/M =

1 + 2/M and b ∈ (0, 1) then

∞

j=M

j−1

jb

→ 0, as M → ∞.

The following result can be found in Billingsley (1999, p. 27). D

Lemma 7. Suppose that for each N, M ∈ N, (XN,M , XM ) are random elements of S×S where (S, ) is a metric space. If for all N, XN,M → ZN D

as M → ∞, ZN → X, as N → ∞ and for all > 0: lim lim sup P[(XN,M , XM ) ⱖ ] = 0,

N→∞

M→∞

D

then XM → X as M → ∞. Proof of Proposition 5. Let N0 be as in Lemma 5; for each N ⱖ N0 , let XN,M = #(WM )GM,N (/0 ) and XM = #(WM )GM,∞ (/0 ).

Author's personal copy O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

3663

For each N ⱖ N0 let ZN be a random vector variable with distribution N(0, GN ) and X a random vector with distribution N(0, G). D

D

By Lemma 5, for each N ⱖ N0 , XN,M −→ ZN , as M → ∞. Since GN −→ G, as N → ∞, then ZN → X as N → ∞. By (23), XM − XN,M = #(WM )YM,N (/) and, by Lemma 3, lim lim sup( #(WM )YM,N (/0 )) = 0, N→∞

M→∞

D

So, the assumptions of Lemma 7 are satisfied; hence XM → X as M → ∞, that is,

D

#(WM )GM,∞ (/0 ) → N(0, G)

as M → ∞.

Proof of Theorem 2. By Lemma 6, (10), Propositions 4, 5 and Slutsky's Theorem we have that

D RA − / ) −→ #(WM )(/ N(0, Df(/0 )−1 GDf(/0 )−1 ) as M → ∞. M 0

(26)

Now, notice (see (12)) that Df(/0 ) = −(1/ )E( 1 (/ , / ). )I∞ (/0 ), where and are independent random variables with distribution F . From the results proved in the previous section and (26), the proof is completed. References Allende, H., Galbiati, J., Vallejos, R., 1998. Digital image restoration using autoregressive time series models. In: Proceedings of the Second Latino-American Seminar on Radar Remote Sensing, vol. SP-434. ESA, pp. 53–59. Allende, H., Galbiati, J., Vallejos, R., 2001. Robust image modelling on image processing. Pattern Recognition Letters 22 (11), 1219–1231. Allende, H., Pizarro, L., 2003. Robust estimation of roughness parameter in SAR amplitude images. In: Sanfeliu, A., Ruiz-Shulcloper, J. (Eds.), CIARP, Lecture Notes in Computer Science, vol. 2905. Springer, Berlin. Aysal, T.C., Barner, K.E., 2006. Quadratic weighted median filters for edge enhancement of noisy images. IEEE Transactions on Image Processing 15 (11), 3294–3310. Basu, S., Reinsel, G.C., 1993. Properties of the spatial unilateral first-order ARMA model. Advances in Applied Probability 25 (3), 631–648. Ben Hamza, A., Krim, H., 2001. Image denoising: a nonlinear robust statistical approach. IEEE Transactions on Signal Processing 49 (12), 3045–3054. Billingsley, P., 1999. Convergence of Probability Measures. Wiley, New York. Black, M.J., Rangarajan, A., 1996. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International Journal of Computer Vision 19, 57–91. Black, M.J., Sapiro, G., Marimont, D., Heeger, D., 1997. Robust anisotropic diffusion: connections between robust statistics, line processing, and anisotropic diffusion. In: Proceedings of the First International Conference on Scale-Space Theory in Computer Vision. Lecture Notes in Computer Science, vol. 1252/1997. Springer, Berlin, pp. 323–326. Bouzouba, K., Radouane, L., 2000. Image identification and estimation using the maximum entropy principle. Pattern Recognition Letters 21, 691–700. Brandle, N., Bischof, H., Lapp, H., 2003. Robust DNA microarray image analysis. Machine Vision and Applications 15, 11–28. Brunelli, R., Messelodi, S., 1995. Robust estimation of correlation with applications to computer vision. Pattern Recognition 28, 833–841. Bustos, O., Fraiman, R., 1984. Asymptotic behavior of the estimates based on residual autocovariances for ARMA models. In: Robust and Nonlinear Time Series. Lectures Notes in Statistics, vol. 29. Springer, New York, pp. 26–49. Bustos, O., Yohai, V., 1986. Robust estimates for ARMA models. Journal of the American Statistical Society 81, 155–168. Bustos, O.H., Lucini, M.M., Frery, A.C., 2002. M-estimators of roughness and scale for GA0-modelled SAR imagery. EURASIP Journal on Applied Signal Processing 2002 (1), 105–114. Butler, N., 1998. A frequency domain comparison of estimation methods for Gaussian fields. Communications in Statistics—Theory and Methods 27, 2325–2342. Chen, J.-H., Chen, C.-S., Chen, Y.-S., 2003. Fast algorithm for robust template matching with m-estimators. IEEE Transactions on Signal Processing 51 (1), 230–243. Chu, C.K., Glad, I.K., Godtliebsen, F., Marron, J.S., 1988. Edge-preserving smoothers for image processing. Journal of the American Statistical Association 93, 526–541. Comport, A., Marchand, E., Chaumette, F., 2006. Statistically robust 2-d visual serving. IEEE Transactions on Robotics 22 (2), 415–420. Dryden, I., Ippoliti, L., Romagnoli, L., 2002. Adjusted maximum likelihood and pseudo-likelihood estimation for noisy Gaussian Markov random fields. Journal of Computational and Graphical Statistics 11, 370–388. Frery, A.C., Sant'Anna, S.J.S., Mascarenhas, N.D.A., Bustos, O.H., 1997. Robust inference techniques for speckle noise reduction in 1-look amplitude SAR images. Applied Signal Processing 4, 61–76. Glendinning, R.H., 1999. Robust shape classification. Signal Processing 77, 121–138. Gottardo, R., Raftery, A.E., Yeung, K.Y., Bumgarner, R.E., 2006. Bayesian robust inference for differential gene expression in microarrays with multiple samples. Biometrics 62, 10–18. Guyon, X., 1993. Champs aléatoires sur un réseau, modélisations, statistique et applications. Masson, Paris. Hasler, D., Sbaiz, L., Susstrunk, S., Vetterli, M., 2003. Outlier modeling in image matching. IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (3), 301–315. Huang, H.-C., Lee, T., 2006. Data adaptive median filters for signal and image denoising using a generalized sure criterion. IEEE Signal Processing Letters 13 (9), 561–564. Kashyap, R.L., Eom, K.-B., 1988. Robust image techniques with an image restoration application. IEEE Transactions on Acoustics, Speech, and Signal Processing 36 (8), 1313–1325. Kim, J.H., Han, J.H., 2006. Outlier correction from uncalibrated image sequence using the triangulation method. Pattern Recognition 39, 394–404. Koivunen, V., 1995. A robust nonlinear filter for image restoration. IEEE Transactions on Image Processing 4 (5), 569–578. Li, S.Z., Wang, H., Soh, W.Y.C., 1998. Robust estimation of rotation angles from image sequences using the annealing M-estimator. Journal of Mathematical Imaging and Vision 8, 181–192. Marroquin, J.L., Servin, M., Rodriguez-Vera, R., 1998. Robust filters for low-level vision. Expert Systems with Applications 14, 169–177. Meer, P., Mintz, D., Rosenfeld, A., Kim, D.Y., 1991. Robust regression methods for computer vision: a review. International Journal of Computer Vision 6, 59–70. Mirza, M., Boyer, K., 1993. Performance evaluation of a class of M-estimators for surface parameter estimation in noisy range data. IEEE Transactions on Robotics and Automation 9 (1), 75–85. Nirel, R., Mugglestone, M.A., Barnett, V., 1998. Outlier-robust spectral estimation for spatial lattice processes. Communications in Statistics—Theory and Methods 27, 3095–3111. Ojeda, S., Vallejos, R., Lucini, M., 2002. Performance of robust RA estimator for bidimensional autoregressive models. Journal of Statistical Computation and Simulation 72 (1), 47–62.

Author's personal copy 3664

O.H. Bustos et al. / Journal of Statistical Planning and Inference 139 (2009) 3649 -- 3664

Palenichka, R.M., Zinterhof, P., Ivasenko, I.B., 2000. Adaptive image filtering and segmentation using robust estimation of intensity. Advances in Pattern Recognition 1876, 888–897. Palenichka, R.M., Zinterhof, P., Rytsar, Y.B., Ivasenko, I.B., 1998. Structure-adaptive image filtering using order statistics. Journal of Electronic Imaging 7, 339–349. Prastawa, M., Bullitt, E., Ho, S., Gerig, G., 2004. A brain tumor segmentation framework based on outlier detection. Medical Image Analysis 8, 275–283. Rabie, T., 2005. Robust estimation approach for blind denoising. IEEE Transactions on Image Processing 14 (11), 1755–1765. Roth, V., 2006. Kernel Fisher discriminants for outlier detection. Neural Computation 18, 942–960. Ruskin, D.M., 1978. M-estimates of nonlinear regression parameters and their jacknife constructed confidence intervals. Ph.D. Thesis, University of California, Los Angeles. Sim, K.S., Cheng, Z., Chuah, H.T., 2004. Robust image signal-to-noise ratio estimation using mixed Lagrange time delay estimation autoregressive model. Scanning 26, 287–295. Singh, M., Arora, H., Ahuja, N., 2004. A robust probabilistic estimation framework for parametric image models. In: Proceedings of the European Conference on Computer Vision—ECCV, vol. I. Springer, Berlin, pp. 508–522. Steland, A., 2005. On the distribution of the clipping median under a mixture model. Statistics and Probability Letters 71, 1–13. Stewart, C.V., 1999. Robust parameter estimation in computer vision. SIAM Review 41, 513–537. Tarel, J.P., Ieng, S.-S., Charbonnier, P., 2002. Using robust estimation algorithms for tracking explicit curves. In: Proceedings of the European Conference on Computer Vision. Springer, Berlin, pp. 492–507. Tjostheim, D., 1978. Statistical spatial series modelling. Advances in Applied Probability 10, 130–154. Tohka, J., Zijdenbos, A., Evans, A., 2004. Fast and robust parameter estimation for statistical partial volume models in brain MRI. Neuroimage 23, 84–97. Torr, P.H.S., Zisserman, A., 2000. MLESAC: a new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding 78, 138–156. Vallejos, R.O., Mardesic, T.J., 2004. A recursive algorithm to restore images based on robust estimation of NSHP autoregressive models. Journal of Computational and Graphical Statistics 13, 674–682. Van de Weijer, J., Van den Boomgaard, R., 2005. Least squares and robust estimation of local image structure. International Journal of Computer Vision 64, 143–155. Voloshynovskiy, S.V., Allen, A.R., Hrytskiv, Z.D., 2000. Robust edge-preserving image restoration in the presence of non-Gaussian noise. Electronics Letters 36, 2006–2007. Wang, H.Z., Suter, D., 2004a. MDPE: a very robust estimator for model fitting and range image segmentation. International Journal of Computer Vision 59, 139–166. Wang, H.Z., Suter, D., 2004b. Robust adaptive-scale parametric model estimation for computer vision. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 1459–1474. Whittle, P., 1954. On stationary processes in the plane. Biometrika 41, 434–449. Xu, P.L., 2005. Sign-constrained robust least squares, subjective breakdown point and the effect of weights of observations on robustness. Journal of Geodesy 79, 146–159. Zervakis, M.E., Katsaggelos, A.K., Kwon, T.M., 1995. A class of robust entropic functionals for image restoration. IEEE Transactions on Image Processing 4 (6), 752–773. Zervakis, M.E., Kwon, T.M., 1992. Robust estimation techniques in regularized image-restoration. Optical Engineering 31, 2174–2190.

Asymptotic Behavior of Small Ball Probabilities MA ...