This paper establishes maximal inequalities for U ...

Viewer
Transcript

PROBABILITY AND MATHEMATICAL STATISTICS Vol. 0, Fasc. 0 (0000), pp. 000–000

MAXIMAL INEQUALITIES FOR U-PROCESSES OF STRONGLY MIXING RANDOM VARIABLES BY

A L E S S I O S A N C E T TA∗ (L ONDON )

Abstract. Maximal inequalities for U-processes are required in order to achieve a reduction to the first nonvanishing term in their Hoeffding’s decomposition, which is the relevant quantity for statistical inference. This paper proves new maximal inequalities under strong mixing for U-processes in some function spaces. As an application we derive a uniform central limit theorem. 2000 AMS Mathematics Subject Classification: Primary: 62E20; Secondary: 60F17. Key words and phrases: Besov Space, Hoeffding Decomposition, Stochastic Equicontinuity, Strong Mixing, U-Process.

1. INTRODUCTION

This paper establishes maximal inequalities for U-processes of arbitrary finite order. A U-process is a U-statistic whose U-kernel belongs to some class of functions. The simplest example is an empirical process, which corresponds to a first order U-process. Many statistical estimators can be written as U-statistics (e.g. quadratic forms) and the extension to a U-process is often considered, especially for nonparametric estimation (e.g. Han, 1987, Honore and Powel, 1994, Cavanagh ∗

I thank a referee for comments that improved the content of the paper and in particular the

proof of one of the lemmata. I also thank Oliver Linton for suggesting some references.

2 and Sherman, 1998, Ghosal et al., 2000). Unfortunately, for technical reasons, independent observations are usually assumed, some exceptions being Fan and Li (1999), Fan and Ullah (1999) and Denker and Keller (1986). We derive uniform bounds for U-processes when the underlying observations are strongly mixing. Because of dependence, well known results in the literature for U-processes (i.e. Arcones and Gin´e, 1993) do not apply. Some maximal inequalities for U-processes under β mixing have been established by Arcones and Yu (1994) using Berbee’s coupling method for β mixing sequences, but this approach requires some lengthy technical details and is not applicable in the strong mixing case (see also Borovkova et al., 2001). Recall that strong mixing is a weaker condition than β mixing, Rio, 2000, for details). The goal of the paper is to establish some familiar results of U-processes in the unfamiliar context of strongly mixing random variables. As in Rio (2000) we use a representation of functions in some space by means of wavelets (see also Birg´e and Massart, 2000). The result of this paper only applies to U-processes indexed by classes of functions in some Besov space, hence in this respect it is less general than the result derived in Arcones and Yu (1994). The main motivation of the paper is the reduction to the first nonvanishing term in the Hoeffding’s decomposition of a Uprocess. The first non-vanishing term in a U-process is the one that determines the asymptotic distribution of the process. Hence, for statistical inference it is required that we find such a reduction. To this end, maximal inequalities for U-processes are the necessary technical tool. An example of such application will be given. Since in practice observations are not independent, the extension to dependent random variables should be pursued. The results are stated in such a way that we can easily bound the reminder terms in a U-process and obtain explicit rates of convergence. The proof of the result makes use of wavelets representation of functions in some Besov spaces and the idea of Arcones and Gin´e (1993) to rewrite U-statistics

U-Processes

3

in terms of powers of partial sums. Moment inequalities for powers of strongly mixing partial sums can then be applied (Rio, 2000). The plan for the paper is as follows. Section 2 provides some background definitions and states the result of the paper. Section 3 contains further notation and proves the result. 2. MAXIMAL INEQUALITIES FOR U-PROCESSES UNDER STRONG MIXING

We will use k...kp,P and k...kp,λ to denote, respectively, the Lp norm with respect to the underlying probability measure P and the Lebesgue measure λ, while |...| is the Euclidean norm. The symbols . and mean inequality and equality up to some finite absolute constant of proportionality. We now turn to the definition of U-processes. 2.1. Definition and Notation for U-Processes. Consider a stationary sequence

of random variables (Xi )i∈Z with values in R. Let δx be the point measure at x, i.e. δx (A) = 1 if x ∈ A ⊂ R, 0 otherwise. Suppose f : Rm → R is a symmetric function of its arguments. Let P : =law (Xi ) (∀i). Borrowing notation P as an from Arcones and Gin´e (1993) and Arcones and Yu (1994), define πk,m

operator on f such that (2.1)

P πk,m f (x1 , ..., xk ) = (δx1 − P) · · · (δxk − P) Pm−k f (X1 , ..., Xm ) ,

where Q1 · · · Qm f =

R

···

R

f (x1 , ..., xm ) dQ1 (x1 ) · · · dQm (xm ) for any marginal

measure Qk (= δxk , P, above) and Pm−k = P · · · P is the m − k product of the marginals. Given a function g : Rm → R, we call g a P-canonical function if it is P f is a symmetric and Eg (x1 , ..., xm−1 , Xm ) = 0, for any x1 , ..., xm−1 . Then, πk,m

P-canonical function in k variables (k = 1, ..., m) . If f is not symmetric, we may write Sf =

X 1 f (xi1 , ..., xim ) m! 1¬i1 ,...,im ¬n

4 for its symmetric version. To ease notation assume f is symmetric. Let F be a class of symmetric measurable real functions on Rm . A U-process of order m with U-kernel in F is defined as

Un(m) (f )

f ∈F

! X (n − m)! f (Xi1 , ..., Xim ) , f ∈ F n! (i1 ,...,im )∈I n

:=

m

n := {(i , ..., i ) : 1 ¬ i ¬ n, i 6= i if j 6= k} (see Serfling, 1980, and where Im 1 m j j k

Arcones and Gin´e, 1993, for details on U-statistics and U-processes, respectively). (m) (m) has is a collection of U-statistics. Then, Un (f ) Hence, Un (f ) f ∈F

f ∈F

the following Hoeffding’s decomposition: (2.2)

Un(m) (f )

m X m (k) P m (k) P m Un πk,m f , Un πk,m f = P f + = k=1 k k=0 k m X

P f (k = 1, ..., m) are P-canonical functions. where πk,m

2.2. U-Kernels in Besov Spaces. The U-kernel f ∈ F of the process will

be restricted to the Besov space Bps,∞ (Rm ) , which is a smoothness subspace of Lp (Rm ) . To define this space, define the rth difference in the direction of h ∈ Rm : 4rh (f, x)

r X

:=

r+j

(−1)

j=0

r f (x + jh) , j

so that 41h (f, x) = f (x + h) − f (x) and higher differences are obtained by induction. The modulus of smoothness of order r of f is given by !1

p

ωr (f, t)p := sup

|h|¬t

R

|4rh (f, x)|p dx

, t > 0.

Rm

Let s > 0 and r = bsc + 1, where bsc is the integer part of s. The Besov space Bps,∞ (Rm ) is defined as the set of all functions in Lp (Rm ) such that !1

p

(2.3)

R Rm

p

|4rh (f, x)| dx

¬ M |h|s

5

U-Processes

for all h ∈ Rm and some finite M. This space is equipped with the seminorm |f |Bps,∞ := supt>0 t−s ωr (f, t)p and the norm kf kBps,∞ := |f |Bps,∞ + kf kp,λ . Note that |...|Bps,∞ is a seminorm because if f is a polynomial of degree less than r, then, 4rh (f, x) = 0, implying |f |Bps,∞ = 0. A discussion of Besov spaces and their relation to Sobolev spaces can be found in Adams and Fournier (2003). 2.3. Dependence Condition. We introduce notation for the weak dependence

condition satisfied by the stationary sequence (Xi )i¬k∈Z . Let Fk := σ (Xi , i ¬ k) and F k := σ (Xi , i k) be, respectively, the sub σ-algebras generated by (Xi )i¬k∈Z and (Xi )ik∈Z . We say that (Xi )i∈Z is α mixing if limn α (F0 , F n ) = 0, where (2.4)

α (F0 , F n )

:

= sup |Pr (A ∩ B) − Pr (A) Pr (B)| A,B

= sup |cov (IA , IB )| , A,B

and A ∈ F0 , B ∈ F n . We call αn := α (F0 , F n ) the strong mixing coefficient of (Xi )i∈Z . 2.4. Statement of Result. We have the following equicontinuity inequality for

U-processes.

T HEOREM 2.1. Suppose (Xi )i∈Z has strong mixing coefficients satisfying supj>0 j m αj < ∞. Let F be a class of symmetric functions such that F ⊂ Bps,∞ (Rm ) ∩ L2 (Rm ) where s ∈ (m/p, ∞) and p ∈ [1, 2]. Then, ∀J ∈ N, and γ > 0,

!

1 (m) − J(m/p−s)

sup + γ2Jm/2 Un (f ) − Un(m) (g)

. n 2 sup kf kBps,∞ 2 f,g∈F f ∈F

kf −gk2,λ ¬γ

2,P

and

(k) P P

sup g Un πk,m f − Un(k) πk,m

f,g∈F

kf −gk2,λ ¬γ

2,P

! .n

− k2

sup kf kBps,∞ 2 f ∈F

J(m/p−s)

Jm/2

+ γ2

.

6

R EMARK 2.1. Clearly, Theorem 2.1 implies

!

1

sup Un(m) (f ) . n− 2 sup kf kBps,∞ 2J(m/p−s) + kf k2,λ 2Jm/2 .

f ∈F

f ∈F 2,P

0

R EMARK 2.2. Note that Bps,∞ can be embedded into Bps0 ,∞ as long as p < p0 and s−m/p = s0 −m/p0 > 0 (e.g. Theorem 2.7.1 in Triebel, 1983). Given that the statement of Theorem 2.1 depends on s − m/p > 0 only, we could choose p = 2 with no loss of generality. To simplify reference to some results to be used, we do not make use of this embedding. 2.5. Application: Donsker Theorem for U-Processes. As an application of

Theorem 2.1 consider a U-process with non-degenerate first term in its Hoeffding’s decomposition. Then, √ (m) n Un (f ) − Pm f

f ∈F

=

√ √ P m nUn(1) π1,m f + n

! m (k) P Un πk,m f k=2 k m X

Under the Conditions of Theorem 2.1, for F ⊂Bps,∞ (Rm )∩L2 (Rm ) the U-process is stochastically equicontinuous (setting γ 2J(m/p−m/2−s) ) and has the same √ (1) P f limiting distribution as m nUn π1,m because f ∈F

m √ X m (k) P n

sup Un πk,m f

f ∈F k=2 k

1

. n− 2

2,P

√ (1) P f by Theorem 2.1. By Theorem 2.1 we also know that m nUn π1,m

f ∈F

is

stochastically equicontinuous. Then, if F is totally bounded with respect to (the √ (m) converges metric induced by) k...k2,λ , to show that n Un (f ) − Pm f f ∈F

to a Gaussian process with k...k2,λ continuous sample paths, we only need finite √ (1) P dimensional convergence of m nUn π1,m f (e.g. van der Vaart and Wellner, 2000, Theorems 1.5.4, 1.5.7). This follows by an application of the central limit theorem for strongly mixing sequences (e.g. Rio, 2000, Theorem 4.2). Hence we have easily proved the following.

. f ∈F

7

U-Processes

C OROLLARY 2.1. Suppose F is a symmetric totally bounded (with respect to k...k2,λ ) class of functions in Bps,∞ (Rm ) ∩ L2 (Rm ) where s ∈ (m/p, ∞) and p ∈ [1, 2]. Suppose the strong mixing coefficients satisfy supj>0 j m αj < ∞. Then, √ (m) converges weakly to a mean zero Gaussian process n Un (f ) − Pm f f ∈F

(G (f ))f ∈F with a.s. continuous sample paths and covariance function P P EG (f ) G (g) = m2 Cov π1,m f (X1 ) , π1,m g (X1 ) ∞ X P P P P +m2 Cov π1,m f (X1 ) , π1,m g (X1+i ) + Cov π1,m f (X1+i ) , π1,m g (X1 ) . i=1

3. PROOF

The proof of Theorem 2.1 relies on multidimensional wavelet representation for functions in Bps∞ . Details on this can be found in the book of Meyer (1992) and the review article of DeVore and Lucier (1992). Since kf k2,λ < ∞, f : Rm → R admits the following multiresolution representation via wavelet expansion f=

(3.1)

X θ∈Zm

bfθ ϕθ +

∞ X X

afθ ψθ

j=0 θ∈Θj

where Θj := 2−j−1 Zm \2−j Zm and {ϕθ : θ ∈ Zm } , {ψθ : θ ∈ Θj , j ∈ N} are functions which can be chosen to have compact support in Rm . In particular, {ϕθ : θ ∈ Zm } is a father wavelet, while {ψθ : θ ∈ Θj , j 0} are mother wavelets (e.g. Meyer, 1992, ch.3.1). The multidimensional wavelets can be constructed from wavelets on R by tensor product method (Meyer, 1992, ch.3.3). Let ϕ and ψ be wavelets on R at resolution level j = 0. To ease notation define ψ 0 := ϕ and ψ 1 := ψ. Hence, ψθ (x1 , ..., xm ) = (3.2)

=

m Q

X

(1 ,...,m )∈{0,1}m \{0}m k=1 m Q X m

m

(1 ,...,m )∈{0,1} \{0}

k=1

2j/2 ψ k 2j xk − qk

k 2j/2 ψjq (xk ) k

when θ = 2−j (q1 − 1 , ..., qm − m ) ∈ Θj and k ∈ {0, 1} with

Pm

k=1 k

1, and qk ∈ Z. (Recall that Θj := 2−j−1 Zm \2−j Zm , hence the point θ =

8 2−j (q1 , ..., qm ) must be excluded.) On the other hand, (3.3)

m Q

ϕθ (x1 , ..., xm ) =

2j/2 ϕ 2j xk − qk

k=1

when θ = 2−j (q1 , ..., qm ) ∈ 2−j Zm . The functions ϕ and ψ are bounded and have compact support. While in (3.1) the father wavelet is computed at a resolution level j = 0, in the proof we will need to consider the father wavelet at the resolution level J > 0, where J is as in Theorem 2.1. In fact, we recall the following identity (3.4)

X θ∈Zm

bfθ ϕθ +

J X X

afθ ψθ =

X

bfθ ϕθ .

θ∈2−J Zm

j=0 θ∈Θj

When f ∈ Bps,∞ , the wavelets coefficients can be related to kf kBps,∞ by appropriate choice of the father and mother wavelets ϕ and ψ. In this case, the wavelets are chosen to be r = bsc + 1 regular (an index of smoothness for the wavelets, Meyer, 1992, ch.2.2) so that there exists an integer M < ∞ (growing linearly in r) such that the support of ϕ and ψ is in [(1 − M ) /2, (1 + M ) /2] , implying that the support of ϕ 2j x − qk and ψ 2j x − qk (in terms of x) is in 2−j [qk + (1 − M ) /2, qk + (1 + M ) /2], qk ∈ Z. Then, it follows (Meyer, 1992, ch.6.10, with the aid of Lemma 8.1 in Rio, 2000) that if f ∈ Bps,∞ (Rm ) and p ∈ [1, 2] X θ∈Zd

2 f bθ

!1 2

. kf kBps,∞

and (3.5)

X θ∈Θj

2 f aθ

!1 2

. kf kBps,∞ 2j(m/p−s−m/2) .

The goal is to substitute the kernel f with its wavelet representation essentially given by the sum of (3.2), where (3.2) is the sum of products of univariate functions. Hence, the most right hand side in (3.2) will be used as kernel in the proof. Then, as in Arcones and Gin´e (1992), we will represent the U-process as

9

U-Processes

the product of powers of partial sums. To control these powers of partial sums, we will then use moment inequalities for powers of strongly mixing partial sums (Rio, 2000). The proof of Theorem 2.1 relies on a sequence of lemmata that formalize the mentioned ideas.

L EMMA 3.1. Let F be a class of symmetric functions such that F ⊂ Bps,∞ (Rm ) ∩ L2 (Rm ) where s > m/p and p ∈ [1, 2]. Then, ∀J ∈ N, and γ > 0,

(m) (m)

sup Un (f ) − Un (g)

f,g∈F

kf −gk2,λ ¬γ

2,P

(m) ¬ sup kf kBps,∞ 2J(m/p−s) (2m − 1) max max

Un j>J ∈{0,1}m \{0}m f ∈F

!

m X Q

0 +γ2Jm/2 Un(m) eqk ψJq

, k

k=1 qk ∈Z

!

k eqk ψjq

k

k=1 qk ∈Z m Q

X

2,P

2,P

(k) ; k = 1, ..., m are eqk qk ∈Z (k) (k) iid sequences of Rademacker random variables (i.e. Pr eqk = 1 = Pr eqk = −1 =

k where ψjq is as in (3.2), := (1 , ..., m ), and k

1/2) independent for each other and independent of (Xi )i∈Z .

P r o o f. [Proof of Lemma 3.1] Using the notation above, define

ΠJ f :=

X θ∈Zm

bfθ ϕθ +

J X X

afθ ψθ .

j=0 θ∈Θj

Clearly, (m) (m) (m) (m) (m) (m) U (f ) − U (g) ¬ U (f ) − U (Π f ) + U (g) − U (Π g) n n n J J n n n + Un(m) (ΠJ f ) − Un(m) (ΠJ g) .

10 Then,

(m)

sup Un (f ) − Un(m) (g)

f,g∈F

kf −gk2,λ ¬γ

(m) (m) ¬ 2 sup Un (f ) − Un (ΠJ f )

f ∈F

2,P

2,P

(m) +2 sup Un (ΠJ f )

kf k2,λ ¬γ

2,P

= I + II. Control over I. By the Cauchy-Schwarz inequality, X X (m) f (m) (m) aθ Un (ψθ ) Un (f ) − Un (ΠJ f ) = j>J θ∈Θj !1 !1 2 2 2 X (m) X f 2 X , ¬ Un (ψθ ) aθ j>J

θ∈Θj

θ∈Θj

Therefore, from (3.5), we have

(m) (m)

sup Un (f ) − Un (ΠM f )

f ∈F

2,P

¬ sup kf kBps,∞

(3.6)

f ∈F

X

2j(m/p−s)

j>J

X θ∈Θj

2 2−jm E Un(m) (ψθ )

!1 2

.

Let (eqk )qk ∈Z (k = 1, ..., m) be as in the statement of the lemma. Then, from (3.2), X θ∈Θj

2 2−jm E Un(m) (ψθ ) !

!

X

X

(q1 ,...,qm )∈Zm

(1 ,...,m )∈{0,1}m \{0}m

=

−jm

2

m 2 (m) Q j/2 k E Un 2 ψjqk k=1

[using (3.2), the sum over θ is equal to the sum over (q1 , ..., qm ) ] ! 2 m X X (m) Q (k) k eqk ψjqk , = E Un k=1 qk ∈Z ( ,..., )∈{0,1}m \{0}m 1

m

by Lemma 3.2 (stated at the end of the proof) and noting that the 2−jm simplifies with the 2jm/2 inside the squared absolute value. Hence, substituting the last

11

U-Processes

display in (3.6),

!

m X Q

k I ¬ sup kf kBps,∞ e(k) ψ 2j(m/p−s)

Un(m) qk jqk

f ∈F k=1 qk ∈Z j>J (1 ,...,m )∈{0,1}m \{0}m 2,P

!

m Q X

(m) (k) k max U e ψ . sup kf kBps,∞ 2J(m/p−s) (2m − 1) max

n q jqk k m m

j>J ∈{0,1} \{0} f ∈F k=1 qk ∈Z X

X

2,P

because j>J 2j(m/p−s) 2J(m/p−s) when (m/p − s) < 0, and there are (2m − 1) elements in the sum over (1 , ..., m ). Control over II. Note that ΠJ f is the projection of f onto the space spanned by the father wavelet ϕθ at the J th resolution level, i.e.

P

X

ΠJ f =

bfθ ϕθ ,

θ∈2−J Zm

so that by the Cauchy-Schwarz inequality, X (m) f (m) bθ Un (ϕθ ) Un (ΠJ f ) = θ∈2−J Zm !1/2 2 X f ¬ bθ θ∈2−J Zm

X θ∈2−J Zm

2 (m) Un (ϕθ )

!1/2 .

Hence, using the same notation and argument as in the control over I, together with the last display and (3.3), we have

II ¬ γ

X θ∈2−J Zm

= γ Un(m)

because

2 (m) E Un (ϕθ )

!1/2

!

J/2 0 e(k) ψJqk qk 2

k=1 qk ∈Z m Q

X

,

2,P

2 1/2 f ¬ γ if kf k2,λ ¬ γ, using (3.1) and (3.4) (recall θ∈2−J Zm bθ

P

that the wavelets are orthonormal functions with respect to the Lebesgue measure λ). The following is used in the previous proof.

12

(k) eqk

; k = 1, ..., m are iid sequences of qk ∈Z (k) (k) Rademacker random variables (i.e. Pr eqk = 1 = Pr eqk = −1 = 1/2)

L EMMA 3.2. Suppose that

independent of each other and independent of (Xi )i∈Z . Then, ! 2 2 m m X (m) Q Q X (m) (k) k k eqk ψjqk . E Un ψjqk = E Un k=1 qk ∈Z k=1 (q ,...,q )∈Zm m

1

P r o o f. [Proof of Lemma 3.2] By definition of the U-process 2 ! 2 m X m X Q Q X n! k (k) k E ) (X e ψ = E Un(m) e(k) ψ ik . qk jqk qk jqk (n − m)! k=1 qk ∈Z (i ,...,i )∈I n k=1 qk ∈Z m

1

m

Hence, 2 m X Q X (k) k E e ψ (Xik ) (i ,...,i )∈I n k=1 qk ∈Z qk jqk m 1 m 2 m Q X X k (1) (m) = E e · · · eqm ψjqk (Xik ) (q ,...,q )∈Zm q1 (i ,...,i )∈I n k=1 m

1

=

[

X

m Q

n k=1 (i1 ,...,im )∈Im X

=

k ψjq 0 k

(1)

(m)

1

m

n k=1 (i1 ,...,im )∈Im

(1)

(m)

1

m

! k ψjq 0 k

(Xik )

]

(m) Ee(1) q1 eq 0 · · · eqm eq 0

m Q

X

m Q

X

(Xik )

0 ∈Z2m (q1 ,...,qm ,q10 ,...,qm )

×E

m

(m) E e(1) q1 eq 0 · · · eqm eq 0

0 ∈Z2m (q1 ,...,qm ,q10 ,...,qm )

×

m

1

X

n k=1 (i1 ,...,im )∈Im

k ψjq 0 k

(Xik )

X

m Q

n k=1 (i1 ,...,im )∈Im

! k ψjq 0 k

(Xik )

[by independence of the Rademacker r.v.’s and the X’s] 2 m Q X X k = E ψjq0 (Xik ) , k (q ,...,q )∈Zm (i ,...,i )∈I n k=1 1

m

1

m

m

because the Rademacker variables are independent of each other and have variance one. The term on the right hand side of the last equality is, by definition, equal to m Q k 2 X n! (m) U ψjqk . E (n − m)! n k=1 (q1 ,...,qm )∈Zm

13

U-Processes

The U-statistics on the right hand side of the bound of Lemma 3.1 can be bounded using the fact that the U-kernel is the product of functions on R.

L EMMA 3.3. Define φk (xk ) :=

(3.7)

X qk ∈Z

(k) k and e where ψjq q k k

qk ∈Z

m

(k)

Q P

Un φk πk,m

k=1

k e(k) qk ψjqk (xk ) , k = 1, ..., m,

are as in Lemma 3.1. Then, #1/2 Q 2k (1 − P) φ (X ) is is E . n s=1 1¬i1 ,...,i2k ¬n

" (Pφ1 )m−k

2,P

X

P r o o f. [Proof of Lemma 3.3] Let κφm (x1 , ..., xm ) := φ (x1 ) · · · φ (xm ), for some bounded function φ. Then, P κφm (x1 , ..., xk ) = (δx1 − P) · · · (δxk − P) Pm−k κφm (X1 , ..., Xm ) πk,m

= (δx1 − P) φ (X1 ) · · · (δxk − P) φ (Xk ) (Pφ (X1 ))m−k , and (1−P)φ

P κφm (X1 , ..., Xk ) = κk πk,m

(3.8)

(k) Define U˜n :=

n k

(k)

(X1 , ..., Xk ) (Pφ (X1 ))m−k . (k)

(k)

Un , noting that n−k U˜n Un . Then,

" # 2 2 k Q X (1−P)φ E U˜n(k) κk (1 − P) φ (Xis ) , = E (i1 ,··· ,ik )∈Ikn s=1 Q X 2k ¬ E (1 − P) φ (Xis ) , 2k s=1 (i1 ,...,i2k )∈{1,...,n}

where we have expanded the square and taken absolute values so that the terms in the summation are positives allowing us to change the indices of summation n (recall we are taking squares) to {1, ..., n}2k (I n ⊂ {1, ..., n}2k ). This from I2k 2k inequality together with (3.8) gives the result. The last final step is to bound the k th power of the partial sum of the function in (3.7).

14

L EMMA 3.4. Suppose the strong mixing coefficients of (Xi )i∈Z satisfy supj>0 j k αj < ∞. Then,

Q 2k (1 − P) φis (Xis ) E . n−k , n s=1 1¬i1 ,...,i2k ¬n X

where φi is as in Lemma 3.3.

P r o o f. [Proof of Lemma 3.4] At first we show that X i j sup |φi (x)| = sup e(i) ψ 2 x − q . 1, i qi x∈R x∈R qi ∈Z recalling the definition of φi (x) in Lemma 3.3 and using the notation in (3.2). In the remarks about wavelets we mentioned that there is a positive integer M < ∞ (depending linearly on the index of regularity r) such that ψ i 2j x − qi . 1 if x ∈ 2−j [qi + (1 − M ) /2, qi + (1 + M ) /2] = 0 otherwise. Hence, X X j i ψ i 2j x − qi ¬ M sup |ψ i (x)| . 1, 2 x − q |φi (x)| = e(i) ψ ¬ i qi qi ∈Z qi ∈Z x∈R because the wavelets are bounded and, for arbitrary, but fixed x, there are at most M non-zero elements in the summation of the above display. Clearly, Q Q 2k (1 − P) φ (X ) 2k (1 − P) φ (X ) X X is is is is E E n n s=1 s=1 1¬i1 ,...,i2k ¬n 1¬i1 ¬···¬i2k ¬n (e.g. eq. 2.16 in Rio, 2000). Then, for bounded φi , the right hand side of the above display is of order n−k if supj>0 j k αj < ∞ (see Rio, 2000, eq. 2.16, 2.23 and Lemma 2.2). P does not affect the wavelets coefficients, Lemmata 3.1, 3.3 and 3.4 Since πk,m

imply the following from which Theorem 2.1 follows as a corollary.

L EMMA 3.5. Under the conditions of Theorem 2.1,

!

k (k) P − J(m/p−s) Jm/2

sup + γ2 . Un πk,m (f − g)

. n 2 sup kf kBps,∞ 2 f ∈F

f,g∈F

kf −gk2,λ ¬γ

2,P

15

U-Processes

We can prove Theorem 2.1.

P r o o f. [Proof of Theorem 2.1] From (2.2), we deduce

sup (1 − Pm ) Un(m) (f − g)

f,g∈F

kf −gk2,λ ¬γ

2,P

m X m (k) P

¬ Un πk,m (f − g)

sup

k f,g∈F

k=1

kf −gk2,λ ¬γ

.

2,P

Hence, applying Lemma 3.5 to each term in the summation gives the result.

REFERENCES

[1] Adams, R.A. and J.J.F. Fournier (2003) Sobolev Spaces. Amsterdam: Academic Press. [2] Arcones, M.A. and E. Giné (1992) On the Bootstrap of U and V Statistics. The Annals of Statistics 20, 655-674. [3] Arcones, M.A. and E. Giné (1993) Limit Theorems for U-Processes. The Annals of Probability 21, 1494-1542. [4] Arcones, M.A. and B. Yu (1994) Central Limit Theorems for Empirical and U-Processes of Stationary Mixing Sequences. Journal of Theoretical Probability 7, 47-71. [5] Birgé, L. and P. Massart (2000) An Adaptive Compression Algorithm in Besov Spaces. Constructive Approximation 16, 1-36. [6] Borovkova, S., R. Burton and H. Dehling (2001) Limit Theorems for Functionals of Mixing Processes with Applications to U-Statistics and Dimension Estimation. Transactions of the American Mathematical Society 353, 4261-4318. [7] Cavanagh, C. and R.P. Sherman (1998) Rank Estimators for Monotonic Index Models. Journal of Econometrics 84, 351-381. [8] Denker, M. and G. Keller (1986) Rigorous Statistical Procedures for Data from Dynamical Systems. Journal of Statistical Physics 44, 67-93. [9] DeVore, R.A. and B.J. Lucier (1992) Wavelets. Acta Numerica, 1-56. [10] Fan, Y, and Q. Li (1999) Central Limit Theorem for Degenerate U-Statistics of Absolutely Regular Processes with Applications to Model Specification Testing. Journal of Nonparametric Statistics 10, 245-271.

16 [11] Fan, Y, and A. Ullah (1999) On Goodness-of-Fit tests for Weakly Dependent Processes Using Kernel Methods. Journal of Nonparametric Statistics 11, 337-360. [12] Ghosal, S., A. Sen and A.W. van der Vaart (2000) Testing Monotonicity of Regression. The Annals of Statistics 28, 1054-1082. [13] Han, A.K. (1987) A Non-Parametric Analysis of Transformations. Journal of Econometrics 35, 191-209. [14] Honore, B.E. and J.L. Powell (1994). Pairwise Difference Estimators of Censored and Truncated Regressio Models. Journal of Econometrics 64, 241-278. [15] Meyer, Y. (1992) Wavelets and Operators. Cambridge: Cambridge University Press. [16] Rio, E. (2000) Théorie Asymptotique des Processus Aléatoires Faiblement Dépendants. Paris: Springer. [17] Serfling, J.R. (1980) Approximation Theorems of Mathematical Statistics. New York: Wiley. [18] Triebel, H. (1983) Theory of Function Spaces. Basel: Birkhäuser. [19] Van der Vaart, A. and J.A. Wellner (2000) Weak Convergence of Empirical Processes. New York: Springer.

FIRST, BNP Paribas 10 Harewood Avenue, London NW1 6AA E-mail: [email protected] Received on 06/03/2008; last revised version on xx.xx.xxxx