Inférence non-paramétrique pour des interactions poissoniennes Soutenance de thèse Laure SANSONNET UNIVERSITÉ PARIS-SUD

Vendredi 14 juin 2013

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Introduction

Examples of interaction in economics: the comprehension of bankruptcies by contagion, in neurosciences: interactions between neurons in the cerebral activity in the cortex, ...

2 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Introduction Context of genomics DNA = a double helix composed of two complementary strands DNA strand = a sequence of nucleotides in {a,c,g,t} motif = a word in the DNA-alphabet {a,c,g,t} ,→ work on the distribution of words in DNA sequences [Robin et al., 2005]

3 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Introduction Context of genomics DNA = a double helix composed of two complementary strands DNA strand = a sequence of nucleotides in {a,c,g,t} motif = a word in the DNA-alphabet {a,c,g,t}

,→ We work in a continuous framework. ,→ Occurrences of a motif = a point process lying in [0; T ], where T is the normalized length of the studied genome.

3 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Introduction Context of genomics DNA = a double helix composed of two complementary strands DNA strand = a sequence of nucleotides in {a,c,g,t} motif = a word in the DNA-alphabet {a,c,g,t}

,→ We work in a continuous framework. ,→ Occurrences of a motif = a point process lying in [0; T ], where T is the normalized length of the studied genome. Aim: study of favored or avoided distances between two given motifs along a genome. 3 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Introduction Context of genomics DNA = a double helix composed of two complementary strands DNA strand = a sequence of nucleotides in {a,c,g,t} motif = a word in the DNA-alphabet {a,c,g,t}

,→ We work in a continuous framework. ,→ Occurrences of a motif = a point process lying in [0; T ], where T is the normalized length of the studied genome. Aim: study of favored or avoided distances between two given motifs along a genome. Application on real data: Escherichia coli genome. 3 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Contents 1

2

3

4

5

Models and framework The bivariate Hawkes process Our Poissonian interactions model The Haar basis and a specific biorthogonal wavelet basis Adaptive testing procedure Description of our testing procedure Main theoretical results Adaptive estimation procedure Data-driven wavelet thresholding procedure Main theoretical results Simulations Application to genomic data Lasso estimation for Poissonian interactions on the circle Formulation of the problem Our Lasso-type procedure The performance of the Lasso estimate Conclusion and perspectives 4 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Poisson process on the real line Let N be a random countable set of points of R (here). NA number of points of N in A, P dN = X ∈N δX .

Poisson process if A1 , . . . , A` are disjoint measurable sets, NA1 , . . . , NA` are independent random variables, NA obeys a Poisson law P(µ(A)), where µ is a measure called "mean measure". Generally, d µ(t) = h(t) dt, where h is the intensity of N. If h is constant, N is a homogeneous Poisson process, otherwise it is said inhomogeneous.

5 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The bivariate Hawkes process

(1)

A bivariate Hawkes process is defined by two point processes (Nt )t∈R (2) and (Nt )t∈R with respective intensity conditionally on the past λ(1) (t) = and λ(2) (t) =

6 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The bivariate Hawkes process

(1)

A bivariate Hawkes process is defined by two point processes (Nt )t∈R (2) and (Nt )t∈R with respective intensity conditionally on the past λ(1) (t) = ν (1) and λ(2) (t) = ν (2) Spontaneous apparition with ν (`) > 0

6 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The bivariate Hawkes process (1)

A bivariate Hawkes process is defined by two point processes (Nt )t∈R (2) and (Nt )t∈R with respective intensity conditionally on the past λ(1) (t) = ν (1) +

Z

t−

−∞

and (2)

λ

(t) = ν

(2)

Z

t−

+ −∞

(1)

h1 (t − u) dNu(1)

(2)

h2 (t − u) dNu(2)

Self-excitation (`)

with ν (`) > 0 and hm > 0 supported by R+ .

6 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The bivariate Hawkes process (1)

A bivariate Hawkes process is defined by two point processes (Nt )t∈R (2) and (Nt )t∈R with respective intensity conditionally on the past λ

(1)

(t) = ν

(1)

Z

t−

+ −∞

(1) h1 (t



u) dNu(1)

Z

t−

+ −∞

(1)

h2 (t − u) dNu(2)

and (2)

λ

(t) = ν

(2)

Z

t−

+ −∞

(2) h2 (t



u) dNu(2)

Z

t−

+ −∞

(2)

h1 (t − u) dNu(1) ,

Interaction with another event (`) hm

with ν (`) > 0 and > 0 supported by R+ . R +∞ (`) A stationary version exists if ρ(Γ) < 1 with Γ`,m = 0 hm (u) du.

6 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . .

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . . In the nonparametric framework Gusto and Schbath (2005) → Maximum likelihood estimates of the (1) (1) coefficients of h1 and h2 on a Spline basis coupled with an AIC criterion.

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . . In the nonparametric framework Gusto and Schbath (2005) → Maximum likelihood estimates of the (1) (1) coefficients of h1 and h2 on a Spline basis coupled with an AIC criterion. Pros: FADO procedure quite effective, producing smooth estimates

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . . In the nonparametric framework Gusto and Schbath (2005) → Maximum likelihood estimates of the (1) (1) coefficients of h1 and h2 on a Spline basis coupled with an AIC criterion. Cons: criterion very poor for complex families of models, sparsity issues

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . . In the nonparametric framework Gusto and Schbath (2005) → Maximum likelihood estimates of the (1) (1) coefficients of h1 and h2 on a Spline basis coupled with an AIC criterion. Reynaud-Bouret and Schbath (2010) → For simple Hawkes process, (1) estimate of h1 based on model selection principle satisfying oracle and minimax properties.

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . . In the nonparametric framework Gusto and Schbath (2005) → Maximum likelihood estimates of the (1) (1) coefficients of h1 and h2 on a Spline basis coupled with an AIC criterion. Reynaud-Bouret and Schbath (2010) → For simple Hawkes process, (1) estimate of h1 based on model selection principle satisfying oracle and minimax properties. Pros: procedure solving sparsity issues, minimax properties for Hawkes process for the first time

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . . In the nonparametric framework Gusto and Schbath (2005) → Maximum likelihood estimates of the (1) (1) coefficients of h1 and h2 on a Spline basis coupled with an AIC criterion. Reynaud-Bouret and Schbath (2010) → For simple Hawkes process, (1) estimate of h1 based on model selection principle satisfying oracle and minimax properties. Cons: it manages only one motif, high computational cost

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . . In the nonparametric framework Gusto and Schbath (2005) → Maximum likelihood estimates of the (1) (1) coefficients of h1 and h2 on a Spline basis coupled with an AIC criterion. Reynaud-Bouret and Schbath (2010) → For simple Hawkes process, (1) estimate of h1 based on model selection principle satisfying oracle and minimax properties.

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Statistical inference for Hawkes processes In the parametric framework Ozaki (1979), Ogata and Akaike (1982), Daley and Vere-Jones (2003), . . . In the nonparametric framework Gusto and Schbath (2005) → Maximum likelihood estimates of the (1) (1) coefficients of h1 and h2 on a Spline basis coupled with an AIC criterion. Reynaud-Bouret and Schbath (2010) → For simple Hawkes process, (1) estimate of h1 based on model selection principle satisfying oracle and minimax properties. Hansen et al. (2012) → adaptive Lasso-type procedure for multivariate Hawkes processes, with oracular properties and a performant and robust practical procedure.

7 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model

We observe the occurrences of both given motifs:

8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model

• U1

• U2

• U3

• U4

We observe the occurrences of both given motifs: Parents: U1 , . . . , Un i.i.d. uniform random variables on [0; T ].

8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model h(· − U1 )

• • • • • U1 U2

• U3

• U4

We observe the occurrences of both given motifs: Parents: U1 , . . . , Un i.i.d. uniform random variables on [0; T ]. Children: ,→ Each Ui gives birth to a Poisson process with intensity h(· − Ui ). 8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model h(· − U1 )

h(· − U2 )

• • • ••• • •• U1 U2 U3

• U4

We observe the occurrences of both given motifs: Parents: U1 , . . . , Un i.i.d. uniform random variables on [0; T ]. Children: ,→ Each Ui gives birth to a Poisson process with intensity h(· − Ui ). 8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model h(· − U1 )

h(· − U2 )

h(· − U3 )

• • • ••• • ••• U1 U2 U3

• • • • U4

We observe the occurrences of both given motifs: Parents: U1 , . . . , Un i.i.d. uniform random variables on [0; T ]. Children: ,→ Each Ui gives birth to a Poisson process with intensity h(· − Ui ). 8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model h(· − U1 )

h(· − U2 )

h(· − U3 )

• • • ••• • ••• U1 U2 U3

h(· − U4 )

• • • • • • • • U4

We observe the occurrences of both given motifs: Parents: U1 , . . . , Un i.i.d. uniform random variables on [0; T ]. Children: ,→ Each Ui gives birth to a Poisson process with intensity h(· − Ui ). 8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model h(· − U1 )

h(· − U2 )

h(· − U3 )

• •• • • • • •• • • • • U1 U2 U3

h(· − U4 )

• •• • • • • •• • U4

We observe the occurrences of both given motifs: Parents: U1 , . . . , Un i.i.d. uniform random variables on [0; T ]. Children: ,→ Each Ui gives birth to a Poisson process with intensity h(· − Ui ). ,→ + A homogeneous Poisson process with intensity ν. 8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model h(· − U1 )

h(· − U2 )

h(· − U3 )

• •• • • • • •• • • • • U1 U2 U3

h(· − U4 )

• •• • • • • •• • U4

We observe the occurrences of both given motifs: Parents: U1 , . . . , Un i.i.d. uniform random variables on [0; T ]. Pn Children: Poisson process N with intensity ν + i=1 h(· − Ui ). 8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Our Poissonian interactions model h(· − U1 )

h(· − U2 )

h(· − U3 )

• •• • • • • •• • • • • U1 U2 U3

h(· − U4 )

• •• • • • • •• • U4

We observe the occurrences of both given motifs: Parents: U1 , . . . , Un i.i.d. uniform random variables on [0; T ]. Pn Children: Poisson process N with intensity ν + i=1 h(· − Ui ). Aim: Estimate h or test some hypotheses on h, with the observation of U1 , . . . , Un and the realization of N. 8 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The Haar basis Assumption: h ∈ L2 (R). Decomposition of h on the Haar basis obtained by dilatations and translations of φ = 1[0;1] (left) and ψ = 1] 12 ;1] − 1[0; 12 ] (right): h=

X λ∈Λ

Z βλ ϕ λ

with βλ =

h(x)ϕλ (x) dx, R

where Λ = {λ = (j, k) : j > −1, k ∈ Z} and ∀x ∈ R, ∀λ = (j, k) ∈ Λ,  φ(x − k) if j = −1 ϕλ (x) = . j/2 j 2 ψ(2 x − k) otherwise ,→ The Haar basis is essentially used for practical procedures and the testing procedure. 9 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

A biorthogonal wavelet basis An example Decomposition of h ∈ L2 (R) on a particular biorthogonal wavelet basis, built by Cohen P et al. (1992): Rh = λ∈Λ βλ ϕ˜λ , with βλ = R h(x)ϕλ (x) dx. Interest of this basis: analysis wavelets ϕλ are piecewise constant functions and reconstruction wavelets ϕ˜λ are typically smooth functions.

˜ and ψ. ˜ Top: φ and ψ; bottom: φ

,→ Implementation of low computational complexity algorithms. ,→ Smooth reconstructions.

10 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Contents 1

2

3

4

5

Models and framework The bivariate Hawkes process Our Poissonian interactions model The Haar basis and a specific biorthogonal wavelet basis Adaptive testing procedure Description of our testing procedure Main theoretical results Adaptive estimation procedure Data-driven wavelet thresholding procedure Main theoretical results Simulations Application to genomic data Lasso estimation for Poissonian interactions on the circle Formulation of the problem Our Lasso-type procedure The performance of the Lasso estimate Conclusion and perspectives 11 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity ν +

Pn

i=1

h(· − Ui )

Aim: Test of H0 : "h = 0" against H1 : "h 6= 0".

12 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity ν +

Pn

i=1

h(· − Ui )

Aim: Test of H0 : "h = 0" against H1 : "h 6= 0". We assume that supp(h) ⊂ [−1; 1]: ,→ Observation of the Ui ’s on [0; T ] and realization of N on [−1; T + 1].

12 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity ν +

Pn

i=1

h(· − Ui )

Aim: Test of H0 : "h = 0" against H1 : "h 6= 0". We assume that supp(h) ⊂ [−1; 1]: ,→ Observation of the Ui ’s on [0; T ] and realization of N on [−1; T + 1].

Decomposition of h on the Haar basis: h=

X λ∈Λ

Z βλ ϕλ

with βλ =

h(x)ϕλ (x) dx. R

12 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity ν +

Pn

i=1

h(· − Ui )

Aim: Test of H0 : "h = 0" against H1 : "h 6= 0". We assume that supp(h) ⊂ [−1; 1]: ,→ Observation of the Ui ’s on [0; T ] and realization of N on [−1; T + 1].

Decomposition of h on the Haar basis: h=

X λ∈Λ

Z βλ ϕλ

with βλ =

h(x)ϕλ (x) dx. R

Remark: "h 6= 0" ⇔ there exists at least one non-zero coefficient. If one coefficient β(−1,k) is non-zero, then there exists at least one coefficient β(j,k) with j > 0 which is also non-zero.

12 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity ν +

Pn

i=1

h(· − Ui )

Aim: Test of H0 : "h = 0" against H1 : "h 6= 0". We assume that supp(h) ⊂ [−1; 1]: ,→ Observation of the Ui ’s on [0; T ] and realization of N on [−1; T + 1].

Decomposition of h on the Haar basis: h=

X λ∈Λ

Z βλ ϕλ

with βλ =

h(x)ϕλ (x) dx. R

Remark: "h 6= 0" ⇔ there exists at least one non-zero coefficient. If one coefficient β(−1,k) is non-zero, then there exists at least one coefficient β(j,k) with j > 0 which is also non-zero. ,→ We introduce the following subset Λm of Λ Λm = {λ = (j, k) ∈ Λ : j > 0, −2j 6 k 6 2j − 1}. 12 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The single testing procedures

Let α ∈]0; 1[ and λ ∈ Λm . Aim: Construct an α-level test of H0 : "h = 0" against H1λ : "βλ 6= 0".

13 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The single testing procedures

Let α ∈]0; 1[ and λ ∈ Λm . Aim: Construct an α-level test of H0 : "h = 0" against H1λ : "βλ 6= 0". 1

ˆλ = |βˆλ |, where βˆλ = G(ϕλ ) , We define the testing statistic by T n with  Z X n  n−1 G(ϕλ ) = ϕλ (t − Ui ) − Eπ (ϕλ (t − U)) dNt . n R i=1

13 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The single testing procedures Let α ∈]0; 1[ and λ ∈ Λm . Aim: Construct an α-level test of H0 : "h = 0" against H1λ : "βλ 6= 0". 1

ˆλ = |βˆλ |, where βˆλ = G(ϕλ ) , We define the testing statistic by T n with  Z X n  n−1 Eπ (ϕλ (t − U)) dNt . G(ϕλ ) = ϕλ (t − Ui ) − n R i=1

Proposition For all λ ∈ Λm , βˆλ is an unbiased estimator of βλ . Furthermore, its variance is upper bounded as follows:   1 n Var(βˆλ ) 6 C + 2 , n T where C is a positive constant depending on ν, R1 and R∞ . 13 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The single testing procedures

Let α ∈]0; 1[ and λ ∈ Λm . Aim: Construct an α-level test of H0 : "h = 0" against H1λ : "βλ 6= 0". 1 2

ˆλ = |βˆλ |. We define the testing statistic by T Under H0 , conditionally on U1 , . . . , Un and on N[−1;T +1] = m, the points of the process N obey a uniform law on [−1; T + 1] and thus, ˆλ depends only on observable quantities. the law of T

13 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The single testing procedures Let α ∈]0; 1[ and λ ∈ Λm . Aim: Construct an α-level test of H0 : "h = 0" against H1λ : "βλ 6= 0". 1 2

ˆλ = |βˆλ |. We define the testing statistic by T Under H0 , conditionally on U1 , . . . , Un and on N[−1;T +1] = m, the points of the process N obey a uniform law on [−1; T + 1] and thus, ˆλ depends only on observable quantities. the law of T [U ,...,Un ;m] ˆλ under Then, we introduce the (1 − α)-quantile qλ 1 (α) of T H0 conditionally on U1 , . . . , Un and on N[−1;T +1] = m such that   ˆλ > q [U1 ,...,Un ;m] (α) U1 , . . . , Un , N[−1;T +1] = m 6 α. P0 T λ

13 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The single testing procedures Let α ∈]0; 1[ and λ ∈ Λm . Aim: Construct an α-level test of H0 : "h = 0" against H1λ : "βλ 6= 0". 1

ˆλ = |βˆλ |. We define the testing statistic by T

2

  ˆλ > q [U1 ,...,Un ;m] (α) U1 , . . . , Un , N[−1;T +1] = m 6 α. P0 T λ 3

Thus, the corresponding test function is defined by Φλ,α = 1 ˆ

[U1 ,...,Un ;N[−1;T +1] ]

Tλ >qλ

(α)

.

13 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The multiple testing procedure

Let α ∈]0; 1[. Aim: Define a multiple testing procedure by aggregating a collection of single tests for the test of H0 : "h = 0" against H1 : "h 6= 0".

14 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The multiple testing procedure

Let α ∈]0; 1[. Aim: Define a multiple testing procedure by aggregating a collection of single tests for the test of H0 : "h = 0" against H1 : "h 6= 0". 1

Collection of positive numbers {wλ , λ ∈ Λm } such that

P

e −wλ 6 1.

14 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The multiple testing procedure Let α ∈]0; 1[. Aim: Define a multiple testing procedure by aggregating a collection of single tests for the test of H0 : "h = 0" against H1 : "h 6= 0". P

e −wλ 6 1.

1

Collection of positive numbers {wλ , λ ∈ Λm } such that

2

We consider the test which rejects H0 when there exists at least one λ in Λm such that ˆλ > q [U1 ,...,Un ;N[−1;T +1] ] (uα[U1 ,...,Un ;N[−1;T +1] ] e −wλ ), T λ where [U1 ,...,Un ;N[−1;T +1] ]



      [...] 0 −wλ ˆλ,N = sup u > 0 : P max T − q (ue ) > 0 . . . 6 α . λ [−1;T +1] λ∈Λm

14 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The multiple testing procedure Let α ∈]0; 1[. Aim: Define a multiple testing procedure by aggregating a collection of single tests for the test of H0 : "h = 0" against H1 : "h 6= 0". 1 2

P −wλ 6 1. Collection of positive numbers {wλ , λ ∈ Λm } such that e We consider the test which rejects H0 when there exists at least one λ in Λm such that ˆλ > q [U1 ,...,Un ;N[−1;T +1] ] (uα[U1 ,...,Un ;N[−1;T +1] ] e −wλ ). T λ

3

The corresponding test function is defined by Φα = 1

maxλ∈Λm



[U ,...,Un ;N[−1;T +1] ]

ˆ λ −q 1 T λ

[U1 ,...,Un ;N[−1;T +1] ] e −wλ )

(uα

 >0

.

14 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Probability of first kind error

Proposition Let α be a fixed level in ]0; 1[. Then the multiple test Φα is of level α. [U1 ,...,Un ;N[−1;T +1] ] [U1 ,...,Un ;N[−1;T +1] ] Furthermore, uα satisfies uα > α. This result shows that the test is exactly of level α, which is required for a test from a non-asymptotic point of view (namely n and T are not required to tend to infinity).

15 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Probability of second kind error We have to control the probability of second kind error in such a way that it is close to 0, in order to obtain powerful tests.

16 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Probability of second kind error We have to control the probability of second kind error in such a way that it is close to 0, in order to obtain powerful tests.

Theorem Let α, β be fixed levels in ]0; 1[. Assume that there exists at least one finite subset L of Λm such that   X  1 n 2 + khL k2 > C1 DL + C2 wλ n T2 λ∈L   X X   jL jL2 2jL 1 2 wλ + 3 + 2 2 , wλ + C5 + C3 DL + C4 n2 n n T λ∈L

λ∈L

where jL = max{j > 0 : (j, k) ∈ L} and C1 , C2 , C3 , C4 and C5 are positive constants depending on α, β, ν, R1 and R∞ . Then, Ph (Φα = 0) 6 β. 16 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Classical and weak Besov bodies We consider the regime "n and T proportional".

17 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Classical and weak Besov bodies We consider the regime "n and T proportional". For δ > 0, for any R > 0, the Besov ball with radius R is defined by ( ) X X δ 2 2 −2jδ B2,∞ (R) = f ∈ L2 (R) : f = βλ ϕλ , ∀j > 0, β(j,k) 6 R 2 λ∈Λ

k

17 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Classical and weak Besov bodies We consider the regime "n and T proportional". For δ > 0, for any R > 0, the Besov ball with radius R is defined by ( ) X X δ 2 2 −2jδ B2,∞ (R) = f ∈ L2 (R) : f = βλ ϕλ , ∀j > 0, β(j,k) 6 R 2 k

λ∈Λ

and for p > 0, for any R 0 > 0, the weak Besov body with radius R 0 is defined by ) ( X X p 0p ∗ 0 1|βλ |>s 6 R . Wp (R ) = f ∈ L2 (R) : f = βλ ϕλ , sup s λ∈Λ

s>0

λ∈Λm

17 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Uniform separation rates Given a class Sδ of alternatives h, it is natural to measure the performance of the test via its uniform separation rate [Baraud, 2002]: ( ) ρ(Φα , Sδ , β) = inf

ρ>0:

sup

Ph (Φα = 0) 6 β

.

h∈Sδ ,khk2 >ρ

18 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Uniform separation rates Given a class Sδ of alternatives h, it is natural to measure the performance of the test via its uniform separation rate [Baraud, 2002]: ( ) ρ(Φα , Sδ , β) = inf

ρ>0:

sup

Ph (Φα = 0) 6 β

.

h∈Sδ ,khk2 >ρ δ The uniform separation rates over B2,∞ (R) ∩ W ∗ 2 (R 0 ), where the 1+2γ

parameter δ measures the regularity and the parameter γ the sparsity.

18 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Uniform separation rates Given a class Sδ of alternatives h, it is natural to measure the performance of the test via its uniform separation rate [Baraud, 2002]: ( ) ρ(Φα , Sδ , β) = inf

ρ>0:

sup

Ph (Φα = 0) 6 β

.

h∈Sδ ,khk2 >ρ δ The uniform separation rates over B2,∞ (R) ∩ W ∗ 2 (R 0 ), where the 1+2γ

parameter δ measures the regularity and the parameter γ the sparsity.

Theorem Let α, β be fixed levels in ]0; 1[. Assume that n and T are proportional. Let Φα with the weights wλ ’s well chosen. Then, for any δ > 0, γ > 0, R > 0, R 0 > 0, if 2δ > γ/(1 + 2γ) δ ρ(Φα , B2,∞ (R)

∩W



2 1+2γ

0

(R ), β) 6 C



ln n n

γ  1+2γ

,

with C a positive constant depending on δ, γ, R, R 0 , α, β, ν, R1 and R∞ . 18 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Contents 1

2

3

4

5

Models and framework The bivariate Hawkes process Our Poissonian interactions model The Haar basis and a specific biorthogonal wavelet basis Adaptive testing procedure Description of our testing procedure Main theoretical results Adaptive estimation procedure Data-driven wavelet thresholding procedure Main theoretical results Simulations Application to genomic data Lasso estimation for Poissonian interactions on the circle Formulation of the problem Our Lasso-type procedure The performance of the Lasso estimate Conclusion and perspectives 19 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity

Pn

i=1

h(· − Ui )

Aim: Estimate h.

20 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity

Pn

i=1

h(· − Ui )

Aim: Estimate h. Assumption: h is compactly supported in [−A; A], with A > 0 (A = the maximal memory along DNA sequences).

20 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity

Pn

i=1

h(· − Ui )

Aim: Estimate h. Assumption: h is compactly supported in [−A; A], with A > 0 (A = the maximal memory along DNA sequences). Decomposition of h on a particular biorthogonal wavelet basis: Z X h= βλ ϕλ with βλ = h(x)ϕλ (x) dx. λ∈Λ

R

20 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Framework Parents: U1 , . . . , Un & Children: N with intensity

Pn

i=1

h(· − Ui )

Aim: Estimate h. Assumption: h is compactly supported in [−A; A], with A > 0 (A = the maximal memory along DNA sequences). Decomposition of h on a particular biorthogonal wavelet basis: Z X h= βλ ϕλ with βλ = h(x)ϕλ (x) dx. λ∈Λ

R

λ) Estimation of βλ : βˆλ = G (ϕ n , with  Z X n  n−1 G (ϕλ ) = ϕλ (t − Ui ) − Eπ (ϕλ (t − U)) dNt . n R

i=1

For all λ ∈ Λ, βˆλ is an unbiased estimator of βλ and   1 n Var(βˆλ ) 6 C + 2 . n T 20 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Thresholding procedure

1

We introduce a deterministic subset Γ of Λ:  Γ = λ = (j, k) ∈ Λ : −1 6 j 6 j0 , k ∈ Kj , with j0 ∈ N∗ .

21 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Thresholding procedure 1

We introduce a deterministic subset Γ of Λ:  Γ = λ = (j, k) ∈ Λ : −1 6 j 6 j0 , k ∈ Kj , with j0 ∈ N∗ .

2

Given some parameter γ > 0, for any λ ∈ Γ, we define the threshold: r     e ϕλ + γj0 B ϕλ + ∆ NR , ηλ = 2γj0 V n 3 n n

21 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Thresholding procedure 1

2

We introduce a deterministic subset Γ of Λ:  Γ = λ = (j, k) ∈ Λ : −1 6 j 6 j0 , k ∈ Kj , with j0 ∈ N∗ . Given some parameter γ > 0, for any λ ∈ Γ, we define the threshold: r     e ϕλ + γj0 B ϕλ + ∆ NR , ηλ = 2γj0 V n 3 n n where NR = number of points of the process N lying in R,√ 2 j0 /2 j n ∆ is a positive quantity (of order j0 2n + √j0T + T0 for theoretical results), e (ϕλ ) (an estimator of the variance of nβˆλ ) are B(ϕλ ) and V quantities depending only on the observations and can be exactly computed.

21 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Thresholding procedure 1

We introduce a deterministic subset Γ of Λ:  Γ = λ = (j, k) ∈ Λ : −1 6 j 6 j0 , k ∈ Kj , with j0 ∈ N∗ .

2

3

Given some parameter γ > 0, for any λ ∈ Γ, we define the threshold: r     e ϕλ + γj0 B ϕλ + ∆ NR . ηλ = 2γj0 V n 3 n n Finally, we denote β˜λ the thresholding P estimator of βλ : β˜λ = βˆλ 1|βˆλ |>ηλ 1λ∈Γ and we set h˜ = λ∈Λ β˜λ ϕλ an estimator of h that only depends on the choice of γ and j0 fixed later.

21 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

An oracle type inequality Theorem We assume that n > 2, j0 ∈ N∗ such that 2j0 6 n < 2j0 +1 , γ > 2 log 2 and ∆ is defined in a technical way. Then the estimator h˜ satisfies   E kh˜ − hk22 # ) " # ( " X n 1 n 1 2 4 + (log n) + C2 + , 6 C1 inf βλ + |m| m⊂Γ n T2 n T2 λ6∈m

where C1 and C2 are positive constants depending on A, γ, R1 and R∞ .

22 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

An oracle type inequality Theorem We assume that n > 2, j0 ∈ N∗ such that 2j0 6 n < 2j0 +1 , γ > 2 log 2 and ∆ is defined in a technical way. Then the estimator h˜ satisfies   E kh˜ − hk22 # ) " # ( " X n 1 n 1 2 4 + (log n) + C2 + , 6 C1 inf βλ + |m| m⊂Γ n T2 n T2 λ6∈m

where C1 and C2 are positive constants depending on A, γ, R1 and R∞ .

Corollary (n and T proportional) 

 E kh˜ − hk22 6 C1 inf

m⊂Γ

( X λ6∈m

βλ2

) (log T )4 C2 + |m| + T T 22 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

A minimax result on Besov balls

We still consider the regime "n and T proportional".

Theorem Let R > 0 and δ ∈ R such that 0 < δ < r + 1. δ Assume that h ∈ B2,∞ (R) and n and T are proportional. ˜ Then the estimator h satisfies   2δ   (log T )4 2δ+1 , E kh˜ − hk22 6 C3 T where C3 is a positive constant depending on R, A, γ, R1 and R∞ .

23 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Simulations

Implementation procedure: Computation of the thresholds with a cascade algorithm (inspired by Mallat (1989)) for the two previous bases.

24 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Simulations

Implementation procedure: Computation of the thresholds with a cascade algorithm (inspired by Mallat (1989)) for the two previous bases. Simulations for the calibration of the parameters from a practical point of view.

24 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Simulations

Implementation procedure: Computation of the thresholds with a cascade algorithm (inspired by Mallat (1989)) for the two previous bases. Simulations for the calibration of the parameters from a practical point of view. Simulations for the illustration of the performance and the robustness of our practical procedure.

24 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Simulations Reconstruction of the function: h = 4 × 1[0;1] , with T = 10000, A = 10, j0 = 5 and n ' 1000.

25 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Simulations Reconstruction of the function: h = 4 × 1[0;1] , with T = 10000, A = 10, j0 = 5 and n ' 1000.

Reconstruction of h on the Haar basis

Reconstruction of h on the Spline basis

(true: dotted line, estimate: solid line)

(true: dotted line, estimate: solid line)

25 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Simulations Reconstruction of the function: h(x) = 4 × √12π e −x with T = 10000, A = 10, j0 = 5 and n ' 1000.

2

/2

,

26 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Simulations Reconstruction of the function: h(x) = 4 × √12π e −x with T = 10000, A = 10, j0 = 5 and n ' 1000.

2

/2

,

Reconstruction of h on the Haar basis

Reconstruction of h on the Spline basis

(true: dotted line, estimate: solid line)

(true: dotted line, estimate: solid line)

26 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Influence promoters/genes in E. coli Most of the genes of E. coli should be preceded by the major promoter, the word tataat at a very short distance apart. In order to validate our thresholding estimation procedure, we hope to detect short favored distances between genes and previous occurrences of tataat.

27 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Influence promoters/genes in E. coli Most of the genes of E. coli should be preceded by the major promoter, the word tataat at a very short distance apart. In order to validate our thresholding estimation procedure, we hope to detect short favored distances between genes and previous occurrences of tataat. Data: locations of 4 290 genes (we took the positions of the first base of coding sequences) and locations of 1 036 occurrences of the word tataat, along a genome of normalized length T = 9288442, with A = 10000 for the maximal memory.

27 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Influence promoters/genes in E. coli How does the DNA motif tataat influence genes? parents = tataat, children = genes.

28 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Influence promoters/genes in E. coli How does the DNA motif tataat influence genes? parents = tataat, children = genes.

˜ on the scale 1 : 1000 Estimator h 28 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Influence promoters/genes in E. coli How does genes influence the DNA motif tataat? parents = genes, children = tataat.

29 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Influence promoters/genes in E. coli How does genes influence the DNA motif tataat? parents = genes, children = tataat.

˜ on the scale 1 : 1000 Estimator h 29 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Contents 1

2

3

4

5

Models and framework The bivariate Hawkes process Our Poissonian interactions model The Haar basis and a specific biorthogonal wavelet basis Adaptive testing procedure Description of our testing procedure Main theoretical results Adaptive estimation procedure Data-driven wavelet thresholding procedure Main theoretical results Simulations Application to genomic data Lasso estimation for Poissonian interactions on the circle Formulation of the problem Our Lasso-type procedure The performance of the Lasso estimate Conclusion and perspectives 30 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Poissonian discrete model Parents: U1 , . . . , Un n i.i.d. uniform random variables on the set {0, . . . , M − 1}

31 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Poissonian discrete model Parents: U1 , . . . , Un n i.i.d. uniform random variables on the set {0, . . . , M − 1} → points on a circle (we work modulus M).

31 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Poissonian discrete model Parents: U1 , . . . , Un n i.i.d. uniform random variables on the set {0, . . . , M − 1} → points on a circle (we work modulus M). Number of children at each position k between 0 and M − 1: Each Ui gives birth to some Poisson variables.

31 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Poissonian discrete model Parents: U1 , . . . , Un n i.i.d. uniform random variables on the set {0, . . . , M − 1} → points on a circle (we work modulus M). Number of children at each position k between 0 and M − 1: Each Ui gives birth to some Poisson variables. ∗ i ∗ If h∗ = (h0∗ , . . . , hM−1 )T ∈ R M + , then let us define NUi +j ∼ P(hj ) independent of anything else. The variable NUi i +j represents the number of children that a certain individual Ui has at distance j.

31 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Poissonian discrete model Parents: U1 , . . . , Un n i.i.d. uniform random variables on the set {0, . . . , M − 1} → points on a circle (we work modulus M). Number of children at each position k between 0 and M − 1: Each Ui gives birth to some Poisson variables. ∗ i ∗ If h∗ = (h0∗ , . . . , hM−1 )T ∈ R M + , then let us define NUi +j ∼ P(hj ) independent of anything else. The variable NUi i +j represents the number of children that a certain individual Ui has P at distance j. We set Yk = ni=1 Nki the total number of children at position k whose distribution conditioned on the Ui ’s is given by Yk ∼ P(

n X

∗ hk−U ). i

i =1

31 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Poissonian discrete model Parents: U1 , . . . , Un n i.i.d. uniform random variables on the set {0, . . . , M − 1} → points on a circle (we work modulus M). Number of children at each position k between 0 and M − 1: Each Ui gives birth to some Poisson variables. ∗ i ∗ If h∗ = (h0∗ , . . . , hM−1 )T ∈ R M + , then let us define NUi +j ∼ P(hj ) independent of anything else. The variable NUi i +j represents the number of children that a certain individual Ui has P at distance j. We set Yk = ni=1 Nki the total number of children at position k whose distribution conditioned on the Ui ’s is given by Yk ∼ P(

n X

∗ hk−U ). i

i =1



Aim: Estimate h with the observation of the Ui ’s and the Yk ’s.

31 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Poissonian discrete model Parents: U1 , . . . , Un n i.i.d. uniform random variables on the set {0, . . . , M − 1} → points on a circle (we work modulus M). Number of children at each position k between 0 and M − 1: Each Ui gives birth to some Poisson variables. ∗ i ∗ If h∗ = (h0∗ , . . . , hM−1 )T ∈ R M + , then let us define NUi +j ∼ P(hj ) independent of anything else. The variable NUi i +j represents the number of children that a certain individual Ui has P at distance j. We set Yk = ni=1 Nki the total number of children at position k whose distribution conditioned on the Ui ’s is given by Yk ∼ P(

n X

∗ hk−U ). i

i =1



Aim: Estimate h with the observation of the Ui ’s and the Yk ’s. Another translation: Y = (Y0 , . . . , YM−1 )T ∼ P(Gh∗ ). ,→ Estimate h∗ ⇔ solve an inverse problem (potentially ill-posed) where the operator G is random and only depends on the Ui ’s. 31 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Steps of our Lasso-type procedure

Aim: Recover optimal sparse solutions.

32 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Steps of our Lasso-type procedure Aim: Recover optimal sparse solutions. 1

Under some conditions, we derive a RIP [Candès and Tao, 2005]: there exist positive constants r and R such that with high probability, for any K -sparse vector x (i.e. with at the most K non-zero coordinates) r kxk2 6 kGxk2 6 Rkxk2 , with R ' r ' n (similar approach to Rudelson and Vershynin (2008)). ,→ in order to derive oracle inequalities for our Lasso-type estimate.

32 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Steps of our Lasso-type procedure Aim: Recover optimal sparse solutions. 1 2

Under some conditions, we derive a RIP [Candès and Tao, 2005]. A modified version of the Lasso estimate, which forces sparsity of the estimate: ( ) M−1 X 2 hˆL := argmin kY − Ghk`2 + (2 + η) dk |hk | , h∈RM K0 −sparse k=0 for some η > 0 and for random weights dk to calibrate. [Bertin et al., 2011, Hansen et al., 2012]

32 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Steps of our Lasso-type procedure Aim: Recover optimal sparse solutions. 1

Under some conditions, we derive a RIP [Candès and Tao, 2005]

2

( hˆL := 3

argmin

h∈RM K0 −sparse

kY −

Ghk2`2

+ (2 + η)

M−1 X

) dk |hk | .

k=0

Choice of the dk ’s: They only depend on observable quantities. They are as small as possible. We need to control, with high probability, the event |G H (Y − Gh∗ )|k 6 dk ,

,→ dk =

q

∀k ∈ {0, . . . , M − 1}.

1 2 log (3/δ)V˜k + log (3/δ)Bk . 3 32 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The performance of the Lasso estimate

Proposition For any δ ∈]0; 1[,      n3 n2 E(dk2 ) 6 C log (3/δ) n + 2 + (log (3/δ))2 n + 2 , M M where C is a positive constant depending on hk∗ and kh∗ k`1 .

33 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The performance of the Lasso estimate Theorem Let Ω be the event such that, for all x at least 2K0 sparse in CM , kGxk2 > (1 − c)nkxk2 ,

with c < 1/2

and |G H (Y − Gh∗ )|k 6 dk

∀ k.

34 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The performance of the Lasso estimate Theorem Let Ω be the event such that, for all x at least 2K0 sparse in CM , kGxk2 > (1 − c)nkxk2 ,

with c < 1/2

and |G H (Y − Gh∗ )|k 6 dk

∀ k.

On the event Ω (i.e. with probability at least 1 − 2δ), if η > 0 and if h∗ is K0 -sparse, 8(2 + η)2 X 2 khˆL − h∗ k2`2 6 dk , (1 − c)2 n2 ∗ k∈S(h )

where S(h) denotes the support of h.

34 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

The performance of the Lasso estimate Theorem Let Ω be the event such that, for all x at least 2K0 sparse in CM , kGxk2 > (1 − c)nkxk2 ,

with c < 1/2

and |G H (Y − Gh∗ )|k 6 dk

∀ k.

On the event Ω (i.e. with probability at least 1 − 2δ), if η > 0 and if h∗ is K0 -sparse, 8(2 + η)2 X 2 khˆL − h∗ k2`2 6 dk , (1 − c)2 n2 ∗ k∈S(h )

where S(h) denotes the support of h. If n = O(M), with a wise choice of δ, then the right hand side is of order (log M)2 /n and this inequality is a classical oracle type inequality, up to a logarithmic term, for Lasso estimates. 34 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Contents 1

2

3

4

5

Models and framework The bivariate Hawkes process Our Poissonian interactions model The Haar basis and a specific biorthogonal wavelet basis Adaptive testing procedure Description of our testing procedure Main theoretical results Adaptive estimation procedure Data-driven wavelet thresholding procedure Main theoretical results Simulations Application to genomic data Lasso estimation for Poissonian interactions on the circle Formulation of the problem Our Lasso-type procedure The performance of the Lasso estimate Conclusion and perspectives 35 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Conclusion

For our Poissonian interactions model: We have built a minimax-optimal testing procedure on weak Besov bodies for the problem of detecting dependence in our model. We have developed an optimal fully data-driven wavelet thresholding procedure for the problem of estimating the interaction function h. Our theoretical results have been strengthened by simulations illustrating the performance and the robustness of our procedures.

36 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Conclusion

For our Poissonian interactions model: We have built a minimax-optimal testing procedure on weak Besov bodies for the problem of detecting dependence in our model. We have developed an optimal fully data-driven wavelet thresholding procedure for the problem of estimating the interaction function h. Our theoretical results have been strengthened by simulations illustrating the performance and the robustness of our procedures. For the discrete version, we have proposed an adaptive Lasso-type procedure, calibrated with the convenient shape of the `1 -penalty weights and we have obtained oracle inequalities.

36 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Perspectives It would be relevant to study similar processes, in the adaptive nonparametric statistic setting, in the spatial framework and to connect them, for instance, to the Neymann-Scott process.

37 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Perspectives It would be relevant to study similar processes, in the adaptive nonparametric statistic setting, in the spatial framework and to connect them, for instance, to the Neymann-Scott process. Our Poissonian interactions model postulates that ν is constant and the same function h is considered on the entire interval [0; T ], which is not really reasonable from a biological point of view. Nevertheless, this assumption is quite feasible on smaller intervals. To overcome the problem of stationarity, we could use a Benjamini and Hochberg’s approach, associated with our adaptive testing procedure.

37 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Perspectives It would be relevant to study similar processes, in the adaptive nonparametric statistic setting, in the spatial framework and to connect them, for instance, to the Neymann-Scott process. Our Poissonian interactions model postulates that ν is constant and the same function h is considered on the entire interval [0; T ], which is not really reasonable from a biological point of view. Nevertheless, this assumption is quite feasible on smaller intervals. To overcome the problem of stationarity, we could use a Benjamini and Hochberg’s approach, associated with our adaptive testing procedure. The adaptive estimation procedure is optimal from an oracle point of view. But, its optimality from a minimax point of view is only highlighted when n and T are proportional. Indeed, the regime n  T has been not studied. In this case, the main term in the oracle inequality is n/T 2 (up to logarithmic terms). Consequently, whether the rate n/T 2 is optimal remains an open question. 37 / 38

Introduction

Models & framework

Detection of dependence

Adaptive estimation

Lasso estimation

Conclusion

Detection of dependence L. Sansonnet and C. Tuleau-Malot (2013) A model of Poissonian interactions and detection of dependence, Submitted. Adaptive estimation L. Sansonnet (2013) Wavelet thresholding estimation in a Poissonian interactions model with application to genomic data, To appear in Scandinavian Journal of Statistics. Lasso estimation A work in progress in collaboration with P. Reynaud-Bouret, V. Rivoirard and R. Willett. Thanks for your attention!

38 / 38

Inférence non-paramétrique pour des interactions ...

Poisson process if A1,..., Al are disjoint measurable sets, NA1 ,..., NAl are independent random variables,. NA obeys a Poisson law P(µ(A)), where µ is a measure called. "mean measure". Generally, dµ(t) = h(t) dt, where h is the intensity of N. If h is constant, N is a homogeneous Poisson process, otherwise it is said.

1MB Sizes 0 Downloads 31 Views

Recommend Documents

Petition-APCE-pour-les-droits-des-nouveau-nes ...
Page 1 of 9. 1. Patrick-Grégor Puppinck. Directeur général. 4 Quai Koch. 67000 Strasbourg. A l'attention de Madame Anne Brasseur. Présidente. Assemblée parlementaire du. Conseil de l'Europe. F-67075 Strasbourg Cedex, France. Strasbourg, le 22 ja

Liste des revues AERES pour le domaine ECONOMIE-GESTION ...
Jun 16, 2010 - Common Market Law Review. 0165-0750. Communication Monographs. 0363- ..... Federal Reserve Bank of San Francisco Economic Review. 0363-0021. Federal Reserve Bank of St. ...... 0038-0385. South African Journal of Economic and Management

Liste des revues AERES pour le domaine ECONOMIE-GESTION ...
Jun 16, 2010 - Australian Economic Review. 0004-9018. Australian Journal of Agricultural and Resource Eco. 1364-985X .... Central European Journal for Operations Research and Economics. 1210-0269. Central European Journal of Operations ..... IBM Syst

Read [PDF] Construire une cave naturelle : Construction et aménagement d'espaces pour la conservation des fruits et des légumes Full Books
Construire une cave naturelle : Construction et aménagement d'espaces pour la conservation des fruits et des légumes Download at => https://pdfkulonline13e1.blogspot.com/2841385655 Construire une cave naturelle : Construction et aménagement d'esp

Download Construire une cave naturelle : Construction et aménagement d'espaces pour la conservation des fruits et des légumes Read online
Construire une cave naturelle : Construction et aménagement d'espaces pour la conservation des fruits et des légumes Download at => https://pdfkulonline13e1.blogspot.com/2841385655 Construire une cave naturelle : Construction et aménagement d'esp

Catégorisation des revues pour l'économie et gestion
dynamics, applied dynamics. Optimal Control Applications and Methods optimal control, socio- ...... Supply Chain Management: An International Journal. 1359-8546. LOG. 4. Total Quality Environmental Management ... Journal of Economic Dynamics and Cont

PDF Construire une cave naturelle : Construction et aménagement d'espaces pour la conservation des fruits et des légumes Full Books
Construire une cave naturelle : Construction et aménagement d'espaces pour la conservation des fruits et des légumes Download at => https://pdfkulonline13e1.blogspot.com/2841385655 Construire une cave naturelle : Construction et aménagement d'

Comparaison-des-différents-programmes-des-candidats-v3.pdf ...
Comparaison-des-différents-programmes-des-candidats-v3.pdf. Comparaison-des-différents-programmes-des-candidats-v3.pdf. Open. Extract. Open with.

pdf-0712\interactions-mosaic-silver-edition-interactions-1-low ...
... the apps below to open or edit this item. pdf-0712\interactions-mosaic-silver-edition-interaction ... iate-reading-class-audio-cd-by-elaine-kirn-pamela-h.pdf.

Urban Interactions
Nov 2, 2017 - We then structurally estimate the model using data from the National Longitudinal. Survey of ... economics looking at how interactions between agents create agglomeration and city centers.3 ... Using data on email communication between

Efficient DES Key Search
operation for a small penalty in running time. The issues of development ... cost of the machine and the time required to find a DES key. There are no plans to ...

joe_20110527_0123_0045_marquage des poissons.pdf ...
... empêcher la mesure de la taille du poisson. Page 1 of 3 ... joe_20110527_0123_0045_marquage des poissons.pdf. joe_20110527_0123_0045_marquage ...

Ballade Pour Adeline.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Ballade Pour ...

Download [Pdf] Le Forex pour les débutants ambitieux: Un guide pour réussir en trading Full Pages
Le Forex pour les débutants ambitieux: Un guide pour réussir en trading Download at => https://pdfkulonline13e1.blogspot.com/9081082175 Le Forex pour les débutants ambitieux: Un guide pour réussir en trading pdf download, Le Forex pour les débuta

Read [PDF] Guide des 4000 médicaments : Utiles, inutiles ou dangereux au service des malades et des praticiens Full Pages
Guide des 4000 médicaments : Utiles, inutiles ou dangereux au service des malades et des praticiens Download at => https://pdfkulonline13e1.blogspot.com/2749121418 Guide des 4000 médicaments : Utiles, inutiles ou dangereux au service des malade

Read [PDF] Guide des 4000 médicaments : Utiles, inutiles ou dangereux au service des malades et des praticiens Read online
Guide des 4000 médicaments : Utiles, inutiles ou dangereux au service des malades et des praticiens Download at => https://pdfkulonline13e1.blogspot.com/2749121418 Guide des 4000 médicaments : Utiles, inutiles ou dangereux au service des malades

En route pour l'Indonésie_extrait.pdf
Le code de la propriété intellectuelle interdit les copies ou reproductions destinées à une. utilisation collective. Toute représentation ou reproduction intégrale ou partielle faite par. quelque procédé que ce soit, sans le consentement de l

reueil des proccedings
SIG100T: A prototype for web-based health GIS application and diseases ... development and use of advanced open system standards and techniques in the ...