Weak Epsilon-Nets, DavenportâSchinzel Sequences ...

Viewer
Transcript

Tel-Aviv University The Raymond and Beverly Sackler Faculty of Exact Sciences The Blavatnik School of Computer Science

Weak Epsilon-Nets, Davenport–Schinzel Sequences, and Related Problems

thesis submitted for the degree of doctor of philosophy

by Gabriel Nivasch under the supervision of Professor Micha Sharir

Submitted to the senate of Tel Aviv University June 2009

ii

Acknowledgements I am deeply thankful to my advisor Micha Sharir for the wonderful four-anda-half years I have spent at Tel Aviv University. During this time I had the opportunity to do some very exciting research, and I also learnt a lot from Micha about academic matters and other general things. I also wish to thank Noga Alon, Boris Bukh, Haim Kaplan, Jiˇr´ı Matouˇsek, and Shakhar Smorodinsky, who together with Micha are coauthors of several parts of this work. It has been a great privilege and also a pleasure to work together. I especially enjoyed working with Boris. We first met by chance at Hebrew University (while he was in Israel for a short internship at Tel Aviv University), and we have continued working together since then via the Internet. I hope we keep in touch and continue our fruitful collaboration. I worked with Jirka only via email, and I hope we will a chance to work together face-to-face in the future. I would also like to thank Haim and Micha specifically for help in deriving a few results in [37]. Thanks to Seth Pettie for some pleasant and enlightening discussions on Davenport–Schinzel sequences and related philosophical issues (at SODA’08 and ’09 and by email). And thanks to Martin Klazar for some useful email correspondence. Finally, thanks to my dear wife Keren for all her support and sacrifice for me. My life is infinitely sweeter since I met you.

iii

iv

ACKNOWLEDGEMENTS

Abstract In this thesis we give improved bounds for several problems in discrete geometry. One of our most surprising results is that a set of complicated expressions involving the inverse Ackermann function appears, independently, as the bound for two unrelated problems. Weak epsilon-nets. Given a finite point set X ⊂ Rd and a parameter < 1, a set N ⊂ Rd is called a weak -net for X if N intersects every convex set C with |C ∩ X| ≥ |X|. Let us write r = 1/, so r > 1. We construct, for every fixed d ≥ 2 and every r > 1, a point set Gs ⊂ Rd for which every weak 1r -net has size Ω(r logd−1 r). This is the first nontrivial lower bound on r for d fixed. The set Gs is a stretched grid, i.e., the Cartesian product of d suitable fast-growing sequences. We show how convexity in this grid can be analyzed using stair-convexity, a new variant of the notion of convexity. This is joint work with Boris Bukh and Jiˇr´ı Matouˇsek [13]. We also improve upper bounds for weak -nets for some special classes of sets X. Namely, we prove that if X ⊂ R2 is in convex position, then X has a weak 1r -net of size O(rα(r)), where α is the extremely slow-growing inverse Ackermann function. (The previous bound for this case was O(r log1.59 r).) We generalize the above result to other cases in which X ⊂ Rd is, in a sense, “intrinsically one-dimensional”. For such sets X we construct weak 1 -nets of size O(r · 2poly(α(r)) ), with the polynomial depending on d. (The r previous bounds were of the form O(r polylog(r)).) We achieve the last two results by a reduction to stabbing interval chains, a new combinatorial problem of independent interest, for which we derive almost-tight upper and lower bounds. All this is joint work with Noga Alon, Haim Kaplan, Micha Sharir, and Shakhar Smorodinsky [5]. Coming back to lower bounds, we also construct, for every d ≥ 3, an “intrinsically one-dimensional” point set Ds ⊂ Rd for which every weak 1r net has size Ω(r · 2poly(α(r)) ) (with a smaller polynomial than in the upper bounds). The set Ds is actually the diagonal of the stretched grid Gs . This is joint work with Bukh and Matouˇsek [13]. v

vi

ABSTRACT

Davenport–Schinzel sequences. Given an integer s ≥ 1, a Davenport– Schinzel sequence of order s is a sequence of symbols that does not contain adjacent repetitions, and does not contain any alternation of the form a · · · b · · · a · · · b · · · of length s + 2. The maximum length of such a sequence containing only n distinct symbols is denoted λs (n). For s ≥ 3, λs (n) is known to have inverse-Ackermann bounds surprisingly similar to the ones we obtained for weak -nets and for interval chains. We slightly improve the upper bounds for λs (n); for s ≥ 4 even they now match the lower bounds. In addition, we re-achieve our improved upper bounds by a new method, which uses what we call almost-DS sequences. Finally, we apply our new technique to generalized Davenport–Schinzel sequences, improving their upper bounds as well. Regarding lower bounds for λs (n), we simplify the construction that achieves the current lower bounds for even s ≥ 4. In addition, we improve the lower bound for λ3 (n) by a constant factor, showing that λ3 (n) equals 2nα(n) plus lower-order terms. These results have been previously published in [37]. Other results. We show that the stretched grid Gs ⊂ Rd yields an improved upper bound for the so-called first selection lemma: No point in Rd is contained in more than (n/(d + 1))d+1 + o(nd+1 ) d-dimensional simplices spanned by Gs , where n = |Gs |. (The current lower bound is the following: For every n-point set S ⊂ Rd there exists a point x ∈ Rd that is contained in at least γd nd+1 − O(nd ) simplices spanned by S, for γd = (d2 + 1)/((d + 1)!(d + 1)d+1 ).) We also use the stretched grid to improve the upper bound for the second 2.5 selection n≤ lemma in the plane by a logarithmic factor: For every t, n log n t ≤ 3 , we construct a set T of t triangles with vertices in Gs ⊂ R2 such that no point in R2 is contained in more than O(t2 /(n3 log n)) triangles of T (where, again, n = |Gs |). These latter results are joint work with Bukh and Matouˇsek [14]. Finally, we provide a correct proof of the current lower bound for the second selection lemma in the plane: For every n-point set S ⊂ R2 and every family T of t triangles with vertices in S, there exists a point in the plane that intersects Ω(t3 /(n6 log2 n)) triangles of T . Eppstein claimed this result in [22], but there is a problem with his proof. Our proof follows by a slight modification of Eppstein’s argument. This is joint work with Micha Sharir [39].

Contents Acknowledgements

iii

Abstract

v

1 Introduction 1.1 The Ackermann function and its inverse . 1.1.1 Variants of the Ackermann function 1.2 Problems considered in this thesis . . . . . 1.2.1 Weak epsilon-nets . . . . . . . . . . 1.2.2 Stabbing interval chains . . . . . . 1.2.3 Davenport–Schinzel sequences . . . 1.2.4 Selection lemmas . . . . . . . . . . 1.3 Organization of the thesis . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

2 Weak epsilon-nets 2.1 The stretched grid and stair-convexity . . . . . . . . 2.1.1 Properties of the stretched grid . . . . . . . . 2.1.2 Epsilon-nets with respect to stair-convex sets . 2.1.3 Properties of the stretched grid: Proofs . . . . 2.2 Intrinsically one-dimensional sets . . . . . . . . . . . 2.2.1 Planar point sets in convex position . . . . . . 2.2.2 Point sets on convex curves . . . . . . . . . . 2.2.3 Point sets on convex curves: A lower bound . 2.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . 3 Stabbing interval chains 3.1 Upper bounds . . . . . . . . . . . 3.1.1 Upper bounds for triples . 3.1.2 Upper bounds for j-tuples 3.2 Lower bounds . . . . . . . . . . . 3.2.1 Lower bounds for triples . vii

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . . . . .

. . . . . . . . .

. . . . .

. . . . . . . .

. . . . . . . . .

. . . . .

. . . . . . . .

. . . . . . . . .

. . . . .

. . . . . . . .

. . . . . . . . .

. . . . .

. . . . . . . .

1 1 3 4 4 7 9 14 17

. . . . . . . . .

19 19 21 23 26 33 33 35 37 41

. . . . .

43 43 45 47 51 53

viii

CONTENTS

3.3 3.4

3.2.2 Lower bounds for j-tuples . . . . . . . . . . . . . . . . 57 Stabbing with pairs . . . . . . . . . . . . . . . . . . . . . . . . 67 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4 Davenport–Schinzel sequences 4.1 General approach for the upper bounds . . . . . . . 4.2 Bounding ψs (m, n) . . . . . . . . . . . . . . . . . . 4.2.1 Applying the recurrence relation . . . . . . . 4.3 A new technique for bounding ψs (m, n) . . . . . . . 4.3.1 Bounding ADS sequences . . . . . . . . . . 4.3.2 Klazar’s improved upper bound for λ3 (n) . . 4.3.3 Bounding Πsk (m) for general s . . . . . . . . 4.4 Formation-free sequences . . . . . . . . . . . . . . . 4.4.1 Bounding formation-free sequences . . . . . 4.4.2 Almost-formation-free sequences . . . . . . . 4.5 Lower bound construction for s = 3 . . . . . . . . . 4.5.1 Analysis . . . . . . . . . . . . . . . . . . . . 4.5.2 Lower bound for ADS sequences of order 3 . 4.6 Lower-bound construction for s ≥ 4 even . . . . . . 4.6.1 The construction . . . . . . . . . . . . . . . 4.6.2 Correctness of the construction . . . . . . . 4.6.3 Analysis . . . . . . . . . . . . . . . . . . . . 4.6.4 Advantages over the previous construction . 4.6.5 Lower bound for ADS sequences, s ≥ 4 even 4.7 Conclusions . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

5 Selection lemmas 5.1 The first selection lemma . . . . . . . . . . . . . . . . . . . . 5.2 The second selection lemma . . . . . . . . . . . . . . . . . . 5.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

71 71 73 76 82 83 86 86 88 90 92 94 97 101 102 102 104 107 109 110 110

111 . 111 . 112 . 116

Bibliography

117

A Comparing Ackermann-like functions

123

B Some recurrent quantities

127

C Proof of Klazar’s Lemma 4.15

129

Chapter 1 Introduction In this thesis we derive improved bounds for several problems in discrete geometry. We begin by introducing some notation. Notation If f is a function and n is a nonnegative integer, then f (n) = f ◦ f ◦ · · · ◦ f denotes the n-fold composition of f ; f (0) is the identity map. If i ≤ j are integers, then [i, j] denotes the interval {i, i + 1, . . . , j}. For convenience, we also write [i, i] as [i]. If S = a1 a2 . . . an is a sequence of symbols, we let |S| = n denote the length of S and kSk denote the number of distinct symbols in S. If S and u are sequences of symbols, we write u ⊂ S if S contains a subsequence u0 (not necessarily contiguous) which is isomorphic to u (i.e., u0 can be made equal to u by a one-to-one renaming of its symbols). In this case we say that S contains u or that u is contained in S. Otherwise, we write u 6⊂ S and we say that S is u-free. For example, S = abcdbc contains u = abab, but it is v-free for v = abba.

1.1

The Ackermann function and its inverse

The Ackermann function, denoted A(n), is a function that grows extremely fast in n. In this thesis we are mainly interested in its extremely slow-growing inverse, α(n). We first define the Ackermann hierarchy, which is a sequence of functions Ak (n), for k = 1, 2, 3, . . . and n ≥ 0, by A1 (n) = 2n, and for k ≥ 2 by (n) Ak (n) = Ak−1 (1). Alternatively, the definition of Ak (n) for k ≥ 2 can be 1

2

CHAPTER 1. INTRODUCTION

written recursively as ( 1, if n = 0; Ak (n) = Ak−1 Ak (n − 1) , otherwise.

(1.1)

···2

We have A2 (n) = 2n , and A3 (n) = 22 is a “tower” of n twos. Each function in this hierarchy grows in n much faster than the preceding one. Namely, for (c) every fixed k and c we have Ak+1 (n) ≥ Ak (n) for all large enough n. Notice that Ak (1) = 2 and Ak (2) = 4 for all k, but Ak (3) already grows very rapidly with k. We define the Ackermann function as A(n) = An (3). Thus, A(n) = 6, 8, 16, 65536, . . . for n = 1, 2, 3, 4, . . .. For every fixed k we have A(n) ≥ Ak (n) for all large enough n. It is also easy to verify that A(n) = An−2 A(n − 1) , for n ≥ 3. (1.2) We then define the slow-growing inverses of these rapidly-growing functions as αk (x) = min {n | Ak (n) ≥ x}, α(x) = min {n | A(n) ≥ x},

(1.3) (1.4)

for all real x ≥ 0. Alternatively, and equivalently, we can define these inverse functions directly without making reference to Ak and A. We define the inverse Ackermann hierarchy by α1 (x) = dx/2e and ( 0, if x ≤ 1; αk (x) = (1.5) 1 + αk αk−1 (x) , otherwise; for k ≥ 2. In other words, for each k ≥ 2, αk (x) denotes the number of times we must apply αk−1 , starting from x, until we reach a value not larger than 1. Thus, α2 (x) = dlog2 xe, and α3 (x) = log∗ x. Finally, we define the inverse Ackermann function by α(x) = min {k | αk (x) ≤ 3}.

(1.6)

It is easy to show by induction that the above two definitions of αk and α are exactly equivalent. Note that αk (x) and α(x) are integers for every x. Each function αk (x) grows to infinity much more slowly than αk−1 (x); (c) namely, for every fixed k and c we have αk+1 (x) ≤ αk (x) for all large enough x. Finally, α(x) grows to infinity more slowly than every αk (x). (Actually, even A(α(x) − 2) grows more slowly than every αk (x), as we show below.)

1.1. THE ACKERMANN FUNCTION AND ITS INVERSE

3

Note that, by definition, we have αα(x) (x) ≤ 3 and αα(x)−1 (x) ≥ 4. Furthermore, αα(x)−2 (x) ≥ 5 (since αk−1 (x) > αk (x) whenever αk−1 (x) ≥ 4). We now show that αα(x)−3 (x) grows to infinity with x, and in fact it does so much faster than α(x) (though obviously slower than αk (x) for every fixed k): Lemma 1.1. Let x be large enough so that α(x) ≥ 4. Then, αα(x)−3 (x) > A(α(x) − 2). Proof. As noted above, we have αα(x)−2 (x) ≥ 5. Thus, by (1.5), 5 ≤ αα(x)−2 (x) = 1 + αα(x)−2 αα(x)−3 (x) , so αα(x)−2 αα(x)−3 (x) ≥ 4. But αk (y) ≥ 4 implies k ≤ α(y) − 1, so in our case, α(x) − 1 ≤ α αα(x)−3 (x) . Finally, n ≤ α(y) implies y > A(n − 1), and the lemma follows. The fact that αα(x)−3 (x) → ∞ is used in some lower-bound results in this thesis (Section 2.2.3).

1.1.1

Variants of the Ackermann function

There does not seem to be a “standard” definition of the Ackermann function and its inverse in the literature; rather, there are several slightly different variants, all of which exhibit the same asymptotic behavior. For example, some authors define Ak (n) as we did above, but then define the Ackermann function by “diagonalizing” the hierarchy, letting A0 (n) = An (n). This does not make any asymptotic difference, as A0 (n−2) ≤ A(n) ≤ A0 (n−1) for all n ≥ 5. We prefer our definition because, first, diagonalization is unnecessary, and second, the corresponding “direct” definition (1.6) of α(x) comes out simpler. (For other references where α(x) is defined as in (1.6) see Pettie [40] and Seidel [43, slide 85].) Sometimes even the hierarchy itself is defined by a recurrence relation differing slightly from (1.1) or (1.5). In fact, in this thesis we use such bk (n) of Ak (n) given variants several times. For example, we use a variant A by the recurrence bk (n) = n · A bk−1 A bk−1 22k A bk (n − 1) , A and a variant α bk (x) of αk (x) given by the recurrence (j) α bk (n) = 1 + α bk α bk−1 (x)/c .

4

CHAPTER 1. INTRODUCTION

We have developed a fairly general technique for proving that variants of this kind behave asymptotically just like the “canonical” functions Ak (x) and αk (x). We present this technique in Appendix A.

1.2

Problems considered in this thesis

We now give a brief overview of the problems we consider in this thesis, including their current bounds and the new results we obtained.

1.2.1

Weak epsilon-nets

Given a finite point set X ⊂ Rd and a parameter < 1, a set N ⊂ Rd is called a weak -net for X (with respect to convex ranges) if N intersects every convex set C with |C ∩ X| ≥ |X|. Weak -nets were introduced by Haussler and Welzl [24], and later used in several results in discrete geometry, most notably in the proof of the Hadwiger–Debrunner (p, q)-conjecture by Alon and Kleitman [6]. Weak epsilon-nets and epsilon-nets The weak -nets we defined above are more appropriately called weak -nets with respect to convex ranges. Furthermore, weak -nets are a generalization of the concept of -nets. Here we present this broader context. Let F be a given family of subsets of Rd , and let X be a finite set of points in Rd . Then, a set N ⊂ Rd is a weak -net for X with respect to F if N intersects every F ∈ F for which |F ∩ X| ≥ |X|. If in addition we have N ⊂ X, then N is a (strong) -net. These notions can also be defined for the case where X ⊂ Rd is a set with finite, positive Lebesgue measure. In this case we require N to intersect every F ∈ F for which vol(F ∩ X) ≥ vol(X), where vol denotes the d-dimensional Lebesgue measure on Rd . Even more generally, let µ be a probability measure in Rd . Then, we say that a set N is a weak -net for µ if N intersects every F ∈ F for which µ(F ) ≥ . If, in addition, N is contained in the support of µ, then N is an -net for µ. A central result (by Haussler and Welzl [24]) states that if the so-called VC-dimension of the family F is finite, then every X has an -net of size O( 1 log 1 ) (where the constant of proportionality depends linearly on the VC-dimension). However, the family of all convex sets, which we consider here, has infinite VC-dimension, so this result does not apply. And in fact,

1.2. PROBLEMS CONSIDERED IN THIS THESIS

5

it is not possible in general to build -nets with respect to convex ranges of size depending only on d and . Therefore, we have to make do with weak -nets. For more details see Matouˇsek [33, ch. 10]. In this thesis we are interested mainly in weak -nets with respect to convex ranges, so we call them simply weak -nets. Whenever we consider another range space, we identify it explicitly.

Bounds for weak epsilon-nets For convenience, we usually let r = 1/, so r > 1 is a parameter, and we speak of weak 1r -nets. It is a nontrivial fact, first proved by Alon et al. [3], that every point set X ⊂ Rd has a weak 1r -net of size at most Ed (r), for some function Ed (r) of d and r alone. The main problem is to obtain upper and lower bounds for Ed (r). For d = 2, Alon et al. [3] proved that E2 (r) = O(r2 ) (also see Chazelle et al. [16] for another proof). For general d ≥ 3 the current upper bound is of the form Ed (r) = O(rd (log r)cd ). This was first shown by Chazelle et al. [16], and later on by Matouˇsek and Wagner [35] via an alternative, simpler technique (which significantly reduced the exponents c(d), to c(d) = O(d3 log d)). Theponly known nontrivial lower bound for Ed (r) asserts that Ed (50) = Ω(exp( d/2)) (Matouˇsek [34]). It concerns the dependence of Ed (r) on d. However, no lower bound, except for the obvious estimate Ed (r) ≥ r, has been known for d fixed and r large. In this thesis we derive the first superlinear lower bound for weak -nets: Theorem 1.2. For every fixed d ≥ 2 we have Ed (r) = Ω(r logd−1 r). For d ≥ 3 this establishes a separation between weak -nets with respect to convex ranges and weak -nets with respect to range spaces with finite VC-dimension, since, as we have mentioned, for the latter case there always exist weak 1r -nets (and even strong ones) of size O(r log r) [24]. The point set that achieves the lower bound in Theorem 1.2 is a stretched grid, i.e., the Cartesian product of d suitable fast-growing sequences; we denote this set by Gs . We show how convexity in Gs can be analyzed using stair-convexity, a new variant of the notion of convexity. We also show that the stretched grid Gs does have a weak 1r -net of size O(r logd−1 r), so one cannot obtain any better lower bounds from it. This is joint work with Boris Bukh and Jiˇr´ı Matouˇsek [13].

6

CHAPTER 1. INTRODUCTION

Weak epsilon-nets for “intrinsically one-dimensional” sets The weak -net problem has also been studied for special cases of the given set X. Here we consider cases in which X is, in a sense, “intrinsically onedimensional”. Chazelle et al. [16] showed that, if X ⊂ R2 is a planar set in convex position, then X has a weak 1r -net of size O(r(log r)log2 3 ) = O(r log1.59 r). We improve this bound as follows: Theorem 1.3. Let X ⊂ R2 be a point set in convex position. Then X has a weak 1r -net of size O(rα(r)), where α is the inverse Ackermann function. A generalization of this case is where X ⊂ Rd lies on a curve that is intersected at most d times by every hyperplane. Such a curve is called a ˇ convex curve in some sources (Zivaljevi´ c [53, p. 314]), and the most wellknown example is the moment curve (t, t2 , t3 , . . . , td ) t ∈ R . For sets X on such curves,1 Matouˇsek and Wagner [35] derived an upper bound of the form O(r polylog(r)), where the degree of the polynomial depends on d. We improve this bound as follows: Theorem 1.4. Let X ⊂ Rd be a point set that lies on a curve that is intersected at most d times by every hyperplane. Let ( (d2 + d)/2, d even; j= (d2 + 1)/2, d odd; and let t = b(j − 2)/2c. Then X has a weak 1r -net of size t

r · 2O(α(r) ) , O(α(r)t log α(r))

r·2

j even; ,

j odd.

(Note that j is even if and only if d is divisible by 4.) 1

A convex curve is necessarily a point set in convex position. Proof: Suppose for a contradiction that p ∈ conv({q0 , . . . , qd }) where p and q0 , . . . , qd belong to a convex curve γ. Consider the order in which γ passes through these d + 2 points. Let qi , qj be adjacent in this order, and let h be the hyperplane passing through the other d points. Then h must separate qi from qj (or else p would lie outside of conv({q0 , . . . , qd }), assuming general position). Further, γ intersects h at most d times, so it must do so at the d points we used to define h. But then the segment of γ connecting qi to qj contains at least one of these points, contradicting the adjacency assumption for qi and qj .

1.2. PROBLEMS CONSIDERED IN THIS THESIS

7

Figure 1.1: A 9-chain stabbed by a 5-tuple. (Theorem 1.4 can be generalized to point sets on curves that are intersected at most q times by every hyperplane, for fixed q ≥ d. One obtains upper bounds of the form r · 2poly(α(r)) , where the degree of the polynomial depends on d and q.) We also obtain a lower bound that shows that Theorem 1.4 is not too far from the truth in the worst case for d ≥ 3: Theorem 1.5. Let d ≥ 3 be fixed, and let t = b(d − 2)/2c. Then for every r > 1 there exists a point set Ds which lies on a curve that is intersected at most d times by every hyperplane, such that every weak 1r -net for Ds has size at least t

t−1 )

r · 2(1/t!)α(r) −O(α(r) (1/t!)α(r)t

r·2

log2

,

α(r)−O(α(r)t )

d even; ,

d odd.

The set Ds is none other than the diagonal of the stretched grid Gs . We also show that the set Ds does have a weak 1r -net of this asymptotic size (up to the lower-order term in the exponent). Theorems 1.3, 1.4, and 1.5 follow by reduction to a new combinatorial problem, which we call stabbing interval chains. We present this problem and the bounds we obtained for it in Section 1.2.2 below. Theorems 1.3 and 1.4 are joint work with Noga Alon, Haim Kaplan, Micha Sharir, and Shakhar Smorodinsky [5]. Theorem 1.5 is joint work with Boris Bukh and Jiˇr´ı Matouˇsek [13].

1.2.2

Stabbing interval chains

As mentioned above, some of our results on weak -nets follow by reduction to a new combinatorial problem, which we call stabbing interval chains. The problem is as follows. Recall that we write [i, j] = {i, i + 1, . . . , j}. An interval chain 2 of size k (also called a k-chain) is a sequence of k consecutive, disjoint, nonempty intervals C = I1 I2 · · · Ik = [a1 , a2 ][a2 + 1, a3 ] · · · [ak + 1, ak+1 ], 2

An identical definition of interval chains has already been given by Condon and Saks [18, sec. 2.2], for an unrelated application.

8

CHAPTER 1. INTRODUCTION

where a1 ≤ a2 < a3 < · · · < ak+1 . We say that a j-tuple of integers (p1 , . . . , pj ) stabs an interval chain C if each pi lies in a different interval of C (see Figure 1.1). The problem is to stab, with a pool of as few j-tuples as possible, all (j) interval chains of size k that lie within a given range [1, n]. We let Zk (n) denote the minimum size of a collection of j-tuples that stab all k-chains (j) that lie in [1, n]. Note that Zk (n) is increasing in n, decreasing in k, and increasing in j. (j) In this thesis we derive almost-tight upper and lower bounds for Zk (n), involving functions in the inverse Ackermann hierarchy. The case j = 3 is simpler (and tighter) than the general case j ≥ 4, and we treat this case separately, in the analysis of both the upper and lower bounds. Our bounds for stabbing interval chains are as follows: (3)

Theorem 1.6. Zk (n) satisfies the following bounds: n−1 (3) (3) (3) ; Z4 (n) = Θ(n log n); Z5 (n) = Θ(n log log n); Z3 (n) = 2 and, for every k ≥ 6, we have (3)

c0 nαbk/2c (n) − c00 n ≤ Zk (n) ≤ cnαbk/2c (n) for all n, where c, c0 , and c00 are absolute constants. Theorem 1.7. Let j ≥ 4 be fixed, and let t = b(j − 2)/2c. Then there exist functions Pj0 (m), Q0j (m), both of which have upper and lower bounds of the form ( t t−1 2(1/t!)m ±O(m ) , j even; 0 0 Pj (m), Qj (m) = (1.7) (1/t!)mt log2 m±O(mt ) 2 , j odd; such that, for every m ≥ 2, we have (j)

ZP 0 (m) (n) ≤ cj nαm (n), j

(j) ZQ0 (m) (n) j

≥ c0j nαm (n) − c00j n

for all n, where cj , c0j , and c00j are constants that depend only on j. (j)

Thus, for every fixed j, once k is sufficiently large, Zk (n) becomes barely superlinear in n. Moreover, if we let k grow as an appropriate function of (3) α(n), then the upper bounds become linear. Namely, we have Zk (n) = O(n) (j) for k ≥ 2α(n); and for j ≥ 4, we have Zk (n) = O(n) for k ≥ Pj0 (α(n)) (recall

1.2. PROBLEMS CONSIDERED IN THIS THESIS

9

that αα(n) (n) ≤ 3). However, if we let k grow a little slower than that, then the lower bounds become again superlinear: If we let k ≤ Q0j (α(n) − 3) then (j) Zk (n) is superlinear in n (recall Lemma 1.1). (j) These bounds for Zk (n) are used to prove Theorems 1.3–1.5: Theorem 1.3 follows by a reduction to stabbing interval chains with triples. Theorem 1.4 follows by a reduction to stabbing with j-tuples, for j as given in the theorem, and Theorem 1.5 follows by a reduction to stabbing with d-tuples. For completeness, we also derive almost-tight bounds for the case of stabbing with with pairs (j = 2): Lemma 1.8. We have n n (2) − 3 ≤ Zk (n) ≤ − 1. bk/2c bk/2c Our work on stabbing interval chains was done together with Noga Alon, Haim Kaplan, Micha Sharir, and Shakhar Smorodinsky [5]. However, the lower bounds, as we present them here in Theorems 1.6 and 1.7, are formulated in a stronger way than in [5]. We need this stronger formulation for the proof of Theorem 1.5.

1.2.3

Davenport–Schinzel sequences

Davenport–Schinzel sequences are combinatorial objects with many applications in computational geometry. They are not related to weak -nets or to interval chains in any way (as far as we know), but Davenport–Schinzel sequences are known to have inverse-Ackermann bounds strikingly similar to the ones we obtained for these two latter problems. Given a positive integer s, a sequence of symbols S = a1 a2 a3 . . . is a Daveport–Schinzel sequence (DS-sequence, for short) of order s if ai 6= ai+1 for all i, and if S contains no alternating subsequence a · · · b · · · a · · · b · · · of length s + 2 for any pair of distinct symbols a, b. The problem is to determine the maximum length of a Davenport–Schinzel sequence of fixed order s containing only n distinct symbols. This maximum length is denoted λs (n). These sequences are named after Harold Davenport and Andrzej Schinzel, who first studied them in 1965 [19]. The main motivation for Davenport– Schinzel sequences is the complexity of the lower envelope of a set of curves in the plane: Let C = {c1 , . . . , cn } be a set of n unbounded x-monotone curves in the plane (graphs of totally defined continuous functions), such that every pair of curves in C intersect at most s times. For example, C could consist of graphs of polynomials of degree s. The pointwise minimum of the curves in

10

CHAPTER 1. INTRODUCTION

Figure 1.2: Left: A set of four unbounded x-monotone curves, each pair intersecting at most twice, and the corresponding lower envelope sequence. Right: A set of four nonvertical line segments and the corresponding lower envelope sequence. C is called the lower envelope of C. This lower envelope can be decomposed into maximal connected segments σ1 , σ2 , σ3 , . . ., such that each segment σi is a piece of some curve in cai ∈ C. The sequence of indices S = a1 a2 a3 . . . is called the lower envelope sequence of C. See Figure 1.2 (left). The main observation is that this sequence S is a Davenport–Schinzel sequence of order s: We have ai 6= ai+1 by the maximality condition on the segments σi , and S cannot contain any alternation of the form a · · · b · · · a · · · b · · · of length s + 2, or else the two corresponding curves in C would intersect s + 1 times. Now suppose C consists of n bounded x-monotone curves, where again each pair of curves intersect at most s times. In this case, one can show that the lower envelope sequence of C is a Davenport–Schinzel sequence of order s + 2. For example, if C is a set of nonvertical line segments, then its lower envelope sequence is a Davenport–Schinzel sequence of order 3. See Figure 1.2 (right). This basic idea has a large number of applications in discrete and computational geometry; the book [45] by Sharir and Agarwal is entirely devoted to this topic. Bounds for Davenport–Schinzel sequences As we said, the main problem is to bound the maximum length λs (n) of a Davenport–Schinzel sequence of order s on n distinct symbols. It is easy to show that λ1 (n) = n (no aba) and λ2 (n) = 2n − 1 (no abab). However, bounding λs (n) for s ≥ 3 already becomes much more complicated. Hart and Sharir showed in 1986 [23, 45] that λ3 (n) = Θ(nα(n)), where α(n) denotes, as above, the inverse Ackermann function. (For the upper

1.2. PROBLEMS CONSIDERED IN THIS THESIS

11

bound see also Sharir [44] and Klazar [27], and for the lower bound see also Wiernik and Sharir [51], Komj´ath [29], and Shor [46].) The tightest previously known bounds for λ3 (n) are p 1 nα(n) − O(n) ≤ λ3 (n) ≤ 2nα(n) + O n α(n) . (1.8) 2 The lower bound is due to Sharir and Agarwal [45] (based on the construction by Wiernik and Sharir [51]). The upper bound is due to Klazar [27]. Klazar [28] asks whether limn→∞ λ3 (n)/ nα(n) exists. The current upper and lower bounds for λs (n) for general s were established by Agarwal, Sharir, and Shor in 1989 [2, 45], and are as follows. Let t = b(s − 2)/2c. Then, ( t t−1 s ≥ 4 even; n · 2α(n) +O(α(n) ) , λs (n) ≤ α(n)t log2 α(n)+O(α(n)t ) n·2 , s ≥ 3 odd; λs (n) ≥ n · 2(1/t!)α(n) −O(α(n) t

t−1

),

s ≥ 4 even.

(1.9)

For odd s ≥ 5 the asymptotically best lower bounds known are obtained by λs (n) ≥ λs−1 (n). Sharir and Agarwal’s book [45] contains a complete derivation of the current upper and lower bounds for λs (n) for all s. Note the striking similarity between these bounds and the bounds we obtained for stabbing interval chains (Theorem 1.7), and for weak -nets (Theorems 1.3, 1.4, and 1.5). There is no connection between the two problems, as far as we know, besides the fact that they satisfy very similar recurrence relations; the bounds seem to arise “independently” in two different places. Our results In this thesis we derive several results on λs (n). First, we improve the upper bounds for general s by a constant factor in the exponent: Theorem 1.9. Let s ≥ 3 be fixed, and let t = b(s − 2)/2c. Then ( t t−1 n · 2(1/t!)α(n) +O(α(n) ) , s even; λs (n) ≤ (1/t!)α(n)t log2 α(n)+O(α(n)t ) n·2 , s odd. Thus, the bounds for s even are now tight, up to lower-order terms in the exponent. For odd s ≥ 5 there is still a (log α(n))-factor gap in the exponent between the upper and the lower bounds. We conjecture, by analogy to the bounds for stabbing interval chains, that the true bounds for λs (n), s ≥ 5 odd, do have the log-factor:

12

CHAPTER 1. INTRODUCTION

Conjecture 1.10. For every odd s ≥ 5 we have t

λs (n) ≥ n · 2(1/t!)α(n)

log2 α(n)−O(α(n)t )

,

where t = (s − 3)/2. We actually prove Theorem 1.9 in two different ways: We first prove it by making a small improvement in the argument of Agarwal, Sharir, and Shor [2]. And then we prove it by a new technique, using what we call almost-DS sequences. An almost-DS sequence, roughly speaking, is a Davenport–Schinzel sequence with two additional restrictions: It must consist of at most m blocks, each containing only distinct symbols; and every symbol must appear at least k times in the sequence. Here m and k are, together with s, parameters of the definition. On the other hand, an almost-DS sequence is allowed to have repetitions at the interface between adjacent blocks (and that is why these are almost Davenport–Schinzel sequences). In addition, we turn the problem around, and we ask for the maximum number of distinct symbols an almost-DS sequence can contain. We denote this quantity Πsk (m). It turns out that Πsk (m) behaves asymptotically very (j) much like Zk (n) of stabbing interval chains, with s taking the place of j and m the place of n. This makes almost-DS sequences, in our opinion, interesting objects in their own right (independently of their role in bounding λs (n)). With our new technique of almost-DS sequences we also rederive Klazar’s upper bound (1.8) for λ3 (n). Regarding lower bounds, we obtain the exact leading coefficient of λ3 (n): Theorem 1.11. λ3 (n) ≥ 2nα(n) − O(n). Corollary 1.12. limn→∞ λ3 (n)/ nα(n) = 2. In addition, we present a simpler variant of the construction of Agarwal, Sharir, and Shor [2, 45], which achieves the lower bounds (1.9) for s ≥ 4 even. (A key step in the construction fails for s ≥ 5 odd.) Generalized Davenport–Schinzel sequences Adamec, Klazar, and Valtr [1] considered a generalization of Davenport– Schinzel sequences, in which the forbidden pattern is not limited to abab . . ., but can be an arbitrary sequence. Recall that |S| denotes the length of a sequence S, and that kSk denotes the number of distinct symbols in S. Also recall that, if S and u are sequences, then we write u ⊂ S, and we say that

1.2. PROBLEMS CONSIDERED IN THIS THESIS

13

S contains u, if S contains a subsequence (not necessarily contiguous) which is isomorphic to u. A sequence S = a1 a2 a3 . . . is called r-sparse if ai 6= aj whenever 1 ≤ |j − i| ≤ r − 1. In other words, S is r-sparse if every interval in S of length at most r contains only distinct symbols. Let u (the forbidden pattern) be a sequence with kuk = r distinct symbols and length |u| = s. Then we denote by Exu (n) the maximum length of an r-sparse, u-free sequence on n distinct symbols. The standard Davenport– Schinzel sequences are obtained by taking r = 2 and u = abab . . . of length s + 2. The requirement of r-sparsity is necessary, since an (r − 1)-sparse, u-free sequence can be arbitrarily long. The requirement of r-sparsity, however, ensures that Exu (n) is finite. Generalized Davenport–Schinzel sequences have found several applications in discrete mathematics. Valtr [49] used generalized Davenport–Schinzel sequences to obtain bounds for some Tur´an-type problems for geometric graphs. Alon and Friedgut [4] used them to derive an almost-tight upper bound for the so-called Stanley–Wilf conjecture (the conjecture was later proved by Marcus and Tardos [31] by a different technique). For more information see the surveys by Klazar [28] and by Valtr [49]. More recently, Pettie [40] used generalized Davenport–Schinzel sequences to improve Sundar’s [48] near-linear upper bound for the deque conjecture for splay trees. Klazar in 1992 [26] developed a general technique for bounding Exu (n) in terms of only r = kuk and s = |u|. His technique is based on considering what we call formation-free sequences (our name). Given integers r and s, an (r, s)formation is a sequence of s permutations on r symbols. For example, abcd dcab dcab cdba dabc is a (4, 5)-formation. An (r, s)-formation-free sequence is a sequence which is r-sparse and does not contain any (r, s)-formation as a subsequence. Denote by Fr,s (n) the length of the longest possible (r, s)-formation-free sequence on n distinct symbols. Let u be a sequence with kuk = r and |u| = s. Since u is trivially contained in every (r, s)-formation, it follows that Exu (n) ≤ Fr,s (n). Klazar made a slight improvement to this observation, by noting that if r ≥ 2, then u is contained in every (r, s − 1)-formation, and thus, Exu (n) ≤ Fr,s−1 (n)

for r ≥ 2.

(1.10)

(The case r = 1 is not interesting in any case.) Klazar proved the bound Fr,s (n) ≤ n · 2O(α(n)

s−3

),

14

CHAPTER 1. INTRODUCTION

where the O notation hides constants that depend on r and s. Together with (1.10), this implies that Exu (n) ≤ n · 2O(α(n)

s−4

).

Our results for generalized Davenport–Schinzel sequences Our new technique of almost-DS sequences easily generalizes to formationfree sequences, yielding: Theorem 1.13. For s ≥ 4 we have ( t t−1 n · 2(1/t!)α(n) +O(α(n) ) , Fr,s (n) ≤ t t n · 2(1/t!)α(n) log2 α(n)+O(α(n) ) ,

s odd; s even;

where t = b(s − 3)/2c. (The O notation hides factors dependent on r and s.) Independently, we improve on Klazar’s bound (1.10): Lemma 1.14. Let u be a sequence with kuk = r, |u| = s. Then, Exu (n) ≤ Fr,s−r+1 (n). This, together with Theorem 1.13, yields:3 Theorem 1.15. Let u be a sequence with kuk = r, |u| = s, and s ≥ r + 3. Let t = b(s − r − 2)/2c. Then, ( t t−1 n · 2(1/t!)α(n) +O(α(n) ) , s − r even; Exu (n) ≤ (1/t!)α(n)t log2 α(n)+O(α(n)t ) n·2 , s − r odd. Note that Theorem 1.15 is a generalization of Theorem 1.9: Taking r = 2 and u = abab . . . of length s + 2 yields Theorem 1.9 once again. Our work on Davenport–Schinzel sequences and generalized Davenport– Schinzel sequences has been previously published in [37].

1.2.4

Selection lemmas

The first selection lemma The following result, first shown by B´ar´any [9], is called the first selection lemma by Matouˇsek [33]: For every n-point set S ⊂ Rd there exists a point 3

Klazar himself [26] speculated that it should be possible to achieve roughly Exu (n) ≤ O α(n)s/2 n·2 .

1.2. PROBLEMS CONSIDERED IN THIS THESIS

15

x ∈ Rd that is contained in at least cd nd+1 − O(nd ) of the d-dimensional simplices spanned by S, for some constants cd > 0. The problem is to determine the largest possible value of these constants 1 1 1 ≤ c2 ≤ 27 + 729 (they cd . For d = 2, Boros and F¨ uredi [12] showed that 27 1 claimed that c2 ≤ 27 , but their construction only yields this weaker upper bound; see [14]). The current lower bound for d ≥ 3 is due to Wagner [50], who showed that a centerpoint of S is always contained in at least γd nd+1 − O(nd ) simplices, with d2 + 1 γd = . (d + 1)!(d + 1)d+1 The only upper bound for d ≥ 3 known until now is derived from the following result, which was proven by K´arteszi for d = 2 [25] (see also Moon [36, p. 7] and Boros and F¨ uredi [11, 12]), and by B´ar´any for general d [9]: Lemma 1.16. For every n-point set S ⊂ Rd in general position and every point x ∈ Rd , x is contained in at most nd+1 /(2d (d + 1)!) + O(nd ) d-simplices spanned by S. (Lemma 1.16 is known to be tight.) This lemma yields a “trivial” upper bound of cd ≤ 1/(2d (d+1)!) (trivial in the sense that it considers a worst-case S). In this thesis we present the first “nontrivial” upper bound for the first selection lemma: Theorem 1.17. For every fixed d ≥ 2 and every n there exists an n-point set S ⊂ Rd such that every point x ∈ Rd is contained in at most

n d+1

d+1

+ o(nd+1 )

d-simplices spanned by S. (In particular, c2 = 1/27, as Boros and F¨ uredi tried to show.) In fact, the set S of the theorem is the same stretched grid Gs that yields Theorem 1.2. Alternatively, the set S could be taken to be Ds , the diagonal of the stretched grid which yields Theorem 1.5. Hence, Theorem 1.17 can also be achieved by point sets in convex position (see fn. 1 on p. 6). We conjecture that Theorem 1.17 is tight for every d. This is joint work with Boris Bukh and Jiˇr´ı Matouˇsek [13, 14].

16

CHAPTER 1. INTRODUCTION

The second selection lemma The following type of result is the planar case of what Matouˇsek [33] calls the second selection lemma: Let S be a set of n points in the plane, and let n T be a family of t ≤ 3 triangles spanned by S. Then there exists a point in the plane that is contained in “many” triangles of T . Aronov et al. [8] showed that there always exists a point contained in 3 Ω(t /(n6 log5 n)) triangles of T . Eppstein [22] subsequently claimed to have improved the bound to Ω(t3 /(n6 log2 n)), but there is a problem with his proof. In this thesis we show that this bound is, nevertheless, correct: Theorem 1.18. Let S be a set of n points in the plane, and let T be a family of t triangles spanned by S. Then, there exists a point in the plane that is contained in t3 Ω 6 2 n log n triangles of T . The proof of Theorem 1.18 follows by a slight modification of Eppstein’s argument. This is joint work with Micha Sharir [39]. Regarding upper bounds for the second selection lemma in the plane, Eppstein [22] showed thatfor every n-point set S ⊂ R2 in general position and for every n2 < t ≤ n3 , there exists a family T of t triangles such that no point in the plane intersects more than O(t2 /n3 ) triangles. In this thesis we improve this simple bound by a logarithmic factor (albeit only for special point sets): Theorem 1.19. For every n and every n2.5 log n < t ≤ n3 there exists an n-point set S ⊂ R2 and a family T of t triangles with vertices in S, such that no point x ∈ R2 is contained in more than t2 O 3 n log(n3 /t) triangles of T . (In particular, if t < n3−δ for some constant δ > 0, then the bound is O(t2 /(n3 log n)).) The set S of the theorem is, in fact, the stretched grid Gs once again. This is joint work with Bukh and Matouˇsek [13]. The second selection lemma for general dimension is as follows: Let S be n a set of n points in Rd , and let T be a family of t ≤ d+1 d-simplices spanned d by S. Then there exists a point x ∈ R that is contained in Ω((t/nd+1 )sd nd+1 ) simplices of T , for some constants sd > 0.

1.3. ORGANIZATION OF THE THESIS

17

The second selection lemma for general dimension was established by ˇ the combined work by Alon et al. [3], B´ar´any et al. [10], and Zivaljevi´ c and Vre´cica [54]. Unfortunately, the proof involves some “heavy” tools from algebraic topology, and it gives very weak bounds, with sd = (4d + 1)d+1 . The main application of the second selection lemma is in bounding the maximum number of k-sets, a problem we do not touch in this thesis.4

1.3

Organization of the thesis

This thesis is organized as follows: In Chapter 2 we derive our results on weak -nets. In Chapter 3 we derive our bounds for stabbing interval chains. In Chapter 4 we deal with Davenport–Schinzel sequences and their generalizations. Finally, in Chapter 5 we prove our results on the selection lemmas. Appendices A–C contain some technical calculations.

4

But see our work [38] on simplifying and improving the lower bound construction for k-sets in the plane, which was carried out during the Ph.D. research period, but was left out of the thesis because it did not quite fit the main theme of the thesis.

18

CHAPTER 1. INTRODUCTION

Chapter 2 Weak epsilon-nets In this chapter we prove our results on weak -nets (Theorems 1.2–1.5). We first prove Theorem 1.2 by constructing, for every fixed d ≥ 2 and every r > 1, a point set Gs ⊂ Rd for which every weak 1r -net has size Ω(r logd−1 r). Our point set Gs is a stretched grid, and its analysis involves a variant of the notion of convexity, which we call stair-convexity. The stretched grid will also be used in Chapter 5.

2.1

The stretched grid and stair-convexity

The stretched grid Gs is the Cartesian product X1 × X2 × · · · × Xd , where each Xi is a suitable set of m real numbers. The integer m is a parameter of the construction of Gs , so we sometimes write Gs = Gs (m), and m has to be chosen sufficiently large in terms of r and d. The main idea in the construction of Gs is that X2 , X3 , . . . , Xd are “fastgrowing” sequences, and each Xi grows much faster than Xi−1 . For technical reasons, we will not define Gs (m) uniquely; rather, we will introduce some condition that the Xi have to satisfy, and thus, formally speaking, Gs (m) will stand for a whole class of grids. We will define the Xi by induction on i, together with relations i on R, which describe “at least how fast” the terms in Xi must grow (but we will also use i for comparing real numbers other than the members of Xi ). Let us write Xi = {xi1 , xi2 , . . . , xim }, where xi1 < xi2 < · · · < xim . We start by letting x 1 y to mean x + 2d − 1 ≤ y. Then we choose X1 so that x11 = 1 and x11 1 x12 1 · · · 1 x1m . Having defined Xi−1 and i−1 , we set Ki := (2d2 x(i−1)m )2d−1 , define x i y to mean Ki x ≤ y, and choose Xi so that xi1 = 1 and xi1 i xi2 i 19

20

CHAPTER 2. WEAK EPSILON-NETS

Figure 2.1: The bijection transforming the stretched grid to the uniform grid: the images of two straight segments connecting grid points (left), and a schematic depiction of the image of a convex set—the convex hull of the points marked bold (right). · · · i xim .1 As we will explain, the intersections of convex sets with the stretched grid can be approximated, up to a small error, by sets that have a simple, essentially combinatorial description. It is practically impossible to make a realistic drawing of the stretched grid, but we can conveniently think about it using a bijection with a uniform (equally spaced) grid. Namely, we define the uniform grid in the unit cube [0, 1]d by d 2 1 , m−1 , . . . , m−1 . Gu = Gu (m) := 0, m−1 m−1 Let BB(Gs ) := [1, x1m ] × [1, x2m ] × · · · × [1, xdm ] be the bounding box of Gs , and let π : BB(Gs ) → [0, 1]d be a bijection that maps Gs onto Gu and preserves ordering in each coordinate; that is, we map points of Gs to the corresponding points of Gu and we squeeze the “elementary boxes” of Gs onto the corresponding elementary boxes of Gu (using, e.g., linear interpolation within each box). Figure 2.1 shows, for d = 2, the image under π of two straight segments connecting grid points (left) and of a “generic” convex set (right). As we will argue shortly, the image of the straight segment ab, for example, first ascends almost vertically almost to the level of b, and then it continues almost horizontally towards b. This motivates a notion we call stair-convexity. 1

This definition of i differs slightly from the one we gave in [13], due to a different proof method we use here in some lemmas. Of course, this difference is of no conceptual significance.

2.1. THE STRETCHED GRID AND STAIR-CONVEXITY

21

Figure 2.2: Examples of a stair-path in the plane (left) and in 3-space (center). An example of a stair-convex set in the plane (right). First we define, for points a = (a1 , a2 , . . . , ad ) and b = (b1 , b2 , . . . , bd ) ∈ Rd , the stair-path σ(a, b). This is a polygonal path connecting a and b and consisting of at most d closed line segments, each parallel to one of the coordinate axes. The definition proceeds by induction on d; for d = 1, σ(a, b) is simply the segment ab. For d ≥ 2, after possibly interchanging a and b, let us assume ad ≤ bd . We set a0 := (a1 , a2 , . . . , ad−1 , bd ) and we let σ(a, b) be the union of the segment aa0 and of the stair-path σ(a0 , b); for the latter we use the recursive definition after “forgetting” the (common) last coordinate of a0 and b. (Note that σ(a0 , b) should be regarded as an undirected path, because of the possible exchange of a and b in the construction.) See Figure 2.2 for examples. Now we define a set S ⊆ Rd to be stair-convex if for every a, b ∈ S we have σ(a, b) ⊆ S. See Figure 2.2 again.2 Since the intersection of stair-convex sets is obviously stair-convex, we can also define the stair-convex hull stconv(X) of a set X ⊆ Rd as the intersection of all stair-convex sets containing X. As Figure 2.1 indicates, convex sets in the stretched grid transform to “almost” stair-convex sets. We will now express this connection formally.

2.1.1

Properties of the stretched grid

We now state some basic properties of stair-convexity and the stretched grid. The proofs are somewhat technical and can be skipped on first reading; they appear in Section 2.1.3 below. The following lemma gives a local characterization of the stair-convex hull. 2

Note that any line parallel to a coordinate axis intersects a stair-convex set in a (possibly empty) segment. Sets with this latter property are called rectilinearly convex, orthoconvex, or separately convex in various sources; however, stair-convexity is a considerably stronger property. Another notion somewhat resembling stair-convexity are the staircase connected sets studied by Magazanik and Perles [30].

22

CHAPTER 2. WEAK EPSILON-NETS

Figure 2.3: A point x in the plane is contained in the stair-convex hull of three points if and only if one of them lies below-left of x, another one lies below-right of x, and the third one lies above x. Let a = (a1 , . . . , ad ) be a point in Rd . We say that another point b = (b1 , . . . , bd ) ∈ Rd has type 0 with respect to a if bi ≤ ai for every i = 1, 2, . . . , d. For j ∈ {1, 2, . . . , d}, we say that b has type j with respect to a if bj ≥ aj but bi ≤ ai for all i = j + 1, . . . , d. (It may happen that b has more than one type with respect to a, but only if some of the above inequalities are equalities. At any rate, b always has at least one type with respect to a.) Lemma 2.1. Let X ⊆ Rd be a point set, and let x ∈ Rd be a point. Then x ∈ stconv(X) if and only if X contains a point of type j with respect to x for every j = 0, 1, . . . , d. (See Figure 2.3.) The next lemma shows that convex hulls and stair-convex hulls almost coincide in the stretched grid. Let us say that two points a = (a1 , . . . , ad ) and b = (b1 , . . . , bd ) in BB(Gs ) are far apart if, for every i = 1, 2, . . . , d, we have either ai i bi or bi i ai . Lemma 2.2. Let P ⊆ BB(Gs ) be a point set and q ∈ BB(Gs ) be a point in the bounding box of the stretched grid, such that every point of P is far apart from q. Then q ∈ stconv(P ) if and only if q ∈ conv(P ). (In [13] we prove a stronger version of Lemma 2.2: If P, Q ⊆ BB(Gs ) are such that every point of P is far apart from every point of Q, then conv(P ) ∩ conv(Q) 6= ∅ if and only if stconv(P ) ∩ stconv(Q) 6= ∅. However, Lemma 2.2 is sufficient for our purposes.) Now recall that a set N ⊆ [0, 1]d is an -net for [0, 1]d with respect to stairconvex sets if N ∩ S 6= ∅ for every stair-convex S ⊆ [0, 1]d with vol(S) ≥ (where vol(·) denotes the d-dimensional Lebesgue measure on [0, 1]d ). Lemma 2.3 (Transference for weak -nets). (i) Let N be a weak -net (with respect to convex sets) for the d-dimensional stretched grid Gs = Gs (m) of side m. Then the set π(N ) ⊆ [0, 1]d is an

2.1. THE STRETCHED GRID AND STAIR-CONVEXITY

23

0 -net for [0, 1]d with respect to stair-convex sets with 0 ≤ +O(|N |/m) (with the constant of proportionality depending on d). (ii) Let N be an -net for [0, 1]d with respect to stair-convex sets. Then π −1 (N ) is a weak 0 -net (with respect to convex sets) for Gs (m) with 0 ≤ +O(|N |/m), again with the constant of proportionality depending on d. Theorem 1.2 immediately follows from Lemma 2.3(i) and the next proposition: Proposition 2.4. Every 1r -net for [0, 1]d with respect to stair-convex sets has at least Ω(r logd−1 r) points. The proof, which we present in Section 2.1.2, is strongly inspired by Roth’s beautiful lower bound in discrepancy theory [42]; also see [32] for a presentation of Roth’s proof and wider context. As we also show in Section 2.1.2, the lower bound in the proposition is actually tight (up to a constant factor). This means, via Lemma 2.3(ii), that the stretched grid itself is not going to provide any stronger lower bounds for weak -nets than those proved here.

2.1.2

Epsilon-nets with respect to stair-convex sets

Here we prove Proposition 2.4, stating that every 1r -net for [0, 1]d with respect to stair-convex sets has Ω(r logd−1 r) points. Proof of Proposition 2.4. Equivalently, we show that if N ⊆ [0, 1]d is an arbitrary set of n points, then there exists a stair-convex set S ⊆ [0, 1]d of volume at least Ω((logd−1 n)/n) that avoids N . We will produce such an S as a union of suitable axis-parallel boxes. Let k = Θ(log n) be the unique integer satisfying 2d+1 n ≤ 2k < 2d+2 n, and let us call every integer vector t = (t1 , t2 , . . . , td ) with ti ≥ 1 for all i and with t1 + t2 + · · · + td = k a box type. For later use we record that the number T of box types is k−1 = Ω(k d−1 ). d−1 Let V := [ 21 , 1]d be the “upper right part” of the cube [0, 1]d . For a box type t and a point p ∈ V , we define the normal box of type t anchored at p as Bt (p) := [p1 − 2−t1 , p1 ] × [p2 − 2−t2 , p2 ] × · · · × [pd − 2−td , pd ]. Since each side of Bt (p) is at most 12 , each normal box is contained in [0, 1]d .

24

CHAPTER 2. WEAK EPSILON-NETS

Figure 2.4: The fan F(p) (left); the stair-convex set S made of the empty boxes of F(p0 ) and the lower subboxes witnessing the volume of S (right). The volume of each normal box is 2−k ≤ 1/(2d+1 n). Let us call a normal box Bt (p) empty if Bt (p) ∩ N = ∅. We will show that for every box type t, if p ∈ V is chosen uniformly at random, then Pr[Bt (p) is empty] ≥ 12 .

(2.1)

Indeed, for every point x ∈ [0, 1]d we have vol{p ∈ V : x ∈ Bt (p)} ≤ 2−k , which in probabilistic terms means Pr[x ∈ Bt (p)] ≤ 2−k / vol(V ) = 2−k+d ≤ 1 , and (2.1) follows by the union bound. 2n Now we define the fan F(p) of a point p ∈ V as the set consisting of the normal boxes Bt (p) for all the T possible box types t (see Figure 2.4 left). By (2.1) we get that for a random p ∈ V the expected number of empty boxes in the fan of p is at least T /2. Thus, there exists a particular point p0 ∈ V such that F(p0 ) has at least T /2 empty boxes. We define S as the union of these empty boxes. Then S ∩ N = ∅, and S is clearly stair-convex (any union of axis-parallel boxes with a common “top-right” corner is stair-convex). It remains to bound from below the volume of S. For an axis-parallel box B = [a1 , a1 + s1 ] × · · · × [ad , ad + sd ] we define the lower subbox B 0 := [a1 , a1 + 12 s1 ] × · · · × [ad , ad + 21 sd ]. We observe that if Bt1 (p) and Bt2 (p) are two normal boxes of different types anchored at the same point, then their lower subboxes are disjoint. Hence, vol(S) is at least the sum of volumes of the lower subboxes of T /2 normal boxes, and so vol(S) ≥ T2 2−d 2−k = Ω((logd−1 n)/n). Proposition 2.4 is proved. Now we show that Proposition 2.4 is asymptotically tight; namely, that for every r ≥ 1 there exists a set N ⊂ [0, 1]d , |N | = O(r logd−1 r), intersecting

2.1. THE STRETCHED GRID AND STAIR-CONVEXITY

25

every stair-convex S ⊆ [0, 1]d with vol(S) ≥ 1r . We begin with the following fact: For every s ≥ 1 there exists a set N ⊂ [0, 1]d of size O(s) intersecting every axis-parallel box B ⊆ [0, 1]d with vol(B) ≥ 1s . Indeed, the Van der Corput set in the plane and the Halton–Hammersley sets in dimension d have this property, as well as many other constructions of low-discrepancy sets (Faure sets, digital nets of Sobol, Niederreiter and others, etc.); see, e.g., [32]. Given r ≥ 1, we now set s := Cr logd−1 r for a sufficiently large constant C, and we let N be a set as in the just mentioned fact. We claim that N is the desired 1r -net for [0, 1]d with respect to stair-convex sets. This follows from the next lemma. Lemma 2.5. Let 0 < v ≤ 1/e and let S ⊆ [0, 1]d be a stair-convex set that contains no axis-parallel box of volume larger than v. Then vol(S) ≤ ev lnd−1 v1 . (Thus, a stair-convex set of volume 1/r must contain an axis-parallel box of volume Ω(1/(r logd−1 r)).) Proof. We proceed by induction on d. The base case d = 1 is trivial, so we assume d ≥ 2. Without loss of generality we can assume that S intersects the “upper facet” of [0, 1]d (the facet of [0, 1]d with last coordinate equal to 1). For z ∈ [0, 1] let h = h(1 − z) denote the “horizontal” hyperplane {x ∈ Rd : xd = 1 − z}. Let S 0 := S ∩ h, and let B 0 be an axis-parallel box of maximum (d − 1)-dimensional volume in S 0 . We must have vold−1 (B 0 ) ≤ vz , for otherwise, B 0 could be extended upwards into a box B of d-dimensional volume larger than v; since B ⊆ S (by Lemma 2.6 below), this is a contradiction. Since S 0 is stair-convex (see again Lemma 2.6 below), for z ≥ ev the lnd−2 vz . We also have vol(S 0 ) ≤ 1. inductive assumption gives vol(S 0 ) ≤ ev z So for v ≤ 1/e we have Z

ev

Z

1

ev d−2 z ev vol(S) ≤ dz + ln dz = ev + z v d−1 0 ev 1 1 ≤ ev + ev lnd−1 − 1 = ev lnd−1 . v v

d−1 1 ln −1 v

This finishes the induction step, and thus establishes the claim.

26

2.1.3

CHAPTER 2. WEAK EPSILON-NETS

Properties of the stretched grid: Proofs

In this section we prove Lemmas 2.1 and 2.2, and then we use them to prove Lemma 2.3 (the transference lemma). Along the way, we establish other basic properties of stair-convexity. Let us first introduce some notation. For a real number y let h(y) denote the “horizontal” hyperplane {x ∈ Rd : xd = y}. For a horizontal hyperplane h = h(y) let h+ := {x ∈ Rd : xd ≥ y} be the upper closed half-space bounded by h, and let h− be the lower closed half-space, defined analogously. For a set S ⊆ Rd let S(y) := S ∩ h(y) be the horizontal slice of S at “height” y. For a point x = (x1 , . . . , xd ) ∈ Rd let x := (x1 , . . . , xd−1 ) be the projection of x into Rd−1 , and define S for S ⊂ Rd similarly. For a point x ∈ Rd−1 and a real number xd , let x × xd := (x1 , . . . , xd−1 , xd ), with a slight abuse of notation. If P and Q are subsets of Rd , we say that P and Q share the i-th coordinate if pi = qi for some p ∈ P , q ∈ Q. Similarly, if p ∈ Rd and Q ⊂ Rd , then we say that p and Q share the i-th coordinate if {p} and Q do so. We begin with an equivalent, and perhaps somewhat more intuitive, description of stair-convex sets. Lemma 2.6. A set S ⊆ Rd is stair-convex if and only if the following two conditions hold: (sc1) For every y ∈ R, the set S(y) is a (d − 1)-dimensional stair-convex set. (sc2) (Slice-monotonicity) For every y1 , y2 ∈ R with y1 ≤ y2 and S(y2 ) 6= ∅, we have S(y1 ) ⊆ S(y2 ). Proof. First let S be stair-convex. Condition (sc1) is clear from the definition of a stair-path. As for (sc2), we need to prove that for every a = (a1 , . . . , ad−1 , y1 ) ∈ S(y1 ) the point a0 := (a1 , . . . , ad−1 , y2 ) directly above a lies in S(y2 ). But since S(y2 ) 6= ∅, we can fix some b ∈ S(y2 ), and then a0 lies on the stair-path σ(a, b) and so a0 ∈ S(y2 ) indeed. Conversely, let S ⊆ Rd satisfy (sc1) and (sc2), and let a = (a1 , . . . , ad ), b = (b1 , . . . , bd ) ∈ S with ad ≤ bd . Letting a0 := (a1 , . . . , ad−1 , bd ) be the point directly above a at the height of b as in the definition of the stair-path σ(a, b), we have σ(a0 , b) ⊆ S by the stair-convexity of S(bd ) and aa0 ⊆ S by (sc2). Lemma 2.7. The stair-convex hull of a set X ⊆ Rd can be (recursively) characterized as follows: For every horizontal hyperplane h = h(y) that does not lie entirely above X, let X 0 stand for the vertical projection of X ∩ h− into h. Then h∩stconv(X) = stconv(X 0 ) (where stconv(X 0 ) is a stair-convex hull in dimension d − 1).

2.1. THE STRETCHED GRID AND STAIR-CONVEXITY

27

Proof. First we prove the inclusion stconv(X 0 ) ⊆ h ∩ stconv(X). Let us fix a point x0 ∈ X ∩ h+ (i.e., above h or on it), and let x be an arbitrary point of X ∩ h− . Then x0 , the vertical projection of x into h, lies on the stair-path σ(x, x0 ), and thus X 0 ⊆ h ∩ stconv(X). Since h ∩ stconv(X) is stair-convex (by (sc1) in Lemma 2.6), we also have stconv(X 0 ) ⊆ h ∩ stconv(X). To establish the reverse inclusion, it suffices to show that for every (d−1)dimensional stair-convex S 0 ⊆ h that contains X 0 there is a d-dimensional stair-convex set S with S ∩h = S 0 that contains X. Such an S can be defined as (Rd \h− )∪P − (S 0 ), where P − (S 0 ) = {(x1 , . . . , xd ) ∈ Rd : (x1 , . . . , xd−1 , y) ∈ S 0 , xd ≤ y} is the semi-infinite vertical prism obtained by extruding S 0 downwards. The stair-convexity of S follows from Lemma 2.6. Next, we prove Lemma 2.1, which asserts that a point x lies in the stairconvex hull of a set X if and only if X contains a point of type j with respect to x for every j = 0, 1, . . . , d. Proof of Lemma 2.1. Again, refer to Figure 2.3. Both directions follow by induction on d. The case d = 1 is trivial, and so we assume d ≥ 2. Let h be the horizontal hyperplane containing x. First we suppose x ∈ stconv(X). There exists a point pd ∈ X whose last coordinate is at least as large as that of x, and this pd has type d with respect to x. Next, let X 0 be the vertical projection of X ∩ h− into h as in Lemma 2.7. By that lemma we have x ∈ stconv(X 0 ), and so, by induction, X 0 contains points p00 , . . . , p0d−1 (not necessarily distinct) of types 0, . . . , d−1, respectively, with respect to x. The corresponding points p0 , . . . , pd−1 ∈ X also have types 0, . . . , d − 1 with respect to x. For the other direction, we suppose that there are points p0 , . . . , pd ∈ X of types 0, . . . , d with respect to x. Then the vertical projections of p0 , . . . , pd−1 into h also have types 0, . . . , d − 1 with respect to x, and so by the inductive hypothesis, their stair-convex hull contains x. Since pd ∈ h+ , it follows, again by Lemma 2.7, that x ∈ stconv({p0 , . . . , pd }). We proceed to prove Lemma 2.2. We first establish some additional properties of stair-convex hulls. (Here we give a different proof of Lemma 2.2 from the one in [13].) Lemma 2.8. Let Q be a k-point set in Rd for some k ≤ d + 1, and let p be a point in stconv(Q). Then p shares at least d − k + 1 coordinates with Q. Proof. By induction on d. Let q be the highest point of Q. First suppose pd = qd . Then, p shares the last coordinate with Q. Further, by induction, p shares at least d − k coordinates with Q, so p also shares at least d − k out of the first d − 1 coordinates with Q, and we are done.

28

CHAPTER 2. WEAK EPSILON-NETS

Next suppose pd < qd . In this case, let Q0 = Q\{qd }. Then, by Lemma 2.7 and by induction, p shares at least d − (k − 1) coordinates with Q0 , so the same is true of p and Q0 , and thus of p and Q. We now prove a simple property of convexity, and its stair-convex analogue: Lemma 2.9. Suppose p ∈ Rd is contained in conv(Q) where Q = {q1 , . . . , qk } for some k ≤ d + 1. Then there exist points r1 , . . . , rk ∈ Rd such that r1 = q1 , rk = p, and for every i = 2, . . . , k the point ri lies in the segment ri−1 qi . (In other words, we can get to p by starting at q1 and “walking” towards q2 , q3 , . . . , qk in succession.) P Proof. Write p as p = α1 q1 + · · · + αk qk where αi = 1 and αi ≥ 0 for every i. If α1 = · · · = αk−1 = 0 P then p = qk , so we can let r2 = · · · = rk = p. Otherwise, let rk−1 = k−1 i=1 (αi /(α1 + · · · + αk−1 ))qi . We have p ∈ rk−1 qk , and rk−1 ∈ conv({q1 , . . . , qk−1 }), so we can find rk−2 , . . . , r1 by induction on k. This is the stair-convex analogue of Lemma 2.9: Lemma 2.10. Suppose p ∈ Rd is contained in stconv(Q) where Q = {q1 , . . . , qk } ⊂ Rd for some k ≤ d + 1. Then there exist points r1 , . . . , rk ∈ Rd such that r1 = q1 , rk = p, and for every i = 2, . . . , k we have ri ∈ σ(ri−1 , qi ). Proof. By induction on d. Let h = h(pd ) be the horizontal hyperplane containing p. Let Q0 = Q ∩ h− . Suppose without loss of generality that Q0 = {q1 , . . . , qk0 } for some k 0 ≤ k. We have p ∈ stconv(Q0 ), so by induction there exist points r1∗ , . . . , rk∗0 ∈ Rd−1 such that r1∗ = q1 , rk∗0 = p, and for each i = 2, . . . , k 0 we have ri∗ ∈ ∗ σ(ri−1 , q i ). Coming back to Rd , let r1 = r1∗ × q1d and, for i ≥ 2, let ri = ri∗ × max{r(i−1)d , qid }. Then r1 = q1 and ri ∈ σ(ri−1 , qi ) for i ≥ 2. Furthermore, we have rk0 d ≤ pd . If rk0 d = pd then we are done. Otherwise, there exists a point qk0 +1 ∈ Q \ h− which we have not used. Then let rk0 +1 = p; we have rk0 +1 ∈ σ(rk0 , qk0 +1 ), and we are done. We now define auxiliary “much-smaller-than” relations i for i = 1, . . . , d, as follows: x 1 y means x + 1 ≤ y; and for i ≥ 2, x i y means Li x ≤ y with Li = 2d2 x(i−1)m , where x(i−1)m is the largest element of Xi−1 . Let ki denote the k-fold composition of the relation i . Note that the relations i which we used to define the sets Xi are (2d−1)-fold compositions of these “auxiliary” relations i ; meaning, i ≡ 2d−1 . i

2.1. THE STRETCHED GRID AND STAIR-CONVEXITY

29

Let a, b be two points in BB(Gs ). We say that a and b are k-far apart in coordinate i if either a ki b or a ki b. Otherwise, a and b are k-close in coordinate i. Points a and b are k-far apart if they are k-far apart in every coordinate, and they are k-close if they are k-close in every coordinate. Recall from Section 2.1.1 that points a and b are far apart if for each i we have either ai i bi or ai i bi . Thus, a and b are far apart if and only if they are (2d − 1)-far apart. The connection between the stretched grid and stair-convexity is based on the following lemma. This lemma is the only place where the actual definition of i is used: Lemma 2.11. Let a, b be two points in BB(Gs ), and let ab and σ(a, b) be the line segment and the stair-path between a and b, respectively. Then every point in ab is 1-close to a point in σ(a, b) and vice versa. To prove Lemma 2.11 we prove by induction a slightly stronger lemma. Given a point p ∈ Rd and a real number δ > 0, let Nδ (p) = {q : |pi − qi | ≤ δ for every i = 1, . . . , d}. be the L∞ δ-neighborhood of p. Lemma 2.12. Let a, a∗ , b, b∗ be points in BB(Gs ) such that a∗ ∈ N1/d (a) and b∗ ∈ N1/d (b). Then every point of the segment a∗ b∗ is 1-close to a point of σ(a, b), and every point of σ(a, b) is 1-close to a point of the segment a∗ b∗ . Proof. Assume without loss of generality that ad ≤ bd . Let c be the point directly above a at the same height as b. Let P = x ∈ Rd : |xi − ai | ≤ 1/(d − 1) for all 1 ≤ i ≤ d − 1 be an infinite vertical prism of cubical cross-section of side 2/(d−1), centered at point a. Thus, every “wall” of P is at distance 1/(d − 1) of segment ac. Let c∗ be the point of intersection of segment a∗ b∗ and the boundary of P (if such an intersection exists; otherwise the claim follows easily). We claim that c∗ is 1-close to c in the last coordinate. See Figure 2.5 (left). Indeed, there exists a coordinate i ≤ d − 1 for which |ai − c∗i | = 1/(d − 1); for this coordinate we have |a∗i − c∗i | ≥ 1/(d − 1) − 1/d > 1/d2 . Since the points a∗ , c∗ , b∗ are collinear, with a∗d ≤ c∗d ≤ b∗d , we have (b∗d −a∗d )/(c∗d −a∗d ) = |b∗i − a∗i |/|c∗i − a∗i |; we also have |b∗i − a∗i | ≤ xim ≤ x(d−1)m , so b∗d − a∗d |b∗i − a∗i | b∗d ≤ = < d2 x(d−1)m . c∗d c∗d − a∗d |c∗i − a∗i |

30

CHAPTER 2. WEAK EPSILON-NETS

Figure 2.5: Left: Proof of Lemma 2.12. Right: Proof of Lemma 2.13. In each case, the picture shows the image under π in the uniform grid. Hence, cd = bd ≤ b∗d + 1/d < 2b∗d < 2d2 x(d−1)m c∗d , proving that c∗ is 1-close to c in the last coordinate. Now the claim follows easily: Split the stair-path σ(a, b) into segment ac and stair-path σ(c, b), and split the segment a∗ b∗ into a∗ c∗ and c∗ b∗ . Every point of ac is 1-close to a point of a∗ c∗ and vice versa (with plenty of room to spare in the first d − 1 coordinates). And every point of σ(c, b) is 1-close to a point of c∗ b∗ and vice versa: The last coordinate has already been taken care of, and for the first d − 1 coordinates we use induction (noting that |ci − c∗i | ≤ 1/(d − 1) for every i ≤ d − 1 by the definition of P ).3 Lemma 2.13. Let Q = {q1 , . . . , qk } ⊂ BB(Gs ) for some k ≤ d + 1. Then every point of conv(Q) is (k −1)-close to a point of stconv(Q) and vice versa. Proof. Let p be a point in conv(Q). By Lemma 2.9 there exist points r1 , . . . , rk such that r1 = q1 , rk = p, and ri lies in the segment ri−1 qi for every i = 2, . . . , k. Then, by repeatedly applying Lemma 2.11, we can find points r10 , . . . , rk0 0 such that r10 = q1 , ri0 ∈ σ(ri−1 , qi ) for every 2 ≤ i ≤ k, and ri0 is (i − 1)-close 0 to ri for every 2 ≤ i ≤ k: Suppose we have already found ri−1 with these ∗ properties. By Lemma 2.11, there is a point ri ∈ σ(ri−1 , qi ) which is 1-close 0 to ri ∈ ri−1 qi ; then let ri0 be a point in σ(ri−1 , qi ) which is (i − 2)-close to ri∗ . See Figure 2.5 (right). 3

The relations i used to define 1-closeness also depend on d, the “target” dimension, and on m, the side of the grid. Therefore, strictly speaking, we should write i,d,m . The claim by induction here refers to the relations i,d−1,m ; but x i,d,m y implies x i,d−1,m y, so we are safe.

2.1. THE STRETCHED GRID AND STAIR-CONVEXITY

31

Since rk0 ∈ stconv(Q), the claim follows. The other direction follows analogously, using Lemma 2.10. Lemma 2.14. Let P be a finite point set in BB(Gs ). Then every point in the boundary of conv(P ) is (d − 1)-close to some point of P in some coordinate. Proof. Every point p in the boundary of conv(P ) lies in the convex hull of a subset P 0 ⊆ P of size at most d. By Lemma 2.13 there is a point p0 ∈ stconv(P 0 ) which is (d − 1)-close to p. Finally, by Lemma 2.8, p0 shares at least one coordinate with P 0 . We can finally prove Lemma 2.2: Proof of Lemma 2.2. Suppose P ⊂ BB(Gs ) and q ∈ BB(Gs ) such that q is (2d − 1)-far from every point of P . The claim is that q ∈ conv(P ) if and only if q ∈ stconv(P ). Suppose q ∈ conv(P ). By Carath´eodory’s theorem, q ∈ conv(P 0 ) for some subset P 0 ⊆ P of size at most d + 1. Therefore, Lemma 2.13 implies there exists a point q 0 ∈ stconv(P 0 ) which is d-close to q. But since every point of P 0 is d-far from q, there is no combinatorial difference between q and q 0 as far containment in stconv(P 0 ) is concerned; thus, we also have q ∈ stconv(P 0 ) ⊆ stconv(P ), as desired. Now suppose q ∈ stconv(P ). By Lemmas 2.1 and 2.13 there exists a point 0 q ∈ conv(P ) which is d-close to q. If q were not contained in conv(P ) then the segment qq 0 would intersect the boundary of conv(P ) at some point q ∗ . But by Lemma 2.14, q ∗ is (d−1)-close to some point p ∈ P in some coordinate i. Therefore, q is (2d − 1)-close to p in coordinate i; contradiction. Next, we derive auxiliary results needed for the proof of transference lemma (Lemma 2.3). Given sets P, Q ⊆ Rd , we define the operation P Q := {p ∈ P : p + Q ⊆ P }, where p + Q = {p + q : q ∈ Q}. Lemma 2.15. Let S ⊆ [0, 1]d be a stair-convex set, and let Gu = Gu (m) be the uniform grid of side m. Then, for every δ > 0, the set S δ− := S [0, δ]d is stair-convex, vol(S δ− ) ≥ vol(S)−dδ,

and

|S δ− ∩Gu | ≥ |S ∩Gu |−dd(m−1)δemd−1 .

Proof. For an index i ∈ {1, 2, . . . , d} and δ > 0 let si (δ) be the initial closed segment of the positive xi -axis (starting at the origin) of length δ.

32

CHAPTER 2. WEAK EPSILON-NETS

We prove that for every stair-convex S ⊆ [0, 1]d and every δ > 0 the set S 0 := S si (δ) is stair-convex, has volume at least vol(S) − δ, and contains at least |S ∩ Gu | − d(m − 1)δemd−1 points of Gu . The assertion of the lemma then follows by d-fold application of this statement and by noticing that S [0, δ]d = S s1 (δ) · · · sd (δ). As for the stair-convexity of S 0 , the following actually holds: If S is stairconvex and D is arbitrary, then S D is stair-convex too. This follows from the translation invariance of stair-paths. Namely, σ(a+x, b+x) = x+σ(a, b), and thus for a, b ∈ S D we have a + x and b + x in S for all x ∈ D, so x + σ(a, b) = σ(a + x, b + x) ⊆ S, and thus σ(a, b) ⊆ S D. The claim about vol(S 0 ) follows by Fubini’s theorem, since S\S 0 intersects every line parallel to the xi -axis in a single segment of length at most δ. The claim about the number of grid points follows similarly, by noticing that the 1 and thus S \ S 0 contains at most dδ(m − 1)e grid grid Gu (m) has step m−1 points on each line parallel to the xi -axis. Corollary 2.16 (Grid approximation). Let S ⊆ [0, 1]d be a stair-convex set, and let gS = |S∩Gu (m)| be the number of points of the uniform grid contained in S. Then, gS − (m − 1)d vol(S) ≤ dmd−1 . 1 Proof. Let δ := m−1 be the step of the grid Gu (m). For every grid point p ∈ Gu ∩ S δ− , the cube p + [0, δ]d is contained in S, and since such cubes have disjoint interiors, we have vol(S) ≥ δ d |S δ− ∩ Gu | ≥ δ d gS − δ d dmd−1 by the second inequality in Lemma 2.15. Multiplying by δ −d we get gS ≤ (m − 1)d vol(S) + dmd−1 , one of the inequalities in the corollary. For the other inequality, we observe that if p ∈ Gu (m) is a grid point such that the cube p + [−δ, 0]d intersects S δ− , then p ∈ S. So using the first inequality of Lemma 2.15 gives vol(S) ≤ vol(S δ− ) + dδ ≤ δ d gS + dδ, and we are done.

Proof of Lemma 2.3. Let us prove part (i). So let N a weak -net for Gs = Gs (m), and let s = |N |. Let us call a point p ∈ Gs good if it is far apart from every point of N ; otherwise, p is bad. There are at most 2dsmd−1 bad points in Gs . Let 0 := +2d(s+1)/m, and let us consider a stair-convex set S 0 ⊆ [0, 1]d of volume 0 . By Corollary 2.16, S 0 contains a set P 0 ⊆ Gu of at least 0 (m − 1)d − dmd−1 grid points. Let P = π −1 (P 0 ) be the corresponding subset of Gs . By removing all bad points from P we obtain a set P ∗ of at least 0 (m−1)d −d(2s+1)md−1 ≥ md good points. Since N is a weak -net, there exists a point x ∈ N ∩ conv(P ∗ ).

2.2. INTRINSICALLY ONE-DIMENSIONAL SETS

33

Since all points of P ∗ are far apart from x, it follows by Lemma 2.2 that x ∈ stconv(P ∗ ). Further, π preserves order in each coordinate, so x0 := π(x) ∈ stconv π(P ∗ ) ⊆ S 0 . Since x0 ∈ π(N ), this proves that π(N ) intersects every stair-convex set of volume 0 in [0, 1]d . This finishes the proof of part (i) of the transference lemma. Part (ii) is proved similarly, only with the roles of convexity and stairconvexity interchanged.

2.2

Intrinsically one-dimensional sets

In this section we prove Theorems 1.3, 1.4, and 1.5. In each case, we perform a reduction to the problem of stabbing interval chains, and then we apply either the lower or the upper bound for this latter problem (Theorems 1.6 and 1.7). (j) Recall that Zk (n) denotes the minimum number of j-tuples needed to stab all k-interval chains in [1, n].

2.2.1

Planar point sets in convex position

Recall that Theorem 1.3 states that if X ⊂ R2 is in convex position, then X has a weak 1r -net of size O(rα(r)). Proof of Theorem 1.3. Lemma 2.17. Let X be a set of n points in convex position in the plane, (3) and let r > 1. Then X has a weak 1r -net of size Z`/r−1 (`), where ` is a free parameter with 4r ≤ ` < n. Proof. Partition the points of X into ` “blocks” B0 , B1 , . . . , B`−1 of n/` consecutive points, clockwise along the boundary of conv(X) (we ignore, without any real effect on the analysis, the rounding to integers). Construct a set of points P = {p0 , p1 , . . . , p`−1 }, where each pj lies on the boundary of conv(X) between the last point of Bj−1 and the first point of Bj . (Indices are modulo `. See Figure 2.6(a).) Consider a subset X 0 ⊂ X of size at least n/r. X 0 must contain m = `/r points q0 , q1 , . . . , qm−1 lying on m distinct blocks. Note that m ≥ 4. Let Bjk be the block containing qk ; assume without loss of generality that 0 ≤ j0 < j1 < · · · < jm−1 < `. The blocks Bjk partition P cyclically into m

34

CHAPTER 2. WEAK EPSILON-NETS

Figure 2.6: The case of planar point sets in convex position: (a) “Separator” points pj between consecutive blocks. (b) The intersection between two chords joining pairs of points from four different intervals falls inside conv(X 0 ). nonempty intervals Ik = {pjk +1 , pjk +2 , . . . , pjk+1 },

for 0 ≤ k < m.

(Indices are modulo ` or modulo m as appropriate.) Let pa , pb , pc , pd ∈ P be four points belonging to four different intervals Ik , listed in cyclic order. Then the intersection between the segments pa pc and pb pd must lie inside conv(q0 , . . . , qm−1 ) ⊆ conv(X 0 ). See Figure 2.6(b).4 Thus, it suffices to construct a set of quadruples of points of P , such that, no matter how P is cyclically partitioned into m intervals I0 I1 · · · Im−1 , some quadruple will “stab” four different intervals. The set of chord-intersection points corresponding to these quadruples is our desired weak 1r -net. We take point p0 as the first point for all the quadruples; by construction, p0 lies in the last interval Im−1 . Thus, it only remains to build a family Z of triples of the form (pa , pb , pc ), with 1 ≤ a < b < c < `, such that some triple is guaranteed to fall on three distinct intervals among I0 , . . . , Im−2 , in any given cyclic chain I0 , . . . , Im−1 . But this is isomorphic to the problem of stabbing all (m − 1)-chains in (3) [1, ` − 1] with triples. Thus, there exists a family Z of size at most Zm−1 (`) = (3) Z`/r−1 (`). 4

This basic idea, initially observed by Emo Welzl, already appears in [16].

2.2. INTRINSICALLY ONE-DIMENSIONAL SETS

35

Remark: Including point p0 in all the quadruples entails a penalty of at most a factor of 2 in the number of quadruples. Indeed, given an optimal family Z of quadruples that stab all cyclic partitions into m intervals, we can replace each quadruple q = (pa , pb , pc , pd ) ∈ Z, with 0 < a < b < c < d < `, by the two quadruples q 1 = (p0 , pb , pc , pd ), q 2 = (p0 , pa , pb , pc ). If q stabs four different intervals in such a partition, then one of q 1 , q 2 must also do so. Continuing the proof of Theorem 1.3, by Theorem 1.6 we have (3) Z`/r−1 (`) = O `α`/(2r)−1 (`) .

(2.2)

We take ` = 2r(1+α(r)), so `/(2r)−1 = α(r). We claim that αα(r) (`) ≤ 4 for all large enough r. Indeed, for all k ≥ 3 and r ≥ 0 we have αk (r2 ) ≤ 1+αk (r). Thus, once r is large enough, we have αα(r) (`) = αα(r) 2r(1 + α(r)) ≤ αα(r) (r2 ) ≤ 1 + αα(r) (r) = 1 + 3 = 4, since αα(r) (r) ≤ 3 by definition. Hence, (2.2) becomes O(rα(r)).

2.2.2

Point sets on convex curves

Next we prove Theorem 1.4, concerning the size of a weak 1r -net for a set X ⊂ Rd lying on a curve γ that is intersected at most d times by every hyperplane. Recall that a curve with this property is called a convex curve. The argument is similar to the one above, though now it has an extra step in which we apply Tverberg’s theorem. Recall that Tverberg’s theorem states that for every positive integers s and d, if A is a set of at least (s−1)(d+1)+1 point in Rd , then T A can be partitioned into s pairwise disjoint subsets A1 , . . . , As such that si=1 conv(Ai ) 6= ∅ [33, p. 200]. If A and B are two finite sets of points along γ, we say that A and B are interleaving if between every two points of A there is a point of B along γ, and vice versa. In such a case, we must have |A| − |B| ≤ 1. Lemma 2.18. Let s = d(d + 1)/2e, and let j = (s − 1)(d + 1) + 1. (Thus, j = (d2 + d + 2)/2 for d even, and j = (d2 + 1)/2 for d odd.) Let A be a set of j points along a convex curve γ ⊂ Rd . Then there exists a point x ∈ conv(A) with the following property: For every point set B ⊂ γ interleaving with A, with ( j, d even, |B| = j + 1, d odd,

36

CHAPTER 2. WEAK EPSILON-NETS

we have x ∈ conv(B). Proof. By Tverberg’s theorem, A can be partitioned into s pairwise disjoint subsets A1 , . . . , As , whose convex hulls all contain some common point x. This point x satisfies the assertion of the lemma, for if x 6∈ conv(B), then there would exist a hyperplane h that separates x from B. But there must be at least s points of A in the same side of h as x (at least one from each part Ai ). By continuity, and since A and B are interleaving, it follows that the curve γ must intersect h at least 2s − 1 times if d is even, or 2s times if d is odd. In either case, this quantity equals d + 1, contradicting the fact that no hyperplane intersects γ more than d times.5 The reduction to stabbing interval chains with j-tuples is now straightforward: Proof of Theorem 1.4. Lemma 2.19. Let X be a set of n points along a convex curve γ, and let r > 1. Let ( (d2 + d)/2, d even; 0 j = (d2 + 1)/2, d odd. (j 0 )

Then X has a weak 1r -net of size at most Z`/r−1 (`), where ` is a free parameter with (j 0 + 1)r ≤ ` < n. Proof. Partition X into ` blocks B0 , B1 , . . . , B`−1 of n/` consecutive points. Construct a set of points P = {p1 , . . . , p`−1 } ⊂ γ, where each pi lies between the last point of Bi−1 and the first point of Bi . Take also a point p` ∈ γ lying after B`−1 . Consider a set X 0 ⊂ X of size at least n/r. X 0 must contain m = `/r points q1 , . . . , qm lying on m different blocks Bi1 , . . . , Bim . These points define on P an (m − 1)-chain C = I1 · · · Im−1 , where Ik = {pik +1 , pik +2 , . . . , pik+1 },

for 1 ≤ k ≤ m − 1.

Note that m − 1 ≥ j 0 . Construct an optimal family Z 0 of j 0 -tuples of points in P that stab all (m − 1)-chains in P . Append the point p` to every j 0 -tuple in Z 0 , obtaining a family Z of (j 0 + 1)-tuples (actually, this is necessary only (j 0 ) for d even). We have |Z| = Zm−1 (` − 1). There must exist some p ∈ Z whose first j 0 points stab the chain C. Thus, the j 0 + 1 points of p are interleaving with some (j 0 + 1)-point subset of 5

The above argument is very similar to the one used by Matouˇsek and Wagner [35], applied to a different construction.

2.2. INTRINSICALLY ONE-DIMENSIONAL SETS

37

{q1 , . . . , qm }. By the choice of j 0 , Lemma 2.18 applies, so the point x = x(p) guaranteed by the lemma lies in conv(X 0 ). Therefore, the set of all points x(p), p ∈ Z, is our desired weak 1r -net. To conclude the proof of Theorem 1.4, take ` = r 1 + Pj00 (α(r)) , with Pj00 (m) as given in Theorem 1.7. Then, arguing as in the proof of Theorem 1.3, (j 0 ) (j 0 ) Z`/r−1 (`) = ZP 00 (α(r)) (`) ≤ cj 0 `αα(r) (`) ≤ 4cj 0 `. j

The claim follows. Remark: Theorem 1.4 can be generalized to curves that are intersected at most q times by every hyperplane, for some integer q. (We must have q ≥ d, since we can always pass a hyperplane through d given points.) In Lemma 2.18, we take instead s = d(q + 1)/2e, and we let |B| = j for q even and |B| = j + 1 for q odd. Lemma 2.19 is also modified accordingly. We obtain weak 1r -nets of size r · 2poly(α(r)) for point sets along such curves. (Note that the methods of Matouˇsek and Wagner [35] yield weak 1r -nets of size O r polylog(r) for these point sets.)

2.2.3

Point sets on convex curves: A lower bound

In this section we prove Theorem 1.5 by constructing, for every d ≥ 3 and every r > 1, a convex curve γ ⊂ Rd and a point set Ds ⊂ γ, such that every weak 1r -net for Ds must have size slightly superlinear in r, with the bound given in the theorem. The set Ds is the diagonal of the d-dimensional stretched grid Gs introduced in Section 2.1. That is, with Gs (n) = X1 × · · · × Xd , where Xi = {xi1 , . . . , xin }, we set Ds (n) := {(x1j , . . . , xdj ) : j = 1, 2, . . . , n}.

(2.3)

As before, n must be chosen large enough in terms of r. We start by showing that, if Gs is defined appropriately, then Ds lies on a convex curve. Indeed, if each element xij of each Xi in the definition of Gs is chosen minimally, then we have x1j = 1+j(2d−1), and for i ≥ 2 we have xij = Kij−1 . Then Ds = (2d + (2d − 1)t, K2t , K3t , . . . , Kdt ) : t = 0, 1, . . . , n − 1 is a subset of the curve γ = (2d + (2d − 1)t, K2t , K3t . . . , Kdt ) : t ∈ R .

(2.4)

38

CHAPTER 2. WEAK EPSILON-NETS

Lemma 2.20. Let γ ⊂ Rd be a curve of the form γ = (c0 + c1 t, ct2 , ct3 , . . . , ctd ) : t ∈ R , for some constants c0 , . . . , cd with c1 6= 0 and c2 , . . . , cd > 0 pairwise distinct. Then every hyperplane in Rd intersects γ in at most d points. Proof. The claim follows almost directly from the following lemma: Lemma 2.21. Let f (t) be a function of the form f (t) = α1 ct1 + · · · + αd ctd + αd+1 , for α1 , . . . , αd+1 arbitrary, not all zero, and c1 , . . . , cd positive and pairwise distinct. Then f has at most d zeros. Proof. It suffices to show that f 0 (t) = (α1 ln c1 )ct1 + · · · + (αd ln cd )ctd has at most d − 1 zeros. But this follows by taking ct1 as common factor and using induction. Now, Lemma 2.20 is equivalent to showing that the function f (t) = α1 (c0 + c1 t) + α2 ct2 + α3 ct3 + · · · + αd ctd + αd+1 has at most d zeros. But this follows by taking one derivative and applying Lemma 2.21 (with d − 1 instead of d). We now derive a lower bound for the size of a weak 1r -net for Ds . Lemma 2.22. Given r > 1, let N be a weak 1r -net for Ds = Ds (n), for n = n(r) large enough. Let ` = |N |. Then ` must satisfy (d)

` ≥ Z4d`/r (`). Proof. For each point x ∈ N and each coordinate 1 ≤ j ≤ d, mark as “bad” the two points of Ds that surround x when the points are projected into the j-th coordinate. Thus, at most 2d` points of Ds are marked “bad”. Partition Ds into 4d` contiguous blocks of size n/(4d`) each (we can safely ignore the rounding to integers if n is large enough). Then there are 2d` blocks B1 , . . . , B2d` which are “good”, in the sense that they do not contain any bad points. Place 2d` − 1 abstract “separators” Y1 , . . . , Y2d`−1 along the curve γ (given by (2.4)) between these blocks, such that Yi lies between Bi and Bi+1 .

2.2. INTRINSICALLY ONE-DIMENSIONAL SETS

39

Let k = 4d`/r. There is a natural one-to-one correspondence between sets B of k good blocks, and (k − 1)-chains B 0 on the separators. Namely, for every i1 < i2 < · · · < ik we map B = {Bi1 , . . . , Bik } ↔ B 0 = [Yi1 , Yi2 −1 ][Yi2 , Yi3 − 1] · · · [Yik−1 , Yik −1 ], where the notation [Ya , Yb ] means {Ya , Ya+1 , . . . , Yb }. Let B = {Bi1 , . . . , Bik } be an arbitrary such set. Let Ds0 = Bi1 ∪· · ·∪Bik ⊆ Ds . Since |Ds0 | = n/r and N is a weak 1r -net for Ds , it follows that conv(Ds0 ) must contain some point x ∈ N . By Carath´eodory’s theorem, x is contained in the convex hull of some d + 1 points of Ds0 ; let these points be q0 , . . . , qd from left to right. Recall that for each coordinate 1 ≤ j ≤ d, the projection xj of x into the j-th coordinate falls between two bad points of Ds . Therefore, xj does not lie in a good block, but rather between two adjacent good blocks Baj , Baj +1 , which surround the separator Yaj . Thus, we can associate with x the d-tuple of separators x0 = (Ya1 , . . . , Yad ). Furthermore, none of the points q0 , . . . , qd are bad, and therefore they are far apart from x in each coordinate. Therefore, Lemmas 2.1 and 2.2 apply, and so the j-th coordinate of x must lie between the j-th coordinates of qj−1 and qj , for every j = 1, 2, . . . , d. It follows that q0 , . . . , qd belong to d + 1 distinct blocks B00 , . . . , Bd0 of B, and furthermore, the relative order of these blocks and of the separators of x0 is B00 , Ya1 , B10 , Ya2 , . . . , Yad , Bd0 . In other words, the d-tuple x0 stabs the (k − 1)-chain B 0 . Thus, N must have enough points to stab all (k − 1)-chains (and so all k-chains) with d-tuples in the range [1, 2d` − 1] ⊇ [1, `]. Therefore, d (`). ` = |N | ≥ Zkd (`) = Z4d`/r

Corollary 2.23. The quantity ` of Lemma 2.22 must satisfy ` = Ω r · Q0d (α(r) − 3) , for the function Q0d of Theorem 1.7. Note that Corollary 2.23 implies Theorem 1.5.

40

CHAPTER 2. WEAK EPSILON-NETS

1 r · Q0d (α(r) − 3). Then 4d`/r ≤ Proof. Suppose for a contradiction that ` ≤ 4d Q0d (α(r) − 3), so by Lemma 2.22, Theorem 1.7, and Lemma 1.1, (d)

(d)

` ≥ Z4d`/r (`) ≥ ZQ0 (α(r)−3) (`) ≥ c0d `αα(r)−3 (`) − c00d ` d

≥ c0d `αα(`)−3 (`) − c00d ` = ω(`). (We have ` ≥ r since every weak 1r -net must trivially have at least r points.) This is a contradiction for all large enough `, and so for all large enough r. Finally, we show that the set Ds does have a weak 1r -net of the size given in Theorem 1.5, up to lower order terms. For this, use basically the same method as in the proof of Theorem 1.4, but without using Tverberg’s theorem and instead working directly with d-tuples: Lemma 2.24. The set Ds ⊂ Rd has a weak 1r -net of size t

t−1 )

r · 2(1/t!)α(r) +O(α(r) (1/t!)α(r)t

r·2

log2

,

α(r)+O(α(r)t )

d even; ,

d odd,

where t = b(d − 2)/2c. Proof. As in the proof of Theorem 1.4, we let ` be a free parameter, and we define ` equal-sized blocks B1 , B2 , . . . , B` of consecutive points of Ds . This time, however, we leave a pair of adjacent points Yi = {yi , yi0 } between every two consecutive blocks Bi , Bi+1 . We call each pair of points Yi a “separator”. We assume ` is much smaller than n = |Ds |, so the size of each block Bi can be approximated by n/`. Again consider a set Ds0 ⊂ Ds of size at least n/r. Ds0 must contain points belonging to a set B of k = `/r different blocks. These blocks define a (k − 1)-chain B 0 on the separators Yi . Let Z be an optimal family of d-tuples of separators that stab all (k − 1)(d) chains of separators. We have |Z| = Zk−1 (` − 1). There must be a d-tuple z = (Ya1 , . . . , Yad ) ∈ Z that stabs the (k − 1)-chain B 0 . Translate the d-tuple z into a point z 0 = (z10 , . . . , zd0 ) ∈ Rd such that, for each 1 ≤ i ≤ d, the coordinate zi0 lies between the i-th coordinates of the two points yai , ya0 i that constitute Yai . Then, Ds0 must contain points q0 , . . . , qd of respective types 0, . . . , d with respect to z 0 . Further, z 0 is far from each of q0 , . . . , qd , so it follows from Lemmas 2.1 and 2.2 that z 0 ∈ conv({q0 , . . . , qd }) ⊆ conv(Ds0 ).

2.3. CONCLUSIONS

41

Thus, the set Z 0 ⊂ Rd of all these points z 0 for every z ∈ Z is a weak 1 -net for Ds . Taking ` = r 1 + Pd0 (α(r)) , with Pd0 as in Theorem 1.7, we get r (d)

(d)

|Z 0 | ≤ Z`/r−1 (`) = ZP 0 (α(r)) (`) ≤ cd `αα(r) (`) ≤ 4cd `, d

and the claim follows.

2.3

Conclusions

The gaps between the known lower and upper bounds for weak 1r -nets are still huge. The most significant gaps are: between Ω(r log r) and O(r2 ) for the general planar case; between Ω r logd−1 r and O(rd polylog r) for the general case in Rd ; and (the not so huge gap) between Ω(r) and O(rα(r)) for planar point sets in convex position. We conjecture that the true bound for the case of planar sets in convex position is Θ(r), for the following reason: It is known (Chazelle et al. [16]) that if X is the vertex set of a regular n-gon then it has a weak 1r -net of size Θ(r). On the other hand, we have shown that the diagonal of the stretched grid in the plane Ds ⊂ R2 also has a weak 1r -net of size Θ(r) (since the problem reduces to stabbing interval chains with pairs; see Lemma 1.8). These two cases are, in a sense, opposite extreme examples of point sets in convex position.

42

CHAPTER 2. WEAK EPSILON-NETS

Chapter 3 Stabbing interval chains (j)

In this chapter we derive our upper and lower bounds for Zk (n), the minimum number of j-tuples needed to stab all k-interval chains in [1, n] (Theorems 1.6 and 1.7, and Lemma 1.8). In Section 3.1 we derive the upper bounds for all j ≥ 3, and in Section 3.2 we derive the lower bounds. Finally, in Section 3.3 we deal with the simple case of j = 2 (Lemma 1.8).

3.1

Upper bounds (j)

We now derive upper bounds on Zk (n). We will always take j to be a constant, noting that the constants implicit in the asymptotic notations do depend on j (though neither on k nor on n). We start with the easy case k = j, for which we have an exact bound. Lemma 3.1. We have (j) Zj (n)

n − bj/2c = = Θ ndj/2e dj/2e

for all j ≥ 2. Proof. Suppose first that j is odd. Consider all j-chains of the form [a1 ][a1 + 1, a2 − 1][a2 ][a2 + 1, a3 − 1][a3 ] · · · [a(j+1)/2 ], where 1 ≤ ai ≤ n and ai + 2 ≤ ai+1 for all i. There are n−(j−1)/2 such (j+1)/2 chains, each of which must be stabbed by a different j-tuple. On the other hand, we can stab all j-chains by taking all j-tuples of the form (b1 , b1 + 1, b2 , b2 + 1, b3 , . . . , b(j+1)/2 ), 43

44

CHAPTER 3. STABBING INTERVAL CHAINS

where 1 ≤ bi ≤ n and bi + 2 ≤ bi+1 for all i. There are n−(j−1)/2 such (j+1)/2 (j) j-tuples. Therefore, for j odd we have Zj (n) = n−(j−1)/2 = n−bj/2c . (j+1)/2 dj/2e The case where j is even is similar. For the lower bound, we consider all j-chains of the form [a1 ][a1 + 1, a2 − 1][a2 ] · · · [aj/2 ][aj/2+1 , n], and, for the upper bound, we take all j-tuples of the form (b1 , b1 + 1, . . . , bj/2 , bj/2 + 1). (j) We get Zj (n) = n−j/2 . j/2 Once k is large enough with respect to j, the number of j-tuples required to stab all k-chains becomes O(n polylog(n)): Lemma 3.2. For every fixed j ≥ 2 we have1 (j) Z2j−1 (n) = O n logj−2 n . Proof. By induction on j. The base case j = 2 is given by Lemma 3.1, so let j ≥ 3, and put k = 2j−1 . Divide the range [1, n] into two blocks B1 , B2 , each of size at most n/2, leaving between them the element y = dn/2e. For each block Bi we build an optimal family of j-tuples that stab all (j) k-chains entirely contained in Bi . This requires at most 2Zk (n/2) j-tuples in total. It remains to stab those k-chains that contain the element y. Every such chain C must have k/2 = 2j−2 intervals entirely contained in either B1 or B2 . Thus, it suffices to build on each Bi an optimal family of (j − 1)-tuples that stab all k/2-chains in Bi , and append the element y to each (j −1)-tuple. The (j−1) number of resulting j-tuples is at most 2Zk/2 (n/2), which is O n logj−3 n by the induction hypothesis. We obtain the recurrence relation (j) (j) n + O n logj−3 n , Zk (n) ≤ 2Zk 2 (j) j−2 which implies Zk (n) = O n log n . (j)

We now derive upper bounds for Zk (n) for all k. We first tackle the simpler case j = 3, and then we address the general case j ≥ 4. Our derivations below (and of the lower bounds in Section 3.2) follow a (j) recurring pattern: We first derive a recurrence relation for Zk (n), and then we apply it with appropriately chosen parameters. 1

A more careful analysis shows that the constant of proportionality actually decreases exponentially with j.

3.1. UPPER BOUNDS

45

Figure 3.1: Range [1, n] partitioned into blocks and separators.

Figure 3.2: A k-chain C must satisfy exactly one of these properties: Either C is contained within a block (a); or every interval of C, except possibly the first and last, contains a separator (b); or some interval of C, besides the first and last, falls entirely within a block, and another interval contains an adjacent separator (c).

3.1.1

Upper bounds for triples

(3) (3) We have already established that Z3 (n) = n−1 (Lemma 3.1) and Z4 (n) = 2 O(n log n) (Lemma 3.2). Our bounds for stabbing k-chains with triples, k ≥ 5, are based on the following recurrence relation. p Recurrence 3.3. Let t be an integer parameter, with 1 ≤ t ≤ n/2 − 1. Then, n n (3) (3) (3) + 2n. Zk (n) ≤ Zk (t) + Zk−2 t t Proof. Partition the range [1, n] into blocks B1 , B2 , . . . , Bb of size t (except for the last block, which might be smaller), leaving between each pair of adjacent blocks, as well as before the first block and after the last one, a single “separator” element. Let the set of separators be Y = {y0 , . . . , yb }, such that block Bi lies between separators n−1 yi−1 and yi (see Figure 3.1). The number of blocks is b = t+1 . We have b ≤ n/t − 1, since n ≥ 2(t + 1)2 ≥ 2t2 + t. Consider an arbitrary k-chain C = I1 · · · Ik . C must satisfy exactly one of the following properties (see Figure 3.2):

46

CHAPTER 3. STABBING INTERVAL CHAINS 1. C is entirely contained within a block Bi . 2. Every interval of C, except possibly the first and the last, contains a separator. 3. Some interval Ij of C, 2 ≤ j ≤ k − 1, falls entirely within a block Bi , but not all of C is contained in the block. Thus, some other interval of C contains either yi−1 or yi .

We can take care of the first case by constructing within each block Bi an optimal family of triples that stab all k-chains. This requires at most (3) (3) bZk (t) ≤ (n/t)Zk (t) triples. The second case is handled by constructing on the separators Y an optimal family of triples that stab all (k − 2)-chains. This requires at most (3) (3) Zk−2 (b + 1) ≤ Zk−2 (n/t) triples. Finally, the third case is handled by taking all triples of the forms (a, a + 1, yi ), for yi−1 ≤ a ≤ yi − 2, (yi−1 , a, a + 1), for yi−1 < a ≤ yi − 1, for all yi . There are at most 2n such triples. (3)

Lemma 3.4. We have Z5 (n) = O(n log log n). Proof. Apply Recurrence 3.3 with k = 5 and t = 3.1.

p

n/3, and use Lemma

Lemma 3.5. There exists an absolute constant c such that, for every k ≥ 6, we have (3) Zk (n) ≤ cnαbk/2c (n) for all n. Proof. Here it is convenient to work with a slight variant of the inverse Ackermann function. Let n0 = 2000. For this proof, let α bm (x), m ≥ 2, be given by α b2 (x) = α2 (x) = dlog2 xe, and, for m ≥ 3, by the recurrence ( 1, if x ≤ n0 ; α bm (x) = 1+α bm 2b αm−1 (x) , otherwise. There exists a constant c0 such that α bm (x) − αm (x) ≤ c0 for all m and x (see Appendix A). Let k ≥ 4, and let m = bk/2c. We prove, by induction on k, that (3)

Zk (n) ≤ c1 nb αm (n) for all n,

3.1. UPPER BOUNDS

47 (3)

for some absolute constant c1 . The base cases of the induction are Z4 (n), (3) Z5 (n) = O(n log n), by Lemmas 3.2 and 3.4, respectively. Without loss of (3) generality, assume that c1 ≥ 4 and that c1 ≥ Z4 (n)/n for all n ≤ n0 . Let now k ≥ 6, and assume that the bound holds for k − 2. To establish the bound for k, assume first that n ≤ n0 . Then, we have (3)

(3)

Zk (n) ≤ Z4 (n) ≤ c1 n = c1 nb αm (n). Thus, 3.3 with t = 2b αm−1 (n). (Note that p let n > n0 . We apply Recurrence (3) t ≤ n/2 − 1 for n > n0 .) Letting Zk (n) = ng(n), and using the fact that c1 ≥ 4, we obtain n c1 c1 g(n) ≤ g(t) + α bm−1 + 2 ≤ g(t) + α bm−1 (n) + 2 t t t c1 = g(t) + + 2 ≤ g(t) + c1 . 2 Since α bm (t) = α bm (n)−1, it follows by induction on n (with base case n ≤ n0 ) that g(n) ≤ c1 α bm (n) for all n. Therefore, (3)

Zk (n) ≤ c1 nb αm (n) for all n. This proves the upper bounds of Theorem 1.6. Remark: Had we not been careful to add the factor 2 in the definition of (3) α bm (x) and in the choice of t, we would have got a weaker bound of Zk (n) = O nkαbk/2c (n) .

3.1.2

Upper bounds for j-tuples

We now extend our techniques of the previous section and derive upper (j) bounds for Zk (n), the minimum number of j-tuples needed to stab all kchains in [1, n], for j ≥ 4. Our bounds are based on the following recurrence relation. p Recurrence 3.6. Let j ≥ 4 be fixed. Let t be a parameter, 1 ≤ t ≤ n/2−1, and let k1 , k2 , k3 be integers. Put k = 2k1 + k2 (k3 − 2). Then, n (j) (j) (j−1) (j−2) (j) n Zk (t) + 2Zk1 (t) + Zk2 (t) + Zk3 . Zk (n) ≤ t t

48

CHAPTER 3. STABBING INTERVAL CHAINS

Figure 3.3: A chain which violates all three properties, like the one shown, can have at most k − 1 intervals. Proof. As before, partition the range [1, n] into blocks B1 , . . . , Bb of size t (except for the last block, which might be smaller), such that each block Bi is surrounded by separator elementspyi−1 , yi . Denote the set of separators by Y = {y0 , . . . , yb }. Again, since t ≤ n/2 − 1, we have b ≤ n/t − 1. Let k1 , k2 , k3 be given, and put k = 2k1 + k2 (k3 − 2). Then, every k-chain C = I1 · · · Ik satisfies at least one of the following properties: 1. C is entirely contained within a block Bi . 2. The first k1 intervals of C, or the last k1 intervals of C, fall entirely within a block Bi , and some other interval of C contains the separator yi or yi−1 , respectively. 3. Some k2 consecutive intervals of C fall within a block Bi , and two other intervals contain the separators yi−1 and yi . 4. At least k3 distinct intervals of C contain separators. Indeed, the largest number of intervals for which a chain might possibly violate all the above properties is (k3 − 1) + (k3 − 2)(k2 − 1) + 2(k1 − 1) = k − 1. (See Figure 3.3.) Since we have a k-chain, one of the above properties must hold. Thus, we can stab all k-chains by building the following family of j-tuples. Within each block Bi we build • an optimal family of j-tuples that stab all k-chains; • an optimal family of (j − 1)-tuples that stab all k1 -chains, where each of these tuples is extended into a j-tuple in two ways, by appending either of the surrounding separators yi−1 , yi ; • an optimal family of (j − 2)-tuples that stab all k2 -chains, where each of these tuples is extended into a j-tuple by appending both separators yi−1 , yi .

3.1. UPPER BOUNDS m= P2 (m) = P3 (m) = P4 (m) = P5 (m) = P6 (m) =

49

2 3 4 5 6 7 2 2 2 2 2 2 4 6 8 10 12 14 8 24 60 136 292 608 16 132 1160 11852 142784 2000164 32 984 61240 8352072 ···

Table 3.1: Values of Pj (m) for small j and m. In addition, we construct on the set of separators Y an optimal family of j-tuples that stab all k3 -chains. Every k-chain C must be stabbed by some j-tuple in the union of all these subfamilies. The claimed recurrence relation follows. Define integer-valued functions Pj (m), j, m ≥ 2, by P2 (m) = 2; P3 (m) = 2m; ( 2j−1 , m = 2; Pj (m) = 2Pj−1 (m) + Pj−2 (m) Pj (m − 1) − 2 , m ≥ 3;

for j ≥ 4.

See Table 3.1. We can give an explicit formula for P4 (m): P4 (m) = 5 · 2m − 4m − 4. For general j we can estimate Pj (m) asymptotically: Lemma 3.7. Let j ≥ 3 be fixed, and let t = b(j − 2)/2c. Then Pj (m) has upper and lower bounds of the form ( t t−1 2(1/t!)m ±O(m ) , for j even; Pj (m) = (1/t!)mt log2 m±O(mt ) 2 , for j odd. For the proof see Appendix B. Lemma 3.8. Let j ≥ 2 be fixed. Then, there exists a constant c = c(j) such that, for every m ≥ 2, we have (j)

ZPj (m) (n) ≤ cnαm (n)j−2 ,

for all n.

(3.1)

Proof. We proceed along the lines of the proof of Lemma 3.5, except that now we also use induction on j. The case j = 3 was proven already (Lemmas 3.2 and 3.5), so let j ≥ 4 be fixed.

50

CHAPTER 3. STABBING INTERVAL CHAINS

We again work with a slight variant of the inverse Ackermann function. Let n0 = j 4j . For this proof, let α bm (x), m ≥ 2, be given by α b2 (x) = α2 (x) = dlog2 xe, and for m ≥ 3 by the recurrence ( 1, if x ≤ n0 ; α bm (x) = 1+α bm (4b αm−1 (x)j−2 ), otherwise. Again, there exists a constant c0 (depending only on j) such that α bk (x) − αk (x) ≤ c0 for all k and x (see Appendix A).2 We will show, by induction on m, that there exists a constant c1 (depending only on j) such that (j) ZPj (m) (n) ≤ c1 n 2b αm (n)j−2 + α bm (n)j−3 + α bm (n) (3.2) for all m ≥ 2 and all n. This is easily seen to imply the claim. The base case m = 2 is given by Lemma 3.2, so assume c1 is large enough that (3.2) holds for m = 2. Assume further that (j)

c1 ≥ ZPj (3) (n)/(4n),

for n ≤ n0 .

(3.3)

By induction on j, we know there exist constants c2 , c3 (depending on j), such that (j−1)

αm (n)j−3 , ZPj−1 (m) (n) ≤ c2 nb (j−2)

αm (n)j−4 , ZPj−2 (m) (n) ≤ c3 nb for all m ≥ 3 and all n.3 Without loss of generality, assume c1 ≥ c2 , c3 . Now, let m ≥ 3, and suppose (3.2) holds for m − 1. To establish (3.2) for m, assume first that n ≤ n0 . Then, by (3.3), we have (j)

(j)

ZPj (m) (n) ≤ ZPj (3) (n) ≤ 4c1 n = c1 n 2b αm (n)j−2 + α bm (n)j−3 + α bm (n) . Thus, let n > n0 . Apply Recurrence 3.6 with the following parameters: k1 = Pj−1 (m),

k2 = Pj−2 (m),

k = Pj (m), 2

k3 = Pj (m − 1),

t = 4b αm−1 (n)j−2 .

Note that α bm (x) depends on j, so strictly speaking we should write α bj,m (x). However, this dependence is very minor, since |b αj,m (x)− α bj 0 ,m (x)| is bounded by some constant cj,j 0 . 3 The multiplicative constants c2 , c3 “pay” for the conversion from α bj−1,m (n), α bj−2,m (n) to α bj,m (n), respectively.

3.2. LOWER BOUNDS (By our choice of n0 , we have t ≤ n/t ≤ n, we have (j−1)

2Zk1

51 p n/2 − 1 for n > n0 .) Using t ≤ n and

(t) ≤ 2c1 tb αm (n)j−3 ;

(j−2)

(t) ≤ c1 tb αm (n)j−4 ; n c n 1 (j) Zk3 ≤ 2b αm−1 (n)j−2 + α bm−1 (n)j−3 + α bm−1 (n) t t c1 n 2+α bm−1 (n)−1 + α bm−1 (n)−j+3 ≤ c1 n. = 4

Zk2

(j)

Plugging these expressions into Recurrence 3.6 and letting Zk (n) = ng(n), we get g(n) ≤ g(t) + 2c1 α bm (n)j−3 + c1 α bm (n)j−4 + c1 . Since α bm (t) = α bm (n) − 1, it follows by induction on n that g(n) ≤ c1 2b αm (n)j−2 + α bm (n)j−3 + α bm (n) . (The base case n ≤ n0 follows from (3.3), and for the induction on n we apply j−x α bm (n) − 1 ≤ α bm (n) − 1 α bm (n)j−x−1 for x = 2, 3.) Thus, (j) αm (n)j−2 + α bm (n)j−3 + α bm (n) , ZPj (m) (n) ≤ c1 n 2b as claimed. Let Pj0 (m) = Pj (m + 1) for j ≥ 4, m ≥ 2. Clearly, Pj0 (m) satisfies (1.7). There exists a constant c0 , depending only on j, such that αm+1 (n)j−2 ≤ c0 αm (n) for all m and n. Therefore, (j)

ZP 0 (m) (n) ≤ c00 nαm (n) for all n, j

for some constant c00 = c00 (j). This proves the upper bounds of Theorem 1.7.

3.2

Lower bounds (j)

We now derive asymptotic lower bounds for Zk (n). As before, we take j to be fixed, recalling that the implicit constants do depend on j. (j) As a warm-up, we first derive lower bounds of the form Zk (n) = Ω(n log n) for appropriate k, for each j ≥ 3. (We do not use these bounds in our later arguments, but we are interested in the case j = 3, since it will yield (3) Z4 (n) = Θ(n log n).)

52

CHAPTER 3. STABBING INTERVAL CHAINS

Figure 3.4: Blocks and contracted blocks defined on the range [1, n]. Lemma 3.9. For every fixed j ≥ 3 we have (j)

Z(j−1)2 (n) = Ω(n log n), where the constant of proportionality depends on j. Proof. Let t = dn/je. We define on the range [1, n] a sequence of j blocks of size t, in which every two consecutive blocks overlap at exactly one element. For this, let yi = 1 + i(t − 1) for 0 ≤ i ≤ j. Note that y0 = 1 and yj ≤ n. Then let Bi = [yi−1 , yi ], for 1 ≤ i ≤ j. We also define “contracted blocks” that do not contain the elements yi : Bi0 = [yi−1 + 1, yi − 1],

for 1 ≤ i ≤ j.

(See Figure 3.4.) We have |Bi0 | = t − 2 for all i. Let k = (j −1)2 , and let Z be a family of j-tuples that stab all k-chains in [1, n]. Z must contain families Z1 , . . . , Zj of “local” j-tuples that stab all the k-chains that are fully contained in B1 , . . . , Bj , respectively. Further, these local families must be disjoint, since every two blocks overlap on at most one element. Thus, (j) (j) n . |Z1 ∪ · · · ∪ Zj | ≥ jZk (t) ≥ jZk j Now, consider the “global” j-tuples of Z—those that are not contained in any block Bi . Consider the elements of the contracted blocks Bi0 that are not contained in any global j-tuple. Call these elements “unused”. Suppose each of the blocks B10 , Bj0 contains a run of j − 2 consecutive 0 unused elements, and each of the intermediate blocks B20 , . . . , Bj−1 contains a run of j − 3 consecutive unused elements. Construct an interval chain C that has these j 2 − 3j + 2 unused elements as singleton intervals, plus j − 1 “long” intervals between the runs of singletons. (If j = 3 then the two long intervals meet at an arbitrary place in B20 .) Note that each long interval is nonempty, since it contains an element yi .

3.2. LOWER BOUNDS

53

The chain C has j 2 − 2j + 1 = k intervals, but it cannot be stabbed by any j-tuple in Z: It cannot be stabbed by a local j-tuple, since each block Bi contains at most j − 1 intervals or parts thereof; and it cannot be stabbed by a global j-tuple, since the global j-tuples can only stab the long intervals, and there are only j − 1 long intervals. Therefore, there cannot exist such runs of unused elements. Hence, at the very least, there must exist some Bi0 in which every (j − 2)-nd element is “used” by some global j-tuple. Since |Bi0 | = Ω(n), there are Ω(n) such “used” elements in Bi0 , but each j-tuple “uses” only j elements. Therefore, there are Ω(n) global j-tuples, and we obtain the following recurrence relation: (j) (j) n Zk (n) ≥ jZk + Ω(n). j (j)

Thus, Zk (n) = Ω(n log n). (j)

We now derive lower bounds for Zk (n) for all k. As in the case of the upper bounds, we first deal with j = 3, and then with j ≥ 4.

3.2.1

Lower bounds for triples

Our asymptotically tight lower bounds for triples are based on the following recurrence relation. √ Recurrence 3.10. Let t be an integer parameter, with 3 ≤ t ≤ n. Then, nn o n (3) (3) (3) n , Zk Zk+2 (n) ≥ Zk+2 (t) + min t 18 3t for all n ≥ 36. Proof. Let b = dn/te. We define on the range [1, n] a sequence of b blocks of size t, in which every two consecutive blocks overlap at exactly one element:4 Let yi = 1 + i(t − 1) for 0 ≤ i ≤ b. Note that y0 = 1; and it can be checked that yb ≤ n, since n ≥ t2 . Then let Bi = [yi−1 , yi ], 4

In the proof of Lemma 3.9 above and Recurrence 3.13 below we divide [1, n] into j blocks of size dn/je, while here and in the proof of Recurrence 3.15 below we divide it into dn/te blocks of size t. For the sake of clarity we opted to always let the block size be t. We apologize for any confusion that this might have caused.

54

CHAPTER 3. STABBING INTERVAL CHAINS

Figure 3.5: The m unused elements x1 , . . . , xm , from m distinct blocks, define m − 1 nonempty “links” L1 , . . . , Lm−1 . for 1 ≤ i ≤ b. As before, we also let Bi0 = [yi−1 + 1, yi − 1], for 1 ≤ i ≤ b (refer again to Figure 3.4). Then, |Bi | = t and |Bi0 | = t − 2 for all i. Let Z be a family of triples that stab all (k +2)-chains in [1, n]. As before, Z must contain b disjoint families of “local” triples that stab all chains in each (3) (3) block Bi . The total size of these families is at least bZk+2 (t) ≥ (n/t)Zk+2 (t). Now consider the “global” triples of Z—those that are not contained in any block Bi . As before, consider the elements of the contracted blocks Bi0 that are not contained in any global triple, and call them “unused”. Suppose that at most half the blocks Bi0 contain unused elements. Then there must be Ω(n) global triples. More precisely, the number of global triples must be at least 1 b n 2 n · (t − 2) ≥ 1− ≥ , 3 2 6 t 18 since t ≥ 3. In this case we are done. Thus, suppose that at least half the blocks Bi0 contain unused elements. Let x1 , . . . , xm be m unused elements from m distinct blocks, with m ≥ b/2. These elements define a sequence of m − 1 intervals Li = [xi + 1, xi+1 − 1] for 1 ≤ i ≤ m − 1, which we call “links” (see Figure 3.5). Each link Li contains at least one element yi0 , so the links are nonempty. Consider a k-chain C 0 = I10 · · · Ik0 on the links, where Ii0 = [Lai , Lai+1 −1 ] for some integers ai , 1 ≤ i ≤ k + 1. We can translate C 0 into a (k + 2)chain C = I0 I1 · · · Ik+1 on [1, n], as follows: We make the unused elements right before I10 and after Ik0 into singleton intervals, and we append each intermediate unused element to the link at its right. Then we fuse the links in each Ii0 into one interval. See Figure 3.6(a,b). This chain C cannot be stabbed by any local triple, since each block Bi contains parts of at most two intervals of C. Thus, C must be stabbed by a global triple τ . Since τ does not contain any unused elements, it cannot stab the singleton intervals I0 or Ik+1 . Therefore, τ must stab three links on three different intervals among I1 , . . . , Ik . Thus, we can translate τ back into a triple of links τ 0 that stabs C 0 . See Figure 3.6(c).

3.2. LOWER BOUNDS

55

Figure 3.6: Every k-chain C 0 on the links (a) can be translated into a (k +2)chain C on [1, n] (b). A global triple (marked by x’s) must stab C on three distinct links. We can translate this triple back into a triple of links that stabs C 0 (c). Hence, we have enough triples of links τ 0 to stab all k-chains on the m − 1 links. The number of original global triples τ must be at least as large. Thus, (3) there are at√ least Zk (m−1) global triples. Finally, note that m−1 ≥ n/(3t), since n ≥ 6 n for n ≥ 36. Lemma 3.11. We have (3)

Z5 (n) = Ω(n log log n). Proof. Apply Recurrence 3.10 with k = 3 and t = 3.1.

√

n, and use Lemma

Lemma 3.12. For every k ≥ 6 we have (3)

Zk (n) ≥ c1 nαbk/2c (n) − c2 n

for all n,

for some absolute constants c1 > 0 and c2 < ∞. Proof. By induction from k to k + 2, and then on n. The base cases are k = 6, 7, which we derive from Recurrence 3.10 with k = 4 and t = log n, and with k = 5 and t = log log n, respectively. We use the lower bounds for (3) (3) Z4 (n) and Z5 (n) of Lemmas 3.9 and 3.11, respectively, and we obtain (3)

(3)

Z6 (n), Z7 (n) = Ω(n log∗ n) = Ω(nα3 (n)). (3)

(3)

(The recursion depth is log∗ n for Z6 (n) and 21 log∗ n for Z7 (n).) Thus, we can assume without loss of generality that c1 is small enough so that our claim is satisfied for k = 6, 7. We define a variant α bm (n) of the inverse Ackermann function by α b3 (n) = α3 (n) = log∗ n and, for m ≥ 4, by: ( 0, if n ≤ 1; α bm (n) = 1+α bm (b αm−1 (n)/c23 ), otherwise;

56

CHAPTER 3. STABBING INTERVAL CHAINS

for an appropriate constant c3 > 1, which must satisfy constraints to be specified below. It follows from the analysis in Appendix A that α bm (n) differs from αm (n) by an absolute constant. Let k ≥ 8, and put m = bk/2c; thus, m ≥ 4. We will show by induction on k that 12 (3) αm (n) − c3 n, for all n. Zk (n) ≥ nb c3 This clearly implies the claim. Let t = α bm−1 (n)/c23 . We consider three cases, according to how large α bm−1 (n) is. The first case is when α bm−1 (n) ≤ c4 for an appropriate constant c4 (specified below). In this case we have 12 12 nb αm (n) − c3 n ≤ nb αm−1 (n) − c3 n c3 c3 12c4 (3) − c3 n ≤ 0 ≤ Zk (n), ≤ c3 √ assuming that c3 is large enough so that c3 ≥ 12c4 . The second case is when c4 ≤ α bm−1 (n) ≤ 3c23 . Now, c4 is chosen large enough so that if α bm−1 (n) ≥ c4 then α bm (n) ≤ α bm−1 (n)/36. (It is clear that such a constant c4 exists, since α bm (n) decreases very rapidly with m for fixed n; concretely, we can take c4 = 180.) In this case we have 12 α bm−1 (n) 12 nb αm (n) − c3 n ≤ n · − c3 n c3 c3 36 n · 3c23 (3) ≤ − c3 n = 0 ≤ Zk (n). 3c3 Finally, the third case is when α bm−1 (n) ≥ 3c23 . In this case we have t ≥ 3, so we can apply Recurrence 3.10. (3) Suppose by induction on k that Zk−2 (n) ≥ 12 nb αm−1 (n) − c3 n for all c3 n. Since α bm−1 (n) grows so slowly in n, we have, for the values of n under consideration, α bm−1 (n/(3t)) ≥ α bm−1 (n) − 1, so certainly α bm−1 (n/(3t)) ≥ α bm−1 (n)/2. Thus, n c n 12 n 3 · α bm−1 − 3t c3 3t 3t 3t 2nb αm−1 (n) 12 ≥ − c3 n = 2c3 n − c3 n = c3 n ≥ n, c3 t c3 √ assuming c3 ≥ 12. (3)

Zk−2

n

≥

3.2. LOWER BOUNDS

57

If we also have c3 ≥ 216, then the above quantity is smaller than n/18, so substituting into Recurrence 3.10 we obtain (3)

Zk (n) ≥

n (3) 12 Zk (t) + n. t c3 (3)

Assuming by induction on n the desired bound on Zk (t), and using the fact that α bm (t) = α bm (n) − 1, we get 12 12 n 12 (3) t(b αm (n) − 1) − c3 t + n = nb αm (n) − c3 n. Zk (n) ≥ t c3 c3 c3 This proves the lower bounds of Theorem 1.6.

3.2.2

Lower bounds for j-tuples (j)

We now derive general lower bounds for Zk (n), j ≥ 4. We will construct a sequence of integer-valued functions Qj (m), m ≥ 2, such that (j) ZQj (2) (n) = Ω n log(j−1) n ; (3.4) (j) (j−2) ZQj (m) (n) = Ω nαm (n) = ω nαm+1 (n) , m ≥ 3; (3.5) for all j ≥ 4. (Recall that f (j) denotes the j-fold composition of f .) Our arguments become more involved, because we now divide each block into sub-blocks. Let us start with the case m = 2 given by (3.4). Recurrence 3.13. Let j ≥ 3 be fixed. Let q be a parameter, with q ≤ n/(3j) − 2. Let k1 , k2 be integers, and put k = 2k1 + (j − 2)k2 + j − 1. Then, n (j−1) n (j−2) n (j) (j) n Z (q), Z (q), jZk + 2 Zk (n) ≥ min 3jq k1 3jq k2 j 3j q for all n ≥ 6j. Proof. Let t = dn/je. Define the elements y0 , . . . , yj , the blocks B1 , . . . , Bj , and the contracted blocks B10 , . . . , Bj0 , as in the proof of Lemma 3.9. We have |Bi | = t and |Bi0 | = t − 2 for all i. Next, define on each contracted block Bi0 a sequence Di1 , . . . , Did of d = b(t − 2)/qc disjoint sub-blocks of size q (these sub-blocks do not necessarily cover Bi0 completely; see Figure 3.7). Note that d ≥ 2n/(3jq), since q ≤ n/(3j) − 2. Let Z be a family of j-tuples that stab all k-chains in [1, n]. For each i, let Zi contain those j-tuples of Z that lie entirely inside Bi . Note that the families Zi are pairwise disjoint.

58

CHAPTER 3. STABBING INTERVAL CHAINS

Figure 3.7: Sub-blocks defined within a contracted block Bi0 .

Figure 3.8: A k-chain which cannot be stabbed by any j-tuple, local or global. Let Z10 (resp., Zj0 ) be the family of (j − 1)-tuples obtained by deleting the last (resp., first) element of each j-tuple in Z1 (resp., Zj ). For each 2 ≤ i ≤ j − 1, let Zi0 be the family of (j − 2)-tuples obtained by deleting the first and last elements of each j-tuple in Zi . We say that a sub-block Di` , i ∈ {1, j}, is cleared if the (j − 1)-tuples in Zi0 stab all the k1 -chains in Di` . And a sub-block Di` , 2 ≤ i ≤ j − 1, is cleared if the (j − 2)-tuples in Zi0 stab all the k2 -chains in Di` . A block Bi is cleared if at least half of its sub-blocks are cleared. Now consider the “global” j-tuples of Z—those that are not contained in any Zi . Let Bi0 be an uncleared block. We say that Bi0 is safe if every uncleared sub-block Di` within Bi0 (of which there are at least d/2) contains some point of a global j-tuple. Suppose all the blocks are uncleared and unsafe. Then we can build a k-chain C that cannot be stabbed by any j-tuple in Z: For each 1 ≤ i ≤ j, we take an uncleared sub-block Di`i of block Bi0 that is not “touched” by any global j-tuple. We take a “hardy” k1 -chain from each of the sub-blocks D1`1 , Dj`j , and a “hardy” k2 -chain from each intermediate block Di`i , 2 ≤ i ≤ j −1. These “hardy” chains are chains that are not stabbed by any tuple in the respective families Zi0 , and are also not touched any global j-tuple. We connect the hardy chains together with j −1 “long intervals” (see Figure 3.8). As before, the long intervals are nonempty, since each one contains an element yi . The total length of C is 2k1 + (j − 2)k2 + j − 1 = k. Now, C cannot be stabbed by a local j-tuple, because then the corresponding (j − 1)- or (j − 2)-tuple in Zi0 would stab a hardy chain. And C cannot be

3.2. LOWER BOUNDS

59

stabbed by a global j-tuple, since the global j-tuples can only stab the long intervals, and there are only j − 1 long intervals. Therefore, there are two possibilities. The first one is that all the blocks are uncleared, but at least one of them is safe. This implies that there are at least n 1 d · ≥ 2 j 2 3j q (j)

global j-tuples. There must also be at least jZk (t) local j-tuples,5 and we get the third term of the recurrence. The second possibility is that some block Bi0 is cleared. If i ∈ {1, j}, this implies that n (j−1) d (j−1) Z (q). |Zi | ≥ |Zi0 | ≥ Zk1 (q) ≥ 2 3jq k1 And if 2 ≤ i ≤ j − 1, this implies that |Zi | ≥ |Zi0 | ≥

n (j−2) (q). Z 3jq k2

Now, let Q2 (2) = 1; Q3 (2) = 5; Qj (2) = 2Qj−1 (2) + (j − 2)Qj−2 (2) + j − 1,

j ≥ 4.

For j ≥ 4 we have Qj (2) = 15, 49, 163, 577, 2139, . . .. Lemma 3.14. For every fixed j ≥ 2 we have (j) ZQj (2) (n) = Ω n log(j−1) n , where the constant of proportionality depends on j. Proof. By induction on j. The case j = 2 is trivial, since it is impossible to (2) stab a 1-chain with a pair, so Z1 (n) = ∞. And the case j = 3 is given by Lemma 3.11. So let j ≥ 4. Apply Recurrence 3.13 with k1 = Qj−1 (2),

k2 = Qj−2 (2),

k = Qj (2),

q = log n.

By induction, we have n (j−1) Zk1 (q) = Ω n log(j−1) n ; 3jq n (j−2) Zk2 (q) = Ω n log(j−2) n = ω n log(j−1) n . 3jq 5

This of course holds in any case.

60

CHAPTER 3. STABBING INTERVAL CHAINS

Now, consider the recurrence relation6 n n + . f (n) ≥ jf j log n This recurrence has solution f (n) = Ω(n log log n) = ω n log(j−1) n . There (j) fore, substituting into Recurrence 3.13, we get ZQj (2) (n) = Ω n log(j−1) n , as desired. We now derive the bounds (3.5). We use the following recurrence. √ Recurrence 3.15. Let j be fixed. Let t and q be parameters, with t ≤ n and q ≤ t/9 − 2. Let k1 , k2 , k3 be integers, and put k = 2k1 + (k2 + 1)(k3 − 1) + 1. Then, (j) Zk (n)

n (j−1) n (j−2) ≥ min Zk1 (q), Z (q), 9q 9q k2 n (j) n (j) n Z (t) + min , Zk3 t k 9jq 3t

for all n ≥ 36. Proof. Let b = dn/te, and define the elements y0 , . . . , yb , the blocks B1 , . . . , Bb , and the contracted blocks B10 , . . . , Bb0 as in the proof of Recurrence 3.10. We have |Bi | = t and |Bi0 | = t − 2 for all i. As in the proof of Recurrence 3.13, define on each contracted block Bi0 a sequence Di1 , . . . , Did of d = b(t − 2)/qc disjoint “sub-blocks” of size q. Let Z be a family of j-tuples that stab all k-chains in [1, n]. For each i, let Zi be the family of j-tuples of Z that are entirely contained in block Bi . (j) The families Zi are pairwise disjoint, and each one has size at least Zk (t). (1) For each i, let Zi be the family of (j − 1)-tuples obtained by removing (2) the last element of each j-tuple in Zi . Let Zi be the family of (j − 2)-tuples obtained by removing the first and last elements of each j-tuple in Zi . And (3) let Zi be the family of (j − 1)-tuples obtained by removing the first element of each j-tuple in Zi . Let Di` be a sub-block within block Bi0 . We say that Di` is left-cleared (1) (3) (resp., right-cleared ) if the (j − 1)-tuples of Zi (resp., Zi ) stab all the k1 -chains in Di` . And we say that Di` is middle-cleared if the (j − 2)-tuples (2) of Zi stab all the k2 -chains in Di` . 6

It is correct, in a recurrence like Recurrence 3.13, to consider each case separately and then take the minimum of the resulting “sub-solutions”.

3.2. LOWER BOUNDS

61

Now consider the “global” j-tuples of Z—those that are not contained in any block Bi . We say that a sub-block Di` is visited if it contains some point of a global j-tuple. If a sub-block Di` is neither left-, middle-, nor right-cleared, nor is it visited, then Di` is hot; otherwise, it is cold. A hot sub-block contains three hardy chains H (1) , H (2) , H (3) (not necessarily disjoint), of lengths k1 , k2 , (1) (2) (3) and k1 , respectively, which are not stabbed by any tuple in Zi , Zi , Zi , respectively, and are not “touched” by any global j-tuple. A block Bi0 is hot if it contains some hot sub-block Di` ; otherwise, it is cold. Now, suppose that at least half the blocks Bi0 are cold. Then, there is a total of at least bd/2 cold sub-blocks. Therefore, there must be at least n 1 bd · ≥ 4 2 9q sub-blocks which are either all left-cleared, or all middle-cleared, or all rightcleared, or all visited. (Note that d ≥ 8t/(9q), since q ≤ t/9 − 2.) The first or third case implies |Z| ≥

n (j−1) Z (q); 9q k1

while the second case implies |Z| ≥

n (j−2) Z (q). 9q k2

Finally, the fourth case implies that Z contains at least n/(9jq) global j(j) tuples, plus at least (n/t)Zk (t) local j-tuples. Suppose, then, that there are m ≥ n/(2t) hot blocks Bi0 . Let K1 , . . . , Km be m hot sub-blocks from m distinct such blocks. These sub-blocks define a sequence of m − 1 nonempty “links” L1 , . . . , Lm−1 between them, as in the (1) proof of Recurrence 3.10. Each sub-block Ki contains the hardy chains Hi , (2) (3) Hi , Hi mentioned above, which we now index with i. Consider a k3 -chain C 0 = I10 . . . Ik0 3 on the links. This chain is uniquely determined by a sequence of k3 + 1 sub-blocks Ka1 , Ka2 , . . . , Kak3 +1 ,

(3.6)

where each interval Ii0 contains those links that lie between Kai and Kai+1 . We can translate C 0 into the k-chain C = Ha(1) I1 Ha(2) I2 · · · Ik3 −1 Ha(2) Ik3 Ha(3) , 1 2 k k +1 3

3

62

CHAPTER 3. STABBING INTERVAL CHAINS m= Q2 (m) = Q3 (m) = Q4 (m) = Q5 (m) = Q6 (m) =

2 3 4 5 6 7 1 1 1 1 1 1 5 7 9 11 13 15 15 43 103 227 479 987 49 471 4907 59327 831523 13306327 163 8071 849095 193712087 ···

Table 3.2: Values of Qj (m) for small j and m. on [1, n], where each interval Ii extends from the end of one hardy chain to the beginning of the next. The number of intervals in C is 2k1 + (k3 − 1)k2 + k3 = k. Now, C cannot be stabbed by any local j-tuple from a block Kai , since (x) then the corresponding hardy chain Hai would be stabbed by a tuple from (x) Zai (for the same x ∈ {1, 2, 3}). Therefore, C must be stabbed by a global j-tuple τ ∈ Z. Further, τ must stab j links from j different intervals Ii (since (x) none of the chains Hai is touched by τ ). Thus, we can translate τ back into a j-tuple of links τ 0 that stabs C 0 . (j) (j) Hence, there are at least Zk3 (m − 1) ≥ Zk3 n/(3t) global j-tuples, plus (j) at least (n/t)Zk (t) local j-tuples. Hence, at least one of the terms in the recurrence must apply, and we are done. Define integer-valued functions Qj (m), for j, m ≥ 2, by Q2 (m) = 1;

Q3 (m) = 2m + 1;

and for j ≥ 4, Qj (m) = 2Qj−1 (m) + 1 + Qj−2 (m) Qj (m − 1) − 1 + 1,

m ≥ 3;

with Qj (2) as defined above. See Table 3.2. We have Q4 (m) = 8 · 2m − 4m − 9, and in general, letting t = b(j − 2)/2c, ( t t−1 2(1/t!)m ±O(m ) , for j ≥ 4 even; Qj (m) = t t 2(1/t!)m log2 m±O(m ) , for j ≥ 3 odd; just as in the case of Pj (m); see Appendix B. Lemma 3.16. For every fixed j ≥ 2 we have (j)

(j−2)

ZQj (3) (n) = Ω nα3

(n)

(with the constant of proportionality depending on j).

3.2. LOWER BOUNDS

63

Proof. The case j = 2 is trivial, and the case j = 3 is given by Lemma 3.12. So let j ≥ 4. We apply Recurrence 3.15 with the following parameters: k1 = Qj−1 (3),

k2 = Qj−2 (3), t = log

(j−1)

n,

k3 = Qj (2),

k = Qj (3).

q = α3 (n).

By induction on j we have n (j−1) (j−2) Zk1 (q) = Ω nα3 (n) , 9q n (j−2) (j−3) (j−2) Zk2 (q) = Ω nα3 (n) = ω nα3 (n) . 9q Now, consider a the recurrence relation of the form n cn f (t) + , (3.7) t q for some constant c. We have α3 log(i) n = α3 (n) − i for every integer i ≥ 0. Hence, (3.7) expands into an harmonic-like series, which yields f (n) = (j−2) Ω n log α3 (n) = ω nα3 (n) . Finally, by Lemma 3.14 we have n n (j) n =Ω log(j−1) = Ω(n). Zk3 3t t 3t f (n) ≥

The solution of the recurrence f (n) ≥ (n/t)f (t) + Ω(n) is f (n) = Ω(nα3 (n)), (j−2) which is also ω nα3 (n) . Plugging all these lower bounds into Recurrence (j) (j−2) 3.15, we get ZQj (3) (n) = Ω nα3 (n) , as desired. Lemma 3.17. For every j ≥ 2 and every m ≥ 3 we have (j)

(j−2) ZQj (m) (n) ≥ cj nαm (n) − c0j n

for all n,

for some constants cj > 0 and c0j < ∞ that depend only on j. Proof. The case m = 3 follows from Lemma 3.16 if cj is chosen small enough, so let m ≥ 4. We define a variant α bm (n) of the inverse Ackermann function by α b3 (n) = α3 (n) = log∗ n and, for m ≥ 4, by ( 0, if n ≤ 1; α bm (n) = (3.8) (j−2) 00 1+α bm α bm−1 (n)/cj , otherwise; for an appropriate sufficiently large constant c00j > 1, which must satisfy constraints to be specified below. As argued in Appendix A, because of

64

CHAPTER 3. STABBING INTERVAL CHAINS

1 αm (n). the (j − 2)-fold application of α bm−1 in (3.8), we have α bm (n) ≈ j−2 Specifically, there exists a constant γj such that for all m ≥ 4 and all n we 1 have α bm (n) ≥ j−2 αm (n) − γj . We will show by induction on j, m, and n that (j)

ZQj (m) (n) ≥

4 (j−2) nb α (n) − c00j n, c00j m

for all n.

This clearly implies the claim. We apply Recurrence 3.15 with the following parameters: k1 = Qj−1 (m),

k2 = Qj−2 (m), t=

k3 = Qj (m − 1),

(j−2) α bm−1 (n)/c00j ,

k = Qj (m),

q=α bm (n).

We assume that c00j is large enough so that c00j > 9(log∗ c002 j + 3).

(3.9) (j−2)

We consider three cases, according to how large α bm−1 (n) is (like we did (j−2) in the proof of Lemma 3.12). The first case is when α bm−1 (n) ≤ ψj for some constant ψj to be specified below. In this case we have 4 (j−2) 4 (j−2) (n) − c00j n nb αm (n) − c00j n ≤ 00 nb α 00 cj cj m−1 4ψj (j) 00 ≤ − cj n ≤ 0 ≤ Zk (n), 00 cj p assuming c00j ≥ 2 ψj . (j−2)

The second case is when ψj < α bm−1 (n) ≤ (c00j )2 . Now, the constant ψj (j−2) (j−2) (j−2) bm−1 (n). is chosen large enough so that if α bm−1 (n) > ψj then α bm (n) ≤ 41 α (j−2) (Again, such a ψj clearly exists since α bm (n) decreases so rapidly in m for fixed n.)7 Then, 4 (j−2) 1 (j−2) nb αm (n) − c00j n ≤ 00 nb α (n) − c00j n 00 cj cj m−1 (j)

≤ c00j n − c00j n = 0 ≤ Zk (n). We first choose ψj large enough according to this condition, and then we choose c00j large enough in terms of ψj . The function α bm (n) decreases as c00j increases, so there is no circularity here. 7

3.2. LOWER BOUNDS

65 (j−2)

Finally, the third case is when α bm−1 (n) > (c00j )2 . In this case we apply Recurrence 3.15. √ For this we first have to check that q ≤ t/9−2 (we certainly also have t ≤ n). Suppose for a contradiction that q > t/9 − 2. Then, (j−2)

(j−2)

α bm−1 (n) < 9c00j (b αm (n) + 2) = 9c00j (b αm (b αm−1 (n)/c00j ) + 3) (j−2)

bm−1 (n) + 3); ≤ 9c00j (log∗ α (j−2)

but this condition is violated for α bm−1 > (c00j )2 by (3.9). Now, by induction on j we have: (j−1)

(j−3) (n) ≥ bj−1 nb αm (n) − b0j−1 n,

(j−2)

(j−4) (n) − b0j−2 n, (n) ≥ bj−2 nb αm

Zk1

Zk2

for some constants bj−1 , bj−2 > 0 and b0j−1 , b0j−2 < ∞.8 Then, b0j−1 n (j−1) bj−1 (j−2) Zk1 (q) ≥ nb αm (n) − n≥ 9q 9 9 b0j−2 bj−2 (j−3) n (j−2) Z (q) ≥ nb αm (n) − n≥ 9q k2 9 9

4 (j−2) n α bm (n) + 1 − c00j n, 00 cj 4 (j−2) n α b (n) + 1 − c00j n, m c00j

assuming we choose c00j large enough in terms of bj−1 , bj−2 , b0j−1 , b0j−2 . (j) Next we bound Zk3 (n/(3t)) by induction on m. Using the assump(j−2) (j−2) (j−2) bm−1 (n/3t)) ≥ α bm−1 (n)/2 (since tion α bm−1 (n) > (c00j )2 and the fact that α (j−2) α bm−1 (n) grows so slowly in n), we have: (j)

Zk3

n

4n (j−2) n n α bm−1 − c00j 00 3t 3cj t 3t 3t c00j n 2n (j−2) 2n n n n 00 ≥ 00 α bm−1 (n) − cj (j−2) ≥ − = ≥ . 3cj t 3 3 3 9jq 3b αm−1 (n) ≥

(j)

Substituting into Recurrence 3.15, and letting g(n) = nZk (n), we obtain the following recurrence for g(n): 4 (j−2) 1 00 g(n) ≥ max 00 α b (n) + 1 − cj , g(t) + . (3.10) cj m 9jq 8

As in the proof of Lemma 3.8, strictly speaking we should write α bm (n) as α bj,m (n). Again, the dependence on j is very minor, since α bj,m (n)/b αj 0 ,m (n) is bounded by some constant cj,j 0 . The constants bj−1 , bj−2 here are appropriately chosen to “pay” for the conversion from α bj−1,m (n), α bj−2,m (n) to α bj,m (n), respectively.

66

CHAPTER 3. STABBING INTERVAL CHAINS

Consider the recurrence h(n) ≥ h(t) + 1/(9jq). By our choice of t = t(n) and q = q(n), we have q(t(i) (n)) = α bm (n) − i. We can apply the recurrence (j−2) as long as n is large enough that α bm−1 (n) > (c00j )2 ; i.e., as long as α bm (n) > 00 1+α bm (cj ). Thus, we obtain 1 1 1 1 1 h(n) ≥ + + + ··· + 9j α bm (n) α bm (n) − 1 α bm (n) − 2 1+α bm (c00j ) 1 ≥ ln α bm (n) − ln α bm (c00j ) . 9j This is larger than

(j−2) 4 (b αm (n) + 1) − c00j c00 j 9

for all n if c00j is chosen large enough.

Thus, we conclude from (3.10) that g(n) ≥

4 (j−2) α b (n) − c00j , c00j m

and therefore, (j)

Zk (n) ≥

4 (j−2) nb α (n) − c00j n, c00j m

as desired. Define Q0j (m) for j ≥ 4, m ≥ 2, by Q0j (2) = j; Q0j (m) = Qj (m − 1),

m ≥ 3.

(j−1)

Then, using the fact that αm−1 (n) ≥ αm (n) − c∗j for some constant c∗j , we conclude by Lemmas 3.1, 3.14, 3.16, and 3.17 that for every j ≥ 4, m ≥ 2 we have (j)

ZQ0 (m) (n) ≥ cj nαm (n) − c0j n for all n, j

for some (other) constants cj , c0j . This proves the lower bounds in Theorem 1.7. 9

Again, intuition suggests that in a recurrence like (3.10) one can consider each case separately and then take the minimum of the resulting “sub-bounds”. This intuition is (j−2) basically correct, though in our case we must subtract 1 from α bm (n) in (3.10), since α bm (n) is discontinuous in n and increases in discrete steps of 1.

3.3. STABBING WITH PAIRS

3.3

67

Stabbing with pairs

For the sake of completeness, we give almost-tight bounds on the number of pairs needed to stab all k-chains in [1, n]. Lemma 3.18. We have n n (2) − 3 ≤ Zk (n) ≤ − 1. bk/2c bk/2c Proof. For the upper bound, let k be even, and let q = k/2. Take the family of pairs Z = (iq, (i + 1)q) 1 ≤ i ≤ n/q − 1 . It is easily verified that in any k-chain C, there must be at least two different intervals that contain elements of the form iq. Therefore, there must be two adjacent elements iq, (i + 1)q that fall on two different intervals, so C is stabbed. We have n n −1 ≤ − 1, |Z| = q bk/2c and we are done. For the lower bound, let k be odd, and let q = (k − 1)/2. Let Z = (xi , yi ) 1 ≤ i ≤ m be a family of pairs that stabs all k-chains in [1, n], with xi < yi for all i. Let X = {xi | 1 ≤ i ≤ m}. We may assume that there exists an integer a1 ∈ [1, n − q + 1] such that X ∩ [a1 , a1 + q − 1] = ∅, for otherwise we have |Z| = |X| ≥ bn/qc ≥ n/q − 1 and we are done. Let a1 be the smallest integer with the above property. Partition X into X1 = X ∩ [1, a1 − 1], X2 = X ∩ [a1 , n]. By the minimality of a1 , we have a1 − 1 a1 |X1 | ≥ ≥ − 1. q q Let Y = {yi | xi ∈ X2 }. Suppose there exists an integer a2 ∈ [a1 + q + 1, n − q + 1] such that Y ∩ [a2 , a2 + q − 1] = ∅.

68

CHAPTER 3. STABBING INTERVAL CHAINS

Then the k-chain consisting of the q singletons [a1 ] · · · [a1 + q − 1], followed by the interval [a1 + q, a2 − 1], followed by the q singletons [a2 ] · · · [a2 + q − 1], cannot be stabbed by any pair in Z, as is easily checked. Thus, such an integer a2 cannot exist, so we have n − a1 n − a1 − q ≥ − 2, |X2 | = |Y | ≥ q q so |X| = |X1 | + |X2 | ≥ n/q − 3.

3.4

Conclusions

We have obtained, for every fixed j and k, reasonably-tight upper and lower (j) bounds for Zk (n) as a function of n. It is an open problem to find the exact (j) asymptotic form of Zk (n) for every fixed j ≥ 4 and k. Interestingly, there are several problems in the literature that are known (3) to have bounds very similar to those of Zk (n) (stabbing with triples). Let us describe one of them, which is related to offline computation of partial sums in semigroups (Alon and Schieber [7], Chazelle and Rosenberg [17], Yao [52]). We are given the range [1, n] and an integer k. We want to construct a family Y of subsets of [1, n], with |Y| as small as possible, such that every interval [a, b], 1 ≤ a ≤ b ≤ n, can be expressed as the union of at most k sets from Y. Let yk (n) denote the minimum size of such a family Y. Then (3) (compare with the bounds for Zk (n) in Theorem 1.6), n+1 y1 (n) = ; y2 (n) = Θ(n log n); y3 (n) = Θ(n log log n); 2 yk (n) = Θ nαbk/2c+1 (n) , k ≥ 4. In fact, these upper bounds can be achieved even if we require the sets in Y to be intervals, and we require every [a, b] to be expressed as a disjoint union of such intervals. Bounds similar to these have also been obtained for some problems related to right-rotations in binary search trees (Sundar [48]), and for circuits of bounded depth (see, e.g., Chandra et al [15], Dolev et al. [21], Pudl´ak [41]). In general, there is no known relation between these problems, except that they all satisfy similar recurrences. In contrast, we do not know of any combinatorial problem with bounds (j) like those of Zk (n) for general j, except for the almost-DS sequences we describe in the next chapter. (Again, the bounds are similar because both

3.4. CONCLUSIONS

69

problems satisfy “essentially the same” recurrence.) Are there other problems with similar bounds? If there are, it would be interesting to find more examples.

70

CHAPTER 3. STABBING INTERVAL CHAINS

Chapter 4 Davenport–Schinzel sequences In this chapter we derive our results on Davenport–Schinzel sequences and their generalizations. See Chapter 1 for a more detailed overview of these results. Sections 4.1–4.4 contain our upper-bound results. In Section 4.1 we show how bounding λs (n) (Theorem 1.9) reduces to bounding a function denoted ψs (m, n). In Section 4.2 we improve the technique of Agarwal et al. [2, 45] for bounding ψs (m, n). In Section 4.3 we present an alternative technique, based on almost-DS sequences, which yields the same improved bounds for ψs (m, n). Section 4.4 addresses formation-free sequences. We first prove Lemma 1.14, and then we extend our new technique to formation-free sequences, proving Theorem 1.13. Sections 4.5–4.6 contain our lower-bound results. Section 4.5 presents our construction for λ3 (n) that proves Theorem 1.11. Section 4.6 contains the simplified lower-bound construction for Davenport–Schinzel sequences of even order s ≥ 4. For completeness, we provide proofs of most of the previous results we rely on.

4.1

General approach for the upper bounds

The upper bounds for λs (n) are obtained by considering a function with an additional parameter m: Definition 4.1: Let m, n, and s be positive integers. Then ψs (m, n) denotes the maximum length of a Davenport–Schinzel sequence of order s on n distinct symbols that can be partitioned into m or fewer contiguous blocks, where each block contains only distinct symbols. 71

72

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES The relation between λs (n) and ψs (m, n) is as follows:

Lemma 4.2 ([2, 45]). Let ϕs−2 (n) be a nondecreasing function in n such that λs−2 (n) ≤ nϕs−2 (n) for all n. Then, λs (n) ≤ ϕs−2 (n) ψs (2n, n) + 2n . (4.1) Proof. Let S be a Davenport–Schinzel sequence of order s on n symbols with maximum length λs (n). Partition S greedily from left to right into blocks S1 , S2 , . . . , Sm , such that each Si is a sequence of order s − 2; in other words, when scanning S from left to right, start a new block Si+1 only if an additional symbol would cause Si to contain an alternation of length s.1 We claim that m, the number of blocks, is at most 2n. Indeed, consider some block Si for i < m. Since Si was extended maximally to the right, it must contain an alternation abab . . . of length s − 1, which is extended to length s by the first symbol of Si+1 (which is either a or b, depending on the parity of s). But then, we cannot have both b appearing in a previous Sj , j < i, and b or a (depending on the parity of s) appearing in a subsequent Sj , j > i, because then S would contain a forbidden alternation of length s + 2. Hence, each block Si (including the last one Sm ) contains either the first occurrence or the last occurrence of at least one symbol. Thus, m ≤ 2n. Let ni = kSi k. Then, λs (n) = |S| =

m X i=1

|Si | ≤

m X i=1

λs−2 (ni ) ≤

m X i=1

ni ϕs−2 (ni ) ≤ ϕs−2 (n)

m X

ni .

i=1

P Let us now bound ni . Construct a subsequence S 0 of S by taking, for each block Si ,P just the first occurrence of each symbol in Si . Note that S 0 has 0 length |S | = ni and, being a subsequence of S, it contains no alternation of length s + 2. Furthermore, S 0 is decomposable into m blocks of distinct 0 symbols S10 , . . . , Sm . However, S 0 might contain adjacent equal symbols at the interface between blocks, but by removing at most one symbol from each block Si0 , we can obtain a sequence S 00 with no adjacent equal symbols. Therefore, |S 00 | ≤ ψs (m, n), and so |S 0 | ≤ ψs (m, n) + m. Since m ≤ 2n, we conclude that λs (n) ≤ ϕs−2 (n) ψs (2n, n) + 2n . In particular, for s = 3 we have λ3 (n) ≤ ψ3 (2n, n) + 2n (by taking ϕ1 (n) = 1, since λ1 (n) = n). Actually for s = 3 we have λ3 (n) = ψ3 (2n, n) (Hart and Sharir [23, 45]). 1

This greedy left-to-right approach is in fact optimal—it yields a partition of S into the minimum possible number of blocks of specified order r < s.

4.2. BOUNDING ψs (m, n)

73

The main issue, then, is to upper-bound ψs (m, n). We present two different techniques for bounding ψs (m, n). The first one is a minor modification of the technique of Agarwal et al. [2, 45]. The second one is our new technique. Both techniques yield the following bounds: Lemma 4.3. For s = 3 we have ψ3 (m, n) = O kmαk (m) + kn

for all k.

In general, for every fixed s ≥ 3 we have ψs (m, n) ≤ Cs,k mαk (m)s−2 + n for some constants Cs,k of the form ( t t−1 2(1/t!)k ±O(k ) , Cs,k = t t 2(1/t!)k log2 k±O(k ) ,

for all k,

s even; s odd;

where t = b(s − 2)/2c. (Equivalent bounds for ψ3 (m, n) and ψ4 (m, n) were previously derived by Hart and Sharir [23, 45], and Agarwal, Sharir, and Shor [2, 45], respectively. For s ≥ 5 these are improvements over [2, 45], which for s ≥ 6 yield improved bounds for λs (n).) From Lemmas 4.2 and 4.3 it follows that λs (n) = o(nα` (n)) for every fixed `: Just take k = ` + 1 in Lemma 4.3, bounding ϕs−2 (n) in Lemma 4.2 by induction. Here the magnitude of the constants Cs,k is irrelevant. But we can also derive a tighter bound for λs (n), namely Theorem 1.9, if we let k grow very slowly with m; for this the dependence of Cs,k on k is significant: Proof of Theorem 1.9. Take k = α(m) in Lemma 4.3 (recall that αα(m) (m) ≤ 3 by definition), and substitute into Lemma 4.2. For s = 3, 4 we get λ3 (n) = O(nα(n)), λ4 (n) = O n · 2α(n) (by taking ϕ1 (n) = 1, ϕ2 (n) = 2). For s ≥ 5 we bound ϕs−2 (n) by induction on s and we get the desired bounds (the factor ϕs−2 (n) only affects lower-order terms in the exponent).

4.2

Bounding ψs(m, n)—improving the known technique

In this section we prove Lemma 4.3 by making a small improvement on the technique of Agarwal et al. [2, 45]. The main ingredient in the proof is the following complicated-looking recurrence relation. This is a small modification of the recurrence in [2, 45] (and more complicated).

74

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Recurrence 4.4. Let m, n ≥ 1 and b ≤ m be integers, and let m = m1 + m2 + · · · + mb be a partition of m into b nonnegative integers. Then, there exists a partition of n into b + 1 nonnegative integers n = n1 + n2 + · · · + nb + n∗ ,

(4.2)

and there exist nonnegative integers n∗1 , n∗2 , . . . , n∗b ≤ n∗ satisfying n∗1 + n∗2 + · · · + n∗b ≤ ψs (b, n∗ ) + b,

(4.3)

such that ψs (m, n) ≤ 2ψs−1 (m, n∗ ) + 4m +

b X

ψs−2 (mi , n∗i ) + ψs (mi , ni ) .

(4.4)

i=1

Here it is appropriate to repeat Matouˇsek’s advice [33, p. 179] to first study the proof below and then try to understand the statement of the recurrence. Proof. Let S be a maximum-length Davenport–Schinzel sequence of order s that is partitionable into m blocks S1 , . . . , Sm , each of distinct symbols. Thus, |S| = ψs (m, n). Group the blocks S1 , . . . , Sm into b layers L1 , L2 , . . . , Lb from left to right, by letting each layer Li contain mi consecutive blocks. We partition the alphabet of S into two sets of symbols. The local symbols are those that appear in only one layer, and the global symbols are those that appear in two or more layers. Let ni be the number of symbols local to layer Li , for 1 ≤ i ≤ b, and let n∗ be the number of global symbols. Equation (4.2) follows. For each layer Li , let n∗i denote the number of global symbols that appear in Li . Trivially n∗i ≤ n∗ for all i. To see that (4.3) holds, build a subsequence S 0 of S by taking, for each layer Li and each global symbol a in Li , just the first occurrence of a within Li . The sequence S 0 , being a subsequence of S, does not contain any alternation of length s + 2. Furthermore, S 0 consists of b blocks of distinct symbols, corresponding to the b layers of S. However, S 0 might contain pairs of adjacent equal symbols at the interface between blocks. But there are at most b − 1 such pairs of symbols, and by deleting one symbol from each pair, we finally obtain a Davenport–Schinzel sequence. Bound (4.3) follows. Each occurrence of a global symbol a in a layer Li is classified into starting, middle, or ending, as follows: If a does not appear in any previous layer Lj ,

4.2. BOUNDING ψs (m, n)

75

j < i, we say that a is a starting symbol for Li . Similarly, if a does not appear in any subsequent layer Lj , j > i, then a is an ending symbol for Li . If a appears both before and after Li , then a is a middle symbol for Li . Decompose S into four sequences T1 , T2 , T3 , T4 (not necessarily contiguous), as follows: Let T1 contain all occurrences of the local symbols of S. Let T2 contain all occurrences of the starting global symbols in all the layers of S; similarly, let T3 contain all occurrences of the middle global symbols, and let T4 contain all occurrences of the ending global symbols in all the layers of S. Thus, |T1 | + · · · + |T4 | = ψs (m, n). Each sequence T1 , . . . , T4 inherits from S the partition into b layers, in which the i-th layer is further partitioned into mi blocks. Each of the sequences T1 , . . . , T4 might contain pairs of adjacent equal symbols, but these can only occur at the interface between adjacent blocks. Hence, by removing at most m − 1 symbols from each sequence, we obtain sequences T10 , . . . , T40 with no adjacent equal symbols. Thus, ψs (m, n) ≤ |T10 | + · · · + |T40 | + 4m. We now bound each of |T10 |, . . . , |T40 | individually. Let us first consider T10 . The i-th layer in T10 is a Davenport–Schinzel sequence of order s on ni symbols, and it consists of mi blocks, each of distinct symbols. Thus b X |T10 | ≤ ψs (mi , ni ). i=1

Next consider T20 . We claim that each layer in T20 is a Davenport–Schinzel sequence of order s − 1. Indeed, suppose for a contradiction that some layer in T20 contains an alternation abab . . . of length s + 1. Then, since a and b are starting symbols for this layer, they must both appear in S in some subsequent layer, and so S would contain an alternation of length s + 2, a contradiction. Furthermore, since each global symbol is a starting symbol for exactly one layer, the layers in T20 have pairwise disjoint sets of symbols, so all of T20 is a Davenport–Schinzel sequence of order s − 1. A similar argument applies for T40 . Thus, |T20 |, |T40 | ≤ ψs−1 (m, n∗ ). Finally, consider T30 . Each layer in T30 is composed of middle global symbols, which appear in S in both previous and subsequent layers. Therefore, no layer in T30 can contain an alternation of length s, or else S would contain an alternation of length s + 2. Thus, each layer in T30 is a Davenport–Schinzel sequence of order s − 2. (However, the whole T30 is not necessarily of order s − 2.) Since the i-th layer in T30 contains n∗i different symbols and is parti-

76

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

tioned into mi blocks, each of distinct symbols, we have |T30 | ≤

b X

ψs−2 (mi , n∗i ).

i=1

Bound (4.4) follows. Remark: Our key improvement over the method of Agarwal, Sharir, and Shor lies in the bound for |T30 |. They noted that each layer in T30 is a sequence of order s − 2, but they did not use the fact that the blocks in each layer have distinct symbols. In addition, they did not introduce the variables n∗i .

4.2.1

Applying the recurrence relation

We apply Recurrence 4.4 repeatedly to obtain successively better upper bounds on ψs (m, n). We first obtain a polylogarithmic bound, and then we use induction to go all the way down the inverse Ackermann hierarchy. For s ≥ 3 let m0 (s) be a large enough constant (depending only on s) such that m ≥ 2 + 2dlog2 mes−2 for all m ≥ m0 (s). (4.5) Define integers Ps,2 , Qs,2 for s ≥ 1 by P1,2 = P2,2 = 0,

Q1,2 = 1, Q2,2 = 2,

(4.6)

Ps,2 = 4Ps−1,2 + 2Ps−2,2 + 2Qs−2,2 + 8, Qs,2 = max m0 (s), 2Qs−1,2 + 2Qs−2,2 .

(4.7)

and, for s ≥ 3,

The reason for our choice of m0 (s) will become apparent later on, in the proof of Lemma 4.6. (Also recall that we take s to be a constant, so the growth of Ps,2 , Qs,2 in s is irrelevant for us.) Our polylogarithmic bound is as follows: Lemma 4.5. For all m, n, and s, we have ψs (m, n) ≤ Ps,2 m(log2 m)s−2 + Qs,2 n. Proof. We proceed by induction on s. If s = 1 then ψ1 (m, n) ≤ n, and if s = 2 then ψ2 (m, n) ≤ 2n − 1, and the claim holds. So let s ≥ 3. For each s we proceed by induction on m. If m ≤ m0 (s) then ψs (m, n) ≤ m0 (s)n ≤ Qs,2 n, and we are done. So assume m > m0 (s).

4.2. BOUNDING ψs (m, n)

77

We apply Recurrence 4.4 with b = 2. Let m1 = bm/2c and m2 = dm/2e, so m1 + m2 = m. Let us bound each term in the right-hand side of (4.4) separately. The term 2ψs−1 (m, n∗ ) is bounded, by induction on s, by 2ψs−1 (m, n∗ ) ≤ 2Ps−1,2 m(log2 m)s−3 + 2Qs−1,2 n∗ . P Next, we bound the term 2i=1 ψs−2 (mi , n∗i ). Using again induction on s, and applying log2 mi ≤ log2 m, we get 2 X

ψs−2 (mi , n∗i ) ≤ Ps−2,2 m(log2 m)s−4 + Qs−2,2 (n∗1 + n∗2 ).

i=1

Now, applying (4.3), we bound n∗1 + n∗2 loosely by n∗1 + n∗2 ≤ ψs (2, n∗ ) + 2 ≤ 2n∗ + m. Thus, being again very loose, we get 2 X

ψs−2 (mi , n∗i ) ≤ m(log2 m)s−3 (Ps−2,2 + Qs−2,2 ) + 2Qs−2,2 n∗ .

i=1

P Next we bound the term 2i=1 ψs (mi , ni ), using induction on m. Applying log2 mi ≤ log2 m − 21 , which is true for m ≥ 3, and using the fact that (x − 21 )s−2 ≤ xs−2 − 12 xs−3 for all x ≥ 12 , we get 2 X i=1

ψs (mi , ni ) ≤

2 X

Ps,2 mi (log2 mi )s−2 + Qs,2 ni

i=1

1 ≤ Ps,2 m(log2 m)s−2 − Ps,2 m(log2 m)s−3 + Qs,2 (n − n∗ ). 2 Finally, we bound 4m (very loosely for s ≥ 4) by 4m(log2 m)s−3 . Putting everything together, we get ψs (m, n) ≤ Ps,2 m(log2 m)s−2 + Qs,2 n 1 s−3 + m(log2 m) 2Ps−1,2 + Ps−2,2 + Qs−2,2 + 4 − Ps,2 2 ∗ + (2Qs−1,2 + 2Qs−2,2 − Qs,2 )n . By the definition of Ps,2 and Qs,2 in (4.7), the last two lines are non-positive, so ψs (m, n) ≤ Ps,2 m(log2 m)s−2 + Qs,2 n.

78

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

We are now ready to go all the way down the inverse Ackermann hierarchy. Define integers Ps,k , Qs,k for k ≥ 3, s ≥ 1 by P1,k = P2,k = 0,

Q1,k = 1, Q2,k = 2,

and, for s ≥ 3, Ps,k = Qs−2,k (1 + Ps,k−1 ) + 2ds Ps−1,k + d0s Ps−2,k + 4, Qs,k = Qs−2,k Qs,k−1 + 2Qs−1,k ,

(4.8)

for some constants ds and d0s to be specified later, with Ps,2 , Qs,2 as in (4.6), (4.7). Lemma 4.6. For every s there exists a constant cs such that ψs (m, n) ≤ Ps,k m(αk (m) + cs )s−2 + Qs,k n

(4.9)

for all integers n, m, s, and k. The proof is similar to the proof of Lemma 4.5, though more complex, since we proceed by induction on k for each s. Before delving into the actual details, we give a brief sketch of the proof. For the purposes of this sketch, denote the right-hand side of (4.9) by Γs,k (m, n). Now refer to equation (4.4) in Recurrence 4.4. The proof proceeds as follows. We bound the term ψs−1 (m, n∗ ) by Γs−1,k (m, n∗ ). We bound terms ψs−2 (mi , n∗i ) by Γs−2,k (mi , n∗i ); this produces the P the ∗ term Qs−2,k ni , on which we apply (4.3). We bound the resulting term ψs (b, n∗ ) by Γs,k−1 (b, n∗ ) (here is where we use induction on k). Finally, we bound the terms ψs (mi , ni ) by Γs,k (mi , ni ) by induction on m (since mi < m for every i). Proof of Lemma 4.6. By induction on s. As before, the claim is easily established for s = 1, 2, so assume s ≥ 3 is fixed. For each s we proceed by induction on k. If k = 2 then the claim reduces to Lemma 4.5, so assume k ≥ 3. By our induction assumption on s, we have ψs−1 (m, n) ≤ Ps−1,k m(αk (m) + cs−1 )s−3 + Qs−1,k n, ψs−2 (m, n) ≤ Ps−2,k m(αk (m) + cs−2 )s−4 + Qs−2,k n, for all m and n.

(4.10)

4.2. BOUNDING ψs (m, n)

79

Here it is convenient to work with a slight variant α bk (x) of the inverse Ackermann hierarchy. Define α bk (x) for k ≥ 2, x ≥ 0 by α b2 (x) = α2 (x) = dlog2 xe, and for k ≥ 3 by the recurrence ( 1, if x ≤ m0 (s); α bk (x) = 1+α bk (1 + 2b αk−1 (x)s−2 ), otherwise;

(4.11)

with m0 (s) as given in (4.5). (Compare (4.11) to the definition (1.5) of αk (x); our choice of m0 (s) guarantees that α bk (x) is well-defined for all k and x.) The functions α bk (x) are almost equivalent to the usual inverse Ackermann functions αk (x). In fact, there exists a constant cs , depending only on s, such that |b αk (x) − αk (x)| ≤ cs for all k and x (see Appendix A). We will show that ψs (m, n) ≤ Ps,k mb αk (m)s−2 + Qs,k n

(4.12)

for all n, m, and k. We will do this by induction on k, and for each k by induction on m. Then our claim will follow. If m ≤ m0 (s), then ψs (m, n) ≤ m0 (s)n ≤ Qs,2 n ≤ Qs,k n, and we are done. So assume m > m0 (s). We want to translate the bounds (4.10) into bounds involving α bk . Since αk (m) ≤ α bk (m) + cs and α bk (m) ≥ 1, it follows (being somewhat slack) that there exist multiplicative constants ds , d0s such that ψs−1 (m, n) ≤ ds Ps−1,k mb αk (m)s−3 + Qs−1,k n, ψs−2 (m, n) ≤ d0s Ps−2,k mb αk (m)s−4 + Qs−2,k n,

(4.13) (4.14)

for all n and m.2 Assume by induction on k that (4.12) holds for k − 1. Choose m . b= α bk−1 (m)s−2

Let mi = bm/bc or dm/be for each i, such that mi ≤ 1 + 2b αk−1 (m)s−2 , 2

P

(4.15) mi = m. We claim that

for all 1 ≤ i ≤ b.

(4.16)

As in Chapter 3, α bk (x) also depends on s, so strictly speaking we should write α bs,k (x). The constants ds , d0s here “pay” for the conversion from α bs−1,k (m), α bs−2,k (m) to α bs,k (m), respectively.

80

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Indeed, by our choice of m0 (s) as given in (4.5), we have α bk−1 (m)s−2 ≤ dlog2 mes−2 ≤ m/2 for all m ≥ m0 (s). Thus, mi ≤ 1 +

m m ≤1+ b m/b αk−1 (m)s−2 − 1 mb αk−1 (m)s−2 =1+ m−α bk−1 (m)s−2 mb αk−1 (m)s−2 ≤1+ = 1 + 2b αk−1 (m)s−2 . m − m/2

Let us bound each term in the right-hand side of (4.4). We first bound the term 2ψs−1 (m, n∗ ) using (4.13), and we obtain 2ψs−1 (m, n∗ ) ≤ 2ds Ps−1,k mb αk (m)s−3 + 2Qs−1,k n∗ . Next we bound α bk (m), b X

Pb

i=1

ψs−2 (mi , n∗i ) using (4.14). Observing that α bk (mi ) ≤

ψs−2 (mi , n∗i )

≤

i=1

b X

d0s Ps−2,k mi α bk (mi )s−4 + Qs−2,k n∗i

i=1

≤

d0s

s−4

Ps−2,k mb αk (m)

+ Qs−2,k

b X

n∗i .

(4.17)

i=1

Now we apply (4.3), and we bound ψs (b, n∗ ) by (4.12) with k − 1 in place of k. b X n∗i ≤ ψs (b, n∗ ) + b ≤ Ps,k−1 bb αk−1 (b)s−2 + Qs,k−1 n∗ + b. i=1

By our choice of b in (4.15), we have α bk−1 (b)s−2 ≤ α bk−1 (m)s−2 ≤ m/b, so, being somewhat slack, b X

n∗i ≤ Ps,k−1 m + Qs,k−1 n∗ + m ≤ mb αk (m)s−3 (1 + Ps,k−1 ) + Qs,k−1 n∗ .

i=1

Substituting this into (4.17), and being slack again, we get b X

ψs−2 (mi , n∗i ) ≤ mb αk (m)s−3 (d0s Ps−2,k + Qs−2,k (1 + Ps,k−1 ))

i=1

+ Qs−2,k Qs,k−1 n∗ .

4.2. BOUNDING ψs (m, n) Next we bound mi < m): b X

Pb

i=1

81

ψs (mi , ni ), applying (4.12) by induction on m (since

ψs (mi , ni ) ≤

i=1

b X

Ps,k mi α bk (mi )s−2 + Qs,k ni .

i=1

But by (4.16) and (4.11), bk (m) − 1. α bk (mi ) ≤ α bk 1 + 2b αk−1 (m)s−2 = α Further, we have (x − 1)s−2 ≤ xs−2 − xs−3 for all x ≥ 1. Therefore, b X

ψs (mi , ni ) ≤ Ps,k m α bk (m)s−2 − α bk (m)s−3 + Qs,k (n − n∗ ).

i=1

Finally, we bound 4m very loosely by 4mb αk (m)s−3 . Putting everything together, we get ψs (m, n) ≤ Ps,k mb αk (m)s−2 + Qs,k n + mb αk (m)s−3 2ds Ps−1,k + d0s Ps−2,k + Qs−2,k (1 + Ps,k−1 ) + 4 − Ps,k + (2Qs−1,k + Qs−2,k Qs,k−1 − Qs,k )n∗ . By the definition of Ps,k and Qs,k in (4.8), the expressions in last three lines equal zero, and we get ψs (m, n) ≤ Ps,k mb αk (m)s−2 + Qs,k n. All that remains is to analyze the asymptotic growth of Ps,k , Qs,k in k for fixed s. We have P3,k , Q3,k = Θ(k), P4,k , Q4,k = Θ 2k , and, in general, letting t = b(s − 2)/2c, ( t t−1 2(1/t!)k ±O(k ) , s ≥ 4 even; Ps,k , Qs,k = (1/t!)kt log2 k±O(kt ) 2 , s ≥ 3 odd.

(4.18)

(See Appendix B for the proof; note that this is the same asymptotic form as that of Pj (m), Qj (m) in Chapter 3.) Thus, Lemma 4.6 is equivalent to Lemma 4.3.

82

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Remark: The investment we made in using a more complicated recurrence (Recurrence 4.4 instead of the one used by Agarwal et al. [2, 45]) paid off in Lemma 4.6. Besides being tighter, the lemma also has a simpler form. The corresponding bound in [2, 45] is of the form ψs (m, n) ≤ Fs,k (n) · mαk (m) + Gs,k (n) · n, where Fs,k (n) and Gs,k (n) are functions of α(n). Our constants Ps,k , Qs,k , in contrast, do not depend on n.

4.3

A new technique for bounding ψs(m, n)

We now present an alternative technique for bounding ψs (m, n). Our new technique is based on a variant of Davenport–Schinzel sequences, in which we turn the problem around, in a sense. We call our variant sequences almost-DS sequences. As we will see, almost-DS sequences have bounds almost equiva(j) lent to those of Zk (n) for stabbing interval chains (Chapter 3); therefore, in our opinion, almost-DS sequences are interesting objects in their own right, independent of their use for bounding λs (n). An almost-DS sequence of order s with multiplicity k and m blocks (or an ADSsk (m)-sequence, for short) is a sequence that satisfies the following properties: • It is a concatenation of m blocks, each block containing only distinct symbols. • Each symbol appears at least k times (in different blocks, so we must have m ≥ k for there to be any symbols at all). • The sequence contains no alternation abab · · · of length s + 2. Note that we do allow repetitions at the interface between adjacent blocks (this simplifies matters). This is why these are almost Davenport–Schinzel sequences. We now pose a different problem: We ask for maximizing the number of distinct symbols. Let Πsk (m) denote the maximum number of distinct symbols in an ADSsk (m)-sequence. (Note that Πsk (m) = 0 for m < k.) The connection between ψs (m, n) and Πsk (m) is based on the following lemma: Lemma 4.7. For all s, n, m, and k we have ψs (m, n) ≤ k Πsk (m) + n .

4.3. A NEW TECHNIQUE FOR BOUNDING ψs (m, n)

83

Proof. Let S be a maximum-length Davenport–Schinzel sequence of order s on n distinct symbols that is partitionable into m blocks, each of distinct symbols. Thus, |S| = ψs (m, n). Let k ≥ 1 be a parameter. We transform S into another sequence S 0 in which every symbol appears exactly k times as follows:3 For each symbol a, group the occurrences of a in S from left to right into “clusters” of size k, deleting the last remaining ≤ k − 1 occurrences of a. Make the occurrences of a in different clusters different, by replacing each a in the i-th cluster by a new symbol ai . We deleted at most kn symbols from S, so |S 0 | ≥ |S| − kn. On the other hand, S 0 is clearly an ADSsk (m)-sequence (the symbol deletions might have created repetitions at the interface between blocks, but these are permitted in almost-DS sequences; on the other hand, the symbol replacements do not introduce any new forbidden alternations). Thus, S 0 contains at most Πsk (m) distinct symbols. Since each symbol appears exactly k times, we have |S 0 | ≤ k · Πsk (m). The claim follows. Thus, the problem of bounding ψs (m, n) reduces to bounding Πsk (m).

4.3.1

Bounding ADS sequences

We first derive some basic results: For every s ≥ 1, if we take k = s then Πsk (m) = ∞, but if we take k = s + 1 then Πsk (m) is already finite. Lemma 4.8. For all s ≥ 1, m ≥ s we have Πss (m) = ∞. Proof. Take the sequence abc . . . . . . cba abc . . . . . . with s blocks, with arbitrarily many symbols in each block. Each symbol appears s times, and the maximum alternation is of length s + 1. Lemma 4.9. We have Π12 (m) = m − 1. Proof. Let S be an ADS12 (m)-sequence. Since S cannot contain an alternation aba, each symbol must have all its occurrences contiguous. Given that S contains m blocks, the sequence that maximizes the number of distinct symbols is 1 12 23 . . . (m − 2)(m − 1) (m − 1), with m − 1 distinct symbols. Lemma 4.10. For all s ≥ 2 we have Πss+1 (m) ≤ 3

m−2 s−1

= O(ms−1 ).

A similar argument has been used by Sundar [48, Lemma 9] for a different problem.

84

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Proof. Suppose for a contradiction that there exists an ADSss+1 (m)-sequence m−2 S with n = 1 + s−1 distinct symbols. Thus, each symbol appears in at least s + 1 out of m different blocks. For each symbol a, consider the s − 1 “internal” occurrences of a, meaning, all occurrences except the first and the last. These internal occurrences can fall in any of the m − 2 “internal” blocks of S (excluding the first and last blocks). By our choice of n, there must be two symbols a, b whose internal occurrences fall in the same s − 1 out of m − 2 internal blocks. These internal occurrences create an alternation of length at least s (in the best case, they form the subsequence ab ba ab . . .). Since both a and b also appear before and after this subsequence, S contains an alternation of length s + 2, a contradiction. We now bound Πsk (m) by deriving recurrences and solving them, in a manner almost entirely analogous to Chapter 3. We begin with the following recurrence and corollary, which are analogous to Lemma 3.2: Recurrence 4.11. For every s ≥ 3 and every k and m we have Πs2k−1 (2m) ≤ 2Πs2k−1 (m) + 2Πs−1 k (m). Proof. Given an ADSs2k−1 (2m)-sequence S, partition the 2m blocks of S into a “left half” and a “right half” of m blocks each. The symbols of S fall into four categories: • Symbols that appear only in the left half. Taking just these symbols produces an ADSs2k−1 (m)-sequence, so there are at most Πs2k−1 (m) such symbols. • Symbols that appear only in the right half. There are also at most Πs2k−1 (m) such symbols. • Symbols that appear in both halves, but appear at least k times in the left half. Taking just these symbols, and just their left-half occurrences, 0 produces an ADSs−1 k (m)-sequence S . (An alternation abab . . . of length s + 1 in S 0 would be extended to length s + 2 by an a or b that appears in the right half.) Thus, there are at most Πs−1 k (m) of these symbols. • Symbols that appear in both halves, but appear at least k times in the right half. There are also at most Πs−1 k (m) such symbols. Corollary 4.12. For every fixed s ≥ 2, if we let k = 2s−1 + 1, then Πsk (m) = O m(log m)s−2 (where the constant implicit in the O notation might depend on s).

4.3. A NEW TECHNIQUE FOR BOUNDING ψs (m, n)

85

Proof. Apply Recurrence 4.11 using induction on s, using Lemma 4.10 as base case for s = 2. The following recurrence and corollary for Π3k (m) are analogous to Recurrence 3.3 and Lemma 3.5: √ Recurrence 4.13. Let t be an integer parameter, with t ≤ m. Then, m m 3 3 3 Πk (t) + Πk−2 1 + + 3m. Πk (m) ≤ 1 + t t Proof. Take a sequence S that maximizes Π3k (m). Let b = dm/te ≤ 1 + m/t. Partition the m blocks of S from left to right into b layers L1 , . . . , Lb of at most t blocks each. We classify the symbols of S into different types. A symbol is local for layer Li if it only appears in Li . Taking just the symbols local to Li produces an ADS3k (t)-sequence. Therefore, the number of local symbols is at most Π3k (t) per layer, or at most bΠ3k (t) ≤ 1 + mt Π3k (t) altogether. Symbols which appear in at least two layers are called global symbols. Call a global symbol left-concentrated for layer Li if it makes its first appearance in Li , and it appears at least three times in Li . Given a layer Li , take just the left-concentrated symbols for Li , and just their occurrences within Li . The resulting sequence Si0 cannot contain an alternation abab, or else S would contain ababa. Therefore, Si0 is an ADS23 (t)-sequence, so by Lemma 4.10 it hasat most t − 2 different symbols. Thus, there are at most m b(t − √2) ≤ 1 + t (t − 2) ≤ m left-concentrated symbols altogether (since t ≤ m). Similarly, there are at most m right-concentrated symbols. Next, call a global symbol middle-concentrated for layer Li if it appears at least twice in Li , and it also appears before Li and after Li . Given Li , take just the middle-concentrated symbols for Li , and just their occurrences within Li . The resulting sequence Si00 cannot contain an alternation aba, so Si00 is an ADS12 (t)-sequence, and so by Lemma 4.9 it contains at most t − 1 different symbols. Therefore, there are at most b(t − 1) ≤ m middle-concentrated symbols. (Note that we might have counted the same middle-concentrated symbol more than once.) Finally, take all the global symbols we have not accounted for so far— the scattered symbols. Each of these symbols must appear in at least k − 2 different layers. Build a subsequence of S by taking just the scattered symbols, and for each scattered symbol, just one occurrence per layer. Each layer becomes a block, and no new forbidden alternation can arise. Hence, we get an ADS3k−2 (b)-sequence, which can have at most Π3k−2 1 + mt different symbols.

86

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Corollary 4.14. There exists an absolute constant c such that, for every k ≥ 2, we have Π32k+1 (m) ≤ cmαk (m) for all m. The proof is analogous to the proof of Lemma 3.5, and is omitted. (The additive terms of 1 next to mt in Recurrence 4.13 are immaterial, and do not present any difficulties. In any case, a complete derivation of Corollary 4.14 from Recurrence 4.13 can be found in [37].) The bound for ψ3 (m, n) in Lemma 4.3 now follows from Corollary 4.14 and Lemma 4.7.

4.3.2

Klazar’s improved upper bound for λ3 (n)

Klazar’s tighter upper bound (1.8) for λ3 (n) follows by using the following relation between λ3 (n) and ψ3 (m, n), instead of Lemma 4.2: Lemma 4.15 (Klazar [27]). We have λ3 (n) ≤ ψ3 (1 + 2n/`, n) + 3n`, where ` ≤ n is a free parameter. (Klazar actually proved this relation under a stricter definition of ψ3 (m, n).) For completeness, we prove Lemma 4.15 in Appendix C. p Corollary 4.16. λ3 (n) ≤ 2nα(n) + O n α(n) . Proof. Taking s = 3 and k = 2α(m) + 1 in Lemma 4.7, and bounding Π32α(m)+1 (m) by Corollary 4.14, we get ψ3 (m, n) ≤ 2α(m) + 1 cmαα(m) (m) + n = 2nα(m) + n + O mα(m) . We now apply Lemma 4.15 with ` =

4.3.3

p α(n).

Bounding Πsk (m) for general s

The following recurrence and corollary for Πsk (m) are analogous to Recurrence 3.6 and Lemma 3.8: Recurrence 4.17. Let s ≥ 3 be fixed. Let k1 , k2 , k3 be integers, and put k = k2 k3 + 2k1 − 3k2 − k3 + 2. Then, m s m s−1 s−2 s s Πk (m) ≤ 1 + Πk (t) + 2Πk1 (t) + Πk2 (t) + Πk3 1 + , t t where t is a free parameter.

4.3. A NEW TECHNIQUE FOR BOUNDING ψs (m, n)

87

Proof. Take a sequence S that maximizes Πsk (m). Again partition the m blocks of S into b = dm/te ≤ 1 + m/t layers L1 , . . . , Lb , with at most t blocks per layer. We again classify the symbols of S into local (if the symbol appears in only one layer), or global. As before, there are at most 1 + mt Πsk (t) local symbols. And we again classify the global symbols into left-concentrated, rightconcentrated, middle-concentrated, and scattered. This time we do this as follows: A global symbol is left-concentrated for layer Li if its first k1 occurrences fall in Li . The overall number of left-concentrated symbols is at most s−1 m 1 + t Πk1 (t). Right-concentrated symbols are defined and handled analogously. A global symbol is middle-concentrated for layer Li if it appears at least k2 times in Li , and it also appears before Li and after Li . There are at most 1 + mt Πs−2 k2 (t) middle-concentrated symbols altogether. Finally, a global symbol is scattered if it appears in at least k3 different layers. Taking just these symbols, and for each symbol, just one occurrence per layer, we obtain an ADSsk3 (b)-sequence. Thus, there are at most Πsk3 (b) ≤ Πsk3 1 + mt scattered symbols. All that remains is to show that we did not miss any global symbol. Suppose a global symbol is neither left-, middle-, nor right-concentrated, nor scattered. Then the symbol appears at most 2(k1 −1)+(k3 −3)(k2 −1) = k−1 times in S, a contradiction. The only significant difference between Recurrence 4.17 above and Recurrence 3.6 lies in the formula for k in terms of k1 , k2 , and k3 . (As before, the extra terms 1 added to mt do not present any difficulty.) But in both cases we get the same asymptotic behavior: Corollary 4.18. Define Rs (d) for s ≥ 1, d ≥ 2 by R1 (d) = 2, R2 (d) = 3, and for s ≥ 3 by Rs (2) = 2s−1 + 1, Rs (d) = Rs (d − 1)Rs−2 (d) + 2Rs−1 (d) − 3Rs−2 (d) − Rs (d − 1) + 2, for d ≥ 3. Then, for every s ≥ 2 and d ≥ 2, if k ≥ Rs (d) then Πsk (m) ≤ cmαd (m)s−2

for all m.

Here c = c(s) is a constant that depends only on s.

88

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES (Again, a complete proof can be found in [37].) Let us now study the asymptotic growth of Rs (d) for fixed s. We have R3 (d) = 2d + 1,

R4 (d) = 5 · 2d − 4d − 3.

In general, letting t = b(s − 2)/2c, we have ( t t−1 2(1/t!)d ±O(d ) , s even; Rs (d) = t t 2(1/t!)d log2 t±O(d ) , s odd

(4.19)

(see Appendix B again). Lemma 4.3 now follows from Lemma 4.7 by applying Corollary 4.18 with k = Rs (d).

4.4

Formation-free sequences

We now deal with the generalizations of Davenport–Schinzel sequences described in the Introduction. Recall that Exu (n) denotes the maximum length of an r-sparse, u-free sequence on n distinct symbols, where r = kuk. Recall also that we bound Exu (n) by considering what we call formationfree sequences: An (r, s)-formation-free sequence is a sequence that is r-sparse and does not contain any (r, s)-formation as a subsequence, where an (r, s)formation is a sequence of s permutations on r symbols. The maximum length of an (r, s)-formation-free sequence on n distinct symbols is denoted Fr,s (n). Finally, recall Lemma 1.14, which states that Exu (n) ≤ Fr,s−r+1 (n), where r = kuk and s = |u|. Proof of Lemma 1.14. Suppose u = u1 u2 . . . us , where 1 ≤ ui ≤ r for each i. We can assume that the symbols in u make their first appearances in the order 1, 2, . . . , r. Let s0 = s − r + 1, and let ` = `1 `2 · · · `s0 be an arbitrary (r, s0 )-formation, where each `j is a permutation of {1, . . . , r}. We want to show that u ⊂ `. Define a partition u = B1 B2 . . . Bs0 of u into s0 blocks as follows: First let each symbol of u constitute its own block of length 1. Then, for each 2 ≤ j ≤ r, merge the block that contains the first occurrence of j in u with the block containing the immediately preceding symbol. The number of blocks goes down from s to s0 . Here is an example of a sequence thus partitioned: u = [1][1][12][134][2][4][1][25][5].

(4.20)

4.4. FORMATION-FREE SEQUENCES

89

Clearly, each block Bj is an increasing sequence. Now we are going to define a permutation σ on {1, . . . , r} such that, for each block Bj with 1 ≤ j ≤ s0 , its image σ(Bj ) is a subsequence of `j . We do this by examining the blocks from right to left, and by defining σ in the order σ(r), σ(r − 1), . . . , σ(1). Note that blocks of length 1 can be safely ignored. Suppose we have already dealt with blocks Bs0 , Bs0 −1 , . . . , Bj+1 , and that now is the turn of block Bj , where |Bj | > 1. Let k be the last symbol in Bj . The symbols preceding k in Bj are k − 1, k − 2, . . ., up to the second symbol of Bj . All these symbols make their first appearance in u in Bj . Call these the “new” symbols of Bj . Suppose we have already assigned values to σ(k + 1), . . . , σ(r) in such a way that, no matter how we assign σ(1), . . . , σ(k), the images σ(Bj+1 ), . . . , σ(Bs0 ) will always be subsequences of `j+1 , . . . , `s0 , respectively. Now consider the symbols of `j . Call a symbol of `j “free” if it has not yet been assigned as image σ(i) to any symbol i, for k + 1 ≤ i ≤ r. We scan `j form right to left, considering only its free symbols, and we assign in a greedy fashion these free symbols as images σ(k), σ(k − 1), . . . to k, k − 1, . . . (the “new” symbols of Bj ). After we are done with these assignments, the only symbol of Bj which has not been assigned an image is the first symbol of Bj —call it bj . But no matter how we define σ(bj ) later on, we will always have that σ(Bj ) is a subsequence of `j (because of our greedy approach). At the end, the assignment σ(1) of 1 will be forced. For example, with u is as in (4.20), suppose that ` = `1 `2 32514 35421 `5 `6 `7 35142 `9 (where `1 , `2 , `5 , `6 , `7 , `9 do not matter). Then, our algorithm will assign σ(5) = 2, σ(4) = 1, σ(3) = 4, σ(2) = 5, and finally σ(1) = 3. Then the sequence σ(u) = [3][3][35][341][5][1][3][52][2] is a subsequence of `, as desired. Remark: Lemma 1.14 is not the last word in finding sequences in formations. For example, consider the sequence u = abcabca. Lemma 1.14 states that u is contained in every (3, 5)-formation, but in fact u is contained in every (3, 4)-formation: Let ` = `1 `2 `3 `4 be a (3, 4)-formation. Suppose `1 = abc. Then, if u itself is not a subsequence of `, then `2 must have b before a, `3 must have c before b, and `4 must have a before c. But then ` contains the subsequence cbacbac, which is isomorphic to u.

90

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

4.4.1

Bounding formation-free sequences

Thus, the problem of bounding Exu (n) reduces to that of bounding Fr,s (n). For completeness, we start by reproducing some simple bounds from [26]. We first prove that Fr,s (n) is finite. Lemma 4.19 (Klazar [26]). We have Fr,s (n) ≤ snr for n ≥ r. Proof. Let S be an (r, s)-formation-free sequence on n distinct symbols. Partition S from left to right into blocks of length r. Note that each block contains r distinct symbols. Suppose we had 1 + (s − 1) nr complete blocks. Then, by the pigeonhole principle, there would exist s blocks that have the same set of r symbols. Such a set of s blocks would be an (r, s)-formation. Contradiction. Therefore, we must have n n |S| < r 1 + (s − 1) ≤ rs ≤ snr . r r It is also easy to get linear bounds for Fr,2 (n) and Fr,3 (n): Lemma 4.20 (Klazar [26]). We have Fr,2 (n) ≤ rn and Fr,3 (n) ≤ 2rn. Proof. Let S be an r-sparse sequence on n distinct symbols. Again partition S from left to right into blocks of length r (the last block might be shorter). If S contains no (r, 2)-formation then every block must contain the first occurrence of a symbol, and if S contains no (r, 3)-formation, then every block must contain the first or last occurrence of a symbol. Thus, there are at most n blocks in the first case, and at most 2n blocks in the second case. Lemma 4.21 (Klazar [26]). Let S = S1 S2 · · · Sm be a sequence which is a concatenation of m blocks, where each block Si contains only distinct symbols. Then S can be made r-sparse by deleting at most (r − 1)(m − 1) symbols. Proof. Build an r-sparse subsequence S 0 of S in a greedy fashion, by scanning S from left to right and adding a symbol from S to S 0 only if it does not equal any of the last r − 1 symbols currently in S 0 . In this way, we will skip at most r − 1 symbols of each block Si , except for the first block S1 , which we will take entirely. Next, we make a definition analogous to Definition 4.1:

4.4. FORMATION-FREE SEQUENCES

91

Definition 4.22: Given integers r, s, m, and n, we denote by ψ 0 r,s (m, n) the length of the longest r-sparse, (r, s)-formation-free sequence on n distinct symbols that can be partitioned into m or fewer blocks, each block containing only distinct symbols. Remark: The reader need not be intimidated (more than necessary) by the double subscript r, s in ψ 0 r,s (m, n). We are never going to use induction on r, only on s. Thus, r can be assumed to be fixed throughout our analysis. The following lemma (analogous to Lemma 4.2) relates Fr,s (n) to ψ 0 r,s (m, n). Lemma 4.23. Given fixed integers r and s, let ϕr,s−2 (n) be a nondecreasing function of n such that Fr,s−2 (n) ≤ nϕr,s−2 (n) for all n. Then, Fr,s (n) ≤ 2n + ϕr,s−2 (n) 2(r − 1)n + ψ 0 r,s (2n, n) . (This constitutes a minor improvement over Klazar [26], since Klazar related Fr,s (n) to ϕr,s−1 (n).) Proof. Let S be a maximum-length (r, s)-formation-free sequence on n symbols. Thus, |S| = Fr,s (n). Partition S from left to right into subsequences as follows: Let S1 be the longest prefix of S that is (r, s − 2)-formation-free. Let x1 be the symbol following S1 in S. Thus S1 x1 contains an (r, s − 2)-formation. Let S2 be the longest segment of S after x1 which is (r, s − 2)-formation-free, let x2 be the symbol following S2 in S, and so on. We obtain a partition S = S1 x1 S2 x2 . . . xm−1 Sm [xm ], where each Si is a subsequence and each xi is a symbol (xm might or might not be present). Each subsequence Si xi must contain either the first or the last occurrence of some symbol, for otherwise S would contain an (r, s)-formation. Thus, m ≤ 2n. Let ni = kSi k. Then Fr,s (n) = |S| ≤ m +

m X

|Si | ≤ m +

i=1

≤ 2n +

m X

Fr,s−2 (ni )

i=1 m X i=1

ni ϕr,s−2 (ni ) ≤ 2n + ϕr,s−2 (n)

m X

ni .

i=1

P So we just have to bound ni . Construct a subsequence S 0 of S by taking, for each subsequence Si in the above Ppartition of S, just the first 0 occurrence of each symbol in Si . Thus, |S | = ni . Next, using Lemma 4.21, 0 00 “r-sparsify” S and obtain a sequence S with |S 00 | ≥ |S 0 | − (r − 1)(m − 1).

92

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Since S 00 is a subsequence of S, it contains no (r, s)-formation. Further, S 00 is r-sparse and partitionable into m blocks of distinct symbols. Therefore, |S 00 | ≤ ψ 0 r,s (m, n), and so m X

ni = |S 0 | ≤ (r − 1)m + |S 00 | ≤ 2(r − 1)n + ψ 0 r,s (2n, n).

i=1

The claim follows. We now apply our “almost-DS” technique to formation-free sequences. For this, we introduce and analyze “almost-formation-free” sequences. The analysis closely parallels the analysis of almost-DS sequences.

4.4.2

Almost-formation-free sequences

If S is a sequence, we say that S is an AFFr,s,k (m) sequence if S contains no (r, s)-formation, can be partitioned into m of fewer blocks, each composed of distinct symbols, and each symbol appears at least k times (in k different blocks). Note that we do not require r-sparsity; this is the reason for calling S “almost” formation-free. Let Π0 r,s,k (m) be the maximum possible number of distinct symbols in an AFFr,s,k (m) sequence. We first show the connection between AFF sequences and ψ 0 r,s (m, n), and then we derive upper bounds for Π0 r,s,k (m). Lemma 4.24. For all s ≥ 2 and k we have ψ 0 r,s (m, n) ≤ k(Π0 r,s,k (m) + n). Proof. Let S be a maximum-length r-sparse, (r, s)-formation-free sequence on n distinct symbols, partitionable into m blocks. Thus, |S| = ψ 0 r,s (m, n). Let k ≥ 1 be the specified parameter. Transform S into another sequence S 0 in which each symbol appears exactly k times as follows. For each symbol a, group the occurrences of a from left to right into “clusters” of size k, discarding the < k occurrences left at the end. Replace each a in the i-th cluster by a new symbol ai . If s ≥ 2, then this does not introduce any (r, s)-formations. (Proof: Call two symbols a and b disjoint if every occurrence of a lies before every occurrence of b or vice-versa. Note that if a and b are disjoint, they cannot belong to the same (r, s)-formation for s ≥ 2. Thus, if S 0 contains an (r, s)formation, that formation was already present in S.) We deleted at most kn symbols from S, and the result S 0 is an AFFr,s,k (m) sequence (S 0 is not necessarily r-sparse, but this is fine for an AFF sequence).

4.4. FORMATION-FREE SEQUENCES

93

Therefore, S 0 contains at most Π0 r,s,k (m) symbols, each one appearing exactly k times, so it has length at most k · Π0 r,s,k (m). The claim follows. Lemma 4.25. For every r ≥ 2 we have Π0 r,2,2 (m) = (r − 1)(m − 1). Proof. For the upper bound, consider m − 1 “separators” between the m blocks. We say that a symbol a “contributes” to all the separators between the first two occurrences of a. Thus each symbol contributes to at least one separator. If there were 1 + (r − 1)(m − 1) symbols, then there would exist a separator with at least r contributions, which would lead to the existence of an (r, 2)-formation. For the lower bound, create m blocks, and create n = (r − 1)(m − 1) different symbols partitioned into m − 1 sets A1 , . . . , Am−1 of r − 1 symbols each. Make two copies of each Ai , and put one copy at the end of block i and one copy at the beginning of block i + 1. We get a sequence with the desired properties. Lemma 4.26. For every fixed r ≥ 2 and s ≥ 3 we have m−2 0 Π r,s,s (m) ≤ (r − 1) = O(ms−2 ). s−2 Proof. Supposefor a contradiction that there is an AFFr,s,s (m) sequence with 1 + (r − 1) m−2 distinct symbols. Consider the s − 2 middle occurrences of s−2 each symbol. They fall on s−2 out of m−2 different blocks. Therefore, by the pigeonhole principle, there exist r symbols whose s−2 middle occurrences all fall in the same s−2 blocks. This leads to the existence of an (r, s)-formation in the given sequence. Contradiction. Recurrence 4.27. We have Π0 r,s,2k−1 (2m) ≤ 2Π0 r,s,2k−1 (m) + 2Π0 r,s−1,k (m). The proof is exactly parallel to that of Recurrence 4.11. Corollary 4.28. For fixed r ≥ 2 and s ≥ 3, if we let k = 2s−2 + 1, then Π0 r,s,k (m) = O m(log m)s−3 (where the constant implicit in the O notation might depend on r and s). Recurrence 4.29. Let r ≥ 2 and s ≥ 3 be fixed. Let k1 , k2 , k3 , and k be integers satisfying k = k2 k3 + 2k1 − 3k2 − k3 + 2. Then, m 0 Π0 r,s,k (m) ≤ 1 + Π r,s,k (t) + 2Π0 r,s−1,k1 (t) + Π0 r,s−2,k2 (t) t m 0 + Π r,s,k3 1 + , t where t is a free parameter.

94

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

The proof exactly parallels that of Recurrence 4.17. The corollary is almost the same as Corollary 4.18; there is just a shift of 1 in the index s: Corollary 4.30. Let Rs (d) be the sequences defined in Corollary 4.18. Then, for every s ≥ 3 and d ≥ 2, if k ≥ Rs−1 (d) then Π0 r,s,k (m) ≤ cmαd (m)s−3

for all m ≥ k.

Here, c = c(r, s) is a constant that depends only on r and s. Combining Corollary 4.30 with Lemma 4.24, we obtain: Corollary 4.31. Let s ≥ 4. Then, for all r, m, and n we have ψ 0 r,s (m, n) ≤ Cr,s,d mαd (m)s−3 + n for all d, for some constants Cr,s,d of the form ( t t−1 2(1/t!)d ±O(d ) , Cr,s,d = t t 2(1/t!)d log2 d±O(d ) ,

s odd; s even;

where t = b(s − 3)/2c. We can finally prove our upper bounds for Fr,s (n). Proof of Theorem 1.13. Take d = α(m) in Corollary 4.31, then substitute into Lemma 4.23, bounding ϕr,s−2 (n) by induction on s. Use the base cases Fr,2 (n), Fr,3 (n) = O(n) (by Lemma 4.20). (As before, ϕr,s−2 (n) contributes only to lower-order terms in the exponent.)

4.5

Lower bound construction for s = 3

The rest of this chapter deals with lower bounds for Davenport–Schinzel sequences. In this section we prove Theorem 1.11 by constructing, for every n, a Davenport–Schinzel sequence of order 3 on n distinct symbols with length at least 2nα(n) − O(n). For this purpose, we first define a two-dimensional array of sequences Zd (m), for d, m ≥ 1, with the following properties: • Each symbol in Zd (m) appears exactly 2d + 1 times. • Zd (m) contains no forbidden alternation ababa. (We do not preclude the presence of adjacent repeated symbols in Zd (m).)

4.5. LOWER BOUND CONSTRUCTION FOR S = 3

95

• Zd (m) is partitioned into blocks, where each block contains only distinct symbols. Some of the blocks in Zd (m) are special blocks. Each symbol in Zd (m) makes its first and last occurrences in special blocks. Furthermore, the special blocks are entirely composed of first and last occurrences of symbols (there might be both first and last occurrences in the same special block). Moreover, each special block in Zd (m) has length exactly m. • For d ≥ 2, each special block is surrounded by regular blocks on both sides, and no regular block is surrounded by special blocks on both sides. For the former property, we place empty regular blocks at the beginning and end of Zd (m), for d ≥ 2. In what follows, we enclose regular blocks by ( )’s, and special blocks by [ ]’s. The base cases of the construction are as follows: For d = 1, we let Z1 (m) = [12 . . . m](m . . . 21)[12 . . . m]. Z1 (m) contains three blocks of length m; the first and last ones are special blocks. Note that each symbol appears exactly three times, as required. Also note that Z1 (m) contains no alternation ababa. For m = 1 and d ≥ 2 we let Zd (1) = ( )[1](1)(1) . . . (1)[1]( ), with 2d + 1 ones. Each symbol constitutes its own block; the first and the last nonempty blocks are special. Note that these special blocks have length 1, as required. At the beginning and end there are regular blocks of length zero. Denote by Sd (m) the number of special blocks in Zd (m). The recursive construction For d, m ≥ 2, we construct Zd (m) recursively as follows. Let Z 0 = Zd (m − 1). Let f = Sd (m − 1) be the number of special blocks in Z 0 , and let Z ∗ = Zd−1 (f ). Thus, the special blocks in Z ∗ have length f . Let g = Sd−1 (f ) be the number of special blocks in Z ∗ . Create g copies of Z 0 , each copy using “fresh” symbols which do not occur in Z ∗ nor in any preceding copy of Z 0 . Thus, we have one copy of Z 0 for each special block in Z ∗ . And each special block in Z ∗ has as many symbols as there are special blocks in the corresponding copy of Z 0 .

96

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Figure 4.1: Construction of Zd (m) from Z ∗ and many copies of Z 0 . Two special blocks of Z ∗ are depicted. In the left one, the symbol 1 makes its last occurrence, and symbols 2, 3 make their first occurrence. In the right block, symbols 2 and 4 make their last occurrence, and symbol 5 makes its first occurrence. Let Ci be the i-th special block in Z ∗ , and let Zi0 be the i-th copy of Z 0 . Let a be the `-th symbol in Ci , and let D` be the `-th special block in Zi0 . We duplicate a into aa, and we insert the aa into Zi0 as follows: If the a in Ci is the first a in Z ∗ , then the first of the two a’s falls at the end of D` and the second a falls at the beginning of the block after D` . And if the a in Ci is the last a in Z ∗ , then the first of the two a’s falls at the end of the block before D` and the second a falls at the beginning of D` . (Recall that D` is surrounded by regular blocks in Zi0 .) Since no regular block in Zi0 is surrounded by special blocks on both sides, it follows that no block in Zi0 receives more than one symbol from Z ∗ . Thus, even after the insertions, no block in Zi0 has repeated symbols. After these insertions, at the place in Z ∗ where the block Ci used to be there is now a hole. We insert Zi0 (with its extra symbols) into this hole. After doing this for all special blocks Ci in Z ∗ , we obtain the desired sequence Zd (m). See Figure 4.1. It is easy to check that every symbol in Zd (m) has multiplicity 2d + 1: The symbols of the copies of Z 0 already had multiplicity 2d + 1, and the symbols of Z ∗ had their multiplicity increased from 2d − 1 to 2d + 1. It is also clear that each symbol makes its first and last occurrences in special blocks, that the special blocks in Zd (m) contain only first and last occurrences, and that their length increased from m − 1 to m. Furthermore, every special block is surrounded by regular blocks on both sides, and no regular block is surrounded by special blocks on both sides. And Zd (m) contains empty regular blocks at the beginning and at the end.

4.5. LOWER BOUND CONSTRUCTION FOR S = 3

97

No ababa Let us now verify that Zd (m) contains no alternation ababa of length 5. Assume by induction that this is true for the component sequences Z 0 and Z ∗. Suppose for a contradiction that Zd (m) contains an alternation ababa. The symbols a and b cannot come from the same copy of Z 0 , by induction, and they cannot come from different copies of Z 0 , since they would not alternate at all. Further, a and b cannot both come from Z ∗ : By the induction assumption, Z ∗ contains no forbidden alternation. And the duplications of symbols a → aa cannot create a forbidden alternation, since the two a’s end up being adjacent in Zd (m). Next, suppose that a comes from a copy of Z 0 and b comes from Z ∗ . Then this copy of Z 0 received two non-adjacent b’s. But this is impossible by construction: Our copy of Z 0 received symbols from a single special block of Z ∗ , which contained at most one b. This b was duplicated into two adjacent copies bb. Finally, suppose that a comes from Z ∗ and b comes from a copy of Z 0 . Then this copy of Z 0 received an a that is neither the first nor the last a in Z ∗ . This is also a contradiction. Remark: The above construction shares some similarities with an earlier construction by Komj´ath [29].

4.5.1

Analysis

Recall that Sd (m) denotes the number of special blocks in Zd (m). We define a few other quantities related to Zd (m): • Nd (m) = kZd (m)k denotes the number of distinct symbols in Zd (m). • Ld (m) = |Ld (m)| denotes the length of Zd (m). • Md (m) denotes the total number of blocks (regular and special) in Zd (m). • We let Xd (m) = Md (m)/Sd (m). Thus, Xd (m)−1 is the fraction of blocks in Zd (m) that are special. • We let Vd (m) = Ld (m)/Md (m) denote the average block length in Zd (m).

98

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Note that 1 Nd (m) = mSd (m), 2

(4.21)

1 Ld (m) = (2d + 1)Nd (m) = d + mSd (m). 2

(4.22)

Equation (4.21) follows from the fact that each symbol appears in two special blocks, and each special block contains m symbols. Equation (4.22) follows from the fact that each symbol appears 2d + 1 times in Zd (m). Theorem 1.11 follows from the following facts: Lemma 4.32. The quantity Nd (m) experiences Ackermann-like growth. Specifically, there exists a small absolute constant c such that Ad (m) ≤ Nd (m) ≤ Ad (m + c)

(4.23)

for all d ≥ 3 and all m ≥ 2. We also have Xd (m) ≤ 2d + 1 and Vd (m) ≥ m/2 for all d and all m. Let us first see how this lemma implies Theorem 1.11. Proof of Theorem 1.11. Diagonalize by taking the sequences Zd∗ = Zd (d) for d = 1, 2, 3, . . .. Let Nd∗ = Nd (d), L∗d = Ld (d), and Vd∗ = Vd (d). By (4.23) and (1.2) we have Nd∗ ≤ Ad (d + c) ≤ Ad A(d + 1) = A(d + 2). Thus, A(d) < Nd∗ ≤ A(d + 2) (4.24) for all d ≥ 4. Thus, by (1.4), α(Nd∗ ) − 2 ≤ d < α(Nd∗ )

(4.25)

for d ≥ 4, and so, by (4.22), L∗d ≥ 2Nd∗ · α(Nd∗ ) − O(Nd∗ ). The sequences Zd∗ are not necessarily Davenport–Schinzel sequences, since they might have adjacent repeated symbols. Therefore, create sequences Zd0 by removing adjacent repetitions from Zd∗ . Since we delete at most one symbol per block, the length of Zd∗ decreases by at most a 1/Vd∗ fraction. But by Lemma 4.32 this ratio tends to zero with d (this is why we diagonalized). Specifically, the length of Zd0 is 2 1 ∗ 0 ∗ = 2Nd∗ · α(Nd∗ ) − O(Nd∗ ). (4.26) Ld ≥ Ld 1 − ∗ ≥ Ld 1 − Vd d

4.5. LOWER BOUND CONSTRUCTION FOR S = 3

99

We have just proven that λ3 (n) ≥ 2nα(n)−O(n) for n of the form n = Nd∗ . We just have to interpolate to intermediate values of n. Given n, let d = d(n) be the unique integer such that ∗ ∗ Nd∗ < Nd+1 ≤ n < Nd+2 .

It follows, by applying (4.25) twice, that ∗ α(n) ≤ α Nd+2 ≤ d + 4 < α(Nd∗ ) + 4.

(4.27)

Also, by the rapid growth of Nd∗ in d, we certainly have q √ ∗ ∗ ≤ n Nd ≤ Nd+1

(4.28)

for d ≥ 4. We now concatenate many copies of Zd0 with disjoint sets of symbols, making sure we do not have more than n distinct symbols altogether. Specifically, we let t = bn/Nd∗ c, and we let Z 00 (n) be a concatenation of t copies of Zd0 with disjoint sets of symbols. By (4.26), (4.27), and (4.28), it follows that the length of Z 00 (n) is n 00 0 − 1 2Nd∗ · α(Nd∗ ) − O(Nd∗ ) = 2nα(n) − O(n). L (n) = tLd ≥ ∗ Nd Since λ3 (n) ≥ L00 (n), the bound follows. All that remains is to prove Lemma 4.32. Proof of Lemma 4.32. The quantity Sd (m) is given recursively by S1 (m) = 2; Sd (1) = 2; Sd (m) = f g = Sd (m − 1)Sd−1 Sd (m − 1) ,

for d, m ≥ 2.

(4.29)

In particular, we have S2 (m) = 2m = A2 (m), and Sd (2) = 2d . It is not hard to show (see Appendix A) that there exists a small constant c0 such that Ad (m) ≤ Sd (m) ≤ Ad (m + c0 ) (4.30) for all d ≥ 2 and all m. Then, by (4.21) we have, for d ≥ 3, m ≥ 2, Sd (m) ≤ Nd (m) ≤ Sd (m)2 ≤ Sd (m + 1), so (4.23) follows with c = c0 + 1.

100

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Regarding Md (m), we have M1 (m) = 3; Md (1) = 2d + 3,

for d ≥ 2,

counting the empty blocks at the ends of Zd (1). And for d, m ≥ 2, we have Md (m) = gMd (m − 1) + Md−1 (f ) − g = Sd−1 Sd (m − 1) Md (m − 1) − 1 + Md−1 Sd (m − 1)

(4.31)

(since the g special blocks of Z ∗ disappear). In particular, we have M2 (m) = 2m+2 − 1, and Md (2) = 2d+1 d − 1. Let us now examine Xd (m) = Md (m)/Sd (m). We have X1 (m) = 3/2, X2 (m) = 4 − 2−m , Xd (1) = d + 3/2,

for d ≥ 2,

Xd (2) = 2d − 2−d ,

for d ≥ 2.

In general, dividing (4.31) by (4.29), Xd−1 Sd (m − 1) − 1 . Xd (m) = Xd (m − 1) + Sd (m − 1)

(4.32)

We now prove by induction that Xd (m) ≤ 2d + 1 for all d and m. The claim has been verified for d ≤ 2 and for m ≤ 2, so assume d, m ≥ 3. By (4.32) and using induction on d, we have Xd (m) ≤ Xd (m − 1) +

2d − 2 , Sd (m − 1)

so Xd (m) ≤ Xd (2) + (2d − 2)

∞ X

Sd (m)

−1

−d

= 2d − 2

+ (2d − 2)

m=2

m=2

It is easily checked that, for d ≥ 3, ∞ X m=2

Sd (m)−1 ≤ 2Sd (2)−1 = 21−d ≤

∞ X

1 . 2d − 2

It follows that Xd (m) ≤ 2d + 1, as desired. Finally, let us consider Vd (m). By (4.22) we have Ld (m) 1 m m Vd (m) = = d+ ≥ . Md (m) 2 Xd (m) 2

Sd (m)−1 .

4.5. LOWER BOUND CONSTRUCTION FOR S = 3

101

Remark: The coefficient 2 in our bound for λ3 (n) comes from the fact that each symbol appears roughly 2d times in Zd (m). In previous constructions [51, 29, 45] each symbol appears only d ± O(1) times in the equivalent sequence. Sharir and Agarwal [45] lost an additional factor of 2 in the interpolation step; we avoided this loss in the proof of Theorem 1.11 by letting 0 Z 00 (n) consist of many copes of Zd0 , instead of using Zd+1 (which would have been a more obvious choice).

4.5.2

Lower bound for ADS sequences of order 3

In Section 4.3 we introduced the notion of almost-DS sequences. We derived an upper bound on the maximum number Πsk (m) of distinct symbols of an ADSsk (m)-sequence, and we used this upper bound to bound λs (n). But the problem of ADSsk (m)-sequences is interesting in its own right, so one might naturally wonder about matching lower bounds for Πsk (m). It turns out that the construction Zd (m) described in this section also provides a roughly-matching lower bound for Π3d (x). We just have to change our point of view: Instead of taking a diagonal (namely, Zd (d)), we take the rows of the construction (meaning, Zd (m) for fixed d). Theorem 4.33. For every fixed d ≥ 2 we have Π32d+1 (x) = Ω d1 xαd (x) . Proof. For every m ≥ 1, the sequence Zd (m) is an ADS32d+1 (xm )-sequence for xm = Md (m). Let nm = Nd (m) be the number of distinct symbols in Zd (m). By the definition of Xd (m), and by applying Lemma 4.32 and then (4.30), we have xm = Md (m) = Xd (m)Sd (m) ≤ (2d + 1)Sd (m) ≤ (2d + 1)Ad (m + c0 ) ≤ Ad (m + c0 + 1). Thus, by (1.3) we have m ≥ αd (xm ) − c0 − 1. Therefore, by (4.22), and applying Lemma 4.32 again, Vd (m)Md (m) mMd (m) 1 Ld (m) = ≥ = Ω xm αd (xm ) . nm = Nd (m) = 2d + 1 2d + 1 4d + 2 d We interpolate to intermediate values of x (for xm ≤ x < xm+1 ) as we did above, in Section 4.5.1. Thus, for odd d the bounds for Π3d (m) are quite tight (they leave a multiplicative gap of O(d)). For even d the bounds are not so tight—they are obtained by applying Π3d+1 (m) ≤ Π3d (m) ≤ Π3d−1 (m).

102

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Theorem 4.33 automatically yields a lower bound for Π0 r,4,k (m): A sequence that does not contain ababa cannot contain an (r, 4)-formation for any r ≥ 2; further, as Adamec, Klazar, and Valtr [1] showed, an r-sparse, u-free sequence can be made r0 -sparse for r0 > r at the cost of shrinking the sequence by at most a constant factor.

4.6

Lower-bound construction for s ≥ 4 even

In this section we present a construction that achieves the lower bounds (1.9) for λs (n), s even. This is a simpler variant of the construction of Agarwal, Sharir, and Shor [2, 45] that achieves the same bounds. We first construct a family of sequences Sks (m) for s ≥ 2 even, k ≥ 0, and m ≥ 1. For all s ≥ 4, m ≥ 2, the sequences Sks (m) are Davenport–Schinzel sequences of order s. The sequences Sks (m) are highly regular; they satisfy the following properties: • Sks (m) is a concatenation of blocks of length m, where each block contains m distinct symbols. (For s = 2 or m = 1 there are adjacent repeated symbols at the interface between blocks, but only in these cases.) • Sks (m) does not contain any forbidden alternation abab . . . of length s + 2, for any distinct symbols a 6= b. Thus, for s ≥ 4, m ≥ 2, the sequence Sks (m) is a Davenport–Schinzel sequence of order s. • All symbols in Sks (m) occur with the same multiplicity µs (k), which depends only on s and k. Further, for s ≥ 4 each symbol in Sks (m) makes all its appearances in the same position within the blocks, and no two symbols a, b appear together in more than one block.

4.6.1

The construction

For s = 2, the sequences Sk2 (m) are given (independently of k) by Sk2 (m) = 12 . . . m m . . . 21. Sk2 (m) consists of two blocks of length m, and each symbol occurs with multiplicity µ2 (k) = 2. Clearly, Sk2 (m) contains no forbidden alternation abab. The construction for general s ≥ 4 is as follows. For k = 0, we let S0s (m) consist of a single block of length m: S0s (m) = 12 . . . m.

(4.33)

4.6. LOWER-BOUND CONSTRUCTION FOR S ≥ 4 EVEN

103

Thus, µs (0) = 1. For general k ≥ 1, we proceed as follows. The sequence Sks (1) consists of µs (k) = µs−2 (k − 1)µs (k − 1)

(4.34)

copies of the symbol 1, each forming by itself a block of length one. Equation (4.34), together with the bounding cases µ2 (k) = 2 and µs (0) = 1 for s ≥ 4, gives the recursive definition of µs (k). For m ≥ 2, the sequence Sks (m) is constructed inductively on the lexicographic order of the triples (s, k, m), using three previously created sequences as components. The first sequence is S 0 = Sks (m − 1); note that S 0 contains blocks of length m − 1. Let f be the number of blocks in S 0 . s−2 The second

sequence is S = Sk−1 (f ). Thus, S contains blocks of length f . Let g = S be the number of distinct symbols in S. s The third and final sequence is S ∗ = Sk−1 (g). Thus, S ∗ contains blocks of length g. Transform the sequence S ∗ into a sequence Sb∗ by replacing each block in ∗ S by a copy of S with the same set of g symbols, making their first appearances in the same order as in the replaced block. Note that Sb∗ contains blocks of length f , and by induction, each symbol in Sb∗ occurs with multiplicity µs−2 (k − 1)µs (k − 1) = µs (k). Let h be the number of blocks in Sb∗ . Now, create h copies of S 0 , each copy using “fresh” symbols which do not occur in Sb∗ nor in any preceding copy of S 0 , and concatenate them into a sequence S 00 . Note that S 00 contains f h blocks of length m − 1, while Sb∗ = f h. Insert each symbol of Sb∗ in order at the end of each block of S 00 . Thus, each component sequence S 0 in S 00 , containing f blocks, receives the f distinct symbols of a block in Sb∗ . The resulting sequence is the desired Sks (m). Note that it contains blocks of length m, and, by induction and construction, each symbol in it has multiplicity µs (k). See Figure 4.2. Letting t = s/2 − 1, we have k

µs (k) = 2( t ) = 2(1/t!)k −O(k if we take s to be a constant.

t

t−1 )

,

(4.35)

104

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Figure 4.2: The recursive construction of Sks (m). The sequence Sb∗ is the result of replacing each block of S ∗ by a copy of S. Each block of Sb∗ is then distributed among the f blocks of a single copy of S 0 .

4.6.2

Correctness of the construction

We now prove that, for s ≥ 4, m ≥ 2, the sequences Sks (m) are indeed Davenport–Schinzel sequences of order s. Let us first recall some important properties of the construction: • The last symbol in each block of Sks (m) comes from Sb∗ (which has the same set of symbols as S ∗ ), while every other symbol in Sks (m) comes from a copy of S 0 . • The copies of S 0 have pairwise disjoint sets of symbols, which are also disjoint from the set of symbols of Sb∗ . • When merging S 00 and Sb∗ to form Sks (m), each copy of S 0 in S 00 receives the f distinct symbols of a block of Sb∗ . The following lemma is easily proven by induction using the above properties: Lemma 4.34. The sequence Sks (m) satisfies the following properties: 1. For s ≥ 4, each symbol in the sequence makes all its appearances in the same position within the blocks. 2. For s ≥ 4, m ≥ 2, there are no adjacent repeated symbols. 3. For s ≥ 4, no two symbols of Sks (m) appear together in more than one block.

4.6. LOWER-BOUND CONSTRUCTION FOR S ≥ 4 EVEN

105

For each symbol a in Sks (m), call the depth of a the position within the blocks in which a always appears in Sks (m). This notion is well-defined by the above lemma. Thus, the symbols that come from copies of S 0 have depth between 1 and m − 1, while the symbols that come from Sb∗ have depth m. The following Lemma is also pretty straightforward: Lemma 4.35. Symbols at different depths in Sks (m) make alternations of length at most 5. Proof. By induction. The claim is clearly true if s = 2, k = 0, or m = 1. Thus, let s ≥ 4, k ≥ 1, and m ≥ 2. Let a and b be two symbols at different depths in Sks (m). If both a and b have depth at most m − 1, then they either come from the same copy of S 0 , in which case the claim follows by induction, or else they come from different copies of S 0 , in which case they do not alternate at all. Thus, suppose one symbol, say a, has depth m (so it comes from Sb∗ ), while the other symbol, b, has depth at most m − 1 (so it comes from a copy of S 0 ). The copy of S 0 to which b belongs receives at most one a from Sb∗ . In the worst case, this a is surrounded by b’s from our copy of S 0 , and this copy of S 0 is in turn surrounded by other a’s from Sb∗ . Thus the longest alternation we can get is ababa. The main issue is to show that Sks (m) contains no forbidden alternating subsequence of length s + 2. For this, we prove by induction that Sks (m) satisfies a stronger property. Lemma 4.36. The sequence Sks (m) satisfies the following properties: 1. Sks (m) contains no forbidden alternation abab . . . of length s + 2. 2. Furthermore, if each block B in Sks (m) is replaced by a sequence T (B) on the same set of symbols as B, such that T (B) contains no alternation abab . . . of length s, and such that the symbols in T (B) make their first appearances in the same order as they did in B, then the resulting sequence still contains no forbidden alternation of length s + 2. Proof. Again by induction. Both properties clearly hold if s = 2, k = 0, or m = 1, so let s ≥ 4, k ≥ 1, and m ≥ 2. Assume by induction that Properties 1 and 2 hold for the sequences S 0 , S, and S ∗ from which Sks (m) is built. We want to show that these properties hold for Sks (m) itself.

106

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Figure 4.3: The left figure shows the case of s even. For a forbidden alternation to occur, a pair of symbols a, b in a common block must be replaced by an alternation of length at most s − 1, and extended to length s + 2 by at least three more symbols a, b, according to one of four possible cases. In each case we get a contradiction. The right figure shows the case of s odd. Here the argument fails, because case (ii) fails to yield a contradiction. We start with Property 1. Suppose for a contradiction that Sks (m) contains a forbidden alternation abab . . . or baba . . . of length s+2. By Lemma 4.35, a and b must have the same depth (since s + 2 ≥ 6). If a and b have depth at most m − 1, then they must belong to the same copy of S 0 , or else they would not alternate at all. But this contradicts our inductive assumption on S 0 . And if a and b have depth m and come from Sb∗ , then Sb∗ itself contains a forbidden alternation. But Sb∗ is obtained from S ∗ via block replacements, exactly as described in Property 2. Thus, the inductive assumption on S ∗ is contradicted. In conclusion, Sks (m) cannot contain an alternation of length s + 2, so it satisfies Property 1. Now we show that Sks (m) satisfies Property 2. Suppose for a contradiction that, after performing a certain set of block replacements in Sks (m), we do get an alternation abab . . . or baba . . . of length s + 2. For this to happen, a and b must have appeared together in some block B of Sks (m). (By Lemma 4.34, they do not appear together in more than one block.) Say that a appeared before b in this block. This block was replaced, in the worst case, by a sequence containing an alternation abab . . . of length s − 1. (Without loss of generality we may assume the alternation starts with an a, since the block replacement preserves the order of first appearances of the symbols.) This alternation is extended to length s+2 by at least three more instances of a and b before or after the block B, according to one of four possible cases, as depicted in Figure 4.3 (left).

4.6. LOWER-BOUND CONSTRUCTION FOR S ≥ 4 EVEN

107

To see why none of these cases can occur, consider again where the symbols a and b came from. If a and b came from the same copy of S 0 , then the same block replacement in S 0 would also have generated a forbidden alternation of length s + 2. This contradicts our inductive assumption for S 0 . Further, a and b could not have come from different copies of S 0 , since then they would not lie together in the same block (and they would not alternate at all). For a similar reason, they cannot both come from Sb∗ . Thus, one symbol—specifically, a—must originate from a copy of S 0 , and the other one—namely, b—must originate from Sb∗ . But all the other instances of a in Sks (m), to the left or right of our block B, also come from the same copy of S 0 . A case analysis shows that in each of the four cases shown in Figure 4.3 (left), this copy of S 0 received two copies of b from Sb∗ . (In cases (i) and (ii) there are two b’s surrounded by a’s, and in cases (iii) and (iv) there is a b surrounded by a’s, plus another b lying in the same block as an a.) This is impossible according to our construction. Remark: Unfortunately, the above argument depends crucially on s being even. If we try to make the same argument with s odd, we get the four cases illustrated in Figure 4.3 (right), and in case (ii) we fail to get a contradiction— we cannot find two instances of b sent to the same copy of S 0 .

4.6.3

Analysis

Given a fixed even number s ≥ 4, take the sequences Sks (2), for k = 0, 1, 2, . . .. These are Davenport–Schinzel sequences of order s, in which the multiplicity of the symbols, µs (k), goes to infinity. Thus, the length of these sequences grows superlinearly in the number of symbols. We want to derive the exact relation between these two quantities. For this purpose, we derive an upper bound on the number of distinct symbols in Sks (2). Let Nks (m) = kSks (m)k denote the number of distinct symbols in Sks (m), and let Fks (m) be the number of blocks in Sks (m). Then, |Sks (m)| = µs (k)Nks (m) = mFks (m). The quantities Nks (m) are initialized by Nk2 (m) = m; N0s (m) = m; Nks (1) = 1.

(4.36)

108

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

To get a recurrence relation for the general case, we analyze the recursive construction of Sks (m). Using the notation there, we have f = Fks (m − 1); s−2 g = Nk−1 (f ); s−2 s h = Fk−1 (g) · Fk−1 (f ); s Nks (m) = Nk−1 (g) + h · Nks (m − 1). Thus, applying (4.36) three times and then (4.34), s−2 s s (g) + Fk−1 (g) · Fk−1 (f ) · Nks (m − 1) Nks (m) = Nk−1 s µs (k − 1)Nk−1 (g) µs−2 (k − 1) · g (m − 1) · f s = Nk−1 (g) + · · g f µs (k) s = m · Nk−1 (g) s−2 s = m · Nk−1 Nk−1 Fks (m − 1) . k

Since µs (k) ≤ 22 and m ≥ 1, by (4.36) we have k

Fks (m) ≤ 22 Nks (m), so Nks (m)

≤m·

s Nk−1

s−2 Nk−1

(4.37)

k 2 s 2 Nk (m − 1) .

We now simplify the analysis by getting rid of the dependence on s in the last inequality. For this, we define an Ackermann-like hierarchy of functions bk (m) for k ≥ 0, m ≥ 1, by A b0 (m) = m; A and bk (m) = A

( 1,

bk−1 A bk−1 m·A

if m = 1; bk (m − 1) , otherwise; 2 A 2k

for k ≥ 1 (compare to (1.1)). It follows by induction that bk (m) Nks (m) ≤ A

(4.38)

for all s, k, and m. In Appendix A we prove that bk (m) ≤ Ak+1 (2m + 4) A

for all k ≥ 2 and all m.

(4.39)

4.6. LOWER-BOUND CONSTRUCTION FOR S ≥ 4 EVEN

109

Now let us come back to the sequences with which we started this discussion. Let Tk = Sks (2) for k = 0, 1, 2, . . ., and let nk = kTk k. Then, applying (4.38), (4.39), and (1.2), bk (2) ≤ Ak+1 (8) ≤ Ak+1 A(k + 2) = A(k + 3). nk = Nks (2) ≤ A Therefore, k ≥ α(nk ) − 3. Substituting into (4.36) applying (4.35), and letting t = s/2 − 1, t t−1 |Tk | = nk · µs (k) ≥ nk · µs α(nk ) − 3 ≥ nk · 2(1/t!)α(nk ) −O(α(nk ) ) . We have thus achieved the desired lower bound on λs (n) for n of the form n = nk . As in Section 4.5, interpolating to intermediate values of n (for nk ≤ n < nk+1 ) is straightforward, and we obtain the desired bound for all n.

4.6.4

Advantages over the previous construction

The construction we just presented follows the same basic idea as the previous construction of Agarwal et al. [2, 45], but it has the following advantages: • In our construction each block is just a sequence of m distinct symbols. In the previous construction each block (there called a “fan”) is of the form 12 . . . m . . . 21. • In our construction all symbols have the same exact multiplicity. This greatly simplifies calculations. • In our construction there are no adjacent repeated symbols at the interface between blocks. (Removing these adjacent repetitions in the previous construction does not present any serious problem, but they constitute a small aesthetic blemish in our opinion.) • The previous construction involves some “tiny” duplications of symbols, which our construction does not have. These duplications are not the cause of the asymptotic growth (and indeed, our construction works fine without them). This is a potential source of confusion, especially since these “tiny” duplications are also present in the lower-bound construction for order-3 sequences, and in that case they are critical.

110

4.6.5

CHAPTER 4. DAVENPORT–SCHINZEL SEQUENCES

Lower bound for ADS sequences, s ≥ 4 even

As was the case with the construction of order 3, the construction described in this section yields lower bounds for Πsk (m), for s ≥ 4 even. Again, the idea is to look at the rows of the construction, namely at Sks (m) for fixed s and k. Theorem 4.37. For every fixed even s ≥ 4 and every k ≥ 4 we have Πsµ (x) ≥ xαk (x)

for all large enough x,

for some µ asymptotically of the form µ ≥ 2(1/t!)k −O(k t

t−1

),

where t = s/2 − 1. Moreover, these lower bounds can be achieved by actual Davenport–Schinzel sequences. The proof is similar to the proof of Theorem 4.33, though somewhat simpler, since the blocks in Sks (m) have uniform length. We omit the details. As before, Theorem 4.37 automatically yields lower bounds for Π0 r,s,k (m) for odd s ≥ 5. It is an open problem whether the lower bounds for the case s = 3 shown above (Section 4.5.2) can be achieved with actual Davenport–Schinzel sequences (without adjacent repeated symbols), as was the case here.

4.7

Conclusions

The bounds for λs (n) are now tight for every even s, up to lower-order terms in the exponent. Unfortunately, for odd s ≥ 5 the problem is still not completely solved. We believe the new upper bounds for odd s are the true bounds, simply by analogy to the interval-chain bounds. But the construction that gives the lower bounds does not seem to work when s is odd. The reason we can unambiguously talk about the coefficient that multiplies α(n) (e.g., in Theorems 1.9 and 1.11), despite the fact that there are several different versions of α(n) in the literature, is that all these versions differ from one another by at most an additive constant. Thus, the coefficient multiplying α(n) is not affected. On the other hand, one cannot talk about α(n) the leading coefficient in λ4 (n) = Θ n · 2 , for example, unless a standard definition of α(n) is agreed upon. Can our lower-bound construction for λ3 (n) (Section 4.5) be realized as the lower envelope of segments in the plane? If so, it would yield a factor-of-2 improvement for this problem as well.

Chapter 5 Selection lemmas In this chapter we prove our results on the first and second selection lemmas. This chapter is somewhat unrelated to the other chapters, but some of our results rely on the stretched grid and related notions which we introduced in Chapter 2.

5.1

The first selection lemma

We start by proving Theorem 1.17, by showing that no point in Rd is contained in more than (n/(d + 1))d+1 + O(nd+1−1/d ) d-simplices spanned by the stretched grid Gs (m) ⊂ Rd , where n = md = |Gs |. Proof of Theorem 1.17. Let x be a fixed point in Rd . There are at most O(md−1 ) = O(n1−1/d ) points of Gs which are “bad” in the sense that they are not far apart from x. These points participate in at most O(nd+1−1/d ) simplices spanned by Gs . Now consider the remaining “good” simplices, i.e., those spanned by the “good” points of Gs . By Lemmas 2.2 and 2.1, the good points of Gs can be partitioned into d + 1 disjoint subsets A0 , . . . , Ad according to their type with respect to x, and then a good simplex contains x if and only if each of its vertices belongs to a different subset Ai . Thus, there are t = |A0 | · · · |Ad | good simplices containing x, where |A0 | + · · · + |Ad | ≤ n. By the arithmeticgeometric mean inequality, t is maximized when all the Ai ’s have equal size ≤ n/(d + 1), and so t ≤ (n/(d + 1))d+1 . Note that, by the same argument, no point in Rd is contained in more than (n/(d + 1))d+1 + O(nd ) d-simplices spanned by Ds , the diagonal of the stretched grid (and in this case we obtain a better error term). Hence, the 111

112

CHAPTER 5. SELECTION LEMMAS

upper bound in Theorem 1.17 can also be attained by point sets in convex position (see fn. 1 on p. 6).

5.2

The second selection lemma

The upper bound Next we prove Theorem 1.19 by constructing, for every n2.5 log n < t ≤√ n3 , a set of t triangles with vertices in the planar n-point stretched grid Gs ( n) ⊂ R2 , such that no point in the plane is contained in more than t2 O 3 n log(n3 /t) triangles of T . √ Proof of Theorem 1.19. Let m = n, and let us now write Gs (m) = {x1 , . . . , xm } × {y1 , . . . , ym }. Let ρ ∈ (0, 1] be a parameter, which we will later determine in terms of n and t. Let p1 = (xi1 , yj1 ), p2 = (xi2 , yj2 ), p3 = (xi3 , yj3 ) be three distinct points of Gs (m). Let us call the triangle ∆ = p1 p2 p3 increasing if i1 < i2 < i3 and j1 < j2 < j3 . Let us define the horizontal dimensions of ∆ as h12 := i2 −i1 and h23 := i3 − i2 , and the vertical dimensions as v12 := j2 − j1 and v23 := j3 − j2 . We define T as the set of all increasing triangles ∆ as above that satisfy 1 m 3

≤ i2 , j2 ≤ 23 m;

h12 , h23 , v12 , v23 ≤ 13 m;

h12 v23 ≤ ρn.

The last condition may look mysterious but it will be explained soon. However, first we bound |T | from below, which is routine. An increasing triangle ∆ is determined by p2 and by its horizontal and vertical dimensions. Each of i2 , j2 , h23 , v12 can be chosen independently in m3 ways. The pair (h12 , v23 ) can then be chosen, independent of the previous choices, as a lattice point lying in the square [0, m3 ]2 and below the hyperbola xy = ρn, and one can easily calculate (by integration, say) that the number of choices is of order ρn log ρ1 . Thus |T | = Ω(n3 ρ log ρ1 ), and thus for ρ := Ct/(n3 log(n3 /t)) with a sufficiently large constant C we obtain |T | ≥ t as needed. (Actually, the above calculation of integer points under the hyperbola is valid only if ρ is not too small compared to m, but the assumptions of the theorem and our choice of ρ guarantee ρ = Ω( m1 ).) Let us fix an arbitrary point q in the plane. It remains to bound from above the number of triangles ∆ ∈ T containing q. To this end, we partition

5.2. THE SECOND SELECTION LEMMA

113

Figure 5.1: The stair-convex hull of the vertex set of a triangle in T (h12 , h23 , v12 , v23 ). the triangles in T into classes according to their horizontal and vertical dimensions; let T (h12 , h23 , v12 , v23 ) be one of these classes. The total number of triangles in such an equivalence class equals the number of choices of p2 , so it is Θ(n). We want to show that only O(ρn) of them contain q. By Lemma 2.2, q ∈ ∆ may hold only if q ∈ stconv{p1 , p2 , p3 } or if q is not far apart from at least one of p1 , p2 , p3 . If, say, p2 is not far apart from q, then its position is restricted to two rows or two columns of the grid, and similarly for p1 and p3 . Thus, there are only O(m) choices for ∆. It remains to deal with the case q ∈ stconv{p1 , p2 , p3 }. The stair-convex hull of the vertex set of a triangle ∆ ∈ T (h12 , h23 , v12 , v23 ) is depicted in Figure 5.1 (the picture actually shows the image under π in the uniform grid). It contains h12 v23 + O(m) ≤ ρn + O(m) grid points, and thus there are at most ρn + O(m) = O(ρn) placements of p2 such that the stair-convex hull of the vertex set contains q (here we used the assumption t > n2.5 log n). So in every equivalence class of the triangles of T only an O(ρ) fraction of triangles contain q. Thus q lies in no more than t2 O(ρ|T |) = O 3 n log(n3 /t) triangles of T , as claimed. The lower bound Finally, we prove Theorem 1.18, which states that for every planar n-point set S and every family T of t triangles spanned by S there exists a point in the plane which is contained in t3 Ω 6 2 (5.1) n log n

114

CHAPTER 5. SELECTION LEMMAS

triangles of T . As we wrote in the Introduction, Eppstein [22] claimed this result, but there is a flaw in his argument.1 (We remind the reader that this constitutes an improvement of a log3 n factor over the old lower bound of [8].) Our proof here is based on a small modification of Eppstein’s argument. Proof of Theorem 1.18. We assume that t = Ω(n2 log2/3 n), since otherwise the bound (5.1) is trivial. The proof relies on the following two one-dimensional selection lemmas (by Aronov et al. [8]): Lemma 5.1 (Unweighted Selection Lemma). Let V be a set of n points on the real line, and let E be a set of t distinct intervals with endpoints in V . Then there exists a point x lying in the interior of Ω(t2 /n2 ) intervals of E. Lemma 5.2 (Weighted Selection Lemma). Let V be a set of n points on the real line, and let E be a multiset of t intervals with endpoints in V . Then there exists a multiset E 0 ⊆ E of t0 intervals, having as endpoints a subset V 0 ⊆ V of n0 points, such that all the intervals of E 0 contain a common point x in their interior, and such that t t0 =Ω . n0 n log n The proof of the desired bound (5.1) proceeds as follows: Assume without loss of generality that no two points of S have the same x-coordinate. For each triangle in T define its base to be the edge with the longest x-projection. For each pair of points a, b ∈ S, let Tab P be the set of triangles in T that have ab as base, and let tab = |Tab |. (Thus, ab tab = 2t.) n 2 Discard all sets Tab for which tab < t/n . We discarded at most 2 t/n < t/2 triangles, so we are left with a subset T 0 of at least t/2 triangles, such that either tab = 0 or tab ≥ t/n2 for each base ab.2 Partition the bases into a logarithmic number of subsets E1 , E2 , . . . , Ek for k = log4 (n3 /t), so that each Ej contains all the bases ab for which 4j−1 t 4j t ≤ t < . (5.2) ab n2 n2 S Let Tj = ab∈Ej Tab denote the set of triangles with bases in Ej , and tj = |Tj | denote their number. There must exist an index j for which tj ≥ 2−(j+1) t, 1

The very last sentence in the proof of Theorem 4 (Section 4) in [22] reads: “So = 1/2i+1 , and x = m/y = O(m/8i ), from which it follows that x/3 = O(n2 ).” This is patently false, since what actually follows is that x/3 = O(m), and the entire argument falls through. 2 This critical discarding step is missing in [22], and that is why the proof there does not work.

5.2. THE SECOND SELECTION LEMMA

115

Figure 5.2: Pairing two triangles with a common base. since otherwise the total number of triangles in T 0 would be less than t/2. From now on we fix this j, and work only with the bases in Ej and the triangles in Tj . For each pair of triangles abc, abd having the same base ab ∈ Ej , project the segment cd into the x-axis, obtaining segment c0 d0 . We thus obtain a multiset M0 of horizontal segments, with j 2 2t tj 4j−1 t −1 =Ω |M0 | ≥ . 2 2 n n2 (Each of the tj triangles in Tj is paired with all other triangles sharing the same base, and each such pair is counted twice.) We now apply the Weighted Selection Lemma (Lemma 5.2) to M0 , obtaining a multiset M1 of segments delimited by n1 distinct endpoints, all segments containing some point z0 in their interior, with j 2 |M0 | 2t |M1 | =Ω =Ω 3 . n1 n log n n log n Let ` be the vertical line passing through z0 . For each horizontal segment c0 d0 ∈ M1 , each of its (possibly multiple) instances in M1 originates from a pair of triangles abc, abd, where points a and c lie to the left of `, and points b and d lie to the right of `. Let p be the intersection of ` with ad, and let q be the intersection of ` with bc. Then, pq is a vertical segment along `, contained in the union of the triangles abc, abd (see Figure 5.2). Let M2 be the set of all these segments pq for all c0 d0 ∈ M1 . Observe that |M2 | = |M1 | when the elements of M1 are counted with multiplicity. Note that the vertical segments in M2 are all distinct, since each such segment pq uniquely determines the originating points a, b, c, d (assuming z0 was chosen in general position, and that the input set is also in general position).

116

CHAPTER 5. SELECTION LEMMAS

Let n2 be the number of endpoints of the segments in M2 . We have n2 ≤ nn1 , since each endpoint (such as p) is uniquely determined by one of n1 “inner” vertices (such as d) and one of at most n “outer” vertices (such as a). Next, apply the Unweighted Selection Lemma (Lemma 5.1) to M2 , obtaining a point x0 ∈ ` that is contained in 2 ! 1 |M1 | 4j t4 |M2 |2 =Ω 2 =Ω 8 2 Ω n22 n n1 n log n segments in M2 . Thus, x0 is contained in at least these many unions of pairs of triangles of Tj . But by (5.2), each triangle in Tj participates in at most 4j t/n2 pairs. Therefore, x0 is contained in t3 Ω 6 2 n log n triangles of Tj .

5.3

Conclusions

In Chapter 2 and the present chapter of this thesis we have seen several applications of the stretched grid. In fact, we know there are more: Boris Bukh (personal communication) has shown√that no line in R3 intersects more than n3 /25 + o(n3 ) triangles spanned Gs ( 3 n) ⊂ R3 . (This is shown using the stronger version of Lemma 2.2 mentioned in Section 2.1.1, and a tedious calculation.) This bound is known to be tight, as we have shown (Bukh et al. [14]) that for every n-point set S ⊂ R3 there exists a line in R3 that stabs at least n3 /25 − O(n2 ) triangles spanned by S. We would like to generalize this result, and calculate the maximum number of j-simplices spanned by Gs ⊂ Rd that can be stabbed by a k-flat, for general j, k, and d. Unfortunately, Bukh’s calculation seems hard to generalize. One can compare the lower bound (5.1) for the planar second selection lemma with the following result in R3 : If T is a set of t triangles spanned by an n-point set S ⊂ R3 , there exists a line (specifically, a line determined by two points of S) that stabs Ω(t3 /n6 ) triangles of T (see Dey and Edelsbrunner [20] and Smorodinsky [47] for two different proofs of this fact). It might turn out that this logarithmic gap between the two cases is an artifact of the current proofs, but we believe that the three-dimensional problem does have a larger bound than the planar one.

Bibliography [1] R. Adamec, M. Klazar, and P. Valtr. Generalized Davenport–Schinzel sequences with linear upper bound. Discrete Math., 108:219–229, 1992. [2] P. K. Agarwal, M. Sharir, and P. Shor. Sharp upper and lower bounds for the length of general Davenport–Schinzel sequences,. J. Comb. Theory, Ser. A, 52:228–274, 1989. [3] N. Alon, I. B´ar´any, Z. F¨ uredi, and D. J. Kleitman. Point selections and weak -nets for convex hulls. Combin. Probab. Comput., 1:189–200, 1992. [4] N. Alon and E. Friedgut. On the number of permutations avoiding a given pattern. J. Comb. Theory, Ser. A, 89:133–140, 2000. [5] N. Alon, H. Kaplan, G. Nivasch, M. Sharir, and S. Smorodinsky. Weak -nets and interval chains. J. ACM, 55, article 28, 32 pages, 2008. [6] N. Alon and D. Kleitman. Piercing convex sets and the HadwigerDebrunner (p, q)-problem. Adv. Math., 96(1):103–112, 1992. [7] N. Alon and B. Schieber. Optimal preprocessing for answering on-line product queries. Technical Report 71/87, The Moise and Frida Eskenasy Institute of Computer Science, Tel Aviv University, 1987. [8] B. Aronov, B. Chazelle, H. Edelsbrunner, L. J. Guibas, M. Sharir, and R. Wenger. Points and triangles in the plane and halving planes in space. Discrete Comput. Geom., 6:435–442, 1991. [9] I. B´ar´any. A generalization of Carath´eodory’s theorem. Discrete Math., 40:141–152, 1982. [10] I. B´ar´any, Z. F¨ uredi, and L. Lov´asz. On the number of halving planes. Combinatorica, 10:175–183, 1990. 117

118

BIBLIOGRAPHY

[11] E. Boros and Z. F¨ uredi. Su un teorema di K´arteszi nella geometria combinatoria. Archimede, 2:71–76, 1977. [12] E. Boros and Z. F¨ uredi. The number of triangles covering the center of an n-set. Geom. Dedicata, 17:69–77, 1984. [13] B. Bukh, J. Matouˇsek, and G. Nivasch. Lower bounds for weak epsilonnets and stair-convexity. Israel J. Math., to appear. Extended abstract in Proc. 25th Annu. Sympos. Comput. Geom. (SoCG’09) (˚ Arhus, Denmark), pp. 1–10, ACM, 2009. [14] B. Bukh, J. Matouˇsek, and G. Nivasch. Stabbing simplices by points and flats. Discrete Comput. Geom., to appear. [15] A. K. Chandra, S. Fortune, and R. Lipton. Unbounded fan-in circuits and associative functions. J. Comput. Syst. Sci., 30:222–234, 1985. [16] B. Chazelle, H. Edelsbrunner, M. Grigni, L. Guibas, M. Sharir, and E. Welzl. Improved bounds on weak -nets for convex sets. Discrete Comput. Geom., 13:1–15, 1995. [17] B. Chazelle and B. Rosenberg. The complexity of computing partial sums off-line. Int. J. Comput. Geom. Appl., 1:33–45, 1991. [18] A. Condon and M. Saks. A limit theorem for sets of stochastic matrices. Linear Algebra Appl., 381:61–76, 2004. [19] H. Davenport and A. Schinzel. A combinatorial problem connected with differential equations. American J. Math., 87:684–694, 1965. [20] T. K. Dey and H. Edelsbrunner. Counting triangle crossings and halving planes. Discrete Comput. Geom., 12(1):281–289, 1994. [21] D. Dolev, C. Dwork, N. Pippenger, and A. Wigderson. Superconcentrators, generalizers and generalized connectors with limited depth. In Proc. 15th Annu. ACM Sympos. Theory Comput. (STOC’83), pages 42– 51, 1983. [22] D. Eppstein. Improved bounds for intersecting triangles and halving planes. J. Combin. Theory Ser. A, 62:176–182, 1993. [23] S. Hart and M. Sharir. Nonlinearity of Davenport–Schinzel sequences and of generalized path compression schemes. Combinatorica, 6:151– 177, 1986.

BIBLIOGRAPHY

119

[24] D. Haussler and E. Welzl. -nets and simplex range queries. Discrete Comput. Geom., 2:127–151, 1987. [25] F. K´arteszi. Extremalaufgaben u ¨ber endliche Punktsysteme. Math. Debrecen, 4:16–27, 1955.

Publ.

[26] M. Klazar. A general upper bound in extremal theory of sequences. Comment. Math. Univ. Carol., 33:737–746, 1992. [27] M. Klazar. On the maximum lengths of Davenport–Schinzel sequences. In R. Graham et al., editors, Contemporary Trends in Discrete Mathematics, volume 49 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 169–178. Amer. Math. Soc., Providence, RI, 1999. [28] M. Klazar. Generalized Davenport–Schinzel sequences: Results, problems, and applications. Integers, 2:A11, 39 pp., 2002. Electronic. [29] P. Komj´ath. A simplified construction of nonlinear Davenport–Schinzel sequences. J. Comb. Theory, Ser. A, 49:262–267, 1988. [30] E. Magazanik and M. A. Perles. Staircase connected sets. Discrete Comput. Geom., 37:587–599, 2007. [31] A. Marcus and G. Tardos. Excluded permutation matrices and the Stanley–Wilf conjecture. J. Comb. Theory, Ser. A, 107:153–160, 2004. [32] J. Matouˇsek. Geometric Discrepancy (An Illustrated Guide). SpringerVerlag, Berlin, 1999. [33] J. Matouˇsek. Lectures on Discrete Geometry. Springer-Verlag, New York, 2002. [34] J. Matouˇsek. A lower bound for weak -nets in high dimension. Discrete Comput. Geom., 28:45–48, 2002. [35] J. Matouˇsek and U. Wagner. New constructions of weak -nets. Discrete Comput. Geom., 32:195–206, 2004. [36] J. W. Moon. Topics on Tournaments. Holt, Rinehart and Winston, New York, 1968. [37] G. Nivasch. Improved bounds and new techniques for Davenport– Schinzel sequences and their generalizations. J. ACM, submitted. Extended abstract in Proc. 20th Annu. ACM-SIAM Sympos. Discrete Algorithms (SODA’09) (New York, NY), pp. 1–10, ACM and SIAM, 2009.

120

BIBLIOGRAPHY

[38] G. Nivasch. An improved, simple construction of many halving edges. In J. E. Goodman et al., editors, Surveys on Discrete and Computational Geometry: Twenty Years Later, volume 453 of Contemporary Mathematics, pages 299–305. Amer. Math. Soc., Providence, RI, 2008. [39] G. Nivasch and M. Sharir. Eppstein’s bound on intersecting triangles revisited. J. Combin. Theory Ser. A, 116:494–497, 2009. [40] S. Pettie. Splay trees, Davenport-Schinzel sequences, and the deque conjecture. In Proc. 19th Annu. ACM-SIAM Sympos. Discrete Algorithms (SODA’08), pages 1115–1124, 2008. [41] P. Pudl´ak. Communication in bounded depth circuits. Combinatorica, 14:203–216, 1994. [42] K. F. Roth. On irregularities of distribution. Mathematika, 1:73–79, 1954. [43] R. Seidel. Understanding the inverse Ackermann function. PDF presentation, 2006. Available at http://cgi.di.uoa.gr/∼ewcg06/invited/ Seidel.pdf. [44] M. Sharir. Almost linear upper bounds on the length of general Davenport–Schinzel sequences. Combinatorica, 7:131–143, 1987. [45] M. Sharir and P. K. Agarwal. Davenport-Schinzel Sequences and Their Geometric Applications. Cambridge University Press, 1995. [46] P. Shor. Geometric realization of superlinear Davenport–Schinzel sequences I: Line segments. Unpublished manuscript, 1990. [47] S. Smorodinsky. Combinatorial problems in computational geometry. PhD thesis, Tel Aviv University, June 2003. http://www.cs.bgu.ac. il/∼shakhar/my papers/phd.ps.gz. [48] R. Sundar. On the deque conjecture for the splay algorithm. Combinatorica, 12:95–124, 1992. [49] P. Valtr. Generalizations of Davenport–Schinzel sequences. In R. Graham et al., editors, Contemporary Trends in Discrete Mathematics, volume 49 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 349–389. Amer. Math. Soc., Providence, RI, 1999.

BIBLIOGRAPHY

121

[50] U. Wagner. On k-Sets and Applications. PhD thesis, ETH Z¨ urich, June 2003. Available at http://www.inf.ethz.ch/∼emo/ DoctThesisFiles/wagner03.pdf. [51] A. Wiernik and M. Sharir. Planar realizations of nonlinear Davenport– Schinzel sequences by segments. Discrete Comput. Geom., 3:15–47, 1988. [52] A. C. Yao. Space-time tradeoff for answering range queries. In Proc. 14th Annu. ACM Sympos. Theory Comput. (STOC’82), pages 128–136, 1982. ˇ [53] R. T. Zivaljevi´ c. Topological methods. In J. E. Goodman and J. O’Rourke, editors, Handbook of Discrete and Computational Geometry, chapter 14, pages 305–329. Chapman & Hall/CRC, second edition, 2004. ˇ [54] R. T. Zivaljevi´ c and S. T. Vre´cica. The colored Tverberg’s problem and complexes of injective functions. J. Combin. Theory Ser. A, 61:309–318, 1992.

122

BIBLIOGRAPHY

Appendix A Comparing Ackermann-like functions In this appendix we present a general technique for proving that variants of the Ackermann hierarchy exhibit equivalent rates of growth. We first give the lemma on which the technique is based, and then we illustrate the technique with some examples. We consider the following general setting. Suppose F (n), G(n) : N → R+ are nondecreasing functions that satisfy F (n), G(n) > n for all n. Define functions F ◦ (n), G◦ (n) by F ◦ (n) = F (n) (F0 ), G◦ (n) = G(n) (G0 ), with some initial conditions F0 , G0 . (Recall that f (n) denotes the n-fold composition of f .) We want to prove that F ◦ (n) ≤ G◦ (dn + c) for some constants d and c. The following lemma gives a sufficient condition for this. Lemma A.1. Let F (n), G(n), F ◦ (n), G◦ (n) be functions as given above. Suppose there exists an integer d and a function δ(n) such that n ≤ δ(n),

(A.1)

δ(F (n)) ≤ G(d) (δ(n)),

(A.2)

for all n ≥ 1. Then F ◦ (n) ≤ G◦ (dn + c) for a constant c large enough so that δ(F0 ) ≤ G◦ (c). Proof. Applying (A.1), then (A.2) n times, and then (A.3), F ◦ (n) = F (n) (F0 ) ≤ δ F (n) (F0 ) ≤ G(dn) δ(F0 ) ≤ G(dn) G◦ (c) = G(dn+c) (G0 ) = G◦ (dn + c). 123

(A.3)

124

APPENDIX A. COMPARING ACKERMANN-LIKE FUNCTIONS 2

Let us see a simple example of the use of this lemma. Let F (n) = 3n and F0 = 1; then F ◦ (n) = F (n) (1) is a “tower”-like function. How fast does it grow compared to A3 (n)? Let us apply the lemma with G(n) = 2n , G0 = 1, G◦ (n) = A3 (n). We can take d = 1 and δ(n) = 8n3 , since then we 2 3 have δ(F (n)) = 8 · 33n ≤ 28n = G(δ(n)) for all n ≥ 1. We conclude that F ◦ (n) ≤ A3 (n + 3) for all n. bk (m) of Now let us see a real application; let us bound the function A Section 4.6.3. bk (m) be given by Lemma A.2. Let A b0 (m) = m, A

for m ≥ 1;

and bk (m) = A

( 1, bk−1 m·A

if m = 1; k 2 bk (m − 1) , otherwise; bk−1 2 A A

(A.4)

for k ≥ 1. Then, bk (m) ≤ Ak+1 (2m + 4) A

(A.5)

for all k ≥ 2 and all m. (The factor 2 multiplying m in (A.5) is due to the two-fold application of b Ak−1 in (A.4).) Proof. We start by noting that b1 (m) = 22m−2 m! ≤ 2m2 . A

(A.6)

Unfortunately the recurrence (A.4) does not fit the general setting of Lemma A.1 because of the factor m in it. But it is not hard to show that 2 bk (m) ≤ A bk−1 A bk−1 22k A bk (m − 1) for m ≥ 2, A so we will use this recurrence instead (the penalty we pay is minimal). We are going to apply Lemma A.1 with d = 2, with 2 bk−1 A bk−1 22k m F (m) = A , (A.7) G(m) = Ak (m), and with the initial conditions F0 = G0 = 1. Thus, bk (m), F ◦ (m) ≥ A G◦ (m) = Ak+1 (m).

125 Let us start with the case k = 2. In this case we have, by (A.6), b1 (A b1 (16m))2 ≤ 22512m F (m) = A G(m) = 2m .

2 +1

,

Then an appropriate choice of δ is δ(m) = 600m3 , since 512m2 +1 512m2 +1 600m3 = G(G(δ(m))) δ(F (m)) ≤ δ 22 = 600 · 23·2 ≤ 22 for all m ≥ 1, and so δ satisfies (A.2). Further, it is enough to take c = 4 in (A.3), since 22

G◦ (4) = 22

≥ 515 = δ(F0 ).

b2 (m) ≤ A3 (2m + 4). We conclude that A Now we deal with the general case k ≥ 3. Suppose by induction that b Ak−1 (m) ≤ Ak (2m + 4). Substituting this into (A.7), k 2 F (m) ≤ Ak 2Ak 22 +1 m + 4 + 4 . k +1

Now it is easy to see that taking δ(m) = 22

m + 5 guarantees that

δ(F (m)) ≤ Ak (Ak (δ(m))) = G(G(δ(m))) for all m ≥ 1. Furthermore, we have G◦ (4) = Ak+1 (4) > 22

k +1

+ 5 = δ(F0 ).

bk (m) ≤ Ak+1 (2m + 4), as desired. We conclude that A With Lemma A.1 one can also bound the many inverse-Ackermann-like functions α bk (n) we met in Chapters 3 and 4. To do this, one considers the Ackermann-like functions they are inverses of. For example, consider the function α bk (n) of Lemma 3.17, which satisfies a recurrence relation of the form (j−2) α bm (n) = 1 + α bm α bm−1 (n)/c . bm (n) that satisfies This is the inverse of a function A b bm (n) = A b(j−2) A m−1 cAm (n − 1) . bm (n) ≤ Am ((j − 2)n + d) for some constant d, and thus, One can show that A 1 α bm (n) ≥ j−2 αm (n) − d0 for some constant d0 . Similarly, we have α bm (n) ≤ 1 1 00 00 α (n) + d for another constant d , and therefore α bm (n) − j−2 αm (n) is j−2 m bounded by a constant.

126

APPENDIX A. COMPARING ACKERMANN-LIKE FUNCTIONS

Appendix B On the asymptotic growth of some recurrent quantities A recurrent feature in this thesis (specifically, in Chapters 3 and 4) are twoparameter quantities given roughly by Cs,k ≈ Cs−2,k Cs,k−1 , with base cases k C3,k = Θ(k) and C4,k = Θ 2 . In this appendix we give a generic analysis of the asymptotic growth of such quantities (as a function of k for s fixed). Lemma B.1. Let Cs,k be positive quantities given recursively, for s ≥ 3, k ≥ 1, by C3,k = Θ(k); C4,k = Θ 2k ; Cs,k = (Cs−2,k + a)Cs,k−1 + a0 Cs−1,k + a00 Cs−2,k + a000 ,

for s ≥ 5, k ≥ 2;

for some implicit constants for C3,k and C4,k , some real constants a, . . . , a000 (which might depend on s), and some initial conditions specifying Cs,1 . Then for every fixed s ≥ 3, Cs,k has upper and lower bounds of the form ( t t−1 2(1/t!)k ±O(k ) , s even; Cs,k = (1/t!)kt log2 k±O(kt ) 2 , s odd; where t = b(s − 2)/2c. Proof. Let s ≥ 5, and assume by induction that Cs−1,k , Cs−2,k have the claimed growth in k. The constant a is “swallowed up” asymptotically by Cs−2,k , so, by a simple transformation, we may assume that a = 0. Let Rs,k = a0 Cs−1,k + a00 Cs−2,k + a000 . Then, Cs,k =

k X i=2

Rs,i

k Y

Cs−2,j + Cs,1

j=i+1

k Y j=2

127

Cs−2,j .

128

APPENDIX B. SOME RECURRENT QUANTITIES

We bound each term in the right-hand-side by substituting the assumed growth rates for Cs−1,k and Cs−2,k , and bounding the resulting sums in the exponent by integrals. The calculations are fairly routine but tedious; they show that the last term in the right-hand-side dominates, and has the form ( t t−1 k Y s even; 2(1/t!)k ±O(k ) , Cs,1 Cs−2,j = t t 2(1/t!)k log2 k±O(k ) , s odd; j=2 and the claim follows.

Appendix C Proof of Klazar’s Lemma 4.15 For completeness, here is the proof of Klazar’s Lemma 4.15. Recall that the claim is that λ3 (n) ≤ ψ3 (1 + 2n/`, n) + 3n`, where ` ≤ n is a free parameter. Proof of Lemma 4.15. Let S be a maximum-length Davenport–Schinzel sequence of order 3 on n distinct symbols. Thus, |S| = λ3 (n). Call an occurrence of a symbol a in S a terminal occurrence if it is the first or last occurrence of a in S. Partition S into blocks S = S1 S2 S3 . . . Sm , where each Si starts with a terminal occurrence and contains exactly ` terminal occurrences (except for Sm , which might contain fewer terminal occurrences). Since S contains 2n terminal occurrences, the number of blocks is m = d2n/`e ≤ 1 + 2n/`. For every block Si and every symbol a, let ni (a) be the number of occurrences of a in Si . Recall that these occurrences must be nonadjacent. If Si contains the first or last occurrence of a in S, we say that a is terminal in Si ; otherwise, a is nonterminal in Si . Let Λi be the set of symbols that appear in Si . Let Λ0i be the subset of these symbols which are terminal in Si , and let Λ00i be the subset of those which are nonterminal. Clearly, X |Si | = kSi k + ni (a) − 1 . a∈Λi

We claim that ni (a) ≤ ` for all a ∈ Λi . Indeed, suppose for a contradiction that ni (a) ≥ ` + 1 for some a ∈ Λi . Then the occurrences of a in Si define ` interior-disjoint, nonempty intervals. But Si contains at most ` terminal occurrences of symbols, one of which is the first symbol of Si . Therefore, one of the above-mentioned intervals must be free of terminal occurrences, and so it contains a symbol b which also appears both before and after the interval. Thus, S contains babab, which is a contradiction. 129

130

APPENDIX C. PROOF OF KLAZAR’S LEMMA 4.15

For a similar reason, Si cannot contain the pattern aba for any a, b ∈ Λ00i . Therefore, the nonterminal symbols in Si do not intermingle at all (meaning, for every a, b ∈ Λ00i , all occurrences of a appear before all occurrences of b or vice versa). Therefore, the symbols which are nonterminal in Si define P ni (a) − 1 interior-disjoint, nonempty intervals of the form a . . . a in a∈Λ00 i Si . On the other hand, the number of such intervals cannot be larger than ` − 1 (by an argument similar to the one above). Therefore, X X |Si | = kSi k + ni (a) − 1 + ni (a) − 1 a∈Λ0i

a∈Λ00 i

≤ kSi k + (` − 1)|Λ0i | + (` − 1) ≤ kSi k + `(` − 1) + (` − 1) = kSi k + `2 − 1. Now, define a subsequence S 0 of S by Pmtaking just the0 first occurrence of 0 each symbol in each Si . Then, |S | = i=1 kSi k, and S is composed of m blocks, each of distinct symbols. S 0 might still contain adjacent repeated symbols at the interface between blocks, but these can be eliminated by deleting at most m − 1 ≤ 2n/` symbols. We get a Davenport–Schinzel sequence S 00 which satisfies |S 00 | ≤ ψ3 (m, n), and thus λ3 (n) = |S| =

m X i=1

2

|Si | ≤ m(` − 1) +

m X i=1 2

kSi k

≤ (1 + 2n/`)(` − 1) + ψ3 (m, n) + 2n/` ≤ ψ3 (1 + 2n/`, n) + 3n`.

‫בנוגע לחסמים עליונים ‪ ,‬אפשטיין הראה שלכל קבוצה 𝑆 במצב קמור ולכל‬

‫𝑛‬ ‫‪3‬‬

‫≤ 𝑡 < ‪,𝑛 2‬‬

‫קיימת משפחה 𝑇 של 𝑡 משולשים עם קודקודים ב‪ ,𝑆 -‬כך שאף נקודה במישור אינה מוכלת‬ ‫ביותר מ‪ 𝑂(𝑡 2 /𝑛3 )-‬משולשים מ‪.𝑇 -‬‬ ‫אנו משפרים את החסם הפשוט הזה בפקטור לוגריתמי (אולם רק לקבוצות מסוימות )‪:‬‬

‫משפט ‪ :14‬לכל 𝑛 ולכל ‪ 𝑛 log 2.5 𝑛 < 𝑡 ≤ 𝑛3‬קיימת קבוצה ‪ 𝑆 ⊂ 𝐑2‬בת 𝑛 נקודות‬ ‫ומשפחה 𝑇 של 𝑡 משולשים עם קודקודים ב‪ ,𝑆 -‬כך שאף נקודה במישור אינה מוכלת ביותר מ‪-‬‬ ‫‪𝑡2‬‬ ‫)𝑡‪𝑛 3 log (𝑛 3 /‬‬

‫𝑂‬

‫משולשים מ‪( .𝑇-‬בפרט‪ ,‬אם 𝛿‪ 𝑡 < 𝑛3−‬לאיזה קבוע ‪ ,𝛿 > 0‬אז החסם הוא‬ ‫))𝑛 ‪).𝑂(𝑡 2 /(𝑛3 log‬‬ ‫הקבוצה 𝑆 של המשפט היא לא אחרת מאשר שוב השריג המתוח ‪ .𝐺s‬זאת עבודה משותפת עם‬ ‫בוך ומטושק [‪.]13‬‬

‫‪13‬‬

‫החסם התחתון הוא של וגנר [‪ ,]50‬והחסם העליון הוא של בראני [‪ .]9‬החסם העליון הוא‬ ‫"טריוויאלי" במובן מסוים ‪ ,‬מכיוון שבראני בעצם הראה ש לכל קבוצה 𝑑𝐑 ⊂ 𝑆 בת 𝑛 נקודות‬ ‫במצב כללי‪ ,‬אף נקודה במרחב אינה מוכלת ביותר מ‪𝑛𝑑+1 /(2𝑑 𝑑 + 1 !) + 𝑂(𝑛𝑑 ) -‬‬ ‫סימפלקסים עם קודקודים ב‪.𝑆 -‬‬ ‫בתזה זו אנו מציגים את החסם העליון ה "לא טריוויאלי " הראשון עבור למת הבחירה‬ ‫הראשונה‪:‬‬

‫משפט ‪ :12‬לכל ‪ 𝑑 ≥ 2‬קבוע ולכל 𝑛 קיימת קבוצה 𝑑𝐑 ⊂ 𝑆 בת 𝑛 נקודות כך שכל נקודה‬ ‫𝑑𝐑 ∈ 𝑥 מוכלת בלא יותר מ‪-‬‬ ‫) ‪+ 𝑜(𝑛𝑑+1‬‬

‫‪𝑑+1‬‬

‫𝑛‬ ‫‪𝑑+1‬‬

‫סימפלקסים 𝑑‪-‬ממדיים עם קודקודים ב‪.𝑆-‬‬ ‫בפרט נובע ש‪ ,𝑐2 ≤ 1/27-‬שהיא התוצאה שבורוש ופירדי ניסו להשיג ‪.‬‬ ‫הקבוצה 𝑆 של המשפט היא בעצם אותו שריג מתוח ‪ 𝐺s‬שנותן את משפט ‪ .2‬לחילופין ‪ ,‬ניתן‬ ‫לקחת בתור הקבוצה 𝑆 את ‪ ,𝐷s‬האלכסון של השריג המתוח ‪ ,‬ולכן ניתן להשיג את החסם של‬ ‫משפט ‪ 12‬גם ע"י קבוצות במצב קמור (מכיוון ש‪ 𝐷s -‬נמצאת במצב קמור; ראו הערה ‪ 1‬בפרק‬ ‫‪.)1‬‬ ‫אנו משערים שמשפט ‪ 12‬הוא הדוק לכל 𝑑‪.‬‬ ‫זאת עבודה משותפת עם בוריס בוך ויירי מטושק [‪.]14 ,13‬‬ ‫למת הבחירה השנייה‬

‫התוצאה הבאה היא המקרה המישורי של מה שמטושק [‪ ]33‬מכנה למת הבחירה השנייה‬ ‫(‪ :)second selection lemma‬תהי 𝑆 קבוצה בת 𝑛 נקודות במישור‪ ,‬ותהי 𝑇 משפחה של‬ ‫‪ 𝑡 ≤ 𝑛3‬משולשים עם קודקודים ב‪ .𝑆 -‬אז קיימת נקודה במישור המוכלת ב "הרבה"‬ ‫משולשים מ‪ ,𝑇-‬עם חסם תחתון התלוי ב‪ 𝑛 -‬ו‪.𝑡-‬‬ ‫ארונוב ושות' [‪ ]8‬הראו שקיימת נקודה המוכלת ב‪ Ω(𝑡 3 / 𝑛6 log 5 𝑛 )-‬משולשים מ‪.𝑇 -‬‬ ‫מאוחר יותר‪ ,‬אפשטיין [‪ ]22‬טען לשיפור החסם ל‪ ,Ω(𝑡 3 / 𝑛6 log 2 𝑛 )-‬אבל קיימת בעיה‬ ‫בהוכחה שלו‪ .‬בתזה זו אנו מראים שבכל זאת ‪ ,‬החסם הזה הוא נכון ‪:‬‬

‫משפט ‪ :13‬תהי 𝑆 קבוצה בת 𝑛 נקודות במישור‪ ,‬ותהי 𝑇 משפחה של 𝑡 משולשים עם‬ ‫קודקודים ב‪ .𝑆-‬אז קיימת נקודה במישור המוכלת ב‪-‬‬ ‫‪𝑡3‬‬ ‫𝑛 ‪𝑛 6 log 2‬‬

‫‪Ω‬‬

‫משולשים מ‪.𝑇-‬‬ ‫ההוכחה של משפט ‪ 13‬מסתמכת על שינוי קטן בטיעון של אפשטיין ‪ .‬זאת עבודה משותפת עם‬ ‫מיכה שריר [‪.]39‬‬

‫‪12‬‬

‫בהינתן סדרת סימנים קבועה 𝑢 ("תת‪-‬הסדרה האסורה")‪ ,‬נסמן ב‪ 𝑠-‬את האורך של 𝑢‪ ,‬וב‪𝑟-‬‬ ‫את מספר הסימנים השונים ב‪ .𝑢-‬נסמן ב‪ Ex𝑢 (𝑛)-‬את האורך המקסימאלי של סדרה 𝑆 שהיא‬ ‫𝑟‪-‬דלילה ואינה מכילה אף תת‪ -‬סדרה איזומורפית ל‪( . 𝑢 -‬למשל‪ ,‬הסדרה 𝑐𝑏𝑑𝑐𝑏𝑎 מכילה תת‪-‬‬ ‫סדרה איזומורפית ל‪).𝑎𝑏𝑎𝑏 -‬‬ ‫סדרות ה‪ DS-‬הרגילות מתקבלות ע"י … 𝑏𝑎𝑏𝑎 = 𝑢 באורך ‪( 𝑠 + 2‬ואז ‪.)𝑟 = 2‬‬ ‫דרישת ה‪-𝑟-‬דלילות היא נחוצה ‪ ,‬כיוון שסדרה )‪-(𝑟 − 1‬דלילה יכולה להיות בא ורך אינסופי‬ ‫ולא להכיל תת‪-‬סדרה איזומורפית ל‪ . 𝑢 -‬לעומת זאת‪ ,‬דרישת ה‪-𝑟 -‬דלילות מספיקה כדי ש‪-‬‬ ‫)𝑛( 𝑢‪ Ex‬יהיה סופי‪.‬‬ ‫קלאזר [‪ ]26‬הוכיח ב‪ 1992-‬ש‪-‬‬ ‫‪,‬‬

‫) ‪𝑛 𝑠−4‬‬

‫𝛼(𝑂‪Ex𝑢 (𝑛) ≤ 𝑛 ∙ 2‬‬

‫כאשר סימון ה‪ 𝑂 -‬מסתיר קבועים התלויים ב‪ 𝑟 -‬ו‪.𝑠-‬‬ ‫בתזה זו אנו משפרים את ה חסם הזה בעזרת הכללה של שיטת הסדרות הכמעט‪ .DS -‬אנו‬ ‫מוכיחים‪:‬‬

‫משפט ‪ :11‬תהי 𝑢 סדרה באורך 𝑟 בת 𝑠 סימנים שונים‪ ,‬כאשר‪ .𝑠 ≥ 𝑟 + 3‬נציב‬ ‫‪ .𝑡 = (𝑠 − 𝑟 − 2)/2‬אז‬ ‫‪,‬‬

‫;𝑟 ‪ 𝑠 −‬זוגי‬ ‫‪ 𝑠 − 𝑟.‬אי זוגי ‪,‬‬

‫)𝑡‬

‫) ‪𝑛 𝑡 +𝑂(𝛼 𝑛 𝑡−1‬‬

‫𝑛 𝛼(𝑂‪log 2 𝛼(𝑛)+‬‬

‫𝛼)!𝑡‪𝑛 ∙ 2(1/‬‬ ‫𝑡‬

‫𝑛‬

‫𝛼)!𝑡‪𝑛 ∙ 2(1/‬‬

‫≤ )𝑛( 𝑢‪Ex‬‬

‫בפרט‪ ,‬אם נציב … 𝑏𝑎𝑏𝑎 = 𝑢 באורך ‪ ,𝑠 + 2‬נקבל שוב את משפט ‪.9‬‬ ‫עבודתנו על סדרות דוונפורט‪ -‬שינצל והכללותיהן פורסמה ב‪.]37[ -‬‬ ‫‪ .2.4‬למות בחירה‬ ‫למת הבחירה הראשונה‬ ‫התוצאה הבאה של בראני [‪ ]9‬מכונה למת הבחירה הראשונה (‪ )first selection lemma‬ע"י‬ ‫מטושק [‪ :]33‬לכל קבוצה 𝑑𝐑 ⊂ 𝑆 בת 𝑛 נקודות קיימת נקודה 𝑑𝐑 ∈ 𝑥 המוכלת לפחות ב‪-‬‬ ‫) 𝑑𝑛(𝑂 ‪ 𝑐𝑑 𝑛𝑑+1 −‬סימפלקסים 𝑑‪-‬ממדיים עם קודקודים ב‪ ,𝑆 -‬עבור קבועים מסוימים‬ ‫‪.𝑐𝑑 > 0‬‬ ‫הבעיה היא למצוא את הערך המקסימאלי של קבועים 𝑑𝑐 אלו‪ .‬למקרה ‪ ,𝑑 = 2‬בורוש ופירדי‬ ‫‪1‬‬

‫‪1‬‬

‫‪1‬‬

‫‪1‬‬

‫[‪ ]12‬הראו ש‪( 27 ≤ 𝑐2 ≤ 27 + 729 -‬הם טענו ש‪ ,𝑐2 ≤ 27 -‬אבל הבניה שלהם נותנת רק את‬ ‫החסם העליון היותר חלש הזה ; ראו [‪.)]14‬‬ ‫עבור ‪ 𝑑 ≥ 3‬כללי ידוע ש‪-‬‬ ‫‪1‬‬ ‫! ‪𝑑+1‬‬

‫𝑑‪≤ 𝑐𝑑 ≤ 2‬‬

‫‪11‬‬

‫‪𝑑 2 +1‬‬ ‫‪𝑑+1 !(𝑑+1)𝑑 +1‬‬

‫‪.‬‬

‫‪,‬‬

‫;‪ 𝑠 ≥ 4‬זוגי‬ ‫‪,‬‬

‫;‪ 𝑠 ≥ 3‬אי זוגי‬

‫;‪ 𝑠 ≥ 4‬זוגי ‪,‬‬

‫) ‪𝑛 𝑡−1‬‬

‫𝛼(𝑂‪𝑡 +‬‬

‫)𝑛( 𝛼‪𝑛 ∙ 2‬‬

‫) 𝑡 𝑛 𝛼(𝑂‪𝛼(𝑛)𝑡 log 2 𝛼(𝑛)+‬‬

‫‪𝑛∙2‬‬

‫) ‪𝑛 𝑡−1‬‬

‫≤ )𝑛( 𝑠𝜆‬

‫𝛼(𝑂‪𝑡 −‬‬

‫)𝑛(𝛼 !𝑡‪𝜆𝑠 (𝑛) ≥ 𝑛 ∙ 2 1/‬‬

‫כאשר ‪ .𝑡 = (𝑠 − 2)/2‬הם הושגו ע"י אגרוואל‪ ,‬שריר‪ ,‬ושור ב‪ .]45 ,2[ 1989-‬עבור ‪𝑠 ≥ 5‬‬ ‫אי‪-‬זוגי לא ידוע חסם תחתון אסימפטוטית יותר טוב מאשר זה המתקבל ע"י‬ ‫)𝑛( ‪.𝜆𝑠 (𝑛) ≥ 𝜆𝑠−1‬‬ ‫נשים לב לדמיון בין חסמים אלה לבין החסמים שהשגנ ו עבור דקירת שרשראות אינטרוולים‬ ‫(משפט ‪ )7‬ועבור רשתות‪ 𝜖 -‬חלשות (משפטים ‪ .)3-5‬למיטב ידיעתנו אין שום קשר בין שתי‬ ‫הבעיות‪ ,‬חוץ מהעובדה שהן מקיימות נוסחאות נסיגה מאוד דומות ‪ .‬החסמים "צצים" באופן‬ ‫בלתי תלוי בשני מקומות שונים ‪.‬‬ ‫התוצאות שלנו‬ ‫בתזה זו אנו מציגים כמה תוצ אות עבור )𝑛( 𝑠𝜆‪ .‬ראשית‪ ,‬אנו משפרים את החסמים העליונים‬ ‫ל‪ 𝑠-‬כללי בפקטור קבוע באקספוננט ‪:‬‬

‫משפט ‪ :9‬יהי‪ 𝑠 ≥ 3‬קבוע‪ ,‬ונציב ‪ .𝑡 = (𝑠 − 2)/2‬אז‬ ‫;‪ 𝑠 ≥ 4‬זוגי‬ ‫‪ 𝑠 ≥ 3.‬אי זוגי‬

‫‪,‬‬ ‫‪,‬‬

‫)𝑡‬

‫) ‪𝑛 𝑡−1‬‬

‫𝛼(𝑂‪𝑡 +‬‬

‫)𝑛( 𝛼)!𝑡‪𝑛 ∙ 2(1/‬‬

‫𝑛 𝛼(𝑂‪log 2 𝛼 (𝑛)+‬‬

‫𝑡)𝑛(𝛼)!𝑡‪(1/‬‬

‫‪𝑛∙2‬‬

‫≤ )𝑛( 𝑠𝜆‬

‫לכן‪ ,‬עבור 𝑠 זוגי החסמים הם כבר הדוקים‪ ,‬עד כדי איברים מסדר גודל קטן יותר‬ ‫באקספוננט‪ .‬עבור ‪ 𝑠 ≥ 5‬אי‪-‬זוגי נותר פער של פקטור )𝑛(𝛼 ‪ log‬באקספוננט בין החסמים‬ ‫העליונים והתחתונים ‪ .‬אנו משערים‪ ,‬ע"פ החסמים עבור דקירת שרשראות אינטרוולים ‪,‬‬ ‫שהחסמים האמיתיים עבור )𝑛( 𝑠𝜆 ל‪ 𝑠 ≥ 5-‬כן מכילים את הפקטור הלוגריתמי ‪.‬‬ ‫אנו בעצם מוכיחים את משפט ‪ 9‬בשתי דרכים שונות‪ :‬ראשית‪ ,‬ע"י שיפור קטן בשיטה של‬ ‫אגרוואל‪ ,‬שריר‪ ,‬ושור [‪ .]2‬ובנוסף‪ ,‬אנו מוכיחים אותו ע "י שיטה חדשה‪ ,‬המסתמכת על מה‬ ‫שאנו קוראים סדרות כמעט‪ ;)almost-DS sequences( DS-‬ראו פרק ‪ 4‬לפרטים מלאים‪.‬‬ ‫בנוגע לחסמים תחתונים ‪ ,‬אנו מוצאים את המקדם המדויק של )𝑛( ‪:𝜆3‬‬ ‫) 𝑛( 𝜆‬

‫משפט ‪ ,𝜆3 𝑛 ≥ 2𝑛𝛼 𝑛 − 𝑂(𝑛) :10‬ולכן‪.lim𝑛→∞ 𝑛𝛼3 (𝑛) = 2‬‬ ‫בנוסף‪ ,‬אנו מציגים גרסה יותר פשוטה של הבניה שנותנת את החסמים התחתונים הידועים‬ ‫עבור )𝑛( 𝑠𝜆 ל‪ 𝑠 ≥ 4-‬זוגי‪.‬‬ ‫סדרות דוונפורט‪-‬שינצל מוכללות‬ ‫אדאמץ‪ ,‬קלאזר וולטר [‪ ]1‬חקרו הכללה של סדרות‪ ,DS -‬בה תת‪-‬הסדרה ה"אסורה" אינה‬ ‫חייבת להיות … 𝑏𝑎𝑏𝑎‪ ,‬אלא סדרה כלשהי המורכבת מיותר משני סימנים שונים ‪.‬‬ ‫בהינתן מספר טבעי ‪ ,𝑟 ≥ 2‬סדרת סימנים … ‪ 𝑆 = 𝑎1 𝑎2 𝑎3‬נקראת 𝑟‪-‬דלילה (‪ )𝑟-sparse‬אם‬ ‫𝑗𝑎 ≠ 𝑖𝑎 לכל 𝑖 ו‪ 𝑗-‬המקיימים ‪.1 ≤ 𝑗 − 𝑖 ≤ 𝑟 − 1‬‬ ‫‪10‬‬

‫לשם שלמות אנו מציגים גם חסמים כמעט הדוקים עבור המקרה ‪( 𝑗 = 2‬דקירה ע"י זוגות)‪:‬‬ ‫למה ‪− 1 :8‬‬

‫𝑛‬ ‫‪𝑘/2‬‬

‫≤ 𝑛 ‪− 3 ≤ 𝑍𝑘2‬‬

‫𝑛‬ ‫‪𝑘/2‬‬

‫‪.‬‬

‫עבודתנו על דקירת שרשראות אינטרוולים נעש תה בשיתוף עם נוגה אלון‪ ,‬חיים קפלן‪ ,‬מיכה‬ ‫שריר‪ ,‬ושחר סמורודינסקי [‪.]5‬‬ ‫‪ .2.3‬סדרות דוונפורט‪-‬שינצל‬ ‫סדרות דוונפורט‪ -‬שינצל (‪ )Davenport-Schinzel sequences‬הן מבנים קומבינטוריים עם‬ ‫יישומים רבים בגיאומטריה חישובית ‪ .‬הם אינם קשורים כלל לרשתות‪ 𝜖-‬חלשות ולשרשראות‬ ‫אינטרוולים (עד כמה שידוע לנו)‪ ,‬אבל החסמים הידועים עבור אורכי סדרות דוונפורט‪-‬שינצל‬ ‫מכילים את פונקצית אקרמן ההופכית ‪ ,‬והם דומים באופן מפליא לחסמים שקיבלנו ע בור שתי‬ ‫הבעיות האחרונות ‪.‬‬

‫בהינתן מספר שלם ‪ ,𝑠 ≥ 1‬סדרת סימנים … ‪ 𝑆 = 𝑎1 𝑎2 𝑎3‬נקראת סדרת דוונפורט‪-‬שינצל‬ ‫(סדרת‪ DS-‬בקיצור) מסדר 𝑠 אם ‪ 𝑎𝑖 ≠ 𝑎𝑖+1‬לכל 𝑖‪ ,‬ואם 𝑆 אינה מכילה אף אלטרנציה‬ ‫… 𝑏 … 𝑎 … 𝑏 … 𝑎 באורך ‪ 𝑠 + 2‬לאף זוג סימנים שונים 𝑎 ו‪.𝑏-‬‬ ‫הבעיה היא למצוא את האורך המקסימאלי של סדרת‪ DS -‬מסדר קבוע 𝑠 המכילה לכל היותר‬ ‫𝑛 סימנים שונים‪ .‬אורך מקסימאלי זה מסומן )𝑛( 𝑠𝜆‪.‬‬ ‫סדרות אלה נחקרו לראשונה ע "י הרולד דוונפורט ואנדרי שינצל ב‪ .]19[ 1965 -‬המוטיבטציה‬ ‫העיקרית של סדרות‪ DS-‬היא הסיבוכיות המקסימאלית של ה מעטפת התחתונה ( ‪lower‬‬ ‫‪ )envelope‬של קבוצת עקומים במישור‪ .‬אולם‪ ,‬לסדרות‪ DS-‬יש יישומים רבים נוספים‬ ‫בגיאומטריה בדידה וחישובית ‪ .‬הספר [‪ ]45‬של שריר ואגרוואל מוקדש כולו לנושא זה ‪.‬‬ ‫חסמים עבור סדרות דוונפורט‪-‬שינצל‬ ‫כמו שאמרנו‪ ,‬הבעיה העיקרית היא לחסום את האורך המקסימאלי )𝑛( 𝑠𝜆 סל סדרת‪DS-‬‬ ‫מסדר 𝑠 עם 𝑛 סימנים שונים‪.‬‬ ‫קל להראות ש‪( 𝜆1 𝑛 = 𝑛 -‬אין 𝑎𝑏𝑎)‪ ,‬ו‪( 𝜆2 𝑛 = 2𝑛 − 1 -‬אין 𝑏𝑎𝑏𝑎)‪ .‬אולם‪ ,‬לחסום את‬ ‫)𝑛( 𝑠𝜆 עבור ‪ 𝑠 ≥ 3‬נהיה כבר הרבה יותר מסובך ‪.‬‬ ‫הרט ושריר [‪ ]23‬הראו ב‪ 1986-‬ש‪ ,𝜆3 𝑛 = Θ(𝑛𝛼 𝑛 )-‬כאשר 𝛼 היא פונקצית אקרמן‬ ‫ההופכית‪ .‬החסמים הטובים ביותר עבור )𝑛( ‪ 𝜆3‬הם‬ ‫‪1‬‬

‫))𝑛(𝛼 𝑛(𝑂 ‪.2 𝑛𝛼 𝑛 − 𝑂(𝑛) ≤ 𝜆3 𝑛 ≤ 2𝑛𝛼 𝑛 +‬‬ ‫החסם העליון הושג ע "י שריר ואגרוואל [‪( ]45‬בהסתמך על הבניה של וירניק ושריר [‪.)]51‬‬ ‫) 𝑛( 𝜆‬

‫החסם התחתון הושג ע"י קלאזר [‪ .]27‬קלאזר [‪ ]28‬שואל האם )𝑛( ‪ lim𝑛→∞ 𝑛𝛼3‬קיים‪.‬‬ ‫החסמים הקודמים הטובים ביותר עבור )𝑛( 𝑠𝜆 ל‪ 𝑠-‬כללי הם‬

‫‪9‬‬

‫הבעיה היא‪ ,‬בהינתן מספרים 𝑛‪ ,𝑘 ,‬ו‪ ,𝑗-‬לבנות משפחה של 𝑗‪-‬יות‪ ,‬כמה שיותר קטנה‪ ,‬כך שכל‬ ‫𝑗‬

‫שרשרת‪ 𝑘-‬המוכלת ב‪ [1, 𝑛]-‬נדקרת ע"י איזו 𝑗‪-‬יה במשפחה‪ .‬נסמן ב‪ 𝑍𝑘 (𝑛)-‬את הגודל‬ ‫𝑗‬

‫המינימאלי של משפחה כזו‪ .‬נשים לב ש‪ 𝑍𝑘 (𝑛)-‬גדל עם 𝑛‪ ,‬קטן עם 𝑘‪ ,‬וגדל עם 𝑗‪.‬‬ ‫𝑗‬

‫בתזה זו אנו מציגים חסמים עליונים ותחתונים כמעט הדוקים עבור )𝑛( 𝑘𝑍‪ .‬המקרה ‪𝑗 = 3‬‬ ‫הוא יותר פשוט מהמקרה הכללי ‪ ,𝑗 ≥ 4‬ולכן אנו מטפלים בו באופן נפרד ‪ .‬החסמים שלנו הם‬ ‫כדלהלן‪:‬‬ ‫‪3‬‬

‫משפט ‪ 𝑍𝑘 (𝑛) :6‬מקיים את החסמים הבאים ‪:‬‬ ‫‪3‬‬

‫; 𝑛 ‪𝑛 = Θ 𝑛 log log‬‬

‫‪3‬‬

‫‪; 𝑍4‬‬

‫‪𝑛 = Θ 𝑛 log 𝑛 ; 𝑍5‬‬

‫‪𝑛−1‬‬ ‫‪2‬‬

‫‪3‬‬

‫‪𝑍3‬‬

‫= 𝑛‬

‫ולכל ‪,𝑘 ≥ 6‬‬ ‫)‪(3‬‬

‫)𝑛( ‪𝑐′𝑛𝛼 𝑘/2 (𝑛) − 𝑐′′𝑛 ≤ 𝑍𝑘 (𝑛) ≤ 𝑐𝑛𝛼 𝑘/2‬‬ ‫לכל 𝑛‪ ,‬עבור קבועים מסוימים 𝑐‪ ,𝑐′ ,‬ו‪.𝑐′′-‬‬

‫משפט ‪ :7‬יהי ‪ 𝑗 ≥ 4‬קבוע‪ ,‬ונציב ‪ .𝑡 = (𝑗 − 2)/2‬אז קיימות פונקציות )𝑚( 𝑗‪ 𝑃′‬ו‪-‬‬ ‫)𝑚( 𝑗 ‪ ,𝑄 ′‬שלשתיהן חסמים עליונים ותחתונים מהצורה‬ ‫;𝑗 זוגי‬ ‫;𝑗 אי זוגי‬

‫‪,‬‬ ‫‪,‬‬

‫) ‪𝑡 ±𝑂(𝑚 𝑡−1‬‬

‫) 𝑡 𝑚(𝑂‪𝑚 ±‬‬

‫𝑚)!𝑡‪2(1/‬‬

‫‪log 2‬‬

‫= 𝑚‬

‫𝑡 𝑚)!𝑡‪(1/‬‬

‫‪2‬‬

‫𝑗‬

‫‪′‬‬

‫‪′‬‬

‫𝑄‪𝑃 𝑗 𝑚 ,‬‬

‫כך שלכל ‪ 𝑚 ≥ 2‬מתקיים‬ ‫‪𝑛 ≤ 𝑐𝑗 𝑛𝛼𝑚 𝑛 ,‬‬

‫𝑗‬

‫𝑚‬

‫𝑛 𝑗‪𝑛 ≥ 𝑐 ′𝑗 𝑛𝛼𝑚 𝑛 − 𝑐′′‬‬

‫𝑗‬

‫‪𝑍𝑃 ′‬‬ ‫𝑗‬

‫𝑚‬

‫𝑗‬

‫‪𝑍𝑄 ′‬‬

‫לכל 𝑛‪ ,‬עבור קבועים מסוימים 𝑗𝑐‪ ,𝑐′𝑗 ,‬ו‪ 𝑐′′𝑗 -‬התלויים אך ב‪.𝑗-‬‬ ‫𝑗‬

‫לכן‪ ,‬לכל 𝑗 קבוע‪ ,‬אם 𝑘 הוא מספיק גדול אז )𝑛( 𝑘𝑍 הוא אך קצת על‪-‬לינארי ב‪ .𝑛 -‬בנוסף‪,‬‬ ‫אם נותנים ל‪ 𝑘-‬לגדול עם 𝑛 כפונקציה מתאימה של )𝑛(𝛼‪ ,‬אז החסמים נהיים לינאריים‪:‬‬ ‫עבור ‪ ,𝑗 = 3‬אם נציב )𝑛(𝛼‪ ,𝑘 ≥ 2‬אז )𝑛(𝑂 = 𝑛 ‪( 𝑍𝑘3‬מכיוון ש‪(𝑥) ≤ 3-‬‬ ‫עבור ‪ ,𝑗 ≥ 4‬אם נציב ) 𝑛 𝛼( 𝑗‪ ,𝑘 ≥ 𝑃′‬אז )𝑛(𝑂 = 𝑛‬

‫𝑥‬

‫𝛼𝛼)‪ .‬ובכלל‪,‬‬

‫𝑗‬

‫𝑘𝑍‪.‬‬

‫מצד שני‪ ,‬אם נותנים ל‪ 𝑘-‬לגדול רק קצת יותר לאט עם 𝑛‪ ,‬אז החסמים התחתונים נהיים שוב‬ ‫על‪-‬לינאריים‪ :‬אם נציב )‪ 𝑘 ≤ 𝑄 ′ 𝑗 (𝛼 𝑛 − 3‬אז )𝑛(𝜔 = 𝑛‬ ‫𝑗‬

‫𝑗‬

‫𝑘𝑍 (לפי למה ‪.)1‬‬

‫אנו משתמשים בחסמים עבו ר )𝑛( 𝑘𝑍 בהוכחות של משפטים ‪ :3-5‬משפט ‪ 3‬נובע מרדוקציה‬ ‫לדקירת שרשראות אינטרוולים ע "י שלשות‪ .‬משפט ‪ 4‬נובע מרדוקציה לדקירה ע"י 𝑗‪-‬יות‪ ,‬עם 𝑗‬ ‫כמו במשפט‪ ,‬ומשפט ‪ 5‬נובע מרדוקציה לדקירה ע "י 𝑑‪-‬יות‪.‬‬

‫‪8‬‬

‫משפט ‪ :4‬תהי 𝑑𝐑 ⊂ 𝑋 קבוצה סופית המוכלת בעקום הנחתך לא יותר מ‪ d -‬פעמים ע"י כל‬ ‫על‪-‬מישור‪ .‬נציב‬ ‫‪(𝑑 2 + 𝑑)/2,‬‬ ‫;𝑑 זוגי‬ ‫‪2‬‬ ‫;𝑑 אי זוגי ‪(𝑑 + 1)/2,‬‬

‫=𝑗‬

‫ונציב ‪ .𝑡 = (𝑗 − 2)/2‬אז קיימת עבור 𝑋 רשת‪ 1/r -‬חלשה בגודל‬ ‫‪,‬‬

‫;𝑗 זוגי‬ ‫‪,‬‬

‫‪ 𝑗.‬אי זוגי‬

‫)𝑡 𝑟‬

‫𝛼(𝑂‪𝑟 ∙ 2‬‬

‫))𝑟(𝛼 ‪𝑂(𝛼 𝑟 𝑡 log‬‬

‫‪𝑟∙2‬‬

‫(נשים לב ש‪ 𝑗-‬הוא זוגי אם ורק אם 𝑑 מתחלק ב‪).4-‬‬ ‫אנו גם נותנים חסמים תחתונים המראים שמשפט ‪ 4‬אינו רחוק מהאמת עבור ‪:𝑑 ≥ 3‬‬

‫משפט ‪ :5‬יהי 𝑑 קבוע‪ ,‬ונציב ‪ .𝑡 = (𝑑 − 2)/2‬אז‪ ,‬לכל ‪ 𝑟 > 1‬קיימת קבוצת נקודות ‪,𝐷s‬‬ ‫המוכלת בעקום שנחתך לא יותר מ‪ 𝑑-‬פעמים ע"י כל על‪-‬מישור‪ ,‬כך שכל רשת‪ 1/𝑟 -‬חלשה‬ ‫עבור ‪ 𝐷s‬הינה בגודל לפחות‬ ‫;𝑑 זוגי‬ ‫‪ 𝑑.‬אי זוגי‬

‫‪,‬‬ ‫‪,‬‬

‫) ‪𝑟 𝑡−1‬‬

‫𝛼(𝑂‪𝑡 −‬‬

‫)𝑟( 𝛼 !𝑡‪𝑟 ∙ 2 1/‬‬

‫) 𝑡 𝑟 𝛼(𝑂‪1/𝑡! 𝛼(𝑟)𝑡 log 2 𝛼(𝑟)−‬‬

‫‪𝑟∙2‬‬

‫הקבוצה ‪ 𝐷s‬היא לא אחרת מאשר ה אלכסון של השריג המתוח ‪ .𝐺s‬אנו גם כן מראים‬ ‫שלקבוצה ‪ 𝐷s‬אכן קיימת רשת‪ 1/𝑟 -‬חלשה בגודל אסימפטוטי זה (עד כדי הגורם היותר‬ ‫קטן באקספוננט)‪.‬‬

‫משפטים ‪ ,4 ,3‬ו‪ 5-‬נובעים מרדוקציה לבעיה קומבינטורית חדשה שאנו קוראים לה דקירת‬ ‫שרשראות אינטרוולים (‪ ,)stabbing interval chains‬שנציג כעת‪.‬‬ ‫משפטים ‪ 3‬ו‪ 4-‬הם עבודה משותפת עם נוגה אלון ‪ ,‬חיים קפלן‪ ,‬מיכה שריר‪ ,‬ושחר‬ ‫סמורודינסקי [‪ .]5‬משפט ‪ 5‬הוא עבודה משותפת עם בוריס בוך ויירי מטושק [‪.]13‬‬ ‫‪ .2.2‬דקירת שרשראות אינטרוולים‬ ‫כמו שהזכרנו‪ ,‬חלק מהתוצאות שלנו על רשתות‪ 𝜖 -‬חלשות נובעות מרדוקציה לבעיה‬ ‫קומבינטורית שאנו קוראים לה דקירת שרשראות אינטרוולים ‪ .‬הבעיה היא כדלהלן ‪.‬‬

‫עבור מספרים שלמים 𝑗 ≤ 𝑖 נסמן }𝑗 ‪ . 𝑖, 𝑗 = {𝑖, 𝑖 + 1, … ,‬נגדיר שרשרת אינטרוולים‬ ‫(‪ )interval chain‬באורך 𝑘 (בקיצור שרשרת‪ )𝑘-‬להיות סדרה של 𝑘 אינטרוולים עוקבים‪ ,‬לא‬ ‫חופפים ולא ריקים ‪:‬‬ ‫‪𝐶 = 𝑎1 , 𝑎2 𝑎2 + 1, 𝑎3 ⋯ 𝑎𝑘 + 1, 𝑎𝑘+1 ,‬‬ ‫כאשר ‪ .𝑎1 ≤ 𝑎2 < 𝑎3 < ⋯ < 𝑎𝑘+1‬אנו אומרים ש‪-𝑗 -‬יה של מספרים שלמים ) 𝑗𝑝 ‪(𝑝1 , … ,‬‬ ‫דוקרת את 𝐶 אם כל 𝑖𝑝 שייך לאינטרוול אחר ב‪.𝐶 -‬‬

‫‪7‬‬

‫)𝑟( 𝑑𝐸 כך שלכל קבוצה 𝑑𝐑 ⊂ 𝑋 קיימת רשת‪ 1/𝑟 -‬חלשה בגודל )𝑟( 𝑑𝐸‪ .‬הבעיה היא‬ ‫למצוא חסמים עליונים ותחתונים עבור )𝑟( 𝑑𝐸‪.‬‬ ‫למקרה ‪ 𝑑 = 2‬אלון ושות' [‪ ]3‬הראו ש‪( 𝐸2 𝑟 = 𝑂(𝑟 2 )-‬לאחר מכן‪ ,‬שאזל ושות' [‪ ]16‬נתנו‬ ‫הוכחה אחרת לאותה תוצאה)‪ .‬עבור ‪ ,𝑑 ≥ 3‬שאזל ושות' [‪ ]16‬הוכיחו ש‪-‬‬ ‫) 𝑑 𝑐)𝑟 ‪ ,𝐸𝑑 𝑟 = 𝑂(𝑟 𝑑 (log‬ומאוחר יותר‪ ,‬מטושק ווגנר [‪ ]35‬נתנו הוכחה פשוטה יותר‬ ‫לתוצאה זו‪ ,‬שגם שיפרה את התלות של הקבוע 𝑑‪ c‬ב‪.𝑑-‬‬ ‫אולם‪ ,‬עד עכשיו לא היה ידוע אף חסם תחתון ל‪ 𝐸𝑑 (𝑟)-‬עבור 𝑑 קבוע ו‪ 𝑟-‬גדול‪ ,‬חוץ מהחסם‬ ‫הטריוויאלי 𝑟 ≥ )𝑟( 𝑑𝐸‪ .‬מטושק [‪ ]34‬הראה חסם תחתון ל‪ 𝑑 -‬גדול עם 𝑟 קבוע‪:‬‬ ‫))‪.𝐸𝑑 50 = Ω(exp( 𝑑/2‬‬ ‫בתזה זו אנו נותנים את החסם התחתון העל‪-‬לינארי הראשון עבור רשתות‪ 𝜀 -‬חלשות‪:‬‬ ‫משפט ‪ :2‬לכל ‪ 𝑑 ≥ 2‬מתקיים )𝑟 ‪.𝐸𝑑 𝑟 = Ω(𝑟 log 𝑑−1‬‬ ‫הקבוצה ‪ 𝐺s‬שנותנת את משפט ‪ 2‬היא שריג מתוח (‪ ,)stretched grid‬כלומר‪ ,‬מכפלה קרטזית‬ ‫של 𝑑 סדרות הגדלות בקצב מספיק מהיר ‪ .‬אנו מראים איך ניתן לנתח קמירות ב שריג זה‬ ‫בעזרת מושג שאנו קוראים לו קמירות‪-‬מדרגה (‪ ,)stair-convexity‬שהוא וריאנט של מושג‬ ‫הקמירות הרגילה‪.‬‬ ‫אנו גם מראים שלשריג המתוח ‪ 𝐺s‬אכן קיימת רשת‪ 1/𝑟 -‬חלשה בגודל )𝑟 ‪,𝑂(𝑟 log 𝑑−1‬‬ ‫ולכן אי אפשר לקבל ממנו אף חסם תחתון יותר טוב ‪ .‬זאת עבודה משותפת עם בוריס בוך‬ ‫ויירי מטושק [‪.]13‬‬ ‫רשתות‪-‬אפסילון חלשות עבור קבוצות "חד‪-‬ממדיות להלכה"‬ ‫בעיית הרשתות‪ 𝜖-‬החלשות גם נחקרה עבור מקרים מיוחדים של הקבוצה הנתונה 𝑋‪ .‬כאן אנו‬ ‫מתייחסים למקרים מסוימים שבהם 𝑋 היא‪ ,‬במובן מסוים ‪ ,‬קבוצה חד‪-‬ממדית להלכה‬ ‫(‪.)intrinsically one-dimensional‬‬ ‫שאזל ושות' [‪ ]16‬הראו שאם ‪ 𝑋 ⊂ 𝐑2‬היא קבוצה מישורית במצב קמור‪ ,‬אז קיימת ל‪-‬‬ ‫𝑋 רשת‪ 1/r -‬חלשה בגודל )𝑟 ‪ .𝑂(𝑟 log1.59‬אנו משפרים את התוצאה הזאת כדלהלן ‪:‬‬

‫משפט ‪ :3‬תהי ‪ 𝑋 ⊂ 𝐑2‬קבוצה במצב קמור‪ .‬אז קיימת עבור 𝑋 רשת‪ 1/r -‬חלשה בגודל‬ ‫) 𝑟 𝛼𝑟(𝑂‪ ,‬כאשר 𝛼 היא פונקצית אקרמן ההופכית ‪.‬‬ ‫הכללה של מקרה זה היא כאשר 𝑑𝐑 ⊂ 𝑋 מוכלת בעקום הנחתך לא יותר מ‪ 𝑑 -‬פעמים ע"י כל‬ ‫על‪-‬מישור‪ .‬עקום עם תכונה זו נקרא עקום קמור (‪ )convex curve‬בכמה מקורות (למשל [‪.)]53‬‬ ‫הדוגמא הכי ידועה של עקום קמור היא עקום המומנט (‪:)moment curve‬‬ ‫‪𝑡, 𝑡 2 , 𝑡 3 , … , 𝑡 𝑑 𝑡 ∈ 𝐑 .‬‬ ‫לקבוצות 𝑋 על עקומים אלו‪ ,‬מטושק ווגנר [‪ ]35‬קיבלו חסם עליון מהצורה )𝑟 ‪,𝑂(𝑟 polylog‬‬ ‫כאשר דרגת הפולינום תלוי ה ב‪ .𝑑-‬אנו משפרים חסם זה כדלהלן ‪:‬‬

‫‪6‬‬

‫תקציר‬ ‫בתזה זו אנו משפרים חסמי ם של כמה בעיות בגיאומטריה בדידה ‪ .‬פונקצית אקרמן ההופכית‬ ‫(‪ )the inverse Ackermann function‬קשורה לכמה מהתוצאות שלנו ‪ ,‬ולכן נגדיר אותה‬ ‫תחילה‪.‬‬

‫‪ .1‬פונקצית אקרמן ההופכית‬ ‫תחילה אנו מגדירים היררכיה של פונקציות )𝑥( 𝑘𝛼 עבור ‪ 𝑘 ≥ 1‬שלם ו‪ 𝑥 ≥ 0-‬ממשי‪ .‬כל‬ ‫פונקציה בהיררכיה הזאת גדלה עם 𝑥 בקצב הרבה יותר איטי מהפונקציה הקודמת‪:‬‬ ‫אנו מגדירים ‪ ,𝛼1 𝑥 = 𝑥/2‬ועבור ‪ 𝑘 ≥ 2‬אנו מגדירים את )𝑥( 𝑘𝛼 באופן רקורסיבי ע"י‬ ‫‪0,‬‬ ‫;‪𝑥 ≤ 1‬‬ ‫‪.‬אחרת ‪1 + 𝛼𝑘 𝛼𝑘−1 𝑥 ,‬‬

‫= 𝑥 𝑘𝛼‬

‫במילים אחרות‪ ,‬לכל ‪ 𝛼𝑘 ,𝑘 ≥ 2‬מחזירה‪ ,‬בהינתן 𝑥‪ ,‬את מספר הפעמים שאנו צריכים לפעיל‬ ‫את ‪ 𝛼𝑘−1‬על 𝑥‪ ,‬עד שנגיע לתוצאה קטנה או שווה ל‪ .1-‬לכן‪ 𝛼2 𝑥 = log 2 𝑥 ,‬ו‪-‬‬ ‫𝑥 ∗ ‪.𝛼3 𝑥 = log‬‬ ‫לכל ‪ 𝑥 ≥ 5‬קבוע‪ ,‬הסדרה )𝑥( ‪ ... ,𝛼3 (𝑥) ,𝛼2 (𝑥) ,𝛼1‬קטנה מהר מאוד‪ ,‬עד שהיא מתייצבת‬ ‫בערך ‪ .3‬אנו מגדירים את פונקצית אקרמן ההופכית להיות‬ ‫‪𝛼 𝑥 = min 𝑘 𝛼𝑘 (𝑥) ≤ 3 .‬‬ ‫)𝑥(𝛼 גדלה עם 𝑥 בקצב יותר איטי מאשר )𝑥( 𝑘𝛼 לכל 𝑘 קבוע‪.‬‬ ‫נשים לב שלפי ההגדרה מתקיים ‪(𝑥) ≤ 3‬‬ ‫למה ‪(𝑥) :1‬‬

‫‪𝑥 −3‬‬

‫𝑥‬

‫𝛼𝛼‪ .‬מצד שני‪:‬‬

‫𝛼𝛼 שואף לאינסוף עם 𝑥‪.‬‬

‫‪ .2‬התוצאות שלנו‬ ‫כעת נציג את הבעיות שאנו חוקרים בתזה זו ‪ ,‬ואת התוצאות שקיבלנו לכל אחת מהן‪.‬‬ ‫‪ .2.1‬רשתות‪-‬אפסילון חלשות‬ ‫בהינתן קבוצה סופית 𝑋 של נקודות ב‪ 𝐑𝑑 -‬ופרמטר ‪ ,𝜖 > 0‬קבוצה 𝑁 ב‪ 𝐑𝑑 -‬נקראת רשת‪𝜖-‬‬ ‫חלשה (‪ )weak 𝜖-net‬עבור 𝑋 אם 𝑁 חותכת כל קבוצה קמורה שמכילה לפחות 𝑋 𝜖 נקודות‬ ‫מ‪.𝑋-‬‬ ‫רשתות‪ 𝜖-‬חלשות נחקרו לראשונה ע "י האוסלר וולצל [‪ .]24‬אחד מהשימושים שלהן הוא‬ ‫בהוכחה של אלון וקלייטמן [‪ ]6‬של השערת ה‪ (𝑝, 𝑞)-‬של האדוויגר‪-‬דברונר‪.‬‬ ‫לשם נוחיות נציב 𝜖‪ ,𝑟 = 1/‬כך ש‪ .𝑟 > 1-‬אלון ושות' [‪ ]3‬הראו שתמיד ניתן לבנות רשתות‪-‬‬ ‫𝑟‪ 1/‬חלשות בגודל שתלוי רק ב‪ 𝑟 -‬ובמימד 𝑑‪ .‬במילים אחרות‪ ,‬לכל ‪ 𝑑 ≥ 2‬קיימת פונקציה‬ ‫‪5‬‬

‫החדשה שלנו לסדרות דוונפורט‪ -‬שינצל מוכללות ( ‪generalized Davenport-Schinzel‬‬ ‫‪ ,)sequences‬ומשפרים את החסמים העליונים עבורן גם כן‪.‬‬ ‫בנוגע לחסמים תחתונים ‪ ,‬אנו מפשטים את הבנייה שנותנת את החסמים התחתונים ל‪𝜆𝑠 (𝑛) -‬‬ ‫עבור ‪ 𝑠 ≥ 4‬זוגי‪ .‬ולסיום‪ ,‬אנו משפרים את החסם התחתון עבור )𝑛( ‪ 𝜆3‬בפקטור קבוע‪,‬‬ ‫ומראים ש‪ 𝜆3 (𝑛)-‬שווה ל‪ 2𝑛𝛼(𝑛)-‬ועוד איברים מסדר גודל יותר קטן ‪.‬‬ ‫תוצאות אלה פורסמו ב‪.]37[ -‬‬ ‫תוצאות נוספות‪ .‬אנו מראים שהשריג המתוח 𝑑𝐑 ⊂ ‪ 𝐺s‬משפר את החסם העליון עבור מה‬ ‫שמכונה למת הבחירה הראשונה (‪ :)first selection lemma‬אף נקודה ב‪ 𝐑𝑑 -‬אינה מוכלת‬ ‫ביותר מ‪-‬‬ ‫) ‪(𝑛/(𝑑 + 1))𝑑+1 + 𝑜(𝑛𝑑+1‬‬ ‫סימפלקסים 𝑑‪-‬מימדיים עם קודקודים ב‪ ,𝐺s -‬כאשר ‪.𝑛 = 𝐺s‬‬

‫בנוסף‪,‬אנו משתמשים בשריג המתוח כדי לשפר את החסם העליון עבור למת הבחירה השנייה‬ ‫(‪ )second selection lemma‬במישור‪ ,‬בפקטור לוגריתמי ‪ :‬לכל‬

‫𝑛‬ ‫‪3‬‬

‫< 𝑡 < 𝑛 ‪ 𝑛2.5 log‬אנו‬

‫בונים קבוצה 𝑇 של 𝑡 משולשים עם קודקודים ב‪ 𝐺s ⊂ 𝐑2 -‬כך שאף נקודה ב‪ 𝐑2 -‬אינה מוכלת‬ ‫ביותר מ‪ 𝑂(𝑡 2 /(𝑛3 log 𝑛))-‬משולשים ב‪( 𝑇 -‬כאשר‪ ,‬שוב‪.)𝑛 = 𝐺s ,‬‬ ‫תוצאות אחרונות אלה הן עבודה משותפת עם בוך ומטושק [‪.]13‬‬ ‫ולסיום‪ ,‬אנו נותנים הוכחה נכונה של החסם התחתון הנוכחי עבור למת הבחירה השנ ייה‬ ‫במישור‪ :‬לכל קבוצה ‪ 𝑆 ⊂ 𝐑2‬בת 𝑛 נקודות ולכל משפחה 𝑇 של 𝑡 משולשים עם קודקודים ב‪-‬‬ ‫𝑆‪ ,‬קיימת נקודה במישור שמוכלת ב‪ Ω(𝑡 3 /(𝑛6 log 2 𝑛)) -‬משולשים מ‪ .𝑇-‬אפשטיין טען‬ ‫תוצאה זו ב‪ ,]22[-‬אך קיימת טעות בהוכחה שלו‪ .‬ההוכחה שלנו מסתמכת על שינוי קטן‬ ‫בטיעון של אפשטיין ‪ .‬זאת עבודה משותפת עם מיכה שריר [‪.]39‬‬

‫‪4‬‬

‫תמצית‬ ‫בתזה זו אנו משפרים חסמים של כמה בעיות בגיאומטריה בדידה ‪ .‬אחת התוצאות היותר‬ ‫מפתיעות שלנו היא הופעתם של אותם ביטויים מסובכים המכילים את פונקצית אקרמן‬ ‫ההופכית (‪ ,)the inverse Ackermann function‬בחסמים של שתי בעיות לא קשורות ‪.‬‬ ‫רשתות‪-‬אפסילון חלשות‪ .‬בהינתן קבוצה סופית 𝑋 של נקודות ב‪ 𝐑𝑑 -‬ופרמטר ‪ ,𝜖 > 0‬קבוצה‬ ‫𝑁 ב‪ 𝐑𝑑 -‬נקראת רשת‪ 𝜖-‬חלשה (‪ )weak 𝜖-net‬עבור 𝑋 אם 𝑁 חותכת כל קבוצה קמורה‬ ‫שמכילה לפחות 𝑋 𝜖 נקודות מ‪ .𝑋 -‬נציב 𝜖‪ ,𝑟 = 1/‬כך ש‪.𝑟 > 1-‬‬ ‫אנו בונים‪ ,‬לכל קבוע ‪ 𝑑 ≥ 2‬ולכל ‪ ,𝑟 > 1‬קבוצה ‪ 𝐺s‬ב‪ 𝐑𝑑 -‬שכל רשת‪ (1/𝑟)-‬חלשה עבורה‬ ‫הינה בגודל )𝑟 ‪ .Ω(𝑟 log 𝑑−1‬זהו החסם התחתון הראשון שאינ ו טריוויאלי (כפונקציה של 𝑟‬ ‫עבור 𝑑 קבוע)‪ .‬הקבוצה ‪ 𝐺s‬הינה שריג מתוח (‪ ,)stretched grid‬כלומר‪ ,‬מכפלה קרטזית של 𝑑‬ ‫סדרות הגדלות בקצב מספיק מהיר‪ .‬אנו מראים איך נית ן לנתח קמירות בשריג זה בעזרת‬ ‫מושג שאנו קוראים לו קמירות‪-‬מדרגה (‪ ,)stair-convexity‬שהוא וריאנט של מושג הקמירות‬ ‫הרגילה‪ .‬זאת עבודה משותפת עם בוריס בוך ויירי מטושק [‪.]13‬‬ ‫בנוסף‪ ,‬אנו משפרים את החסמים העליונים עבור רשתות‪ 𝜖 -‬חלשות עבור סוגים מיוחדים של‬ ‫קבוצות 𝑋‪ :‬אנו מראים שאם ‪ 𝑋 ⊂ 𝐑2‬היא במצב קמור‪ ,‬אז קיימת רשת‪ 1/𝑟 -‬חלשה עבור‬ ‫𝑋 בגודל ) 𝑟 𝛼𝑟(𝑂‪ ,‬כאשר 𝛼 היא פונקצית אקרמן ההופכית‪ .‬אנו מכלילים את התוצאה‬ ‫הזאת למקרים אחרים שבהם 𝑑𝐑 ⊂ 𝑋 היא‪ ,‬במובן מסוים‪ ,‬קבוצה "חד‪-‬ממדית להלכה"‬ ‫(”‪ .)“intrinsically one-dimensional‬לקבוצות 𝑋 כאלו אנו בונים רשתות‪ 1/𝑟 -‬חלשות‬ ‫בגודל )‬

‫𝑟 𝛼‬

‫‪ , 𝑂(𝑟 ∙ 2poly‬כאשר דרגת הפולינום תלויה ב‪.𝑑 -‬‬

‫אנו משיגים את התוצאות האלה בעזרת רדוקציה ל דקירת שרשראות אינטרוולים ( ‪stabbing‬‬ ‫‪ ,)interval chains‬שהיא בעיה קומבינטורית חדשה מעניינת בפני עצמה ‪ .‬אנו מראים חסמים‬ ‫עליונים ותחתונים כמעט הדוקים עבור בעיה זו ‪ .‬זאת עבודה משותפת עם נוגה אלון ‪ ,‬חיים‬ ‫קפלן‪ ,‬מיכה שריר‪ ,‬ושחר סמורודינסקי [‪.]5‬‬ ‫בחזרה לחסמים התחתונים ‪ ,‬אנו בונים‪ ,‬לכל ‪ ,𝑑 ≥ 3‬קבוצת נקודות "חד‪-‬ממדית להלכה"‬ ‫שכל רשת‪ 1/𝑟 -‬חלשה עבורה הינה בגודל ) 𝑟 𝛼 ‪( Ω(𝑟 ∙ 2poly‬עם פולינום קטן מאשר‬ ‫בחסם העליון)‪ .‬הקבוצה ‪ 𝐷s‬היא למעשה האלכסון של השריג המתוח ‪ .𝐺s‬זאת עבודה‬ ‫משותפת עם בוך ומטושק [‪.]13‬‬ ‫סדרות דוונפורט‪-‬שינצל‪ .‬בהינתן מספר שלם ‪ ,𝑠 ≥ 1‬סדרת דוונפורט‪ -‬שינצל (סדרת‪DS-‬‬ ‫בקיצור) מסדר 𝑠 היא סדרת סימנים שאינה מכילה שני סימנים זהים עוקבים‪ ,‬ואינה מכילה‬ ‫אף אלטרנציה מהצורה … 𝑏 … 𝑎 … 𝑏 … 𝑎 באורך ‪ .𝑠 + 2‬האורך המרבי של סדרה כזו‬ ‫שמכילה לכל היותר 𝑛 סימנים שונים מסומן )𝑛( 𝑠𝜆‪ .‬ידוע של‪ ,𝑠 ≥ 3-‬החסמים עבור )𝑛( 𝑠𝜆‬ ‫מכילים את פונקצית אקרמן ההופכית‪ ,‬והם דומים באופן מפליא לחסמים שקיבלנו עבור‬ ‫רשתות‪ 𝜖-‬חלשות ושרשראות אינטרוולים ‪ .‬אנו משיגים שיפור קל בחסמים העליונים עבור‬ ‫)𝑛( 𝑠𝜆; עבור ‪ 𝑠 ≥ 4‬זוגי הם כבר תואמים כמעט לחלוטין את החסמים התחתונים ‪ .‬בנוסף‪,‬‬ ‫אנו משיגים מחדש את החסמים העליונים שלנו בעזרת שיטה ח דשה‪ ,‬שמשתמשת במה שאנו‬ ‫קוראים סדרות כמעט‪ .)almost-DS sequences( DS-‬ובנוסף‪ ,‬אנו מפעילים את השיטה‬

‫‪3‬‬

2

‫בס"ד‬

‫אוניברסיטת תל אביב‬ ‫הפקולטה למדעים מדויקים ע"ש ריימונד ובברלי סאקלר‬ ‫בית הספר למדעי המחשב ע"ש בלבטניק‬

‫רשתות‪-‬אפסילון חלשות‪ ,‬סדרות‬ ‫דוונפורט‪-‬שינצל‪ ,‬ובעיות נלוות‬

‫חיבור זה הוגש לשם קבלת תואר דוקטור לפילוסופיה‬

‫ע"י גבריאל ניבש‬ ‫בהנחיית פרופ' מיכה שריר‬ ‫הוגש לסנאט של אוניברסיטת תל אביב‬ ‫סיוון תשס"ט‬

Weak Epsilon-Nets, DavenportâSchinzel Sequences ...

sity), and we have continued working together since then via the Internet. I hope we ...... and let us call every integer vector t = (t1,t2,...,td) with ti â¥ 1 for all i and.

Download PDF

2MB Sizes 1 Downloads 74 Views

Report

Weak Epsilon-Nets, DavenportâSchinzel Sequences ...

Recommend Documents

Weak Epsilon-Nets, DavenportâSchinzel Sequences ...