Reconstruction of Generalized Depth-3 Arithmetic ...

Viewer
Transcript

Reconstruction of Generalized Depth-3 Arithmetic Circuits with Bounded Top Fan-in Zohar S. Karnin∗

Amir Shpilka∗

Abstract In this paper we give reconstruction algorithms for depth-3 arithmetic circuits with k multiplication gates (also known as ΣΠΣ(k) circuits), where k = O(1). Namely, we give an algorithm that when given a black box holding a ΣΠΣ(k) circuit C over a ﬁeld F as input, makes queries to the black box (possibly over a polynomial sized extension ﬁeld of F) and outputs a circuit C ′ computing the same polynomial as C. In particular we obtain the following results. 1. When C is a multilinear ΣΠΣ(k) circuit (i.e. each of its multiplication gates computes a multilinear polynomial) our algorithm runs in polynomial time (when k is a constant) and outputs a multilinear ΣΠΣ(k) circuits computing the same polynomial. 2. In the general case, our algorithm runs in quasi-polynomial time and outputs a generalized depth-3 circuit (a notion that is deﬁned in the paper) with k multiplication gates. For example, the polynomials computed by generalized depth-3 circuits can be computed by quasipolynomial sized depth-3 circuits. In fact, our algorithm works in the slightly more general case where the black box holds a generalized depth-3 circuit. Prior to this work there were reconstruction algorithms for several diﬀerent models of bounded depth circuits: the well studied class of depth-2 arithmetic circuits (that compute sparse polynomials) and its close by model of depth-3 set-multilinear circuits. For the class of depth-3 circuits only the case of k = 2 (i.e. ΣΠΣ(2) circuits) was known. Our proof technique combines ideas from [Shp09] and [KS08] with some new ones. Most notably, we prove the existence of a unique canonical representation of depth-3 circuits. This enables us to work with a speciﬁc representation in mind. Another technical contribution is an isolation lemma for depth-3 circuits that enables us to reconstruct a single multiplication gate of the circuit.

∗ Faculty of Computer Science, Technion, Haifa 32000, Israel. Email: {zkarnin,shpilka}@cs.technion.ac.il. Research supported by the Israel Science Foundation (grant number 439/06).

Contents 1 Introduction 1.1 Depth-3 circuits . . . . . 1.2 Statement of our results 1.3 Related works . . . . . . 1.4 Our Techniques . . . . . 1.5 Organization . . . . . .

. . . . .

1 1 2 3 4 5

. . . .

5 6 7 9 11

3 Canonical τ -Distant Circuits 3.1 Proof of Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12 13 15

4 Reconstructing ΣΠΣ(k, d, ρ) Circuits 4.1 Reconstructing a Low Rank Circuit . . . . . . . . . . . . . . . . . . 4.2 Finding a τ -Distant Representation in a Low Dimension Subspace 4.2.1 Step 1: Finding the set of subspaces . . . . . . . . . . . . . 4.2.2 Steps 2 & 3: Gluing the Restrictions Together . . . . . . . 4.2.3 The Algorithm for ﬁnding a τ -distant circuit . . . . . . . . 4.3 The Reconstruction Algorithm . . . . . . . . . . . . . . . . . . . .

. . . . . .

16 17 21 21 26 29 31

5 Reconstructing Multilinear ΣΠΣ(k) circuits 5.1 Lifting a Low Rank Multilinear ΣΠΣ(k) Circuit . . . . . . . . . . . . . . . . . . . . . . 5.2 Finding a Partition of the Circuit in a Low Dimension Subspace . . . . . . . . . . . . 5.3 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33 34 40 43

A Toolbox A.1 Black-Box Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Brute Force Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 Reconstructing Linear Functions . . . . . . . . . . . . . . . . . . . . . . . . A.4 Deterministic Polynomial Identity Testing Algorithms for Depth-3 Circuits

47 47 48 49 49

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

2 Preliminaries 2.1 Generalized Depth 3 Arithmetic Circuits . . . . . . . . . . . . . 2.2 Rank Preserving Subspaces . . . . . . . . . . . . . . . . . . . . 2.2.1 Rank Preserving Subspaces that Preserve Multilinearity 2.3 A “Distance Function” for ΣΠΣ Circuits . . . . . . . . . . . . .

B Proof of Lemma 4.16

. . . . .

. . . .

. . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . . .

. . . .

. . . . . .

. . . . .

. . . .

. . . . . .

. . . . .

. . . .

. . . . . .

. . . . .

. . . .

. . . . . .

. . . . .

. . . .

. . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . .

. . . . .

. . . .

. . . . . .

. . . .

. . . .

50

List of Algorithms 1 2 3 4 5 6 7 8 9 10 11

Canonical partition of a circuit . . . . . . . . . . . . . . . . . . . Gluing together low dimension restrictions of a low rank circuit Reconstructing a circuit given a depth-1 m-linear function tree . Reconstructing a circuit given an m-linear function tree . . . . . Finding the τ -distant circuit of a polynomial . . . . . . . . . . . Learning a ΣΠΣ(k, d, ρ) circuit . . . . . . . . . . . . . . . . . . . Lifting the g.c.d of a low ∆ measured circuit . . . . . . . . . . . Lifting a low rank multilinear circuit to Fn . . . . . . . . . . . . Lifting a multilinear circuit to Fn . . . . . . . . . . . . . . . . . Reconstruction of a multilinear ΣΠΣ(k) circuit . . . . . . . . . . Brute force interpolation . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

14 19 27 29 30 32 35 37 39 44 48

1

Introduction

In this work we consider the problem of reconstruction an arithmetic circuit for which we have oracle access; we are given a black box holding an arithmetic circuit C, and we wish to construct a circuit C ′ that computes the same polynomial as C. Our only access to the polynomial computed by C is via black-box queries. That is, we are allowed to pick inputs (adaptively) and query the black box for the value of C on those inputs. The problem of reconstructing arithmetic circuits using queries is an important question in algebraic complexity that received a lot of attention. As the problem is notoriously diﬃcult, research focused on restricted models such as the model of depth-2 arithmetic circuits (circuits computing sparse polynomials) that was extensively studied over various ﬁelds (see e.g. [BOT88, KS01] and the references within), the model of read-once arithmetic formulas [HH91, BHH95, BB98, SV08, SV09] and the model of set-multilinear depth-3 circuits [BBB+ 00, KS06a]. Another fruitful line of research focused on proving hardness of learning results [FK06, KS06b]. The focus of this work is on the class of depth-3 arithmetic circuits that have a bounded number of multiplication gates, over some ﬁnite ﬁeld F. We give two diﬀerent algorithms for this model. The ﬁrst is a polynomial time reconstruction algorithm for the case that the circuit is a multilinear circuit. In the general case, without the multilinearity assumption, we give a quasi-polynomial time reconstruction algorithm. These results extend and generalize a previous work by the second author that gave reconstruction algorithms for depth-3 circuit with only 2 multiplication gates [Shp09]. We would like to say a word about the terminology used in this paper. The problem of reconstructing is sometimes referred to as the learning problem or as the interpolation problem. We prefer to think of the learning problem as describing a scenario in which the learner makes queries from the input domain and not, as in our algorithms, from an extension ﬁeld. Similarly, we think of the interpolation problem as a problem of obtaining the polynomial and not an implicit representation of it (e.g. by an arithmetic circuit). We now give the required background before stating our results.

1.1

Depth-3 circuits

For reconstruction purposes we only consider circuits with a + gate at the top. Notice that for depth3 circuits with a multiplication gate at the top, reconstruction is possible using known factorization algorithms [Kal85, KT90, Kal95]. A depth-3 circuit with a + gate at the top, i.e. a ΣΠΣ circuit, has the following structure: di k k ∏ ∑ ∑ C= Mi = Li,j (x1 , . . . , xn ), (1) i=1

i=1 j=1

where the Li,j -s are linear functions. The {Mi }ki=1 are called the multiplication gates of C and k is the fan-in of the top plus gate. We call a depth-3 circuit with a top fan-in k a ΣΠΣ(k) circuit. Although depth-3 circuits seem to be a very restricted model of computation, understanding it is one of the greatest challenges in arithmetic circuit complexity. It is the ﬁrst model for which lower bounds are diﬃcult to prove. Over ﬁnite ﬁelds exponential lower bounds were obtained [GK98, GR00] but over ﬁelds of characteristic zero only quadratic lower bounds are known [SW01, Shp02]. Moreover, recently Agrawal and Vinay [AV08] showed1 that exponential lower bounds for depth-4 circuits imply exponential lower bounds for general arithmetic circuits. In addition [AV08] showed that a polynomial time black-box polynomial identity testing algorithm for depth-4 circuits implies a (quasi-polynomial time) derandomization of polynomial identity testing of general arithmetic circuits. As for depth-2 circuits, most of the questions are not very diﬃcult, we see that depth-3 circuits stand between the 1

This result also follows from the earlier works of [VSBR83, Raz08].

1

relatively easy depth-2 case and the very diﬃcult depth-4 case. Hence it is an important goal to better understand the class of depth-3 circuits.

1.2

Statement of our results

We give two reconstruction algorithms. The ﬁrst for general ΣΠΣ(k) circuits and the second for multilinear ΣΠΣ(k) circuits. While in the case of multilinear depth-3 circuits our algorithm returns a multilinear ΣΠΣ(k) circuit, in the general case it returns what we call a generalized depth-3 circuit. We note that generalized depth-3 circuits can be presented as depth-3 circuits of quasi-polynomial size. The earlier result of [Shp09] also has a similar form. We now deﬁne the notion of generalized depth-3 circuits. We say that a polynomial f (¯ x) is computed by a ΣΠΣ(k, d, ρ) if it can be represented in the following form   di k k ) ( ∑ ∑ ∏ ˜ i,1 (¯ ˜ i,ρ (¯  x), . . . , L x ) (2) f (¯ x) = Mi = Li,j (¯ x) · hi L i i=1

i=1

j=1

˜ i,j ’s are linear functions in the variables x where the Li,j ’s and the L ¯ = (x1 , . . . , xn ), over F. Every hi is a ˜ i,j }ρi are linearly independent. We shall assume, polynomial in ρi ≤ ρ variables, and the functions {L j=1 w.l.o.g., that each hi∏depends on all of its ρi variables. We call M1 , . . . , Mk the multiplication gates i of the circuit (Mi = dj=1 Li,j · hi ). The degree of the circuit, d = deg(C), is deﬁned as the maximal degree of its multiplication gates (i.e. d = maxi=1...k {deg(Mi )}). Thus, a ΣΠΣ(k, d, ρ) circuit is a generalized depth-3 circuit of degree d with k multiplication gates. The size of a ΣΠΣ(k, d, ρ) circuit is deﬁned as the sum of degrees of the multiplication gates of the circuit (thus, the size of the circuit ∑ given in Equation (2) is ki=1 deg(Mi ) ≤ dk). We denote the size of a circuit C by size(C). When ρ = 0 (i.e. each hi is a constant function) we get the class of depth-3 circuits with k multiplication gates and degree d, also known as ΣΠΣ(k, d) circuits (similarly we have the class ΣΠΣ(k) of depth-3 circuits with k multiplication gates). When k and d are arbitrary we get the class of depth-3 circuits that we denote with ΣΠΣ. Notice that the diﬀerence between generalized ΣΠΣ(k) circuits and ΣΠΣ(k) circuits is the non linear part hi of the multiplication gates. Also note that ΣΠΣ(k, d, ρ) circuits can be simulated by depth-3 circuits with k · dρ multiplication gates. Thus, when ρ is polylogarithmic in n (as in this paper), a ΣΠΣ(k, d, ρ) circuit can be simulated by a depth-3 circuit with quasi-polynomially many multiplication gates. Our result for reconstructing ΣΠΣ(k) circuits actually holds for generalized depth-3 circuits. Namely, our reconstruction algorithm, when given black-box access to a ΣΠΣ(k, d, ρ) circuit, returns a ΣΠΣ(k, d, ρ′ ) circuit computing the same polynomial. Therefore, instead of stating our result for ΣΠΣ(k) circuits we state it in its most general form. Theorem 1. Let f be an n-variate polynomial computed by a ΣΠΣ(k, d, ρ) circuit, over a ﬁeld F. Then 3 there is a reconstruction algorithm for f , that runs in time poly(n) · exp(log(|F|) · log(d)O(k ) · ρO(k) ), and outputs a ΣΠΣ(k, d, ρ′ ) circuit for f (ρ′ is equal to ρ up to an additive polylogarithmic factor). When |F| = O(d5 ) the algorithm may ask queries from a polynomial size algebraic extension ﬁeld of F. Our second result deals with the case of multilinear ΣΠΣ(k) circuits. A multilinear circuit is a circuit in which every multiplication gate computes a multilinear polynomial. Theorem 2. Let f be a multilinear polynomial in n variables that is computed by a multilinear ΣΠΣ(k) O(k log k) circuit, over a ﬁeld F. Then there is a reconstruction algorithm for f that runs in time (n+|F|)2

2

and outputs a multilinear ΣΠΣ(k) circuits computing f . When |F| = O(n5 ) The algorithm may ask queries from a polynomial size algebraic extension ﬁeld of F. Thus, for the multilinear case we give a reconstruction algorithm that also returns a multilinear circuit of the same complexity.

1.3

Related works

We are aware of two reconstruction algorithms for restricted depth-3 circuits. The ﬁrst by [BBB+ 00, KS06a] gives a reconstruction algorithm for set-multilinear depth-3 circuits. Although being depth-3 circuits, set-multilinear circuits are closer in nature to depth-2 circuits. In fact, it is better to think of them as a generalization of depth-2 circuits. In particular, the techniques used to reconstruct set multilinear circuits also apply when the input is a depth-2 circuit but completely fail even for the case of ΣΠΣ(2) multilinear circuits (see [Shp09] for an explanation). The second related result is a reconstruction algorithm for ΣΠΣ(2) circuits [Shp09]. Our work uses ideas from this earlier work together with some new techniques in order to obtain our (more general) results. Another line of results related to this paper is that of algorithms for polynomial identity testing (PIT for short) of depth-3 circuits. In [DS06] a quasi-polynomial time non black-box PIT algorithm was given to ΣΠΣ(k) circuits (for a constant k). This result was later improved by [KS07] that gave a polynomial time algorithm for ΣΠΣ(k) circuits (again for a constant k). In [KS08] we managed to generalize the results of [DS06] to the black-box setting. [SS09, KS09, SS10b] managed to improve a theorem of [DS06] regarding the rank of depth-3 identities, which immediately improves our previous algorithm. Unfortunately, even after these improvements, the running time of the algorithm is quasipolynomial in n (when F is a ﬁnite ﬁeld). Recently, [SS10a] obtained a polynomial time algorithm for such circuits by using the methods in [KS08] along with a diﬀerent structural theorem. We note that black-box algorithms for the polynomial identity problem are closely related to reconstruction algorithms. This is since they give a set of inputs such that the value of the circuit on those inputs completely determines the circuit. Hence, it gives suﬃcient information for reconstructing the circuit. The main problem of course is coming up with an eﬃcient algorithm that will use this information to reconstruct the circuit. Such algorithms are known for depth-2 circuits where several PIT algorithms were used for reconstruction (see e.g. [KS01] and the references within). We are unaware of any generic way of doing so (namely, moving from PIT to reconstruction). This is exactly the reason that the black-box PIT algorithms of [KS01] for ΣΠΣ(2) circuits and of [KS08] for ΣΠΣ(k) circuits do not immediately imply reconstruction algorithms. The problem of reconstructing arithmetic circuits is also closely related to the problem of learning arithmetic circuits. In the learning scenario the focus is on learning arithmetic circuits over small ﬁelds (usually over F2 ). Moreover, we are usually satisﬁed with outputting an approximation to the unknown arithmetic circuit. For example, it is not diﬃcult to see that the learning algorithm of Kushilevitz and Mansour [KM93] can learn ΣΠΣ(k) circuits over constant sized ﬁelds, for k = O(log n), in polynomial time. However, the model that we are considering is diﬀerent. We are interested in larger ﬁelds, in particular we allow queries from extension ﬁelds, and we want to output the exact same polynomial. This diﬀerence is prominent when considering reconstruction algorithms, see e.g. the discussion in [KS01]. Other related works include hardness results for learning arithmetic circuits. In [FK06] Fortnow and Klivans showed that polynomial time reconstruction of arithmetic circuits implies a lower bound for the class ZPEXPRP . However, for ΣΠΣ(k) circuit this does not give much as it is easy to think of polynomials that are not computed in this model. Another related result was given in [KS06b] where Klivans and Sherstov showed hardness result for PAC learning depth-3 arithmetic circuits.

3

Namely, they show that there is no eﬃcient PAC learning algorithm for depth-3 arithmetic circuits that works for every distribution. It is unclear however whether this result can be extended to show a hardness result for a membership-query algorithm over the uniform distribution (like the model under consideration here). Indeed, for the reconstruction problem of arithmetic circuits the uniform distribution is the most relevant model. Thus, although we believe that the problem of reconstructing general depth-3 circuits is diﬃcult we do not have suﬃciently strong hardness results to support that feeling. It is an interesting question to prove (or disprove) it.

1.4

Our Techniques

Our algorithms use the intuition behind the construction of the test-sets of [KS08] (that give a blackbox PIT algorithm) although here we need to evaluate the circuit at a much larger set of points. The scheme of the algorithm is similar in nature to the algorithm of [Shp09], however it is much more complicated and requires several new ideas that we now outline. Conceptually we have two main new ideas. The ﬁrst is the notion of a canonical circuit and the second is an isolation lemma. We now shortly discuss each of the ideas. Canonical τ -distant circuits: An important ingredient of our proof is the deﬁnition of a canonical τ -distant circuit for a polynomial f that is computed by a (generalized) depth-3 circuit. The canonical circuit is a uniquely deﬁned (by f ) generalized depth-3 circuit computing f (see Corollary 3.7). The advantage of this deﬁnition is that since the canonical circuit is unique then if we manage to compute its restriction to some low dimensional subspace then we can “lift” each multiplication gate separately to the whole space. In this way we obtain the canonical circuit for f . In other words, if C = M1 +. . .+Mk is the canonical circuit for f (each Mi is a multiplication gate), then M1 |V +. . .+Mk |V is the canonical circuit for f |V , where V is a rank-preserving subspace (as deﬁned in [KS08], see Section 2.2)2 . An important step towards the deﬁnition of a canonical circuit is the deﬁnition of a distance function between multiplication gates (see Section 2.3). form a new (generalized) multiplication gate. As a result, In a canonical circuit, any two new multiplication gates are far from each other. This robust structure allows us to deal with each multiplication gate separately. Isolation Lemma: The second new ingredient in our proof is an isolation lemma that basically shows that for any canonical circuit C = M1 + . . . + Mk there exists an index 1 ≤ m ≤ k and a set of subspaces V = {Ui } such that Mj |Ui = 0, for every j ̸= m and subspace Ui . In particular, C|Ui ≡ Mm |Ui . Moreover, it is possible to reconstruct Mm from the set of circuits {Mm |Ui } (we discuss this in more details in section 4). The proof of the existence of such a structure relies on ideas from the work Saxena and Seshadhri [SS09]. Finally we also require an idea that appeared in our previous work [KS08]. There, we deﬁned the notion of rank-preserving subspaces and used it to give a deterministic sub-exponential blackbox PIT algorithm for ΣΠΣ(k, d, ρ) circuits. Here we show that rank-preserving subspaces can be used to derandomize the reconstruction algorithm of [Shp09]. In particular, this makes our algorithm deterministic whereas the algorithm of [Shp09] was randomized. Given these tools we manage to follow the algorithmic scheme laid out in [Shp09]. Roughly, the idea is as follows: First we restrict our inputs to a rank-preserving subspace V . Then, using the isolation lemma we reconstruct the multiplication gates of the canonical circuit for f |V . After that we further reconstruct f |Vi (for i ∈ [n]), where V is of co-dimension 1 inside each Vi and span (∪ni=1 Vi ) = Fn . 2

To understand the general outline of the proofs, think of V as a random low dimension subspace in which (many of) the dependencies between the linear forms appearing in the circuit are preserved.

4

Finally, we use the uniqueness of the canonical circuit for f to “glue” the diﬀerent circuits of {f |Vi }i together and obtain a single circuit for f . While this is the general scheme there are a few diﬀerences between the multilinear and the general case. In the multilinear case the diﬃcult part is lifting the circuit. Namely, reconstructing C from canonical representations of {C|V }i . It turns out that, unlike the general case, uniqueness is not guaranteed. However we show that if the rank-preserving subspace V has several additional properties, then basically any lift will suﬃce. In contrast, for the general case the bottleneck lies in reconstructing the circuit C|V . This is the place where we need to apply the isolation lemma. A more detailed overview of the algorithm for the diﬀerent cases can be found in Sections 4 and 5.

1.5

Organization

The paper is organized as follows. In Section 2 we give some deﬁnitions and discuss properties of restrictions of linear functions to aﬃne subspaces. We then describe the results of [KS08, SS09] that give deterministic PIT algorithms for zero depth-3 circuits. Speciﬁcally we present the rank bounds of [SS09] and give the notion of rank-preserving subspaces of [KS08]. In Section 3 we deﬁne the notion of a τ -distant ΣΠΣ(k, d, r) circuit and prove an existence and uniqueness result. In Section 4 we prove Theorem 1 and in Section 5 we prove Theorem 2.

2

Preliminaries

For a positive integer k we denote [k] = {1, . . . , k}. A partition of a set S is a set of nonempty subsets of S such that every element of S is in exactly one of these subsets. Let F be a ﬁeld. We denote with Fn the n’th dimensional vector space over F. We shall use the notation x ¯ = (x1 , . . . , xn ) to denote the vector of n indeterminates. For a pair of (multivariate) polynomials g and h we say that g divides h with multiplicity a when g a divides h. For two non-zero linear functions L1 , L2 we write L1 ∼ L2 or alternatively say that L1 and L2 are equivalent whenever L1 and L2 are linearly dependent (that is, for some α, β ∈ F where at least one of the element is non-zero, αL1 + βL2 = 0). Let V = V0 + v0 ⊆ Fn be an aﬃne subspace, where v0 ∈ Fn and V0 ⊆ Fn is a linear subspace. Let L(¯ x) be a linear function. We denote with L|V the restriction of L to V . Assume the dimension of V0 is t, then L|V can be viewed as a linear function of t variables in the following way: Let {vi }i∈[t] be a ∑ basis for V0 . For v ∈ V let v = ti=1 xi · vi + v0 be its representation according to the basis. We get that t ∑ △ L(v) = xi · L(vi ) + L(v0 ) = L|V (x1 , . . . , xt ), i=1

in order for L|V (x1 , . . . , xt ) to be well deﬁned, {vi }i∈[t] should be chosen in some unique way. We give a deﬁnition for a “default” basis later3 . A linear function ∑ L will sometimes be viewed as a vector of n + 1 entries. Namely, the function L(x1 , . . . , xn ) = ni=1 αi · xi + α0 corresponds to the vector of coeﬃcients (α0 , α1 , . . . , αn ). Accordingly, we deﬁne the span of a set of linear functions of n variables as the span of the corresponding vectors (i.e. as a subspace of Fn+1 ). For an aﬃne subspace V of dimension t, the linear function L|V can be viewed as a vector of t + 1 entries. Thus, V deﬁnes a linear transformation from Fn+1 to Ft+1 . For example, let V = V0 + v0 be as above, and {vi }i∈[t] be a basis for V0 . Let A be the (t + 1) × (n + 1) matrix whose i’th row, for 1 ≤ i ≤ t, is vi and its t + 1’th row is v0 . Then A represents the required linear transformation. Let L1 , . . . , Lm be a set of linear functions. We deﬁne the span of these linear functions along with 1 (i.e., the constant function) as 3

We mention that this is a technical issue. Any bijection between bases and subspaces would suﬃce.

5

span1 (L1 , . . . , Lm ). For a subspace L of linear functions, we deﬁne a measure of its dimension “modolu 1” as the dimension of the subspace obtained by taking the homogenous part of its linear functions. We denote the “modolu 1” dimension measure by dim1 (L). For convenience we say that L1 , . . . , Lm are linearly independentH to indicate that their homogenous parts are linearly independent.

2.1

Generalized Depth 3 Arithmetic Circuits

We ﬁrst recall the usual deﬁnition of depth-3 circuits. A depth-3 circuit with k multiplication gates of degree d has the following form: C=

k ∑

Mi =

i=1

di k ∏ ∑

Li,j (x1 , . . . , xn )

(3)

i=1 j=1

where each Li,j is a linear function in the input variables and d = maxi=1...k {deg(Mi )}. Recall that we deﬁned a ΣΠΣ(k, d, ρ) circuit (see Equation 2) to be a circuit of the form   di k k ) ( ∑ ∑ ∏ ˜ i,1 (¯ ˜ i,ρ (¯  x), . . . , L x ) (4) C= Mi = Li,j (¯ x) · hi L i i=1

i=1

j=1

We thus see that in a generalized depth-3 circuit multiplication gates can have an additional term that is a polynomial that depends on at most ρ linear functions. For each Mi (as in Equation (4)), we assume w.l.o.g. that hi has no linear factors. The following notions will be used throughout this paper. Definition 2.1. Let C be a ΣΠΣ(k, d, ρ) arithmetic circuit that computes a polynomial as in Equation (4). ∏i 1. For every multiplication gate Mi we deﬁne Lin(Mi ) = dj=1 Li,j (¯ x) (we use the notations of Equation 4). That is, Lin(Mi ) is the product of all the linear factors of Mi (recall that hi has no linear factors). We call hi the non-linear term of Mi . ∑ x). 2. For each A ⊆ [k], we deﬁne CA (¯ x) to be a sub-circuit of C as follows: CA (¯ x) = i∈A Mi (¯ 3. Deﬁne gcd(C) as the product of all the non-constant linear functions that appear as factors in all the multiplication gates. I.e. gcd(C) = gcd(Lin(M1 ), . . . , Lin(Mk )). A circuit will be called simple if gcd(C) = 1. ∆

4. The simpliﬁcation of C, sim(C), is deﬁned as sim(C) = C/ gcd(C). 5. Deﬁne

( ) ∆ ˜ i,j }j∈[ρ ] . Lin(C) = {Li,j }i∈[k],j∈[di ] ∪ ∪ki=1 span1 {L i

˜ i,j }j∈[ρ ] to be in Lin(C). Notice that we take every linear function in the span of each {L i 6. Deﬁne rank(C) as the dimension of the span of the linear functions in C. That is, ( ( )) ˜ i,j }i,j rank(C) = dim1 span1 {Li,j }i,j ∪ {L = dim1 (Lin(C)).

6

A word of clariﬁcation is needed regarding the deﬁnition of Lin(C) and rank(C). Notice that the ˜ i,j . That is, it may be the case deﬁnition seems to depend on the speciﬁc choice of linear functions L ˜ ˜ (and it is indeed the case) that every polynomial hi (Li,1 , . . . , Li,ρi ) can be represented as a (diﬀerent) polynomial in some other set of linear functions. However the following lemma from [Shp09] shows that the speciﬁc representation that we chose does not change the rank nor the set Lin(C). We say that a polynomial h(¯ x) is a polynomial in exactly k linear functions if h can be written as a polynomial in k linear functions but not in k − 1 linear functions. Lemma 2.2 (Lemma 20 in [Shp09]). Let h(¯ x) be a polynomial in exactly k linear functions. Let P (ℓ′1 , . . . , ℓ′k ) = h = Q(ℓ1 , . . . , ℓk ) be two diﬀerent representations for h. Then span1 ({ℓ′i }i∈[k] ) = span1 ({ℓi }i∈[k] ). We use the notation C ≡ f to denote the fact that a ΣΠΣ circuit4 C computes the polynomial f . Notice that this is a syntactic deﬁnition, we are thinking of the circuit as computing a polynomial and not a function over the ﬁeld. Let C be a ΣΠΣ(k) circuit. We say that C is minimal if there is no A ( [k] such that CA ≡ 0. The following theorem, that relies on the new results of [SS09], gives a bound on the rank of identically zero ΣΠΣ(k, d, ρ) circuits: Theorem 2.3 (Lemma 4.2 of [KS08] combined with Theorem 2 of [SS09]). Let k ≥ 3, and C be a simple and minimal ΣΠΣ(k, d, ρ) circuit (such that) deg(C) ≡ 0. ) Using the notations of ( C ∑k ≥ 2 and 3 3 Equation (4), we have that rank(C) < O k log(d) + i=1 ρi ≤ O k log(d) + kρ. ( ) ∆ For convenience, we deﬁne R(k, d, ρ) = O k 3 log(d) + kρ to be the above bound on the rank. It follows that R(k, d, ρ) is larger than the rank of any identically zero simple and minimal ΣΠΣ(k, d, ρ) ∆

circuit. We also deﬁne R(k, d) = R(k, d, 0) as the upper bound for the rank of a simple and minimal identically zero ΣΠΣ(k, d) circuit. The following theorem gives a bound on the rank of multilinear ΣΠΣ(k) circuits that are identically zero. Theorem 2.4 (Corollary 6.9 of [DS06] combined with Theorem 2 of [SS09]). There exists an integer function RM (k) = O(k 3 log k) such that every multilinear ΣΠΣ(k) circuit C that is simple, minimal and equal to zero, satisﬁes that rank(C) < RM (k). Speciﬁcally, RM (k) denotes the minimal integer larger than the rank of any identically zero simple and minimal multilinear ΣΠΣ(k) circuit. This theorem will be used in section 5, where we discuss multilinear circuits.

2.2

Rank Preserving Subspaces

Throughout the paper we use subspaces of low dimension that preserve the circuit rank to some extent. Such subspaces were introduced in [KS08] for the purpose of deterministic black-box identity testing of polynomials that are computable by ΣΠΣ(k, d, ρ) circuits. We deﬁne these subspaces, state some of their useful properties and give the construction of [KS08]. Most of the lemmas of this section appear in [KS08] and although the rank bound of [SS09] was not known to [KS08], their proofs remain the same. Therefore, we omit the proofs of ∑ the following lemmas. k Given a ΣΠΣ(k, d, ρ) circuit C = i=1 Mi and a subspace V we deﬁne C|V to be the circuit whose multiplication gates are {Mi |V }i∈[k] . Note that this is a syntactic deﬁnition, we do not make an attempt to ﬁnd a “better” representation for C|V . 4

When speaking of ΣΠΣ circuits we (also) refer to generalized depth-3 circuits.

7

Definition 2.5. Let C be a ΣΠΣ(k, d, ρ) circuit and V an aﬃne subspace. We say that V is r-rankpreserving for C if the following properties hold: 1. For any two linear functions L1 , L2 ∈ Lin(C) such that L1 L2 , it holds that L1 |V L2 |V . 2. ∀A ⊆ [k], rank(sim(CA )|V ) ≥ min{rank(sim(CA )), r}. 3. No multiplication gate5 M ∈ C vanishes on V . In other words M |V ̸≡ 0 for every multiplication gate M ∈ C. 4. Lin(M )|V = Lin(M |V ) for every multiplication gate M in C (that is, the non-linear term of M has no new linear factors when restricted to V ). The following lemma lists some of the useful properties of rank-preserving subspaces. Lemma 2.6 (Lemma 3.2 and speciﬁc cases of Theorem 3.4 of [KS08]). Let C be a (generalized) depth-3 circuit and V be an r-rank-preserving aﬃne subspace for C. Then we have the following: 1. For every ∅ ̸= A ⊆ [k], V is r-rank preserving for CA . 2. V is r-rank-preserving for sim(C). 3. If C is a ΣΠΣ(k, d, ρ) circuit and r ≥ ρ then gcd(C)|V = gcd(C|V ) and sim(C)|V = sim(C|V ). 4. If C is a ΣΠΣ(k, d, ρ) circuit and r ≥ R(k, d, ρ) then C ≡ 0 if and only if C|V ≡ 0. 5. If C is a ΣΠΣ(k) multilinear circuit, r ≥ RM (k) and C|V is a multilinear circuit then C ≡ 0 if and only if C|V ≡ 0. Now that we have seen their deﬁnition and some of their useful properties, we explain how to construct rank preserving subspaces. We show two construction methods for rank-preserving-subspaces. One method, used for ΣΠΣ(k, d, ρ) circuits, ﬁnds an r-rank preserving subspace. The second method, used for multilinear ΣΠΣ(k) circuits, ﬁnds an r-rank preserving subspace that preserves the multilinearity of circuits as well. The following deﬁnition and lemma explain how to ﬁnd a rank preserving subspace for general ΣΠΣ(k, d, ρ) circuits. Definition 2.7. Let α ∈ F and r ∈ N+ .

For 0 ≤ i ≤ r let vi,α ∈ Fn be the following vector vi,α = (αi+1 , . . . , αn(i+1) ).

Let Pα,r be the matrix whose j-th column (for 1 ≤ j ≤ r) is vj,α . Namely,    Pα,r = (v1,α , . . . , vr,α ) =  

α2 α4 .. . α2n

5

α3 . . . αr+1 α6 . . . α2(r+1) .. .. . . n(r+1) ... α

At times we abuse notations and treat a circuit C as a set of multiplication gates.

8

   . 

Let V0,α,r be the linear subspace spanned by {vi,α }i∈[r] . Let Vα,r ⊆ Fn be the aﬃne subspace Vα,r = V0,α,r + v0,α . In other words, Vα,r = {Pα,r y¯ + v0,α : y¯ ∈ Fr } . Lemma 2.8 (Corollary Let C be a ΣΠΣ(k, d, ρ) circuit over F and r ≥ R(k, d, ρ). Let (( )4.9 in)[KS08]). (r+2) dk k 6 S ⊆ F be a set of n 2 + 2 2 /ϵ diﬀerent ﬁeld elements . Then, for every ΣΠΣ(k, d, ρ) circuit C over F there are at least (1 − ϵ)|S| elements α ∈ S such that Vα,r is an r-rank-preserving subspace for C. 2.2.1

Rank Preserving Subspaces that Preserve Multilinearity

As stated before, when dealing with multilinear circuits we require that the rank-preserving subspaces also preserve the multilinearity of the circuit. Actually, for multilinear circuits we need to slightly change Deﬁnition 2.5. We ﬁrst explain why it is necessary to change the deﬁnition and then give the modiﬁed deﬁnition for the∏multilinear case. Let C be an n-input multilinear ΣΠΣ(k) circuit having a multiplication gate M = ni=1 xi . Let V be a subspace of Fn of co-dimension r. Assume that C|V is a multilinear circuit. Then at least r linear functions in M have been restricted to constants and are thus linearly dependent. Whenever r > 1, this violates Property 1 of Deﬁnition 2.5 (linearly independent linear functions must remain linearly independent), indicating that C does not have any low dimension rank preserving subspaces. We now give the deﬁnition of multilinear-rank-preserving-subspaces7 : Definition 2.9. Let C be a ΣΠΣ(k) circuit and V an aﬃne subspace. We say that V is r-multilinearrank-preserving for C if the following properties hold: 1. For any two linear functions L1 L2 ∈ Lin(C), we either have that L1 |V L2 |V or that both L1 |V , L2 |V are constant functions. 2. ∀A ⊆ [k], rank(sim(CA )|V ) ≥ min{rank(sim(CA )), r}. 3. No multiplication gate M ∈ C vanishes on V . In other words M |V ̸≡ 0 for every multiplication gate M ∈ C. 4. The circuit C|V is a multilinear circuit. Despite the modiﬁcation of Property 1, Lemma 2.6 applies also for r-multilinear-rank-preserving subspaces8 . The following deﬁnition shows how to construct such a subspace. The two lemmas proceeding it prove the correction of the construction: Definition 2.10 (Deﬁnition 5.3 of [KS08]). Let B ⊆ [n] be a non-empty subset of the coordinates and α ∈ F be a ﬁeld element.

Deﬁne VB as the following subspace: VB = span{ei : i ∈ B} where ei ∈ {0, 1}n is the vector that has a single nonzero entry in the i’th coordinate. Recall our assumption that if |F| is not large enough then we work over an algebraic extension ﬁeld of F. These subspaces are used only for non-generalized depth-3 circuit. Thus, we only deﬁne them for such circuits. 8 As a matter of fact, in the deﬁnition of rank preserving subspaces in [KS08], Property 1 was the same as in Deﬁnition 2.9. 6 7

9

Let v0,α be, as before, the vector

( ) v0,α = α, α2 , . . . , αn .

∆

Let VB,α = VB + v0,α . Lemma 2.11 (Theorem 5.4 of [KS08]). Let C be a ΣΠΣ(k) multilinear depth-3 circuit over the ﬁeld F and r ∈ N. There exists a subset B ⊆ [n] such that |B| = 2k · r and B has the following properties: 1. ∀A ⊆ [k], rank(sim(CA )|VB ) ≥ min{rank(sim(CA )), r}. 2. For every u ¯ ∈ Fn , C|VB +¯u is a multilinear ΣΠΣ(k) circuit. Lemma 2.12 (Easy modiﬁcation of Theorem 5.6 of [KS08]). Let C be a ΣΠΣ(k) multilinear n-variate circuit. Let B be the set guaranteed by lemma 2.11 for some integer r. Then there are less than n3 k 2 many α ∈ F such that VB,α is not r-multilinear-rank-preserving for C. We require an additional property of the subspace, formulated in the following deﬁnition. Definition 2.13. Let C be an n-variate ΣΠΣ(k, d) multilinear circuit over the ﬁeld F. Let B ⊆ [n] and α ∈ F. We say that VB,α is a liftable r-multilinear-rank-preserving subspace for C if the following hold: For each B ′ ⊇ B, the subspace VB ′ ,α is an r-multilinear-rank-preserving subspace for C. Clearly, the subspaces of Deﬁnition 2.10 might restrict linear functions from the circuit to constants. Hence, these subspaces are not always liftable. For example, take C(x, y, z) = (x + y + 1) + (x + 1)z, V1 as the subspace of F3 where x = y = 0 and V2 as the subspace where y = 0. Clearly, V1 ⊆ V2 , V1 is 1-rank preserving and V2 is not (as x + y + 1|V2 ∼ x + 1|V2 ). The following lemma shows how to construct an r-multilinear-rank-preserving liftable subspace. Lemma 2.14. Let C be a ΣΠΣ(k) multilinear arithmetic circuit over a ﬁeld F. Let r ∈ N and B be the set guaranteed by lemma 2.11 for C and r. Then there are less than n4 k 2 many α ∈ F for which there exists a set B ′ ⊇ B, s.t. VB ′ ,α is not r-multilinear-rank-preserving for C. Proof. Let α be some ﬁeld element. Assume that VB,α is r-multilinear-rank-preserving for C. We ˆ ⊇ B of size |B| ˆ = |B| + 1, then VB,α is is r-multilinear-rank-preserving for any B show that if VB,α ˆ liftable. The claim will then follow by Lemma 2.12. ˆ ⊇ B of size |B| ˆ ≤ |B|+1, it holds that V ˆ is r-multilinear-rank-preserving Assume that for each B B,α ′ for C. Let B ⊆ B . It can easily be seen that the only property that might be violated is Property 1 of deﬁnition 2.9. Let L1 , L2 ∈ C such that L1 L2 and L1 |VB′ ,α is not a constant function. Then ˆ ⊇ B, where |B| ˆ = |B| + 1, such that L1 |V there exist some B ′ ⊇ B is not a constant function. ˆ B,α

Hence, by our assumption, L1 |VB,α L2 |VB,α and thus L1 |VB′ ,α L2 |VB′ ,α . It follows that VB ′ ,α is ˆ ˆ r-multilinear-rank-preserving for C. ˆ ⊇B Notice that by Lemma 2.12, there are at most n · n3 k 2 many α ∈ F such that for some B ˆ ≤ |B| + 1, it holds that V ˆ is not r-multilinear-rank-preserving for C. This proves the of size |B| B,α Lemma. We conclude this section with the following corollary giving the method to ﬁnd a liftable rmultilinear-rank-preserving subspace. Corollary 2.15. Let r, n, k ∈ N. Let S ⊆ F be some set of size |S| > n4 k 2 . Let C be a multilinear ΣΠΣ(k) circuit of n inputs. There exist some B ⊆ [n], such that |B| = 2k · r and α ∈ S such that VB,α is r-multilinear-rank-preserving and liftable for C. 10

2.3

A “Distance Function” for ΣΠΣ Circuits

In this section we deﬁne a “distance function” for ΣΠΣ circuits and discuss some of its properties. The function measures the dimension of the linear functions in the simpliﬁcation of the sum of the circuits. Finally we prove that the weight of any two circuits9 computing the same polynomial is equal up to some additive constant. Definition 2.16. Let C1 , . . . , Ci be a collection of ΣΠΣ circuits (i ≥ 1). Deﬁne:    i ∑ ∆ Cj   . ∆(C1 , . . . , Ci ) = rank sim  j=1

Note that this is a syntactic deﬁnition as this sum might contain a multiplication gate M and the multiplication gate −M . The following lemma explains why we refer to ∆ as a distance function. Lemma 2.17. (triangle inequality) Let C1 , C2 , C3 be ΣΠΣ circuits. Then ∆(C1 , C3 ) ≤ ∆(C1 , C2 , C3 ) ≤ ∆(C1 , C2 ) + ∆(C2 , C3 ). Proof. The ﬁrst inequality is trivial since Lin(sim(C1 + C3 )) ⊆ Lin(sim(C1 + C2 + C3 )). We now show the second inequality. We do so by proving a stronger statement: Any linear function appearing in Lin(sim(C1 + C2 + C3 )) also appears in Lin(sim(C1 + C2 )) ∪ Lin(sim(C1 + C3 )). Let L be a linear function in Lin(sim(C1 + C2 + C3 )). We discuss two diﬀerent possibilities for the “origin” of L. Option 1: L ∈ Lin(sim(Ci )) for some i ∈ {1, 2, 3}. In this case, the claim trivially holds. Option 2: Let m ≥ 0 be the multiplicity of L in gcd(C1 ). That is, m is the maximal integer for which Lm divides gcd(C1 ). If L appears with the same multiplicity in both gcd(C2 ) and gcd(C3 ) then it cannot appear in Lin(sim(C1 + C2 + C3 )) (as L ∈ / Lin(sim(Ci )) for any i ∈ {1, 2, 3} ). Hence, w.l.o.g., the multiplicity of L in gcd(C2 ) is not m and thus L ∈ Lin(sim(C1 + C2 )). We now prove that the weight of two minimal circuits computing the same polynomial is roughly the same. To do so we deﬁne a default circuit for a polynomial f . We then show that its weight is roughly the same as the weight of any other minimal circuit computing f . Definition 2.18. Let U be a linear space of n-input linear functions. Deﬁne the default basis of U as the Gaussian elimination of some basis of linear functions10 Definition 2.19. Let f (¯ x) be an n-variate polynomial of degree d. Deﬁne Lin(f ) as the product of the linear factors of f (i.e. for f (x1 , x2 , x3 ) = x3 (x1 + x2 )2 (x1 + x22 ), Lin(f ) = x3 (x1 + x2 )2 ). Let r ∈ N+ be such that f /Lin(f ) is a polynomial of exactly r linear functions (as deﬁned in Lemma 2.2). Let h ˜ 1, . . . , L ˜ r be r linear functions such that be an r-variate polynomial and let L ˜ 1, . . . , L ˜ r ). f /Lin(f ) = h(L ˜ 1, . . . , L ˜ r are the default basis of the linear space they span. L Deﬁne Cf , the default circuit of f , as the following ΣΠΣ(1, d, r) circuit: ∆ ˜ 1, . . . , L ˜ r ). Cf = Lin(f ) · h(L 9

The weight of a circuit is its distance from the 0 circuit. The Gaussian elimination of a set of linear functions is done by performing a Gaussian elimination on the matrix whose rows are the coeﬃcients of the linear functions. 10

11

˜ 1, . . . , L ˜ r is guaranteed by the deﬁnition of r. In Appendix A.2 Notice that the existence of h and L we give a brute force algorithm that given a black-box access to a polynomial, constructs its default circuit. The next lemma implies that for a polynomial f , the ∆ measure of the circuit Cf is the lowest among all ΣΠΣ circuits computing f . It also shows that the ∆ weight of any circuit computing f is close to the ∆ measure of Cf , thus showing that the ∆ measure of any two circuits computing the same polynomial is close. Lemma 2.20. Let C be a minimal ΣΠΣ(k, d, ρ) circuit computing the polynomial f . Then ∆(Cf ) ≤ ∆(C) < ∆(Cf ) + R(k + 1, d) + k · ρ. Proof. Using the notations of Deﬁnition 2.19, it is not hard to see that f /Lin(f ) is a factor of sim(C) and that sim(Cf ) ≡ f /Lin(f ). It follows that the polynomial computed by sim(C) is a polynomial of at least ∆(Cf ) linear functions and thus ∆(C) ≥ ∆(Cf ). We proceed to the second inequality. If C ≡ 0 (i.e., f = 0), then since it is also minimal, we get that ∆(C) < R(k, d, ρ) < R(k + 1, d) + k · ρ. Assume that C does not compute the zero polynomial. Consider the ΣΠΣ(k + 1, d, max{ρ, ∆(Cf )}) circuit C − Cf . As no subcircuit of C (nor C itself) compute the zero polynomial, we have that C − Cf is minimal. Since the circuit clearly computes the zero polynomial, Theorem 2.3 implies that ∆(C) ≤ ∆(C, −Cf ) = ∆(C − Cf ) < R(k + 1, d) + k · ρ + ∆(Cf ).

Given ∑k a ΣΠΣ(k, d, ρ) circuit C we deﬁne its canonical representation in the following way. Let C = i=1 Mi , be the representation of C as sum of multiplication gates. Let fi be the polynomial computed by Mi . Then the canonical representation of C is C=

k ∑

C fi .

(5)

i=1

Note that the only diﬀerence from the description C = we represent sim(Mi )

3

∑k

i=1 Mi ,

is the basis with respect to which

Canonical τ -Distant Circuits

In this section we deﬁne the notion of a τ -distant circuit. We prove the existence of such a circuit C ′ computing f (Theorem 3.2) and prove its uniqueness (Theorem 3.6). We then show that for a subspace V that is rank preserving for C ′ , the restriction C ′ |V is the unique τ -distant circuit computing f |V (Corollary 3.7). Definition 3.1. Let C be a ΣΠΣ(s, d, r) circuit computing a polynomial f (in particular, assume that C is not a ΣΠΣ(s, d, r − 1) circuit). We say that C is τ -distant if for any two multiplication gates of C, M and M ′ , we have that ∆(M, M ′ ) ≥ τ · r.

12

3.1

Proof of Existence

In this section we prove the existence of a τ -distant ΣΠΣ(s, d, r) circuit C ′ computing f (a polynomial that can be computed by a ΣΠΣ(k, d, ρ) circuit). We would like to have r as small as possible (as a function of k, d, ρ) as it will aﬀect the running time of the reconstruction algorithm. Our results are given in the following theorem. Theorem 3.2. [Existence] Let f be a polynomial that can be computed by a ΣΠΣ(k, d, ρ) circuit and let τ ∈ N+ . Let rinit ∈ N+ be such that rinit ≥ R(k + 1, d) + k · ρ. Then there exist s, r ∈ N+ and a ΣΠΣ(s, d, r) circuit C ′ computing f such that s ≤ k, rinit ≤ r ≤ rinit · k ⌈logk (τ +1)⌉·(k−2) and C ′ is τ -distant. Proof. The proof is algorithmic. That is, we give an algorithm for constructing a τ -distant circuit C ′ that computes the same polynomial as C. The idea behind the algorithm is to cluster the multiplication gates of C, such that any two multiplication gates in the same cluster are close to each other, and any two multiplication gates in diﬀerent clusters are far away from each other. Then we replace each cluster with the default circuit (recall deﬁnition 2.19) for the polynomial that it computes. Before giving the algorithm and its analysis we make the following deﬁnition. Definition 3.3. Let C be a ΣΠΣ(k, d, ρ) circuit and I = {A1 , . . . , As } be some partition of [k]. For ∆

each i ∈ [s] deﬁne Ci = CAi . The set {Ci }si=1 is called a partition of C. We say that {Ci }si=1 is (τ ′ , r)-strong when the following conditions hold:

∀i ∈ [s], ∆(Ci ) ≤ r. ∀i, j ∈ [s] such that i ̸= j, ∆(Ci , Cj ) ≥ τ ′ · r. Lemma 3.4. The partition outputted by Algorithm 1 is (τ ′ , r)-strong. Proof. Let I = {A1 , . . . , As } be the partition found in the algorithm. We shall use the notations of Deﬁnition 3.3. First we note that the end of the algorithm, for each i ̸= i′ ∈ [s], we have that ∆(Ci , Cj ) ≥ rm = r · k ζ ≥ r · τ ′ . Thus, we only have to prove that for every i ∈ [s] it holds that ∆(Ci ) ≤ r. Fix some i ∈ [s]. We consider two cases. The ﬁrst is that Ci is a multiplication gate (i.e., Ai is a singleton). Clearly, its non-linear term is a polynomial of at most ρ linear functions. Hence, ∆(Ci ) ≤ ρ ≤ rinit ≤ r. In the ∑′ second case Ci = sℓ=1 Ci,ℓ where the Ci,ℓ were computed at an earlier stage of the algorithm. Let j ′ be the iteration in which the circuit Ci was computed (that is, the iteration in which the set of indices of the multiplication gates of Ci became a member of the partition). Let E ′ ⊆ Ej ′ be some a spanning tree of the connected component Ci,1 , . . . , Ci,s′ . Then, (1)

∆(Ci ) = ∆(Ci,1 , . . . , Ci,s′ ) ≤

∑

(2)

∆(Ci,l1 , Ci,l2 ) < |E ′ | · rj ′ ≤ k · rj ′ ≤ rm−ζ = r.

(Ci,l1 ,Ci,l2 )∈E ′

Inequality (1) can be reached by repeatedly using the second inequality of Lemma 2.17. To justify inequality (2) notice that the partition did not change in the last ζ iterations and thus j ′ < m − ζ. This proves the lemma. Now that we are guaranteed that we have a strong partition we prove an upper bound on r. Namely, we show that the weight of each Ci is not too large. 13

1 2 3 4 5 6 7 8 9 10 11 12 13

14 15 16 17 18

Input: n, k, d, ρ, rinit , τ ′ ∈ N such that rinit ≥ ρ and a ΣΠΣ(k, d, ρ) circuit C of n inputs. Output: An integer r ≥ rinit and I, a partition of [k]. ζ ← ⌈logk (τ ′ )⌉; I1 ← {{1}, {2}, . . . , {k}}; r1 ← rinit ; while the partition was changed in any one of the former ζ iterations do Deﬁne j as the number of the current iteration (its initial value is 1); Let Gj (Ij , Ej ) be a graph where each subset belonging to the partition Ij is a vertex; Ej ← ∅; foreach Ai ̸= Ai′ ∈ Ij do if ∆(CAi , CAi′ ) < rj then Ej ← (Ai , Ai′ ) end end Ij+1 ← the set of connected components of Gj . That is, every connected component is now a set in the partition; rj+1 ← rj · k; end Deﬁne m as the total number of iterations (that is, in the last iteration we had j = m); r ← rm /k ζ ; I ← Im ; Algorithm 1: Canonical partition of a circuit

Lemma 3.5. At the end of Algorithm 1, rm is at most rinit · k ζ·(k−1) . Thus, r ≤ rinit · k ζ·(k−2) . Proof. In every ζ iterations, the number of elements in I is reduced by at least one (otherwise the algorithm terminates). The number of elements in I begins with k and ends with at least 1. Hence, the number of iterations is at most ζ · (k − 1) indicating that rm is at most rinit · k ζ·(k−1) . Hence, r is bounded from above by rinit · k ζ·(k−2) . We proceed with the proof of Theorem 3.2. We now set τ ′ = τ + 1 and ﬁx some integer rinit such that rinit ≥ R(k + 1, d) + k · ρ. Let {Ci }si=1 be the partition outputted by Algorithm 1. Let {fi }si=1 be the polynomials computed by the subcircuits of the partition. That is, Ci computes fi . Deﬁne C′ = ∆

s ∑

Cfi .

i=1

We now show that C ′ satisﬁes the requirements of Theorem 3.2. Since the partition is (τ ′ , r)-strong, Lemma 2.20 implies that ∆(Cfi ) ≤ ∆(Ci ) ≤ r, for all i ∈ [s]. Hence, C ′ is a ΣΠΣ(s, d, r) circuit. Let Ci , Ci′ be two subcircuits in the partition of C. Then, (1)

(3)

(2)

∆(Cfi + Cfi′ ) ≥ ∆(Cfi +fi′ ) > ∆(Ci + Ci′ ) − (R(k + 1, d) + k · ρ) ≥ τ′ · r − r = τ · r

Inequalities 1 and 2 stem from lemma 2.20 (we assume w.l.o.g. that C is minimal and thus so is Ci + Ci′ ). Inequality 3 holds since r ≥ rinit ≥ R(k + 1, d) + k · ρ and ∆(Ci + Ci′ ) ≥ τ ′ · r (by Lemma 3.4). This concludes the proof of Theorem 3.2 14

3.2

Uniqueness

In this section we prove, for a large enough value of τ , the uniqueness of a τ -distant ΣΠΣ(k, d, r) circuit computing a polynomial f . As a corollary we obtain a result showing that if C ′ is a τ -distant circuit computing f and V is rank preserving for C ′ then the unique τ -distant circuit computing f |V is C ′ |V . Theorem 3.6. [Uniqueness] Let f be a polynomial of degree d. Let k, r, τ ∈ N+ be such that τ ≥ R(2k, d, r)/r. Then there exists at most one canonical minimal τ -distant ΣΠΣ(s, d, r) circuit computing f such that s ≤ k. ∆ ∑ ∆ ∑ ′ Proof. Let s, s′ ≤ k and C1 = si=1 Cfi and C2 = si=1 Cgi be two canoncial minimal τ -distant circuits computing f . It suﬃces to prove that s = s′ and that for some reordering of the multiplication gates, ∀i ∈ [s], Cfi = Cgi . Consider the ΣΠΣ(s + s′ , d, r) circuit

∆

C=

s ∑

′

Cfi −

i=1

s ∑

C gi

i=1

Clearly, C computes the zero polynomial. We now show that each minimal subcircuit of C is composed of exactly two multiplication gates Cfi and Cgj where i ≤ s and j ≤ s′ . This will prove our claim. Let C˜ be some minimal subcircuit of C. Clearly C˜ is a ΣΠΣ(m, d′ , r) circuit where 2 ≤ m ≤ s + s′ and d′ ≤ d. It suﬃces to prove that C˜ cannot contain two multiplication gates originating from the same Cℓ (ℓ ∈ {1, 2}). If s = s′ = 1 then both circuit are of the form Cf and are thus clearly equal. Assume w.l.o.g. that s ≥ 2. Assume for a contradiction and w.l.o.g. that both Cf1 and Cf2 are multiplication ˜ Then, gates in C. (1)

(2)

˜ < R(s + s′ , d, r) ≤ R(2k, d, r). τ · r ≤ ∆(Cf1 , Cf2 ) ≤ ∆(C) Inequality 1 holds since C1 is τ -distant, and Inequality 2 follows from the deﬁnition of R(k, d, r) right after Theorem 2.3. This contradicts our assumption regarding τ and thus proves the theorem. Corollary 3.7. Let f be an n-variate polynomial of degree d.

Let k, s, r, τ ∈ N+ be such that τ ≥ R(2k, d, r)/r and s ≤ k. Let C ′ be a minimal ΣΠΣ(s, d, r) τ -distant circuit computing f . Let V be an (r · τ )-rank preserving subspace for C ′ . Then C ′ |V is a minimal τ -distant ΣΠΣ(s, d, r) circuit computing f |V . In addition there is no other τ -distant ΣΠΣ(k ′ , d, r) minimal circuit computing f |V for any k ′ ≤ k. Proof. Let M, M ′ be two diﬀerent multiplication gates of C ′ . As V is (r · τ )-rank preserving for C we get that ∆((M + M ′ )|V ) ≥ min{∆(M + M ′ ), r · τ } ≥ r · τ, ∆(M |V ) ≤ ∆(M ) ≤ r. Hence C ′ |V is a τ -distant ΣΠΣ(s, d′ , r) circuit computing f |V (d′ ≤ d). Since τ ≥ R(2k, d, r)/r ≥ R(2k, d′ , r)/r, Theorem 3.6 implies the uniqueness of C ′ |V .

15

4

Reconstructing ΣΠΣ(k, d, ρ) Circuits

In this section we give our main reconstruction algorithm (Theorem 1). Recall the general scheme of the algorithm that was described in Section 1.4: We ﬁrst restrict the inputs of the polynomial f to a low dimensional rank-preserving subspace V , then we reconstruct the unique τ -distant circuit for f |V , and ﬁnally we lift this to a circuit over Fn . We now explain the intuition behind this approach. Recall that Corollary 3.7 states that C is a τ -distant circuit for f and V is a rank-preserving subspace for C (with the adequate parameters) then C|V is the unique τ -distant circuit for f |V . Stated diﬀerently, if we manage to ﬁnd a τ -distant circuit C ′ that computes f |V and lift each multiplication gate of C ′ separately to Fn , we get back the circuit C. Therefore, our goal is to ﬁnd a τ -distant circuit for f |V . We do this in Section 4.2. This is the main technical part of the algorithm. We now describe the main idea in the reconstruction of f |V . Let us ﬁrst start with the simple case that C ′ (= C|V ) has a single generalized multiplication gate. Then, by factoring f |V we can ﬁnd11 gcd(C ′ ), and then use brute force interpolation to compute sim(C ′ ) (this is possible since dim(V ) is relatively small). This case is not so diﬃcult (it is dealt with in Section 4.1), however it is not clear what to do if we have more than a single multiplication gate. We handle this by ﬁnding a reduction to the case of a circuit with a single multiplication gate. The reduction is based on an isolation lemma that roughly says that there exists a set of subspaces that “zero-out” all but one multiplication gate. Lemma 4.1 (Isolation Lemma (informal)). For every t-variate τ -distant circuit C = Cf1 + . . . + Cfk , there exists a polynomial sized set of subspaces V = {Ui }i , where each Ui ⊂ Ft has co-dimension at most k, such that the following holds. There exists an index i0 ∈ [k] such that for every U ∈ V we have that12 Cfj |U = 0 if and only if j ̸= i0 . Namely, all gates but Cfi0 vanish when restricted to the subspaces in V. Moreover, there is an eﬃcient algorithm for reconstructing Cfi0 from the restrictions {Cfi0 |U }U ∈V . The exact version of the lemma is given in Lemma 4.11 (which strongly depends on Lemma 4.13). The “moreover” part of the lemma is proved by gluing the diﬀerent gates Cfi0 |U to a single Cfi0 . The way to glue the diﬀerent restrictions together is given in Section 4.2.2 (Algorithms 3 and 2), and is based on the earlier work of [Shp09]. The main question that remain then is how to ﬁnd the isolating set of subspaces V (or even how to prove its existence). It turns out that because the dimension of V is relatively low (eventually it will be polylog(n)), then if we know that such a set exists then we can go over all possibilities for it. That is, we go over all possibilities for the set V, and for each “guess”, we try to reconstruct a multiplication gate,(i.e., a circuit of the form Cfi0 ). The point is that after we reconstruct Cfi0 we can continue by recursion and learn all the other multiplication gates. In this way we get many guesses for C ′ , and then we can simply check (via deterministic identity testing) which one is a correct representation for it (this is given in Algorithm 5). In view of the above, we just have to prove the existence of such a set V. The idea is to construct the subspaces step by step. That is, we ﬁrst ﬁnd a set of linear functions L that splits the multiplication gates of C ′ to two sets A and A¯ such that for every ℓ ∈ L all the multiplication gates in A¯ vanish when restricted to the subspace deﬁned by ℓ = 0. On the other hand, none of the multiplication gates in A vanishes on ℓ = 0 (actually there is a stronger demand that we skip now). The deﬁnition of a splitting set and the proof of its existence are given in Section 4.2.1 (Deﬁnition 4.10 and Lemma 4.13). Given a splitting set L and the sets A, A¯ we can look for a splitting set L′ for A (that splits it to A′ and 11

Speciﬁcally, we use a black box factoring algorithm, which produces black boxes to the irreducible components of f |V , and isolate the linear factors. f 12 Actually, the more precise statement would be for the polynomial MjU |U where MU is the product of linear functions, including multiplicities, which divide f and are spanned by of the orthogonal complement of U . In the intuition part we assume that all MU = 1 as dealing with these issues is quite simple.

16

A \ A′ for some ∅ ( A′ ( A). The sets L and L′ deﬁne a set of co-dimension 2 subspaces: for every ℓ ∈ L and ℓ′ ∈ L′ we have the space deﬁned by ℓ = ℓ′ = 0. We can continue to do so until we are left with a single multiplication gate that was split from the other multiplication gates by the subspaces that we generated, which are of co-dimension at most s when C is a ΣΠΣ(s, d, r) circuit (actually, this is not a completely accurate description, see Deﬁnition 4.7 and Lemma 4.11, that discuss m-linear function trees for a more formal treatment). This proves the existence of V. In particular, we reduced the problem of proving the existence of the subspaces V to the problem of proving the existence of a splitting set. The proof of the existence of splitting sets is the most technical part of the proof. It relies on combinatorial methods used by [SS09] when proving a structural theorem regarding depth-3 circuits. To conclude the algorithm for constructing C has the following form (Algorithm 6). Find a set of subspaces, of low dimension, that contains a rank-preserving subspace for C. Denote this subspace with V . We do the following for each space in the family but focus on V as we will later verify which of the circuits that we constructed is the correct one. Run Algorithm 5, to get C ′ |V , the canonical τ -distant circuit computing f |V . The algorithm uses as a subroutine Algorithm 4 that constructs a single multiplication gate of C ′ |V using an m-linear function tree (namely, a family of low co-dimension spaces; their existence is based on the existence of splitting sets). In particular we have to try Algorithm 5 for every “guess” of an m-linear function tree. Run Algorithm 2 to lift each multiplication gate of the τ -distant circuit C ′ |V to Fn . Use the PIT algorithm of [KS08] to ﬁnd the circuit C. That is, to determine the correct “candidate” (each subspace V and m-linear function tree produce a candidate).

4.1

Reconstructing a Low Rank Circuit

In this section we deal with the case in which the rank of the circuit is low. We later show how to reduce the general case to this one. Actually, we solve a slightly more complicated problem (dictated by the upcoming sections). Instead of having an oracle for f (that is, black box access to it) we are only allowed access to it in various low dimension subspaces. Formally put, we deal with the following problem: Let f be an n-variate polynomial of degree d over a ﬁeld F. Given the circuits {Cf |V }i for various low dimensional subspaces, we would like to i construct the circuit Cf . That is, we show how to “glue” together representations of the restrictions of f to various low dimension subspaces. The set of subspaces {Vi } that we work with is such that each subspace Vi is ∆(Cf )-rank preserving for Cf . The algorithm has two parts. First we reconstruct gcd(Cf ) and then sim(Cf ). Using the properties of rank preserving subspaces we manage to isolate the restriction of each linear function in gcd(Cf ) to every subspace Vi . Having these restrictions, we reconstruct each linear function separately. In the second part (reconstruction of sim(Cf )) we use a result of [Shp09] where an algorithm for gluing together restrictions of a low rank circuit to various low dimensional subspaces, is given. ∆ For convenience we denote r = ∆(Cf ). We now describe the subspaces in which we receive the restrictions of f . One of the subspaces we have is contained in all other subspaces. We denote it by ∆ V . Deﬁne t = dim(V ) and keep in mind that in our main algorithm, both r and t have small values (polylogarithmic in the circuit size). The following deﬁnition describes the entire set of subspaces {Vi } and contains the notations used in this section: Definition 4.2. Let V be an aﬃne subspace of dimension t. 17

Denote with Vˆ the homogenous subspace of V and let v0 be some ﬁxed vector such that V = Vˆ +v0 . Let v1 , . . . , vn be a basis of Fn such that v1 , . . . , vt is the default basis (as in deﬁnition 2.18) of Vˆ and vt+1 , . . . , vn are the default basis to some complement subspace of Vˆ (that is, to some subspace of largest possible dimension that has trivial intersection with Vˆ ). ∆

For each 0 ≤ i ≤ n − t, set Vi = span(Vˆ ∪ {vt+i }) + v0 (note that V0 = V ). Let v1∗ , . . . , vn∗ be the dual basis of v1 , . . . , vn . That is, each vi∗ is a (homogenous) linear function and vi∗ (vj ) = 1 if and only if i = j (it is zero otherwise). The set of subspaces w.r.t. which we receive the restrictions of f is {Vi }n−t i=0 . The following theorem summarizes the properties of Algorithm 2 (the gluing algorithm). Theorem 4.3. Let f be an n-variate polynomial over a ﬁeld F and let r = ∆(Cf ). Let V be an aﬃne t-dimensional subspace of Fn (equivalently, V is of codimension n − t). Algorithm 2, given {Cf |V }n−t i i=0 as input (as deﬁned in Deﬁnition 4.2), runs in time O(n · dr ). If V is r-rank preserving for Cf then the algorithm outputs Cf . Proof of Theorem 4.3. Notice that if V = V0 is r-rank preserving for Cf then for each i ∈ [n−t] it holds that Vi is also r-rank preserving for Cf (since V0 ⊆ Vi ). We shall see that this implies (Lemma 4.4) that in each subspace Vi , gcd(Cf |V ) = gcd(Cf )|Vi and sim(Cf |V ) = sim(Cf )|Vi . i

i

Hence, the circuits Cf |Vi and Cf |V are identical. This gives us the method for obtaining the restrici tions gcd(Cf )|Vi and sim(Cf )|Vi for every 0 ≤ i ≤ n − t, thus allowing us to reconstruct each part independently. We start by proving that gcd(Cf |V ) = gcd(Cf )|Vi . i

Lemma 4.4. Let f be a non-zero n-variate polynomial under F. Let V ⊆ Fn be a subspace that is ∆(Cf )-rank preserving for sim(Cf ) and f |V ̸= 0. Then gcd(Cf )|V = gcd(Cf |V ) and sim(Cf )|V = sim(Cf |V ). Proof. Let ˜ 1, . . . , L ˜ ∆(C ) ). sim(Cf ) = p˜(L f Let g(¯ x) be some irreducible factor of sim(Cf ). Obviously, g is not a linear function. Let g ′ be a ˜ 1, . . . , L ˜ ∆(C ) ) = g. Since V is ∆(Cf )-rank preserving for ∆(Cf )-variate polynomial such that g ′ (L f sim(Cf ) we get that ˜ 1 |V , . . . , L ˜ ∆(C ) |V }) = ∆(Cf ). dim1 (span1 {L f ˜ 1 |V , . . . , L ˜ ∆(C ) |V )) is also an irreducible non-linear polynomial, meanThus, we have that g|V (= g ′ (L f ing that sim(Cf )|V does not have any linear factor. Therefore, gcd(Cf )|V , which computes the same polynomial as Cf |V /sim(Cf )|V , contains all the linear factors of f |V , indicating that gcd(Cf )|V = gcd(Cf |V ). It follows that13 sim(Cf )|V = sim(Cf |V ). 13

Actually, we have shown that sim(Cf |V ) and sim(Cf )|V compute the same polynomial that does not have any linear factor. The representations of this polynomial may not be the same in both circuits. However, it is easy to see that given the “correct” deﬁnition of a default basis (used in the deﬁnition of default circuits), we can make sure that the representation is equal and assume w.l.o.g. the equality of these circuits.

18

1 2 3 4

5

6 7 8 9 10 11

∏i (i) (i) (i) Input: For each 0 ≤ i ≤ n − t, the circuit Cf |V : pi (L1 , . . . , Lri ) · dj=1 (ℓj )ej . i ∏d′ ej Output: The circuit Cf : p(L1 , . . . , Lr ) · j=1 ℓj . Part 1: Reconstructing gcd(Cf ) If there exist 0 ≤ i < i′ ≤ n − t such that di ̸= di′ output “fail”. Deﬁne d′ = d1 ; foreach linear function ℓ(0) dividing f |V , with some multiplicity e, do In each i ∈ [n − t] ﬁnd a linear functions ℓ(i) such that ℓ(i) ∈ gcd(Cf |V ) and ℓ(i) |V = ℓ(0) . i

If for some i ∈ [n − t] no such linear function ℓ(i) exists or is not unique then output “fail”; ∑ For each 0 ≤ i ≤ n − t, denote ℓ(i) = αi,0 + nj=1 αi,j vj∗ . Deﬁne ∑ ∑ ∆ ∗ ℓ = α0,0 + tj=1 α0,j vj∗ + n−t j=1 αj,j+t vj+t ; Insert the linear function ℓ to gcd(Cf ) with multiplicity e; end Part 2: Reconstructing sim(Cf ) If there exist 0 ≤ i < i′ ≤ n − t such that ri ̸= ri′ output “fail”. Deﬁne r = r1 ; foreach i ∈ [n − t] do (i) (i) ﬁnd the unique r × r matrix that transforms the linear functions L1 |V , . . . , Lr |V into (0)

(0)

L1 , . . . , Lr . If no such matrix exists output “fail”; 12

(i)

(i)

transform the linear functions L1 , . . . , Lr and the polynomial pi according to the found (i)

(0)

matrix so that for each j ∈ [r] and i ∈ [n − t] it will hold that Lj |V = Lj

and

∆

13 14

pˆ = p0 = pi ; end ˆ 1, . . . , L ˆ r such that for every j ∈ [r] and Find the unique set of linear functions L (i)

15

16

0 ≤ i ≤ n − t, we will have Lj |Vi = Lj ; ˆ 1, . . . , L ˆ r }. Construct the polynomial p such Set L1 , . . . , Lr as the default basis of span1 {L ˆ 1, . . . , L ˆ r ); that p(L1 , . . . , Lr ) = pˆ(L ∆

output Cf = p(L1 , . . . , Lr ) · gcd(Cf ); Algorithm 2: Gluing together low dimension restrictions of a low rank circuit

19

We now return to the proof of Theorem 4.3. We deal with the diﬀerent parts of Algorithm 2. Part 1: Constructing gcd(Cf ): Denote the distinct linear factors of gcd(Cf ) by ℓ′1 , . . . , ℓ′d′ and let their multiplicities be e1 , . . . , ed′ , respectively. Namely, ′

gcd(Cf ) =

d ∏

(ℓ′ j )ej .

j=1

We shall think of those functions as linear functions in the basis v1∗ , . . . , vn∗ , 1. That is, for each i ∈ [d′ ], we deﬁne: n ∑ ′ ′ ℓ′i = αi,0 + αi,j vj∗ . j=1

The following lemma gives the analysis of part 1 of the algorithm. Lemma 4.5. Part 1 of Algorithm 2 runs in time n · poly(d). If V is r-rank preserving for Cf , it outputs gcd(Cf ). Proof. The complexity analysis is trivial, so we only prove the correctness of the algorithm. Since all the subspaces are r-rank preserving, we have that gcd(Cf )|Vi = gcd(Cf |V ) for every subspace Vi i (Lemma 4.4). Furthermore, the restrictions of every two non-equivalent linear functions remain nonequivalent (Property 1 of Deﬁnition 2.5). It follows that14 deg(gcd(Cf )|Vi ) = deg(gcd(Cf )). Hence, Step 2 of the algorithm does not fail. As no two linearly independent linear functions become linearly dependent when restricted to V we get that indeed there exists a unique ℓ(i) such that ℓ(i) |V = ℓ(0) . We now note that the function ℓ deﬁned in Step 5 satisfy that ℓ|Vi = ℓ(i) for every 0 ≤ i ≤ n − t. By the structure of the Vi ’s it is clear that this ℓ is unique. In particular ℓ must belong to gcd(Cf ) with multiplicity e as required. Part 2: Gluing the Restrictions of sim(Cf ): In [Shp09] an algorithm for exactly this task was given (Algorithm 4). Although it is not presented in this form, the Algorithm 4 of [Shp09] can be (i) (i) seen as getting representations of the form (f /Lin(f ))|Vi = Q(L1 , . . . , Lr ), where for each i and j, (i) ˜ j such that for every i it holds that L ˜ j |V = L(i) . This is exactly Lj |V = Lj , and then computing L i j what we wish to achieve in Part 2 of our algorithm, and indeed the algorithm we give is exactly the algorithm (presented implicitly) of [Shp09]. Thus correctness follow from the following lemma. Lemma 4.6. (implicit in [Shp09]) Let h be a non-zero n-variate polynomial of degree d. Let r ∈ N+ be such that h is a polynomial of exactly r linear functions. Let V0 be a subspace of Fn such that h|V0 n−t is a polynomial of exactly r linear functions. Algorithm 4 of [Shp09], given the input {h|Vi }i=0 outputs r a representation of h as a polynomial of r linear functions in O(n · d ) time. By setting h as f /Lin(f ), we clearly satisfy the requirements of Lemma 4.6. Hence, part 2 of Algorithm 2 (that acts exactly as Algorithm 4 of [Shp09]) outputs the circuit sim(Cf ) in O(n · dr ) time. Theorem 4.3 follows from this and from Lemma 4.5. 14 Actually, there can be one linear function that is reduced to a constant, resulting in a lower degree. This case can be easily dealt with.

20

4.2

Finding a τ -Distant Representation in a Low Dimension Subspace

In this section we show how to ﬁnd a minimal τ -distant ΣΠΣ(s, d, r) circuit computing f |V , given black box oracle access to f |V . The values of s, d, r and τ that we choose are such that the uniqueness of the circuit is guaranteed (according to Theorem 3.6 and Corollary 3.7). Since we only deal with the ∆

restriction of f to the subspace V we assume for convenience that f is a polynomial of t = dim(V ) ∆ ∑ variables and ﬁnd the minimal τ -distant ΣΠΣ(s, d, r) circuit C ′ = si=1 Cfi computing f (that is, we abuse notations and write f instead of f |V ). 4.2.1

Step 1: Finding the set of subspaces

In this section we deal with the following problem. Let C′ = ∆

s ∑

C fi

(6)

i=1

be a ΣΠΣ(s, d, r) minimal τ -distant t-variate circuit computing a polynomial f . We ﬁnd an “interpolating tree” for f . As detailed in the beggining of the section, this tree is constructed via some set of subspaces which, in some sense, isolates a single multiplication gate of C. Definition 4.7. An m-linear function tree is a tree graph with the following properties.

Each non-leaf vertex has exactly m children. Every edge e is labeled with a linear function φe . The linear functions labeling the edges from a vertex u to its children are linearly independentH . Deﬁne the size of the m-linear function tree as the number of non-leaf vertices in the tree. Definition 4.8. Let f be a polynomial an let T be an m-linear function tree. For a vertex u in the ∆ tree, deﬁne fu as follows: For the root r, fr = f . Let φ be the linear function on the label of the edge between a vertex v and its child u. Let a be the multiplicity of φ in f : ∆

fu =

fv |φ=0 . φa

Definition 4.9. An m-linear function tree is an interpolating tree for a polynomial g w.r.t. a polynomial f , if the following hold: 1. Denote the degree of g by d. Then m ≥ max{100 log(d), ∆(Cg ) + 2}. 2. For any leaf u, gu = fu . 3. For any non-leaf v and edge e connecting v to one of its children, the multiplicities of φe in fv and in gv are the same. 4. For any non-leaf v and edge e connecting v to one of its children, we have that φ ∈ / Lin(sim(Cgv )). The usefulness of this deﬁnition is demonstrated in the following section. There we give an algorithm that receives as input an interpolating tree for fi w.r.t. f , and black box access to f . The algorithm outputs Cfi . In the rest of the section we prove the existence of an m-linear function tree of low depth (lower than s) that is an interpolation tree for some fi w.r.t. f . We now explain how to choose the labels of the edges such that the resulting tree will have the required properties. This is done by using splitting sets, as detailed earlier. 21

Definition 4.10. Let f be a t-variate polynomial of degree d. Let r, s, k ∈ N+ where s ≤ k and let ∆ ∑ ξ(k) be some increasing positive integer function. Let C ′ = si=1 Cfi be a ξ(k)-distant ΣΠΣ(s, d, r) circuit computing f . Let L be a set of linearly independentH linear functions. We say that L is a (ξ, r, k)-splitting set for C ′ when it satisﬁes the following properties: 1. |L| ≥ max {100 log(d), r + 2} · k. 2. There exists some ∅ ̸= A ( [s] such that for every φ ∈ L we have the following: Let a ≥ 0 be the multiplicity of φ in gcd(C ′ ). Then φ is a factor of fi of multiplicity at least a + 1 if and only if i∈ / A. 3. For every φ ∈ L and i ∈ [s], we have that φ ∈ / Lin(sim(Cfi )). 4. For each i ̸= j ∈ A and φ ∈ L, it holds that ∆(C fi |

φa φ=0

, C fj

| φa φ=0

) ≥ ξ(k − 1) · r.

The reason for the name “splitting set” comes from the fact that it splits the multiplication gates of C ′ into two non-trivial sets (A and [s]\A), such that only the multiplication gates in A “survive” the process of dividing the circuit in a linear function in the set and then projecting this linear function to zero. Namely, for every i ∈ [s] \ A and every φ ∈ L we have that φfia |φ=0 = 0. Before we prove the existence of a splitting set we give a lemma that demonstrates its usefulness. Lemma 4.11. Let ξ(k) be an increasing integer function and r, d ∈ N+ . Assume that for every s ≤ ∆ ∑ k ∈ N+ and every ξ(k)-distant ΣΠΣ(s, d, r) circuit C ′ = si=1 Cfi computing the polynomial f , there ∆

exists a (ξ, r, k)-splitting set. Deﬁne m = max {100 log(d), r + 2}. Assume that ξ(k)r ≥ R(2k, d, r). Then for each such circuit, there exists an m-linear function tree T with the following properties 1. The depth of T is smaller than s. 2. There exists i′ ∈ [s] such that T is an interpolating tree for fi′ w.r.t. f . Proof. We prove the claim by induction on s. For s = 1, the tree is a single node and the∑claim is trivial. For s > 1 let L be the (ξ, r, k)-splitting set guaranteed for a ΣΠΣ(s, d, r) circuit C ′ = si=1 Cfi . Let φ ∈ L and let a be its multiplicity in gcd(C ′ ). Consider the polynomial φfa |φ=0 where a is the multiplicity of φ in gcd(C ′ ). by Deﬁnition 4.10, ∆ ∑ there exists some ∅ ̸= A ( [s] such that Cφ = i∈A C fi | is either a single multiplication gate or a φa φ=0

f φa |φ=0 .

ξ(k − 1)-distant ΣΠΣ(|A|, d, r) circuit computing In particular, this means that φfa |φ=0 ̸= 0 since ξ(k − 1) is suﬃciently large and by Theorem 3.6 the canonical distant circuit computing it must be unique. Hence, the multiplicity of φ in f is exactly a. Since |A| ≤ s − 1 ≤ k − 1 we get by induction that there exists an m-linear function tree Tφ with the following properties: 1. The depth of Tφ is lower than |A| ≤ s − 1. 2. There exists iφ ∈ A ⊆ [s] such that Tφ is an interpolating tree for

fiφ φa |φ=0

w.r.t.

f φa |φ=0 .

In particular, there is some i′ ∈ [s] such that at least |L|/s ≥ |L|/k ≥ m of the trees Tφ satisfy iφ = i′ . We now construct a new tree T by connecting m of these trees (those with iφ = i′ ) to a new root and labeling each edge from the root to Tφ with the linear function φ. Since the multiplicity of each such φ in f and in fi′ is the same, it is not hard to see that T is an m-linear function tree satisfying the required properties (recall that by the deﬁnition of ∆(·) we have that ∆(Cfi′ ) ≤ r, as fi′ is computed by a multiplication gate in a ΣΠΣ(s, d, r) circuit). 22

In other words, the lemma above implies that a splitting set guarantees the existence of an interpolating tree. We proceed to proving the existence of splitting sets. That is, we ﬁnd a function ξ such that for every s ≤ k ∈ N+ and ξ(k)-distant ΣΠΣ(s, d, r) circuit C ′ , there exists a (ξ, r, k)-splitting set L. Our proof will have the following structure. We ﬁrst deﬁne the function ξ and discuss some of its properties. We then ﬁnd a set of linearly independentH linear functions, denoted by L, for which Property 2 of deﬁnition 4.10 holds. Next, we give an upper bound on the number of functions in L that do not satisfy either Property 3 or Property 4 of Deﬁnition 4.10. Finally, we remove the “bad” functions from L and verify that the number of function remaining is large enough (thus guaranteeing Property 1). The function we work with is the following: { ∏ (j−1) (k+4)(k+5) k 2 · (log(d))k k > 2 ·2 j=3 2 ξd (k) = . (7) (k+4)(k+5) 2 2 · (log(d))k k=2 For convenience, we state some technical, yet trivial properties of ξd that we shall use later. Fact 4.12. For k ≥ 2, r ≥ 50 log(d) and suﬃciently large d (i.e., d ≥ C for some universal constant C) the following properties hold:

ξd (k) is an increasing function (both as a function of k and of d). ξd (k) ≥ 2k + 1. ξd (k) − 2 ≥ ξd (k)/2.

ξd (k) 2k+2

≥ 3k.

ξd (k) ≥ R(2k, d, r)/r. We are now ready to ﬁnd the linear functions required for the splitting set. ∑ Lemma 4.13. Let f be a t-variate polynomial of degree d ≥ 2. Let C ′ = si=1 Cfi be a ξd (k)-distant ΣΠΣ(s, d, r) circuit computing f , where r ≥ 50 log(d). Then there exists a (ξd , r, k)-splitting set for C ′. Proof. We start with the following lemma that shows that there are many linear functions satisfying Property 2 of Deﬁnition 4.10. Lemma 4.14. There exists a set L of ∆ [s] to two sets A, A¯ = [s] \ A such that

ξd (k)r 2k+1

linearly independentH linear functions and a partition of

Both A and A¯ are not empty. For every φ ∈ L let aφ be its multiplicity in gcd(f1 , . . . , fs ). For any i ∈ [s], φ divides fi with multiplicity exactly aφ iﬀ i ∈ A. Proof. We ﬁrst prove a weaker lemma that shows that there are many linearly independentH functions which divide f1 and f2 (the polynomials computed by the ﬁrst and second multiplication gates of C ′ ) with diﬀerent multiplicities. We shall later “extract” the required set from this initial set. Lemma 4.15. There are at least ξd (k)r − 2r linearly independentH linear functions that divide f1 with a diﬀerent multiplicity than f2 .

23

Proof. It is not hard to see that Lin(sim(Cf1 + Cf2 )) = Lin(sim(Cf1 )) ∪ Lin(sim(Cf2 ))∪ gcd(Cf1 ) gcd(Cf2 ) ∪ gcd(Cf1 + Cf2 ) gcd(Cf1 + Cf2 ) gcd(C

)

f1 where we abuse notations and refer to gcd(Cf +C as the set of linear functions dividing the polynomial f2 ) 1 (without counting multiplicities). For each set, consider the dimension of its span and obtain the following inequality: ∆(Cf1 , Cf2 ) ≤ ∆(Cf1 ) + ∆(Cf2 )+ ( ) ∪ gcd(Cf1 ) gcd(Cf2 ) dim1 . gcd(Cf1 + Cf2 ) gcd(Cf1 + Cf2 )

Since Cf1 , Cf2 are multiplication gates of a ξd (k)-distant ΣΠΣ(s, d, r) circuit, we have that ξd (k) · r ≤ ∆(Cf1 , Cf2 ) and that ∆(Cf1 ) + ∆(Cf2 ) ≤ 2r. Hence, ) ( ∪ gcd(Cf2 ) gcd(Cf1 ) ≥ ξd (k) · r − 2r. dim1 gcd(Cf1 + Cf2 ) gcd(Cf1 + Cf2 ) It remains to notice that any linear function in diﬀerent multiplicities.

gcd(Cf1 ) gcd(Cf1 +Cf2 )

∪

gcd(Cf2 ) gcd(Cf1 +Cf2 )

divides f1 and f2 with

Let L′ be the set of linear functions guaranteed by Lemma 4.15. For every linear function φ ∈ L′ and i ∈ [s] let ai be the multiplicity of φ in fi . For any such φ, deﬁne Aφ ⊆ [s] as the set of indices for which ai is minimal. As a1 ̸= a2 for any choice of φ, it holds that ∅ ̸= Aφ ( [s]. It follows that for some ∅ ̸= A ( [s], there exist at least |L′ | ξd (k)r − 2r ξd (k)r ≥ ≥ k+1 k k 2 2 2 many functions in L′ s.t. Aφ = A (the last inequality is reached via Fact 4.12). It can easily be ∆

seen that the set L = {φ|Aφ = A}, and the partition (A, [s] \ A) realize the required properties of Lemma 4.14. We continue with the proof of Lemma 4.13. Let L denote the a set of functions guaranteed by Lemma 4.14. To bound the number of functions that do not satisfy Condition 3 of Deﬁnition 4.10, we notice that dim1 (Lin(sim(Cfi ))) ≤ r for every i ∈ [s]. Hence, there are at most s · r ≤ k · r linear functions in L that belong to Lin(sim(Cfi )) for some i ∈ [s] (because the functions of L are linearly independentH ). To bound the number of functions that do not satisfy Condition 4 we need the following lemma which is a based on the rank bound of depth-3 circuits by Saxena and Seshadhri [SS09]. The proof is given in Section B of the appendix. ∆ ∑ Lemma 4.16. Let s, d, r ∈ N and let C = si=1 Cfi be a ΣΠΣ(s, d, r) circuit. Let Lˆ be a set of linearly independentH linear functions and A ⊆ [s] be a set of size |A| ≥ 2. Let η, R ∈ N. Assume that for ˆ every i, i′ ∈ A where i ̸= i′ we have that ∆(Cfi + Cfi′ ) ≥ R + 2r. Assume further that for each φ ∈ L, the following holds:

For every i ∈ [s], φ divides fi if and only if i ∈ / A. For every i ∈ [s], φ ∈ / Lin(sim(Cfi )). 24

∃i, i′ ∈ A, such that i ̸= i′ and ∆(Cfi |φ=0 , Cfi′ |φ=0 ) < Then ˆ ≤ |L|

R 2·η·log(d) .

( ) |A| R · . 2 η

In other words, the lemma bounds the number of linear functions in L that satisfy Properties 2 and 3 of Deﬁnition 4.10 but not Property 4 (when applied on the circuit sim(C)). We now proceed with the proof of Lemma 4.13. Consider the functions of L for which property 4 does not hold (but Properties 2 and 3 do hold). For such a function φ, there are i ̸= j ∈ A such that15 ∆(C fi |

φa φ=0

, C fj

| φa φ=0

) < ξd (k − 1) · r =

Set R = r · (ξd (k) − 2) and η = 2k+2 · ∆(C fi |

φa φ=0

, C fj

| φa φ=0

)<

(k−1) 2

ξd (k − 1) ξd (k) · r · (ξd (k) · r) = . (k−1) k+4 ξd (k) 2 log(d) 2

. It follows that

ξd (k) · r ξd (k) · r (1) (ξd (k) − 2) · r R = ≤ = , (k−1) k+4 4η log(d) 2η log(d) 2η log(d) 2 log(d) 2

where inequality (1) stems from Fact 4.12. In addition, for every i1 ̸= i2 ∈ [s] we have that ∆(Cfi1 , Cfi2 ) ≥ ξd (k)r = (ξd (k) − 2)r + 2r = R + 2r. Hence, we may apply Lemma 4.16 with the parameters R and η (on the circuit sim(C)), and get that there are at most ( ) k−1 ξd (k) · r ξd (k) · r · (k−1) = k+2 2k+2 2 2 2 linear functions in L for which condition 4 is the only condition that does not hold. We now remove the functions that do not satisfy Conditions 3 or 4 from L. According to the previous calculations, the number of functions that remain is at least: ) ( ) (1) ( ξd (k) ξd (k) ξd (k) − k − k+2 ≥ r −k ≥ r 2k+1 2 2k+2 (2)

2r · k ≥ max {100 log(d), r + 2} · k, where inequality (1) is derived from Fact 4.12 and inequality (2) holds since r ≥ 50 log(d) ≥ 50. This completes the proof of Lemma 4.13 and shows the existence of a (ξd , r, k)-splitting set for C ′ . We now have the following corollary. ∆

Corollary 4.17. Let r, d ∈ N+ be such that d ≥ 2 and r ≥ 50 log(d). Let m = max {100 log(d), r + 2} ∆ ∑ and ξd (k) be as in Equation (7). Let s ≤ k ∈ N+ and C ′ = si=1 Cfi be a minimal ξd (k)-distant ΣΠΣ(s, d, r) circuit computing the polynomial f . Then there exists an m-linear function tree T with the following properties:

The depth of T is at most s − 1. There exists some i′ ∈ [s] such that T is an interpolation tree for fi′ w.r.t. f . Notice that Condition 4 is only interesting for k ≥ 3. When k = 2 we have that |A| = 1 and the condition is always held. 15

25

4.2.2

Steps 2 & 3: Gluing the Restrictions Together

In this section we deal with the following problem: Let f be a t-variate polynomial and T be an mlinear function tree that is interpolating for f w.r.t. a polynomial F . Given the tree T and black-box access to F we would like to reconstruct the circuit Cf (notice that f is actually the polynomial fi described at the beginning of Section 4.2 and F is the polynomial computed by the original circuit). We give an algorithm for the problem above. We ﬁrst describe its general scheme: As a ﬁrst step we construct, for each leaf u of the tree, the circuit CFu using the brute force algorithm of Appendix A.2. As we have black-box access to F , we can query the polynomial Fu by using a black box factoring algorithm, described in Appendix A.1. The computation is done from the root downwards: For a vertex v and its son u connected to it with an edge labeled with φ, we divide Fv by φa , where a is the multiplicity of φ in Fv and restrict it to the subspace in which φ = 0. Due to the properties of factoring trees, for a leaf u, Fu = fu . From here on, our methods are “local” when regarding the input tree. Speciﬁcally, let v be a vertex in the tree. Denote the children of v as {uj }j . We show how to construct the circuit Cfv given the circuits {Cfuj }j . Using this “local construction method” we gradually construct for each vertex v the circuit Cfv until we reach the root in which we construct Cf . The goal of this section is to prove the following theorem. Theorem 4.18. Let f, F be t-variate polynomials and T an m-linear function tree that is an interpolating tree for f w.r.t. F . Then Algorithm 4, when given T and black box access to the polynomial 2 F as input, runs in time size(T ) · |F|O(t ) , and outputs Cf . Algorithm 3 is the “local” algorithm that in fact deals with an m-linear function tree of depth 1. It assumes that non of the linear functions on the tree divide the polynomial F . To deal with such cases we preprocess the circuit by dividing F and the diﬀerent fu ’s by the appropriate linear function via a factoring algorithm (in Section A.1), then multiply the constructed polynomial with them. As the multiplicity of each function is the same in f and F , we obtain the correct circuit. The running time of the preprocessing is dominated by that of Algorithm 4 (see Section A.1). The algorithm works in two stages. First we ﬁnd the linear functions of gcd(Cf ) (all linear functions dividing f ) and then we construct sim(Cf ). In the ﬁrst stage (ﬁnding gcd(Cf )) we ﬁnd the restriction gcd(Cf )|φu =0 for all leafs u (φu is the label of the edge connecting the root to u) and glue them together using a method presented in [Shp09]. In that paper, an algorithm for reconstructing a single (nongeneralized) multiplication gate given its restriction to various co-dimension 1 subspaces is devised (see Appendix A.3). In the second stage we glue the diﬀerent restrictions of {sim(Cf )|φu =0 }u . For this we use the same gluing algorithm given in Section 4.1. In particular, we quote Theorem 4.3 of Section 4.1 and show that its requirements are satisﬁed, which guarantees that we can perform the gluing. Lemma 4.19. Algorithm 3 runs in dO(t) time, for d = deg(f ). If the tree is an interpolating tree for f w.r.t. F then the algorithms outputs Cf . Proof. Let u1 , . . . , um be an enumeration of the leafs of the tree. Notice that the linear functions of the m-linear function tree given as input in Algorithm 3 are linearly independentH . Hence, we will hereon assume w.l.o.g. that the functions in hand are x1 , . . . , xm . In particular, fui = f |xi =0 , for every i ∈ [m]. The proof has the following structure: First we show that in each co-dimension 1 subspace we are able to obtain the restrictions of gcd(Cf ) and sim(Cf ). We then prove that for every leaf u, ∆(Cfu ) = ∆(Cf ). Hence, Step 2 of the algorithm succeeds and r = ∆(Cf ). We proceed to prove that gcd(Cf ) is found correctly by using the results of [Shp09], where an algorithm for Step 3 is given. To show that sim(Cf ) is properly constructed we prove the following: Let h be the polynomial computed by sim(Cf ). Clearly, Ch = sim(Cf ). We prove that a pair of leafs such as u1 , u2 is found in Step 6. 26

1 2

Input: An m-linear function tree of depth-1. In addition, for every leaf u of the tree, the circuit Cfu . Output: A generalized multiplication gate. For a leaf u, let φu be the linear function on the label of the edge connected to it; If for some pair of leaves u1 , u2 we have ∆(Cfu1 ) ̸= ∆(Cfu2 ), output “fail”. Otherwise, deﬁne ∆

r = ∆(Cfu1 );

3

4 5 6

7

8

∏′ Find a set of linear functions ℓ1 , . . . , ℓd′ such that for each leaf u, dj=1 ℓj |φu =0 = gcd(Cfu ). If no such functions exist, output “fail”; ∏′ Output gcd(Cf ) = dj=1 ℓj ; For a leaf u, write sim(Cfu ) = p(Lu1 , . . . , Lur ); Find a pair of leafs, u1 ̸= u2 for which Lu1 1 |φu2 =0 , . . . , Lur 1 |φu2 =0 are linearly independentH . If no such pair exist, output “fail” ; Deﬁne U0 as the co-dimension 2 subspace where φu1 = φu2 = 0. Let U1 be the co-dimension 1 subspace where φu1 = 0 and U2 be the co-dimension 1 subspace where φu2 = 0; Let h be the polynomial computed by sim(Cf ). Run Algorithm 2 with input {Ch|U0 , Ch|U1 , Ch|U2 }. Set sim(Cf ) as the output of the algorithm ; Algorithm 3: Reconstructing a circuit given a depth-1 m-linear function tree

Moreover, we show that the subspace U0 is ∆(Ch )-rank preserving for Ch . This will guarantee the correctness of step 8. Lemma 4.20. For each i ∈ [m], the codimension 1 subspace deﬁned by the equation xi = 0 is ∆(Cf )rank preserving for sim(Cf ). Proof. Let ˜ 1, . . . , L ˜ ∆(C ) ). sim(Cf ) = p˜(L f Assume for a contradiction that for some i ∈ [m], the subspace where xi = 0 is not ∆(Cf )-rank ˜ 1 |x =0 , . . . , L ˜ ∆(C ) |x =0 must be linearly dependentH . Hence, there preserving for sim(Cf ). Then L i i f exists a non-trivial combination of these functions that gives a constant function: ∆(Cf )

∑

˜ j |x =0 = γ αj · L i

j=1

˜ 1, . . . , L ˜ ∆(C ) cannot be constant since for some α1 , . . . , α∆(Cf ) , γ ∈ F. The same combination of L f these functions are linearly independentH . Hence for some 0 ̸= β ∈ F: ∆(Cf )

∑

˜ j = β · xi + γ. αj · L

j=1

Therefore, xi ∈ Lin(sim(Cf )), meaning that the input tree is not an interpolating tree (see Deﬁnition 4.9). We continue with the proof and show how to obtain, in each co-dimension 1 subspace, the corresponding restrictions of gcd(Cf ) and sim(Cf ). The given tree is interpolating for f w.r.t. F , and

27

for all x ∈ [m], xi does not divide F . Therefore, for all i ∈ [m] it holds that f |xi =0 ̸= 0. Hence, by Lemmas 4.20 and 4.4 it holds that gcd(Cfui ) = gcd(Cf )|xi =0 ,

sim(Cfui ) = sim(Cf )|xi =0 .

(8)

˜ 1, . . . , L ˜ ∆(C ) ), then Namely, if sim(C) = p˜(L f ˜ 1 |x =0 , . . . , L ˜ ∆(C ) |x =0 ). sim(Cfui ) = p˜(L i i f ˜ 1 |x =0 , . . . , L ˜ ∆(C ) |x =0 are linearly independentH then for every i ∈ [m], we have that ri = Since L i i f ∆(Cf ). Hence, the algorithm does not fail in Step 2 and r = ∆(Cf ). We now prove that we can indeed reconstruct the linear functions of gcd(Cf ). According to equation (8), the set of linear functions in gcd(Cfui ) is the set of the projections of the linear functions in gcd(Cf ) to the co-dimension 1 subspace deﬁned by xi = 0. In [Shp09] an algorithm (Algorithm 6 there) is given for exactly this problem (see Section A.3 in the appendix). Therefore, we can assume that we reconstructed gcd(Cf ). We proceed to the second stage of Algorithm 3. The following lemma proves the existence of a pair of leafs satisfying the requirement of Step 6. Lemma 4.21. There exist 1 < ˆi ≤ m such that }) ( { dim1 span1 Lu1 1 |xˆi =0 , . . . , Lur 1 |xˆi =0 = r. Proof. Let 1 < i ≤ m. Assume that dim1 (span1 {Lu1 1 |xi =0 , . . . , Lur 1 |xi =0 }) < r. Then there exists a non-trivial linear combination of Lu1 1 |xi =0 , . . . , Lur 1 |xi =0 equal to some ﬁeld element. Alternatively, for some list of ﬁeld elements (containing at least one non-zero element) (αj )rj=0 in F, α0 +

r ∑

αj Luj 1 |xi =0 = 0.

j−1

As Lu1 1 , . . . , Lur 1 are linearly independentH , it must holds that the same combination of Lu1 1 , . . . , Lur 1 sums to xi (after scaling with a ﬁeld element). Hence, xi ∈ span1 ({Lu1 1 , . . . , Lur 1 }). Since m ≥ r + 2, / we get by a simple dimension argument that there must exist some 1 < ˆi ≤ m such that xˆi ∈ span1 ({Lu1 1 , . . . , Lur 1 }). For this ˆi it holds that }) ( { dim1 span1 Lu1 1 |xˆi =0 , . . . , Lur 1 |xˆi =0 = r, as required. Thus, the lemma implies that the subspace U0 exists. We are now close to completing the proof of Lemma 4.19. In Step 8 we run Algorithm 2 of Section 4.1 on the diﬀerent restrictions of the circuit Ch where h is the polynomial computed by sim(Cf ). As h does not have any linear factors it follows that Ch = sim(Cf ). Thus, it suﬃces to prove that the output of Algorithm 2 is the circuit Ch , given the restrictions {Ch|U0 , Ch|U1 , Ch|U2 }. The requirement of Algorithm 2 is that U0 should be ∆(Ch )-rank preserving for Ch (see Theorem 4.3). Clearly, since Ch = sim(Cf ), it holds that U0 satisﬁes this requirement and thus step 8 outputs Ch = sim(Cf ). This proves the algorithm correctness. We now analyze the running time of the algorithm. By Lemma A.4, ﬁnding the linear functions of gcd(Cf ) requires poly(d, t) time. The last step of the algorithm (constructing sim(C)) requires t · dO(∆(Ch )) = dO(t) time (this is shown in section 4.1, Theorem 4.3). It can easily be seen that the time required in all other steps is also poly(t, d, size(Cf )). Hence, the total running time of Algorithm is dO(t) (as size(Cf ) = dO(t) ). This completes the proof of Lemma 4.19. 28

Now that we have analyzed Algorithm 3, we are ready to present Algorithm 4 that handles any m-linear function tree and not only a tree of depth 1.

1 2 3 4 5

6

Input: Two integers t, m ∈ N+ and black box access to a t-variate polynomial F . An m-linear function tree T . Output: A circuit denoted by Cf computing a polynomial f . if T is a single node then ∆

reconstruct Cf = CF via Algorithm 12 of Section A.2 (brute force interpolation algorithm). end else Let {ej }m j=1 be the set of edges connecting the root to its children. Denote the corresponding children by {vj }m j=1 . For each ej , let aj be the multiplicity of φej in F .; Obtain, via the black box factoring algorithm of Section A.1, the value of aj and black box access to the circuit Fvj = (φeF )aj |φej =0 for all j ∈ [m].; j

7

8

9

For each j ∈ [m], recursively run with the subtree rooted at vj and a black box computing Fvj . Denote the outputted circuit as Cj′ ; Run Algorithm 3 with input T ′ , composed of only the ﬁrst level of the tree T (i.e., the diﬀerent vj ’s are leafs) and {Cj′ }j to obtain Cf ; end Algorithm 4: Reconstructing a circuit given an m-linear function tree

Proof of Theorem 4.18. We start by proving the correctness of the algorithm. It is easy to verify that if T is an interpolating tree for f w.r.t. F then for any vertex vj , the sub-tree rooted there is an interpolating tree for fvj w.r.t. Fvj . The proof of correctness now follows by a simple induction on the tree depth. The base case is trivial. Due to the induction hypothesis we have that for every j ∈ [m], the circuit Cj′ will in fact be equal to Cfvj (where fvj is as deﬁned in Deﬁnition 4.8). Hence, due to the correctness of Algorithm 3 we have that the outputted circuit is indeed the default circuit or f . We now analyze the running time of the algorithm. In total, Algorithm 3 is invoked size(T ) times (once for every non-leaf vertex). Similarly, we see that Algorithm 12 and the factoring algorithm of section A.1 are called as subroutines at most O(size(T ) · m) times. Hence, according to Lemma A.3, Lemma A.2 and Lemma 4.19, the running time of Algorithm 4 is 2

dO(t) · size(T ) + |F|O(t ) · m · size(T ). Since m ≤ t (in a t-dimensional space there are at most t linearly independentH linear functions) and d ≤ |F| (otherwise we work with an algebraic extension of F containing more than d elements) the claim follows. 4.2.3

The Algorithm for finding a τ -distant circuit

We conclude Section 4.2 by giving its main algorithm. The following theorem concludes its analysis and this section. Theorem 4.22. Let C be a t-variate minimal ΣΠΣ(k, d, ρ) circuit computing a polynomial f . Let r ∈ N be such that r ≥ max {50 log(d), R(2k, d)}. Assume also that τ ≥ ξd (k). Algorithm 5, given

29

k+1

the inputs k, t, d, r, τ and C runs in |F|O(kt ) time. It outputs an integer s ≤ k and a τ -distant ΣΠΣ(s, d, r) circuit computing f if they exist. If not, it outputs “fail”.

1 2 3 4 5 6

7

8

9 10

Input: k, t, d, r, τ ∈ N and oracle access to a ΣΠΣ(k, d, ρ) circuit C in t variables computing a polynomial f . ∑ Output: s ∈ N+ such that s ≤ k and a τ -distant ΣΠΣ(s, d, r) circuit C ′ = si=1 Cfi computing f. If ∆(Cf ) ≤ r, output s = 1 and C ′ = Cf .; If ∆(Cf ) > r and k = 1, output fail.; ∆

m = max {⌈100 log(d)⌉, r + 2}; foreach m-linear function tree T of depth lower than k do Run Algorithm 4 with inputs T and f ; If the algorithm failed, or outputted a circuit Cg such that ∆(Cg ) > r continue to the next tree; Recursively construct a τ -distant circuit with at most k − 1 multiplication gates computing the polynomial f − f1 . If it does not exist, continue to the next tree; Denote the found circuit as C1 and the number of its multiplication gates by sˆ. Check whether C1 + Cf1 is a τ -distant ΣΠΣ(ˆ s + 1, d, r) circuit computing f . If so, output s = sˆ + 1 and C ′ = C1 + Cf1 . Otherwise, continue to the next tree; end If no τ -distant circuit was found, output “fail”; Algorithm 5: Finding the τ -distant circuit of a polynomial

Proof. (of Theorem 4.22) We ﬁrst note that as τ ≥ ξd (k) ≥ R(2k, d, r)/r (see Fact 4.12), if a τ -distant ΣΠΣ(s, d, r) circuit exists then it is unique (see Theorem 3.6). We start by proving the correctness of the algorithm. Note that before the algorithm outputs any circuit (Step 8) it veriﬁes that it is indeed a τ -distant circuit for f so in any case when we output a circuit we are guaranteed to have the unique circuit ∑sat hand. Assume that for some s ≤ k ′ there ′ exists a τ -distant ΣΠΣ(s, d, r) minimal circuit C = i=1 Cfi for f . We prove that for some s ≤ k, a τ -distant ΣΠΣ(s′ , d, r) circuit is outputted by the algorithm, thus proving its correctness. Our proof is by induction on s. When s = 1, we clearly output {Cf }. Assume that s > 1. According tree that we check Algorithm 4 will produce, w.l.o.g., to Corollary 4.17, as τ ≥ ξd (k), for some ∑ r) circuit Cf1 . According to Theorem 3.6, the circuit si=2 Cfi is the unique τ -distant ΣΠΣ(s − 1, d,∑ computing f − fi . Hence, by the induction hypothesis, the recursive call will output C1 = si=2 Cfi . Therefore, the next step of the algorithm will output the circuit C ′ , as we wanted to prove. We now analyze the running time. Our ﬁrst action requires checking whether ∆(Cf ) ≤ r. To do 2 so we reconstruct Cf using Algorithm 12 (brute force algorithm) in FO(t ) time.16 We now analyze the number of iterations in each recursive call. For this we just have to bound the number of m-linear functions tree. The size of any set of linearly independentH linear functions is at most t so m ≤ t. Each tree has at most O(mk ) = O(tk ) edges. For each edge there exists a t-variate linear function. The number of t-variate linear functions over the ﬁeld F is at most |F|t+1 . Hence, the number of trees we check is There are more eﬃcient ways to check whether ∆(Cf ) ≤ r. However, this does not aﬀect the analysis of the running time. 16

30

k+1

bounded by |F|O(t ) and so is the number of iterations. In each iteration, besides the recursive call we construct a circuit given an m-linear function tree (Step 6) and preform a PIT test (Step 8). The former 2 requires |F|O(t ) time, according to Theorem 4.18. The latter (PIT) may be done deterministically by brute force interpolation in dO(t) = |F|O(t) time (as d ≤ |F|). Hence, The time spent in each iteration 2 not including the recursive call is |F|O(t ) . Concluding, we get that the total time spent without k+1 including recursive calls is |F|O(t ) . The recursion depth is obviously bounded by k since we reduce k+1 its value by one in each call. Thus, the total running time is |F|O(k·t ) . Note: An alternative view of the algorithm is the following: We go over all possible k-tuples of trees of depth k, and for every such set we interpolate f1 from the ﬁrst tree, then f2 from the second tree (given access to f − f1 ) etc...

4.3

The Reconstruction Algorithm

We are now ready to summarize our ﬁndings and present the learning algorithm. Algorithm 6, given the inputs k, n, d ∈ N and black box access to an n-input ΣΠΣ(k, d, ρ) circuit C, outputs a ΣΠΣ(s, d, r) circuit C ′ computing the same polynomial as C. Its running time is quasi-polynomial in n, d, |F|. The following lemma gives the summary of the algorithm analysis. Theorem 1 is immediately implied by it Lemma 4.23. Let k, n, d, ρ ∈ N and let C be an n-variate ΣΠΣ(k, d, ρ) circuit over the ﬁeld |F|. Then Algorithm 6, given a black box computing C as input, outputs for some r ≤ max {50 log(d), R(2k, d), R(k + 1, d) + k · ρ} · (ξd (k) · k)k and s ≤ k, a ΣΠΣ(s, d, r) circuit C ′ such that C ′ ≡ C. The running time of the algorithm is ( O

n · |F| 2

(max{50 log(d),R(2k,d),R(k+1,d)+k·ρ}·((ξd (k)+1)·k)k−2 ·ξd (k))

k+1

)

=

3

poly(n) · exp(log(|F|) · log(d)O(k ) · ρO(k) ). Proof. (of Lemma 4.23) We ﬁrst prove the correctness of the algorithm. Notice that before we output any circuit we verify that it computes the correct polynomial. Hence, it suﬃces to prove that there exists at least one pair of (r, α) for which a circuit is outputted. Let f be the polynomial computed by the input circuit C. In the main iteration, the integer r takes each value between rinit and rinit · k ⌈logk (ξd (k)+1)⌉·(k−2) . According to Theorem 3.2, there exist at least one such value r for which there exist an integer s′ ≤ k and a minimal ξd (k)-distant ΣΠΣ(s, d, r) circuit C ′ computing f . We focus on the iterations in which r takes a minimal such value. According to Lemma 2.8, there exist α ∈ S for which the chosen subspace Vα,t is t-rank preserving for C ′ . We focus on the iteration where such an element α is chosen along with the speciﬁed value of ∆ ∑ ′ r. Let C ′ = si=1 Cfi′ be a minimal ξd (k)-distant ΣΠΣ(s, d, r) circuit computing f whose existence stems from the choice of r. By Fact 4.12, R(2k, d, r)/r ≤ ξd (k). Hence, by Corollary 3.7, the circuit C ′ |V is a ξd (k)-distant ΣΠΣ(s, deg(f |V ), r) minimal circuit computing f |V . In addition, by the same corollary, there is no other ξd (k)-distant ΣΠΣ(k ′ , deg(f |V ), r) minimal circuit computing f |V for any k ′ ≤ k. The same applies for all other Vj ’s. Therefore, for each subspace Vj , the circuit reconstructed in Step 7 is in fact C ′ |Vj due to the correctness of Algorithm 5. In particular, all the sj ’s are the same (and equal to s as deﬁned in Step 8) and for each i ∈ [s] and 0 ≤ j ≤ n − t it holds, after Step 9, that Cf j = Cfi′ |Vj . Since each Vj is r-rank preserving for C ′ we have by Lemma 4.4 that Cfi′ |Vj = Cfi′ |V . j i From this and from V being t-rank preserving for each Cfi′ , it follows by Theorem 4.3 that in Step 10, for every i ∈ [s] Algorithm 2 outputs the circuit Cfi′ (that is, for every i ∈ [s], fi = fi′ ). 31

Input: k, n, d, ρ ∈ N and a black box holding an n-variate ΣΠΣ(k, d, ρ) circuit C. Output: A ΣΠΣ(s, d, r) circuit C ′ computing the same polynomial as C. ∆

2

rinit = max {50 log(d), R(2k, d), R(k + 1, d) + k · ρ}; (( ) )( ) rinit ·k⌈logk (ξd (k)+1)⌉·k +2 k Let S ⊆ F be such that |S| = n dk + 2 + 1; 2 2

3

foreach rinit ≤ r ≤ rinit · k ⌈logk (ξd (k)+1)⌉·(k−2) and α ∈ S do

1

4 5

6 7

∆

Set t = r · ξd (k); ∆ Deﬁne V = Vα,t . Let Vˆ be the homogenous part of V , and v1 , . . . , vt be the default basis of Vˆ . Also let v0 be such that V = Vˆ + v0 . Let vt+1 , . . . , vn be the default basis for some V ′ satisfying V ′ ⊕ Vˆ = Fn ; ∆ For each j ∈ [n − t], deﬁne Vj = span(Vˆ ∪ {vt+j }) + v0 ; For each 0 ≤ j ≤ n − t run Algorithm 5 on the circuit C|Vj . If the algorithm failed in any of the restrictions then ∑ proceed to the next pair of (r, α), otherwise, for every j, deﬁne the sj outputted circuit as i=1 Cf j ; i

8

If for any 0 ≤ j1 < j2 ≤ n − t, sj1 ̸= sj2 then proceed to the next pair of (r, α). Otherwise ∆

9

10

11 12

deﬁne s = s1 ; Reorder the numbering of the multiplication gates so that for every i ∈ [s] and j ∈ [n − t] it holds that fij |V = fi0 . If this is not possible then proceed to the next pair (r, α) ; n−t For every i ∈ [s], run Algorithm 2 with input {Cf j }j=0 . If the algorithm failed for any i i ∈ [s] then proceed to the next pair of (r, α). Otherwise, for each i ∈ [s] set Cfi to be the circuit ∑ found by Algorithm ∑ 2, given i; If si=1 Cfi ≡ C, output si=1 Cfi . Otherwise, proceed to the next pair of (r, α); end Algorithm 6: Learning a ΣΠΣ(k, d, ρ) circuit

32

′

We have shown so far that for some pair of (r, α), in Step 10 we reconstruct the circuits {Cfi′ }si=1 . Hence, the circuit we check in the following step is C ′ . As C ′ ≡ f , the algorithm outputs it as the required circuit. This proves the correctness of Algorithm 6. We now analyze the time complexity of the algorithm. The number of iterations we have is 3

2

3

|S| · 2O(k ) · (log(d))O(k ) · ρ = n · poly(d, ρ) · 2O(k ) · (log(d))O(k

2)

The only steps whose running time analysis are neither trivial nor already analyzed (in previous sections) are steps 9 and 11. In Step 9, by performing a brute force PIT we reach a running time bound of k+1 n · s2 · dO(t) ≤ n · |F|O(t ) . In Step 11, we would like to deterministically check whether the ΣΠΣ(s + k, d, r) circuit C − C ′ computes the zero polynomial given only a black box computing the circuit. The results of [KS08], combined with [SS09], give such an algorithm whose running time ( ) n · exp k 3 · (log d) + kr log d (see Lemma A.5 of Section A.4). Recall that ( ) tk+1 = (ξd (k) · r)k+1 = Ω k 3 · (log d) + kr log d . Therefore, assuming that d ≤ |F|, the running time of is Step 11 is bounded by k+1 )

n · |F|O(t

.

We have that the total time of each iteration, according to Theorems 4.22 and 4.3 is k+1 )

n · |F|O(t

k+1 )

+ n · dO(r) = n · |F|O(t

.

Hence, the total running time of the algorithm is ( ) ( ) 3 2 k+1 k+1 n · poly(d, ρ) · 2O(k ) · (log(d))O(k ) · n · |F|O(t ) = n2 · |F|O(t ) This proves Lemma 4.23.

5

Reconstructing Multilinear ΣΠΣ(k) circuits

When reconstructing multilinear ΣΠΣ(k) circuits we obtain better results. We present an algorithm that given black box access to some n-variate multilinear ΣΠΣ(k) circuit C computing a polynomial f , deterministically outputs, in (n + |F|)Ok (1) time, a multilinear ΣΠΣ(k) circuit C ′ computing f . The main outline of the reconstruction algorithm is the same as in ΣΠΣ(k, d, ρ) circuits. We ﬁrst reconstruct the restriction of the circuit to several low dimensional subspaces and then glue together the diﬀerent restricted circuits. As before, the representation of the circuit (i.e., a canonical τ -distant circuit) over the low dimensional subspaces (detailed in Section 5.2) ensures a small number of possible “lifts” of the circuit. Speciﬁcally, we reduce the case of lifting a multilinear ΣΠΣ(k) circuit to the case of lifting a ΣΠΣ(k ′ ) circuit (for some k ′ ≤ k) of low ∆ measure (this case is detailed in Section 5.1). The low ∆ measured circuit is the analog of a single generalized multiplication gate in the non-multilinear case. Let us ﬁrst discuss the case of low rank (i.e., ∆ measure) circuits. Namely, assume we are given black box access to a low rank ΣΠΣ(k) multilinear circuit (speciﬁcally, its rank is bounded by some 33

function of k) computing a polynomial f and our goal is to reconstruct it. As opposed to ΣΠΣ(k, d, ρ) circuits, we do not wish to reconstruct a single generalized multiplication gate but a ΣΠΣ(k) multilinear circuit. Nevertheless, it turns out that similarly to before, there is way to reconstruct the circuit given only the access to it in various low dimension subspaces. We do so via the following process: First, we ﬁnd a circuit computing f |V in some low (dependent only on k) dimension subspace V via brute force techniques. This is possible since C|V is a multilinear ΣΠΣ(k) circuit containing only a constant number of variables. Hence, the size of C|V is bounded by some constant. Once we obtain a circuit computing f |V , we lift it onto many subspaces of dimension dim(V ) + 1 (i.e., construct circuits computing the restrictions of f in these subspaces). It turns out that these circuits can be glued together, into one circuit computing f , using a rather simple algorithm. The rest of the details are given in Section 5.1. We now explain the reconstruction of high rank circuits. Similarly to the non-multilinear case, we reduce to the case of low rank circuits. We reconstruct the circuit in some low dimension subspace V and then ﬁnd a strong partition of the circuit (Deﬁnition 3.3). This is an analog of τ -distant circuits. A strong partition is a partition of the multiplication gates s.t. any single circuit induced from the partition has a low rank and the sum of any pair of circuits has a high rank. It turns out that the partition is unique in the following sense: For any circuit computing f |V , the polynomials computed by the circuits of the strong partition are the same. Hence, as in the previous section, this gives us some unique “decomposition” of the polynomial f into at most k “sub-polynomials”, all computable by low rank multilinear ΣΠΣ(k) circuits. Given this uniqueness, we manage to sample each low rank “subpolynomial” independently in various low rank subspaces and thus reconstruct a circuit computing it.

5.1

Lifting a Low Rank Multilinear ΣΠΣ(k) Circuit

In this section we deal with the following problem: Let f be a polynomial over Fn , for which there exists some multilinear ΣΠΣ(k) circuit C of low rank (i.e. ∆(C) = Ok (1)) computing it. As input we are given black box access to f in various low dimension subspaces. Our goal is to output a multilinear ΣΠΣ(k) circuit computing f . We now give a formal deﬁnition of the problem: Let k, r, n ∈ N. Let C be an n-variate multilinear ΣΠΣ(k) circuit computing a polynomial f such that ∆(C) ≤ r. Let B ⊆ [n] and α ∈ F. Deﬁne VB,α as in Deﬁnition 2.10). Assume that VB,α is a liftable r-multilinear-rank-preserving t dimensional subspace for C (recall Deﬁnition 2.13). Given black boxes computing the polynomial f |VB′ ,α for each B ′ ⊇ B of size |B ′ | = |B| + 2, we construct an n-variate ΣΠΣ(k) multilinear circuit C ′ that computes ∆

the polynomial f . For convenience, we set V = VB,α . For each A ⊆ [n] let ∆

A V A = VB,α = VB∪A,α .

(9)

We assume w.l.o.g. that the set B is in fact the set {n − t + 1, n − t + 2, . . . , n} and that the shifting vector vα,0 = {0, 0, . . . , 0} (that is α = 0). As a ﬁrst step towards computing a ΣΠΣ(k) circuit for f we ﬁrst construct a ΣΠΣ(k) circuit computing a restriction of f to V . Notice that C|V is a multilinear ΣΠΣ(k) circuit in t variables. Hence, its size is bounded by some integer function of t and k. It follows that by going over all multilinear ΣΠΣ(k) circuits bounded by that size restriction we will at some point ﬁnd at least one circuit computing f |V . This “pool of circuits” is of size poly(|F|), assuming that k and t are constants. Therefore, in the ﬁrst model we work on, the circuit C|V is given to us as input while in practice we try to lift the circuit for each guess of C|V .

34

The lifting process consists of two phases. First we ﬁnd the linear functions of gcd(C) and then, having access to sim(C) we ﬁnd a simple circuit C ′ such that C ′ ≡ sim(C). Algorithm 7 constructs gcd(C). 1

2 3 4 5 6

7

8 9 10 11

C. Input: Positive integers t, n. A depth-3 multilinear circuit C ′ (xn−t+1 , . . . , xn ). For some multilinear polynomial f , black box access to the polynomials f |V {i,j} for each i ̸= j ∈ [n − t] where V = span{xn−t+1 , . . . , xn }. Output: A set M of linear functions. S ← [n − t]; L ← gcd(C ′ ); while S ̸= ∅ do Pick some i ∈ S and remove it from S; Obtain a linear function ℓ such that xi appears in ℓ (i.e., the coeﬃcient of xi in ℓ is non-zero) and ℓ is a factor of f |V {i} . If no such function exists or ℓ contains a variable xj that appears in sim(C ′ ) then proceed to the next iteration; Find, for each j ∈ S, a linear function ℓj dividing f |V {i,j} such that ℓj |V {i} = ℓ. Denote by αj the coeﬃcient of xj in ℓj .∑If αj ̸= 0, then remove j from S; Add the linear function ℓ + αj xj to M ; L ← L\{ℓ|V }; end Add all functions left in L to M ; Algorithm 7: Lifting the g.c.d of a low ∆ measured circuit

Lemma 5.1. Given inputs n, t, a t-variate multilinear circuit C ′ and black box access to a multilinear n-variate polynomial f , both over a ﬁeld F, the running time of Algorithm 7 is n2 · |F|O(t) . Let C be an n-variate multilinear depth-3 circuit over a ﬁeld F. Let r = ∆(C). Assume that the subspace V = span{xn−t+1 , . . . , xn } is a liftable r-multilinear-rank-preserving subspace for C. Algorithm 7, given t, n and C ′ = C|V as inputs outputs gcd(C). Proof. We start by proving the correctness of the algorithm. Step 6 is meant to ﬁnd whether xi appears in gcd(C) (i.e., there exists a linear function in gcd(C) in which the coeﬃcient of xi is nonzero). The other steps in the main iteration are reached only when xi appears in gcd(C). We prove the correctness of Step 6 with Lemmas 5.2 and 5.3 by giving suﬃcient and necessary conditions for the appearance of a variable and of a linear function in gcd(C). We then show that in Step 8, the function found is the unique linear function in gcd(C) in which xi appears. An immediate conclusion is that at the end of the loop (i.e. when S = ∅), M contains exactly the linear functions in gcd(C) that contain a variable xi where 1 ≤ i ≤ n − t. The linear functions in gcd(C) that are not yet in the set M are those that depend only on xn−t+1 , . . . , xn . We will show that these are exactly the functions that remained in L. Lemma 5.2. Let i ∈ [n − t]. Then xi appears in gcd(C) if and only if xi appears in gcd(C|V {i} ) Proof. Notice that V {i} is ∆(C)-multilinear-rank-preserving subspace for C. Hence, Lemma 2.6 implies17 that gcd(C|V {i} ) = gcd(C)|V {i} and sim(C|V {i} ) = sim(C)|V {i} . 17

The lemma discusses rank preserving subspaces for circuits that are not necessarily multilinear. It is easy to see that it holds for multilinear rank preserving subspaces as well.

35

Assume that xi appears in gcd(C). Then xi appears in gcd(C)|V {i} , meaning that it appears in gcd(C|V {i} ). Assume now that xi does not appear in gcd(C). If xi does not appear in C then clearly it does not appear in gcd(C|V {i} ). If it does then it must appear in sim(C) and in a similar manner we get that xi appears in sim(C|V {i} ). Since the circuit C|V {i} is multilinear, xi does not appear in gcd(C|V {i} ). Lemma 5.3. The following holds for every linear function ℓ appearing in the circuit C: ℓ ∈ gcd(C) if and only if ℓ divides f and each xi appearing in ℓ does not appear in sim(C|V ). Proof. Assume that ℓ ∈ gcd(C). Clearly ℓ divides f . Since C is a multilinear circuit, any xi appearing in ℓ cannot appear in sim(C). Hence, it cannot appear in sim(C|V ). This proves the ﬁrst direction of the claim. Let ℓ be a linear function dividing f such that ℓ ∈ / gcd(C). To prove the second direction it suﬃces to show that some variable appearing in ℓ also appears in sim(C|V ). Indeed, as ℓ ∈ / gcd(C), it follows that ℓ divides the polynomial computed by sim(C). Hence, ℓ is equal to some non-trivial linear combination of the linear functions in the default basis of span1 (Lin(sim(C))) (plus some ﬁeld constant). Denote the functions of this basis by L1 , . . . , Lr . Since V is r-multilinear-rank-preserving for C, we have that L1 |V , . . . , Lr |V are linearly independentH and thus ℓ|V is not a constant function. Let xi be some variable appearing in ℓ|V . Obviously, xi appears in ℓ (and hence in sim(C)). By Lemma 2.6 we get that sim(C|V ) = sim(C)|V , and therefore, xi appears in sim(C|V ). This completes the proof of the lemma. We continue the proof of Lemma 5.1. According to Lemma 5.2, xi appears in gcd(C) if and only if it appears in gcd(C|V {i} ). According Lemma 5.3, xi appears in gcd(C|V {i} ) iﬀ there exists some linear function ℓ with the following properties: ℓ contains xi , ℓ divides f |V {i} , and any variable xj (j ̸= i) appearing in ℓ does not appear in sim(C|V ). This is precisely the function we look for in Step 6. Hence, we indeed continue to the following steps if and only if xi appears in gcd(C). Lemma 5.4. Let i ∈ [n] and let ℓ ∈ gcd(C|V {i} ) be the linear function containing xi . There exists a unique function L ∈ gcd(C) such that L|V {i} = ℓ. Proof. According to Lemma 2.6, gcd(C)|V {i} = gcd(C|V {i} ). Hence, as ℓ ∈ gcd(C|V {i} ), the function ℓ is one of the factors of gcd(C)|V {i} , meaning that for some L ∈ gcd(C) it holds that L|V {i} = ℓ. It remains to prove the uniqueness of L. Assume for contradiction that L is not unique. In other words, for some L ̸= L′ where L, L′ ∈ gcd(C) it holds that L|V {i} = ℓ and L′ |V {i} = ℓ. By the deﬁnition of V {i} it follows that xi appears in both L and L′ , meaning that C is not multilinear, which is a contradiction. We now ﬁnish the proof of Lemma 5.1. So far we have proved the existence of a unique linear function L ∈ gcd(C) such that L|V {i} = ℓ. Hence, for each j ∈ [n − t] chosen in Step 7, the linear function ℓj is unique and L|V {i,j} = ℓj . It follows that in Step 8, the function added to M is indeed L. We get that at the and of the loop, M contains exactly the linear functions of gcd(C) that contain some xi where i ∈ [n − t]. Since gcd(C ′ ) = gcd(C|V ) = gcd(C)|V (Lemma 2.6), it follows that the functions in gcd(C) which are not in M are exactly those that remained in L. This proves the correctness of the algorithm. We now analyze the complexity of the algorithm. We require, for each polynomial f |V {i,j} , the linear functions dividing it. According to Lemma A.2 of Section A.1 there is a deterministic algorithm that ﬁnds the linear functions dividing a polynomial given black box access to it in |F|O(t) time. Hence, to ﬁnd the linear functions dividing each polynomial f |V {i,j} we require n2 ·|F|O(t) running time. Having

36

these linear functions, it is clear that the algorithm requires at most an additional n2 · poly(t) running time. Hence, the total time required for the algorithm is n2 · |F|O(t) . Denote by h the polynomial computed by sim(C). After constructing gcd(C) we obtain black box access to h|V in the following way: We factor f |V via the algorithm of Section A.1. The algorithm produces a list of black boxes, one for every irreducible factor of f |V (see Lemma A.1). It is easy to verify that h|V is the product of the non-linear factors of f |V and the linear factors not appearing in gcd(C)|V . In the same manner one can achieve black box access to h|V {i} for any i. As in the previous algorithm, we shall assume that the circuit sim(C) is given to us as input while in practice we enumerate over all constant size circuits. Algorithm 8 reconstructs sim(C). Think of it as a complete reconstruction algorithm for simple circuits. Namely, given the restriction C ′′ |V of a simple circuit C ′′ (where C ′′ = sim(C)) computing a polynomial h and black boxes computing h|V {i} for each i ∈ [n − t] as input it outputs a circuit C ′ computing the h (i.e., C ′ ≡ C ′′ = sim(C)). In the algorithm description and analysis, we abuse notations for convenience and refer to the circuit sim(C) as C.

1

2

3

Input: Positive integers k, t, n. A multilinear circuit C 0 (xn−t+1 , . . . , xn ). For some multivariate polynomial h, black box access to the polynomials h|V {i} for each i ∈ [n − t], where V = span{xn−t+1 , . . . , xn }. Output: An n-variate multilinear circuit. Let L1 , . . . , Lr be the default basis of span1 (Lin(C 0 )). Find a circuit C¯ of r variables such that ¯ 1 , . . . , Lr ) = C 0 . Note that the equality is between the circuits and not just the C(L polynomials they compute; For each i ∈ [n − t], ﬁnd a (multi) set of elements of F, {βi,j }j∈[r] for which ∆ ¯ i C i = C(L 1 + βi,1 xi , . . . , Lr + βi,r xi ) ≡ h|V {i} and C is a multilinear circuit. If such a set does not exist, output “fail”; ∑ ¯ 1+∑ output C(L i∈[n−t] βi,1 xi , . . . , Lr + i∈[n−t] βi,r xi ); Algorithm 8: Lifting a low rank multilinear circuit to Fn

Lemma 5.5. Let k, t, n be positive integers. Algorithm 8, given the above integers as inputs, runs in n · |F|O(t) time. Let C be a simple multilinear n-variate ΣΠΣ(k) circuit computing a polynomial h. Assume that V = span(xn−t+1 , . . . , xn ) is a liftable ∆(C)-multilinear-rank-preserving subspace for C. Algorithm 8, given n, t, k, C 0 = C|V and access to h in the appropriate subspaces as inputs outputs a circuit computing h. Proof. We ﬁrst analyze the time complexity. In Step 2, in order to ﬁnd a circuit of the wanted form, we simply go over all possibilities for {βi,j }j∈[r] . There are |F|r such possible sets. For each option we preform an identity test to check if the found circuit computes the needed polynomial. As the circuit we construct and the polynomial we compare it to are multilinear, a deterministic PIT can be performed in 2O(t) time18 . We have that for each i ∈ [n − t] we require a running time of |F|r · 2O(t) = |F|O(t) . Hence, the total running time of Step 2 is at most n · |F|O(t) . Clearly, all other steps require poly(t, k) time and thus, since19 t ≥ k, the total running time is n · |F|O(t) . According to the well known Schwartz-Zippel lemma, it suﬃces to compute the output of C on each point in {0, 1}t . A non-zero multilinear polynomial cannot vanish on the entire set of points. 19 The algorithm will always be invoked where t is much larger than k. In order to ease the reading process, we only analyze this case. 18

37

We now prove the correctness of the algorithm. We ﬁrst prove that we do not fail at Step 2. Namely, that there exists a circuit of the form C i , computing the necessary polynomial, for each i ∈ [n − t]. We will then show that the circuit produced in Step 3 indeed computes h. ¯ j }r Since V is ∆(C)-multilinear-rank-preserving for C, we have that ∆(C|V ) = ∆(C) = r. Let {L j=1 be a basis of span1 (Lin(C)) (recall that C is simple and thus r = ∆(C)). Since ∆(C|V ) = ∆(C), we ¯ j |V }r are linearly independentH . Deﬁne {L ¯ j }r as the basis have that the linear functions of {L j=1 j=1 ¯ j |V }r are the default basis of the subspace they of span1 (Lin(C)) such that the linear functions {L j=1 ¯ j |V = Lj for each j ∈ [r]). Let C˜ be the r-variate circuit for which C( ˜ L ¯ 1, . . . , L ¯r) = C span (that is, L (the equality is between circuits, not just polynomials). Then ˜ L ¯ 1 |V , . . . , L ¯ r |V ) = C|V = C(L ¯ 1 , . . . , Lr ). C( ¯ According to the deﬁnition of V , it follows that for some (multi) set of ﬁeld It follows that C˜ = C. elements {αi,j }i∈[n−t],j∈[r] , it holds that ∑

¯ 1+ C = C(L

∑

αi,1 xi , . . . , Lr +

αi,r xi ) .

i∈[n−t]

i∈[n−t]

In particular, ¯ 1 + αi,1 xi , . . . , Lr + αi,r xi ) . C|V {i} = C(L This means that in Step 2, we are guaranteed to ﬁnd a circuit C i for each i ∈ [n − t]. In the following lemma we prove that the outputted circuit of Step 3 computes h, thus proving the algorithm correctness. Lemma 5.6. Let {βi,j }i∈[n−t],j∈[r] be a (multi) set of ﬁeld elements such that for each i ∈ [n − t]: ¯ 1 + βi,1 xi , . . . , Lr + βi,r xi ) ≡ C| {i} C(L V Let {αi,j }i∈[n−t],j∈[r] be the (multi) set of ﬁeld elements such that ∑

¯ 1+ C = C(L

∑

αi,1 xi , . . . , Lr +

i∈[n−t]

αi,r xi )

i∈[n−t]

Let A1 , A2 ⊆ [n − t] such that A1 ∩ A2 = ∅. It holds that ∑ ∑ ∑ ∑ ¯ 1+ C(L βi,1 xi + αi,1 xi , . . . , Lr + βi,r xi + αi,r xi ) ≡ h|V A1 ∪A2 i∈A1

i∈A2

Speciﬁcally, ¯ 1+ C(L

∑

i∈A1

βi,1 xi , . . . , Lr +

i∈[n−t]

i∈A2

∑

βi,r xi ) ≡ h

i∈[n−t]

Proof. By induction on |A1 |: The base case where A1 = ∅ is trivial. To prove the induction step we present a few deﬁnitions: For each i ∈ [n − t] Let pi (y1 , . . . , yr , x) be the polynomial computed by ¯ 1 + αi,1 x, . . . , yr + αi,r x). Let qi (y1 , . . . , yr , x) be the polynomial computed by C(y ¯ 1 + βi,1 x, . . . , yr + C(y βi,r x). Since for each i ∈ [n − t], we have that L1 , . . . , Lr , xi are linearly independentH linear functions and qi (L1 , . . . , Lr , xi ) = h|V {i} = pi (L1 , . . . , Lr , xi ) ,

38

it follows that for any r + 1 indeterminates y1 , . . . , yr , x, it holds that qi (y1 , . . . , yr , x) = pi (y1 , . . . , yr , x). Let m ∈ A1 be some integer. For each j ∈ [r] deﬁne: ∆ ˆj = L Lj +

∑

αi,j xi +

¯ 1+ C(L

∑ i∈A1

βi,1 xi +

∑

βi,j xi .

i∈A1 \{m}

i∈A2

Then,

∑

αi,1 xi , . . . , Lr +

i∈A2

∑

βi,r xi +

i∈A1

∑

αi,r xi ) =

i∈A2

¯ L ˆ 1 + βm,1 xm , . . . , L ˆ r + βm,r xm ) ≡ qm (L ˆ 1, . . . , L ˆ r , xm ) = C( ˆ 1, . . . , L ˆ r , xm ) ≡ C( ¯ L ˆ 1 + αm,1 xm , . . . , L ˆ r + αm,r xm ) ≡ h| A1 ∪A2 . pm (L V The last equivalence is due to the induction hypothesis. This proves the Lemma. By setting A1 = [n − t], A2 = ∅ in the former lemma, we get the proof of Lemma 5.5. We conclude this section with Algorithm 9 combining the section results. The algorithm works in the more general model where we are only given black boxes computing the restrictions of a low rank multilinear circuit C. Input: k, n ∈ N, B ⊆ [n], α ∈ F. For some polynomial f computed by a multilinear ΣΠΣ(k) circuit, black box access to f |V {i,j} for every i, j ∈ [n]. B,α

Output: A set C of n-variate ΣΠΣ(k) multilinear circuits. 1 2 3

4

5

6

∆

∆

Deﬁne V = VB,α and t = dim(V ); foreach ΣΠΣ(k) multilinear circuit Cˆ of t inputs such that Cˆ ≡ f |V do Run Algorithm 7 with20 inputs t, n, the black boxes to the restrictions of f and the circuit ˆ Denote the output as M ; C. For every i ∈ / B, obtain, via the factoring algorithm of Appendix A.1, black box access to the irreducible factors of f |V {i} . Denote by h{i} the product of all non-linear factors and linear factors not in M |V {i} .; ˆ and access to the polynomials h{i} Invoke Algorithm 8 with inputs k, t, n, the circuit sim(C) (as the restrictions of h). If the algorithm did not fail, add the product of the outputted circuit with the linear functions of M to C.; end Algorithm 9: Lifting a multilinear circuit to Fn

Theorem 5.7. Let k, n ∈ N. Let C be an n-variate ΣΠΣ(k) multilinear circuit computing a polynomial f . Let B ⊆ [n], α ∈ F and the corresponding restrictions of f be the input of Algorithm 9. Let t = |B|. Then Algorithm 9 runs in n2 · |F|O(k·t) time and outputs a set of size |F|O(k·t) . If VB,α is a liftable ∆(C)-multilinear-rank-preserving subspace for C then there exists at least one circuit C ′ ∈ C such that C ′ ≡ C.

39

Proof. Clearly, if VB,α is a liftable ∆(C)-multilinear-rank-preserving subspace for C, then when we run Algorithms 7 and 8 on C|V then we get a circuit C ′ computing the same polynomial as C. This proves the correctness of the algorithm. We now analyze the time complexity and the size of C. For each “guess” of C|V , according to Lemmas 5.5 and 5.1, we require running time of n2 · |F|O(t) . For each such guess we increase the size of C by at most one. The question remaining is how many guesses for C|V do we have. Using combinatorial methods21 , it can be shown that the number of multilinear multiplication gates is at most |F|2t · exp(t). Since we have at most k multiplication gates, the number of possible ΣΠΣ(k) multilinear circuits is at most |F|O(k·t) . Hence, the total running time of the algorithm is |F|O(k·t) · |F|O(t) · n2 = n2 · |F|O(k·t) . The maximum size of C is |F|O(k·t) . This proves the theorem.

5.2

Finding a Partition of the Circuit in a Low Dimension Subspace

We now describe a method to reduce the case of ΣΠΣ(k) circuits of arbitrary rank to that of low rank circuits. We shall ﬁrst analyze the speciﬁcs of Algorithm 1. For convenience we repeat it here, in a restricted version for non-generalized depth-3 circuits, along with parts of its analysis that was originally given in Section 3.1.

1 2 3 4 5 6 7 8 9 10 11 12 13

14 15 16 17 18

Input: n, k, d, rinit , τ ∈ N such that rinit ≥ 0 and a ΣΠΣ(k, d) circuit C of n inputs. Output: An integer r ≥ rinit and I, a partition of [k]. ζ ← ⌈logk (τ )⌉; I1 ← {{1}, {2}, . . . , {k}}; r1 ← rinit ; while the partition was changed in any one of the former ζ iterations do Deﬁne j as the number of the current iteration (its initial value is 1); Let Gj (Ij , Ej ) be a graph where each subset belonging to the partition Ij is a vertex; Ej ← ∅; foreach Ai ̸= Ai′ ∈ Ij do if ∆(CAi , CAi′ ) < rj then Ej ← (Ai , Ai′ ) end end Ij+1 ← the set of connected components of Gj . That is, every connected component is now a set in the partition; rj+1 ← rj · k; end Deﬁne m as the total number of iterations (that is, in the last iteration we had j = m); r ← rm /k ζ ; I ← Im ; Since we are about to use the algorithm it is essential to discuss its running time:

We bound the number of coeﬃcients in a multiplication gate by 2t, this gives us |F|2t options. To determine which indeterminates appear in the same linear functions, we require a partition of [t]. There are exp(t) possible partitions of [t]. 21

40

Lemma 5.8. Algorithm 1 ﬁnds a (τ, r) strong partition of C in time O(log(τ ) · n3 k 4 ). Proof. By Lemma 3.4 the partition found by the algorithm is (τ, r) strong. We analyze its running time. In each iteration of the main loop, all actions except the inner loop require time O(k 2 ). The inner loop has O(k 2 ) iterations in which we check the ∆ measure of a circuit having at most nk linear functions. To check the rank of a circuit we preform a Gaussian elimination to the matrix of coeﬃcients of the linear functions in it. This requires O(n3 k) time (there are faster algorithms of course but it this will not aﬀect the total running time). Hence, the total running time of a single iteration in the main loop is O(n3 k 3 ). In every ζ iterations, the number of elements in I is reduced by at least one. The number of elements in I begins with k and ends with at least 1. Hence, the number of iterations is at most ζ · (k − 1), indicating that the total running time is O(ζ · n3 k 4 ) = O(logk (τ ) · n3 k 4 ). To prove the uniqueness of the partition we ﬁrst analyze the ∆ measure of the subcircuits, outputted by Algorithm 1, and their multiplication gate. Lemma 5.9. Let C1 , . . . , Cs denote the subcircuits of C induced by the partition found in Algorithm 1. If ζ ≥ 2 then we have the following: For every i ̸= i′ ∈ [s] and any two multiplication gates M, M ′ in the circuits Ci , Ci′ (respectively), it holds that ∆(M, M ′ ) ≥ rm−1 . Proof. Let Ci = M1 +. . .+Mℓ (where each Mj is a multiplication gate) and Ci′ = M1′ +. . .+Mℓ′′ be two diﬀerent subcircuits in the partition. Assume w.l.o.g. and for contradiction that ∆(M1 , M1′ ) < rm−1 . Notice that, (1)

(2)

∀j1 , j2 ∈ [l], ∆(Mj1 , Mj2 ) ≤ ∆(Ci ) ≤ r = rm−ζ ≤ rm−2 . Inequality 1 holds according to Lemma 2.17. Since the partition is (τ, r)-strong (Lemma 3.4), inequality 2 holds as well. Analogically, we have that ∀j1′ , j2′ ∈ [l′ ], ∆(Mj′ ′ , Mj′ ′ ) ≤ rm−2 . 1

Hence,

2

(1) ( ) (2) rm ≤ ∆(Ci , Ci′ ) = ∆ Mℓ , Mℓ−1 , . . . , M1 , M1′ , M2′ , . . . , Mℓ′′ ≤ ℓ ∑ j=2

′

∆(Mj , Mj−1 ) +

m ∑

′ ∆(Mj′ , Mj−1 ) + ∆(M1 , M1′ ) ≤

j=2

rm−1 + (k − 1) · rm−2 < 2 · rm−1 ≤ k · rm−1 = rm . Note that inequality 1 holds as otherwise the m’th iteration would not have been the last one (contradicting the deﬁnition of m). Inequality 2 follows from Lemma 2.17. This proves the claim. The following theorem gives the required uniqueness result. Theorem 5.10. Let rinit , τ ∈ N+ and let C and C ′ be two minimal multilinear ΣΠΣ(k) circuits computing the same non-zero polynomial.

Let C1 , . . . , Cs and C1′ , . . . , Cs′ ′ (s′ ≥ s) be the partitions of C and C ′ found by Algorithm 1 when given τ , rinit and C (or C ′ ) as input. Assume that rinit ≥ RM (2k) and that τ > k 2 (that is, ζ ≥ 3).

41

Then, s = s′ and there exists a reordering of the subcircuits such that for each i ∈ [s], it holds that Ci ≡ Ci′ . Proof. We will prove that for some i ∈ [s′ ] it holds that C1 ≡ Ci . By the same reasoning we will get that C2 computes the same polynomial as some Ci′′ (where i ̸= i′ ) and so on. If s ̸= s′ it will follow that either C or C ′ have a subcircuit computing the zero polynomial which is a contradiction to their minimality. The claim easily follows. Consider the circuit C ′ − C. Clearly it computes the identically zero polynomial. This circuit is a sum of minimal circuits. In order to show that C1 is equivalent to some Ci′ we will prove the following: For some i ∈ [s′ ] it holds that any minimal subcircuit of C ′ − C computing the zero polynomial having one of its multiplication gates originating either from C1 or Ci′ has all of its multiplication gates originating either from C1 or Ci′ . From this statement it follows that C1 − Ci′ is a sum of minimal circuits computing the zero polynomial and thus C1 ≡ Ci′ as required. As a ﬁrst step (Lemma 5.11) we show that any minimal zero circuit C˜ of C − C ′ having one if its multiplication gates originating from C1 has all of its multiplication gates originating either from C1 ′ ˜ or from Ci(′ C) ˜ for some i(C) ∈ [s ]. We will then show (Lemma 5.12) that for any two minimal zero ˜ Cˆ it holds that i(C) ˜ = i(C). ˆ By symmetry the claim follows. subcircuits C, Lemma 5.11. Let C˜ be a minimal zero subcircuit of C − C ′ and let M ∈ C˜ be a multiplication gate that is also in C1 . Then for some i ∈ [s′ ] it holds that all of the multiplication gates in C˜ originate either from C1 or from Ci′ Proof. C˜ must contain at least one multiplication gate from C ′ as otherwise we will get either that C is not minimal, or that C computes the zero polynomial. In both cases we reach a contradiction. It follows that for some i ∈ [s′ ], there exists a multiplication gate M ′ in C˜ originating from Ci′ . ˆ ∈ C. ˜ We have that Let M (1)

ˆ , M ) ≤ ∆(C) ˜ < RM (2k) ≤ rinit ≤ rm−1 , ∆(M

ˆ , M ′ ) ≤ ∆(C) ˜ < rm−1 , ∆(M

ˆ ∈ C1 or where equality 1 holds as C˜ is a minimal zero circuit. Therefore, by Lemma 5.9 either M ′ ˆ M ∈ Ci . ˜ Cˆ be two minimal zero circuits of C − C ′ , both having a multiplication gate Lemma 5.12. Let C, ˜ as deﬁned in the previous lemma is equal to i(C). ˆ originating from C1 . Then i(C), ′ ′ ˜ Proof. Let N ∈ Cˆ ∩ C1 , M ∈ C˜ ∩ C1 , N ′ ∈ Cˆ ∩ Ci(′ C) ˜ . To prove the claim we will ˆ and M ∈ C ∩ Ci(C)

show that N ′ ∈ Ci(′ C) ˜ .

˜ < RM (2k) ≤ rinit ≤ rm−ζ ≤ rm−3 , ∆(M ′ , M ) ≤ ∆(C) ∆(M, N ) ≤ ∆(C1 ) ≤ rm−ζ ≤ rm−3 ,

ˆ < RM (2k) ≤ rm−3 . ∆(N, N ′ ) ≤ ∆(C)

Hence, by Lemma 2.17 we get that ∆(M ′ , N ′ ) ≤ ∆(M ′ , M ) + ∆(M, N ) + ∆(N, N ′ ) < 3 · rm−3 < k 2 · rm−3 = rm−1 . ′ ′ Since M ′ ∈ Ci(′ C) ˜ , we have by Lemma 5.9 that it must be the case that N ∈ Ci(C) ˜ .

The two lemmas above show that C1 ≡ Ci′ for some i ∈ [s′ ] and by the arguments above, the proof of Theorem 5.10 follows.

42

Corollary 5.13. Let C be a non-zero minimal multilinear ΣΠΣ(k) circuit:

Let rinit , τ ∈ N+ be such that rinit ≥ RM (2k) and τ > k 2 . Let C1 , . . . , Cs be the partition of C outputted by Algorithm 1 when given C, rinit and τ as input. Let V be an (rinit · k (k−1)·⌈logk (τ )⌉ )-multilinear-rank-preserving subspace for C. Let C ′ be a ΣΠΣ(k) circuit in dim(V ) indeterminates such that C ′ ≡ C|V . Let C1′ , . . . , Cs′ ′ be the partition of C ′ outputted by Algorithm 1 when given C ′ , rinit and τ as input. Then, s = s′ and there exists a reordering of the subcircuits such that for each i ∈ [s], it holds that Ci |V ≡ Ci′ . Proof. According to Theorem 5.10, it suﬃces to show that Algorithm 1, given C|V , rinit and τ as input outputs C1 |V , . . . , Cs |V (or the partition of k deﬁning these subcircuits). To prove this is indeed the case notice that for any two ΣΠΣ(k) circuits C, C ′ and any rinit and τ , ′ ) < r then if for any subset A ⊆ [k] and any r ≤ rinit · k (k−1)·⌈logk (τ )⌉ it holds that ∆(CA ) < r iﬀ ∆(CA ′ Algorithm 1 outputs the same partition of [k] given either C, rinit and τ or C , rinit and τ as inputs. This is since by Lemma 3.5, the values taken by the diﬀerent rj ’s in Algorithm 1 are all bounded by rinit · k (k−1)·⌈logk (τ )⌉ . Notice now that V is (rinit · k (k−1)·⌈logk (τ )⌉ )-multilinear-rank-preserving for C. It follows that for every subcircuit CA of C and r ≤ rinit · k (k−1)·⌈logk (τ )⌉ , it holds that ∆(CA ) < r ⇔ ∆ ((CA )|V ) < r. By the discussion above we get that given C|V , rinit and τ as inputs, Algorithm 1 outputs the same partition of [k] as it would if given C, rinit and τ as inputs. Namely, the partition corresponding to C1 |V , . . . , Cs |V .

5.3

The Algorithm

We are now ready to present the main reconstruction algorithm for ΣΠΣ(k) multilinear circuits. Algorithm 11 receives a black box computing an n-variate multilinear ΣΠΣ(k) circuit over a ﬁeld F and outputs a multilinear ΣΠΣ(k) circuit computing the same polynomial. We give a lemma analyzing the algorithm that implies the correctness of Theorem 2. Lemma 5.14. Algorithm 11, given as input k, n and a black box computing an n-variate ΣΠΣ(k) O(k log k) multilinear circuit C, outputs a circuit C ′ computing the same polynomial as C in (n + |F|)2 time. Proof. We ﬁrst prove the algorithm correctness. Any circuit outputted by the algorithm is a multilinear ΣΠΣ(k) circuit. Furthermore, before the algorithm outputs a circuit, it veriﬁes it computes the correct polynomial (the one computed by C). Hence, it suﬃces to prove that in some iteration, the algorithm outputs a circuit. According to Corollary 2.15, there exist a set B ⊆ [n] and a ﬁeld element α ∈ S such that VB,α is a liftable r′ -multilinear-rank-preserving subspace for C. Focus on an iteration where such B, α are chosen. Let C1 , . . . , Cs and r be the circuits of the partition and the integer that would have been 43

Input: k, n ∈ N. A black box computing an n-variate ΣΠΣ(k) multilinear circuit C. Output: A multilinear ΣΠΣ(k) circuit C ′ such that C ′ ≡ C. 1 2 3 4

∆

∆

∆

Let rinit = RM (2k), τ = k 3 , r′ = rinit · k 3(k−1) and S ⊆ F such that |S| = n4 k 2 + 1; foreach α ∈ S and B ⊆ [n] of size |B| = r′ · 2k do A as the aﬃne subspace V For each A ⊆ [n] deﬁne VB,α B∪A,α , as in Deﬁnition 2.10. Deﬁne A = {A | A ⊆ [n]\B, |A| ≤ 2}. For each A ∈ A ﬁnd a multilinear ΣΠΣ(k) circuit C A such that C A ≡ C|V A ; B,α

5

6

7 8

9

10 11

For each such circuit run Algorithm 1 with inputs n, k, rinit , τ , and C A . Denote the polynomials computed by the partition circuits as f1A , . . . , fsAA ; If there exist two such sets A1 , A2 ∈ A such that sA1 ̸= sA2 , proceed to the next pair of B, α. Otherwise, denote by s the number of the polynomials in each partition. Also, reorder the polynomials so that for each A1 , A2 ∈ A and each j ∈ [s] it will hold that fjA1 |VB,α = fjA2 |VB,α . If that is not possible, proceed to the next pair of B, α.; ∑ foreach k1 , . . . , ks ∈ N+ such that si=1 ki = k do For each j ∈ [s], activate Algorithm 9 with inputs kj , B, α and {fjA }A∈A . Denote the outputted set as CB,α,j ; ∑ For each (Cˆ1 , . . . , Cˆs ) ∈ CB,α,1 × . . . × CB,α,s check whether sj=1 Cˆj ≡ C. If so, output ∑s ˆ j=1 Cj . If no such circuits exist, proceed to the next iteration; end end Algorithm 10: Reconstruction of a multilinear ΣΠΣ(k) circuit

outputted by Algorithm 1 if it were run with inputs C, rinit = RM (2k) and τ = k 3 . Let k1′ , . . . , ks′ the number of multiplication gates in C1 , . . . , Cs (respectively). Recall that the subspace VB,α is a liftable r′ -multilinear-rank-preserving subspace for C. Also, rinit ≥ RM (2k), τ > k 2 and r′ = rinit · k (k−1)·⌈logk (τ )⌉ . Hence, by Corollary 5.13 we have that for each A ∈ A it holds that sA = s and that for each set A there exist some reordering of the polynomial {fjA }j such that fjA ≡ Cj |V A . In particular, this shows that in Step 6 the iteration will not end. B,α

Assume w.l.o.g. that in Step 6 we reorder the polynomials in a manner that fjA ≡ Cj |V A for each B,α j ∈ [s] and A ∈ A. In the inner loop, focus on the iteration where k1 = k1′ , . . . , ks = ks′ . Notice that for any j ∈ [s], it holds that VB,α is a liftable ∆(Cj )-multilinear rank preserving subspace for Cj . Hence, by Theorem 5.7 we have that for each j, the set CB,α,j a circuit Cj′ such that Cj ≡ Cj′ . It follows that one of ∑s contains the circuits we check is the circuit j=1 Cj′ , which clearly computes the same polynomial as C (i.e., ∑ s ′ j=1 Cj ≡ C). This proves the correctness of the algorithm. We now analyze the time complexity of Algorithm 11. In Step 4, we ﬁnd several circuits via brute force techniques. As detailed in the proof of Theorem 5.7, the number of ΣΠΣ(k) multilinear (t + 2)variate circuits over F is |F|O(k·t) . For each such circuit, we perform a PIT with |A| diﬀerent multilinear O(t)-variate circuits. The time required for the PIT checks for each circuit is 2O(t) · |A| = 2O(t) · n2 . Hence, the total amount of time needed in Step 4 is n2 · |F|O(kt) . Step 5 requires, according to Lemma 5.8, O(log(τ ) · t3 k 3 · |A|) = n2 · poly(t, k) time. Step 6 requires O(n4 k 2 ) PIT checks of t-variate multilinear polynomials. Hence, it requires n4 · 2O(t) time (since t ≥ 2k ). The inner loop has 2O(k) iterations. We analyze the running time of each iteration. Step 8 contains

44

s ≤ k calls to Algorithm 9. This requires, according to Theorem 5.7, n2 · |F|O(k·t) running time. In 2 Step 9 we preform, according to Theorem 5.7, (|F|O(kt) )s = |F|O(k t) PIT tests. Each such test is between two n-variate multilinear ΣΠΣ(k) circuits. To do so deterministically we use a deterministic algorithm given in [SV09], that runs in nO(k) time (see Lemma A.6 in Section A.4). Hence, Step 9 2 2 requires |F|O(k t) · nO(k) time. In total, each iteration of the main loop requires |F|O(k t) · nO(k) time. The number of such iterations is at most |S| · n|B| = |S| · nt . Hence, the total running time of the algorithm is: (n4 k 2 + 1) · nk

3(k−1) R

k M (2k)2

· |F|O(k

3(k−1)+2 R

M (2k)2

k)

O(k log k)

· nO(k) = (n + |F|)2

.

This proves Lemma 5.14.

References [AV08]

M. Agrawal and V. Vinay. Arithmetic circuits: A chasm at depth four. In Proceedings of the 49th Annual FOCS, pages 67–75, 2008.

[BB98]

D. Bshouty and N. H. Bshouty. On interpolating arithmetic read-once formulas with exponentiation. J. of Computer and System Sciences, 56(1):112–124, 1998.

[BBB+ 00] A. Beimel, F. Bergadano, N. H. Bshouty, E. Kushilevitz, and S. Varricchio. Learning functions represented as multiplicity automata. Journal of the ACM (JACM), 47(3):506– 530, 2000. [BHH95]

N. H. Bshouty, T. R. Hancock, and L. Hellerstein. Learning arithmetic read-once formulas. SIAM Journal on Computing, 24(4):706–735, 1995.

[BOT88]

M. Ben-Or and P. Tiwari. A deterministic algorithm for sparse multivariate polynominal interpolation. In Proceedings of the 20th Annual ACM Symposium on Theory of Computing (STOC), pages 301–309, 1988.

[DS06]

Z. Dvir and A. Shpilka. Locally decodable codes with 2 queries and polynomial identity testing for depth 3 circuits. SIAM Journal on Computing, 36(5):1404–1434, 2006.

[FK06]

L. Fortnow and A. R. Klivans. Eﬃcient learning algorithms yield circuit lower bounds. In Proceeding of the 19th annual COLT, pages 350–363, 2006.

[GK98]

D. Grigoriev and M. Karpinski. An exponential lower bound for depth 3 arithmetic circuits. In Proceedings of the 30th Annual STOC, pages 577–582, 1998.

[GR00]

D. Grigoriev and A. A. Razborov. Exponential complexity lower bounds for depth 3 arithmetic circuits in algebras of functions over ﬁnite ﬁelds. Applicable Algebra in Engineering, Communication and Computing, 10(6):465–487, 2000.

[HH91]

T. R. Hancock and L. Hellerstein. Learning read-once formulas over ﬁelds and extended bases. In Proceedings of the 4th Annual COLT, pages 326–336, 1991.

[Kal85]

E. Kaltofen. Polynomial-time reductions from multivariate to bi- and univariate integral polynomial factorization. SIAM Journal on Computing, 14(2):469–489, 1985.

[Kal95]

E. Kaltofen. Eﬀective Noether irreducibility forms and applications. J. of Computer and System Sciences, 50(2):274–295, 1995. 45

[KM93]

E. Kushilevitz and Y. Mansour. Learning decision trees using the fourier spectrum. SIAM Journal on Computing, 22(6):1331–1348, 1993.

[KS01]

A. Klivans and D. Spielman. Randomness eﬃcient identity testing of multivariate polynomials. In Proceedings of the 33rd Annual STOC, pages 216–223, 2001.

[KS06a]

A. Klivans and A. Shpilka. Learning restricted models of arithmetic circuits. Theory of Computing, 2(10):185–206, 2006.

[KS06b]

A. R. Klivans and A. A. Sherstov. Cryptographic hardness for learning intersections of halfspaces. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 553–562, 2006.

[KS07]

N. Kayal and N. Saxena. Polynomial identity testing for depth 3 circuits. Journal of Computational Complexity, 16(2):115–138, 2007.

[KS08]

Z. S. Karnin and A. Shpilka. Deterministic black box polynomial identity testing of depth3 arithmetic circuits with bounded top fan-in. In Proceedings of the 23rd Annual IEEE Conference on Computational Complexity, pages 280–291, 2008.

[KS09]

N. Kayal and S. Saraf. Blackbox polynomial identity testing for depth 3 circuits. In Proceedings of the 50th Annual FOCS, pages 198–207, 2009.

[KT90]

E. Kaltofen and B. M. Trager. Computing with polynomials given by black boxes for their evaluations: Greatest common divisors, factorization, separation of numerators and denominators. Journal of Symbolic Computation, 9(3):301–320, 1990.

[Raz08]

R. Raz. Elusive functions and lower bounds for arithmetic circuits. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing (STOC), pages 711–720, 2008.

[Sch80]

J. T. Schwartz. Fast probabilistic algorithms for veriﬁcation of polynomial identities. Journal of the ACM (JACM), 27(4):701–717, 1980.

[Shp02]

A. Shpilka. Aﬃne projections of symmetric polynomials. J. of Computer and System Sciences, 65(4):639–659, 2002.

[Shp09]

A. Shpilka. Interpolation of depth-3 arithmetic circuits with two multiplication gates. SIAM Journal on Computing, 38(6):2130–2161, 2009.

[SS09]

N. Saxena and C. Seshadhri. An almost optimal rank bound for depth-3 identities. In Proceedings of the 24th Annual IEEE Conference on Computational Complexity, pages 137–148, 2009.

[SS10a]

N. Saxena and C. Seshadhri. Blackbox identity testing for bounded top fanin depth-3 circuits: the ﬁeld doesn’t matter. Electronic Colloquium on Computational Complexity (ECCC), (167), 2010.

[SS10b]

N. Saxena and C. Seshadhri. From sylvester-gallai conﬁgurations to rank bounds: Improved black-box identity test for depth-3 circuits. Electronic Colloquium on Computational Complexity (ECCC), (013), 2010.

[SV08]

A. Shpilka and I. Volkovich. Read-once polynomial identity testing. In Proceedings of the 40th Annual STOC, pages 507–516, 2008. 46

[SV09]

A. Shpilka and I. Volkovich. Improved polynomial identity testing for read-once formulas. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 13th International Workshop RANDOM, LNCS, pages 700–713, 2009.

[SW01]

A. Shpilka and A. Wigderson. Depth-3 arithmetic circuits over ﬁelds of characteristic zero. Journal of Computational Complexity, 10(1):1–27, 2001.

[VSBR83] L. G. Valiant, S. Skyum, S. Berkowitz, and C. Rackoﬀ. Fast parallel computation of polynomials using few processors. SIAM Journal on Computing, 12(4):641–644, November 1983. [Zip79]

A

R. Zippel. Probabilistic algorithms for sparse polynomials. In Symbolic and algebraic computation, pages 216–226. 1979.

Toolbox

In this section we present several algorithms that were used throughout the paper. These algorithms are not directly related to the paper’s topic and were thus chosen to appear in the appendix.

A.1

Black-Box Factorization

In several sections of the paper we encounter the following problem: Given black box oracle access to a circuit computing a t-variate polynomial f of degree d over a ﬁeld F, we would like to access its irreducible factors individually. In particular, we wish to reconstruct its linear factors. To solve this problem we use an algorithm for black box factoring of a multivariate polynomial devised in [Kal85, KT90, Kal95]. The algorithm outputs black boxes to the irreducible factors of the polynomial and their multiplicities. It requires that the ﬁeld we are working with is of a large enough size (Ω(d5 )). We assume that if the ﬁeld is not large enough, we are allowed to make queries from an extension ﬁeld. The following lemma gives the requirements and results of the algorithm: Lemma A.1. Let d, t be integers and f be a polynomial of t variables and of degree d over the ﬁeld F. Assume that |F| = Ω(d5 ). Then there is a randomized algorithm that gets as input a black box access to f , and the parameters t and d, and outputs, in poly(t, d, log |F|) time, with probability 1 − exp(−t) black boxes to all the linear factors, with their multiplicities22 , of f . The algorithm uses O(t log |F|) random bits. In order to ﬁnd the linear factors of a given black box polynomial, we check which factors are linear functions and reconstruct them. Namely, we interpolate each factor as a linear function (to interpolate a linear function of t indeterminates we must query the polynomial in t + 1 diﬀerent points). We then use a polynomial identity test (PIT) to check whether the factor is identical to the linear function. The PIT we use is the “classic” randomized PIT based on the Schwartz-Zippel lemma [Sch80, Zip79] requiring O(t log(d)) random bits and one query (for arbitrarily low constant error probability). In our setting t is relatively small so we can go over all choices for random bits to get an eﬃcient deterministic algorithm. i

In fact, the basic algorithm only guarantees that if p is the characteristic of the ﬁeld and g p ·e is a factor of f (where i e is not divisible by p), then the black-box corresponding to g that is outputted by the algorithm, holds g p and its multiplicity is e. 22

47

Lemma A.2. Let d, n be integers and f be a polynomial of t variables and of degree d. Let F be a ﬁnite ﬁeld such that |F| = Ω(d5 ). Then there is a deterministic algorithm that gets as input a black box access to f , and the parameters t and d, and outputs, in |F|O(t) time all the linear factors, with their multiplicities, of f . We note that while the original algorithm of Lemma A.1 does not necessarily output all the correct multiplicities and all the correct linear functions (in case that the characteristic of the ﬁeld is a factor of the multiplicity), there is an easy way of taking care of that when the linear factors is all that we i care about. Indeed, there is a simple algorithm that given black box access to ℓp outputs both ℓ and pi , when ℓ is a linear function, using queries from a polynomial size extension ﬁeld (the algorithm is randomized, but when t is small we can use its brute force version).

A.2

Brute Force Interpolation

Throughout the paper we construct circuits computing polynomials for which we have black box access. In deﬁnition 2.19 we presented the notion of the default circuit for a polynomial f . Algorithm 12 shows how to construct this circuit. Namely, it is the solution to the following problem: Let f be a t-variate polynomial over the ﬁeld F of degree d. Given a black box computing f , construct the circuit Cf . Since we use brute force techniques the running time of the algorithm is exponential in t (the number of variables). However, for our applications, the number of indeterminates is relatively small. That is, t is substantially smaller than d and |F|. The running time is exponential in t but polynomial in |F| and d. This will be translated into a quasi-polynomial time algorithm in the input size. By using the factoring algorithm of Section A.1 we can reconstruct all the linear factors of f (i.e., gcd(Cf )) and gain black box access to the product of all non-linear factors (sim(Cf )). We then construct sim(Cf ) via brute force interpolation according to all possible bases for Ft . We note that there are more eﬃcient methods of constructing the circuit Cf . However, improving the running time of Algorithm 12 would have an negligible eﬀect on the running time of the reconstruction algorithm.

1 2

3 4

5 6

7

8

Input: t, d ∈ N and a black box holding a t-variate polynomial f of degree bounded by d. Output: The circuit Cf = gcd(Cf ) · sim(Cf ). Find the linear factors of f . Output their product as gcd(Cf ); Denote by h the polynomial computed by sim(Cf ). Using the factoring algorithm of Section A.1, obtain black box access to h; foreach t linearly independentH linear functions {Li }i∈[t] , in the variables x ¯ = (x1 , . . . , xt ) do ˜ such that Represent h as a polynomial in the Li ’s. That is, look for a polynomial h ˜ h(L1 , . . . , Lr ) = h, for some r ≤ t; end ˜ depends on a minimal number of linear For some set {Li } for which the polynomial h 23 ˆ and the r linear functions on functions (any such basis will do), denote the polynomial as h ˆ ˆ which it depends on as L1 , . . . , Lr ; ¯ be ˆ 1, . . . , L ˆ r }. Let h Let L1 , . . . , Lr be the linear functions forming the default basis of span1 {L ˆ L ¯ 1 , . . . , Lr ); ˆ 1, . . . , L ˆ r ) = h(L the polynomial such that h( ¯ Set sim(C) as h(L1 , . . . , Lr ); Algorithm 11: Brute force interpolation

Lemma A.3. Let f be a t-variate polynomial over a ﬁeld F. Algorithm 12, given a black box holding 2 f as input (as well as t and deg(f )), outputs the circuit Cf in |F|O(t ) time. 48

Proof. We ﬁrst prove the correctness of the algorithm. The linear functions we ﬁnd in Step 1 are clearly the linear functions of gcd(Cf ). By the choice of r it follows that h, the polynomial computing sim(Cf ), is a polynomial in exactly r liner functions (see Lemma 2.2 and the discussion prior to it). ¯ 1 , . . . , Lr ) computes sim(Cf ) as well. Since L1 , . . . , Lr are a default basis Clearly the polynomial h(L

of the space they span and r is minimal, it follows that h(L1 , . . . , Lr ) ≡ sim(Cf ) (as circuits). We now analyze the running time of the algorithm: The ﬁrst step ﬁnds all the linear factors of f . Lemma A.2 states that this can be done deterministically in |F|O(t) time (recall that we assumed that d ≤ |F|). The second step is a by-product of the ﬁrst step and requires no additional running time. In Step 3, at each iteration we interpolate a polynomial of degree bounded by d with t inputs. This requires24 dO(t) time. The number of iterations is the number of bases for Ft . It can easily be shown 2 that this number is |F|O(t ) . The time required by Step 7 is linear in the size of the description of the ˆ Hence, it requires dO(t) time. To Conclude, by assuming that |F| ≥ d we have that the polynomial h. 2 total running time is |F|O(t ) .

A.3

Reconstructing Linear Functions

In [Shp09] a method for eﬃciently reconstructing ΣΠΣ(2) circuits was given. One of the algorithms presented there reconstructs a multiplication gate when given its restrictions to several co-dimension 1 subspaces on which it does not vanish. We use this algorithm in our paper for a similar purpose (we reconstruct the linear factors of a generalized multiplication gate). Parts 2 and 3 of algorithm 6 in [Shp09] reconstruct a set of linear functions given the restriction of their product to several codimension 1 subspaces. The following lemma summarizes the results of the algorithm needed for our paper: Lemma A.4. (implicit in [Shp09]) Let L be a (multi) set containing d linear functions in n indeterminates. Let {φ1 , . . . , φm } be a set of linearly independentH linear functions such that m ≥ 100 log(d). For each j ∈ [m] deﬁne the (multi) set } ∆ { Lj = ℓ|φj =0 ℓ ∈ L . Then there exists a deterministic algorithm that given {Lj }m j=1 outputs L in poly(n, d) time.

A.4

Deterministic Polynomial Identity Testing Algorithms for Depth-3 Circuits

In [KS08] a deterministic black-box PIT algorithm for ΣΠΣ(k, d, ρ) circuits was given. That is, [KS08] give a deterministic algorithm (Algorithm 1 in [KS08]) that veriﬁes, in quasi-polynomial time, whether a black-box ΣΠΣ(k, d, ρ) circuit C computes the zero polynomial. Using the new rank bounds of [SS09] the following result is obtained25 . Lemma A.5 (Lemma 4.10 of [KS08] combined with Theorem 2 of [SS09]). Let C be an n-variate ΣΠΣ(k, d, ρ) circuit. Then there exist a deterministic algorithm that, given black box access to C, veriﬁes whether C computes the zero polynomial. The running time of the algorithm is ( (( ) )( ) ) kd R(k, d, ρ) + 2 n + 2k + 1 · (d + 1)R(k,d,ρ) = 2 2 ( ( 3 )) n · exp O k log(k) log2 (d) + kρ log d . 24

In order to use a simple interpolation of a polynomial of t inputs and degree d we may use Lagrange’s formula which gives a description of size dO(t) of the polynomial. 25 Recently, [SS10a] gave a more eﬃcient PIT algorithm for ΣΠΣ(k, d) circuits which requires dO(k) time. While it can be adapted to work with ΣΠΣ(k, d, ρ) circuits the improvement will not aﬀect the running time of the reconstruction algorithm.

49

When C is a multilinear ΣΠΣ(k) circuit a better result was proved in [SV09]. Lemma A.6 (Theorem of [SV09]). Let C be an n-variate ΣΠΣ(k) multilinear circuit. Then there exist a deterministic algorithm that, given black box access to C, veriﬁes whether C computes the zero polynomial. The running time of the algorithm is nO(k) .

B

Proof of Lemma 4.16

We begin by proving a simpler lemma discussing only two ‘classic’ multiplication gates. a ΣΠΣ(3) circuit. Lemma B.1. Let M1 , M2 be two products of linear functions of degree at most d. Let R, η > 0. Assume that ∆(M1 + M2 ) ≥ R. Let L be a set of linearly independentH linear functions s.t. for any R . φ ∈ L, it holds that M1 |φ=0 ̸= 0, M2 |φ=0 ̸= 0 and required all the time ∆ ((M1 + M2 )|φ=0 ) < 2 log(d)η Then |L| ≤

R η.

Proof. Assume w.l.o.g. that gcd(M1 , M2 ) = 1; otherwise we discuss the division of M1 , M2 by their gcd (such divisions are possible as M1 , M2 do not vanish in any co-dimension 1 subspace deﬁned by L). We also assume for convenience that all the linear functions are homogenous. Modifying the proof to deal with non-homogenous functions is an easy exercise. Denote by N the set of linear functions appearing in M1 , M2 . Clearly, rank(N ) ≥ R. We shall gradually construct a set of functions whose span contain N ∪ L. We divide the process into steps. At each step we add several linear functions to a partial basis we denote by B. Initially, B is deﬁned as the empty set. We shall see that after a small number of steps, the basis is complete. By bounding the number of elements entered into B at each step, we obtain an upper bound on the size of the basis, and consequentially, on the size of L. The construction of the basis is achieved by proving that if L is not spanned by the functions in B then B can be further expanded (i.e., we can perform an additional step). To analyze the process, consider the partition I of N deﬁned as such: Each set in the partition is an equivalence class modulo B. That is, every set in I is a subset of N of maximal size such that all of its linear functions are equal, up to a product by a non-zero ﬁeld element, in the subspace orthogonal to B. Formally put, for any ℓ, ℓ′ ∈ N ′ where N ′ ∈ I (N ′ ⊆ N ) there exists some non-zero α ∈ F such that ℓ|B=0 = αℓ′ |B=0 . Furthermore, this property does not hold for any pair of linear functions that do not belong to the same set in the partition. The initial partition (corresponding to B = ∅) is w.l.o.g. a partition to singletons. Denote by Ij the partition at the beginning of step j and Bj the basis at the beginning of step j. In particular B1 = ∅ and I1 is the partition of N into singletons. At step j ≥ 1, we pick some φ ∈ L (arbitrarily) that is not spanned by Bj . We stop when no such function can be found. We update the basis in the following way: Bj+1 is deﬁned as the default basis of span (Bj ∪ {φ} ∪ {Lin (sim ((M1 + M2 )|φ=0 ))}) . Deﬁne ij as the number of sets in Ij which contain elements that are not in span(Bj ). We denote the set of functions in span(Bj ) as the ‘zero-set’ (since they vanish in the space orthogonal to Bj ). Lemma B.2. For any j ≥ 1 such that span(Bj ) does not contain L it holds that ij+1 ≤ ij /2. Before proving the lemma, we show how to deduce Lemma B.1 from it. From Lemma B.2 we get that after at most ⌈log(|N |)⌉ + 1 steps, ij = 0 (as i1 ≤ |N | and ij is an integer). Hence, N is contained in the span of Bj . We claim that L must also be contained in the span of Bj . Let φ ∈ L. Since ∆ ((M1 + M2 )|φ=0 ) < ∆(M1 + M2 ), M1 |φ=0 ̸= 0 and M2 |φ=0 ̸= 0 we have that for some pair of linear functions ℓ, ℓ′ ∈ N and some α, β ∈ F it holds that αℓ + βℓ′ + φ = 0. Hence, 50

φ|Bj =0 = αℓ|Bj =0 + βℓ′ |Bj =0 + φ|Bj =0 = 0, meaning that φ ∈ span(Bj ). It follows that when the process terminates, the size of B is upper bounded by26 (⌈log(|N |)⌉ + 1) ·

R R R ≤ 2 log(d) · = . 2 log(d)η 2 log(d)η η

Hence, the dimension of L must also be bounded by Rη . Otherwise, we could have continued for at least one more step. Lemma B.1 follows since the functions of L are linearly independent. Proof. (of Lemma B.2) For two non-similar27 linear functions ℓ, ℓ′ in N that are not in the zero-set of Ij , we say that φ connects them when ℓ|φ=0 ∼ ℓ′ |φ=0 . Alternatively, we say that a linear function ℓ ∈ N is connected by φ if there exists some ℓ′ ∈ N s.t. φ connects ℓ and ℓ′ . We say that two subsets N1 , N2 of N are connected by φ if some pair ℓ ∈ N1 , ℓ′ ∈ N2 is connected by φ. Fix some j ≥ 1 and let φ ∈ L be a linear function which is not in span(Bj ) according to which we construct Bj+1 . Let Icon ⊆ Ij be the collection of sets in I that are connected by φ to one or more sets other than itself. We will now show that ij+1 ≤ |Icon |/2. Lemma B.2 will immediately follow as ij ≥ |Icon |. Notice that Ij is a reﬁnement of the partition Ij+1 . Our ﬁrst claim is that the number of sets in Ij+1 that contain sets from Icon is at most Icon /2. Our second claim is that that any linear function ℓ ∈ N in a set of Ij that is not in Icon has the property that ℓ|Bj+1 =0 = 0. It is easy to verify that both claims indicate that ij+1 ≥ |Icon |/2. To prove the ﬁrst claim, simply notice that since φ ∈ Bj+1 , the elements of any two sets from Ij that are connected by φ will appear in the same set in Ij+1 . We now prove the second claim. Let ℓ ∈ N be a linear function in a set of Ij that is not in Icon . If ℓ is not connected by φ then it either holds that ℓ ∈ span(Bj ) or ℓ|φ=0 ∈ Lin (sim ((M1 + M2 )|φ=0 )). In both cases we get ℓ ∈ span(Bj+1 ). If ℓ is connected by φ it must be connected to a function ℓ′ in the same set as ℓ. Since ℓ ℓ′ yet ℓ|φ=0 ∼ ℓ′ |φ=0 (by the deﬁnition of two functions being connected by φ) we have that for some α, β ∈ F it holds that αℓ + βℓ′ = φ. Hence, φ|Bj =0 = αℓ|Bj =0 + βℓ′ |Bj =0 = γℓ|Bj =0 for some γ ∈ F (since ℓ and ℓ′ are in the same set in the partition). Since φ|Bj =0 ̸= 0 we have that γ ̸= 0 and ℓ|Bj =0 = γ −1 φ|Bj =0 . As φ ∈ Bj+1 it follows that ℓ|Bj+1 =0 = 0 and thus ij+1 ≤ |Icon |/2 ≤ ij /2. This concludes the proof of Lemma B.1. Lemma B.3. Let r, d be integers and let f1 , f2 be two polynomials, computed by ΣΠΣ(1, d, r) circuits Cf1 , Cf2 . Let R, η > 0. Assume that ∆(Cf1 + Cf2 ) ≥ R + 2r. Let L be a set of linearly independentH linear functions s.t. for any hold: φ is not spanned by neither Lin(sim(Cf1 )) nor ( φ ∈ L, The following ) R Lin(sim(Cf2 )). Also, ∆ Cf1 |φ=0 + Cf2 |φ=0 < 2 log(d)η . Then |L| ≤ Rη . Proof. Write M1 = gcd(Cf1 ) and M2 = gcd(Cf2 ). Due to the deﬁnition of the ∆(·) function, ∆(M1 + M2 ) ≥ ∆(Cf1 + Cf2 ) − 2r ≥ R. Let φ ∈ L. As φ is not spanned by neither Lin(sim(Cf1 )) nor Lin(sim(Cf2 )), we get that gcd(Cf1 |φ=0 ) = M1 |φ=0 and gcd(Cf2 |φ=0 ) = M2 |φ=0 (Lemma 4.4). Hence, ( ) ∆ ((M1 + M2 )|φ=0 ) ≤ ∆ (Cf1 |φ=0 + Cf2 |φ=0 ) <

R . 2 log(d)η

By applying Lemma B.1, the bound on |L| is achieved. Notice that d ≥ |N |/2. We assume that d is suﬃciently large so that ⌈log(|N |)⌉ + 1 ≤ 2 log(d). If that is not the case, it will translate into a trivial reconstruction problem (speciﬁcally, the polynomial we reconstruct is of constant degree). 27 Recall that ℓ, ℓ′ are similar (ℓ ∼ ℓ′ for short) when for some α, β ∈ F where (α, β) ̸= (0, 0) it holds that αℓ + βℓ′ = 0. 26

51

∆ ∑ Lemma (Lemma 4.16 restated). Let s, d, r ∈ N and let C = si=1 Cfi be a ΣΠΣ(s, d, r) circuit. Let Lˆ be a set of linearly independentH linear functions and A ⊆ [s] be a set of size |A| ≥ 2. Let η, R ∈ N. Assume that for every i, i′ ∈ A where i ̸= i′ we have that ∆(Cfi + Cfi′ ) ≥ R + 2r. Assume further that ˆ the following holds: for each φ ∈ L,

For every i ∈ [s], φ divides fi if and only if i ∈ / A. For every i ∈ [s], φ ∈ / Lin(sim(Cfi )). ∃i, i′ ∈ A, such that i ̸= i′ and ∆(Cfi |φ=0 , Cfi′ |φ=0 ) <

R 2·η·log(d) .

( ) ˆ ≤ |A| · R . |L| η 2

Then

( ) ˆ |A| Proof. Clearly, for some i1 , i2 ∈ A where i1 ̸= i2 , there exists a set L ⊆ Lˆ with at least |L|/ 2 elements and the following property: For every φ ∈ L: φ is not spanned by neither Lin(sim(Cfi1 )) nor Lin(sim(Cfi2 )). ∆(Cfi1 |φ=0 , Cfi2 |φ=0 ) <

R 2·η·log(d) .

ˆ By Lemma B.3 we get a bound for |L| which leads to the required bound on |L|.

52

Reconstruction of Generalized Depth-3 Arithmetic ...

only the case of k = 2 (i.e. Î£Î Î£(2) circuits) was known. Our proof ...... If C is a Î£Î Î£(k, d, Ï) circuit and r â¥ Ï then gcd(C)|V = gcd(C|V ) and sim(C)|V = sim(C|V ). 4.

Download PDF

479KB Sizes 1 Downloads 190 Views

Report

Reconstruction of Generalized Depth-3 Arithmetic ...

Recommend Documents