Expressive reasoning on tree structures: recursion, inverse programs, Presburger constraints and nominals Everardo B´arcenas1 and Jes´ us Lavalle2 1

Universidad Polit´ecnica de Puebla [email protected] 2 Universidad Aut´ onoma de Puebla [email protected]

Abstract. The Semantic Web lays its foundations on the study of graph and tree logics. One of the most expressive graph logics is the fully enriched µ-calculus, which is a modal logic equipped with least and greatest fixed-points, nominals, inverse programs and graded modalities. Although it is well-known that the fully enriched µ-calculus is undecidable, it was recently shown that this logic is decidable when its models are finite trees. In the present work, we study the fully-enriched µ-calculus for trees extended with Presburger constraints. These constraints generalize graded modalities by restricting the number of children nodes with respect to Presburger arithmetic expressions. We show that the logic is decidable in EXPTIME. This is achieved by the introduction of a satisfiability algorithm based on a Fischer-Ladner model construction that is able to handle binary encodings of Presburger constraints.

1

Introduction

The µ-calculus is an expressive propositional modal logic with least and greatest fixed-points, which subsumes many temporal, modal and description logics (DLs), such as the Propositional Dynamic Logic (PDL) and the Computation Tree Logic (CTL) [2]. Due to its expressive power and nice computational properties, the µ-calculus has been extensively used in many areas of computer science, such as program verification, concurrent systems and knowledge representation. In this last domain, the µ-calculus has been particularly useful in the identification of expressive and computationally well-behaved DLs [2], which are now known the standard ontology language OWL for the W3C. Another standard for the W3C is the XPath query language for XML. XPath also takes an important role in many XML technologies, such as XProc, XSLT and XQuery. Due to its capability to express recursive and multi-directional navigation, the µ-calculus has also been successfully used as a framework for the evaluation and reasoning of XPath queries [1, 4]. Since the µ-calculus is as expressive as the monadic second order logic (MSOL), it has been also successfully used in the XML setting in the description of schema languages [1], which can be seen as the tree version

p q

q

p ...

q

(a) Regular tree expression: p(q ? )

q

q

q

r

r

(b) Non-regular tree expression for a balanced tree: p(q > r)

Fig. 1. Tree expressions

of regular expressions. Analogously as regular expressions are interpreted as sets of strings, XML schemas (regular tree expressions) are interpreted as sets of unranked trees (XML documents). For example, the expression p(q ? ) represents the sets of trees rooted at p with either none, one or more children subtrees matching q. See figure 1(a) for an interpretation of p(q ? ). However, it is well-known that expressing arithmetical constraints goes beyond regularity [1]. For instance, p(q > r) denotes the trees rooted at p with more q children than r children. In figure 1(b) is depicted an interpretation of p(q > r). In the present work, we study an extension of the µ-calculus for trees with Presburger constraints that can be used to express arithmetical restrictions on regular (tree) languages. Related work. The extension of the µ-calculus with nominals, inverse programs and graded modalities is known as the fully enriched µ-calculus [2]. Nominals are intuitively interpreted as singleton sets, inverse programs are used to express past properties (backward navigation along accesability relations), and graded modalities express numerical constraints on the number immediate successors nodes [2]. All of them, nominals, inverse programs and graded modalities are present in OWL. However, the fully enriched µ-calculus was proven by Bonatti and Peron to be undecidable [3]. Nevertheless, B´arcenas et al. [1] recently showed that the fully enriched µ-calculus is decidable in single exponential time when its models are finite trees. Graded modalities in the context of trees are used to constrain the number of children nodes with respect to constants. In this work, we introduce a generalization of graded modalities. This generalization considers numerical bounds on children with respect to Presburger arithmetical expressions, as for instance φ > ψ, which restricts the number of children where φ holds to be strictly greater than the number of children where ψ is true. Other works have previously considered Presburger constraints on tree logics. MSOL with Presburger constraints was shown to be undecidable by Seidl et al. [9]. Demri and Lugiez proved a PSPACE bound on the propositional modal logic with Presburger constraints [5]. A tree logic with a fixed-point and Presburger constraints was shown to be decidable in EXPTIME by Seidl et al. [10]. In the current work, we push further decidability by allowing nominals and inverse programs in addition than Presburger constraints. Outline We introduce a modal logic for trees with fixed-points, inverse programs, nominals and Presburger constraints in Section 2. Preliminaries of the satisfiability algorithm for the logic are described in Section 3. In Section 4, an EXPTIME satisfiability algorithm is described and proven correct. A summary together with a discussion of further research is reported in Section 5.

2

A tree logic with recursion, inverse, counting and nominals

In this section, we introduce an expressive modal logic for finite unranked tree models. The tree logic (TL) is equipped with operators for recursion (µ), inverse programs (I), Presburger constraints (C), and nominals (O). Definition 1 (µTLICO syntax). The set of µTLICO formulas is defined by the following grammar: φ := p | x | ¬φ | φ ∨ φ | hmiφ | µx.φ | φ − φ > k Variables are assumed to be bounded and under the scope of a modal (hmiφ) or a counting formula (φ − ψ > k). Formulas are interpreted as subset tree nodes: propositions p are used to label nodes; negation and disjunction are interpreted as complement and union of sets, respectively; modal formulas hmiφ are true in nodes, such that φ holds in at least one accessible node through adjacency m, which may be either ↓, →, ↑ or ←, which in turn are interpreted as the children, right siblings, parent and left siblings relations, respectively; µx.φ is interpreted as a least fixed-point; and counting formulas φ − ψ > k holds in nodes, such that the number of its children where φ holds minus the number of its children where ψ holds is greater than the non-negative integer k. Consider for instance the following formula ψ: p ∧ (q − r > 0) This formula is true in nodes labeled by p, such that the number of its q children is greater than the number of its r children. ψ actually corresponds to the example of Section 1 about a balanced tree: p(q > r). In Figure 2 there is a graphical representation of a model for ψ. Due to the fixed-points in the logic, it is also possible to perform recursive navigation. For example, consider the following formula φ: µx.ψ ∨ h↓ix φ is true in nodes with at least one descendant where ψ is true, that is, φ recursively navigates along children until it finds a ψ node. A model for φ is depicted in Figure 2. Backward navigation may also be possible to express with the help of inverse programs (converse modalities). For instance, consider the following formula ϕ: µx.t ∨ h↑ix ϕ holds in nodes with an ancestor named t, that is, ϕ recursively navigates along parents until it finds a t node. A model for ϕ is depicted in Figure 2. In order to provide a formal semantics, we need some preliminaries. A tree structure T is a tuple (P, N , R, L), where: P is a set of propositions; N is a finite set of nodes; R : N × M × N is relation between nodes and modalities M forming a tree, written n ∈ R(n, m); and L : N × P is a labeling relation, written p ∈ L(n). Given a tree structure, a valuation V of variables is defined as a mapping from the set variables X to the nodes V : X 7→ N .

φ s s .. .

s .. .

φ ...

s φ s

s .. .

φ, ϕ ...

s .. .

t .. . φ, ψ, ϕ p

ϕ q

ϕ q

ϕ q

ϕ r

ϕ r

Fig. 2. A model for ψ ≡ (p ∧ [q − r > 0]), φ ≡ µx.ψ ∨ h↓ix and ϕ ≡ µx.t ∨ h↑ix.

Definition 2 (µTLICO semantics). Given a tree structure T and a valuation V , µTLICO formulas are interpreted as follows: [[p]]TV = {n | p ∈ L(n)}

[[x]]TV = V (x)

[[¬φ]]TV = N \ [[φ]]TV [[φ ∨ ψ]]TV = [[φ]]TV ∪ [[ψ]]TV o \n  [[hmiφ]]TV = n | R(n, m) ∩ [[φ]]TV 6= ∅ [[µx.φ]]TV = N 0 ⊆ N [[φ]]TV [N 0 / ] ⊆ N 0 x  T T T [[φ − ψ > k]]V = n | |R(n, ↓) ∩ [[φ]]V | − |R(n, ↓) ∩ [[ψ]]V | > k We also use the following notation: > = p ∨ ¬p, φ ∧ ψ = ¬(¬φ ∨ ¬ψ), [m] φ = ¬hmi¬φ, νx.φ = ¬µx.¬φ [x /¬x ], (φ − ψ ≤ k) = ¬(φ − ψ > k), φS − ψ #k n for # ∈ {≤, >} and (φ + φ + . . . + φ − ψ − ψ − . . . − ψ #k) = ( 1 2 n 1 2 m i φi − Sm ψ #k). Note that > is true in every node, conjunction φ ∧ ψ holds whenever j j both φ and ψ are true, [m]φ holds in nodes where φ is true in each accessible node through m, νx.φ is a greatest fixed-point, φ − ψ ≤ k holds in nodes where the number of its φ children nodes minus the number of its ψ children is less than the constant k, and φ1 + φ2 + . . . + φn − ψ1 − ψ2 − . . . − ψm #k is true in nodes where the sum of its φi (i = 1, . . . , n) children minus the sum of its ψj (j = 1, . . . , m) children satisfies #k. The interpretation of nominals is a singleton, that is, nominals are formulas which are true in exactly one node in the entire model [2]. Now, is easy to see that µTLICO can navigate recursively thanks to the fixed-points, and in all directions thanks to inverse programs. Hence, µTLICO can then express for a formula to be true in one node while being false in all other nodes of the model. Definition 3 (Nominals). Nominals are defined as follows: nom(φ) = φ ∧ siblings(¬φ) ∧ ancestors(¬φ) ∧ descendants(¬φ) where siblings(¬φ), ancestors(¬φ), and descendants(¬φ) are true, if and only if, φ is not true in all siblings, ancestors and descendants, respectively, and they

n n

n

n

n

...

n

n ...

n

n

n

n

...

n

...

n

n

Fig. 3. Bijection of n-ary and binary trees.

are defined as follows: siblings(φ) =[→]φ ∧ [←]φ ancestors(φ) =[↑] (µx.φ ∧ siblings(φ) ∧ [↑]x) descendants(φ) =[↓] (µx.φ ∧ [↓]x)

3

Syntactic trees

In this Section, we give a detailed description of the syntactic version of tree models used by the satisfiability algorithm. We first introduce some preliminaries. There is well-known bijection between binary and n-ary unranked trees [7]. One adjacency is interpreted as the first child relation and the other adjacency is for the right sibling relation. In Figure 3 is depicted an example of the bijection. Hence, without loss of generality, from now on, we consider binary unranked trees only. At the logic level, formulas are reinterpreted as expected: h↓iφ holds in nodes such that φ is true in its first child; h→iφ holds in nodes where φ is satisfied by its right (following) sibling; h↑iφ is true in nodes whose parent satisfies φ; and h←iφ satisfies nodes where φ holds in its left (previous) sibling. For the satisfiability algorithm we consider formulas in negation normal form only. The negation normal form (NNF) of µTLIC formulas is defined by the usual De Morgan rules and the following ones: nnf(hmiφ) = hmi¬φ ∨ ¬hmi>, nnf(µx.φ) = µx.¬φ [x /¬x ], nnf(φ − ψ > k) = φ − ψ ≤ k, and nnf(φ − ψ ≤ k) = φ−ψ > k. Hence, negation in formulas in NNF occurs only in front of propositions and formulas of the form hmi>. Also notice that we consider an extension of µTLIC formulas consisting of conjunctions, φ − ψ ≤ k and > formulas, with the expected semantics. We now consider a binary encoding of natural numbers. Given a finite set of propositions, the binary encoding of a natural number is the boolean combination of propositions satisfying the Vbinary representation of the given number. For V example, number 0 is written i≥0 ¬pi , and number 7 is p2 ∧ p1 ∧ p0 ∧ i>2 ¬pi (111 in binary). The binary encoding of constants is required in the definition of counters, which are used in the satisfiability algorithm to verify counting subformulas. Given a counting formula φ−ψ #k, counters C(φ) = k1 and C(ψ) = k2 are the sequence of propositions occurring positively in the binary encoding of a constants k1 and k2 , respectively. We write Ci (φ) = k to refer to an individual

of the sequence. A formula φ induces a set of counters corresponding to its counting subformulas. The bound in the amount of counter propositions is given by the constant K(φ), which is defined as the sum of the constants of the counting subformulas, more precisely: K(p) = K(x) = K(>) = 0, K(hmiφ) = K(¬φ) = K(µx.φ) = K(φ), K(φ ∨ ψ) = K(φ ∧ ψ) = K(φ) + K(ψ), and K(φ − ψ #k) = K(φ) + K(ψ) + k + 1. Nodes in syntactic trees are defined as sets of subformulas. These subformulas are extracted with the help of the Fischer-Ladner Closure. Before defining the Fischer-Ladner Closure, consider the Fischer-Ladner relation RFL for i = 1, 2, ◦ = ∨, ∧ and j = 1, . . . , log(K(φ)): RFL (ψ, nnf(ψ)) , RFL (hmiψ, ψ) ,

RFL (ψ1 ◦ ψ2 , ψi ) ,   RFL µx.ψ, ψ µx.ψ /x ,

RFL (ψ1 − ψ2 #k, ψi ) ,

RFL (ψ1 − ψ2 #k, h↓iµx.ψi ∨ h→ix) ,

RFL (ψ1 −ψ2 #k, Cj (ψi ) = K(φ)) ,

RFL (¬ψ, ψ) .

Definition 4 (Fischer-Ladner Closure). Given a formula φ, the FischerLadner Closure of φ is defined as CLFL (φ) = CLFL k (φ), such that k is the smallest FL positive integer satisfying CLFL k (φ) = CLk+1 (φ), where for i ≥ 0:  FL FL FL (ψ 0 , ψ), ψ 0 ∈ CLFL CLFL i (φ) . 0 (φ) = {φ}; CLi+1 (φ) = CLi (φ) ∪ ψ | R Example 1. Consider the formula φ = p ∧ [(q − r) > 1] ∧ r > 0, then for j = 1, 2, we have that CLFL(φ)={p ∧ [(q − r) > 1] ∧ (r > 0)), p ∧ [(q − r) > 1], (r > 0), p, [(q − r > 1)], q, r, Cj (q) = 3, Cj (r) = 3, h↓iµx.q∨h→ix, h↓iµx.r∨h→ix}∪CLFL(nnf(φ)) We are now ready to define the lean set for nodes in syntactic trees. The lean set contains the propositions, modal subformulas, counters and counting subformulas of the formula in question (for the satisfiability algorithm). Intuitively, propositions will serve to label nodes, modal subformulas contain the topological information of the trees, and counters are used to verify the satisfaction of counting subformulas. Definition 5 (Lean). Given a formula φ, its lean set is defined as follow  lean(φ) = p, hmiφ, ψ1 − ψ2 #k ∈ CLFL (φ) ∪ {hmi>, p0 } , provided that p0 does not occur in φ and m =↓, →, ↑, ← . Example 2. Consider again the formula φ = p∧[(q−r) > 1]∧r > 0 of Example 1, then for j = 1, 2, we have that: lean(φ) ={p, q, r, Cj (q) = 3, Cj (r) = 3, h↓iψj , h→iψj , h↓innf(ψj ), h→innf(ψj ) (q − r)#1, r#0, hmi>, p0 }, where ψ1 = µx.q ∨ h→ix and ψ2 = µx.r ∨ h→ix. Recall that Cj (ψ) = k are the corresponding propositions, required to define the binary encoding of k.

p q

n0

Step 5

n1

q

Step 4

n2

Step 3

q n3

r n4

Step 2 Step 1

Fig. 4. φ-tree model for φ = p ∧ [(q − r) > 1] ∧ (r > 0) built by the satisfiability algorithm in 5 steps.

A φ-node nφ is defined as a subset of the lean, such that: at least one proposition (different from the counter propositions) occurs in nφ ; if a modal subformula hmiψ occurs in nφ , then hmi> also does; and nφ can be either the root, a children or a sibling. More precisely, the set of φ-nodes is defined as follows: N φ = {nφ ∈ lean(φ) |p ∈ nφ , hmiψ ∈ nφ ⇒ hmi> ∈ nφ , h↑i> ∈ nφ ⇔ h←i> 6∈ nφ } When it is clear from the context, φ-nodes are called simply nodes. We are finally ready to define φ-trees. A φ-tree is defined either as empty ∅, or as a triple (nφ , T1φ , T2φ ), provided that nφ is a φ-node and Tiφ (i = 1, 2) are φ-trees. The root of (nφ , T1φ , T2φ ) is nφ . We often call φ-trees simply trees. Example 3. Consider the formula φ = p ∧ [(q − r) > 1] ∧ r > 0. Then T = (n0 , (n1 , ∅, (n2 , ∅, (n3 , ∅, (n4 , ∅, ∅)))), ∅) is a φ-tree, where n0 ={p, C(q) = 0, C(r) = 0, (q − r) > 1, r > 0, h↓iψ1 , h↓iψ2 , h↓i>}, n1 ={q, C(q) = 3, C(r) = 1, (q − r) ≤ 1, r ≤ 0, h→iψ1 , h→iψ2 , h↑i>, h→i>} n2 ={q, C(q) = 2, C(r) = 1, (q − r) ≤ 1, r ≤ 0, h→iψ1 , h→iψ2 , h←i>, h→i>} n3 ={q, C(q) = 1, C(r) = 1, (q − r) ≤ 1, r ≤ 0, h→iψ2 , h←i>, h→i>} n4 ={r, C(q) = 0, C(r) = 1, (q − r) ≤ 1, r ≤ 0, h←i>}. φ-nodes ni (i = 0, . . . , 4) are defined from the lean of φ (Example 2). In Figure 4 is depicted a graphical representation of T . Notice that counters in the root n0 are set to zero 0. This is because counters are intended to count on the siblings only. For instance, counters in n1 are set to 3 and 1 for q and r, respectively, because there are three q’s and one r in n1 and its siblings. Counting formulas occur positively only at the root n0 , because they are intended to be true when the counters in the children of n0 satisfy the Presburger constraints. Since ni (i > 0) does not have children, then counting formulas occur negatively in these nodes. Finally, notice that modal subformulas define the topology of the tree.

4

Satisfiability

In this section we define a satisfiability algorithm for the logic µTLIC following the Fischer-Ladner method [1, 5]. Given an input formula, the algorithm decides whether or not the formula is satisfiable. The algorithm builds φ-trees in a bottom-up manner. Starting from the leaves, parents are iteratively added until a satisfying tree with respect to φ is found. Algorithm 1 describes the bottom-up construction of φ-trees. The set Init(φ) gathers the leaves. The satisfiability of formulas with respect to φ-trees is tested with the entailment relation `. Inside the loop, the U pdate function consistently adds parents to previously build trees until either a satisfying tree is found or no more trees can be built. If a satisfying tree is found, the algorithm returns that the input formula is satisfiable, otherwise, the algorithm returns that the input formula is not satisfiable. Example 4. Consider the formula φ = p ∧ [(q − r) > 1] ∧ (r > 0). The φ-tree T , described in Example 3, is built by the satisfiability algorithm in 5 steps. All leaves are first defined by Init(φ): notice that n4 is a leaf because it does not contain downward modal formulas. Once in the cycle, parents and previous siblings are iteratively added to previously built trees, which by the second step consists of leaves only: it is easy to see that n3 can be a previous sibling of n4 , and the same for n2 and n3 , and n1 and n2 , respectively; also it is clear that n0 can be a parent of n1 . Notice that n0 is the root due to the absence of upward modal formulas. The construction of T is graphically represented in Figure 4. We now give a detailed description of the algorithm components. Definition 6 (Entailment). The entailment relation is defined as follows:

n`>

φ∈n n`φ

n`φ n`φ∨ψ

n`ψ n`φ∨ψ

φ 6∈ n n ` ¬φ   n ` φ µx.φ /x n ` µx.φ

n`φ n`ψ n`φ∧ψ

If there is node n in a tree T , such that n entails φ (n ` φ) and formulas h↑i> and h←i> does not occur in the root of T , we then say that the tree T entails φ, T ` φ. Given a set of trees X, if there is a tree T in X entailing φ (T ` φ), then X entails φ, X ` φ. Relation 6` is defined in the obvious manner. Leaves are φ-nodes without downward adjacencies, that is, formulas with the form h↓iψ or h→iψ do not occur in leaves. Also, counters are properly initialized, that is, for each counting subformula ψ1 − ψ2 #k of the input formula, if a leaf satisfies ψi (i = 1, 2), then C(ψi ) = 1 is contained in the leaf, otherwise C(ψi ) = 0, that is, no counting proposition corresponding to ψi occurs in the leaf. The set of leaves is defined by the Init function.

Algorithm 1 Satisfiability algorithm for µTLIC Y ← Nφ X ← Init(φ) X0 ← ∅ while X 6` φ and X 6= X 0 do X0 ← X X ← U pdate(X 0 , Y ) Y ← Y \ root(X) end while if X ` φ then return φ is satisfiable end if return φ is not satisfiable

Definition 7 (Init). Given a formula φ, its initial set is defined as follows: Init(φ) ={nφ ∈ N φ | h↓i>, h→i> 6∈ nφ , [ψ1 − ψ2 #k] ∈ lean(φ), ψi ∈ nφ ⇒ [C(ψi ) = 1] ∈ nφ , [ψ1 − ψ2 #k] ∈ lean(φ), ψi 6∈ nφ ⇒ [C(ψi ) = 0] ∈ nφ } Notice that, from definition of φ-nodes, if formulas of the forms h↓i> and h→i> does not occur in leaves, then neither formulas of the forms h↓iψ and h→iψ. Example 5. Consider again the formula φ of Example 3. It is then easy to see that n4 is a leaf. n4 does not contain downward modal formulas h↓iψ and h→iψ. Also, counters are properly initialized in n4 , i.e., C(r) = 1 occurs in n4 . Recall that the U pdate function consistently adds parents to previously built trees. Consistency is defined with respect to two different notions. One notion is with respect to modal formulas. For example, a modal formula h↓iφ is contained in the root of a tree, if and only if, its first children satisfies φ. Definition 8 (Modal Consistency). Given a φ-node nφ and a φ-tree T with root r, nφ and T are m modally consistent ∆m (nφ , T ), if and only if, for all hmiψ, hmiφ in lean(φ), we have that hmiψ ∈ nφ ⇔ r ` ψ, hmiψ ∈ r ⇔ nφ ` ψ. Example 6. Consider φ of Figure 4. In step 2, it is easy to see that n3 is modally consistent with n4 : formula h→iµx.r ∨ h→ix is clearly true in n3 , because r occurs in n4 . In the following steps, ni is clearly modally consistent with ni+1 . The other consistency notion is defined in terms of counters. Since the first child is the upper one in a tree, it must contain all the information regarding counters, i.e., each time a previous sibling is added by the algorithm, counters

must be updated. Counter consistency must also consider that counting formulas occurs in the parents, if and only if, the counters of its first child are consistent with the constraints in counting subformulas. Definition 9 (Counter Consistency). Given a φ-node nφ and trees T1 and T2 , we say that nφ and Ti are counter consistent Θ(nφ , T1 , T2 ), if and only if, for the root ri of Ti and ∀[ψ1 − ψ2 #k] ∈ lean(φ) (i = 1, 2): [C(ψi ) = k 0 ] ∈ nφ , nφ ` ψi ⇔[C(ψi ) = k 0 − 1] ∈ r2 [C(ψi ) = k 0 ] ∈ nφ , nφ 6` ψi ⇔[C(ψi ) = k 0 ] ∈ r2 [ψ1 − ψ2 #k] ∈ nφ ⇔[C(ψi ) = ki ] ∈ r1 , k1 − k2 #k Example 7. Consider the formula φ of Example 3 and Figure 4. In steps 2, 3 and 4, since previous siblings are added, counters for q are incremented in n3 , n2 and n1 , respectively. In step 5, the counting formulas q − r > 1 and r > 0 are present in the root n0 , due to the fact that counters, in the first child, satisfy the Presburger constraints. U pdate function gathers the notions of counter and modal consistency. Definition 10 (Update). Given a set of φ-trees X, the update function is defined as follow for i = 1, 2:  U pdate(X, Y ) = (nφ , T1 , T2 ) | Ti ∈ X, nφ ∈ Y, ∆i (nφ , Ti ), Θ(nφ , T1 , T2 ) We finally define the function root(X), which takes as input a set of φ-trees and returns a set with the roots of the φ-trees. We now show that the algorithm is correct and terminates. It is clear that the algorithm terminates due to the facts that set of φ-nodes is finite and that the U pdate function is monotone. Algorithm correctness is shown by proving that the algorithm is sound and complete. Theorem 1 (Soundness). If the satisfiability algorithm returns that φ is satisfiable, then there is tree model satisfying φ. Proof. Assume T is the φ-tree that entails φ. Then we construct a tree model T isomorphic to T as follows: the nodes of T are the φ-nodes; for each triple (n, T1 , T2 ) in T , n1 ∈ R(n, ↓) and n2 ∈ R(n, →), provided that ni are the roots of Ti (i = 1, 2); and if p ∈ n, then p ∈ L(n). We now show by induction on the structure of the input formula φ that T satisfies φ. All cases are immediate. In the case of fixed-points,   we use the fact that there is an equivalent finite unfolding µx.ψ ≡ ψ µx.ψ /x . Theorem 2 (Completeness). If there is a model satisfying a formula φ, then the satisfiability algorithm returns that φ is satisfiable. Proof. The proof is divided in two main steps: first we show that there is a lean labeled version of the satisfying model; and then we show that the algorithm actually builds the lean labeled version.

Assume T satisfies the formula φ. We construct a lean labeled version T of T as follows: the nodes and shape of T are the same than T ; for each ψ ∈ lean(φ), if n in T satisfies ψ, then ψ is in n of T ; and the counters are set in the nodes in T as the algorithm does in a bottom-up manner. It is now shown by induction on the derivation of T ` φ that T entails φ. By the construction of T and by induction most cases are straightforward. For the fixed-point   case µx.ψ, we proceed by induction on the structure of the unfolding ψ µx.ψ /x . This is immediate since variables and hence unfolded fixed-points occur in the scope of modal or counting formulas only. Before proving that the algorithm builds T , we need to show that there are enough φ-nodes to construct T . That is, counter formulas may enforce the duplication of nodes in order to be satisfied. Counters are used to distinguish potentially identical nodes. We then show that counters are consistent. More precisely, we will show that provided that φ is satisfied, then there is a φ-tree entailing φ, such that each φ-node has no identical children. This is shown by contradiction. We assume that there is no tree with non-identical children entailing φ. Assume that T entails φ, such that T has identical children. We now prune T to produce a tree T 0 by removing the duplicated children. It is now shown that T 0 also entails φ by induction on the derivation T 0 ` φ. Most cases are trivial. Consider now the case of counting subformulas ψ1 − ψ2 #k. We need to ensure that counted nodes (the ones that satisfy ψi for i = 1, 2) were not removed in the pruning process. This is not possible since only identical nodes were removed, and hence removed nodes share the same counters. We are now ready to show that the algorithm builds the lean labeled version T of the satisfying model T . It is proceed by induction on the height of T . The base case is trivial. Consider now the induction step. By induction, we know that the left and right subtrees of T were built by the algorithm, we now show that the root n of T can be joined to the previously built left and right subtrees. This is true due to the following: ∆(n, ni ) is consistent with R, where i = 1, 2 and ni are the roots of the left and right subtrees, respectively; and K(φ) is consistent with the satisfaction of the counting subformulas. Theorem 3 (Complexity). The satisfiability algorithm takes at most single exponential time with respect to the size of the input formula. Proof. We first show that the lean set of the input formula φ has linear size with respect to the size of φ. This is easily proven by induction on the structure φ. We then proceed to show that the algorithm is exponential with respect to the size of the lean. Since the number of φ-nodes is single exponential with respect to lean size, then there are at most an exponential number of steps in the loop of the algorithm. It remains to prove that each step, including the ones inside the loop, takes at most exponential time: computing Init(φ) implies the traversal of N φ and hences takes exponential time; testing ` takes linear time with respect to the node size, and hence its cost is exponential with respect to the set of trees; and since the cost of relations of modal and counter consistency ∆m and Θ is linear, then the U pdate functions takes exponential time.

5

Discussion

We introduced a modal logic for trees with least and greatest fixed-points, inverse programs, nominals and Presburger constraints (µTLICO). We showed that the logic is decidable in single exponential time, even if the Presbuger constraints are in binary. Decidability was shown by a Fischer-Ladner satisfiability algorithm. We are currently exploring symbolic techniques, such as Binary Decision Diagrams, to achieve an efficient implementation of the satisfiability algorithm. The fully enriched µ-calculus for trees has been previously used as a reasoning framework for the XPath query language enhanced with a limited form of counting (numerical constraints) [1]. B´arcenas et al. [1] also showed that XML schemas (regular tree languages) with numerical constraints can be succinctly expressed by the fully enriched µ-calculus. An immediate application of µTLICO is its use as a reasoning framework for XPath and XML schemas with a more general form of counting (arithmetical constraints). In another setting, arithmetical constraints on trees have been also successfully used in the verification of balanced tree structures such as AVL or red-black trees [8, 6]. We believe that another field of application for the logic presented in the current work is in the verification of balanced tree structures.

References 1. E. B´ arcenas, P. Genev`es, N. Laya¨ıda, and A. Schmitt. Query reasoning on trees with types, interleaving, and counting. In T. Walsh, editor, IJCAI, pages 718–723. IJCAI/AAAI, 2011. 2. P. A. Bonatti, C. Lutz, A. Murano, and M. Y. Vardi. The complexity of enriched mu-calculi. In M. Bugliesi, B. Preneel, V. Sassone, and I. Wegener, editors, ICALP, volume 4052 of Lecture Notes in Computer Science, pages 540–551. Springer, 2006. 3. P. A. Bonatti and A. Peron. On the undecidability of logics with converse, nominals, recursion and counting. Artif. Intell., 158(1):75–96, 2004. 4. D. Calvanese, G. D. Giacomo, M. Lenzerini, and M. Y. Vardi. Node selection query languages for trees. In M. Fox and D. Poole, editors, AAAI. AAAI Press, 2010. 5. S. Demri and D. Lugiez. Complexity of modal logics with Presburger constraints. J. Applied Logic, 8(3):233–252, 2010. 6. P. Habermehl, R. Iosif, and T. Vojnar. Automata-based verification of programs with tree updates. Acta Inf., 47(1):1–31, 2010. 7. H. Hosoya, J. Vouillon, and B. C. Pierce. Regular expression types for XML. ACM Trans. Program. Lang. Syst., 27(1):46–90, 2005. 8. Z. Manna, H. B. Sipma, and T. Zhang. Verifying balanced trees. In S. N. Art¨emov and A. Nerode, editors, LFCS, volume 4514 of Lecture Notes in Computer Science, pages 363–378. Springer, 2007. 9. H. Seidl, T. Schwentick, and A. Muscholl. Numerical document queries. In F. Neven, C. Beeri, and T. Milo, editors, PODS, pages 155–166. ACM, 2003. 10. H. Seidl, T. Schwentick, A. Muscholl, and P. Habermehl. Counting in trees for free. In J. D´ıaz, J. Karhum¨ aki, A. Lepist¨ o, and D. Sannella, editors, ICALP, volume 3142 of Lecture Notes in Computer Science, pages 1136–1149. Springer, 2004.

barcenas-lavalle-micai13.pdf

of strings, XML schemas (regular tree expressions) are interpreted as sets of un- ranked trees (XML documents). For example, the expression p(q ? ) represents ...

304KB Sizes 5 Downloads 116 Views

Recommend Documents

No documents