Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis Information Processing Letters (IPL) 2010

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment Problem

• Q1 , Q2 over schema S. • D is a database instance of S.

Q2 v Q1

For every Database Instance D

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis



Q1 (D) Q2 (D)

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Motivation - Previous Work • Related problems: • Query rewriting using views. • Information integration. • Query optimization. • ... • The query containment problem under set semantics has

been extensively investigated • Most of the queries’ classes give decidable results.

• SQL semantics: manipulation of duplicate tuples. • The query containment problem for conjunctive queries

under both bag and bag-set semantics remains open for more than a decade. • Most of the super-classes give undecidable results.

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Conjunctive queries • Conjunctive query (CQ, for short):

Body Head Q : q(X ) :- g1 (X 1 ), . . . ,gn (X n ), Subgoal Subgoal • Select-Project-Join SQL queries with equality comparisons. • Distinguished variables: Vars(X ). • Safe CQ: every variable in Vars(head(Q)) appears in the

body of Q. • (True) valuation from Q to D: 1 every variable of Q is valuated by a constant appear in D. 2 If every valuated subgoal appears in database instance D then the valuated head is in the answer of Q (Q(D)).

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Semantics Set-valued DB D

Bag-valued DB D

link

a a b c

b c b c

link

VS.

a a b c

b b b c

Query: Q : q(X) : −link(X, Y), link(Y, Y)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Semantics Set-valued DB D

Bag-valued DB D

link

a a b c

b c b c

link

VS.

a a b c

b b b c

Query:

(1)set-operators Q : q(X) : −link(X, Y), link(Y, Y) Q(D) a b c

(1): Set semantics: Relations are sets (using DISTINCT in SQL)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Semantics Set-valued DB D

Bag-valued DB D

link

a a b c

b c b c

link

VS.

a a b c

b b b c

Query: (1)set-operators Q : q(X) : −link(X, Y), link(Y, Y) (2)bag-operators Q(D) a b c

Q(D) a a b c

Bag-operators: treat duplicates as distinct tuples (1): Set semantics: Relations are sets (using DISTINCT in SQL) (2): Bag semantics: Relations are bags (SQL semantics) Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Semantics Set-valued DB D

Bag-valued DB D

link

a a b c

link

b c b c

VS.

a a b c

b b b c

Query: (1)set-operators Q : q(X) : −link(X, Y), link(Y, Y) (2)bag-operators Q(D) a b c

(3)bag-operators

Q(D) a a b c

Bag-operators: treat duplicates as distinct tuples (1): Set semantics: Relations are sets (using DISTINCT in SQL) (2): Bag semantics: Relations are bags (SQL semantics) (3): Bag-Set semantics: Relations are sets (normalized DB + SQL) Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Bag-Set Semantics - Projection causes duplicate tuples • Bag-set semantics: Set-valued database + Bag-operators • Each tuple is unique in a relation

D: link

Q1 : q(X) : −link(X, Y), link(Y, Y) Q2 : q(X, Y) : −link(X, Y), link(Y, Y)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

a a b c

b c b c

Q1 (D) a a b c Bag Q2 (D) a b a c b b c c Set

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Bag-Set Semantics - Projection causes duplicate tuples • Bag-set semantics: Set-valued database + Bag-operators • Each tuple is unique in a relation • Queries: Select, Join, Cartesian Product, Projection • CQ Q without projection ⇔ the answer of Q is set. • Afrati, Damigos, Gergatsoulis IPL 2009

D: link

Q1 : q(X) : −link(X, Y), link(Y, Y) Q2 : q(X, Y) : −link(X, Y), link(Y, Y)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

a a b c

b c b c

Q1 (D) a a b c Bag

Q2 (D) a b a c b b c c Set

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Formal Definition Definition Q2 v Q1 , if for every database instance D of S, we have that Q2 (D) ⊆ Q1 (D). Semantics Set Bag Bag-Set

v vs vb vbs

D set-valued bag-valued set-valued

⊆ ⊆s ⊆b ⊆b

• Q2 bag-contained in Q1 ⇒ Q2 bag-set-contained in Q1 • Q2 bag-set-contained in Q1 ⇒ Q2 set-contained in Q1

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

vbs 6⇒vb and vs 6⇒vbs • Relation “path” stores paths of length 2.

Queries

• Q3 vs Q2 vs Q1

• Q3 vbs Q1 but Q2 6vbs Q1 Q1 : q(X ) :- path(X , Y ) Q2 : q(X ) :- path(X , Y ), path(Y , Z ) Q3 : q(X ) :- path(X , Y ), path(Y , Y ) Database D Answers of Queries path Directed 1 3 • Q1 (D) = {1, 2, 3, 3} Graph 2 2 • Q2 (D) = {1, 1, 2, 3, 3} 3 3 3 5 • Q3 (D) = {1, 2, 3, } 1 2 3

5

4

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

vbs 6⇒vb and vs 6⇒vbs • Relation “path” stores paths of length 2.

Queries

• Q3 vs Q2 vs Q1

• Q3 vbs Q1 but Q2 6vbs Q1 Q1 : q(X ) :- path(X , Y ) Q2 : q(X ) :- path(X , Y ), path(Y , Z ) • Q3 6vb Q1 Q3 : q(X ) :- path(X , Y ), path(Y , Y ) Database D Answers of Queries path Directed 1 3 • Q1 (D) = {1, 2, 3, 3, 2, 3, 4, 4} Graph 2 2 • Q2 (D) = {1, 1, 2, 3, 3, 1, 2, 2, 3, 3 3 3, 3, 3, 3, 4, 4} 3 5 1 2 3 3 3 • Q3 (D) = {1, 2, 3, 1, 2, 3, 3, 3, 4, 2 4 5 4 4} 4 4 4 2

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Containment Mapping • Containment mapping from Q1 to Q2 : Every distinct

tuple appears in Q2 (D) also appears in Q1 (D) • each valuation over Q2 ⇒ at least one valuation over Q1

Q1 :

X

Y

W

Z

Q2 :

A

B

C

D

q(a) ← blue(a, b), red(b, c) red(b, c) Q1 : q(X ) :- blue(X , Y ), red(Y , Z ) red(W , Z ) Q2 : q(A) :- blue(A, B), red(B, C ), red(B, D) q(a)← blue(a, b), red(b, c) red(b, d)

µ1 : Q1 → Q2

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Containment Mapping • Containment mapping from Q1 to Q2 : Every distinct

tuple appears in Q2 (D) also appears in Q1 (D) • each valuation over Q2 ⇒ at least one valuation over Q1

Q1 :

X

Y

W

Z

Q2 :

A

B

C

D

q(a) ← blue(a, b), red(b, d) red(b, d) q(a) ← blue(a, b), red(b, c) red(b, c) Q1 : q(X ) :- blue(X , Y ), red(Y , Z ) red(W , Z ) Q2 : q(A) :- blue(A, B), red(B, C ), red(B, D) q(a)← blue(a, b), red(b, c) red(b, d)

µ1 : Q1 → Q2 µ2 : Q1 → Q2 Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Containment Mapping • Containment mapping from Q1 to Q2 : Every distinct

tuple appears in Q2 (D) also appears in Q1 (D) • each valuation over Q2 ⇒ at least one valuation over Q1 • What about multiplicity of each tuple (under bag(-set)

semantics)? • Many valuations over Q2 ⇒ same valuation over Q1

q(a) ← blue(a, b), red(b, d) red(b, d) q(a) ← blue(a, b), red(b, c) red(b, c) Q1 : q(X ) :- blue(X , Y ), red(Y , Z ) red(W , Z ) Q2 : q(A) :- blue(A, B), red(B, C ), red(B, D) q(a)← blue(a, b), red(b, c) red(b, d) q(a)← blue(a, b), red(b, d) red(b, c)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Containment Mapping • Containment mapping from Q1 to Q2 : Every distinct

tuple appears in Q2 (D) also appears in Q1 (D) • each valuation over Q2 ⇒ at least one valuation over Q1 • What about multiplicity of each tuple (under bag(-set)

semantics)? • Many valuations over Q2 ⇒ same valuation over Q1 • ⇒ Q2 v 6 bs Q1 ⇒ Q2 6vb Q1

q(a) ← blue(a, b), red(b, d) red(b, d) q(a) ← blue(a, b), red(b, c) red(b, c) Q1 : q(X ) :- blue(X , Y ), red(Y , Z ) red(W , Z ) Q2 : q(A) :- blue(A, B), q(a)← blue(a, b), q(a)← blue(a, b), q(a)← blue(a, b), q(a)← blue(a, b),

red(B, C ), red(b, c) red(b, d) red(b, c) red(b, d)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

red(B, D) red(b, d) red(b, c) red(b, c) red(b, d)

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query containment and Containment mapping Q1 : q(X ) :- edge(X , Y ), edge(Y , Z ), edge(Y , W ) Q2 : q(A) :- edge(A, B), edge(B, B), edge(B, C ) Q1 :

X

Y

Z

Q2 :

A

B

C

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

W

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query containment and Containment mapping Q1 : q(X ) :- edge(X , Y ), edge(Y , Z ), edge(Y , W ) Q2 : q(A) :- edge(A, B), edge(B, B), edge(B, C )



Q1 :

X

Y

Z

Q2 :

A

B

C

W

µ1 : Q1 → Q2

µ1 : containment mapping from Q1 to Q2 . • Q2 vs Q1 ⇔ containment mapping from Q1 to Q2 .

(Chandra-Merlin, STOC 1977)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query containment and Containment mapping Q1 : q(X ) :- edge(X , Y ), edge(Y , Z ), edge(Y , W ) Q2 : q(A) :- edge(A, B), edge(B, B), edge(B, C )



Q1 :

X

Y

Z

Q2 :

A

B

C

W

µ2 : Q1 → Q2

µ1 : containment mapping from Q1 to Q2 . • Q2 vs Q1 ⇔ containment mapping from Q1 to Q2 .

(Chandra-Merlin, STOC 1977) •

µ2 : variables-onto containment mapping from Q1 to Q2 . • variables-onto containment mapping from Q1 to Q2 ⇒

Q2 vbs Q1 . (Chaudhuri-Vardi, PODS 1993)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query containment and Containment mapping Q1 : q(X ) :- edge(X , Y ), edge(Y , Z ), edge(Y , W ) Q2 : q(A) :- edge(A, B), edge(B, B), edge(B, C )



Q1 :

X

Y

Z

Q2 :

A

B

C

W

µ3 : Q1 → Q2

µ1 : containment mapping from Q1 to Q2 . • Q2 vs Q1 ⇔ containment mapping from Q1 to Q2 .

(Chandra-Merlin, STOC 1977) •

µ2 : variables-onto containment mapping from Q1 to Q2 . • variables-onto containment mapping from Q1 to Q2 ⇒

Q2 vbs Q1 . (Chaudhuri-Vardi, PODS 1993) •

µ3 : subgoals-onto containment mapping from Q1 to Q2 . • subgoals-onto containment mapping from Q1 to Q2 ⇒

Q2 vb Q1 . (Chaudhuri-Vardi, PODS 1993) Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Necessary Conditions for CQ Containment • Q2 vbs Q1 ⇒ Every variable of Q2 must be mapped using

a containment mapping from Q1 (Chaudhuri-Vardi, PODS 1993) • otherwise, there is a subgoal of Q2 that is no mapped by

Q1 • Q2 vb Q1 ⇒ Every subgoal of Q2 must be mapped using

a containment mapping from Q1 (Chaudhuri-Vardi, PODS 1993)

Example Q2 6vbs Q1

q(a) ← link(a, b), link(a, b) Q1 : q(X ) :- link(X , Y ), link(X , Z ) Q2 : q(A) :- link(A, C ), link(C , D) q(a)← link(a, b), link(a, c1 ) .. . link(a, c` )

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Necessary Conditions for CQ Containment • Q2 vbs Q1 ⇒ Every variable of Q2 must be mapped using

a containment mapping from Q1 (Chaudhuri-Vardi, PODS 1993) • otherwise, there is a subgoal of Q2 that is no mapped by

Q1 • Q2 vb Q1 ⇒ Every subgoal of Q2 must be mapped using

a containment mapping from Q1 (Chaudhuri-Vardi, PODS 1993)

Example Q2 6vb Q1

q(a) ← link(a, b), link(a, b) Q1 : q(X ) :- link(X , Y ), link(X , Z ) Q2 : q(A) :- link(A, C ), link(C , C ) q(a)← link(a, b), link(a, a) .. . link(a, a)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

CQs without projections • Q1 is CQ, Q2 is CQ without projections:

Q2 vbs Q1 ⇔ Q2 vs Q1 • i.e. searching for a containment mapping from Q1 to Q2

(in NP) • What about bag-containment? i.e. Q2 vb Q1 ??Q2 vbs Q1

Example Q1 : q(X , Y ) :- link(X , Y ) Q2 : q(X , Y ) :- link(X , Y ), link(Y , Y ) Q2 vbs Q1

n

D: link

a a

a a

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Q2 (D) a a a a a a a a

6vb

Q1 (D) a a a a

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

CQs without projections - Cont. • Q1 , Q2 both CQs without projections: I Q2 vb Q1 ⇔ subgoals-onto I Q2 vbs Q1 ⇔ containment containment mapping mapping from Q1 to Q2 from Q1 to Q2 • Check whether or not the mapping of distinguished vars is

also a subgoals-onto (resp. variables-onto): O(n2 log (n)) • Why Q1 without projection?

Example

Q1 : q(X , Y ) :- e(X , Y ), e(W , U), e(W , V ) Q2 : q(X , Y ) :- e(X , Y ), e(X , X ), e(Y , Y )

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

CQs without projections - Cont. • Q1 , Q2 both CQs without projections: I Q2 vb Q1 ⇔ subgoals-onto I Q2 vbs Q1 ⇔ containment containment mapping mapping from Q1 to Q2 from Q1 to Q2 • Check whether or not the mapping of distinguished vars is

also a subgoals-onto (resp. variables-onto): O(n2 log (n)) • Why Q1 without projection?

Example

e(a, b), e(a, a), e(b, b), Q1 : q(X , Y ) :- e(X , Y ), e(W , U), q(a, b) ← e(a, b),

Q2 vb Q1

e(a, b) e(a, a) e(b, b) e(W , V )

Q2 : q(X , Y ) :- e(X , Y ), e(X , X ), e(Y , Y ) q(a, b) ← e(a, b), e(a, a), e(b, b)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

CQs without self-joins • Q1 is a CQ, Q2 is CQ without self-joins: • i.e. every relation-name appears at most once • every subgoal of Q1 can map at most one subgoal of Q2

I Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 • Complexity: O(nlog (n)) Sort w.r.t. relation-names Check whether or not each subgoal of Q1 maps the unique subgoal of Q2 , with the same relation name 3 Check whether or not there is a subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 1 2

Example Q1 : q(X ) :- blue(X , Y ), blue(X , Z ) Q2 : q(X ) :- blue(X , Y ), green(Y , Y ), red(Y , Z ),

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

CQs without self-joins • Q1 is a CQ, Q2 is CQ without self-joins: • i.e. every relation-name appears at most once • every subgoal of Q1 can map at most one subgoal of Q2

I Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 • Complexity: O(nlog (n)) Sort w.r.t. relation-names Check whether or not each subgoal of Q1 maps the unique subgoal of Q2 , with the same relation name 3 Check whether or not there is a subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 1 2

Example Q2 6vbs Q1

q(a) ← blue(a, b), blue(a, b) Q1 : q(X ) :- blue(X , Y ), blue(X , Z ) Q2 : q(X ) :- blue(X , Y ), green(Y , Y ), red(Y , Z ), q(a) ← blue(a, b), green(a, a), red(a, c) red(a, d)

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Generalized-Star Queries • Labeled path: r1 (W0 , W1 ), r2 (W1 , W2 ), . . . , rk (Wk−1 , Wk), k ≥ 1 • r1 , r2 , . . . , rk are not necessarily distinct relation names, and • W0 , W1 , . . . , Wk are distinct variables. • Star S(X ): collection of labeled paths starting from the

same variable X (root). Z1

X W1

U1

. . . U k` • Generalized-star query of arity n: Q : q(X1 , . . . , Xn ):-S1 (X1 ) , . . . , Sn (Xn ), N1 (Y1 ) , . . . , Nm (Ym ) Zk1

Wk2

distinguished-stars (d-stars)

non-distinguished-stars (n-stars)

• m = 0 ⇒ Q is a star query • Simple generalized-star query: the length of each labeled

path is 1 (i.e. it is of the form: r (W0 , W1 )) Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Star Queries • Q1 , Q2 are star queries of arity n: • Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 • Variables-onto (resp. Q1 : X1

X2 P1K1

P11

...

XN PNKN

P2K2 PN1

P21

W11 . . . W1K1W21 . . . W2K2 WN1 . . . WNKN

Q2 : Y1 0 P11

Y2 0 P1K 1

0 P21

...

• Existence of a containment mapping from Q1 to Q2

• Every variable (resp. subgoal) of Q2 is mapped by Q1

YN 0 PNK

0 P2K 2

Z11 . . . Z1K1 Z21 . . . Z2K2

Subgoals-onto) containment mapping ⇒ Q2 vbs Q1 (resp. Q2 vb Q1 )

0 PN1

N

ZN1 . . . ZNKN

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Star Queries • Q1 , Q2 are star queries of arity n: • Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 • Variables-onto (resp. Q1 : X1

X2 P1K1

P11

...

XN PNKN

P2K2 PN1

P21

W11 . . . W1K1W21 . . . W2K2 WN1 . . . WNKN

Q2 : Y1 0 P11

Y2 0 P1K 1

0 P21

...

• Existence of a containment mapping from Q1 to Q2

• Every variable (resp. subgoal) of Q2 is mapped by Q1

YN 0 PNK

0 P2K 2

Z11 . . . Z1K1 Z21 . . . Z2K2

Subgoals-onto) containment mapping ⇒ Q2 vbs Q1 (resp. Q2 vb Q1 )

0 PN1

N

ZN1 . . . ZNKN

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Star Queries • Q1 , Q2 are star queries of arity n: • Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 • Variables-onto (resp. Q1 : X1

X2 P1K1

P11

...

XN PNKN

P2K2 PN1

P21

W11 . . . W1K1W21 . . . W2K2 WN1 . . . WNKN

Q2 : Y1 0 P11

Y2 0 P1K 1

0 P21

...

• Existence of a containment mapping from Q1 to Q2

• Every variable (resp. subgoal) of Q2 is mapped by Q1

YN 0 PNK

0 P2K 2

Z11 . . . Z1K1 Z21 . . . Z2K2

Subgoals-onto) containment mapping ⇒ Q2 vbs Q1 (resp. Q2 vb Q1 )

0 PN1

N

ZN1 . . . ZNKN

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Star Queries • Q1 , Q2 are star queries of arity n: • Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 • Variables-onto (resp. Q1 : X1

X2 P1K1

P11

...

XN PNKN

P2K2 PN1

P21

W11 . . . W1K1W21 . . . W2K2 WN1 . . . WNKN

Q2 : Y1 0 P11

Y2 0 P1K 1

0 P21

...

• Existence of a containment mapping from Q1 to Q2

• Every variable (resp. subgoal) of Q2 is mapped by Q1

YN 0 PNK

0 P2K 2

Z11 . . . Z1K1 Z21 . . . Z2K2

Subgoals-onto) containment mapping ⇒ Q2 vbs Q1 (resp. Q2 vb Q1 )

0 PN1

N

ZN1 . . . ZNKN

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Star Queries • Q1 , Q2 are star queries of arity n: • Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ subgoals-onto (resp. variables-onto) containment mapping from Q1 to Q2 • Variables-onto (resp. Q1 : X1

X2 P1K1

P11

...

XN PNKN

P2K2 PN1

P21

W11 . . . W1K1W21 . . . W2K2 WN1 . . . WNKN

Q2 : Y1 0 P11

Y2 0 P1K 1

0 P21

...

Z11 . . . Z1K1 Z21 . . . Z2K2

• Existence of a containment mapping from Q1 to Q2

• Every variable (resp. subgoal) of Q2 is mapped by Q1

YN 0 PNK

0 P2K 2

Subgoals-onto) containment mapping ⇒ Q2 vbs Q1 (resp. Q2 vb Q1 )

0 PN1

N

ZN1 . . . ZNKN

• Check whether or not for each distinct labeled path P: number of P(Yi ) in Q2 ≤ number of P(Xi ) in Q1

• O(n2 logn) Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

No existence of variables-onto containment mapping Suppose the following two queries: Q1 : q(X , Y ) :- r (X , X 0 ), r (Z , U), r (Z , W ), r (Y , Y 0 ) Q2 : q(X , Y ) :- r (X , X 0 ), r (X , U), r (Y , W ), r (Y , Y 0 ) Q1 X X0

Q2

Z U

Y

X

W Y0

X0

Y U

W

Y0

• Neither subgoal-onto nor variables-onto containment

mapping from Q1 to Q2 . • Each variable and each subgoal of Q2 are mapped by Q1 . • For each distinct set of tuples: number of valuations over Q2 ≤ number of valuations over Q1 Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

No existence of variables-onto containment mapping Suppose the following two queries: Q1 : q(X , Y ) :- r (X , X 0 ), r (Z , U), r (Z , W ), r (Y , Y 0 ) Q2 : q(X , Y ) :- r (X , X 0 ), r (X , U), r (Y , W ), r (Y , Y 0 ) Q1 X X0

Q2

Z U

Y

X

W Y0

X0

Y U

W

Y0

• Neither subgoal-onto nor variables-onto containment

mapping from Q1 to Q2 . • Each variable and each subgoal of Q2 are mapped by Q1 . • For each distinct set of tuples: number of valuations over Q2 ≤ number of valuations over Q1 Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

No existence of variables-onto containment mapping Suppose the following two queries: Q1 : q(X , Y ) :- r (X , X 0 ), r (Z , U), r (Z , W ), r (Y , Y 0 ) Q2 : q(X , Y ) :- r (X , X 0 ), r (X , U), r (Y , W ), r (Y , Y 0 ) Q1 X X0

Q2

Z U

Y

X

W Y0

X0

Y U

W

Y0

• Neither subgoal-onto nor variables-onto containment

mapping from Q1 to Q2 . • Each variable and each subgoal of Q2 are mapped by Q1 . • For each distinct set of tuples: number of valuations over Q2 ≤ number of valuations over Q1 Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

No existence of variables-onto containment mapping Suppose the following two queries: Q1 : q(X , Y ) :- r (X , X 0 ), r (Z , U), r (Z , W ), r (Y , Y 0 ) Q2 : q(X , Y ) :- r (X , X 0 ), r (X , U), r (Y , W ), r (Y , Y 0 ) Q1 X X0

Q2

Z U

Y

X

W Y0

X0

Y U

W

Y0

• Neither subgoal-onto nor variables-onto containment

mapping from Q1 to Q2 . • Each variable and each subgoal of Q2 are mapped by Q1 . • For each distinct set of tuples: number of valuations over Q2 ≤ number of valuations over Q1 Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

No existence of variables-onto containment mapping Suppose the following two queries: Q1 : q(X , Y ) :- r (X , X 0 ), r (Z , U), r (Z , W ), r (Y , Y 0 ) Q2 : q(X , Y ) :- r (X , X 0 ), r (X , U), r (Y , W ), r (Y , Y 0 ) Q1

Q2 vbs Q1

X

Q2 vb Q1

X0

Q2

Z U

Y

X

W Y0

X0

Y U

W

Y0

• Neither subgoal-onto nor variables-onto containment

mapping from Q1 to Q2 . • Each variable and each subgoal of Q2 are mapped by Q1 . • For each distinct set of tuples: number of valuations over Q2 ≤ number of valuations over Q1 Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Simple Generalized-Star Queries • Q1 is simple generalized-star query of arity n • Q2 is a star query of arity n • Schema contains a single binary relation

Q1 : X1

Nj

X2

...

W11 . . . W1K1 W1 . . . WK W21 . . . W2K2

Q2 :

...

Y1

Y2

Z11 . . . Z1K1

Z21 . . . Z2K2

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Simple Generalized-Star Queries • Q1 is simple generalized-star query of arity n • Q2 is a star query of arity n • Schema contains a single binary relation

Q1 : X1

Nj

X2

...

W11 . . . W1K1 W1 . . . WK W21 . . . W2K2

Q2 :

...

Y1

Y2

Z11 . . . Z1K1

Z21 . . . Z2K2

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Simple Generalized-Star Queries • Q1 is simple generalized-star query of arity n • Q2 is a star query of arity n • Schema contains a single binary relation

Q1 : X1

Nj

X2

...

W11 . . . W1K1 W1 . . . WK W21 . . . W2K2

Q2 :

...

Y1

Y2

Z11 . . . Z1K1

Z21 . . . Z2K2

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Simple Generalized-Star Queries • Q1 is simple generalized-star query of arity n • Q2 is a star query of arity n • Schema contains a single binary relation

Q1 : X1

Nj

X2

...

W11 . . . W1K1 W1 . . . WK W21 . . . W2K2

Q2 :

...

Y1

Y2

Z11 . . . Z1K1

Z21 . . . Z2K2

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Simple Generalized-Star Queries • Q1 is simple generalized-star query of arity n • Q2 is a star query of arity n • Schema contains a single binary relation • Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ for every subgset of d-stars S of Q1 and the set of corresponding d-stars S 0 of Q2 : P P Pm 0 S 0 ∈S |subgoals(S )| ≤ S∈S |subgoals(S)| + j=1 |subgoals(Nj )| Q1 : X1

Nj

X2

...

W11 . . . W1K1 W1 . . . WK W21 . . . W2K2

Q2 :

...

Y1

Y2

Z11 . . . Z1K1

Z21 . . . Z2K2

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Simple Generalized-Star Queries • Q1 is simple generalized-star query of arity n • Q2 is a star query of arity n • Schema contains a single binary relation • Q2 vb Q1 (resp. Q2 vbs Q1 )⇔ for every subgset of d-stars S of Q1 and the set of corresponding d-stars S 0 of Q2 : P P Pm 0 S 0 ∈S |subgoals(S )| ≤ S∈S |subgoals(S)| + j=1 |subgoals(Nj )| Test

Q1 : X1

Nj

X2

. . .• For each d-star S of Q1 and the

W11 . . . W1K1 W1 . . . WK W21 . . . W2K2

corresp. d-star S 0 of Q2 , calculate: |subgoals(S 0 )| − |subgoals(S)|.

• Calculate the sum s of all negative differences.

Q2 :

• Calculate the number sN of the

. . . subgoals of n-stars of Q1 .

Y1

Y2

Z11 . . . Z1K1

Z21 . . . Z2K2

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

• Check whether or not s + sN ≥ 0. • Linear time.

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Variables Property and CQ-Enhanced • Q1 , Q2 are CQs • Q2 vbs Q1 ⇒ for each finite set of tuples: number of valuations over Q2 ≤ number of valuations over Q1

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Variables Property and CQ-Enhanced • Q1 , Q2 are CQs • Q2 vbs Q1 ⇒ for each finite set of tuples: number of valuations over Q2 ≤ number of valuations over Q1

• Each relation of arity n is the n-th

constants Q 1 X1 X2

.. . Xn

a1

.. .

ak

Y1 Q2 Y2

Cartesian Product of the set of constants

.. . Ym

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Variables Property and CQ-Enhanced • Q1 , Q2 are CQs • Q2 vbs Q1 ⇒ for each finite set of tuples: number of valuations over Q2 ≤ number of valuations over Q1

• Each relation of arity n is the n-th

constants Q 1 X1 X2

.. . Xn

a1

.. .

ak

Y1 Q2 Y2

.. .

Cartesian Product of the set of constants

• Each variable of each query can valuated by any constant

Ym

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Variables Property and CQ-Enhanced • Q1 , Q2 are CQs • Q2 vbs Q1 ⇒ for each finite set of tuples: number of valuations over Q2 ≤ number of valuations over Q1

• Each relation of arity n is the n-th

constants Q 1 X1 X2

.. . Xn

a1

.. .

ak

Y1 Q2 Y2

.. . Ym

Cartesian Product of the set of constants

• Each variable of each query can valuated by any constant

• If |Variables(Q2 )| ≤ |Variables(Q1 )| then Q2 6vbs Q1

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Variables Property and CQ-Enhanced • Q1 , Q2 are CQs • Q2 vbs Q1 ⇒ for each finite set of tuples: number of valuations over Q2 ≤ number of valuations over Q1 • Q2 vbs Q1 ⇒ |Variables(Q2 )| ≤ |Variables(Q1 )|.

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Variables Property and CQ-Enhanced • Q1 , Q2 are CQs • Q2 vbs Q1 ⇒ for each finite set of tuples: number of valuations over Q2 ≤ number of valuations over Q1 • Q2 vbs Q1 ⇒ |Variables(Q2 )| ≤ |Variables(Q1 )|. • Suppose Q2 is Q1 -enhanced: obtained by adding a

sequence of subgoals to Q1 : • Q2 vbs Q1 ⇔ variables-onto containment mapping

from Q1 to Q2 • Test: check whether any variable appearing in the

additional subgoals also appears in Q1 ’s body. • Linear time

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Complexity results for the Bag and Bag-Set CQ containment problem Containing Query (Q1 )

Contained Query (Q2 )

CQ CQ

CQ CQ without projections CQ without projections CQ without self-joins

CQ without projections CQ

Star Query

Star Query

Simple Gener.Star Query CQ

Simple Star Query Enhanced Q1

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Complexity Bag-Set Semantics Bag Semantics Πp2 − hard: ChaudVardiPODS93 NP: open AfrDamGerIPL09 O(n2 log (n)): AfrDamGerIPL09 O(nlog (n)) : O(nlog (n)) : AfrDamGerIPL09 IoanRamTODS95, AfrDamGerIPL09 O(n2 log (n)): AfrDamGerIPL09 Linear: AfrDamGerIPL09 Linear: open AfrDamGerIPL09

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Thank You

Foto Afrati, Matthew Damigos and Manolis Gergatsoulis

Query Containment under Bag and Bag-Set Semantics

Query Containment under Bag and Bag-Set Semantics

D is a database instance of S. Q2 ⊑ Q1 .... Bag-set semantics: Set-valued database + Bag-operators .... Containment mapping from Q1 to Q2: Every distinct.

672KB Sizes 0 Downloads 267 Views

Recommend Documents

No documents