Using conjunctive normal form for natural language ...

Viewer
Transcript

Using conjunctive normal form for natural language semantics Steven Abney and Ezra Keshet January 2, 2013

1

Contents 1 Introduction 1.1 Language as Interface to Reasoning . . . . . . . . . . . . . . . . . 1.2 Introducing CNF . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 CNF as Semantic Metalanguage 2.1 Interpretation function as homomorphism . . . . 2.2 Semantic types . . . . . . . . . . . . . . . . . . . 2.3 Semantic lexicon and root constraints . . . . . . 2.4 Rules of interpretation . . . . . . . . . . . . . . . 2.4.1 Literals . . . . . . . . . . . . . . . . . . . 2.4.2 Zeros . . . . . . . . . . . . . . . . . . . . 2.4.3 Copy . . . . . . . . . . . . . . . . . . . . . 2.4.4 Identity . . . . . . . . . . . . . . . . . . . 2.4.5 Negation . . . . . . . . . . . . . . . . . . 2.4.6 Conjunction and disjunction . . . . . . . 2.4.7 Application . . . . . . . . . . . . . . . . . 2.4.8 Abstraction . . . . . . . . . . . . . . . . . 2.4.9 Q-Restriction . . . . . . . . . . . . . . . . 2.4.10 Q-Scope . . . . . . . . . . . . . . . . . . . 2.4.11 Modification . . . . . . . . . . . . . . . . 2.5 Index-assignment rules . . . . . . . . . . . . . . . 2.6 Lambda abstraction . . . . . . . . . . . . . . . . 2.7 Examples . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Elementary examples . . . . . . . . . . . . 2.7.2 The essence of the interpretation function 2.7.3 More examples . . . . . . . . . . . . . . . 2.8 Shorthand . . . . . . . . . . . . . . . . . . . . . . 3 Anaphora 3.1 Basic intra-sentential anaphora . . . 3.2 Discourse (inter-sentential) anaphora 3.3 Donkey anaphora . . . . . . . . . . . 3.4 Negation, Disjunction, and Universal 3.5 The Limits of DPL . . . . . . . . . . 3.6 Cataphora . . . . . . . . . . . . . . . 3.7 Summary . . . . . . . . . . . . . . .

3 3 4

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

6 6 7 8 9 10 10 10 10 11 11 11 12 12 13 13 14 15 15 15 17 18 21

. . . . . . . . . . . . . . . . . . . . . . . . . . . Quantification . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

23 23 23 26 27 28 29 29

. . . . . . . . . . . . . . . . . . . . . .

4 Additional Issues 30 4.1 Event variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2 Telescoping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.3 Additional constraints and open questions . . . . . . . . . . . . . 36 5 Conclusion

39

2

1

Introduction

In this paper, we consider the consequences of using conjunctive normal form (CNF)—also known as clausal normal form—as the semantic metalanguage for the interpretation of natural language discourses. CNF is a quantifier-free subset of first-order predicate calculus (FOPC) that constitutes a normal form in the sense of being inferentially equivalent to general FOPC. It is the standard representation in automated reasoning systems. We will show that the use of CNF predicts the existence of certain classes of eccentric anaphora such as donkey anaphora and telescoping. Theories of discourse semantics that accommodate donkey anaphora include Discourse Representation Theory (Kamp 1981, Kamp and Reyle 1993) and Dynamic Predicate Logic (Groenendijk and Stokhof 1991). We will compare the CNF approach to these alternatives with respect to their empirical predictions. But of equal interest is the manner in which the predictions arise. CNF is designed for utility in reasoning, and the CNF-based system of interpretation that we describe is designed to enable a two-way connection between language and reasoning. Eccentric anaphora emerges as an accidental consequence of the design. In that sense, the CNF approach goes beyond merely accommodating eccentric anaphora to giving a possible explanation for why it exists in the first place.

1.1

Language as Interface to Reasoning

Natural language users are able to both produce and comprehend language. Information acquired through language comprehension feeds reasoning, as well as decision-making, emotions, and other basic cognitive functions; and these functions in turn generate information which may be conveyed to others via language production. Viewed from a computational perspective, language thus constitutes an input-output system for an intelligent agent. In the input direction, the system converts a sentence into an internal representation that is conducive to reasoning or other uses, and in the output direction, this conversion is reversed. Conventional linguistic accounts of semantics (Montague 1970a,b) define interpretation functions, often in FOPC, but do not give an account of the connection between those functions and reasoning. An interpretation function typically assigns a set of possible worlds to a sentence: in particular, those possible worlds where the sentence is true. Most work in semantics does not consider how a set of possible worlds can be finitely represented, algorithmically computed, or used in reasoning algorithms. In the field of artificial intelligence (AI), there is an enormous literature on reasoning algorithms. By far the commonest algorithm is resolution.1 In 1 We note that reasoning algorithms other than resolution do exist. We particularly mention model-building algorithms that, at first blush, appear more compatible with model-based interpretation functions. Model building algorithms attempt to satisfy a formula by explicitly constructing a finite model for it. If a formula is satisfiable, its negation is not valid. As such,

3

the context of AI, the textbook approach to language comprehension involves a pipeline consisting of a parser that converts a sentence to an expression in first-order predicate calculus, followed by conversion to CNF, for processing by an automated reasoner.2 CNF emerged from early work on logical inference by Whitehead and Russell (1910), Herbrand (1930), Skolem (1934), and G¨odel (1939), and its special significance for automated reasoning was established with the introduction of resolution-based theorem provers (Robinson 1965). Resolution is a rule of inference that applies to CNF, not to general FOPC expressions. It is inferentially complete as a sole rule of inference and is therefore the basis for most practical reasoning systems. To do language production, one must reverse the pipeline just sketched. However, most computational systems for language production do not actually reverse the comprehension pipeline, but rely instead on a production system that is entirely independent of the comprehension system. One difficulty is the conversion from FOPC to CNF, which is not readily inverted. The system of semantic interpretation that we present below is based on the idea of avoiding the FOPC-to-CNF conversion by doing semantic interpretation directly to CNF. We effectively use the same tree structure for both the English sentence and the CNF expression—this is made possible by the homorphism between English logical forms (LFs) and CNF discussed in section 2.1—and the common structure makes it possible to do language production by parsing in reverse. One parses the CNF to construct an LF tree, and one then reads the English sentence off the LF tree (Abney 2012).

1.2

Introducing CNF

An expression in CNF is a conjunction of clauses, where a clause is a disjunction of literals. A literal, in turn, is an atomic formula or a negated atomic formula, an atomic formula being the application of a predicate to zero or more terms. A term is a variable, an individual constant, or a function applied to a list of terms. For instance, the following diagram labels the parts of the CNF expression [A(c) ∨ ¬B(f (x))] ∧ C(x) with predicates A, B, C, variable x, constant c, and model building is a useful complement to deduction: deduction searches for a proof that a given formula is valid, and building a model for the negation can provide a counterexample, demonstrating that no proof will be found. But a model-building approach is not necessarily any more compatible with the standard interpretation functions. Generally, model building algorithms also use CNF internally, or convert expressions to CNF as a step in reducing the first-order model-building problem to a propositional satisfiability problem (McCune 2003). 2 For example, this is the approach spelled out in Chapter 9 of Russell and Norvig (1995), which is the standard introduction to artificial intelligence.

4

function f : clause

z [

clause

}|

literal

literal

{

z

}|

literal

{

}| { z }| { z }| { c ) ∨ ¬ |{z} x ) A ( |{z} B ( f (x) ) ] ∧ |{z} C ( |{z} |{z} |{z} predicate term predicate term predicate term | | {z } {z } | {z } z

atomic formula

atomic formula

atomic formula

Notice that there are no quantifiers in a CNF expression: all variables that occur are free and are implicitly universally bound. In addition, certain of the function symbols are distinguished as Skolem functions; these function symbols are implicitly existentially bound. We will extend the term variable to include both the standard variables over individuals and Skolem function symbols, which we think of as variables ranging over functions. The former we call individual variables or universal variables (because of the implicit universal binding), and the latter we call Skolem variables. To distinguish them typographically, we place a dot over Skolem variables. The interpretation of a CNF expression is the same as for any predicate calculus expression, except for the implicit binding of variables. More precisely, for any CNF expression φ, let x1 , . . . , xn be the universal variables that occur in φ, and let y˙ 1 , . . . , y˙ m be the Skolem variables that occur. Then φ is interpreted identically to the (second-order) predicate calculus formula ∃y˙ 1 . . . ∃y˙ m ∀x1 . . . ∀xn φ. For example, consider the following sentence: (1)

Every farmer owns a donkey.

Its CNF translation is the following:3 (2)

¬farmer0 (x) ∨ [donkey0 (y(x)) ˙ ∧ owns0 (x, y(x))]. ˙

As we will see, variables may be shared across multiple sentences. Hence our comments above regarding implicit quantification provide a definite interpretation only for a complete discourse or dialogue—that is, a complete database of CNF clauses. We therefore must address the interpretation of a CNF expression such as (2) that constitutes only a fragment of a database. In formal definitions of the predicate calculus, an open formula is typically treated as representing (at least indirectly) a function from assignments to truth values. A very similar interpretation applies here: an open formula such as (2) represents a function from assignments to truth values; although in this case the domain of the assignment includes not only individual variables but also Skolem variables. CNF differs from the standard predicate calculus in that CNF additionally permits an open formula to be interpreted as if it were a sentence (i.e., a formula 3 We adopt the common convention of using ω 0 to represent the logical predicate that translates the object-language word ω.

5

with no free variables). With a slight abuse of terminology, we can think of these two interpretations as the intension and extension of the formula. The intension is the function that maps assignments to truth values. The extension is a truth value, defined as follows. A CNF expression whose intension is f has the extension true just in case there exists a partial assignment g0 , assigning values just to the Skolem variables such that every extension of g0 to a total assignment g yields f (g) = true. That is, an open formula is true qua sentence just in case there exists some way of fixing the values of Skolem variables, such that every way of assigning values to universal variables makes the formula, qua formula, true. Note incidentally that intensions are easily conjoined: if we have a CNF formula whose intension is f1 , and we add a new clause with intension f2 , the conjoined intension is λg [f1 (g) ∧ f2 (g)]. Hence sentence meanings are readily conjoined to create discourse meanings.

2

CNF as Semantic Metalanguage

In this section, we define a semantic system using CNF as the metalanguage. Instead of a complex pipeline from natural language to CNF, our system is a homomorphism from an LF tree to its corresponding CNF expression. After defining the system, we step through several illustrative examples, and, finally, we present a shorthand that will prove useful in later sections.

2.1

Interpretation function as homomorphism

The domain of our interpretation function consists of standard LF trees, in which quantifiers and negation have been raised. (That is, we treat both quantification and negation as scope-taking operators.) For example, we take the LF of sentence (1) to be something like (3a). Our system assigns it the interpretation (2), whose syntactic structure is shown in (3b): (3)

a. DPi every

S

farmer

¬farmer0 (x)

DPj a

∨

b.

S

donkey

S ti

donkey0 (y(x)) ˙

∧

owns0 (x, y(x)) ˙

VP

owns

tj

A key intuition is that our semantic interpretation function defines a homomorphism from LF trees to CNF expressions. That is, the structure of the CNF

6

expression is a simplification of the LF tree, but it does not involve any restructuring. The nodes of the CNF tree (3b) correspond in an obvious way to nodes in the LF tree (3a): the leaf nodes correspond to the leaves farmer, donkey, and owns in the LF tree; the root node corresponds to the root node of the LF tree; and the node labeled “∧” corresponds to the middle S node in the LF tree. The structure of the LF tree is preserved, albeit in simplified form, in the CNF expression. The next few sections define the semantic interpretation system necessary to achieve this homomorphism. Comprehensive examples of the system are postponed until section 2.7. On a first reading one may wish to skim the examples before launching into the definitions of sections 2.2–2.5.

2.2

Semantic types

Before defining the interpretation function, it will be convenient to introduce a system of semantic types. A type has three parts: a basic type ρ ∈ {L, 0, C, ∀, ∃, ∧, ∨}, governing how nodes combine; an index i, governing predicate-argument relationships and quantifier-trace relationships; and a polarity s ∈ {+1, −1}. The combination ρs of basic type and polarity is a signed type. We usually adopt a slightly modified notation and write a signed type simply as ρ if s = +1, and ρ if s = −1. The combination ρsi of signed type and index is a full type. Basic types. The basic types are summarized in the following table. Unlike the types e, t, he, ti, etc., familiar from the typed predicate calculus, our types do not represent the mathematical class of the denotatum of a node, but rather the mode of combination used when interpreting the node. Node Literals: nouns, verbs, adjectives, prepositions Uninterpreted nodes: all other terminal nodes Copy nodes (share a denotation with one child) Universally quantified DPs Existentially quantified DPs Disjunction Conjunction

Basic Type L 0 C ∀ ∃ ∨ ∧

The first two types are used for terminal nodes, and the remaining types are used for nonterminal nodes. The assignment of types to nodes is determined by the rules of section 2.4. Namely, a configuration is interpretable just in case there is an interpretation rule that licenses it. (By configuration we mean a local subtree consisting of a node and its children.) The reader may find it useful to think of the interpretation rules as rewrite rules in which semantic types represent the categories. Polarity. The polarity of a node is also determined by the rules of interpretation given below. The general principle is quite simple, however, and we state

7

it here. Root nodes have positive polarity, and polarity is inherited downward (i.e., the polarity of a node equals the polarity of its parent) with two exceptions: • In a determiner phrase [DP D NP] in which the determiner is a universal (every, all, each, . . .), the polarity of the NP is opposite to that of the DP. • In a negation [S not S], the two S nodes have opposite polarity. Indices. An index term may be a universal variable, a Skolem term consisting of a Skolem function and its arguments, or a name (individual constant). An index is a tuple of index terms. For example: • h i – an empty index. Used for uninterpreted nodes and saturated nodes, such as complete clauses. • hui – a single index term. Used for one-place predicates, such as simple nouns and intransitive verbs. Also used for quantified DPs and their traces. • hu, vi – used for two-place predicates, such as transitive verbs. • hu, v, wi – used for three-place predicates, such as ditransitive verbs. Tuples of any length are allowed. In most cases, we omit the angle brackets when writing indices. For example, the full type Cx,y has index hx, yi; the full type ∃x has index hxi; and the full type ∧ has index h i.

2.3

Semantic lexicon and root constraints

The semantic types and interpretations of nodes are determined by four sets of rules: a semantic lexicon, a set of root constraints, a set of interpretation rules, and a set of index-assignment rules. The interpretation rules are given in section 2.4; the index-assignment rules are stated in section 2.5; and the lexicon and root constraints are dispensed with here. Semantic lexicon The semantic lexicon determines the basic type of each word. We do not give an explicit listing, but the general principles are very simple. Content words (nouns, non-copular verbs, adjectives, and prepositions) have type L, and all other words have type 0. The lexicon also determines a CNF predicate ω 0 corresponding to each content word ω. Root constraints The root constraints are also simply stated: the root node must have positive polarity and its index must be h i.

8

2.4

Rules of interpretation

For each interpreted node in an LF tree, the interpretation function [[ ]] determines a CNF intension—that is, a function from assignments to truth values. This intension is coordinated with the intensions of the previous sentences in the current discourse, and the resulting CNF expression (still an intension) is assigned a true/false extension as described in Section 1.2. The interpretation function is defined by the rules given below. In these rules we use “colon” notation to indicate the semantic type of a node. For example, “α : ∃i ” is a pattern that matches any node α whose semantic type is ∃i . We use the variable ω for terminal nodes only; we use α, β, and γ for nodes that may be either terminal or nonterminal. We use the variables σ and τ for basic types other than 0. We use variables u and v for individual index terms, and we use i and j for indices. (Recall that indices are tuples of index terms.) The case i = h i is expressly permitted. Conversely, the omission of an index is significant: for example, a semantic type indicated simply as τ is incompatible with any index except h i. The linear order of a node’s children is not important and may be reversed in any of the rules. As mentioned earlier, for any given node in an LF tree, there is only rule that might possibly apply, the relevant rule being readily determined by the number of children (if any) and their semantic or syntactic types. If the semantic types in the relevant rule do not match the semantic types in the local configuration in the LF tree, the tree is semantically ill-formed. The determination of which rule is the “relevant rule” is made as follows. • For a terminal node, the only applicable rules are Literals (4) and Zeros (5). The choice between them is determined by the semantic type of the word (L or 0), as assigned by the semantic lexicon. • For a nonterminal node with a single child, the only applicable rule is Copy (6). • For branching nodes, rules (7)–(12) apply if there is exactly one child of type 0. The syntactic type of that child determines which rule applies. The relevant syntactic types, along with the notation we use to indicate them, are as follows: identity elements (ωI ) such as copular verbs, negative elements (ω¬ ) such as not, conjunction (ω∧ ), disjunction (ω∨ ), referential elements (ωDP ) such as personal pronouns and traces, relative pronouns (ωRP ), universal quantificational determiners (ω∀ ) such as every and each, and existential quantificational determiners (ω∃ ) such as some and a(n). • The two remaining rules are Modification (14) and Q-Scope (13), for which neither child may be of type 0. Q-Scope requires one child to be quantificational (basic type ∀ or ∃), and Modification prohibits either child from being quantificational, so again the choice of rule is determined by the types of the children.

9

2.4.1

Literals

Recall that ω must be a terminal node. (4)

[[ω : Lu1 ,...,un ]] = ω 0 (u1 , . . . , un )

[[ω : Lu1 ,...,un ]] = ¬ω 0 (u1 , . . . , un )

Thus, a content word ω with an index i comprising n index terms denotes a CNF literal whose predicate is the semantic constant ω 0 listed for ω in the lexicon and whose argument list is identical to i. The cases n = 1 and n = 0 are permitted. Examples are the terminal nodes labeled “L” in any of the trees of section 2.7. 2.4.2

Zeros

A terminal node ω whose type is not L must have type 0. The interpretation of such nodes is undefined. (5)

[[ω : 0i ]] = ⊥

[[ω : 0i ]] = ⊥

Such nodes may, however, contribute to the interpretation less directly. As already mentioned, words of type 0 play a syncategorematic role in many of the following rules. Also, some of the index-assignment rules of section 2.5 apply to DP nodes of type 0. Note that we treat pronouns, proper nouns, and traces as terminal nodes, devoid of internal structure, despite having category DP. Examples are the terminal nodes labeled “0” in any of the trees of section 2.7. 2.4.3

Copy

The following rule applies to any unary-branching node γ whose only child is α.     γ : Ci γ : Ci   = [[α]]   = [[α]] (6) α : τi α:τi The index of the child is copied to the parent. Child and parent also share polarity and denotation. An example can be found in (21). On a first exposure, the reader may find it confusing that the interpretation appears to be the same, regardless of the polarity of the parent node. But remember that if the child α is a terminal node with a negative type τ i , its interpretation [[α]] will be a negated CNF literal. Negations in CNF appear only on literals; negative polarity on a nonterminal node indicates that all the literals it contains are to be negated. Similar comments apply to all the following rules. 2.4.4

Identity

In the following rule, ωI is an identity element, such as a copula.4 4 For the moment we set aside any tense interpretion of the copula. Tense is revisited in section 4.1.

10

""

γ : Ci

(7)



## = [[β]]

ωI : 0

β : τi



γ : Ci

 ωI : 0

 = [[β]]

β :τi

Again, the parent shares index, polarity, and denotation with one child—the one not of type 0. An example can be found in (24): the word is is an identity element. 2.4.5

Negation

In this rule, ω¬ represents a negative element. The configuration under consideration is typically the result of operator raising at LF.   ## "" γ : Ci γ : Ci   = [[β]] = [[β]] (8) ω¬ : 0 β : τ i ω¬ : 0 β : τi The only effect of a negative element is to flip the polarity of its complement. An example is found in (25). 2.4.6

Conjunction and disjunction

In these rules, ω∧ represents a conjunction disjunction (or ).   γ : ∧ i     = [[α]] ∧ [[β]] (9)

α : σi 

ω∧

β : τi 

γ : ∨i

  = [[α]] ∨ [[β]]

  α : σi

ω∨

β : τi

(and, but) and ω∨ represents a 



γ : ∧i    = [[α]] ∨ [[β]]    α : σ i ω∧ β : τ i   γ : ∨i     = [[α]] ∧ [[β]]   α : σ i ω∨ β : τ i

A negated conjunction (∧) is interpreted as ∨, and a negated disjunction (∨) is interpreted as ∧. Some readers may find the ternery branching structure distasteful. We have adopted it for simplicity; defining interpretation rules for a binary-branching structure involves technical issues that would complicate matters to little profit. 2.4.7

Application

This rule corresponds roughly to Functional Application. However, unlike with standard Functional Application, in our system the CNF predicate is not actually applied to its arguments until one interprets the terminal node. Recall that the Literals rule (4) takes its node’s index terms to represent the predicate’s arguments. Hence the present rule implements function application indirectly,

11

by adding the index term v of the argument to the index i of the parent, yielding the index hi, vi for the head. The variable ωDP represents a referential DP—a proper noun, a pronoun, or a trace. We use hi, vi as a shorthand for hu1 , . . . , un , vi where i = hu1 , . . . , un i.     γ : Ci γ : C i     = [[α]] (10)   = [[α]] α : σi,v ωDP : 0v α : σ i,v ωDP : 0v We can view the rule alternatively from a bottom-up perspective, as stating that a predicate α with a non-empty index hi, vi may discharge its last index term v by combining with an uninterpreted node ωDP whose sole index term is v. The resulting parent node has the same interpretation and polarity as the predicate, but its index list i is one term shorter, omitting the discharged term. Examples can be found in any of the trees of section 2.7. 2.4.8

Abstraction

Next, we have a rule akin to Predicate Abstraction. Here, ωRP represents a relative pronoun.     γ : Cu γ : Cu   = [[β]]   = [[β]] (11) ωRP : 0u β : τ ωRP : 0u β : τ In words, a relative pronoun (of basic type 0) may combine with a node having an empty index to form a node having a singleton index. Note that the lack of index on τ is significant. The relative pronoun ωRP and its parent share the same index. The denotation of the parent is taken from the other child β. An example can be found in (24). 2.4.9

Q-Restriction

A set of syncategorematic rules are required to capture quantificational DPs. In the following, the parent γ is a quantificational DP; ω∀ must be a universal quantificational determiner such as every, all, each, etc.; and ω∃ must be an existential quantificational determiner such as a(n), some, etc.   ## "" γ : ∃u γ : ∃u   = [[β]] = [[β]] ω∃ : 0 β : τu ω∃ : 0 β : τ u (12)   "" ## γ : ∀u γ : ∀u   = [[β]] = [[β]] ω∀ : 0 β : τ u ω∀ : 0 β : τu Existentially quantified DPs carry the basic type ∃ while universally quantified DPs carry ∀. With existentials, the parent and both children share the same

12

polarity; but universal quantificational determiners flip the polarity of their complements. In all cases, the parent and the restriction (β) share a common index, which must consist of a single index term. The parent always shares a denotation with the restriction. Most of the trees in section 2.7 contain examples. In rule (12), the passing of the index term is being used in lieu of lambda abstraction. As a general matter, we implement lambda abstraction through the passing of index terms, rather than through the use of a lambda operator in the metalanguage. That may at first seem a mere notational idiosyncracy, but it is actually central to the CNF system. See section 2.6 for discussion. 2.4.10

Q-Scope

Next we need rules to combine quantificational DPs with their nuclear scope.   ## "" γ :∧ γ :∧  = [[α]] ∨ [[β]]  = [[α]] ∧ [[β]] α : ∃i β : τ α : ∃i β : τ (13)   "" ## γ :∨ γ :∨  = [[α]] ∧ [[β]]  = [[α]] ∨ [[β]] α : ∀i β : τ α : ∀i β : τ A DP of basic type ∃ has a type-∧ parent (a conjunction) while a DP of basic type ∀ has a type-∨ parent (a disjunction). If the polarity is negative, the interpretation of the connective is inverted. That is, “∧” is interpreted as ∨ and “∨” is interpreted as ∧. Note that the parent and the second child must have an empty index. Most of the trees of section 2.7 contain examples. 2.4.11

Modification

The next rule corresponds to Predicate Modification. In this rule, the children may not be quantified DPs. That is, in (14), σ and τ may be any basic type except 0, ∀, or ∃. "" ## "" ## γ : ∧i γ : ∧i = [[α]] ∧ [[β]] = [[α]] ∨ [[β]] (14) α : σi β : τi α : σi β : τ i In words, two nodes with matching polarity and indices may combine to form a node of type ∧ with the same polarity and index as the children. The signed semantic type of the parent node determines how the interpretations of its children are combined, with “∧” being interpreted as ∨. An example is the combination of donkey and the relative clause in (26). Modification also applies when an adjective modifier is combined with a noun, though there are no examples of that in section 2.7.

13

2.5

Index-assignment rules

The interpretation rules of the previous section determine the basic types and polarities of all nodes. They also propagate indices from one node to another, and that propagation suffices to determine indices for all nodes, provided that indices for DPs are initially specified. The rules of this section provide those initial specifications. Before stating the rules, we require a couple of preliminary definitions. We define the scope of a quantificational DP to be the set of nodes dominated by its parent. The rationale is as follows. In a well-formed LF tree, a quantificational DP occurs in the following configuration: (15)

S DP D

S

NP

The NP is the restriction of the DP, the lower S is its nuclear scope, and everything dominated by the upper S, including everything in both the restriction and nuclear scope, constitutes the scope of the DP. We assume a fixed enumeration of universal variables x1 , x2 , . . . and a fixed enumeration of Skolem variables x˙ 1 , x˙ 2 , . . ..5 We also enumerate the quantificational DPs in preorder as DP1 , . . . , DPn .6 Now we state the index-assignment rules. They determine the indices for DP nodes, and those indices are propagated throughout the tree by the interpretation rules of the previous section. (16)

Proper Nouns. The index of a proper noun ω is hω 0 i. ω 0 is an individual constant, determined by the semantic lexicon.

(17)

Anaphora. The index of a trace or pronoun must be the same as the index of its antecedent.

(18)

Universal Variables. If DPt (the t-th DP in the preorder enumeration) has signed type ∀ or ∃, then its index is hxt i, a singleton index consisting of the t-th universal variable.

(19)

Skolem Variables. Otherwise the index of DPt is of form hx˙ t (. . .)i, a singleton index consisting of the t-th Skolem function applied to an argument list, and the argument list “. . .” consists of all distinct universal variables that occur in the index terms of outscoping quantificational DPs.7

5 Hence

our use, in example trees, of variables that vary unsystematically by letter—x, y, etc.—rather than systematically by subscript—x1 , x2 , x3 , etc.—is not strictly speaking correct; we have opted for readability over rigor. 6 In preorder, a parent node is ordered before its children, and the k-th child and all its descendents are ordered before the (k + 1)-st child and any of its descendents. 7 This definition is slightly nonstandard. In the standard definition, the argument list contains all universal variables that are index terms of outscoping quantificational DPs. The relaxation from “are” to “occur in” makes no difference under normal circumstances, but one

14

For example, in (21) below, the second DP (“a boy”) has an index term consisting of Skolem function y˙ combined with argument x because it lies in the scope of the first DP (“every girl”), whose index term is x. In general, in what follows, we write the arguments of Skolem functions as subscripts rather than placing them within parentheses. The motivation is to reduce clutter.

2.6

Lambda abstraction

As a general matter, we have used index-term passing where the standard approach would have lambda abstraction. There is a temptation to view this as a mere notational idiosyncracy and to make the interpretation rules look more familiar by adopting the more common approach of using lambda operators in the metalanguage, to be subsequently eliminated by beta reduction. There are compelling reasons to resist the temptation. The use of lambda abstraction in any of the rules would have repercussions for the entire system. For example, if we use lambda abstraction in Q-Restriction (12), we would have to replace Q-Scope (13) with standard function application, and replace Modification (14) with set intersection (implemented as conjunction of characteristic functions), and so on. Introducing lambda abstraction into the semantic translation would destroy the homomorphism mentioned in section 2.1 and discussed in more detail in section 2.7.2 below. The interpretation rules would no longer translate directly to CNF. One would first need to do beta reduction to convert the output of interpretation to CNF, and beta reduction is not readily invertible. We would lose the symmetry between production and comprehension that is fundamental to the reversibility of parsing discussed in section 1.1. Donkey anaphora would disappear as well. The indices provide a connection between pronouns and quantifiers that is global in scope. By contrast, lambda abstraction combined with beta reduction allows only connections that are compositions of chains of local connections. Eccentric connections by definition involve elements outside of the scope of the highest lambda operator, and would become unavailable if we converted the CNF system to one that uses lambda abstraction.

2.7 2.7.1

Examples Elementary examples

Consider the following tree, with full types written in: case arises in section 4.3 where this tiny change becomes critical.

15

(20)

S: C

John: 0John0

VP: CJohn0

likes: LJohn0 ,Mary0

Mary: 0Mary0

The semantic lexicon determines the semantic types of the terminal nodes: L for the verb, and 0 for the proper nouns. As for the VP and S nodes, each has a sole type-0 child with syntactic category DP, so the only rule that can be used is Application (10). The root constraints require the root node to have positive polarity, and Application guarantees that positive polarity is propagated to the rest of the tree. Notice that the index hJohn0 , Mary0 i on likes “becomes” hJohn0 i on VP when the verb combines with the DP node Mary having index Mary0 . This VP index then “becomes” h i on the S once the VP combines with the subject DP. The only nodes in this tree that have interpretations are likes, VP, and S, and they all have the same interpretation: likes0 (John0 , Mary0 ). The DP nodes contribute to the interpretation by providing the lexical constants John0 and Mary0 , which they bear as index terms, by the Proper Nouns rule (16). The index terms on the verb are constrained by the Application rule to match the index terms of the DP nodes. Next, consider the following tree, again annotated with full types: S: ∨

(21)

DP: ∀x every: 0

S: ∧

NP: C x DP: ∃y˙ x

S: C

girl: Lx a: 0

NP: Cy˙ x boy: Ly˙ x

t: 0x

VP: Cx likes: Lx,y˙ x

t: 0y˙ x

This tree illustrates the use of several rules. By Q-Scope (13), the top node has a type of ∨, meaning that its two daughters are combined via disjunction. Also by Q-Scope, the second highest S has type ∧, meaning that its daughters are combined via conjunction. By Q-Restriction (12), the NP dominating girl has negative polarity, because of the universal quantification. The Copy rule (6) propagates that polarity to the N girl. Because the DP dominating girl has signed type ∀, the Universal Variables rule (18) specifies that its index term is a universal variable, x. The DP dominating boy has signed type ∃, so the Skolem Variables rule (19) specifies that its

16

index is a Skolem term whose argument list comprises the outscoping universal variables. There is only one outscoping universal variable, x, hence the index term is y(x), ˙ which we have written as y˙ x . By the Anaphora rule (17), each trace bears the index of its antecedent. Thus, the original sentence for this tree was “every girl likes a boy” and not “a boy likes every girl.” 2.7.2

The essence of the interpretation function

Next, consider the same tree annotated with the node interpretations: (22)

S⇒ ¬girl0 (x) ∨ [boy0 (y˙ x ) ∧ likes0 (x, y˙ x )]

DP⇒ ¬girl0 (x) every

S⇒ boy0 (y˙ x ) ∧ likes0 (x, y˙ x )

NP⇒ ¬girl0 (x) girl⇒ ¬girl0 (x)

DP⇒ boy0 (y˙ x ) a

VP⇒ likes0 (x, y˙ x )

NP⇒ boy0 (y˙ x )

t

VP⇒ likes0 (x, y˙ x )

0

boy⇒ boy (y˙ x ) likes⇒ likes0 (x, y˙ x )

t

Such an annotated tree is quite repetitious. In fact, almost all of the interpretation rules of section 2.4 are copy rules, in the sense that the denotation of the parent is identical to the denotation of one of the children. The only exceptions are Literals (4), which interprets terminal nodes of type L as literals, and Q-Scope (13) and Modification (14), which interpret nonterminal nodes of type ∧ and ∨ as connectives, understanding ∧ as ∨ and ∨ as ∧. For the example just given, it suffices to compute the following (where boxes have been placed around the literals’ denotations and the coordinating nodes’ types):

17

S: ∨

(23)

DP: ∀x every: 0

S: ∧

NP: C x girl:Lx ⇓ ¬girl0 (x)

DP: ∃y˙ x a: 0

S: C

NP: Cy˙ x boy:Ly˙ x ⇓ boy0 (y˙ x )

t: 0x

VP: Cx likes:Lx,y˙ x ⇓

t: 0y˙ x

likes0 (x, y˙ x )

One can then read off the interpretation simply by reading the boxed elements: ¬girl0 (x) ∨ [boy0 (y˙ x ) ∧ likes0 (x, y˙ x )] 2.7.3

More examples

The following provides an example of a relative clause. The tree (24) is the LF for “all that glitters is gold.” S: ∨

(24)

DP: ∀x all: 0

S: C t: 0x

CP: C x

VP: Cx is: 0

that: 0x

S: C

gold:Lx ⇓ gold0 (x)

t: 0x

VP: C x glitters:Lx ⇓ ¬glitters0 (x)

The translation, ¬glitters0 (x) ∨ gold0 (x), is logically equivalent to glitters0 (x) → gold0 (x), with implicit universal binding of x. The negation on the CP sibling of all is introduced by Q-Restriction (12) and propagates to the entire subtree. The propagation of the index x throughout the

18

tree is also noteworthy. The choice of the variable is determined at the DP by the Universal Variables rule (18). (Properly speaking, the first variable should be x1 , but we have used x instead to reduce subscript clutter.) The variable is propagated to the relative pronoun by Abstraction (11), from the relative pronoun to its trace by Anaphora (17), from the trace to the subordinate-clause VP by Application (10), and from the VP to glitters by Copy (6). In the other direction, the variable is propagated from the DP to its trace by Anaphora (17), from the trace to the matrix-clause VP by Application (10), and from the VP to gold by Identity (7). Now let us consider what happens when we negate the sentence: “all that glitters is not gold.” This of course has two readings; the following LF assigns wide scope to the negation. (25)

S: C

not: 0

S: ∨

DP: ∀x˙ all: 0

S: C t: 0x˙

CP: Cx˙ that: 0x˙

is: 0

S: C t: 0x˙

VP: C x˙

VP: Cx˙

gold:Lx˙ ⇓ ¬gold0 (x) ˙

glitters:Lx˙ ⇓ glitters0 (x) ˙ This tree illustrates the use of Negation (8); it causes the second-highest S to have negative polarity. The basic type of a node combining a universally quantified DP with its sister is ∨, although the negated version ∨, which we have here, is interpreted as ∧. The negation spreads through the whole tree, except that it is canceled by a second negation on the sibling of all. The semantic translation is glitters0 (x) ˙ ∧ ¬gold0 (x), ˙ which can be read intuitively as “there is a thing x˙ such that x˙ glitters, but x˙ is not gold.” Next we consider an example in which there are two literals in the restriction of the quantifier; it illustrates the use of Modification (14).

19

S: ∨

(26)

DP: ∀x every: 0

S: C

NP: ∧

donkey:Lx ⇓ ¬donkey0 (x)

x

John: 0John0

owns:LJohn0 ,x ⇓

CP: C x that: 0x

VP: CJohn0

owns0 (John0 , x)

S: C t: 0x

t: 0x

VP: C x brayed:Lx ⇓ ¬brayed0 (x)

Modification is used to combine donkey with the relative clause. It propagates the variable on NP to both the head noun and the modifier. The translation is ¬donkey0 (x) ∨ ¬brayed0 (x) ∨ owns0 (John0 , x). At this point, we have illustrated all of the interpretation rules. Now we turn to the classic donkey-anaphora example, “every farmer who owns a donkey beats it” (Geach 1962).

20

S: ∨

(27)

DP: ∀x every: 0

S: C

NP: ∧

t: 0x

x

farmer:Lx ⇓ 0

¬farmer (x)

VP: Cx beats:Lx,y ⇓

CP: C x

it: 0y

beats0 (x, y) that: 0x

S: ∧

DP: ∃y a: 0

S: C

NP: C y donkey:Ly ⇓ ¬donkey0 (y)

t: 0x

VP: C x

owns:Lx,y ⇓ ¬owns0 (x, y)

t: 0y

The upper DP (headed by farmer ) is of type ∀, hence its parent S is of type ∨. The lower DP (headed by donkey) has basic type ∃ because its determiner is existential, hence its parent has basic type ∧. However, both are negated because they occur within the restriction of a universal. The index terms for both DPs are variables, not Skolem terms, because the signed types are universal: ∀ and ∃, respectively. (Note that the basic type of the NP dominating farmer is ∧, not because of an existential quantifier, but because it is a case of Modification.) The only thing that is new is the pronoun. Its antecedent is the DP headed by donkey; hence its index is y, by Anaphora (17). This—rather magically— gives the correct interpretation: ¬farmer0 (x) ∨ ¬owns0 (x, y) ∨ beats0 (x, y) See section 3 below for details on the CNF treatment of anaphora.

2.8

Shorthand

Despite the simplifications we have made, using trees to compute translations consumes considerable space. We can define a shorthand for computing the translation directly from a partially bracketed sentence. Rather than giving a formal definition, we give an informal characterization. Take the LF structure,

21

and add annotations according to the following templates:8 (28)

someu˙ NP ∧ everyu NP ∨

∧ that S Adj ∧ NP

not S

For the determiners, “u” or “u” ˙ represents the variable assigned to the determiner’s parent node. For example, consider “every dog that chases every cat is happy.” Switching to LF word order and adding annotations, this becomes: (29)

everyx dog ∧ that everyy cat ∨ chases ∨ is happy

To read off the interpretation, one makes the following adjustments in a negative context (that is, in any region with an odd number of superscribed lines): add dots to undotted variables, remove dots from dotted variables, place “¬” before each content word, and invert the connectives ∧ and ∨. Then one adds outscoping universal variables as subscripts on Skolem variables; one copies the variables to the argument positions of content words, as appropriate; and finally one deletes everything except the literals and connectives. Grouping follows the LF tree structure. For (29), the result is (30): (30)

¬dog0 (x) ∨ [cat0 (y˙ x ) ∧ ¬chases0 (x, y˙ x )] ∨ happy0 (x).

Note that a dot has been added to the variable y (it has been converted to a Skolem variable) because it appears in a negative context. The resulting Skolem function y˙ has argument x because its DP, every cat, appears within the scope of every dog at LF. Here is an example with wide-scope not. The (a) line is the S-structure, the (b) line is the LF, and the (c) line is the translation. (31)

a.

every dog that chases a cat1 does not catch it1

b. c.

not everyx dog ∧ that ay˙ cat ∧ tx chases ty˙ ∨ tx catches ity˙ dog0 (x) ˙ ∧ cat(y) ˙ ∧ chases0 (x, ˙ y) ˙ ∧ ¬catches0 (x, ˙ y) ˙

There are multiple occurrences of x and y˙ in (31b). The occurrences on the determiners (every x and a y˙ ) are the primary occurrences, and the context of the primary occurrence determines whether the variable remains as it is or switches between dotted and undotted. The secondary occurrences are all on anaphors that take a primary occurrence as antecedent. The variable x receives a dot because the primary occurrence is in a negative context. The other occurrences of x follow suit, regardless of their own context. That is, all occurrences of x are replaced uniformly with x. ˙ The primary occurrence of y˙ occurs in a positive context (double negative is positive), so y˙ remains unchanged. All occurrences of y˙ remain unchanged, regardless of their own context. In (31b), x outscopes y, ˙ but because x ends up as a dotted 8 The templates are largely sufficient for the examples we consider, but they are not exhaustive.

22

variable, it is not an argument of y: ˙ only outscoping universal variables appear as arguments of a Skolem function.

3

Anaphora

One of the main motivations for Discourse Representation Theory (see Kamp and Reyle 1993), and its compositional cousin Dynamic Predicate Logic (Groenendijk and Stokhof 1991), is the treatment of pronouns—in particular, crosssentential anaphora and intra-sentential donkey anaphora. In this section, therefore, we detail our treatment of anaphora and highlight the differences from the DPL approach. (We skip DRT since DPL was designed as a compositional analog of DRT).

3.1

Basic intra-sentential anaphora

The CNF system allows for basic intra-sentential anaphora via the mechanism of indexation. Two co-indexed nodes will have the same referent in this system, whether this referent is determined by a constant, a regular variable, or a Skolem term. The following examples illustrate the treatment of typical referential and bound pronouns: (32)

3.2

a. b. c.

everyx dog ∨ loves himselfx ⇒ ¬dog0 (x) ∨ loves0 (x, x) somex˙ dog ∧ loves himselfx˙ ⇒ dog0 (x) ∧ loves0 (x, x) JohnJohn0 loves himselfJohn0 ⇒ loves0 (John0 , John0 )

Discourse (inter-sentential) anaphora

One benefit of the DPL system over previous approaches is that it clearly enumerates the circumstances under which pronouns may co-refer with quantifiers in previous sentences. In the standard approach, pronouns translate as variables, and a formula containing unbound variables represents the set of assignments that verify the formula. For instance the meaning of the sentence iti brayed is the set of all assignments mapping i to an individual that brayed: {g | g(i) ∈ [[brayed]]} In DPL, on the other hand, sentences denote sets of pairs of assignments, which represent input-output contexts. The same sentence in DPL thus denotes the set of all pairs of assignments hg, hi such that g(i) brayed and h is the assignment after evaluating this sentence. Since there are no defined dynamic elements in this sentence, h will be the same as g, yielding: {hg, gi | g(i) ∈ [[brayed]]} Sentences with dynamic elements may change the output assignment, for instance to add a referent. Thus, the DPL meaning for the sentence “a donkeyi brayed”—which contains a dynamic existential quantifier—is as follows: 23

{hg, hi | ∃k : k[i]g ∧ k(i) ∈ [[donkey]] ∧ hk, hi ∈ [[i brayed]]} This is the set of all pairs of assignments hg, hi such that there is a third assignment k differing from the input function g at most in its value for i such that k(i) is a donkey and the pair hk, hi is in the denotation of i brayed. The dynamic existential quantifier has the effect of adding an individual (the one that brayed) to the input assignment g and using the result as the output assignment h. Next, DPL treats discourse sequencing by using the output assignment of one sentence as the input assignment of the subsequent sentence. Therefore, a short discourse such as “A donkey brayed. It was hungry.” will use the output assignment of the first sentence, containing a referent for the braying donkey, as the input assignment for the second sentence, allowing the pronoun it to refer back to this donkey. In the CNF system, on the other hand, there are no explicit quantifiers in the interpretations of sentences. Instead, Skolem variables are implicitly existentially quantified over whole discourses, and universal variables are implicitly universally quantified within the scope of the Skolem variables, as described in Section 1.2. Thus, the same discourse would simply yield the following CNF formulas: (33)

a. b.

ax˙ donkey ∧ brayed ⇒ donkey0 (x) ˙ ∧ brayed0 (x) ˙ 0 itx˙ was hungry ⇒ hungry (x) ˙

The Skolem Variables rule (19) specifies that a DP headed by a, in a positive context, has as index term a Skolem function applied to an argument list. In the case of (33a), the argument list is empty, since there are no outscoping universal variables. Next, the Anaphora rule (17) requires a pronoun to share an index with its antecedent, ensuring that it has the same index as a donkey. Sentence interpretations are implicitly conjoined, and so the final CNF formula (intension) is donkey0 (x) ˙ ∧ brayed0 (x) ˙ ∧ hungry0 (x). ˙ The extension of this formula is determined by existentially quantifying over the zero-arity Skolem function x, ˙ yielding the meaning “there was a donkey that brayed and was hungry.” DPL defines universal quantifiers such as every donkey such that they do not change the assignment for subsequent sentences, correctly ruling out sequences such as the following (although see section 4.2 below for a discussion of exceptions to this rule): (34)

Every donkeyi brayed. *Iti was hungry.

As things stand, the CNF system allows such sequences. The sentence in (34) has the following translation in CNF: (35)

a. b.

everyx donkey ∨ brayed ⇒ ¬donkey0 (x) ∨ brayed0 (x) itx was hungry ⇒ hungry0 (x)

24

Even worse, this yields the following meaning for the discourse, consisting of two CNF clauses: [¬donkey0 (x) ∨ brayed0 (x)] ∧ hungry0 (x).

(36)

In words, “every donkey brayed, and everything was hungry.” Obviously, this is not the correct interpretation even for this aberrant English discourse. A fairly natural way to refine the system is suggested by the following observation. The expression (36) is logically equivalent to the following: [¬donkey0 (x) ∨ brayed0 (x)] ∧ hungry0 (y).

(37)

In other words, there is no connection conveyed between being a (braying) donkey and being hungry; the only contribution that the universal quantifier every donkey makes towards the interpretation of the pronoun it is to make it a universal variable, rather than a Skolem term. To put it another way, the co-indexation is vacuous, in the sense that the interpretation of (34) is logically equivalent to the interpretation of the same discourse without the anaphora:9 (38)

Every donkeyx brayed. Ity was hungry.

We define: (39)

An anaphor α is vacuously bound in a discourse φ if [[φ]] is logically equivalent to [[φ0 ]], where φ0 is identical to φ except that the index of α is replaced with a new universal variable, not appearing elsewhere in the discourse.

We add the following constraint on anaphora. (40)

Prohibition against Vacuous Binding: A pronoun may not be vacuously bound.

The purpose of co-indexing two items is to indicate a certain pattern of coreference between them. It makes sense to disallow such co-indexing when the two items in question are not referentially linked whatsoever. See B¨ uring (2005, Ch. 6) for discussion of similar issues in standard binding theory. The prohibition in (40) also rules out certain illicit cases where the quantifier and pronoun appear in the same sentence. For instance, consider the following sentence and its LF representation: (41)

One of hisx friends likes everyx boy.

9 Note

(i)

a. b.

that the following are not logically equivalent: ∃x . [A(x) ∧ B(x)] [∃x . A(x)] ∧ [∃y . B(y)]

Expression (ia) entails (ib), but not vice versa. Hence, when a pronoun translates as a Skolem term, it is not vacuously bound, even when its antecedent occurs in a separate CNF clause.

25

(42)

S

DPy˙ one of hisx friends

S DPx every

boy

VP ty˙

VP likes

tx

This sequence is not ruled out by any syntactic binding condition (Chomsky 1981), since the coindexed his and every boy do not c-command one another. Furthermore, it does not run afoul of any prohibition against crossover, since the two items are in their surface order. However, such an indexing is ruled out by (40). The meaning derived would be something like the following: (43)

friend-of 0 (y, ˙ x) ∧ [¬boy0 (x) ∨ likes0 (y, ˙ x)]

In other words, there is a someone y˙ who is friends with everyone, and y˙ likes every boy. This is obviously not the correct interpretation for (41). Notice, however, that the anaphor his is vacuously bound: the variable x in the first clause— contributed by the index on the pronoun his—could be replaced by a fresh universal variable without changing the meaning of the discourse. Hence (40) rules out every boy as a possible antecedent for his in this construction.

3.3

Donkey anaphora

Another benefit to the DPL system is that it allows dynamic elements to change the assignment function in the middle of computing the denotation of a sentence. This comes in handy in interpreting so-called donkey anaphora, pronouns which seem to refer back to existential quantifiers not in a position to bind them in standard predicate logic. The DPL definition of universal quantification allows this by using the output assignment as changed by its restriction as the input assignment to its nuclear scope.10 So, for instance, the classic donkey-anaphora example every farmer who owns a donkey beats it (Geach 1962) is evaluated in such a way that the input assignment for the nuclear scope beats it contains a suitable referent for the pronoun it—a referent “added” to the assignment during the computation of the restriction farmer who owns a donkey. As seen in Section 2.7.3 above, the CNF system handles this type of example without any further machinery. 10 This is a simplification. DPL actually quantifies over all possible output assignments of the restriction and uses each as a potential input assignment for the nuclear scope.

26

3.4

Negation, Disjunction, and Universal Quantification

As seen above, DPL was designed to allow existential quantifiers to effectively extend their scope between two sentences sequenced in a discourse and between the restriction and nuclear scope of a quantifier. In addition, the DPL definition of conjunctive and renders it synonymous with discourse sequencing. Based on examples such as the following, however, the DPL definitions for negation, disjunction, and universally quantified sentences were designed to block existentials from scoping beyond a certain point: (44)

a. John doesn’t own a cari . *Iti ’s in his driveway. b. Every farmer who owns a donkeyi beats iti . *Iti is very stubborn. c. *John owns a cari or iti ’s in his driveway.

In the CNF approach, the Prohibition against Vacuous Binding defined in (40) above correctly rules out the first two cases. The analysis for (44a) goes as follows: (45)

a. b.

not ax˙ car ∧ JohnJohn0 owns tx˙ ∧ Itx˙ is in his driveway [¬car0 (x) ∨ ¬owns0 (John0 , x)] ∧ in-driveway0 (x)

This final formula runs afoul of the Prohibition against Vacuous Binding, since the pronoun it is vacuously bound: if its interpretation x is replaced with a new variable y, the result is logically equivalent to (45b): (46)

[¬car0 (x) ∨ ¬owns0 (John0 , x)] ∧ in-driveway0 (y)

The analysis for (44b) is a little more complex, but yields a very similar result: (47)

a. b.

everyx farmer ∧ who ay˙ donkey ∧ owns ∨ beats ity˙ ∧ ity˙ is stubborn [¬farmer0 (x) ∨ ¬donkey0 (y) ∨ ¬owns0 (x, y) ∨ beats0 (x, y)] ∧ stubborn0 (y)

Once again, the last instance of it is vacuously bound, since the y in stubborn0 (y) could be replaced with a new variable without altering the truth conditions. The Prohibition against Vacous Binding does not rule out (44c). However, examples of this form are not uniformly infelicitous. For instance, consider the following disjunctions, which sound fine despite exhibiting a scoping that is illegal under the rules of DPL: (48)

a. b. c.

John bought a cari , or perhaps he stole iti John bought a cari , or Mary bought iti A dogi ate our dinner, or at least iti ate most of our dinner.

And, in fact, the original example (44c) improves if you make it a more plausible disjunction, such as the following: (49)

I see John owns a Ferrarij , or maybe a rich friend parked itj in his driveway. 27

Under the CNF account, (44c) and (48a–c) all involve pronouns whose antecedents bear Skolem indices, hence they are all predicted to be good. So, what is the difference between cases like (49) that sound good and those such as (44c) that do not? Although we leave this as an open question, we suggest that cases like (44c) may be infelicitous for reasons of discourse coherence. The sentence in (44c) involves the disjunction of two seemingly unrelated statements: one about John owning a car, and the second about a car being in his driveway. To the extent that the sentences can be related by supplying more context, such as in (49), the example improves.

3.5

The Limits of DPL

The strict rules of DPL quickly run into problems in cases quite similar to those given above, as pointed out by Groenendijk and Stokhof (1991) themselves: (50)

a. b.

Either there isn’t a bathroomi here or iti ’s in a funny place. Every player chooses a tokeni . Iti goes on square one.

The simplest formulation of DPL predicts these sentences to sound odd, since they involve scoping out of negation, disjunction, and/or a universally quantified sentence; but they are actually entirely acceptable. Groenendijk and Stokhof (1991) propose extensions to accommodate examples like these. By contrast, the CNF system makes the correct predictions without modification. For instance, consider the following analysis for (50a): (51)

a. b.

There isn’t ax˙ bathroom ∧ here or ∨ itx˙ is in a funny place ¬bathroom0 (x) ∨ ¬here0 (x) ∨ in-funny-place0 (x)

Since the existential a bathroom is negated, it translates as a universal variable. The resulting interpretation is the same as the interpretation of “every/any bathroom here is in a funny place.” Similarly, (50b) has the following analysis under the CNF system: (52)

a. b.

[everyx player ∨ [ay˙ token ∧ tx chooses ty˙ ]] ∧ ity˙ goes on square one [¬player0 (x) ∨ [token0 (y˙ x ) ∧ chooses0 (x, y˙ x )]] ∧ goes-on-sq-one0 (y˙ x )

Remember for the last clause here that y˙ x actually represents the application of the function y˙ to the argument x, and x is implicitly universally bound. Thus goes-on-sq-one0 (y˙ x ) means that for all x, y(x) ˙ goes on square one. The grouping, and in particular the fact that the second sentence introduces a separate CNF clause, makes for a certain subtlety. The first clause establishes for all players p that y(p) ˙ is a token. However, for individuals x that are not players, the complete discourse (52b) is satisfied as long as y(x) ˙ is on square one, regardless whether y(x) ˙ is a token or not. Let us call anything that is not a player’s token an “extraneous item.” The discourse is satisfied by a situation in which there are extraneous items on square one, but it is also satisfied if there are no extraneous items on square one, as long as all the players’ tokens are there. (As long as there exists any player p0 , we may choose y(x) ˙ = y(p ˙ 0) 28

for all non-players x.) Hence the discourse asserts only that all players’ tokens are on square one, though it is not contradicted if there happen also to be extraneous items on square one. This does coincide with the meaning of the English sentence.

3.6

Cataphora

Cataphora is another phenomenon where the predictions of DPL and the CNF approach diverge.11 DPL admits of no cataphoric binding (besides perhaps accidental co-reference), since dynamic items such as quantifiers only affect the assignments of material after them. However, cataphoric bound pronouns are not uncommon: (53)

a. b.

It1 will (probably never) be used, but an officer in each county has been equipped with a tranquilizer gun1 . Everyone in its1 way tends to flee a tiger1 .

The case in (53a) is arguably a cataphoric case of standard quantifier-pronoun binding. The case in (53b) involves a cataphoric donkey pronoun. In contrast to DPL, the CNF system treats cataphora the same as anaphora. For example, here is the CNF analysis of (53a): (54)

a.

It1 will (probably never) be used, but an officer in each county has been equipped with a tranquilizer gun1 .

b.

not ity˙ will be used, but ∧ [eachc county ∨ [anx˙ [officer ∧ in tc ] ∧ ay˙ tranq. gun ∧ tx˙ has been equipped with ty˙ ]]

c.

¬be-used(y˙ c )∧[¬county(c)∨[officer(x˙ c )∧in(x˙ c , c)∧tranq-gun(y˙ c )∧ equipped-with(x˙ c , y˙ c )]]

Classic Bach-Peters sentences (Bach 1970, Karttunen 1969, 1971) such as (55a) provide particularly complex examples of cataphora. Again, the CNF system as it stands predicts them to be good and assigns the correct interpretation, as shown in (55). For simplicity, we treat the definite determiner as a simple existential, putting the uniqueness presupposition aside. (55)

3.7

a.

every pilot1 who shot at it2 hit the MIG2 that chased him1

b.

everyx pilot ∧ who tx shot at ity˙ ∨ [they˙ MIG ∧ that ty˙ chased himx ∧ [tx hit ty˙ ]]

c.

¬pilot(x) ∨ ¬shot-at(x, y˙ x ) ∨ [MIG(y˙ x ) ∧ chased(y˙ x , x) ∧ hit(x, y˙ x )]

Summary

To summarize, the CNF account handles both standard anaphora and donkey anaphora. A proposed prohibition against vacuous binding accounts for the illformedness of certain cases of intrasentential anaphora that are not ruled out by 11 Thanks

to David Beaver for pointing this out.

29

the standard binding conditions or constraints on crossover. The vacuous binding prohibition also eliminates some but not all more complex cases combining negation, disjunction and quantification; and the predictions are in good accordance with human judgments. Moreover, DPL and the CNF account differ in their predictions about the acceptability of cataphora; the facts generally accord with the predictions of the latter. On the whole, the CNF account compares favorably with DPL both on empirical coverage and simplicity.

4

Additional Issues

In this section, we first present a CNF version of a ‘neo-Davidsonian’ analysis of LF. Next, we use this to help capture certain cases of telescoping anaphora (Roberts 1987, 1989). Last, we examine a case where CNF predicts anaphoric relations not actually possible in English.

4.1

Event variables

Many researchers following Davidson (1967) have argued that the traditional account in which transitive verbs are interpreted as two-place predicates does not accommodate verbal modifiers. The neo-Davidsonian account, in which a verb denotes a subclass of situation, is more adequate, though somewhat more verbose (see Higginbotham 1985, 2000, Dowty 1988, Parsons 1990, 2000). We sketch a neo-Davidsonian version of the CNF system, for two reasons. First, it necessitates some modifications to the system; and second, event variables play a role in our account of certain cases of telescoping, discussed below. Let us consider a simple example, “Fido chased Spike quickly.” We assume the following syntactic structure.

30

(56)

TP: ∧

T: ∃e˙ 0∃

VP: ∧

e˙

past:Le˙ ⇓ past0 (e) ˙

ΘP: Ca(e) ˙

θ:La(e), ˙ Fido0 ⇓

VP: ∧

Fido: 0Fido0

VP: ∧

e˙

quickly:Le˙ ⇓

e˙

0

a(e) ˙ = Fido

quick0 (e) ˙ chased:Pe˙ ⇓ chase0 (e) ˙

ΘP: Cp(e) ˙ θ:Lp(e), ˙ Spike0 ⇓

Spike: 0Spike0

p(e) ˙ = Spike0

Notice the use of event variables and the new thematic-argument phrases (ΘP). These ΘPs provide a place to attach the semantics of theta-role assignment— a(e) ˙ for agent and p(e) ˙ for patient—and they will also play a role in our analysis of telescoping below. These changes require additions to our interpretation rules, as we outline here. Q-Scope Variant. For consistency with the Q-Restriction rule (12), the tense element T is treated as consisting of an empty determiner-like element with semantic type 0, and an empty literal that introduces the time predicate. A modified version of the Q-Scope rule (13) is needed to combine T with VP. Specifically, Q-Scope as it currently stands requires the non-∃ child (VP) to have no index. This is appropriate when the presence of the index indicates an unsaturated argument, but in the Davidsonian approach, a VP is analogous to an NP in the sense that the index does not represent an unsaturated argument but rather the object denoted by the phrase—in our example, e˙ represents the chasing event denoted by the VP. Here is the new variant of the Q-Scope rule. The child α is restricted to being a (semantic) quantifier that has not undergone QR—i.e., it is limited to being a tense element.   "" ## γ :∧ γ :∧   = [[α]] ∨ [[β]] = [[α]] ∧ [[β]] (57) α : ∃u β : τu α : ∃u β : τ u

31

Theta-role Assignment. An abstract head is also introduced for theta-role assignment. The nodes labeled θ in (56) are abstract words that translate as equality. They are combined with their DP arguments (Fido, Spike) using Application (10). We therefore require a new rule to combine the resulting ΘP with VP. Like the Modification rule, the Theta-role Assignment rule applies when neither child is of type 0 or quantificational (∀, ∃). Modification applies when one child modifies the other, and Theta-role Assignment applies when one child (β) is a syntactic argument of the other (α). The function θ in the rule represents the theta role assigned to the argument.     γ : ∧u γ : ∧u  = [[α]] ∧ [[β]]   = [[α]] ∨ [[β]]  (58) α : σu β : τθ(u) α : σ u β : τ θ(u) With these added rules, the CNF system yields the meaning indicated above in (56): there is a past chasing event e˙ whose agent is Fido and whose patient is Spike.

4.2

Telescoping

Roberts (1987, 1989) describes a phenomenon she calls telescoping, wherein a universal quantifier in one sentence seems to bind a pronoun in a subsequent sentence: (59)

[Each candidate]i approached the stage. Hei shook the dean’s hand and returned to hisi seat.

The CNF account as developed so far predicts this discourse to be ruled out, on the grounds that it violates the Prohibition against Vacuous Binding: (60)

[¬candidate0 (x) ∨ approached-stage0 (x)] ∧ shook-deans-hand0 (x) ∧ returned-to-seat0 (x)

Before suggesting a solution to this problem, we turn to another type of sentence discussed by Roberts (1989), where an existential below a universal appears to scope across two sentences. We saw an example of this in Section 3.5 above: (61)

Every player chooses [a token]i . Iti goes on the first square.

As shown above, this kind of example is captured straightforwardly in the CNF system. We repeat the analysis here: (62)

a. b.

[everyx player ∨ [ay˙ token ∧ tx chooses ty˙ ]] ∧ ity˙ goes on square one [¬player0 (x) ∨ [token0 (y˙ x ) ∧ chooses0 (x, y˙ x )]] ∧ goes-on-sq-one0 (y˙ x )

The Skolem term y˙ x is contributed by the quantified DP a token. The pronoun it is coindexed with a token and therefore also contributes the term y˙ x to the

32

interpretation of the second sentence. In this case, the binding is not vacuous, and no violation results. So, the CNF account easily handles one of Roberts’ telescoping cases, but what about the other? Is there any way to capture the binding in (59) with a Skolem term, similar to that in (62b)? The event variables in the Davidsonian analysis of section 4.1 provide one possibility. Let us consider the Davidsonian analysis of the first sentence of (59). For clarity, we show only the indices, not the full semantic types (with the exception of the θ node): (63)

S

DPx

TP

each candidate Te˙ x

VPe˙ x VPe˙ x

ΘPa(e˙ x ) θ:La(e˙ x ),x ⇓ a(e˙ x ) = x

tx

approached the stage

The translation of (63) is: (64)

¬candidate0 (x) ∨ [a(e˙ x ) = x ∧ approach-stage0 (e˙ x )]

Note that, under the Davidsonian analysis, the predicate approach-stage0 is true of events in which someone approaches the stage. The Skolem variable e˙ x represents the stage-approaching event for each candidate x. The innovation we require to enable telescoping in this case is to assume that not only the subject trace, but also the ΘP, are potential antecedents for pronouns in subsequent sentences. Consider the continuation “he shook the dean’s hand and returned to his seat.” Choosing the trace as antecedent of “he” violates Vacuous Binding, but choosing the ΘP does not. If we choose ΘP as antecedent, the pronoun’s index is the Skolem term a(e˙ x ), analogous to the Skolem term y˙ x in (62b). A subtlety does arise. If we take the second clause (“he shook the dean’s hand”) to introduce an unrelated event variable s, ˙ we get the interpretation: (65)

[¬candidate0 (x) ∨ [a(e˙ x ) = x ∧ approach-stage0 (e˙ x )]] ∧ a(s) ˙ = a(e˙ x ) ∧ shake-hand0 (s) ˙

Because s˙ names a single event, a(s) ˙ is a single individual a0 (the agent of the event). Under this interpretation, the second clause states that the agent of e˙ x 33

equals a0 , for all x, entailing that all stage-approaching events have the same agent a0 . It does not seem necessary to rule this interpretation out semantically, inasmuch as it flaunts Gricean principles (Grice 1975): it is deceptive to say “every candidate” instead of “the only candidate” in a situation where there is only one candidate.12 If one entertains the possibility that the speaker is flaunting Gricean principles in this manner, the interpretation does seem to be accessible: every candidate approached the stage, and he (the one and only candidate) shook the dean’s hand etc. However, we would like also to allow an interpretation in which the event variable of the second clause is a function of the event of the first clause— what we might call an anaphoric tense (cf. Partee 1973). Specifically, let “s(e)” ˙ denote the event that is the narrative successor of event e. ˙ We would like to allow the T node in the second sentence to bear the index s(e), ˙ instead of a freshly-allocated Skolem term—and indeed, assigning the index s(e) ˙ to the T node expresses the narrative succession discourse relation. This provides the correct interpretation. The event variable for the second sentence, “he shook the dean’s hand,” is s(e˙ x ), and the pronoun “he” receives the index a(e˙ x ) by taking the subject ΘP of the first sentence as antecedent. The CNF for the two sentences together comes out as: (66)

[¬candidate0 (x) ∨ [a(e˙ x ) = x ∧ approach-stage0 (e˙ x )]] ∧ a(s(e˙ x )) = a(e˙ x ) ∧ shake-hand0 (s(e˙ x ))

The modification needed to permit anaphoric tenses actually concerns the index-assignment rules of section 2.5, rather than the rules of interpretation. The Skolem Variables rule (19) needs to be revised to permit Skolem terms to be assigned to T nodes (and not only to DP nodes), while also permitting a T node optionally to receive the index s(e), ˙ where e˙ is the event variable of the previous sentence. A more formal statement would be premature at this point, in the absence of a systematic study of the referential possibilities for T nodes. We do note one issue that bears on the statement of the Skolem Variables rule. Consider a discourse such as (67): (67)

a. Each player1 approaches the board. b. He1 chooses a token.

Suppose each player receives the index x. The DP a token in the second sentence is assigned a Skolem variable, y, ˙ and since a token is not outscoped by each player, it would appear that y˙ has no arguments. In the resulting reading, there is only one token; every player chooses the same token. Empirically, this is clearly incorrect. However, under the Davidsonian analysis, the event variable of the first sentence is e˙ x , and the event variable of the second sentence, if anaphoric, is s(e˙ x ). If a token scopes below TP in the second sentence, then the x in the event 12 This

may also be viewed as a consequence of Heim’s (1991) ‘Maximize Presupposition’

principle.

34

variable s(e˙ x ) is an outscoping universal variable, and the index for a token is y˙ x , correctly yielding a reading in which each player chooses a different token. When we defined the Skolem Variables rule, we noted that its statement was slightly nonstandard in that the argument list of a Skolem function consisted of all universal variables occurring in the index terms of outscoping quantifiers. Under the standard definition the argument list would include only universal variables that are index terms of outscoping quantifiers. In all previous cases, the two definitions have been equivalent, but for (67), they diverge, and the standard definition would yield the wrong reading. Finally, we would like to make some comments about the postulated narrative successor function s(·). We would like to emphasize that it is not merely an expedient way of converting a universal variable to something that behaves like a Skolem function. Empirically, when the discourse relation is something other than narrative succession, telescoping is usually unavailable. Consider the following sentence and several potential next sentences (after Fodor and Sag 1982): (68)

Each of the students was accused of cheating on the exam.

(69)

a. He was reprimanded by the dean. b. #He has a Ph.D. in astrophysics. c. #He was failing the course and wanted a better grade.

The continuation in (69a) is a valid case of telescoping, but (69b) sounds odd. Wang et al. (2006) discuss such cases at length and determine that the discourse relation between two sentences is crucial in determining whether they will support telescoping. A Narration relation holds between (68) and (69a), but a Background or Explanation relation holds between (68) and (69c). Possibly, the Background and Explanation discourse relations fail to identify a unique successor situation, hence fail to provide an event like s(e˙ x ) in (66) that enables the telescoping interpretation. To complicate matters, Roberts (1989) points out that (69b) sounds fine as successor to a different sentence, to which it bears a different discourse relation: (70)

Each candidate for the space mission meets all our requirements. He has a Ph.D. in astrophysics and extensive prior flight experience.

In this case, the discourse relation is Elaboration, which, like Narration, appears to license telescoping. Perhaps what Narration and Elaboration have in common is that they involve a single large event divided into subevents. Unfortunately, it is unclear whether that genuinely sets them apart from other discourse relations, and whether it provides a compelling generalization of the successor-function mechanism that correctly discriminates between cases that permit telescoping and those that do not. We leave this as an open question.

35

4.3

Additional constraints and open questions

Cases remain in which the CNF system is overly permissive, allowing anaphoric relations that are empirically unavailable. It appears that two separate constraints are required. We provide tentative formulations, but we also flag significant questions that remain. Prohibition on vacuous Skolem binding. When a Skolem variable appears in a disjunction, but only in one disjunct, one cannot refer back to it using a pronoun. Consider the following discourse: (71)

a. b.

Either Mary got a question1 wrong or I won my bet. ˙ ∧ won0 (I0 , b)] ˙ [question0 (q) ˙ ∧ got-wrong0 (Mary0 , q)] ˙ ∨ [bet0 (b)

(72)

a. *It1 was (probably) question 14. b. q˙ = question-140

The sentence (72) attempts to refer back to the question that Mary got wrong, but it just ends up sounding odd. Note, though, that (71) does not in fact assert that such a question exists, giving a natural explanation for why (72) is unacceptable. More technically, define a vacuous Skolem function to be one that is undefined everywhere. That is, a one-place function y˙ is vacuous if y(x) ˙ = ⊥ for all x, and a zero-place function y˙ is vacuous if y˙ = ⊥. There exist models and assignments for (71b) in which the Skolem function q˙ is vacuous. In particular, in models in which I won my bet, (71b) is satisfied even if q˙ is vacuous. Under these conditions, we define the Skolem variable q˙ to be potentially vacuous. We propose to exclude the anaphora in (72) by extending the prohibition on vacuous binding to exclude binding in which the index term of the antecedent is a potentially vacuous Skolem variable.13 Most of the cases we have seen so far will not run afoul of this new prohibition, since they introduce Skolem variables in literals which are then connected to later clauses via conjunction. Due to the nature of conjunction, previous clauses must be true and hence the Skolem terms in them must be non-vacuous. The prohibition does not always prevent Skolems in disjunctions from acting as antecedents. For instance, contrast (71) with (50a), repeated here as (73), where the Skolem variable appears in both preceding disjuncts: (73)

a. b.

Either there isn’t a bathroom1 here or it1 ’s in a funny place ˙ ∨ ¬here0 (b) ˙ ∨ in-funny-place0 (b) ˙ ¬bathroom0 (b)

In (73b), the first English clause (“there isn’t a bathroom here”) translates as ˙ ∨ ¬here0 (b), ˙ but in this case, the disjunction is the disjunction ¬bathroom0 (b) 13 Notice that a follow-up statement that acknowledges this potential vacuity actually sounds much better:

(i)

?If I lost my bet, it1 was probably question 14.

36

not satisfied under any model if b˙ is vacuous: if b˙ is vacuous, then the value ˙ ∨ ¬here0 (b) ˙ is undefined. Since b˙ cannot be vacuous, it is of ¬bathroom0 (b) suitable as an antecedent index—that is, the pronoun it in the second English ˙ 14 clause (“it’s in a funny place”) can refer back to the bathroom b. In all the examples we have considered so far, Skolem variables arise from existential determiners. It is also possible to generate Skolem variables from universal determiners, by placing them in a negative context. In such cases, attempts to refer back to the Skolem variable fail spectacularly: (74)

a. **Every player who holds every suit1 eventually gets it1 . b.

(75)

a. b.

¬player0 (p) ∨ [suit0 (s˙ p ) ∧ ¬hold0 (s˙ p )] ∨ gets0 (p, s˙ p ). Every girl that answered every1 question was rewarded. **It1 wasn’t even a trick question, usually. [¬girl0 (x) ∨ [question0 (y˙ x ) ∧ ¬answered0 (x, y˙ x )] ∨ was-rewarded(x)] ∧ ¬trick-question0 (y˙ x )

The profound unacceptability of (74) and (75) suggests that the proposed prohibition against vacuous Skolem binding is not the whole story. We will turn next to a second prohibition, against a universal quantifier in a negative context serving as an antecedent for a pronoun. If negative contexts include the restrictions of universals, then (74) and (75) violate both prohibitions, giving a plausible reason why they should sound so bad. Negative universals cannot be antecedents. An every phrase in the scope of a negative appears to be a bad antecedent, independently of anything else. Consider the following contrast: (76)

(77)

a.

Some runner1 must lose. He1 must be satisfied with doing his best.

b.

runner0 (r) ˙ ∧ loses0 (r) ˙ ∧ must-be-satisfied0 (r) ˙

a. b.

Not every runner1 wins. *He1 must be satisfied with doing his best. runner0 (r) ˙ ∧ ¬wins0 (r) ˙ ∧ must-be-satisfied0 (r) ˙

Example (77) violates neither the prohibition on vacuous binding nor the prohibition on binding by vacuous Skolem variables. The only obvious difference between (76) and (77) is the explicit “not” in (77). The presence of “not” appears to trigger a violation even if we construct a doubly negative (hence semantically positive) context: 14 Similarly, example (49) above continues to be admitted. The context “I see John owns a Ferrari” cannot be satisfied if the Skolem variable for “a Ferrari” is vacuous; hence binding in the following phrase “or a rich friend parked it in his driveway” is possible. In fact, this whole sentence also licenses binding of “a Ferrari” in a following sentence, such as “It’s blocking my way out!”

37

(78)

a. b.

Every player who is missing a suit1 must wait till he gets it1 ¬player0 (p) ∨ ¬suit0 (s) ∨ ¬missing0 (p, s) ∨ wait-for0 (p, s)

(79)

a. *Every player who does not hold every suit1 must wait till he gets it1 b. ¬player0 (p) ∨ ¬suit0 (s) ∨ holds0 (p, s) ∨ wait-for0 (p, s)

Examples (76)–(77) contain Skolem variables in the translation and examples (78)–(79) contain universals, so the prohibition evidently does not depend on the type of variable. In fact, on the assumption that not winning is the same as losing, (76) and (77) are logically equivalent, and assuming that not missing a suit is the same as holding it, (78) and (79) are logically equivalent. Presumably, then, whatever distinguishes the good and bad examples is syntactic. A possibility that suggests itself is that every in the scope of not is syntactically plural. Namely, changing the pronoun from “it” to “them” makes (79) good:15 (80)

Every player who does not hold every suit1 must wait till he gets them1 .

However, it is questionable whether (79b) is precisely the correct interpretation for (80). The pronoun them in (80) appears to refer explicitly to the quantifier’s restriction set, the set of suits. Unfortunately, exploring the question properly would require a general treatment of pluralities within the CNF approach, and would take us beyond the scope of this paper. We do append one observation. Anaphoric reference to the set reading of a universal quantifier appears to be an obviative strategy, in the sense that the set reading is available only when the normal distributive reading is unavailable. In (80), the distributive reading violates the prohibition against taking “not every” as antecedent. There are also cases in which the distributive reading violates the prohibition against vacuous binding, and the set reading is available as an alternative: (81)

a. b.

Every ments Every ments

girl1 in our class works hard. *She1 always turns in assignon time. girl1 in our class works hard. They1 always turn in assignon time.

When the distributive reading is available, it appears generally to preclude the set reading: (82) 15 A

(i)

a. Every boy1 in our class does his1 work carefully. b. *Every boy1 in our class does their1 work carefully. similar change improves (74): Every player who holds every suit eventually must use them all.

38

5

Conclusion

We have shown how a simple homomorphism can map LF structures into an expression in conjunctive normal form and how such expressions can serve as a full semantic metalanguage. Furthermore, we have shown how the predictions of such a CNF system accord with human judgments in most cases better than those of competing systems, such as dynamic predicate logic, over a range of phenomena including donkey anaphora and telescoping. We have left some significant issues to future work. In particular, we have not provided a method for interpreting inherently second-order phenomena, such as pluralities or generalized quantifiers (Barwise and Cooper 1981), using CNF. We also left some open questions in our discussion of telescoping (section 4.2) and in the additional constraints of section 4.3. On balance, though, using CNF as metalanguage yields a simple system capable of explaining a large range of empirical phenomena. Furthermore, the homomorphism from LF trees to CNF allows for simple bi-directional translation to and from CNF, facilitating the connection between natural-language parsers and logical inference systems.

References Abney, S.: 2012, Reversibility of interpretation by direct translation to CNF. Unpublished manuscript. Bach, E.: 1970, Problominalization, Linguistic Inquiry 1, 121–122. Barwise, J. and Cooper, R.: 1981, Generalized quantifiers and natural language, Linguistics and philosophy 4(2), 159–219. B¨ uring, D.: 2005, Binding theory, Cambridge Univ Pr. Chomsky, N.: 1981, Lectures on government and binding. Davidson, D.: 1967, The logical form of action sentences, Essays on actions and events 5, 105–148. Dowty, D.: 1988, On the Semantic Content of the Notion of Thematic Role, Vol. 2, Kluwer, Dordrecht, The Netherlands, pp. 69–129. Fodor, J. and Sag, I.: 1982, Referential and quantificational indefinites, Linguistics and philosophy 5(3), 355–398. Geach, P. T.: 1962, Reference and generality: An examination of some medieval and modern theories, Cornell University Press. G¨ odel, K.: 1939, Consistency-proof for the generalized continuum-hypothesis, Proceedings of the National Academy of Sciences of the United States of America 25(4), 220.

39

Grice, H. P.: 1975, Logic and conversation, in P. Cole and J. Morgan (eds), Syntax and Semantics 3: Speech Acts, New York: Academic Press, pp. 41–58. Groenendijk, J. and Stokhof, M.: 1991, Dynamic predicate logic, Linguistics and philosophy 14(1), 39–100. Heim, I.: 1991, Artikel und definitheit, Semantik: ein internationales Handbuch der Zeitgen¨ ossischen forschung pp. 487–535. Herbrand, J.: 1930, Recherches sur la th´eorie de la d´emonstration, number 33 in Travaux de la Societ´e des Sciences et des Lettres de Varsovie, Class III, Sciences Math´ematiques et Physiques, Nakl. Tow. Naukowego Warszawskiego. Higginbotham, J.: 1985, On semantics, Linguistic inquiry 16(4), 547–593. Higginbotham, J.: 2000, On Events in Linguistic Semantics, Oxford: Oxford University Press, pp. 49–79. Kamp, H.: 1981, A theory of truth and semantic representation, Formal Semantics pp. 189–222. Kamp, H. and Reyle, U.: 1993, From discourse to logic: Introduction to modeltheoretic semantics of natural language, formal logic and discourse representation theory, Vol. 42, Kluwer Academic Dordrecht,, The Netherlands. Karttunen, L.: 1969, Pronouns and variables, Fifth Regional Meeting of the Chicago Linguistic Society, pp. 108–115. Karttunen, L.: 1971, Definite descriptions with crossing coreference: A study of the bach-peters paradox, Foundations of Language 7(2), 157–182. McCune, W.: 2003, Mace4 reference manual and guide, Technical Report Tech. Memo ANL/MCS-TM-264, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL. Montague, R.: 1970a, English as a formal language, Edizioni di Communit`a, Milan, pp. 189–224. Montague, R.: 1970b, Universal grammar, Theoria 36(3), 373–398. Parsons, T.: 1990, Events in the Semantics of English, MIT press, Cambridge, MA. Parsons, T.: 2000, Underlying states and time travel, Oxford: Oxford University Press, pp. 81–93. Partee, B.: 1973, Some structural analogies between tenses and pronouns in english, The Journal of Philosophy pp. 601–609. Roberts, C.: 1987, Modal subordination, anaphora, and distributivity, PhD thesis, University of Massachusetts at Amherst.

40

Roberts, C.: 1989, Modal subordination and pronominal anaphora in discourse, Linguistics and philosophy 12(6), 683–721. Robinson, J.: 1965, A machine-oriented logic based on the resolution principle, Journal of the ACM (JACM) 12(1), 23–41. Russell, S. and Norvig, P.: 1995, Artificial Intelligence: A Modern Approach, Prentice Hall series in artificial intelligence, Prentice Hall. ¨ Skolem, T.: 1934, Uber die nicht-charakterisierbarkeit der zahlenreihe mittels endlich oder abz¨ ahlbar unendlich vieler aussagen mit ausschliesslich zahlenvariablen, Fundamenta mathematicae 23, 150–161. Wang, L., McCready, E. and Asher, N.: 2006, Information dependency in quantificational subordination, Where semantics meets pragmatics pp. 267–306. Whitehead, A. N. and Russell, B.: 1910, Principia Mathematica, Vol. 1, Cambridge University Press.

41

Using conjunctive normal form for natural language ...

Jan 2, 2013 - occur are free and are implicitly universally bound. In addition, certain ... place a dot over Skolem variables. The interpretation ... the domain of the assignment includes not only individual variables but also. Skolem variables.

Download PDF

450KB Sizes 1 Downloads 274 Views

Report

Using conjunctive normal form for natural language ...

Recommend Documents