encapsulation of state with monad transformers

Viewer
Transcript

ENCAPSULATION OF STATE WITH MONAD TRANSFORMERS

Steven E. Ganz

Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Doctor of Philosophy in the Department of Computer Science Indiana University December 2006

Encapsulation of State with Monad Transformers

Accepted by the Graduate Faculty, Indiana University, in partial fulfillment of the requirements for the degree of Doctor of Philosophy.

Daniel P. Friedman, Ph.D.

Christopher T. Haynes, Ph.D.

David Charles McCarty, Ph.D.

Paul W. Purdom, Jr., Ph.D.

December 8, 2006

ii

Encapsulation of State with Monad Transformers

c 2006

Steven E. Ganz ALL RIGHTS RESERVED

iii

Encapsulation of State with Monad Transformers

To my parents

iv

Encapsulation of State with Monad Transformers

Acknowledgements This work was made possible by many people. Of course, any remaining errors are entirely my own responsibility. I would first like to thank my advisor, Dan Friedman, who had a large influence on the form of this work. The tutorial viewpoint that is maintained throughout is a result of his urging. This is clearly reflected in the incremental exposition. The ‘Categorical Interlude’ sections were also his idea. His emphasis on implementation led me to develop a working model of the enhanced languages. Typing derivations were produced by a home-grown logic system written in the Scheme programming language; reductions and translations operated over these derivations. This model mutually reinforced and was reinforced by the soundness proofs `a la Curry-Howard Isomorphism. I thank him, also, for his painstaking editing of the final draft. I would also like to thank my research committee as a whole, for their effort in understanding what this work aims to accomplish and for generating thoughtful questions. This work has benefited from discussions with several experts in the field. I thank Amr Sabry, for discussions early on that led to the selection of this topic. I thank Eugenio Moggi, for some advice which both improved the presentation and led to finding a few errors, and for suggestions on the handling of polymorphism. I thank Phil Wadler, for suggestions that led to simplification of the object language. I thank Esfandiar Haghverdi, for discussions related to presenting the categorical foundations of this work. I also thank my employers at Elo Systems for hiring me while knowing that my attention would be diverted to complete this dissertation. Finally I thank my parents, to whom this work is dedicated, for their encouragement on this project, which whether gentle or firm I alway knew had my best interests at heart.

v

Encapsulation of State with Monad Transformers

Abstract We relate the type-and-effect system of Tofte and Talpin and monadic systems while handling the permissive notion of encapsulation and various other features of their system. In particular, we translate from an object language that allows interaction between computations at different regions to a language based on the state monad transformer. The syntactic classes of the languages are indexed respectively by sequences of region indicators and by the depth within evaluation constructs. The fundamental difference between a monadic programming language and a more traditional programming language with encapsulated effects is that the monadic language uses relative addressing of regions while the traditional language uses absolute addressing of regions. This relative addressing is inherent in the return and run constructs and there’s no need for region identifiers to indicate a region for an effectful operation. Even memory addresses are free of region names, so that the representation is truly modular. This approach carries a restriction against upward (inward) references that gives a sequence-of-trees structure to the regions of a program, and thus to the store of such a program with state effects. We also define effect-annotated monad transformers and make use of them in presenting a type system that uses tree-structured store types and is equally free of region identifiers. We show this type system to be sound with respect to the operational behavior. Semantics-preserving translations from a direct-style effect system are provided. The reduction semantics and typing judgments are also indexed, giving them a precise characterization in terms of the syntactic classes. Under our restriction against upward references, the tree structure of monadic programs carries information not available to direct-style programs with a linear store — namely that certain regions aren’t accessible. This difference cannot be made up by the translation, so the earlier one translates to monadic style, the more efficient the resulting program. We demonstrate support for various extensions, including recursion, polymorphism, eager deallocation of regions that departs from the stack regime, and subtyping to allow live pointers to be viewed as dangling ones. Finally, we briefly consider region-polymorphic functions and sketch an incremental implementation supporting computational reflection.

vii

Encapsulation of State with Monad Transformers

Contents Introduction 1. Historical Perspective 2. Overview of Results 3. Methodology

1 3 10 20

Part I. Modelling State With Monads

23

Categorical Interlude: Monads

25

Chapter 1. Source Languages 1. Object Language Statics 2. Monadic Language Statics 3. Translation

49 49 57 61

Chapter 2. Intermediate Languages 1. Object Language 2. Monadic Language 3. Translation

67 67 93 109

Part II. Modelling Encapsulation of State With Monad Transformers

121

Categorical Interlude: Monad Transformers

123

Chapter 3. Source Languages 1. Object Language Statics 2. Monadic Language Statics 3. Translation

129 129 142 152

Chapter 4. Intermediate Languages 1. Object Language 2. Monadic Language 3. Translation

161 161 207 251

Part III. Enhancing the Languages

267

Categorical Interlude: Strong Monads, Fixpoints, and iml-categories

269

Chapter 5. Source Languages 1. Object Language Statics 2. Monadic Language Statics 3. Translation

275 275 288 296

Chapter 6. Intermediate Languages 1. Object Language

301 301

ix

Encapsulation of State with Monad Transformers

2. Monadic Language 3. Translation

Contents

334 347

Conclusion and Future Work 1. Allowing Actions on Deallocated Regions 2. Subtyping 3. Region Polymorphism 4. Unrestricted Store 5. Other Effects 6. Incremental Implementation and Monadic Reflection

355 357 358 359 364 364 365

Bibliography

369

x

Introduction

Historical Perspective

1. Historical Perspective

In this work, we contribute to the categorical semantics of programming languages by relating effect systems to monads using monad transformers. We do this in a manner that includes encapsulation, an essential feature of effect systems. Let us define our terms, to help explain what that means. Programming language semantics is the endeavor to provide unambiguous descriptions of programming languages, i.e., unambiguous statements of what any particular program means. An effect is anything that happens when a program runs, other than the evaluation of a mathematical function. Mathematical functions always return exactly one value for any input, so effects include nontermination and nondeterminism. They also include accessing memory, handling asynchronous events, spawning new threads of computation, and jumping between arbitrary program points. What counts as an effect is relative to our level of analysis; at a low enough level practically anything that happens during program execution is a complex series of effects. For example, assigning values to local variables and passing arguments to procedures requires updating the stack, but following standard conventions we consider these to be functional and concern ourselves with heap access effects. Effects are often defined with respect to a region, which corresponds to an area of memory, a scheduler, etc. Type systems allow one to statically derive information about any value that might result from evaluating an expression. Effect systems are extensions of type systems that derive more information. Effect systems predict the effects that can take place at various regions during a program run. This is in contrast to a trace, which collects the actions that actually occur during the evaluation of an expression. Our predictions may overestimate the actual effect; otherwise, such a calculation is clearly undecidable. Encapsulation is that aspect of effectful systems that guarantees that a particular effect, or any effect at a particular region, is constrained to operate over a statically determinable portion of a program. Encapsulation schemes are differentiated by their decisions as to what forms of interaction to allow between regions. We aim to allow considerable interaction. We model encapsulation of effects using monad transformers from category theory. Category theory is a branch of mathematics that abstracts over various parts of mathematics and has been found to be valuable in representing the meanings of computer programs. Monads are a category-theoretic construct that has been found

3

Introduction

to be very useful in modelling effects in programming languages [43, 44]. Monad transformers are functions that incrementally build more complex monads from simpler ones. We are interested in the monadic semantics of effects not only for its own sake, but because monads allow effects to be rendered efficiently in a functional language [62, 26, 28, 35]. This is because monads abstract the way in which computations that perform effects are combined, so that it is possible to make guarantees about how they operate. The state monad, for example, represents computations as state transformers that additionally return a value. Because we can guarantee that the state is single-threaded, i.e., that no more than one copy exists at any point in time, we can avoid passing it around at all and simply resort to imperative updates at a lower level. We are interested in extending monadic semantics to handle encapsulation because effects make programs difficult to analyze and thus difficult to optimize, even in a monadic setting. A functional program with effects implemented in terms of monads will allow for various functional optimizations, but not for optimizations that might alter the order of effectful operations. If we can guarantee that a portion of a program does not use a particular effect, then we may potentially have an opportunity to perform more optimizations. Encapsulation provides precisely that guarantee. Although there have been various other presentations of effect systems [39, 55, 56, 54], we use as our standard that of Tofte and Talpin [59]. Their system is notable both for the range of values that can be allocated in encapsulated regions and for the degree of interaction between regions that is permitted, i.e., the permeability of region boundaries. Specifically, it accounts for the allocation of procedures as first-class values. The form of encapsulation provided by Tofte and Talpin is a “stack-of-regions”. The syntactic form letregion is thus used to introduce a lexically-scoped region variable bound to a region of memory for use within a piece of code. Effectful operators take a region argument so that outer regions may be accessed via visible region variables. Regions may escape the encapsulation construct, but their associated effects will also escape. Addresses become dangling pointers upon escaping the encapsulation construct for the region to which they refer. Tofte and Talpin are concerned with region inference [39, 30, 58], in which code is written without any enscapsulation construct and the region boundaries are applied via translation. Evaluation is described via a big-step semantics. Safety is guaranteed not for the target language with the encapsulation construct in general, but only for the translation. Calcagno, Helsen, and Thiemann [5] recast Tofte and Talpin’s effect system without region inference under a reduction semantics. This was done at the cost of dropping support for dangling 4

Historical Perspective

pointers, recursion, and polymorphism. Recursion and both implicit (Curry-style) type and explicit (Church-style) region polymorphism were considered by Helsen and Thiemann in an earlier work[22], but only for a stateless semantics. Calcagno, et. al., provide a type soundness proof that a program will not go wrong if its derived effect is free of region variables and deallocated regions. Programs such as the following that dereferences a cell after its region is deallocated must be precluded: deref letregion ρ in @ ρ href uniti Other researchers have considered generalizations that allow regions to consist of arbitrary collections of program points where particular effects occur[47]. Crary, Walker, and Morrisett present such a system in their CPS-based Calculus of Capabilities[6]. Henglein, Makholm, and Niss present a similar system in direct style [23]. It is unclear how monads can be used to model such approaches. We optimize Tofte and Talpin’s “stack-of-regions” somewhat as described in Section 2.7 below. Our optimization provides efficient region reclamation while maintaining a lexical extent of regions. Previous attempts to formally relate monads and effect systems have either ignored encapsulation in the effect system [64, 16, 65, 3, 61] or used overly restrictive notions of encapsulation [37, 50]. We demonstrate a monadic interpretation of encapsulation for an object language more similar to the target language of Tofte and Talpin1, but with a reduction semantics similar to that of Calcagno, et. al.. Our object language is capable of expressing monadic operations on any of multiple nested regions, the escape of dangling pointers from the encapsulation construct, and storable functions. We achieve this level of expressivity through the use of monad transformers [43, 38, 25, 2] in our target language. Our object language differs from Calcagno, et. al., in that we track region indicators through indexes. This allows for a more traditional statement of type soundness and resolves several problems with their proof. We separate the notions of type and effect soundness. The former states that any reduction sequence of a well-typed program will terminate with a well-typed program value or not at all. The latter states that no trace of a reduction sequence of a well-typed expression will contain actions not authorized by the expression’s effect. In fact, we need only prove these properties for programs and expressions reachable through evaluation. Because much of the existing work on effect systems, encapsulation, and the relation between effect 1Kagawa [31] suggests an alternative scheme under which the region-defining construct may declare a region variable bound to the next outermost region, and operations may be performed either implicitly in the innermost region, or explictly in an outer region. In spite of the title and notation, Kagawa presents no language with return and bind or other monadic operators. We choose to use the system of Tofte and Talpin.

5

Introduction

systems and monads has been done in terms of state, we also work with that effect. As with much related work [64, 50, 5], we do not treat Tofte and Talpin’s effect variables, and are not concerned with region inference. Explicit declaration of regions may be tedious for a programmer, but Tofte and Talpin demonstrate that regions may be inferred by a compiler. There are several natural ways to view encapsulation in terms of monads: Single Monad: As a somewhat trivial solution, one can use a single monad (or monad transformer) to represent all regions. Computations are typed as the application of a single monad to the type of value computed and encapsulation is treated as an operation within the monad. Monad per Region: Computations are typed as the iterative application of the monads corresponding to each lexically defined region to the type of value computed. Encapsulation is provided as a run operator, which takes one out of a monad. Monad Transformer per Region: Once again, computations are typed as the application of a single monad to the type of value computed, but this monad is enhanced incrementally. The monad applied to the type of value computed is built as the iterative application of the monad transformers corresponding to each lexically defined region to the identity monad. An incremental return operator enhances monadic values to represent a trivial computation on an additional region. Encapsulation is provided as an incremental run operator, which takes one to a less complex monadic value. Eventually, it yields a value that is typed as an application of the identity monad to the type of value computed, which is, of course, isomorphic to the type of value computed. The single monad used by the first approach is intuitively a fixed point of the corresponding monad transformer, i.e., it is the monad that would cease to be enhanced by subsequent applications of the monad transformer. We demonstrate with the state monad. A computation represented with the state monad will be typed in a functional language as S ⇒ E × S, where E is an arbitrary type and S represents the type of a store. Iterating it under the monad per region approach yields ≬S ⇒ (. . . (≬S ⇒ P × ≬S ) . . .) × ≬S, with P a pure, i.e., noneffectful type and ≬S the type of a region of memory. The state monad transformer applied to monad T takes the form ≬S ⇒ T (P × ≬S) . Iterating it under the monad transformer per region approach yields ≬S ⇒ (. . . (≬S ⇒ (((P × ≬S ) . . .) × ≬S))). A computation represented as a single monad takes the form [≬S] ⇒ P × [≬S],

6

Historical Perspective

i.e., a function from an initial list of region stores to a pair of a pure type and a resulting list of region stores, which can be obtained via uncurrying and reversing the association of the pairs. It is useful at this point to reflect on previous work relating effects and monads. Wadler [64] presents a call-by-value translation relating effect systems to monads. Each of the languages is given both a type system and a reduction semantics. A computation with type P and effect ε is ε

assigned the monadic type Stε [P ] obj PN , while a functional type P1 ⇒ P2 translates to [P1 ] obj PN ⇒ Stε [P2 ] obj PN . The latent effect of the function becomes an annotation on the functor applied to the function’s range. Such generalized monads were axiomatized by Filliˆatre [16], who developed a similar translation independently. We provide a simpler axiomatization in the first Categorical Interlude that better respects the way annotated monads are actually used. A recent extension of Wadler’s work with Thiemann [65] adds type, region, and effect polymorphism to the treatment. While Wadler’s work relating effects and monads has been motivated largely by the use of both in delimiting the scope of effects within programs, his formal treatment (and that of Filliˆ atre) ignores the use of encapsulation in effect and monad systems. There is no encapsulation construct, and regions are assigned arbitrarily by the type system. However, we use much of the structure of Wadler’s approach as the basis of our own. Part I in particular is largely a presentation of his work. Benton and Kennedy [3] and Tolmach [61] use a single monad and monad transformers, respectively, to model various effects but also do not handle encapsulation. The problem of feature-based modularity is related to that of encapsulation. The goal here is to build up a representation for a computation in terms of representations relating to the various features (e.g., exceptions, state, continuations, multithreading) supported by the language. This was Moggi’s motivation in using monad transformers to represent programming languages [43]. Liang, Hudak, and Jones [38] and Espinosa [11] were also concerned with this problem. They studied ways of lifting the operations of the original monad onto the transformed monad. Filinski used layered monads, an alternative to monad transformers, to represent various language features, and showed that any such feature can also be represented by a continuation monad, and that any computation built up in this way can also be represented by a monad with just continuations and state [15]. While the problem of feature-based modularity is related to that of encapsulation of effects, it is not the same. The modularity problem involves combining representations of distinct effects, or distinct uses of the same effect, each relating to a feature available throughout a program. The encapsulation problem involves combining representations of similar uses of a single effect, where the number of 7

Introduction

layers varies at different points of the program. The forms of interaction that can reasonably be expected and interference that must be guarded against will differ between these situations.2 Launchbury and Peyton Jones [36] describe how monadic types parameterized by a type variable can be used to represent stateful computations, albeit without detailed effect information. A state transformer that modifies a store and delivers a result value is assigned a type of the form St̺0 E (in our notation). A return operator of type E ⇒ St̺0 E creates trivial computations. A bind operator of type St̺0 E1 ⇒ (E1 ⇒ St̺0 E2 ) ⇒ St̺0 E2 composes two computations over the same type variable ̺0 . They also show how encapsulation can be provided by universally quantifying over the type variable. A run operator is typed as (∀ ̺0 .St̺0 E) ⇒ E. Because the ̺0 in the domain is not a free type variable, run may make no assumptions about the initial state. Because the quantification is restricted to the domain of run, no reference to the store may escape in the value type E. Thus, the following unsafe program (which violates both constraints) will not typecheck: let x = run href truei in run deref x Here, the context of the deref operation implies that it should operate over the inner region, but its argument points to the outer region. The Monad per Region model follows from Launchbury and Peyton Jones’ system by taking the type variable ̺0 to correspond to a local region indicator and the value type E to be a computation in any outer regions. Launchbury and Sabry [37] provide a syntactic type soundness result for this system. Under the Monad per Region model (with the state monad), Semmelroth and Sabry [50] map an effect system to Launchbury and Peyton Jones’ monad system. Because their source-language encapsulation construct does not bind a region variable, they implicitly assume all effectful operations to be performed only at the innermost region. Furthermore, in translating encaps e as return(run [e ] obj eN ) and evaluating run’s argument in an empty store, they preclude any dependence between the inner computation and any embedding outer computations. Moggi and Palumbo [45] provide a natural semantics for a variant of run, again in the context of the state monad, in work later extended by Moggi and Sabry [46]. They are not concerned with translations from a nonmonadic source language. Interpreting run as pure evaluation, they similarly preclude dependence between computations on inner and outer regions. 2One way of combining the problems would be to allow for a variety of encapsulation constructs to introduce different monad transformers. Another would be to allow the monad transformers to themselves be built up incrementally.

8

Historical Perspective

Under a variant of the single monad model, Launchbury and Sabry [37] seek a way to allow some dependence among computations in nested scopes. They suggest an operator for importing values associated with one region into another, without modifying any operations to take a region argument. In the context of state, they would thus allow a cell allocated at an outer region to be read or written from an inner region scope. Their approach still has the significant limitation that all allocations must be to the innermost region. Much of the practical use of region-based memory management, such as an ability to allow closures to survive the activation record in which they were created, relies on an ability to allocate at outer regions [60]. Recent work by Fluet and Morrisett [17] provides a more full notion of encapsulation under the single monad approach.3 Their primary concern is in reducing explict region polymorphism to explicit type polymorphism. The disadvantage for us of the single monad approach is that the insight provided by the model is limited by the extent that the feature being studied, in our case encapsulation, is treated differently between the object language and the model. Some related research has been done with effects other than state. In terms of continuations, Jouvelot and Gifford [29] had taken some steps in modelling the capture and calls of continuations as maskable effects. Using an approach closely related to CPS, Danvy and Filinski [8] implicitly parameterize their shift and reset operators by a region but do not explicitly describe their system in terms of monads. Thielecke [57] implicitly uses a single monad application, but when descending to an inner region, he modifies the answer type to be the prior type of continuation computations. He does not parameterize his call/cc operator by a region, so that the innermost region is assumed. Capturing and applying continuations at an outer level allows for more coarse jumping behavior. More recently, Dybvig, Jones, and Sabry [9] present a “monolithic” monadic model of encapsulated continuations, in which the encapsulation construct is an operator within the monad. We thus benefit from a rich history of work on effect systems, monads, and their relation. We turn now to our particular areas of contribution. 3It is not entirely fair to classify it in this way, since strictly speaking they use a family of monads with a subtyping relation. Still, their approach does contain similar attributes. Encapsulation is provided separately from a run operation and both addresses and types contain explicit mention of region names.

9

Introduction

2. Overview of Results

2.1. Monad Transformers for Representing Encapsulation of Effects. The difficulty with the Monad per Region approach is to some extent anticipated by the wellknown difficulty of composing monads [27, 32], and conversely our use of the Monad Transformer per Region approach is to some extent anticipated by previous success in composing monad transformers [43, 11, 38]. We present an example that demonstrates the limitations of the Monad per Region approach in expressing dependencies between computations at various levels. Consider the following program in our object language, producing a result of false: letregion ρ1 in let outerRef = @ ρ1 href truei in letregion ρ2 in let innerRef = @ ρ2 href falsei in set outerRef to deref innerRef; set innerRef to deref outerRef; deref innerRef As we attempt to translate this under the Monad per Region approach, writing let for monadic bind, we obtain something like the following:4 run let outerRef = href truei in run let innerRef = href falsei in let innerVal = deref innerRef in return set outerRef to innerVal; let outerValComp = return deref outerRef in let outerVal = outerValComp in set innerRef to outerVal ; deref innerRef Operations do not specify the region at which they are to operate; this is implicit in their context within, e.g., return constructs. These generate a trivial computation in the bypassed region that in turn yields a computation in the next outer region. let bindings give access to the value of a computation. Under the Monad per Region approach, for computations in outer regions this is another computation typed with one fewer application of the monad. Thus the third let binding 4We assume here that booleans may be stored directly at any region, and that monadic operators are typed in the standard way, i.e., addresses are nonmonadic and allocation is typable as a computation at the region in which allocation takes place, producing an address.

10

Overview of Results

Restriction to Downward/Outward References

within the inner run gives access only to outerValComp, but we need outerVal to continue the inner computation. We then resort to another let binding to obtain outerVal. Unfortunately, this program is not typable. The let form requires that the right-hand side and the body be typable under the same monad [36], but while outerValComp is typable under the monad corresponding to the outer region, set innerRef to outerVal; deref innerRef is typable under the monad corresponding to the inner region. By contrast, our translation successfully generates the following program for this example: run let outerRef = href truei in run let innerRef = href falsei in let innerVal = deref innerRef in return set outerRef to innerVal; let outerVal = return deref outerRef in set innerRef to outerVal ; deref innerRef Because the Monad Transformer per Region approach uses only a single monad application, the monadic bind operation now provides outerVal directly and not outerValComp, and the problem vanishes. We find that the Monad Transformer per Region approach justifies the intuition of run as an encapsulation construct while allowing for an enhanced, less restrictive, notion of encapsulation. An effect ε under regions ̺ is a set of atomic effects (ι @ ̺), indicating that action ι is performed at some region ̺ in ̺. Under the Monad Tranformer per Region approach, a computation in regions ̺ with type P and effect ε is assigned the monadic type St{ι|(ι @ ̺) ∈ ε} Id [P ] obj PN . A functional ε

type P1 ⇒ P2 translates to [P1 ] obj PN ⇒ St{ι|(ι @ ̺) ∈ ε} Id [P2 ] obj PN . In both cases, the effect ε is converted to a monad built from a sequence of annotated monad transformers.

2.2. Restriction to Downward/Outward References. Our object language does carry a restriction on allocations to prevent upward references, i.e., storing a reference to an inner region at an outer region. Thus we do not allow the following program (which obeys lexical scoping) that allocates two regions and stores a reference to unit at the inner region and a reference to that inner cell at the outer region: letregion ρ1 in letregion ρ2 in @ ρ1 href @ ρ2 href unitii Storing a value at an upper region in a cell at a lower region creates a reference to a dangling pointer when the upper region is deallocated. Had the upper cell been dereferenced from outside 11

Introduction

the scope of the inner (upper) region, we would certainly have to reject the program (as with the first example of Section 1 that dereferences a dangling pointer). Although the program causes no problems as it is, we choose to reject it as well. With this restriction, we can provide storable functions by maintaining the store at increasing levels of monadification. This restriction seems less onerous than the restriction on dependencies described above. It is not unrealistic, and is similar to one proposed in the standard for Java realtime programming [4]. While that standard prevents storing at an outer region an object allocated at an inner region, we impose a stronger restriction that prevents storing at an outer region an object whose type refers to an inner region. Both are, however, unnecessary and could delay the reclamation of memory. Effect systems in the literature have allowed such upward references [59, 5]. While the restriction is unnecessary for effect systems, our interest is in modelling effect systems with monads. The removal of this restriction presents a challenge to monadic models of encapsulation. This restriction and that implied by the Monad per Region approach (Section 2.1) are technically independent. We have seen in that section a program involving inter-region dependencies that does not point upwards in the store, and in this one a program pointing upward in the store without involving inter-region dependencies. Under the Monad per Region approach we can translate this latter program as follows: run run let innerRef = href uniti in return href innerRefi But this is a particularly benign example because the upward reference is never dereferenced. That is good, because the allocation in the outer region will not take place until after the deallocation of the inner region.5 Had we needed to dereference the stored pointer, we would be unable to do so. It thus seems that the Monad per Region approach would have serious problems with any more substantive example in an unrestricted store.

2.3. Relative Addressing of Regions in Static Analysis. Effect systems provide information regarding “what is computed”, “where it is computed”, and “how it is computed”. Under our notation, storable types, B, describe what is computed. Pure types, P, add the region where it is computed, i.e., in what region the result resides. Finally, 5This is the case if in a reduction semantics we always replace running return e with e, consistent with the interpretation of return as a constructor of trivial computations that, when run, simply yield the constructor’s argument.

12

Overview of Results

Relative Addressing of Regions in Static Analysis

expression types, E, add the effect, which describes how it is computed, i.e., what regions are used in what ways in the computation. In the object language, all of this is quite explicit. Storable types clearly distinguish constants, reference cells, and procedures. Reference types are defined in terms of the pure type of the stored value. Procedure types are defined in terms of the pure types of the input and output as well as the latent trace type of the procedure, i.e., the effect of applying it. Pure types clearly distinguish constants requiring only global allocation, typed as G, from values allocated in a particular region, typed as B @ ̺0 . Finally, effects {(ι @ ρ0 )} clearly distinguish the various actions of allocating, reading, writing, and executing, and clearly identify the region where each occurs. The hallmark of our monadic languages is that they derive the same information but make no explicit mention of region indicators. Monadic languages use a return operator to indicate trivial computations. Our incremental return operator is used to bypass inner regions in describing how to compute a result using only the outer regions. On the static side, each bypassed region corresponds to a monad transformer, St∅ , which enhances any monad with a trivial computation on an inner region. In our monadic languages, allocations (and other operations) do not require a region indicator to be specified, and address values take the form of simple offsets, so that the region at which a value resides must be determined by context. Specifically, the region can be determined by the number of return forms around the allocating expression, or around any operator that actively uses the offset. We say that deref and set operators actively use the offset to the reference cell that they access and that an application actively uses the offset to the procedure applied. On the static side, we introduce Return forms to indicate that a value does not reside at the current region, but at some outer region. Like return forms, Return forms are cumulative, i.e., within them the bypassed region is not visible. For example, in describing the location of a value stored at a reference cell in a particular region, we start counting outward from that region, and similarly for the domain, range and effects of stored procedures. This makes it impossible to express types that violate our restricted store (Section 2.2). For this purpose, Unit is considered to be outside of all regions, so that we get a treatment consistent with that of unit and return. A few examples should clarify the above discussion. In a context of two regions, a reference cell holding unit at the outer region has pure type Return Ref Return Unit while the same cell at the inner region has pure type Ref Return Return Unit. Similarly, an identity function on unit at

13

Introduction

the outer region has pure type Return ((Return Unit) ⇒ St∅ Id (Return Unit)) while the same function at the inner region has pure type (Return Return Unit) ⇒ St∅ St∅ Id (Return Return Unit). When one reference cell points to another that in turn holds unit, the first has pure type Ref Ref Return Return Unit if both cells are at the inner region, Ref Return Ref Return Unit if the first is at the inner region and the second at the outer region, and Return Ref Ref Return Unit if both are at the outer region. A reference cell at the inner region holding an identity function on unit at the outer region has pure type Ref Return ((Return Unit) ⇒ St∅ Id (Return Unit)). Finally, a thunk at the inner region allocating and returning a reference cell holding unit at the outer region has pure type (Return Return Unit) ⇒ St∅ St{alloc} Id (Return Ref Return Unit). Recall that Launchbury and Peyton Jones type computations using monadic types parameterized by a type variable, and that considering that type variable to be a region indicator and run to be an encapsulation construct leads to the Monad per Region approach of modelling encapsulation of effects. We have removed the region indicator from our monadic types, so how can we perform encapsulation? In fact, with relative addressing of regions the problem largely vanishes, since there is no way to refer into a deallocated region except as ∅, the type of dangling pointers. The type system will not permit any effectful operation that actively uses a dangling pointer. As we emerge from the encapsulation construct, we must only perform that substitution of ∅ for live pointer types at the deallocated region and remove the level of Return constructs that bypass the deallocated region for types at more outer regions.

2.4. Regions and Environments. There are two ways in which one might hope to use program variable bindings across region boundaries. These come for free in our object language and are assumed in standard presentations of effect systems. (1) A variable bound in an outer region should remain visible from within an inner region. (2) A variable bound in an innner region should remain visible from within effectful operations on outer regions.

14

Overview of Results

Handling Dangling Pointers

For example, in: letregion ρ1 in let x1 = @ ρ1 href uniti in deref x1 ; letregion ρ2 in deref @ ρ2 href x1 i the second reference to x1 is from within an inner region scope. In: letregion ρ1 in letregion ρ2 in let x1 = @ ρ1 href uniti in @ ρ2 href x1 i; set x1 to unit the second reference to x1 is for an effectful operation on region ρ1 , bypassing region ρ2 . The first example will translate as: run let x1 = href uniti in deref x1 ; run let x2 = href x1 i in deref x2 To handle the first situation in a monadic setting, the only complication is introduced by our relative addressing of regions (Section 2.3). Within an inner region, an additional Return form will be required to identify the location of any storable value. Thus, while the first occurrence of x1 in the translated example has pure type Ref Unit, the second has pure type Return Ref Unit. The second example will translate as: run run let x1 = return href uniti in href x1 i; return set x1 to unit As we ascend to an outer region by entering a return construct, we drop a level of Return forms in variable bindings associated with the inner region. Thus, while the first occurrence of x1 is typed as Return Ref Unit, the second has pure type Ref Unit.

2.5. Handling Dangling Pointers. We model Tofte and Talpin more closely than others [5, 50] in that our type system allows region references to escape our encapsulation construct. In the case of addresses, such “dangling”

15

Introduction

pointers may not be dereferenced. We thus allow the following program that allocates a reference to unit at a new region, returning a dangling pointer: letregion ρ in @ ρ href uniti Dangling pointers may legally be stored in and retrieved from reference cells, and passed to and returned from functions. This might be convenient if our system were extended with subtyping to allow live pointers to pass as dangling ones. One could also imagine, however, extending our system with pairs consisting of two reference cells. It might then be convenient, if only one of the cells is used, to free the memory referenced by the other. We would then leave behind a dangling pointer. An example is presented by Tofte and Talpin [60]. Pointers might also escape inwards. This is clearly true in the monadic language. We now modify the second example from Section 2.4 so that the type of the variable binding x1 uses the inner, bypassed region. In letregion ρ1 in let x1 = @ ρ1 href letregion ρ3 in @ ρ3 h3ii in letregion ρ2 in let x2 = @ ρ2 h4i in set x1 to x2 we attempt to store a pointer to h4i at ρ2 in a cell that holds a dangling pointer, so this program will not typecheck.6 The corresponding monadic program run let x1 = href run h3ii in run let x2 = h4i in return set x1 to x2 does typecheck. We refine the rule for variable bindings under return forms such that we drop a level of Return forms when any exist, and otherwise substitute ∅. Thus, within a return form, any reference to the cancelled region in a variable binding is treated as a dangling pointer. In the example, x1 is initialized with a dangling pointer. Because each appears under a return form, the occurrence of x1 has pure type Ref ∅ rather than Return Ref ∅ and the occurrence of x2 has pure type ∅ rather than Int. The set is thus able to proceed. These conventions lead to a situation where programs in the monadic language that appear to violate the restricted store constraint of Section 2.2 are legal. Take for example the following illegal object language program from that section: 6It would typecheck given the subtyping regime described above.

16

Overview of Results

Handling Dangling Pointers

letregion ρ1 in letregion ρ2 in @ ρ1 href @ ρ2 href unitii and the corresponding monadic program: run run let x = href uniti in return href xi The constraint is never actually violated in the monadic program, however, because the upward reference is interpreted as a dangling pointer. The rouse must be given up if we try to dereference the pointer to the inner region; this is caught by the type system as a dereference of a dangling pointer. Our monadic type system (under the Monad Transformer per Region approach) thus supports an unrestricted store only to the extent that we considered safe in our discussion of the Monad per Region approach. It is also possible for pointers to escape inwards in the object language. In the body of a procedure stored in an outer region, uses of the inner region in visible variable bindings will be treated as dangling pointers. If we modify the second example above so that the set operation occurs in the body of a function allocated at the outer region, then the occurrence of x2 will have pure type ∅ even in the object language: letregion ρ1 in let x1 = @ ρ1 href letregion ρ3 in @ ρ3 h3ii in letregion ρ2 in let x2 = @ ρ2 h4i in @ ρ1 hλx.set x1 to x2 i unit In both languages, this is safe because of our restricted store (Section 2.2). The concern is that within a return form or the body of a function at an outer region, where different types might both be considered dangling pointer types, a live pointer might be modified so as to point to a dangling pointer originally of a different type. This would cause an inconsistency after evaluation of the return expression or function application is completed. If any reference is treated as a dangling pointer within a return form or function body, then all more inner references in the context of the return form or function body will also be treated as dangling pointers, so they cannot be modified. All more outer references could not point to an “escaping” cell because this would violate the restriction. Cells typed to contain dangling pointers and allocated at regions within the return form or function body,

17

Introduction

however, may safely be modified to hold escaping pointers because such cells will be deallocated before the return expression or function application is completed.

2.6. Trees of Regions. Our restricted store (Section 2.2) gives the region structure of monadic programs the form of a forest, i.e., a sequence of trees, in which additional regions become lexically visible as one descends from the root of any tree. We use running to describe encapsulated computations in progress. Consider the following program, in which the region name annotations are for descriptive purposes only: runningr1 runningr2 returnr2 runningr3 returnr3 returnr1 runningr4 returnr4 unit The original program first allocated region r1 , then r2 . Thus r1 is visible from r2 . The return on r2 prior to the allocation of region r3 ensures that only r1 and not r2 is visible from r3 . We thus obtain a tree with root r1 and children r2 and r3 . r2 and r3 might each be the root of a subtree with additional region structure. But because the program then descended past r1 to allocate r4 , we obtain a second, trivial, tree r4 , and see that in general the top level can also support multiple trees, giving the form of a forest. We will work a similar example in more detail later. Here, we just want to demonstrate the form of the region structure of a monadic program supporting only outward references. Following is a graphical display of the region structure in this example. r4

r1 r2

r3

But consider that we work with the state monad, and that we support stored procedures. The level of monadification required of the body of a procedure in the store corresponds to its depth in the tree structure. While it is possible to force an unnatural sequential structure, such an approach would complicate the system while reducing efficiency. We thus take the tree structure seriously in our dynamic semantics, and any attempt to type the store must also be aware of it.

18

Overview of Results

Lazy vs. Eager Deallocation of Regions

We stress that because sibling region stores such as r2 and r3 may both contain data that is used in the program, the tree form carries through from the region structure of the program to its store. We demonstrate this by fleshing out the example above with additional code that uses values presumably stored in the regions in earlier processing. The metavariables e denote expressions over the regions in their superscript. runningr1 runningr2 let x2 = returnr2 runningr3 let x3 = returnr3 returnr1 runningr4 let x4 = returnr4 unit in e r4 in e r1 r3 r1 r2 in e Because the superscripts on expression variables correspond to the assumed path from the bottom of the store to the top, and because these superscripts grow in a stack-like fashion, a treestructured store is necessary.

2.7. Lazy vs. Eager Deallocation of Regions. In both the object languages and monadic languages, we must confront the issue of when regions are to be allocated and deallocated. We shall always allocate regions outer-most first. If we choose to deallocate regions inner-most first, we obtain a stack discipline. In our initial presentation of encapsulation we shall take this lazy approach, in fact waiting until only a trivial computation remains in the innermost region before deallocating. However, we will subsequently address the possibility of deallocating regions eagerly. A region may be deallocated when its associated computation performs no effect with respect to it. An outer region may be deallocated before an inner one if we can guarantee that the outer region is no longer necessary. Since static analysis is required to determine whether a function may be called, this enhancement would require us to formally define our reduction semantics over typing judgments. We do not, however, allow operations on dangling pointers, even in functions not called, so we take a more conservative approach that deallocates an outer region only when no operation is performed at it from code in its associated computation or code at any upper region in the store. Deallocating regions eagerly can greatly reduce the maximum 19

Introduction

number of regions required simultaneously by a program. This is particularly true if we consider region tree structures (Section 2.6). We have seen that each region in a program corresponds to a node in a tree. The number of lexically visible regions at some region in a program corresponds to the depth of the corresponding node. Deallocating the innermost region (a leaf) leaves the depth of all remaining nodes unchanged, while deallocating the outermost region (the root) decrements the depth of all nodes remaining in the tree.

3. Methodology

This part has presented our results at a high level and set them in a historical context. We next develop the presentation in three stages, addressed in the following three parts of this work. The first ignores encapsulation, presenting Wadler’s work in representing effects with monads. The second models encapsulation of effects using monad transformers. The third enhances the language with value-polymorphism, recursion, region-allocated constants, and eager deallocation of regions. At each stage, our methodology, like those of Wadler and Semmelroth and Sabry, is to present a type system and sound reduction semantics for both an object language and a monadic language, and a translation that preserves types as well as semantics. Our object language, unlike theirs, allows interdependent effectful operations to be performed at various regions. Each stage begins with a “Categorical Interlude”, which presents the category theory upon which it is based.7 These chapters may safely be skipped by those for which this material is already familiar, and those for which it inspires little interest.8 Each of the stages is presented incrementally, treating the source languages before the more complex intermediate languages. This allows us to present translations sooner than would otherwise be the case. For the source languages, we present syntax, decidable type systems and translations between the languages that preserve types. We make a compromise in language design to allow us to define the translation from the object languages in terms of configurations rather than their typing derivations. The object language requires each 7A discaimer is in order here. These presentations are only semiformal and do not address every subtlety. Also, the stateful systems described therein differ in several respects from the more explicitly presented monadic systems that follow in the remainder of each Part. These presentations are intended to provide intuition about how those systems might be anchored in category theory. 8i.e., these are Categorical Interludes, not Categorical Imperatives.

20

Methodology

effectful operator to receive an explicit region argument, although (except for allocations) this could be inferred from the type of a reference cell or procedure argument. The code examples in this Introduction have elided that detail. For the intermediate languages, we present syntax and type systems (augmented to support sound reduction semantics) and translations that preserve both types and semantics. Semantics are preserved in the sense that whenever there is a reduction sequence in the object language, the translations in the monadic language are interreducible. Indexes are introduced at the second stage and enhanced at the third. The reduction semantics for the monadic language makes direct use of the indexes on our configurations. We conclude in the final part with ideas for future work.

21

Part I

Modelling State With Monads

Categorical Interlude: Monads Category theory is a branch of mathematics that is useful in abstracting over various parts of mathematics (including itself). It thus represents mathematical entities in a common language that helps to identify similarities that might otherwise be obscured by superficial differences. More recently, category theory has been found to be valuable in programming language semantics, i.e., in representing the meanings of computer programs. Here we present enough category theory to represent computations with monads.9 Our intention here is not to present a detailed semantics of any particular language in the style of Moggi [44]. Rather, we aim to provide insight into the mathematical foundations of the monadic programming languages to come. A directed graph G consists of a collection of objects and a collection of directed edges, or morphisms, between particular objects. If a morphism f points from object A to object B we say that A is the domain of f and B is the codomain of f , and write f:G A → B. A category C is a directed graph with: (1) For each object A an identity morphism idCA : A → A. (2) A morphism composition operation ◦C

defined such that whenever f1 : A → B and

f2 : B → C , f2 ◦ f1 : A → C , and ◦

is associative: For all f1 : A → B , f2 : B → C , and f3 : C → D , (f3 ◦ (f2 ◦ f1 ) = (f3 ◦ f2 ) ◦ f1 ): A → D .

left-identity of right-identity of

◦ : For all f: A → B , (idB ◦ f = f): A → B . ◦ : For all f: A → B , (f ◦ idA = f): A → B .

Thus, categories are reflexive and transitive directed graphs with additional structure, namely identification of identity morphisms and the results of composition. Notice that we freely omit the 9This chapter draws on presentations by various authors[63, 14, 48, 1, 40, 34]. There are, however, several differences. Foremost, we present monads with effect annotations throughout the development. We use an original system for graphical presentation of categories that displays functors along with commuting diagrams. We point to a representation of the run operator of the state monad as a natural transformation. Also, as described in the final paragraph, we interpret the Kleisli construction in a somewhat unusual way.

25

Modelling State With Monads

category specifications when they are clear from the context. We call a category small when its collections of objects and morphisms are sets. When we use categories to represent computer programs, objects play the role of types while morphisms between objects represent terms of one type parameterized by terms of another type. This is appropriate because empty contexts leave the argument unchanged, performing no effects, and composition of contexts is associative and respects identities. A term of a given type need not, of course, actually denote any particular value of that type. Any attempt to process the term might lead to an abortive or divergent computation. For any morphism f : A → B , a pair hf , f −1 i such that f −1 ◦ f = idA and f ◦ f −1 = idB is called an isomorphism from A to B . If an isomorphism exists from A to B , then one exists from B to A, and the two objects are said to be isomorphic. Intuitively, two isomorphic objects have the same structure. We consider a few constructions on categories. Given any category C, we can construct its dual, C op , by leaving the objects unchanged and reversing the morphisms. Thus, identities of C op are those op

of C, while g ◦C f is f ◦C g. The category 1 has exactly one object, 1 , and one morphism, id1 . Given any two categories A and B, we can construct a product category A×B. The objects of A×B are pairs hA, B i, where A is an object in A and B is an object in B. The morphisms of A×B are pairs hf , gi: hA1 , B1 i → hA2 , B2 i, where f :A A1 → A2 and g:B B1 → B2 . Identities and composition are defined pairwise, i.e., idA×B hA, B i B A×B = hidA hf1 , g1 i = hf2 ◦A f1 , g2 ◦B g1 i. Given n categories C1 ≤ i ≤ n , n ≥ 0, we A , idB i, and hf2 , g2 i ◦

can construct a tuple category that is their left-associated, 1 -initiated product ((1 × C1 ) × . . .) × Cn . The category 0 has no objects and no morphisms. Given any two categories A and B, we can construct a coproduct, or sum, category A + B. The objects and morphisms of A + B are the objects and morphisms of A and B, appropriately tagged. The objects of A + B are 1 A, where A is an object of A and 2 B , where B is an object of B. The morphisms are 1 f:A + B 1 A1 → 1 A2 , where f :A A1 → A2 , and 2 g:A + B 2 B1 → 2 B2 , where g:B B1 → B2 . Identities and composition refer A+B 1 A+B 1 = idB f1 back to the appropriate component category, i.e., id1AA+ B = idA B , and f2 ◦ A and id2 B

= f2 ◦A f1 and 2 g2 ◦A + B 2 g1 = g2 ◦B g1 . Given n categories C1 ≤ i ≤ n , n ≥ 0, we can construct a variant category that is their left-associated, 0 -initiated sum ((0 + C1 ) + . . .) + Cn . Let C and D be categories. A functor F from C to D is composed of a mapping from objects of C to objects of D and from morphisms f :C A → B to morphisms F f :D F A → F B that preserves 26

Categorical Interlude

domains, codomains, identities, and composition. We write F A for the application of a functor F to an object A and F f for the application of F to a morphism f . We can think of this as passing the object or morphism through the functor. More formally, F is a functor when the following laws hold: (1) F idA = idF A Throughout, we use diagrams such as that in Figure I.1. For each category, labeled on the

C:

D:

w w w w w Fw w w w w

idA A +

⊃

F idA FA +

⊃

Figure I.1. Functors Preserve Identities left, we present a commuting diagram on the right. We identify identity morphisms with a loop marked by a notch. Thus, in this case the diagram tells us that F idA equals idF A . To the left of the commuting diagrams, we represent any mentioned functors as arrows with a double shaft between the appropriate categories. This additional diagram represents the relationships between the objects and morphisms appearing in the diagrams to the right. (2) F (g ◦ f ) = (F g) ◦ (F f ) They are called commuting diagrams because they indicate that several compositions of morphisms are equivalent. The diagram in Figure I.2 tells us that since F (g ◦ f ) and (F g) ◦ (F f ) have the same endpoints and direction, they represent the same morphism. For any category C, there is an identity functor that we will call Id[C] that maps objects and morphisms to themselves. Identity functors are described in the commuting diagram of Figure I.3. The triple bar in Figure I.3 indicates that the connected vertices represent the same object. For any functors F: C → D and K: D → E, their composition K ◦ F: C → E, pictured in Figure I.4, is simply the composition of each component. Because functors correspond to morphisms in a category Cat whose objects are small categories (i.e., composition of functors is associative and identity functors act as such), we write F:Cat C → D when F is a functor from C to D. We consider a few constructions on functors. The functor 1:Cat 1 → 1 maps the single object to itself and the single morphism to itself. Given any two functors F:Cat A → C and G:Cat B → D, we

27

Modelling State With Monads

-

B

f

g

C:

-

- C

-

FB

f F

g

D:

g◦f

A

F -

w w w w w w w w w w w w Fw w w w w w w w w w w w

F (g ◦ f )

FA

- FC

Figure I.2. Functors Preserve Composition f B A ||| ||| ||| ||| ||| ||| ||| ||| ||| ||| Id[C]f Id[C] A Id[C] B

Id[C] C : ⇐=+===⊃

Figure I.3. Identity Functors C:

A

⇐= === == K ====

F ====⇒ === ===

w w w w w w w w w w w w w D: w w w w w w K ◦ Fw w w w w E:

f

FA

(K ◦ F) A

(K ◦ F)f

- B

Ff-

FB

(K ◦ F) B

Figure I.4. Composition of Functors

can construct a product functor F×G:Cat A×B → C×D. We define the functor operations pairwise, i.e., (F×G) hA, B i = hF A, G B i and (F×G) hf , gi = hF f , G gi. Given n functors F1 ≤ i ≤ n , n ≥ 0, we can construct a tuple functor that is their left-associated, 1-initiated product ((1×F1 )×. . .)×Fn .

28

Categorical Interlude

We define variants similarly. The functor 0:Cat 0 → 0 is empty. Given any two functors F:Cat A → B and G:Cat C → D, we can construct a sum functor F + G:Cat A + B → C + D. We define the functor operations by referring to the appropriate component functor, i.e., (F + G) 1 A = 1 (F A), (F + G) 2

B = 2 (G B ), (F + G) 1 f = 1 (F f ), and (F + G) 2 g = 2 (G g). Given n functors F1 ≤ i ≤ n , n ≥ 0, we

can construct a variant functor that is their left-associated, 0- initiated sum ((0 + F1 ) + . . .) + Fn . Although we have defined functors only between single categories, it is convenient to allow multi-argument functors. The constructions above can add this flexibility. In particular, to specify a multi-argument functor we can make use of the tuple construction in defining its domain, i.e., a functor F from C1 , . . . , Cn , 0 ≤ n, to D can be represented as a functor from ((1 ×C1 )×. . .)×Cn to D. While this generality is useful, we will freely define unary functors directly from a category and binary functors directly from a product of two categories. If f1 :C1 C1a → C1b , . . . , fn :Cn Cna → Cnb then F (f1 , . . . , fn ):D F (C1a , . . . , Cna ) → F (C1b , . . . , Cnb ). We thus have that: (1) F (idCC11 , . . . , idCCnn ) = idD F (C1 ,...,Cn ) (2) F (f1b ◦C1 f1a , . . . , fnb ◦Cn fna ) = F (f1b , . . . , fnb ) ◦D F (f1a , . . . , fna ) The category 1 gives rise to a zeroary functor 1:Cat → 1 with object component λ.1 and morphism component λ.id1 , which can alternatively be viewed as the identity functor on 1 . The 1 construction itself can be viewed as such a functor or alternatively as 1 :Cat → Cat. Similarly, every product category A×B gives rise to a product-construction functor × :Cat A,B → A×B with object component λA, B .hA, B i and morphism component λf , g.hf , gi, which can alternatively be viewed as the identity functor on A×B. The product construction itself on small categories and functors can be viewed as such a functor × :Cat Cat,Cat → Cat×Cat, or alternatively as × :Cat Cat,Cat → Cat. Product categories also give rise to projection functors π1 :Cat A×B → A and π2 :Cat A×B → B with object components λA, B .A and λA, B .B respectively, and similar morphism components. The 0 construction is a zeroary functor λ.0 (the category) and λ.0 (the functor) to Cat. The sum construction above is also a functor λA, B.A + B and λF, G.F + G from Cat, Cat to Cat. Sum categories give rise to injection functors ι1 :Cat A → A + B and ι2 :Cat B → A + B with object components λA.1 A and λB .2 B , and similar morphism components. We consider a few additional constructions on functors. Given any two functors F:Cat A → B, and G:Cat A → C, we can construct their mediating product F&G:Cat A → B×C. F&G maps objects A to F A × G A and morphisms f :A A1 → A2 to F f × G f :B×C F A1 × G A1 → F A2 × G A2 .

29

Modelling State With Monads

The mediating product of the projection functors on A and B is the identity functor on A×B. The product functor above is then the mediating product of the projection functors on Cat. Similarly, given any two functors F:Cat A → C, and G:Cat B → C, we can construct their mediating sum F|G:Cat A + B → C. F|G maps objects 1 A to F A, 2 B to G B, and similarly for morphisms. The mediating sum of the injection functors on A and B is the identity functor on A + B. We also introduce an additonal construction on categories. For any category C we can define a morphism category C → whose objects are the morphisms of C. For objects f1 :C A1 → B1 and f2 :C A2 → B2 of C → , a morphism from f1 to f2 is a commutative square of C. A1 C:

hidCA , idCB i:C

→

→

- A2

f1 ? B1

We then write hs, t i:C

s

A1

f2

t

C → : f1

? - B2

? B1

A2

hs, t i

- f2 ? B2

f1 → f2 . For any f :C A → B , there is an identity commutative square

f → f . Commutative squares are composed by pasting them together, i.e., given f1 :C

A1 → B1 , f2 :C A2 → B2 , and f3 :C A3 → B3 , with hs1 , t1 i:C have hs2 , t2 i ◦ hs1 , t1 i = hs2 ◦C s1 , t2 ◦C t1 i:C

→

→

f1 → f2 and hs2 , t2 i:C

→

f2 → f3 , we

f1 → f3 .

Endofunctors are functors from a category to itself. Endofunctors can be used to represent parameterized datatypes. For example, the List functor takes a type A to the type List(A), and procedures f from A to B to procedures which map f over a list of elements of A, yielding a list of elements of B . The first functor law tells us that we can obtain the identity procedure on lists of integers by mapping the identity procedure on integers over a given list of integers. The second functor law tells us that to replace each element of a given list of integers with one less than its square, we can map the composition sub1◦ square over the list, or map the two procedures separately and compose the results. The composition of a functor Tree with List, List ◦ Tree, represents, for any datatye A, lists of trees, i.e., forests, of A. Let C and D be categories and let F and G be functors from C to D. A natural transformation h from F to G assigns to each object A in C a morphism hA :D F A → G A, called a component of h, such that the application of the natural transformation commutes with mappings of any morphisms in C, i.e., if f:C A → B , then G f ◦ hA = hB ◦ F f . The corresponding commuting diagram is in Figure I.5. In commuting diagrams, we represent natural transformations, like functors, as arrows

30

Categorical Interlude

with a double shaft. We place a circle at the tail of arrows representing natural transformations and their components. A C: ⇒ G ====== = ==== ====

⇐== ==== F ==== ==== =

f

h ◦====⇒

D:

? B FA ◦

hA

Ff ? FB ◦

- GA Gf

hB

? - GB

Figure I.5. Natural Transformations For any functor F from C to D, there is an identity natural transformation, which we call id, pictured in Figure I.610, that for any object A in C, yields idF A in D. For any functors F, G, and H from C to D, the “vertical” composition of two natural transformations h: F → G and k: G → H is, for any object A of C simply kA ◦ hA : F A → H A. It is pictured in Figure I.7. We can now introduce an additional construction on categories. Given any two small categories A and B, we can construct a functor category A B whose objects are functors from A to B and whose morphisms are natural transformations, using identity natural transformations and vertical composition (i.e., vertical composition of natural transformations is associative and identity natural B

transformations act as such). Thus we write h:A F → G when h is a natural transformation from F to G. Natural transformations over endofunctors can be used to represent polymorphic contexts. For example, a polymorphic pre-order: Tree → List context takes a type A to a context that performs a pre-order traversal of a tree of elements of type A. The constraint on natural transformations (Figure I.5) ensures that such polymorphic contexts do not depend on the stored values. Let h: F → G be a natural transformation. For any possible modification f : A → B of the stored values that could manifest such a dependency, we obtain the same result whether f is applied before or after the component of h. 10The natural transformation self-loops should have circles at their tails.

31

Modelling State With Monads

C:

D:

f

A

w w w w h w Fw ⇐===+===⊃ w w w w

- B

Ff-

FA 6

FB 6

hA +

hB +

∪

∪

Figure I.6. Identity Natural Transformations

◦

◦

hB

B

A

h)

h)

(k ◦

(k ◦

H f

- HB

◦

GA

-

kA

-

FB

hA

HA

Ff-

FA

◦

=◦ === = = k ==== == = = = = = == === ⇐=

- B

◦

⇒ ====⇒ === == G === == === == = === == h === == === ◦= ===

D:

f

A

w w w w w w w w w w w w Fw w w w w

⇐= ⇐ == == == ==== ====== === ===== H == = k ◦===== ====== ==== h === ==== === ==== =◦

C:

G fGB ◦

kB

Figure I.7. “Vertical” Composition of Natural Transformations Functors can be composed with natural transformations. Categories B, C, and D, functors E: B → C, F,G: C → D, and natural transformation h: F → G, are pictured in Figure I.8. When paths representing two functors share (unmarked) endpoints, this indicates that the associated mappings on objects and morphisms are the same. We define the natural transformation h ◦ E: F ◦ E → G ◦ E such that (h ◦ E)A = hE A . To apply a functor prior to a natural transformation, we use its object component. The composition of a functor with a natural transformation intuitively applies the natural transformation on top of the functor. For example, preorder ◦ Set performs a preorder traversal of a tree of sets, yielding a list of sets.

32

Categorical Interlude

A B:

f

⇒ ===⇒ ===== === == === == === === === == ◦ E == == G === === == === == === G === === == === == === == === == ===

⇐= ⇐ ==== ===== == === == === === == === F == === ◦ == ===E == === == === == F === == === == === == === == === = =

w w w w w w w w w w w w Ew w w w w

C:

◦============= h ◦ E ============⇒

◦================h===============⇒

D:

? B EA Ef ? EB

(h ◦ E)A = hE A - (G ◦ E) A (F ◦ E) A ◦ (F ◦ E) f

(G ◦ E) f (h ◦ E)B = ? ? hE B - (G ◦ E) B (F ◦ E) B ◦

Figure I.8. Composition of a Functor with a Natural Transformation Given categories C, D, and E, functors F,G: C → D and K: D → E, and natural transformation h: F → G, as pictured in Figure I.9, we define the natural transformation K ◦ h: K ◦ F → K ◦ G such that (K ◦ h)B = K hB . To apply a functor after a natural transformation, we use its morphism component. The composition of a natural transformation with a functor intuitively drops under the functor to perform the natural transformation. For example, Set ◦ preorder performs a preorder traversal of each tree in a set of trees, yielding a set of lists. We can also define a “horizontal” composition of two natural transformations. Given categories C, D, and E, functors F,G:Cat C → D and K,L:Cat D → E, natural transformations h:C

D

F → G and

E

k:D K → L, the horizontal composition k ⋄ h: K ◦ F → L ◦ G, as pictured in Figure I.10, is defined equivalently as L ◦ h ◦ k ◦ F or k ◦ G ◦ K ◦ h. For example, given a list of trees of type A, applying reverse ⋄ preorder will yield a list of lists of type A, which can be constructed by reversing the list and performing the traversal in either order, i.e., we can reverse the list of trees and then traverse under the list, or we can traverse under the list and then reverse the list of lists.

33

Modelling State With Monads

A

◦=====K ◦ h=====⇒

⇐= === === === K === === === === ==

⇒ === === K ===== = === === === ===

D:

⇒ ==== ==== G ======== K ◦= = == = ==== ==== ==== ==== ====

C:

= ==== == == = = == == == = = == G = == F === == = = == == == = = ◦= = = = = = = = = = = h = = = = = = = = = = = ⇒ == == == = = == == =⇒ = = ⇐ ⇐=== ==== ==== ==== K ◦ ==== F ==== ==== ==== ==== ==== ==

B:

f ? B FA ◦

hA

- GA

Ff ? FB ◦

Gf hB

? - GB

(K ◦ h)A = K hA - (K ◦ G) A (K ◦ F) A ◦ (K ◦ F) f

(K ◦ G) f (K ◦ h)B = ? ? K hB - (K ◦ G) B (K ◦ F) B ◦

Figure I.9. Composition of a Natural Transformation with a Functor We now begin to describe a particular category C for representing various common features of programming languages. The following objects are formally only defined up to isomorphism, i.e., if multiple objects satisfy any definition, they will be isomorphic. They are known as universal constructions, because for any graph of objects and morphisms of a particular shape, they require the existence of a unique morphism with properties expressible through a commuting diagram. Unit: The datatype Unit is represented by a zeroary functor Unit: 1 → C that maps the single object and its identity to an object 1 C , a terminal object, and its identity. For any object A of C, there is exactly one morphism !CA : A → 1 . Morphisms are specified as unique with a dashed arrow shaft. Thus, we need not distinguish between terms of unit type, parameterized by a given type. Any characteristics of the ensuing computations by which we might wish to distinguish them must then be incorporated into that given type. We can also use terminal objects to represent constants. A constant of type A can be represented

34

Categorical Interlude

A

FA ◦

- B

Ff

- FB ◦

hA

hB

? GA

(K ◦ F) f

- (K ◦ F) B ◦

◦

)B

-

(K ◦ G) A

(L ◦ F) B

(k ⋄ h)B

(K ◦ G) B

◦

◦

◦ G )B

◦ h) B

A

(k

h)

◦

◦

G )A

(L

(L

? - (L ◦ G) B

-

-

(L ◦ G) f

◦

◦

? (L ◦ G) A

(k

F

F ◦ (k

B

A

h)

h) (k ⋄ h)A

◦

◦

◦

(K

(K

(k

◦

(K ◦ F) A ◦

◦

? - GB

Gf

◦

E:

f

=⇒ h == ⋄ = k === = == == ◦=

D:

w w w w w w w w w w w w w w w w h w w w Fw◦=====⇒ Gw w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w k w w Kw ◦=====⇒ Lw w w w w w w w w w w w w w w w w w w w w w w w w (L ◦ F) A w w w w w w w w w w w

)A

C:

Figure I.10. “Horizontal” Composition of Natural Transformations

as a global element, i.e., a morphism from 1 to A. 1 :

C:

1

w w w w w

Unit w w w w

A

!A 1

The category 1 is a terminal object in the category Cat. The functor 1 is a terminal object in the category Cat→ . Empty: The datatype Empty is represented by a zeroary functor Empty: 1 → C that maps the single object and its identity to an object 0 C , an initial object, and its identity. For any object A in C, there is exactly one morphism ¡A :C 0 → A. Thus, we need not distinguish between terms of a given type, parameterized by empty type, because that parameter can 35

Modelling State With Monads

never be instantiated. 1 :

C:

w w w w w

Empty w w w w

1

¡ A A

0

While there are no terms of empty type, empty types can be used to build inhabited types, such as a type of lists with no elements, or a type of functions that cannot be called. These might be more useful when one considers subtyping — lists of the empty type are a subtype of all other list types, while functions of empty domain type are supertypes of all other function types of the same range type. The category 0 is an initial object in the category Cat. The functor 0 is an initial object in the category Cat→ . Products: There is a binary product functor P: C×C → C, such that and for any two objects A and B , P hA, B i = A × B , and for any two morphisms f and g, P hf , gi = f × g. There are also projection natural transformations π1 : P → π1 and π2 : P → π2 as in the following diagram.11

⇒ == == == == π 2 === == == == == == ==

C:

hA, B i w w w w w w w w w w w w π1 π2 ⇐=======◦ w P ◦=======⇒ w w w

⇐ == == == == == π == 1 == == == == == =

C×C :

hC , D i π1A,B π2A,B - B A ◦ A×B ◦ f

f ×g

g

? π1C ,D ? π2C ,D ? - D C ◦ C ×D ◦

For any object C and morphisms f : C → A and g: C → B , there is a morphism f &g: C → A × B as in the following diagram, and we consider all such morphisms to be 11We overload the π and π symbols introduced earlier for functors. 1 2

36

Categorical Interlude

indistinguishable. hA, B i

w w w w w w w w w w w

⇒ == == == == π 2=== = == == == == ==

C:

⇐ == == == == == π ==1 == == == == ==

C×C :

C

π1 π2 ⇐=======◦ w P ◦=======⇒ w w w

f

f &g

g

? π2A,B π1A,B A ◦ A×B ◦ B

The datatype of tuples of arbitrary “arity” n can be defined as the left-associated, 1 initiated product ((1 × A1 ) × . . .) × An . If we choose to view our terms as parameterized by an environment, finite products can be used to represent that environment. Any product of small categories A×B is a product object in the category Cat, and any product of functors over small categories F×G is a product object in the category Cat→ . Coproducts: There is a binary coproduct functor S: C×C → C, such that and for any two objects A and B , S hA, B i = A + B , and for any two morphisms f and g, S hf , gi = f + g. There are also injection natural transformations ι1 : π1 → S and ι2 : π2 → S as in the following diagram. hA, B i

C:

⇒ == == == == π 2 === == == == == == ==

w w w w w w w w w w w w ι1 ι2 ◦=======⇒ w S ⇐=======◦ w w w

⇐ == == == == == π == 1 == == == == == =

C×C :

hC , D i A◦

ι1A,B ι - A + B 2A,B ◦ B

f

f +g

g

? ι1C ,D ? ι ? - C + D 2C ,D ◦ D C ◦

For any object C and morphisms f : A → C and g: B → C , there is a morphism f |g: A + B → C as in the following diagram, and we consider all such morphisms to be indistinguishable. hA, B i

w w w w w w w w w w w

⇒ == == == == π 2=== = == == == == ==

C:

⇐ == == == == == π ==1 == == == == ==

C×C :

ι1 ι2 ◦=======⇒ w S ⇐=======◦ w w w

C - 6 f

A◦

f |g

g

ι1A,B ι2A,B - A+B ◦B 37

Modelling State With Monads

The datatype of variants of arbitrary “arity” can similarly be defined as the left-associated, 0 -initiated coproduct ((0 + A1 ) + . . .) + An . Any sum of small categories A + B is a coproduct object in the category Cat, and any sum of functors over small categories F + G is a coproduct object in the category Cat→ . Procedures: (C×C)×C : C×C :

C:

w w w w w w πw1 w w w E&π1 ⇐ = = = = = = = = = = = = = = = w w w ==w w w = w = w w w = = w w w w == w w = = w w = w w = w w = = = = ◦= = = = = = = ⇒ P π w w 2 == w w apply == w w = w w = w w w⇐= w

hhB , C i, B i hA, B i hC B , B i

hB , C i

id BA × B { z}| ) × f f B ( - C (C ) × B ◦ applyhB , C i

We can represent the datatype of procedures from type B to type C as an exponential object C B . The morphism applyhB ,C i intuitively takes such a procedure and an argument of type B to a result of type C . The diagram shows that for any way f of deriving a result in C from a pair of inputs in A × B , there is a unique “Currying” operation z}|{ on f denoted by ( f ) that yields a procedure from the exponential object C B whose

application to the input in B yields the same result in C . For f : A → C B , the inverse operation ( f ) = f × A ◦ apply. A procedure with effects can be represented by the same |{z} technique but treating the range type C as a computation returning a value of type C , as described below. There is a binary exponential functor E: C op ×C → C, such that for any two objects A and B , E hA, B i = B A , and for any two morphisms f : A → B and g: z }| { z }| { C → D , E hf , gi = (g ◦ applyhA, C i ) ◦ (applyhB , C i ◦ idC B × f ): C B → D A , i.e., it takes a

representation of a morphism from B to C to a representation of a morphism from A to D by representing the result of composing f onto the front and g onto the rear of the represented morphism. Notice that this functor is contravariant in its first argument, i.e., the domain of f appears in the codomain of E hf , gi, and vice versa. This is reflected in defining E such that its first argument is drawn from C op . Categories with terminal objects,

38

Categorical Interlude

products, and exponentials are called cartesian closed. We define a category Ccc whose objects are cartesian closed categories and whose morphisms are ccc-representations, i.e., functors that preserve terminal objects, products, and exponentials. The functor category B A between any two small categories is an exponential object in the category Cat. We treat C as an implementation-level view of the language and enhance it further in order to derive another category C M providing a user-level view of the same language. We introduce monads

M

M ::= hhεM , ⊥M , ⊔M i, M ε

∈ εM

M

M ∈ εM ,εM 2 ∈ ε

, ηM , µM,ε1

i

to express computations, and then define user-level procedures in terms of them. Computations perform effects, so we require a set εM of effects. Because effects may be combined and there is an empty effect, we assume that effects form a monoid, i.e., they are subject to a binary associative combination operation ⊔M

with left and right identity empty effect ⊥M . M

then represented in category C using a family of endofunctors M ε

∈ ε

M

12

A computation is

indexed over a monoid of

effects εM . We can interpret M ε as follows: • M ε A is the type of computations performing effect ε and (possibly) returning a value of type A. • M ε f is the procedure that given some initial computation of v performing effect ε, returns the computation that performs the same effect ε and produces the result of applying procedure f to v. Computations are in fact treated as a datatype, so M ε should be available in C M . We require more structure to model computations, namely two natural transformations: C

• ηM :C Id → M ⊥ • µM,ε1 ,ε2 :C

C ×C

M ε1 ◦ M ε2 → M ε1 ⊔ε2

such that the following laws hold: 12This notation may seem bloated, but it is useful for describing monad transformers in Part II. We omit the monad index on effects when it is clear from the context.

39

Modelling State With Monads

associativity of monad: µM,ε1 ,ε2 ⊔ε3 ◦ (M ε1 ◦ µM,ε2 ,ε3 ) = µM,ε1 ⊔ε2 ,ε3 ◦ (µM,ε1 ,ε2 ◦ M ε3 ) ε3 M ε1 ◦ M ε2◦ ◦ M ◦ ===µ M,ε = = == 1 ,ε M == == 2 µ ==== == ◦ M ◦ == ε ε1 = == 3 M ==== == = =⇒ = = ⇐ ε2 ⊔ε3 M ε1 ⊔ε2 ◦ M ε3 ◦ M◦ == µ M 3 =◦ ε == , == ε2 == ,ε1 ,ε = ⊔ = == 2 ⊔ ,ε 1 == == ε3 M == == = µ = == == == == ⇒ = ⇐ M ε1 ⊔ε2 ⊔ε3

,ε 3 ,ε 2

M ε1

inner-identity of monad: µM,ε,⊥ ◦ (M ε ◦ ηM ) = idM ε outer-identity of monad: µM,⊥,ε ◦ (ηM ◦ M ε ) = idM ε M ε◦ ◦ ◦ === η M = w M == ◦ == w w η = == M w ε ◦ == w == ε = w == M === w == = w = =⇒ w = = w ⇐ ε ⊥ idw M ◦M M⊥ ◦ Mε Mε ◦== w =◦ w == µ M ,ε w == == ,ε , ⊥ = , w == ⊥ M == w == w µ ==== w == w = == =⇒ ==== ⇐ Mε

We can interpret ηM and µM,ε1 ,ε2 as parameterized terms as follows. Given a value v of type M,ε1 ,ε2

A, ηM A constructs a trivial computation producing v. µA

collapses, or flattens, a computation

of a computation (a two-layered computation) performing ε1 in the outer layer and ε2 in the inner layer, and producing v, into a single-layered computation performing effect ε1 ⊔ ε2 and producing v. The first constraint says that given a three-layer computation that is to be flattened twice, it doesn’t matter whether the inner computations are flattened first (M ε1 ◦ µM,ε2 ,ε3 ) or the outer computations are flattened first (µM,ε1 ,ε2 ◦ M ε3 ). Flattening is in this sense associative. This constraint relies on the associativity of ⊔ . The second and third constraints then say that given a computation, M ε , inserting an inner trivial computation (M ε ◦ ηM ) and then flattenning and inserting an outer trivial computation (ηM ◦ M ε ) and then flattenning both leave the original computation unchanged. Thus, ηM is an “inner” and “outer” identity on µM,ε1 ,ε2 . These constraints rely on ⊥M being a left and right identity on ⊔M .

40

Categorical Interlude

We can now define the category C M providing a user-level view of the language. The objects of category C M are the objects of C. The morphisms f ε :C M A → B are precisely the morphisms f :C A → M ε B . Thus, we consider a term parameterized by a domain type and representing an effectful computation of a value of the range type to implement a term performing the effect, parameterized by the domain type and representing a value of the range type. We must define identities and composition under C M . C

M (1) For any object A, idAM = ηM A . Identity morphisms have an effect of ⊥ .

(2) Given two morphisms f :C A → M ε1 B and g:C B → M ε2 C we have f ε1 :C M A → B and g ε2 :C M B → C . We define their composition g ε2 ◦C M f ε1 = (g ◦C f )ε1 ⊔ε2 :C M A → C to be M,ε1 ,ε2

µC

◦C M ε1 g ◦C f :C M A → M ε1 ⊔ε2 C .

With these definitions and the monad laws, we can prove that C M is a category: ◦

is associative: Let f ε1 :C M A → B , g ε2 :C M B → C , hε3 :C M C → D .

(hε3 ◦C M g ε2 ) ◦C M f ε1 M,ε ,ε = (µD 2 3 ◦C M ε2 h ◦C g) ◦C M f ε1 M,ε1 ,ε2 ⊔ε3 C M,ε ,ε = µD ◦ M ε1 (µD 2 3 ◦C M ε2 h ◦C g) ◦C f M,ε ,ε ⊔ε M,ε ,ε = µD 1 2 3 ◦C (M ε1 µD 2 3 ◦C M ε1 M ε2 h ◦C M ε1 g) ◦C f M,ε1 ,ε2 ⊔ε3 C M,ε ,ε = (µD ◦ M ε1 µD 2 3 ) ◦C M ε1 M ε2 h ◦C M ε1 g ◦C f M,ε ⊔ε ,ε M,ε ,ε = (µD 1 2 3 ◦C µM ε31 D2 ) ◦C M ε1 M ε2 h ◦C M ε1 g ◦C f M,ε1 ⊔ε2 ,ε3 C M,ε ,ε = µD ◦ (µM ε31 D2 ◦C M ε1 M ε2 h) ◦C M ε1 g ◦C f M,ε1 ⊔ε2 ,ε3 C M,ε ,ε = µD ◦ (M ε1 ⊔ε2 h ◦C µC 1 2 ) ◦C M ε1 g ◦C f M,ε ⊔ε ,ε M,ε ,ε = µD 1 2 3 ◦C M ε1 ⊔ε2 h ◦C (µC 1 2 ◦C M ε1 g ◦C f ) M,ε1 ⊔ε2 ,ε3 C = µD ◦ M ε1 ⊔ε2 h ◦C (g ε2 ◦C M f ε1 ) ε3 C M ε2 C M ε1 = h ◦ (g ◦ f ) right-identity of

defn of ◦C M defn of ◦C M M ε1 is a functor assoc of ◦C assoc of M assoc of ◦C naturality of µM,ε1 ,ε2 assoc of ◦C defn of ◦C M defn of ◦C M

◦ :

Let f ε :C M A → B. idCBM = = = =

◦C M f ε M,ε,⊥ C µB ◦ M ε idCBM ◦C f M,ε,⊥ C C µB ◦ M ε ηM B ◦ f C C idM ε B ◦ f fε

defn of ◦C M defn of idC M inner-identity of M idC is right-identity for ◦C

41

Modelling State With Monads

left-identity of

◦ :

Let f ε :C M A → B. f ε ◦C M idCAM M,⊥,ε C = µB ◦ M ⊥ f ◦C idCAM M,⊥,ε C = µB ◦ M ⊥ f ◦C ηM A M,⊥,ε C M = µB ◦ ηM ε B ◦C f = idCM ε B ◦C f = fε

defn of ◦C M defn of idC M naturality of ηM outer-identity of M idC is right-identity for ◦C

We find another formulation to be more relevant. A Kleisli triple consists of: (1) An effect-indexed family of type constructors M ε

∈ ε

.

C ⊥ (2) For each object A, ηM A A: A → M

(3) For each pair of effects ε1 and ε2 and pair of objects A and B , a mapping that for each morphism f :C A → M ε2 B , there is a morphism f

M,ε ,ε ∗A,B 1 2

M,ε ,ε2

∗A,B 1

such

:C M ε1 A → M ε1 ⊔ε2

B The type constructor corresponds to the object portion of the functor, ηM A is as before, and if M,ε1 ,ε2

f :C A → M ε2 B , then f ∗A,B

= µM,ε1 ,ε2 ◦ M ε1 f

We can thus restate the monad laws as follows: M,ε2 ,ε3

associativity of monad (Kleisli): (g ∗B,C

M,ε ,ε2 ⊔ε3

∗A,C 1

◦ f)

with f :C A → M ε2 B , g:C B → M ε3 C . M,ε,⊥

inner-identity of monad (Kleisli): ηM A outer-identity of monad (Kleisli): f

∗A,A

M,⊥,ε ∗A,B

= idM ε A ◦ ηM A =f ,

with f :C A → M ε B , These laws can be reduced to the original monad laws: associativity of monad (Kleisli): Let f :C A → M ε2 B , g:C B → M ε3 C .

42

M,ε1 ⊔ε2 ,ε3

= g ∗B,C

M,ε1 ,ε2

◦ f ∗A,B

,

Categorical Interlude

M,ε2 ,ε3

(g ∗B,C

M,ε ,ε2 ⊔ε3

∗A,C 1

◦ f)

◦ M ε1 (g ∗B,C

M,ε1 ,ε2 ⊔ε3

◦ (M ε1 g ∗B,C

= µC = = = = = = = =

M,ε2 ,ε3

M,ε1 ,ε2 ⊔ε3

= µC

M,ε2 ,ε3

◦ M ε1 f )

M ε1 is a functor

(µC 1 2 3 ◦ M ε1 g ) ◦ M ε1 f M,ε1 ,ε2 ⊔ε3 M,ε ,ε (µC ◦ M ε1 (µC 2 3 ◦ M ε2 g)) ◦ M ε1 f M,ε1 ,ε2 ⊔ε3 M,ε ,ε (µC ◦ (M ε1 µC 2 3 ◦ M ε1 M ε2 g) ◦ M ε1 f ) M,ε1 ,ε2 ⊔ε3 M,ε ,ε ((µC ◦ M ε1 µC 2 3 ) ◦ M ε1 M ε2 g) ◦ M ε1 f M,ε ⊔ε ,ε M,ε ,ε ((µC 1 2 3 ◦ µM ε31 C2 ) ◦ M ε1 M ε2 g) ◦ M ε1 f M,ε1 ⊔ε2 ,ε3 M,ε ,ε (µC ◦ (µM ε31 C2 ◦ M ε1 M ε2 g)) ◦ M ε1 f M,ε ,ε M,ε ⊔ε ,ε ( ◦ (M ε1 ⊔ε2 g ◦ µB 1 2 )µC 1 2 3 ) ◦ M ε1 f M,ε ⊔ε ,ε M,ε ,ε (µC 1 2 3 ◦ M ε1 ⊔ε2 g) ◦ (µB 1 2 ◦ M ε1 f ) M,ε1 ⊔ε2 ,ε3

= (µC

M,ε1 ⊔ε2 ,ε3

= g ∗B,C

, , ,

defn of

M,ε ,ε ∗B,C 2 3

M,ε ,ε ⊔ε

∗

◦ f)

M,ε1 ,ε2

◦ M ε1 ⊔ε2 g) ◦ f ∗A,B M,ε1 ,ε2

◦ f ∗A,B

assoc of ◦ , , defn of ∗ , M ε1 is a functor assoc of ◦ assoc of M assoc of ◦ naturality of µM,ε1 ,ε2 assoc of ◦ defn of

∗

, , ,

defn of

∗

, , ,

inner-identity of monad (Kleisli): ∗

M,ε,⊥

A,A ηM A M,ε,⊥ = µA ◦ M ε ηM A = idM ε A

, ,

defn of ∗ , inner-identity of M

outer-identity of monad (Kleisli): Let f :C A → M ε B . M,⊥,ε

f ∗A,B ◦ ηM A , , M,⊥,ε = (µB ◦ M ⊥ f ) ◦ ηM defn of ∗ , A M,⊥,ε = µB ◦ (M ⊥ f ◦ ηM A ) assoc of ◦ M,⊥,ε = µB ◦ (ηM ◦ f ) naturality of ηM ε M B M,⊥,ε = (µB ◦ ηM assoc of ◦ M ε B) ◦ f ε = idM B ◦ f outer-identity of M = f idC is right-identity for ◦C This formulation is equivalent; we can regain the monad as follows: ∗

C • M ε1 f = (ηM M ε2 B ◦ f ) M,ε1 ,ε2

• µA

= idCM ε2 A

∗

M,ε1 ,⊥ ε A,M 2 B

, with f : A → M ε2 B .

M,ε1 ,ε2 ε M 2 A,A

The original monad laws can now be reduced to the revised ones:

43

Modelling State With Monads

associativity of monad: M,ε ,ε ⊔ε µA 1 2 3 ◦ (M ε1 ◦ µM,ε2 ,ε3 )A M,ε ,ε ⊔ε M,ε ,ε = µA 1 2 3 ◦ M ε1 µA 2 3 M,ε1 ,ε2 ⊔ε3 M,ε ,ε ∗ ε ⊔ε ∗ ε 2 3 = idM ε2 ⊔ε3 A M 2 3 A,A ◦ M ε1 idM ε3 A M 3 A,A

= idM ε2 ⊔ε3 A

∗

= (idM ε2 ⊔ε3 A

M,ε1 ,ε2 ⊔ε3 ε ⊔ε 3 A,A M 2

∗

= ((idM ε2 ⊔ε3 A

◦ (ηM ◦ idM ε3 A M ε2 ⊔ε3 A

M,⊥,ε2 ⊔ε3 ε ⊔ε 3 A,A M 2

∗

= (idM ε3 A

M,ε ,ε ∗ ε 2 3 M 3 A,A

∗

∗

defn of µ , , ∗

◦ (ηM ◦ idM ε3 A M ε2 ⊔ε3 A

M,⊥,ε2 ⊔ε3 ε ⊔ε3 M 2 A,A

= (idM ε2 ⊔ε3 A ◦ idM ε3 A

functor ◦ nat trans M,ε2 ,ε3 ε M 3 A,A

∗

◦ ηM ) ◦ idM ε3 A M ε2 ⊔ε3 A

M,ε2 ,ε3 ε M 3 A,A

∗

)

∗

)

M,ε2 ,ε3 ε M 3 A,A

∗

M,ε1 ,⊥ ε ε ε ⊔ε 3 A M 2 M 3 A,M 2

∗

))

M,ε2 ,ε3 ε M 3 A,A

∗

assoc of M (K) M,ε1 ,ε2 ⊔ε3 ε ε M 2 M 3 A,A

)

M,ε1 ,ε2 ⊔ε3 ε ε M 2 M 3 A,A

∗

◦ idM ε2 M ε3 A )

M,ε1 ⊔ε2 ,ε3 ε M 3 A,A

= idM ε3 A ◦ idM ε2 M ε3 A M,ε1 ⊔ε2 ,ε3 M,ε1 ,ε2 = µA ◦ µM ε3 A M,ε ⊔ε ,ε = µA 1 2 3 ◦ (µM,ε1 ,ε2 ◦ M ε3 )A

assoc of ◦ outer-identity of M (K)

M,ε1 ,ε2 ⊔ε3 ε ε M 2 M 3 A,A

∗

defn of M

M,ε1 ,ε2 ⊔ε3 ε ε M 2 M 3 A,A

left/right identity of ◦

M,ε1 ,ε2 ε ε ε M 2 M 3 A,M 3 A

assoc of M (K) defn of µ , , nat trans ◦ functor

inner-identity of monad: M,ε,⊥

µA =

= = = = = = =

◦ (M ε ◦ ηM )A M,ε,⊥ µA ◦ M ε ηM A idM ⊥ A

∗

M,ε,⊥ M ⊥ A,A

∗

M,ε,⊥ M ⊥ A,A

functor ◦ nat trans

◦ M ε ηM A

defn of µ , , M,ε ,⊥

M ∗A,M ⊥ A idM ⊥ A ◦ (ηM M ⊥ A ◦ ηA ) M,ε,⊥ ∗A,A ∗M,⊥,⊥ M (idM ⊥ A M ⊥ A,A ◦ (ηM ◦ η )) A M⊥ A M,ε,⊥ ∗A,A ∗M,⊥,⊥ M M ⊥ A,A M ((idM ⊥ A ◦ ηM ⊥ A ) ◦ ηA ) M,ε ,⊥ M ∗A,A (idM ⊥ A ◦ ηA ) M,ε,⊥ ∗A,A ηM A

idM ε A

defn of M assoc of M (K) assoc of ◦ outer-identity of M (K) left-identity of ◦ inner-identity of M (K)

outer-identity of monad: M,⊥,ε

µA = = =

◦ (ηM ◦ M ε )A M,⊥,ε µA ◦ ηM Mε A

idM ε A idM ε A

∗

M,⊥,ε ε M A,A

◦ ηM Mε A

nat trans ◦ functor defn of µ , , outer-identity of M (K)

Given two morphisms f :C A → M ε1 B and g:C B → M ε2 C , with f ε1 :C M A → B and g ε2 :C M B → C , we can redefine their composition g ε2 ◦C M f ε1 = (g ◦C f )ε1 ⊔ε2 :C M A → C in terms of Kleisli M,ε1 ,ε2

triples as g ∗B,C

◦C f : A → M ε1 ⊔ε2 C .

We note that we can swap the order of the arguments for a variant of convenient for programming:

44

M,ε ,ε2

∗A,B 1

that is more

Categorical Interlude

M,ε ,ε2

⋆A,B1

M,ε ,ε2

: M ε1 A ⇒ (A ⇒ M ε2 B ) ⇒ M ε1 ⊔ε2 B m⋆A,B1

M,ε1 ,ε2

f = f ∗A,B

m

This operation is now closer in form to a let construct, with the first argument being the definition computation with type A and effect ε1 and the second being the body computation with type B and effect ε2 in terms of the definition value, yielding a computation with type B and effect ε1 ⊔ ε2 M,ε ,ε2

for the ⋆A,B1

form.

Finally, we identify a morphism ζM ε A : M ε A → A that takes one out of a monad, corresponding to retrieving a value from a computation, i.e., running it. ζM ⊥ serves as a right inverse to ηM (ζM ⊥ ◦ ηM A = idA ) because running a trivial computation should return the value with which it was created. We now stop briefly to define a few monads and their run operations. There is an identity monad on category C, Id[C], trivially defined as follows: hεId[C] , ⊥Id[C] , ⊔Id[C] i Id[C] P Id[C] f Id[C] ηP Id[C],⊥,⊥ µP

= = = = =

f ∗A,B (P := ∅) ζId[C] ⊥ P

= f = P = idCP

Id[C],⊥,⊥

h{⊥}, ⊥, {h⊥ , ⊥i 7→ ⊥}i P f idCP idCP

The state monad St is somewhat more substantial. We define it in terms of a category C representing a programming language and a store object S in C. The run operation is defined in terms of an initial store. Rather than define the monad directly, we assume that each morphism is associated with a term of the untyped λ-calculus, extended with unit, pairs, and offset values, in a manner that respects the commuting diagrams above, subject to the equational theory of β, η, πreducibility. We then specify particular λ terms and rely on that equational theory. An application of an abstraction λx.e1 to a term e2 is thus equated with the capture-avoiding substitution e1 [ x:= e2 ], and an abstraction λx.e x is thus equated with e if x is not free in e. We assume extentions for pattern-matching to hide the use of projections.13,14 13The reader should bear in mind that with some tedium, these definitions could be expressed in terms of products and exponentials. 14In the following definition, mv and mmv are to be considered single variable names. The naming convention indicates the number of applications of the monad in the variable’s type.

45

Modelling State With Monads

h {alloc,read,write,exec} , ∅, ∪ i S (P × S) λmv.λs.let hv, si = mv s in hf v, si λv.λs.hv, si λmmv.λs.let hmv, si = mmv s in mv s

hεSt , ⊥St , ⊔St i Stε P Stε f ηSt P , µSt, P

= = = = =

f ∗P1 ,P2 ζSt ε

= λmv.λs.let hv, si = mv s in f v s = λmv.let hv, si = mv sinit in v

St, ,

We next make some further assumptions so that we can define state operations in terms of St. We assume that C is cartesian closed and contains an exponential functor ⇒ , an endofunctor Ref and zeroary functors Unit and ∅15. We can view Ref as a degenerate, unary product with only a single projection16 and both Unit and ∅ as identifying terminal objects. We also assume that stores are finite functions that can be extended, updated, and applied, and update our calculus accordingly. We define a sequence of zeroary denotable value functors beginning with functor D0 of results. We set this to be equal to Unit, but recognize that it could be extended to a disjoint union over base types. We then define Dj+1 as Dj |Ref ◦ Dj |( ⇒ ) ◦ (Dj &(M ε ◦ Dj )) and D as limj:0→∞ Dj . Stores can be seen as an object relating an object O of offsets with D. We define operators to allocate, dereference, and update reference cells and to allocate and apply procedures. In each case the range type of the morphism is an application of the state monad, appropriately annotated with effect information. In the case of application, this effect includes the latent effect of the procedure, drawn from the range of the procedure object. ref deref set abs app

:C :C :C :C :C

P → St{alloc} Ref P Ref P → St{read} P Ref P × P → St{write} Unit (Stε P2 )P1 → St{alloc} (P1 ⇒ (Stε P2 )) ε (P1 ⇒ St P2 ) × P1 → St{exec} ∪ ε P2

= = = = =

λv.λs.ho, s {o 7→ v}i λo.λs.hs o, si λho, vi.λs.hunit, s {o λf.λs.ho, s {o 7→ f }i λho, vi.λs.s o v s

v}i

1

We seek a natural transformation run:C Stε ◦ D → ( + ∅) ◦ D0 to represent the generation of a result value or dangling pointer from a computation. This is done through a natural transformation 1

drop:C D → ( + ∅) ◦ D0 . The latter is defined as the infinite mediating sum that leaves base values intact and maps allocated values to the dangling pointer: drop0 dropj+1

= =

ι1 ∅ dropj |ι2 ◦ !∅ 1 |ι2 ◦ !1

15We reuse these names to refer to the indicated objects. 16If we want to be parsimonious, we can represent this as a binary product with a terminal object.

46

Categorical Interlude

Intuitively, drop forgets the escaping reference types embedded in D. We can then define run as drop ◦ ζSt ε. Before proceeding, we must mention that while we have followed convention in deriving C M from C, our perspective will be somewhat different. We are studying the user-level language represented by C M , so we could just as well have taken that as given. The effect of any morphism is then subject to the constraints that for any object A, the effect of idA is ⊥ and the effect of any composition g ε2 ◦ f ε1 is ε1 ⊔M ε2 . Our goal is to provide a deeper interpretation in terms of computations. We could introduce monads to express computations, and then find another category C that provides an implementation-level view of the same language, i.e., we can find in C an implementation of M

any morphism in C M . Category C requires an effect-indexed family of endofunctors M ε

∈ εM

. The

objects of category C are the objects of C M . Rather than identifying morphisms f ε :C M A → B with morphisms f :C A → M ε B , we could postulate the existence of a translation function from the former to the latter, i.e., for every morphism f ε :C M A → B , there is a morphism [f ε ] obj N :C A → M ε B . Thus, we’d represent parameterized terms by morphisms from the object representing the parameter type to the object representing computations yielding values of the term’s type and performing the intended effect. Our goal would not be to present a detailed representation of any category C M , but only to present a general framework for such representations. We require that they preserve associativity and identities. We’d thus postulate further that [g ε2 ◦C M f ε1 ] obj N = µM,ε1 ,ε2 ◦C M ε1 [g ε2 ] obj N ◦C [f ε1 ] obj N and [idA ] obj N = ηM A . Then, rather than proving that C M is a category, we would have to demonstrate that any translation satisfying our constraints is consistent with that fact, i.e., that morphisms of C M that are equated via the definition of a category have equal translations.

47

CHAPTER 1

Source Languages

We first present two languages with memory effects — an object language and a monadic language. Each is given a type system that describes the effects that any subprogram may perform as well as the type of any result of execution. We then present a translation between the languages that preserves types and effects. This presentation is largely based on that of Wadler [64], with the notable exception of our inclusion of storable functions.

1. Object Language Statics

In Figure 1.1.1, we begin our presentation of an object source language with effects.1,2 The class src obj q

of programs is equated with the class

src obj e

of expressions. let x= e in e is a standard let binding,

which binds the local variable to the first expression (the definition) over the scope of the second expression (the body). Allocations are performed using prestorables b, which abstract the typing rules for expressions. Intuitively, a prestorable is destined to be placed in the store, perhaps after some evaluation. These include reference cells and λ-abstractions. The latter bind a program variable (the formal parameter) over the scope of an expression (the body). All are distinguished from other 1Our presentation involves descriptions of various languages that differ from each other in some ways but also

share many common features. Throughout, we use boxes like this to attract attention to certain items in a figure or inset that have not been seen in previous languages. This diversion of attention is not meant to be complete. Readers are encouraged to ensure that nonhighlighted material is consistent with their expectations. Because this section presents the first language, it contains much highlighting. 2Our BNF notation indicates, e.g., that e is a metavariable used to represent elements of the object source language syntactic class src obj e, and which can take any of the forms in the ‘|’-separated list following ‘::=’. We occasionally also define syntactic classes by construction from other, defined syntactic classes.

49

Modelling State With Monads

Source Languages

q ∈

src obj q

::=

e

e ∈

src obj e

::=

p | let x = e in e | b | deref e | set e to e | e e

b ∈

src obj b

::=

href ei | hλx.e i

p ∈

src obj p

::=

v|x

v ∈

src obj v

::=

g

g ∈

src obj g

e1 ; e2

x ∈

src obj x

→

let x = e1 in e2

(x ∈ / fpv(e2 ))

Figure 1.1.1. Object Source Language Syntax expressions by being enclosed in angle brackets. Operators are provided to read from and write to reference cells, and to apply functions. deref forms expect an expression yielding a reference cell and extract its contents. set forms expect an expression yielding a reference cell and one yielding a replacement for the contained object. Function applications expect an operator expression yielding a function and an operand expression yielding its argument. Pures p are particular expressions without effect. They include program variables x, declared by let and λ-abstractions, as well as values v. Values are legitimate results of computations.3 They consist of global constants g, i.e., constants requiring only global allocation. Global constants and program variables are treated as primitive syntactic classes and not defined further, although we assume the former to include the booleans true and false and the singular value unit. We are conservative in our definition of pures, e.g., let x= unit in x performs no effect, but for convenience we do not consider it to be pure. An abbreviation for sequencing of imperative commands is also defined. Here and throughout, as is standard practice, we identify expressions that differ only by a consistent renaming of bound program variables. We are not interested in all object language programs, but only those satisfying certain properties including, for example, that they will run safely. We introduce a type system to recognize such programs. Figure 1.1.2 demonstrates some typable programs. The first is just a global constant. The second allocates and then dereferences a cell holding a global constant. The third modifies the contents of a cell before dereferencing it. The fourth allocates a reference cell that is treated as a dangling pointer at the level of programs. The fifth replaces an identity function in a reference cell with a function returning a global constant. It then dereferences the cell and calls the function. The final program allocates a reference cell pointing towards a global constant and a function that 3The values presented here do not define results of computations under any direct semantics of this language.

They are interpreted as such with respect to the intermediate languages of the next chapter.

50

Object Language Statics

(1) unit (2) deref href uniti (3) let x = href falsei in set x to true; deref x (4) href uniti (5) let x = href hλx.x ii in set x to hλx.true i; deref x false (6) let x2 = href falsei in hλx1 .set x2 to x1 i true; deref x2 Figure 1.1.2. Some Typable Object Language Programs updates its contained value. The function is called to modify the global constant and then the cell is dereferenced. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

< λx1 . x2 > deref unit deref hλx.x i href uniti unit let x = href uniti in set x to x hλx.x x i hλx.x x i set href uniti to href uniti set href href unitii to hλx.x i let x = hλx.x i in x unit; x href uniti let x = hλx.x i in x href uniti; x hλx.unit i Figure 1.1.3. Some Untypable Object Language Programs

In Figure 1.1.3 we motivate our type system by presenting some programs that it should reject. The first includes an unbound variable x2 . The second applies an operation expecting an allocated reference cell (deref) to a global constant; the third applies it to an allocated function. The fourth program conversely applies an operation expecting an allocated function (the application operator) to an allocated reference cell. There are also programs that our type system might safely accept, but does not. The next two programs demonstrate the lack of recursion; the first attempts a selfreferential cell while the second attempts an infinite loop by self-application. We do not address the former concern, and address the latter one only to some extent, when we add a recursion operator in Part III. To do more would require recursive types.4 The following two programs set a reference cell 4Typing the former program would also require modifying it, and the language, so that reference cells are initialized

to a ’nil’ value.

51

Modelling State With Monads

Source Languages

to hold a value of a different type; the first sets a reference cell initialized to a global constant to hold an allocated reference cell while the second sets a reference cell initialized to an allocated reference cell to hold an allocated function. The final two programs demonstrate the lack of polymorphic functions; functions may only be applied to arguments of a single type. The first applies the same function to a global constant and to an allocated reference cell while the second applies the same function to an allocated reference cell and to an allocated function. We address polymorphism in Part III. src obj Q

::=

E

Γ ∈

src obj Γ

::=

∅ | Γ {x 7→ P}

E ∈

src obj E

::=

P!T

B ∈

src obj B

::=

Ref P | P ⇒ P

P ∈

src obj P

::=

G|B

ε ∈

src obj ε

=

{ src obj F}

F ∈

src obj F

::=

ι

src obj ι

::=

alloc | read | write | exec

Q ∈

T ∈

src obj T,

ι ∈

G ∈

T

src obj G

Figure 1.1.4. Object Source Language Static Syntax Figure 1.1.4 describes the static syntax of the same language.5 We use components of the static syntax to analyze programs. Program types Q are equated with expression types E. Exression types E are composed of a pure type P and a trace type T, separated by ‘!’. The pure type represents what an expression evaluates to while the trace type represents how it is evaluated. Storable types B include reference cell types and function types. Reference cell types require the pure type of contained values. Function types contain domain and range pure types and the latent effect of applying the function, which is realized upon application. Pure types are either global constant types G or located types represented by storable types. Global constant types form a primitive syntactic class, but we assume that it includes the boolean type Bool and the unit type Unit. Atomic effects F record the occurrence of an action ι; these include allocations, reads, writes, and function executions. In the object language, trace types are the same as effects ε; both refer to sets of atomic effects. 5The grammar is not reduced to simplest form because we wish to maintain consistency with the more complex languages in the remainder of this work.

52

Object Language Statics

Environments Γ are finite functions from program variables to pure types. Our notation for finite functions is such that ∅ represents an empty environment and Γ {x 7→ P} extends the environment Γ such that it maps program variable x to pure type P. Although such finite functions are built up as sequences of bindings, we consider them to be sets for purposes of checking equality, i.e., the order of bindings is not relevant. We thus preclude duplicate program variables in our environments. We can, however, still typecheck programs that redeclare a program variable within the scope of the same program variable because we have identified expressions that differ only by a consistent renaming of bound program variables. Environments allow us to determine the type of any program variable lexically visible at a given point in the program.

We define x ∈ Dom( Γ) such that x1 ∈ / Dom( ∅) and x1 ∈

Dom( Γ {x2 7→ P}) iff either x1 = x2 or x1 ∈ Dom( Γ). Our constraint against duplicate bindings of program variables is thus equivalent to the requirement that Γ {x 7→ P} assumes x ∈ / Dom( Γ). We define Γ ( x) to apply the environment Γ to program variable x; Γ {x1 7→ P} ( x2 ) = P if x1 = x2 and Γ ( x2 ) otherwise. ∅ ( x) is undefined, i.e., Γ ( x) assumes x ∈ Dom( Γ). src

⊢obj q

programs expressions prestorables pures values

Γ Γ Γ

⊢ ⊢ ⊢ ⊢

src obj e src obj b src obj p

src obj v

q

: Q

e

: E

b

: B!T

p

: P

v

: P

Figure 1.1.5. Object Source Language Typing Judgments Figure 1.1.5 defines five judgments that may be derived with respect to programs, expressions, prestorables, pures, and values. Each judgment asserts that elements of a particular syntactic class have a corresponding type. A static environment is required for typing expressions, prestorables, and pures. It is not required for typing programs because no variables are free at the top level. It is not required for typing values because they do not include program variables. Programs and expressions are assigned program and expression types, both of which include an effect. Prestorables are assigned a trace type as well as a storable type, because processing may be required before they are stored. Both pures and values are assigned a pure type.

53

Modelling State With Monads

Source Languages

src

⊢obj v-glob-const

src

⊢obj v g : TypeOf( g)

Figure 1.1.6. Typing of Object Source Language Values Figure 1.1.6 presents the typing rule for values. It assumes a total function TypeOf:

src obj g

⇒

src obj G,

which assigns a global constant type to each global constant. We further assume that TypeOf( true) = Bool, TypeOf( false) = Bool, and TypeOf( unit) = Unit. src

src

⊢obj q

∅ ⊢obj e q : Q ⊢

src obj q

q: Q

Figure 1.1.7. Typing of Object Source Language Programs Object source language programs are typed as expressions in an empty environment in Figure 1.1.7. src

src

⊢obj p-var

src

src

Γ ⊢obj p x : Γ( x)

⊢obj p-value

⊢obj v v : P src

Γ ⊢obj p v : P

Figure 1.1.8. Typing of Object Source Language Pures Figure 1.1.8 presents the typing rules for pures. Variables are given the type determined by the environment. Values are typed as in Figure 1.1.6, as a degenerate case of pures for which the environment may be ignored. src

Γ ⊢obj e e : P ! ε

src

⊢obj b-ref

src

Γ ⊢obj b href ei : Ref P ! ε

src

src

⊢obj b-λ

Γ {x 7→ P1 } ⊢obj e e : P2 ! ε src

ε

Γ ⊢obj b hλx.e i : P1 ⇒ P2 ! ∅

Figure 1.1.9. Typing of Object Source Language Prestorables Figure 1.1.9 presents the typing rules for prestorables. These include the differences between various sorts of allocations that are abstracted out by our use of a common allocation rule for expressions. Isolating them here allows us to collect the remaining logic as a single alloc rule for src

expressions. ⊢obj b -ref types reference cells as references to the type of the expression describing the cell’s contents. Only reference cell prestorables may have nonempty effect — the effect of evaluating src

that expression. ⊢obj b -λ assigns functions a function type with range and latent effect determined 54

Object Language Statics

as the type and effect of the body in an environment extended with the formal parameter bound to the domain type. src

src

⊢obj e-let

Γ ⊢obj e e1 : P1 ! ε1 src Γ {x 7→ P1 } ⊢obj e e2 : P2 ! ε2 src

Γ ⊢obj e let x = e1 in e2 : P2 ! ε1 ∪ ε2 src

src

⊢obj e-pure

Γ ⊢obj p p : P Γ ⊢

src obj e

p: P!∅

src

Γ ⊢obj b b : B ! ε

src

⊢obj e-alloc

Γ ⊢

src obj e

b : B ! ε ∪ {alloc}

src

⊢

src obj e -deref

Γ ⊢obj e e : Ref P ! ε src

Γ ⊢obj e deref e : P ! ε ∪ {read} src

Γ ⊢obj e e 1 : Ref P ! ε1 src Γ ⊢obj e e 2 : P ! ε2

src

⊢obj e-set

src

Γ ⊢obj e set e 1 to e 2 : Unit ! ε1 ∪ ε2 ∪ {write}

src

Γ ⊢obj e e 2 : P1 ! ε2 src ε Γ ⊢obj e e 1 : P1 ⇒ P2 ! ε1

src

⊢obj e-app

src

Γ ⊢obj e e 1 e 2 : P2 ! ε ∪ ε1 ∪ ε2 ∪ {exec}

Figure 1.1.10. Typing of Object Source Language Expressions Figure 1.1.10 presents the typing rules for expressions. Pure expressions are typed as in Figure 1.1.8 and assigned an empty effect. let expressions are assigned the type of the body and an effect that is the union of the effects of the two component expressions, the body being typed with an environment extended with the local variable bound to the type of the definition. Allocations of prestorables are typed as in Figure 1.1.9, and assigned an effect that consists of an indication of an allocation, in addition to the effect of the prestorable. Dereferences of reference cells require that their argument expression be typed as a reference cell and are assigned its contained type. They are assigned an effect that consists of an indication of a read, in addition to the effect of the argument. Settings of reference cells are assigned the unit type. They are assigned an effect that consists of the unions of the effects of their component expressions, along with an indication of a write. They require that the first argument be typed as a reference cell type whose contained type is the type

55

⊢

src obj e -pure

src

⊢obj e-set src

∅ {x2 7→ Ref Bool} ⊢ x2 : Ref Bool {x1 7→ Bool} d2 ∅ {x2 7→ Ref Bool} ⊢ x2 : Ref Bool ! ∅ {x1 7→ Bool} ∅ {x2 7→ Ref Bool} ⊢ set x2 to x1 : Unit ! {write} {x1 7→ Bool}

src ⊢obj v-glob-const ⊢obj b-λ ⊢ false : Bool src p {write} obj ⊢ -value ∅ {x2 7→ Ref Bool} ⊢ hλx1 .set x2 to x1 i : Bool ⇒ Unit ! ∅ src ∅ ⊢ false : Bool src obj e -alloc ⊢ d3 e ⊢obj -pure {write} ∅ ⊢ false : Bool ! ∅ ∅ {x → 7 Ref Bool} ⊢ hλx .set x to x i : Bool ⇒ Unit ! {alloc} src src 2 1 2 1 ⊢obj b-ref ⊢obj e-app d4 ∅ ⊢ href falsei : Ref Bool ! ∅ ∅ {x → 7 Ref Bool} ⊢ hλx .set x to x src src 2 1 2 1 i true : Unit ! {alloc, exec, write} e e obj obj ⊢ -alloc ⊢ -seq ∅ ⊢ href falsei : Ref Bool ! {alloc} Bool ! ∅ {x2 7→ Ref Bool} ⊢ hλx1 .set x2 to x1 i true; : {alloc, exec, write, read} deref x2 src e obj ⊢ -let ∅ ⊢ let x2 = href falsei : Bool ! {alloc, exec, write, read} in hλx .set x to x i true; deref x src 1 2 1 2 ⊢obj q Bool ! ⊢ let x2 = href falsei : {alloc, exec, write, read} in hλx1 .set x2 to x1 i true; deref x2

src

⊢obj v-glob-const

src

⊢obj p-var src

d2 = ⊢obj e-pure

Modelling State With Monads

56

src

⊢obj p-var

∅ {x2 7→ Ref Bool} ⊢ x1 : Bool {x1 7→ Bool} ∅ {x2 7→ Ref Bool} ⊢ x1 : Bool ! ∅ {x1 7→ Bool}

src

⊢ true : Bool ∅ {x2 7→ Ref Bool} ⊢ true : Bool ∅ {x2 7→ Ref Bool} ⊢ true : Bool ! ∅

⊢obj p-value src

d3 = ⊢obj e-pure

src

⊢obj p-var src

d4 = ⊢

Figure 1.1.11. Sample Object Source Language Derivation

Source Languages

∅ {x2 7→ Ref Bool} ⊢ x2 : Ref Bool ∅ {x2 7→ Ref Bool} ⊢ x2 : Ref Bool ! ∅ ∅ {x2 7→ Ref Bool} ⊢ deref x2 : Bool ! {read}

⊢obj e-pure src obj e -deref

Monadic Language Statics

of the second argument. Applications require that their operator have a function type and that the operand have its domain type. They are assigned the range type, and an effect that is the union of the effects of the component expressions and the latent function effect, along with an indication of a function execution. We present in Figure 1.1.11 a sample typing derivation of the last program of Figure 1.1.2. To conserve space, we do not annotate judgments with the syntactic class of the term, which is manifest in the rule name. The static environment is not present for programs, empty for top-level expressions, extended as one descends into let and function bodies, and discarded for values. Effects are not present for pures or values, and are accumulated as one emerges to top-level expressions. src

This program performs all four possible action types. We consider the rule ⊢obj e-seq used here to be src

derived from ⊢obj e-let.

2. Monadic Language Statics

Figure 1.2.12 presents the syntax of a simple monadic language with effects. It has only a few differences from the object language in Figure 1.1.1. First, programs and expressions are separated, with expressions denoting computations yielding values. Programs thus consist of runs of expressions and values. Second, pures are excluded from expressions unless wrapped in a return form, representing a trivial computation. Third, the component expressions of allocations, dereferences, and settings of reference cells, and applications of functions are now restricted to be pures. The components of let forms and function bodies remain expressions since they represent computations. q ∈ e ∈ b ∈ p ∈ v ∈

src mon q src mon e src mon b src mon p src mon v

::= ::= ::= ::= ::= g ∈

run e | v return p | let x = e in e | p

p | deref p | set p to p | b

href p i | hλx.e i v|x g src mon g ⊇

src obj g

e1 ; e2

x ∈ src mon x ⊇ → let x = e1 in e2

src obj x

(x ∈ / fpv(e2 ))

Figure 1.2.12. Monadic Source Language Syntax

57

Modelling State With Monads

Source Languages

(1) unit (2) run return unit (3) run let x = href uniti in deref x (4) run let x = href falsei in set x to true; deref x (5) run href uniti (6) run let x = let x1 = hλx.return x i in href x1 i in let x1 = hλx.return true i in set x to x1 ; let x1 = deref x in x1 false (7) run let x2 = href falsei in let x3 = hλx1 .set x2 to x1 i in let x4 = return true in x3 x4 ; deref x2 Figure 1.2.13. Some Typable Monadic Language Programs (1) run hλx1 .return x2 i (2) run deref unit (3) run let x = hλx.return x i in deref x (4) run let x = href uniti in x unit (5) run let x = href uniti in set x to x (6) run let x = hλx.x x i in x x (7) let x1 = href uniti in let x2 = href uniti in set x1 to x2 (8) let x1 = href href unitii in let x2 = hλx.return x i in set x1 to x2 (9) run let x1 = hλx.return x i in let x2 = href uniti in x1 unit; x1 x2 (10) run let x1 = hλx.return x i in let x2 = href uniti in let x3 = hλx.return unit i in x1 x2 ; x1 x3 Figure 1.2.14. Some Untypable Monadic Language Programs In Figure 1.2.13, we present typable programs similar to those of Figure 1.1.2 (but not necessarily their translations) and in Figure 1.2.14, we present untypable programs similar to those of

58

Monadic Language Statics

Figure 1.1.3. The programs now follow the syntax of Figure 1.2.12. Observe that the first two programs in Figure 1.2.13 both correspond to the global constant unit of Figure 1.1.2. The remaining programs all have a run form on the outside. Complex subexpressions are pulled out using let forms. Pures in expression context such as the definition of a let or the body of a function are enclosed in a return form. Q ∈ Γ ∈ E ∈ B ∈ P ∈ T ∈ ε ∈ F ∈ ι ∈

src mon Q src mon Γ src mon E

::= ::=

src mon B src mon P src mon T src mon ε src mon F src mon ι

::=

Ref P | P ⇒ E

::=

G|B

G| ∅ ∅ | Γ {x 7→ P} TP

::=

::= Stε = { src mon F} ::= ι ::= alloc | read | write | exec src G ∈ src mon G ⊇ obj G

Figure 1.2.15. Monadic Source Language Static Syntax Figure 1.2.15 presents a static syntax for this monadic language. Like the dynamic syntax, it is similar to the object language (Figure 1.1.4). It differs in that a trace type takes the form of a functor annotated with an effect and in that an expression type takes the form of an application of a trace type to an object represented by a pure type. Also, the latent effect of a function type now resides in the range expression type. Finally, this semantics will allow a first, primitive example of encapsulation — effects and storable types will not be visible at the level of programs. To that end, a dangling pointer type ∅ is introduced to replace storable types at the level of program types. src

Γ

⊢mon q src ⊢mon e

prestorables Γ

⊢mon b

programs expressions pures values

Γ

src

src mon p

⊢ src ⊢mon v

q e

: Q : E

b

:

p v

: P : P

B

Figure 1.2.16. Monadic Source Language Typing Judgments Figure 1.2.16 presents the judgments for this monadic language. The judgment for prestorables differs from the object language (Figure 1.2.16) in that it does not assign any effect (since the contained expression of a prestorable reference cell must be pure). 59

Modelling State With Monads

Source Languages

src

⊢mon v-glob-const

src

⊢mon v g : TypeOf( g)

Figure 1.2.17. Typing of Monadic Source Language Values The typing rule for values in Figure 1.2.17 reiterates that of the object language (Figure 1.1.6). src

⊢

src mon q -run

src

∅ ⊢mon e e : Stε P src ⊢mon q run e : P := ∅

⊢

src mon q -value

⊢mon v v : P src ⊢mon q v : P

Figure 1.2.18. Typing of Monadic Source Language Programs There are now two typing rules for programs, presented in Figure 1.2.18. Computations to be run are typed with the rules of Figure 1.2.21 below, although the effect is dropped and the pure type is modified. The operation P := ∅ replaces storable types B with ∅ but leaves global constant types unchanged. Programs that evaluate to an allocated value are thus assigned the dangling pointer type, preventing storable types from escaping to the level of programs. Value programs are typed with the rule in Figure 1.2.17, in terms of the typing of values. src

src

⊢mon p-var

src

Γ ⊢

src mon p

⊢mon p-value

x : Γ( x)

⊢mon v v : P src Γ ⊢mon p v : P

Figure 1.2.19. Typing of Monadic Source Language Pures The typing rules for pures in Figure 1.2.19 reiterate those of the object language (Figure 1.1.8). src

⊢

src mon b-ref

src

Γ ⊢mon p p : P Γ ⊢

src mon b

src

href p i : Ref P

⊢mon b-λ

Γ {x 7→ P1 } ⊢mon e e0 :

Stε P2

src

Γ ⊢mon b hλx.e0 i : (P1 ⇒ Stε P2 )

Figure 1.2.20. Typing of Monadic Source Language Prestorables The typing rules for prestorables in Figure 1.2.20 differ from the object language (Figure 1.1.9) in that they do not assign any effect, and in that they use the revised syntax for function and src

expression types. The antecedent of ⊢mon b-ref is a pure judgment in the monadic language. The typing rules for expressions in Figure 1.2.21 similarly differ from the object language (Figure 1.1.10) in that they use the revised syntax for function and expression types. No effect is provided by the prestorable in an allocation, or by typing components of dereferences and settings of reference cells, or applications of functions, so none is incorporated into the effect of these expressions. let 60

Translation

forms combine the effects of the definition and the body, and applications must incorporate the src

latent effect of the function. ⊢mon e-pure expects a return form for the expression. Figures 1.3.22 and 1.3.23 present a derivation of the last program of Figure 1.2.13. The rule ⊢

src mon q

-run

is used to derive the program. Nested monadic operations are pulled out through let src

constructs. The antecedents of derivations of monadic operations are now pure derivations. ⊢mon e -pure

is used only to introduce a return form. The syntax of expression types and and typing of

prestorables is adapted to the monadic language. Both environments and trace types are discarded at the level of programs.

3. Translation

We are now ready to present our first translation between an object language with effects and a monadic language. Figure 1.3.24 translates source program syntax. It provides separate translation functions for each syntactic class, except that prestorables are translated as the expressions that allocate them. The translation for programs inserts the run form (even for values). The first two lines of the translation for expressions insert return forms for pures and leave let forms in place. Operations expecting an expression in the object language and a pure in the monadic language are

src

src

⊢mon e-let

Stε1 P1

Γ ⊢mon e e1 :

Γ ⊢

src mon e

src

Stε1 P1 } ⊢mon e e2 :

Γ {x 7→

let x = e1 in e2 :

⊢

src

Γ ⊢mon e

return p :

St∅ P

⊢

Γ ⊢mon b b : B src

Γ ⊢mon e b :

St{alloc} B

⊢

src mon e -set

⊢

⊢

src mon e -deref

Γ ⊢mon p p : Ref P src

Γ ⊢mon e deref p : Γ ⊢mon p p1 : Ref P

src

src

src

St{write} Unit

src

Γ ⊢mon p p1 : (P1 ⇒ Stε P2 )

Γ ⊢mon p p2 : P1 Γ ⊢mon e

St{read} P

Γ ⊢mon p p2 : P

Γ ⊢mon e set p1 to p2 :

src

src mon e-app

P2

src

src

src mon e -alloc

∪ ε2

src

src

Γ ⊢mon p p : P

src mon e -pure

Stε1

Stε2 P2

p1

p2 :

Stε

∪ {exec}

P2

Figure 1.2.21. Typing of Monadic Source Language Expressions 61

∅ {x2 7→ Ref Bool} {x1 7→ Bool}

src

⊢mon e-set src

⊢mon v-glob-const ⊢ false : Bool src ⊢mon p-value ∅ ⊢ false : Bool src ⊢mon b-ref ∅ ⊢ href falsei : Ref Bool src e

⊢mon

-alloc

src

⊢mon b-λ src

⊢mon e-alloc

∅ ⊢ href falsei : St{alloc} Ref Bool

⊢

src

⊢mon e-let

d2

⊢ set x2 to x1 : St{write} Unit

∅ {x2 7→ Ref Bool} ⊢ hλx1 .set x2 to x1 i : Bool⇒ St{write} Unit

∅ {x2 7→ Ref Bool} ⊢ hλx1 .set x2 to x1 i : St{alloc} (Bool⇒ St{write} Unit)

d3

∅ {x2 7→ Ref Bool} ⊢ let x3 = hλx1 .set x2 to x1 i : St{alloc,exec,write,read} Bool in let x4 = return true in x3 x4 ; deref x2

∅ ⊢ let x2 = href falsei : St{alloc,exec,write,read} Bool in let x3 = hλx1 .set x2 to x1 i in let x4 = return true in x3 x4 ; deref x2 ⊢ run let x2 = href falsei : Bool in let x3 = hλx1 .set x2 to x1 i in let x4 = return true in x3 x4 ; deref x2

src

⊢mon q-run

src

d2 = ⊢mon p-var

src mon e-let

∅ {x2 7→ Ref Bool} {x1 7→ Bool}

⊢ x2 : Ref Bool

Modelling State With Monads

62

src

⊢mon p-var

src

∅ {x2 7→ Ref Bool} {x1 7→ Bool}

⊢ x1 : Bool

d5 = ⊢mon p-var

∅ {x2 7→ Ref Bool} {x3 7→ Bool⇒ St{write} Unit} {x4 7→ Bool}

⊢ x4 : Bool

Figure 1.3.22. Sample Monadic Source Language Derivation, I Source Languages

Translation

translated by inserting a let form for each such operand. In both languages, the body of a function is an expression, so no let form need be inserted. The translations on pures and values are identity functions. We have assumed that the program variables and global constants of the object language are a subset of those of the monadic language. Consider the sample object language derivation in Figure 1.1.11. Applying the translation of Figure 1.3.24 to the program let x2 = href falsei in hλx1 .set x2 to x1 i true; deref x2 yields6 run let x2 = let x3 = return false in href x3 i in let x4 = hλx1 .let x5 = return x2 in let x6 = return x1 in set x5 to x6 in let x7 = return true in x4 x7 ; let x8 = return x2 in deref x8

i

Figure 1.3.25 translates the static syntax. It modifies the form of trace types (effects) to be annotated state functors, and modifies the form of expression types to be applications of these. Global constant types are not modified by the translation. We have assumed that the global constant types of the object language are a subset of those of the monadic language7. Storable types are translated as dangling pointer types at the level of programs only. Otherwise, reference cell types are translated in place and function types are modified to apply the latent effect to form an expression type as the range. Applying the translation of Figure 1.3.25 to the program type Bool ! {alloc, exec, write, read} simply yields Bool 6When a prestorable or configuration takes more than one line, we vertically align the left and right angle brackets at the top line. 7If we dropped the inclusion constraint on constants and generalized the translation, we would require an assumpsrc src tion that the TypeOf( ) operator commutes with translation, i.e., [TypeOf( [ g) ]]obj P N = TypeOf([g [ ]]obj v N ).

63

Modelling State With Monads

Source Languages

src

⊢mon v-glob-const

⊢

src mon p-value

src

⊢mon e-pure src

d3 = ⊢mon e-let

⊢ true : Bool ∅ {x2 7→ Ref Bool} ⊢ true : Bool {x3 7→ Bool⇒ St{write} Unit}

∅ {x2 7→ Ref Bool} {x3 7→ Bool⇒ St{write} Unit}

⊢ return true : St∅ Bool

d4

∅ {x2 7→ Ref Bool} ⊢ let x4 = return true : St{alloc,exec,write,read} Bool {write} in x3 x4 ; deref x2 {x3 7→ Bool⇒ St Unit}

src

⊢mon p-var

src

⊢mon e-app

∅ {x2 7→ Ref Bool} {x3 7→ Bool⇒ St{write} Unit} {x4 7→ Bool}

d5

⊢ x3 x4 : St{exec,write} Unit

∅ {x2 7→ Ref Bool} {x3 7→ Bool⇒ St{write} Unit} {x4 7→ Bool}

src

d4 = ⊢mon e-seq

⊢ x3 : Bool⇒ St{write} Unit

∅ {x2 7→ Ref Bool} {x3 7→ Bool⇒ St{write} Unit} {x4 7→ Bool}

d6

⊢ x3 x4 ; deref x2 : St{exec,write,read} Bool

src

⊢mon p-var

src

d6 = ⊢mon e-deref

∅ {x2 7→ Ref Bool} {x3 7→ Bool⇒ St{write} Unit} {x4 7→ Bool}

∅ {x2 7→ Ref Bool} {x3 7→ Bool⇒ St{write} Unit} {x4 7→ Bool}

⊢ x2 : Ref Bool

⊢ deref x2 : St{read} Bool

Figure 1.3.23. Sample Monadic Source Language Derivation, II The effect is not included in the monadic program type, but using the translation for expression types we obtain St{alloc,exec,write,read} Bool The translation preserves types in that if any of the object language judgments for programs, expressions, pures, or values is derivable, then a derivable monadic language judgment is obtainable via the translation. Theorem 1.3.1 (Types Preservation). src

⊢obj q src Γ ⊢obj e src Γ ⊢obj p src ⊢obj v 64

q: e: p: v:

Q E P P

→ → → →

src

src obj ΓN

[Γ ] src [Γ ] obj ΓN

⊢mon q src ⊢mon e src ⊢mon p src ⊢mon v

src

src

[q ] obj qN : [Q ] obj QN src src [e ] obj eN : [E ] obj EN src src [p ] obj pN : [P ] obj PN src src [v ] obj vN : [P ] obj PN

Translation

Programs and Expressions src obj qN

= run [e ] obj eN

src

src

= return [p ] obj pN

[e ]

src

[p ] obj eN src

[let x = e1 in e2 ] obj eN

src

src

= let x = [e1 ] obj eN in [e2 ] obj eN

src

= let x = [e ] obj eN in href xi

src

src

= let x = [e ] obj eN in deref x

[href ei] obj eN

src

[deref e ] obj eN src

src

[set e1 to e2 ] obj eN

src

= let x1 = [e1 ] obj eN in let x2 = [e2 ] obj eN in set x1 to x2

src

src

[hλx.e i] obj eN

= hλx.[[e ] obj eN i

src

src

[e1 e2 ] obj eN

src

= let x1 = [e1 ] obj eN in let x2 = [e2 ] obj eN in x1 x2 Pures and Values src

[x ] obj pN

=

src obj pN

x src

= [v ] obj vN

[v ]

src

[g ] obj vN

=

g

Figure 1.3.24. Translating Object Source Language to Monadic (Dynamic) The interesting cases are application and set because those have two components. Since our let construct supports only a single binding and our translation creates bindings for each component, when the second component is bound it will be within the scope of the binding of the first component. Thus, we require a derivation of the second component that does not use the translation of the environment from the object language derivation but one extended with the new binding of the first component. That motivates the following lemma.

Lemma 1.3.1 (Weakening).

src

Γ ⊢mon e e : E Γ ⊢ Γ ⊢

src mon b

src mon p

b: B p: P

→ → →

src

Γ {x′ 7→ P ′ } ⊢mon e e : E Γ {x′ 7→ P ′ } ⊢ ′

′

Γ {x 7→ P } ⊢

src mon b src mon p

b: B p: P

Recall that finite functions are only extended to bind objects not already in their domain, so we need not fear that the environment extension might shadow an existing binding of a program variable free in the expression, prestorable, or pure.

65

Modelling State With Monads

Intermediate Languages

Environments src obj ΓN

[∅ ]

= ∅ src obj ΓN

[Γ {x 7→ P} ]

src

src

= [Γ ] obj ΓN {x 7→ [P ] obj PN }

Expression and Trace Types src

[P ! T ] obj EN src

src

src

= [T ] obj TN [P ] obj PN = StT

[T ] obj TN

Program and Pure Types src

[G ! T ] obj QN src obj QN

[B ! T ]

src obj PN

[G ]

src

[B ] obj PN

= G = ∅ = G src

= [B ] obj BN

Storable Types src obj BN

[Ref P ] T

src

[P1 ⇒ P2 ] obj BN

src

= Ref [P ] obj PN src

src

src

= [P1 ] obj PN ⇒ [T ] obj TN [P2 ] obj PN

Figure 1.3.25. Translating Object Source Language to Monadic (Static) We choose to present this and subsequent translations of the dynamic syntax independently of any translations of the static syntax, and to allow translations of typing judgments and derivations to be derived from translations of their component syntax. A translation consistent with our own but defined directly over typing derivations is a proof that our translation preserves types. The Curry/Howard isomorphism relates type systems to logics so that types correspond to propositions, programs correspond to (realize) proofs, and reduction rules correspond to proof transformations. In terms of the Curry/Howard isomorphism (assuming any side-conditions are formalized into additional judgments) the translation over derivations would correspond to a realizer of the statement that the original translation preserves types. So far the reader has patiently absorbed two languages and a translation between them. We claim that these languages relate to memory effects; the names of operations and the type systems provided seem to support the claim, and we have asked you to consider that sufficient. We next, however, demonstrate precisely how each language provides memory effects by describing how programs are reduced as a computation proceeds.

66

CHAPTER 2

Intermediate Languages

In this chapter we repeat the process of presenting an object and a monadic language along with type systems for each and a translation between them. We augment this description, however, with a reduction semantics that provides an operational description of how language forms are processed. In order to support a reduction semantics we must enlarge both languages. Again, except for the inclusion of storable functions, this presentation is largely based on that of Wadler [64].1 At this stage, handling storable functions will not require substantial changes in Wadler’s infrastructure; they will be more difficult to support in Parts II and III. As a reduction semantics for a language with state, this work is reminiscent of seminal work by Felleisen and Friedman [12] and Felleisen and Hieb [13].

1. Object Language

1.1. Dynamics.

Figure 2.1.1 presents the syntax of the object intermediate language. It differs from the object source language in Figure 1.1.1 in its inclusion of stores and configurations, as well as traces and offsets. A configuration is a state of a system that will be affected by our evaluation contexts and reduction rules. Program configurations hq; si pair a program q with a store s. Similarly, expression and value configurations pair the respective syntactic class with a store. In the case of expression 1Other differences include our inclusion of run and running in the monadic language and the fact that we take

the contextual closure of reduction in both languages simultaneously with and not prior to the reflexive/transitive closure. Both points will be carried over to Parts II and III in order to model encapsulation.

67

Modelling State With Monads

Intermediate Languages

imd obj hq;

hq; si ∈

q ∈

si

=

imd obj q

::=

imd obj he;

he; si ∈

e ∈ b ∈ d ∈ p ∈

v ∈

×

imd obj s

×

imd obj s

e imd obj e

si

=

imd obj e imd obj b

::= ::=

p | let x = e in e | b | deref e | set e to e | e e href ei | hλx.e i

::=

href vi | hλx.e i

::=

v|x

imd obj d imd obj p

imd obj hv;

hv; si ∈

imd obj q

si

=

imd obj v

imd obj v

::=

g| o ∅ {o 7→ d}

×

imd obj s

s ∈

imd obj s

::=

t ∈

imd obj t

=

imd obj [f]

f ∈

imd obj f

::=

(ι @ o)

imd obj ι

::=

alloc | read | write | exec

g ∈

imd obj g

ι ∈

o ∈

imd obj o

x ∈

imd obj x

Figure 2.1.1. Object Intermediate Language Syntax configurations, the expression will be closed. A store is a finite function from offsets o to storables d, i.e., it is either empty or extended to bind an offset to a storable. Offsets, which form a new primitive syntactic class, are thus values that are used to access a store. The notation for empty and extended stores is similar to that for environments in the preceding chapter, as is the constraint preventing duplicate bindings.2 Storables are similar to prestorables but do not require further processing. As a program runs, allocations of prestorables will be replaced by offsets that point to the corresponding stored values. Our storables include both references to values and closed functions. Prestorables will now abstract reduction as well as typing rules for expression configurations. Clearly, and thus

imd obj d

⊆

imd obj b.

imd obj v

⊆

imd obj e

Traces t describe the execution of a program. Atomic traces f indicate that

an action occurs at a particular offset. Traces consist of lists of atomic traces, representing actions that occur in sequence as a program executes. In preparation for the reduction rules below, we assume the standard definition for substitution e [ x′ := v ′ ] of a value v ′ for a program variable x′ in an expression e. x [ x′ := v ′ ] = v ′ if x = x′ and x otherwise. Substitution is defined recursively through expressions, prestorables, pures, and values, 2The formal definition of the notation for stores is abbreviated using an overbar to indicate zero or more repetitions.

68

Object Language

Dynamics

except that the bodies of λ prestorables and let expressions are shielded from substitution for the bound variable. We use fpv( ) to refer to the free program variables of an expression. Stored values can be accessed through offsets. We define domain-checking and application of stores similarly to the corresponding operations on environments, i.e., o1 ∈ Dom( s {o2 7→ d}) if o1 = o2 or o1 ∈ Dom( s), (and thus s {o 7→ d} assumes o ∈ / Dom( s)), and s {o1 7→ d} ( o2 ) = d if o1 = o2 and s ( o2 ) otherwise. We define s {o to d; s {o1 7→ d1 } {o2 and ∅ {o

d} to update the store s such that it maps offset o

d2 } = s {o1 7→ d2 } if o1 = o2 and s {o2

d2 } {o1 7→ d1 } otherwise. ∅ ( o)

d} are both undefined, i.e., both s ( o) and s {o ⇀

imd he; si obj

-let

⇀ [] hlet x= v in e; si ⇀ he [ x:= v]; si

imd he; si obj

d} assume o ∈ Dom( s).

-alloc hd; si

[(alloc @ o)] ⇀ ho; s {o 7→ d}i

imd he; si obj

⇀ imd he; si obj

-app-λ

imd he; si obj

s( o) = hλx.e i

⇀ imd he; si obj

[(exec @ o)] ho v; si ⇀ he [ x:= v]; si

s( o) = href vi

-deref

hderef o; si

imd he; si obj

⇀ imd he; si obj

[(read @ o)] ⇀ hv; si

imd he; si obj

s( o) = href v1 i

-set hset o to v; si

[(write @ o)] ⇀ hunit; s {o

href vi}i

imd he; si obj

Figure 2.1.2. Object Language Expression Configuration Reduction Rules A reduction semantics is presented in Figures 2.1.2 through 2.1.4. Figure 2.1.2 presents five reduction rules. These simple rules derive judgments of the form:

t he; si ⇀ he′ ; s′ i

. They indicate that

he; si

a reduction is valid between two expression configurations, yielding a trace t.

⇀ imd he; si obj

-let processes a

let form by substituting the definition value for each free occurrence of the local variable in the body. The store remains unchanged.

⇀ imd he; si obj

-alloc is appropriate when the expression prestorable has the

form of a storable, i.e., for reference cells, the contained expression must be a value. It allocates an offset and extends the store, registering an allocation at the offset. The resulting configuration has the new offset as the expression and a store extended by binding the new offset to the storable. It implicitly requires that o ∈ / Dom( s). The remaining rules ⇀

imd he; si obj

⇀

imd he; si obj

-deref,

⇀

imd he; si obj

-set, and

-app-λ each require that all of the components of the expression be values and that the first

components be offsets that are defined in the store.

⇀

imd he; si obj

-deref and

⇀

imd he; si obj

-set require that it be 69

Modelling State With Monads

Intermediate Languages

defined as a reference cell while

⇀

imd he; si obj

that offset to access and (in the case of

-app-λ requires that it be defined as a function. They use ⇀

imd he; si obj

-set) update the store. The three rules register a read,

a write, and an execution at the offset, respectively.

⇀ imd he; si obj

-deref extracts the contained value from

the reference cell and makes it the resulting expression, leaving the store unchanged. results in unit, but updates the store to hold a new contained value.

⇀ imd he; si obj

⇀ imd he; si obj

-set

-app-λ substitutes an

operand value for the formal parameter in the body of the function, yielding a resulting expression, and leaves the store unchanged.

⇀ imd he; si obj

-app-λ and

⇀ imd he; si obj

-let correspond to βv reduction. We

reduce sequences (abbreviations for let forms in which the bound variable does not occur in the body) using

⇀

imd he; si obj

→∗ [ e]

q ∈

-seq, which is just a restriction of

imd →∗ [ e] q obj

::=

imd →∗ [ e] e obj

::=

⇀

imd he; si obj

-let to this condition.

[ e] →∗ [ e]

e ∈

[ e] | let x = [ e] in e | href [ e]i | deref [ e] | set [ e] to e | set v to [ e] | [ e] e | v [ e] Figure 2.1.3. CBV Object Language Atomic Expression Evaluation Contexts A program expression context is a program with a hole expecting an expression. Metavariables for program expression contexts take the form [ e]q , representing a program around a hole expecting an expression. An expression context is an expression with a hole expecting another expression. Metavariables for expression contexts take the form

[ e]

e, representing an expression around a hole

expecting another expression. We refer to an empty expression context as [ e]. This context expects (and returns) an expression. We refer to the filling of a context [ e]e with an expression e1 as [ e]e [e1 ]. We say that a syntactic form can be factored into a context and a subform if filling the context with the subform yields the original form. We identify atomic evaluation contexts in Figure 2.1.3 to guide the call-by-value (CBV) reduction of expressions. These contexts regulate the process of reduction by indicating where the reduction rules of Figure 2.1.2 may be applied. The only atomic program-expression context is an empty expression context. There are eight atomic expression contexts. These consist of the empty expression context, the definition of a let form, the contained expression of an allocation of a reference cell, the cell subexpression of a deref or set form, the operator subexpression of an application,

70

Object Language

Dynamics

or, if the first subexpression is a value, the second subexpression of a set form or application. These evaluation contexts are CBV because they specify evaluation in the operand of an application and the contents of a reference cell. They specify a left-to-right evaluation order because for set forms and applications, the first argument must be fully evaluated before evaluation proceeds to the second argument. Our evaluation contexts do not include the scope of any bound program variables. We inductively define expression evaluation contexts as the least set of expression contexts including these and closed over composition. We define program expression evaluation contexts as the least set including atomic program expression evaluation contexts and closed over composition with expression evaluation contexts. t he; si →∗ he′ ; s′ i

∗

→

imd hq; si obj

imd he; si obj

-cntxt

t ∗ q [e]; si →∗ h→ [

∗ [ e]

h→

q [e′ ]; s′ i

e]

imd hq; si obj

t he; si →∗ he′ ; s′ i

∗

→

imd he; si obj

imd he; si obj

-cntxt

t ∗ e [e]; si →∗ h→ [

∗ [ e]

h→

e]

e [e′ ]; s′ i

imd he; si obj

→∗ imd he; si obj

-reflex

[] he; si →∗ he; si

t he; si →∗ he′ ; s′ i

imd he; si obj

t he; si ⇀ he′ ; s′ i

∗

→

imd he; si obj

imd he; si obj ′

-step

t he′ ; s′ i →∗ he′′ ; s′′ i

∗

imd he; si obj

→

t he; si →∗ he′ ; s′ i

imd he; si obj

imd he; si obj

-trans

imd he; si obj

t+t′ he; si →∗ he′′ ; s′′ i imd he; si obj

Figure 2.1.4. Object Language Multiple Deep Reduction Rules We refer to the reflexive, transitive, and contextual closure of t hq; si →∗ hq′ ; s′ i

presents rules for judgments of the form

imd hq; si obj

⇀ imd he; si obj

for programs or

as

→∗ imd he; si obj

. Figure 2.1.4

t he; si →∗ he′ ; s′ i imd he; si obj

for expres-

∗

sions. The former are derived using

→

imd hq; si obj

-cntxt. We define a program-expression configuration

context to be a program configuration with an expression-configuration shaped hole. We define a program-expression configuration evaluation context to be a program-expression configuration context that applies a program-expression evaluation context to the expression component and leaves 71

Modelling State With Monads

Intermediate Languages

the store component unchanged.

→∗ imd hq; si obj

-cntxt indicates that one program configuration evaluates

to another with a given trace if each program configuration can be factored (using the same programexpression configuration evaluation context) into expression configurations related by evaluation with the same trace. Similarly, the rule

→∗

imd he; si obj

-cntxt allows evaluation within subexpressions. Define an

expression-configuration context to be an expression configuration with an expression-configuration shaped hole. Define an expression-configuration evaluation context to be an expression-configuration context that applies an exression evaluation context to the expression component and leaves the store →∗

component unchanged.

imd he; si obj

-cntxt indicates that one expression configuration evaluates to an-

other with a given trace if each expression configuration can be factored (using the same expression configuration evaluation context) into expression configurations related by evaluation with the same trace. In essence, these rules allow us to descend into an evaluation context to perform an evaluation sequence, leaving the store unchanged, and later ascend the same evaluation context with a revised expression and store. The next three rules of Figure 2.1.4, derived from Wadler and Thiemann [65], formalize that a chain of steps is defined to have a trace formed as the concatenation of the traces of its links. The rule trace. The rule

→∗ imd he; si obj

→∗ imd he; si obj

-reflex indicates that a configuration evaluates to itself with an empty

-step allows us to treat a reduction as an evaluation with the same trace.

∗

→

imd he; si obj

-trans allows two evaluations to be linked together, appending their traces. To obtain no-

t

tions = of equality of program and expression configurations, we use rules similar to these along with a rule

=

imd he; si obj

-revrs for the symmetric closure.

=

imd he; si obj

-revrs relates two configurations in one

direction if they are related in the reverse direction. We could require that the trace be reversed, but it will be sufficient to insist upon an empty trace. We use a strong notion of equality that considers evaluation contexts because in a language with effects, the same program can produce different results under different evaluation orders. Let us extend our notion of traces to infinite lists and also define → to be the restriction of →∗

by refraining from the use of

→∗ imd he; si obj

-reflex and

→∗ imd he; si obj

t

-trans. Then we can say hq; si ⇑, i.e.,

hq; si diverges with trace t if and only if for each q ′ , s′ , and t′ such that

t′ hq; si →∗ hq′ ; s′ i

′′

q ′′ , s′′ , and t′′ such that

t hq′ ; s′ i → hq′′ ; s′′ i imd hq; si obj

and t′ + t′′ is a prefix of t. We say

converges to hq ′ ; s′ i with trace t, if and only if ′

that

72

t hq′ ; s′ i → hq′′ ; s′′ i imd hq; si obj

.

t hq; si →∗ hq′ ; s′ i imd hq; si obj

imd hq; si obj

, there exist

t hq; si ⇓ hq′ ; s′ i imd hq; si obj

, i.e., hq; si

and there are no s′′ , q ′′ , and t′ such

hhref falsei; ∅i

[(alloc @ o1 )] ⇀ imd he; si obj

[(alloc @ o2 )]

[] true 6

hhλx1 .set o1 to x1 i; ∅ {o1 7→ href truei}i

imd he; si obj

⇀ -let

⇀ ho1 ; ∅ {o1 7→ href falsei}i

-alloc

[(alloc @ o2 ),(exec @ o2 ),(write @ o1 )]

[e]; deref o1 6

[] ⇀

Object Language

[(alloc @ o1 )]

let x2 = [e] in hλx1 .set x2 to x1 i true; deref x2 -

[]

⇀ hderef o1 ; imd he; si -seq ∅ {o1 → 7 href truei} i obj {o2 7→ hλx1 .set o1 to x1 i}

[(read @ o1 )] ⇀ htrue; ⇀ imd he; si -deref ∅ {o1 → 7 href truei} i obj {o2 7→ hλx1 .set o1 to x1 i}

[(exec @ o2 )] ⇀ hset o1 to true; ⇀ imd he; si -app-λ ∅ {o1 7→ href falsei} i obj {o2 7→ hλx1 .set o1 to x1 i}

[(write @ o1 )] ⇀ hunit; ⇀ imd he; si -set ∅ {o1 → 7 href truei} i obj {o2 7→ hλx1 .set o1 to x1 i}

⇀

[(alloc @ o2 )] ⇀ ho2 ; ⇀ imd he; si -alloc ∅ {o1 7→ href falsei} i obj {o2 7→ hλx1 .set o1 to x1 i} Figure 2.1.5. Sample Object Language Reduction Dynamics

73

Modelling State With Monads

Intermediate Languages

In Figure 2.1.5 we demonstrate this dynamic semantics on the sample source program typed in Figure 1.1.11. The figure presents the reduction sequence in the form of a tree. Leaves of the tree are expression configurations. Internal nodes are evaluation contexts. Sibling nodes are linked by reduction arcs. Only the first and last sibling are drawn connected to the parent. Arcs to and from leaves indicate reductions to and from the expression configurations. An arc pointing to an internal node indicates that the evaluation context may be used to descend into the residual, and the subtree describes a reduction sequence within that context. Internal nodes are annotated with a trace of this sequence. An arc from an internal node indicates that a reduction is applied to the configuration resulting from ascending through the evaluation context from the rightmost leaf, i.e., the configuration resulting from the internal node’s evaluation sequence. The example program is reduced as follows. We first descend into the let context to allocate the reference cell at o1 with and apply

⇀ imd he; si obj

⇀ imd he; si obj

-alloc, producing a trace of [(alloc @ o1 )]. We then return to top level

-let, substituting the pointer to the reference cell for x2 and yielding an empty

trace. We next descend into a sequential context derived from the let context, and further into the application operator context. We allocate the function at o2 with operator context to apply

⇀

imd he; si obj

⇀

imd he; si obj

-alloc and ascend from the

-app-λ. This rule substitues the value true for the formal x1 with a

trace of [(exec @ o2 )], o2 being the location of the function. The expression from the function body is now prepared to be processed with

⇀

imd he; si obj

-set, modifying the contents of the reference cell and

yielding a trace of [(write @ o1 )], o1 being the location of the reference cell. We now ascend from the sequential context to apply the rule and is processed using

⇀

imd he; si obj

⇀

imd he; si obj

-seq, derived from

⇀

imd he; si obj

-let. The dereference remains,

-deref with a trace of [(read @ o1 )]. The final result of true is retrieved

from the reference cell. The trace of the entire example would be: [(alloc @ o1 ),(alloc @ o2 ),(exec @ o2 ),(write @ o1 ),(read @ o1 )]. We explicitly identify here particular expression configurations that we call immediately faulty, for which no reduction sequence can lead to a value. These include unbound variables, allocations of functions with free variables, and the active use in a dereference, set, or application of either nonoffset values or offsets not present in the store as the appropriate form of storable.

74

Object Language

Dynamics

Definition 2.1.1 (Immediately Faulty Expression Configurations). (1) hx; si (2) hhλx.e i; si (λx.e not closed) (3) hderef v; si, hset v to e; si, hv e; si (v 6= o) (4) hderef o; si, hset o to e; si, ho e; si (o ∈ / Dom (s)) (5) hderef o; si, hset o to e; si (s (o) 6= href vi) (6) ho e; si (s (o) 6= hλx.e i) Our definition of immediately faulty expression configurations is, in a sense, complete. Proposition 2.1.1 (Nonvalue program configurations are reducible or faulty). Every program configuration hq; si with a nonvalue program q can be decomposed into the form ∗

h→

[ e]

q[e]; si, where he; si is either a redex or immediately faulty. In the former case, we say that

hq; si is reducible, in the latter case that it is faulty. Proof: Proposition 2.1.1. We show by induction on the structure of expressions that every nonvalue expression e in an expression configuration he; si can be decomposed into the form

→∗ [ e]

e [e0 ], where he0 ; si is either a

redex or immediately faulty. The result for program configurations follows immediately. In each case consider an expression configuration formed with an arbitrary store s. src obj he;

si:

hx; si: Let

→∗ [ e]

e = [ e]. hx; si is immediately faulty.

hlet x= e1 in e2 ; si: If e1 is a value, then let

→∗ [ e]

e = [ e] and hlet x= e1 in e2 ; si is a redex. Otherwise, by

induction, e1 can be decomposed into immediately faulty, so let

∗

→ [ e]

→∗ [ e]

e = let x= →

e 1 [e01 ], where he01 ; si is either a redex or

∗

[ e]

e 1 in e2 . 75

Modelling State With Monads

Intermediate Languages

hhref e1 i; si: →∗ [ e]

e = [ e] and hhref e1 i; si is a redex. Otherwise, by

If e1 is a value, then let

induction, e1 can be decomposed into immediately faulty, so let

∗

→ [ e]

→∗ [ e]

e = href

e 1 [e01 ], where he01 ; si is either a redex or

∗

→ [e]

e 1 i.

hderef e1 ; si: ∗

If e1 is an offset mapping to a reference cell, then let → redex. Otherwise, if e1 is a value, then let →

∗

[ e]

e = [ e] and hderef e1 ; si is a

[ e]

e = [ e] and h deref e1 ; si is immediately

faulty. Otherwise, by induction, e1 can be decomposed into si is either a redex or immediately faulty, so let

∗

→∗ [ e]

e 1 [e01 ], where he01 ;

∗

→ [ e]

e = deref →

[ e]

e1.

hset e1 to e2 ; si: If e1 is an offset mapping to a reference cell, then consider e2 . If e2 is a value, then let

→∗ [ e]

e = [ e] and hset e1 to e2 ; si is a redex. Otherwise, by induction, e2 can be

decomposed into ∗

→ [ e]

so let

→∗ [ e]

e 2 [e02 ], where he02 ; si is either a redex or immediately faulty,

e = set e1 to →

∗

[ e]

e 2 . Otherwise (e1 not an offset mapping to a reference

cell), if e1 is a value, then let

→∗ [ e]

e = [ e] and hset e1 to e2 ; si is immediately faulty.

Otherwise, by induction, e1 can be decomposed into either a redex or immediately faulty, so let

∗

→ [ e]

→∗ [ e]

e 01 [e01 ], where he01 ; si is

∗

e = set →

[ e]

e 1 to e2 .

hhλx.e0 i; si: ∗

Let →

[ e]

e = [ e]. If λx.e0 is closed then hhλx.e0 i; si is a redex; otherwise, it is imme-

diately faulty. he1 e2 ; si: This case is similar to that for set above, but we check whether or not e1 is an offset mapping to a stored procedure.

1.2. Statics.

We now extend the type system of Section 1.1 to handle intermediate language programs. Figure 2.1.1 revises the static syntax of Figure 1.1.4. It adds lines for configuration types and store types, analogous to the configurations and stores of Figure 2.1.1. Configuration types combine the

76

Object Language

Statics

hQ; Si ∈

imd obj hQ;

Q ∈ Γ ∈ hE; Si ∈

hP; Si ∈

×

imd obj S

::= E ::= ∅ | Γ {x 7→ P} =

Si

imd obj E

×

imd obj S

imd obj E

::= P ! T

B ∈

imd obj B

::= Ref P | P ⇒ P

imd obj hP;

S ∈ imd obj T,

imd obj Q

E ∈

P ∈

T ∈

imd obj Q imd obj Γ

imd obj hE;

=

Si

ε ∈ F ∈ ι ∈

T

=

Si

imd obj P imd obj S imd obj ε imd obj F imd obj ι

imd obj P

×

imd obj S

::= G | B ::= ∅ {o 7→ B} = { imd obj F} ::= ι ::= alloc | read | write | exec

G ∈

imd obj G

Figure 2.1.1. Object Intermediate Language Static Syntax type of objects of some syntactic class with a store type. They are defined for program, expression, and pure types. Store types map offsets to storable types, rather than storables. Our use of store types here is somewhat of a convenience; it is needed to keep derivations finite in the presence of a circular store. Store types serve to prevent our type rules from chasing pointers in the store, i.e., to prevent us from requiring for a judgment regarding an address that one prove a judgment regarding the contents stored at that address. Without a store type, this would be the case for the imd

rule ⊢obj v -addr-live below. If the store contained a cycle, this process would not terminate.3 Store types have been used by other authors in formulating static semantics [6, 5, 66]. We define S ≤ S ′ if and only if ∀o ∈ Dom( S).S( o) = S ′ ( o) and define ≥ accordingly. Store types seem to be dependent upon runtime values (offsets) so, although they are used in a restricted way, it is unclear that type-checking should be decidable for arbitrary intermediate language configurations. In fact, we do not require such decidablity. It is sufficient that type-checking be decidable for source language programs, and that reduction preserve typability. Figure 2.1.2 updates Figure 1.1.5. Additional judgments are provided for typing configurations, stores and storables, and both traces and atomic traces. Judgments for typing configurations are intended to apply to the content of evaluation contexts. Type soundness will be stated in terms of 3Our store of Chapter 6 has cycles due to our implementation of recursion; these could be avoided if we were to choose an implementation of recursion using stored recursive closures. Cycles in the store would also need to be considered if we were to include recursive types in our language, e.g., to model multithreading.

77

Modelling State With Monads

Intermediate Languages

imd

⊢obj hq; si

prog configs programs

S

⊢

expr configs expressions prestorables pures

⊢

Γ Γ Γ

; S ; S ; S

⊢

values

S

storables

S

imd obj p

si

imd obj d

imd obj t imd obj f

: hQ; Si : Q : hE; Si

e

: E

b

: B!T

p

: P

hv; si

imd obj v

imd obj s

⊢ ⊢

atomic traces

he; si

imd obj b

imd obj hv;

⊢

⊢

traces

q si

imd obj e

⊢

⊢

stores

imd obj he;

⊢ ⊢

value configs

hq; si

imd obj q

: hP; Si

v

: P

s

: S

d

: B

t

: T

f

: F

Figure 2.1.2. Object Intermediate Language Typing Judgments program configuration judgments, while effect soundness will be stated in terms of expression configuration judgments. We see here that pure type configurations are used to type value configurations, with program and expression configuration types typing program and expression configurations, respectively. Expression configurations do not require an environment because evaluation contexts do not include the scope of bound program variables.4 Stores are typed with store types. Storables are typed with storable types, using a store type. A store type is similarly added to the judgments from the source language. Traces are assigned trace types and atomic traces are assigned atomic trace types. Atomic effects classify atomic traces, disregarding the offset at which the action occurs. Effects classify traces, additionally disregarding duplicate actions and the sequence in which they occur. imd

imd

imd

We thus define ⊢obj f f: F to hold for ⊢obj f (ι @ o): ι and ⊢obj t t: T to hold when ∀ f ∈ t. ∃ F ∈ imd

T. ⊢obj f f : F , i.e., every atomic trace in the trace must be justified by some atomic trace type in the trace type. imd

Values are typed in Figure 2.1.3. ⊢obj v -glob-const is unchanged except for the presence of the imd

store type. ⊢obj v-addr-live is a new rule that types offsets by applying the store type. 4There is no fundamental difficulty in including an environment here and stating type and effect soundness results more generally in those terms, but it makes those results more unwieldy, particularly in the remaining parts of this work. Also, there is some economy in only providing what is needed that leads to insight regarding the environments that actually develop.

78

Object Language

Statics

imd

⊢obj v-glob-const

imd

S

⊢

imd obj v

⊢obj v-addr-live

g : TypeOf( g)

imd

S ⊢obj v o : S( o)

Figure 2.1.3. Typing of Object Intermediate Language Values imd

⊢

imd obj q

∅; S ⊢obj e q : Q S

imd

⊢obj q q : Q

Figure 2.1.4. Typing of Object Intermediate Language Programs Object intermediate language programs are typed using the rules in Figure 2.1.4 as in the source language (Figure 1.1.7), except for the addition of the store type. imd

⊢

imd obj d-ref

imd

S ⊢obj v v : P imd

S ⊢obj d href vi : Ref P

⊢

imd obj d -λ

∅; S ⊢obj b hλx.e i : B ! ∅ imd

S ⊢obj d hλx.e i : B

Figure 2.1.5. Typing of Object Language Storables imd

imd

Object storables, in Figure 2.1.5, are typed using ⊢obj d -ref5 and ⊢obj d -λ. A storable reference cell is typable if the contained value has the contained type, using the given store type. A storable function is typable if it is typable as a prestorable in an empty environment and with empty effect. imd

imd

⊢obj s

∅ {o 7→ B} ⊢obj d d : B imd

⊢obj s ∅ {o 7→ d} : ∅ {o 7→ B}

Figure 2.1.6. Typing of Object Language Stores Object stores are typed using the rule

imd obj s

in Figure 2.1.6. It says that a store is typable with a

store type if they cover the same address space and that every stored value is typable as the storable type at the same offset, using the entire store type.6 Using the entire store type allows for reference cells allocated earlier to be set to refer to those allocated later. Pures, prestorables, and expressions are typed using the rules in Figures 2.1.7 through 2.1.9 imd

as in Figures 1.1.8 through 1.1.10, except for the addition of the store type. ⊢obj b -ref is similar to 5We could type storable reference cells in terms of prestorable reference cells using the following rule: imd

imd

⊢obj d-ref

∅; S ⊢obj b href vi : Ref P ! ∅

. We avoid doing so only because the rule is slightly more complex, and imd S ⊢obj d href vi : Ref P because in the corresponding situation with constants in Part III, the rule is considerably more complex. 6Our overbar notation is used here to repeat the storable typing judgment.

79

Modelling State With Monads

⊢

imd obj p-var

Γ; S

Intermediate Languages

⊢

imd

⊢obj p x : Γ( x)

imd obj p -value

imd

⊢obj v v : P

S

imd

⊢obj p v : P

Γ; S

Figure 2.1.7. Typing of Object Intermediate Language Pures imd

imd

Γ; S ⊢obj e e : P ! ε

imd

⊢obj b-ref

imd

imd

⊢obj b href ei : Ref P ! ε

Γ; S

⊢obj b-λ

Γ {x 7→ P1 }; S ⊢obj e e : P2 ! ε Γ; S

ε

imd

⊢obj b hλx.e i : P1 ⇒ P2 ! ∅

Figure 2.1.8. Typing of Object Intermediate Language Prestorables imd

Γ; S ⊢obj e e1 : P1 ! ε1 imd

Γ {x 7→ P1 }; S ⊢obj e e2 : P2 ! ε2

⊢

imd obj e -let

⊢

imd obj e -pure

⊢

imd obj e -deref

imd

Γ; S ⊢obj e let x = e1 in e2 : P2 ! ε1 ∪ ε2 Γ; S

imd

⊢obj p p : P imd

Γ; S ⊢obj e p : P ! ∅

⊢

imd obj e -alloc

Γ; S

imd

⊢obj b b : B ! ε

imd

Γ; S ⊢obj e b : B ! ε ∪ {alloc}

imd

Γ; S ⊢obj e e : Ref P ! ε imd

Γ; S ⊢obj e deref e : P ! ε ∪ {read} imd

Γ; S ⊢obj e e 1 : Ref P ! ε1 imd

⊢

imd obj e -set

Γ; S ⊢obj e e 2 : P ! ε2 imd

Γ; S ⊢obj e set e 1 to e 2 : Unit ! ε1 ∪ ε2 ∪ {write}

imd

Γ; S ⊢obj e e 2 : P1 ! ε2 imd

⊢

imd obj e -app

ε

Γ; S ⊢obj e e 1 : P1 ⇒ P2 ! ε1 imd

Γ; S ⊢obj e e 1 e 2 : P2 ! ε ∪ ε1 ∪ ε2 ∪ {exec}

Figure 2.1.9. Typing of Object Intermediate Language Expressions imd

⊢obj d -ref but depends on the typing of a contained expression rather than value. As we have seen, prestorables in our intermediate languages are used to abstract over the typing rules for storables as well as allocations. Typings of program and expression configurations are derived using the rules imd obj he;

imd obj hq;

si and

si, which require both that the store type properly describe the store and that the program

or expression be typable using the store type but without regard to the specific values in the store.

80

imd

∅ {o1 7→ Ref Bool}

⊢ o1 : Ref Bool

{write}

imd

⊢obj v-glob-const

imd

∅ {o1 7→ Ref Bool}

⊢obj p-value

⊢ true : Bool

{write}

{o2 7→ Bool ⇒ Unit} ∅ {o1 7→ Ref Bool} ⊢ href truei : Ref Bool

imd

⊢obj d-ref

imd

⊢obj e-pure

d1

⊢

{o2 7→ Bool ⇒ Unit} ∅; ∅ {o1 7→ Ref Bool} ⊢ o1 : Ref Bool ! ∅ {write}

{o2 7→ Bool ⇒ Unit} ⊢ ∅ {o1 7→ href truei} : ∅ {o1 7→ Ref Bool} {write} {o2 7→ hλx1 .set o1 to x1 i} {o2 7→ Bool ⇒ Unit}

imd obj he;

⊢ o1 : Ref Bool

{write}

{write}

imd

⊢obj s

{o2 7→ Bool ⇒ Unit} ∅; ∅ {o1 7→ Ref Bool}

Object Language

⊢obj v-addr-live

imd

⊢obj e-deref

{o2 7→ Bool ⇒ Unit} ∅; ∅ {o1 7→ Ref Bool}

⊢ deref o1 :

{write}

si

⊢ hderef o1 ; ∅ {o1 7→ href truei} {o2 7→ hλx1 .set o1 to x1 i}

{o2 7→ Bool ⇒ Unit} i : hBool ! {read}; ∅ {o1 7→ Ref Bool}

Bool ! {read}

i

{write}

{o2 7→ Bool ⇒ Unit}

d1 = imd

⊢obj v-addr-live

∅ {o1 7→ Ref Bool}

⊢ o1 : Ref Bool

{write}

imd

⊢obj p-value

{o2 7→ Bool ⇒ Unit} ∅ {x1 → 7 Bool}; ∅ {o1 7→ Ref Bool}

imd

⊢obj p-var

⊢ o1 : Ref Bool

∅ {x1 7→ Bool}; ∅ {o1 7→ Ref Bool}

{write}

imd

⊢obj e-pure imd

⊢obj e-set

⊢ x1 : Bool

{write}

{o2 7→ Bool ⇒ Unit}

{o2 7→ Bool ⇒ Unit} imd ⊢obj e-pure Ref Bool ! Bool ! ∅ {x1 → 7 Bool}; ∅ {o1 7→ Ref Bool} ⊢ o1 : ∅ {x1 7→ Bool}; ∅ {o1 7→ Ref Bool} ⊢ x1 : ∅ ∅ {write} {write} {o2 7→ Bool ⇒ Unit} {o2 7→ Bool ⇒ Unit} ∅ {x1 7→ Bool}; ∅ {o1 7→ Ref Bool} ⊢ set o1 to x1 : Unit ! {write} {write}

imd

⊢obj b-λ

{o2 7→ Bool ⇒ Unit} ∅; ∅ {o1 7→ Ref Bool}

{write}

⊢ hλx1 .set o1 to x1 i : Bool ⇒ Unit ! ∅

{write}

⊢

imd obj d-λ

{o2 7→ Bool ⇒ Unit} ∅ {o1 7→ Ref Bool}

{write}

⊢ hλx1 .set o1 to x1 i : Bool ⇒ Unit

{write}

{o2 7→ Bool ⇒ Unit} Statics

81

Figure 2.1.10. Sample Object Intermediate Language Derivation

Modelling State With Monads

Intermediate Languages

Typing expression configurations requires that the expression be typable in an empty environment (since our evaluation contexts do not include the scope of program variables). imd

imd

⊢obj hq; si

imd

⊢obj s s : S imd S ⊢obj q q : Q ⊢

imd

hq; si

imd

⊢obj he; si

hq; si : hQ; Si

⊢obj s s : S imd ∅; S ⊢obj e e : E imd

⊢obj he; si he; si : hE; Si

It will be convenient to introduce configurations at a lower level as well. Judgments for value configuratons are typed as follows: imd

⊢obj s s : S imd S ⊢obj v v : P

imd

⊢obj hv; si

imd

⊢obj hv; si hv; si : hP; Si

In Figure 2.1.10 we present a sample expression configuration typing derivation for part of the reduction sequence in Figure 2.1.5. At this point, the reference cell and function have been allocated in the store, the store has been updated, and all that remains is to perform the dereference. The expression configuration derivation is composed of a store derivation and an expression derivation. The expression derivation is similar to that in Figure 1.1.11 but the store type is maintained throughout the derivation and used to type the dereference subexpression as an offset. The store derivation contains derivations of each stored value. Stored reference cells are typed using value derivations for the contained values and stored functions are typed using prestorable derivations with an initially empty environment and no effect.

1.3. Type and Effect Soundness.

Type soundness provides a guarantee that if a program is accepted by our type system, it is not faulty with respect to the reduction semantics. We prove type soundness using the technique of Wright and Felleisen [67]. Our result states that well-typed source language programs paired with an empty store either diverge or converge to well-typed value configurations of the same pure type.7 With little effort, our proof can be modified to show type soundness for intermediate language 7Relating the result of evaluation back to the source language is difficult. One must perform a dependency analysis on the offsets mentioned in λ-abstraction bodies and replace them with free variables bound outside the abstraction by let constructs.

82

Object Language

Type and Effect Soundness

programs, i.e., that well-typed program configurations either diverge or converge to a well-typed value configuration of the same pure type and no lesser store type. Theorem 2.1.1 (Type Soundness). src obj hq;

si:

src

⊢obj q q: P ! ε →



t

∃ t. hq; ∅i ⇑ ∨ ∃ v, s, t. 

t hq; ∅i ⇓ hv; si imd hq; si obj

∧ ∃ S ≥ ∅. ⊢

imd obj hq;

si



hv; si : hP ! ∅; Si

We also present an effect soundness result that corresponds to part of Wadler’s [64] notion of type soundness. During the evaluation of well-typed programs and expressions, no unexpected operations are performed. Again, corresponding results are provable for arbitrary intermediate language configurations. Theorem 2.1.2 (Effect Soundness). src obj hq;

⊢

si:

src obj q

src obj he;

∅⊢

q: P ! ε ∧

t hq; ∅i →∗ hq′ ; s′ i imd hq; si obj

imd

→ ⊢obj t t: ε

si:

src obj e

e: P ! ε ∧

t he; ∅i →∗ he′ ; s′ i imd he; si obj

imd

→ ⊢obj t t: ε

We will require two lemmas. The first states that typable program and expression configurations remain typable after evaluation, with no greater effect and no lesser store type. Lemma 2.1.1 (Evaluation Preserves Type and Effect). imd obj hq;

si:

imd

⊢obj hq; si hq; si : hP ! ε; Si ∧

t hq; si →∗ hq′ ; s′ i imd hq; si obj

→

imd

∃ ε ′ ⊆ ε, S ′ ≥ S. ⊢obj hq; si hq ′ ; s′ i : hP ! ε ′ ; S ′ i imd obj he;

⊢

si:

imd obj he;

si

he; si: hP ! ε; Si ∧

t he; si →∗ he′ ; s′ i imd he; si obj

→

imd

∃ ε ′ ⊆ ε, S ′ ≥ S. ⊢obj he; si he′ ; s′ i : hP ! ε ′ ; S ′ i The second simply states that faulty program configurations are untypable. 83

Modelling State With Monads

Intermediate Languages

Lemma 2.1.2 (Faulty Program Configurations Untypable). imd

If hq; si is faulty, then there are no Q and S, such that ⊢obj hq; si hq; si : hQ; Si. Proof: Theorem 2.1.1. src

We have ⊢obj q q: P ! ε. Because the intermediate language is an extension of the source language, imd

there is a similar derivation ∅ ⊢obj q q : P ! ε that simply passes around the empty store type. imd

imd

t

imd

Applying rules ⊢obj s and then ⊢obj hq; si, we have ⊢obj hq; si hq; ∅i : hP ! ε; ∅i. Clearly, if hq; ∅i ⇑, the condition is satisfied. Assume

t hq; ∅i ⇓ hq′ ; s′ i imd hq; si obj

, i.e.,

t hq; ∅i →∗ hq′ ; s′ i imd hq; si obj

imd

with hq ′ ; s′ i irreducible. By imd

imd

Lemma 2.1.1, ∃ ε ′ , S ′ ≥ ∅. ⊢obj hq; si hq ′ ; s′ i : hP ! ε ′ ; S ′ i and by ⊢obj hq; si, S ′ ⊢obj q q ′ : P ! ε ′ . By Lemma 2.1.2, hq ′ ; s′ i is not faulty. By Proposition 2.1.1, q ′ is a value. By inspection of the typing rules, values have no effect.

We will require two more lemmas and a proposition. The first lemma asserts both that reduction preserves type and effect, and also that it performs no unexpected actions. Lemma 2.1.3 (Subject Reduction). imd

⊢obj he; si he; si: hP ! ε; Si ∧ ∃ ε ′ ⊆ ε, S ′ ≥ S. ⊢

imd obj he;

si

t he; si ⇀ he′ ; s′ i imd he; si obj

→ imd

he′ ; s′ i : hP ! ε ′ ; S ′ i ∧ ⊢obj t t : ε

The remaining proposition and lemma relate to contexts. The proposition asserts that if a program or expression configuration is typable, its subexpressions within evaluation contexts can form typable configurations. Proposition 2.1.2 (Removing Context Preserves Typability). imd obj hq;

si:

imd

⊢obj hq; si h→ imd obj he;

⊢

∗

imd

q [e0 ]; si : hP ! ε; Si → ∃ P0 , ε0 ⊆ ε. ⊢obj he; si he0 ; si : hP0 ! ε0 ; Si

[e]

si:

imd obj he;

si

h→

∗

imd

e [e0 ]; si: hP ! ε; Si → ∃ P0 , ε0 ⊆ ε. ⊢obj he; si he0 ; si : hP0 ! ε0 ; Si

[ e]

The following lemma states that within a well-typed program configuration, expression configuration, or expression, replacing a well-typed subexpression in an evaluation context with another subexpression of the same pure type and no greater effect, and the store and store type with no lesser store and store type, yields another well-typed program configuration, expression configuration, or expression. 84

Object Language

Type and Effect Soundness

Lemma 2.1.4 (Replacement). imd obj hq;



si:

⊢

imd obj hq;

si

∗

h→



[ e]

q [e0 ]; si : hP ! ε; Si

   ∧ ⊢imd obj he; si he ; si : hP ! ε ; Si 0 0 0   ′  ∧ ⊢imd he; si ′ ′ obj he0 ; s i : hP0 ! ε ; S ′ i   ′ ∧ ε0 ⊆ ε0 ∧ S ′ ≥ S imd

′

∃ ε ⊆ ε. ⊢obj hq; si h→

imd obj he;



∗

     →    ′

q[e′0 ]; s′ i : hP ! ε ; S ′ i

[e]

si: imd obj he;

si

 ⊢  imd he; si  ⊢obj   imd  obj he; si  ⊢  ′ ε0 ⊆ ε0 ′



→∗ [ e]

h

e [e0 ]; si : hP ! ε; Si ∧    he0 ; si : hP0 ! ε0 ; Si ∧   → ′  he′0 ; s′ i : hP0 ! ε0 ; S ′ i ∧   ∧ S′ ≥ S

imd

∗

∃ ε ⊆ ε. ⊢obj he; si h→

′

e [e′0 ]; s′ i : hP ! ε ; S ′ i

[ e]

imd obj e:



 imd ∗ obj e → [ e]e [e ] : P ! ε ∧ ∅; S ⊢ 0      ∅; S ⊢imd  obj e e : P ! ε 0 0 0 ∧     → ′  ∅; S ′ ⊢imd  obj e e′ : P ! ε   0 0 0 ∧   ′ ε0 ⊆ ε0 ∧ S ′ ≥ S imd

′

∃ ε ⊆ ε. ∅; S ′ ⊢obj e

→∗ [ e]

e [e′0 ] : P ! ε

′

Proof: Theorem 2.1.2. imd obj hq;

si: imd

As in the proof of Theorem 2.1.1, we obtain ⊢obj hq; si hq; ∅i : hP ! ε; ∅i. The reduction derivation can only use →∗ imd hq; si obj

→∗

imd hq; si obj

-cntxt.

-cntxt:

Using Proposition 2.1.2 on the typing derivation of the initial program configuration ∗

h→

imd

q[e0 ]; si, we obtain ∃ P0 , ε0 ⊆ ε. ⊢obj he; si he0 ; si : hP0 ! ε0 ; Si.

[e]

determines an evaluation

t he0 ; si →∗ he′0 ; s′ i imd he; si obj

imd

→∗ imd hq; si obj

-cntxt

imd

and ⊢obj t t: ε0 implies ⊢obj t t: ε, so the

statement for program configurations reduces to that for expression configurations.

85

Modelling State With Monads

imd obj he;

Intermediate Languages

si:

We proceed by induction on the expression configuation reduction derivation. →∗

imd he; si obj

-cntxt:

Using Proposition 2.1.2 on the typing derivation of the initial expression configuration, imd

we obtain ∃ P0 , ε0 ⊆ ε. ⊢obj he; si he0 ; si : hP0 ! ε0 ; Si. evaluation →∗ imd he; si obj

t he0 ; si →∗ he′0 ; s′ i imd he; si obj

→∗

imd he; si obj

-cntxt determines an

imd

imd

. By induction ⊢obj t t: ε0 . It follows that ⊢obj t t: ε.

-reflex: imd

t = [ ] so ⊢obj t t: ε clearly holds. →∗ imd he; si obj

-step:

Use Lemma 2.1.3. →∗

imd he; si obj

-trans:

We have ⊢

imd obj he;

si

he; si: hP ! ε; Si,

t′′ he; si →∗ he′′ ; s′′ i imd he; si obj

, and

t′ he′′ ; s′′ i →∗ he′ ; s′ i imd he; si obj

with t = t′′ +

imd

′

t′ . By Lemma 2.1.1, we have that ∃ ε ⊆ ε, S ′′ . ⊢obj he; si he′′ ; s′′ i : hP ! ε ′ ; S ′′ i. By imd

imd

′

′

imd

induction, we have ⊢obj t t′′ : ε and ⊢obj t t′ : ε . Because ε ⊆ ε, ⊢obj t t′ : ε and thus imd

⊢obj t t: ε. Proof: Lemma 2.1.1. imd obj hq;

si: →∗

imd hq; si obj

-cntxt:

We again use Proposition 2.1.2 to reduce the first statement to the second, but now we need Lemma 2.1.4 as well.

imd obj he;

si:

Proceed as follows for each evaluation rule. →∗ imd he; si obj

-cntxt:

Use Proposition 2.1.2 and the induction hypothesis, then Lemma 2.1.4. →∗

imd he; si obj

-reflex: ′

s = s′ and e = e′ , so let S ′ = S and ε = ε.

86

Object Language

Type and Effect Soundness

→∗ imd he; si obj

-step:

Use Lemma 2.1.3. →∗ imd he; si obj

-trans:

We have ∃ε

t′′ he; si →∗ he′′ ; s′′ i imd he; si obj

and

t′ he′′ ; s′′ i →∗ he′ ; s′ i imd he; si obj

imd

′′

with t = t′′ + t′ . By induction, we have ′′

⊆ ε, S ′′ ≥ S. ⊢obj he; si he′′ ; s′′ i : hP ! ε ; S ′′ i, and then ∃ ε ′ ⊆ ε ′′ , S ′ imd

′

≥ S ′′ . ⊢obj he; si he′ ; s′ i : hP ! ε ; S ′ i. Proof: Lemma 2.1.2. We observe that typing any faulty program configuration would require typing the expression and store of an immediately faulty expression configuration and, moreover, that because evaluation contexts do not fall in the scope of any program variable, the expression would be typable in an empty environment so the immediately faulty expression configuration would be typable. We thus imd

proceed to show that if he; si is immediately faulty, then there are no S and E such that ⊢obj he; si he; si: hE; Si. Our proof is by contradiction. We perform case analysis on immediately faulty expression configurations. hx; si: imd

imd

imd

By rule ⊢obj he; si , we require ∅; S ⊢obj e x: E. This must be generated with ⊢obj p -var, which requires x ∈ Dom( ∅). hhλx.e1 i; si(λx.e1 not closed ): imd

imd

imd

By rule ⊢obj he; si, we require ∅; S ⊢obj e hλx.e1 i: E. This must be generated with ⊢obj e-alloc imd

ε

imd

1 and ⊢obj b -λ, which in turn requires ∅ {x 7→ P1 }; S ⊢obj e e1 : P2 ! ε1 and E = P1 ⇒ P2 !

{alloc}. But e1 contains a free variable other than x so this will not be derivable. hderef v; si, hset v to e2 ; si, hv e2 ; si(v 6= o): imd

imd

By rule ⊢obj he; si , we require ∅; S ⊢obj e e: P ! ε, for e equals deref v, set v to e2 , or v e2 . imd

imd

imd

These must be generated with ⊢obj e-deref, ⊢obj e-set, or ⊢obj e-app, respectively, each of which ε

imd

3 requires ∅; S ⊢obj e v: B ! ε1 , for B equals Ref P1 or P1 ⇒ P. This must be generated imd

imd

imd

with ⊢obj e-pure, ⊢obj p-val, and ⊢obj v-addr-live, which requires v = o. hderef o; si, hset o to e2 ; si, ho e2 ; si,(o ∈ / Dom ( s)): imd

imd

imd

By ⊢obj v -addr-live (as above), o ∈ Dom( S). By rule ⊢obj he; si we also require ⊢obj s s: S, which gives that o ∈ Dom( s). 87

Modelling State With Monads

Intermediate Languages

hderef o, set o to e2 ; si,(s ( o) ∈ / {href vi}): imd

imd

imd

imd

Also by ⊢obj v -addr-live (as above), S ( o) = Ref P1 . By rules ⊢obj he; si and ⊢obj s , S ⊢obj d s imd

( o): Ref P1 . This must be generated with ⊢obj d-ref, which requires that s ( o) = href vi. ho e2 ; si,(s ( o) 6= {hλx.e1 i}): ε

imd

imd

imd

imd

3 Also by ⊢obj v -addr-live (as above) S ( o) = P1 ⇒ P. By ⊢obj he; si and ⊢obj s , S ⊢obj d s ( o):

ε

imd

3 P1 ⇒ P. This must be generated by ⊢obj d-λ, which requires that s ( o) = hλx.e1 i.

We lay some preliminary groundwork for the proof of subject reduction by introducing a substitution lemma. We demonstrate that typing is preserved by substitution as used in the reduction semantics. Lemma 2.1.5 (Value Substitution). imd obj p: imd

imd

imd

∅ {x′ 7→ P ′ }; S ⊢obj p p : P ∧ S ⊢obj v v ′ : P ′ → ∅; S ⊢obj p p [x′ := v ′ ] : P imd obj b: imd

imd

imd

∅ {x′ 7→ P ′ }; S ⊢obj b b: B ! ε ∧ S ⊢obj v v ′ : P ′ → ∅; S ⊢obj b b [x′ := v ′ ]: B ! ε imd obj e: imd

imd

imd

∅ {x′ 7→ P ′ }; S ⊢obj e e: E ∧ S ⊢obj v v ′ : P ′ → ∅; S ⊢obj e e [x′ := v ′ ]: E The following proposition states that for various judgments, we can maintain derivability by replacing a store with a greater or equal store. The store could be enlarged by the allocation of additional offsets. This holds by way of a simple induction, noticing that our static judgments do not require negative assertions of the store type.8 Store weakening is needed for proving Lemma 2.1.4 as well as Lemma 2.1.3, and is thus stated in terms of more general environments. 8Constrast this with our dynamic judgments, which for allocation of an offset o in store s require that o ∈ / Dom

( s).

88

Object Language

Type and Effect Soundness

Proposition 2.1.3 (Store Weakening). imd obj v: imd

∧ S ≤ S′

→ S ′ ⊢obj v v: P

imd

∧ S ≤ S′

→ S ′ ⊢obj d d: B

∧ S ≤ S′

→ Γ; S ′ ⊢obj p p : P

∧ S ≤ S′

→ Γ; S ′ ⊢obj b b: B ! ε

∧ S ≤ S′

→ Γ; S ′ ⊢obj e e: E

S ⊢obj v v: P

imd

imd obj d:

S ⊢obj d d: B

imd

imd obj p: imd

Γ; S ⊢obj p p : P

imd

imd obj b: imd

Γ; S ⊢obj b b: B ! ε

imd

imd obj e: imd

Γ; S ⊢obj e e: E

imd

If a storable is typed as a prestorable, it can also be typed as a storable. It is sufficient that the environment be empty because storables are either reference cells holding values or closed functions. Proposition 2.1.4 (Storification). imd

imd

∅; S ⊢obj b d: B ! ∅ → S ⊢obj d d: B We now proceed with the proof of Subject Reduction. Proof: Lemma 2.1.3. imd

We have ⊢obj he; si he; si: hP ! ε; Si and

t he; si ⇀ he′ ; s′ i imd he; si obj

imd

imd

imd

. By ⊢obj he; si, ⊢obj s s: S and ∅; S ⊢obj e e:

P ! ε. By case analysis on the reduction step: ⇀

imd he; si obj

-let:

[] hlet x= v in e1 ; si ⇀ he1 [ x:= v]; si imd he; si obj

We have by rule ⊢

imd obj e -let

that ∅; S ⊢

imd

imd

:

imd obj e

imd

v: P1 ! ε1 and ε = ε1 ∪ ε2 . By ⊢obj e -pure, ε1 = imd

imd

∅ and ∅; S ⊢obj p v : P1 . By ⊢obj p -val, S ⊢obj v v: P1 . Also by ⊢obj e -let, ∅ {x 7→ P1 }; S imd

imd

⊢obj e e1 : P ! ε. By Lemma 2.1.5 we have that ∅; S ⊢obj e e1 [ x := v]: P ! ε. We leave the imd

imd

derivation of ⊢obj s s: S in place and complete by reapplying rule ⊢obj he; si. ⇀ imd he; si obj

-alloc:

[(alloc @ o)] hd; si ⇀ ho; s {o 7→ d}i imd he; si obj

Let s′ = s {o 7→ d}. By ⊢

imd obj e -alloc

: imd

, P = B, ε = ε1 ∪ {alloc}, and ∅; S ⊢obj b d: B ! ε1 , where imd

imd

by observation ε1 = ∅. Letting S ′ = S {o 7→ B} and applying ⊢obj v -addr-live, ⊢obj p -value, imd

imd

imd

and ⊢obj e -pure, ∅; S ′ ⊢obj e o: B ! ∅. By Lemma 2.1.4, S ⊢obj d d: B. We have by rule imd

⊢obj s a sequence of storable derivations. Weakening these and the new derivation to S ′

89

Modelling State With Monads

Intermediate Languages

imd

imd

using Proposition 2.1.3 and reapplying ⊢obj s yields ⊢obj s s′ : S ′ . We complete with an imd

application of rule ⊢obj he; si. ⇀

imd he; si obj

-deref:

hderef o; si

[(read @ o)] ⇀ hv; si

imd he; si obj

:

imd

imd

imd

where s ( o) = href vi. By ⊢obj e-deref, ε = ε1 ∪ {read}, and ∅; S ⊢obj e o: Ref P ! ε1 . By ⊢obj e -pure,

imd

imd

imd

imd

ε1 = ∅ and S; o ⊢obj p Ref P : . By ⊢obj p -value, S ⊢obj v o: Ref P. By ⊢obj v -addr-live imd

imd

imd

imd

S ( o) = Ref P. By ⊢obj s , S ⊢obj d href vi: Ref P. By ⊢obj d -ref, S ⊢obj v v: P. Applying imd

imd

imd

imd

⊢obj p -value and ⊢obj e -pure, ∅; S ⊢obj e v: P ! ∅. We leave the derivation of ⊢obj s s: S in imd

place and complete with an application of rule ⊢obj he; si. ⇀ imd he; si obj

-set:

[(read @ o)] hderef o; si ⇀ hv; si imd he; si obj

Let s′ = s {o

href vi}. By

:

⇀

imd

imd he; si obj

-set, s ( o) = href v0 i. By ⊢obj e-set, P = Unit and ε = ε1

imd

imd

imd

imd

∪ ε2 ∪ {write}. Applying ⊢obj v -glob-const, ⊢obj p -value, and ⊢obj e -pure we obtain ∅; S ⊢obj e imd

imd

imd

unit: Unit ! ∅. We also have by rule ⊢obj e-set that ∅; S ⊢obj e o: Ref P1 ! ε1 and ∅; S ⊢obj e imd

imd

imd

imd

v: P1 ! ε2 . By ⊢obj e-pure and ⊢obj p-value, S ⊢obj v o: Ref P1 and S ⊢obj v v: P1 . By the former imd

imd

imd

and ⊢obj v -addr-live, S ( o) = Ref P1 . Applying ⊢obj d -ref using the latter, S ⊢obj d href vi: Ref imd

imd

imd

P1 . Applying ⊢obj s, ⊢obj s s′ : S. We complete with an application of rule ⊢obj he; si. ⇀ imd he; si obj

-app-λ:

[(exec @ a0 )] h@ r0 a0 v; si ⇀ he1 [ x:= v]; si imd he; sir1 r0 r2 obj

: imd

imd

imd

where a0 = hr0 , oi and s ( a0 ) = hλx.e1 i. By ⊢obj s , ⊢obj d -λ, and ⊢obj b -λ, there are P1′ , P ′ , ′

ε

′

imd

imd

′

3 and ε3 such that S ( o) = P1′ ⇒ P ′ . Also by ⊢obj b -λ, ∅ {x 7→ P1′ }; S ⊢obj e e1 : P ′ ! ε3 . imd

imd

imd

imd

ε

imd

3 P, and ε = ε3 ∪ By ⊢obj e-app, ⊢obj e-pure, and ⊢obj p-value, S ⊢obj v v: P1 and S ⊢obj v o: P1 ⇒ imd

imd

′

{exec}. By ⊢obj v-addr-live, P1′ = P1 , P ′ = P, and ε3 = ε3 . By Lemma 2.1.5, ∅; S ⊢obj e e1 [ x imd

:= v]: P ! ε3 . We complete with an application of rule ⊢obj he; si. Proof: Lemma 2.1.4. imd obj hq;

si: imd

From ⊢obj hq; si h→

∗

imd

q [e0 ]; si : hP ! ε; Si we have S ⊢obj q

[e]

imd

s: S. The former gives ∅; S ⊢obj e ∗

h→

imd

→∗ [e]

q[e0 ] : P ! ε and ⊢obj s imd

→∗ [e]

imd

q[e0 ]: P ! ε. Applying rule ⊢obj he; si gives ⊢obj he; si

imd

imd

q [e0 ]; si: hP ! ε; Si. Also using ⊢obj he; si he0 ; si: hP0 ! ε0 ; Si and ⊢obj he; si he′0 ; s′ i: hP0

[e]

′

imd

′

! ε0 ; S ′ i with ε0 ⊆ ε0 and S ′ ≥ S, the result on expression configurations provides ⊢obj he; si ∗

h→

90

′

imd

imd

q [e′0 ]; s′ i: hP ! ε ; S ′ i. From rule ⊢obj he; si we have ∅; S ′ ⊢obj e

[e]

→∗ [e]

′

q[e′0 ]: P ! ε and

Object Language

Type and Effect Soundness

imd

imd

imd

⊢obj s s′ : S ′ . Applying rule ⊢obj q to the former gives S ′ ⊢obj q imd

imd

∗

rule ⊢obj hq; si yields ⊢obj hq; si h→ imd obj he;

→∗ [e]

′

q [e′0 ] : P ! ε . Applying

′

q [e′0 ]; S ′ i : hP ! ε ; s′ i.

[e]

si: imd

From ⊢obj he; si h→

∗

imd

imd

imd

→∗ [ e]

e [e0 ]; si: hP ! ε; Si we have ∅; S ⊢obj e

[ e]

e [e0 ]: P ! ε and ⊢obj s

imd

′

s: S. From ⊢obj he; si he0 ; si: hP0 ! ε0 ; Si and ⊢obj he; si he′0 ; s′ i: hP0 ! ε0 ; S ′ i we also have ∅; imd

imd

imd

′

S ⊢obj e e0 : P0 ! ε0 , ∅; S ′ ⊢obj e e′0 : P0 ! ε0 , and ⊢obj s s′ : S ′ . By the result on expressions imd

′

we have ∃ ε ⊆ ε. ∅; S ′ ⊢obj e ∗

h→

→∗ [ e]

imd

′

imd

e [e′0 ] : P ! ε . Applying rule ⊢obj he; si yields ⊢obj he; si

′

e [e′0 ]; s′ i: hP ! ε ; S ′ i.

[ e]

imd obj e: imd

→∗ [ e]

By induction on the portion of the derivation ∅; S ⊢obj e to

∗

e [e0 ]: P ! ε corresponding

→ [ e]

e . We show only selected cases.

[ e]: imd

′

′

′

P0 = P ∧ ε0 = ε. Use ∅; S ′ ⊢obj e e′0 : P0 ! ε0 letting ε = ε0 ⊆ ε0 = ε. href

→∗ [ e]

e i: imd

We have ∅; S ⊢obj e href imd

we get ∅; S ⊢obj e imd

imd

′

imd

e [e0 ]i: Ref P1 ! ε ∪ {alloc}. By ⊢obj e-alloc and ⊢obj b-ref,

→∗ [ e]

′

e [e0 ]: P1 ! ε . Then by induction we obtain ∃ ε

→∗ [ e]

∅; S ′ ⊢obj e href

→∗ [ e]

imd

′′

imd

′′

′

⊆ ε. imd

e [e′0 ] : P1 ! ε . Applying ⊢obj b -ref and ⊢obj e -alloc yields ∅; S ′ ⊢obj e

→∗ [ e]

′′

′′

′

e [e′0 ]i: Ref P1 ! ε ∪ {alloc}, with ε ∪ {alloc} ⊆ ε ∪ {alloc}.

∗

let x= →

[ e]

e in e: imd

∗

We have ∅; S ⊢obj e let x= →

imd

imd

e [e0 ] in e: P ! ε0 ∪ ε2 . By ⊢obj e -let, ∅; S0 ⊢obj e

[ e]

imd

→∗ [ e]

′

e [e0 ]: P0 ! ε0 and ∅ {x 7→ P0′′ }; S ⊢obj e e: P ! ε2 . By induction we get ∃ ε0 ⊆ imd

ε0 . ∅; S ′ ⊢obj e

→∗ [ e]

′

e [e′0 ] : P0 ! ε0 . Modifying the second subderivation using Propoimd

imd

imd

sition 2.1.3 yields ∅ {x 7→ P0 }; S ′ ⊢obj e e: P ! ε2 . Applying ⊢obj e-let yields ∅; S2 ⊢obj e ∗

let x= →

′

′

e [e′0 ] in e: P ! ε0 ∪ ε2 , with ε0 ∪ ε2 ⊆ ε0 ∪ ε2 .

[ e]

Proof: Lemma 2.1.5. We allow the environment to hold additional bindings beyond the program variable being replaced and show instead:

91

Modelling State With Monads

Intermediate Languages

imd obj p: imd

imd

∅ {x′ 7→ P ′ } {x′′ 7→ P ′′ }; S ⊢obj p p : P ∧ S ⊢obj v v ′ : P ′ → imd

∅ {x′′ 7→ P ′′ }; S ⊢obj p p [ x′ := v ′ ] : P imd obj b: imd

imd

∅ {x′ 7→ P ′ } {x′′ 7→ P ′′ }; S ⊢obj b b: B ! ε ∧ S ⊢obj v v ′ : P ′ → imd

∅ {x′′ 7→ P ′′ }; S ⊢obj b b [ x′ := v ′ ]: B ! ε imd obj e: imd

imd

∅ {x′ 7→ P ′ } {x′′ 7→ P ′′ }; S ⊢obj e e: E ∧ S ⊢obj v v ′ : P ′ → imd

∅ {x′′ 7→ P ′′ }; S ⊢obj e e [ x′ := v ′ ]: E The proof is by induction on the derivation of the term into which we are substituting. We show only a few interesting cases. imd obj p:

imd

∅ {x′ 7→ P ′ } {x′′ 7→ P ′′ }; S ⊢obj p x : P: imd

By ⊢obj p -var, (∅ {x′ 7→ P ′ } {x′′ 7→ P ′′ }) ( x) = P. If x′ = x then P = P ′ and x [ x′ := v ′ ] = imd

v ′ , so the result follows from an application of ⊢obj p -value. Otherwise, by the definition of imd

imd

environments, ∅ {x′′ 7→ P ′′ } ( x) = P. ∅ {x′′ 7→ P ′′ }; S ⊢obj p x : P follows from ⊢obj p-var, and is sufficient because x = x [ x′ := v ′ ].

imd obj b:

imd

ε

∅ {x′ 7→ P ′ } {x′′ 7→ P ′′ }; S ⊢obj b hλx.e i: P1 ⇒ P2 ! ∅: imd

imd

By ⊢obj b -λ, ∅ {x′ 7→ P ′ } {x′′ 7→ P ′′ } {x 7→ P1 }; S ⊢obj e e: P2 ! ε. We then have by imd

induction that ∅ {x′′ 7→ P ′′ } {x 7→ P1 }; S ⊢obj e e [ x′ := v ′ ]: P2 ! ε. ∅ {x′′ 7→ P ′′ }; S imd

ε

imd

⊢obj b hλx.e [ x′ := v ′ ] i: P1 ⇒ P2 ! ∅, follows by ⊢obj b -λ, with hλx.e [ x′ := v ′ ] i = hλx.e i [ x′ := v ′ ].

92

Monadic Language

Dynamics

2. Monadic Language

2.1. Dynamics.

imd ∅ mon hq; si q ∈ imd mon q imd ∈ mon he; si e ∈ imd mon e b ∈ imd mon b d ∈ imd mon d p ∈ imd mon p ∈ imd hv; si mon imd v ∈ mon v ∅ ∅ s ∈ imd mon s s ∈ imd mon s t ∈ imd mon t imd f ∈ mon f ι ∈ imd mon ι

hq; ∅si ∈ he; si

hv; si

g ∈

imd mon g

⊇

imd obj g

= ::= = ::= ::= ::= ::= = ::= ::= ::= = ::= ::=

imd mon q

×

imd ∅ mon s

run e | running e | v imd mon e

× imd mon s return p | let x = e in e | b | deref p | set p to p | p p href pi | hλx.e i href vi | hλx.e i v|x imd imd mon v × mon s g|o s|∅ ∅ {o 7→ d} imd mon [f] (ι @ o) alloc | read | write | exec o ∈

imd mon o

⊇

imd obj o

x ∈

imd mon x

⊇

imd obj x

Figure 2.2.1. Monadic Intermediate Language Syntax In Figure 2.2.1, we similarly extend the monadic language to support a reduction semantics. Like the monadic source language (Figure 1.2.12), components of monadic operations are pure, and a return form is present. Programs again include run forms and values but now also include a running form for computations in progress. As in the object intermediate language (Figure 2.1.1), offsets are present as values and both traces and stores are defined. Again, storables d of the form hλx.e i are assumed closed. Potential stores ∅s are either stores or ∅, which we have overloaded from the static syntax. Configurations are again defined to represent the current state of a computation. They pair a program with a potential store, or an expression, pure, or value with a store. The intention is that program configurations that are running expressions will have an actual store, while expressions yet to be run and values resulting from running an expression have ∅.

93

Modelling State With Monads

⇀ imd hq; ∅s i mon

Intermediate Languages

⇀

-run

hrun e; ∅i ⇀ hrunning e; ∅i

imd hq; ∅s i mon

-running

hrunning return v; si ⇀ hv; ∅i

imd hq; ∅s i mon

imd hq; ∅s i mon

Figure 2.2.2. Monadic Program Configuration Reduction Rules We now have reduction rules at the level of programs as well as at the level of expressions. Program-level reduction rules derive judgments of the form

hq;

s i ⇀ hq′ ;

∅

∅ ′

s i

imd hq; ∅s i mon

. In particular, they

do not include any trace, which is visible only within a running form. Figure 2.2.2 presents the two program-level reduction rules. The first takes place before all expression-level rules; the second takes place after them.

⇀ imd hq; ∅s i mon

-run replaces the run form with a running form, and initiates the

store, replacing nonstore ∅ with an empty store.

⇀ imd hq; ∅s i mon

-running replaces the running of a trivial

computation (returning a value) with the computed value, and destructs the store, replacing it with ∅. ⇀ imd he; si mon

-let hlet x=

return v

[] in e; si ⇀ he [ x:= v]; si imd he; si mon

⇀

imd he; si mon

-alloc hd; si

[(alloc @ o)] ⇀ h

return o

; s {o 7→ d}i

imd he; si mon

⇀ imd he; si mon

-app-λ

s( o) = hλx.e i [(exec @ o)] ho v; si ⇀ he [ x:= v]; si

⇀ imd he; si mon

s( o) = href vi

-deref

hderef o; si

imd he; si mon

⇀

imd he; si mon

-set

[(read @ o)] ⇀ h

return v

; si

imd he; si mon

s( o) = href v1 i [(write @ o)] hset o to v; si ⇀ h

return unit

; s {o

href vi}i

imd he; si mon

Figure 2.2.3. Monadic Language Expression Configuration Reduction Rules The expression-level reduction rules in Figure 2.2.3 are similar to those of the object language (Figure 2.1.2) but differ in the judicious placement of return forms. and

⇀

imd he; si mon

-alloc,

⇀

imd he; si mon

-set,

-deref, which yield simple values in the object language, must instead provide trivial

computations using return.

⇀

imd he; si mon

-let, which consumes a simple value in the object language, must

instead consume a trivial computation with return. Only 94

⇀

imd he; si mon

⇀

imd he; si mon

-app-λ is unchanged.

Monadic Language

Dynamics

→∗ [e]

imd →∗ [e] q mon

::=

running [ e]

→∗ [ e]

imd →∗ [ e] e mon

::=

[ e] | let x = [ e] in e

q ∈ e ∈

Figure 2.2.4. CBV Monadic Language Atomic Expression Evaluation Contexts Contexts are defined as in the object language. Because components of monadic operations are pure and pures are not reducible, evaluation contexts (in Figure 2.2.4) are fewer than in the object language (Figure 2.1.3). The only evaluation contexts to survive from the object language are the empty one and the definition of a let form. Rather than an empty program-expression evaluation context, however, evaluation of an expression within a program must occur through a running form. t he; si →∗ he′ ; s′ i

→∗ imd hq; ∅s i mon

-cntxt

→∗

imd he; si mon

h

→∗ [e]

∗

q [e]; si → h

→∗ [e]

′

imd hq; ∅s i mon

′

q [e ]; s i

-reflex

hq;

s i →∗ hq;

∅

hq;

→∗

hq;

imd hq; ∅s i mon

-step

s i ⇀ hq′ ;

∅

s i

→∗

s i →∗ hq′ ;

∅

imd hq; ∅s i mon

∅ ′

s i

-trans

s i →∗ hq′ ;

∅

hq′ ;

∅ ′

imd hq; ∅s i mon

hq;

→

imd he; si mon

→∗

imd he; si mon

t ∗ e [e]; si →∗ h→ [

∗ [ e]

h→

s i

s i → hq ;

∅ ′′

s i

imd hq; ∅s i mon

hq;

s i →∗ hq′′ ;

∅

∅ ′′

s i

imd hq; ∅s i mon

t he; si →∗ he′ ; s′ i

-cntxt

si

∅ ′

imd hq; ∅s i mon ∅ ′ ∗ ′′

imd hq; ∅s i mon

∗

∅

imd hq; ∅s i mon

imd hq; ∅s i mon

imd he; si mon

-reflex

e [e′ ]; s′ i

e]

imd he; si mon

[] he; si →∗ he; si imd he; si mon

t he; si →∗ he′ ; s′ i

→

t he; si ⇀ he′ ; s′ i

∗

imd he; si mon

imd he; si mon ′

-step

imd he; si mon

t he; si →∗ he′ ; s′ i imd he; si mon

→

t he′ ; s′ i →∗ he′′ ; s′′ i

∗

imd he; si mon

-trans

imd he; si mon

t+t′ he; si →∗ he′′ ; s′′ i imd he; si mon

Figure 2.2.5. Monadic Language Multiple Deep Reduction Rules As in the object language (Figure 2.1.4), we define the reflexive, transitive, and contextual closure of our reductions, at the level of programs as well as expressions. Just as the programlevel reductions do not maintain a trace, neither does their closure relation. Because we now have program-level reductions, the closure must be taken explicitly at both levels. Figure 2.2.5 presents 95

Modelling State With Monads

hrun let x = href truei ; ∅i in deref x

⇀ imd hq; ∅s i mon

[(alloc @ o1 )]

let x = [ e] in deref x

Intermediate Languages

⇀ running [ e] -

-run

[]

⇀ hderef o1 ; imd he; si -let ∅ {o1 → 7 href truei}i mon ⇀

⇀ htrue; ∅i

⇀ imd hq; ∅s i mon

-running

[(read @ o1 )] ⇀

imd he; si mon

⇀ hreturn true; ∅ {o1 7→ href truei}i

-deref

6 hhref truei; ∅i

[(alloc @ o1 )] ⇀ hreturn o1 ; ⇀ imd he; si -alloc ∅ {o1 → 7 href truei}i mon Figure 2.2.6. Sample Monadic Language Reduction

all of these rules. We assume definitions of equality, divergence, and convergence similar to those of the object language. Figure 2.2.6 presents a sample reduction sequence for the monadic language. The sequence begins with program-level rule

⇀

imd hq; ∅s i mon

-run . Within a running context, we immediately enter a

let context and proceed with an allocation of a reference cell. The residual of a return form, which is precisely what another return form, as required by

⇀

imd he; si mon

⇀

imd hq; ∅s i mon

⇀

imd he; si mon

-let now expects. The residual of

-alloc is now

⇀

imd he; si mon

-deref is

-running, the final program-level reduction. No trace

information is maintained at the program level. The expression configurations faulty for the monadic language are similar to those fautly for the object language. There are additionally a few faulty program configurations for the monadic language. Definition 2.2.1 (Immediately Faulty Program Configurations).

(1) hrun e; si, hrunning e; ∅i, hv; si Proposition 2.2.1 (Nonvalue program configurations are reducible or faulty). Call a program configuration of the form hv; ∅i a value program configuration. Every program configuration hq; ∅si is either a value program configuration, a redex, immediately faulty, or decom∗

posable into the form h→

96

[e]

q [e0 ]; si, where he0 ; si is either a redex or immediately faulty.

Monadic Language

Statics

Proof: Proposition 2.2.1. imd ∅ mon hq; s i:

We use case analysis. hrun e; ∅si: hrun e; ∅i is a redex. hrun e; si is immediately faulty. hrunning e; ∅si: hrunning e; ∅i is immediately faulty. Otherwise, ∅s = s. If e is return v, then hrunning return v; si is a redex. Otherwise, by the result on expression configurations, e can be decomposed into let

→∗ [e]

→∗ [ e]

e [e0 ], where he0 ; si is either a redex or immediately faulty, so

∗

q = running →

[ e]

e.

hv; ∅si: hv; ∅i is a value program configuration. hv; si is immediately faulty.

imd mon he;

si:

We can again show by induction on the structure of expressions that every expression configuration he; si with a nonvalue expression e can be decomposed into the form h→

∗

[ e]

e [e0 ];

si, where he0 ; si is either a redex or immediately faulty. The clause for let differs only slightly from the corresponding clause for the object language. hlet x= e1 in e2 ; si: If e1 is return v, then let

→∗ [ e]

e = [ e] and hlet x= e1 in e2 ; si is a redex. Otherwise, by

induction, e1 can be decomposed into immediately faulty, so let

→∗ [ e]

→∗ [ e]

e = let x= →

e 1 [e01 ], where he01 ; si is either a redex or

∗

[ e]

e 1 in e2 .

2.2. Statics.

We now present a static semantics of the monadic intermediate language. Figure 2.2.1 contains the syntax. All forms that were present in the source language (Figure 1.2.15) are identical here. Configuration types and the store type are identical to those of the object intermediate static syntax

97

Modelling State With Monads

Intermediate Languages

imd ∅ imd imd ∅ = mon hQ; S i mon Q × mon S imd Q ∈ mon Q ::= G | ∅ Γ ∈ imd ::= ∅ | Γ {x 7→ P} mon Γ imd imd ∈ imd hE; Si = mon mon E × mon S E ∈ imd mon E ::= T P imd B ∈ mon B ::= Ref P | P ⇒ E imd imd ∈ imd = mon hP; Si mon P × mon S P ∈ imd mon P ::= G | B ∅ ∅ S ∈ imd ::= S | ∅ mon S imd S ∈ mon S ::= ∅ {o 7→ B} ε T ∈ imd mon T ::= St imd ε ∈ imd = mon ε mon {F} imd F ∈ mon F ::= ι ι ∈ imd mon ι ::= alloc | read | write imd G ∈ imd mon G ⊇ obj G

hQ; ∅S i ∈

hE; Si

hP; Si

| exec

Figure 2.2.1. Monadic Intermediate Language Static Syntax (Figure 2.1.1) except for the inclusion of potential store types. Potential store types are similar to the potential stores of Figure 2.1.1. The ≤ relation is extended to potential store types by assuming ∀

∅

S. ∅ ≤

∅

S ∧

∅

S ≤ ∅. imd

⊢mon hq;

program configs ∅

programs

S

⊢

expression configs expressions prestorables pures

Γ Γ Γ

;

∅

;

∅

;

∅

S S S

traces atomic traces

imd mon he;

⊢ ⊢

∅

imd mon b imd mon p

S

⊢ ⊢

S

imd mon v

imd ∅ mon s

⊢

imd mon d

imd

⊢mon t ⊢

imd mon f

hq; ∅si : hQ; ∅Si q

si

imd mon e

imd

potential stores storables

⊢

si

⊢mon hv; si

value configs values

⊢

∅

imd mon q

he; si

: Q : hE; Si

e

: E

b

: B

p

: P

hv; si v ∅

: hP; Si : P ∅

s

:

S

d

: B

t

: T

f

: F

Figure 2.2.2. Monadic Intermediate Language Typing Judgments Figure 2.2.2 presents typing judgments for the monadic intermediate language. Judgments of the source language (Figure 1.2.16) differ here only in their inclusion of a store type, as in the object intermediate language typing judments (Figure 2.1.2). Judgments for configurations, traces, atomic

98

Monadic Language

Statics

traces, stores, and storables are also similar to the object intermediate language. The judgment for stores is replaced by one for potential stores. imd

imd

imd

We again define ⊢mon f f: F to hold for ⊢mon f (ι @ o): ι and now define ⊢mon t t: Stε to hold imd

when ∀ f ∈ t. ∃ F ∈ ε. ⊢obj f f : F , i.e., the atomic trace type must be included in the functor’s annotation. imd

⊢mon v-glob-const

imd

∅

S ⊢

imd mon v

g : TypeOf( g)

⊢mon v-addr-live

imd

S ⊢mon v o : S( o)

Figure 2.2.3. Typing of Monadic Intermediate Language Values Monadic intermediate language values are typed using the rules in Figure 2.2.3 as in the object intermediate language (Figure 2.1.3). imd

⊢mon q-glob-const

imd

∅ ⊢

imd mon q

⊢mon q-addr-dead

g : TypeOf( g)

imd

⊢

imd mon q -run

∅; ∅ ⊢mon e e : Stε P imd

∅ ⊢mon q run e : P := ∅

imd

∅ ⊢mon q

o : ∅

imd

⊢

imd mon q -running

∅; S ⊢mon e e : Stε P imd

S ⊢mon q running e : P := ∅

Figure 2.2.4. Typing of Monadic Intermediate Language Programs Programs are typed using the rules in Figure 2.2.4. The typing rules follow the BNF for the static syntax rather than that for the dynamic syntax, leading to duplication of the rule for global imd

imd

constants.9 ⊢mon q-run and ⊢mon q-glob-const differ from the source language (Figure 1.2.18) only in the imd

inclusion of the non-store type ∅. ⊢mon q -addr-dead also carries the non-store type, and represents imd

imd

program-level pointers as offsets. ⊢mon q-running is similar to ⊢mon q-run, but takes a store type. imd

imd

⊢

imd mon d-ref

S ⊢mon v v : P S⊢

imd mon d

href vi : Ref P

⊢

imd mon d-λ

∅; S ⊢mon b hλx.e0 i : S⊢

imd mon d

B

hλx.e0 i : B

Figure 2.2.5. Typing of Monadic Language Storables Storables and potential stores are typed using the rules in Figures 2.2.5 and 2.2.6, respectively. imd

They differ from the object language rules (Figures 2.1.5 and 2.1.6) in that the antecedent to ⊢mon d-λ does not require any effect and in that there is a rule that types nonstores as nonstore types. Pures in Figure 2.2.7 are typed similarly to those of the object language (Figure 2.1.7). 99

Modelling State With Monads

Intermediate Languages

imd

imd ∅

s

⊢mon

imd ∅

-∅

imd ∅

s

⊢mon

s

⊢mon

∅: ∅

-s

∅ {o 7→ B} ⊢mon d d : B imd ∅

⊢mon

s

∅ {o 7→ d} : ∅ {o 7→ B}

Figure 2.2.6. Typing of Monadic Language Potential Stores imd

⊢mon p-var

imd

Γ; ∅S ⊢mon p x : Γ( x)

imd

S ⊢mon v v : P

∅

imd

⊢mon p-value

imd

Γ; ∅S ⊢mon p v : P

Figure 2.2.7. Typing of Monadic Intermediate Language Pures imd

⊢

imd mon b-ref

imd

Γ; ∅S ⊢mon p p : P imd

Γ; ∅S ⊢mon b href pi : Ref P

⊢

Γ {x 7→ P1 }; ∅S ⊢mon e e : Stε P2

imd mon b -λ

imd

Γ; ∅S ⊢mon b hλx.e i : (P1 ⇒ Stε P2 )

Figure 2.2.8. Typing of Monadic Intermediate Language Prestorables Figures 2.2.8 and 2.2.9 type prestorables and expressions, respectively. They are typed just as in the source language (Figures 1.2.20 and 1.2.21) except for the inclusion of a potential store type in place of the store type of the object intermediate language (Figures 2.1.8 and 2.1.9). imd

Configurations are typed using the rules ⊢mon hq;

∅

si

imd

imd

, ⊢mon he; si , and ⊢mon hv; si . These are the imd

same as those of the object language except for the use of ∅s and ∅S instead of s and S in ⊢mon hq; imd ∅

imd

⊢mon hq;

imd ∅

⊢mon s ∅s : ∅S imd ∅ S ⊢mon q q : Q

∅

si

⊢

imd ∅ mon hq; s i

∅

∅

⊢mon s s : S imd Γ; S ⊢mon e e : E

imd

⊢mon he; si

∅

hq; s i : hQ; Si

imd

⊢mon he; si he; si : hE; Si

9This is justified because dangling pointers only make sense at the top level, and live ones only internally.

imd

imd

⊢mon e-let

Γ; ∅S ⊢mon e e1 : Stε1 P1

imd

Γ {x 7→ Stε1 P1 }; ∅S ⊢mon e e2 : Stε2 P2

imd

Γ; ∅S ⊢mon e let x = e1 in e2 : St(ε1 imd

⊢

imd mon e -pure

∪ ε2 )

P2 imd

Γ; ∅S ⊢mon p p : P imd

Γ; ∅S ⊢mon e return p : St∅ P

⊢

imd mon e -alloc

Γ; ∅S ⊢mon b b : B imd

Γ; ∅S ⊢mon e b : St{alloc} B

imd

⊢

imd mon e -deref

Γ; ∅S ⊢mon p p : Ref P imd

Γ; ∅S ⊢mon e deref p : St{read}P

imd

imd

⊢mon e-set

imd

Γ; ∅S ⊢mon p p1 : Ref P imd Γ; ∅S ⊢mon p p2 : P ∅

Γ; S ⊢

imd mon e

set p1 to p2 : St

{write}

imd

Unit

⊢mon e-app

Γ; ∅S ⊢mon p p2 : P1 imd Γ; ∅S ⊢mon p p1 : (P1 ⇒ Stε P2 ) imd

Γ; ∅S ⊢mon e p1 p2 : St(ε

Figure 2.2.9. Typing of Monadic Intermediate Language Expressions

100

∪ {exec})

P2

si

.

Monadic Language

imd

⊢mon v-addr-live ∅ {o1 7→ Ref Bool} ⊢ o1 : Ref Bool imd imd ⊢mon p-value ⊢mon v-glob-const ∅; ∅ {o1 7→ Ref Bool} ⊢ o1 : Ref Bool imd ∅ {o1 7→ Ref Bool} ⊢ true : Bool imd ⊢mon e-deref ⊢mon d-ref ∅ {o1 7→ Ref Bool} ⊢ href truei : Ref Bool ∅; ∅ {o1 7→ Ref Bool} ⊢ deref o1 : St{read} Bool imd ∅ imd ⊢mon s -s ⊢mon q-running ∅ {o1 7→ Ref Bool} ⊢ running deref o1 : Bool imd ∅ ⊢ ∅ {o1 7→ href truei} : ∅ {o1 7→ Ref Bool} ⊢mon hq; s i ⊢ hrunning deref o1 ; ∅ {o1 7→ href truei}i : hBool; ∅ {o1 7→ Ref Bool}i

Figure 2.2.10. Sample Monadic Intermediate Language Derivation

Statics

101

Modelling State With Monads

Intermediate Languages

imd ∅

⊢mon s s : S imd S ⊢mon v v : P

imd

⊢mon hv; si

imd

⊢mon hv; si hv; si : hP; Si

In Figure 2.2.10, we present a sample program configuration typing derivation for part of the reduction sequence in Figure 2.2.6. At this point, the reference cell has been allocated in the store and all that remains is to perform the dereference. The program configuration derivation is composed of a store derivation and a program derivation. Neither the program configuration judgment nor the program judgment reveals the effect that is manifest in the antecedent expression derivation. The structure of the expression derivation is similar to those of the source language in Figures 1.3.22 and 1.3.23. As in the sample object intermediate language derivation of Figures 2.1.10, the store type is maintained throughout the expression derivation in order to type the explicit offset.

2.3. Type and Effect Soundness.

We are now ready to prove type and effect soundness for the monadic language. Theorem 2.2.1 (Type Soundness). src

⊢mon q q : Q

→ hq; ∅i ⇓ hv; ∅i

hq; ∅i ⇑ ∨ ∃ v.

∧ ⊢

imd hq; ∅s i mon

!

imd ∅ mon hq; s i

hv; ∅i : hQ; ∅i

Effect soundness in now defined only at the level of expressions. Theorem 2.2.2 (Effect Soundness). imd mon he;

∅⊢

si:

src mon e

ε

e: St P ∧

t he; ∅i →∗ he′ ; s′ i imd he; si mon

imd

→ ⊢mon t t: Stε

We will again require two lemmas. Lemma 2.2.1 (Evaluation Preserves Type and Effect). imd ∅ mon hq; s i: imd

⊢mon hq;

∅

∃ ∅S ′ ≥

102

si

hq; ∅s i : hQ; ∅Si ∧ imd

S. ⊢mon hq;

∅

∅

si

hq;

s i →∗ hq′ ;

∅

imd hq; ∅s i mon

hq ′ ; ∅s ′ i : hQ; ∅S ′ i

∅ ′

s i

→

Monadic Language

imd mon he;

⊢

Type and Effect Soundness

si:

imd mon he;

t he; si →∗ he′ ; s′ i

ε

si

he; si: hSt P; Si ∧

′

∃ ε ⊆ ε, S ′ ≥ S. ⊢

imd mon he;

si

→

imd he; si mon ′

ε

he′ ; s′ i : hSt P; S ′ i

Lemma 2.2.2 (Faulty Program Configurations Untypable). imd

If hq; ∅s i is faulty, then there are no Q and ∅S, such that ⊢mon hq;

∅

si

hq; ∅si : hQ; ∅Si.

Proof: Theorem 2.2.1. src

We have ⊢mon q q: Q. Because the intermediate language is an extension of the source language, imd

imd

src

src

there is a similar derivation ∅ ⊢mon q q : Q. It uses ⊢mon q-glob-const in place of ⊢mon q-value and ⊢mon v -glob-const,

⊢

imd ∅

and passes around the nonstore type. Applying rules ⊢mon

imd ∅ mon hq; s i

i.e.,

s

imd

-∅

and then ⊢mon hq;

hq; ∅i : hQ; ∅i. Clearly, if hq; ∅i ⇑, the condition is satisfied. Assume

hq; ∅i →∗ hq′ ;

∅ ′

s i

imd hq; ∅s i mon

with hq ′ ; ∅s ′ i irreducible. By Lemma 2.2.1, ∃ ∅S ′ ≥ ∅. ⊢

∅

si

, we have

hq; ∅i ⇓ hq′ ;

∅ ′

s i

imd hq; ∅s i mon

imd ∅ mon hq; s i

,

hq ′ ; ∅s ′ i :

hQ; ∅S ′ i. By Lemma 2.2.2, hq ′ ; ∅s ′ i is not faulty. By Proposition 2.2.1, hq ′ ; ∅s ′ i is a value program imd ∅

configuration. By rule ⊢mon

s

-∅,

∅ ′

S = ∅.

We will again require two more lemmas and a proposition. Subject reduction is now stated at the level of program configurations in addition to that of expression configurations. Lemma 2.2.3 (Subject Reduction). imd ∅ mon hq; s i: imd

⊢mon hq;

∅

si

∃ ∅S ′ ≥ imd mon he;

⊢

hq; ∅s i : hQ; ∅Si ∧ hq; ∅si ⇀ hq ′ ; ∅s ′ i → imd

S. ⊢mon hq;

∅

∅

si

hq ′ ; ∅s ′ i : hQ; ∅S ′ i

si:

imd mon he;

si

t

he; si: hStε P; Si ∧ he; si ⇀ he′ ; s′ i →

′

imd

′

imd

∃ ε ⊆ ε, S ′ ≥ S. ⊢mon he; si he′ ; s′ i : hStε P; S ′ i ∧ ⊢mon t t : Stε Again, the proposition and lemma relate to contexts. Proposition 2.2.2 (Removing Context Preserves Typability). imd ∅ mon hq; s i: imd

⊢mon hq;

∅

si

h→

∗

imd

q [e0 ]; si : hQ; Si → ∃ P0 , ε0 . ⊢mon he; si he0 ; si : hStε0 P0 ; Si

[e]

103

Modelling State With Monads

imd mon he;

Intermediate Languages

si:

imd

∗

⊢mon he; si h→

imd

e [e0 ]; si: hStε P; Si → ∃ P0 , ε0 ⊆ ε. ⊢mon he; si he; si : hStε0 P0 ; Si

[ e]

Lemma 2.2.4 (Replacement). imd hq; ∅s i: mon  imd

⊢mon hq;

∗

∅

si

h→



[e]

q [e0 ]; si : hQ; Si

     →   

  Stε0  ∧ ⊢imd mon he; si he ; si : hSt P0 ; Si 0  ′  ε imd  St 0 P0 ; S ′ i  ∧ ⊢mon he; si he′0 ; s′ i : hSt  ′ ∧ S ′ ≥ S ∧ ε0 ⊆ ε0 imd

⊢mon hq; imd mon he;

∅

si

h→

∗

q [e′0 ]; s′ i : hQ; S ′ i

[e]

si: imd mon he;

∗

si → [ e] h e [e0 ]; si : hStε P; Si  ⊢  ε  ∧ ⊢imd mon he; si he ; si : hSt 0 P ; Si 0 0   ′ imd  ε  ∧ ⊢mon he; si he′0 ; s′ i : hSt 0 P0 ; S ′ i  ′ ∧ S ′ ≥ S ∧ ε0 ⊆ ε0 ′

imd

∃ ε ⊆ ε. ⊢mon he; si h→ imd mon e:

imd

∅; S ⊢mon e

∗

e [e0 ] : Stε P

  ε  ∧ ∅; S ⊢imd mon e e : St 0 P 0 0   ′ imd  ∧ ∅; S ′ ⊢mon e e′ : Stε0 P  0 0  ′ ∧ S ′ ≥ S ∧ ε0 ⊆ ε0 ′

imd

     →    ′

e [e′0 ]; s′ i : hStε P; S ′ i

[ e]

→∗ [ e]

∃ ε ⊆ ε. ∅; S ′ ⊢mon e



→∗ [ e]



     →    ′

e [e′0 ] : Stε P

Proof: Theorem 2.2.2. imd mon he;

si:

Similar to the proof of Theorem 2.1.2.

104

Monadic Language

Type and Effect Soundness

Proof: Lemma 2.2.1. imd ∅ mon hq; s i:

We reduce the first statement to the second by induction on the program configuration reduction derivation. →∗

imd hq; ∅s i mon

-cntxt:

Use Proposition 2.2.2 and the corresponding statement on expression configurations, and then Lemma 2.2.4. →∗ imd hq; ∅s i mon

-reflex:

s = ∅s ′ and q = q ′ , so let ∅S = ∅S ′ .

∅

→∗ imd hq; ∅s i mon

-step:

Use Lemma 2.2.3. →∗ imd hq; ∅s i mon

-trans:

We have

hq;

∃ ∅S ′′ ≥ imd

⊢mon hq;

∅

si

s i →∗ hq′′ ;

∅

∅ ′′

s i

imd hq; ∅s i mon imd

S . ⊢mon hq;

∅

and

∅

si

hq′′ ;

∅ ′′

s i →∗ hq′ ; imd hq; ∅s i mon

∅ ′

s i

. By induction, we have first that

hq ′′ ; ∅s ′′ i : hQ; ∅S ′′ i, and then that ∃ ∅S ′ ≥

∅ ′′

S .

hq ′ ; ∅s ′ i : hQ; ∅S ′ i.

imd ∅ mon he; s i:

Similar to the proof of Lemma 2.1.1. Proof: Lemma 2.2.2. We observe that typing any faulty program configuration would require typing an immediately faulty program configuration or a faulty expression configuration in an empty environment, and show that imd

if hq; ∅si is immediately faulty, then there are no ∅S and Q such that ⊢mon hq;

∅

si

hq; ∅s i : hQ; ∅S i.

The proof for expression configurations is similar to that of the object language (Lemma 2.1.2). imd ∅ mon hq; s i:

hrun e; si, hrunning e; ∅i, hv; si: imd

imd

imd

imd

⊢mon q-glob-const, ⊢mon q-addr-dead, and ⊢mon q-run require ∅s = ∅. ⊢mon q-running requires ∅s = s. We again introduce a substitution lemma in preparation for the proof of subject reduction.

105

Modelling State With Monads

Intermediate Languages

Lemma 2.2.5 (Value Substitution). imd mon p: imd

imd

imd

∅ {x′ 7→ P ′ }; S ⊢mon p p : P ∧ S ⊢mon v v ′ : P ′ → ∅; S ⊢mon p p [x′ := v ′ ] : P imd mon b: imd

imd

imd

∅ {x′ 7→ P ′ }; S ⊢mon b b: B ∧ S ⊢mon v v ′ : P ′ → ∅; S ⊢mon b b [x′ := v ′ ]: B imd mon e: imd

imd

imd

∅ {x′ 7→ P ′ }; S ⊢mon e e: Stε P ∧ S ⊢mon v v ′ : P ′ → ∅; S ⊢mon e e [x′ := v ′ ]: Stε P

Proposition 2.2.3 (Store Weakening). imd mon v: imd

S ⊢mon v v: P

imd

∧

S ≤ S′

→ S ′ ⊢mon v v: P

∧

S ≤ S′

→ S ′ ⊢mon d d: B

∧

S ≤ S′

→ Γ; S ′ ⊢mon p p : P

∧

S ≤ S′

→ Γ; S ′ ⊢mon b b: B

∧

S ≤ S′

→ Γ; S ′ ⊢mon e e: E

imd mon d: imd

S ⊢mon d d: B

imd

imd mon p: imd

Γ; S ⊢mon p p : P

imd

imd mon b: imd

Γ; S ⊢mon b b: B

imd

imd mon e: imd

Γ; S ⊢mon e e: E

imd

As in the object language, if a storable is typed as a prestorable, it can also be typed as a storable. Proposition 2.2.4 (Storification). imd

imd

∅; S ⊢mon b d: B → S ⊢mon d d: B We require a simple proposition asserting that any expression typable without a store is typable with an empty store.

106

Monadic Language

Type and Effect Soundness

Proposition 2.2.5 (No Store / Empty Store). imd mon v: imd

imd

∅ ⊢mon v v: P → ∅ ⊢mon v v: P imd mon p: imd

imd

Γ; ∅ ⊢mon p p : P → Γ; ∅ ⊢mon p p : P imd mon b: imd

imd

Γ; ∅ ⊢mon b b: B → Γ; ∅ ⊢mon b b: B imd mon e: imd

imd

Γ; ∅ ⊢mon e e: E → Γ; ∅ ⊢mon e e: E Proof: Lemma 2.2.3.

imd ∅ mon hq; s i: imd

We have ⊢mon hq;

∅

si

hq;

hq; ∅si : hQ; ∅Si and

s i ⇀ hq′ ;

∅

∅ ′

imd hq; ∅s i mon

s i

imd

. By ⊢mon hq;

∅

si

imd ∅

, ⊢mon

s

∅

s:

imd

S and ∅S ⊢mon q q : Q. By case analysis on the reduction step:

∅

⇀

imd hq; ∅s i mon

-run: hrun e; ∅i ⇀ hrunning e; ∅i: imd

imd

By rule ⊢mon q -run, ∅; ∅ ⊢mon e e: Stε P, for Q = P := ∅. By Proposition 2.2.5, ∅; ∅ imd

imd

imd

⊢mon e e: Stε P. Applying rule ⊢mon q-running, ∅ ⊢mon q running e : Q. We complete by imd ∅

applying rule ⊢mon ⇀ imd hq; ∅s i mon

s

-s

for an empty store, and reapplying rule

imd ∅ mon hq; s i.

-running: hrunning return v; si ⇀ hv; ∅i: imd

imd

imd

By rule ⊢mon q -running, ∅; S ⊢mon e return v: Stε P, for Q = P := ∅. By ⊢mon e -pure and imd

imd

imd

imd

⊢mon p -val, S ⊢mon v v: P. For rule ⊢mon v -glob-const, we apply ⊢mon q -glob-const. For rule imd

imd

imd ∅

⊢mon v -addr-live, we apply ⊢mon q -addr-dead. We complete by applying rule ⊢mon reapplying rule ⊢

imd mon he;

-∅

and

imd ∅ mon hq; s i

.

si: imd

We have ⊢mon he; si he; si: hStε P; Si and S⊢

s

imd mon e

t he; si ⇀ he′ ; s′ i imd he; si mon

imd

imd ∅

. By ⊢mon he; si, ⊢mon

s

s: S and ∅;

e: P ! ε. By case analysis on the reduction step:

107

Modelling State With Monads

⇀

imd he; si mon

-let:

Intermediate Languages

[] hlet x= return v in e1 ; si ⇀ he1 [ x:= v]; si imd he; si mon

We have by rule ⊢

imd mon e

-let

that ∅; S ⊢

imd

:

imd mon e

return v: Stε1 P1 and ε = ε1 ∪ ε2 . By

imd

imd

imd

⊢mon e -pure, ε1 = ∅ and ∅; S ⊢mon p v : P1 . By ⊢mon p -val, S ⊢mon v v: P1 . Also by imd

imd

imd

⊢mon e-let, ∅ {x 7→ P1 }; S ⊢mon e e1 : Stε1 P. By Lemma 2.2.5 we have that ∅; S ⊢mon e imd ∅

e1 [ x := v]: Stε1 P. We leave the derivation of ⊢mon reapplying rule ⊢ ⇀ imd he; si mon

-alloc:

imd mon he;

s

s: S in place and complete by

si

.

[(alloc @ o)] hd; si ⇀ hreturn o; s {o 7→ d}i imd he; si mon

Let s′ = s {o 7→ d}. By ⊢

:

imd mon e -alloc

imd

, P = B, ε1 = ε ∪ {alloc}, and ∅; S ⊢mon b d: B. Letimd

imd

imd

ting S ′ = S {o 7→ B} and applying ⊢mon v -addr-live, ⊢mon p -value, and ⊢mon e -pure, ∅; S ′ imd

imd

imd

⊢mon e return o: St∅ B. By Lemma 2.2.4, S ⊢mon d d: B. We have by rule ⊢mon s a sequence of storable derivations. Weakening these and the new derivation to S ′ using imd ∅

s

Proposition 2.2.3 and reapplying ⊢mon an application of rule ⊢ ⇀

imd he; si mon

-deref:

imd obj he;

imd ∅

yields ⊢mon

s

∅ ′ ∅ ′

s : S . We complete with

si

.

[(read @ o)] hderef o; si ⇀ hreturn v; si imd he; si mon

where s ( o) = href vi. By ⊢ imd

-s

imd mon e

:

-deref,

imd

imd

ε = {read}, and ∅; S ⊢mon p o : Ref P. By

imd

imd ∅

⊢mon p -value, S ⊢mon v o: Ref P. By ⊢mon v -addr-live S ( o) = Ref P. By rule ⊢mon ⊢

imd mon d

href vi: Ref P. By ⊢

∅; S ⊢

imd mon e

imd mon d -ref

,S⊢

imd mon v

return v: St∅ P. We leave the derivation of ⊢

complete with an application of rule ⊢ ⇀ imd he; si mon

-set:

v: P. Applying ⊢

imd mon he;

href vi}. By

imd he; si mon

and ⊢

S ,

s: S in place and

. href vi}i

imd he; si mon

Let s′ = s {o

imd ∅ obj s

-s ,

imd mon e -pure

si

[(write @ o)] hset o to v; si ⇀ hreturn unit; s {o ⇀

imd mon p -value

s

: imd

-set, s ( o) = href v0 i. By ⊢mon e -set, P = Unit

imd

imd

imd

and ε = {write}. Applying ⊢mon v -glob-const, ⊢mon p -value, and ⊢mon e -pure we obtain ∅; S imd

imd

imd

⊢mon e return unit: St∅ Unit. We also have by rule ⊢mon e-set that ∅; S ⊢mon p o : Ref P1 imd

imd

imd

imd

and ∅; S ⊢mon p v : P1 . By ⊢mon p -value, S ⊢obj v o: Ref P1 and S ⊢obj v v: P1 . By the imd

imd

imd

former and ⊢mon v-addr-live, S ( o) = Ref P1 . Applying ⊢mon d-ref using the latter, S ⊢mon d imd ∅

href vi: Ref P1 . Applying ⊢mon

s

-s ,

imd ∅

⊢mon

s′ : S. We complete with an application of

s

imd

rule ⊢mon he; si. ⇀

imd he; si mon

-app-λ:

ho v; si

[(exec @ o)] ⇀ he1 [ x:= v]; si imd he; si mon

where s ( o) = hλx.e1 i. By ⊢

108

imd mon s

, ⊢

:

imd mon d

-λ,

imd

′

and ⊢mon b -λ, there are P1′ , P ′ , and ε3

Translation

′

imd

′

imd

such that S ( o) = P1′ ⇒ Stε3 P ′ . Also by ⊢mon b -λ, ∅ {x 7→ P1′ }; S ⊢mon e e1 : Stε3 P ′ . imd

imd

imd

imd

By ⊢mon e-app and ⊢mon p -value, S ⊢mon v v: P1 and S ⊢mon v o: P1 ⇒ Stε3 P, and ε = ε3 ∪ imd

′

imd

{exec}. By ⊢mon v -addr-live, P1′ = P1 , P ′ = P, and ε3 = ε3 . By Lemma 2.2.5, ∅; S ⊢mon e imd

e1 [ x := v]: Stε3 P. We complete with an application of rule ⊢mon he; si. Proof: Lemma 2.2.4. imd mon hq;

si: imd

∅

From ⊢mon hq;

si

∗

h→

imd

[e]

q[e0 ]; si : hQ; Si we have S ⊢mon q imd

The former gives ∅; S ⊢mon e imd

imd

→∗ [e]

imd ∅

q [e0 ] : Q and ⊢mon

s

s: S.

imd

q[e0 ]: Stε P via ⊢mon q-running, with Q = P := ∅. Applying

∗

rule ⊢mon he; si gives ⊢mon he; si h→

imd

q [e0 ]; si: hStε P; Si. Also using ⊢mon he; si he0 ; si: hStε0

[e]

′

imd

→∗ [e]

′

P0 ; Si and ⊢mon he; si he′0 ; s′ i: hStε0 P0 ; S ′ i with ε0 ⊆ ε0 and S ′ ≥ S, the result on expression imd

∗

configurations provides ⊢mon he; si h→ imd

∅; S ′ ⊢mon e

′

→∗ [e]

imd

gives S ′ ⊢mon q

′

imd ∅

q [e′0 ]: Stε0 P and ⊢mon

→∗ [e]

imd

q[e′0 ]; s′ i: hStε0 P; S ′ i. From rule ⊢mon he; si we have

[e]

s

imd

s′ : S ′ . Applying rule ⊢mon q-running to the former imd

imd

q[e′0 ] : Q. Applying rule ⊢mon hq; si yields ⊢mon hq;

∅

si

∗

h→

q[e′0 ]; S ′ i :

[e]

hQ; s′ i. imd mon he;

si,

imd mon e:

Similar to the proof of Lemma 2.1.4. Proof: Lemma 2.2.5. Similar to the proof of Lemma 2.1.5.

3. Translation

The dynamic translation for the intermediate languages is presented in Figure 2.3.1. The translations of configurations and traces as well as stores and offsets are new but straightforward. Object language stores are mapped to actual monadic stores. Object language programs are mapped to

109

Modelling State With Monads

Intermediate Languages

Configurations and Traces imd

[hq; si] obj hq; siN imd

[he; si] obj he; siN imd

[hv; si] obj hv; siN imd

[t ] obj tN

imd

imd

imd

imd

imd

imd

=

h[[q ] obj qN ; [s ] obj sN i

=

h[[e ] obj eN ; [s ] obj sN i

=

h[[v ] obj vN ; [s ] obj sN i

=

t

Programs and Expressions imd obj qN

[q ]

imd

[p ] obj eN imd

[let x = e1 in e2 ] obj eN imd

[href ei] obj eN imd

[deref e ] obj eN imd

[set e1 to e2 ] obj eN imd

[hλx.e i] obj eN imd

[e1 e2 ] obj eN

imd

=

running [q ] obj eN

=

return [p ] obj pN

=

let x = [e1 ] obj eN in [e2 ] obj eN

=

let x = [e ] obj eN in href xi

=

let x = [e ] obj eN in deref x

=

let x1 = [e1 ] obj eN in let x2 = [e2 ] obj eN in set x1 to x2

=

hλx.[[e ] obj eN i

=

let x1 = [e1 ] obj eN in let x2 = [e2 ] obj eN in x1 x2

imd

imd

imd

imd imd

imd

imd

imd

imd

imd

Stores and Storables imd obj sN

[∅ ]

= imd obj sN

[s {o 7→ d} ]

Pures and Values ∅

imd

[x ] obj pN imd obj sN

= [s ]

imd obj dN

{o 7→ [d ]

}

imd obj pN

[v ]

imd

imd obj dN

[href vi]

imd

[hλx.e i] obj dN

imd obj vN

=

href [v ]

=

hλx.[[e ] obj eN i

imd

i

[g ] obj vN imd obj vN

[o ]

=

x imd

= [v ] obj vN =

g

=

o

Figure 2.3.1. Translating Object Intermediate Language to Monadic (Dynamic) running forms in the monadic language. The translation of expressions is similar to that of the source language (Figure 1.3.24). Consider the sample object language derivation in Figure 2.1.10. Applying the translation of Figure 2.3.1 to the expression configuration hderef o1 ; ∅ {o1 7→ href truei} i {o2 7→ hλx1 .set o1 to x1 i} yields hlet x2 = return o1 ; ∅ {o1 7→ href truei} in deref x2 {o2 7→ hλx1 .let x3 = return o1 in let x4 = return x1 in set x3 to x4

110

i i}

Translation

Environments imd obj ΓN

[∅ ]

= imd obj ΓN

[Γ {x 7→ P} ]

∅ imd

imd

= [Γ ] obj ΓN {x 7→ [P ] obj PN }

Configuration Types imd

[hQ; Si] obj hQ; SiN imd

[hE; Si] obj hE; SiN imd

[hP; Si] obj hP; SiN

imd

imd

imd

imd

=

h[[Q ] obj QN ; [S ] obj SN i

=

h[[E ] obj EN ; [S ] obj SN i

=

h[[P ] obj PN ; [S ] obj SN i

imd

imd

Expression and Trace Types imd

[P ! T ] obj EN imd

[T ] obj TN

imd

imd

= [T ] obj TN [P ] obj PN =

StT

Program and Pure Types imd

[G ! T ] obj QN imd obj QN

[B ! T ]

imd obj PN

[G ]

imd obj PN

[B ]

= G = ∅ = G imd

= [B ] obj BN

Storable Types imd obj BN

imd

= Ref [P ] obj PN

[Ref P ] T

imd

[P1 ⇒ P2 ] obj BN

imd

imd

imd

= [P1 ] obj PN ⇒ [T ] obj TN [P2 ] obj PN Store Types

imd

[∅ ] obj SN

= ∅ imd obj SN

[S {o 7→ B} ]

imd

imd

= [S ] obj SN {o 7→ [B ] obj BN }

Figure 2.3.2. Translating Object Intermediate Language to Monadic (Static) The static translation for the intermediate languages is presented in Figure 2.3.2. The translations of configuration types and store types are new but straightforward. The remainder of the translation is similar to the source languages (Figure 1.3.25). Applying the translation of Figure 2.3.2 to the expression configuration type (again from Figure 2.1.10) hBool ! {read}; ∅ {o1 7→ Ref Bool}

i

{write}

{o2 7→ Bool ⇒ Unit} yields

111

Modelling State With Monads

Intermediate Languages

hSt{read} Bool; ∅ {o1 7→ Ref Bool} i {o2 7→ Bool⇒ St{write} Unit} Our statement of Types Preservation is now extended to configurations. Theorem 2.3.1 (Types Preservation).

imd

⊢obj hq; si hq; si : hQ; Si imd ⊢obj he; si he; si : hE; Si imd ⊢obj hv; si hv; si : hP; Si

imd

imd

∅

⊢mon hq; s i [hq; si] obj hq; siN : imd imd ⊢mon he; si [he; si] obj he; siN : imd imd ⊢mon hv; si [hv; si] obj hv; siN :

→ → →

imd

[hQ; Si] obj hQ; SiN imd [hE; Si] obj hE; SiN imd [hP; Si] obj hP; SiN

The proof brings up no interesting issues beyond those mentioned with respect to the source languages. We can now claim more than that our translation preserves types; we can claim that it preserves the dynamic semantics as well. Theorem 2.3.2 (Semantics Preservation (i)). hq; si →∗ hq′ ; s′ i imd hq; si obj

t he; si →∗ he′ ; s′ i imd he; si obj

imd hq; siN

→

[hq; [ si]]obj

imd he; siN

→

imd hq; siN

= [hq [ ′ ; s′ i]]obj

imd hq; ∅s i mon

[he; [ si]]obj

imd he; t = [he [ ′ ; s′ i]]obj

siN

imd he; si mon imd

Although hv; si is a fully reduced program configuration in the object language, [hv; si] obj hq; siN is not fully reduced in the monadic language. A single application of

⇀

imd hq; ∅s i mon

-running remains. As

described earlier, the notion of equality on monadic language configurations intended here involves not only analogues of the reduction rules presented in Figure 2.2.5 but also =

imd he; si obj

=

imd hq; ∅s i obj

-revrs and

-revrs.

We prove the theorem using two lemmas. The first shows that translations of object language evaluation contexts pure-reduce to monadic language evaluation contexts. The second shows a correspondence between reductions in the two languages such that translations of object language redexes reduce in the monadic language to corresponding monadic language redexes, and that translation of the residuals yields the corresponding residuals. To state the first lemma more precisely, we generalize expressions and programs of both languages imd

imd

to include evaluation contexts. We then generalize the translations [ ] obj qN and [ ] obj eN on programs imd

imd

and expressions to contexts by mapping [[ e]] obj qN to running [ e] and [[ e]] obj eN to [ e]. Other atomic expression evaluation contexts are translated as expressions, replacing holes with holes. Translations 112

Translation

of nonatomic evaluation contexts are derived by composition of atomic evaluation contexts. We use the same evaluation contexts and reduction rules on the modified languages, and allow context h→

t ∗ e ; si ⇀∗ h→ [

∗ [ e]

reduction derivations to be filled with an expression e1 as in (

e] ′

e ; s′ i

imd he; si obj

)[e1 ].

Lemma 2.3.1 (Evaluation Contexts Preserved by Translation up to Evaluation). imd [ e] q: obj

For any object language evaluation context

→∗ [ e]

q and monadic store s, for some monadic

imd [ e]q N

∗ [ e]

language evaluation context

→∗ [ e] ′

h[ [→

q ]]obj

; si →∗ h→

∗ [ e] ′

q ; si

imd h→∗ [ e]q ; ∅s i mon

q,

.

imd [ e] e: obj

For any object language evaluation context

→∗ [ e]

e and monadic store s, for some monadic

imd [ e]e N e ]]obj ;

→∗ [ e]

language evaluation context

∗

→ [ e] ′

h[ [

[] ∗ si →∗ h→ [

e] ′

e ; si

imd h→∗ [ e]e ; si mon

e,

.

We also state the second lemma more precisely. Lemma 2.3.2 (Reduction Translates as Evaluation). From

t he; si ⇀ he′ ; s′ i imd he; si obj

imd he; siN

[he; [ si]]obj

, it follows that

t imd he; →∗ [he [ ′ ; s′ i]]obj

imd he; si mon

siN

.

Proof: Theorem 2.3.2. The proof proceeds by induction on the object language evaluation derivation. imd obj hq;

si: →∗

imd hq; si obj

-cntxt: →∗

The proof is similar to that for

imd he; si obj

-cntxt below and also uses a derivation from

Figure 2.3.3. imd obj he;

si: →∗

imd he; si obj

-cntxt: →∗ [ e]

We have e = [he [ 1;

obtain

′

e [e1 ] and e =

imd he; siN si]]obj

→∗ [ e]

e [e′1 ],

imd he; t = [he [ ′1 ; s′ i]]obj

with

t he1 ; si →∗ he′1 ; s′ i imd he; si obj

imd he; si mon

. We call this evaluation derivation d. By

Lemma 2.3.1, for some monadic language evaluation context ∗

[h→

imd [ e] e; obj h

e; s′ i]

[ e]

evaluation contexts

siN ∗

[] ∗

→ h→

∗

. By induction, we

siN

[ e] ′

imd obj sN

e ; [s′ ]

→∗ [ e] ′

e , we have that

i. We call this evaluation derivation on

→ [ e]

d . Our monadic language equality derivation is then built

as in Figure 2.3.3. 113

imd

d[[[e1 ] obj eN ]

=

imd hq; si mon

=

imd hq; si mon

-trans

=

imd he; si mon

-trans

imd obj eN

imd

h q [[[e1 ] ]; [s ] obj sN i = imd eN →∗ [ e] ′ ′ imd h q [[[e1 ] obj ]; [s′ ] obj sN i

=

imd hq; si mon

imd

∗

[h→

imd

d [[[e1 ] obj eN ]

-trans

d →∗ [ e] ′

∗

→∗ [ e]

=

-cntxt

[h→ [ e]q[e1 ]; si] obj hq; siN = imd imd ∗ h→ [ e]q ′ [[[e′1 ] obj eN ]; [s′ ] obj sN i

-trans

imd he; si mon

= imd hq; si mon

=

imd he; si mon

-cntxt

imd

q[e1 ]; si] obj hq; siN = [h→

[ e]

∗

→∗ [ e]

-revrs

imd

d[[[e′1 ] obj eN ]

∗

imd

imd

h→ [ e]q ′ [[[e′1 ] obj eN ]; [s′ ] obj sN i = imd ∗ [h→ [ e]q [e′1 ]; s′ i] obj hq; siN

imd

q[e′1 ]; s′ i] obj hq; siN

[ e]

d imd obj eN

∗

t

imd

h→ [ e]e′ [[[e1 ] ]; [s ] obj sN i = imd eN →∗ [ e] ′ ′ imd h e [[[e1 ] obj ]; [s′ ] obj sN i t

imd

∗

=

imd he; si mon

[h→ [ e]e [e1 ]; si] obj he; siN = imd imd ∗ h→ [ e]e′ [[[e′1 ] obj eN ]; [s′ ] obj sN i [h→

∗

imd

t

e [e1 ]; si] obj he; siN = [h→

[ e]

∗

→∗ [ e]

-revrs

Modelling State With Monads

114

→∗ [ e]

imd

d[[[e′1 ] obj eN ]

∗

imd

imd

[]

h→ [ e]e′ [[[e′1 ] obj eN ]; [s′ ] obj sN i = imd ∗ [h→ [ e]e [e′1 ]; s′ i] obj he; siN

imd

e [e′1 ]; s′ i] obj he; siN

[ e]

Figure 2.3.3. Monadic Language Expression Configuration Equality Derivations for Proof of Theorem 2.3.2

Intermediate Languages

Translation

→∗ imd he; si obj

-reflex: imd

imd

imd

We have that he; si = he′ ; s′ i, so [he; si] obj he; siN = [he′ ; s′ i] obj he; siN . t = [ ], so [t ] obj tN = [ ] and we can apply →∗

imd he; si obj

= imd he; si mon

-reflex.

-step:

Use Lemma 2.3.2. →∗

imd he; si obj

-trans:

We have

t′ he; si →∗ he′ ; s′ i

and

t′′ he′ ; s′ i →∗ he′′ ; s′′ i

imd he; si mon imd he; siN t′ imd he; siN si]]obj = [he [ ′ ; s′ i]]obj

[he; [

imd he; si mon

The result follows by rule

imd he; si mon

with t = t′ + t′′ . By induction, we obtain

imd he; siN

and

= imd he; si mon

[he [ ′ ; s′ i]]obj

imd he; t′′ = [he [ ′′ ; s′′ i]]obj

imd he; si mon

siN

.

-trans.

Proof: Lemma 2.3.1. imd [ e] q: obj imd [ e ]

qN

[[ e]q ] obj

imd [ e]

eN

is running [[ e]q ] obj

. running [e] is a monadic language atomic program-

expression evaluation context, so this reduces to the expression case. imd [ e] e: obj

We proceed by induction on object-language evaluation contexts. We show only particular cases; the others are similar. [ e]: [e] is a monadic language evaluation context. ∗

let x= →

[ e]

e in e: ∗

By induction, [→

imd

e ] obj eN evaluates with empty trace to a monadic expression eval-

[ e]

uation context so, using

→∗ imd he; si mon

∗

-cntxt, let x1 = [→

imd

e ] obj eN in e does as well.

[ e]

→∗ [ e]

e e: ∗

By induction, [→

imd

e ] obj eN evaluates with empty trace to a monadic expression eval-

[ e]

uation context so, using

→∗ imd he; si mon

∗

-cntxt, let x1 = [→

imd

imd

e ] obj eN in let x2 = [e ] obj eN in x1 x2

[ e]

does as well.

115

Modelling State With Monads

∗

v→

Intermediate Languages

[ e]

e: h let x1

∗

in let x2 = [→

[ e]

imd obj eN

e]

= [→

∗

imd

e ] obj eN

[ e]

; si

in v x2

in x1 x2

→∗

By

[] ; si ⇀ h let x2

= return v

imd he; si mon

-let,

∗

duction, [→

. By in-

imd he; si mon imd obj eN

[ e]

e]

evaluates with empty trace to a monadic expression evaluation

→∗

context so, using

imd he; si mon

with applications of

∗

-cntxt, let x2 = [→

→∗

imd he; si mon

-step and

imd

e ] obj eN in v x2 does as well. We complete

[ e]

→∗

imd he; si mon

-trans.

We require a Proposition asserting that translation may be commuted with substitution. Proposition 2.3.1 (Translation Respects Substitution of Values). imd obj eN

[e ]

imd

imd

[x := [v ] obj vN ] = [e [x:= v]] obj eN

Proof: Lemma 2.3.2. We proceed by case analysis on object language reductions. In each case, we elide the details of how →∗

the derivations are filled out using ⇀ imd he; si obj

-let:

imd he; si obj

-step and

[] hlet x= v in e; si ⇀ he [ x:= v] ; si imd he; si obj imd obj he;

[hlet x = v in e; si ]

siN

→∗ imd he; si obj

: imd

imd

⇀

imd

Proposition 2.3.1, [e ]

imd

= hlet x= return v in [e ] obj eN ; [s ] obj sN i. In the monadic lan-

guage we obtain h[[e ] obj eN [ x := v]; [s ] obj sN i by imd obj eN

-trans.

imd he; si mon imd obj eN

[ x := v] = [e [ x:= v] ]

-let with an empty trace. Then by

, so that configuration is the transla-

tion of the object language residual. ⇀ imd he; si obj

-alloc:

[(alloc @ o)] hd; si ⇀ ho; s {o 7→ d}i imd he; si obj imd obj he;

If d = href vi, then [hd; si] imd obj sN

obtain hhref vi; [s ]

siN

: imd

= hlet x= return v in href xi; [s ] obj sN i. By

i with empty trace and by

imd obj sN

we obtain hreturn o; [s ]

imd obj eN

If d = hλx.e i, then [d ]

⇀

imd he; si mon

⇀

imd he; si mon

-let we

-alloc with a trace of [(alloc @ o)]

{o 7→ href vi}i, the translation of the object language residual. imd

= hλx.[[e ] obj eN i. By

imd obj sN

we obtain hreturn o; [s ]

imd obj eN

{o 7→ hλx.[[e ]

⇀

imd he; si mon

-alloc with a trace of [(alloc @ o)]

i}i, the translation of the object language

residual. ⇀

imd he; si obj

-deref:

hderef o; si

[(read @ o)] ⇀ hv; si

imd he; si obj

: imd

where s ( o) = href vi. The translation [href vi ] obj dN at s ( o) yields href vi, and the imd

imd

translation [hderef o; si ] obj he; siN = hlet x= return o in deref x; [s ] obj sN i. Thus, by 116

⇀

imd he; si mon

Translation

⇀

imd

-let we obtain hderef o; [s ] obj sN i with empty trace and by imd obj sN

[(read @ o)] we obtain hreturn v; [s ] ⇀

imd he; si obj

-set:

[(read @ o)] hderef o; si ⇀ hv; si imd he; si obj imd obj dN

The translation [href v1 i ]

imd he; si mon

-deref with a trace of

i, the translation of the object language residual.

: imd

at s ( o) yields href v1 i. [hset o to v; si ] obj he; siN = hlet x1 = imd

return o in let x2 = return v in set x1 to x2 ; [s ] obj sN i. By two applications of imd obj sN

obtain hset o to v; [s ]

i with empty trace and by imd obj sN

o)] we obtain hreturn unit; [s ]

{o

⇀

imd he; si mon

⇀

imd he; si mon

-let we

-set with a trace of [(write @

href vi}i, the translation of the object language

residual. ⇀

imd he; si obj

-app-λ:

[(exec @ a0 )] h@ r0 a0 v; si ⇀ he1 [ x:= v]; si imd he; sir1 r0 r2 obj

: imd

where a0 = hr0 , oi and s ( a0 ) = hλx.e1 i. The translation [hλx.e1 i ] obj dN at s ( o) yields imd

imd

imd

hλx.[[e1 ] obj eN i. [ho v; si ] obj he; siN = hlet x1 = return o in let x2 = return v in x1 x2 ; [s ] obj sN i. By two applications of ⇀ imd he; si mon

⇀

imd

imd he; si mon

-let we obtain ho v; [s ] obj sN i with empty trace and by imd

imd

-app-λ with a trace of [(exec @ o)] we obtain h[[e1 ] obj eN [ x := v]; [s ] obj sN i. By imd

imd

Proposition 2.3.1 we have [e1 ] obj eN [ x := v] = [e1 [ x:= v] ] obj eN , so that configuration is the translation of the object language residual. Sabry and Wadler demonstrate how to build a translation that is complete as well as sound, by adding reductions to both calculi [49]. They also demonstrate how to obtain a stronger result that considers the directionality of reduction by modifying the translation to avoid adding let constructs when the defining expression is pure, thus creating no administrative

⇀ imd he; si obj

-let redexes during imd

translation. We refer to the translations incorporating this latter modification as [ ] obj hq; siH and imd

[ ] obj he; siH . The first subclaim above would then be strengthened to assert that translations of object language evaluation contexts are (and not merely pure-reduce to) monadic language evaluation contexts. Compare this directional result with Theorem 2.3.2. Theorem 2.3.3 (Semantics Preservation (ii)). hq; si →∗ hq′ ; s′ i imd hq; ∅s i obj

t he; si →∗ he′ ; s′ i imd he; si obj

imd hq; siH

→

[hq; [ si] ] obj

imd he; siH

→

imd hq; siH

→∗ [hq [ ′ ; s′ i] ] obj

imd hq; si mon

[he; [ si] ] obj

t imd he; →∗ [he [ ′ ; s′ i] ] obj

siH

imd he; si mon

117

Modelling State With Monads

Intermediate Languages

To demonstrate the difference, the object-language program configuration: ho let x = unit in x; ∅ {o 7→ hλx.x i}i reduces to: ho unit; ∅ {o 7→ hλx.x i}i The former object-language program configuration translates (under the original translation) to: h running let x1 = return o in let x2 = let x = return unit in return x in x1 x2 ; ∅ {o 7→ hλx.return x i}i which does not reduce to the translation of the latter object-language program configuration: h running let x1 = return o in let x2 = return unit in x1 x2 ; ∅ {o 7→ hλx.return x i}i because the program configuration evaluation context: ho [ e]; ∅ {o 7→ hλx.x i}i translates to: h running let x1 = return o in let x2 = [ e] in x1 x2 ; ∅ {o 7→ hλx.return x i}i which is not an evaluation context in the monadic language (although it reduces to one).

118

Translation

However, the former program configuration translates (under the revised translation) to: h running let x2 = return unit in o x2 ; ∅ {o 7→ hλx.return x i}i which does reduce to the translation of the latter program configuration: hrunning o unit; ∅ {o 7→ hλx.return x i}i

We have expended great effort in comparing two languages, an object language and a monadic language, which are not particularly different from each other. The reader is at least, by this point, familiar with our models of memory effects, our notations, and our methodology. All of this will be useful as we tackle encapsulation, under which the object and monadic languages further diverge.

119

Part II

Modelling Encapsulation of State With Monad Transformers

Categorical Interlude: Monad Transformers Monad transformers provide a way of enhancing monads. A monad transformer is a mapping between monads: T ::= M

Mε

hhεM , ⊥M , ⊔M i, 7→ hhεT

M

, ⊥T

M

, ⊔T

M

∈ εM

i, (T M )ε

M

ηM ,

,

T M

∈ ε

T M

, ηT

M

µM,ε1 , µT

M ∈ εM ,εM 2 ∈ ε

M ,εT 1

M

∈ εT

M

,εT 2

i M

∈ εT

M

i

along with • a monoid hεT , ⊥T , ⊔T i of effects such that: – εT

M

– ⊥T

= εT × εM

M

= h⊥T , ⊥M i

M T – hεT 1 , ε1 i ⊔

M

M T T T M M M hεT ε2 )i 2 , ε2 i = h(ε1 ⊔ ε2 ), (ε1 ⊔ T

• a family of endofunctor transformers T ε – (T M )

hεT , εM i

T

∈ εT

indexed by εT such that:

M

=T ε Mε

• a monad-indexed family of incremental η natural transformations: M

M

ηT,M,ε : M ε – ηT

M

M

→ T ⊥ Mε

such that:

M

= ηT,M,⊥ ◦ ηM

We thus have that the monoid of effects of the transformed monad is a product of the monoid of effects of the transformer and of that of the original monad, that the family of endofunctors of the transformed monad is indexed with the pair of the index of the transformer and that of the original monad, and that the ηT

M

of the transformed monad is the composition of the ηM of the original

monad followed by the incremental ηT,M,⊥

M

of the transformer.

There is also an incremental operator for taking a value out of a computation, one monad transformer at a time: T

ζT,M,ε

,εM

T

M

M

: T ε M ε P → M ε P such that:

• ζT

M hε

T

T

, εM i = ζM εM ◦ ζT,M,ε

,εM

123

Modelling Encapsulation of State With Monad Transformers

The ζT

M hε

T

T

, εM i of the transformed monad is the composition of the incremental ζT,M,ε

,εM

of the transformed monad followed by the ζM εM of the original monad. We will make use of a unary product functor Return:Cat C → C to increment the relative level to which a result of computation refers. Then the unique morphism for any object P from P into Return P corresponding to idP is a natural transformation: C

• &Return id:C Id → Return We need a lift operation to promote a computation through the monad transformer. Rather than using ηT,M,ε directly to promote computations, we will build another natural transformation return that allows us to also promote the relative level to which a result of computation refers. We can define the return natural transformation: M

C

M

• returnT,M,ε :C M ε M

M

M

→ T ⊥ M ε ◦ Return M

M

M

as returnT,M,ε = ηT,M,ε ◦ M ε &Return id, i.e., ηT,M,ε ◦ (M ε ◦ &Return id). For any morphism f :C A → B , T,M,εM

returnB ◦ Mε f T,M,ε = (ηB ◦ M ε &Return id B ) ◦ M ε f T,M,ε = ηB ◦ (M ε &Return id B ◦ M ε f ) T,M,ε = ηB ◦ M ε (&Return id B ◦ f ) T,M,ε = ηB ◦ M ε (Return f ◦ &Return id A ) T,M,ε = T ⊥ M ε (Return f ◦ &Return id A ) ◦ ηA T,M,ε = (T ⊥ M ε Return f ◦ T ⊥ M ε &Return id A ) ◦ ηA T,M,ε = T ⊥ M ε Return f ◦ (T ⊥ M ε &Return id A ◦ ηA ) T,M,ε = (T ⊥ M ε ◦ Return) f ◦ (ηA ◦ M ε &Return id A ) T,M,ε = (T ⊥ M ε ◦ Return) f ◦ returnA

defn of return associativity M ε a functor &Return id a natural transformation ηT,M,ε a natural transformation T ⊥ M ε a functor associativity ηT,M,ε a natural transformation defn of return

We proceed to present a category in which encapsulation of state operations can be expressed. Cartesian closed category C contains the identity monad Id and monad transformer St. The state monad transformer St is defined over our implementation category C in terms of a region store type ≬

S.

124

Categorical Interlude

= h {alloc,read,write,exec} , ∅, ∪ i

hεSt , ⊥St , ⊔St i M

St

µP

= λma.λ St εM 1 i,hε2 ,

εM 2 i

M T M St M ,hεSt 1 , ε1 i,hε2 , ε2 i

f ∗P1 ,P2

St

ζSt,M ,ε

,εM

S

= (M ε (P × ≬S)) M = λtmv.λ≬s.M ε (λhv, ≬si.hf v, ≬si) (tmv ≬s)

St,M ,εM ηP M ,hεSt 1 ,

≬

M

Stε M ε P M Stε M ε f

≬

s.(λv.ηM P × ≬S

∗

≬

hv, s i)

M ,εM ,⊥M P,P × ≬S

∗

= λtmtmv.λ≬s.(λhtmv, ≬si.tmv ≬s) = λtmv.λ≬s.(λhv, ≬s i.f v ≬s) = λtmv.(λhv, ≬s i.ηM P v)

ma

M M ,εM 1 ,ε2 St M P × ≬S ,P × ≬S

M M ,εM 1 ,ε2 ∗ p1 × ≬S ,P2 × ≬S

M ,εM ,⊥M ∗ P × ≬S ,P

(tmtmv ≬s)

(tmv ≬s)

(tmv ≬s init )

As in the prior part, we assume zeroary functors ∅ and Unit as well as unary functor Ref , all identifying distinct products, and exponential functor ⇒ . We restrict our attention to the portion of category C that can be built up iteratively as the closure under composition of the union of a family of categories Ci , i ≥ 0, along with return and run morphisms, as follows. We define first a sequence of pure categories Pi , i ≥ 0. P0 contains two distinct objects, terminal in P0 , identified by zeroary functors Unit and ∅. Pi+1 also contains a terminal object ∅, identified by a zeroary functor, as well as the image of Pi via unary product functor Return:Cat Pi → Pi+1 . We extend the categories Pi to form a sequence of categories Ci , i ≥ 0. Ci is defined as the smallest extension of Pi with monad St i Id. Thus, C0 = P0 . We can then extend Pi+1 with distinct objects and morphisms identified by a binary exponential functor ⇒ from Pi+1 and Ci+1 to Pi+1 . The first 3 iterations are displayed in Figure II.4. We again assume that stores are finite functions that can be extended, updated, and applied. We break the sequence of zeroary denotable value functors of the previous Part into a two-dimensional sequence. The outer sequence begins with zeroary functor D0 :Cat 1 → P0 of results, set equal to Unit. The remainder is composed of unary functors Di+1 :Cat Pi → Pi+1 defined via the inner sequence. Intuitively, this functor maps a sum of pures at one level to the sum of the pures at the next. We define Di+1, 0 :Cat Pi → Pi+1 as Return. We then define Di+1, St

Cat j+1 :

Pi → Pi+1 as

M

Di+1, j |Ref ◦ Di+1, j |( ⇒ ) ◦ (Di+1, j &(Stε M ε ◦ Di+1, j )) and Di+1 as limj:0→∞ Di+1, j . Region store types ≬S at level i can be seen as objects relating an object O of offsets with Di . We assume that values include offsets and a unit value, so as to define state operations in terms of St as follows:

125

C2 C1

C0

Id

St Id

return

return

Return Return P

Return P

P0

2

St Id

1

2

Ref

... Ref

Unit

φ φ φ

1 Figure II.4. Incremental Development of Implementation Category C

Modelling Encapsulation of State With Monad Transformers

126

Functor: Nat trans:

Categorical Interlude

refM

:Ci+1

P

derefM

:Ci+1

Ref P

setM

:Ci+1

((Ref P) × P)

absM

:Ci+1

appM

:Ci+1

→ St{alloc} M ⊥ (Ref P) = λv.λ≬s.ηSt M ho, ≬s {o 7→ v}i → St{read} M ⊥ P = λo.λ≬s.ηSt M h≬s o, ≬si → St{write} M ⊥ (Returni Unit) = λho, vi.λ≬s .ηSt M hunit, ≬s {o → St{alloc} M ⊥ (P1 ⇒ Stε1 M ε2 P2 ) = λf.λ≬s .ηSt M ho, ≬s {o 7→ f }i {exec} ∪ ε1 → St M ε2 P2 = λho, vi.λ≬s .≬s o v ≬s

P1

(Stε1 M ε2 P2 )

(P1 ⇒ Stε1 M ε2 P2 ) × P1

v}i

where the operators are now parameterized by the previous monad M = Sti Id. Each operator that constructs a pair (all but app) must now enclose it in a trivial computation in the current monad. We can define the return natural transformation as before, with the domain of both functors restricted to Pi , i.e., M

• returnSt,M,ε :C

Pi

M

Mε

St

M

→ St⊥ M ε ◦ Return

We must demonstrate that operations that make sense with respect to the original monad can still be carried out with respect to the transformed monad. Because operations of the state monad transformer require only values (not computations), return will suffice to promote any operator, ignoring the new region. We this time seek natural transformations runi :C

Pi

St

M

M

Stε M ε ◦ Di+1 → M ε ◦ ( + ∅) to represent

the generation of a result value or dangling pointer from a computation. This is done through natural transformations dropi :C

Pi

Di+1 → ( + ∅). The latter is defined as the infinite mediating sum that

leaves base values intact and maps allocated values to the dangling pointer: dropi, dropi,

0 j+1

= =

ι1 ◦ π1Return ∅ dropi, j |ι2 ◦ !∅ 1 |ι2 ◦ !1

We can then define run as dropi ◦ ζM ε.

127

CHAPTER 3

Source Languages

The reader has already seen encapsulation in this work; the monadic languages in Part I included an all-or-nothing variety. Effects were visible within any subexpression of a program, but not at the level of the program itself. This was appropriate because monadic expressions participated in programs via their embedding in a single run form. We now reintroduce object and monadic languages that provide encapsulation at the level of expressions. Object-language encapsulation is modelled in the monadic language by using one monad transformer per region as described in Section 2.1 of the Introduction. Our presentation of an imperative region-monomorphic language with encapsulated state in Sections 3.1 and 4.1 has some similarity with that of Calcagno, Helsen and Thiemann [5]. 1 The monadic language and translation that form the remainder of this part are largely original.

1. Object Language Statics

Figure 3.1.1 presents an object source language with encapsulation of memory effects. We immediately notice differences from the corresponding language without encapsulation in Figure 1.1.1. First, there is a new primitive syntactic class of region variables

src obj ρ.

Region variables are in-

troduced by a new expression form, letregion, that binds them over the scope of an encapsulated expression. That expression defines the extent of the corresponding region. Region variables are not expressions or even pures, as are program variables. They are referenced in the effectful operations that must now specify the region upon which they are to act. Thus, the letregion construct indicates 1We differ in our use of a region-indexed syntax and in that our environments are extended with region indicators, as well as in our support for allocations that escape their lexical context. For brevity, we drop their copy operator.

129

Modelling Encapsulation of State With Monad Transformers

Source Languages

src obj q

::=

eǫ

src ρ obj e

::=

p ρ | let x = e ρ in e ρ | letregion ρ0 in e ρρ0

src ρ1 ρ0 ρ2 obj e

::=

ρ1 ρ0 /ρ2 b ρ ∈ src obj b ρ ρ p ∈ src obj p ρ v ρ ∈ src obj v

::= ::= ::=

q ∈ eρ ∈ eρ ∈ ⇁

g ∈

@ ρ0 b ρ1 ρ0 /ρ2 | @ ρ0 deref e ρ | @ ρ0 set e ρ to e ρ | @ ρ0 e ρ e ρ href e ρ1 ρ0 ρ2 i | hλx.e ρ1 ρ0 i vρ | x g

src obj g

ρ ∈

src obj ρ

→

e1 ; e2

x ∈

src obj x

let x = e1 in e2

(x ∈ / fpv(e2 ))

Figure 3.1.1. Object Source Language Syntax the portion of the program to which these effects relate and through which the relevant memory must be maintained. Only allocations logically require that the region be indicated explicitly; in the other forms it could be determined statically from the first subexpression. We require that it be stated explicitly throughout in order that the translation, which requires this information, be independent of the type system. Our definition of pures is still conservative; we again exclude a let construct, even if its subexpressions are pure and also exclude a run construct, even if its subexpression does not operate on outer regions. We identify expressions that differ only by a consistent renaming of bound region as well as program variables. We now turn our attention to the indexes on the metavariables and syntactic classes for expressions, prestorables, pures, and values. In these cases our BNF defines an indexed family of syntactic classes, i.e., a function from the index set to a syntactic class. Our indexed metavariables represent an element of the syntactic class instantiated with the same index value. In the cases of expressions, pures, and values, the indexes are sequences of region variables, representing the lexical context of the syntactic form within letregion constructs. We will thus call them lexical indexes. Programs, occurring at top level, require no index. We use ǫ to refer to an empty sequence and otherwise simply concatenate the sequence elements. We consider ρ to be a metavariable for sequences of region variables. Prestorables are indexed by a nonempty sequence of region variables, partitioned after the region at which the prestorable is allocated. We represent such partitioned sequences with a slash ’/’ at the point of partition and simple sequences on either side, omitting the indication ǫ of an ⇁

empty sequence. We consider ρ to be a metavariable for partitioned sequences of region variables. ⇁

⇁

⇁

If ρ = ρ1 /ρ2 , we occasionally write ρ ρ for ρ1 /ρ2 ρ. seq-/( ρ ) is the sequence of region variables

130

Object Language Statics

with the partition removed. We stress that in this and all subsequent languages, the indexes are an analytic tool used to define syntactic classes; actual terms are not annotated with level information. In the BNF, the index of the syntactic class may be more specific than that of the metavariable. Metavariables may then only be instantiated with index values for which the syntactic class is defined. On the third line, for example, the index of the expression is nonempty and on the fourth line the sequence to the left of the partition in the index of the prestorable is nonempty. The syntactic classes defined by lines in this BNF are not all disjoint. A syntactic class is defined by the union of all matching clauses in the BNF. The first class for expressions is general, covering pures, let, and letregion. It shares its scope with a more specialized class, defining the remaining forms, and valid when the lexical index is nonempty. Programs q are defined to be expressions with empty index. The index of the encapsulated expression of a letregion construct is extended with the bound region variable. Then, the occurrences of region variables in effectful operators must be drawn from the nonempty lexical index. Our use of region variables as indexes to syntactic classes thus defines, prior to the type system, where in programs they may legally appear. Because, as discussed above, all effectful operators require explicit mention of the region, we cannot syntactically represent “programs” that dereference dangling pointers such as: deref letregion ρ in @ ρ href uniti In an allocation, the index of the prestorable is the index of the expression partitioned after the region at which the allocation occurs. For a reference cell, the partition is removed for the contained expression, indicating that that expression may mention any lexically visible region. The body of a lambda expression, however, has an index formed by dropping the partition and any region variables to its right. This indicates that the body of a function may only mention regions from the point of allocation and continuing outwards. Thus, the following “program”, which applies a function that sets a cell in a more inner region, is not syntactically valid: letregion ρ1 in @ ρ1 letregion ρ2 in let x1 = @ ρ2 href uniti in @ ρ1 hλx.@ ρ2 set x1 to unit i unit

131

Modelling Encapsulation of State With Monad Transformers

Source Languages

If permitted, it would lead to illegal operations on dangling pointers when the function is called from outside the inner region. The programs are invalidated by the mere allocation of these functions — applying them is not necessary. This is consistent with our restriction to downward/outward references described in Section 2.2 of the Introduction. Although we permit the contained expression of a reference cell to make use of regions inward of the point of allocation, the type system will ensure that any contained value does not. We are not so lenient with function bodies, which will be placed in the store at the region of allocation after only some substitution for program variables that cannot change its region structure. (1) letregion ρ in @ ρ deref @ ρ href uniti (2) letregion ρ in @ ρ href uniti (3) letregion ρ1 in @ ρ1 hλx.@ ρ1 href xi i letregion ρ2 in @ ρ2 href uniti (4) letregion ρ1 in letregion ρ2 in @ ρ2 href @ ρ1 href unitii (5) letregion ρ1 in let x2 = @ ρ1 href uniti in letregion ρ2 in let x3 = @ ρ2 href x2 i in @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 in x2 i @ ρ2 deref x3 Figure 3.1.2. Some Typable Object Language Programs Figure 3.1.2 demonstrates some typable programs. The first dereferences a pointer prior to deallocation of its region. The second creates a dangling pointer. The third creates and calls a function that accepts a dangling pointer, but does not access it. The fourth stores a value from an outer region in a cell from an inner region. Consider the final program. It first allocates an outer region ρ1 with a cell holding unit and then an inner region ρ2 with a cell pointing to the outer cell. It then performs an application of a function allocated at the outer region to the result of dereferencing the inner cell, i.e., to the outer cell. The function ignores its argument and dereferences the outer cell via the free variable x2 within a new region ρ3 . Figure 1.1.3 presented some untypable programs in an object language without encapsulation. Corresponding programs in this language continue to be untypable, and we have more reasons to 132

Object Language Statics

(1) letregion ρ1 in @ ρ2 href uniti (2) letregion ρ1 in letregion ρ2 in let x = @ ρ2 href @ ρ2 href unitii in @ ρ2 set x to (@ ρ1 href uniti) (3) letregion ρ1 in letregion ρ2 in let x = @ ρ2 hλx.x i in @ ρ2 x (@ ρ2 href uniti); @ ρ2 x (@ ρ1 href uniti) (4) let x = letregion ρ in @ ρ href truei in letregion ρ in @ ρ deref x (5) letregion ρ1 in letregion ρ2 in @ ρ1 href @ ρ2 href unitii (6) letregion ρ1 in let x = @ ρ1 hλx.x i in letregion ρ2 in @ ρ1 x (@ ρ2 href uniti) Figure 3.1.3. Some Untypable Object Language Programs reject programs as well. The examples in Figure 3.1.3 rely on operations prohibited by our effect system. The first contains an unbound region variable ρ2 . The second shows that we cannot reference objects at different regions from the same cell. The third demonstrates our continuing lack of polymorphic functions; we do not support attempts to apply the same function to values allocated at different regions. The fourth example, corresponding to a monadic program from Section 1 of the Introduction, demonstrates encapsulation. It is rejected because typing it would require letting region information from the letregion form in the let definition escape and become available within the letregion form in the let body. The final two programs demonstrate the restriction to downward/outward references. The first, similar to a program in Section 2.2 of the Introduction, stores a value at an upper region in a cell at a lower region. The second applies a function to a value allocated at a higher region. We have already seen that, because a region must be explicitly mentioned when it is actively used, “programs” that from the body of a function actively use regions that are inward of the region at which the function is allocated, like those that dereference dangling pointers, are not well-formed. We could easily exclude them using the type system, however, were we to drop that convention and accept programs as they appear in the Introduction. To be safe, under a restricted store we require that all free regions of the argument type occur outside of the

133

Modelling Encapsulation of State With Monad Transformers

Source Languages

region where the function is allocated, and, more generally, that all free regions of the type of an allocated object occur outside of the region where it is allocated. Γρ ∈

src ǫ obj Γ

::=

∅ | Γ ǫ {x 7→ P ǫ }

src ρ1 ρ0 obj Γ ρ ρ E ∈ src obj E

::=

Γ ρ {x 7→ P ρ } | Γ ρ1 {ρ0 }

::=

Pρ ! Tρ

ρ1 ρ0 src obj B ∈ src obj Q

::=

Ref P ρ | P ρ ⇒ P ρ

::=

G|∅

src ρ1 ρ0 obj P src ρ ρ ρ ∈ src obj T , ε obj ε ρ1 ρ0 ρ2 F ρ ∈ src obj F ι ∈ src obj ι

::=

Γρ ∈ Bρ ∈ Pρ ∈

src ǫ obj P ,

Q

Pρ ∈

Tρ ∈

Tρ

B ρ @ ρ0 | P ρ1

=

ρ { src obj F }

::=

(ι @ ρ0 )

::=

G ∈

alloc | read | write | exec

src obj G

Figure 3.1.4. Object Source Language Static Syntax The static syntax for this language is presented in Figure 3.1.4. It can be compared with that in Figure 1.1.4 without encapsulation. First, there is a new form for environment extension that appends a region variable in braces to the old environment. We can consider this in terms of finite functions as mapping region specifiers to an arbitrary value (thus duplicates are not permitted), however it also acts as a declaration. In particular, it indicates that any pure types for which bindings are subsequently added may mention the region. A ≤ operator is defined on environments as finite functions, i.e., one environment is less than or equal to another only when it declares a subset of the region indicators and program variables, and assigns the same pure types to the program variables. Storable and program types are unchanged from Figure 1.1.4. Located types are now not simply storable types but indications that a storable type is at a region variable. Similarly, atomic effects are not simply actions but indications that an action occurs at a region variable. T ρ is the subset of T that does not refer to ρ, i.e., it removes from T any atomic trace types of the form (ι @ ρ). We define ρ0 ∈ T to hold when any element of T refers to ρ0 . We let frv( ) refer to the free region variables of environments, pure or storable types, or of effects. The static syntax is also indexed. All syntactic forms except program types, global constant types, and actions are indexed by a sequence of region variables. Storable types in particular are indexed by a nonpartitioned sequence of region variables. As one descends into an environment past a region variable extension, the index of the environment is truncated, i.e., environments are 134

Object Language Statics

indexed with the region specifiers that they declare. Program types are the same as in the monadic language of Figure 1.2.15 and are equated with pure types with an empty index. In particular, program types exclude located types, which require a nonempty region context, replacing them with dangling pointer types. Since encapsulation now takes place at the level of expressions, dangling pointer types are included as pure types at any index. More generally, pure types at higher indexes include program types along with located types at any region variable in the index. The storable type in these located types may refer to any region variable at the point of allocation or outward. Atomic effects may include actions at any region variable in their index. Γ ρ1 ρ2 − ρ2

∈

src ρ1 obj Γ

Γ {ρ0 } − ρ2 ρ0

=

Γ − ρ2

Γ {x 7→ P} − ρ2

=

(Γ − ρ2 ) {x 7→ (P − ρ2 )}

Γ − ǫ

=

Γ

P ρ1 ρ2 − ρ2

∈

G − ρ2

=

G

∅ − ρ2

=

∅

(B0 @ ρ0 ) − ρ21 ρ0 ρ22

=

∅

(B0 @ ρ0 ) − ρ2

(ρ0 ∈ / ρ2 ) =

src ρ1 obj P

B0 @ ρ0

Figure 3.1.5. Object Source Language Definitions Figure 3.1.5 defines the restriction of environments and pure types to their lowermost region variables and modifies declarations of program variables to restrict their types. Restriction of environments removes any declarations of restricted region variables and modifies declarations of program variables to restrict their types. It is defined only when the regions to be restricted are declared in order at the top of the environment. Restriction of a pure type is defined such that restricting the region of allocation from a located type yields a dangling pointer type. src

⊢obj q

programs expressions prestorables pures values

Γ

ρ

Γ ρ1 ρ0 ρ2 Γρ

⊢

src ρ obj e src

⊢obj b ⊢ ⊢

ρ1 ρ0 /ρ2

src ρ obj p

src ρ obj v

e

q

: Q

ρ

: Eρ

bρ1 ρ0 /ρ2

: B ρ1 ρ0 ! T ρ1 ρ0 ρ2

pρ

: Pρ

vρ

: Pρ

Figure 3.1.6. Object Source Language Typing Judgments 135

Modelling Encapsulation of State With Monad Transformers

Source Languages

Figure 3.1.6 presents the typing judgments, which are the same as in Figure 1.1.5 except for the presence of indexes. The judgments themselves are indexed along with the metavariables; we thus omit the indexes on metavariables in the remainder. In general, the indexes of all of the metavariables in a judgment and the index of the judgment itself must be the same. The only exception is the judgment for prestorables, for which the indexes of the environment and trace type use the sequences from both sides of the partition while the index for the storable type uses the sequence from the left (outer) side of the partition only. This supports the restriction to downward/outward references, since that restriction would otherwise be violated at the time of allocation. Although the contained expression of a reference cell allocation may make use of program variables from and perform effects on regions inward of the region of allocation, the type of the prestorable may not mention such region variables. Since judgments are indexed by a sequence of region variables, the typing rules that derive them are more accurately called typing rule schemes. Their names include a syntactic class family that we may choose to instantiate with an index value corresponding to the index of the conclusion judgment. src

⊢obj v-glob-const

src

⊢obj v

ρ

g : TypeOf( g)

Figure 3.1.7. Typing of Object Source Language Values The rule for global constant values in Figure 3.1.7 is like that of Figure 1.1.6. src

⊢

src obj q

∅ ⊢obj e ⊢

ǫ

src obj q

q: Q!∅ q: Q

Figure 3.1.8. Typing of Object Source Language Programs The rule for typing programs in Figure 3.1.8 is similar to that of Figure 1.1.7. The main differences are that we must install the empty index for the expression judgment and that as our program judgments require no trace type, the top-level expression trace type is required to be empty. Regarding the latter point, the trace type will be empty because by the time we emerge at the top level, all effects will be encapsulated by the relevant letregion construct. The rules for typing pures in Figure 3.1.9 are the same as without encapsulation (Figure 1.1.8) except for the presence of indexes. 136

Object Language Statics

src

⊢

src obj p -var

Γ ⊢

src ρ obj p

x : Γ( x)

⊢

src obj p -value

⊢obj v Γ ⊢

ρ

v: P

src ρ obj p

v: P

Figure 3.1.9. Typing of Object Source Language Pures src

⊢

src obj b-ref

Γ ⊢obj e Γ ⊢

ρ1 ρ0 ρ2

src ρ1 ρ0 /ρ2 obj b

src

e : P0 ! T

href ei : Ref P0 ! T

⊢

src obj b -λ

(Γ − ρ2 ) {x 7→ P0.1 } ⊢obj e Γ ⊢

src ρ1 ρ0 /ρ2 obj b

ρ1 ρ0

e0 : P0.2 ! T0

T0

hλx.e0 i : P0.1 ⇒ P0.2 ! ∅

Figure 3.1.10. Typing of Object Source Language Prestorables src

Prestorables are typed in Figure 3.1.10.2 In rule ⊢obj b -ref, we have begun to use trace types instead of effects. This will be convenient for the monadic language of Section 3.2. The change in the index between the prestorable judgment and the expression judgment for each rule follows src

the dynamic syntax. Rule ⊢obj b -λ requires that we restrict the environment (Figure 3.1.5) before extending it with the formal parameter, and that the free regions of the parameter type, body type, and latent effect be visible in the restricted environment. Restricted region variables (those pointing upward of the region of allocation) are considered to refer to dangling pointers when typing the function body, as described at the end of Section 2.5 of the Introduction. Given our typing rules for expressions in Figure 3.1.11, this prevents those variables from being accessed actively, i.e., in order to dereference or set a cell or call a function, while allowing them to be used where a dangling pointer is expected. Because we always restrict the innermost regions, variables in the function body referencing other regions are safe to use in any manner. The rules for typing expressions are presented in Figure 3.1.11. In each rule, the indexes on the antecedent judgments can be derived from those on the conclusion by following the syntax. They differ from the rules without encapsulation in Figure 1.1.10 in that they include region information. src

In ⊢obj e -alloc, allocation @ ρ0 b is assigned located type B0 @ ρ0 and registers an effect that includes src

src

src

both the effect of b and an allocation at regon ρ0 . Rules ⊢obj e -deref, ⊢obj e -set, and ⊢obj e -app require that the first subexpression be typable as a located type at region ρ0 , and each registers an effect at ρ0 . As described above, dereferences and settings of reference cells and applications of functions mention ρ0 for convenience. It is implicit in our syntactic categories that this region be declared in the static environment. 2Our unusual subscripts have no formal role other than to distinguish occurrences of distinct metavariables. The convention is intended to directly impart significant information regarding the index of the metavariable that could otherwise be inferred by the reader.

137

Modelling Encapsulation of State With Monad Transformers

src

src

Γ ⊢obj p

src

⊢obj e-pure

Γ ⊢

ρ

src ρ obj e

p: P

src

src

⊢

src obj e -alloc

⊢

src obj e -deref

Γ ⊢obj b Γ ⊢

src ρ1 ρ0 ρ2 obj e

src

Γ ⊢

src ρ = ρ1 ρ0 ρ2 obj e

̺

e:

Γ ⊢ Γ ⊢

src ρ = ρ1 ρ0 ρ2 obj e

Γ ⊢ src

Γ ⊢obj e

ρ = ρ1 ρ0 ρ2

ρ

src ρ obj e

src

Γ ⊢

src ρ obj e

e .1 :

Ref P0 @ ρ0 ! T.1

e .2 : P0 ! T.2

e .2 : P0.1 ! T.2 e .1 :

T

P0.1 ⇒0 P0.2 @ ρ0 ! T.1

@ ρ0 e .1 e .2 : P0.2 ! T0 ∪ T.1 ∪ T.2 ∪ { (exec @ ρ0 ) }

Γ {ρ0 } ⊢obj e

src

⊢obj e-letregion

ρ

@ ρ0 set e .1 to e .2 : Unit ! T.1 ∪ T.2 ∪ { (write @ ρ0 ) } src

⊢

b : B0 ! T

Ref P0 @ ρ0 ! T

src ρ obj e

Γ ⊢obj e src obj e -app

let x = e in e : P.2 ! T.1 ∪ T.2

@ ρ0 deref e : P0 ! T ∪ { (read @ ρ0 ) } src

src

ρ

B0 @ ρ0 ! T ∪ { (alloc @ ρ0 ) }

Γ ⊢obj e ⊢obj e-set

src

Γ ⊢obj e

ρ1 ρ0 /ρ2

@ ρ0 b : Γ ⊢obj e

ρ

Γ ⊢obj e e : P.1 ! T.1 src ρ Γ {x 7→ P.1 } ⊢obj e e : P.2 ! T.2

⊢obj e-let

p: P!∅

Source Languages

ρ ρ0

e : P0 ! T0

letregion ρ0 in e : P0 − ρ0 ! T0 − ρ0

Figure 3.1.11. Typing of Object Source Language Expressions src

src

The rule ⊢obj e -letregion is new, but bears some similarity to the rule ⊢mon q -run of the monadic source language (Figure 1.2.18). When typing the letregion body, the environment is extended with the bound region variable. It is implicit in our indexes that the bound region variable is fresh, i.e., does not occur in the environment Γ. If the body has type P0 and effect T0 , the letregion construct has type P0 - ρ0 and effect T0 - ρ0 . The former is defined in Figure 3.1.5 to treat any references to the bound region variable as dangling pointers, allowing us to support operations on dangling pointers as described in Section 2.5 of the Introduction. This also serves to make it impossible that any such values could be dereferenced, set, or applied outside of the letregion construct in a well-typed program, since we disallow effects on ∅. We restrict the trace type because we do not need to be concerned with these effects in analyzing code outside of this letregion construct.3 3For symmetry, in this and subsequent languages we could modify the conclusion to derive a judgment using the static environment Γ - ρ0 , ensuring that the dependence of the static environment on ρ0 not be used outside of the encapsulation construct. This, however, is unnecessary for the language, as specified. It is not possible for static environments to contain free references to region indicators before those region indicators are declared, since any allocation requires an initial value of the appropriate type. With this modification, however, this technique might be

138

Object Language Statics

In the presentation of Calcagno, et. al., based on Tofte and Talpin [60], it is required in the src

rule corresponding to ⊢obj e -letregion with bound variable ρ0 that ρ0 ∈ / frv(Γ) and ρ0 ∈ / frv(P). The motivation for these requirements, from Launchbury and Peyton Jones [36], as described in Section 1 of the Introduction, is to prevent programs such as the fourth in Figure 3.1.3, which uses an address generated from an allocation at one region to dereference a cell at another region. Tofte and Talpin, however, do not require this, but only that effects relating to Γ and P not be masked, i.e., a reference to a region may escape the encapsulation construct, and a variable declared outside the encapsulation construct may involve the region, but no analysis information may be gleaned from such regions. This is acceptable because Tofte and Talpin do not prove type soundness for arbitrary programs with the encapsulation construct, but only for translations of encapsulation-free source-language programs. Thus, they need not be concerned by examples such as that above, which could not have been generated by their region-inference algorithm. We choose to avoid these requirements, but prevent such harmful examples by replacing P0 with P0 - ρ0 in the context of a letregion construct. Figures 3.1.12 and 3.1.13 present a typing derivation of the last program of Figure 3.1.2. We again conserve space by omitting the syntactic class from the judgment; we now instantiate the rule scheme name with the index of the conclusion so as not to lose this information. As we descend src

through the instances of ⊢obj e -letregion in Figure 3.1.12 and d1 of Figure 3.1.13, we add each bound region variable to both the judgment index and the static environment. At the level of programs src

there is no effect at all, and in the antecedent of each instance of ⊢obj e-letregion, atomic effects for the bound region variable are included in the judgment’s effect. The allocation of the inner reference cell (d2 ) in Figure 3.1.12 uses a prestorable derivation in which both regions are to the left of the partition, and both are available for use in the subexpression representing the contained value. The type of the prestorable may mention either region variable — in fact it mentions only ρ1 . The allocation at region ρ2 is registered in the effect of the expression. Had the reference cell index been ρ1 /ρ2 , i.e., the allocation been at ρ1 , the only change would be that, because of the restricted store, we would preclude the type of the prestorable from mentioning ρ2 . In the case of a function with the partition set between the two regions, as with the allocation of the function in d1 of Figure 3.1.13, neither the body nor its type or effect may mention ρ2 , as indicated by the index of ρ1 and the applied to language extensions, such as allowing reference cells to be initialized to a nil value that is permitted to stand in as any pure type.

139

Modelling Encapsulation of State With Monad Transformers

src

⊢obj v

ρ1

Source Languages

-glob-const

⊢ unit : Unit ∅ {ρ1 } ⊢ unit : Unit src ρ1 ⊢obj e -pure ∅ {ρ src ρ1 / 1 } ⊢ unit : Unit ! ∅ ⊢obj b -ref ∅ {ρ1 } ⊢ href uniti : Ref Unit ! ∅ src ρ1 ⊢obj e -alloc d1 ∅ {ρ } ⊢ @ ρ1 href uniti : Ref Unit @ ρ1 ! {(alloc @ ρ1 ) } src ρ1 1 ⊢obj e -let ∅ {ρ1 } ⊢ let x2 = @ ρ1 href uniti : Unit ! {(alloc @ ρ1 ), } in letregion ρ2 (exec @ ρ1 ), in let x3 = @ ρ2 href x2 i (read @ ρ1 ) in @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 in x2 i @ ρ2 deref x3 src ǫ ⊢obj e -letregion ∅ ⊢ letregion ρ1 : Unit ! ∅ in let x2 = @ ρ1 href uniti in letregion ρ2 in let x3 = @ ρ2 href x2 i in @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 in x2 i @ ρ2 deref x3 src ⊢obj q ⊢ letregion ρ1 : Unit in let x2 = @ ρ1 href uniti in letregion ρ2 in let x3 = @ ρ2 href x2 i in @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 in x2 i @ ρ2 deref x3 ⊢

src

⊢obj p ⊢ src

src

ρ1 ρ2 /

ρ1 ρ2

-ref

-alloc

src

⊢obj p ⊢

src ρ1 ρ2 obj e -pure

src

d3 = ⊢obj e

-var

∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ x2 : Ref Unit @ ρ1 {ρ2 } ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ x2 : Ref Unit @ ρ1 ! ∅ {ρ2 } ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ href x2 i : Ref (Ref Unit @ ρ1 ) ! ∅ {ρ2 } Ref (Ref Unit @ ρ1 ) @ ρ2 ! ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ @ ρ2 href x2 i : {(alloc @ ρ2 ) } {ρ2 }

src ρ1 ρ2 obj e -pure

⊢obj b d2 = ⊢obj e

ρ1 ρ2

src ρ1 obj p -value

ρ1 ρ2

-deref

ρ1 ρ2

-var

∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ x3 : Ref Unit @ ρ1 {ρ2 } {x3 7→ Ref (Ref Unit @ ρ1 ) @ ρ2 } ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ x3 : Ref (Ref Unit @ ρ1 ) @ ρ2 ! ∅ {ρ2 } {x3 7→ Ref (Ref Unit @ ρ1 ) @ ρ2 } Ref Unit @ ρ1 ! ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ @ ρ2 deref x3 : {(read @ ρ2 ) } {ρ2 } {x3 7→ Ref (Ref Unit @ ρ1 ) @ ρ2 }

Figure 3.1.12. Sample Object Source Language Derivation, I absence of ρ2 in the environment. Within the body of that function, x3 , which had been typed as a reference cell at ρ2 , is interpreted to have a type of ∅ and the declaration of ρ2 is 140

Object Language Statics

d1 = src

⊢obj p

ρ1 ρ3

-var

∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } {x3 7→ ∅} ⊢ x2 : Ref Unit @ ρ1 {x1 7→ Ref Unit @ ρ1 } ⊢ ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } {x3 7→ ∅} ⊢ x2 : Ref Unit @ ρ1 ! ∅ {x1 7→ Ref Unit @ ρ1 } src ρ1 ⊢obj e -letregion ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } {x3 7→ ∅} ⊢ letregion ρ3 : Ref Unit @ ρ1 ! ∅ {x1 7→ Ref Unit @ ρ1 } in x2 src ρ1 ⊢obj e -deref ∅ {ρ } {x → 7 Ref Unit @ ρ } {x → 7 ∅} ⊢ @ ρ deref letregion ρ3 in x2 : Unit ! {(read @ ρ1 ) } src ρ1 /ρ2 1 2 1 3 1 b src ρ1 ρ3 obj e -pure

⊢obj ⊢

-λ

∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } {ρ2 } {x3 7→ Ref Ref Unit @ ρ2 }

src ρ1 ρ2 obj e -alloc

⊢ hλx1 .@ ρ1 deref letregion ρ3 i : Ref Unit @ ρ1 in x2

src

src

⊢obj e

⊢obj e

src

⊢obj e

ρ1 ρ2

ρ1

ρ1 ρ2

-app-λ

-let

-letregion

⇒

Ref Ref Unit @ ρ1 ! ∅

{(read @ ρ1 )}

∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } {ρ2 } {x3 7→ Ref Ref Unit @ ρ2 } d2

{(read @ ρ1 )}

i : Ref Unit @ ρ1 {(alloc @ ρ1 ) }

⇒

d3 Ref Unit @ ρ1 @ ρ1 !

⊢ @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 in x2 ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 i : Ref Unit @ ρ1 ! {(alloc @ ρ1 ), } {ρ2 } {x3 7→ Ref (Ref Unit @ ρ1 ) @ ρ2 } in x2 (exec @ ρ1 ), @ ρ2 deref x3 (read @ ρ1 ), (read @ ρ2 ) ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ let x3 = @ ρ2 href x2 i : Ref Unit @ ρ1 ! {(alloc @ ρ1 ), } {ρ2 } in @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 i (exec @ ρ1 ), in x2 (read @ ρ1 ), @ ρ2 deref x3 (alloc @ ρ2 ) ∅ {ρ1 } {x2 7→ Ref Unit @ ρ1 } ⊢ letregion ρ2 : Ref Unit @ ρ1 ! {(alloc @ ρ1 ), } in let x3 = @ ρ2 href x2 i (exec @ ρ1 ), in @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 i (read @ ρ1 ) in x2 @ ρ2 deref x3 Figure 3.1.13. Sample Object Source Language Derivation, II

141

Modelling Encapsulation of State With Monad Transformers

Source Languages

removed from the environment. Upon reaching the declaration of ρ3 , the index is extended to ρ1 ρ3 . The body registers a read at ρ1 which becomes the latent effect of the function.

2. Monadic Language Statics

q ∈ eρ ∈

src mon q

::=

eǫ

src ρ mon e

::=

returnρ p ρ | let x = e ρ in e ρ | run e ρρ0

::=

return e ρ1 | b ρ | deref p ρ | set p ρ to p ρ | p ρ p ρ

eρ ∈

g

src ρ1 ρ0 mon e ρ1 ρ0 b ρ ∈ src mon b ρ p ρ ∈ src mon p ρ src ρ v ∈ mon v ρ ∈ src mon ρ src ∈ mon g ⊇ src obj g

e1 ; e2

::= href p ρ i | hλx.e ρ i ::= v ̺ | x ::= g ::= ρ src x ∈ src mon x ⊇ obj x → let x = e1 in e2

(x ∈ / fpv(e2 ))

Figure 3.2.14. Monadic Source Language Syntax The monadic source language syntax is presented in Figure 3.2.14. Like the object language of Figure 3.1.1, it provides encapsulation at the level of expressions, but unlike that language it does so without region variables. To this end, some of the encapsulation syntax of the monadic language of Chapter 1, Figure 1.2.12 reappears in a modified form. Unlike that monadic language, run is not needed at the level of programs. Instead, run, which does not bind any region variable, replaces letregion from the object language as an expression form. Effectful operators no longer require a region argument. As in the prior monadic language, their components are pure. Because our definition of pures is again conservative, we preclude the use of let and run in these contexts. That is not much of a problem, since such complex operands could always be let-bound. A return form is present, but its argument can be either pure or an expression, as indicated by the two distinct BNF entries. In the former case, the superscript on return indicates that there is an instance of return for every region variable in the sequence. In the latter, although the lack of region variables in the language forces us to make all effects relate only to the innermost region, we can obtain access to outer regions via the subexpression of return.

142

Monadic Language Statics

The syntactic classes of expressions, prestorables, pures, and values are indexed by a sequence of region variables. The grammar is similar to that used by Taha to describe levels of code.4 As in the object language, monadic operations are permissible only when the region sequence is nonempty. Also as in the object language, the index remains unchanged as one descends to the components of let forms, effectful operations, and prestorables. There are, however, several important differences from the object language. First, only a single explicit, generic, region variable is used, so that the sequence could effectively be replaced by a numeric counter. Second, although the index is extended upon descending into a run form, it may be retracted upon entering a return form. When a return form is built from an expression (the second BNF entry), the index of that expression is retracted, cancelling the last uncancelled run form in a stack protocol. In that case, the outer layer of the computation is considered to be trivial. The remainder of the computation is considered to refer to the next lower region. Thus, as promised, we support effectful operations on outer regions. When a return form is built from a pure, however, the index is left unchanged. In this case, enough return forms must be provided to consider the entire computation trivial, but the pure may reference values allocated at any region. This corresponds to uses of return in the monadic language with program-level encapsulation. Figures 3.2.15 and 3.2.16 present typable and untypable monadic language programs corresponding to (but not necessarily translations of) the object language programs in Figures 3.1.2 and 3.1.3, respectively. In the case of the final typable program, we choose to place the allocation of the function at the outer region and its application under the same return form. For the first untypable program, an unbound region variable is simulated with an excess return form. The static syntax, presented in Figure 3.2.17, is also free of explicit mention of region variables. Rather than extending environments with region variables, we maintain environments as sequences of region environments ≬Γ. Region environments are similar to the environments Γ of the simple monadic language of Figure 1.2.15, i.e., sets of bindings of program variables to pure types. As 4The length of the sequence corresponds to his numeric indexes [52]. More recently [53], he has used sequences of names for indexes. With respect to MetaML syntax, run e correspond to hei and return e corresponds to ˜e. MetaML ρ src ρρ0 ⊆ src eρρ0 and ∀ e ρ ∈ src eρ , e ǫ ∈ src eǫ . e ρ [ x:= e ǫ ] ∈ src eρ . has the properties that src mon e = mon v mon mon mon mon These properties are not necessary for a monad language. The first equality would imply that within a region, operations on outer regions are considered values, i.e., execution of computations on an outer region is not (at least implicitly) interspersed with execution of computations on an inner region; in Section 2.1 of the Introduction we described the benefits of allowing such dependency between computations at various levels. The subset relation would hold if we allowed pure terms beyond their given level; this would reqire a reduction that inserted implicit return forms around values, or else a more complex reduction rule for let. The final property fails in our system because of our conservative definition of pures.

143

Modelling Encapsulation of State With Monad Transformers

Source Languages

(1) run let x = href uniti in deref x (2) run href uniti (3) run hλx.href xi i run href uniti (4) run run let x = return href uniti in href xi (5) run let x2 = href uniti in run let x3 = href x2 i in let x5 = deref x3 in return let x4 = hλx1 .let x6 = run return return x2 i in deref x6 in x4 x5 Figure 3.2.15. Some Typable Monadic Language Programs (1) run return return href uniti (2) run run let x = let x3 = href uniti in href x3 i in let x3 = return href uniti in set x to x3 (3) run run let x = hλx.return return x i in let x1 = href uniti in x x1 ; let x1 = return href uniti in x x1 (4) let x = run href truei in run deref x (5) run run let x1 = href uniti in return href x1 i (6) run let x = hλx.return x i in run let x1 = href uniti in return x x1 Figure 3.2.16. Some Untypable Monadic Language Programs in Figure 1.2.15, expression types are applications of a trace type to a pure type, but trace types are now built up from region trace types. In particular, they are a right-associative sequence of applications of region trace types to the identity functor Id. Region trace types are instances of the state monad transformer annotated with an effect ε. Effects, as in the simple monadic language, are simply sets of actions. We define an operator to combine two annotated monad transformers, taking the union of their effect annotations, i.e., we define ≬T .1 ⊔ ≬T .2 as Stε.1 ⊔ Stε.2 = Stε.1

∪ ε.2

.

We then define an operator to combine two monads of the same structure pointwise. Formally, we define T.1 ⊔ T.2 such that Id ⊔ Id = Id and ≬T 0.1 T1.1 ⊔ ≬T 0.2 T1.2 = (≬T 0.1 ⊔ ≬T 0.2 ) (T1.1 ⊔ T1.2 ). Later in Part III, we will also require a relational operator on trace types and region trace types. We define

144

Monadic Language Statics

Γρ ∈

ǫ src mon Γ

Γρ ∈

Pρ ∈

ρ1 ρ0 src mon Γ ≬ ρ ≬ ρ Γ ∈ src mon Γ ρ E ρ ∈ src mon E ρ1 ρ0 B ρ ∈ src mon B ǫ src src mon P , Q ∈ mon Q

Pρ ∈

ρ1 ρ0 src mon P

Tρ ∈ Tρ ∈ ≬ ρ

∈ ρ ε ∈ Fρ ∈

T

ǫ src mon T

::=

ǫ

{≬Γ } ρ

::= Γ ρ1 {≬Γ } ::= ::= ::= ::=

ρ

∅ | ≬Γ {x 7→ P ρ } Tρ Pρ Ref P ρ | P ρ ⇒ E ρ G|∅

::= B ρ | Return P ρ1 ::= Id

ρ1 ρ0 src ::= mon T src ≬ ρ1 ρ0 ::= mon T src ρ1 ρ0 = mon ε ρ1 ρ0 src ::= mon F ι ∈ src ι ::= mon G ∈ src G ⊇ mon

≬ ρ

T T ρ1 ρ

Stε ρ { src mon F } ι alloc | read | write | exec src obj G

Figure 3.2.17. Monadic Source Language Static Syntax ≬ ′ T0

′

′

⊑ ≬T 0 to hold when there is a subset relation on the annotations, i.e., Stε0 ⊑ Stε0 if ε0 ⊆ ε0 . T ′

⊑ T is defined such that Id ⊑ Id and ≬T ′0 T1′ ⊑ ≬T 0 T1 when ≬T ′0 ⊑ ≬T 0 and T1′ ⊑ T1 . Like the dynamic syntax of Figure 3.2.14, the static syntax is indexed by a seqence of occurrences of a single, generic region variable. Environments Γ are sequences of region environments at increasing monadic level; the initial region environment has an empty index. As in the object language, program types are pure types with an empty index and consist of global constant types and dangling pointer types. Pure types of nonempty index include storable types of the same index, corresponding to the type of a prestorable at the innermost region. Types of prestorables at outer regions are specified using a Return form, which expects a pure type of retracted index. Examples of such monadic pure types are provided in Section 2.3 of the Introduction. We present below an example of an environment representing a context of two regions. The outermost binding set is outside of any region. It binds x1 to Unit. The next one binds x2 and x3 . x2 takes only unit, typed here as Return Unit. x3 takes an integer. The third binding set contains another unit binding, x4 , as well as another binding for an integer from the outer region, x5 , typed here as Return Int, a binding of x6 for an integer from the inner region, and finally a binding of x7 for a function at the inner region from an integer at the outer region to an integer at the inner region, performing a read at the outer region and an allocation at the inner region.

145

Modelling Encapsulation of State With Monad Transformers

{∅ {x1 7→ {∅ {x2 7→ {∅ {x4 7→ {x7 7→

Γ ρ1 ( x) Γ( x)

Unit}} Return Unit} {x3 7→ Int}} Return Return Unit} {x5 7→ Return Int} {x6 7→ Int} } Return Int ⇒ St{alloc} (St{read} Id) Int}

ρ1

∈

ρ1

Γ ρ1 ( x)

ρ1 /ρ2

∈ ρ1 ρ0 /ρ2

Γ {≬Γ {x 7→ P}}( x)

Γ

2ρ1 ρ0 ≬Γ

ρ1 /

ρ1 ρ2 src mon P

= Γ {≬Γ}( x)

ρ1 /ρ2

Γ {≬Γ {x1 7→ P}}( x2 ) ≬ ρ1

ρ1 src mon P

= Γ( x)

Γ {≬Γ } {∅}( x)

Source Languages

ρ1 /ρ2

ρ1 /ρ0 ρ2

= Returnρ2 P (x1 6= x2 ) = Γ {≬Γ}( x2 )

ρ1 ρ0

∈

Γ 1 2ρ1 ρ0 ∅

ρ1 /ρ2

src ≬ ρ1 mon Γ

≬

=

≬

= (≬Γ 1 2ρ1 ρ0 ≬Γ 2 ) {x 7→ P :=ρ1 ρ0 ∅}

Γ ρ1 ρ2 2ρ1 /ρ2

∈

Γ {≬Γ } 2ρ1 /

= Γ {≬Γ}

Γ {≬Γ 1 } {≬Γ 2 } 2ρ1 /ρ2 ρ0

= (Γ {≬Γ 1 2ρ1 ρ2 ρ0 ≬Γ 2 }) 2ρ1 /ρ2

P ρ1 ρ0 :=ρ1 ρ0 ∅

∈

B :=ρ1 ρ0 ∅

= Returnρ1 ∅

Return B :=ρ1 ρ0 ∅

= B

Γ 1 2ρ1 ρ0 ≬Γ 2 {x 7→ P}

≬

Γ1

ρ1 src mon Γ

ρ1 src mon P

Figure 3.2.18. Monadic Source Language Definitions Figure 3.2.18 presents definitions related to the monadic source language. Application of an ρ

environment to a program variable at ρ1 , Γ ( x) 1 , searches each region environment for the variable binding, not just the upper one, in order to satisfy the first condition in Section 2.4 of the Introduction. It adds a Return form for each region boundary that must be crossed to reach the program variable binding. The merging of ≬Γ 2 into ≬Γ 1 at ρ1 ρ0 , ≬Γ 1 2ρ1 ρ0 ≬Γ 2 , is defined to include entries from the upper region environment in the lower region environment after restricting the innermost region from the pure type. The restriction of an environment, i.e., the iterated merging of the uppermost region environments of an environment, Γ 2ρ1 /ρ2 , is also defined. Restricting the innermost 146

Monadic Language Statics

region from a pure type at ρ1 ρ0 , P :=ρ1 ρ0 ∅, is defined so as to remove a level of referencing to outer regions, or replace a storable type at the innermost region with a reference to ∅, outside of all remaining regions. We omit the indexes on these operators when they can be inferred from the context. src

⊢mon q

programs expressions prestorables

Γ

ρ

⊢

Γρ

⊢

Γρ

pures

⊢ ⊢

values

src ρ mon e

q

: Q

ρ

: Eρ

src ρ mon b

bρ

: Bρ

src ρ mon p

pρ

: Pρ

vρ

: Pρ

src ρ mon v

e

Figure 3.2.19. Monadic Source Language Typing Judgments Judgments in the encapsulated monadic language, in Figure 3.2.19, are similar to those of the simple monadic language of Figure 1.2.16 except that the syntax is indexed as in Figures 3.2.14 and 3.2.17. src

⊢mon v

src

src

⊢mon v-glob-const

src

⊢mon v

ǫ

⊢mon v-ret-run

g : TypeOf( g)

⊢

src ρ1 ρ0 mon v

ρ1

v : P1

v : Return P1

Figure 3.2.20. Typing of Monadic Source Language Values src

Values are typed using the two rules in Figure 3.2.20. Rule ⊢mon v -ret-run is used to type values allocated in outer regions, or to type global constants and dangling pointers from within a region. It requires a derivation of the same value, typed without the Return form, with a retracted index. src

The rule ⊢mon v-glob-const is like that of the object language (Figure 3.1.7), but restricted to an empty index.

⊢

src mon q

{∅}

src

⊢mon e ⊢

src mon q

ǫ

q:

Id Q

q: Q

Figure 3.2.21. Typing of Monadic Source Language Programs Programs are typed using the rule in Figure 3.2.21. The antecedent expression derivation has an empty index, as in the object language (Figure 3.1.8). The required expression type is thus formed by applying the identity functor to the program type. The environment consists of a single, empty, region environment. 147

Modelling Encapsulation of State With Monad Transformers

Source Languages

src

⊢

src mon p-var

Γ ⊢

src ρ mon p

⊢

x : Γ( x)

src mon p-value

⊢mon v Γ ⊢

ρ

v: P

src ρ mon p

v: P

Figure 3.2.22. Typing of Monadic Source Language Pures The rules for pures in Figure 3.2.22 are the same as those of the object language (Figure 3.1.9). The application of an environment to a program variable is defined above in Figure 3.2.18. src

⊢

src mon b-ref

Γ ⊢mon p src

Γ ⊢mon b

ρ1 ρ0

ρ1 ρ0

p : P0

href pi : Ref P0

src

⊢

src mon b-λ

Γ1 {≬Γ 0 {x 7→ P0.1 }} ⊢mon e src

Γ1 {≬Γ 0 } ⊢mon b

ρ1 ρ0

ρ1 ρ0

e0 : ≬T 0 T1 P0.2

hλx.e0 i : P0.1 ⇒ ≬T 0 T1 P0.2

Figure 3.2.23. Typing of Monadic Source Language Prestorables Rules for typing prestorables are presented in Figure 3.2.23. They are indexed versions of src

those of Figure 1.2.20, except that in the rule ⊢mon b -λ, it is the upper region environment that is extended with the bound program variable. The expression types in that rule follow the syntax in Figure 3.2.17. src

In ⊢mon e -let, it is again the uppermost region environment that is extended with the bound program variable. The ⊔ operator is used to combine the effects of the definition and of the body. The

⊔

src

operator in ⊢mon e -app operates similarly for functor transformers, combining the latent

effect at the innermost region with an execution registered for the application. The resulting functor transformer is in turn applied to the remaining latent effect. Other monadic operators have a trace type formed by applying a functor transformer that simply registers an action at the innermost src

region to a trivial trace type that registers no effects at other regions. Rule ⊢mon e -pure types a pure surrounded by a series of return forms, one per region variable in the context. This is indicated by the syntax returnρ . The rule requires that the pure be typeable at the same index, with the same pure type. We can now see that our definition of application of environments in Figure 3.2.18, which inserts Return forms at region boundaries, is consistent with the desired treatment described in Section 2.4 src

of the Introduction. This is because in rule ⊢mon e -run, the environment is extended for variable references within a run form. If we reference a program variable declard outside the run form, we thus expect it to have a pure type modified by an application of Return. The exception is that src

rule ⊢mon e -ret-run retracts the environment in order to prevent the insertion of a Return form for variables declared outside of a run form but accessed within operations on an outer region. This

148

Monadic Language Statics

src

is because ⊢mon e -ret-run ensures that the entire pure type is enclosed in a Return form. In order src

to satisfy the second condition in Section 2.4 of the Introduction, ⊢mon e -ret-run merges the upper region environment into the next highest region environment instead of dropping the upper region environment outright. The merging process drops a Return form, allowing variable references to an outer region to be considered as references to the current region within an operation on the outer src

region. The rule ⊢mon e-pure, by contrast, neither retracts the environment nor encloses the pure type in a Return form, as the index remains unchanged. Figures 3.2.25 through 3.3.27 present a sample derivation of the final program of Figure 3.2.15. src

As we descend the typing derivation in Figure 3.2.25 through each of the two instances of ⊢mon e-run, we extend the judgment index with the generic region variable and extend the static environment with an empty region environment. The trace type begins as Id for the outermost expression and is

src

⊢

src mon e -let

Γ {≬Γ } ⊢mon e

Γ { Γ} ⊢ Γ ⊢mon p

src mon e -pure

Γ ⊢

src ρ mon e

ρ

src

Γ {≬Γ {x 7→ T.1 P.1 }} ⊢mon e

e.1 : T.1 P.1 ≬

src

⊢

ρ

src ρ mon e

src

⊢

∅

Γ ⊢ src

⊢

Γ ⊢mon p Γ ⊢

src

⊢

Γ ⊢mon p

src mon e -set

Γ ⊢ src

⊢

src mon e -app

Γ ⊢mon p

ρ1 ρ0

src ρ1 ρ0 mon e

ρ1 ρ0

src ρ1 ρ0 mon e

src

Γ ⊢mon p

src

Γ ⊢mon p

ρ1 ρ0

b : St

(St∅ Id) B0

ρ1 ρ0

p.2 : P0

∅

(St Id) Unit

p.1 : (P0.1 ⇒ ≬T 0 T1 P0.2 )

p.1 p.2 : (≬T 0 ⊔ St{exec} ) T1 P0.2

src ρ1 mon e

ρ1 ρ0

e0 : ≬T 0 T1 P0

run e0 : T1 P0 :=ρ1 ρ0 ∅ src

⊢mon e-ret-run

b : B0 {alloc}

p : Ref P0

set p.1 to p.2 : St

Γ1 {∅} ⊢mon e Γ1 ⊢

ρ1 ρ0

deref p : St{read} (St∅ Id) P0

src

src

src ρ1 ρ0 mon e

{write}

p.2 : P0.1

Γ ⊢

ρ1 ρ0

p.1 : Ref P0

src ρ1 ρ0 mon e

⊢mon e-run

src

Γ ⊢mon b

src mon e -alloc

return p : St Id P src mon e -deref

e.2 : T.2 P.2

let x = e.1 in e.2 : (T.1 ⊔ T.2 ) P.2

p: P ρ

ρ

Γ {≬Γ 1 2 ≬Γ 0 } ⊢mon e Γ {≬Γ 1 } {≬Γ 0 } ⊢

src ρ1 ρ0 mon e

ρ1

e1 : T1 P1

return e1 : St∅ T1 (Return P1 )

Figure 3.2.24. Typing of Monadic Source Language Expressions 149

src

⊢mon p

src

⊢mon e

d2

src

-let

src

⊢mon e

src

⊢mon e

src

-deref

{∅} {∅ {x2 7→ Ref Return Unit}} {∅ {x3 7→ Ref Return Ref Return Unit}}

-run

ǫ

-run

⊢ x3 : Ref Return Ref Return Unit ⊢ deref x3 : Stread (St∅ Id) Return Ref Return Unit

i

: St{alloc,read} (St{alloc,exec,read} Id) Return Ref Return Unit

⊢ let x3 = href x2 i in let x5 = deref x3 in return let x4 = hλx1 .let x6 = run return return x2 i in deref x6 in x4 x5

{∅} {∅ {x2 7→ Ref Return Unit}}

⊢ run let x3 = href x2 i in let x5 = deref x3 in return let x4 = hλx1 .let x6 = run return return x2 i in deref x6 in x4 x5

{∅} {∅}

d3

: St{read} (St{alloc,exec,read} Id) Return Ref Return Unit

{∅} {∅ {x2 7→ Ref Return Unit}} {∅}

ρ

-let

: St{alloc,exec,read} Id Ref Return Unit

⊢ let x2 = href uniti : St{alloc,exec,read} Id in run let x3 = href x2 i Return Unit in let x5 = deref x3 in return let x4 = hλx1 .let x6 = run return return x2 i in x4 x5 in deref x6 {∅} ⊢ run let x2 = href uniti : Id in run let x3 = href x2 i Unit in let x5 = deref x3 in return let x4 = hλx1 .let x6 = run return return x2 i in x4 x5 in deref x6 ⊢ run let x2 = href uniti : Unit in run let x3 = href x2 i in let x5 = deref x3 in return let x4 = hλx1 .let x6 = run return return x2 i in x4 x5 in deref x6 Figure 3.2.25. Sample Monadic Source Language Derivation, I

Source Languages

⊢mon q

ρρ

{∅} {∅ {x2 7→ Ref Return Unit}} {∅ {x3 7→ Ref Return Ref Return Unit}}

{∅} ⊢ let x5 = deref x3 {∅ {x2 7→ Ref Return Unit}} in return {∅ {x3 7→ Ref Return Ref Return Unit}} let x4 = hλx1 .let x6 = run return return x2 in deref x6 in x4 x5

ρ

⊢mon e

src

-let

ρρ

⊢mon e

d1

ρρ

-var

Modelling Encapsulation of State With Monad Transformers

150

src

⊢mon e

ρρ

Translation

modified by an application of the state monad transformer, annotated with the effect at each region, as the pure type Unit is modified by applications of Return. Within antecedents of the derivation of the let forms, the effect at each region trace type is divided, with that pertaining to the inner region peeling off to each definition antecedent (e.g., the reference cell allocations in d1 and d2 , Figure 3.2.26), and that pertaining to the outer region belonging to the body antecedent. In d2 , the variable reference of x2 yielding the outer reference cell in the allocation of the inner cell is typed with an additional Return construct because x2 is accessed one region environment back. Within the body antecedents, the uppermost region environment is extended with the bound program variable. src

The rule for each effectful operator expression applies a trace type; in the case of ⊢mon e -app, this src

includes the latent effect of the function. ⊢mon p -value is used to introduce the environment. In d3 of src

Figure 3.3.27, the return form is typed using ⊢mon e-ret-run, in whose antecedent the index is retracted, restricting the remaining processing to the outer region. An empty region trace type and instance of Return are stripped off, and the uppermost two region environments are merged. The latter process involves stripping a Return form from the type of x5 and converting the type of x3 to ∅. The revised type of x5 is appropriate for the application of the function allocated in d4 . The application incorporates the latent read effect of the function with a newly registered execution effect. In the src

derivation of the function allocation, ⊢mon b -λ requires an expression derivation in an environment with the uppermost region environment extended with a binding of the formal parameter, x1 , to the domain type, Ref Return Unit, yielding the range type and effect of St{read} Id Ref Return Unit. src

⊢mon e-run introduces the encapsulation construct, expecting an expression derivation with extended index and environment, and producing a pure type with an additional application of Return and a trace type with an additional application of the state monad transformer, in this case annotated with the empty effect. Because the access of program variable x2 from within an additional region occurs src

in an expression context, ⊢mon e-pure is used to supply both required return forms and the empty trace type. It preserves both the Return form corresponding to the new region and the additional region src

environment, which cancel each other in the ⊢mon p -var derivation. The typing of the dereference of the outer cell in d5 of Figure 3.2.26 is straightforward and provides the required read action.

151

Modelling Encapsulation of State With Monad Transformers

Source Languages

d1 = src

⊢mon v

ǫ

-glob-const

⊢ unit : Unit ⊢ unit : Return Unit src ρ ⊢mon p -value {∅} {∅} ⊢ unit : Return Unit src ρ ⊢mon b -ref {∅} {∅} ⊢ href uniti : Ref Return Unit src ρ ⊢mon e -alloc {∅} {∅} ⊢ href uniti : St{alloc} Id Ref Return Unit ⊢

src ρ mon v -ret-run

d2 = src

⊢mon p

ρρ

-var

{∅} {∅ {x2 7→ Ref Return Unit}} {∅} ⊢ x2 : Return Ref Return Unit ⊢ {∅} {∅ {x2 7→ Ref Return Unit}} {∅} ⊢ href x2 i : Ref Return Ref Return Unit src ρρ ⊢mon e -alloc {∅} {∅ {x2 7→ Ref Return Unit}} {∅} ⊢ href x2 i : St{alloc} (St∅ Id) Ref Return Ref Return Unit d5 = src ρρ mon b -ref

src

⊢mon p

src

⊢mon e

ρρ

-var

ρρ

-deref

{∅} { ∅ {x2 7→ Ref Return Unit} {x3 7→ ∅} {x5 7→ Ref Return Unit} {x1 7→ Ref Return Unit} {x6 7→ Ref Return Unit} {∅} { ∅ {x2 7→ Ref Return Unit} {x3 7→ ∅} {x5 7→ Ref Return Unit} {x1 7→ Ref Return Unit} {x6 7→ Ref Return Unit}

} ⊢ x6 : Ref Return Unit

} ⊢ deref x6 : St{read} Id Return Unit

Figure 3.2.26. Sample Monadic Source Language Derivation, II 3. Translation

In Figure 3.3.28 we present a translation between the two languages of this chapter. Like the translation of the nonencapsulated object language (Figure 1.3.24), it is divided by object-language syntactic class, but these (for classes other than programs) are now indexed by sequences of region variables. As in that translation, subexpressions of monadic operations are broken out using let

152

src

⊢mon p d4 ⊢

src

⊢mon e

src

src

ρ

-var

⊢mon e

Γ ⊢ x4 : Ref Return Unit ⇒ St{read} Id Ref Return Unit

ρ

src ρ mon e -let

-app

⊢mon p

ρ

-var

Γ ′ ⊢ x5 : Ref Return Unit

Γ ′ ⊢ x4 x5 : St{exec,read} Id Ref Return Unit

Γ ⊢ let x4 = hλx1 .let x6 = run return return x2 i : St{alloc,exec,read} Id in deref x6 Ref Return Unit in x4 x5

ρρ

-ret-run

′

{∅} {∅ {x2 → 7 Ref Return Unit}} {∅ {x3 → 7 Ref Return Ref Return Unit} } {x5 7→ Return Ref Return Unit}

⊢ return let x4 = hλx1 .let x6 = run return return x2 i in deref x6 in x4 x5

: St{} (St{alloc,exec,read} Id) Return Ref Return Unit

d4 = src

⊢mon p src

⊢mon e src

⊢mon e ⊢

src

src

-run

src ρ mon e -let

⊢mon b ⊢mon e

ρ

ρ

-λ

ρ

-alloc

ρρ

-pure

ρρ

-var

Γ ′′ {∅} ⊢ x2 : Return Ref Return Unit

Γ ′′ {∅} ⊢ return return x2 : St{} St{} Id Return Ref Return Unit Γ ′′ ⊢ run return return x2 : St{} Id Ref Return Unit

d5

Γ ′′ ⊢ let x6 = run return return x2 : St{read} Id in deref x6 Return Unit Γ ⊢ hλx1 .let x6 = run return return x2 i : Ref Return Unit ⇒ in deref x6 St{read} Id Ref Return Unit

Γ ⊢ hλx1 .let x6 = run return return x2 i : St{alloc} Id in deref x6 Ref Return Unit ⇒ St{read} Id Ref Return Unit

Γ = {∅} { ∅ {x2 7→ Ref Return Unit} {x3 7→ ∅} {x5 7→ Ref Return Unit}

} Γ ′ = {∅} { ∅ {x2 7→ {x3 7→ {x5 7→ {x4 7→

Ref Return Unit} } ∅} Ref Return Unit} Ref Return Unit ⇒ } {read} St Id Ref Return Unit

Γ ′′ = {∅} { ∅ {x2 7→ Ref Return Unit} } {x3 7→ ∅} {x5 7→ Ref Return Unit} {x1 7→ Ref Return Unit}

153

Figure 3.3.27. Sample Monadic Source Language Derivation, III

Translation

d3 =

Modelling Encapsulation of State With Monad Transformers

Source Languages

forms. Rather than introducing run at the level of programs, the translation replaces letregion constructs with run. Object-language pures are given not a single return construct, but a series of src

them. As in the rule ⊢mon e-pure, the syntax returnρ indicates an instance of return for every region variable in ρ. Series of return constructs are also provided to bypass inner regions when a monadic operation is to be performed in an outer region. Recursive calls in the translation follow the syntax definition so that the body of a letregion form is translated with an extended index and the body of a function allocated in an outer region is translated with a retracted index. One can check that the translated code can be assigned an index of the same length as that of the source code. In particular, it can be assigned the translation of the object-language index, defined by replacing any region variable with the unique monadic region variable. Programs and Expressions src obj qN

src

= [e ] obj e

[e ]

src ρ obj e N

[p ]

= src ρ obj e N

[let x = e1 in e2 ]

=

src ρ = ρ1 ρ0 ρ2 N obj e

[@ ρ0 href ei]

=

ǫ

N src

returnρ [p ] obj p

ρ

N

src ρ obj e N

let x = [e1 ]

src

in [e2 ] obj e

ρ

N

src ρ obj e N

let x = [e ]

in returnρ2 href xi src ρ = ρ1 ρ0 ρ2 N obj e

=

[@ ρ0 deref e ]

src

let x = [e ] obj e

ρ

N

in returnρ2 (deref x) src ρ = ρ1 ρ0 ρ2 N obj e

[@ ρ0 set e.1 to e.2 ]

=

src

ρ

let x1 = [e.1 ] obj e N src ρ in let x2 = [e.2 ] obj e N in returnρ2 (set x1 to x2 )

src ρ1 ρ0 ρ2 N obj e

[@ ρ0 hλx.e i]

=

src ρ = ρ1 ρ0 ρ2 N obj e

[@ ρ0 e.1 e.2 ]

=

src

returnρ2 hλx.[[e ] obj e

ρ1 ρ0 N

i

src ρ obj e N

let x1 = [e.1 ] src ρ in let x2 = [e.2 ] obj e N in returnρ2 (x1 x2 )

src

[letregion ρ0 in e ] obj e

ρ1 N

=

Pures and Values src ρ [x ] obj p N = x src

ρ

N

= [v ] obj v

src

ρ

N

=

[v ] obj p [g ] obj v

src

g

src

run [e ] obj e

ρ1 ρ0 N

Region Variables src obj ρN

ρ

N

[ρ ]

src

[ρ ] obj ρN

= ρ src

= [ρ ] obj ρN

Figure 3.3.28. Translating Object Source Language to Monadic (Dynamic) Recall that in order to make our translation of the dynamic syntax independent of any translation of the static syntax, we required the relevant region to be explicitly mentioned in all monadic 154

Translation

operations. We see now that this region variable is used to divide the region sequence into ρ1 , ρ0 , and ρ2 thus indicating how many return forms must be inserted and at what level procedure bodies are to be translated. If our translation were defined over typing derivations and this crutch were not present, all that would remain is to identify the relevant ρ0 in the region variable sequence as the location of the first argument to the monadic operation. Consider the sample object language derivation in Figure 3.1.12. Applying the translation of Figure 3.3.28 to the program letregion ρ1 in let x2 = @ ρ1 href uniti in letregion ρ2 in let x3 = @ ρ2 href x2 i in @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 in x2 i @ ρ2 deref x3 yields run let x2 = let x4 = return unit in href x4 i in run let x3 = let x6 = return return x2 in href x6 i in let x7 = return hλx1 .let x5 = run return return x2 i in deref x5 in let x8 = let x9 = return return x3 in deref x9 in return (x7 x8 ) The static translation is in Figure 3.3.29. The translation of environments differs from that of Figure 1.3.25 in that it must remove region variable declarations, using them to delimit the region environments of the monadic language. Similarly, the translation removes explicit mention of region variables in object language trace types, using them to divide the effect among the series of functor transformers applied, eventually, to the identity functor. Pure types are translated by applying a series of Return forms. In the cases of global constant types and dangling pointer types, a Return form is added for each region variable in the index. In the case of located types, a Return form is added for each region variable in the index inward of the point of allocation. Storable types are translated as for the nonencapsulated object language. Again, translated code can be assigned an index of the same length as that of the source code.

155

Modelling Encapsulation of State With Monad Transformers

Source Languages

Environments src ǫ obj Γ N

[∅ ]

src

[Γ {x 7→ P} ] obj Γ

ρ

N

src ρ1 ρ0 N obj Γ

=

{∅}

=

let Γ {≬Γ} = [Γ ] obj Γ

src

src ρ1 N obj Γ

[Γ {ρ0 } ]

= [Γ ]

ρ

N

src

in Γ {≬Γ {x 7→ [P ] obj P

ρ

N

}}

{∅}

Expression and Trace Types src ρ obj E N

[P ! T ] src

[T0 ] obj T

ρρ0 N

src ǫ obj T N

[T ]

src

= [T ] obj T

ρ

N

src

[P ] obj P

ρ

N src

=

St{ι|(ι @ ρ0 ) ∈ T} [T0 − ρ0 ] obj T

=

Id

ρ

N

Pure Types ρ src obj P N

[G ]

src

[ B0 @ ρ0 ] obj P src

[∅] obj P

ρ

ρ1 ρ0 ρ2 N

N

=

Returnρ G

=

Returnρ2 [B0 ] obj B

=

Returnρ ∅

src

ρ1 ρ0 N

Storable Types ρ = ρ1 ρ0 src N obj B

[Ref P0 ] T

src

[P0.1 ⇒ P0.2 ] obj B

ρ = ρ1 ρ0 N

src

= Ref [P0 ] obj P src

= [P0.1 ] obj P

ρ

ρ

N

N src

⇒ [T ] obj T

ρ

N

src

[P0.2 ] obj P

ρ

N

Figure 3.3.29. Translating Object Source Language to Monadic (Static) Applying the static translation to the environment ∅ {x1 7→ Unit} {ρ1 } {x2 7→ Unit} {x3 7→ Int @ ρ1 } {ρ2 } {x4 7→ Unit} {x5 7→ Int @ ρ1 } {x6 7→ Int @ ρ2 } {x7 7→ Int @ ρ1

{(read @ ρ1 ),(alloc @ ρ2 )}

⇒

Int @ ρ2 }

yields

{∅ {x1 7→ {∅ {x2 7→ {∅ {x4 7→ {x7 7→

Unit}} Return Unit} {x3 7→ Int}} Return Return Unit} {x5 7→ Return Int} {x6 7→ Int} } Return Int ⇒ St{alloc} (St{read} Id) Int}

the environment presented early in this chapter. Corresponding to Theorem 1.3.1, we have another statement of preservation of types. In each case the monadic index is the translation of the object index.

156

Translation

Theorem 3.3.1 (Types Preservation).

src

⊢obj q q : Q src

Γ ⊢obj e

ρ

src

Γ ⊢obj p src

⊢obj v

ρ

ρ

src

⊢mon q

→ src

e: E

→

[Γ ] obj ΓN

p: P

→

[Γ ] obj ΓN

v: P

→

src

src

src

src

src

src

src

src

src

[q ] obj qN : [Q ] obj QN

src

src ρN [ρ [ ]]obj

⊢mon e

[e ] obj eN : [E ] obj EN

src

src ρN [ρ [ ]]obj

[p ] obj pN : [P ] obj PN

⊢mon p src

src ρN [ρ [ ]]obj

[v ] obj vN : [P ] obj PN

⊢mon v

Lemma 3.3.1 (Correspondence of Environments). src

[Γ2 − ρ2 ] obj Γ

ρ1 ρ0 N

src

= [Γ2 ] obj Γ

ρ1 ρ0 ρ2 N

2ρ1 ρ0 /ρ2

Lemma 3.3.2 (Correspondence of Pure Types). src

[P0 − ρ0 ] obj P

ρ1 N

src

= [P0 ] obj P

ρ1 ρ0 N

:= ∅

Proof: Theorem 3.3.1. The proof is by induction on object language typing derivations. We present some interesting cases. src

⊢obj v-glob-const: src

Use an instance of ⊢mon v-ret-run for each region variable in the index, and then an instance src

of ⊢mon v-glob-const. src

⊢obj p-var: src

[Γ ] obj Γ

ρ1 ρ2 N

src

( [x ] obj p

ρ1 ρ2 N

src

) = [Γ ] obj Γ

ρ1 ρ2 N

src

( x) = Returnρ2 [Γ( x) ] obj P

ρ1 N

with x occurring in Γ between the declarations of ρ1 and ρ2 . By ⊢ can apply ⊢ ⊢

src mon p-var

src

= [Γ( x) ] obj P

src obj p-var

ρ1 ρ2 N

,

, Γ ( x) = P so we

.

src obj e -pure

:

By induction the translation of the pure antecedent judgment is derivable. Use a single src

application of ⊢mon e -pure. The sequence of return forms added by the translation of the src

expression corresponds to that added to the translation of the pure antecedent by ⊢mon e-pure. src

src

⊢obj e-alloc/⊢obj b-λ: src

src

Let the index be ρ1 ρ0 ρ2 , with the allocation taking place at ρ0 . By ⊢obj e -alloc and ⊢obj b -λ src

we have a derivation of (Γ − ρ2 ) {x 7→ P0.1 } ⊢obj e src

a derivation of [(Γ − ρ2 ) {x 7→ P0.1 } ] obj Γ src

[P0.2 ] obj P

ρ1 ρ0 N

src

e0 : P0.2 ! T0 . By induction, we get

src ρN [ρ [ 1 ρ0]]obj

⊢mon e

src

[e0 ] obj e

ρ1 ρ0 N src

src

ρ1 ρ0 N

src

in Γ1 {≬Γ 0 {x 7→ [P0.1 ] obj P src

and ⊢mon e-alloc to obtain a derivation of [Γ − ρ2 ] obj Γ

ρ1 ρ0 N

ρ1 ρ0 N src

src

: [T0 ] obj T

. By definition of the translation, [(Γ − ρ2 ) {x 7→ P0.1 } ] obj Γ

{≬Γ 0 } = [(Γ − ρ2 ) ] obj Γ src

ρ1 ρ0 N

ρ1 ρ0

ρ1 ρ0 N

ρ1 ρ0 N

= let Γ1 src

}}. We can apply ⊢mon b -λ src ρN [ρ [ 1 ρ0]]obj

⊢mon e

src

hλx.[[e0 ] obj e

ρ1 ρ0 N

i:

157

Modelling Encapsulation of State With Monad Transformers

src

St{alloc} (St∅ Id) ([[P0.1 ] obj P apply ⊢

src mon e -ret-run

⊢

ρ1 ρ0 N

src obj e -alloc

/⊢

ρ1 ρ0 N

src

[P0.2 ] obj P

ρ1 ρ0 N

). By Lemma 3.3.1, we can

once for each region variable in ρ2 to obtain [Γ ] ∅

returnρ2 hλx.[[e0 ] src

src

⇒ [T0 ] obj T

src ρ1 ρ0 ρ2 N obj Γ

src ρ1 ρ0 N obj e

[P0.2 ] obj P

ρ1 ρ0 N

Source Languages

{alloc}

src ρ1 ρ0 N obj P

∅

src

⇒ [T0 ] obj T

(St Id)) Returnρ2 ([[P0.1 ]

i: St (St

⊢

src ρN src [ρ [ 1 ρ0 ρ2]]obj mon e ρ1 ρ0 N

).

src obj b -ref

: src

src

Let the index be ρ1 ρ0 ρ2 , with the allocation taking place at ρ0 . By ⊢obj e-alloc and ⊢obj b-ref src

we have a derivation of Γ ⊢obj e src

src

src ρN [ρ [ 1 ρ0 ρ2]]obj

[Γ ] obj ΓN ⊢mon e

src

Returnρ2 [P ρ1 ρ0 ] obj P

ρ1 ρ0 N

src

e: P ρ1 ρ0 ! T. By induction, there is a derivation of

src

src

[e ] obj eN : [T ] obj TN [P ρ1 ρ0 ] obj P src

. Let Γ {≬Γ} = [Γ ] obj Γ src

rive Γ {≬Γ {x 7→ [P ρ1 ρ0 ] obj P src

ρ1 ρ0 ρ2

ρ1 ρ0 ρ2 N

ρ1 ρ0 ρ2 N

ρ1 ρ0 ρ2 N

src

, in which [P ρ1 ρ0 ] obj P

}} 2ρ1 ρ0 /ρ2 ⊢mon p

src

x: [P ρ1 ρ0 ] obj P

src

src

apply ⊢mon b-ref and ⊢mon e-alloc to derive Γ {≬Γ {x 7→ [P ρ1 ρ0 ] obj P src

src ρN [ρ [ 1 ρ0]]obj

⊢mon e

src

href xi: St{alloc} (St∅ Id) Ref [P ρ1 ρ0 ] obj P

ρ1 ρ0 N

src

region variable in ρ2 , deriving Γ {≬Γ {x 7→ [P ρ1 ρ0 ] obj P src

href xi: St∅ (St{alloc} (St∅ Id)) Returnρ2 Ref [P ρ1 ρ0 ] obj P

=

. We select a fresh variable x and de-

src ρN [ρ [ 1 ρ0]]obj

src

ρ1 ρ0 ρ2 N

. We then

}} 2ρ1 ρ0 /ρ2 src

. We apply ⊢mon e -ret-run for each

ρ1 ρ0 ρ2 N

ρ1 ρ0 N

ρ1 ρ0 ρ2 N

ρ1 ρ0 N

src

src ρN [ρ [ 1 ρ0 ρ2]]obj

}} ⊢mon e

returnρ2

src

, then apply ⊢mon e -let, using the

derivation by induction for the definition antecedent and this latter definition for the body antecedent. src

src

src

⊢obj e-deref, ⊢obj e-set, ⊢obj e-app: src

src

These are similar. For ⊢obj e -set and ⊢obj e -app, we require an analogue of Lemma 1.3.1 (Weakening), for the same reason as in the translation of the nonencapsulated object language. src

⊢obj e-letregion: src

By ⊢obj e

-letregion,

src

we have a derivation of Γ1 {ρ0 } ⊢obj e

ρ1 ρ0

e0 : P0 ! T0 . src ρ1 N obj Γ

and the definition of the static translation, we can derive [Γ1 ] src

[e0 ] obj e

ρ1 ρ0 N

src

: St{ι|(ι @ ρ0 ) ∈ T0 } [T0 − ρ0 ] obj T src

a derivation of [Γ1 ] obj Γ

ρ1 N

src

src ρN [ρ [ 1]]obj

⊢mon e

ρ1 N

src

[P0 ] obj P src

run [e0 ] obj e

ρ1 ρ0 N

ρ1 ρ0 N

By induction src

src ρN [ρ [ 1 ρ0]]obj

{∅} ⊢mon e src

. Applying ⊢mon e-run, we obtain src

: [T0 − ρ0 ] obj T

ρ1 N

src

[P0 ] obj P

ρ1 ρ0 N

:=

∅. By the definition of the translation of letregion and Lemma 3.3.2, we are done. Proof: Lemma 3.3.1. Let Γ2 = Γ1 {ρ0 } {x0 7→ P0 }{ρ2 }{x2 7→ P0 }. With harmless of abuse of notation, we then have Γ2 src

ρ2 = Γ1 {ρ0 } {x0 7→ P0 } {x2 7→ P2−ρ2 }. Then [Γ2 − ρ2 ] obj Γ

158

ρ1 ρ0 N

src

= [Γ1 ] obj Γ

ρ1 N

src

{∅ {x0 7→ [P0 ] obj P

ρ1 ρ0 N

}

Translation

src

{x2 7→ [P2−ρ2 ] obj P

ρ1 ρ0 N

src

}}. Let [Γ2 ] obj Γ

ρ1 ρ0 ρ2 N

src

Lemma 3.3.2, this is equivalent to [Γ1 ] obj Γ

= Γ1 {≬Γ 0 } {≬Γ 2 }. Then, by multiple applications of

ρ1 ρ0 N

src

{≬Γ 0 2 ≬Γ 2 }, or [Γ1 ] obj Γ

ρ1 ρ0 N

2ρ1 ρ0 /ρ2 .

Proof: Lemma 3.3.2. This breaks down to a tedious proof by cases, for P0 either G, ∅, B @ ρ0 , or B @ ρ1 .

We have now presented a pair of languages with encapsulation of effects, an object language and a monadic language, and demonstrated a translation between them. But we are in a similar situation to that at the end of Chapter 1, in that we must take on faith that these languages can be evaluated to perform effects. Next, we continue to follow the methodology of the previous part as we demonstrate such evaluation strategies. Doing so will force us to confront head-on the tree structure of monadic stores.

159

CHAPTER 4

Intermediate Languages

In this chapter we take the source languages of Chapter 3 with expression-level encapsulation and provide them with reduction semantics as done for the simple languages in Chapter 2. The object language is again similar to that of Calcagno, Helsen and Thiemann [5], although there are several differences. Foremost, the store type of Calcagno, et. al., is unrestricted, in that values allocated at one region may be stored in reference cells in any region. By contrast, we present rules corresponding to a restricted store in which values allocated at (or otherwise referencing) a region may not be stored in outer regions. Our support for allocations that escape their lexical context develops into a support for dangling pointers. Additionally, our lexical tracking of region information allows our proof of type soundness to proceed where theirs fails. A final difference is our use of evaluation contexts by contrast to their search rules. Wadler and Theimann [65] use evaluation contexts and traces but do not consider encapsulation. The monadic language and translation that form the remainder of this part are again largely original.

1. Object Language

1.1. Dynamics.

Figure 4.1.1 presents the object intermediate language syntax, an extension of the syntax of the source object language (Figure 3.1.1) to support a reduction semantics. Specifically, we include support for addresses and make the deallocation of regions explicit. In addition to region variables we now have region names r. Region names refer to allocated regions of memory. Region indicators ̺

161

Modelling Encapsulation of State With Monad Transformers

imd obj hq; si q ∈ imd obj q r r imd si ∈ obj he; si ̺ e ̺ ∈ imd obj e

hq; si ∈ he;

er ∈

= ::= = ::=

×

imd r3 obj s

e

imd r obj e ̺

r r3 × imd obj s p | let x = e ̺ in e ̺ | letregion ρ0 in e ̺ρ0

imd r obj e

::=

imd ̺1 ̺0 ̺2 obj e

::=

@ ̺0 b ̺1 ̺0 /̺2 | @ ̺0 deref e ̺ | @ ̺0 set e ̺ to e ̺ | @ ̺0 e ̺ e ̺

imd ̺1 ̺0 /̺2 obj b r r1 r0 d ∈ imd obj d ̺ imd ̺ p ∈ obj p r r si ∈ imd obj hv; si

::=

href e ̺1 ̺0 ̺2 i | hλx.e ̺1 ̺0 i

::= ::=

href v r i | hλx.e r i v̺ | x

e̺ ∈ ⇁

b̺ ∈

hv;

imd obj q ǫ

Intermediate Languages

v̺ ∈ ∅ ̺

a

a̺ ∈

×

imd r obj s

∅ ̺

g|

imd ∅ ̺ obj a

::=

h∅, oi | a ̺

imd r1 r0 ̺2 obj a

::=

hr0 , oi

imd obj ̺

::=

ρ | r0

imd ǫ obj s

::=

∅

imd r1 r0 obj s

::=

s r {r0 7→ ≬s }

::=

∅ {o 7→ d r }

∈

sr ∈ sr ∈ s

imd r obj v

::=

̺ ∈

≬ r

imd ̺ obj v

=

freeregion r0 after e rr0

imd ≬ r1 r0 obj s r t r ∈ imd obj t r f r ∈ imd obj f ι ∈ imd obj ι

∈

g ∈

a

r

=

r [ imd obj f ]

::=

(ι @ a r )

::=

imd obj g

alloc | read | write | exec ρ ∈

imd obj ρ

r ∈

imd obj r

o ∈

imd obj o

x ∈

imd obj x

Figure 4.1.1. Object Intermediate Language Syntax include region variables and region names. Region indicators replace the region variables of the source language in indicating a region at which monadic operations are to operate. We use the freeregion r after e form to indicate that region r should be deallocated after processing of e is complete, i.e., it corresponds to an active letregion form. As in the simple object intermediate language (Figure 2.1.1) the class of values is extended with locations of allocated storables. Addresses a represent existing locations and indicate both an offset and a region name. A dangling pointer is similar but uses ∅ instead of a region name. These are similar to the dangling pointer programs of the monadic language of Section 2.2, but since encapsulation now takes place at the level of expressions, they are available as values. They represent allocations that escape the construct in which their region variable is declared, as described in Section 2.5 of the Introduction. Generalized addresses ∅a include 162

Object Language

Dynamics

addresses and dangling pointers. As in the simple object intermediate language, there are syntactic classes for program, expression, pure, and value configurations. The stores of that language are now region stores ≬s , i.e., finite functions mapping offsets to storable values. Storable values d are defined as before, with lambda expressions here having neither free program variables nor freeregion constructs, although they may have free region names. Stores s are now finite functions mapping region names to region stores. We thus preclude duplicate occurrences of region names in a store. Atomic effects f now record the occurrence of an action ι at an address. Expressions, pures, and values are indexed with a sequence of region indicators that, unlike that of the source language, includes region names and describes the context within freeregion as well as letregion forms. We continue to refer to such sequences as lexical indexes. The corresponding configurations as well as addresses, region stores, traces, and atomic traces are all lexically indexed by a sequence of region names. As in the source language, the region to which effectful operations apply must occur in the index of the expression. Prestorables are indexed with a partitioned sequence of region indicators, similar to the partitioned sequence of region variables of the source language. Addresses (and thus traces) are constrained to point to regions in their lexical index. Once expressions are indexed, stores must be indexed as well because they contain procedures whose code is indexed based on its position in the store. Stores are indexed by a sequence of region names that represents a lexical context within freeregion constructs as well as the sequence of names of included region stores. We clearly have that

imd ̺ obj v

⊆

imd ̺ obj e

and thus

imd r obj d

⊆

imd r/ obj b .

The components of value

configurations are constrained to share the same index. A program, however, may be combined with a store of any index to form a program configuration, while for an expression and store to be combined as a expression configuration, the lexical index of the expression (and configuration) must be a prefix of the store index. If these latter two are somewhat unsatisfying, the situation will be rectified in Chapter 6, where we introduce nonlexical indexes. We refer to expressions and prestorables without freeregion forms as active source expressions and active source prestorables. These intermediate language expressions are called “pure” by Calcagno, et. al.. [5]. Our operational semantics will ensure that nonevaluation contexts such as the bodies of let forms and procedures only contain active source expressions. As in the simple object intermediate language, we require substitution of values for program r

r1 r0 /r2

variables (e r [ x′ := v ′ ]). We define substitution nontraditionally in that to substitute hλx.e i

163

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

[ x2 := hr2 , oi] for r2 ∈ r2 with x 6= x2 , we obtain hλx.e [ x2 := h∅, oi] i. It will be convenient1 to allow this not just in active source expressions, but in more general expressions as well. We also require two additional forms of substitution. Region allocation (e0 [ ρ0 := r0 ]) substitutes region name r0 for region variable ρ0 in an active source expression e0 . Region deallocation (v0 [ r0 := ∅]) substitutes ∅ for region name r0 in value v0 , creating dangling pointers. We use frv( ) to refer to the free region variables of an active source expression and frn( ) to refer to the free region names of a value. These are all defined in the obvious way. We will also define fr(e) as frv(e) ∪ frn(e), for active source expression e. We make no consideration here for an extension of these forms of substitution to more general expressions, although this will be necessary for the latter one in Chapter 6. We apply the definitions of update, application, and domain of region stores to stores. Following Tofte and Talpin, we allow stores to be accessed at an address using the following abbreviations: s {hr, oi 7→ d}

→ s {r

s (r) {o 7→ d}}

s {hr, oi

→ s {r

s (r) {o

d}

s( hr, oi)

d}}

→ s( r)( o)

Thus, the use of any of these forms implicitly assumes that r ∈ Dom( s). We can also check the domain of a store for an address, i.e., a ∈ Dom( s) exactly when r ∈ Dom( s) ∧ o ∈ Dom( s( r)), where a = hr, oi. The reduction rules in Figure 4.1.2 allow state operations on region names. They are used to demonstrate reduction relations between expression configurations indexed by a sequence of region names. ⇀

imd he; si obj

⇀

imd he; si obj

-alloc,

-let is the same as in the object language without encapsulation (Figure 2.1.2). Rules ⇀

imd he; si obj

-deref,

⇀

imd he; si obj

-set, and

⇀

imd he; si obj

-app-λ are similar, but operate on a store with

a full address rather than on a region store with an offset. They require that the expression configuration be indexed by a nonempty sequence (including r0 ) and now carry an implicit requirement that r0 ∈ Dom( s1 ).

⇀

imd he; si obj

-alloc now implicitly requires that o ∈ / Dom( s( r0 )), while the other

three implicitily require o ∈ Dom( s( r0 )). Our two additional forms of substitution are used to implement region allocation and region deallocation.

⇀ imd he; si obj

-letregion replaces a letregion construct

with a freeregion construct, substituting a fresh region name for occurrences of the region variable in the body and extending the store to bind that name to an empty region store.2 1This is just true of the monadic language of the following section. 2The region name is fresh because the store cannot define duplicate region names.

164

⇀ imd he; si obj

-freeregion

Object Language

⇀

imd he; si obj

-let

Dynamics

⇀

[] hlet x= v in e; si⇀he [ x:= v] ; si

imd he; si obj

-deref

a0 = hr0 , oi

[(read @ a0 )] h@ r0 deref a0 ; si ⇀ hv0 ; si

imd he; sir obj

⇀

imd he; si obj

-set

s( a0 ) = href v0 i

imd he; sir1 r0 r2 obj

a0 = hr0 , oi

s( a0 ) = href v0.1 i

[(write @ a0 )] h@ r0 set a0 to v0 ; si ⇀ hunit; s {a0 href v0 i}i imd he; sir1 r0 r2 obj

⇀ imd he; si obj

-alloc

a0 = hr0 , oi [(alloc @ a0 )] h@ r0 d0 ; si ⇀ ha0 ; s {a0 7→ d0 }i imd he; sir1 r0 r2 obj

⇀ imd he; si obj

-app-λ

a0 = hr0 , oi

s( a0 ) = hλx.e0 i

[(exec @ a0 )] h@ r0 a0 v0 ; si ⇀ he0 [ x:= v0 ]; si imd he; sir1 r0 r2 obj

⇀

imd he; si obj

-letregion

[] hletregion ρ0 in e0 ; s1 i ⇀ hfreeregion r0 after e0 [ ρ0 := r0 ]; s1 {r0 7→ ∅}i imd he; sir1 obj

⇀

imd he; si obj

-freeregion

[] hfreeregion r0 after v0 ; s1 {r0 7→ ≬s 0 }i ⇀ hv0 [ r0 := ∅]; s1 i imd he; sir1 obj

Figure 4.1.2. Object Language Expression Configuration Reduction Rules drops the freeregion construct after the body is a value, and substitutes ∅ for any remaining occurrences of the region name in the value.3 Both carry an implicit requirement that r0 ∈ / Dom( s). The order of region stores in the store is not important, so for convenience, the new region store to the end of the store.

⇀ imd he; si obj

⇀ imd he; si obj

-letregion appends

-freeregion assumes that the region store to be

deallocated is the uppermost region store of the store. We define evaluation contexts to guide the CBV reduction of expressions. We define program expression contexts and expression contexts as indexed versions of those of the object language without encapsulation. A program expression context is a program with a hole expecting an expression. r1

Metavariables for program expression contexts take the form [ e ]q, representing a program defining regions r1 around a hole expecting an expression. We refer to such metavariables as

[ e] r1

q .

3With our restriction on the store, it can remain unchanged in the residual. An unrestricted store would require a similar substitution. Our eager deallocation of regions in Chapter 6 will require such a substitution in part of the store.

165

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

An expression context is an expression with a hole expecting another expression. We refer to an empty expression context as [ e r ], assuming a surrounding context defining regions r. Metavariables for expression contexts take the form [ e

r r1

] r

e , representing an expression defining regions r1 around

a hole expecting another expression, assuming a surrounding context defining regions r. We refer r ˆ r1 r ˆ r1 to such metavariables as [ e]e . We refer to the filling of a context [ e]e with an expression r

e1 r r1 as

[ e]

e [e1 ].

We refer to an empty trace context as [ t r ]. It expects and returns unchanged a trace over r. Metavariables for trace contexts take the form

[ t r r1 ] r

t , representing a restriction of actions at

regions r1 around a hole expecting a trace of actions at r r1 . We refer to such metavariables as [ t] r ˆ r1 t . Filling a trace context is indicated similarly to filling an expression context. The notions of expression contexts and trace contexts are combined to define expression/trace r ˆ r1 r ˆ r1 [ t] r ˆ r1 contexts of the form: ([ e]e ! [ t]t ) , corresponding to the concatenation ([ e]e ! t ). →∗ [ e] ǫ

q

imd →∗ [ e] ǫ q obj

∈

::=

[ e] ∗

(→

[ e]

e!

→∗ [ t]

rˆ ǫ

t )

∈

imd →∗ [ e] e obj (

!

→∗ [ t]

t )

rˆ ǫ

::=

([ e] ! [ t]) | (let x = [ e] in e ! [ t]) | (@ r0 href [ e]i ! [ t]) | (@ r0 deref [ e] ! [ t]) | (@ r0 set [ e] to e ! [ t]) | (@ r0 set v to [ e] ! [ t]) | (@ r0 [ e] e ! [ t]) | (@ r0 v [ e] ! [ t]) ∗

(→

∗

e !→

[ e]

[ t]

t)

r ˆ r0

∈

imd →∗ [ e] e obj (

∗

!→

[ t]

r ˆ r0

t)

::=

(freeregion r0 after [ e] ! [ t] − r0 ) Figure 4.1.3. CBV Object Language Atomic Expression/Trace Evaluation Contexts Because our reductions may occur within an evaluation context that performs encapsulation in Figure 4.1.3, we extend expression evaluation contexts as presented in the nonencapsulated object language (Figure 2.1.3) to expression/trace evaluation contexts, which perform trace masking. An expression evaluation context is used first to descend into syntax, and then to rebuild the syntax after reducing. The trace portion of the expression/trace evaluation context is used only in the rebuilding phase. Just as application of an evaluation context adds syntax to an expression, it restricts a trace,

166

Object Language

Dynamics

removing actions to particular regions. Figure 4.1.3 presents atomic program expression evaluation contexts and atomic expression/trace evaluation contexts. It differs from Figure 2.1.3 in that it includes indexes, trace contexts, and a case for freeregion contexts. The first clause defines atomic program expression evaluation contexts, expecting an expression with empty index, to include only empty expression contexts. The second clause defines expression/trace evaluation contexts expecting an expression and trace with the same index as the context. In each case, the expression contexts are as in the nonencapsulated object language while the trace context is empty. The third clause defines expression/trace evaluation contexts expecting an expression and trace with an extended lexical index. The only case is for freeregion expression contexts; the corresponding trace context restricts a trace to exclude addresses in the defined region. The lack of a case for letregion ensures that regions are allocated outermost-first, so freeregion forms may not appear within letregion forms. It follows that in the indexes of expressions reachable through evaluation, region names will always precede region variables. The deterministic nature of these contexts ensures that regions are allocated in a depth-first manner, so that the property that no two freeregion forms occur in parallel within the same (empty or nonempty) sequence of nested freeregion forms is maintained. Our evaluation contexts do not include the scope of any bound program or region variables. We inductively define expression/trace evaluation contexts over subconfigurations as the least set of expression/trace contexts including these and closed over composition, and define program expression evaluation contexts as the least set including atomic program expression evaluation contexts and closed over composition with the expression context component of expression/trace evaluation contexts. We refer to the reflexive, transitive, and contextual closure of

⇀ imd he; sir obj

as

→∗ imd he; sir obj

. We ob-

t

tain notions = of equality of program and expression configurations as in the language without encapsulation. We again extend our notion of traces to infinite lists and also define → to be the restriction of →∗ by refraining from the use of only if for each q ′ and s′ such that We say

hq; si ⇓ hq′ ; s′ i imd hq; si obj

s′′ and q ′′ such that

→∗ imd he; si obj

-reflex and

hq; si →∗ hq′ ; s′ i imd hq; si obj

→∗ imd he; si obj

-trans. Then we can say hq; si ⇑ if and

, there exist q ′′ and s′′ such that

, i.e., hq; si converges to hq ′ ; s′ i, if and only if hq′ ; s′ i → hq′′ ; s′′ i imd hq; si obj

hq; si →∗ hq′ ; s′ i imd hq; si obj

hq′ ; s′ i → hq′′ ; s′′ i imd hq; si obj

.

and there are no

.

The rules in Figure 4.1.4 are indexed versions of those of the simple object language without encapsulation in Figure 2.1.4. Rule

→∗

imd hq; si obj

-cntxt allows the expression within a program expression

context to evaluate with any index. For the subexpressions to form configurations with the store in 167

Modelling Encapsulation of State With Monad Transformers

t1 he1 ; si →∗ he′1 ; s′ i

∗

→

imd hq; si obj

-cntxt

imd he; sir1 obj

h

→∗ [ e]

∗ [ e]

q [e1 ]; si →∗ h→

q [e′1 ]; s′ i

imd hq; si obj

t he; si →∗ he′ ; s′ i

→∗

imd he; si obj

→∗ [ t ] ∗ [ e]

e [e1 ]; si

t [t1 ] ∗ →∗ h→ [

imd he; sir obj

-reflex

→∗

imd he; sir r1 obj

-cntxt h→

imd he; si obj

imd he; sir obj ′

t1 he1 ; si →∗ he′1 ; s′ i

imd he; si obj

→∗

Intermediate Languages

→∗

[] he; si →∗ he; si

imd he; si obj

t he′ ; s′ i →∗ he′′ ; s′′ i

-trans

e [e′1 ]; s′ i

e]

imd he; sir obj

t+t′ he; si →∗ he′′ ; s′′ i imd he; sir obj

t he; si ⇀ he′ ; s′ i

-step

imd he; sir obj

t he; si →∗ he′ ; s′ i imd he; sir obj

imd he; sir obj

Figure 4.1.4. Object Language Multiple Deep Reduction Rules the antecedent, it is necessary that r1 be a prefix of the indexes of s and s′ . In fact, our programexpression evaluation contexts have empty index, so r1 is always empty. Intuitively, if a reduction performs some effect, its trace is masked as we view the transformation of the redex in its surrounding context including the freeregion construct that declares the effected region. Thus, traces are fully masked at the level of programs. Rule

→∗

imd he; si obj

-cntxt allows the expression within an expression/trace

context to evaluate with an index extended (with r1 ) from that (r) of the original expression and trace. For the subexpressions to form configurations with the store in the antecedent, it is necessary that r r1 be a prefix of the indexes of s and s′ . It performs effect masking by applying the trace component as well as the expression component of the context. We emphasize that although the lexical index is bounded by the size of the program and determines which regions a piece of code may access, the total number of regions that may potentially be created is in fact unbounded, as the same letregion construct may be executed repeatedly. However, the currently existing regions are the region names defined in the store index. In Figures 4.1.5 through 4.1.7 we demonstrate this dynamic semantics on the sample source program typed in Figures 3.1.12 and 3.1.13. The program is reduced as follows. We first apply ⇀

imd he; si obj

-letregion, creating a freeregion context into which we descend, and allocating region r1 on

the store. We then descend into the let definition context and apply

⇀

imd he; si obj

-alloc for the outer cell,

registering an allocation at o1 at the new region and updating this uppermost region store to include the new cell. 168

;

[]

[] ⇀ imd he; si obj

-letregion

[ (alloc @ hr1 ,

[ (alloc @ hr1 , o1 i) ] let x2 = [ e r1 ] in letregion ρ2 in let x3 = @ ρ2 href x2 i in @ r1 @ r1 hλx1 .@ r1 deref letregion ρ3 in x2 i @ ρ2 deref x3

o2 i), (exec @ hr1 , o2 i), (read @ hr1 , o1 i) d1 ⇀

[] ⇀

imd he; si obj

[]

⇀ freeregion r1 r1 - after [ e ] 6

⇀ imd he; si obj

⇀ hunit; ∅i -freeregion

Object Language

h letregion ρ1 in let x2 = @ ρ1 href uniti in letregion ρ2 in let x3 = @ ρ2 href x2 i in @ ρ1 @ ρ1 hλx1 .@ ρ1 deref letregion ρ3 in x2 i @ ρ2 deref x3 ∅i

]

-let

6 h@ ρ1 href uniti; ∅ {r1 7→ ∅}i

[(alloc @ hr1 , o1 i)] ⇀ hhr1 , o1 i; ⇀ imd he; si -alloc ∅ {r1 7→ ∅ {o1 7→ href uniti}}i obj Figure 4.1.5. Sample Object Language Reduction, I

Dynamics

169

[ (alloc @ hr1 ,

h letregion ρ2 in let x3 = @ ρ2 href hr1 , o1 ii in @ r1 @ r1 hλx1 .@ r1 deref letregion ρ3 i in hr1 , o1 i @ ρ2 deref x3 ∅ {r1 7→ ∅ {o1 7→ href uniti}}i

;

[] ⇀ imd he; si obj

-letregion

o2 i), ] (exec @ hr1 , o2 i), (read @ hr1 , o1 i) r1 r2 ⇀ freeregion r2 after [e ] -

[] ⇀ imd he; si obj

⇀ hhr1 , o1 i; s1 i

-freeregion

[ (read @ hr2 ,

[ (alloc @ hr2 , o1 i) ] let x3 = [ e r1 r2 ] in @ r1 @ r1 hλx1 .@ r1 deref letregion ρ3 i in hr1 , o1 i @ r2 deref x3

o1 i), (exec @ hr1 , o2 i), [ (alloc @ hr1 , o2 i) ] (read @ hr1 , o1 i) ⇀ d2 ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡ d3

[] ⇀

imd he; si obj

-let

6 h@ ρ2 href hr1 , o1 ii; ∅ {r1 7→ ∅ {o1 7→ href uniti}} i {r2 7→ ∅}

[(alloc @ hr2 , o1 i)] ⇀ hhr2 , o1 i; ⇀ imd he; si -alloc ∅ {r1 7→ ∅ {o1 7→ href uniti}} i obj {r2 7→ ∅ {o1 7→ hr1 , o1 i}}

]

Modelling Encapsulation of State With Monad Transformers

170

d1 =

Figure 4.1.6. Sample Object Language Reduction, II Intermediate Languages

Object Language

d2 = [(alloc @ hr1 , o2 i)] r1 r2

@ r1 [ e ] @ r2 deref hr2 , o1 i 6 h@ r1 hλx1 .@ r1 deref letregion ρ3 in hr1 , o1 i i; ∅ {r1 7→ ∅ {o1 7→ href uniti}} i {r2 7→ ∅ {o1 7→ href hr1 , o1 ii}}

[(alloc @ hr1 , o2 i)] ⇀ hhr1 , o2 i; s2 i ⇀ imd he; si -alloc obj

d3 = [ (read @ hr2 , o1 i) ] @ r1 hr1 , o2 i [ e r1 r2 ] 6

h@ r2 deref hr2 , o1 i; s2 i

[]

[(exec @ hr1 , o2 i)] ⇀

imd he; si obj

[(read @ hr2 , o1 i)] ⇀

imd he; si obj

-deref

r1 r2 ⇀ @ r1 deref [ e ] -

-app-λ

[] ⇀ hhr1 , o1 i; s2 i hletregion ρ3 ; s2 i ⇀ ⇀ hfreeregion r3 ; s3 i imd he; si -letregion in hr1 , o1 i after hr1 , o1 i obj

s1 = ∅ {r1 7→ ∅ {o1 7→ href uniti} {o2 7→ hλx1 .@ r1 deref letregion ρ3 in hr1 , o1 i

}

s2 = s1 {r2 7→ ∅ {o1 7→ href hr1 , o1 ii}}

[(read @ hr1 , o1 i)] ⇀ hunit; s2 i ⇀ imd he; si -deref obj

[]

⇀

imd he; si obj

⇀hhr1 , o1 i; s2 i -freeregion

s3 = s2 {r3 7→ ∅}

i}

Figure 4.1.7. Sample Object Language Reduction, III

Dynamics

171

Modelling Encapsulation of State With Monad Transformers

Returning to the let definition context, we apply

⇀

imd he; si obj

Intermediate Languages

-let. This replaces occurrences of the bound

variable (x2 ) in the body with the address of the new cell. The process is repeated in d1 of Figure 4.1.6 with the next letregion construct, region r2 , and the inner reference cell. We then, in d2 in Figure 4.1.7 descend into the operator context. We apply

⇀ imd he; si obj

-alloc for the function at o2 at the outer region,

registering the allocation and updating the lower region store to include the function. We then, in d3 in Figure 4.1.7, descend into the operand context just to apply After ascending, we apply

⇀ imd he; si obj

⇀ imd he; si obj

-deref, registering a read.

-app-λ. No substitution for the bound variable x1 is needed; we

just register an execution, and then descend into the deref context for the outer cell. We apply ⇀

imd he; si obj

-letregion, allocating r3 on the store and apply

⇀

imd he; si obj

the new region. Returning to the dereference, we apply

⇀

-freeregion, immediately deallocating

imd he; si obj

-deref, registering a read. We then

ascend to the inner freeregion construct, masking all actions at the inner region r2 from the trace and apply

⇀

imd he; si obj

-freeregion, removing the upper region store. We next acend to the outer freeregion

construct, masking the remaining actions at the outer region from the trace. Finally, we reapply ⇀

imd he; si obj

-freeregion, removing the lower region store.

We now present a few definitions and prove a few facts about them. These will be useful for our proof of type soundness. Definition 4.1.1 (Immediately Faulty Expression Configurations). r

(1) hx; si

r1 r0 r2

(2) h@ r0 hλx.e0 i; si

(λx.e0 not closed) r1 r0 r2

(3) h@ r0 deref v; si

r1 r0 r2

, h@ r0 set v to e; si

, h@ r0 v e; si

r1 r0 r2

(v 6= a) (4) h@ r0 deref a; sir1 r0 r2 , h@ r0 set a to e; sir1 r0 r2 , h@ r0 a e; sir1 r0 r2 (a ∈ / Dom (s)) r1 r0 r2

(5) h@ r0 deref a; si

(s (a) 6= href vi) r1 r0 r2

(6) h@ r0 a e; si

(s (a) 6= hλx.e0 i) (7)

hfreeregion r0 after e0 ; sir (s ∈ /

172

imd rr0 r3 ) obj s

r1 r0 r2

, h@ r0 set a to e; si

Object Language

Dynamics

r

hfreeregion r0 after v0 ; si

(8)

(s ∈ /

imd rr0 ) obj s

Immediately faulty expressions correspond to those of Definition 2.1.1. One additional case declares freeregion construct configurations to be immediately faulty if the declared region is not just above the lexical portion of the store. These are faulty because

→∗ imd he; si obj

-cntxt cannot then be used

with freeregion r0 after [ e] to enter an evaluation context, and even if e is a value,

⇀ imd he; si obj

-freeregion

cannot be used to deallocate the region. More specifically, another case declares freeregion construct configurations to be immediately faulty if the body is a value and the declared region is not uppermost on the store. We needn’t consider operations on region variables or allocations at regions not in the store, because these do not even qualify as expression configurations. Our definition of immediately faulty expression configurations is complete in the sense of Proposition 2.1.1. Proposition 4.1.1 (Nonvalue program configurations are reducible or faulty). Every program configuration hq; si with a nonvalue program q can be decomposed into the form ∗

h→

[ e] r

r

q [e]; si, where he; si is either a redex or immediately faulty. In the former case, we say

that hq; si is reducible, in the latter case that it is faulty. Proof: Proposition 4.1.1. The proof is structured similarly to that of Proposition 2.1.1; most cases are not substantially different and are omitted. Two additonal cases are at the level of expression configurations. Recall that it is assumed under our syntax that region indicators mentioned in an expression are in the domain of any store with which it may form a configuration. r

hletregion ρ0 in e0 ; si : Let

→∗ [ e]

r

e = [ e]. hletregion ρ0 in e0 ; si is a redex. r

hfreeregion r0 after e0 ; si : If e0 is a value, then let →

∗

r

[ e]

e = [ e] and hfreeregion r0 after e0 ; si is a redex or immediately

faulty, depending on whether or not r0 is the uppermost region store in s. Otherwise, by induction, he0 ; si

rr0

can be decomposed into h→

either a redex or immediately faulty, so

∗

→ [ e]

∗

[ e]

e 0 [e1 ]; si

rr0

rr0 r1

, where he1 ; si ∗

e is freeregion r0 after →

is

[ e]

e0 .

173

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

We need not be concerned with proving type soundness for arbitrary configurations, but only imd

those reachable from source language programs and empty stores. R obj hq; si is the least set that includes program configurations formed from source language programs and empty stores, and is imd

r

closed under evaluation. R obj he; si is the least set of expression configurations including those imd

formed from R obj hq; si by pulling expressions from program evaluation contexts and leaving the store unchanged, and closed under evaluation. Definition 4.1.2 (Configurations Reachable from Source).

imd obj hq;

R

si:

imd obj hq;

si

(1) ∀ q ∈

is the least set such that both: src obj q.

imd

hq; ∅i ∈ R obj hq; si imd

(2) ∀ hq; si ∈ R obj hq; si .

imd obj he;

hq; si →∗ hq′ ; s′ i imd hq; si obj

imd

→ hq ′ ; s′ i ∈ R obj hq; si

r

si : imd

r

he; si ∈ R obj he; si

r

∗

⇔ ∃→

[ e] r

∗

q . h→

[ e] r

imd

q [e]; si ∈ R obj hq; si

Source-reachable program configurations constructed with a value must contain an empty store. Source-reachable expression configurations constructed with a value must contain a store whose structure matches the lexical context of the expression configuration. Lemma 4.1.1 (Source-Reachable Value Configurations Have Lexical Store).

imd obj hq;

si: imd

hv; si ∈ R obj hq; si → s ∈ imd obj he;

imd ǫ obj s

r

si : imd

hv; si ∈ R obj he; si

r

→ s∈

imd r obj s

Proof: Lemma 4.1.1. We write |q| and |e r | for the number of freeregion constructs in q, e r , respectively, and |s| for the number of region stores in s. We write |r| for the number of region names in r. Then, we define φ r

( hq; si) to hold when |s| = |q|, and φ ( he; si ) to hold when |s| = |e| + |r| and show instead that:

174

Object Language

imd obj hq;

Dynamics

si: imd

hq; si ∈ R obj hq; si → φ ( hq; si) imd obj he;

r

si : imd

he; si ∈ R obj he; si

r

r1

We write |[ e]q |, and |[ e]e

→ φ ( he; si) r ˆ r1

| for the number of freeregion constructs in

[ e] r1

q

and

[ e] r ˆ r1

e

,

respectively, where the number of freeregion constructs in a context includes those whose body ∗ ∗ r1 r ˆ r1 contains the hole. By inspection of the evaluation contexts, |→ [ e]q | = |r1 |, and |→ [ e]e |= |r1 |. imd obj hq;

si:

(1) The empty store contains no region stores and source language programs contain no freeregion constructs, so |∅| = 0 = |q|. (2) We show that φ is preserved by →∗ : imd obj hq; si: Assume →∗

hq; si →∗ hq′ ; s′ i

imd hq; si obj

imd hq; si obj

-cntxt:

We have q =

and φ ( hq; si). Then, |s| = |q|.

→∗ [ e] r1

′

q [e1 ], q =

→∗ [ e] r1

q

[e′1 ],

and

t1 he1 ; si →∗ he′1 ; s′ i imd he; sir1 obj

→ [ e] r1

= |e1 | + |

∗

. Clearly, |q|

r1

q | = |e1 | + |r1 |. Thus we have φ ( he1 ; si ). By preservar

tion of φ by expression configuration evaluation, φ ( he′1 ; s′ i 1 ). Because ∗

|s′ | = |e′1 | + |r1 | = |e′1 | + |→

imd obj he;

[ e] r1

q | = |q ′ |, we have φ ( hq ′ ; s′ i).

r

si :

Assume →

t he; si →∗ he′ ; s′ i

∗

imd he; si obj

imd he; sir obj

-cntxt:

∗

We have e = →

∗

|e| = |e1 | + |→

and φ ( he; si). Then, |s| = |e| + |r|.

[ e] r ˆ r1

e

[ e] r ˆ r1

e

∗

[e1 ], e′ = →

[ e] r ˆ r1

e

[e′1 ], and

t1 he1 ; si →∗ he′1 ; s′ i

| = |e1 | + |r1 |. Thus φ ( he1 ; si

imd he; sir1 obj

. Clearly,

r r1

). By induction, φ ∗ r ˆ r1 rr ( he′1 ; s′ i 1 ). Because |s′ | = |e′1 | + |r1 | = |e′1 | + |→ [ e]e | = |e′ |, we have r

φ ( he′ ; s′ i ). →∗

imd he; si obj

-reflex:

Clearly, this rule preserves the equality.

175

Modelling Encapsulation of State With Monad Transformers

→∗ imd he; si obj

Intermediate Languages

-step:

The index r is unchanged by all reduction rules.

⇀ imd he; si obj

-letregion in-

troduces a freeregion construct and extends the store with a region store. ⇀

imd he; si obj

-freeregion removes a freeregion construct and retracts the store

by a region store. The other rules affect neither the number of freeregion constructs nor the size of the store. →∗

imd he; si obj

-trans:

By induction, the ancecedent evaluations preserve φ, so their composition does as well.

sir :

imd obj he;

By the result above, h→ ∗

= |→

∗

[ e] r

imd

∗

q [e]; si ∈ R obj hq; si → |→

[ e] r

∗

q [e]| = |s|. Then |→

[ e] r

q [e]|

[ e] r

q | + |e| = |r| + |e|, i.e., φ ( he; sir ).

Our proof of type soundness (Lemma 4.1.6) will require an additional lemma. Programs and expressions reachable from the source language are factorable only into active source evaluation contexts. imd

imd

r

Lemma 4.1.2 (R obj hq; si and R obj he; si Terms Factorable Only Into Active Source Evaluation Contexts). We define an active source expression context to be an expression, with a hole, that does not contain full freeregion constructs. We overlook freeregion constructs whose body includes the hole. imd obj hq; ∗

h→

si:

[ e] r1

→∗ [ e] r1

q

imd obj he;

imd

q [e1 ]; si ∈ R obj hq; si →

r

is an active source program expression context.

si : r imd ∗ r ˆ r1 h→ [ e]e [e1 ]; si ∈ R obj he; si → →∗ [ e] r ˆ r1 e is an active source expression context.

176

Object Language

Dynamics

Proof: Lemma 4.1.2. We define φ ( hq; si) to hold when q is factorable only by active source program expression contexts, r

and φ ( he; si ) to hold when e is factorable only by active source expression contexts. imd obj hq;

si:

(1) Clearly, a source language program is factorable only by active source program expression evaluation contexts. (2) We show that φ is preserved by →∗ : imd obj hq;

si:

Assume

hq; si →∗ hq′ ; s′ i imd hq; si obj

and φ ( hq; si). Then, q is factorable only into active

source program expression evaluation contexts. →∗ imd hq; si obj

-cntxt:

This case is similar to that for

imd obj he;

→∗ imd he; si obj

-cntxt below.

r

si :

Assume

t he; si →∗ he′ ; s′ i imd he; sir obj

and φ ( he; si). Then, e is factorable only into active

source expression evaluation contexts. →∗

-cntxt: ∗ r ˆ r1 e = → [ e]e [e1 ]. e is factorable only into active source expression eval-

imd he; si obj

uation contexts, so this holds for e1 as well. We have by induction that ∗ r ˆ r1 e′1 is only factorable into active source evaluation contexts. → [ e]e is unchanged from e, so filling it with e′1 to yield e′ preserves this property. →∗ imd he; si obj

-reflex:

Clearly, this rule does not introduce freeregion. →∗ imd he; si obj

-step:

The only reduction rule that could introduce freeregion is

⇀ imd he; si obj

-letregion,

but that instance of freeregion would surround the hole of a freeregion r0 after [ e] context.

177

Modelling Encapsulation of State With Monad Transformers

→∗ imd he; si obj

Intermediate Languages

-trans:

By induction, the ancecedent evaluations preserve φ, so their composition does as well.

imd obj he;

r

si :

By the result above we have that

→∗ [ e] r

q [e] is factorable only by active source program ∗ ∗ ∗ r ˆ r1 r r ˆ r1 expression evaluation contexts. If e = → [ e]e [e1 ], → [ e]q [→ [ e]e ] must be an ac∗ r ˆ r1 tive source program expression evaluation context, so → [ e]e must be an active source expression evaluation context.

1.2. Statics. Tracking region names in the lexical index allows detection by the type system of faulty configurations (that could not have been generated via the reduction semantics), such as: hfreeregion r0 after ((freeregion r0 after g); @ r0 deref @ r0 href gi); ∅ {r0 7→ ∅}i. Here, we attempt to access a region store after it has been deallocated. This allows us to exclude reference to region variables and names from our statement of type soundness,4 and allows more straightforward localization of errors in intermediate programs. The static syntax is presented in Figure 4.1.1. It is similar to the source language static syntax in Figure 3.1.4, except that region variables are generalized to region indicators (as in the dynamic syntax, Figure 3.1.1) and that it includes configuration types (as in the unencapsulated object language, Figure 2.1.1). We extend frn( ) to refer to the free region names of a pure type or static environment. Program, expression, and pure configuration types are indexed similarly to the corresponding configr

urations in the dynamic case. Region store types ≬S are finite functions mapping offsets to storable type schemes. Store types S r are finite functions mapping region names to region store types. We define S r r1 ≤r S ′

r r2

if and only if ∀r ∈ r.∀o ∈ Dom( S r r1 ( r)).S r r1 ( r)( o) = S ′

≥ accordingly. We define T ′

r r1

≤r T r r2 if and only if ∀r ∈ r.(ι @ r) ∈ T ′

r r2

r r1

and define ≥ accordingly. 4We do not need the assumption of Calcagno, et. al., that T refers only to region names.

178

( r)( o) and define

→ (ι @ r) ∈ T r r2

Object Language

Statics

Γ̺ ∈

imd ǫ obj Γ ̺1 ̺0 Γ ∈ imd obj Γ hQ; Si ∈ imd obj hQ; Si r r imd hE; Si ∈ obj hE; Si ̺ E ̺ ∈ imd obj E ̺

B̺ ∈

imd ̺1 ̺0 obj B r imd hP; Si ∈ obj hP; Sir ǫ imd P ̺ ∈ imd obj P , Q ∈ obj Q ̺1 ̺0 P ̺ ∈ imd obj P

≬ r

T̺ ∈

::=

Γ ̺ {x 7→ P ̺ } | Γ ̺1 {̺0 } imd imd r3 obj Q × obj S imd r r3 imd r obj E × obj S ̺ ̺

= = ::=

P !T

::=

Ref P ̺ | P ̺ ⇒ P ̺

T̺

imd r obj P

=

×

imd r obj S

G|∅

::=

B ̺1 ̺0 @ ̺0 | P ̺1

imd ǫ obj S

::=

∅

imd r1 r0 obj S

::=

S r1 {r0 7→ ≬S }

::=

∅ {o 7→ B r }

imd ≬ r1 r0 obj S imd ̺ ̺ ̺ ∈ imd obj T , ε obj ε ̺1 ̺0 ̺2 F ̺ ∈ imd obj F ι ∈ imd obj ι

S

∅ | Γ ǫ {x 7→ P ǫ }

::=

Sr ∈ Sr ∈

::=

∈

G ∈

r

=

̺ { imd obj F }

::=

(ι @ ̺0 )

::=

alloc | read | write | exec

imd obj G

Figure 4.1.1. Object Intermediate Language Static Syntax Definitions for the object intermediate language in Figure 4.1.2 include those of the source language (Figure 3.1.5) generalized to region indicators as well as substitution of a region variable by a region name in an environment and restriction | S r1 r2 |r1 of a store type S r1 r2 to its lowermost region names r1 . The sequence to restrict to may actually contain region variables; these are ignored as bindings of region names not mentioned are removed from the store type. We assume that |̺| r restricts the sequence of region indicators ̺ to those that are region names. Thus, uses of the store restriction operator take the form | S r0 r2 |̺0 , where |̺0 | r = r0 . Substitution of a region name for a region variable is also assumed. The typing judgments in Figure 4.1.3 are similar to those without encapsulation in Figure 2.1.2. The main difference is the presence of indexes. For the judgments present in the source language (Figure 3.1.6), indexes are assigned similarly but use region indicators instead of region variables. The store type is given an index that reflects the region names of the judgment’s index followed, in the case of expressions and prestorables, by additional, nonlexical, region names r3 . The latter are required to handle freeregion constructs that may occur within these terms. That is no longer necessary in Chapter 6, where we assign judgments a nonlexical index that allows us to more fully

179

Modelling Encapsulation of State With Monad Transformers

Γ r1 ρ0 ρ2 [ ρ0 := r0 ]

Intermediate Languages

imd r1 r0 ρ2 obj Γ

∈

Γ1 {ρ0 } {x0 7→ P0 }{ρ2 }{x2 7→ P2 }

Γ1 {r0 } {x0 7→ P0 [ ρ0 := r0 ]}

=

{ρ2 }{x2 7→ P2 [ ρ0 := r0 ]} Γ ̺1 ̺2 − ̺2 Γ {̺20 } − ̺21 ̺20 Γ {x 7→ P} − ̺2 Γ − ǫ

̺1 ∈ imd obj Γ = Γ − ̺21 = (Γ − ̺2 ) {x 7→ (P − ̺2 )} = Γ

P ̺1 ̺2 − ̺2 G − ̺2 ∅ − ̺2 (B0 @ ̺0 ) − ̺21 ̺0 ̺22 (B0 @ ̺0 ) − ̺2

(̺0 ∈ / ̺2 )

∈ = = = =

G ∅ ∅ B0 @ ̺0

(|̺1 | r = r1 )

∈

imd r1 obj S

|S r1 r2 |̺1 |S|̺1 ρ0

imd ̺1 obj P

= |S|̺1

|S {r2 7→ ≬S }|̺1 r0

= |S|̺1 r0

(r2 6= r0 )

|S {r0 7→ ≬S }|̺1 r0

= S {r0 7→ ≬S}

|∅|ǫ

= ∅ Figure 4.1.2. Object Intermediate Language Definitions imd

⊢obj hq; si

prog configs programs

S

r2

⊢ ⊢

expr configs expressions

S |̺|

Γ̺ ; ⇁

prestorables

Γ ̺ = seq-/( ̺ ) ; S |̺|

r

S |̺|

r

Γ̺ ;

pures

r

r3

S |̺|

r

imd

⊢ ⊢

region stores storables

Sr S

r

⊢

sir

̺

q

: Q

r

: hE; Sir

he; si

e̺

⇁ imd ̺ = ̺1 ̺0 /̺2 obj b imd ̺ obj p

imd obj hv;

⊢ ⊢

stores

imd obj he;

⊢obj e

⊢

value configs values

r3

hq; si : hQ; Si

imd obj q

r

si

imd ̺ obj v

imd r obj s

⇁

: E̺

b̺

: B ̺1 ̺0 ! T ̺

p̺

: P̺

r

hv; si

: hP; Si

v̺

: P̺

sr

: Sr

imd ≬ r = r1 r0 obj s

≬ r

imd r = r1 r0 obj d

r

: Br

imd r

tr

: Tr

imd

fr

: Fr

⊢

traces

⊢obj t

atomic traces

⊢obj f

r

s

d

:

≬ r

Figure 4.1.3. Object Intermediate Language Typing Judgments

180

S

r

Object Language

Statics

specify the store types that they might use. Judgments for configurations, stores, region stores, storables, traces, and atomic traces are indexed by a sequence of region names. The judgments for region stores and storables have a nonempty index and require a store type of the same index. This allows us to type references to values stored in outer regions. imd

We define ⊢obj f

r

imd

f: F to hold for ⊢obj f

r

(ι @ hr, oi): (ι @ r), r ∈ r, incorporating the region imd r

name information into the rule from the object language without encapsulation. We define ⊢obj t imd

t: T to hold when ∀ f ∈ t. ∃ F ∈ T. ⊢obj f

r

f : F , an indexed version of the rule for the object

language without encapsulation. imd

⊢obj v-glob-const

imd

S ⊢obj v

g : TypeOf( g)

a0 = hr0 , oi

imd

⊢obj v-addr-live

̺

S ⊢

imd r1 r0 ̺2 obj v

imd

⊢obj v-addr-dead

a0 : S( a0 ) @ r0

imd

S ⊢obj v

̺

h∅, oi : ∅

Figure 4.1.4. Typing of Object Intermediate Language Values Values are typed using the rules in Figure 4.1.4. The rule for typing global constants is an indexed imd

version of that from the language without encapsulation (Figure 2.1.3). The rules ⊢obj v-addr-live and imd

⊢obj v -addr-dead must take into account the modified form of generalized addresses. ⊢

imd

v

-addr-dead

is

used to type values that result from an allocation that has escaped its lexical scope, i.e., dangling pointers. imd

imd

⊢obj q

∅; S ⊢obj e S ⊢

ǫ

imd obj q

q: Q!∅ q: Q

Figure 4.1.5. Typing of Object Intermediate Language Programs The rule in Figure 4.1.5 for typing programs is identical to that of the source language (Figure 3.1.8). imd

imd

⊢obj d-ref

S ⊢obj v S⊢

imd r1 r0 obj d

r1 r0

v0 : P0

href v0 i : Ref P0

imd

imd

⊢obj d-λ

∅ {r1 } {r0 }; S ⊢obj b S⊢

imd r1 r0 obj d

r1 r0 /

hλx.e0 i : B0 ! ∅

hλx.e0 i : B0

Figure 4.1.6. Typing of Object Language Storables The rules for typing storables in Figure 4.1.6 are indexed versions of those for the unencapsulated object language (Figure 2.1.5). In both rules, the allocation is assumed to take place at r0 , the final 181

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

imd

region in the sequence, as the sequence is retracted at lower region stores. Thus, in rule ⊢obj d-λ, the prestorable antecedent judgment is given an index with the partition at the end of the sequence of region names. That judgment uses an environment composed only of declarations of these region names. Storable functions may not have free program variables. imd

⊢

r

S0 ⊢obj d d0 : B0

imd ≬ obj s

imd ≬ r = r1 r0

S0 ⊢obj

s

∅ {o 7→ d0 } : ∅ {o 7→ B0 } imd ≬ r

imd

⊢obj s-empty

S1 {r0 7→ ≬S 0 } ⊢obj imd r1 ⊢obj s s1 : S1

imd

⊢

imd ǫ obj s

⊢obj s-nonempty

∅: ∅

imd

⊢obj s

r = r1 r0

s

≬

s 0 : ≬S 0

s1 {r0 7→ ≬s 0 } : S1 {r0 7→ ≬S 0 }

Figure 4.1.7. Typing of Object Language Stores and Region Stores Figure 4.1.7 presents rules for typing stores and region stores. The rule for typing region stores is similar to that for typing stores in the unencapsulated object language (Figure 2.1.6) except for the presence of indexes and the fact that a store type is required in both the conclusion and the antecedent. By using the same store type to type each storable in a particular region, the rule ensures that each region is fully visible from any part of that region. imd

There are two rules for typing stores. Rule ⊢obj s -empty declares that an empty store is typed imd

with an empty store type. Rule ⊢obj s-nonempty declares that a store extended to bind a region name to a region store is typed with a store type extended to bind the same region name to a region store type if the store is typed with the store type (under a retracted index) and the region store is typed with the region store type using the store type (under the full index). The fact that the store antecedent derivation uses a retracted index is a consequence of our restricted store (Section 2.2 of the Introduction).5 We can now see the role played by the restriction in the prestorable judgment of the storable type to the region of allocation, for if this were not the case our restricted store could be violated by the rule 5By

sr

∈

comparison, imd sr obj

imd ⊢obj s-unrestr

::=

to

⇀ imd he; si obj

type

-alloc. an

imd ≬s r

∅ {r0 7→ ≬S } ⊢obj imd r ⊢obj s

unrestricted

r

∅ {r0 7→ ≬s } and S r ≬s

:

≬S

∈ .

store imd Sr obj

we ::=

could

instead

assume

the

declarations:

r

∅ {r0 7→ ≬S } and use a rule such as: imd

The difference is that while ⊢obj s -unrestr makes the

∅ {r0 7→ ≬s } : ∅ {r0 7→ ≬S } imd

imd

entire store visible from any part of the store, ⊢obj s-empty and ⊢obj s-nonempty only allow regions to see “outwards”. In the former case, the store type is shared among all the antecedents, while in the latter the store type is reduced region-by-region during the recursion. The index of an unrestricted store can be considered to be unordered.

182

Object Language

Statics

imd

⊢

imd obj p-var

Γ; S ⊢

imd ̺ obj p

⊢

x : Γ( x)

imd obj p -value

S ⊢obj v Γ; S ⊢

̺

v: P

imd ̺ obj p

v: P

Figure 4.1.8. Typing of Object Intermediate Language Pures The rules for typing pures in Figure 4.1.8 are indexed versions of those for the unencapsulated object language (Figure 2.1.7). imd

imd

⊢obj b-ref

Γ; S ⊢obj e Γ; S ⊢

̺1 ̺0 ̺2

imd ̺1 ̺0 /̺2 obj b

e : P0 ! T

href ei : Ref P0 ! T imd

⊢

imd obj b-λ

(Γ − ̺2 ) {x 7→ P0.1 }; |S|̺1 ̺0 ⊢obj e imd

Γ; S ⊢obj b

̺1 ̺0 /̺2

̺1 ̺0

e0 : P0.2 ! T0

T

hλx.e0 i : P0.1 ⇒0 P0.2 ! ∅

Figure 4.1.9. Typing of Object Intermediate Language Prestorables The rules for typing prestorables in Figure 4.1.9 are indexed versions of those for the unenimd

capsulated object language (Figure 2.1.8) except that in rule ⊢obj b -λ the antecedent for typing the function body must use an environment restricted (as in the source language, Figure 3.1.10) by the region indicators outside of the region of allocation as well as a store type similarly restricted. This is necessary because the body of a function is assigned an index retracted to the region at which the function is allocated. The prestorable judgment restricts the storable type to the index ̺1 ̺0 , so P0 imd

imd

in ⊢obj b-ref as well as P0.1 , P0.2 , and T0 in ⊢obj b-λ are similarly restricted. The rules for typing expressions in Figure 4.1.10 are similar to those of the source language (Figure 3.1.11) except that their indexes are generalized to region indicators and they incorporate a store type. The store type is simply passed on to the antecedents except in the case of pures, imd

imd

for which it is restricted to its lexical portion. The rule ⊢obj e -freeregion resembles ⊢obj e -letregion but declares a region name rather than a region variable. To maintain type soundness, this region name is similarly restricted from the static environment and the pure type, outside of the freeregion constructs. Although the antecedent has an extended index, the conclusion can use the same store type because the expression judgment permits its store type index to exceed its own. It is implicit in our finite function notation that the bound region name is fresh, i.e., does not occur in the environment Γ. Configurations are typed using indexed versions of the rules for the unencapsulated object language. The antecedent for the store may use any index prefixed with that of the conclusion, 183

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

imd

but the rule ⊢obj e-freeregion above will require that the additional region names be those declared in sequence by nested freeregion constructs within the term. The environment for typing the expression imd

in the antecedent of ⊢obj he; si is no longer empty but declares the region names from the expression’s index. imd

Γ; |S|̺ ⊢

imd

⊢obj e-pure

Γ; S ⊢

imd ̺ obj p

imd ̺ obj e

imd

⊢obj e-alloc

imd

Γ; S ⊢obj e

p: P

imd

Γ; S ⊢

imd ̺ = ̺1 ̺0 ̺2 obj e

̺

̺

let x = e.1 in e.2 : P.2 ! T.1 ∪ T.2

e : Ref P0 @ ̺0 ! T

@ ̺0 deref e : P0 ! T ∪ {(read @ ̺0 )} imd

̺

Γ; S ⊢obj e e .1 : Ref P0 @ ̺0 ! T.1 imd ̺ Γ; S ⊢obj e e .2 : P0 ! T.2

imd

⊢obj e-set

imd

@ ̺0 b : B0 @ ̺0 ! T ∪ {(alloc @ ̺0 )}

Γ; S ⊢obj e

imd

⊢obj e-deref

imd

⊢obj e-let

Γ; S ⊢obj e p: P!∅ imd ̺1 ̺0 /̺2 Γ; S ⊢obj b b : B0 ! T

̺1 ̺0 ̺2

̺

Γ; S ⊢obj e e.1 : P.1 ! T.1 imd ̺ Γ {x 7→ P.1 }; S ⊢obj e e.2 : P.2 ! T.2

imd

Γ; S ⊢obj e

̺ = ̺1 ̺0 ̺2

@ ̺0 set e .1 to e .2 : Unit ! T.1 ∪ T.2 ∪ {(write @ ̺0 )} imd

Γ; S ⊢obj e Γ; S ⊢

⊢

imd obj e-app

⊢

imd obj e-letregion

imd

Γ; S ⊢obj e

̺ = ̺1 ̺0 ̺2

̺

imd ̺ obj e

e .2 : P0.1 ! T.2 T

e .1 : P0.1 ⇒0 P0.2 @ ̺0 ! T.1

@ ̺0 e .1 e .2 : P0.2 ! T0 ∪ T.1 ∪ T.2 ∪ {(exec @ ̺0 )} imd

Γ1 {ρ0 }; S ⊢obj e Γ1 ; S ⊢

imd ̺1 obj e

̺1 ρ0

letregion ρ0 in e0 : P0 − ρ0 ! T0 − ρ0 imd

Γ1 {r0 }; S ⊢obj e

imd

⊢obj e-freeregion

Γ1 ; S ⊢

imd r1 obj e

e0 : P0 ! T0

r1 r0

e0 : P0 ! T0

freeregion r0 after e0 : P0 − r0 ! T0 − r0

Figure 4.1.10. Typing of Object Intermediate Language Expressions

imd

⊢

imd obj hq;

imd

⊢obj s

r2

⊢obj s s : S imd S ⊢obj q q : Q

si

⊢

imd obj hq;

si

⊢ imd

imd obj he;

si

∅ {r1 } ; S ⊢obj e ⊢

imd obj he;

r1 r2

⊢obj s s: S imd r1 v obj S ⊢ v: P

imd

184

s: S imd

hq; si : hQ; Si

⊢obj hv; si

r1 r2

imd

⊢obj hv; si

r1

hv; si : hP; Si

sir1

r1

e: E

he; si : hE; Si

imd

d2 ⊢

imd ≬ r1 r2

⊢obj

s

imd r1 r2 obj s -nonempty

imd r1 r2 r3 obj s -nonempty

⊢ imd ⊢obj hq; si

r1 r2

-addr-live

S2 ⊢ hr1 , o1 i : Ref Unit @ r1 S2 ⊢ href hr1 , o1 ii : Ref (Ref Unit @ r1 ) S2 ⊢ ∅ {o1 7→ href hr1 , o1 ii} : ∅ {o1 7→ Ref (Ref Unit @ r1 )} ⊢ s2 : S 2 ⊢ s3 : S 3 ⊢ hfreeregion r1 ; s3 i : hUnit; S3 i after freeregion r2 after @ r1 deref freeregion r3 after hr1 , o1 i

⊢

imd ≬ r1 r2 r3

⊢obj

s

S3 ⊢ ∅ : ∅

d1

Object Language

⊢obj v

imd r1 r2 obj d -ref

d2 = imd

⊢obj v

imd

imd ≬ r1

ǫ

imd

-empty

⊢obj s

r1

⊢ ∅: ∅

-nonempty

⊢obj

s

-glob-const

S1 ⊢ unit : Unit d3 S1 ⊢ href uniti : Ref Unit S1 ⊢ ∅ {o1 7→ href uniti} : ∅ {o1 7→ Ref Unit} {o2 7→ hλx1 .@ r1 deref i} {o2 7→ Ref Unit @ r1 letregion ρ3 Ref Unit @ r1 in hr1 , o1 i ⊢ s1 : S 1 imd

⊢obj s

r1

⊢obj d

s1 = ∅ {r1 7→ ∅ {o1 7→ href uniti} {o2 7→ hλx1 .@ r1 deref letregion ρ3 in hr1 , o1 i s2 = s1 {r2 7→ ∅ {o1 7→ href hr1 , o1 ii}} s3 = s2 {r3 7→ ∅}

r1

-ref

{(read @ r1 )}

⇒

}

} S1 = ∅ {r1 7→ ∅ {o1 7→ Ref Unit} i}

{o2 7→ Ref Unit @ r1

} {(read @ r1 )}

⇒

Ref Unit @ r1 }

S2 = S1 {r2 → 7 ∅ {o1 7→ Ref (Ref Unit @ r1 )}} S3 = S2 {r3 → 7 ∅}

Figure 4.1.11. Sample Object Intermediate Language Derivation, I

Statics

185

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

We now present in Figures 4.1.11 and 4.1.12 a sample derivation of a configuration from a snapshot of the sample reduction of Figures 4.1.5 through 4.1.7, just before the deallocation of region r3 . Typing the program configuration requires typing the program and the store. Typing the store requires typing each of the three region stores in the context of any lower region store types. Collectively, the region stores contain two reference cells and a function. The reference cells imd

are typed using ⊢obj d -ref, which requires a value derivation. In the case of the inner cell, the value imd

hr1 , o1 i is typed using ⊢obj v -addr-live, using the region store type for r1 . The function is typed in d3 of Figure 4.1.12. It requires a prestorable derivation with an environment declaring only the visible region names (in this case just r1 ). In the course of this derivation, the environment is extended with the formal parameter and the region variable declared in the body. The reduced store type S1 is used in typing the address hr1 , o1 i. The program is typed in derivation d1 of Figure 4.1.12. freeregion constructs are typed similarly to the letregion constructs of the source language example. The store type is passed through this derivation unchanged and is used in typing the address hr1 , o1 i corresponding to the free variable of the function.

1.3. Type and Effect Soundness.

We extend the type and effect soundness theorems of Section 2.1.3 to handle our object language with encapsulation. This time, our proof cannot be extended to show type soundness for the intermediate language without respect to the source language; in fact, such a result does not hold. To get a sense of the problem, consider that we might be given a program schema: freeregion r0 after (freeregion r1 after e1 ) (freeregion r2 after e2 ) Clearly, e1 must reduce to an offset holding a procedure at region r0 (so that it survives the deallocation of region r1 ). As we descend into the evaluation context freeregion r0 after freeregion r1 after [ e], the lexical index r0 r1 can no longer be a prefix of the store index, which must be r0 r2 r1 if we expect the first region to be deallocated, r1 , to be on top of the stack. Such an expression configuration would not be syntactically well-formed. We could, however, be given the well-formed and well-typed program: freeregion r0 after (letregion ρ1 in e1 ) (freeregion r2 after e2 ) 186

Object Language

Type and Effect Soundness

d1 = imd

⊢obj v

r1 r2 r3

-addr-live

S3 ⊢ hr1 , o1 i : Ref Unit @ r1 ∅ {r1 } {r2 } {r3 }; S3 ⊢ hr1 , o1 i : Ref Unit @ r1 imd r1 r2 r3 ⊢obj e -pure ∅ {r1 } {r2 } {r3 }; S3 ⊢ hr1 , o1 i : Ref Unit @ r1 ! ∅ imd r1 r2 ⊢obj e -freeregion ∅ {r1 } {r2 }; S3 ⊢ freeregion r3 : Ref Unit @ r1 ! ∅ after hr1 , o1 i imd r1 r2 ⊢obj e -deref ∅ {r1 } {r2 }; S3 ⊢ @ r1 deref : Unit ! {(read @ r1 )} freeregion r3 after hr1 , o1 i imd r1 ⊢obj e -freeregion ∅ {r1 }; S3 ⊢ freeregion r2 : Unit ! {(read @ r1 )} after @ r1 deref freeregion r3 after hr1 , o1 i imd ǫ ⊢obj e -freeregion ∅; S3 ⊢ freeregion r1 : Unit ! {(read @ r1 )} after freeregion r2 after @ r1 deref freeregion r3 after hr1 , o1 i d3 = imd

⊢obj p

r1 r2 r3

imd

⊢obj v

-value

r1 ρ3

-addr-live

S1 ⊢ hr1 , o1 i : Ref Unit @ r1 ⊢ ∅ {r1 } {x1 7→ Ref Unit @ r1 } {ρ3 }; S1 ⊢ hr1 , o1 i : Ref Unit @ r1 imd r1 ρ3 ⊢obj e -pure ∅ {r1 } {x1 7→ Ref Unit @ r1 } {ρ3 }; S1 ⊢ hr1 , o1 i : Ref Unit @ r1 ! ∅ imd r1 ⊢obj e -letregion ∅ {r1 } {x1 7→ Ref Unit @ r1 }; S1 ⊢ letregion ρ3 : Ref Unit @ r1 ! ∅ in hr1 , o1 i imd r1 ⊢obj e -deref Ref Unit @ r1 ! ∅ {r1 } {x1 7→ Ref Unit @ r1 }; S1 ⊢ @ r1 deref : {(read @ r1 )} letregion ρ3 in hr1 , o1 i imd r1 / b obj ⊢ -λ {(read @ r1 )} ∅ {r1 }; S1 ⊢ hλx1 .@ r1 deref i : Ref Unit @ r1 ⇒ !∅ letregion ρ3 in hr1 , o1 i Ref Unit @ r1 imd r1 d ⊢obj -λ {(read @ r1 )} S1 ⊢ hλx1 .@ r1 deref i : Ref Unit @ r1 ⇒ letregion ρ3 in hr1 , o1 i Ref Unit @ r1 imd r1 ρ3 obj p -value

Figure 4.1.12. Sample Object Intermediate Language Derivation, II in which the above situation is yet to develop as the program runs. Thus, we instead prove type and effect soundness only for source language programs.6 The programs above cannot be derived from any source language program by our operational semantics. 6Various approaches would have allowed the type soundness result on arbitrary intermediate language configurations, but only at the cost of introducing complications to handle situations such as those described above, which cannot arise from source language programs. One such approach is to define the syntax and/or type system tightly around the operational semantics so that these programs are no longer expressible/typable. Another is to let the store be a tree, as we will do for different reasons with the monadic language in the next chapter. In fact, we could get by with a linear store, even maintaining our stack-of-regions architecture, with only a slight modification to our indexing structure. The difficulty comes with the translation, where this violation of lexical scope forces us either to use the expression structure in translating the store or to give up the principle that the monadic “level” of code is determined by the number of enclosing encapsulation constructs. We choose instead to make an assumption that we begin with a source language program.

187

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Theorem 4.1.1 (Type Soundness). src

⊢obj q q : Q

→ hq; ∅i ⇓ hv; ∅i

hq; ∅i ⇑ ∨ ∃ v.(

imd hq; si obj

imd

∧ ⊢obj hq; si hv; ∅i : hQ; ∅i)

During the evaluation of a well-typed expression in a context, no unexpected operations are performed. We present only the contextualized effect soundness result. Because we mask effects, this result becomes vacuous if carried to the top level (programs).7 The result is constrained to reachable expression configurations. Theorem 4.1.2 (Effect Soundness).

⊢

he; si ∈ R t: T

imd obj he;

sir

∧ ⊢

imd obj he;

sir

he; si : hP ! T; Si ∧

t he; si →∗ he′ ; s′ i imd he; sir obj

→

imd r obj t

We again have lemmas stating that evaluation preserves type and effect, and that faulty program configurations are untypable. Lemma 4.1.3 (Evaluation Preserves Type and Effect). imd obj hq;

si: imd

imd

hq; si ∈ R obj hq; si ∧ ⊢obj hq; si hq; si : hQ; Si ∧

hq; si →∗ hq′ ; s′ i imd hq; si obj

→

imd

∃ S ′ ≥ǫ S. ⊢obj hq; si hq ′ ; s′ i : hQ; S ′ i imd obj he;

sir :

he; si ∈ R

imd obj he;

sir

∧ ⊢ imd

imd obj he;

sir

he; si: hP ! T; Si ∧

t he; si →∗ he′ ; s′ i imd he; sir obj

→

r

∃ T ′ ≤r T, S ′ ≥r S. ⊢obj he; si he′ ; s′ i : hP ! T ′ ; S ′ i

Lemma 4.1.4 (Faulty Program Configurations Untypable). imd

imd

If hq; si ∈ R obj hq; si is faulty, then there are no Q and S, such that ⊢obj hq; si hq; si : hQ; Si. This example is not an issue for Calcagno, et. al., because they do not require an ordered index to their unrestricted store. However, their system can type unsafe program configurations such as hfreeregion r1 after set freeregion r2 after hr1 , o2 i to deref hr2 , o1 i; ∅ {r1 7→ ∅ {o1 7→ href h∅, oii}} {r2 7→ ∅ {o1 7→ href h∅, oii}}i. These could be avoided by only claiming type soundess for source language programs or by excluding such programs on the basis of their lack of lexical region structure. 7If we were to avoid masking effects, we could obtain the corresponding top-level result. However, it would be near trivial (at least for source programs), as most regions in a program will experience reads, allocations, and writes at some point.

188

Object Language

Type and Effect Soundness

Proof: Theorem 4.1.1. src

We have ⊢obj q q: Q. Because the intermediate language is an extension of the source language, there imd

is a similar derivation ∅ ⊢obj q q : Q that simply passes around the empty store type. Applying imd

imd

t

imd

rule ⊢obj s-empty, then ⊢obj hq; si, we have ⊢obj hq; si hq; ∅i : hQ; ∅i. Clearly, if hq; ∅i ⇑, the condition is satisfied. Assume

t hq; ∅i ⇓ hq′ ; s′ i imd hq; si obj

, i.e.,

t hq; ∅i →∗ hq′ ; s′ i imd hq; si obj

with hq ′ ; s′ i irreducible. By Lemma 4.1.3, ∃

imd

S ′ ≥ǫ ∅. ⊢obj hq; si hq ′ ; s′ i : hQ; S ′ i. By Lemma 4.1.4, hq ′ ; s′ i is not faulty. By Proposition 4.1.1, imd

q ′ is a value. By Lemma 4.1.1, s′ = ∅. By rule ⊢obj s-empty, S ′ = ∅.

As with the unencapsulated object language, the proofs of effect soundess and that stepping preserves typability require two lemmas and a proposition. In the statement of Subject Reduction, both the trace type and the store type are mediated by the index of the configuration. Lemma 4.1.5 (Subject Reduction). imd

r

imd

r

he; si ∈ R obj he; si ∧ ⊢obj he; si he; si: hP ! T; Si ∧ ∃ T ′ ≤r T, S ′ ≥r S. ⊢

imd obj he;

si

r

t he; si ⇀ he′ ; s′ i

he′ ; s′ i : hP ! T ′ ; S ′ i ∧ ⊢

imd he; sir obj imd r obj t

→

t: T

In the proposition on Removing Context, the relation between the inner and outer trace type is mediated by the index outside of the context. Proposition 4.1.2 (Removing Context Preserves Typability). imd obj hq;

⊢

si:

imd obj hq;

si

h→

∃ P3 , T3 . ⊢ imd obj he;

⊢

∗

[ e r3 ]

q [e3 ]; si : hQ; Si →

imd obj he;

si

r3

he3 ; si : hP3 ! T3 ; Si

si:

imd obj he;

sir

h→

∗

[ e r3 ]

∃ P3 , T3 ≤r T. ⊢

e [e3 ]; si: hP ! T; Si →

imd obj he;

sir r3

he3 ; si : hP3 ! T3 ; Si

The relational constraints on both the trace type and store type in the Replacement lemma are now mediated by the index within the context. In the latter two clauses of the lemma, we obtain a constraint on the resulting trace type with respect to the original trace type, mediated by the index outside of the context.

189

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Lemma 4.1.6 (Replacement). imd obj hq;



si: ∗

h→

imd

q [e3 ]; si ∈ R obj hq; si

[ e]

  imd  ∧ ⊢obj hq; si h→∗ [ e]q[e3 ]; si : hQ; Si   r3  ∧ ⊢imd obj he; si he3 ; si : hP3 ! T3 ; Si   imd r  3  ∧ ⊢obj he; si he′3 ; s′ i : hP3 ! T3′ ; S ′ i  ∧ T3′ ≤r3 T3 ∧ S ′ ≥r3 S imd

⊢obj hq; si h→

imd obj he;



∗

q[e′3 ]; s′ i : hQ; S ′ i

[ e]



      →     

si: ∗

h→

imd



r

e [e3 ]; si ∈ R obj he; si

[ e]

    imd r  ∧ ⊢obj he; si h→∗ [ e]e [e3 ]; si : hP ! T; Si      r r3 he; si  ∧ ⊢imd  → obj he3 ; si : hP3 ! T3 ; Si     imd r r3   he; si  ∧ ⊢obj he′3 ; s′ i : hP3 ! T3′ ; S ′ i    ∧ T3′ ≤r r3 T3 ∧ S ′ ≥r r3 S imd

r

∗

∃ T ′ ≤r T. ⊢obj he; si h→

e [e′3 ]; s′ i : hP ! T ′ ; S ′ i

[ e]

imd obj e:



∗

∃ s.h→

imd

e[e3 ]; si ∈ R obj he; si

[ e]

r

  imd r  ∧ ∅ {r}; S ⊢obj e →∗ [ e]e [e3 ] : P ! T   r r3  ∧ ∅ {r} {r }; S ⊢imd obj e e3 : P3 ! T3 3   imd r r3  e′3 : P3 ! T3′  ∧ ∅ {r} {r3 }; S ′ ⊢obj e  ∧ T3′ ≤r r3 T3 ∧ S ′ ≥r r3 S imd

∃ T ′ ≤r T. ∅ {r}; S ′ ⊢obj e

r

→∗ [ e]



      →     

e[e′3 ] : P ! T ′

Proof: Theorem 4.1.2. imd

We have Γ; S ⊢obj e imd obj he;

r

e: P ! T.

sir :

We proceed by induction on the expression configuation reduction derivation. →∗ imd he; si obj

-cntxt:

Using Proposition 4.1.2 on the typing derivation of the initial expression configuration, imd

r r1

we obtain ∃ P1 , T1 ≤r T. ⊢obj he; si

190

he1 ; si : hP1 ! T1 ; Si.

→∗ imd he; si obj

-cntxt determines

Object Language

Type and Effect Soundness

an evaluation

t1 he1 ; si →∗ he′1 ; s′ i imd he; sir r1 obj

r1 . By induction ⊢ →∗ imd he; si obj

imd r

imd he; si obj

, for (by inspection of trace evaluation contexts) t = t1 imd r

t1 : T1 . It follows that ⊢obj t

t: T.

-reflex:

t = [ ] so ⊢obj t →∗

imd r r1 obj t

t: T clearly holds.

-step:

Use Lemma 4.1.5. →∗

imd he; si obj

-trans: imd

r

We have ⊢obj he; si he; si: hP ! T; Si,

t′ he; si →∗ he′′ ; s′′ i imd he; sir obj

t′ + t′′ . By Lemma 4.1.3, ∃ T ′ ≤r T, S ′′ . ⊢ imd r

induction hypothesis, we have ⊢obj t imd r

t′′ : T and thus ⊢obj t

imd obj he;

si

r

imd r

t′ : T and ⊢obj t

, and

t′′ he′′ ; s′′ i →∗ he′ ; s′ i imd he; sir obj

with t =

he′′ ; s′′ i : hP ! T ′ ; S ′′ i. By the imd r

t′′ : T ′ . Because T ′ ≤r T, ⊢obj t

t: T.

Proof: Lemma 4.1.3. imd obj hq;

si: →∗

imd hq; si obj

-cntxt:

We again use Proposition 4.1.2 to reduce the first statement to the second, but now we need Proposition 4.1.6 as well.

imd obj he;

si:

Proceed as follows for each evaluation rule. →∗

imd he; si obj

-cntxt:

Use Proposition 4.1.2 and the induction hypothesis, then Proposition 4.1.6. →∗

imd he; si obj

-reflex:

s = s′ and e = e′ , so let S ′ = S and T ′ = T. →∗

imd he; si obj

-step:

Use Lemma 4.1.5.

191

Modelling Encapsulation of State With Monad Transformers

→∗ imd he; si obj

Intermediate Languages

-trans:

We have

t′ he; si →∗ he′′ ; s′′ i imd he; sir obj imd

and

t′′ he′′ ; s′′ i →∗ he′ ; s′ i imd he; sir obj

with t = t′ + t′′ . By induction, we have ∃

r

T ′′ ≤r T, S ′′ ≥r S. ⊢obj he; si he′′ ; s′′ i : hP ! T ′′ ; S ′′ i, and then ∃ T ′ ≤r T ′′ , S ′ ≥r S ′′ . imd

r

⊢obj he; si he′ ; s′ i : hP ! T ′ ; S ′ i. Proof: Lemma 4.1.4. imd

r

We show below that if he; si ∈ R obj he; si is immediately faulty, then there are no S and E such imd

r

that ⊢obj he; si he; si: hE; Si. One can then observe that typing any faulty program configuration would require typing an immediately faulty configuration from a program expression evaluation context. Our proof is by contradiction. We perform case analysis on immediately faulty expression configurations. r

hx; si : imd

imd

By rule ⊢obj he; si, we require ∅ {r}; S ⊢obj e

r

imd

x: E. This must be generated with ⊢obj p-var,

which requires x ∈ Dom( ∅ {r}). r1 r0 r2

h@ r0 hλx.e0 i; si

(λx.e0 not closed ):

imd

imd

By rule ⊢obj he; si , we require ∅ {r}; S ⊢obj e imd

r

@ r0 hλx.e0 i: E. This must be generated

imd

imd

with ⊢obj e-alloc and ⊢obj b-λ, which in turn require (∅ {r} - r2 ) {x 7→ P0.1 }; S ⊢obj e

r1 r0

e0 :

T

P0.2 ! T0 and E = P0.1 ⇒0 P0.2 ! {(alloc @ r0 )}. But e0 contains a free variable other than x so this will not be derivable. h@ r0 deref v; si

r1 r0 r2

r1 r0 r2

, h@ r0 set v to e; si

imd

imd

r1 r0 r2

, h@ r0 v e; si

By rule ⊢obj he; si, we require ∅ {r}; S ⊢obj e

r

or @ r0 v e. These must be generated with ⊢ imd

each of which requires ∅ {r}; S ⊢obj e

r

,(v 6= a):

e0 : E, for e0 equals @ r0 deref v, @ r0 set v to e, imd obj e -deref

imd

imd

, ⊢obj e -set, or ⊢obj e -app, respectively, T

v: B @ r0 ! T, for B equals Ref P0 or P0.1 ⇒0 P0.2 .

imd

imd

imd

This must be generated with ⊢obj e -pure, ⊢obj p -val, and ⊢obj v -addr-live, the latter of which requires v = hr0 , oi. r1 r0 r2

h@ r0 deref a; si By ⊢ imd

⊢obj s

imd obj v -addr-live r r3

r1 r0 r2

, h@ r0 a e; si

,(a ∈ / Dom ( s)): imd

(as above), a = hr0 , oi ∈ Dom( S). By rule ⊢obj he; si we also require imd

imd ≬

s: S, so by ⊢obj s-nonempty, r0 ∈ Dom( s) and by ⊢obj s , o ∈ Dom( s( r0 )). r1 r0 r2

h@ r0 deref a; si

As above, by ⊢ 192

r1 r0 r2

, h@ r0 set a to e; si

r1 r0 r2

, h@ r0 set a to e; si

imd obj e -deref

or ⊢

imd obj e-set

,(s ( a) 6= href vi): imd

, ∅ {r}; S ⊢obj e

r

imd

a: Ref P0 @ r0 ! T, and by ⊢obj v-addr-live

Object Language

Type and Effect Soundness

imd

imd

imd ≬

a = hr0 , oi and S ( a) = Ref P0 . By rules ⊢obj he; si , ⊢obj s -nonempty, and ⊢obj

s

imd

, S ⊢obj d s

imd

( a): Ref P0 . By ⊢obj d-ref, s ( a) = href vi. h@ r0 a e; si

r1 r0 r2

,(s ( a) 6= hλx.e0 i): imd

imd

As above, by ⊢obj e -app, ∅ {r}; S ⊢obj e

T

r

imd

a: P0.1 ⇒0 P0.2 @ r0 ! T, and by ⊢obj v -addr-live we

T

imd

imd

imd ≬

have a = hr0 , oi and S ( a) = P0.1 ⇒0 P0.2 . By rules ⊢obj he; si, ⊢obj s-nonempty, and ⊢obj s , S T

imd

imd

⊢obj d s ( a): P0.1 ⇒0 P0.2 . By ⊢obj d-λ, s ( a) = hλx.e0 i. r

hfreeregion r0 after e0 ; si 1 ,(s ∈ /

imd r1 r0 r2 ): obj s

imd

imd

By rule ⊢obj he; si, we require ∅ {r1 }; S ⊢obj e

r1

freeregion r0 after e0 : E. This must be gen-

imd

imd

erated with ⊢obj e-freeregion, which requires E = P0 - r0 ! T0 - r0 , and ∅ {r1 } {r0 }; S ⊢obj e imd

e0 : P0 ! T0 . By the definition of ⊢obj e imd

also require ⊢obj s

r1 r3

s: S, so S ∈ r

hfreeregion r0 after v0 ; si 1 ,(s ∈ /

r1 r0

we have S ∈

imd r1 r3 , obj S

r3 = r0 r2 , and s ∈

imd

By rule ⊢obj he; si we

imd r1 r0 r2 . obj s

imd r1 r0 ): obj s

imd

imd

By rule ⊢obj he; si, we require ∅ {r1 }; S ⊢obj e erated with ⊢

imd r1 r0 r2 . obj S

r0

freeregion r0 after v0 : E. This must be gen-

imd obj e -freeregion

imd

, which requires E = P0 - r0 ! T0 - r0 , and ∅ {r1 } {r0 }; S ⊢obj e imd

r1 r0

v0 : P0 ! T0 . freeregion r0 after [ e] is an evaluation context, so hv0 ; si ∈ R obj he; si Lemma 4.1.1, s ∈

r1 r0

r1 r0

. By

imd r1 r0 . obj s

We now prepare for the proof of subject reduction by introducing three lemmas. We have seen that the reduction semantics uses three forms of substitution for βv reduction, region allocation, and region deallocation. We will show that typing is preserved in each of these cases. Lemma 4.1.7 (Value Substitution). imd obj p: imd

∅ {r} {x′ 7→ P ′ }; S ⊢obj p imd

∅ {r}; S ⊢obj p

r

r

imd

r

p : P ∧ S ⊢obj v v ′ : P ′ →

p [x′ := v ′ ] : P

imd obj b: imd

∅ {r} {x′ 7→ P ′ }; S ⊢obj b imd r1 r0 /r2

∅ {r}; S ⊢obj b

r1 r0 /r2

imd

b: B ! T ∧ S ⊢obj v

r = r1 r0 r2

v′ : P ′ →

b [x′ := v ′ ]: B ! T

193

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

imd obj e: imd

∅ {r} {x′ 7→ P ′ }; S ⊢obj e imd

∅ {r}; S ⊢obj e

r

r

imd

r

e: E ∧ S ⊢obj v v ′ : P ′ →

e [x′ := v ′ ]: E

Two additional substitution lemmas handle the allocation and deallocation of regions. Each has two parts: one relating to expressions and another relating to stores. The constraint on r0 in Lemma 4.1.8 is required for the extensions of the static environment and store type in the conclusion to be defined8. Lemma 4.1.8 (Region Allocation). For all r0 ∈ / Dom(S1 ): imd obj v: imd

S1 ⊢obj v

r1 ρ0

v0 : P0 → imd

S1 {r0 7→ ∅} ⊢obj v

r1 r0

v0 [ρ0 := r0 ]: P0 [ρ0 := r0 ]

imd obj p: imd

∅ {r1 } {ρ0 }; S1 ⊢obj p

r1 ρ0

p0 : P0 → imd

∅ {r1 } {r0 }; S1 {r0 7→ ∅} ⊢obj p imd obj b

p0 [ρ0 := r0 ] : P0 [ρ0 := r0 ] ⇁

imd

⇁ ̺

b0 : B0 ! T0 → imd

∅ {r1 } {r0 }; S1 {r0 7→ ∅} ⊢obj b

⇁′ ̺

b0 [ρ0 := r0 ]: B0 [ρ0 := r0 ] ! T0 [ρ0 := r0 ]

(active source expression): imd

∅ {r1 } {ρ0 }; S1 ⊢obj e

r1 ρ0

e0 : E0 → imd

∅ {r1 } {r0 }; S1 {r0 7→ ∅} ⊢obj e

r1 r0

e0 [ρ0 := r0 ]: E0 [ρ0 := r0 ]

imd obj s: imd

⊢obj s

r1

imd

s1 : S1 → ⊢obj s

r1 r0

s1 {r0 7→ ∅}: S1 {r0 7→ ∅}

rr0 8By the definition of, e.g., ⊢imd obj e , r0 ∈ / Dom ( S1 )

194

⇁′

(active source prestorable, seq-/( ̺ ) = r1 ρ0 , seq-/( ̺ ) = r1 r0 ):

∅ {r1 } {ρ0 }; S1 ⊢obj b

imd obj e

r1 r0

→ r0 ∈ / Dom ( Γ).

Object Language

Type and Effect Soundness

Lemma 4.1.9 (Region Deallocation). imd obj v: imd

S1 {r0 7→ ≬S 0 } ⊢obj v

r1 r0

imd

v0 : P0 → S1 ⊢obj v

r1

v0 [r0 := ∅]: P0 - r0

imd obj s: imd

⊢obj s

r1 r0

imd

s1 {r0 7→ ≬s 0 }: S1 {r0 7→ ≬S 0 } → ⊢obj s

r1

s1 : S 1

Our statement of store weakening now operates over the generalized ordering relation on stores. It serves to maintain derivability as the store is enlarged through the allocation of additional offsets in lexically visible regions. It is again proven by induction on the typing derivations. In the case for imd

⊢obj v-addr-live, we have that a = hr, oi for r ∈ ̺. For such r, ∀o ∈ Dom( S( r)).S( r)( o) = S ′ ( r)( o), so S ′ may be substituted for S. Because the expresssion case is restricted to active source expressions, we need not handle a case for freeregion. Proposition 4.1.3 (Store Weakening). imd obj v: imd

̺

imd

̺

imd

̺

imd

̺

∧

S ≤̺ S ′

→ S ′ ⊢obj v v: P

∧

S ≤̺ S ′

→ S ′ ⊢obj d d: B

p: P

∧

S ≤̺ S ′

→ Γ; S ′ ⊢obj p

(active ⇁ source prestorable): imd ̺ Γ; S ⊢obj b b: B ! T

∧

S ≤seq-/( ̺ ) S ′

→ Γ; S ′ ⊢obj b

∧

S ≤̺ S ′

→ Γ; S ′ ⊢obj e

S ⊢obj v v: P imd obj d:

S ⊢obj d d: B

imd obj p:

imd

Γ; S ⊢obj p

̺

imd obj b

imd obj e

(active source expression): imd ̺ Γ; S ⊢obj e e: E

imd

⇁

imd

imd

̺

⇁ ̺

̺

p: P b: B ! T e: E

The following two additional lemmas relate to our restriction on the store. They mediate between the typing of code in an expression context and the typing of code in the store. Expification, a form of weakening, asserts that an expression in the store can be dropped into an expression context if the additional regions visible in that context are declared in the store type and static environment. Additional, nonlexical, region store types may also be added to the store type. Storification asserts that if a storable is typable as a prestorable in an extended context and if the additional regions visible in this context were not referenced in the storable type, it is also typable as a storable with these regions dropped from the store type.

195

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Lemma 4.1.10 (Expification). imd obj v: imd

S0 ⊢obj v

r1 r0

imd

v: P → S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj v

r1 r0 r2

v: P

imd obj p: imd

∅ {r1 } {r0 }; S0 ⊢obj p

r1 r0

p: P → imd

∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj p imd obj b

r1 r0 r2

p: P

(active source prestorable): imd r1 r0 /

∅ {r1 } {r0 }; S0 ⊢obj b

b: B ! T → imd

∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b imd obj e

r1 r0 /r2

b: B ! T

(active source expression): imd

∅ {r1 } {r0 }; S0 ⊢obj e

r1 r0

e: E → imd

∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj e

r1 r0 r2

e: E

In the premise of Storification, it is implicit that ∀r2 ∈ r2 .r2 ∈ / fr(B). Proposition 4.1.4 (Storification). imd

∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b

r1 r0 /r2

imd

d: B ! ∅ → S0 ⊢obj d

r1 r0

d: B

Proposition 4.1.4 will require a lemma with clauses converse to those of Lemma 4.1.10, but requiring an additional constraint that the regions being dropped do not occur in the term or its type. Lemma 4.1.11 (Storification-aux). imd obj v: imd

(S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj v

r1 r0 r2

v: P ∧

∀r2 ∈ r2 .r2 ∈ / frn(v) ∪ frn(P)) → imd

S0 ⊢obj v

r1 r0

v: P

imd obj p: imd

(∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj p ∀r2 ∈ r2 .r2 ∈ / frn(p) ∪ frn(P)) → imd

∅ {r1 } {r0 }; S0 ⊢obj p 196

r1 r0

p: P

r1 r0 r2

p: P ∧

Object Language

imd obj b

Type and Effect Soundness

(active source prestorable): imd

(∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b

r1 r0 /r2

b: B ! T ∧

∀r2 ∈ r2 .r2 ∈ / frn(b) ∪ frn(B)) → imd r1 r0 /

∅ {r1 } {r0 }; S0 ⊢obj b imd obj e

b: B ! T

(active source expression): imd

(∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj e

r1 r0 r2

e: E ∧

∀r2 ∈ r2 .r2 ∈ / frn(e) ∪ frn(E) ) → imd

∅ {r1 } {r0 }; S0 ⊢obj e

r1 r0

e: E

Proof: Proposition 4.1.4. imd r1 r0 . obj B

We have B ∈

If d = closed hλx.e i, then e ∈

case of Lemma 4.1.11. Otherwise, d = href vi so by ⊢ ⊢

imd obj v-addr-live

imd r1 r0 obj e

imd obj d

-ref,

and we can use the prestorable imd r1 r0 . obj P

B = Ref P, with P ∈

By

, r2 ∈ / frn(P) → r2 ∈ / frn(v), so we can use the value case of Lemma 4.1.11.

Proof: Lemma 4.1.5. By case analysis on the reduction step: ⇀

imd he; si obj

-let:

[] hlet x= v in e; si ⇀ he [ x:= v] ; si imd he; sir obj

:

imd

imd

We have by rule ⊢obj he; si that ∅ {r}; S ⊢obj e imd

values have no effect, ∅ {r}; S ⊢obj e imd

r

r

imd

let x= v in e: P ! T. By ⊢obj e-let and because imd

v: P.1 ! ∅ and ∅ {r} {x 7→ P.1 }; S ⊢obj e

imd

r

e: P ! T.

r

The former gives, by ⊢obj e -pure, S ⊢obj v v: P.1 . By Lemma 4.1.7 we have that ∅ {r}; S imd

⊢obj e ⇀

imd he; si obj

r

imd obj he;

e [ x := v]: P ! T. We complete with a reapplication of rule

-alloc:

[(alloc @ a0 )] h@ r0 d0 ; si ⇀ ha0 ; s′ i imd he; sir = r1 r0 r2 obj

si.

: imd

imd

where s′ = s {a0 7→ d0 } and a0 = hr0 , oi. We have by rule ⊢obj he; si that ∅ {r}; S ⊢obj e imd

r

imd r1 r0 /r2

@ r0 d0 : P ! T. By ⊢obj e -alloc, P = B0 @ r0 , T = T ′ ∪ {(alloc @ r0 )}, and ∅ {r}; S ⊢obj b imd r1 r0 /r2

d0 : B0 ! T, where by observation T ′ = ∅. By the definition of ⊢obj b

, S = S0 {r2 7→ ≬S 2 } imd

{r3 7→ ≬S 3 }, with S0 = S1 {r0 7→ ≬S 0 }. By Proposition 4.1.4, S1 {r0 7→ ≬S 0 } ⊢obj d B0 . By Proposition 4.1.3, S0′ ⊢ imd

imd

imd r1 r0 obj d

imd

S ′ = S0′ {r2 7→ ≬S 2 } {r3 7→ ≬S 3 }. We also have by rule ⊢

imd

imd obj he;

si

r

a0 : B0 @ r0 ! ∅, with imd

and a sequence of ⊢obj s imd

S0 ⊢

that s = s0 {r2 7→ ≬s 2 } {r3 7→ ≬s 3 }, with s0 = s1 {r0 7→ ≬s 0 }, ⊢obj s

imd ≬ r1 r0 obj s

d0 :

d0 : B0 , with S0′ = S1 {r0 7→ ≬S 0 {o 7→ B0 }}. Ap-

plying ⊢obj v -addr-live, ⊢obj p -value, and ⊢obj e -pure, ∅ {r}; S ′ ⊢obj e

-nonempty

r1 r0

r1

s1 : S1 , and

≬

s 0 : ≬S 0 . Replacing ≬s 0 with ≬s 0 {o 7→ d0 } and ≬S 0 with ≬S 0 {o 7→ B0 } and 197

Modelling Encapsulation of State With Monad Transformers

imd

reapplying the sequence, ⊢obj s

r r3

Intermediate Languages

s′ : S ′ , with S ′ = S {a0 7→ B0 }. We complete with an

imd

application of rule ⊢obj he; si. ⇀

imd he; si obj

h@ r0 deref a0 ; si

-deref:

[(read @ a0 )] ⇀ hv0 ; si

imd he; sir = r1 r0 r2 obj

: imd

imd

where a0 = hr0 , oi and s ( a0 ) = href v0 i. We have by rule ⊢obj he; si that ∅ {r}; S ⊢obj e imd

imd

r

imd

deref a0 : P0 ! T. By ⊢obj e -deref, ⊢obj e -pure, and ⊢obj p -value and because values have no imd

r

imd

effect, S ⊢obj v a0 : Ref P0 @ r0 with T = {(read @ r0 )}. By ⊢obj v -addr-live a0 = hr0 , oi and S imd

imd

( a0 ) = Ref P0 . We also have by rule ⊢obj he; si that ⊢obj s ⊢

imd obj s -nonempty

, that ⊢

Ref P0 . By ⊢ ⊢

imd obj p -value

⇀ imd he; si obj

and ⊢

imd obj he;

rule

-set:

imd obj d

imd r1 r0 obj s

-ref,

S0 ⊢

s0 : S0 . By an additional ⊢

imd r1 r0 obj v

s: S, and, by a sequence of

imd obj s -nonempty

, ∅ {r}; S ⊢

imd r obj e

imd

, S0 ⊢obj d

v0 : P0 . By Lemma 4.1.10, S ⊢

imd obj e -pure

imd r obj v

r1 r0

href v0 i:

v0 : P0 . Applying

v0 : P0 ! ∅. We complete with an application of

r

si .

h@ r0 set a0 to v0 ; si

[(write @ a0 )] ⇀ hunit; s′ i

imd he; sir = r1 r0 r2 obj

where a0 = hr0 , oi and s′ = s {a0 imd obj he;

by rule

r

href v0 i}. By

si that ∅ {r}; S ⊢ imd

need ∅ {r}; S ⊢obj e

r

:

imd r obj e

⇀ imd he; si obj

-set, s ( a0 ) = href v0 i. We have imd

set a0 to v0 : P ! T1 . By ⊢obj e -set, P = Unit. We imd

imd

unit: Unit ! ∅, which follows using ⊢obj v -glob-const, ⊢obj p -value, and

imd

imd

imd

imd

⊢obj e -pure. We also have by rule ⊢obj he; si and a sequence of ⊢obj s -nonempty, that ⊢obj s imd

imd

imd

imd

r

r1 r0

imd

s0 : S0 . By ⊢obj e-set, ⊢obj e-pure, and ⊢obj p-val, S ⊢obj v a0 : Ref P0 @ r0 . By ⊢obj v-addr-live, a0 imd

imd

imd

imd

r

= hr0 , oi, and S ( a0 ) = Ref P0 . Also by ⊢obj e -set, ⊢obj e -pure, and ⊢obj p -value, S ⊢obj v v0 : imd

P0 . By Lemma 4.1.11, S ⊢obj v Reapplying the sequence of ⊢

r1 r0

imd

imd

v0 : P0 . Applying ⊢obj d -ref, S ⊢obj d

imd obj s -nonempty

, we obtain ⊢

imd r obj s

r1 r0

href v0 i: Ref P0 .

s′ : S. We complete with an

imd

application of rule ⊢obj he; si. ⇀ imd he; si obj

[(exec @ a0 )] ⇀ he0 [ x:= imd he; sir = r1 r0 r2 .r obj

ha0 v0 ; si

-app-λ:

v0 ]; si

: imd

imd

where a0 = hr0 , oi and s ( a0 ) = hλx.e0 i. By rule ⊢obj he; si , ∅ {r}; S ⊢obj e By ⊢

imd obj e

-app

and because values have no effect, S ⊢

T0

⇒ P) @ r0 with T1 = {(exec @ r0 )} ∪ T0 . By ⊢ have by rule ⊢ and S0 ⊢ S0 ⊢

imd obj he;

imd ≬ r1 r0 obj s

imd r1 r0 obj d

si

that ⊢

≬

imd r r3 obj s

s 0 : ≬S 0 . By ⊢

imd r obj v

r

v0 : P0.1 and S ⊢

a0 v0 : P ! T.

imd r obj v

, S ( a0 ) = P0.1 ⇒ P0.2 . We also imd

imd

s: S. By a sequence of ⊢obj s-nonempty, ⊢obj s

imd ≬ obj s

,S⊢

imd obj d

a0 : (P0.1

T0

imd obj v -addr-live

hλx.e0 i: S ( a) and by ⊢

T0

hλx.e0 i: P0.1 ⇒ P0.2 and ∅ {r1 } {r0 }; S0 ⊢

imd r1 r0 / obj b

imd obj d -λ

r1 r0

and ⊢

s0 : S 0 imd obj b -λ

,

T0

hλx.e0 i: P0.1 ⇒ P0.2 ! ∅.

Because hλx.e0 i is an active source prestorable, we can apply Lemma 4.1.10 to obtain ∅ imd r1 r0 /r2

{r}; S ⊢obj b

198

T

imd

imd

hλx.e0 i: P0.1 ⇒0 P0.2 ! ∅. By ⊢obj b-λ, ∅ {r} {x 7→ P0.1 }; S0 ⊢obj e

r

e0 :

Object Language

Type and Effect Soundness

imd

P0.2 ! T0 . By Lemma 4.1.7, ∅ {r}; S ⊢obj e

r

e0 [ x := v0 ]: P0.2 ! T0 . We complete with an

imd

application of rule ⊢obj he; si. ⇀

imd he; si obj

By

[] hletregion ρ0 in e0 ; s1 i ⇀ hfreeregion r0 after e0 [ ρ0 := r0 ]; s1 {r0 7→ ∅}i

-letregion: ⇀

imd he; sir1 obj imd

imd he; si obj

imd

-letregion, r0 ∈ / r1 . By rule ⊢obj he; si , ∅ {r1 }; S ⊢obj e

imd

imd

T1 . By ⊢obj e -letregion, ∅ {r1 } {ρ0 }; S1 ⊢obj e

r1 ρ0

By Lemma 4.1.8, ∅ {r1 } {r0 }; S1 {r0 7→ ∅} ⊢ := r0 ]. Apply ⊢

imd obj e -freeregion

r1

:

letregion ρ0 in e0 : P1 !

e0 : P0 ! T0 , P1 = P0 - ρ0 , and T1 = T0 - ρ0 .

imd r1 r0 obj e

e0 [ ρ0 := r0 ]: P0 [ ρ0 := r0 ] ! T0 [ ρ0 imd

to get ∅ {r1 }; S1 {r0 7→ ∅} ⊢obj e

r1

freeregion r0 after e0 [ ρ0

:= r0 ]: P0 [ ρ0 := r0 ] - r0 ! T0 [ ρ0 := r0 ] - r0 . Because r0 ∈ / r1 , r0 ∈ / frv(P1 ) and r0 ∈ / frv(T1 ). Thus, r0 ∈ / frv(P0 ) and r0 ∈ / frv(T0 ), so P0 [ ρ0 := r0 ] - r0 = P0 - ρ0 and T0 [ ρ0 := r0 ] - r0 = T0 - ρ0 , i.e., reduction leaves the pure type and trace type unchanged. We also have by rule imd

imd

⊢obj he; si that ⊢obj s

r1

imd

s1 : S1 , so applying Lemma 4.1.8 on stores we obtain ⊢obj s

{r0 7→ ∅}: S1 {r0 7→ ∅}. We complete with an application of rule ⊢ [] hfreeregion r0 after v0 ; s1 {r0 7→ s 0 }i ⇀ hv0 [ r0 := ∅]; s1 i

imd obj he;

s1

si

.

≬

⇀ imd he; si obj

-freeregion:

imd he; sir1 obj

imd

imd

We have by rule ⊢obj he; si that ∅ {r1 }; S0 ⊢obj e imd

r1

: imd

freeregion r0 after v0 : P1 ! T1 . By ⊢obj e

imd

imd

⊢obj e -pure, and ⊢obj p -val (because values have no effect), S0 ⊢obj v

-freeregion,

P1 = P0 - r0 and T1 = ∅. By Lemma 4.1.1, S0 ∈ imd

Lemma 4.1.9, S1 ⊢obj v

r1

imd r1 r0 , obj S

v0 : P0 , with

imd

imd

v0 [ r0 := ∅]: P1 . Applying ⊢obj p -val and ⊢obj e -pure provides an imd

Lemma 4.1.9 on stores obtains ⊢ imd obj he;

r1 r0

i.e., S0 = S1 {r0 7→ ≬S 0 }. By

imd

empty effect, as required. We also have by rule ⊢obj he; si that ⊢obj s

⊢

r1 r0

imd r1 obj s

r1 r0

s0 : S0 . Using

s1 : S1 . We complete with an application of rule

si

.

Proof: Lemma 4.1.6. imd obj hq;

si: imd

∗

From ⊢obj hq; si h→

imd

q [e1 ]; si : hQ; Si we have S ⊢obj q

[ e]

imd

The former gives ∅; S1 ⊢obj e ∗

h→

ǫ

q[e1 ] : Q and ⊢obj s s: S.9 imd

→∗ [ e]

imd

ǫ

q [e1 ]: Q ! ∅. Applying rule ⊢obj he; si gives ⊢obj he; si imd

q[e1 ]; si: hP ! ∅; Si. Also using ⊢obj he; si

[ e]

imd

→∗ [ e]

r1

imd

r1

he1 ; si: hP1 ! T1 ; Si and ⊢obj he; si

he′1 ;

s′ i: hP1 ! T1′ ; S ′ i with T1′ ≤ T1 and S ′ ≥r1 S, the result on expression configurations proimd

ǫ

vides ⊢obj he; si h→

∗

imd

imd

q[e′1 ]; s′ i: hQ ! ∅; S ′ i. From rule ⊢obj he; si we have ∅; S ′ ⊢obj e

[ e]

ǫ

9We omit indexes on storable judgments to avoid confusion and because we will not need to refer to them.

199

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

imd

→∗ [ e]

imd

imd

q[e′1 ]: Q ! ∅ and ⊢obj s s′ : S ′ . Applying rule ⊢obj q to the former gives S ′ ⊢obj q imd

→∗ [ e]

imd

∗

q[e′1 ] : Q. Applying rule ⊢obj hq; si yields ⊢obj hq; si h→

imd obj he;

q [e′1 ]; si : hQ; Si.

[ e]

si: imd

r

From ⊢obj he; si h→ imd

∗

imd

e [e1 ]; si: hP ! T; Si we have ∅ {r}; S ⊢obj e

[ e]

imd

⊢obj s s: S. From ⊢obj he; si

r r1

imd

also have ∅ {r} {r1 }; S ⊢obj e

imd

he1 ; si: hP1 ! T1 ; Si and ⊢obj he; si

r r1

r

r r1

imd

e1 : P1 ! T1 , ∅ {r} {r1 }; S ′ ⊢obj e

→∗ [ e]

e [e1 ]: P ! T and

he′1 ; s′ i: hP1 ! T1′ ; S ′ i we

r r1

imd

imd

e′1 : P1 ! T1′ , and ⊢obj s

s′ : S ′ . By the result on expressions we have ∃ T ′ ≤ T. Γ; S ′ ⊢obj e imd

imd

r

∗

Applying rule ⊢obj he; si yields ⊢obj he; si h→

r

→∗ [ e]

e[e′1 ] : P ! T ′ .

e [e′1 ]; s′ i: hP ! T ′ ; S ′ i.

[ e]

imd obj e: imd

By induction on the portion of the derivation Γ; S ⊢obj e to

∗

r

→∗ [ e]

e[e3 ]: P ! T corresponding

→ [ e]

e . We show only selected cases.

[ e r ]: imd

r3 = ǫ ∧ P3 = P ∧ T3 = T. Use ∅ {r} {r3 }; S ′ ⊢obj e

r r3

e′3 : P3 ! T3′ letting T ′ = T3′ ≤

T3 = T. ∗

freeregion r0 after →

[ e]

e0

rr0 ˆ r2 r

imd

We have ∅ {r}; S ⊢obj e -freeregion,

: ∗

freeregion r0 after →

imd

∅ {r} {r0 }; S ⊢obj e imd

rr0

T0′ ≤ T0 . ∅ {r} {r0 }; S ′ ⊢obj e

rr0

imd

e 0 [e2 ]: P0 - r0 ! T0 - r0 . By ⊢obj e

[ e]

→∗ [ e]

e 0 [e2 ]: P0 ! T0 . By induction, we then obtain ∃ imd

→∗ [ e]

e 0 [e′2 ] : P0 ! T0′ . After applying ⊢obj e -freeregion

imd

to this result, we have ∅ {r}; S ′ ⊢obj e

r

∗

freeregion r0 after →

e 0 [e′2 ]: P0 - r0 ! T0′ - r0 ,

[ e]

with T0′ - r0 ⊆ T0 - r0 . ∗ r1 r0 r2 ˆ r3 @ r0 href → [ e]e′ i: imd

We have ∅ {r}; S ⊢obj e imd

r

→∗ [ e] ′

e [e3 ]i: Ref P0 @ r0 ! T ′ ∪ {(alloc @ r0 )}. By

@ r0 href

imd

imd

⊢obj e -alloc and ⊢obj b -ref, ∅ {r}; S ⊢obj e imd

T ′′ ≤ T ′ . ∅ {r}; S ′ ⊢obj e imd

yields ∅ {r}; S ′ ⊢obj e

r

r

r

→∗ [ e] ′

e [e3 ]: P0 ! T ′ . By induction we get ∃ imd

→∗ [ e] ′

imd

e [e′3 ] : P0 ! T ′′ . Applying ⊢obj b -ref and ⊢obj e -alloc ∗

@ r0 href →

[ e] ′

e [e′3 ]i: Ref P0 @ r0 ! T ′′ ∪ {(alloc @ r0 )}, with T ′′

∪ {(alloc @ r0 )} ⊆ T ′ ∪ {(alloc @ r0 )}. ∗ r ˆ r3 let x= → [ e]e′ in e: 10 imd

We have ∅ {r}; S ⊢obj e

r

let x= →

∗

[ e] ′

imd

e [e3 ] in e: P ! T.1 ∪ T.2 . By ⊢obj e -let, ∅ {r}; S

10The proof of Calcagno, et. al., fails in cases such as this, with an additional subexpression besides the context. Without proving any properties of the residual store type, we cannot know that it is safe to substitute it into the derivation of the additional subexpression.

200

Object Language

Type and Effect Soundness

imd

⊢obj e

r

imd

→∗ [ e] ′

e [e3 ]: P.1 ! T.1 and ∅ {r} {x 7→ P.1 }; S ⊢obj e imd

we get ∃ T.1′ ≤ T.1 . ∅ {r}; S ′ ⊢obj e

r

r

e: P ! T.2 . By induction

→∗ [ e] ′

e [e′3 ] : P.1 ! T.1′ . By Lemma 4.1.2 on

→∗ [ e]

e , we can modify the second subderivation using Proposition 4.1.3 to yield ∅ {r} imd

{x 7→ P.1 }; S ′ ⊢obj e

r

imd

imd

e: P ! T.2 . Applying ⊢obj e -let yields ∅ {r}; S ′ ⊢obj e

r

let x=

→∗ [ e] ′

e [e′3 ] in e: P ! T.1′ ∪ T.2 , with T.1′ ∪ T.2 ⊆ T.1 ∪ T.2 .

Adding context can also preserve typability. There are two formulations for each syntactic category depending on whether the new context is described by a region name or a region variable. In each case the store remains unchanged. For region names, we consider more of it to be lexical; we thus require that the first nonlexical region store correspond to the region name describing the new context. We restrict the last two cases to active source expressions and prestorables — in fact we only require the value case. Proposition 4.1.5 (Refinement). imd obj v: imd

r

S ⊢obj v v : P imd r S rr3 r4 ⊢obj v v : P

imd

rρ3

S ⊢obj v v: P imd rr3 S rr3 r4 ⊢obj v v: P

→ →

imd obj p: imd

r

Γ; S ⊢obj p p : P imd r Γ; S rr3 r4 ⊢obj p p : P

imd obj b

imd

rρ3

Γ {ρ3 } {x3 7→ P3 }; S ⊢obj p p: P imd rr3 rr3 r4 p obj Γ {r3 } {x3 7→ P3 }; S ⊢ p: P

→ →

(active source prestorable): imd

r0 /r2

imd

r0 /r2 ρ3

Γ; S ⊢obj b b: B!T → Γ {ρ3 } {x3 7→ P3 }; S ⊢obj b b: B!T imd r0 /r2 imd r0 /r2 r3 r0 r2 r3 r4 r0 r2 r3 r4 b b obj obj Γ; S ⊢ b : B ! T → Γ {r3 } {x3 7→ P3 }; S ⊢ b: B!T

imd obj e

(active source expression): imd

r

Γ; S ⊢obj e e : P ! T imd r Γ; S rr3 r4 ⊢obj e e : P ! T

→ →

imd

rρ3

Γ {ρ3 } {x3 7→ P3 }; S ⊢obj e e: P!T imd rr3 Γ {r3 } {x3 7→ P3 }; S rr3 r4 ⊢obj e e: P!T 201

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Proof: Lemma 4.1.7. We actually show: imd obj p: imd

Γ; S ⊢obj p

̺

imd

̺

p : P ∧ Γ ( x′ ) = P ′ ∧ S ⊢obj v v ′ : P ′ → imd

Γ − x′ ; S ⊢obj p

̺

p [ x′ := v ′ ] : P

imd obj b: imd

Γ; S ⊢obj b

⇁ ̺

imd

⇁ ̺ = seq-/( ̺ )

imd

̺

b: B ! T ∧ Γ ( x′ ) = P ′ ∧ S ⊢obj v

imd

Γ - x′ ; S ⊢obj b

⇁ ̺

v′ : P ′ →

b [ x′ := v ′ ]: B ! T

imd obj e: imd

Γ; S ⊢obj e

̺

e: P ! T ∧ Γ ( x′ ) = P ′ ∧ S ⊢obj v v ′ : P ′ →

imd

Γ - x′ ; S ⊢obj e

̺

e [ x′ := v ′ ]: P ! T

The proof is by induction on the derivation of the term into which we are substituting. We show only selected cases. imd obj p: imd

Γ; S ⊢obj p

̺

x : P:

imd

By ⊢obj p -var, Γ ( x) = P. If x′ = x then P = P ′ and x [ x′ := v ′ ] = v ′ , so the result imd

imd

follows from an application of ⊢obj p -value. Otherwise, by ⊢obj p -var and the definition imd

of environments, (Γ − x′ ) ( x) = P. Γ − x′ ; S ⊢obj p

̺

imd

x : P follows from ⊢obj p -var,

and is sufficient because x = x [ x′ := v ′ ].

imd obj b: imd ̺1 ̺0 /̺2

Γ; S ⊢obj b

T

hλx.e0 i: P0.1 ⇒0 P0.2 ! ∅:

imd

imd

By ⊢obj b -λ, (Γ - ̺2 ) {x 7→ P0.1 }; | S |̺1 ̺0 ⊢obj e

̺1 ̺0

e0 : P0.2 ! T0 . To perform in-

duction we use either the value derivation rebuilt with index ̺1 ̺0 or, if v ′ = hr2 , oi, imd

for some r2 ∈ ̺2 , a value derivation built using ⊢obj v -addr-dead. The result of the inimd

duction is (Γ − ̺2 ) {x 7→ P0.1 } - x′ ; | S |̺1 ̺0 ⊢obj e

̺1 ̺0

e0 [ x′ := v ′ ]: P0.2 ! T0 . By

definition of environment extension, x′ 6= x, so (Γ − ̺2 ) {x 7→ P0.1 } - x′ = (Γ − x′ imd ̺1 ̺0 /̺2

̺2 ) {x 7→ P0.1 }. (Γ - x′ ); S ⊢obj b imd

⊢obj b-λ, with hλx.e0 [ x′ := v ′ ] i = hλx.e0 i [ x′ := v ′ ]. 202

T

hλx.e0 [ x′ := v ′ ] i: P0.1 ⇒0 P0.2 ! ∅ follows by

Object Language

Type and Effect Soundness

imd obj e: imd

Γ; S ⊢obj e

r

freeregion r0 after e0 : P0 - r0 ! T0 - r0 :

imd

By ⊢obj e S∈

-freeregion,

imd rr0 r3 , obj S

imd

Γ {r0 }; S ⊢obj e

rr0

imd

e0 : P0 ! T0 . By the definition of ⊢obj e imd

so by Proposition 4.1.5, S ⊢obj v imd

Γ {r0 } - x′ ; S ⊢obj e

rr0

rr0

rr0

,

v ′ : P ′ , and then by induction,

e0 [ x′ := v ′ ]: P0 ! T0 . By definition of environment extenimd

imd

sion, Γ {r0 } - x′ = (Γ - x′ ) {r0 }. Applying ⊢obj e-freeregion, (Γ - x′ ); S ⊢obj e

r

freeregion

r0 after e0 [ x′ := v ′ ]: P0 - r0 ! T0 - r0 , where freeregion r0 after e0 [ x′ := v ′ ] = (freeregion r0 after e0 ) [ x′ := v ′ ]. Proof: Lemma 4.1.8. Unlike the situation with value substitution, the property of the variable being substituted always being declared on the outside of the static environment cannot be maintained through the induction. This is because the expression may declare program variables whose types use this region variable. Thus, we prove for all r0 ∈ / Dom( S1 ): imd obj v: imd

S1 ⊢obj v

r1 ρ0 ρ2

imd

v: P → S1 {r0 7→ ∅} ⊢obj v

r1 r0 ρ2

v [ ρ0 := r0 ]: P [ ρ0 := r0 ]

imd obj p: imd

Γ; S1 ⊢obj p

r1 ρ0 ρ2

p: P → imd

Γ [ ρ0 := r0 ]; S1 {r0 7→ ∅} ⊢obj p imd obj b

p [ ρ0 := r0 ] : P [ ρ0 := r0 ] ⇁

(active source prestorable, seq-/( ̺ ) = r1 ρ0 ρ2 ): imd

Γ; S1 ⊢obj b

⇁ ̺

b: B ! T → imd

Γ [ ρ0 := r0 ]; S1 {r0 7→ ∅} ⊢obj b imd obj e

r1 r0 ρ2

⇁ ̺ [ ρ0 := r0 ]

b [ ρ0 := r0 ]: B [ ρ0 := r0 ] ! T [ ρ0 := r0 ]

(active source expression): imd

Γ; S1 ⊢obj e

r1 ρ0 ρ2

e: P ! T → imd

Γ [ ρ0 := r0 ]; S1 {r0 7→ ∅} ⊢obj e

r1 r0 ρ2

e [ ρ0 := r0 ]: P [ ρ0 := r0 ] ! T [ ρ0 := r0 ]

imd obj s: imd

⊢obj s

r

imd

s: S1 → ⊢obj s

rr0

s {r0 7→ ∅}: S1 {r0 7→ ∅}

203

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

The proof is by mutual induction: imd obj b: imd

Γ; S1 ⊢obj b

⇁ ̺

T

hλx.e i: P.1 ⇒ P.2 ! ∅:

⇁

imd

imd

Let ̺ = ̺1 ̺0 /̺2 . If ρ0 ∈ ̺1 ̺0 , by ⊢obj b-λ, (Γ - ̺2 ) {x 7→ P.1 }; | S1 |̺1 ̺0 ⊢obj e

̺1 ̺0

e:

P.2 ! T. We have by induction that (Γ - ̺2 ) {x 7→ P.1 } [ ρ0 := r0 ]; | S1 |̺1 ̺0 {r0 7→ ∅} imd

⊢obj e

̺1 ̺0 [ ρ0 := r0 ]

e [ ρ0 := r0 ]: P.2 [ ρ0 := r0 ] ! T [ ρ0 := r0 ]. (Γ - ̺2 ) {x 7→ P.1 } [ ρ0 :=

r0 ] = (Γ [ ρ0 := r0 ] - ̺2 ) {x 7→ P.1 [ ρ0 := r0 ]} as well as that | S1 |̺1 ̺0 {r0 7→ ∅} = | S1 imd

{r0 7→ ∅} |̺1 ̺0 [ ρ0 := r0 ] , so we can apply ⊢obj b -λ to obtain Γ [ ρ0 := r0 ]; S1 {r0 7→ ∅} imd

⊢obj b

⇁ ̺ [ ρ0 := r0 ]

hλx.e [ ρ0 := r0 ] i: P.1 [ ρ0 := r0 ]

:= r0 ] i = hλx.e i [ ρ0 := r0 ] and P.1 [ ρ0 := r0 ]

T [ ρ0 := r0 ]

⇒

T [ ρ0 := r0 ]

⇒

P.2 [ ρ0 := r0 ] ! ∅. hλx.e [ ρ0 T

P.2 [ ρ0 := r0 ] = P.1 ⇒ P.2 [ ρ0 T

:= r0 ], so we are done. Otherwise (ρ0 ∈ ̺2 ), hλx.e i, P.1 ⇒ P.2 , and ∅ do not include ρ0 so they are unchanged by the substitutions and Γ may be replaced with Γ [ ρ0 := r0 ], while S1 {r0 7→ ∅} provides the single additional region store required.

imd obj e: imd

Γ; S1 ⊢obj e By ⊢

imd obj e

̺ ′ ̺ ′ ̺ ′ = r1 ρ0 ρ2 1 0 2

-alloc,

Γ; S1 ⊢

@ ̺0′ b: B @ ̺0′ ! T:

imd ̺1′ ̺0′ /̺2′ obj b

b: B ! T ′ and ̺ ∈ ̺ where T = T ′ ∪ {(alloc @ ̺)}. imd (̺1′ ̺0′ /̺2′ ) [ ρ0 := r0 ]

By induction, Γ [ ρ0 := r0 ]; S1 {r0 7→ ∅} ⊢obj b r0 ] ! T ′ [ ρ0 := r0 ]. Applying ⊢

imd obj e

b [ ρ0 := r0 ]: B [ ρ0 := imd

-alloc

gives Γ [ ρ0 := r0 ]; S1 {r0 7→ ∅} ⊢obj e

r1 r0 ρ2

@ ̺0′ [ ρ0 := r0 ] b [ ρ0 := r0 ]: B [ ρ0 := r0 ] @ ̺0′ [ ρ0 := r0 ] ! T ′ [ ρ0 := r0 ] ∪ {(alloc @ ̺0′ [ ρ0 := r0 ])}, with @ ̺0′ [ ρ0 := r0 ] b [ ρ0 := r0 ] = @ ̺0′ b [ ρ0 := r0 ], B [ ρ0 := r0 ] @ ̺0′ [ ρ0 := r0 ] = (B @ ̺0′ ) [ ρ0 := r0 ], and T ′ [ ρ0 := r0 ] ∪ {(alloc @ ̺0′ [ ρ0 := r0 ])} = (T ′ ∪ {(alloc @ ̺0′ )}) [ ρ0 := r0 ]. imd

Γ; S1 ⊢obj e By ⊢

r1 ρ0 ρ2

letregion ρ3 in e3 : P3 - ρ3 ! T3 - ρ3 :

imd obj e -letregion

imd

, Γ {ρ3 }; S1 ⊢obj e

r1 ρ0 ρ2 ρ3

e3 : P3 ! T3 . By definition of environment imd

extension, ρ3 6= ρ0 . By induction, Γ {ρ3 } [ ρ0 := r0 ]; S1 {r0 7→ ∅} ⊢obj e

r1 r0 ρ2 ρ3

e3

[ ρ0 := r0 ]: P3 [ ρ0 := r0 ] ! T3 [ ρ0 := r0 ], with Γ {ρ3 } [ ρ0 := r0 ] = Γ [ ρ0 := r0 ] {ρ3 }. imd

imd

Applying ⊢obj e -letregion, we obtain Γ [ ρ0 := r0 ]; S1 {r0 7→ ∅} ⊢obj e

r1 r0 ρ2

letregion

ρ3 in e3 [ ρ0 := r0 ]: P3 [ ρ0 := r0 ] - ρ3 ! T3 [ ρ0 := r0 ] - ρ3 with letregion ρ3 in e3 [ ρ0 := r0 ] 204

Object Language

Type and Effect Soundness

= (letregion ρ3 in e3 ) [ ρ0 := r0 ], P [ ρ0 := r0 ] - ρ3 = (P - ρ3 ) [ ρ0 := r0 ], and T [ ρ0 := r0 ] ρ3 = (T - ρ3 ) [ ρ0 := r0 ]. Proof: Lemma 4.1.9. imd

The proof is by case analysis on the derivation of S1 {r0 7→ ≬S 0 } ⊢obj v imd

S1 {r0 7→ ≬S 0 } ⊢obj v By ⊢ ⊢

imd obj v

r1 r0

-addr-live,

imd obj v-addr-dead

Reapplying ⊢

r1 r0

v0 : P0 :

a: B @ r: imd

a = hr, oi, S ( a) = B. If r = r0 then S1 ⊢obj v

r1

h∅, oi: ∅ follows from

with h∅, oi = a [ r0 := ∅] and ∅ = (B @ r) [ r0 := ∅]. Otherwise S1 ( a) = B.

imd obj v -addr-live

imd

gives S1 ⊢obj v

r1

a: B @ r, with a = a [ r0 := ∅] and B @ r = (B @

r) [ r0 := ∅]. We provide only sketches for the remaining two proofs. Proof: Lemma 4.1.10. We must again prove a more general result. In addition to the context of region names r2 defined between the points of allocation and application, there is a second context of lexical region and program variables that develops as we descend into the derivation of the function body. We write the prestorable case in two clauses, depending on whether the allocation is in the outer store regions r1 or the current store region and virtual regions r0 ρ2 . We thus prove: imd obj v: imd

S0 ⊢obj v

r1 r0 ρ2

imd

v: P → S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj v

r1 r0 r2 ρ2

v: P

imd obj p: imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj p

r1 r0 ρ2

p: P → imd

∅ {r1 } {r0 } {r2 }{ρ2 }{x2 7→ P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj p imd obj b

r1 r0 r2 ρ2

p: P

(active source prestorable, i): imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj b

⇁ r1 r0 ρ2

b: B1 ! T → imd

∅ {r1 } {r0 } {r2 }{ρ2 }{x2 7→ P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b

⇁ r1 r0 r2 ρ2

b: B1 ! T

205

Modelling Encapsulation of State With Monad Transformers

imd obj b

Intermediate Languages

(active source prestorable, ii): imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj b

⇁ r1 r0 ρ2

b: B2 ! T → imd

∅ {r1 } {r0 } {r2 }{ρ2 }{x2 7→ P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b imd obj e

⇁ r1 r0 r2 ρ2

b: B2 ! T

(active source expression): imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj e

r1 r0 ρ2

e: E → imd

∅ {r1 } {r0 } {r2 }{ρ2 }{x2 7→ P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj e

r1 r0 r2 ρ2

e: E

Proof: Lemma 4.1.11. The auxilliary lemma for storification requires the converse statement of that for expification: imd obj v: imd

(S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj v S0 ⊢

imd r1 r0 ρ2 obj v

r1 r0 r2 ρ2

v: P ∧ ∀r2 ∈ r2 .r2 ∈ / frn(v) ∪ frn(P)) →

v: P

imd obj p: imd

(∅ {r1 } {r0 } {r2 }{ρ2 }{x2 7→ P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj p

r1 r0 r2 ρ2

p: P ∧

∀r2 ∈ r2 .r2 ∈ / frn(p) ∪ frn(P)) → imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj p imd obj b

r1 r0 ρ2

p: P

(active source prestorable, i): ⇁ imd r1 r0 r2 ρ2

(∅ {r1 } {r0 } {r2 }{ρ2 }{x2 7→ P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b

b: B ! T ∧

∀r2 ∈ r2 .r2 ∈ / frn(b) ∪ frn(B)) → imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj b imd obj b

⇁ r1 r0 ρ2

b: B ! T

(active source prestorable, ii): ⇁ imd r1 r0 r2 ρ2

(∅ {r1 } {r0 } {r2 }{ρ2 }{x2 7→ P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b ∀r2 ∈ r2 .r2 ∈ / frn(b) ∪ frn(B)) → imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj b

206

⇁ r1 r0 ρ2

b: B ! T

b: B ! T ∧

Monadic Language

imd obj e

Dynamics

(active source expression): imd

(∅ {r1 } {r0 } {r2 }{ρ2 }{x2 7→ P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj e

r1 r0 r2 ρ2

e: E ∧

∀r2 ∈ r2 .r2 ∈ / frn(e) ∪ frn(E)) → imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj e

r1 r0 ρ2

e: E

2. Monadic Language

2.1. Dynamics.

The syntax of a monadic intermediate language with encapsulation is presented in Figure 4.2.1. It modifies the monadic source language syntax of Figure 3.2.14 by generalizing the region variables to region indicators and adding additional features to support a reduction semantics as in Figure 4.1.1. It includes a running form, corresponding to freeregion of the object language, that, like run, does not bind any region indicator. The object language has a single global view of the store. It thus requires a two-dimensional address to access first a region store and then a location. By our own rules, the view of a monadic store must be relative to the context within a program. Then all effectful operations can be applied locally. Because region names are not necessary for reduction, full addresses of the object language are replaced by simple offsets. In particular, the uppermost region stores become irrelevant as we use return to access outer regions. If they are popped from the stack the relevant information will always be on top. This is clear for lexically visible region stores but is just as true for region stores whose encapsulation construct lies within the current context (see Section 2.6 of the Introduction). The challenge is to maintain these alternate views in such a way that the appropriate region stores are made available for local processing. Our store now has two components, adjoined by ’ˆ ’. To the left, the lexical store ⊚s holds the lexically visible region stores. It takes the form of a simple sequence and not a finite function. The empty lexical store is ∅ and each region store is enclosed between ’{’ and ’}’ braces. To the right, the nonlexical store 6 ⊚s holds the rest of the store as a ’.’-left-delimited sequence of forests (sequences of trees) of

207

Modelling Encapsulation of State With Monad Transformers

hq; 6 ⊚si ∈ he;

imd 6⊚ mon hq; s i imd q ∈ mon q r r si ∈ imd mon he; si ̺ e ̺ ∈ imd mon e

e r1 ∈

hv;

imd r1 mon e ̺ imd ̺1 ̺0 e ∈ mon e ̺1 ̺0 b ̺ ∈ imd mon b r1 r0 d r ∈ imd mon d ̺ p ̺ ∈ imd mon p r ⊚ ⊚ r si ∈ imd mon hv; s i ̺ v ̺ ∈ imd mon v

s

⊚ r

∈

s

6⊚ r

s

6⊚ r

s

∈

= ::=

imd imd imd 6 ⊚ ǫ mon q mon × mon s ǫ

e

imd r imd imd r mon e mon × mon s ̺ ̺ ̺

return p | let x = e in e ̺ | run e ̺ρ0 running e r1 r0

::= ::= ::= ::= ::= = ::=

return e ̺1 | b ̺ | deref p ̺ | set p ̺ to p ̺ | p ̺ p ̺ href p ̺ i | hλx.e ̺ i href v r i | hλx.e r i v̺ | x imd r mon v

::=

⊚ r

imd ⊚ ǫ mon s

::=

∅

imd ⊚ r1 r0 mon s

::=

⊚ r1

∈

×

s ˆ 6 ⊚s

s

r

r

{≬s }

imd 6 ⊚ ǫ mon s

::=

fǫ ≬ .s

imd 6 ⊚ r1 r0 mon s

::=

fr 6 ⊚ r1 ≬ s . s

∈

imd ⊚ r mon s

g| o

imd r mon s

sr ∈ ⊚ r

= ::=

Intermediate Languages

≬ r1 r0

s

f r1 ≬

f r1 imd ≬ s ∈ mon s ≬ r ≬ r1 r0 s ∈ imd mon s ̺ t r ∈ imd mon t r1 r0 f r ∈ imd mon f

::=

f r1 r0 ≬s

::= = ::=

∅ {o 7→ d r } r [ imd mon f ] (ι @ o)

fr ∈

::=

(ι @ o) | return f r2 r1

::= ::= ::= ::=

alloc | read | write | exec r|ρ ρ r

g ∈

imd r2 r1 r0 mon f ι ∈ imd mon ι imd ̺ ∈ mon ̺ ρ ∈ imd mon ρ r ∈ imd mon r imd imd mon g ⊇ obj g

o ∈

imd mon o

⊇

imd obj o

x ∈

imd mon x

⊇

imd obj x

Figure 4.2.1. Monadic Intermediate Language Syntax f

region stores. We use the metavariable ≬s to denote a tree of region stores. Atomic traces no longer explicitly record the region at which an action occurs. Rather, they use return forms to identify its position with respect to the current level. Program configurations require only a nonlexical store because programs have no lexical context. Value configurations include only a lexical store because values do not contain running constructs that would be required to access nonlexical region stores. Expression configurations include a full store. 208

Monadic Language

Dynamics

We extend the indexing of the monadic source language to include region names as well as region variables, but region names are also restricted to a unique element. A region indicator metavariable is then just a boolean variable. As in the monadic source language, monadic operations are assigned a nonempty index, run requires an expression of index extended by a region variable, and return either occurs in sequence and requires a pure or requires an expression of retracted index. The subexpression of a running form has an index extended with an additional region name. The atomic trace component of a return form has an index retracted by a region indicator. Lexical stores are indexed similarly to the stores of the object language. A region has access to those other regions in its lexical context that are net of (not cancelled by) return constructs. The trees in each successive forest in a nonlexical store have an index extended with an additional region name, as they correspond to regions that have access to successively increasing portions (prefixes) of the lexical store. The first forest has an empty index, corresponding to the top level. Thus, program configurations include a nonlexical store of empty index. The last forest has access to the entire lexical store, i.e., it contains regions declared by freeregion constructs that do not fall within enough net return constructs to cancel regions declared by freeregion constructs in the context of the expression. Each node of a tree contains a region store and a forest of children, both with an index extended by a single region name. Regions deeper in a tree are declared by more deeply nested freeregion constructs, net of return constructs, within the expression. Thus, a region store has an index extended by a number of region names equal to its depth in a tree plus its position in the sequence of forests. The region stores at successive depths of the tree, however, were created at the same depth of nesting of running constructs net of return constructs, in the expression’s context. A lexical and a nonlexical store that are combined as a full store must have the same indexes. An expression and store, or value and lexical store, that are combined as a configuration must have the same indexes. Again, actual terms are not annotated with level information. In particular, offsets may be represented as their name implies, and need not contain any region information. With the object language, interaction between regions comes largely for free. But with the Monad Transformer per Region Language, stored values at different regions have different representations. For example, hλx.return x i and hλx.return return x i represent identity functions at the first and second levels, respectively. The question thus arises of what interactions between levels to allow and how to mediate such interaction. Because we maintain the store at increasing monadic levels, there is a direct association between region r0 of the store, and the portion of the program (with 209

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

lexical index r1 r0 ) that may access it. It falls to the translation to ensure that all object language programs can be interpreted in this way. ⇀ imd he; si mon

-let returnr v

hlet x=

[] in e; si ⇀ he [ x:= v]; si

imd he; sir mon

⇀

imd he; si mon

-alloc hd0 ;

s 1 {≬s 0 } ˆ

⊚

6⊚

s0

[(alloc @ o)] ⇀ h

i

returnr1 r0 o

;

s 1 {≬s 0 {o 7→ d0 }} ˆ

⊚

6⊚

s0

i

imd he; sir1 r0 mon

⇀

imd he; si mon

≬

s 0 ( o) = href v0 i

-deref hderef o;

s 1 {≬s 0 } ˆ

⊚

[(read @ o)] 6⊚ s0 i ⇀ h

returnr1 r0 v0

;

s 1 {≬s 0 } ˆ

⊚

6⊚

s0

i

imd he; sir1 r0 mon

⇀

imd he; si mon

≬

s 0 ( o) = href v0.1 i

-set hset o to v0 ; s 1 { s 0 } ˆ ⊚

≬

[(write @ o)] s0 i ⇀ h

6⊚

returnr1 r0 unit

;⊚s 1 {≬s 0 {o href v0 i}} ˆ

6⊚

s0

i

imd he; sir1 r0 mon

⇀ imd he; si mon

≬

s 0 ( o) = hλx.e0 i

-app-λ ho v0 ;

s 1 {≬s 0 } ˆ

⊚

6⊚

s0

[(exec @ o)] ⇀ he0 [ x:= v0 ];

i

s 1 {≬s 0 } ˆ

⊚

6⊚

s0

i

imd he; sir1 r0 mon

⇀

imd he; si mon

-run hrun e0 ;

⊚

s1

f

[]

ˆ  6 ⊚s  . ≬s 1 i ⇀

hrunning e0 ;

f

⊚

s1

imd he; sir1 mon

⇀ imd he; si mon

-running hrunning returnr1 r0 v0 ;

⊚

s1

[]

ˆ  6 ⊚s  . ≬s 0 i ⇀ imd he; sir1 mon

ˆ  6 ⊚s  . ≬s 1 ∅i

hreturnr1 v0 ;

⊚

s1

ˆ  6 ⊚s  . ǫi

Figure 4.2.2. Monadic Expression Configuration Reduction Rules Figure 4.2.2 presents the reduction rules of the language. The rule ⊢he; si -let does not use the store. It consumes a return form for each region name in the index. The rules for monadic operations use the top region store of the lexical store in a manner consistent with the language with only program-level encapsulation (Figure 2.2.3). The rules ⊢he; si -alloc, ⊢he; si -deref, and ⊢he; si -set must now generate trivial computations with a return form for each region in the lexical index. ⊢he; si-run and ⊢he; si -running correspond to ⊢he; si -letregion and ⊢he; si -freeregion of the object language (Figure 4.1.2). The last forest in the period-separated nonlexical store holds region stores representing 210

Monadic Language

Dynamics

running forms prefixed by the current lexical index. The rule ⊢he; si-run installs a new empty region store (with no children) at the end of this last forest. It is convenient to use the notation 6 ⊚s

to indicate an optional 6 ⊚s , which will be unnecessary in the case of r1 = ǫ. ⊢he; si -running replaces the single region store (of index r1 r0 , with no children) comprising the last period-terminated forest (of index r1 ) with an empty sequence. It consumes a single running form and a single return form. The remainder of the term takes the form returnr1 v. All of these reduction rules leave the index unchanged. The portion of the store required for evaluation is stricly decreasing as we descend expression evaluation contexts. Because all effects relate only to the innermost region, as we descend into return contexts we must truncate the store. This could be implemented by resorting to the more general notion of search rules instead of evaluation contexts, but we choose instead to redefine evaluation contexts as configuration contexts. We index program expression contexts as in the object language. Expression contexts are more general; because they need no longer take the form [ e

r r1

] r

e , no condensed representation is possible.

Configuration contexts are configurations with two holes, one in the store and one in the eximd

r2

imd

r2

pression. They take the form h[ mon e ]e r1 ; [ mon s ]s r1 i, i.e., there is no longer any necessary relation between the index of the expression or store and its hole, but both indexes are modified in the same way. We refer to configuration contexts as

[

imd mon he;

sir2 ]

r

he; si 1 , treating the holes expecting an

expression and store as a single hole expecting a configuration and allowing indexes on metavariables to be inferred. We combine configuration contexts and trace contexts to form configuration/trace contexts. These take the form:

[he; sir2 ]

imd

r2

he; sir1 ! [ mon t ]t r1 .

Figure 4.2.3 defines atomic program expression configuration and expression configuration/trace evaluation contexts in terms of their three component holes. The first clause defines program expression configuration evaluation contexts to include only empty program configuration contexts. The second clause leaves the index unchanged. It includes empty expression configuration/trace contexts and let contexts, whose store and trace components are empty, i.e., they leave the store and trace unchanged. Similarly to the object language, we have a context for running but not run, so that regions are again allocated outermost-first. Effect-masking now takes place in the context for running. It is handled by a clause allowing the hole to have incremented index. Upon entering a running context, we must rearrange the store, moving a region store from the root of the rightmost 211

Modelling Encapsulation of State With Monad Transformers

→∗ [ he; ∅ ˆ 6 ⊚s i ] ǫ

Intermediate Languages

imd →∗ [ he; ∅ ˆ 6 ⊚s i ] hq; 6 ⊚s i mon ǫ

hq; 6 ⊚si ∈

::=

h[ e]; [ 6 ⊚s ]i →∗ [ he; sir ]

he; sir ! →

∗

[ tr ] r

t

∈

→∗ [ he; sir ]

∗

he; sir ! →

[ tr ] r

t ::=

h[ e]; [ ⊚s ] ˆ [ 6 ⊚s]i ! [ t] | hlet x = [ e] in e; [ ⊚s] ˆ [ 6 ⊚s ]i ! [ t] →∗ [ he; sir1 r0 ]

r1

he; si

∗

!→

[ t r1 r0 ]

t ∈

→∗ [ he; sir1 r0 ]

he; si

r1

!→

∗

[ t r1 r0 ] r 1

t

::=

hrunning [ e]; &([ ⊚s] ˆ [ 6 ⊚s ])i ! &[ t] →∗ [ he; sir1 ]

∗

he; sir1 r0 ! →

[ t r1 r0 ]

t ∈

→∗ [ he; sir1 ]

he; sir1 r0 ! →

∗

[ t r1 ] r 1 r 0

t

::=

f

hreturn [ e]; [ ⊚s ] {≬s 0 } ˆ [ 6 ⊚s ]. ≬s 0 i ! '[ t] Figure 4.2.3. CBV Monadic Language Atomic Expression Configuration/Trace Evaluation Contexts list of trees in the nonlexical portion to the top of the lexical portion, inserting a period and letting the children form a new rightmost tree sequence. We must also modify the trace, adding a level of return to each atomic trace. Upon leaving, we must readjust both the store and the trace. We make the region store at the top of the lexical store the root of a tree whose children are the rightmost tree sequence, remove a dot, and merge this new tree into the next tree sequence to the left. We mask the trace, keeping only those atomic traces from which we can remove a level of return. These are achieved using the operator &. The final clause allows the holes to have decremented index and includes return. Both portions of the store are retracted upon descending into a return context and restored upon ascending. Thus, the reduction rules, which operate by default on the uppermost region store, will be made to operate on a lower region store when they operate within a return context. Traces are treated opposite to the way they are treated in running contexts, i.e., upon entering a running context, we keep only those atomic traces from which we can remove a level of return. Upon leaving, we add a level of return to each atomic trace. This is accomplished with the operator '. We define the operators & and ' formally below. They form a retraction pair, i.e., & ◦ ' = id.

212

Monadic Language

Dynamics

f f &(⊚s 1 {≬s 0 } ˆ 6 ⊚s. ≬s 1 . ≬s 0 )

≬ s0 f f ⊚ s 1 ˆ 6 ⊚s . ≬s 1 ≬s 0

=

≬ s0 f f '(⊚s 1 ˆ 6 ⊚s. ≬s 1 ≬s 0 )

&t 't

f

f

s 1 {≬s 0 } ˆ 6 ⊚s. ≬s 1 . ≬s 0

=

⊚

= =

[f|return f ∈ t] [return f|f ∈ t]

Iterating versions &r2 s r1 r2 and 'r2 s r1 of these operators are defined as follows: &ǫ s &r1 r0 s

= =

'ǫ s 'r1 r0 s

s &r1 &s

= =

s ''r1 s

We inductively define expression configuration/trace evaluation contexts over subconfigurations as the least set of expression configuration/trace contexts including these and closed over composition, and define program configuration evaluation contexts as the least set including atomic program configuration evaluation contexts and closed over composition with the expression configuration context component of expression configuration trace evaluation contexts. We extend the operators over region sequences to stores to help in the definition of evaluation contexts. We define hq; si ⇑ and hq; si ⇓ hq ′ ; s′ i as in the object language. t1 he1 ; s1 i →∗ he′1 ; s′1 i

∗

→

imd hq; si mon

-cntxt

imd he; sir1 mon

→∗ [ he; si]

hq;

6⊚

s i[he1 ; s1 i] →∗

→∗ [ he; si]

hq;

imd hq; si mon

6⊚

s i[he′1 ; s′1 i]

t1 he1 ; s1 i →∗ he′1 ; s′1 i

∗

→

imd he; sir mon

imd he; sir1 mon

-cntxt

→∗ [ t] →∗ [ he; si]

he; si[he1 ; s1 i]

t [t1 ] →∗

→∗ [ he; si]

imd he; sir mon

→∗

imd he; sir mon

-reflex

[] he; si →∗ he; si

t he; si →∗ he′ ; s′ i

imd he; sir mon

→

imd he; sir mon

imd he; sir mon ′

t he; si ⇀ he′ ; s′ i

∗

-step

imd he; sir mon

t he; si →∗ he′ ; s′ i imd he; sir mon

he; si[he′1 ; s′1 i]

→

t he′ ; s′ i →∗ he′′ ; s′′ i

∗

imd he; sir mon

-trans

imd he; sir mon

t+t′ he; si →∗ he′′ ; s′′ i imd he; sir mon

Figure 4.2.4. Monadic Language Multiple Deep Reduction Rules

213

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Multiple deep reduction rules are presented in Figure 4.2.4. We can derive the antecedent index for

→∗

imd hq; si obj

-cntxt and

→∗

imd he; si obj

-cntxt from the index of the conclusion only as the cumulative result

of the index transformations in Figure 4.2.3. These rules now enter a context for the store as well as for the expression and, in the case of

→∗

imd he; si obj

-cntxt, the trace.

In Figures 4.2.5 through 4.2.7 we demonstrate this dynamic semantics on the sample source program typed in Figures 3.2.25 through 3.3.27. Initially, the lexical store is empty and the nonlexical store contains a single empty sequence of trees. After the first run reduction, the store is ∅ ˆ∅ Dropping into the new running context yields a store of ∅ {∅} ˆ .ǫ. ǫ After the allocation, the store becomes ∅ {∅ {o1 7→ href uniti}} ˆ .ǫ. ǫ We apply

⇀

imd he; si obj

-run, leading to a store of ∅ {∅ {o1 7→ href uniti}} ˆ .ǫ. ∅

Again dropping into the running context yields ∅ {∅ {o1 7→ href uniti}} {∅} ˆ .ǫ. ǫ. ǫ After a few operations on the upper region store, we obtain ∅ {∅ {o1 7→ href uniti}} {∅ {o1 7→ href o1 i}} ˆ .ǫ. ǫ. ǫ Descending into the return form, the store is truncated to ∅ {∅ {o1 7→ href uniti}} ˆ .ǫ. ǫ We truncate both the lexical and nonlexical parts; in this case the upper portion of the nonlexical store is empty. Allocating the function, the store becomes ∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ǫ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i}

214

[]

[]

;

⇀

imd he; siǫ obj

-run

⇀ hrunning [ e]; &[ s]i 6

i

[] ⇀

imd he; siǫ obj

⇀ hunit; ∅ ˆ .ǫi -running

Monadic Language

hrun let x2 = href uniti in run let x3 = href x2 i in let x5 = deref x3 in return let x4 = hλx1 .let x6 = run return return x2 in deref x6 in x4 x5 ∅ ˆ .ǫi

[ (alloc @ o2 ), ] (alloc @ o1 )

h let x2 = [ e] in run let x3 = href x2 i in let x5 = deref x3 in return let x4 = hλx1 .let x6 = run return return x2 in deref x6 in x4 x5 [ s]i

(exec @ o2 ), (read @ o1 ) ⇀ d1

[]

;

⇀

imd he; sir obj

-let

i

6 hhref uniti; ∅ {∅} i ˆ.ǫ. ǫ

[(alloc @ o1 )] ⇀ hreturn o1 ; ⇀ imd he; sir -alloc ∅ {∅ {o1 7→ href uniti}} i obj ˆ.ǫ. ǫ Figure 4.2.5. Sample Monadic Language Reduction, I

Dynamics

215

Modelling Encapsulation of State With Monad Transformers

Executing the function provides a deref and a run form; after

Intermediate Languages

⇀

imd he; si obj

-run we get

∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ∅ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} We could now descend into the running form to obtain ∅ {∅ {o1 7→ href uniti} } {∅} ˆ .ǫ. ǫ. ǫ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} and into another return form to obtain ∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ǫ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} but this is not necessary, so we back out to ∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ∅ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} If we choose to continue backing out to the top level, we obtain ∅ {∅ {o1 7→ href uniti} } {∅ {o1 7→ href o1 i}} ˆ .ǫ. ∅. ǫ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} upon passing the return form, ∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ∅ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i}

∅ {o1 7→ href o1 i}

upon passing the inner running form, and ∅ˆ .∅ {o1 7→ href uniti} {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} ∅

∅ {o1 7→ href o1 i}

upon passing the outer running form. We can now see the tree structure of the program. The region store in the root holds the procedure. There are two child region stores. One, including only a cell holding unit, was created before the procedure was allocated. The other, empty, region store, was created during execution of the procedure body. Neither sibling region store has access to the other; both have access to the parent region store.

216

d1 =

hrun let x3 = href o1 i in let x5 = deref x3 in return letx4 = hλx1 .letx6 = run return return o1 in deref x6 in x4 x5 ∅ {∅ {o1 7→ href uniti}} i ˆ.ǫ. ǫ

[]

;

⇀ imd he; sir obj

-run

(exec @ o2 ), (read @ o1 ) ⇀ hrunning [ e];&[ s]i -

[] ⇀ imd he; siǫ obj

-running

⇀ hreturn unit; ⊚ s1 i ˆ.ǫ. ǫ

Monadic Language

[ (alloc @ o2 ), ]

i

[ return (alloc @ o2 ), ] [ (alloc @ o1 ) ]

h let x3 = [ e] in let x5 = deref o1 in return letx4 = hλx1 .letx6 = run return return o1 in deref x6 in x4 x5 [ s]i

[]

;

⇀

imd he; sir r obj

i

[ (read @ o1 ) ]

⇀

d2

-let

return (exec @ o2 ), return (read @ o1 ) [] ⇀ hreturn [ e]; ⇀ imd he; sir r -let [ ⊚s ] {∅ {o1 7→ href o1 i}} i obj ˆ[ 6 ⊚s]. ǫ 6

6 [(alloc @ o1 )] ⇀ hreturn return o1 ; ⇀ imd he; sir r -alloc ∅ {∅ {o1 7→ href uniti}} i obj {∅ {o1 7→ href o1 i}} ˆ.ǫ. ǫ. ǫ d3 217

Figure 4.2.6. Sample Monadic Language Reduction, II

Dynamics

hhref o1 i; ∅ {∅ {o1 7→ href uniti}} i {∅} ˆ.ǫ. ǫ. ǫ

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

In terms of traces, [(alloc @ o2 ), (exec @ o2 ), (read @ o1 ) ] is modified upon ascending to the return form to yield [return (alloc @ o2 ), return (exec @ o2 ) , return (read @ o1 )]. Combining with traces of the operations on the inner region, we obtain [(alloc @ o1 ), (read @ o1 ), return (alloc @ o2 ), return (exec @ o2 ) , return (read @ o1 )], but this is masked to [(alloc @ o2 ), (exec @ o2 ) , (read @ o1 )] upon ascending to the next running form. It is combined with the allocation of the outer cell to yield [(alloc @ o1 ), (alloc @ o2 ), (exec @ o2 ) , (read @ o1 )] but masked to [ ] upon ascending to the top level. Otherwise we continue execution of the procedure. Applying

⇀ imd he; si obj

-running clears an empty

region store to obtain ∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ǫ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} After ascending to the outer return form, the upper portion of the store is restored to yield ∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ǫ. ǫ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} {∅ {o1 7→ href o1 i}} Ascending to the inner running form, the store is ∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ∅ {o1 7→ href o1 i} {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} Another application of

⇀

imd he; si obj

-running reduces it to

∅ {∅ {o1 7→ href uniti} } ˆ .ǫ. ǫ {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} After ascending to the outer running form, the store is ∅ ˆ .∅ {o1 7→ href uniti} {o2 7→ hλx1 .let x6 = run return return o1 in deref x6 i} A final application of

⇀

imd he; si obj

-running reduces it to ∅ ˆǫ

218

; [ s]i

6 hderef o1 ; ∅ {∅ {o1 7→ href uniti}} i {∅ {o1 7→ href o1 i}} ˆ.ǫ. ǫ. ǫ []

h let x4 = [ e] ; in x4 o1 [ s]i

⇀

imd he; si obj

-let

[(read @ o1 )] ⇀ hreturn return o1 ; ∅ {∅ {o1 7→ href uniti}} i ⇀ imd he; sir r -deref {∅ {o1 7→ href o1 i}} obj ˆ.ǫ. ǫ. ǫ ⇀ ho2 o1 ; ⊚ s 1 ˆ .ǫ. ǫi

[(exec @ o2 )] ⇀

imd he; si obj

Monadic Language

d2 =

[ (alloc @ o2 ), (exec @ o2 ), (read @ o1 ) ] hlet x5 = [ e] in return let x4 = hλx1 .let x6 = run return return o1 i in deref x6 in x4 x5

⇀

-app-λ

6

d3 =

hhλx1 .let x6 = run return return o1 i; in deref x6 ∅ {∅ {o1 7→ href uniti}} i ˆ.ǫ. ǫ

[(alloc @ o2 )] ⇀

imd he; si obj

-alloc

⇀ hreturn o2 ; ⊚ s 1 ˆ .ǫ. ǫi

[]

⇀ hderef o1 ; ⊚ imd he; si -let s 1 ˆ .ǫ. ∅ {o1 7→ href o1 i}i obj

h let x6 = [ e] ; in deref x6 [ s]i

⇀

[(read @ o1 ) ] ⇀ hreturn unit; ⇀ ⊚ imd he; sir -deref s 1 ˆ .ǫ. ∅ {o1 7→ href o1 i}i obj

6 []

h run ; return return o1 ⊚ s 1 ˆ .ǫ. ǫi

⇀

imd he; si obj

-run

⇀ hrunning return ; return o1 ⊚ s 1 ˆ .ǫ. ∅i

[] ⇀

imd he; si obj

-running

219

s 1 = ∅ {∅ {o1 7→ href uniti} } {o2 7→ hλx1 .let x6 = run return return o1 i} in deref x6

Figure 4.2.7. Sample Monadic Language Reduction, III

Dynamics

⊚

⇀ hreturn o1 ; ⊚ s 1 ˆ .ǫ. ǫi

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

We continue along a similar course to that followed for the object language. Definition 4.2.1 (Immediately Faulty Expression Configurations). r

(1) hreturnr x; si

(2) hhλx.e0 i; sir1 r0 (λx.e0 not closed) r1 r0

(3) hderef v; si

r1 r0

, hset v to p; si

, hv p; si

r1 r0

(v 6= o) r1 r0

, hset o to p; ⊚s {≬s } ˆ 6 ⊚si

r1 r0

r1 r0

, hset o to p; ⊚s {≬s } ˆ 6 ⊚si

r1 r0

(4) hderef o; ⊚s {≬s} ˆ 6 ⊚si

, ho p; ⊚s {≬s } ˆ 6 ⊚si

r1 r0

(o ∈ / Dom (≬s)) (5) hderef o; ⊚s {≬s} ˆ 6 ⊚si ( ≬s (o) 6= href vi) (6) ho p; ⊚s {≬s} ˆ 6 ⊚si

r1 r0

( ≬s (o) 6= hλx.e0 i) (7) hrunning e0 ; s1 i (s1 6=

r

≬ s0 f f ⊚ s 1 ˆ 6 ⊚s. ≬s 1 ≬s 0 )

(8) hrunning return e1 ; s1 i

r1

(e1 = returnr1 v1 ∧ s1 6= ⊚s 1 ˆ 6 ⊚s. ≬s 0 )

Most immediately faulty expressions are similar to those of Definition 4.1.1. They differ in that all operations are assumed to take place with respect to the uppermost lexical region store. Additionally, the first clause for running requires the uppermost forest to be nonempty. The second, when the body is fully reduced, requires the uppermost forest to be a region store, i.e., a single tree with no children. As in the corresponding clauses for the object language, the second condition is a special case of the first. Intuitively, we can understand the second condition as follows. There are no siblings because they would have been moved into the lexical context as we descended into running forms and then would have been removed by truncation as we descended into return forms. There are no children because regions are deallocated innermost-first. Again, we needn’t consider varieties of syntactically invalid expression configurations, such as those with excess return forms. We again claim that our definition is complete.

220

Monadic Language

Dynamics

Proposition 4.2.1 (Nonvalue program configurations are reducible or faulty). Every program configuration hq; si with a nonvalue program q can be decomposed into the form r

→∗ [ he; si]

r

hq; si [he; si], where he; si is either a redex or immediately faulty. In the former case,

we say that hq; si is reducible, in the latter case that it is faulty. Proof: Proposition 4.2.1. We call an expression of the form returnr v a value expression. We show by induction on the structure of expressions that every expression configuration he; si with a nonvalue expression e can be ∗

decomposed into the form →

[ he; si]

he; si[he11 ; s11 i], where he11 ; s11 i is either a redex or immediately

faulty. The result for programs follows immediately because program configurations with nonvalue program can be decomposed using the empty program evaluation context into expression configurations with nonvalue expression. We then recognize that if a program q is an expression value e ǫ , then it is just a value v ǫ . We show only new and interesting cases. r

hrun e0 ; si : Let

→∗ [ he; si]

r

he; si = [ he; si]. hrun e0 ; si is a redex.

r

hrunning e0 ; si : →∗ [ he; si]

r

he; si = [ he; si] and hrunning e0 ; si is a

If e0 is an expression value, then let

redex or immediately faulty, depending on whether or not the uppermost forest in the rr0

nonlexical store is a singleton region store. Otherwise, by induction, he0 ; 'si ∗

decomposed into h→ mediately faulty, so

rr0

→∗ [ s]

[ e]

e 0 [e1 ];

s 0 [s1 ]i

→∗ [ he; si]

, where he1 ; s1 i ∗

he; si is hrunning →

f

r1 r0

hreturn e1 ; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 1 . ≬s 0 i

e0; &→

[ e]

∗

rr0 r1

s 0 i.

: r1 r0

h→ so

[ e]

e 1 [e2 ];

→∗ [ he; si]

∗

→ [ s]

is either a redex or im-

[ s]

If e1 is not an expression value, by induction he1 ; ⊚s 1 ˆ 6 ⊚s 1 i ∗

can be

r1

r2

s 1 [s2 ]i , where he2 ; ⊚s 2 ˆ 6 ⊚s 2 i ∗

he; si is hreturn →

∗

e1; →

[ e]

[

⊚

is either a redex or immediately faulty, ∗

s 1 {≬s 0 } ˆ →

s ]⊚

can be decomposed into

[

6⊚

f

s ]6 ⊚

s 1 . ≬s 0 i.

imd

imd

r

R mon hq; si is defined as in the object language but using configuration contexts. R mon he; si is imd

the least set of expression configurations including those formed from R mon hq; si by pulling expression configurations from program configuration evaluation contexts, and closed under evaluation.

221

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Definition 4.2.2 (Configurations Reachable from Source).

imd 6⊚ mon hq; s i: imd

R mon hq;

6⊚

si

(1) ∀ q ∈

is the least set such that both:

src mon q.

imd

hq; ǫi ∈ R mon hq; imd

(2) ∀ hq; 6 ⊚si ∈ R mon hq;

imd mon he;

6⊚

si

.

hq;

6⊚

si

6⊚

s i →∗ hq′ ;

6⊚ ′

s i

imd

→ hq ′ ; 6 ⊚s ′ i ∈ R mon hq;

imd hq; 6 ⊚s i mon

6⊚

si

sir : imd

r

he; sir ∈ R mon he; si

∗

⇔ ∃→

r

[ he; si]

hq; 6 ⊚si .

→∗ [ he; si]

r

imd

hq; 6 ⊚si [he; si] ∈ R mon hq;

6⊚

si

Source-reachable program configurations constructed with a value must contain an empty nonlexical store. Source-reachable expression configurations constructed with a value expression must contain a store whose structure matches the lexical context of the expression configuration, with each forest empty. Lemma 4.2.1 (Source-Reachable Value Configurations Have Lexical Store).

imd 6⊚ mon hq; s i: imd

hv; 6 ⊚s i ∈ R mon hq; imd mon he;

6⊚

si

→

6⊚

s =ǫ

r

si : imd

hreturnr v; ⊚s ˆ 6 ⊚si ∈ R mon he; si

r

→

6⊚

s = .ǫ. ǫ

Proof: Lemma 4.2.1. To prove this lemma, we must hone in on the correspondence between running constructs and region stores in reachable configurations. We write |q| and |e r |r/ for the number of running constructs net f

of return constructs (in the sense described below) in q and e r , respectively, and |≬s| for the number f

of region stores in ≬s . Formally,

222

Monadic Language

Dynamics

|q| |q|

∈ Nat = |q|ǫ

|e r1 r2 |r1 /r2 |return e|r1 r0 / |return e|r1 /r2 r0 |running e|r1 / |running e|r1 /r2 r0 |let x = e1 in e2 |r1 /r2 otherwise |e|r1 /r2

∈ = = = = = =

Nat 0 |e|r1 /r2 1 + |e|r1 /r |e|r1 /r2 r0 r |e1 |r1 /r2 + |e2 |r1 /r2 0

|6 ⊚s|r1 /r2

∈

Nat

f f |. ≬s. ≬s 0 |r1 /r2 r0 f

f

= |. ≬s|r1 /r2

f

|. ≬s . ≬s 0 |r1 / f

f

= |≬s 0 |

f

r

We define φ ( hq; ≬si) to hold when |q| = |≬s|, and φ ( he; ⊚s ˆ 6 ⊚si ) to hold when r = seq-/( r1 /r2 ) → |e|r1 /r2 = |6 ⊚s|r1 /r2 and show instead that: imd 6⊚ mon hq; s i: imd

6⊚

hq; 6 ⊚si ∈ R mon hq; imd mon he;

si

→ φ ( hq; 6 ⊚si)

r

si : r

imd

he; si ∈ R mon he; si

→ φ ( he; si).

imd 6⊚ mon hq; s i:

(1) The empty program store contains no region stores and source language programs contain no running constructs, so |q| = 0 = |ǫ|. (2) We show that φ is preserved by →∗ : imd 6⊚ mon hq; si: hq;

Assume

6⊚

s i →∗ hq′ ; imd hq; 6 ⊚s i mon

6⊚ ′

s i

and φ ( hq; si). Then, |6 ⊚s| = |q|.

∗

→

imd hq; 6 ⊚s i mon

-cntxt:

We have q =

→∗ [ e] r1

′

q [e1 ], q =

→∗ [ e] r1

q

[e′1 ],

and

t1 he1 ; si →∗ he′1 ; s′ i imd he; sir1 mon

.

223

Modelling Encapsulation of State With Monad Transformers

imd mon he;

Intermediate Languages

r

si :

Assume

t he; si →∗ he′ ; s′ i imd he; sir mon

and φ ( he; si).

∗

→

imd he; si mon

-cntxt:

We have he; si = → s′1 i], and

∗

[ he; sir1 ]

he; si[he1 ; s1 i], he′ ; s′ i = →

t1 he1 ; s1 i →∗ he′1 ; s′1 i

[ he; sir1 ]

he; si[he′1 ;

→∗

. Clearly, we can simulate

imd he; sir1 mon

∗

imd he; si mon

-cntxt with

rules that allow only atomic contexts. hrunning [ e]; &([ ⊚s] ˆ [ 6 ⊚s ])i: f

f

f

Assume φ ( hrunning e0 ; &(⊚s {≬s 0 } ˆ .≬s 1 . ≬s 2 . ≬s 0 )i). ⇁

We have that for all r , ≬

|running

s0 f f f e0 | r = |≬s 1 .≬s 2 ≬s 0 ⇁

⇁

⇁

| r . Letting r = r1 / (the running construct

is at “top-level” within the expression, net of return constructs), we have that |running + 1, so we have φ

≬ s0 f f f ⇁ ⇁ e0 | r = |e0 | r + 1 and |≬s 1 .≬s 2 ≬s 0 f f f ( he0 ; (⊚s {≬s 0 } ˆ .≬s 1 . ≬s 2 . ≬s 0 )i).

f

⇁

f

f

⇁

| r = |.≬s 1 . ≬s 2 . ≬s 0 | r ⇁

Letting r = r1 /r2 r0

(the running construct is not at “top-level” within the expression, net of ≬ s0 f f f ⇁ return constructs), we have that |running e0 | r = |e0 | r r and |≬s 1 .≬s 2 ≬s 0 | r f f f ⇁ f f f = |.≬s 1 . ≬s 2 . ≬s 0 | r r , so again φ ( he0 ; (⊚s {≬s 0 } ˆ .≬s 1 . ≬s 2 . ≬s 0 )i). We thus f f f have by induction that φ ( he′0 ; (⊚s {≬s 0 } ˆ .≬s′ 1 . ≬s′ 2 . ≬s′ 0 )i). Reversing f f f the argument above gives φ ( hrunning e′0 ; &(⊚s {≬s 0 } ˆ .≬s′ 1 . ≬s′ 2 . ≬s′ 0 )i). f hreturn [ e]; [ ⊚s] {≬s 0 } ˆ [ 6 ⊚s]. ≬s 0 i: f ⇁ ⇁ Assume φ ( hreturn e1 ; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 1 . ≬s 0 i). Letting r = r1 r0 (we are ⇁

⇁

cancelling a running construct that is within the expression), we have f

⇁

⇁

f

⇁

⇁

⇁

that |return e1 | r = |6 ⊚s 1 . ≬s 0 | r . Thus, |e1 |r1 = |6 ⊚s 1 . ≬s 0 |r1 r0 = |6 ⊚s 1 |r1 . By ⇁

⇁

induction, |e′1 |r1 = |6 ⊚s ′1 |r1 . f

⇁

⇁

⇁

⇁

As above, |return e′1 |r1 r0 = |e′1 |r1 = |6 ⊚s ′1 |r1 =

⇁

|6 ⊚s ′1 . ≬s 0 |r1 r0 . Letting r = r1 r0 / (we are cancelling a running construct ⇁

f

⇁

that is outside of the expression), we have 0 = |return e1 | r = |6 ⊚s 1 . ≬s 0 | r = f

f

f

|≬s 0 |. Thus, |return e′1 |r1 r0 / = 0 = |≬s 0 | = |6 ⊚s ′1 . ≬s 0 |r1 r0 / →∗ imd he; si mon

-reflex:

Clearly, this rule preserves the equality. 224

Monadic Language

Dynamics

→∗ imd he; si mon

-step:

The index r is unchanged by all reduction rules.

⇀ imd he; si mon

-run introduces

a running construct and extends the uppermost forest with a region store. In both cases, the count is affected iff the index is r1 /. Similarly,

⇀

imd he; si mon

-running removes a running construct and retracts the uppermost forest by a region store. The other rules affect neither the number of running constructs nor the number of region stores in the nonlexical store. →∗

imd he; si mon

-trans:

By induction, the antecedent evaluations preserve φ, so their composition does as well.

imd mon he;

sir : ∗

By the result above, h→ → [ e] r

that |

∗

[ e] r

imd

→ [ e] r

q [e]| = |

∗

∗

q [e]; si ∈ R mon hq; si → |→

[ e] r

q [e]| = |s|. As above, we have

r

q | + |e| = |r| + |e|, i.e., φ ( he; si ).

We again require as a lemma that programs and expressions reachable from the source language are factorable only into active source evaluation contexts. imd

Lemma 4.2.2 (R mon hq;

6⊚

si

imd

r

and R mon he; si Terms Factorable Only Into Active Source Evalua-

tion Contexts). We define an active source expression context to be an expression, with a hole, that does not contain full running constructs. We overlook running constructs whose body includes the hole. imd 6⊚ mon hq; s i: →∗ [ he; si]

r1

→∗ [ he; si]

r1

imd

hq; 6 ⊚si [he1 ; s1 i] ∈ R mon hq;

hq; 6 ⊚si

6⊚

si

→

is an active source program expression configuration context.

sir : ∗ imd r r ˆ r1 h→ [ e]e [e1 ]; si ∈ R mon he; si → →∗ [ e] r ˆ r1 e is an active source expression context.

imd mon he;

225

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Proof: Lemma 4.2.2. The proof is similar to that of 4.1.2.

2.2. Statics.

Γ̺ ∈

imd ǫ mon Γ ̺ imd ̺1 ̺0 Γ ∈ mon Γ ≬ ̺ ≬ ̺ Γ ∈ imd mon Γ 6⊚ hQ; 6 ⊚Si ∈ imd mon hQ; S i r r hE; Si ∈ imd mon hE; Si ̺ E ̺ ∈ imd mon E ̺1 ̺0 ̺ imd B ∈ mon B r ⊚ r ⊚ hP; Si ∈ imd mon hP; S i ǫ src P ̺ ∈ imd mon P , Q ∈ obj Q ̺1 ̺0 P ̺ ∈ imd mon P

S

⊚ r

S

∈

6⊚ r

S

6⊚ r

S

∈

{≬Γ }

::=

Γ ̺1 {≬Γ }

::= =

∅ | ≬Γ {x 7→ P ̺ } imd imd 6 ⊚ ǫ mon Q × mon S

= ::= ::=

̺

̺

r imd mon E ̺ ̺

×

imd r mon P

×

T P Ref P ̺ | P ̺ ⇒ E ̺ imd ⊚ r mon S

G|∅

::=

B ̺ | Return P ̺1

imd r mon S

::=

⊚ r

imd ⊚ ǫ mon S

::=

∅

imd ⊚ r1 r0 mon S

::=

⊚ r1

imd 6 ⊚ ǫ mon S

::=

.≬S

imd 6 ⊚ r1 r0 mon S

::=

fr 6 ⊚ r1 ≬ S . S

∈

∈

S ˆ 6 ⊚S

S

r

r

{≬S }

fǫ

≬

S

f r1 ≬

imd r mon S

= ::=

Sr ∈ ⊚ r

ǫ

::=

f r1 ≬ S ∈ imd ::= mon S r r r ≬ ≬ 1 0 S ∈ imd ::= mon S ̺ imd ǫ T ∈ mon T ::= ̺ imd ̺1 ̺0 T ∈ mon T ::= ≬ ̺ imd ≬ ̺1 ̺0 T ∈ mon T ::= ̺1 ̺0 ε ̺ ∈ imd ε = mon ̺ imd ̺1 ̺0 F ∈ mon F ::= imd ι ∈ mon ι ::= G ∈ imd mon G ⊇

r1 r0

f r1 r0 ≬S

∅ {o 7→ B r } Id ≬ ̺

T T ̺1 ̺

Stε ̺ { imd mon F } ι alloc | read | write | exec imd obj G

Figure 4.2.1. Monadic Intermediate Language Static Syntax

226

Monadic Language

Statics

The static syntax is presented in Figure 4.2.1. As in the source language of Figure 3.2.17, environments are sequences of region environments of increasing index, there is a Return construct for identifying the region of allocation, and trace types, corresponding to monads, are built up from T ǫ = Id using region trace types, corresponding to monad transformers. Store types correspond closely to the stores Figure 4.2.1, but, as in the object language (Figure 4.1.1), region store types r

r

r

r

associate a storable type with an offset. We define ⊚S ˆ 6 ⊚S ≤r ⊚S ′ ˆ 6 ⊚S ′ if and only if each region r

r

store type of ⊚S approximates the corresponding region store type of ⊚S ′ , and define ≥ accordingly. r

r

We define T ′ ≤r T r if and only if T ′ and T r have the same structure and the effect at each level of r

T ′ is a subset of the effect at the corresponding level of T r , and define ≥ accordingly. We assume definitions of & and ' on store types analogous to those on stores in the dynamic semantics. We also assume definitions analogous to those of Figure 3.2.18, but operating over the syntax of Figure 4.2.1 and generalized from region variables to region indicators. Finally, we define the empty nonlexical store type at index ̺ such that

6⊚ ǫ

∅ = .ǫ, 6 ⊚∅

rr

imd

⊢mon hq;

program configs 6⊚ ǫ

programs

S

Γr ρ

; Sr

prestorables

Γrρ

;

Γrρ

;

⊚ r

S

⊚ r

S

⊚ r

S

⊚ r

S

⊚ r

S

region stores

⊚ r

storables

Sr

S

⊢

ǫ

: Q r

he; si

r

: hE; Si

rρ

br ρ

: Br ρ

imd r ρ mon p

vr ρ

: Prρ

imd r ρ mon v

imd r mon s

r

hv; ⊚si vr ρ sr

imd ⊚ r mon s

⊚ r

imd 6 ⊚ r mon s

6⊚ r

⊢ ⊢

ǫ

hq; 6 ⊚s i : hQ; 6 ⊚S i

: Er ρ

r imd ⊚ mon hv; s i

⊢

̺

= 6 ⊚∅ .

er ρ

imd

⊢

̺ρ

q r

rρ

⊢mon b

⊢

lexical stores

traces

imd

⊢mon e

⊢

stores

region store trees

imd

⊢

value configs

nonlexical stores

si

⊢mon q imd

expressions

values

6⊚

⊢mon he; si

expression configs

pures

r

= 6 ⊚∅ . ǫ, and 6 ⊚∅

f r imd ≬ mon S

imd ≬ r = r1 r0 mon s imd

⊢mon d imd

⊢mon t

̺

r = r1 r0

s s

fr ≬

r

: hP; ⊚Si : Prρ : Sr :

⊚ r

:

6⊚ r

S S

fr ≬

s

:

S

≬ r

s

:

dr

: Br

t̺

: T̺

≬ r

S

Figure 4.2.2. Monadic Intermediate Language Typing Judgments The typing judgments are in Figure 4.2.2. As in the object intermediate language, our typing judgments for terms must provide typing information regarding the store. Programs, having no context, require only a nonlexical store type. Prestorables, pures, and values require only a lexical 227

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

store type, since they cannot contain running constructs. The index of these judgments must be a sequence of region names followed by a sequence of region variables. The lexical store type is assigned the region names as its own index. We now have separate judgments for lexical and nonlexical stores as well as a judgment for region trees. Typing a nonlexical store or region tree requires a lexical store type for resolving stored references. We will no longer use a typing judgment for atomic traces. imd

r

imd

r1

We define ⊢mon t → ι ∈ ε0 and ⊢mon t

imd

t: T to hold for ⊢mon t

imd

ǫ

[ ]: Id and for ⊢mon t

r1 r0

t0 : Stε0 T1 when (ι @ o) ∈ t0

&t0 : T1 , i.e., when actions occurring within a number of return constructs in

the trace occur at the same depth of monad transformer applications in the trace type.

imd

imd

⊢mon v-glob-const

⊢

imd mon v-addr-dead

∅ ⊢

∅ ⊢

imd ǫ mon v

imd ǫ mon v

g : TypeOf( g)

⊢

o: ∅ imd

⊚

S 1 ⊢mon v

imd

⊢mon v-ret-running

⊚

⊢mon v-addr-live

≬

S1 { S0} ⊢

imd mon v-ret-run

r1

imd r1 r0 mon v

imd

⊚

S ⊢mon v

⊚

S ⊢

imd

S {≬S} ⊢mon v

⊚

imd ̺ ρ0 mon v

̺

r1 r0

o : ≬S( o)

v : P1

v : Return P1

v : P1 v : Return P1

Figure 4.2.3. Typing of Monadic Intermediate Language Values imd

imd

Typing rules for values are presented in Figure 4.2.3. The rules ⊢mon v -glob-const, ⊢mon v -addr-live, imd

and ⊢mon v -addr-dead are similar to the corresponding rules of the object intermediate language (Figure 4.1.4), but require only the lexical portion of the store. For purposes of typing, we distinguish between uses of return that cancel a virtual region declared with run and those that cancel an actual imd

region, declared with running. ⊢mon v-ret-run, for virtual regions, is similar to the corresponding rule of the monadic source language (Figure 3.2.20), but is generalized to region indicators and also requires imd

a lexical store. The new rule ⊢mon v-ret-running, for actual regions, differs only in that the lexical store imd

is retracted in the antecedent. The retraction of the store allows the rule ⊢mon v-addr-live to access by default the uppermost region store type. imd

imd

⊢mon q

{∅}; ∅ ˆ 6 ⊚S ⊢mon e 6⊚

S ⊢

imd mon q

ǫ

q : Id Q

q: Q

Figure 4.2.4. Typing of Monadic Intermediate Language Programs

228

Monadic Language

Statics

The rule for typing programs in Figure 4.2.4 is similar to that for the source language (Figure 3.2.21), but must handle store types. The corresponding rule of the object language (Figimd

ure 4.1.5) passes the store type through unchanged. By contrast, the conclusion of ⊢mon q requires a nonlexical store, which is combined with an empty lexical store in the antecedent.

⊢

imd mon d -ref

imd

⊚

S 0 ⊢mon v

⊚

S0 ⊢

imd r1 r0 mon d

r1 r0

imd

v0 : P0

⊢

href v0 i : Ref P0

imd mon d-λ

{∅} {∅} {∅} ; ⊚S 0 ⊢mon b ⊚

S0 ⊢

imd r1 r0 mon d

r1 r0

hλx.e0 i : B0

hλx.e0 i : B0

Figure 4.2.5. Typing of Monadic Language Storables Figure 4.2.5 contains rules for typing storables. They are similar to the rules of the object language (Figure 4.1.6), except that the empty environment is structured differently and that we pass a lexical store type through them. imd ⊚ r

⊢

⊢mon s ⊚s : ⊚S imd6 ⊚ r 6⊚ ⊚ s S⊢ s : 6 ⊚S

imd mon s

imd

⊢mon s

r

s ˆ 6 ⊚s :

⊚

S ˆ 6 ⊚S

⊚

⊢

⊚S

imd ≬ mon s

⊚

imd

0

imd ≬ r = r1 r0

S 0 ⊢mon

s

r

⊢mon d d0 : B0 ∅ {o 7→ d0 } : ∅ {o 7→ B0 }

Figure 4.2.6. Typing of Monadic Language Stores and Region Stores The rules of Figure 4.2.6 are used to type stores and region stores. The rule for region stores is similar to that of the object language (Figure 4.1.7), but again passes a lexical store type through. The rule for stores simply defers to separate rules for lexical and nonlexical stores, the latter still requiring the lexical store type. imd ≬ r

⊚ imd ⊚

⊢mon

s

imd ⊚

-empty

⊢

imd ⊚ ǫ mon s

∅: ∅

⊢mon

s

-nonempty

S 1 {≬S 0 } ⊢mon s ≬s 0 : ≬S 0 imd ⊚ r1 ⊢mon s ⊚s 1 : ⊚S 1 imd ⊚ r = r1 r0

⊢mon

s

⊚

s 1 {≬s 0 } :

⊚

S 1 {≬S 0 }

Figure 4.2.7. Typing of Monadic Lexical Stores The rules for typing monadic lexical stores in Figure 4.2.7 are similar to those for typing stores in the object language (Figure 4.1.7), except that store extension does not require a region name. Figure 4.2.8 presents rules for typing nonlexical stores. These are defined in terms of a rule for typing region store trees. Typing a region store tree in the context of a lexical store type with f imd ≬

⊢mon

s

requires typing the region store at the root and each region store tree in the forest of children, 229

Modelling Encapsulation of State With Monad Transformers

imd ≬ r1 r0

⊚

S 1 {≬S 0 } ⊢mon

⊢

f imd ≬ mon s

⊚S

≬ 1 { S0} ⊢

≬

s 0 : ≬S 0

s

f r1 r0 imd ≬S mon

f f ≬s : ≬S 0 0

⊢

≬

⊚

S1 ⊢

f r1 imd ≬ mon S

⊚S s

s 0 ≬S 0 f f ≬s : ≬S 0 0

-nonempty

imd 6 ⊚ mon s -empty

f ǫ imd ≬S

∅ ⊢mon

imd 6 ⊚ ǫ

∅ ⊢mon

s

f f ≬s: ≬S f ≬s

f

: ≬S

f f ≬s : ≬S 0 0 6⊚ S1

f r1 r0 imd ≬

{≬S 0 } ⊢mon S imd 6 ⊚ r1 ⊚ S 1 ⊢mon s 6 ⊚s 1 :

imd 6 ⊚

⊢mon

Intermediate Languages

1

imd 6 ⊚ r1 r0

⊚

S 1 {≬S 0 } ⊢mon

s

f

6⊚

s 1 . ≬s 0 :

6⊚

f

S 1 . ≬S 0

Figure 4.2.8. Typing of Monadic Nonlexical Stores both with incremented index and lexical store type extended with the root region store type. Like derivations of lexical store typings, derivations of nonlexical store typings are built up using a pair of rules. The rule for empty nonlexical stores has a sequence of antecedents corresponding to the sequence of region store trees in the single region store tree forest. Typing a nonempty nonlexical store requires a sequence of typings for the uppermost region store forest and a typing of the remaining nonlexical store with a retracted index and lexical store type.

⊚

Γ; S ⊢

imd ̺ mon p

⊢mon p-value

x : Γ( x)

imd

⊚

imd

imd

⊢mon p-var

S ⊢mon v ⊚

Γ; S ⊢

̺

v: P

imd ̺ mon p

v: P

Figure 4.2.9. Typing of Monadic Intermediate Language Pures

imd

⊢

imd mon b -ref

imd

⊢mon b-λ

Γ; ⊚S ⊢mon p ⊚

Γ; S ⊢

imd ̺1 ̺0 mon b

̺1 ̺0

p : P0

href pi : Ref P0

Γ1 {≬Γ 0 {x 7→ P0.1 }}; ⊚S ˆ imd

Γ1 {≬Γ 0 }; ⊚S ⊢mon b

̺1 ̺0

6 ⊚ ̺1 ̺0

∅

imd

⊢mon e

̺1 ̺0

e0 : ≬T 0 T1 P0.2

hλx.e0 i : (P0.1 ⇒ ≬T 0 T1 P0.2 )

Figure 4.2.10. Typing of Monadic Intermediate Language Prestorables Monadic intermediate language pures and prestorables are typed using the rules of Figures 4.2.9 and 4.2.10, which are similar to those of the source language (Figures 3.2.22 and 3.2.23, respectively) except that the indexes are generalized to region indicators and a lexical store type through. In the imd

case of ⊢mon b -λ, the lexical store type must be combined with an empty nonlexical store type of appropriate index in order to type the body as an active source expression. 230

Monadic Language

Statics

Monadic intermediate language expressions are typed using the rules of Figure 4.2.11, which are similar to those of the source language (Figure 3.2.24), but generalize the indexes to region indicators imd

imd

imd

imd

and thread a full store type is threaded through. ⊢mon e -pure, ⊢mon e -deref, ⊢mon e -set, ⊢mon e -app, and imd

⊢mon e -alloc pass only the lexical part of the store type to their pure and, for the latter, prestorable, imd

imd

imd

imd

antecedents. Two new rules, ⊢mon e -running and ⊢mon e -ret-running are similar to ⊢mon e -run and ⊢mon e -ret-run,

respectively, but must actively modify the store type instead of simply passing it through. imd

⊢

⊚S

Γ; imd

imd

⊢mon e-let

Γ {≬Γ }; S ⊢mon e

̺

6⊚

S ⊢

ˆ

≬

Γ { Γ}; S ⊢

imd ̺ mon e

imd ̺ mon e

⊚Sˆ 6 ⊚

S ⊢

Γ; ⊚S ⊢mon p 6⊚

⊚

S ⊢

Γ; S ˆ imd

⊢

imd mon e -set

Γ; ⊚S ⊢mon p Γ; ⊚Sˆ imd

imd

⊢mon e-app

Γ; ⊚S ⊢mon p

̺1 ̺0

6⊚

̺1 ̺0

Γ; ⊚Sˆ

S ⊢

imd ̺1 ̺0 mon e

imd ̺1 ̺0 mon e

imd

S ⊢mon e

̺1 ̺0

deref p : St{read} (St∅ Id) P0 imd

Γ; ⊚S ⊢mon p

imd

Γ; ⊚S ⊢mon p ̺1 ̺0

⊢

imd ̺1 mon e

̺1 ρ0

imd r1 mon e

⊢

r1 r0

≬

Γ { Γ 1 } { Γ 0 }; S1 ⊢

p.1 : (P0.1 ⇒ ≬T 0 T1 P0.2 )

e0 : ≬T 0 T1 P0

running e0 : T1 P0 :=r1 r0 ∅

Γ {≬Γ 1 2 ≬Γ 0 }; S1 ⊢mon e ≬

imd ̺1 ρ0 mon e

̺1

e1 : T1 P1

return e1 : St∅ T1 (Return P1 ) imd

imd

⊢mon e-ret-running

(St Id) Unit

e0 : ≬T 0 T1 P0

imd

imd mon e-ret-run

̺1 ̺0

p.2 : P0 ∅

run e0 : T1 P0 :=̺1 ρ0 ∅

Γ1 {∅}; 'S1 ⊢mon e Γ1 ; S1 ⊢

̺1 ̺0

p.1 p.2 : (≬T 0 ⊔ St{exec} ) T1 P0.2

imd

imd mon e -running

p : Ref P0

set p.1 to p.2 : St

Γ1 {∅}; S1 ⊢mon e Γ1 ; S1 ⊢

b : B0

{write}

imd

imd

⊢mon e-run

e.2 : T.2 P.2

b : St{alloc} (St∅ Id) B0

p.1 : Ref P0

p.2 : P0.1 6⊚

̺1 ̺0

imd ̺1 ̺0 mon e

imd

⊢

̺

let x = e.1 in e.2 : (T.1 ⊔ T.2 ) P.2 imd

Γ;

return̺ p : St∅ Id P imd

Γ; ⊚S ⊢mon b

imd

⊢mon e-alloc

p: P

Γ {≬Γ {x 7→ T.1 P.1 }}; S ⊢mon e

e.1 : T.1 P.1

imd mon e -deref

̺

Γ; ⊚S ⊢mon p

imd mon e -pure

Γ {≬Γ 1 2 ≬Γ 0 }; ⊚S 1 ˆ 6 ⊚S 1 ⊢mon e f

imd

Γ {≬Γ 1 } {≬Γ 0 }; ⊚S 1 {≬S 0 }ˆ 6 ⊚S 1 . ≬S 0 ⊢mon e

r1 r0

r1

e1 : T1 P1

return e1 : St∅ T1 (Return P1 )

Figure 4.2.11. Typing of Monadic Intermediate Language Expressions

231

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

imd

In the case of ⊢mon e -running, we use ' to reorganize the store type in the antecedent, dividing the uppermost region store type forest and placing the root of the rightmost tree on top of the lexical imd

store type. In the case of ⊢mon e -ret-running, we retract both the lexical and nonlexical store types in imd

the antecedent. This can be seen as a generalization of the rule ⊢mon v -ret-running that allows other rules to operate by default on the uppermost region store type of the lexical store type and the uppermost region store type forest of the nonlexical store type. imd

Typings of program configurations are derived using the rules ⊢mon hq;

6⊚

si

, which requires both

that the nonlexical store type properly describe the nonlexical store and that the program be typable using the nonlexical store type but without regard to the specific values in the nonlexical store. imd 6 ⊚ ǫ

imd

⊢mon hq;

∅ ⊢mon s 6 ⊚s : 6 ⊚S imd 6⊚ S ⊢mon q q : Q

6⊚

si imd

6⊚

⊢mon hq;

si

hq; 6 ⊚si : hQ; 6 ⊚Si imd

Typings of expression configurations are derived using the rule ⊢mon he; si, which requires both that the store type properly describe the store and that the expression be typable using the store type but without regard to the specific values in the store. imd

imd

⊢mon he; si

r

⊢mon s s : S imd r {∅} {∅}; S ⊢mon e e : E imd

r

⊢mon he; si he; si : hE; Si imd

Typings of value configurations are derived using the rule ⊢mon hv;

⊚

si

, which requires both that

the lexical store type properly describe the lexical store and that the value be typable using the lexical store type but without regard to the specific values in the lexical store. imd ⊚ r

imd

⊢mon hv;

⊢mon s ⊚s : ⊚S imd r ⊚ S ⊢mon v v : P

⊚

si imd

⊢mon hv;

⊚

r

si

hv; ⊚si : hP; ⊚Si

Figures 4.2.12 through 4.2.14 present a typing of a snapshot of the reduction of Figures 4.2.5 through 4.2.7, at the short-lived point at which all three regions are allocated. Throughout, locations are represented by offsets, and the region to which they refer is determined by context and specified via Return forms. In particular, the multiple occurrences of o1 are distinguished by context. Typing the program configuration again requires typing the program as well as the nonlexical store. For the f imd ≬

imd 6 ⊚

latter, ⊢mon

232

s

-empty

is used to type the single region store forest and ⊢mon

s

to type its

r

imd

ǫ

⊢mon v -glob-const ∅ ⊢ unit : Unit imd r ⊢mon v -ret-running ∅ {≬S 1 } ⊢ unit : Return Unit imd r ⊢mon d -ref ≬ ∅ { S 1 } ⊢ href uniti : Ref Return Unit imd ≬ r ⊢mon sf ǫ ≬ ≬ ∅ { S 1 } ⊢ s 1 : ≬S 1 imd ≬ ⊢mon s

d1

d2 imd ≬ r r

⊢mon fs r ∅ {≬S 1 } {∅} ⊢ ∅ : ∅ imd ≬ ⊢mon s ∅ {≬S 1 } {∅} ⊢ ∅ : ∅ ∅ ⊢ . ≬s 1 : . ≬S 1 ∅

imd 6 ⊚ ǫ

⊢mon

imd

⊢mon hq;

≬

s

-empty

∅⊢

≬

imd 6 ⊚ ǫ mon s

si

⊢ hrunning running return let x6 = running return return o1 in deref x6

s 1 = ∅ {o1 7→ href uniti} {o2 7→ hλx1 .let x6 = run return return o1 i} in deref x6 ≬ s 2 = ∅ {o1 7→ href o1 i}

S2

. ≬s 1 : . ≬S 1 ∅

6⊚

≬

∅

s2

∅ {≬S 1 } ⊢ o1 : Ref Return Unit imd r r ⊢mon v -ret-running ≬ ∅ { S 1 } ⊢ o1 : Return {≬S 2 } Ref Return Unit imd r r ⊢mon d -ref ∅ {≬S 1 } ⊢ href o1 i : Ref Return {≬S 2 } Ref Return Unit imd ≬ r r ⊢mon fs r ≬ ≬ ≬ ∅ { S 1 } { S 2 } ⊢ s 2 : ≬S 2 imd ≬ ⊢mon s ∅ {≬S 1 } {≬S 2 } ⊢ ≬s 2 : ≬S 2

Monadic Language

imd

⊢mon v -addr-live

≬

s 2 ∅ ≬S 2 ; . ≬s 1 i : hUnit; . ≬S 1 i ∅

≬

s2

∅

≬

S2

≬

S 1 = ∅ {o1 7→ Ref Return Unit} {o2 7→ Ref Return Unit ⇒ } {read} St Id Ref Return Unit ≬ S 2 = ∅ {o1 7→ Ref Return Ref Return Unit} Statics

233

Figure 4.2.12. Sample Monadic Intermediate Language Derivation, I

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

single tree. This in turn requires typing the root region store and the two child subtrees, each of which contain only a single region store. We thus require derivations of typings of the three region stores, corresponding to those of the object language example of Figures 4.1.11 and 4.1.12. The languages differ primarily in that typing the empty region store in the monadic language requires a lexical store type that includes the parent region store type ≬S 1 but excludes the sibling region store imd

type ≬S 2 . They differ also in the use of ⊢mon v -ret-running to reduce the index and lexical store type imd

before either a global constant derivation or a derivation (with ⊢mon v -addr-live) of an offset referring to a lower region. The first region store contains a function, typed in d2 of Figure 4.2.14. An empty nonlexical store type is installed as we descend into the body. The program is typed in d1 of Figure 4.2.13 and d3 of Figure 4.2.14. The derivation begins with only a nonlexical store type and produces only a pure type. The first expression derivation introduces an empty lexical store type and generates an empty trace type. The store type is modified as we proceed through antecedents imd

imd

of ⊢mon e-running and ⊢mon e-ret-running derivations. It is modified in ways similar to those in which the store is modified in the dynamic semantics upon entering the corresponding evaluation contexts. At imd

imd

the same time, the trace type and pure type are modified, as with ⊢mon e-run and ⊢mon e-ret-run in the imd

imd

source language. The use of ⊢mon e -pure keeps only the lexical store type in place while ⊢mon p -value imd

imd

drops the environment. ⊢mon v -ret-running and ⊢mon v -addr-live are then used to make o1 point to the imd

imd

imd

lower region. We could instead have used ⊢mon e -ret-running followed by ⊢mon e -pure, ⊢mon p -value, and imd

⊢mon v-addr-live to accomplish the same thing.

2.3. Type and Effect Soundness.

As for the object language, we state Type Soundness only for source language programs, and Effect Soundness only for reachable expression configurations. Theorem 4.2.1 (Type Soundness). src

⊢mon q q : Q

→

hq; ǫi ⇓ hv; ǫi

hq; ǫi ⇑ ∨ ∃ v.(

imd hq; 6 ⊚s i mon

imd

∧ ⊢mon hq;

6⊚

si

hv; ǫi : hQ; ǫi).

During the evaluation of a well-typed expression in a context, no unexpected operations are performed. We again present only the contextualized effect soundness result.

234

Monadic Language

Type and Effect Soundness

d1 = imd

r

⊢mon v -addr-live ⊢ imd

⊢mon p ⊢mon e

imd

rr

imd

r

∅ {≬S 1 } ⊢ o1 : Ref Return Unit ∅ {≬S 1 } {∅} ⊢ o1 : Return Ref Return Unit {∅} {∅} {∅}; ∅ {≬S 1 } {∅} ⊢ o1 : Return Ref Return Unit

imd r r mon v -ret-running rr

-value

-pure

{∅} {∅} {∅}; ∅ {≬S 1 } {∅} ⊢ return : St∅ (St∅ Id) ˆ.ǫ. ǫ. ǫ return o1 Return Ref Return Unit

⊢mon e -running imd

r

⊢mon e -let

imd

⊢mon e

imd

rr

-ret-running

r

⊢mon e -running

imd

⊢mon e

ǫ

-running

imd

⊢mon q

d3

{∅} {∅}; ∅ {≬S 1 } ⊢ running : St∅ Id ˆ.ǫ. ∅ return return o1 Ref Return Unit {∅} {∅}; ∅ {≬S 1 } ⊢ let x6 = running return return o1 ˆ.ǫ. ∅ ≬S 2 in deref x6

: St{read} Id Return Unit

{∅} {∅} {∅}; ∅ {≬S 1 } {≬S 2 } ⊢ return ˆ.ǫ. ∅. ǫ let x6 = running return return o1 in deref x6

: St∅ (St{read} Id) Return Return Unit

{∅} {∅}; ∅ {≬S 1 } ⊢ running : St{read} Id ≬ return let x6 = running return return o1 Return Unit ˆ.ǫ. ∅ S 2 in deref x6 {∅}; ∅ ˆ . ≬S 1 ⊢ running : Id running return let x6 = running return return o1 Unit ∅ ≬S 2 in deref x6 . ≬S 1 ⊢ running : Unit running return let x6 = running return return o1 ∅ ≬S 2 in deref x6 Figure 4.2.13. Sample Monadic Intermediate Language Derivation, II

Theorem 4.2.2 (Effect Soundness).

⊢

he; si ∈ R t: T

imd mon he;

sir

∧ ⊢

imd mon he;

sir

he; si : hT P; Si ∧

imd r mon t

t he; si →∗ he′ ; s′ i imd he; sir mon

→

Lemma 4.2.3 (Evaluation Preserves Type and Effect). imd 6⊚ mon hq; s i: imd

hq; 6 ⊚si ∈ R mon hq; imd

∃ 6 ⊚S ′ ≥ǫ 6 ⊚S. ⊢mon hq;

6⊚

si

6⊚

si

imd

∧ ⊢mon hq;

6⊚

si

hq; 6 ⊚si : hQ; 6 ⊚Si ∧

hq;

6⊚

s i →∗ hq′ ; imd hq; 6 ⊚s i mon

6⊚ ′

s i

→

hq ′ ; 6 ⊚s ′ i : hQ; 6 ⊚S ′ i

235

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

d2 = imd

imd

rr

⊢mon p -value imd r r ⊢mon e -pure

imd

{∅} ; ∅ {≬S 1 } ⊢ return : St∅ (St∅ Id) {∅ {x1 7→ Ref Return Unit}} ˆ.ǫ. ǫ return o1 Return {∅} Ref Return Unit ∅ {∅ {x1 7→ Ref Return Unit}}; ∅ {≬S 1 } ⊢ run return return o1 : ˆ.ǫ. ǫ St∅ Id Ref Return Unit

r

⊢mon e -run imd

r

⊢mon e -let ⊢mon b

imd

r

imd

r

⊢mon d

-λ

-λ

r

⊢mon v -addr-live ∅ {≬S 1 } ⊢ o1 : Ref Return Unit imd r r ⊢mon v -ret-run ∅ {≬S 1 } ⊢ o1 : Return Ref Return Unit {∅} {∅ {x1 7→ Ref Return Unit}} {∅}; ∅ {≬S 1 } ⊢ o1 : Return Ref Return Unit

d4

{∅} ; ∅ {≬S 1 } ⊢ let x6 = run return return o1 in deref x6 : {∅ {x1 7→ Ref Return Unit}} ˆ.ǫ. ǫ St{read} Id Return Unit {∅} ; ∅ {≬S 1 } ⊢ hλx1 .let x6 = run return return o1 in deref x6 i : {∅ {x1 7→ Ref Return Unit}} Ref Return Unit ⇒ (St{read} Id Ref Return Unit) ∅ {≬S 1 } ⊢ hλx1 .let x6 = run return return o1 i : Ref Return Unit ⇒ in deref x6 (St{read} Id Ref Return Unit)

d3 = imd

r

⊢mon p -var ⊢

imd r mon e -deref

{∅} {∅ {x6 7→ Ref Return Unit}}; ∅ {≬S 1 } ⊢ x6 : Ref Return Unit

{∅} {∅ {x6 7→ Ref Return Unit}}; ∅ {≬S 1 } ˆ .ǫ. ∅

S 2 ⊢ deref x6 : St{read} Id Return Unit

≬

d4 = imd

⊢mon p imd

r

-var

r

⊢mon e -deref

{∅} {∅ {x1 7→ Ref Return Unit} }; ∅ {≬S 1 } ⊢ x6 : Ref Return Unit {x6 7→ Ref Return Unit}

{∅} {∅ {x1 7→ Ref Return Unit} }; ∅ {≬S 1 } ˆ .ǫ. ǫ ⊢ deref x6 : St{read} Id Return Unit {x6 7→ Ref Return Unit} Figure 4.2.14. Sample Monadic Intermediate Language Derivation, III

imd mon he;

r

si :

he; si ∈ R

imd mon he;

sir

imd

∧ ⊢

imd mon he;

sir

he; si : hT P; Si ∧

t he; si →∗ he′ ; s′ i imd he; sir mon

→

r

∃ T ′ ≤r T, S ′ ≥r S.⊢mon he; si he′ ; s′ i : hT ′ P; S ′ i

Lemma 4.2.4 (Faulty Program Configurations Untypable). imd

If hq; 6 ⊚si ∈ R mon hq; hQ; 6 ⊚Si.

236

6⊚

si

is faulty, then there are no Q and

6⊚

imd

S, such that ⊢mon hq; si hq; 6 ⊚si :

Monadic Language

Type and Effect Soundness

Proof: Theorem 4.2.1. src

We have ⊢mon q q: Q. Because the intermediate language is an extension of the source language, imd

there is a similar derivation ǫ ⊢mon q q : Q that simply passes around required components of the imd 6 ⊚

empty store type. Applying rule ⊢mon

s

-empty,

imd

then ⊢mon hq;

t

Clearly, if hq; ǫi ⇑, the condition is satisfied. Assume

6⊚

si

imd

, we have ⊢mon hq; si hq; ǫi : hQ; ǫi.

t hq; ǫi ⇓ hq′ ;

6⊚ ′

s i

imd hq; 6 ⊚s i mon

, i.e.,

t hq; ǫi →∗ hq′ ;

6⊚ ′

s i

imd hq; 6 ⊚s i mon

with hq ′ ;

imd

s i irreducible. By Lemma 4.2.3, ∃ 6 ⊚S ′ ≥ǫ ∅. ⊢mon hq; si hq ′ ; 6 ⊚s ′ i : hQ; S ′ i. By Lemma 4.2.4, hq ′ ;

6⊚ ′

imd 6 ⊚

6⊚ ′

s i is not faulty. By Proposition 4.2.1, q ′ is a value. By Lemma 4.2.1, 6 ⊚s ′ = ǫ. By rule ⊢mon

6⊚ ′

S = ǫ.

s

-empty,

We will again require a Subject Reduction lemma. Lemma 4.2.5 (Subject Reduction). imd

r

imd

r

he; si ∈ R mon he; si ∧ ⊢mon he; si he; si: hT P; Si ∧ ∃ T ′ ≤r T, S ′ ≥r S. ⊢

imd mon he;

si

r

t he; si ⇀ he′ ; s′ i imd he; sir mon

he′ ; s′ i : hT ′ P; S ′ i ∧ ⊢

imd r mon t

→

t: T

We introduce a proposition relating typings of stores under & and '. Proposition 4.2.2 (Store Typing). imd

⊢mon s

r1

imd

&s0 : S1 ⇐⇒ ⊢mon s

r1 r0

s0 : 'S1

Proof: Theorem 4.2.2. imd

We have Γ; S ⊢mon e imd mon he;

r

e: T P.

r

si :

We proceed by induction on the expression configuration reduction derivation. →∗

imd he; si mon

-cntxt:

We again notice that we can simulate atomic contexts. ∗

→ [ t

⊢

r1

imd r mon t

→

→∗

imd he; si mon

-cntxt with a rule that operates over

∗

imd he; si mon

-cntxt determines an evaluation

]

t[t1 ]. Thus we have by induction that ⊢

imd r1 mon t

t1 he1 ; s1 i →∗ he′1 ; s′1 i imd he; sir1 mon

, for t =

t1 : T1 . We show by cases that

t: T.

h[ e]; [ ⊚s ] ˆ [ 6 ⊚s]i ! [ t]: Because the subconfiguration is identical to the full configuration, we can use the full typing derivation unchanged. t1 = t and T = T1 , so the result holds by our induction on rules.

237

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

hlet x = [ e] in e; [ ⊚s ] ˆ [ 6 ⊚s]i ! [ t]: imd

imd

Again, t1 = t and T = T1 . By ⊢mon he; si, we have ∅ {∅} {∅}; S ⊢mon e imd

in e.2 : T P and ⊢mon s

r

imd

for T = T1 ⊔ T.2 . We can reapply ⊢ definition of ⊢

imd mon t

imd

s ˆ 6 ⊚s: S. By ⊢mon e -let, ∅ {∅} {∅}; S ⊢mon e

⊚

and ⊔ , ⊢

imd mon he;

imd r mon t

si

and by induction ⊢

r

let x= e1

r1

imd r mon t

e1 : T1 P1 , t1 : T1 . By

t: T1 ⊔ T.2 .

hrunning [ e]; &([ ⊚s ] ˆ [ 6 ⊚s])i ! &[t]: imd

imd

r1 = rr0 and t = &t1 . By ⊢mon he; si, we have ∅ {∅} {∅}; S ⊢mon e P and ⊢

imd r mon s

&(⊚s 1 ˆ 6 ⊚s 1 ): S. By ⊢

imd

, ∅ {∅} {∅} {∅}; 'S ⊢mon e

imd

imd

can reapply ⊢mon he; si and by induction ⊢mon t imd

r

running e1 : T

imd mon e -running

Stε0 T P1 , i.e., T1 = Stε0 T. By Proposition 4.2.2, ⊢

ι ∈ ε0 and ⊢mon t

r

rr0

imd rr0 mon s

r1

e1 :

(⊚s 1 ˆ 6 ⊚s 1 ): 'S. We

t1 : Stε0 T, i.e., (ι @ o) ∈ t1 →

&t1 : T. The latter is precisely what we need to show. f

hreturn [ e]; [ ⊚s ] {≬s} ˆ [ 6 ⊚s]. ≬s i ! '[t]: imd

imd

r = r1 r0 and t = 't1 . By ⊢mon he; si , we have ∅ {∅} {∅} {∅}; S ⊢mon e imd

e1 : T P and ⊢mon s imd

⊢mon e

r1

imd 6 ⊚

⊢mon

s

r1 r0

f ⊚ s 1 {≬s 0 } ˆ 6 ⊚s 1 . ≬s 0 :

r1 r0

return

imd

S. By ⊢mon e-ret-running, ∅ {∅} {∅}; S1 imd ⊚

e1 : T1 P with S1 = ⊚s 1 ˆ 6 ⊚s 1 and T = St∅ T1 . By ⊢mon

-nonempty,

imd

imd

and reapplying ⊢mon s , we have ⊢mon s

imd

imd

then reapply ⊢mon he; si. We need to show ⊢mon t imd

i.e., (ι @ o) ∈ 't1 → ι ∈ ∅ and ⊢mon t

r1

r

r1

s

-nonempty

and

s 1 ˆ 6 ⊚s 1 : S1 . We can

⊚

imd

t: T, i.e., ⊢mon t

r1 r0

't1 : St∅ T1 ,

&'t1 : T1 . The former holds because imd

by definition of ', the premise is always false. The latter is equivalent to ⊢mon t

r1

t1 : T1 , which we have by induction. →∗

imd he; si mon

-reflex: imd

t = [ ] so ⊢mon t →∗ imd he; si mon

r

t: T clearly holds.

-step:

Use Lemma 4.2.5. →∗ imd he; si mon

-trans:

We have ⊢

imd mon he;

r

si

he; si: hT P; Si,

t′ he; si →∗ he′′ ; s′′ i imd he; sir mon

t′ + t′′ . By Lemma 4.2.3, ∃ T ′ ≤r T, S ′′ . ⊢ imd

duction hypothesis, we have ⊢mon t imd

t′′ : T and thus ⊢mon t

r

r

imd mon he;

r

si

imd

t′ : T and ⊢mon t

r

, and

t′′ he′′ ; s′′ i →∗ he′ ; s′ i imd he; sir mon

with t =

he′′ ; s′′ i : hT ′ P; S ′′ i. By the inimd

t′′ : T ′ . Because T ′ ≤r T, ⊢mon t

r

t: T.

238

Monadic Language

Type and Effect Soundness

Proof: Lemma 4.2.3. imd 6⊚ mon hq; s i: →∗

imd hq; 6 ⊚s i mon

By ⊢

-cntxt:

imd 6⊚ mon hq; s i

imd

imd 6 ⊚ ǫ

, we have 6 ⊚S ⊢mon q q : Q and 6 ⊚s⊢mon imd

∅ {∅}; ∅ ˆ 6 ⊚S ⊢mon e imd

ǫ

imd ⊚

q: Id Q. We can apply ⊢mon

s

imd

6⊚

s

S: . By ⊢mon q , we have

-empty,

imd

imd

⊢mon s , and ⊢mon he; si to

ǫ

obtain ⊢mon he; si hq; ∅ ˆ 6 ⊚si: hId Q; ∅ ˆ 6 ⊚Si. Using the result on expressions, we obtain imd

∃ 6 ⊚S ′ . ⊢mon he; si hq ′ ; ∅ ˆ 6 ⊚s ′ i : hId Q; ∅ ˆ 6 ⊚S ′ i. We can reverse the steps above to ǫ

imd

yield ⊢mon hq; si hq ′ ; 6 ⊚s ′ i : hQ; 6 ⊚S ′ i.

imd mon he;

si:

Proceed as follows for each evaluation rule. →∗

imd he; si mon

-cntxt:

h[ e]; [ ⊚s ] ˆ [ 6 ⊚s]i ! [ t]: Because the subconfiguration is identical to the full configuration, we can use the full typing derivation unchanged. The result holds by our induction on rules, with T ′ = T1′ and S ′ = S1′ . hlet x = [ e] in e; [ ⊚s ] ˆ [ 6 ⊚s]i ! [ t]: imd

imd

By ⊢mon he; si, we have ∅ {∅} {∅}; S ⊢mon e S. By ⊢ ⊢

imd r mon e

imd mon e -let

, ∅ {∅} {∅}; S ⊢

imd r mon e

r

imd

let x= e.1 in e.2 : T P and ⊢mon s

r

s:

e.1 : T.1 P and (*) ∅ {∅} {∅ {x 7→ P.1 }}; S imd

imd

e: T.2 P, for T = T.1 ⊔ T.2 . We can reapply ⊢mon he; si to form ⊢mon he; si

he.1 ; si: hT.1 P; Si. Because there is an evaluation induction ∃ T.1′ ≤r T.1 , S ′ ≥r S. ⊢ imd

, ∅ {∅} {∅}; S ′ ⊢mon e

r

imd mon he;

si

r

t he.1 ; si →∗ he′.1 ; s′ i imd he; sir mon

r

we have by imd

he′.1 ; s′ i : hT.1′ P.1 ; S ′ i. By ⊢mon he; si

e′.1 : T.1′ P.1 . By Lemma 4.2.2 on

→∗ [ e]

e , we can modify

the second subderivation (*) with Proposition 4.2.3 to get ∅ {∅}∅ {∅ {x 7→ P.1 }}; imd

S ′ ⊢mon e

r

imd

imd

e.2 : T.2 P. Applying ⊢mon e -let yields ∅ {∅} {∅}; S ′ ⊢mon e imd

imd

r

let x= e′.1

r

in e.2 : T.1′ ⊔ T.2 P. Applying ⊢mon he; si, we obtain ⊢mon he; si hlet x= e′.1 in e.2 ; s′ i: hT.1′ ⊔ T.2 P; S ′ i. This is sufficient since (T.1′ ⊔ T.2 ) ≤ (T.1 ⊔ T.2 ).

239

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

hrunning [ e]; &([ ⊚s ] ˆ [ 6 ⊚s])i ! &[t]: imd

imd

By ⊢mon he; si , we have ∅ {∅} {∅}; S1 ⊢mon e imd

⊢mon s

r1

r1

running e0 : T1 (P0 :=r1 r0 ∅) and

imd

imd

&s0 : S1 . By ⊢mon e -running, ∅ {∅} {∅} {∅}; 'S1 ⊢mon e imd

imd

r1 r0

e0 : ≬T 0 T1 P0 .

r

We can reapply ⊢mon he; si to form ⊢mon he; si he0 ; &s0 i: h≬T 0 T1 P0 ; S1 i. Then, because there is an evaluation

t0 he0 ; s0 i →∗ he′0 ; s′0 i imd he; sir1 r0 mon

, we have by induction that

imd

∃ T0′ = ≬T ′0 T1′ ≤r1 r0 ≬T 0 T1 , S0′ ≥r1 r0 'S1 . ⊢mon he; si imd

imd

By ⊢mon he; si, ∅ {∅} {∅} {∅}; S0′ ⊢mon e

r1 r0

r1 r0

he′0 ; s′0 i : hT0′ P0 ; S0′ i. imd

e′0 : T0′ P0 and ⊢mon s

r1 r0

s′0 : S0′ . Be-

imd

cause ' is a surjection, ∃ S1′ . S0′ = 'S1′ . Applying ⊢mon e -running, ∅ {∅} {∅}; S1′ imd

⊢mon e

r1

imd

running e′0 : T1′ (P0 :=r1 r0 ∅). By Proposition 4.2.2, ⊢mon s imd

imd

Applying ⊢mon he; si , we obtain ⊢mon he; si

r1

r1

&s′0 : S1′ .

hrunning e′0 ; &s′0 i: hT1′ (P0 :=r1 r0 ∅);

S1′ i. This is sufficient since T1′ ≤r1 T1 . f

hreturn [ e]; [ ⊚s ] {≬s} ˆ [ 6 ⊚s]. ≬s i ! '[t]: imd

imd

By ⊢mon he; si , we have ∅ {∅} {∅} {∅}; S0 ⊢mon e imd

P1 ) and ⊢mon s

r1 r0

imd

⊢

-ret-running,

imd mon s

and ⊢

imd ⊚ mon s

⊢

imd 6 ⊚ mon s

0

⊢

s 0 = 6 ⊚s 1 . ≬s 0 ,

f

S 0 = ⊚S 1 {≬S 0 }, and 6 ⊚S 0 = 6 ⊚S 1 . ≬S 0 . By

⊚

imd

-nonempty

S 0 . By the rules ⊢ ⊚S

f

6⊚

∅ {∅} {∅}; S1 ⊢mon e

≬

and

return e1 : St∅ T1 (Return

s0 : S0 , with s0 = ⊚s 0 ˆ 6 ⊚s 0 and S0 = ⊚S 0 ˆ 6 ⊚S 0 , and in turn

with ⊚s 0 = ⊚s 1 {≬s 0 }, ⊢mon e

r1 r0

f r1 r0 imd ≬S mon

imd mon s

we have ⊢

and ⊢

f f ≬s : ≬S . 0 0

imd 6 ⊚ mon s

r1

e1 : T1 P1 , with S1 = ⊚S 1 ˆ 6 ⊚S 1 .

imd ⊚ r1 mon s

⊚

s 1 : ⊚S 1 and ⊚S 0 ⊢

-nonempty,

we have ⊚S 1 ⊢ imd

imd mon he;

si

to form ⊢

with s1 = ⊚s 1 ˆ 6 ⊚s 1 . With the evaluation have by induction ∃ T1′ ≤r1 T1 , S1′ =

imd mon he;

si

t0 he1 ; s1 i →∗ he′1 ; s′1 i imd he; sir1 mon

⊚ ′ S1

imd

r1

imd

s′1 : S1′ . Applying ⊢mon e -ret-running, ∅ {∅} {∅}; S1′ ⊢mon e

above, we can reapply ⊢ ⊢

imd r1 r0 mon s

s′0 : S0′ , with imd

-nonempty,

r1 r0

Applying ⊢mon he; si , ⊢mon he; si

⊢

imd 6 ⊚ mon s

s

and

he′1 ; s′1 i : imd

r1 r0

-nonempty

r1

running e′0 : St∅ T1′

and ⊢

s

imd mon s

derivations to obtain f

and S0′ = ⊚S 1 {≬S ′ 0 } ˆ 6 ⊚S 1 . ≬S ′ 0 .

hreturn e′1 ; s′0 i: hSt∅ T1′ (Return P1 ); S0′ i. This

is sufficient since St∅ T1′ ≤r1 r0 St∅ T1 .

240

r1

f imd ≬

f s′0 = ⊚s 1 {≬s′ 0 } ˆ 6 ⊚s 1 . ≬s′ 0 imd

s 1 : 6 ⊚S 1

e′1 : T1′ P1 and ⊢mon s

derivation and the sequence of ⊢mon

imd ⊚ mon s

s0:

for s′1 = ⊚s ′1 ˆ 6 ⊚s ′1 we imd

imd

≬

he1 ; s1 i: hT1 P1 ; S1 i,

ˆ 6 ⊚S ′1 ≥r1 S1 . ⊢mon he; si

imd

s

r1

By

6⊚

imd ⊚

hT1′ P1 ; S1′ i. By ⊢mon he; si , ∅ {∅} {∅} {∅}; S1′ ⊢mon e

imd ≬

imd 6 ⊚ r1 mon s

We can reapply ⊢mon s (using the ⊢mon

derivations) and then ⊢

(Return P1 ). Using the ⊢mon

imd ≬ r1 r0 mon s

Monadic Language

→∗ imd he; si mon

Type and Effect Soundness

-reflex:

s = s′ and e = e′ , so let S ′ = S and T ′ = T. →∗ imd he; si mon

-step:

Use Lemma 4.2.5. →∗

imd he; si mon

-trans:

We have

t′ he; si →∗ he′′ ; s′′ i imd he; sir mon

and

imd

t′′ he′′ ; s′′ i →∗ he′ ; s′ i imd he; sir mon

with t = t′ + t′′ . By induction, we have ∃

r

T ′′ ≤r T, S ′′ ≥r S. ⊢mon he; si he′′ ; s′′ i : hT ′′ P; S ′′ i, and then ∃ T ′ ≤r T ′′ , S ′ ≥r S ′′ . imd

r

⊢mon he; si he′ ; s′ i : hT ′ P; S ′ i. Proof: Lemma 4.2.4. imd

r

We show below that if he; si ∈ R mon he; si is immediately faulty, then there are no S and E such imd

r

that ⊢mon he; si he; si: hE; Si. One can then observe that typing any faulty program configuration would require typing an immediately faulty configuration from a program expression evaluation context. Our proof is by contradiction. We perform case analysis on immediately faulty expression configurations. r

hreturnr x; si : imd

imd

By rule ⊢mon he; si , we require ∅ {∅} {∅}; S ⊢mon e

r

imd

returnr x: E. This must be generated imd

imd

with a sequence of ⊢mon e -ret-running, followed by ⊢mon e -pure and ⊢mon p -var, which require x ∈ Dom( ∅). hhλx.e0 i; s0 i

r1 r0

By rule ⊢

(λx.e0 not closed ):

imd mon he;

si

generated with ⊢ S0 ⊢

imd r1 r0 mon e

imd

, we require ∅ {∅} {∅} {∅}; S0 ⊢mon e

imd mon e -alloc

and ⊢

r1 r0

hλx.e0 i: E. This must be

imd mon b -λ

, which in turn require ∅ {∅} {∅} {∅ {x 7→ P0.1 }};

e0 : T0 P0.2 and E = St{alloc} (St∅ Id) (P0.1 ⇒ T0 P0.2 ). But e0 contains a free

variable other than x so this will not be derivable. hderef v; s0 ir1 r0 , hset v to p; s0 ir1 r0 , hv p; s0 ir1 r0 ,(v 6= o): imd

imd

By rule ⊢mon he; si, we require ∅ {∅} {∅} {∅}; S0 ⊢mon e to p, or v p. These must be generated with ⊢

r1 r0

imd mon e -deref

,⊢

imd

each of which requires ∅ {∅} {∅} {∅}; ⊚S 1 {≬S 0 } ⊢mon p 6⊚

e0 : E, for e0 equals deref v, set v

imd mon e -set

r1 r0

imd

, or ⊢mon e-app, respectively,

v : BT, for S0 = ⊚S 1 {≬S 0 } ˆ imd

S 0 and B equals Ref P0 or P0.1 ⇒ T0 P0.2 . This must be generated with ⊢mon p -val and imd

⊢mon v-addr-live, the latter of which requires v = o. 241

Modelling Encapsulation of State With Monad Transformers

r1 r0

hderef o; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i

Intermediate Languages

, hset o to p; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i

r1 r0

, ho p; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i

r1 r0

,

:

(o ∈ / Dom ( ≬s 0 )) imd

imd

imd

By ⊢mon v -addr-live (as above), o ∈ Dom( ≬S 0 ). By rule ⊢mon he; si we also require ⊢mon s imd

r1 r0

imd ≬

s0 : S0 , so by ⊢mon s-nonempty and ⊢mon s , o ∈ Dom( ≬s 0 ). r1 r0

hderef o; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i imd

r1 r0 ≬

, hset o to p; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i

( s 0 ( o) 6= href vi):

imd

imd

As above, by ⊢mon e -deref or ⊢mon e -set, ∅ {∅} {∅} {∅}; ⊚S 1 {≬S 0 } ⊢mon p imd

imd

r1 r0

o : Ref P0 , and

imd

imd ≬

by ⊢mon v -addr-live ≬S 0 ( o) = Ref P0 . By rules ⊢mon he; si , ⊢mon s -nonempty, and ⊢mon s , ⊚S 1 imd

imd

{≬S 0 } ⊢mon d ≬s 0 ( o): Ref P0 . By ⊢mon d-ref, ≬s 0 ( o) = href vi. r1 r0 ≬

ho p; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i

( s 0 ( o) 6= hλx.e0 i): imd

imd

As above, by ⊢mon e -app, ∅ {∅} {∅} {∅}; ⊚S 1 {≬S 0 } ⊢mon p

r1 r0

imd

o : P0.1 ⇒ T0 P0.2 , and by imd

imd

⊢mon v -addr-live we have ≬S 0 ( o) = P0.1 ⇒ T0 P0.2 . By rules ⊢mon he; si , ⊢mon s -nonempty, and imd ≬

imd

imd

⊢mon s , ⊚S 1 {≬S 0 } ⊢mon d ≬s 0 ( o): P0.1 ⇒ T0 P0.2 . By ⊢mon d-λ, ≬s 0 ( o) = hλx.e0 i. ≬

hrunning e0 ; s1 ir (s1 6= imd

s0 f f ⊚ s 1 ˆ 6 ⊚s . ≬s 1 ≬s 0 ):

imd

By rule ⊢mon he; si, we require ∅ {∅} {∅}; S1 ⊢mon e with ⊢

r1

running e0 : E. This must be generated

imd mon e -running

imd

, which requires E = T1 P0 := ∅, and ∅ {∅} {∅} {∅}; 'S1 ⊢mon e

r1 r0

e0 :

≬

≬

T 0 T1 P0 . For 'S1 to be defined we require imd

we also require ⊢mon s

r1

S0 f f S1 = ⊚S 1 ˆ 6 ⊚S. ≬S 1 ≬S 0 . imd

imd

By rule ⊢mon he; si

imd 6 ⊚

s1 : S1 . This must be generated by ⊢mon s , ⊢mon

s

-nonempty

and at

≬

f imd ≬

least one application of ⊢mon s , which require r

s0 f f s1 = ⊚s 1 ˆ 6 ⊚s . ≬s 1 ≬s 0 .

hrunning returnr1 r0 v1 ; s1 i 1 (s1 6= ⊚s 1 ˆ 6 ⊚s. ≬s 0 ):

We have above that running constructs form reachable, typeable configurations with nonf

lexical stores with a rightmost region store tree rooted at some ≬s 0 in their uppermost forest. We show here that when the body of the running construct is an expression value, f ≬ s0

imd

will have no children or siblings. As above, we have ∅ {∅} {∅} {∅}; 'S1 ⊢mon e imd

imd

turnr1 r0 v1 : ≬T 0 T1 P0 . By rule ⊢mon he; si we also have ⊢mon s

r1

r1 r0

re-

s1 : S1 , so s1 and S1 have the

same shape. The rightmost forest in the nonlexical store resulting from 'S1 will consist of f

the children of some ≬S 0 and the next rightmost of its siblings; by Lemma 4.2.1, these are empty.

242

Monadic Language

Type and Effect Soundness

We proceed to the preparatory lemmas for subject reduction. The value substitution lemma differs from the object language mainly in the representation of environments. Lemma 4.2.6 (Value Substitution). imd mon p: imd

∅ {∅} {∅ {x′ 7→ P ′ }}; ⊚S ⊢mon p imd

∅ {∅} {∅}; ⊚S ⊢mon p

r

r

p: P ∧

imd

r

S ⊢mon v v ′ : P ′ →

⊚

p [x′ := v ′ ] : P

imd mon b:

∅ {∅} {∅} {∅ {x′ 7→ P ′ }}; ∅ {∅} {∅} {∅};

imd

⊚

S ⊢mon b

imd

⊚

S ⊢mon b

r1 r0

r1 r0

b: B ∧

imd

⊚

S ⊢mon v

r = r1 r0

v′ : P ′ →

b [x′ := v ′ ]: B

imd mon e:

∅ {∅} {∅ {x′ 7→ P ′ }}; ∅ {∅} {∅};

imd

S ˆ 6 ⊚S ⊢mon e

⊚

imd

S ˆ 6 ⊚S ⊢mon e

⊚

r

r

e: T P ∧

imd

r

S ⊢mon v v ′ : P ′ →

⊚

e [x′ := v ′ ]: T P

Since region indicators do not occur in the syntax of the monadic language, substitutions are not required for the allocation and deallocation of regions. The pure type, however, must be modified on region deallocation, as defined in Figure 3.2.18. Lemma 4.2.7 (Region Allocation). imd mon v: imd

⊚

S 1 ⊢mon v

r1 ρ0

imd

⊚

S 1 {∅} ⊢mon v

v0 : P0 → r1 r0

v0 : P0

imd mon p: imd

∅ {∅} {∅}; ⊚S 1 ⊢mon p

r1 ρ0

imd

∅ {∅} {∅}; ⊚S 1 {∅} ⊢mon p imd mon b

p0 : P0 → r1 r0

p0 : P0

(active source prestorable):

∅ {∅} {∅}; ∅ {∅} {∅};

⊚

imd

S 1 ⊢mon b

⊚

S 1 {∅} ⊢

r1 ρ0

b0 : B0 →

imd r1 r0 mon b

b0 : B0

243

Modelling Encapsulation of State With Monad Transformers

imd mon e

Intermediate Languages

(active source expression): imd

S 1 ˆ 6 ⊚S 1 ⊢mon e

∅ {∅} {∅};

⊚

∅ {∅} {∅};

⊚

r1 ρ0

imd

e0 : E0 →

S 1 {∅} ˆ 6 ⊚S 1 . ǫ ⊢mon e

r1 r0

e0 : E0

imd mon s: imd

⊢mon s

r1

imd

s 1 ˆ 6 ⊚s 1 : ⊚S 1 ˆ 6 ⊚S 1 → ⊢mon s

⊚

r1 r0

s 1 {∅} ˆ 6 ⊚s 1 . ǫ: ⊚S 1 {∅} ˆ 6 ⊚S 1 . ǫ

⊚

Lemma 4.2.8 (Region Deallocation). imd mon v: imd

⊚

S 1 {≬S 0 } ⊢mon v

r1 r0

v0 : P0 →

imd

⊚

S 1 ⊢mon v

r1

v0 : P0 := ∅

imd mon s: imd

⊢mon s

r1 r0

imd

s1 {≬s 0 }: S1 {≬S 0 } → ⊢mon s

r1

s1 : S 1

Our statement of store weakening corresponds to that of the object language (Proposition 4.1.3). Proposition 4.2.3 (Store Weakening). imd mon v: ⊚

imd

̺

imd

̺

S ⊢mon v v: P

imd mon d: ⊚

S ⊢mon d d: B

imd mon p:

imd

Γ; ⊚S ⊢mon p

̺

p: P

imd mon e

(active source expression): imd ̺ Γ; S ⊢mon e e: E

→

⊚ ′

imd

̺

S ≤̺ ⊚S ′

→

⊚ ′

imd

̺

S ≤̺ ⊚S ′

→ Γ; ⊚S ′ ⊢mon p

S ≤̺ ⊚S ′

→ Γ;

⊚

∧

⊚

∧

⊚

∧

⊚

imd mon b

(active source prestorable): imd ̺ Γ; ⊚S ⊢mon b b: B

S ≤̺ ⊚S ′

∧

∧ S ≤̺ S ′

S ⊢mon v v: P S ⊢mon d d: B

⊚ ′

imd

̺

imd

̺

p: P

S ⊢mon b b: B imd

→ Γ; S ′ ⊢mon e

̺

e: E

Expification and storification, two lemmas of the object language type soundness proof, are unnecessary for the monadic language proof; only a trivial version of storification remains. The fundamental reason is that configuration contexts make an expression’s representation in the store the same as its representation within a program. Even if we call a function at an outer region, this must be done from within return constructs that peel off the outer regions. One might think that we at least need to blank out the nonlexical store of an expression upon storification and replace it upon expification, but even this is not the case. It is not necessary for storification because function bodies have empty nonlexical store. It is not necessary for expification because reachable applications have empty nonlexical store. This holds because the components of an application are pure, and because 244

Monadic Language

Type and Effect Soundness

if the application occurs in the definition of a let construct, the body of the let construct must be an active source expression. Proposition 4.2.4 (Storification). imd

⊚

S 0 ⊢mon b

∅ {∅} {∅};

r1 r0

imd

⊚

S 0 ⊢mon d

d: B →

r1 r0

d: B

Proof: Lemma 4.2.5. By case analysis on the reduction step: ⇀

imd he; si mon

[] hlet x= returnr v in e; si ⇀ he [ x:= v] ; si

-let:

imd he; sir mon

We have by rule ⊢

imd mon he;

si

:

that ∅ {∅} {∅};

imd

S ˆ 6 ⊚S ⊢mon e

⊚

imd

by a sequence of ⊢

imd

S ˆ 6 ⊚S ⊢mon e

⊚

imd mon e -ret-running

r

followed by ⊢

let x= returnr v in e: T P. By imd

S ˆ 6 ⊚S ⊢mon e

⊚

⊢mon e -let and because values have no effect, ∅ {∅} {∅}; P.1 and ∅ {∅} {∅ {x 7→ P.1 }};

r

r

returnr v: St∅ Id

e: T P. The former must be generated imd mon e -pure

imd

imd

and ⊢mon p -value. Replacing each

imd

occurrence of ⊢mon e -ret-running by an occurrence of ⊢mon v -ret-running in the antecedent of imd

imd

r

e [ x := v]: T P. We complete with a reapplication of rule ⇀

imd he; si mon

[(alloc @ o)] hd0 ; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i ⇀ hreturnr1 r0 o;

-alloc:

We have by rule ⊢

that ∅ {∅} {∅};

sition 4.2.4, ⊚S 1 {≬S 0 } ⊢ ⊚ ′ S 0 = ⊚S 1

S 1 {≬S 0 } ˆ 6 ⊚S 0 ⊢

imd r1 r0 mon e

∅

imd ≬

⊢mon

s

imd

⊚

S 1 {≬S 0 } ⊢mon b

imd mon v -addr-live

,⊢

r1 r0

⊚ ′ S0

⊢

: imd

d0 : T P. By ⊢mon e d0 : B0 . By Propo-

imd r1 r0 mon d

imd mon p -value

, and ⊢

-nonempty

imd ⊚ r1

that ⊢mon

, ∅ {∅} {∅};

∅

imd ≬ r1 r0

⊚

s 1 : ⊚S 1 and ⊚S 1 {≬S 0 }⊢mon

s

using the storable derivation gives ⊚S 1 {≬S 0 {o 7→ B0 }}⊢ imd ⊚

{o 7→ B0 }. We then reapply ⊢mon

s

s

imd

imd mon he;

si

imd

, ⊢mon s ,

≬

imd ≬ r1 r0 mon s

imd

-nonempty

d0 : B0 , with

imd mon e -pure

returnr1 r0 o: St (St Id) B0 . We also have by rules ⊢

imd ⊚

s

imd r1 r0 mon e

d0 : B0 . By Proposition 4.2.3,

{≬S 0 {o 7→ B0 }}. Applying ⊢

ˆ 6 ⊚S 0 ⊢

and ⊢mon

imd r1 r0 mon d

si.

s 1 {≬s 0 {o 7→ d0 }} ˆ 6 ⊚s 0 i

⊚

P = B0 , T = St{alloc} (St∅ Id), and ∅ {∅} {∅};

-alloc,

⊚ ′ S0

si

r

⊚

imd he; sir1 r0 mon

imd mon he;

imd mon he;

imd

S ˆ 6 ⊚S ⊢mon e

⊚

⊢mon p-value gives ⊚S ⊢mon v v: P.1 . By Lemma 4.2.6 we have that ∅ {∅} {∅};

and ⊢mon s, to obtain ⊢mon s

s 0 : ≬S 0 . Reaplying ≬

s 0 {o 7→ d0 }: ≬S 0

r1 r0

⊚

s 1 {≬s 0 {o 7→ imd

d0 }} ˆ 6 ⊚s 0 : ⊚S 1 {≬S 0 {o 7→ B0 }} ˆ 6 ⊚S 0 . We complete with an application of rule ⊢mon he; si. ⇀ imd he; si mon

-deref:

hderef o;

s 1 {≬s 0 } ˆ 6 ⊚s 0 i

⊚

[(read @ o)] ⇀ hreturnr1 r0 v0 ; imd he; sir1 r0 mon

where ≬s ( o) = href v0 i. We have by rule ⊢ imd

imd mon he;

si

s 1 {≬s 0 } ˆ 6 ⊚s 0 i

⊚

that ∅ {∅} {∅};

: imd

S 1 {≬S 0 } ˆ 6 ⊚S 0 ⊢mon e

⊚

r1 r0

imd

deref o: T P0 . By ⊢mon e -deref and ⊢mon p -value and because values have no effect, ⊚S 1 {≬S 0 } imd

⊢mon v

r1 r0

imd

o: Ref P0 with T = St{read} (St∅ Id). By ⊢mon v -addr-live ≬S ( o) = Ref P0 . We also imd

imd

have by rule ⊢mon he; si that ⊢mon s imd ⊚

⊢mon

s

-nonempty,

⊚

imd

S 1 {≬S 0 } ⊢mon d

r1 r0

r1 r0

imd

s 1 {≬s 0 } ˆ 6 ⊚s 0 : ⊚S 1 {≬S 0 } ˆ 6 ⊚S 0 , and, by ⊢mon s and

⊚

imd

imd

href v0 i: Ref P0 . By ⊢mon d -ref, ⊚S 1 {≬S 0 } ⊢mon v

r1 r0

v0 : 245

Modelling Encapsulation of State With Monad Transformers

imd

imd

Intermediate Languages

S 1 {≬S 0 } ⊢mon e

(St∅ Id) P0 . We complete with an application of rule ⇀ imd he; si mon

[(write @ o)] s 1 {≬s 0 } ˆ 6 ⊚s 0 i ⇀ hreturnr1 r0 unit;

⊚

hset o to v0 ;

-set:

⊢mon e

r1 r0

{∅};

⊚

imd

imd mon he;

r1 r0

returnr1 r0 v0 : St∅

r

si .

s 1 {≬s 0 {o href v0 i}} ˆ 6 ⊚s 0 i

⊚

imd he; sir1 r0 mon

where ≬s ( o) = href v0.1 i. We have by rule

imd

⊚

P0 . Applying ⊢mon p -value and ⊢mon e -pure, ∅ {∅} {∅};

imd mon he;

: S 1 {≬S 0 } ˆ 6 ⊚S 0

⊚

si that ∅ {∅} {∅};

imd

set o to v0 : T P. By ⊢mon e -set, P = Unit and T = St{write} (St∅ Id). We need ∅ {∅} imd

S 1 {≬S 0 } ˆ 6 ⊚S 0 ⊢mon e imd

r1 r0

imd

returnr1 r0 unit: St∅ (St∅ Id) Unit, which follows using ⊢mon v

imd

imd

imd

imd ⊚

-glob-const,

⊢mon p -value, and ⊢mon e -pure. We also have by rule ⊢mon he; si , ⊢mon s and ⊢mon

-nonempty,

that ⊚S 1 {≬S 0 }⊢mon

imd ≬ r1 r0

imd

{≬S 0 } ⊢mon v

r1 r0

imd

≬

imd

s

imd

s 0 : ≬S 0 . By ⊢mon e -set, ⊢mon e -pure, and ⊢mon p -val, ⊚S 1

s

imd

imd

imd

o: Ref P0 . By ⊢mon v-addr-live, ≬S 0 ( o) = Ref P0 . Also by ⊢mon e-set, ⊢mon e-pure,

imd

imd

and ⊢mon p -value, S ⊢mon v

r1 r0

imd ≬

imd

v0 : P0 . Applying ⊢mon d -ref,

imd ⊚

P0 . Reapplying ⊢mon s , ⊢mon

s

-nonempty,

S 1 {≬S 0 } ⊢mon d

imd

r1 r0

r1 r0

⊚

⊚

imd

imd

and ⊢mon s we obtain ⊢mon s

href v0 i: Ref

s 1 {≬s 0 {o

imd

href v0 i}} ˆ 6 ⊚s 0 : ⊚S 1 {≬S 0 } ˆ 6 ⊚S 0 . We complete with an application of rule ⊢mon he; si. ⇀ imd he; si mon

-app-λ:

ho v0 ;

s 1 {≬s 0 } ˆ 6 ⊚s 0 i

⊚

[(exec @ o)] ⇀ he0 [ x:= v0 ];

s 1 {≬s 0 } ˆ 6 ⊚s 0 i

⊚

imd he; sir1 r0 mon imd

where ≬s 0 ( o) = hλx.e0 i. By rule ⊢mon he; si, ∅ {∅} {∅}; imd

imd

imd

By ⊢mon e -app and ⊢mon p -value, ⊚S 1 {≬S 0 } ⊢mon v

r1 r0

: imd

S 1 {≬S 0 } ˆ 6 ⊚S 0 ⊢mon e

⊚

r1 r0

imd

v0 : P0.1 and ⊚S 1 {≬S 0 } ⊢mon v

o v0 : T P. r1 r0

o: P0.1

imd

⇒ ≬T 0 T1 P with T = (≬T 0 ⊔ St{exec} ) T1 . By ⊢mon v-addr-live, ≬S ( o) = P0.1 ⇒ ≬T 0 T1 P0.2 . We imd

imd

also have by rule ⊢mon he; si that ⊢mon s imd ⊚

s

⊢mon

⊢mon b

imd

r1 r0

imd

r1 r0

⊢mon e

imd ≬ r1 r0

⊚

S 1 {≬S 0 }⊢mon

-nonempty,

s

r1 r0

imd

s 1 {≬s 0 } ˆ 6 ⊚s 0 : ⊚S 1 {≬S 0 } ˆ 6 ⊚S 0 . By ⊢mon s and

⊚

imd

≬

s 0 : ≬S 0 . By ⊢mon d -λ, we have ∅ {∅} {∅}; imd

hλx.e0 i: P0.1 ⇒ ≬T 0 T1 P0.2 . By ⊢mon b -λ, ∅ {∅} {∅ {x 7→ P0.1 }}; imd

e0 : ≬T 0 T1 P0.2 . By Lemma 4.2.6, ∅ {∅} {∅}; S ⊢mon e

r1 r0

⊚

S 1 {≬S 0 }

S 1 {≬S 0 } ˆ 6 ⊚S 0

⊚

e0 [ x := v0 ]: ≬T 0 T1

imd

P0.2 . We complete with an application of rule ⊢mon he; si. ⇀

imd he; si mon

hrun e0 ;

-run:

By rule ⊢

⊚

f

s1

si

, ∅ {∅};

f

imd

S 1 ˆ 6 ⊚S. ≬S 1 ⊢mon e f

imd

f

ˆ 6 ⊚S. ≬S 1 ∅ ⊢ f

imd r1 mon e

f

S 1 ˆ 6 ⊚S . ≬S 1 ⊢

⊚

r1 ρ0

S 1 {∅} ˆ 6 ⊚S. ≬S 1 . ǫ ⊢mon e

⊚

hrunning e0 ;

imd he; sir1 mon

imd mon he;

⊚

[]

ˆ  6 ⊚s  . ≬s 1 i ⇀

r1 r0

⊚

s1

f

ˆ  6 ⊚s  . ≬s 1 ∅i

imd r1 mon e

: imd

run e0 : T1 P1 . By ⊢mon e-run, ∅ {∅} {∅};

e0 : ≬T 0 T1 P0 , P1 = (P0 := ∅). By Lemma 4.2.7, ∅ {∅} {∅}; imd

e0 : ≬T 0 T1 P0 . We can apply ⊢mon e -running to get ∅ {∅};

running e0 : T1 P0 := ∅. We also have by rule ⊢ f

imd mon he;

si

that ⊢

⊚

imd

s 1 ˆ 6 ⊚s. ≬s 1 : ⊚S 1 ˆ 6 ⊚S. ≬S 1 , so applying Lemma 4.2.7 on stores we obtain ⊢mon s

⊚

246

S1

imd r1 mon s

r1 r0

Monadic Language

Type and Effect Soundness

f

f

s 1 {∅} ˆ 6 ⊚s. ≬s 1 . ǫ: ⊚S 1 {∅} ˆ 6 ⊚S. ≬S 1 . ǫ. We complete with an application of rule

⊚

imd

⊢mon he; si.

hrunning returnr1 r0 v0 ;

⇀

imd he; si mon

-running:

[] s 1ˆ6 ⊚s . ≬s 0i ⇀ hreturnr1 v0 ;

⊚

We have by rule ⊢

imd mon he;

si

that ⊢

imd r1 mon s

s 1ˆ6 ⊚s . ǫi

⊚

imd he; sir1 mon

: f

s 1 ˆ 6 ⊚s. ≬s 0 : ⊚S 1 ˆ 6 ⊚S. ≬S 1 . Thus, the

⊚

f

store and store type have the same shape and in particular ≬S 1 = ≬S 0 . We also have by imd

rule ⊢mon he; si that ∅ {∅}; -running,

imd

S 1 ˆ 6 ⊚S . ≬S 0 ⊢mon e

⊚

f

imd

r1

S 1 {≬S 0 } ˆ 6 ⊚S. ≬S 1 . ǫ ⊢mon e

⊚

∅ {∅} {∅};

imd

running returnr1 r0 v0 : T1 P1 . By ⊢mon e

r1 ρ0

returnr1 r0 v0 : ≬T 0 T1 P0 (*), with P1 f

P1 .

imd

S 1 ˆ 6 ⊚S. ≬S 1 ⊢mon e

⊚

= (P0 := ∅). We require a derivation (**) ∅ {∅};

r1

returnr1 v0 : T1

imd

We have two cases. If the above judgment (*) is generated by ⊢mon e -ret-running, then imd

we have a derivation (**) with P0 = Return P1 . Otherwise (*) is generated by ⊢mon e -pure imd

imd

and by ⊢mon p -value we have ⊚S 1 {≬S 0 } ⊢mon v P1 . We can apply ⊢ imd

From ⊢mon s

r1

imd mon p -value

and ⊢

imd mon e -pure

r1 r0

imd

v0 : P0 . By Lemma 4.2.8, ⊚S 1 ⊢mon v

r1

v0 :

to obtain (**). In either case, T1 = St∅ Id. imd

s 1 ˆ 6 ⊚s. ≬s 0 : ⊚S 1 ˆ 6 ⊚S. ≬S 0 we obtain ⊢mon s

⊚

r1

s 1 ˆ 6 ⊚s. ǫ: ⊚S 1 ˆ

⊚

imd

6 ⊚S. ǫ. We complete with an application of rule ⊢mon he; si.

Proof: Lemma 4.2.6. We actually show: imd mon p: imd

Γ; ⊚S ⊢mon p

̺

p : P ∧ Γ ( x′ ) = P ′ ∧ imd

Γ − x′ ; ⊚S ⊢mon p

̺

imd

̺

S ⊢mon v v ′ : P ′ →

⊚

p [ x′ := v ′ ] : P

imd mon b: imd

Γ; ⊚S ⊢mon b

̺

b: T B ∧ Γ ( x′ ) = P ′ ∧

imd

Γ - x′ ; ⊚S ⊢mon b

̺

imd

̺

S ⊢mon v v ′ : P ′ →

⊚

b [ x′ := v ′ ]: T B

imd mon e:

Γ;

imd

S ˆ 6 ⊚S ⊢mon e

⊚

Γ - x′ ;

̺

imd

e: T P ∧ Γ ( x′ ) = P ′ ∧

S ˆ 6 ⊚S ⊢mon e

⊚

̺

imd

̺

S ⊢mon v v ′ : P ′ →

⊚

e [ x′ := v ′ ]: T P

247

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

The proof is by induction on the derivation of the term into which we are substituting. We show only selected cases. imd mon p: imd

Γ; ⊚S ⊢mon p By ⊢

imd mon p

̺

x : Γ( x):

-var,

Γ ( x) = P. If x′ = x then P = P ′ and x [ x′ := v ′ ] = v ′ , so the result imd

imd

follows from an application of ⊢mon p -value. Otherwise, by ⊢mon p -var and the definition imd

of environments, (Γ − x′ ) ( x) = P. Γ − x′ ; S ⊢mon p

̺

imd

x : P follows from ⊢mon p-var,

and is sufficient because x = x [ x′ := v ′ ].

imd mon b:

Γ1 {≬Γ 0 };

⊚

imd

S ⊢mon b

̺1 ̺0

hλx.e0 i: (P0.1 ⇒ ≬T 0 T1 P0.2 ):

imd

imd

̺1 ̺0

By ⊢mon b-λ, Γ1 {≬Γ 0 {x 7→ P0.1 }}; ⊚S ˆ .ǫ. ǫ. ǫ ⊢mon e e0 : ≬T 0 T1 P0.2 . The result of imd ̺1 ̺0 ≬ ′ ⊚ mon the induction is Γ1 { Γ 0 {x 7→ P0.1 }} - x ; S ˆ .ǫ. ǫ. ǫ ⊢ e e0 [ x′ := v ′ ]: ≬T 0 T1 P0.2 . By definition of environment extension, x′ 6= x, so Γ1 {≬Γ 0 {x 7→ P0.1 }} - x′ = Γ1 - x′ {≬Γ 0 - x′ {x 7→ P0.1 }}. (Γ − x′ );

⊚

imd

S ⊢mon b

̺1 ̺0

hλx.e0 [ x′ := v ′ ] i: P0.1 ⇒ ≬T 0 T1 P0.2

imd

follows by ⊢mon b-λ, with hλx.e0 [ x′ := v ′ ] i = hλx.e0 i [ x′ := v ′ ].

imd mon e:

Γ1 ;

≬ s0 f f (⊚s 1 ˆ 6 ⊚s. ≬s 1 ≬s 0 ) imd

imd

⊢mon e

By ⊢mon e -running, Γ1 {∅};

r1

running e0 : T1 P0 :=r1 r0 ∅: f

f

imd

s 1 {≬s 0 } ˆ 6 ⊚s. ≬s 1 . ≬s 0 ⊢mon e

⊚

imd

erate a derivation of ⊚S 1 {≬S 0 } ⊢mon v

r1 r0

r1 r0

e0 : ≬T 0 T1 P0 . We gen-

v ′ : Return P ′ using a single additional ap-

imd

plication of ⊢mon v -ret-running. By definition of environment application, Γ1 {∅} ( x′ ) = Return P ′ . Then by induction, Γ1 {∅} - x′ ;

f

f

imd

s 1 {≬s 0 } ˆ 6 ⊚s . ≬s 1 . ≬s 0 ⊢mon e

⊚

r1 r0

e0 [ x′

imd

:= v ′ ]: ≬T 0 T1 P0 . Γ1 {∅} - x′ = (Γ1 - x′ ) {∅}, so applying ⊢mon e-running, (Γ1 - x′ ); (⊚s 1 ˆ ≬ s0 f f 6 ⊚s. ≬s 1 ≬s 0 )

imd

⊢mon e

r1

running e0 [ x′ := v ′ ]: T1 P0 :=r1 r0 ∅, where running e0 [ x′ :=

v ′ ] = (running e0 ) [ x′ := v ′ ].

248

Monadic Language

Type and Effect Soundness

f

imd

S 1 {≬S 0 }ˆ 6 ⊚S 1 . ≬S 0 ⊢mon e

Γ {≬Γ 1 } {≬Γ 0 };

⊚

imd

By ⊢obj e -ret-running, Γ {≬Γ 1 2 ≬Γ 0 };

r1 r0

return e1 : St∅ T1 (Return P1 ): imd

S 1 ˆ 6 ⊚S 1 ⊢mon e

⊚

r1

e1 : T1 P1 . To perform induc-

tion we use either the antecedent derivation of a value derivation generated with imd

imd

⊢mon v -ret-running or, if v ′ is built with ⊢obj v -addr-live, a value derivation built using imd

an application of ⊢mon v -ret-running for each region name in r1 and a single applicaimd

tion of ⊢mon v 6⊚

S1 ⊢

imd r1 mon e

The result of the induction is Γ {≬Γ 1 2 ≬Γ 0 } - x′ ;

-addr-dead.

⊚

S1 ˆ

e1 [ x′ := v ′ ]: T1 P1 . Because Γ {≬Γ 1 2 ≬Γ 0 } - x′ = (Γ - x′ ) {(≬Γ 1 - x′ ) 2 imd

(≬Γ 0 - x′ )}, we can reapply ⊢mon e -ret-running to obtain (Γ - x′ ) {(≬Γ 1 - x′ )} {(≬Γ 0 - x′ )}; f

imd

S 1 {≬S 0 }ˆ 6 ⊚S 1 . ≬S 0 ⊢mon e

⊚

r1 r0

return e1 [ x′ := v ′ ]: St∅ T1 (Return P1 ), which is sufficient

because (Γ - x′ ) {(≬Γ 1 - x′ )} {(≬Γ 0 - x′ )} = Γ {≬Γ 1 } {≬Γ 0 } - x′ and return e1 [ x′ := v ′ ] = (return e1 ) [ x′ := v ′ ]. Proof: Lemma 4.2.7. We prove instead: imd mon v: imd

⊚

S 1 ⊢mon v

⊚

S 1 {∅} ⊢

r1 ρ0 ρ2

v0 : P0 →

imd r1 r0 ρ2 mon v

v0 : P0

imd mon p: imd

∅ {∅} {∅}; ⊚S 1 ⊢mon p

r1 ρ0 ρ2

imd

∅ {∅} {∅}; ⊚S 1 {∅} ⊢mon p imd mon b

r1 r0 ρ2

p0 : P0

(active source prestorable):

∅ {∅} {∅};

⊚

∅ {∅} {∅};

⊚

imd mon e

p0 : P0 →

imd

S 1 ⊢mon b

r1 ρ0 ρ2

imd

S 1 {∅} ⊢mon b

b0 : B0 →

r1 r0 ρ2

b0 : B0

(active source expression): imd

S 1 ˆ 6 ⊚S 1 ⊢mon e

∅ {∅} {∅};

⊚

∅ {∅} {∅};

⊚

r1 ρ0 ρ2

imd

S 1 {∅} ˆ 6 ⊚S 1 . ǫ ⊢mon e

e0 : E0 → r1 r0 ρ2

e0 : E0

249

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

imd mon s: imd

⊢mon s

r1

imd

s 1 ˆ 6 ⊚s 1 : ⊚S 1 ˆ 6 ⊚S 1 → ⊢mon s

⊚

r1 r0

s 1 {∅} ˆ 6 ⊚s 1 . ǫ: ⊚S 1 {∅} ˆ 6 ⊚S 1 . ǫ

⊚

The proof is by induction on the derivation. Again, we show selected cases. imd mon v: imd

⊚

S 1 ⊢mon v

r1 ρ0 ρ2

v: Return P: imd

imd

Consider the case of ρ2 = ǫ. By ⊢mon v -ret-run we have ⊚S 1 ⊢mon v imd

⊢mon v -ret-running to obtain

imd

⊚

S 1 {∅} ⊢mon v

r1 r0

imd

imd

r1 r0 ρ21

v: P. We apply

v: Return P as required. Alternatively, imd

ρ2 = ρ21 ρ20 . By ⊢mon v -ret-run we have ⊚S 1 ⊢mon v ⊢mon v

r1

r1 ρ0 ρ21

v: P. By induction,

imd

imd

v: P. Reapplying ⊢mon v-ret-run yields ⊚S 1 {∅} ⊢mon v

r1 r0 ρ2

⊚

S 1 {∅}

v: Return P.

imd mon b:

Γ1 {≬Γ 0 };

imd

⊚

S 1 ⊢mon b

̺1 ̺0

hλx.e0 i: (P0.1 ⇒ ≬T 0 T1 P0.2 ):

imd

̺1 ̺0 = r1 ρ0 ρ2 . By ⊢mon b -λ, Γ1 {≬Γ 0 {x 7→ P0.1 }}; P0.2 . Then by induction, Γ1 {≬Γ 0 {x 7→ P0.1 }}; e0 : ≬T 0 T1 P0.2 . imd

⊚

S 1 {∅} ⊢mon b

6 ⊚ ̺1 ̺0

∅

(̺1 ̺0 ) [ ρ0 := r0 ]

. ǫ = 6 ⊚∅

(̺1 ̺0 ) [ ρ0 := r0 ]

S 1 ˆ 6 ⊚∅

⊚

̺1 ̺0

S 1 {∅} ˆ 6 ⊚∅

⊚

̺1 ̺0

imd

⊢mon e

̺1 ̺0

imd

. ǫ ⊢mon e

e0 : ≬T 0 T1

(̺1 ̺0 ) [ ρ0 := r0 ]

imd

so we can apply ⊢mon b-λ to obtain Γ1 {≬Γ 0 };

hλx.e0 i: P0.1 ⇒ ≬T 0 T1 P0.2 .

imd mon e:

Γ {≬Γ } {≬Γ ′ };

imd

S 1 ˆ 6 ⊚S 1 ⊢mon e

⊚

r1 ρ0 ρ2

return e: St∅ T (Return P): imd

Consider the case of ρ2 = ǫ. By ⊢mon e -ret-run, Γ {≬Γ 2 ≬Γ ′ }; imd

P. We can apply ⊢mon e -ret-running to obtain Γ {≬Γ 2 ≬Γ ′ };

imd

S 1 ˆ 6 ⊚S 1 ⊢mon e

⊚

r1

imd

e: T

S 1 {∅} ˆ 6 ⊚S 1 . ǫ ⊢mon e

⊚

r1 r0

imd

return e: St∅ T (Return P) Alternatively, ρ2 = ρ21 ρ20 . By ⊢mon e -ret-run we have Γ {≬Γ 2 ≬Γ ′ }; imd

r1 r0 ρ21

imd

r1 r0 ρ2

⊢mon e ⊢mon e

imd

S 1 ˆ 6 ⊚S 1 ⊢mon e

⊚

imd

S 1 {∅} ˆ 6 ⊚s 1 . ǫ

⊚

S 1 {∅} ˆ 6 ⊚s 1 . ǫ

⊚

return e: St∅ T (Return P). imd

r1 ρ0 ρ2

imd

imd

S 1 {∅} ˆ 6 ⊚S 1 . ǫ ⊢mon e

⊚

imd

{∅} ˆ 6 ⊚S 1 . ǫ ⊢mon e

run e3 : T2 P3 :=r1 ρ0 ρ2 ρ3 ∅: imd

S 1 ˆ 6 ⊚S 1 ⊢mon e

⊚

By ⊢mon e -run, Γ1 {∅};

250

e: T P. By induction, Γ {≬Γ 2 ≬Γ ′ };

e: T P. Reapplying ⊢mon e -ret-run yields Γ {≬Γ} {≬Γ ′ };

S 1 ˆ 6 ⊚S 1 ⊢mon e

⊚

Γ1 ;

r1 ρ0 ρ21

r1 r0 ρ2 ρ3

r1 r0 ρ2

r1 ρ0 ρ2 ρ3

e3 : ≬T 3 T2 P3 . By induction, Γ1 {∅}; imd

e3 : ≬T 3 T2 P3 . Applying ⊢mon e -run, we obtain Γ1 ;

run e3 : T2 P3 :=r1 r0 ρ2 ρ3 ∅.

⊚

S1

Translation

Proof: Lemma 4.2.8. imd

The proof is by case analysis on the derivation of ⊚S 1 {≬S 0 } ⊢mon v

imd

⊚

S 1 {≬S 0 } ⊢mon v

r1 r0

r1 r0

v0 : P0 :

v: Return P1 :

imd

imd

By ⊢mon v-ret-running we have ⊚S 1 ⊢mon v

r1

v: P1 , which is all that we require as (Return P1 ) :=

∅ = P1 . ⊚

imd

S 1 {≬S 0 } ⊢mon v

r1 r0

o: B: imd

This is generated by ⊢mon v -addr-live. We can type the offset at an index of r1 by using an imd

imd

application of ⊢mon v-addr-dead, followed by an application of ⊢mon v-ret-running for each region name in r1 , which is sufficient since B := ∅ = Returnr1 ∅.

3. Translation

The intermediate monadic language carries more information than the intermediate object language. Specifically, an object language program does not maintain the information that a particular lexically visible region is not being used because we are currently executing the body of a function from an outer region. Thus, in translating object language programs, we must make assumptions about what region structure to provide for the monadic program. The obvious choice, which we use here, is to assume a linear tree structure, i.e., that each region in the object language store requires every preceding region. The additional levels of monads will yield an intermediate language program that is less efficient than that which would result from translating and reducing a source language program. Figure 4.3.1 presents the dynamic translation for the core intermediate language. As with the core source language in Figure 3.3.28, the translation requires a region sequence that is extended with a region variable at letregion forms. The translation again inserts return forms around operations at outer regions and retracts the index on translating the bodies of functions allocated at outer regions. The translation rules, however, must be modified to use the more general region indicators ̺ rather than region variables, as the index is extended with a region name at freeregion forms. Region 251

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Configurations and Traces imd

[hq; si] obj hq; siN r

N

=

h[[e ] obj e

imd

r

N

=

h[[v ] obj v

[hv; si] obj hv; si imd r

N

imd

h[[q ] obj qN ; π6 ⊚[s ] obj s

imd

[he; si] obj he; si

[t ] obj t

imd

=

imd

r

ˆ Dom( s ) N

ˆ Dom( s )

imd

r

N

; [s ] obj s

imd

r

N

; π⊚ [s ] obj s

imd

r

− r

i N

ˆ Dom( s )

i

− r

N

i

[returnr2 (ι @ o)|r = r1 r0 r2 ∧ (ι @ hr0 , oi) ∈ t]

=

Programs and Expressions imd obj qN

imd

= [q ] obj e

[q ]

imd ̺ obj e N

[p ]

= imd ̺ obj e N

=

[let x = e1 in e2 ]

imd ̺ = ̺1 ̺0 ̺2 N obj e

[@ ̺0 href ei]

=

imd ̺ = ̺1 ̺0 ̺2 N obj e

[@ ̺0 deref e ]

=

imd ̺ = ̺1 ̺0 ̺2 N obj e

[@ ̺0 set e.1 to e.2 ] imd

[@ ̺0 hλx.e0 i] obj e

̺1 ̺0 ̺2 N

imd ̺ = ̺1 ̺0 ̺2 N obj e

[@ ̺0 e.1 e.2 ]

[letregion ρ0 in e0 ]

imd r1 N obj e

[freeregion r0 after e0 ]

N imd

return̺ [p ] obj p

̺

N

imd ̺ obj e N

imd

in [e2 ] obj e

let x = [e1 ]

̺

N

imd ̺ obj e N

in return̺2 href xi

imd ̺ obj e N

in return̺2 (deref x)

let x = [e ]

let x = [e ]

imd ̺ obj e N

=

let x.1 = [e.1 ] imd ̺ in let x.2 = [e.2 ] obj e N in return̺2 (set x.1 to x.2 )

=

return̺2 hλx.[[e0 ] obj e

=

imd ̺1 N obj e

ǫ

imd

̺1 ̺0 N

imd ̺ obj e N

i imd

in let x.2 = [e.2 ] obj e

let x.1 = [e.1 ]

̺

N

in return̺2 (x.1 x.2 )

imd ̺1 ρ0 N obj e

=

run [e0 ]

=

running [e0 ] obj e

imd

r1 r0

N

Stores and Storables imd

[s ] obj s

r

ˆ r3 N

imd ǫ obj s N

[∅ ]

imd r1 r0 N obj s

&r3 ([[s ] obj s

=

∅ imd

= [s1 ] obj s

[s1 {r0 7→ ≬s 0 } ] imd ≬ r1 r0 N obj s

[∅ ]

= imd ≬ r1 r0 N obj s

≬

imd

=

r1

N

r r3

[href v0 i]

=

imd r1 r0 N obj d

[hλx.e0 i]

=

imd ≬ r1 r0

{[[≬s 0 ] obj

imd ≬ r1 r0

imd r1 r0 N obj d

s

N

[x ]

imd ̺ obj p N

[v ]

imd ̺ obj v N

[g ]

= x

imd

imd ̺ obj v N

= [v ] = g

[hr0 , oi] obj v

imd r1 r0 N obj v

href [v0 ]

imd r1 r0 N obj e

hλx.[[e0 ]

r1 r0 ̺2

imd ̺ obj v N

[h∅, oi]

s

N

}

N

imd

{o 7→ [d0 ] obj d

Pures and Values imd ̺ obj p N

ˆ .ǫ. ǫ. ǫ)

∅

= [≬s 0 ] obj

[ s 0 {o 7→ d0 } ]

N

= o = o

r1 r0

N

}

i i Region Indicators imd [ρ ] obj ̺N = ρ imd

[r ] obj ̺N imd obj ̺N

[̺ ]

= r imd

= [̺ ] obj ̺N

Figure 4.3.1. Translating Object Intermediate Language to Monadic (Dynamic)

252

Translation

names are translated similarly to region variables and freeregion forms are translated similarly to letregion forms. Generalized addresses are translated by keeping only the offset and dropping the region indicator. As in Figure 2.3.1, we must translate configurations and stores. This is more problematic with our tree-structured monadic stores, but in accordance with the strategy described above we convert the linear store of the object language (including both lexical and nonlexical region stores) to a linear tree at the appropriate lexical level. We begin by translating stores in a straightforward way, extending the index as we ascend. Region stores are translated as are the stores in Figure 2.3.1. This provides a monadic lexical store covering all the regions. We adjoin this to an empty nonlexical store and iteratively graft (using the operation &) each region not lexically visible in the object language program into the tree structure. In the case of program or value configurations, we finish by projecting the appropriate store component. Traces are translated by replacing the region name of the address with sufficient return forms in the atomic trace type. Consider the sample object language derivation in Figure 4.1.11. Applying the translation of Figure 4.3.1 to the expression configuration hfreeregion r2 after @ r1 deref freeregion r3 after hr1 , o1 i

; ∅ {r1 7→ ∅ {o1 7→ href uniti} {o2 7→ hλx1 .@ r1 deref letregion ρ3 in hr1 , o1 i {r2 7→ ∅ {o1 7→ href hr1 , o1 ii}} {r3 7→ ∅}

} i i}

yields h running ; let x6 = running return return return o1 in return deref x6 ∅ {∅ {o1 7→ href uniti} } ˆ .∅ {o1 7→ href o1 i}i {o2 7→ hλx1 .let x6 = run return return o1 i} ∅ in return deref x6 Under the translation, there is a parent/child relationship between regions corresponding to r2 and r3 , which were siblings in the examples of Section 4.2. The translation of the static syntax is presented in figure 4.3.2. As with the core source language in Figure 3.3.29, the translation requires a region sequence that is retracted as Return forms are inserted around pure types at outer regions. Also as in Figure 3.3.29, the trace type is partitioned into region trace types. Configuration types are translated similarly to the configurations of Figure 4.3.1. 253

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Environments imd ǫ obj Γ N

[∅ ]

= ∅ {∅}

imd ̺1 ̺0 N obj Γ

imd

= [Γ ] obj Γ

[Γ {̺0 }]

imd ̺ obj Γ N

̺1 N

{∅} imd

= let Γ { Γ } = [Γ ] obj Γ ≬

[Γ {x 7→ P} ]

̺

N

imd

in Γ {≬Γ {x 7→ [P ] obj P

̺

N

}}

Configuration Types imd ˆ Dom( S ) N ; π6 ⊚[S ] obj S i imd r imd r ˆ Dom( S ) − r N = h[[E ] obj E N ; [S ] obj S i imd r imd r ˆ Dom( S ) − r N = h[[P ] obj P N ; π⊚ [S ] obj S i

imd

imd

[hQ; Si] obj hQ; SiN imd

[hE; Si] obj hE; Si imd

[hP; Si] obj hP; Si

r

r

= h[[Q ] obj P

N

N

ǫ

N

Expression and Trace Types imd ̺ obj E N

[P ! T ] imd

[T ] obj T

̺ ̺0

N

imd ǫ obj T N

[T ]

imd

= [T ] obj T

̺

N

imd

[P ] obj P

̺

N imd

= St{ι|(ι @ ̺0 ) ∈ T} [T − ̺0 ] obj T

̺

N

= Id Pure Types

imd ̺ obj P N

[G ]

imd ̺1 ̺0 ̺2 N obj P

[B @ ̺0 ] imd

[∅] obj P

̺

N

=

Return̺ G

=

Return̺2 [B ] obj B

=

Return̺ ∅

imd

̺1 ̺0 N

Storable Types imd ̺ = ̺1 ̺0 N obj B

[Ref P ] T

imd

[P.1 ⇒ P.2 ] obj B

̺ = ̺1 ̺0 N

imd

= Ref [P ] obj P imd

= [P.1 ] obj P

̺

̺

N

N imd

⇒ [T ] obj T

̺

N

imd

̺

N

r1 r0

N

[P.2 ] obj P

Store Types imd

[S ] obj S

r0

ˆ r3 N

imd ǫ obj S N

[∅ ]

imd r1 r0 N obj S

[S {r0 7→ ≬S } ] imd ≬ r1 r0 N obj S

[∅ ] ≬

&r3 ([[S ] obj S

=

∅ imd

= [S ] obj S =

imd ≬ r1 r0 N obj S

[ S {o 7→ B} ]

imd

=

r1

N

r0 r3

N

ˆ .ǫ. ǫ. ǫ)

imd ≬

{[[≬S ] obj

S

r1 r0

N

}

∅ imd ≬

= [≬S ] obj

S

r1 r0

N

imd

{o 7→ [B ] obj B

}

Figure 4.3.2. Translating Object Intermediate Language to Monadic (Static)

254

Translation

Applying the translation of Figure 4.3.2 to the expression configuration type hUnit ! {(read @ r1 )}; ∅ {r1 7→ ∅ {o1 7→ Ref Unit}

} i {(read @ r1 )}

{o2 7→ Ref Unit @ r1 ⇒ {r2 7→ ∅ {o1 7→ Ref (Ref Unit @ r1 )}} {r3 7→ ∅}

Ref Unit @ r1 }

yields hSt{read} Id ; ∅ {∅ {o1 → 7 Ref Return Unit} } ˆ .∅ {o1 7→ Ref Return Ref Return Unit}i {(read @ r1 )} Return Unit {o2 7→ Ref Return Unit ⇒ } ∅ Ref Return Unit We again have a statement of Types Preservation. Theorem 4.3.1 (Types Preservation). imd

⊢obj hq; si imd 6⊚ ⊢mon hq; s i imd

⊢obj he; si ⊢

imd mon he;

r

si

⊢obj hv; si imd

⊢mon hv; imd

⊢obj s ⊢

imd

S imd

π⊚ [S ] obj S

r1 r0

ˆN S

imd r1 r0 obj S

π⊚ [S ]

imd obj S

π6 ⊚[S ]

ˆN

[he; si]

⊚

si

[hv; si]

[s ]

r

imd

[s ] obj s

⊢

s

imd

⊢obj d ⊢

imd ̺N imd [r [ 1 r0]]obj mon d

⊢obj q

ˆ r3 N

imd mon q

⊢

r

N

s

imd ≬ r1 r0 N obj s

[≬s ]

d imd r1 r0 N obj d

[d ]

imd

S

q imd obj qN

[q ]

: [hE; Si]

:

:

:

[hP; Si]

[S ]

→

imd

[S ] obj S

r

N

≬

→

S

imd ≬

[≬S ] obj

S

r1 r0

N

B

→

imd r1 r0 N obj B

[B ]

: :

→ ˆ r3 N

S

: :

SiN

S

: :

SiN

→

imd obj hP;

imd r obj S

→

→

hP; Si

:

≬

r1 r0

imd obj hE;

: ˆ r3 N

s

imd ≬ r1 r0 imd ̺ N [ 1 r0]]obj imd ≬ [r mon s

siN

s imd r obj s

hE; Si

:

imd obj hv;

r r3

imd ̺N [ ]]obj imd ⊚ [r mon s

⊢obj

siN

hv; si imd ̺N [r [ ]]obj

: hQ; Si imd : [hQ; Si] obj hQ; SiN :

imd obj he;

r

imd ̺N imd [r [ r3]]obj mon s

⊢obj s ⊢

he; si

imd ̺N [r [ ]]obj

imd

hq; si imd [hq; si] obj hq; siN

Q

→

imd obj QN

[Q ]

255

Modelling Encapsulation of State With Monad Transformers

imd

r

imd

imd ̺N [r [ ]]obj

Γ; S imd imd r ˆ r3 N [Γ ] obj ΓN ; [S ] obj S

⊢obj e

S

⊢obj v

imd r obj S

π⊚ [S ]

ˆ r3 N

e

imd

⊢

:

imd obj eN

⊢mon e

: [E ]

v

imd ̺N imd [r [ ]]obj mon v

→

E imd obj EN

[e ]

r

Intermediate Languages

:

imd obj vN

→

P imd obj PN

[v ]

: [P ]

Proof: Theorem 4.3.1. The proof is by induction on object language typing derivations. We state the theorem over judgments for programs, expressions, values, and their configurations, stores, region stores, and storables. We present some cases that differ from the proof of Theorem 3.3.1. imd

⊢obj hq;

6⊚

si

: imd

imd

r2

6⊚

imd

6⊚

[hq; 6 ⊚s i] obj hq;

imd

s: S and S ⊢obj q q : Given ⊢obj hq; si hq; 6 ⊚s i: hQ; 6 ⊚Si, by ⊢obj hq; s i we have ⊢obj s imd ˆ r2 imd ˆ r2 imd ˆ r2 imd imd ǫ imd P. By induction, ⊢mon s [s ] obj s N : [S ] obj S N and π6 ⊚[S ] obj S N ⊢mon q [q ] obj qN : imd

imd

[Q ] obj QN . Applying ⊢mon hq; ⊢

imd obj he;

6⊚

si

imd

yields ⊢mon hq;

si

imd

6⊚

s iN

: [hQ; 6 ⊚Si] obj hQ;

imd

6⊚

imd

r

S iN

.

si

: imd

r

imd

imd

r r2

Given ⊢obj he; si he; si: hE; Si, by ⊢obj he; si we have ⊢obj s s: S and ∅ {r}; S ⊢obj e e: imd ̺N imd r ˆ r2 imd r ˆ r2 imd r ˆ r2 imd [r [ ]]obj N N N E. By induction, we have ⊢mon s [s ] obj s : [S ] obj S and {∅} {∅}; [S ] obj S imd ̺N [r [ ]]obj

imd

imd

imd

imd

imd

imd

[e ] obj eN : [E ] obj EN . Applying ⊢mon he; si then yields ⊢mon he; si [he; si] obj he; siN :

⊢mon e

imd

[hE; Si] obj hE; SiN . imd

⊢obj s-empty: imd

We have ⊢obj s imd

⊢mon s ⊢

ǫ

We have ⊢

s

imd 6 ⊚

-empty

and ⊢mon

s

-empty,

imd

along with ⊢mon s to obtain

:

imd r = r1 r0 obj s

imd ≬ r

S 0 }⊢obj

imd ⊚

∅: ∅. We use ⊢mon

∅ ˆ ǫ: ∅ ˆ ǫ.

imd obj s -nonempty

≬

ǫ

s

imd

s1 {r0 7→ ≬s 0 }: S1 {r0 7→ ≬S 0 }. By ⊢obj s -nonempty, we get S1 {r0 7→ imd

s 0 : ≬S 0 and ⊢obj s

≬

r1

s1 : S1 . We need to show that the corresponding monadic

stores at each lexical depth are typable. We first demonstrate that the corresponding monadic fully lexical store is typable, and then that & preserves typability. The for≬

imd r obj s N

mer is straightforward. By induction, π⊚ [S1 {r0 7→ S 0 } ] imd ≬ r obj S N

[≬S 0 ]

imd r1 N obj s

and, if we let ⊚s 1 ˆ 6 ⊚s 1 = [s1 ]

s 1 ˆ 6 ⊚s 1 : ⊚S 1 ˆ 6 ⊚S 1 . The latter gives rise to ⊢

⊚

6⊚

imd ⊚

s 1 : 6 ⊚S 1 . We can apply ⊢mon

256

s

imd r1 N obj S

and ⊚S 1 ˆ 6 ⊚S 1 = [S1 ]

imd ̺N [ 1]]obj imd ⊚ [r obj s imd 6 ⊚

-nonempty

⊢

imd ̺N [ ]]obj imd ≬ [r obj s

and ⊢mon

s

⊚

imd ≬ r

[≬s 0 ] obj

,⊢

:

imd ̺N [ 1]]obj imd 6 ⊚ [r

s 1 : ⊚S 1 and ⊚S 1 ⊢obj

-nonempty,

s N

imd ̺N imd [r [ 1]]obj obj s

imd

s

along with ⊢mon s to obtain

Translation

imd

⊢mon s

ǫ

s 1 {≬s 0 } ˆ 6 ⊚s 1 . ǫ: ⊚S 1 {≬S 0 } ˆ 6 ⊚S 1 . ǫ. That & preserves typability holds by a vari-

⊚

ant of Proposition 4.2.2. In brief, we can augment the nonlexical store derivation using f imd ≬

⊢mon

s

imd ̺N [ 1]]obj imd 6 ⊚ [r

s

on the region store derivation to obtain ⊚S 1 ⊢obj

6⊚

s 1 ≬s 0 : 6 ⊚S 1 ≬S 0 .

imd

⊢obj v-addr-live: imd

We use an instance of ⊢mon v-ret-running for each region variable in the index beyond the point of allocation. This removes the extra region store types that would be allowed to remain in imd

the object language derivation. We then complete with an instance of ⊢mon v-addr-live. The latter is derivable because the translation preserves types of storables. ⊢

imd obj d-λ

: imd

We have S ⊢obj d

r1 r0

imd

imd

hλx.e0 i: B0 . By ⊢obj d -λ and ⊢obj b -λ as well as the definition of T

restriction of environments and store types, we have B0 = P0.1 ⇒0 P0.2 and ∅ {r1 } {r0 } {x 7→ imd

P0.1 }; S ⊢obj e imd r1 r0 N obj S

⊢

[S ]

r1 r0

imd

e0 : P0.2 ! T0 . By induction, can derive [∅ {r1 } {r0 } {x 7→ P1 }] obj Γ

imd ρN imd [r [ 1 r0]]obj mon e

imd r1 r0 N obj e

[e0 ]

imd r1 r0 N obj T

: [T0 ]

imd r1 r0 N obj Γ

translation, [∅ {r1 } {r0 } {x 7→ P0.1 } ] apply ⊢

imd mon b-λ

imd r1 r0 N obj P

[P0.1 ] ⊢

imd obj d-ref

and ⊢

imd mon d-λ

[P0.2 ]

imd

in order to derive [S ]

⇒ [T0 ]

imd r1 r0 N obj P

[P0.2 ]

⊢

N

;

. By definition of the

= {∅} {∅} {∅ {x 7→ [P0.1 ] obj P imd r1 r0 N obj S

imd r1 r0 N obj T

imd r1 r0 N obj P

r1 r0

imd rN imd [r [ 1 r0]]obj mon d

r1 r0

N

}}, so we

imd r1 r0 N obj e

hλx.[[e0 ]

i:

.

: imd

imd

r1 r0

imd

r1 r0

href v0 i: Ref P0 . By ⊢obj d-ref we have a derivation of S ⊢mon v v0 : P0 . imd ̺N imd r1 r0 ˆ imd imd imd [r [ 1 r0]]obj N mon v By induction, there is a derivation of π⊚ [S ] obj S ⊢ [v0 ] obj vN : [P0 ] obj PN . imd ̺N imd r1 r0 ˆ imd imd imd [r [ 1 r0]]obj N imd Applying ⊢mon d-ref yields π⊚ [S ] obj S ⊢mon d href [v0 ] obj vN i: Ref [P0 ] obj PN . We have S ⊢mon d

imd

⊢obj e-freeregion: imd

We have Γ1 ; S ⊢obj e

r1

imd

freeregion r0 after e0 : P0 - r0 ! T0 - r0 . By ⊢obj e -freeregion, we have a imd

derivation of Γ1 {r0 }; S ⊢obj e

r1 r0

e0 : P0 ! T0 . By induction and the definition of the static imd ˆ r3 N imd e[r[ 1 r0]]obj ̺N imd r1 r0 N translation, [Γ1 ] {∅}; [S ] ⊢mon [e0 ] obj e : St{ι|(ι @ r0 ) ∈ T0 } imd r1 imd r1 r0 imd r1 imd r1 r0 ˆ r3 imd N N [T0 − r0 ] obj T N [P0 ] obj P . Applying ⊢mon e-running, we derive [Γ1 ] obj Γ N ; &[[S ] obj S imd r1 N obj Γ

imd

imd ̺N [r [ 1]]obj

imd r1 r0 obj S

imd

r1 r0

imd

r1

imd

r1 r0

N N running [e0 ] obj e : [T0 − r0 ] obj T N [P0 ] obj P := ∅. Recognizing that imd r1 r0 ˆ r3 imd r1 ˆ r0 r3 imd r1 r0 imd r1 N N N &[[S ] obj S = [S ] obj S , that running [e0 ] obj e = [freeregion r0 after e0 ] obj e N ,

⊢mon e

imd

and also that [P0 ] obj P

r1 r0

N

imd

:= ∅ = [P0 − r0 ] obj P

r1

N

, we are done.

Semantics is preserved by the translation in the nondirectional sense of Theorem 2.3.2.

257

Modelling Encapsulation of State With Monad Transformers

Intermediate Languages

Theorem 4.3.2 (Semantics Preservation). hq; si →∗ hq′ ; s′ i

imd

hq; si ∈ R obj hq; si ∧ imd obj he;

he; si ∈ R

sir

imd hq; 6 ⊚s i mon

t he; si →∗ he′ ; s′ i

∧

imd he; sir mon

imd hq; siN

→

[hq; [ si]]obj

imd he; sir N

→

imd hq; siN

= [hq [ ′ ; s′ i]]obj

imd hq; 6 ⊚s i mon

[he; [ si]]obj

imd he; t = [he [ ′ ; s′ i]]obj

sir N

imd he; sir mon

We again require two lemmas to prove Theorem 4.3.2. The first lemma is modified to account for the fact that our monadic evaluation contexts are now defined over configurations. Thus, we posit monadic configuration evaluation contexts for the result of evaluation, and generalize expressions of the object language and expression configurations of the monadic language to include evaluation contexts. To motivate the following definition, notice that the translation of an expression evaluation context (as an expression) never includes a return context (surrounding the hole).11 Therefore, the monadic configuration context will not include any region stores. Its store component will simply apply & for each letregion form surrounding the hole. We thus generalize the imd

imd

translations [ ] obj qN and [ ] obj e → [ e] r/r1 ∗

e

r

N

on programs and expressions to evaluation contexts imd →∗ [ e ǫ ] qN obj

by mapping, in the case of programs, [[ e ǫ ] ] imd →∗ [ e ] r /

e

expressions, [[ e r ]] obj

N

→∗ [ e r1 ]

q or

ǫ

to [ he; si ], and in the case of imd →∗ [ e] r/r0

[r] []

to [ he; si ], and [freeregion r0 after [ e rr0 ]] obj

e

N

to hrunning

[ 0]] [ 0]] [ e[rr ]; &[r[ 0]][ s[rr ]i. Other atomic expression evaluation contexts are translated to expression con-

figuration contexts whose expression component applies the translation on expressions (replacing holes with holes) and whose store component is an empty store context. Translations of nonatomic evaluation contexts are derived by composition of atomic evaluation contexts. We use the same evaluation contexts and reduction rules on the modified languages, and allow object language context h→

reduction derivations to be filled with an expression e1 as in (

t ∗ e ; si ⇀∗ h→ [

∗ [ e r r1 ]

imd he; sir obj

e r r1 ] ′

e ; s′ i

)[e1 ] and

monadic language context reduction derivations to be filled with an expression configuration he1 ; s1 i t he; si ⇀∗

→∗ [ he; sir r1 ]

as in (

→∗ [ he; sir r1 ]

imd he; sir mon

he′ ; s′ i

)[he1 ; s1 i].

11The translation of an evaluation context may include a return construct that does not surround the hole, and the translation of an object language redex may include a return context that must be descended to arrive at a monadic redex.

258

Translation

Lemma 4.3.1 (Evaluation Contexts Preserved by Translation up to Evaluation). imd [ e] q: obj

For any object language evaluation context

→∗ [ e r1 ]

q,

for some monadic language evaluation context imd →∗ [ e r1 ]q N obj

→∗ [ e r1 ]

[[

q ]]

→

[ 1]]] ∗ →∗ [ he; si[r

hq′ ;

[ 1]] →∗ [ [r he; si]

hq ′ ; 6 ⊚si,

6⊚

si

[ 1]]] imd →∗ [ he; si[r hq; 6 ⊚s i mon

.

imd [ e] e: obj

For any object language evaluation context

→∗ [ e] r/r1

e

for some monadic language evaluation context imd →∗ [ e]e r/r1 N

∗ [ e r r1 ]

[[→

e ]]obj

[] →∗

,

[ r1]] →∗ [ he; si[r ]

[r] []

he′ ; si ,

[ r1]]] →∗ [ he; si[r

[] he′ ; si[r]

[r] [] [ r1]]] imd →∗ [ he; si[r he; si mon

.

The second lemma is modified merely to restrict it to reachable expression configurations. Lemma 4.3.2 (Reduction Translates as Evaluation). imd obj he;

From he; si ∈ R it follows that

sir

and

imd he; sir N si]]obj

[he; [

t he; si ⇀ he′ ; s′ i imd he; sir obj

t imd he; →∗ [he [ ′ ; s′ i]]obj

,

sir N

.

imd he; si[r] [] mon

Proof: Theorem 4.3.2. The proof proceeds by induction on the object language evaluation derivation. imd obj hq;

si: →∗

imd hq; si obj

-cntxt: →∗

The proof is similar to that for

imd he; si obj

-cntxt below and also uses a derivation from

Figure 4.3.3. imd obj he;

si: →∗

imd he; si obj

-cntxt:

We have e = tion,

[he [ 1;

→∗ [ e r r1 ]

′

e [e1 ] and e =

imd he; sir r1 N si]]obj

→∗ [ e r r1 ]

imd he; t = [he [ ′1 ; s′ i]]obj

e [e′1 ],

with

t he1 ; si →∗ he′1 ; s′ i imd he; sir r1 obj

. Call this evaluation derivation d. By

[ r1]] imd he; si[r mon

∗

Lemma 4.3.1, we have for some monadic language evaluation context → ∗

[h→

imd [ e]

e; si ] obj h

[ e]

e ; siN

evaluation contexts

[]

→∗

. By induc-

sir r1 N

[ r1]] →∗ [ he; si[r ]

[] he; si[r] .

[ r1]] [ he; si[r ]

[] he; si[r] ,

Call its evaluation derivation on

[ r1]] →∗ [ he; si[r ]

d . Our monadic language equality derivation is then

built as in Figure 4.3.3. 259

d

[[[he1 ; si] =

imd hq; 6 ⊚s i mon

= imd hq; 6 ⊚s i mon

r1

si

N

]

-trans

[h→ ∗

→

-trans

=

d

=

-trans

si

r r1

N

r1

∗

[ e

r r1

∗

-revrs

d [[[he′1 ; s′ i] obj he; si

∗

[r [ 1]]

→ [ he; si ∗

= [h→

imd

[ e r1 ]

q [e1 ]; si] obj hq; siN = [h→

∗

imd

[ 1]] ] →∗ [ he; si[r

=

imd hq; 6 ⊚s i mon

[r [ 1]]

[ e

N

]

sir1 N

imd obj he;

hq ′ ; 6 ⊚si[[[he′1 ; s′ i] imd ] q[e′1 ]; s′ i] obj hq; siN

]

r1

r1

]

imd

[ e r1 ]

q[e′1 ]; s′ i] obj hq; siN

d [ r1]] →∗ [ he; si[r ]

imd

t

r r1

he ; si[[[he1 ; si] obj he; si N ] = r r1 imd [ r1]] →∗ [ he; si[r ] ′ he ; si[[[he′1 ; s′ i] obj he; si N ]

]

′

imd obj he;

=

-revrs

imd he; s i[r] [] mon

si N t r

= e [e1 ]; si] imd r r1 ] ′ he ; si[[[he′1 ; s′ i] obj he; si N ]

[r [ r1]]

→ [ he; si

-trans

imd he; s i[r] [] mon

6⊚

imd obj hq;

]

] [h→

imd

hq ; si[[[he1 ; si] obj he; si N ] = imd r1 ∗ [r [ 1]] → [ he; si ] ′ 6 ⊚ hq ; si[[[he′1 ; s′ i] obj he; si N ] ′

siN q [e1 ]; si] = imd r1 [ he; si ] ′ 6 ⊚ ′ hq ; s i[[[he1 ; s′ i] obj he; si N ] [ e

-cntxt

∗

=

r1

imd he; si[r] [] mon

imd obj he;

imd he; si[r] [] mon

∗

d [ 1]] →∗ [ he; si[r ]

[h→

[ r1]] →∗ [ he; si[r ]

[[[he1 ; si]

-cntxt

imd hq; 6 ⊚s i mon

imd obj he;

[h→

∗

[ e r r1 ]

imd

e [e1 ]; si] obj he; si

r

N t

= [h→

∗

[ e r r1 ]

[ r1]] →∗ [ he; si[r ]d [[ [he′1 ; ∗

[r [ r1]]

→ [ he; si

imd

s′ i] obj he; si imd obj he;

he′ ; si[[[he′1 ; s′ i] imd r ∗ r r1 [h→ [ e ]e [e′1 ]; s′ i] obj he; si N imd

e [e′1 ]; s′ i] obj he; si

r

]

N

Figure 4.3.3. Monadic Language Expression Configuration Equality Derivations for Proof of Theorem 4.3.2

r r1

si

N

]

r r1

N

[]

]=

Modelling Encapsulation of State With Monad Transformers

260

=

[ 1]] →∗ [ he; si[r ]

Intermediate Languages

Translation

→∗ imd he; si obj

-reflex: imd

r

We have that he; si = he′ ; s′ i, so [he; si] obj he; si imd r obj t N

[t ] →∗

imd he; si obj

= [ ] and we can apply

= imd he; si mon

N

imd

r

= [he′ ; s′ i] obj he; si

N

. t = [ ], and thus

-reflex.

-step:

Use Lemma 4.3.2. →∗

imd he; si obj

-trans: t′ he′ ; s′ i →∗ he′′ ; s′′ i

We have have

[he; [

imd he; sir mon imd he; sir N t′ si]]obj = [he [ ′′ ;

and

t′′ he′′ ; s′′ i →∗ he′ ; s′ i imd he; sir mon

imd he; sir N

s′′ i]]obj

result follows by rule

= imd he; si mon

imd he; sir N

[he [ ′′ ; s′′ i]]obj

and

imd he; si[r] [] mon

with t = t′ + t′′ . By induction, we imd he; t′′ = [he [ ′ ; s′ i]]obj

imd he; si[r] [] mon

sir N

. The

-trans because appending of traces commutes with their

translation. Proof: Lemma 4.3.1. imd [ e] q: obj imd [ e]

qN

[[ e]q ] obj

imd [ e ]

is running [[ e]q ] obj

eN

. running [ e] is a monadic language atomic program-

expression evaluation context, so this reduces to the expression case. imd [ e] e: obj

We proceed by induction on object-language evaluation contexts. We show only particular cases; the others are similar. [ e]: [ e] is a monadic language evaluation context. ∗

let x= →

[ e]

e in e:

By induction, [→ ∗

let x1 = [→

∗

imd

e ] obj eN evaluates to a monadic expression evaluation context, so

[ e]

imd

e ] obj eN in e does as well.

[ e]

→∗ [ e]

e e:

By induction, [→ ∗

let x1 = [→

∗

imd

imd

e ] obj eN evaluates to a monadic expression evaluation context, so

[ e]

imd

e ] obj eN in let x2 = [e ] obj eN in x1 x2 does as well.

[ e]

261

Modelling Encapsulation of State With Monad Transformers

∗

v→

Intermediate Languages

[ e]

e:

h let x1

[] ∗ ; si ⇀ hlet x2 = [[→ [

= return v

in let x2 = [→

∗

imd obj eN

[ e]

e]

imd eN

e]

e ]]obj

in v x2 ; si

in x1 x2 . By induction, we

imd he; si mon ∗

have that [→ ∗

x2 = [→

[ e]

imd obj eN

[ e]

e]

imd obj eN

e]

evaluates to a monadic expression evaluation context, so let

in v x2 is as well.

We require propositions asserting that translation may be commuted with each form of substitution. Proposition 4.3.1 (Translation Respects Substitution of Values). imd r obj e N

[e ]

imd

[x := [v ] obj v

r

N

imd

] = [e [x:= v] ] obj e

r

N

Proposition 4.3.2 (Translation Respects Region Allocation). imd

[e0 ] obj e

r1 ρ0

N

imd

= [e0 [ρ0 := r0 ]] obj e

r1 r0

N

Proposition 4.3.3 (Translation Respects Region Deallocation). imd

[v0 ] obj v

r1 r0

N

imd

= [v0 [r0 := ∅]] obj v

r1

N

We use the following proposition freely and without explicit comment. Translating values at an extended index does not change the result. Proposition 4.3.4 (Translation of Values is Conservative). imd

[v r1 ] obj v

r1

N

imd

= [v r1 ] obj v

r1 r2

N

Proof: Lemma 4.3.2. We proceed by case analysis on object language reductions. In each case, we again elide the details of how the derivations are filled out using ⇀ imd he; si obj

-let:

[] hlet x= v in e; si ⇀ he [ x:= v] ; si imd he; sir obj imd obj he;

[hlet x = v in e; si ] s∈

imd r r3 ). obj s

imd r/r3 N obj s

[s ]

→∗ imd he; si obj

r

si N

-step and

→∗ imd he; si obj

-trans.

: imd ̺N obj

= hlet x= return[r[ ]]

In the monadic language, by

⇀

imd

[v ] obj v

imd he; si mon

r

N

imd

in [e ] obj e

r

N

imd r obj e N

-let we get h[[e ]

imd

; [s ] obj s

r/r3

N

i (for

imd r obj v N

[ x := [v ]

];

i with an empty trace. Then by Proposition 4.3.1 and the definition of transla-

tion of configurations, that configuration is the translation of the object language residual. The result follows by ⇀ imd he; si obj

-alloc:

→∗

imd he; si mon

-step.

[(alloc @ a0 )] h@ r0 d0 ; si ⇀ ha0 ; s′ i imd he; sir = r1 r0 r2 obj

: imd

where s′ = s {a0 7→ d0 } and a0 = hr0 , oi. [h@ r0 d0 ; si] obj he; si 262

r

N

imd

= h[[@ r0 d0 ] obj e

r

N

imd

; [s ] obj s

r

N

i.

Translation

s = ∅ {r1 7→ ≬s 1 } {r0 7→ ≬s 0 } {r2 7→ ≬s 2 } (by the proof of Lemma 4.1.1) and thus we have imd

[s ] obj s

r

imd ≬s r1 N

N

= ∅ {[[≬s 1 ] obj

imd ≬ r1 r0

imd r obj e N

If d0 = href v0 i, [@ r0 d0 ] ⇀

we apply

imd he; si mon

s

} {[[≬s 0 ] obj

N

imd ≬s r1 r0 r2 N

} {[[≬s 2 ] obj

imd ̺N [r [ ]]obj

= let x= return

-let we get hreturn[r[ 2

imd ̺N ]]obj

[v0 ]

imd r obj v N

href [v0 ]

i; [s ]

[r [2 plying imd mon he; si -alloc within the evaluation context hreturn imd ≬s r1 N obj

ˆ [ 6 ⊚s]. ǫi then yields hreturn imd ≬s r1 r0 r2 N obj

{[[≬s 2 ]

o; ∅ {[[≬s 1 ]

imd ̺N obj

in return[r[ 2]]

imd r obj s N

⇀

imd ̺N [r [ ]]obj

} ˆ .ǫ. ǫ. ǫ. ǫ.

imd r obj v N

imd ̺N ]]obj

href xi. If

i with empty trace. Apimd ≬s r1 r0 r2 N

[ e]; [ ⊚s] {[[≬s 2 ] obj

imd ≬ r1 r0 N obj s

imd r obj v N

}{[[≬s 0 ]

{o 7→ href[[v0 ]

}

i}}

} ˆ .ǫ. ǫ. ǫ. ǫi (the translation of the object language residual) with a trace

imd ̺N [r [ 2]]obj

of [return

(alloc @ o)] (the translation of the object language trace). imd

If d0 = hλx.e0 i, [@ r0 d0 ] obj e

r

N

imd ̺N obj

= return[r[ 2]]

imd

hλx.[[e0 ] obj e

imd ̺N obj

imd ̺N obj

imd ≬s r1 N

o; ∅ {[[≬s 1 ] obj

imd ≬ r1 r0

} {[[≬s 0 ] obj

s

N

[ e]; [ ⊚s] {[[≬s 2 ] obj

N

⇀

i. Applying

imd ≬s r1 r0 r2 N

-alloc within the evaluation context hreturn[r[ 2]] yields hreturn[r[ ]]

r1 r0

imd

{o 7→ hλx.[[e0 ] obj e

r1 r0

N

imd he; si mon

} ˆ [ 6 ⊚s]. ǫi imd ≬s r1 r0 r2 N

i}} {[[≬s 2 ] obj

imd ̺N [r [ 2]]obj

ˆ .ǫ. ǫ. ǫ. ǫi (the translation of the object language residual) with a trace of [return (alloc @ o)] (the translation of the object language trace). ⇀

imd he; si obj

h@ r0 deref a0 ; si

-deref:

[(read @ a0 )] ⇀ hv0 ; si

imd he; sir = r1 r0 r2 obj

: imd

where a0 = hr0 , oi and s ( a0 ) = href v0 i. [h@ r0 deref a0 ; si] obj he; si imd r obj s N

[s ]

imd

= h[[@ r0 deref a0 ] obj e

r

N

;

i. As above, we have s = ∅ {r1 7→ ≬s 1 } {r0 7→ ≬s 0 } {r2 7→ ≬s 2 } and thus [s ] imd ≬s r1 N

imd ̺N [r [ ]]obj

x= return [s ]

N

imd r obj s N

= ∅ {[[≬s 1 ] obj

imd r obj s N

r

imd ≬ r1 r0

} {[[≬s 0 ] obj

s

N

imd ≬s r1 r0 r2 N

} {[[≬s 2 ] obj

imd ̺ N [r [ 2]]obj

o in return

deref x.

imd

} ˆ .ǫ. ǫ. ǫ. ǫ. [@ r0 deref a0 ] obj e

⇀

By

imd he; si mon

-let we get hreturn[r[ 2

imd r1 r0 N obj d

[r [2 Applying imd mon he; si -deref within the evaluation context hreturn imd ̺N [r [ ]]obj

ˆ [ 6 ⊚s]. ǫi yields hreturn

imd r1 r0 N obj v

[v0 ]

guage residual) with a trace of [return[r[ 2

imd r obj s N

; [s ]

imd ̺N ]]obj

N

= let

deref o; imd

at ≬s 0 ( o) yields href [v0 ] obj v

i with empty trace. The translation [href v0 i] ⇀

imd ̺ N ]]obj

r

imd ̺N ]]obj

r1 r0

N

imd ≬s r1 r0 r2 N obj

[ e]; [ ⊚s ] {[[≬s 2 ]

i (the translation of the object lan-

(read @ o)] (the translation of the object lan-

guage trace). ⇀ imd he; si obj

-set:

h@ r0 set a0 to v0 ; si

[(write @ a0 )] ⇀ hunit; s′ i

imd he; sir = r1 r0 r2 obj

imd

href v0 i}. We translate [h@ r0 set a0 to v0 ; si] obj he; si

where a0 = hr0 , oi and s′ = s {a0 imd

= h[[@ r0 set a0 to v0 ] obj e

r

N

imd

r

N

i.

imd

r

; [s ] obj s

proof of Lemma 4.1.1) and thus [s ] obj s imd r obj e N

ˆ .ǫ. ǫ. ǫ. ǫ. [@ r0 set a0 to v0 ] in return[r[ 2

imd ̺N ]]obj imd

set o to [v0 ] obj v

r

:

s = ∅ {r1 7→ ≬s 1 } {r0 7→ N

imd ≬s r1 N

= ∅ {[[≬s 1 ] obj

imd ̺N [r [ ]]obj

= let x.1 = return

r

N

≬

s 0 } {r2 7→ ≬s 2 } (by the

imd ≬ r1 r0

} {[[≬s 0 ] obj

s

N

imd ≬s r1 r0 r2 N

} {[[≬s 2 ] obj imd ̺N [r [ ]]obj

o in let x.2 = return

[v0 ]

imd ̺N obj

⇀

[r [ 2]] set x.1 to x.2 . Thus, by two applications of imd mon he; si -let we obtain hreturn N

imd

; [s ] obj s

r

N

}

imd r obj v N

imd

i with empty trace. The translation [href v0.1 i ] obj d

r1 r0

N

at ≬s 0

263

i.

}

}

Modelling Encapsulation of State With Monad Transformers

imd

( o) gives rise to href [v0.1 ] obj v hreturn[r[ 2

imd ̺N ]]obj

r1 r0

N

i. Applying the rule

imd ≬s r1 r0 r2 N obj

[ e]; [ ⊚s] {[[≬s 2 ]

imd ≬s r1 N

unit; ∅ {[[≬s 1 ] obj

imd ≬ r1 r0

s

} {[[≬s 0 ] obj

Intermediate Languages

N

⇀

-set within the context

imd he; si mon

imd ̺N obj

} ˆ [ 6 ⊚s ]. ǫi yields the configuration hreturn[r[ ]] imd

[v0 ] obj v

{o

r

N

imd ≬s r1 r0 r2 N

}} {[[≬s 2 ] obj

} ˆ .ǫ. ǫ. ǫ. ǫi (the

imd ̺N [r [ 2]]obj

translation of the object language residual) with a trace of [return

(write @ o)] (the

translation of the object language trace). ⇀

imd he; si obj

-app-λ:

ha0 v0 ; si

[(exec @ a0 )] ⇀ he0 [ x:= imd he; sir = r1 r0 r2 obj

v0 ]; si

: imd

where a0 = hr0 , oi and s ( a0 ) = hλx.e0 i. The translation [h@ r0 a0 v0 ; si] obj he; si imd

= h[[@ r0 a0 v0 ] obj e

r

N

imd

; [s ] obj s

r

N

imd

ˆ .ǫ. ǫ. ǫ. ǫ. [@ r0 a0 v0 ] imd ̺ N [r [ 2]]obj

return

imd

(o [v0 ] obj v

r

N

imd ≬s r1 N

r

imd

imd

yields hλx.[[e0 ] obj e

= let x.1 = return

s

N

imd ≬s r1 r0 r2 N

} {[[≬s 2 ] obj

imd ̺N [r [ ]]obj

o in let x.2 = return ⇀ imd he; si mon

i with empty trace. The translation [hλx.e0 i ] obj d

imd

r1 r0

N

i. Applying

N

imd ̺N ]]obj

at ≬s 0 ( o)

-app-λ within the context hreturn[r[ 2]]

} ˆ [ 6 ⊚s]. ǫi yields h[[e0 ] imd ≬s r1 r0 r2 N

r1 r0

imd r obj v N

[ x := [v0 ]

}

in

[v0 ]

imd ̺N obj

⇀ imd he; si mon

imd r1 r0 N obj e

} {[[≬s 2 ] obj

imd r obj v N

-let we obtain hreturn[r[ 2

N

[ ⊚s] {[[≬s 2 ]

imd ≬ r1 r0 N obj s

imd ≬ r1 r0

} {[[≬s 0 ] obj

r

imd ≬s r1 r0 r2 N obj

{[[≬s 0 ]

imd ̺N [r [ ]]obj

x.1 x.2 . Thus, by two applications of

); [s ] obj s

N

i. s = ∅ {r1 7→ ≬s 1 } {r0 7→ ≬s 0 } {r2 7→ ≬s 2 } (by the proof of

Lemma 4.1.1) and thus we get [s ] obj s N= ∅ {[[≬s 1 ] obj imd r obj e N

r

imd ≬s r1 N obj

]; ∅ {[[≬s 1 ]

[ e]; }

} ˆ .ǫ. ǫ. ǫ. ǫi (by Proposition 4.3.1, the translation of the imd ̺N obj

object language residual) with trace [return[r[ 2]]

(exec @ o)] (the translation of the object

language trace). ⇀ imd he; si obj

-letregion:

[] hletregion ρ0 in e0 ; s1 i ⇀ hfreeregion r0 after e0 [ ρ0 := r0 ]; s1 {r0 7→ ∅}i imd he; sir1 obj imd obj he;

[hletregion ρ0 in e0 ; s1 i ]

r1

si

N

imd r1 N obj e

= h[[letregion ρ0 in e0 ]

imd r1 N obj e

Lemma 4.1.1). [letregion ρ0 in e0 ] imd ≬s r1 N

= ∅ {[[≬s 1 ] obj

} ˆ .ǫ. ǫ. Applying

imd r1 ρ0 N obj e

= run [e0 ]

⇀

imd he; si mon

imd r1 N obj s

; [s1 ]

:

i (by the proof of imd

. s1 = ∅ {r1 7→ ≬s 1 } and [s1 ] obj s imd

-run yields hrunning [e0 ] obj e

r1 ρ0

N

r1

N

imd ≬s r1 N

; ∅ {[[≬s 1 ] obj

ˆ .ǫ. ǫ∅i (by Proposition 4.3.2 and the definition of &, the translation of the object language residual) with trace [] (the translation of the object language trace). ⇀ imd he; si obj

-freeregion:

[] hfreeregion r0 after v0 ; s1 {r0 7→ ≬s 0 }i ⇀ hv0 [ r0 := ∅]; s1 i imd he; sir1 obj

imd obj he;

Translate [hfreeregion r0 after v0 ; s1 {r0 7→ ≬s 0 }i] imd r1 /r0 N obj s

[s1 {r0 7→ ≬s 0 } ]

imd ≬ r1 r0 N obj s

ˆ .ǫ.[[≬s 0 ]

264

r1 /r0

N

imd

= h[[freeregion r0 after v0 ] obj e

r1

N

i (by the proof of Lemma 4.1.1). [freeregion r0 after v0 ]

running returnr1 r0 [v0 ] imd

N

:

imd r1 N obj e

imd r1 r0 N obj v

[s1 {r0 7→ ≬s 0 } ] obj s

si

r1

=

. The store s1 = ∅ {r1 7→ ≬s 1 } and the translation of the store imd ≬s r1 N

= &(∅ {[[≬s 1 ] obj

. Applying

;

⇀

imd he; si mon

imd ≬ r1 r0

} {[[≬s 0 ] obj

s

N

imd ≬s r1 N

} ˆ .ǫ. ǫ. ǫ) = ∅ {[[≬s 1 ] obj imd r1 r0 N obj v

-running yields hreturnr1 [v0 ]

imd ≬s r1 N obj

; ∅ {[[≬s 1 ]

} }

}

Translation

ˆ .ǫ. ǫi (by Proposition 4.3.3 and the definition of &, the translation of the object language residual) with trace [] (the translation of the object language trace). We do not present a directional result analogous to Theorem 2.3.3. It is precisely the information differential described at the beginning of this section that makes such a result undesirable, as it would force monadic language reduction to be less efficient than it otherwise might, so as to conform to the object language. The reader may now take a deep breath! You have completed the core material of this work. You have seen how a language with encapsulation can be implemented in either a direct or monadic setting. The monadic implementation maintains a tree-structured store so as to minimize the length of the sequence of visible region stores and thus the number of applications of monad transformers. You have seen how the translation operates in building a store structured as a linear tree. As the reader may appreciate or apprehend depending on your propensity to fill in the details yourself, the remainder of this work uses a less formal style as we introduce various enhancements to the basic ideas already presented.

265

Part III

Enhancing the Languages

Categorical Interlude: Strong Monads, Fixpoints, and iml-categories Crole and Pitts [7] define a monadic semantics of recursion in the presence of Kleisli categories. We present the essence of their semantics, and add effect annotations. We also present Mitchell and Scedrov’s [42] categorical semantics of implicit polymorphism.

Monoidal Categories. A monoidal category is one with a binary tensor product and a unit object. This construction is weaker than that which we have already assumed; it does not require that projection morphisms be defined over arbitrary products nor that 1 be a terminal object. It does require the following three isomorphisms: (1) Let χ1A : A → A × 1 be the isomorphism that constructs a product with the unit object on the right. The left projection π1 is its inverse. (2) Similarly, let χ2A : 1 × A → A be the isomorphism that constructs a product with the unit object on the left. The right projection π2 is its inverse. (3) Finally, χ3hA,B ,C i : (A × B ) × C → A × (B × C ) and its inverse provide associativity. The following two diagrams must commute: ((A × B ) × C ) × D

- (A × (B × C )) × D - A × ((B × C ) × D ) χ3hA,B ,C i × idD χ3hA,B ×C ,Di idA × χ3hB ,C ,Di

χ3hA×B ,C ,Di ? (A × B ) × (C × D )

? - A × (B × (C × D ))

χ3hA,B ,C ×Di (A × 1 ) × B

- A × (1 × B ) χ3 hA,1 ,B i

π

2

π1 269

A

id

-

id B

×

× A×B

Enhancing the Languages

Categorical Interlude: Strong Monads, Fixpoints, and iml-categories

We can define these isomorphisms in terms of our products as follows: (1) χ1A = idA &!A (2) χ2A = !A &idA (3) χ3hA,B ,C i = (π1 ◦ π1 )&(π2 ◦ π1 &π2 )

Strong Monads. In a monoidal category, one can define a strong monad that is useful in modelling dependencies between computations. A strong monad allows one to defer a partial result to be returned as part of an existing computation. A strong monad is a monad augmented with an additional natural transformation τM ,ε :C

C

M ε ×Id → M ε ◦ ( × )12 called a tensorial strength. We require that the

following four diagrams commute: M ε A × (B × C ) 6 χ3 hM ε

- M ε (A × (B × C)) 6

M,ε

τhA,B ×C i

M ε χ3 hA,B ,C i

A,B ,C i M,ε

M,ε

τhA,B i × idC τ - M ε (A × B ) × C hA×B ,C-i M ε ((A × B) × C ) (M A × B ) × C ε

Mε A × 1

- M ε (A × 1 ) M,ε τhA,1 i

M ε1 (M ε2 A) × B M,ε1 ,ε2

µA

- M⊥ A × B ηM A × idB

M ε π1A

π

1A

A×B

- ? Mε A

ηM

A×

B

-

M

- M ε1 (M ε2 A × B )

M,ε τhM ε12 A,B i

M

ε1

⊥

τM,⊥ hA,B i ? (A × B )

- M ε1 (M ε2 (A × B ))

M,ε τhA,B2 i

M,ε ,ε2

× idB ? M ε1 ⊔ε2 A × B

µA×B1 M,ε ⊔ε2

τhA,B1 i

? - M ε1 ⊔ε2 (A × B )

They require that τM, commute with χ3 , π1 , ηM , and µM,ε1 ,ε2 , respectively.

Exponentials Revisited. We presented exponential objects in Part I as a way of representing abstractions in programming languages. Because morphisms between objects represent parameterized terms, there is a 12For convenience, we swap the order of the arguments from that of convention. A commutative strong monad would support both.

270

Algebras

correspondence between a morphism and its representation as an abstraction. For any morphism f : z }| { A → B , a representation of f , rep f : 1 → B A is defined up to isomorphism to be (f ◦ π2 ). Given a representation rep f , we can regain the original morphism as abs (rep f ) = f = (rep f ) ◦ χ. | {z } Now, given morphisms f : B A → D C , and g: A → B , we define f ( g): C → D as abs (apply ◦ (rep f &rep g)). This operation provides a “nesting of morphisms” using exponentials.

Algebras. For an effect-annotated endofunctor F ε :Cat C → C, an algebra is an object A of C and a morphism αε :C F ε A → A. Intuitively, the endofunctor F ε adds structure and the morphism αε interprets away that structure for a particular object A. Recall that our endofunctors generally represent data structures, and in the case of monads, computations. Algebras then interpret a “layer” of computation for some value type. Because we are interested in the case where F is a monad M , we also require that algebras respect ηM and µM , , . In particular, we require that interpreting a trivial computation yields the original value, and that flattening a computation of a computation is equivalent to interpreting, innermost computation first.

M ε1 (M ε2 A)

ε2 M ε1 α-

M ε1 A αε1

C : µA

? M ε1 ⊔ε2 A

ηM A-

M⊥ A

id A

M,ε1 ,ε2

A

-

αε1 ⊔ε2 - ? A

α⊥ ? A

An algebra homomorphism is a structure-preserving mapping between algebras on the same endofunctor. Formally, a homomorphism from algebra hA, αε i to hB , β ε i is a morphism f :C A → B such that f ◦ αε = β ε ◦ M ε f .

Mε A

- Mε B Mε f

C : αε

βε ? A

f

? - B

Mε A C M : αε ? A

Mε B

f

- βε ? B

271

Enhancing the Languages

Categorical Interlude: Strong Monads, Fixpoints, and iml-categories

Algebra homomorphisms support identities and associative composition, so we define a category C M of algebras. Thus, for us, an algebra homomorphism is a mapping between types that preserves the interpretation of computations. We defined initial objects 0 C in the category C to represent empty types. We refer to an initial M

object in C M as an initial algebra. Thus, for an initial algebra 0 C = hI , σ ε i and any algebra hA, αε i, M

there is exactly one homomorphism ¡ChA, αε i from hI , σ ε i to hA, αε i. Intuitively, an initial algebra is a trivial interpretation that leaves the structure in place. The constructors of data structures, which merely repackage their input, are examples.

Equalizers. Equalizers are an additional universal construction, also defined up to isomorphism. For any morphisms h,k: A → B , there is an equalizer object h Eq k and an equalizing morphism ǫh,k : h Eq k → A as in the following diagram.13. For any object C and morphism f : C → A, there is a morphism f =h,k : C → h Eq k as in the following diagram, and we consider all such morphisms to be indistinguishable. C f =h,k

◦

◦

f

f

? h Eq k

k

ǫh

,k

ǫ h,

-

A

A

-

k

h B

The dual notion of coequalizer can be used to represent equivalence relations.

13Compare this diagram with that for products in Part I. There are now two morphisms, h and k , which must

participate in the commuting diagram. The equalizer corresponds to π1 . There is no explicit morphism corresponding to π2 ; it is the composition h ◦ ǫh,k = k ◦ ǫh,k . Similarly, our mediating morphism depends only upon a single morphism f . We could use our two-sided commuting diagrams, but would have to restrict the functor to be unary in the object and binary in the morphism. For simplicity, we refrain from doing so.

272

Implicit Polymorphism via iml-categories

Recursion via Fixpoint Objects. We extend Crole and Pitts in defining a fixpoint object as an initial algebra hFix, σ ε i on an annotated monad M ε , (i.e., σ ε :C M ε Fix → Fix), and a global element ω :C 1 → M ε Fix which ε is the equalizer of ιM,⊥,ε Fix ◦ ηM Fix ◦ σ and idM ε Fix . In other words, ω is the morphism that is not ε M modified by composition into ηM Fix ◦ σ . Crole and Pitts required that it be the equalizer of ηFix ◦ σ

and idM Fix ; because ηM generates computations with empty effect, we conjure up the injections ιM,ε1 ,ε2 to allow computations to be treated as if they had greater effects.14 For any g: A → M ε ε Fix, g = ηM Fix ◦ σ ◦ g exactly when g = ω ◦ !A . We will require that our monad transformers preserve

the existence of a fixpoint object. Crole and Pitts describe how in the presence of a fixpoint object, one may define a combinator to interpret recursively defined programs. We add additional detail and effect annotations to their ε A A ((M B ) )

presentation. To start, we have that yA,B :C ((M ε B ) ) A

→ (M ε B )

A

satisfies yA,B ( f ) = f

A

( yA,B ( f )) for all f :C (M ε B ) → (M ε B ) . ε ε A A M ((M B ) )

The definition requires an algebra h((M ε B ) ) ε φf M

We then have ¡Ch(M ε

ε

B )A , φf i

z

∗

= (( f ) |{z}

}|

M,ε,ε ε ((M B)A )×A,B

A

◦

ε

, φf i, where {

M,ε τh(M ε B )A ,Ai ). M

: Fix→ (M ε B ) , so abs (¡Ch(M ε B )A , φε i ◦ σ ε ◦ ω): A → M ε B . yA,B f

can then be defined as an abstraction of this morphism over the morphism f .

Implicit Polymorphism via iml-categories. Polymorphism is a feature of programming languages under which a single definition applies to different types of data. Under parametric polymorphism, the definitions are general and do not refer to any particular types. Implicit polymorphism is parametric polymorphism in which the types to which a definition applies are determined by context. This is possible in predicative polymorphic systems, in which definitions apply to “flat” types, rather than polymorphic ones. Mitchell and Scedrov [42] define iml-categories15 and describe how an iml-category may be derived from any cartesian closed category. An iml-category is given by: 14See Section 2 of the Conclusion. 15The name is a reference to the implicit polymorphism that is typified by the ML programming language.

273

Enhancing the Languages

Source Languages

• A base category B with finite products, generated by some distinguished object V . The object V represents the kind Type of all types so the objects of B represent tuples of types. A morphism of B represents a tuple of types, parameterized by a tuple of type variables. • A functor F:Cat B op → Cat that assigns a cartesian closed category Fn to each object V n of B. The objects of this category will be the set of morphisms from V n to V in B, representing types parameterized by a tuple of type variables. The morphisms of this category will represent typings of polymorphic terms, i.e., terms of a parameterized type, defined over an environment of variables of parameterized type. To each morphism f : V m → V n of B, F assigns a ccc-representation from Fn to Fm . This functor represents the substitution of the n types over m type variables for the n type variables of an object or morphism of Fn . Given any cartesian closed category C, we can define Ciml . This allows us to add implicit polymorphism to our existing model. Let the base category B have natural numbers as objects and functions over n-tuples of objects of C as morphisms. Fn is the functor category from the category of n-tuples of objects of C to C. For a morphism h:B m → n, i.e., a function from an m-tuple of C-objects to an n-tuple of C-objects, F h:Cat Fn → Fm is defined such that for g a function from n-tuples of C-objects to C-objects, F h g = g ◦ h. For natural transformation t :Fn g1 → g2 , F h t :Fm g1 ◦ h → g2 ◦ h is defined such that (F h t )C m = th C m .

274

CHAPTER 5

Source Languages

Our • • • •

source language presents the following enhancements: Heap-allocated constants including constant functions. Heap-allocated, recursive functions. Implicit storable type let-polymorphism. A weak form of effect-polymorphism through allowing a function to be considered to have a greater latent effect than it actually does. Although for brevity we present a single language with all of these enhancements, they are in fact independent. Our object languages in Sections 5.1 and 6.1 differ more now from Calcagno, Helsen and Thiemann [5]. Recursion and implicit (Curry-style) type polymorphism were considered by Helsen and Thiemann in their stateless semantics [22]; here we adapt recursion and implicit type polymorphism to a semantics with state. Other researchers have presented similar languages, enhanced in various ways. Filliˆatre [16] defines an object language that is restricted to prevent aliasing. Elsman [10] defines an object language that is restricted to prevent dangling pointers.

1. Object Language Statics

An enhanced object source language syntax is presented in Figure 5.1.1. Prestorables now include constants and recursive lambda abstractions. Constants c are provided in a new primitive syntactic class. The same application form is used for recursive and standard user-defined functions and for constant functions. We assume a large supply of constant functions that operate over various regions in a program. @ ρ0 hµx1 @ρ0 . λx2 . ei allocates a (recursive) function x1 with formal parameter x2 . The region of allocation ρ0 is duplicated to simplify the grammar. The function name and formal

275

Enhancing the Languages

src obj q

::=

src ρ obj e src ∀ ρ obj e

::=

q ∈ eρ ∈ ∀ ρ

e

∀ ρ1 ρ0 ρ2

e

6 ∀ ρ1

e

6∀ ρ

e

⇁

b

6∀

b

⇁ ρ0

∈ ∈

∈

src ∀ ρ obj e src 6 ∀ ρ1 obj e

::= ::= ::=

src ρ1 ρ0 /ρ2 obj b

::=

src ∀ ρ1 ρ0 /ρ2 obj b

::=

src 6 ∀ ρ1 ρ0 /ρ2 obj b ρ p ρ ∈ src obj p ρ src ρ v ∈ obj v

::= ::= ::=

b ρ0 ∈ ∀ ρ0

∈

::=

src 6 ∀ ρ1 ρ0 ρ2 obj e

∈

⇁

∈

Source Languages

g ∈

eǫ ∀ ρ

e

p

6∀ ρ

|

e

ρ

@ ρ0 ∀b

ρ1 ρ0 /ρ2

let x = e ρ1 in e ρ1 | letregion ρ0 in e ρ1 ρ0 @ ρ0 6 ∀b

ρ1 ρ0 /ρ2

⇁

src obj g

e1 ; e2

∀ ρ0

b

| @ ρ0 deref e ρ | @ ρ0 set e ρ to e ρ | @ ρ0 e ρ e ρ

⇁

|

6 ∀ ρ0

b

hci | hλx.e ρ1 ρ0 i | hµx@ρ0 . λx. e ρ1 ρ0 i href e ρ1 ρ0 ρ2 i vρ | x g|∅ ρ0 ∈

src obj ρ

x ∈

→ let x = e1 in e2

src obj x

c ∈

src obj c

(x ∈ / fpv(e2 ))

Figure 5.1.1. Object Source Language Syntax parameter must be distinct and both are bound in the function body. Recursive functions are applied just as nonrecursive ones. Expressions and prestorables are defined in two classes (∀e, 6 ∀e) and (∀b, 6∀

b), respectively, to specify whether or not they are safe for generalization [41, 33]. An expression

is safe for generalization if it can be bound to a variable whose type allows distinct references to use different types. Clearly, it is unsafe to assume a cell to have one type when setting its value and another when dereferencing it (see untypable program #3 of Figure 5.1.3 below). Constants and functions, however, can be assumed to have whatever type is convenient. A larger class of safe expressions could have been provided [55]. Figure 5.1.2 demonstrates some typable programs. We quickly see that to be useful, constant functions would be region-polymorphic. We do not include that feature formally, but instead assume a rich supply of awkwardly-named constants. The first program applies two constant functions, hadd1@ρi and hdisplay@ρi, consecutively to the constant integer h5i. The second demonstrates recursion. The recursive function simply copies its argument; the surrounding program calls it and displays the result. The recursive function, its arguments and results, and the hdisplay@ρ1 i function are allocated in an outer region. The recursive function uses a constant function hif@ρ2 i to select between thunks (at ρ2 ) depending on the result (a boolean) of applying hzero?@ρ1 i to the argument. Constant functions hsub1@ρ1 i and hadd1@ρ1 i are used for the copying. The constant 276

Object Language Statics

(1) letregion ρ in @ ρ @ ρ hdisplay@ρi @ ρ @ ρ hadd1@ρi @ ρ h5i (2) letregion ρ1 in let x1= hµx1 @ρ1 . λx2 . letregion ρ2 in @ ρ2 @ ρ2 @ ρ2 @ ρ2 @ ρ2 hif@ρ2 i (@ ρ2 hzero?@ρ1 i x2 ) @ ρ2 hλx.@ ρ1 h0i i @ ρ2 hλx.let x3 = @ ρ1 x1 @ ρ2 @ ρ2 hsub1@ρ1 i x2 i in @ ρ2 @ ρ2 hadd1@ρ1 i x3 unit in @ ρ1 @ ρ1 hdisplay@ρ1 i (@ ρ1 x1 @ ρ1 h1i) (3) letregion ρ in let x = @ ρ hλx.x i in @ ρ x @ ρ h5i; @ ρ x @ ρ href uniti (4) letregion ρ in let x1 = @ ρ href @ ρ hλx.true ii in let x2 = @ ρ deref x1 in set x1 to @ ρ hλx.@ ρ x2 x i; @ ρ (@ ρ deref x1 ) false (5) letregion ρ in let x0 = @ ρ hλx.@ ρ x unit i in let x1 = @ ρ hλx.@ ρ href uniti i in let x2 = @ ρ hλx.@ ρ x1 unit i in @ ρ x0 x1 ; @ ρ x0 x2

i

Figure 5.1.2. Some Typable Object Language Programs functions and the thunks are defined in a region internal to the body of the recursive function. The if@ρ2 function in this example is later defined to be type-polymorphic. The third program demonstrates type polymorphism. It applies an identity function to a constant integer and a reference cell allocated in the same region. The final two examples demonstrate our effect polymorphism for reference cells and functions, respectively. The first replaces a function with no effect in a reference cell with a second function that calls the first and therefore has an execution effect. The cell is subsequently dereferenced and the function called. The second program applies a function that applies its argument to unit separately to two arguments, in sequence. The first argument is a function with an allocation effect. The second is a function that calls the first and thus has both an allocation effect and an execution effect. Type polymorphism would be insufficient in both cases because we cannot type an application without knowing that the operand has a functional type and a type variable cannot provide that guarantee.

277

Enhancing the Languages

Source Languages

(1) letregion ρ in let x = @ ρ hλx.x i in @ ρ x (@ ρ href uniti); @ ρ x unit (2) letregion ρ1 in letregion ρ2 in let x = @ ρ2 hλx.x i in @ ρ2 x (@ ρ1 href uniti); @ ρ2 x (@ ρ2 href uniti) (3) letregion ρ in let x1 = @ ρ href @ ρ hλx.x ii in @ ρ set x1 to @ ρ hadd1i; @ ρ (@ ρ deref x1 ) @ ρ href uniti (4) letregion ρ in let x1 = @ ρ href @ ρ hλx.x ii in let x2 = @ ρ hλx.@ ρ (@ ρ deref x1 ) x i in @ ρ set x1 to @ ρ hadd1i; @ ρ x2 @ ρ href uniti (5) letregion ρ in hµx1 @ρ. λx2 . @ ρ x1 @ ρ h5i; i @ ρ x1 @ ρ href uniti unit Figure 5.1.3. Some Untypable Object Language Programs Figure 5.1.3 contains untypable programs. The first two examples, derived from those of earlier sections, demonstrate that pure types are not polymorphic. The remaining examples demonstrate that our treatment of polymorphism is standard. The first demonstrates that only certain syntactic forms are generalizable; references, for example, are not. It makes clear why this must be the case. If x1 were bound in the environment to a cell holding a polymorphic identity function, we could (treating the cell contents as a function over integers) set x1 to @ ρ hadd1i and then (treating the cell contents as a function over cells holding integers) apply the contents of x1 to @ ρ href uniti. Such a configuration would be faulty. The next example attempts to avoid this problem by placing one of the references to x1 (the cell dereference) in a generalizable context, via an η-expansion1. We must not allow this function to be generalized, lest we be subject to the same dangers as in the previous example, because its type is linked to the fixed type of x1 . Multiple occurrences of x1 and x2 must then use a single type for the function argument, as with x1 in the previous example. The 1An η-expansion of an expression of functional type is an equivalent (subject to execution effects) expression formed by wrapping in a function whose formal parameter doesn’t appear free in the expression, an application of the expression to the formal parameter. In this case, the expression is the dereference of x1 .

278

Object Language Statics

final example demonstrates that recursion is not polymorphic, i.e., within the body of the recursive function, the function name x1 must be used monomorphically. Γρ ∈

src ǫ obj Γ

Γρ ∈ ∀

B

src ρ1 ρ0 obj Γ ρ ρ E ∈ src obj E

::= Γ ρ {x 7→

ρ

::= ∀ X. B ρ

∈

Bρ ∈ ∀ ρ

P

P̺ ∈ Tρ ∈

src ǫ obj P , ρ

∈

src ∀ ρ obj B ρ1 ρ0 src obj B src ∀ ρ obj P

src obj Q src ρ1 ρ0 P ∈ obj P src ρ ρ ρ ∈ src obj T , ε obj ε ρ src ρ1 ρ0 ρ2 F ∈ obj F ι ∈ src obj ι

ρ ∈

::= ∅ | Γ ǫ {x 7→ P ǫ }

src obj ρ

Q ∈

G ∈

src obj G

∀ ρ

P

} | Γ ρ1 {ρ0 }

::= P ρ ! T ρ

::=

Tρ

X | C | Ref P ρ | P ρ ⇒ P ρ

::= ∀ X. P ρ ::= G | ∅ ::= B ρ @ ρ0 | P ρ1 =

ρ { src obj F }

::= (ι @ ρ0 ) ::= alloc | read | write | exec X ∈

src obj X

C ∈

src obj C

Figure 5.1.4. Object Source Language Static Syntax Our static semantics is presented in Figure 5.1.4. Static environments are redefined to map program variables to pure type schemes. Storable types B, now include storable type variables X and base types C, defined as new primitive syntactic classes. While the set of base types is left unspecified, it is assumed to be disjoint with the other types in the system, and include Int. Storable type schemes ∀B univerally quantify storable types over storable type variables. We identify type schemes that differ only by a consistent renaming of bound type variables. We define the relation of a storable type scheme ∀B = ∀ X. B ∗ generalizing a storable type B as ∀B B if and only if there exists a substitution σ with Dom ( σ) = {X} such that σ ( B ∗ ) = B. We assume the obvious definitions of ftv(B) and ftv(∀B). We allow B to stand in for the corresponding ∀B with no parameters. Pure and program types are unchanged from the core language (Figure 3.1.4). We define pure type schemes ∀

P, by analogy with storable type schemes ∀B, i.e., they are pure types universally quantified over

storable type variables. Environments of nonempty index bind program variables to pure type schemes so that we can implement value-polymorphism for let. This could be done for top-level environments as well, but would not increase the power of the type system.

We extend frv( ) to

refer to the free region variables of pure or storable type schemes.

279

Enhancing the Languages

Source Languages

A type system is said to have principal types [24, 33] if given a term and environment, there exists a type representing all possible types for the term in the environment. For this system, we would require that given an expression (prestorable) and environment, there exists an expression (prestorable) type such that any expression (prestorable) type for the expression (prestorable) in the environment is obtainable by substitution of storable types for storable type variables. This system is not polymorphic in pure types, i.e., there is no provision for pure type variables. Thus, this system does not have principal types. An identity function in an environment declaring only region variable ∅

∅

ρ0 , for example, can be assigned the storable types Unit ⇒ Unit or Ref Unit @ ρ0 ⇒ Ref Unit @ ρ0 , but there is no single storable type that encompasses both. This is also due to the fact that our system is region-monomorphic. An identity function in an environment declaring only the region variables ∅

ρ1 and ρ2 , can similarly be assigned the storable types Ref Unit @ ρ1 ⇒ Ref Unit @ ρ1 or Ref Unit @ ∅

ρ2 ⇒ Ref Unit @ ρ2 , although there is no single storable type that encompasses both. See the first two examples in Figure 5.1.3 with respect to this discussion. As we are not particularly interested here in type or region inference, the lack of principal types is not a major concern. Γ ρ1 ρ2 − ρ2 Γ {ρ0 } − ρ2 ρ0 Γ {x 7→

∀

P } − ρ2

Γ − ǫ ∀ ρ1 ρ2

Γ − ρ2

=

(Γ − ρ2 ) {x 7→ ( ∀P

=

Γ

− ρ2

∈

∀ X. P − ρ2

=

P

P ρ1 ρ2 − ρ2 G − ρ2 ∅ − ρ2 (B0 @ ρ0 ) − ρ1 ρ0 ρ2 (B0 @ ρ0 ) − ρ2

(ρ0

src ρ1 obj Γ

∈ =

∈ = = = ∈ / ρ2 ) =

− ρ2 )}

src ∀ ρ1 obj P

∀ X. (P − ρ2 ) src ρ1 obj P

G ∅ ∅ B0 @ ρ0

Figure 5.1.5. Object Source Language Definitions Restriction of environments and pure types is defined in Figure 5.1.5 as in the core language (Figure 3.1.5), except that pure type schemes in the region environment are restricted by restricting their embedded pure type. Typing judgments in Figure 5.1.6 are the same as in the core language (Figure 3.1.6), subject to our revised syntax. 280

Object Language Statics

src

⊢obj q

programs expressions

Γ

ρ

⊢

Γ ρ1 ρ0 ρ2

prestorables

⊢

Γρ

pures

⊢ ⊢

values

src ρ obj e

e

src ρ1 ρ0 /ρ2 obj b

q

: Q

ρ

: Eρ

b ρ1 ρ0 /ρ2

src ρ obj p

src ρ obj v

: B ρ1 ρ0 ! T ρ1 ρ0 ρ2

pρ

: Pρ

vρ

: Pρ

Figure 5.1.6. Object Source Language Typing Judgments src

src

⊢obj v-glob-const

⊢

src ρ obj v

⊢obj v-addr-dead

g : TypeOf( g)

src

⊢obj v

ρ

∅: ∅

Figure 5.1.7. Typing of Object Source Language Values src

src

⊢obj q

∅ ⊢obj e ⊢

ǫ

src obj q

q: Q!∅ q: Q

Figure 5.1.8. Typing of Object Source Language Programs Values and programs are typed in Figures 5.1.7 and 5.1.8 as in the core language (Figures 3.1.7 and 3.1.8, respectively). Γ( x) P

src

⊢obj p-var

Γ ⊢

src ρ obj p

x: P

src

src

⊢obj p-value

⊢obj v Γ ⊢

ρ

v: P

src ρ obj p

v: P

Figure 5.1.9. Typing of Object Source Language Pures src

The typing rules for pures are presented in Figure 5.1.9. ⊢obj p-var allows a variable to be assigned any pure type that is a specialization of its entry in the static environment. src

The typing rules for prestorables are presented in Figure 5.1.10. ⊢obj b -const makes use of an indexed total function TypeOfρ :

src obj c

⇒

src ∀ ρ obj B

to obtain a storable type scheme for any constant,

specializing the result. We assume, for example, that TypeOf( zero?@ρ0 ) = Int @ ρ0 TypeOf( add1@ρ0 ) = Int @ ρ0 {(alloc @ ρ0 )}

⇒

{(alloc @ ρ0 ),(read @ ρ0 )}

{(alloc @ ρ0 )}

(X @ ρ0 )

⇒

⇒

∅

{(read @ ρ0 )}

⇒

Bool,

Int @ ρ0 , and also that TypeOf( if@ρ0 ) = ∀ X. Bool src

(X @ ρ0 )⇒ (X @ ρ0 ). ⊢obj b -λ allows the latent effect of a function to

be greater than the effect of evaluating the body [59]; we accept this as a simpler alternative to src

src

effect polymorphism. ⊢obj b -µλ is stated in terms of ⊢obj b -λ. When typing the function, the static environment is extended with a trivial pure type scheme (without type variables) based on the function type. Thus, there is no polymorphic recursion as demonstrated in the last example of Figure 5.1.3. 281

Enhancing the Languages

Source Languages

T0′ ⊆ T0 ⊢

src obj b-const

src

Γ ⊢obj b

ρ1 ρ0 /ρ2

src

⊢

src obj b-ref

src

TypeOf( c) B0

Γ ⊢obj e Γ ⊢

hci : B0 ! ∅

ρ1 ρ0 ρ2

src ρ1 ρ0 /ρ2 obj b

⊢

src obj b-λ

(Γ − ρ2 ) {x 7→ P0.1 } ⊢obj e src

Γ ⊢obj b

href ei : Ref P0 ! T

⊢

src obj b-µλ

hλx.e0 i : P0.1 ⇒0 P0.2 ! ∅

Γ {x1 7→ B0 @ ρ0 } ⊢obj b Γ ⊢

⇁ src ρ obj b

e0 : P0.2 ! T0′ T

ρ1 ρ0 /ρ2

src

e : P0 ! T

ρ1 ρ0

⇁ ρ

hλx2 .e0 i: B0 ! ∅

hµx1 @ρ0 . λx2 . e0 i : B0 ! ∅

Figure 5.1.10. Typing of Object Source Language Prestorables src

Γ ⊢obj e

ρ

6∀

e : P.1 ! T.1 src

⊢

Γ {x 7→ P.1 } ⊢obj e

src obj e -letn

Γ ⊢

src ρ obj e

src

⊢

src obj e -letg

ρ

e : P.2 ! T.2

6∀

e in e : P.2 ! T.1 ∪ T.2

let x = ρ

Γ ⊢obj e ∀e : P0.1 ! T.1 X = ftv(P0.1 ) − ftv(Γ) src ρ Γ {x 7→ ∀ X. P0.1 } ⊢obj e e : P0.2 ! T.2 Γ ⊢

src ρ obj e

∀

let x = e in e : P0.2 ! T.1 ∪ T.2

src

⊢

src obj e -pure

Γ ⊢obj p Γ ⊢

ρ

src ρ obj e

p: P p: P!∅

b = hµx1 @̺. λx2 . ei → ̺ = ̺0 src

⊢

src obj e -alloc

⊢

src obj e -deref

Γ ⊢obj b Γ ⊢

src ρ1 ρ0 ρ2 obj e

ρ1 ρ0 /ρ2

b: B!T

@ ρ0 b : B @ ρ0 ! T ∪ {(alloc @ ρ0 )} src

Γ ⊢obj e Γ ⊢

src ρ3 = ρ1 ρ0 ρ2 obj e

ρ3

e : Ref P0 @ ρ0 ! T

@ ρ0 deref e : P0 ! T ∪ {(read @ ρ0 )} src

src

⊢obj e-set

src

Γ ⊢obj e

ρ3 = ρ1 ρ0 ρ2

@ ρ0 set e .1 to e .2 : Unit ! T.1 ∪ T.2 ∪ {(write @ ρ0 )} src

Γ ⊢obj e Γ ⊢

src

⊢obj e-app

src

Γ ⊢obj e

ρ3 = ρ1 ρ0 ρ2

ρ3

src ρ3 obj e

src obj e -letregion

Γ {ρ0 } ⊢obj e Γ ⊢

src ̺ obj e

e .2 : P0.1 ! T.2 T

e .1 : P0.1 ⇒0 P0.2 @ ρ0 ! T.1

@ ρ0 e .1 e .2 : P0.2 ! T0 ∪ T.1 ∪ T.2 ∪ {(exec @ ρ0 )} src

⊢

ρ3

Γ ⊢obj e e .1 : Ref P0 @ ρ0 ! T.1 src ρ3 Γ ⊢obj e e .2 : P0 ! T.2

̺ρ0

e: P!T

letregion ρ0 in e : P − ρ0 ! T − ρ0

Figure 5.1.11. Typing of Object Source Language Expressions

282

Object Language Statics

The typing rules for expressions are presented in Figure 5.1.11. There are now two rules for src

let. The first, ⊢obj e -letn is familiar, but requires the definition to be nongeneralizable. It extends the environment with a trivial pure type scheme based on the type of the definition. The rule src

⊢obj e -letg is available when the definition is generalizable. It differs in extending the environment with a pure type scheme formed by universally quantifying the pure type of the definition over its free type variables that are not free in the environment. This serves to allow distinct occurences of the bound variable in the body to be typed using different substitutions of storable types for the type variables. We reject untypable program #3 of Figure 5.1.3 above by insisting that reference ∅

cells not be generalizable. If x1 were bound in the environment to ∀ X. @ ρ Ref (@ ρ @ ρ X ⇒ @ ρ X), we could set x1 to add1 (treating X as Int) and then apply the contents of x1 to a reference cell (treating X as Ref Unit). We reject untypable program #4 of Figure 5.1.3 above by insisting that one can only generalize fresh type variables. In that case, although the definition in the inner let is generalizable, we must not quantify its type over X because that type variable occurs free in src

the environment, in the binding of x1 . The rule ⊢obj e -alloc ensures that a recursive prestorable is allocated at its internally specified region. A sample derivation in the enhanced object language is presented in Figures 5.1.12 through 5.1.15. The program is not particularly useful, to say the least, but it concisely demonstrates the main added features. It enters an infinite loop twice, in one case treating the result as an integer and in the other as a reference cell. See Figure 5.1.2 for examples of more practical programs. In Figure 5.1.12, the top-level program and corresponding expression are typed as dangling pointers and the expression has no effect. Within region ρ1 , the pure type is X2 @ ρ1 and the effect includes an allocation, an execution, and a read at ρ1 . Allocations of recursive functions are gensrc

eralizable; the rule ⊢obj e -letg is used so that within the body of the let construct that binds the recursive function, the function appears in the environment with a pure type scheme of ∀ X1 . Int @ ρ1

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

⇒

X1 @ ρ1 @ ρ1 . Thus it takes an integer at ρ1 , performs allocations,

executions, and reads at ρ1 , and returns an element of pure type X1 @ ρ1 . The let body consists of a sequence. The first element of the sequence is typed in Figure 5.1.12. The second is d4 in Figure 5.1.14. In each case we have an application of the recursive function, x1 , to a constant integer (whose allocation is typed in d3 in Figure 5.1.15). In the first sequence element, we have x1 typed as returning Int @ ρ1 while in the second it is typed as returning Ref X2 @ ρ1 ; both are specializations

283

284 src

⊢obj e

src

⊢obj e

d1

src

ρ1

⊢obj e

ρ1

src

ǫ

⊢obj e

src

⊢obj q

-app

ρ1

-seq

-letg

-letregion

src

⊢obj e

ρ1

-app

∅ {ρ1 }

Enhancing the Languages

d6 d3 ⊢ @ ρ1 x1 @ ρ1 h1i : Int @ ρ1 ! {(alloc @ ρ1 ), } { (alloc @ ρ1 ), } (exec @ ρ1 ), (exec @ ρ1 ), (read @ ρ1 ) (read @ ρ1 ) {x1 7→ ∀ X1 . Int @ ρ1 ⇒ X1 @ ρ1 @ ρ1 } d4 ∅ {ρ1 } ⊢ @ ρ1 @ ρ1 hsub1@ρ1 i (@ ρ1 x1 @ ρ1 h1i) : Int @ ρ1 ! {(alloc @ ρ1 ), } { (alloc @ ρ1 ), } (exec @ ρ1 ), (exec @ ρ1 ), (read @ ρ1 ) (read @ ρ1 ) {x1 7→ ∀ X1 . Int @ ρ1 ⇒ X1 @ ρ1 @ ρ1 } ∅ {ρ1 } ⊢ @ ρ1 @ ρ1 hsub1@ρ1 i (@ ρ1 x1 @ ρ1 h1i); : X2 @ ρ1 ! {(alloc @ ρ1 ), } { (alloc @ ρ1 ), } @ ρ1 deref (@ ρ1 x1 @ ρ1 h1i) (exec @ ρ1 ), (exec @ ρ1 ), (read @ ρ1 ) (read @ ρ1 ) {x1 7→ ∀ X1 . Int @ ρ1 ⇒ X1 @ ρ1 @ ρ1 } ∅ {ρ1 } ⊢ let x1 = @ ρ1 hµx1 @ρ1 . λx2 . letregion ρ2 i : X2 @ ρ1 ! {(alloc @ ρ1 ), } in @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i x2 ) (exec @ ρ1 ), in @ ρ1 @ ρ1 hsub1@ρ1 i (@ ρ1 x1 @ ρ1 h1i); (read @ ρ1 ) @ ρ1 deref (@ ρ1 x1 @ ρ1 h1i) ∅ ⊢ letregion ρ1 : ∅!∅ in let x1 = @ ρ1 hµx1 @ρ1 . λx2 . letregion ρ2 i in @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i x2 ) in @ ρ1 @ ρ1 hsub1@ρ1 i (@ ρ1 x1 @ ρ1 h1i); @ ρ1 deref (@ ρ1 x1 @ ρ1 h1i) ⊢ letregion ρ1 : ∅ in let x1 = @ ρ1 hµx1 @ρ1 . λx2 . letregion ρ2 i in @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i x2 ) in @ ρ1 @ ρ1 hsub1@ρ1 i (@ ρ1 x1 @ ρ1 h1i); @ ρ1 deref (@ ρ1 x1 @ ρ1 h1i)

d2

Figure 5.1.12. Sample Object Source Language Derivation, I Source Languages

⊢

∅ {ρ1 }

⊢ x1 : Int @ ρ1 {(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

src

⊢obj e

ρ1 ρ2

{x1 7→ Int @ ρ1 {x2 7→ Int @ ρ1 } {ρ2 }

-pure

⇒

⊢ x1 : Int @ ρ1 {(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

src

ρ1 ρ2

{x1 7→ Int @ ρ1 {x2 7→ Int @ ρ1 } {ρ2 } ∅ {ρ1 }

-app

⇒

⇒

X1 @ ρ1 @ ρ1

X1 @ ρ1 @ ρ1 }

∅ {ρ1 }

⊢obj e

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

⇒

d5 X1 @ ρ1 @ ρ1 ! ∅

X1 @ ρ1 @ ρ1 }

⊢ @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i x2 ) : X1 @ ρ1 ! {(alloc @ ρ1 ), } (exec @ ρ1 ), {x1 7→ Int @ ρ1 ⇒ X1 @ ρ1 @ ρ1 } (read @ ρ1 ), {x2 7→ Int @ ρ1 } (alloc @ ρ2 ), {ρ2 } (exec @ ρ2 ) ∅ {ρ1 } ⊢ letregion ρ2 : X1 @ ρ1 ! {(alloc @ ρ1 ), } {(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )} in @ ρ x (@ ρ @ ρ hadd1@ρ i x ) (exec @ ρ1 ), 1 1 2 2 1 2 {x1 7→ Int @ ρ1 ⇒ X1 @ ρ1 @ ρ1 } (read @ ρ1 ) {x 7→ Int @ ρ } {(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

src

⊢obj e

src

ρ1

⊢obj b

-letregion

ρ1 /

2

-λ

1

∅ {ρ1 }

src

⊢obj b

⊢

ρ1 /

-µλ

{x1 7→ Int @ ρ1

src ρ1 obj e -alloc

{ (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

⇒

⊢ hλx2 .letregion ρ2 i : Int @ ρ1 in @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i x2 )

{ (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

⇒

X1 @ ρ1 ! {(alloc @ ρ1 ) }

X1 @ ρ1 @ ρ1 }

∅ {ρ1 } ⊢ hµx1 @ρ1 . λx2 . letregion ρ2 i : Int @ ρ1 in @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i x2 )

{ (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

∅ {ρ1 } ⊢ @ ρ1 hµx1 @ρ1 . λx2 . letregion ρ2 i : Int @ ρ1 in @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i x2 )

⇒

X1 @ ρ1 ! {(alloc @ ρ1 ) }

{ (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

⇒

285

Figure 5.1.13. Sample Object Source Language Derivation, II

X1 @ ρ1 @ ρ1 ! {(alloc @ ρ1 ) }

Object Language Statics

d1 = src ρ1 ρ2 obj p -var

Enhancing the Languages

286

d3 = src

⊢obj b src

⊢obj e

ρ1

-alloc

ρ1 /

-const

∅ {ρ1 }

⊢ h1i : Int @ ρ1 ! ∅

{x1 7→ ∀ X1 . Int @ ρ1

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

⇒

∅ {ρ1 } {x1 7→ ∀ X1 . Int @ ρ1

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

⇒

X1 @ ρ1 @ ρ1 } ⊢ @ ρ1 h1i : Int @ ρ1 ! {(alloc @ ρ1 )}

X1 @ ρ1 @ ρ1 }

d4 = src

⊢obj p

ρ1

-var

∅ {ρ1 }

src

⊢obj e-pure

⊢ x1 : Int @ ρ1

{x1 7→ ∀ X1 . Int @ ρ1

{ (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

⇒

∅ {ρ1 }

src

⊢obj e

src

ρ1

⊢obj e

-app

ρ1

-deref

⇒

Ref X2 @ ρ1 @ ρ1

X1 @ ρ1 @ ρ1 } ⊢ x1 : Int @ ρ1

{x1 7→ ∀ X1 . Int @ ρ1 ∅ {ρ1 }

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

⇒

d3 Ref X2 @ ρ1 @ ρ1 ! ∅

{ (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

⇒

X1 @ ρ1 @ ρ1 } ⊢ @ ρ1 x1 @ ρ1 h1i : Ref (X2 @ ρ1 ) @ ρ1 ! {(alloc @ ρ1 ), } { (alloc @ ρ1 ), } (exec @ ρ1 ), (exec @ ρ1 ), (read @ ρ1 ) (read @ ρ1 ) {x1 7→ ∀ X1 . Int @ ρ1 ⇒ X1 @ ρ1 @ ρ1 } ∅ {ρ1 } ⊢ @ ρ1 deref (@ ρ1 x1 @ ρ1 h1i) : X2 @ ρ1 ! {(alloc @ ρ1 ), } { (alloc @ ρ1 ), } (exec @ ρ1 ), (exec @ ρ1 ), (read @ ρ1 ) (read @ ρ1 ) {x1 7→ ∀ X1 . Int @ ρ1 ⇒ X1 @ ρ1 @ ρ1 }

Γ = ∅ {ρ1 } ⇒

X1 @ ρ1 @ ρ1 }

Figure 5.1.14. Sample Object Source Language Derivation, III

Source Languages

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

{x1 7→ Int @ ρ1 {x2 7→ Int @ ρ1 } {ρ2 }

src

⊢obj b

ρ1 /

-const

∅ {ρ1 }

src

⊢obj e

ρ1

⊢ hsub1@ρ1 i : Int @ ρ1

{x1 7→ ∀ X1 . Int @ ρ1

-alloc

∅ {ρ1 }

{ (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

⇒

{x1 7→ ∀ X1 . Int @ ρ1

⇒

⇒

Int @ ρ1 ! ∅

X1 @ ρ1 @ ρ1 }

⊢ @ ρ1 hsub1@ρ1 i : (Int @ ρ1 { (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

{(alloc @ ρ1 ),(read @ ρ1 )}

{(alloc @ ρ1 ),(read @ ρ1 )}

⇒

Int @ ρ1 ) @ ρ1 ! {(alloc @ ρ1 ) }

X1 @ ρ1 @ ρ1 }

d5 = src

⊢obj b src

⊢obj e

src

⊢obj e

ρ1 ρ2

ρ1 ρ2 /

ρ1 ρ2

-const

-alloc

Γ ⊢ hadd1@ρ1 i : Int @ ρ1

{ (alloc @ ρ1 ), } (read @ ρ1 )

⇒

src

⊢obj p

Int @ ρ1 ! ∅

src

⊢obj e

{ (alloc @ ρ1 ), } (read @ ρ1 )

ρ1 ρ2

ρ1 ρ2

-pure

-var

Γ ⊢ x2 : Int @ ρ1 Γ ⊢ x2 : Int @ ρ1 ! ∅

Γ ⊢ @ ρ2 hadd1@ρ1 i : (Int @ ρ1 ⇒ Int @ ρ1 ) @ ρ2 ! {(alloc @ ρ2 )} Γ ⊢ @ ρ2 @ ρ2 hadd1@ρ1 i x2 : Int @ ρ1 ! {(alloc @ ρ1 ), (read @ ρ1 ), (alloc @ ρ2 ), (exec @ ρ2 )}

-app

d6 = src

⊢obj p

ρ1

-var

∅ {ρ1 }

src

⊢obj e

ρ1

-pure

{x1 7→ ∀ X1 . Int @ ρ1

⊢ x1 : Int @ ρ1 { (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

⇒

∅ {ρ1 }

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

{x1 7→ ∀ X1 . Int @ ρ1

⇒

Int @ ρ1 @ ρ1

X1 @ ρ1 @ ρ1 } ⊢ x1 : Int @ ρ1

{ (alloc @ ρ1 ), } (exec @ ρ1 ), (read @ ρ1 )

⇒

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

⇒

Int @ ρ1 @ ρ1 ! ∅

X1 @ ρ1 @ ρ1 }

Figure 5.1.15. Sample Object Source Language Derivation, IV

Object Language Statics

d2 =

287

Enhancing the Languages

Source Languages

of X1 @ ρ1 , the range type from the environment. These are the most general derivable return types; in the latter case we could have made the return type any reference cell type at ρ1 . The recursive function is typed in d1 in Figure 5.1.13. Within the body, x1 , which happens to also serve as the internally visible name of the recursive function, is typed in the environment without generalization as Int @ ρ1

{(alloc @ ρ1 ),(exec @ ρ1 ),(read @ ρ1 )}

⇒

X1 @ ρ1 @ ρ1 . Thus, within the recursive function’s body no

assumptions can be made regarding its return type.

2. Monadic Language Statics

src mon q ρ e ρ ∈ src mon e ρ ∀ ρ ∀ e ∈ src mon e ∀ ρ ∀ ρ1 ρ0 e ∈ src mon e 6∀ ρ 6∀ ρ e ∈ src mon e 6∀ ρ 6 ∀ ρ1 ρ0 e ∈ src mon e ρ b ρ ∈ src mon b ρ ρ ρ 1 0 ∀ ∀ b ∈ src mon b 6∀ ρ 6 ∀ ρ1 ρ0 b ∈ src mon b ρ p ρ ∈ src mon p ρ v ρ ∈ src mon v src ρ ∈ mon ρ src ∈ src mon g ⊇ obj g

q ∈

g

e1 ; e2

::= ::=

eǫ ∀ ρ

e | 6 ∀e

ρ

::=

returnρ p ρ

::=

return ∀e

::=

let x = e ρ in e ρ | run e ρρ0

::=

return 6 ∀e

::=

∀ ρ

b | 6 ∀b

ρ1

ρ1

| ∀b

ρ

ρ

| 6 ∀b | deref p ρ | set p ρ to p ρ | p ρ p ρ

ρ

::=

hci | hλx.e ρ i | hµx. λx. e ρ i

::= ::= ::= ::= x ∈

href p ρ i v̺ | x g|∅ ρ src mon x ⊇ →

src obj x

c ∈ src mon c let x = e1 in e2

(x ∈ / fpv(e2 ))

Figure 5.2.16. Monadic Source Language Syntax The monadic source language syntax is presented in Figure 5.2.16. It enhances the core monadic source language of Figure 3.2.14 with the features first presented in Figure 5.1.1. As in the core monadic language, new regions may be declared with a run form, indexes are formed using a single region variable, and the region to which an operation applies is determined by its context within return forms. As in the enhanced object language, constants and recursive functions are included. Recursive functions need no longer mention the region of allocation. Expressions and prestorables are again divided by whether or not they are generalizable. The cascaded use of return constructs

288

Monadic Language Statics

around pures is considered generalizable, while the potential for generalizing the use of an individual return construct to perform an operation on an outer region is determined by the operation being performed. (1) run let x1 = hdisplayi in let x2 = let x3 = hadd1i in let x4 = h5i in x3 x4 in x1 x2 (2) run let x3 = hdisplayi in let x4 = let x5 = hµx1 . λx2 . run let x14 = let x15 = let x6 = let x7 = hifi in let x8 = let x9 = hzero?-1i in x9 x2 in x7 x8 in let x10 = hλx.return h0i i in x6 x10 in let x13 = hλx.let x12 = let x17 = hsub1-1i i in let x18 = x17 x2 in x1 x18 in let x16 = hadd1-1i in x16 x12 in x15 x13 in x14 unit in let x11 = h1i in x5 x11 in x3 x4 (3) run let x = hλx.return x i in let x1 = h5i in x x1 ; let x2 = href uniti in x x2 (4) run let x1 = let x3 = hλx.return true i in href x3 i in let x2 = deref x1 in let x4 = hλx.x2 x i in set x1 to x4 ; let x5 = deref x1 in x5 false (5) run let x0 = hλx.x unit i in let x1 = hλx.href uniti i in let x2 = hλx.x1 unit i in x0 x1 ; x0 x2

i

Figure 5.2.17. Some Typable Monadic Language Programs Figures 5.2.17 and 5.2.18 present sample monadic programs corresponding to the object-language programs of Figures 5.1.2 and 5.1.3. Constants are named with the number of regions back at which they are to operate. 289

Enhancing the Languages

Source Languages

(1) run let x = hλx.return x i in let x1 = href uniti in x x1 ; x unit (2) run run let x = hλx.return return x i in let x1 = return href uniti in x x1 ; let x2 = href uniti in x x2 (3) run let x1 = let x2 = hλx.return x i in href x2 i in let x5 = hadd1i in set x1 to x5 ; let x3 = deref x1 in let x4 = href uniti in x3 x4 (4) run let x1 = let x3 = hλx.return x i in href x3 i in let x2 = hλx.let x4 = deref x1 i in x4 x in let x5 = hadd1i in set x1 to x5 ; let x6 = href uniti in x2 x6 (5) run hµx1 . λx2 . let x3 = h5i in x1 x3 ; i let x4 = href uniti in x1 x4 unit Figure 5.2.18. Some Untypable Monadic Language Programs ǫ ǫ src ::= ∅ {≬Γ } mon Γ ρ ρ1 ρ0 Γ ρ ∈ src ::= Γ ρ1 {≬Γ } mon Γ ρ ρ ≬ ρ ≬ ρ Γ ∈ src ::= ∅ | ≬Γ {x 7→ ∀P } mon Γ ρ E ρ ∈ src ::= T ρ P ρ mon E ∀ ρ ∀ ρ B ∈ src ::= ∀ X. B ρ mon B ρ1 ρ0 B ρ ∈ src ::= X | C | Ref P ρ | P ρ ⇒ E ρ mon B ∀ ρ ∀ ρ P ∈ src ::= ∀ X. P ρ mon P ǫ src P ̺ ∈ src mon P , Q ∈ mon Q ::= G | ∅ ρ1 ρ0 ρ ::= B ρ | Return P ρ1 P ∈ src mon P ǫ ::= Id T ρ ∈ src mon T ρ ρ1 ρ0 ρ src T ∈ mon T ::= ≬T T ρ1 ρ ≬ ρ ≬ ρ1 ρ0 T ∈ src ::= Stε mon T ρ ρ1 ρ0 = { src ε ρ ∈ src mon ε mon F } ρ ρ ρ 1 0 F ∈ src ::= ι mon F src ι ∈ mon ι ::= alloc | read | write | exec src src src src ∈ src X ∈ src mon G ⊇ obj G mon X ⊇ obj X C ∈ mon C ⊇ obj C

Γρ ∈

G

Figure 5.2.19. Monadic Source Language Static Syntax The static syntax is presented in Figure 5.2.19. It is similar to that of the core monadic source language (Figure 3.2.17). Environments are sequences of region environments. Expression types 290

Monadic Language Statics

take the form of the application of a trace type to a pure type. Trace types are functor transformers annotated with an effect. Pure types are prestorable types embedded within Return forms. The indexing is also similar to the core monadic source language. As in the enhanced object language (Figure 5.1.4) storable types include type variables and constant types. Storable type schemes are storable types universally quantified over type variables. Pure type schemes are pure types universally quantified over type variables. Region environments hold pure type schemes. Γ ρ1 ( x) Γ( x)

ρ1

∈

ρ1

ρ1

Γ ( x)

= Γ( x) ρ1 /ρ2

Γ {≬Γ } {∅}( x)

∈ ρ1 ρ0 /ρ2

Γ { Γ {x 7→ ∀ X. P}}( x) Γ {≬Γ {x1 7→ ∀P}}( x2 )

ρ1 ρ2 src mon P ρ1 /ρ0 ρ2 ≬

= Γ { Γ}( x)

≬

≬ ρ1

src ∀ ρ1 mon P ρ1 /

ρ1 /ρ2

ρ1 /ρ2

ρ1 ρ0

= ∀ X. Returnρ2 P (x1 6= x2 ) = Γ {≬Γ}( x2 )

ρ1 /ρ2

ρ1

Γ 2ρ1 ρ0 ≬Γ ≬ Γ 1 2ρ1 ρ0 ∅ ≬ Γ 1 2ρ1 ρ0 ≬Γ 2 {x 7→ ∀P}

≬ ∈ src mon Γ = ≬Γ 1 = (≬Γ 1 2ρ1 ρ0 ≬Γ 2 ) {x 7→ ∀P :=ρ1 ρ0 ∅}

Γ ρ1 ρ2 2ρ1 /ρ2 Γ {≬Γ } 2ρ1 / Γ {≬Γ 1 } {≬Γ 2 } 2ρ1 /ρ2 ρ0

ρ1 ∈ src mon Γ = Γ {≬Γ} = (Γ {≬Γ 1 2ρ1 ρ2 ρ0 ≬Γ 2 }) 2ρ1 /ρ2

∀ ρ1 ρ0

∀ ∈ src mon P = ∀ X. (P :=ρ1 ρ0 ∅) ρ1 ∈ src mon P = Returnρ1 ∅ = B

P :=ρ1 ρ0 ∅ ∀ X. P :=ρ1 ρ0 ∅ P ρ1 ρ0 :=ρ1 ρ0 ∅ B :=ρ1 ρ0 ∅ Return B :=ρ1 ρ0 ∅

ρ1

Figure 5.2.20. Monadic Source Language Definitions Monadic source language definitions are presented in Figure 5.2.20. They are like those of the core language (Figure 3.2.18) except for the presence of pure type schemes as in the object language (Figure 5.1.5). In applying an environment, the Return forms for an outer-region binding are inserted within the scope of any type variables. Monadic source language typing judgments in Figure 5.2.21, value rules in Figure 5.2.22, and program rules in Figure 5.2.23 are identical to those of the core language (Figures 3.2.19, 3.2.20, and 3.2.21).

291

Enhancing the Languages

Source Languages

src

⊢mon q

programs

q

: Q

Γρ

⊢mon e

src

ρ

eρ

: Eρ

prestorables Γ ρ

⊢mon b

src

ρ

bρ

: Bρ

Γρ

⊢mon p

src

ρ

pρ

: Pρ

vρ

: Pρ

expressions pures

⊢

values

src ρ mon v

Figure 5.2.21. Monadic Source Language Typing Judgments src

src

⊢mon v-glob-const

⊢

src ǫ mon v

src

⊢

src mon v -ret-run

⊢mon v ⊢

src ρρ0 mon v

⊢mon v-addr-dead

g : TypeOf( g) ρ

src

⊢mon v

ǫ

∅: ∅

v: P

v : Return P

Figure 5.2.22. Typing of Monadic Source Language Values src

src

⊢mon q

ǫ

{∅} ⊢mon e q : IdQ src ⊢mon q q : Q

Figure 5.2.23. Typing of Monadic Source Language Programs ⊢

src mon p-var

Γ( x) P Γ ⊢

src ρ mon p

x: P

src

⊢

src mon p -value

⊢mon v Γ ⊢

ρ

v: P

src ρ mon p

v: P

Figure 5.2.24. Typing of Monadic Source Language Pures The rules for pures in Figure 5.2.24 are again like those of the object language (Figure 5.1.9), except for the slight difference in indexing. src

Figure 5.2.25 presents typing rules for prestorables. Rule ⊢mon b from the core monadic language (Figure 3.2.23). Rules ⊢

src mon b

-ref

is identical to the rule src

-const

and ⊢mon b -µλ are similar to

src

the corresponding rule for the object language (Figure 5.1.10). ⊢mon b -const again makes use of an indexed function TypeOfρρ :

src mon c

⇒

src ∀ ρρ mon B ,

to obtain a storable type scheme for any constant.

We assume, for example, that TypeOfρρρ ( zero?-1) = Return Int ⇒ (St∅ (St{read} (St∅ Id))) Returnρρρ Bool, TypeOfρρρ ( add1-1) = Return Int ⇒ (St∅ (St{alloc,read} (St∅ Id))) Return Int, and TypeOfρρ ( if) = src

∀ X. Returnρρ Bool ⇒ (St{alloc} (St∅ Id)) X ⇒ (St{alloc} (St∅ Id)) X ⇒ (St∅ (St∅ Id)) X. Rule ⊢mon b-µλ src

differs from the object language in that B is always allocated in the innermost region. Rule ⊢mon b-λ differs from the core monadic language version in that, as in the object language, we allow the effect of the body to fall short of the latent effect of the function. 292

Monadic Language Statics

Γ ⊢

src

⊢mon b-ref

⊢

src mon b-µλ

src

ρ1 ρ0 src mon p

Γ ⊢mon b

ρ1 ρ0

src

B }} ⊢mon b

Γ { Γ} ⊢

ρ1 ρ0 src mon b

ρ1 ρ0

⊑ ≬T 0 T1 src

Γ { Γ {x 7→ P.1 }} ⊢mon e

src

⊢mon b-λ

href ei : Ref P

Γ {≬Γ {x1 7→ ≬

p: P

≬ ′ T 0 T1′ ≬

src

Γ {≬Γ} ⊢mon b

hλx2 .e0 i : B

ρ1 ρ0

e0 : ≬T ′0 T1′ P.2

hλx.e0 i : (P.1 ⇒ ≬T 0 T1 P.2 ) TypeOf( c) B

src

⊢mon b-const

hµx1 . λx2 . e0 i : B

ρ1 ρ0

src

Γ ⊢mon b

ρ

hci : B

Figure 5.2.25. Typing of Monadic Source Language Prestorables Figure 5.2.26 presents typing rules for expressions. The only differences from the core monadic src

language (Figure 3.2.24) are that, as in the object language (Figure 5.1.11), the rule ⊢mon e-letn is

src

⊢

⊢

src mon e -letn

src mon e -letg

Γ {≬Γ } ⊢mon e

ρ

src

6∀

Γ {≬Γ {x 7→ P.1 }} ⊢mon e

e : T.1 P.1

≬

Γ { Γ} ⊢

src ρ mon e

Γ { Γ} ⊢

let x = e in e : (T.1 ⊔ T.2 ) P.2 src

Γ {≬Γ {x 7→ ∀ X. P.1 }} ⊢mon e

src ρ mon e

Γ ⊢

src ρ1 ρ0 mon e

src

Γ ⊢ src

Γ ⊢mon p

src

⊢mon e-set

Γ ⊢ src

⊢

src mon e -app

Γ ⊢mon p

ρ1 ρ0

src

Γ ⊢

ρ1 ρ0

src ρ mon e

ρ

b: B

b : St ρ1 ρ0

deref p : St{read} (St∅ Id) P0 src

Γ ⊢mon p

src

Γ ⊢mon p

p.2 : P0.1 src ρ1 ρ0 mon e

ρ1 ρ0

p.1 : (P0.1 ⇒ ≬T T.1 P0.2 )

src

Γ {∅} ⊢mon e

src

⊢mon e-run src

Γ {≬Γ 1 2 ≬Γ 2 } ⊢mon e ≬

p.2 : P0

p.1 p.2 : (≬T ⊔ St{exec} ) T.1 P0.2

p: P

src

ρ1 ρ0

set p.1 to p.2 : St{write} (St∅ Id) Unit

returnρ p : St∅ Id P

⊢mon e-ret-run

(St∅ Id) B

p : Ref P0

p.1 : Ref P0

src ρ1 ρ0 mon e

Γ ⊢ Γ ⊢mon p

src

⊢mon e-pure

src ρ1 ρ0 mon e

ρ1 ρ0

{alloc}

Γ ⊢mon p

src

⊢mon e-deref

≬

e : T.2 P.2

let x = e in e : (T.1 ⊔ T.2 ) P.2 src

⊢

ρ

∀

Γ ⊢mon b

src mon e-alloc

e : T.2 P.2

6∀

X = ftv(P.1 ) − ftv(Γ) src ρ Γ {≬Γ} ⊢mon e ∀e : T.1 P.1 ≬

ρ

Γ { Γ1} { Γ2} ⊢

src ρ1 ρ0 mon e

Γ ⊢ ρ1

src ρ1 mon e

ρ1 ρ0

e : ≬T T P

run e : T P [ ρ0 := ∅]

e : TP

return e : St∅ T (Return P)

Figure 5.2.26. Typing of Monadic Source Language Expressions 293

⊢ d3 ⊢ d2 d1

src

ρ

⊢mon e

-letg

src

⊢mon e

src

ǫ

⊢mon q

-run

src

⊢mon e

src ρ mon e -letg

⊢

ρ

-letg

src ρ mon e -seq

src

ρ

-var

Γ12 ⊢ x2 : Int

src ρ mon e -app

⊢

src ρ mon e -letn

⊢mon p

ρ

-var

Γ12 ⊢ x1 : Int ⇒ St{alloc,exec,read} Id Int

Γ12 ⊢ x1 x2 : St{alloc,exec,read} Id Int

d4

Γ12 ⊢ let x5 = x1 x2 in x4 x5 : St{alloc,exec,read} Id Int

d5

Γ1 ⊢ let x4 = hsub1i in let x5 = x1 x2 in x4 x5 : St{alloc,exec,read} Id Int Γ1 ⊢ let x4 = hsub1i in let x5 = x1 x2 in x4 x5 ; : St{alloc,exec,read} Id X2 let x6 = x1 x2 in deref x6

{∅} ⊢ let x2 = h1i : St{alloc,exec,read} Id X2 {∅ {x1 7→ ∀ X1 . Int ⇒ }} in let x4 = hsub1i in let x5 = x1 x2 in x4 x5 ; let x6 = x1 x2 in deref x6 St{alloc,exec,read} Id X1

Enhancing the Languages

294

src

⊢mon p

{∅} {∅} ⊢ let x1 = hµx1 . λx2 . run let x3 = let x4 = hadd1-1i in x4 x2 i : St{alloc,exec,read} Id X2 in x1 x3 in let x2 = h1i in let x4 = hsub1i in let x5 = x1 x2 in x4 x5 ; let x6 = x1 x2 in deref x6 {∅} ⊢ run let x1 = hµx1 . λx2 . run let x3 = let x4 = hadd1-1i in x4 x2 i : Id ∅ in x1 x3 in let x2 = h1i in let x4 = hsub1i in let x5 = x1 x2 in x4 x5 ; let x6 = x1 x2 in deref x6 ⊢ run let x1 = hµx1 . λx2 . run let x3 = let x4 = hadd1-1i in x4 x2 i : ∅ in x1 x3 in let x2 = h1i in let x4 = hsub1i in let x5 = x1 x2 in x4 x5 ; let x6 = x1 x2 in deref x6 Γ12 = {∅} { ∅ {x1 7→ ∀ X1 . Int ⇒ St{alloc,exec,read} Id X1 } {x2 7→ Int} {x4 7→ Int ⇒ St{alloc,read} Id Int}

Figure 5.2.27. Sample Monadic Source Language Derivation, I

}

Source Languages

Γ1 = {∅} {∅ {x1 7→ ∀ X1 . Int ⇒ St{alloc,exec,read} Id X1 } } {x2 7→ Int}

Monadic Language Statics

d1 = src

⊢mon p ⊢

d6 ⊢

src

ρρ

-var

src ρρ mon e -app

∅

Γ21 ⊢ x4 : Return Int ⇒ St (St

src

src

⊢mon b src

-run

src

Id)

{∅} {∅ {x1 7→ Int ⇒ St{alloc,exec,read} Id X1 }} ⊢ hλx2 .run let x3 = let x4 = hadd1-1i in x4 x2 i : Int ⇒ St{alloc,exec,read} Id X1 in x1 x3

ρ

⊢mon e

Γ21 ⊢ x2 : Return Int

Γ2 ⊢ run let x3 = let x4 = hadd1-1i in x4 x2 : St{alloc,exec,read} Id X1 in x1 x3

ρ

⊢mon b

-var

Γ2 {∅} ⊢ let x3 = let x4 = hadd1-1i in x4 x2 : St{alloc,exec} (St{alloc,exec,read} Id) in x1 x3 Return X1

ρ

-λ

ρρ

Γ2 {∅} ⊢ let x4 = hadd1-1i in x4 x2 : St{alloc,exec} (St{alloc,read} Id) Return X1

-letn

src

⊢mon p

{alloc,read}

Γ21 ⊢ x4 x2 : St (St Return X1

ρρ

⊢mon e

Id) Return Int {exec}

src ρρ mon e -letg

⊢mon e

{alloc,read}

-µλ ρ

-alloc

{∅} {∅} ⊢ hµx1 . λx2 . run let x3 = let x4 = hadd1-1i in x4 x2 i : Int ⇒ St{alloc,exec,read} Id X1 in x1 x3 {∅} {∅} ⊢ hµx1 . λx2 . run let x3 = let x4 = hadd1-1i in x4 x2 i : St{alloc} Id (Int ⇒ St{alloc,exec,read} Id X1 ) in x1 x3

d5 = src

⊢mon p ⊢

src

ρ

-var

src ρ mon e -app src

⊢mon e

ρ

-letn

Γ1 ⊢ x2 : Int

⊢mon p

ρ

-var

src

Γ1 ⊢ x1 : Int ⇒ St{ alloc, exec, read {alloc,exec,read}

Γ1 ⊢ x1 x2 : St

Id Ref X2

}

Id Ref X2

⊢mon p ⊢

src ρ mon e -deref

ρ

-var

Γ3 ⊢ x6 : Ref X2

Γ3 ⊢ deref x6 : St{read} Id X2

Γ1 ⊢ let x6 = x1 x2 in deref x6 : St{alloc,exec,read} Id X2

Γ2 = {∅} {∅ {x1 7→ Int ⇒ St{alloc,exec,read} Id X1 } } {x2 7→ Int}

Γ3 = {∅} { ∅ {x1 7→ ∀ X1 . Int ⇒ St{alloc,exec,read} Id X1 } {x2 7→ Int} {x6 7→ Ref X2 }

Γ21 = Γ2 {∅ {x4 7→ Return Int ⇒ St∅ (St{alloc,read} Id) Return Int}} Figure 5.2.28. Sample Monadic Source Language Derivation, II

}

295

Enhancing the Languages

Source Languages

src

src

restricted so that the definition is nongeneralizable and there is a rule ⊢mon e-letg. ⊢mon e-letg operates similarly to the object language except that it is the uppermost region environment that is extended and that we use the operator ⊔ to combine the trace types. Figures 5.2.27 through 5.2.29 present a sample derivation in the enhanced monadic language. The program differs from that of the sample object-language derivation (Figures 5.1.12 through 5.1.15) in that the allocation of h1i is only performed once. The structure of the derivations is similar. src

⊢mon b d2 = ⊢

ρ

-const

src ρ mon e -alloc

{∅} {∅ {x1 7→ ∀ X1 . Int ⇒ St{alloc,exec,read} Id X1 }} ⊢ h1i : St{alloc} Id Int src

⊢mon b d3 = ⊢

src

d4 = ⊢

src

-var

⊢mon b d6 = ⊢

-const

ρ

Γ4 ⊢ x5 : Int

src ρ mon e -app

src

ρ

Γ1 ⊢ hsub1i : Int ⇒ St{alloc,read} Id Int Γ1 ⊢ hsub1i : St{ alloc } Id (Int ⇒ St{alloc,read} Id Int)

src ρ mon e -alloc

⊢mon p

{∅} {∅ {x1 7→ ∀ X1 . Int ⇒ St{alloc,exec,read} Id X1 }} ⊢ h1i : Int

⊢mon p

ρ

-var

Γ4 ⊢ x4 : Int ⇒ St{alloc,read} Id Int

Γ4 ⊢ x4 x5 : St{alloc,exec,read} Id Int

ρρ

-const

src ρρ mon e -alloc

Γ2 {∅} ⊢ hadd1-1i : Return Int ⇒ St∅ (St{ alloc, read

Γ2 {∅} ⊢ hadd1-1i : St

{alloc}

}

Id) Return Int

∅

(St Id)

(Return Int ⇒ St∅ (St{ alloc, read Γ4 = {∅} { ∅ {x1 7→ ∀ X1 . Int ⇒ St{alloc,exec,read} Id X1 } {x2 7→ Int} {x4 7→ Int ⇒ St{alloc,read} Id Int} {x5 7→ Int}

}

Id) Return Int)

}

Figure 5.2.29. Sample Monadic Source Language Derivation, III

3. Translation

Figure 5.3.30 describes the translation of enhanced object source language programs to monadic programs. The translation differs from that of the core languages (Figure 3.3.28) only in handling 296

Translation

Programs and Expressions src obj qN

src

= [e ] obj e

[e ]

src ρ obj e N

[let x = e1 in e2 ] src

[@ ρ0 hci] obj e

ρ1 ρ0 ρ2 N

src ρ1 ρ0 ρ2 N obj e

[@ ρ0 hλx.e i]

src ρ1 ρ0 ρ2 N obj e

[@ ρ0 hµx1 @ρ0 . λx2 . ei]

src ρ = ρ1 ρ0 ρ2 N obj e

[@ ρ0 href ei] src ρ = ρ1 ρ0 ρ2 N [@ ρ0 e.1 e.2 ] obj e

src

ρ = ρ1 ρ0 ρ2

N [@ ρ0 deref e ] obj e src ρ = ρ1 ρ0 ρ2 e N [@ ρ0 set e.1 to e.2 ] obj

src

[letregion ρ0 in e ] obj e src ρ [p ] obj e N

ρ

N

Pures and src ρ [x ] obj p N src ρ [v ] obj p N src ρ [g ] obj v N src ρ [∅] obj v N

N src

ρ

N

src

=

let x = [e1 ] obj e

=

returnρ2 hci

=

returnρ2 hλx.[[e ] obj e

= = =

= =

= = Values = x src ρ = [v ] obj v N = g = ∅

ǫ

src

return

ρ2

in [e2 ] obj e ρ1 ρ0 N

ρ

N

i

src ρ1 ρ0 N obj e

hµx1 . λx2 . [e ] src ρ obj e N

i

ρ2

let x = [e ] in return href xi src ρ e N obj let x1 = [e.1 ] src ρ in let x2 = [e.2 ] obj e N in returnρ2 (x1 x2 ) src ρ let x = [e ] obj e N in returnρ2 (deref x) src ρ let x1 = [e.1 ] obj e N src ρ in let x2 = [e.2 ] obj e N in returnρ2 (set x1 to x2 ) src ρρ0 run [e ] obj e N src ρ returnρ [p ] obj p N Region Variables src

[ρ ] obj ρN src obj ρN

[ρ ]

=

ρ src

= [ρ ] obj ρN

Figure 5.3.30. Translating Object Source Language to Monadic (Dynamic) constants and recursive functions. Allocations of constants are translated so that the region of allocation becomes implicit in the number of return forms, as with other allocations. As with program variables and global constants, we assume that all object-language allocated constants are present in the monadic language. Allocations of recursive functions are translated just like allocations of nonrecursive functions. The translation of let forms is unaffected by the introduction of polymorphism. Consider the sample object language derivation in Figure 5.1.12. Applying the translation of Figure 5.3.30 to the program letregion ρ1 in let x1 = @ ρ1 hµx1 @ρ1 . λx2 . letregion ρ2 i in @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i x2 ) in @ ρ1 @ ρ1 hsub1@ρ1 i (@ ρ1 x1 @ ρ1 h1i); @ ρ1 deref (@ ρ1 x1 @ ρ1 h1i)

297

Enhancing the Languages

Source Languages

yields run let x1 = hµx1 . λx2 . run let x3 = return return x1 in let x4 = let x5 = hadd1-1i in let x6 = return return x2 in x5 x6 in x3 x4 in let x7 = hsub1i in let x8 = let x9 = return x1 in let x10 = h1i in x9 x10 in x7 x8 ; let x11 = let x12 = return x1 in let x13 = h1i in x12 x13 in deref x11

src ǫ obj Γ N

[∅ ]

Environments ∅ {∅}

= src

[Γ {x 7→ ∀P} ] obj Γ src ρ1 ρ0 N [Γ {ρ0 } ] obj Γ

ρ

src

N

i

= let Γ {≬Γ } = [Γ ] obj Γ src ρ1 = [Γ ] obj Γ N {∅}

ρ

src ∀

N

in Γ {≬Γ {x 7→ [∀P ] obj

ρ

P N

}}

Expression and Trace Types src ρ obj E N

src

ρ

src

ρ

= [T ] obj T N [P ] obj P N src ρ1 = St{ι|(ι @ ρ0 ) ∈ T} [T − ρ0 ] obj T N = Id

[P ! T ] src ρ1 ρ0 N [T ] obj T src ǫ [T ] obj T N

Pure Types and Schemes ρ

src ∀

[∀ X. P ] obj

P N

src ρ obj P N

[G ] src ρ1 ρ0 ρ2 N [B @ ρ0 ] obj P src ρ [∅] obj P N

src

ρ

N

=

∀ X. [P ] obj P

= = =

Returnρ G src ρ1 ρ0 N Returnρ2 [B ] obj B Returnρ ∅

Storable Types and Schemes src ∀

[∀ X. B ] obj src

[X ] obj B

B

ρ = ρ1 ρ0

N

ρ = ρ1 ρ0 N

src ρ = ρ1 ρ0 N obj B

[C ]

src ρ = ρ1 ρ0 N obj B

[Ref P ] T

src ρ = ρ1 ρ0 N obj B

[P1 ⇒ P2 ]

src

= ∀ X. [B ] obj B

ρ

N

= X = C src

= Ref [P ] obj P src ρ obj P N

= [P1 ]

ρ

N src

⇒ [T ] obj T

ρ

N

src

[P2 ] obj P

ρ

N

Figure 5.3.31. Translating Object Source Language to Monadic (Static)

298

Translation

The static translation in Figure 5.3.31 differs from the core language (Figure 3.3.29) in handling type variables, allocated constant types, and both pure and storable type schemes. As with global constant types, we assume that all object-language type variables and allocated constant types are src ∀

BN

present in the monadic language, and that [TypeOf( c) ] obj

= TypeOf( c). As we translate, we leave

any quantifiers in place. Our statement of preservation of types is unchanged from the core source languages. Theorem 5.3.1 (Types Preservation).

src

⊢obj q q : Q src

Γ ⊢obj e

ρ

src

Γ ⊢obj p src

⊢obj v

src

⊢mon q

→ src

e: E

→

[Γ ] obj ΓN

p: P

→

[Γ ] obj ΓN

v: P

→

ρ

ρ

src

src

src

src

src

src

src

src

src

[q ] obj qN : [Q ] obj QN

src

src ρN [ρ [ ]]obj

⊢mon e

[e ] obj eN : [E ] obj EN

src

src ρN [ρ [ ]]obj

[p ] obj pN : [P ] obj PN

⊢mon p

src ρN [ρ [ ]]obj

src

[v ] obj vN : [P ] obj PN

⊢mon v

Proof: Theorem 5.3.1. The proof is again by induction on object language typing derivations. We present the new and modified cases: src

⊢obj p-var: src

src

We now have Γ( x) P as an antecedent of both ⊢obj p-var and ⊢mon p-var, but this does not src ∀

change the rest of the proof because Γ( x) P implies [Γ( x) ] obj

PN

src

[P ] obj PN .

src

⊢obj e-letg: src

src

By ⊢obj e -letg we have Γ ⊢obj e

ρ

ρ

src

∀

e: P1 ! T1 and Γ {x 7→ ∀ X. P1 } ⊢obj e e: P2 ! T2 , with src

src

src

ρ

src

X = ftv(P1 ) − ftv(Γ). By induction, we have that [Γ ] obj ΓN ⊢mon e [∀e ] obj eN : [T1 ] obj TN src

src

src

src

ρ

src

src

[P1 ] obj PN and that [Γ {x 7→ ∀ X. P1 } ] obj ΓN ⊢mon e [e ] obj eN : [T2 ] obj TN [P2 ] obj PN . We exsrc

pand using the translation on environments to get: let Γ {≬Γ} = [Γ ] obj Γ src

∀ X. [P1 ] obj P

ρ

N

src

src

ρ

}} ⊢mon e [e ] obj e

ρ

N

src

src

: [T2 ] obj TN [P2 ] obj P src ρ obj P N

We also have that ftv(P1 ) = ftv([[P1 ] ence is X and we can apply ⊢ src

[∀e ] obj e

ρ

N

src ρ obj e N

[e ] ⊢

src obj e -alloc

src

in [e ] obj e

ρ

N

src

: [T1 ] obj T

N

N

N

in Γ {≬Γ {x 7→

. src

) and ftv(Γ) = ftv([[Γ ] obj Γ

ρ

N

), so their differ-

src ρ obj Γ N

src mon e -letg

src ρ obj e N

= [let x = ∀e in e ]

/⊢

ρ

ρ

ρ

, obtaining a derivation of [Γ ]

src

⊔ [T2 ] obj T

ρ

N

src ρ obj T N

and [T1 ]

src

[P2 ] obj P

ρ

N

ρ

src

. Noticing that let x= [∀e ] obj e

src ρ obj T N

⊔ [T2 ]

src

⊢mon e let x=

src ρ obj T N

= [T1 ⊔ T2 ]

ρ

N

in

, we are done.

src obj b -const

: src

src

Let the index be ρ1 ρ0 ρ2 , with the allocation taking place at ρ0 . By ⊢obj e -alloc and ⊢obj b

299

Enhancing the Languages

-const

Intermediate Languages

we have that TypeOf( c) B ρ1 ρ0 . By assumption and because ∀B B implies src ∀

BN

[∀B ] obj

src

src

[B ] obj BN , we have TypeOf( c) [B ρ1 ρ0 ] obj BN . We can thus apply the monadic

src

src

src

⊢mon b -const and ⊢mon e -alloc at an index of ρ1 ρ0 and with an environment of [Γ ] obj Γ

ρ1 ρ0 ρ2 N

src

2ρ1 ρ0 /ρ2 . Applying an instance of ⊢mon e-ret-run for each element of ρ2 provides a derivation src

src

of [Γ ] obj ΓN ⊢mon e

ρ1 ρ0 ρ2

src

returnρ2 @ ρ0 hci: Returnρ2 ([[B ρ1 ρ0 ] obj BN @ ρ0 ). By the definition of

the translation, this is all that is required. ⊢

src obj e -alloc

src

/⊢obj b-λ: src

The proof is the same as in the core language except that we use T ⊆ T0 → [T ] obj TN ⊑ src

[T0 ] obj TN . src

src

⊢obj e-alloc/⊢obj b-µλ: src

src

This case is similar to that of ⊢obj e-alloc/⊢obj b-λ.

The reader has now seen some enhancements of the languages. Next, you will see how those enhancements are borne out in the reduction semantics and explore other enhancements that are specific to the intermediate language.

300

CHAPTER 6

Intermediate Languages

In addition to the enhancements already discussed in Chapter 5, we provide the following enhancements for the intermediate languages: • More precise indexing. • Eager deallocation of regions in the store, as described in Section 2.7 of the Introduction. Because we support heap-allocated functions and reject programs that perform operations on deallocated regions from functions that are not called, we must carefully restrict when we can deallocate a region eagerly. Specifically, we must ensure that there are no operations on the region being deallocated from anywhere in the expression or the store. As with the source language, the enhancements are independent. In particular, the presence of nonlexical information in the indexes is not necessary for eager deallocation of regions.

1. Object Language

1.1. Dynamics.

The object intermediate language syntax is presented in Figure 6.1.1. As in the source language (Figure 5.1.1), expressions and prestorables are divided into separate classes on the basis of whether or not they are generalizable. Also as in that language, prestorables include constants and recursive functions. Constants are now also included among the storables. As in the core object intermediate language, a freeregion construct is present among expressions and generalized addresses are present

301

Enhancing the Languages

hq; si r1 ˆ r2

he; si

r2

∈

Intermediate Languages

r2 imd obj hq; si r2 q r2 ∈ imd obj q r1 ˆ r2 imd obj he; si

∈

e ̺ ˆ r3 ∈ ∀ ̺ ˆ r3 e ∈ ̺ ˆ r 3 6∀ ∈ e

imd ̺ ˆ r3 obj e imd ∀ ̺ ˆ r3 obj e imd 6 ∀ ̺ ˆ r3 obj e 6 ∀ r1 ˆ r0 r2 imd 6 ∀ r1 ˆ r0 r2 e ∈ obj e ∀ ̺1 ̺0 ̺2 ˆ r3 ∀ ̺ ˆ r3 ∈ imd e obj e 6 ∀ ̺1 ̺0 ̺2 ˆ r3 6 ∀ ̺ ˆ r3 ∈ imd e obj e ⇁

b ̺ ˆ r3 ∈ ⇁

∀ ̺ ˆ r3

b

⇁

6 ∀ ̺ ˆ r3

b

hv; sir1 ˆ r2

∈

::= = ::=

imd 6 ∀ ̺1 ̺0 /̺2 ˆ r3 obj b r1 r0 d r ∈ imd obj d ̺ p ̺ ∈ imd obj p r1 ˆ r2 ∈ imd obj hv; si ̺ v ̺ ∈ imd obj v ̺ ∅ ∅ ̺ a ∈ imd obj a r1 r0 ̺2 a ̺ ∈ imd obj a ̺∈ imd obj ̺ r imd ǫ s ∈ obj s

∈

sr ∈

imd r1 r0 obj s imd ≬ r1 r0 ≬ r s ∈ obj s r t r ∈ imd obj t r f r ∈ imd obj f imd ι ∈ obj ι g ∈ imd obj g ρ ∈

×

imd r2 obj s

e

imd r1 ˆ r2 r1 r2 × imd obj e obj s ∀ ̺ ˆ r3 6 ∀ ̺ ˆ r3

| e

e

::=

p̺

::=

let x = e ̺ ˆ r3 in e ̺ ˆ r3 | letregion ρ0 in e ̺ρ0 ˆ r3

::=

::=

freeregion r0 after e r1 r0 ˆ r2 ̺1 ̺0 /̺2 ˆ r3 @ ̺0 ∀b ⇁ ̺0 ˆ r3 @ ̺0 6 ∀b | @ ̺0 deref e ̺ ˆ r3 | @ ̺0 set e ̺ ˆ r3 to e ̺ ˆ r3 @ ̺0 e ̺ ˆ r 3 e ̺ ˆ r 3 ⇁ ⇁ ∀ ̺ ˆ r3 6 ∀ ̺ ˆ r3 b | b

::=

hci | hλx.e ̺1 ̺0 ˆ r3 i | hµx@̺0 . λx. e ̺1 ̺0 ˆ r3 i

::= ::=

href e ̺1 ̺0 ̺2 ˆ r3 i hci | hλx.e r ˆ ǫ i | href v r i

::=

v̺ | x

::= ::= |

⇁

imd ̺ ˆ r3 obj b imd ∀ ̺1 ̺0 /̺2 ˆ r3 obj b

imd r2 obj q ǫ ˆ r2

=

imd r1 obj v ∅ ̺

=

×

imd r1 r2 obj s

::=

g| a

::= ::= ::= ::=

h∅, oi | a ̺ hr0 , oi ρ|r ∅

::=

s r1 {r0 7→ ≬s }

::=

∅ {o 7→ d r }

=

imd r obj [f ] r (ι @ imd obj a )

::= ::=

r

alloc | read | write | exec

imd obj ρ

r ∈

imd obj r

o ∈

imd obj o

x ∈

imd obj x

Figure 6.1.1. Object Intermediate Language Syntax among values. Region names and region indicators are defined and configurations, stores, region stores, traces, and atomic traces are defined as in that language. We extend frn( ) to refer to the free region names of an expression. The syntactic class of programs is given a nonlexical index — a sequence of region names, corresponding to the regions on the store. The syntactic classes of expressions and prestorables are given both lexical and nonlexical indexes (̺ˆ r). Here, the lexical index ̺ corresponds to the 302

Object Language

Dynamics

indexes of the core object intermediate language (Chapter 4). The nonlexical index r corresponds to regions “pending” on the store, for use either in a freeregion form within the current expression or prestorable, or outside of (but not including) the current expression or prestorable. Because region indicator sequences do not contain duplicates, expressions that declare the same region name in overlapping scopes are precluded prior to the type system. The linearity of our nonlexical indexes ensures that any sibling running forms declare the same region indicator. Stores are given only a nonlexical index. Function bodies have empty nonlexical index. This will be convenient because we will want to retract their lexical index. We could ensure that all nonevaluation contexts, i.e., all contexts required to hold active source expressions, such as let bodies and operands of functions with nonvalue operators, also have empty nonlexical index but for convenience refrain from doing so. We have that

imd ̺ obj v

⊆

imd ̺ ˆ ǫ obj e

and thus

imd r obj d

⊆

imd r/ˆ ǫ . obj b

Program configurations take the form hq r0 ; s r0 i, while expression configurations take the form he r1 ˆ r2 ; s r1 r2 i and value configurations take the form hv r1 ˆ r2 ; s r1 r2 i. We can see from this template that for an expression (value) and store to be combined as a configuration, we must be able to partition the store index so that the prefix is the expression’s (value’s) lexical index and the suffix r

its nonlexical index. We refer to program configurations as hq; si 0 , expression configurations as r ˆr r ˆr he; si 1 2 , and value configurations as hv; si 1 2 . We now require, in addition to the substitution forms already mentioned, substitution of ∅ for a region name in an arbitrary expression (e - r0 ). We extend frn( ) accordingly. Only the nonlexical index can change during reduction, so as

t hq; si ⇀ hq′ ; s′ i r2 → r ′ 2

and

′ t he; sir ˆ r3 ⇀ he′ ; s′ ir ˆ r3

′ t hq; sir2 ⇀ hq′ ; s′ ir2

as

r

ˆ r3 → r3′

. The reduction rules in Figure 6.1.2

leave the lexical index unchanged. They extend the nonlexical index only for retract it only for

⇀ imd he; si obj

can be abbreviated

t he; si ⇀ he′ ; s′ i ⇀ imd he; si obj

-letregion and

-freeregion. These indexes are again more general than required by our

evaluation contexts. Only the rule

⇀ imd he; si obj

-freeregion’s redex may contain freeregion constructs,

so r3 is empty in each reduction rule, i.e., the nonlexical indexes are empty except for the rule ⇀ imd he; si obj

-freeregion and the region being allocated in

⇀ imd he; si obj

We specify eager deallocation of regions by allowing

-letregion. ⇀

imd he; si obj

-freeregion to be applied with a

nonlexical index of length greater than one, and with an arbitrary subexpression rather than a value.

⇀

imd he; si obj

-freeregion can no longer make any assumption regarding the position within the store

of the region store to be deallocated. For this rule, we extend the operation [ := ∅] to expressions and segments of stores not actively mentioning the region to be deallocated. For expressions, this 303

Enhancing the Languages

⇀

imd he; si obj

-let

Intermediate Languages

⇀

∅ hlet x= v in e; si⇀he [ x:= v]; si imd he; sir obj

⇀

imd he; si obj

⇀

imd he; si obj

-app-λ

⇀

imd he; si obj

imd he; sir1 r0 r2 obj

⇀

-µλ

⇀

-app-const

a = hr0 , oi s( a) = hci

[(exec @ a)]+ δ ′ ( c)( hv; si) h@ r0 a v; si ⇀ δ( c) ( hv; si) imd he; sir1 r0 r2 obj

ˆ r3 → r3

a = hr0 , oi ˆ r3 → r3

[] hletregion ρ0 in e0 ; si ⇀ hfreeregion r0 after e0 [ ρ0 := r0 ]; s {r0 7→ ∅}i imd he; sir obj

⇀

⇀

imd he; si obj

ˆ r3 → r3

[(alloc @ a)] hhµx1 @r0 . λx2 . ei; si ⇀ ha; s {a 7→ hλx2 .e [ x1 := a] i}i

-letregion

imd he; si obj

[(write @ a)] h@ r0 set a to v; si ⇀ hunit; s {a href vi}i

ˆ r3 → r3

imd he; sir1 r0 r2 obj

imd he; si obj

-set

imd he; sir1 r0 r2 obj

a = hr0 , oi s( a) = hλx.e0 i

ˆ r3 → r3

a = hr0 , oi s( a) = href v 1 i

ˆ r3 → r3

[(exec @ a)] h@ r0 a v; si ⇀ he0 [ x:= v]; si

imd he; si obj

a = hr0 , oi [(alloc @ a)] h@ r0 d; si ⇀ ha; s {a 7→ d}i imd he; sir1 r0 r2 obj

[(read @ a)] h@ r0 deref a; si ⇀ hv; si r1 r0 r2 imd obj he; si

-alloc

ˆ r3 → r3

a = hr0 , oi s( a) = href vi

-deref

imd he; si obj

-freeregion

ˆ r3 → r0 r3

he0 ; s2 inot actively mentioningr0 [] hfreeregion r0 after e0 ; s1 {r0 7→ ≬s 0 } s2 i ⇀ he0 [ r0 := ∅]; s1 s2 [ r0 := ∅]i imd he; sir1 obj

ˆ r0 r2 → r2

Figure 6.1.2. Object Language Expression Configuration Reduction Rules is straightforward except in the case of freeregion. If freeregion constructs declaring the same region name were nested, we would confront the situation of substituting for a region name in a construct that declares the same region name. For completeness, we define this consistently with substitution of program variables in λ expressions, so that the freeregion construct is left unchanged. As in the previous part, however,

⇀

imd he; si obj

-letregion will preclude that situation from developing. We define

:= ∅ over store segments as follows. We push it into each stored value and then, in the case of reference cells, use our existing definition on the contained values, or in the case of functions, use our definition over expressions on the function body. To the extent that constants are defined in terms of existing regions or effects, these must also be subject to ⇀

imd he; si obj

:= ∅. The requirement of

-freeregion that he; s2 i not actively mention r0 , described in Section 2.7 of the Introduction,

precludes operations on the region to be deallocated but permits addresses pointing to the region. 304

Object Language

Dynamics

⇀

-app-const is a new reduction rule for handling applications of constant functions. It r1 ˆ r2 r1 ˆ r2 imd assumes a partial function δ r1 ˆ r2 : imd ⇒ imd over value configuraobj c ⇒ obj hv; si obj hv; si imd he; si obj

tions, which we assume is restricted to modify only the lexical portion of the store. The rule also assumes a partial function δ ′ r1 ˆ r2 :

imd obj c

⇒

imd obj hv;

sir1 ˆ r2 ⇒

imd r1 obj t ,

defined over the same domain,

to provide the trace for any constant function and intitial value configuration. We assume, for example, that δ( zero?@r0 ) ( hhr0 , oi; si) = true if s ( hr0 , oi) = h0i and false otherwise. δ ′ ( zero?@r0 ) ( hhr0 , oi; si) = [(read @ hr0 , oi)]. δ and δ ′ similarly implement other constant functions as one might expect. We restrict δ and δ ′ later via a relationship with TypeOf in the type system. We wish to implement recursion in such a way that the recursive function is allocated only once, and not on each recursive call. This constraint precludes a simple unfolding. Recursion is implemented in

⇀

imd he; si obj

-µλ by allocating a closure, substituting the address for the function

name in the function body, and extending the store with the closure. Thus, like ⇀

imd he; si obj

⇀

imd he; si obj

-alloc,

-µλ allocates a new offset in an existing region and extends the store. It also carries the

implicit requirement that r0 ∈ Dom( s) but o ∈ / Dom( s( r0 )). We define program expression contexts and expression contexts as in the core language but extend the indexes to include nonlexical information. A program expression context is a program with a hole expecting an expression. Metavariables for program expression contexts take the form [e r2 ˆ r3 ] r2 r3 q , representing a program defining regions r2 around a hole expecting an expression r2 ˆ r3 defining regions r3 . We refer to such metavariables as [e]q . An expression context is an expression with a hole expecting another expression. We refer to an empty expression context as [e r1 ˆ r2 ], assuming a surrounding context defining regions r1 . This context expects (and returns) an expression that may make use of the nonlexical context r2 . r1 r2 ˆ r3 ] r1 ˆ r2 r3 Metavariables for expression contexts take the form [e e , representing an expression defining regions r2 around a hole expecting another expression defining regions r3 , assuming a r1 ˆ r2 ˆ r3 surrounding context defining regions r1 . We refer to such metavariables as [e]e . We refer to r1 ˆ r2 r3 r1 ˆ r2 ˆ r3 the filling of a context [e]e with an expression e0 r1 r2 ˆ r3 as [ e]e [e0 ]. We refer to an empty trace context as [t r0 ]. It expects and returns unchanged a trace over r0 . Metavariables for trace contexts take the form

[t r1 r2 ] r1

t , representing a restriction of actions at

regions r2 around a hole expecting a trace of actions at r1 r2 . We refer to such metavariables as [t] r1 ˆ r2 t . Filling of a context with a trace is indicated similarly to expression contexts.

305

Enhancing the Languages

Intermediate Languages

The notions of program expression contexts and trace contexts are combined to define pror2 ˆ r3 gram expression/trace contexts of the form: ([ e]q ! [ t]t ) , corresponding to the concatenation r2 ˆ r3 [ t] ˆ r2 ([ e]q ! t ). The notions of expression contexts and trace contexts are combined to der1 ˆ r2 ˆ r3 fine expression/trace contexts of the form: ([ e]e ! [ t]t) , corresponding to the concatenation r1 ˆ r2 ˆ r3 [ t] r1 ˆ r2 ([ e]e ! t ). →∗ [ e] ǫˆ r3

→∗ [ e]

∈

q

q

ǫ ˆ r3

::=

[ e] (→

∗

∗

e !→

[ e]

[ t]

t)

r1 ˆ ǫ ˆ r3

∗

→ ∈ ( imd obj

[ e]

e!

imd →∗ [ t] r1 ˆ ǫˆ r3 t) obj

::=

([ e] ! [ t]) | (let x = [ e] in e ! [ t]) | (@ r0 href [ e]i ! [ t]) | (@ r0 deref [ e] ! [ t]) | (@ r0 set [ e] to e ! [ t]) | (@ r0 set v to [ e] ! [ t]) | (@ r0 [ e] e ! [ t]) | (@ r0 v [ e] ! [ t]) (→

∗

∗

e !→

[ e]

[ t]

t)

r1 ˆ r0 ˆ r3

∗

→ ∈ ( imd obj

[ e]

e!

imd →∗ [ t] r1 ˆ r0 ˆ r3 t) obj

::=

(freeregion r0 after [ e] ! [ t] − r0 ) Figure 6.1.3. CBV Object Language Atomic Expression/Trace Evaluation Contexts In Figure 6.1.3 we define evaluation contexts to guide the CBV reduction of expressions. These again merely augment the core-language evaluation contexts of Figure 4.1.3 with enhanced indexes. freeregion r0 after [ e] is an evaluation context while letregion ρ0 in [ e] is not, so regions continue to be allocated outermost-first. The rules still ensure, as with the core language, that regions are allocated depth-first. The multiple deep reduction rules in Figure 6.1.4 similarly augment the core-language rules of Figure 4.1.4 with enhanced indexes. With the change to

⇀ imd he; si obj

-freeregion and (freeregion r0 after [ e] ! [ t] − r0 ) remaining an evalu-

ation context, the operational semantics is nondeterministic in that there is an option of deallocating regions eagerly or not. In Figures 6.1.5 through 6.1.7 we demonstrate this dynamic semantics on the sample source program typed in Figures 5.1.12 through 5.1.15. The program is partially reduced as follows. We first create the outer region as r1 then drop through several contexts to allocate the recursive function using

⇀

imd he; si obj

-µλ. This causes a technically nonrecursive function to be allocated at hr1 , o1 i, with

that same address replacing occurrences of the formal parameter x1 . Eventually, we reach d1 in Figure 6.1.6. Here, we allocate a constant sub1 function that operates on the region r1 (this will not 306

Object Language

Dynamics

t1 he1 ; si →∗ he′1 ; s′ i ∗

→

imd hq; si obj

imd he; si obj

-context

h

→∗ [ e]

r1

t he; si →∗ he′ ; s′ i

ˆ r2 → r2′ →∗ [ e ]

∗

q [e1 ]; si → h q [e′1 ]; ′ imd hq; sir1 r2 → r1 r2 obj

imd he; sir obj ′

′

si

t he′ ; s′ i →∗ he′′ ; s′′ i →∗ imd he; si obj

t1 he1 ; si →∗ he′1 ; s′ i →∗

imd he; si obj

imd he; sir r1 obj

-context

ˆ r3 → r3′

-trans

ˆ r3 → r3′

imd he; sir obj

ˆ r3′ → r3′′

t+t′ he; si →∗ he′′ ; s′′ i imd he; sir obj

ˆ r3 → r3′′

→∗ [ t ]

h→

∗ [ e]

e [e1 ]; si imd he; si obj

t [t1 ] ∗ →∗ h→ [

r

e [e′1 ]; s′ i

e]

ˆ r1 r3 → r1 r3′

t he; si ⇀ he′ ; s′ i →

∗

imd he; si obj

∗

-reflex

→

imd he; si obj

[] he; si →∗ he; si imd he; sir obj

ˆ r3 → r3

-step

imd he; sir obj

ˆ r3 → r3′

t he; si →∗ he′ ; s′ i imd he; sir obj

ˆ r3 → r3′

Figure 6.1.4. Object Language Multiple Deep Reduction Rules be used) and a constant integer one, both with the latter using

⇀ imd he; si obj

⇀ imd he; si obj

-alloc. We then apply the recursive function to

-app-λ, as with any other function application. In d2 in Figure 6.1.7, we create

another region r2 . We allocate there a constant add1 function that operates back at region r1 , and apply it using

⇀ imd he; si obj

-app-const to the integer h1i at hr1 , o3 i in the store. δ defines the resulting

configuration holding the integer two, newly allocated at hr1 , o4 i. δ ′ defines the resulting trace, which is registered after the execution at hr2 , o1 i. Rather than reapplying the recursive function immediately, we emerge to the freeregion construct and deallocate the region r2 . The pending application remains, and the trace continues indefinitely. This infinite loop, which creates a new region with every iteration, by eager region deallocation uses a maximum of two region stores. This example barely even demonstrates the power of eager region deallocation, since we still maintain a stack discipline. This would be broken if, for example, we declared a second region in the function body and deallocated only those resulting from the outer letregion construct, letting those resulting from the inner one accumulate indefinitely. The definition of immediately faulty expression configurations is modified to drop any reference to freeregion. The reduction rule no longer requires the region being deallocated to be on top of the store, while our indexes guarantee that it will appear just past the lexical portion of the store. The definition is also modified to allow the store to hold a constant function in the case of an application.

307

Enhancing the Languages

Intermediate Languages

h letregion ρ1 in let x1 = @ ρ1 h µx1 @ρ1 . λx2 . letregion ρ2 i in @ ρ1 x1 (@ ρ2 @ ρ2 hadd1@ρ1 i ) x2 in @ ρ1 @ ρ1 hsub1@ρ1 i ; (@ ρ1 x1 @ ρ1 h1i) @ ρ1 deref (@ ρ1 x1 @ ρ1 h1i) ∅i

;

[]

[] ⇀ imd he; si obj

-letregion

⇀ freeregion r1 after [ e r1 ] 6

[ (alloc @ hr1 ,

[ (alloc @ hr1 , r1

o1 i)

]

let x1 = [ e ] in @ r1 @ r1 hsub1@r1 i ; (@ r1 x1 ) @ r1 h1i @ r1 deref (@ r1 x1 ) @ r1 h1i

o2 i), ] (alloc @ hr1 , o3 i), (exec @ hr1 , o1 i), (read @ hr1 , o3 i), (alloc @ hr1 , o4 i), ... [] r1 ⇀ [ e ]; ⇀ imd he; si -let @ r1 deref (@ r1 hr1 , o1 i ) obj @ r1 h1i

6 h@ r1 h µx1 @r1 . λx2 . ; letregion ρ2 i in @ r1 x1 (@ ρ2 @ ρ2 hadd1@r1 i ) x2 ∅ {r1 7→ ∅}i

[(alloc @ hr1 , o1 i) ] ⇀

imd he; si obj

-µλ

≬ ≬

∅ {r1 7→ s 1,1 } ∅ {r1 7→ ≬s 1,2 } ∅ {r1 7→ ≬s 1,3 } s3 {r2 7→ ≬s 2,1 } ∅ {r1 7→ ≬s 1,4 } {r2 7→ ≬s 2,1 } s6 = ∅ {r1 7→ ≬s 1,4 } s1 s2 s3 s4 s5

= = = = =

s 1,1 = ∅ {o1 7→ h λx2 . } letregion ρ2 i in @ r1 hr1 , o1 i (@ ρ2 @ ρ2 hadd1@r1 i ) x2 ≬ s 1,2 = ≬s 1,1 {o2 7→ hsub1@r1 i} ≬ s 1,3 = ≬s 1,2 {o3 7→ h1i} ≬ s 1,4 = ≬s 1,3 {o4 7→ h2i} ≬ s 2,1 = ∅ {o1 7→ hadd1@r1 i}

Figure 6.1.5. Sample Object Language Reduction, I

308

⇀ hhr1 , o1 i; s1 i

d1

Object Language

d1 = [ (alloc @ hr1 ,

o3 i), ] (exec @ hr1 , o1 i), (read @ hr1 , o3 i), (alloc @ hr1 , o4 i), (exec @ hr1 , o1 i), ... ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡ @ r1 hr1 , o2 i [ e r1 ]

[ (alloc @ hr1 , o2 i) ] @ r1 [ e r1 ] (@ r1 hr1 , o1 i ) @ r1 h1i

6

h@ r1 hsub1@r1 i; s1 i

[ (read @ hr1 ,

[(alloc @ hr1 , o2 i) ] ⇀

imd he; si obj

-alloc

[ (alloc @ hr1 ,

⇀ hhr1 , o2 i; s2 i

-

o3 i) @ r1 hr1 , o1 i [ e r1 ]

]

o3 i), (alloc @ hr1 , o4 i), (exec @ hr1 , o1 i), [(exec @ hr1 , o1 i)] . . . ⇀ d2 ⇀ imd he; si -app-λ obj

]

6 h@ r1 h1i; s2 i

[(alloc @ hr1 , o3 i) ] ⇀ imd he; si obj

⇀ hhr1 , o3 i; s3 i

-alloc

Figure 6.1.6. Sample Object Language Reduction, II

Dynamics

309

[ (read @ hr1 ,

h letregion ρ2 in @ r1 hr1 , o1 i (@ ρ2 @ ρ2 hadd1@r1 i ) hr1 , o3 i s3 i

;

o3 i), (alloc @ hr1 , o4 i) [] ⇀ freeregion r2 ⇀ imd he; si -letregion after [ e r1 r2 ] obj

]

6

[]

⇀ h@ r1 hr1 , o1 i ; imd he; si -freeregion hr1 , o4 i obj s6 i ⇀

[ (alloc @ hr2 ,

o1 i), ] (exec @ hr2 , o1 i), (read @ hr1 , o3 i), (alloc @ hr1 , o4 i) @ r1 hr1 , o1 i [ e r1 r2 ] 6

[(alloc @ hr2 , o1 i)]

(@ r2 [ e r1 r2 ] hr1 , o3 i) 6

⇀

imd he; si obj

]

⇀ hhr1 , o4 i; s5 i

-app-const

[(alloc @ hr2 , o1 i)] ⇀ imd he; si obj

⇀ hhr2 , o1 i; s4 i

-alloc

Figure 6.1.7. Sample Object Language Reduction, III

Intermediate Languages

h@ r2 hadd1@r1 i; s3 {r2 7→ ∅}i

[(exec @ hr2 , o1 i), ] (read @ hr1 , o3 i), (alloc @ hr1 , o4 i)

[(exec @ hr1 , o1 i)] [ . . . ⇀ ∞ ⇀ imd he; si -app-λ obj

Enhancing the Languages

310

d2 =

Object Language

Dynamics

Definition 6.1.1 (Immediately Faulty Expression Configurations). r1 ˆ r3

(1) hx; si

(2) h@ r0 hλx.e0 i; sir1 r0 r2 ˆ r3 (λx.e0 not closed) (3) h@ r0 deref v; sir1 r0 r2 ˆ r3 , h@ r0 set v to e; sir1 r0 r2 ˆ r3 , h@ r0 v e; sir1 r0 r2 ˆ r3 (v 6= a) r1 r0 r2 ˆ r3

(4) h@ r0 deref a; si

, h@ r0 set a to e; si

r1 r0 r2 ˆ r3

r1 r0 r2 ˆ r3

, h@ r0 a e; si

(a ∈ / Dom (s)) (5) hderef a; si

r1 r0 r2 ˆ r3

, hset a to e; si

(s (a) 6= href vi) r r r ˆr (6) ha e; si 1 0 2 3 (s (a) ∈ / { hci ,hλx.e0 i}) r1 r0 r2 ˆ r3

(7)

ha v; si

(s(a) = hci, δ(c)(hv; si)not defined) Our definition of immediately faulty expression configurations is complete. Proposition 6.1.1 (Nonvalue program configurations are reducible or faulty). r

Every program configuration hq; si with a nonvalue program q r can be decomposed into the form r1 ˆ r2 ∗ r r ˆr h→ [ e]q [e]; si , for r = r1 r2 , where he; si 1 2 is either a redex or immediately faulty. In the r

former case, we say that hq; si is reducible, in the latter case that it is faulty. We again define the sets of reachable program and expression configurations in anticipation of restricting our type soundness theorem to them. Definition 6.1.2 (Configurations Reachable from Source).

imd obj hq;

r

si 2 :

imd

R obj hq; si (1) ∀ q ∈

r2

is the least set such that both:

src obj q. r2

(2) ∀ hq; si

ǫ

hq; ∅i imd

imd

∈ R obj hq; si r2

∈ R obj hq; si .

ǫ

hq; si →∗ hq′ ; s′ i ′ imd hq; sir2 → r2 obj

→ hq ′ ; s′ i

r2′

imd

∈ R obj hq; si

r′ 2

311

Enhancing the Languages

imd obj he;

r1 ˆ r2

si

r1 ˆ r2

he; si

Intermediate Languages

: r1

imd

∈ R obj he; si

ˆ r2

⇔ ∃→

∗

[ e] r1

q . h→

∗

[ e] r1

imd

q [e]; si ∈ R obj hq; si

r1 r2

Source-reachable program configurations constructed with a value must contain an empty store. Source-reachable expression configurations constructed with a value must contain a store whose structure matches the lexical context of the expression configuration. Lemma 6.1.1 (Source-Reachable Value Configurations Have Lexical Store).

r

imd obj hq;

si 2 : imd r2 ˆr → r2 = ǫ hv; si 2 ∈ R obj hq; si r1 ˆ r2 imd : obj he; si imd r1 ˆ r2 r ˆr hv; si 1 2 ∈ R obj he; si → r2 = ǫ

Our proof of type soundness (Lemma 6.1.6) will require an additional lemma. Programs and expressions reachable from the source language are factorable only into active source evaluation contexts. imd

r2

Lemma 6.1.2 (R obj hq; si

imd

and R obj he; si

r1

ˆ r2

Terms Factorable Only Into Active Source Eval-

uation Contexts). We define an active source expression context to be an expression, with a hole, that does not contain full freeregion constructs. We overlook freeregion constructs whose body includes the hole. imd obj hq; ∗

sir1 r2 : r1 r2

imd

r1 r2

q[e1 ]; si ∈ R obj hq; si → →∗ [ e] r1 ˆ r2 q is an active source program expression context. h→

[ e]

imd obj he;

r1 r2 ˆ r3

si

:

imd r1 r2 ˆ r3 r1 r2 ˆ r3 ∈ R obj he; si e [e1 ]; si → →∗ [ e] r1 ˆ r2 ˆ r3 e is an active source expression context. ∗

h→

312

[ e]

Object Language

Statics

1.2. Statics.

Γ̺ ∈

r

hQ; Si 2 r ˆr hE; Si 1 2 ∈

imd ǫ obj Γ ̺ imd ̺1 ̺0 Γ ∈ obj Γ r2 ∈ imd obj hQ; Si r1 ˆ r2 imd obj hE; Si ̺ E ̺ ∈ imd obj E ∀ ̺ ∀ ̺ B ∈ imd obj B

B̺ ∈ r1 ˆ r2

hP; Si

P̺ ∈

imd obj hP; Si ∀ ̺ ∀ ̺ P ∈ imd obj P src src ǫ obj P , Q ∈ obj Q ̺1 ̺0 P ̺ ∈ imd obj P ǫ S r ∈ imd obj S r1 r0 S r ∈ imd obj S

∈

≬ r

imd ≬ r1 r0 obj S imd ̺ ̺ ̺ ∈ imd obj T , ε obj ε ̺ imd ̺1 ̺0 ̺2 F ∈ obj F ι ∈ imd obj ι src G ∈ obj G X ∈

S

T

̺

∈

imd ̺1 ̺0 obj B r1 ˆ r2

∈

::=

∅ | Γ ǫ {x 7→ P ǫ }

::=

Γ ̺ {x 7→ ∀P } | Γ ̺1 {̺0 }

̺

=

imd obj Q

=

imd r1 obj E ̺ ̺

×

imd r2 obj S imd r1 r2 obj S

×

::=

P !T

::=

∀ X. B ̺

::=

X | Ref P ̺ | P ̺ ⇒ P ̺ | C

=

T̺

imd r1 obj P ̺

imd r1 r2 obj S

×

::=

∀ X. P

::=

G|∅

::=

B ̺1 ̺0 @ ̺0 | P ̺1

::=

∅

::=

S r1 {r0 7→ ≬S }

::=

∅ {o 7→

r

∀

B

r

}

imd ̺ obj F }

=

{

::= ::=

(ι @ ̺0 ) alloc | read | write | exec

src obj X

C ∈

src obj C

Figure 6.1.1. Object Intermediate Language Static Syntax Our introduction of let-bound polymorphic program variables in the source language requires a related modification as we define a reduction semantics. Not only must our environments (of nonempty index) hold pure type schemes, but our region store types must now hold storable type schemes. This is the most noticeable difference from Calcagno, Helsen and Thiemann [5] in implementing polymorphism. Definitions of restriction operations on the static syntax are presented in Figure 6.1.2. The restriction of store types to the innermost regions is defined as in the core language (Figure 4.1.2). However there are several major differences, all due to rule

⇀

imd he; si obj

-freeregion, that will cascade

through the rest of the definitions.

313

Enhancing the Languages

Intermediate Languages

Γ ρ0 ̺1 ρ2 [ ρ0 := r0 ]

∈

Γ1 {ρ0 } {x0 7→ P0 }

=

imd r0 ̺1 ρ2 obj Γ

Γ1 {r0 } {x0 7→ P0 [ ρ0 := r0 ]}

{ρ2 }{x2 7→ P2 }

{ρ2 }{x2 7→ P2 [ ρ0 := r0 ]}

Γ ̺1 − ̺2 Γ {̺0 } − ̺2 ̺0 Γ {̺1 } − ̺2 ̺0

(̺1 6= ̺0 )

∀

Γ {x 7→ P} − ̺2 Γ − ǫ ∀ ̺1

∈ =

imd ̺1 − ̺2 obj Γ Γ − ̺2

=

(Γ − ̺2 ̺0 ) {̺1 }

= =

(Γ − ̺2 ) {x 7→ (∀P − ̺2 )} Γ

P − ̺2 ∀ X. P − ̺2

∈ =

P ̺1 − ̺2 G − ̺2 ∅ − ̺2 (B0 @ ̺0 ) − ̺21 ̺0 ̺22 (B0 @ ̺0 ) − ̺2

∈ = = = =

∀

(̺0 ∈ / ̺2 )

̺1 ̺0

imd ∀ ̺1 − ̺2 obj P

∀ X. (P − ̺2 ) imd ̺1 − ̺2 obj P

G ∅ ∅ ( B0 − ̺2 ) @ ̺0 imd ∀ ̺1 ̺0 − ̺2 obj B

B − ̺2 ∀ X. B0 − ̺2

∈ =

B ̺1 ̺0 − ̺2 C − ̺2

∈ =

C

X − ̺2

=

X

(Ref P0 ) − ̺2

=

Ref (P0 − ̺2 )

=

(P0.1 − ̺2 ) ⇒

T

(P0.1 ⇒0 P0.2 ) − ̺2 S r1 r0 r2 − r0 (S1 {r0 7→ ≬S} S2 ) − r0 ≬ r1 r0 r2

∈ =

∀ X. (B0 − ̺2 ) imd ̺1 ̺0 − ̺2 obj B

T0 − ̺2

imd r1 r2 obj S

S1 S2 [ r0 := ∅]

[ r0 := ∅]

∈

(≬S {o 7→ ∀B}) [ r0 := ∅]

=

imd ≬ r1 r2 obj S ≬ S [ r0 := ∅]

∅ [ r0 := ∅]

=

∅

∈ = = = =

imd r1 obj S |S|̺1 |S|̺1 r0 S {r0 7→ ≬S} ∅

S

|S r1 r2 |̺1 |S|̺1 ρ0 |S {r2 7→ ≬S}|̺1 r0 |S {r0 7→ ≬S}|̺1 r0 |∅|ǫ

(|̺1 | r = r1 ) (r2 6= r0 )

(P0.2 − ̺2 )

{o 7→ ∀B − r0 }

Figure 6.1.2. Object Intermediate Language Definitions

314

Object Language

Statics

In the core language we restricted only the uppermost regions. This is no longer necessarily the case, so we may need to skip over upper region indicators in the environment. In the core language, we restricted the environment implicitly, in that the required expression derivation was rebuilt from ⇀

a value derivation. Now we must make this explicit as

imd he; si obj

-freeregion requires us to operate

directly on expression derivations. Finally, there is now a separate restriction operation on stores that is generalized to remove references to any region stores, not necessarily the innermost. This corresponds in

⇀

imd he; si obj

-freeregion of our dynamic semantics to the substitution of ∅ for occurrences

in upper region stores of the name of a region being eagerly deallocated below. It is defined in terms of restriction of the storable type schemes of the region store type, and thus storable types and pure types. Also, a located type at a region not being restricted may point into a region being restricted. We thus recur on the storable type. Functions being storable provide another use for generalized versions of restriction of environments and pure type schemes, in order to type their bodies in a restricted region store. Restriction is defined for functions only when the latent effect does not include any of the regions being restricted. imd

⊢obj hq; si

prog configs S r2

programs

⊢

expr configs expressions

Γ̺ ⇁

prestorables pures

; S seq-ˆ

r

( ̺ ˆ r)

r ⇁

Γ seq-/( ̺ )

; S seq-ˆ

Γ̺

; S seq-ˆ

region stores

traces atomic traces

imd

⊢obj e

̺

( ̺ ˆ r)

Sr

̺

⊢ Sr

r1

ˆ r2

imd ≬ r = r1 r0 obj s imd r obj s

q r2 r1 ˆ r2

:

Q

:

hE; Si

e̺ˆ r

:

E̺

b ̺ ˆr

:

B ̺1 ̺0 ! T ̺

p̺ r1 ˆ r2

:

P̺

:

hP; Si

v̺

:

P̺

he; si ˆr

⇁

hv; si

≬ r

s

:

r2

Sr

dr

:

∀

imd ̺

t̺

:

T̺

imd ̺ obj f

f̺

:

F̺

imd

r = r1 r0

⊢obj t

r1 ˆ r2

S

:

s

r1 ˆ r2

≬ r

r

⊢obj d ⊢

hQ; Si

ˆr

⊢obj hv; si imd ̺ ˆ r ⊢obj v ⊢

:

ˆr

imd

⊢obj p

r

si

ˆ r2

⇁ ̺ = ̺1 ̺0 /̺2

imd

S seq-ˆ

r1

imd

( ̺ ˆ r)

stores storables

imd obj he;

⊢obj b

r

r2

hq; si

imd r2 obj q

( ̺ ˆ r)

value configs values

⊢

r2

B

r

Figure 6.1.3. Object Intermediate Language Typing Judgments Our typing judgments are presented in Figure 6.1.3. They are similar to the core language (Figure 4.1.3), but use our enhanced indexes. The index of the store type is now fully derivable

315

Enhancing the Languages

Intermediate Languages

from the index of the program, expression, prestorable, pure, or value for which it provides typing of store locations. We introduce an abbreviation to obtain a store index from a combined lexical/nonlexical index by dropping region variables from the lexical index and concatenating: seq-ˆ r ( ̺ˆ r)

= seq-ˆ( |̺| r ˆ r)

⇁

⇁

seq-ˆ r ( ̺ ˆ r) = seq-ˆ( |seq-/( ̺ ) | r ˆ r) A particular pure or prestorable judgment assigns a particular pure or storable type, not a type scheme, because it represents an occurrence within an expression, not a declaration. A storable judgment, however, assigns a storable type scheme because various occurrences of the address of the storable may be assigned different types (using a value judgment). An additional difference is that the judgment for values now also requires an environment. The purpose is to allow live addresses to be assigned a specialized type that is restricted to only introduce fresh type variables. We define imd

⊢obj f

̺

imd ̺

f: F and ⊢obj t

t: T as in the core language.

imd

⊢obj v-glob-const

imd

S ⊢obj v imd

⊢obj v-addr-live

̺

ˆ r0

g : TypeOf( g)

a = hr0 , oi S( a) B imd r1 r0 ̺2 ˆ r3 a : B @ r0 S ⊢obj v

imd

⊢obj v-addr-dead

imd

S ⊢obj v

̺

ˆ r0

h∅, oi : ∅

Figure 6.1.4. Typing of Object Intermediate Language Values The typing rules for values in the core intermediate language (Figure 4.1.4) carry over to the imd

intermediate language (Figure 6.1.4) with enhanced indexes. ⊢obj v-addr-live makes use of the presence of an environment in value judgments. It is modified to assign to an address a specialization of the type scheme specified in the store type, again at the region component of the address. This corresponds to our treatment of program variables with respect to a generalized environment. The typing rule for programs, except for its indexes, is unchanged for our enhancements: imd

⊢

imd obj q

∅; S ⊢obj e S ⊢

ˆ r2

ǫ

imd r2 obj q

q: Q!∅ q: Q imd

The typing rules for storables are presented in Figure 6.1.5. Rule ⊢obj d-const types constants by imd r ˆ r3 imd r r3 T using the TypeOf function directly.1 If TypeOf( c12 ) P0.1 ⇒0 P0.2 , S ⊢obj v v: P0.1 , and ⊢obj s 1We could type storable constants in terms of prestorable constants using the following less direct rule:

ftv(B) = X imd

⊢obj d-const

imd r1 r0

∅ {r1 } {r0 }; S ⊢obj b S

316

imd r1 r0 ⊢obj d

ˆǫ

hci : B ! ∅

hci : ∀ X. B

. It differs in that the derived storable type scheme will not

Object Language

Statics

ftv(B0 ) = X S0 ⊢

imd

⊢obj d-ref

S0 ⊢

imd r1 r0 obj v

imd r1 r0 obj d

ˆǫ

imd

v0 : P0

imd

⊢obj d-λ

href v0 i : Ref P0

∅ {r1 } {r0 }; S0 ⊢obj b S0 ⊢

imd r1 r0 obj d

r1 r0

ˆǫ

hλx.e0 i : B0 ! ∅

hλx.e0 i :

∀ X. B0

imd

⊢obj d-const

imd

S0 ⊢obj d

r1 r0

hci : TypeOf( c)

Figure 6.1.5. Typing of Object Language Storables imd

r

ˆ r3

imd

r

ˆ r3

s: S, then we require that δ( c12 ) ( hv; si) and δ ′ ( c12 ) ( hv; si) must be defined, ∃ S ′ . ⊢obj hv; si imd r

δ ′ ( c12 ) ( hv; si): T0 . It follows easily that ∃ S ′ . ⊢obj he; si

δ( c12 )( hv; si) : hP0.2 ; S ′ i, and ⊢obj t δ( c12 )( hv; si) : hP0.2 ! ∅; S ′ i. imd

⊢obj d-λ is again stated in terms of derivations of prestorables. Such prestorable derivations use a storable type scheme similar to that of the storable, but stripped of quantifier formal parameters. In imd

⊢obj d-ref, the type of the cell is not generalized; its type scheme is a trivial one with no parameters. imd

⊢

r

S0 ⊢obj d d0 :

imd ≬ obj s

S0 ⊢

imd ≬ r = r1 r0 obj s

∀

B0 ∀

∅ {o 7→ d0 } : ∅ {o 7→

B0 } imd ≬ r

imd

⊢obj s-empty

S1 {r0 7→ ≬S 0 } ⊢obj imd r1 ⊢obj s s1 : S1

imd

⊢

imd ǫ obj s

∅: ∅

⊢obj s-nonempty

imd

⊢obj s

r = r1 r0

s

≬

s 0 : ≬S 0

s1 {r0 7→ ≬s 0 } : S1 {r0 7→ ≬S 0 }

Figure 6.1.6. Typing of Object Language Stores and Region Stores The typing rules for stores and regions stores in Figure 6.1.6 leaves these rules unchanged from imd ≬

the core language (Figure 4.1.7), except that in rule ⊢obj

s

, the storable type scheme in the region

store type corresponds to the storable type scheme in the storable typing judgment. imd

imd

⊢obj p-value

S ⊢obj v

̺

imd

ˆr

Γ; S ⊢obj p

̺

v: P

ˆr

v: P

imd

⊢obj p-var

Γ( x) P imd ̺ ˆ r Γ; S ⊢obj p x: P

Figure 6.1.7. Typing of Object Intermediate Language Pures bind type variables that are not referenced, even if TypeOf returns such type schemes, and because it is also capable of deriving less general storable type schemes than those returned by TypeOf.

317

Enhancing the Languages

Intermediate Languages

The rules for typing pures in Figure 6.1.7 and prestorables in Figure 6.1.8 are like those for the source language in Figures 5.1.9 and 5.1.10, but include a store type and replace region variables with more general region indicators. imd

⊢

Γ; S ⊢obj e

imd obj b -ref

Γ; S ⊢

̺1 ̺0 ̺2

imd ̺1 ̺0 /̺2 obj b

ˆr

ˆr

e : P0 ! T

imd

⊢obj b-const

href ei : Ref P0 ! T

TypeOf( c) B0 ˆr Γ; S ⊢ hci : B0 ! ∅ imd ̺1 ̺0 /̺2 obj b

T0′ ⊆ T0

⊢

imd obj b -λ

imd

imd ̺1 ̺0 ˆ (Γ − ̺2 ) {x 7→ P0.1 }; |S|̺1 ̺0 ⊢obj e e0 : P0.2 ! T0′ imd ̺1 ̺0 /̺2 ˆ r T Γ; S ⊢obj b hλx.e0 i : P0.1 ⇒0 P0.2 ! ∅

⊢obj b-µλ

imd ̺1 ̺0 /̺2 ˆ r hλx2 .e0 i : B0 ! ∅ Γ {x1 7→ B0 @ ̺0 }; S ⊢obj b imd ̺1 ̺0 /̺2 ˆ r b Γ; S ⊢obj hµx1 @̺0 . λx2 . e0 i : B0 ! ∅

Figure 6.1.8. Typing of Object Intermediate Language Prestorables Figure 6.1.9 presents the typing rules for expressions. Polymorphism is handled as in the source language (Figure 5.1.11). The rules for let and other source language constructs are modified as were those for pures and prestorables above. freeregion is handled as in the core intermediate language (Figure 4.1.10). Configurations are typed as expected. imd

imd

⊢obj hq; si

imd

imd

⊢obj hq; si

imd

r2

⊢obj s S ⊢

imd

⊢obj hv; si ⊢

imd obj hv;

⊢obj s

r2

⊢obj s s : S imd r2 S ⊢obj q q : Q hq; si : hQ; Si

r1 r2

imd r1 obj v

sir1 ˆ r2

imd

⊢obj he; si

r1 r2

s: S

imd r1 ˆ r2 ∅ {r1 }; S ⊢obj e e: E imd r1 ˆ r2 ⊢obj he; si he; si : hE; Si

s: S ˆ r2 v: P hv; si : hP; Si

Figures 6.1.10 through 6.1.12 provide a program configuration typing derivation in the enhanced object language. It reflects the situation towards the end of the reduction sequence displayed in Figure 6.1.7, at the end of the first loop iteration. It is similar to the source language typing derivation in Figures 5.1.12

318

Object Language

Statics

imd

Γ; S ⊢obj e

̺

ˆr

6∀

e : P.1 ! T.1 imd ̺ ˆ r Γ {x 7→ P.1 }; S ⊢obj e e : P.2 ! T.2 imd ̺ ˆ r Γ; S ⊢obj e let x = 6 ∀e in e : P.2 ! T.1 ∪ T.2

imd

⊢obj e-letn

imd ̺ ˆ r ∀ Γ; S ⊢obj e e : P.1 ! T.1 X = ftv(P.1 ) − ftv(Γ)

⊢

imd ̺ ˆ r Γ {x 7→ ∀ X. P.1 }; S ⊢obj e e : P.2 ! T.2 imd ̺ ˆ r Γ; S ⊢obj e let x = ∀e in e : P.2 ! T.1 ∪ T.2

imd obj e -letg

imd

⊢obj e-alloc

imd ̺ ˆ r Γ; S ⊢obj p p: P imd ̺ ˆ r Γ; S ⊢obj e p: P!∅

B0 = hµx1 @̺0′ . λx2 . ei → ̺0 = ̺0′ imd ̺1 ̺0 /̺2 ˆ r0 Γ; S ⊢obj b b : B0 ! T imd ̺1 ̺0 ̺2 ˆ r Γ; S ⊢obj e @ ̺0 b : B0 @ ̺0 ! T ∪ {(alloc @ ̺0 )} imd

Γ; S ⊢obj e

̺

imd ̺ obj e

ˆr ˆr

e .2 : P0.1 ! T.2 T

Γ; S ⊢ e .1 : P0.1 ⇒0 P0.2 @ ̺0 ! T.1 imd ̺ = ̺1 ̺0 ̺2 ˆ r Γ; S ⊢obj e @ ̺0 e .1 e .2 : P0.2 ! T0 ∪ T.1 ∪ T.2 ∪ {(exec @ ̺0 )}

imd

⊢obj e-app

imd

Γ; S ⊢obj e imd ̺ = ̺1 ̺0 ̺2 ˆ r e

imd

⊢obj e-deref

Γ; S ⊢obj

̺

ˆr

e : Ref P0 @ ̺0 ! T

@ ̺0 deref e : P0 ! T ∪ {(read @ ̺0 )} imd

Γ; S ⊢obj e

̺

imd ̺ obj e

imd

⊢obj e-set Γ; S ⊢

ˆr ˆr

e .1 : Ref P0 @ ̺0 ! T.1

Γ; S ⊢ e .2 : P0 ! T.2 ˆr @ ̺0 set e .1 to e .2 : Unit ! T.1 ∪ T.2 ∪ {(write @ ̺0 )}

imd ̺ = ̺1 ̺0 ̺2 obj e

imd ̺1 ρ0 ˆ r e0 : P0 ! T0 Γ1 {ρ0 }; S ⊢obj e imd ̺1 ˆ r Γ1 ; S ⊢obj e letregion ρ0 in e0 : P0 − ρ0 ! T0 − ρ0

imd

⊢obj e-letregion

⊢

⊢

imd obj e-pure

imd obj e -freeregion

imd r1 r0 ˆ r2 Γ1 {r0 }; S ⊢obj e e0 : P0 ! T0 imd r1 ˆ r0 r2 Γ1 ; S ⊢obj e freeregion r0 after e0 : P0 − r0 ! T0 − r0

Figure 6.1.9. Typing of Object Intermediate Language Expressions

319

Enhancing the Languages imd

⊢obj v

Intermediate Languages r1

-addr-live

S1 ⊢ hr1 , o4 i : Int @ r1 ∅ {r1 }; S1 ⊢ hr1 , o4 i : Int @ r1 d7 ∅ {r1 }; S1 ⊢ hr1 , o4 i : Int @ r1 ! ∅ imd r1 ⊢obj e -app d2 ∅ {r1 }; S1 ⊢ @ r1 hr1 , o1 i : Int @ r1 ! {(alloc @ r1 ), } hr1 , o4 i (exec @ r1 ), (read @ r1 ) imd r1 d4 ⊢obj e -app ∅ {r1 }; S1 ⊢ @ r1 hr1 , o2 i : Int @ r1 ! {(alloc @ r1 ), } @ r1 hr1 , o1 i (exec @ r1 ), hr1 , o4 i (read @ r1 ) imd r1 e ⊢obj -seq ∅ {r1 }; S1 ⊢ @ r1 hr1 , o2 i @ r1 hr1 , o1 i hr1 , o4 i; : X2 @ r1 ! {(alloc @ r1 ), } @ r1 deref (@ r1 hr1 , o1 i @ r1 h1i) (exec @ r1 ), (read @ r1 ) imd ǫ/r1 ⊢obj e -freeregion ∅ ; S1 ⊢ freeregion r1 after @ r1 hr1 , o2 i @ r1 hr1 , o1 i hr1 , o4 i; : ∅ ! ∅ @ r1 deref (@ r1 hr1 , o1 i @ r1 h1i) imd r1 d6 ⊢obj q S1 ⊢ freeregion r1 after @ r1 hr1 , o2 i @ r1 hr1 , o1 i hr1 , o4 i; : ∅ @ r1 deref (@ r1 hr1 , o1 i @ r1 h1i) imd r1 ⊢obj hq; si ⊢ hfreeregion r1 after @ r1 hr1 , o2 i @ r1 hr1 , o1 i hr1 , o4 i; ; s1 i : h∅; S1 i @ r1 deref (@ r1 hr1 , o1 i @ r1 h1i) imd

r1

⊢obj p -value imd r1 ⊢obj e -pure

imd

⊢obj v imd

⊢obj p

r1

imd

d7 = ⊢obj e

r1

-addr-live

S1 ⊢ hr1 , o1 i : Int @ r1

-value

r1

-pure

∅ {r1 }; S1 ⊢ hr1 , o1 i : Int @ r1 ∅ {r1 }; S1 ⊢ hr1 , o1 i : Int @ r1

imd

⊢obj b d2 = ⊢

imd r1 obj e -alloc

{ (alloc @ r1 ), } (exec @ r1 ), (read @ r1 )

r1 /

⇒

imd

r1

⇒

-const

∅ {r1 }; S1 ⊢ hr1 , o2 i : Int @ r1

-const

S1 ⊢ hsub1@r1 i : Int @ r1 imd

d9 = ⊢obj d

r1

S1 ⊢ h1i : Int

⇒

d10 = ⊢obj d

r1

Int @ r1 @ r1

Int @ r1 @ r1 ! ∅

{(alloc @ r1 ),(read @ r1 )}

⇒

{(alloc @ r1 ),(read @ r1 )}

{ (alloc @ r1 ), } (read @ r1 )

imd

-const

⇒

{ (alloc @ r1 ), } (exec @ r1 ), (read @ r1 )

∅ {r1 }; S1 ⊢ hr1 , o2 i : (Int @ r1 d8 = ⊢obj d

Int @ r1 @ r1

{ (alloc @ r1 ), } (exec @ r1 ), (read @ r1 )

-const

⇒

Int @ r1 ! ∅

Int @ r1 ) @ r1 ! ∅

Int @ r1

S1 ⊢ h2i : Int

Figure 6.1.10. Sample Object Intermediate Language Derivation, I

320

Object Language

d6 = imd

⊢obj p ⊢ ⊢ imd

⊢obj e

r1

⊢obj d

⊢ imd r1 ⊢ ∅ : ∅ ⊢obj s

⊢

-letregion

r1

-λ

-λ

imd ≬ ǫ obj s

≬

Γ; S1 ⊢ hr1 , o1 i : Int @ r1

{(alloc @ r1 ),(exec @ r1 ),(read @ r1 )}

⇒

d5 Γ; S1 ⊢ hr1 , o1 i : Int @ r1 ⇒ X1 @ r1 @ r1 ! ∅ Γ; S1 ⊢ @ r1 hr1 , o1 i : X1 @ r1 ! {(alloc @ r1 ), (exec @ r1 ), } (@ ρ2 @ ρ2 hadd1@r1 i x2 ) (read @ r1 ), (alloc @ ρ2 ), (exec @ ρ2 ) ∅ {r1 } ; S1 ⊢ letregion ρ2 : X1 @ r1 ! {(alloc @ r1 ), } {x2 7→ Int @ r1 } in @ r1 hr1 , o1 i (exec @ r1 ), (@ ρ2 @ ρ2 hadd1@r1 i x2 ) (read @ r1 )

∅ {r1 }; S1 ⊢ hλx2 .letregion ρ2 in @ r1 hr1 , o1 i (@ ρ2 @ ρ2 hadd1@r1 i x2 )

i : Int @ r1

{ (alloc @ r1 ), } (exec @ r1 ), (read @ r1 )

S1 ⊢ hλx2 .letregion ρ2 ir1 : Int @ r1 in @ r1 hr1 , o1 i (@ ρ2 @ ρ2 hadd1@r1 i x2 ) S1 ⊢ ≬s 1 : ≬S 1 ⊢ s1 : S 1

s 1 = ∅ {o1 7→ hλx2 .letregion ρ2 i} in @ r1 hr1 , o1 i (@ ρ2 @ ρ2 hadd1@r1 i x2 ) {o2 7→ hsub1@r1 i} {o3 7→ h1i} {o4 7→ h2i}

s1 = ∅ {r1 7→ ≬s 1 }

X1 @ r1 @ r1

{(alloc @ r1 ),(exec @ r1 ),(read @ r1 )}

imd r1 ρ2 obj e -app

imd r1 /

imd ǫ obj s

-var

imd r1 ρ2 obj e -pure

⊢obj b

imd

r1 ρ2

≬

⇒

{ (alloc @ r1 ), } (exec @ r1 ), (read @ r1 )

⇒

S 1 = ∅ {o1 7→ ∀ X1 . Int @ r1 {o2 7→ Int @ r1 {o3 7→ Int} {o4 7→ Int}

X1 @ r1 ! ∅ d8 d9 d10 X1 @ r1

{ (alloc @ r1 ), } (exec @ r1 ), (read @ r1 )

⇒

{ (alloc @ r1 ), } (read @ r1 )

⇒

X1 @ r1 }

Int @ r1 }

S1 = ∅ {r1 7→ ≬S 1 } Statics

321

Figure 6.1.11. Sample Object Intermediate Language Derivation, II

imd

⊢obj p

r1

-var

∅ {r1 }; S1 ⊢ x1 : Int @ r1

imd

⊢obj e-pure

{(alloc @ r1 ),(exec @ r1 ),(read @ r1 )}

⇒

imd r1 /

Ref X2 @ r1 @ r1

{(alloc @ r1 ),(exec @ r1 ),(read @ r1 )}

⊢obj b -const ∅ {r1 }; S1 ⊢ h1i : Int @ r1 ! ∅ imd r1 ⊢obj e -alloc Int @ r1 ! ∅ {r1 }; S1 ⊢ @ r1 h1i : {(alloc @ r1 )}

⇒ Ref X2 @ r1 @ r1 ! ∅ {r1 }; S1 ⊢ x1 : Int @ r1 ∅ ⊢ ∅ {r1 }; S1 ⊢ @ r1 x1 @ r1 h1i : Ref (X2 @ r1 ) @ r1 ! {(alloc @ r1 ), (exec @ r1 ), (read @ r1 ) } imd r1 ⊢obj e -deref ∅ {r1 }; S1 ⊢ @ r1 deref (@ r1 x1 @ r1 h1i) : X2 @ r1 ! {(alloc @ r1 ), (exec @ r1 ), (read @ r1 ) } imd r1 obj e -app

Enhancing the Languages

322

d4 =

d5 = imd

⊢obj b imd

⊢obj e imd

⊢obj e

r1 ρ2

r1 ρ2

r1 ρ2 /

-alloc

-app

-const

Γ; S1 ⊢ hadd1@r1 i : Int @ r1

{ (alloc @ r1 ), } (read @ r1 )

⇒

{ (alloc @ r1 ), } (read @ r1 )

imd

Int @ r1 ! ∅

⊢obj p imd

⊢obj e

r1 ρ2

r1 ρ2

-var

-pure

Γ; S1 ⊢ x2 : Int @ r1 Γ; S1 ⊢ x2 : Int @ r1 ! ∅

Γ; S1 ⊢ @ ρ2 hadd1@r1 i : (Int @ r1 ⇒ Int @ r1 ) @ ρ2 ! {(alloc @ ρ2 )} Γ; S1 ⊢ @ ρ2 @ ρ2 hadd1@r1 i x2 : Int @ r1 ! {(alloc @ r1 ), (read @ r1 ), (alloc @ ρ2 ), (exec @ ρ2 )}

Γ = ∅ {r1 } {(alloc @ r1 ),(exec @ r1 ),(read @ r1 )}

{x1 7→ Int @ r1 {x2 7→ Int @ r1 } {r2 }

⇒

X1 @ r1 @ r1 }

Figure 6.1.12. Sample Object Intermediate Language Derivation, III

Intermediate Languages

Object Language

Type and Effect Soundness

through 5.1.15, but the recursive function has already been allocated, so the polymorphism is manifest through it’s address hr1 , o1 i rather than through the program variable x1 . The derivation d7 in Figure 5.1.12 treats the return type as Int whereas d4 in Figure 5.1.14 treats the return type as a reference cell. The derivation d6 of the stored (nonrecursive) function leaves the storable type of the return located type completely open. The remaining storable derivations d8 , d9 , and d10 in Figure 5.1.12 demonstrate the typing of stored constants.

1.3. Type and Effect Soundness. Type and effect soundness are proven using now familiar techniques. We present the main theorems and lemmas, and provide proofs only when they differ substantially from those of the previous parts. Theorem 6.1.1 (Type Soundness). src

⊢obj q

r2

q: Q →

r2

hq; ∅i ⇓ hv; ∅i

hq; ∅i ⇑ ∨ ∃ v. (

imd

∧ ⊢obj hq; si

imd hq; sir2 obj

r2

hv; ∅i : hQ; ∅i).

Theorem 6.1.2 (Effect Soundness). he; si ∈ R imd r

⊢obj t

imd obj he;

r

si

ˆ r3

∧ ⊢

imd obj he;

si

r

ˆ r3

t he; si →∗ he′ ; s′ i

he; si: hP ! T; Si ∧

imd he; sir obj

ˆ r3 → r3′ →

t: T

Lemma 6.1.3 (Evaluation Preserves Type and Effect). imd obj hq;

r

si 2 :

hq; si ∈ R

imd obj hq;

sir

∧ ⊢ ′ r2

imd

∃ S ′ ≥ǫ S. ⊢obj hq; si imd obj he;

r ˆ r3

si

he; si ∈ R

imd obj hq;

hq; si →∗ hq′ ; s′ i

sir2

hq; si : hQ; Si ∧

′ imd hq; sir2 → r2 obj

→

hq ′ ; s′ i : hQ; S ′ i

:

imd obj he;

r

si

ˆ r3

∧ ⊢

imd

imd obj he;

∃ T ′ ≤r T, S ′ ≥r S. ⊢obj he; si

r

ˆ r3′

si

r

ˆ r3

t he; si →∗ he′ ; s′ i

he; si: hP ! T; Si ∧

imd he; sir obj

ˆ r3 → r3′ →

he′ ; s′ i : hP ! T ′ ; S ′ i

323

Enhancing the Languages

Intermediate Languages

Lemma 6.1.4 (Faulty Program Configurations Untypable). imd

If hq; si ∈ R obj hq; si

r2

imd

is faulty, then there are no Q and S, such that ⊢obj hq; si

Lemma 6.1.5 (Subject Reduction). he; si ∈ R

imd obj he;

sir ˆ r3

∧ ⊢

imd obj he;

imd

∃ T ′ ≤r T, S ′ ≥r S. ⊢obj he; si

r

ˆ r3′

sir ˆ r3

t he; si ⇀ he′ ; s′ i

he; si: hP ! T; Si ∧

imd he; sir obj imd r

he′ ; s′ i : hP ! T ′ ; S ′ i ∧ ⊢obj t

t: T

Proposition 6.1.2 (Removing Context Preserves Typability). imd obj hq;

⊢

si:

imd obj hq;

sir2

∃ P1 , T1 . ⊢ imd obj he;

∗

h→

[ e r1 ]

imd obj he;

si

r1

q[e1 ]; si : hQ; Si → ˆ r2 he1 ; si : hP1 ! T1 ; Si

si:

imd

⊢obj he; si

r

ˆ r1 r2

∗

h→

[ e r r1 ]

e[e1 ]; si: hP ! T; Si → imd r r1 ˆ r2 ∃ P1 , T1 ≤r T. ⊢obj he; si he1 ; si : hP1 ! T1 ; Si Lemma 6.1.6 (Replacement). imd obj hq;



si: imd

→∗ [ e]

q [e1 ]; si ∈ R obj hq; si

h

  r r imd  ∧ ⊢obj hq; si 1 2   r1 ˆ r2  ∧ ⊢imd obj he; si   ′ r1 ˆ r2 imd   ∧ ⊢obj he; si  ∧ T1′ ≤r1 T1 ∧ ′ r1 r2

imd

⊢obj hq; si imd obj he;



∗

h→

r1 r2



  q[e1 ]; si : hQ; Si     → he1 ; si : hP1 ! T1 ; Si    ′ ′ ′ ′ he1 ; s i : hP1 ! T1 ; S i   S ′ ≥r1 S h→

∗

[ e]

q [e′1 ]; s′ i : hQ; S ′ i

[ e]

si:

 imd r ˆ r1 r2 e [e1 ]; si ∈ R obj he; si     imd rˆ r r  ∧ ⊢obj he; si 1 2 h→∗ [ e]e[e1 ]; si : hP ! T; Si      r r1 ˆ r2 he; si  ∧ ⊢imd  → obj he1 ; si : hP1 ! T1 ; Si     r r1 ˆ r ′ imd   2 he; si ′ ′ ′ ′  ∧ ⊢obj  he1 ; s i : hP1 ! T1 ; S i   ∧ T1′ ≤r r1 T1 ∧ S ′ ≥r r1 S ′ r ˆ r1 r2 imd ∗ ∃ T ′ ≤r T. ⊢obj he; si h→ [ e]e [e′1 ]; s′ i : hP ! T ′ ; S ′ i

324

→∗ [ e]

h

ˆ r3 → r3′ →

r2

hq; si : hQ; Si.

Object Language

Type and Effect Soundness

imd obj e:

 r ˆ r1 r2 imd e[e1 ]; si ∈ R obj he; si     imd r ˆ r r  ∧ ∅ {r}; S ⊢obj e 1 2 →∗ [ e]e [e1 ] : P ! T      ′ r r1 ˆ r2  ∧ ∅ {r} {r }; S ⊢imd obj e e1 : P1 ! T1  1  →    imd r r1 ˆ r2    ∧ ∅ {r} {r1 }; S ′ ⊢obj e e′1 : P1 ! T1′    ∧ T1′ ≤r r1 T1 ∧ S ′ ≥r r1 S ′ imd r ˆ r1 r2 →∗ [ e] ∃ T ′ ≤r T. ∅ {r}; S ′ ⊢obj e e [e′1 ] : P ! T ′ 

∗

∃ s.h→

[ e]

Proof: Lemma 6.1.4.

ha v; si,(s ( a) = hci, δ( c) ( hv; si) not defined): imd

T

imd

imd

S ⊢obj d hci: P1 ⇒0 P. must be generated with ⊢obj d -const, which requires Γ; S ⊢obj b hci: T

P1 ⇒0 P ! ∅. But our restriction on δ then requires that δ( c) ( hv; si) must be defined. The value substitution lemma now supports polymorphism. We would require X ∈ ftv(P0 ) − ftv(Γ), but Γ = ∅ {r}, so there is no need to include the difference. Lemma 6.1.7 (Value Substitution). imd obj p: imd r ˆ r2 imd r ˆ r2 ∅ {r} {x′ 7→ ∀ X. P ′∗ }; S ⊢obj p p : P ∧ X ∈ ftv(P ′∗ ) ∧ S ⊢obj v v ′ : P ′∗ → imd r ˆ r2 p [x′ := v ′ ] : P ∅ {r}; S ⊢obj p

imd obj b: ⇁ imd r ˆ r2 imd r ˆ r2 ∅ {r} {x′ 7→ ∀ X. P ′∗ }; S ⊢obj b b: B ! T ∧ X ∈ ftv(P ′∗ ) ∧ S ⊢obj v v ′ : P ′∗ → ⇁ imd r ˆ r2 ∅ {r}; S ⊢obj b b [x′ := v ′ ]: B ! T

imd obj e: imd r ˆ r2 imd r ˆ r2 ∅ {r} {x′ 7→ ∀ X. P ′∗ }; S ⊢obj e e: E ∧ X ∈ ftv(P ′∗ ) ∧ S ⊢obj v v ′ : P ′∗ → imd r ˆ r2 ∅ {r}; S ⊢obj e e [x′ := v ′ ]: E

The Region Allocation lemma is similar to that of Chapter 4. Restricting the region name from the upper part of the store and store type in the Region Deallocation lemma is required to support the eager deallocation of regions.

325

Enhancing the Languages

Intermediate Languages

Lemma 6.1.8 (Region Allocation). For all r0 ∈ / Dom(S1 ): imd obj v: imd

S1 ⊢obj v

r1 ρ0

ˆ r2

v0 : P0 → imd r1 r0 ˆ r2 S1 {r0 7→ ∅} ⊢obj v v0 [ρ0 := r0 ]: P0 [ρ0 := r0 ] imd obj p: imd

∅ {r1 } {ρ0 }; S1 ⊢obj p

r1 ρ0

ˆ r2

p0 : P0 → ˆ r2 7 ∅} ⊢ p0 [ρ0 := r0 ] : P0 [ρ0 := r0 ] ∅ {r1 } {r0 }; S1 {r0 → imd r1 r0 obj p

⇁

imd obj b

⇁′

(active source prestorable, seq-/( ̺ ) = r1 ρ0 , seq-/( ̺ ) = r1 r0 ): ⇁ imd ̺ ˆ r2 b0 : B0 ! T0 → ∅ {r1 } {ρ0 }; S1 ⊢obj b ⇁′ imd ̺ ˆ r2 ∅ {r1 } {r0 }; S1 {r0 7→ ∅} ⊢obj b b0 [ρ0 := r0 ]: B0 [ρ0 := r0 ] ! T0 [ρ0 := r0 ]

imd obj e

(active source expression): imd r1 ρ0 ˆ r2 ∅ {r1 } {ρ0 }; S1 ⊢obj e e0 : E0 → imd r1 r0 ˆ r2 ∅ {r1 } {r0 }; S1 {r0 7→ ∅} ⊢obj e e0 [ρ0 := r0 ]: E0 [ρ0 := r0 ]

imd obj s: imd

⊢obj s

r1

imd

s1 : S1 → ⊢obj s

r1 r0

s1 {r0 7→ ∅}: S1 {r0 7→ ∅}

Lemma 6.1.9 (Region Deallocation). imd obj v: imd r1 r0 ˆ r3 S1 {r0 7→ ≬S} S3 ⊢obj v v: P → imd r1 ˆ r3 S1 S3 [r0 := ∅] ⊢obj v v [r0 := ∅]: P - r0

imd obj p: imd r1 r0 ˆ r3 ∅ {r1 } {r0 }; S1 {r0 7→ ≬S} S3 ⊢obj p p: P → imd r1 ˆ r3 ∅ {r1 }; S1 S3 [r0 := ∅] ⊢obj p p [r0 := ∅] : P − r0

326

Object Language

Type and Effect Soundness

imd obj b: ⇁ imd r ˆ r3 ∅ {r1 } {r0 }; S1 {r0 7→ ≬S} S3 ⊢obj b b: B ! T → ⇁ imd r ˆ r3 ∅ {r1 }; S1 S3 [r0 := ∅] ⊢obj b b [r0 := ∅]: B - r0 ! T - r0

⇁

where seq-/( r ) = r1 r0 ̺2 and b does not actively mention r0 imd obj e: imd r1 r0 ˆ r3 ∅ {r1 } {r0 }; S1 {r0 7→ ≬S} S3 ⊢obj e e: P ! T → imd r1 ˆ r3 ∅ {r1 }; S1 S3 [r0 := ∅] ⊢obj e e [r0 := ∅]: P - r0 ! T - r0

where e does not actively mention r0 imd obj s: imd

r1 r0 r2

imd

r1 r2

⊢obj s ⊢obj s

s1 {r0 7→ ≬s} s2 : S1 {r0 7→ ≬S} S2 → s1 s2 [r0 := ∅]: S1 S2 [r0 := ∅]

where s2 does not actively mention r0

As in Chapter 4, the Expification and Storification lemmas handle the insertion and removal of context as terms are moved between the store and expression portions of a configuration. In addition, Storification now binds some free type variables, while Expification now allows specialization of the result type. It is used for applications of (possibly) polymorphic functions in

⇀

imd he; si obj

-app. It must

allow for a difference in generality between the pure type of the stored function body (in terms of type variables) and its post-substitution pure type (specialized to the requirements of the context of the application). Lemma 6.1.10 (Expification). imd obj v: imd

r1 r0

ˆ

v: P ∗ ∧ ∃ X. ∀ X. P ∗ P → imd r1 r0 r2 ˆ r3 S0 {r2 → 7 ≬S 2 } {r3 7→ ≬S 3 } ⊢obj v v: P

S0 ⊢obj v

imd obj p: imd

∅ {r1 } {r0 }; S0 ⊢obj p

r1 r0

ˆ

p : P ∗ ∧ ∃ X. ∀ X. P ∗ P → imd r1 r0 r2 ˆ r3 ∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj p p: P imd obj b

(active source prestorable): imd r1 r0 /ˆ ∅ {r1 } {r0 }; S0 ⊢obj b b: B ∗ ! T ∧ ∃ X. ∀ X. B ∗ B →

327

Enhancing the Languages

Intermediate Languages

imd

∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b

r1 r0 /r2

ˆ r3

b: B ! T

imd obj e

(active source expression): imd r1 r0 ˆ e: P ∗ ! T ∧ ∃ X. ∀ X. P ∗ P → ∅ {r1 } {r0 }; S0 ⊢obj e imd r1 r0 r2 ˆ r3 ∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj e e: P ! T

Proposition 6.1.3 (Storification). imd

∅ {r1 } {r0 } {r2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b

r1 r0 /r2

ˆ r3

imd

d: B ∗ ! ∅ → S0 ⊢obj d

r1 r0

d: ∀ X. B ∗ ,

where X = ∅ if d is a reference cell and X = ftv(B ∗ ) otherwise.

Proof: Lemma 6.1.5. By case analysis on the reduction step: ⇀

imd he; si obj

[]

-let: hlet x= v in e; si ⇀ he [ x := v]; si:

imd imd imd r ˆ r3 let x= v in e: P ! T. By ⊢obj e -letg and We have by rule ⊢obj he; si that ∅ {r}; S ⊢obj e imd r ˆ r3 because values have no effect, ∅ {r}; S ⊢obj e v: P.1 ! ∅ and ∅ {r} {x 7→ ∀ X. P.1 }; S imd r ˆ r3 imd imd ⊢obj e e: P ! T where X = ftv(P.1 ). The former gives, by ⊢obj e -pure and ⊢obj p -value, S imd r ˆ r3 imd r ˆ r3 v: P.1 . By Lemma 4.1.7(i) we have that ∅ {r}; S ⊢obj e e [ x := v]: P ! T. ⊢obj v

We complete with a reapplication of rule

imd obj he;

si.

[(exec @ a)]+ ⇀

imd he; si obj

-app-const: ha v; si

δ ′ ( c)( hv; si)

⇀

δ( c) ( hv; si):

imd r ˆ r3 imd where a = hr0 , oi and s ( a) = hci. We have by rule ⊢obj he; si that ∅ {r}; S ⊢obj e a v: imd imd imd imd r ˆ r3 imd r ˆ r3 P ! T. By ⊢obj e-app, ⊢obj e-pure, and ⊢obj p-value, S ⊢obj v v: P0.1 and S ⊢obj v a: (P0.1

T

T

imd

⇒0 P) @ r0 with T = {(exec @ r0 )} ∪ T0 . By ⊢obj v -addr-live, S( a) P0.1 ⇒0 P. We also have by rule S0 ⊢

imd obj he;

imd ≬ r1 r0 obj s

imd

si that ⊢obj s

≬

s 0 : ≬S 0 . By ⊢ T

r r3

imd ≬ obj s

imd

imd

s: S. By a sequence of ⊢obj s-nonempty, ⊢obj s ,S⊢

imd obj d

hci: S ( a) and by ⊢

imd obj d-const

r1 r0

s0 : S0 and

, TypeOf ( c) = S ( a).

Thus, TypeOf( c) P0.1 ⇒0 P By our restriction on δ and δ ′ , δ( c) ( hv; si) and δ ′ ( c) ( hv; si) imd r ˆ r3 imd r must be defined, ∃ S ′ . ⊢obj hv; si δ( c)( hv; si) : hP; S ′ i, and ⊢obj t δ ′ ( c) ( hv; si): T0 . imd r ˆ r3 imd r Thus, ⊢obj he; si δ( c) ( hv; si): hP ! ∅; S ′ i and ⊢obj t [(exec @ a)] + δ ′ ( c)( hv; si): T.

328

Object Language

⇀

imd he; si obj

Type and Effect Soundness

-app-λ: ha v; si

[(exec @ r0 )]

⇀

he [ x := v]; si:

imd imd r ˆ r3 where a = hr0 , oi and s ( a) = hλx.e i. By rule ⊢obj he; si , ∅ {r}; S ⊢obj e a0 v: P ! T. imd imd r ˆ r3 imd r ˆ r3 By ⊢obj e -app and because values have no effect, S ⊢obj v v: P0.1 and S ⊢obj v a0 :

T

T

imd

(P0.1 ⇒0 P) @ r0 with T = T0 ∪ {(exec @ r0 )}. By ⊢obj v -addr-live, S( a0 ) P0.1 ⇒0 P0.2 . We imd

imd

also have by rule ⊢obj he; si that ⊢obj s imd ≬ r1 r0

s0 : S0 and S0 ⊢obj imd

s

imd

and ⊢obj b -λ, S0 ⊢obj d

r1 r0

r

imd

imd ≬

≬

imd

s: S. By a sequence of ⊢obj s -nonempty, ⊢obj s

s 0 : ≬S 0 . By ⊢obj

s

imd

, S0 ⊢obj d

r1 r0

r1 r0

imd

hλx.e0 i: S ( a) and by ⊢obj d -λ imd r1 r0 /

hλx.e0 i: S ( a0 ) and ∅ {r1 } {r0 }; S0 ⊢obj b

T

hλx.e0 i: P0.1 ∗ ⇒0

T

P ∗ ! ∅ with S ( a0 ) = ∀ X. P0.1 ∗ ⇒0 P ∗ . Because hλx.e0 i is an active source prestorable, using the generalization relation above we can apply Lemma 6.1.10 to obtain ∅ {r}; S imd r1 r0 /r2 ˆ r3 imd imd r ˆ r3 T hλx.e0 i: P0.1 ⇒0 P ! ∅. By ⊢obj b -λ, ∅ {r} {x 7→ P0.1 }; S0 ⊢obj e e0 : ⊢obj b imd r ˆ r3 P ! T0 . By Lemma 6.1.7, ∅ {r}; S ⊢obj e e0 [ x := v]: P0.2 ! T0 . We complete with an imd

application of rule ⊢obj he; si. ⇀

imd he; si obj

-µλ: h@ r0 hµx1 @r0 . λx2 . e0 i; si

[(alloc @ hr0 , oi)]

⇀

hhr0 , oi; s′ i:

where a = hr0 , oi and s′ = s {a 7→ hλx2 .e0 [ x1 := a] i} for o ∈ / Dom ( s( r0 )). We have by imd r1 r0 r2 ˆ r3 imd obj e @ r0 hµx1 @r0 . λx2 . e0 i: P ! T. By ⊢obj e -alloc Γ; rule imd obj he; si that Γ; S ⊢ imd r1 r0 /r2 ˆ r3 S ⊢obj b hµx1 @r0 . λx2 . e0 i : B0 ! ∅, with P = B0 @ r0 and T = {(exec @ r0 )}. By imd imd imd r1 r0 /r2 ˆ r3 hλx.e0 i: B0 ! ∅. By ⊢obj b -λ, Γ {x1 7→ ⊢obj b -µλ, Γ {x1 7→ B0 @ r0 }; S ⊢obj b imd r1 r0 ˆ T B0 @ r0 } {x2 7→ P0.1 }; S0 ⊢obj e e0 : P0.2 ! T0 , with B0 = P0.1 ⇒0 P0.2 and S = S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 }. Since x2 6= x1 , Γ {x1 7→ B0 @ r0 } {x2 7→ P0.1 } = Γ {x2 7→ P0.1 } {x1 7→ B0 @ r0 }. By weakening to S0′ = S0 {a 7→ B0 }, Γ {x2 7→ P0.1 } {x1 7→ B0 @ r0 }; imd r1 r0 ˆ imd imd r1 r0 ˆ S0′ ⊢obj e e0 : P0.2 ! T0 . Applying ⊢obj v -addr-live gives S0′ ⊢obj v a: B0 @ r0 . By imd r1 r0 ˆ imd Lemma 6.1.7(i), Γ {x2 7→ P0.1 }; S0′ ⊢obj e e0 [ x1 := a]: P0.2 ! T0 . Reapplying ⊢obj b-λ imd r1 r0 /ˆ imd imd r1 r0 gives Γ; S0′ ⊢obj b hλx2 .e0 [ x1 := a] i: B0 ! ∅. Applying ⊢obj d -λ gives S0′ ⊢obj d imd

imd

hλx2 .e0 [ x1 := a] i: B0 . We also have by rule ⊢obj he; si that ⊢obj s the outer layers and reapplying with the new storable yields ⊢ with an application of rule ⇀

imd he; si obj

imd obj he;

r

imd obj s

s: S, so peeling back s′ : S ′ . We complete

si. []

-freeregion: hfreeregion r0 after e0 ; s1 {r0 7→ ≬s 0 } s2 i ⇀ he0 [ r0 := ∅]; s1 s2 [ r0 := ∅]i:

with he0 ; s2 i not actively mentioning

imd

r0 . We have by rule ⊢obj he; si that ∅ {r1 }; S

329

Enhancing the Languages

imd

⊢obj e

r1

ˆ r0 r2

Intermediate Languages

imd

imd

freeregion r0 after e0 : P1 ! T1 . By ⊢obj e -freeregion, ∅ {r1 } {r0 }; S ⊢obj e

r1 r0

ˆ r2

e0 : P0 ! T0 , with P1 = P0 - r0 and T1 = T0 - r0 . Let S = S1 {r0 7→ ≬S 0 } S2 . By Lemma 6.1.9, imd r1 ˆ r2 imd ∅ {r1 }; S1 S2 [ r0 := ∅] ⊢obj e e0 [ r0 := ∅]: P1 ! T1 . We also have by rule ⊢obj he; si imd

that ⊢obj s imd

r1 r0 r2

tains ⊢obj s ⊢

imd obj he;

r1 r2

s1 {r0 7→ ≬s 0 } s2 : S1 {r0 7→ ≬S 0 } S2 . Using Lemma 6.1.9 on stores obs1 s2 [ r0 := ∅]: S1 S2 [ r0 := ∅]. We complete with an application of rule

si

.

Proof: Lemma 6.1.7. We again actually show variations that work on judgments over region indicators with general environments: imd obj p: imd ̺ ˆ r2 imd ̺ ˆ r2 p : P ∧ X ∈ ftv(P ′∗ ) ∧ S ⊢obj v v ′ : P ′∗ → Γ {x′ 7→ ∀ X. P ′∗ }; S ⊢obj p imd ̺ ˆ r2 Γ; S ⊢obj p p [ x′ := v ′ ] : P

imd obj b: ⇁ imd ̺ ˆ r2 imd ̺ ˆ r2 Γ {x′ 7→ ∀ X. P ′∗ }; S ⊢obj b b: B ! T ∧ X ∈ ftv(P ′∗ ) ∧ S ⊢obj v v ′ : P ′∗ → ⇁ imd ̺ ˆ r2 b [ x′ := v ′ ]: B ! T Γ; S ⊢obj b

imd obj e: imd ̺ ˆ r2 imd ̺ ˆ r2 Γ {x′ 7→ ∀ X. P ′∗ }; S ⊢obj e e: E ∧ X ∈ ftv(P ′∗ ) ∧ S ⊢obj v v ′ : P ′∗ → imd ̺ ˆ r2 Γ; S ⊢obj e e [ x′ := v ′ ]: E

imd obj p: imd

Γ {x′ 7→ ∀ XP ′∗ . }; S ⊢obj p By ⊢

imd obj p-var

̺

ˆ r2

x : P:

, Γ {x′ 7→ ∀ XP ′∗ . }( x) P. If x′ = x then x [ x′ := v ′ ] = v ′ and ∀ X. P ′∗

P. Thus, ∃ σ1 . σ1 ( P ′∗ ) = P. Unless v ′ is a live address a = hr0 , oi, the result follows easily. Otherwise, we have P ′∗ = B ′∗ @ r0 , P = B @ r0 , σ1 ( B ′∗ ) = B, and S( a) B ′∗ . Thus, S ( a) = ∀ X2 . B ′∗∗ with ∃ σ2 . σ2 ( B ′∗∗ ) = B ′∗ . Let σ3 = σ2 ◦ σ1 so that ∃ σ3 . imd imd ̺ ˆ r2 ′∗∗ σ3 ( B ) = B. Thus, S( a) B. Reapplying ⊢obj v -addr-live, Γ; S ⊢obj p a : B @ r0 .

330

Object Language

Type and Effect Soundness

imd

imd

If x 6= x′ , by ⊢obj p-var and the definition of environments, Γ( x) P. Γ; S ⊢obj p x : imd

P follows by reapplying ⊢obj p-var, and is sufficient because x = x [ x′ := v ′ ].

imd obj b: imd

Γ {x′ 7→ ∀ X. P ′∗ }; S ⊢obj b By ⊢

imd obj b -µλ

ˆ r2

⇁ ̺

hµx1 @̺0 . λx2 . e0 i : B0 ! ∅: imd

, Γ {x′ 7→ ∀ X. P ′∗ } {x1 7→ B0 @ ̺0 }; S ⊢obj b

⇁ ̺

ˆ r2

hλx2 .e0 i: B0 ! ∅. By

definition of environment extension, x′ 6= x1 . Γ {x′ 7→ ∀ X. P ′∗ } {x1 7→ B0 @ ̺0 } = Γ {x1 7→ B0 @ ̺0 } {x′ 7→ ∀ X. P ′∗ } so by induction, we have Γ {x1 7→ B0 @ ̺0 }; S ⇁ imd ̺ ˆ r2 ⊢obj b hλx2 .e0 i [ x′ := v ′ ]: B0 ! ∅. If x′ = x2 then hλx2 .e0 i [ x′ := v ′ ] = hλx2 .e0 i, so ⇁ imd imd ̺ ˆ r2 hµx1 @̺0 . λx2 . e0 i : B0 ! ∅ follows by ⊢obj b -µλ, with hµx1 @̺0 . λx2 . e0 i Γ; S ⊢obj b = hµx1 @̺0 . λx2 . e0 i [ x′ := v ′ ]. Otherwise, hλx2 .e0 i [ x′ := v ′ ] = hλx2 .e0 [ x′ := v ′ ] i imd

imd

and Γ; S ⊢obj b hµx1 @̺0 . λx2 . e0 [ x′ := v ′ ]i : B0 ! ∅ follows by ⊢obj b-µλ, and is sufficient because hµx1 @̺0 . λx2 . e0 [ x′ := v ′ ]i = hµx1 @̺0 . λx2 . e0 i [ x′ := v ′ ].

imd obj e: imd

Γ {x′ 7→ ∀P ′ }; S ⊢obj e

̺

ˆr

let x= ∀e in e: P ! T: imd imd ̺ ˆ r imd ̺ ˆ r ∀ By ⊢obj e-letg, Γ {x′ 7→ ∀P ′ }; S ⊢obj e e: P.1 ! T.1 , and by induction, Γ; S ⊢obj e imd

e [ x′ := v ′ ]: P.1 ! T.1 . Also by ⊢obj e-letg (assuming without loss of generality that x′ 6= imd ̺ ˆ r e: P ! T.2 , where X = ftv(P.1 ) − ftv(Γ), x), Γ {x′ 7→ ∀P ′ } {x 7→ ∀ X. P.1 }; S ⊢obj e ∀

and T = T.1 ∪ T.2 . Γ {x′ 7→ ∀P ′ } {x 7→ ∀ X. P.1 } = Γ {x 7→ ∀ X. P.1 } {x′ 7→ ∀P ′ }, so imd ̺ ˆ r imd by induction, Γ {x 7→ ∀ X. P.1 }; S ⊢obj e e [ x′ := v ′ ]: P ! T.2 . Applying ⊢obj e-letg, imd ̺ ˆ r we get Γ; S ⊢obj e let x= ∀e [ x′ := v ′ ] in e [ x′ := v ′ ]: P ! T, where let x= ∀e [ x′ := v ′ ] in e [ x′ := v ′ ] = (let x= ∀e in e) [ x′ := v ′ ]. Proof: Lemma 6.1.9. We generalize the result to allow lexically visible region indicators above the region to be deallocated, and to allow program variables in the environments:

331

Enhancing the Languages

Intermediate Languages

imd obj v: imd

S1 {r0 7→ ≬S} S3 ⊢obj v S1 S3 [ r0 := ∅] ⊢

r1 r0 ̺2

imd r1 ̺2 obj v

ˆ r3

ˆ r3

v: P →

v [ r0 := ∅]: P - r0

imd obj p: imd

Γ; S1 {r0 7→ ≬S} S3 ⊢obj p

r1 r0 ̺2

imd

Γ − r0 ; S1 S3 [ r0 := ∅] ⊢obj p

ˆ r3

r1 ̺2

p: P →

ˆ r3

p [ r0 := ∅] : P − r0

imd obj b: imd

Γ; S1 {r0 7→ ≬S} S3 ⊢obj b

ˆ r3

⇁ r

imd

Γ - r0 ; S1 S3 [ r0 := ∅] ⊢obj b

ˆ r3

⇁ r

b: B ! T → b [ r0 := ∅]: B - r0 ! T - r0

⇁

where seq-/( r ) = r1 r0 ̺2 and b does not actively mention r0 imd obj e: imd

Γ; S1 {r0 7→ ≬S } S3 ⊢obj e Γ - r0 ; S1 S3 [ r0 := ∅] ⊢

r1 r0 ̺2

imd r1 ̺2 obj e

ˆ r3 ˆ r3

e: P ! T → e [ r0 := ∅]: P - r0 ! T - r0

where e does not actively mention r0 imd obj s: imd

r1 r0 r2

imd

r1 r2

⊢obj s ⊢obj s

s1 {r0 7→ ≬s} s2 : S1 {r0 7→ ≬S} S2 → s1 s2 [ r0 := ∅]: S1 S2 [ r0 := ∅]

where s2 does not actively mention r0 The proof is by induction on the derivation: imd obj v: imd

S ⊢obj v

r1 r0 ̺2

By ⊢

ˆ r3

a: B @ r:

imd obj v-addr-live

imd

, a = hr, oi and S( a) B. If r = r0 then S − r0 ⊢obj v

∅ follows from ⊢

r1 ̺2

ˆ r3

h∅, oi:

imd obj v -addr-dead

with h∅, oi = a [ r0 := ∅] and ∅ = B @ r [ r0 := ∅]. Othimd imd r1 ̺2 ˆ r3 erwise (S [ r0 := ∅])( a) B. Reapplying ⊢obj v -addr-live gives S − r0 ⊢obj v a: B [ r0 := ∅] @ r0 , with a = a [ r0 := ∅] and (B [ r0 := ∅]) @ r = (B @ r) [ r0 := ∅].

332

Object Language

Type and Effect Soundness

imd obj e: imd

Γ1 ; S11 {r10 7→ ≬S 10 } S12 S0 S3 ⊢obj e

r11 r10 r12

ˆ r0 r2

imd

freeregion r0 after e0 : P0 - r0 ! T0 - r0 :

imd

By ⊢obj e-freeregion, Γ1 {r0 }; S11 {r10 7→ ≬S 10 } S12 S0 S3 ⊢obj e imd

By induction, Γ1 {r0 } - r10 ; S11 (S12 S0 S3 ) [ r10 := ∅] ⊢obj e

r11 r10 r12 r0

r11 r12 r0

ˆ r2

ˆ r2

e0 : P0 ! T0 .

e0 [ r10 := ∅]: imd

P0 - r10 ! T0 - r10 . Γ1 {r0 } - r10 = (Γ1 - r10 ) {r0 }, so we can reapply ⊢obj e -freeregion to imd r11 r10 r12 ˆ r0 r2 obtain Γ1 - r10 ; S11 (S12 S0 S3 ) [ r10 := ∅] ⊢obj e freeregion r0 after e0 [ r10 := ∅]: P0 − r10 - r0 ! T0 − r10 - r0 . Because freeregion r0 after e0 [ r10 := ∅] = freeregion r0 after e0 [ r10 := ∅], P0 − r10 - r0 = P0 − r0 - r10 and T0 − r10 - r0 = T0 − r0 - r10 , we are done. Proof: Lemma 6.1.10. As in the core language, we generalize the result to allow for contexts due to function bodies. We thus prove: imd obj v: imd

r1 r0 ρ2

ˆ

v: P ∗ ∧ ∃ X. ∀ X. P ∗ P → imd r1 r0 r2 ρ2 ˆ r3 7 ≬S 2 } {r3 7→ ≬S 3 } ⊢obj v v: P S0 {r2 →

S0 ⊢obj v

imd obj p: imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj p

r1 r0 ρ2

ˆ

p : P ∗ ∧ ∃ X. ∀ X. P ∗ P → imd r1 r0 r2 ρ2 ˆ r3 ∅ {r1 } {r0 } {r2 }{ρ2 }{x2 → 7 P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj p p: P

imd obj b

(active source prestorable, i): imd

⇁ r1 r0 ρ2

ˆ

b: B1 ∗ ! T ∧ ∃ X. ∀ X. B1 ∗ B1 → ⇁ imd r1 r0 r2 ρ2 ˆ r3 ∅ {r1 } {r0 } {r2 }{ρ2 }{x2 → 7 P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b b: B1 ! T ∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj b

imd obj b

(active source prestorable, ii): ⇁ imd r1 r0 ρ2

ˆ

b: B2 ∗ ! T ∧ ∃ X. ∀ X. B2 ∗ B2 → ⇁ imd r1 r0 r2 ρ2 ˆ r3 ∅ {r1 } {r0 } {r2 }{ρ2 }{x2 → 7 P2 }; S0 {r2 → 7 ≬S 2 } {r3 7→ ≬S 3 } ⊢obj b b: B2 ! T ∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj b

333

Enhancing the Languages

imd obj e

Intermediate Languages

(active source expression): imd

∅ {r1 } {r0 }{ρ2 }{x2 7→ P2 }; S0 ⊢obj e

r1 r0 ρ2

ˆ

e: P ∗ ! T ∧ ∃ X. ∀ X. P ∗ P → imd r1 r0 r2 ρ2 ˆ r3 ∅ {r1 } {r0 } {r2 }{ρ2 }{x2 → 7 P2 }; S0 {r2 7→ ≬S 2 } {r3 7→ ≬S 3 } ⊢obj e e: P ! T The cases for variable references and live addresses are modified to use a substitution extended from the one already in use.

2. Monadic Language

2.1. Dynamics.

The monadic intermediate language dynamic syntax is presented in Figure 6.2.1. We enhance our monadic language in a manner analogous to that just applied to the object language. In particular, we augment our indexes on expressions to reflect the structure of the store. Rather than maintaining separate lexical and nonlexical indexes as we did for stores, it will be more convenient to overlay them. Indexes are composed of a nonempty sequence of ’.’-left-delimited segments. The first segment contains just a forest, as in the first segment of the nonlexical store but with region stores replaced by region names. Subsequent segments are divided into a lexical portion (simply a region name) and a nonlexical portion (another forest of lexically visible (and uncancelled) region-names)2. We use the metavariable fr to represent these segments. Any remaining segments represent lexically visible (and uncancelled) virtual regions and contain only a region variable. We use the metavariable f̺ to represent segments of either of the latter two types. f

f

We will require trees of region names r0 and sequences of trees of region names r0 . We indicate f

a tree with root r0 and child subtrees r1 with

r0 f r1 .

Effectful operations are permissible when the

index contains more than one segment. It is again the case that configurations are formed out of components of the same index. Stores are just like those of the core language (Figure 4.2.1). The following operations on indexes are similar to those defined on stores in Chapter 4. We still assume the availability of those operations, and corresponding ones on traces. 2The lexical portion could be omitted in that there is again only a single region name so that all relevant structure

resides in the nonlexical portion, but including it makes it easier to express transformations

334

Monadic Language

Dynamics f . r2

hq; 6 ⊚s i

imd 6⊚ mon hq; si

∈

f

f

q . r2 ∈ f

.r2 . fr

he; si

f f r.ρ

e . r2 . f

f

f

f

f

f

::= e .r2 f

imd .r2 . fr mon e

=

f imd 6 ∀ .r2 . r . ρ mon e

∈

e

f

f imd 6 ∀ .r2 . r mon e

∈

e

f

f f imd 6 ∀ .r2 . ̺ 1 . ̺ 0 mon e

∈

e

b̺ ∈ ∀ rρ

∈

b

6∀ ̺

∈

b

imd ̺1 ̺0 mon b imd ∀ ̺1 ̺0 mon b imd 6 ∀ ̺1 ̺0 mon b

r1 r0 d r ∈ imd mon d ̺ p ̺ ∈ imd mon p

hv; ⊚si

r

∈

imd ⊚ r mon hv; si ̺ v ̺ ∈ imd mon v

f

f

| 6 ∀e

e

f

s .r2 . r ˆ r2 ∈

.r2 . fr . ρ

::= returnr ρ p r ρ f

::= return ∀e f

.r2 . f̺ 1

imd .r2 . fr mon s ⊚ ǫ ⊚ r s ∈ imd mon s ⊚ r ⊚ r1 r0 s ∈ imd mon s

| ∀b

fr . ρ. ρ 0

::= run e .r2 .

rρ f

| let x = e .r2 .

fr . ρ

f

in e .r2 .

fr . ρ

f f r

::= running e '.r2 .

f

6 ∀ .r2 . r ˆ r . ρ

f

f ∀ . r2 . r . ρ

f

f 6 ∀ . r2 . r

f

imd .r2 . fr mon s

×

f

::=

f

f 6 ∀ . r2 . r . ρ

f . r2

f

f f imd ∀ .r2 . ̺ 1 . ̺ 0 mon e

∈

e

imd 6 ⊚ mon s

×

f

∀ .r2 . r ˆ r2 . ρ

f

.r2 . fr

imd ∀ .r2 . r ˆ r . ρ mon e

∈

e

f

imd .r2 mon q

=

f

si

imd .r2 . fr . ρ mon e

∈

f

∀ .r2 . r ˆ r . ρ f

imd .r2 mon q

imd mon he;

∈

f . r2

f

.r2 . f̺

rρ

1 ::= return 6 ∀e | 6 ∀b | deref p r ρ rρ rρ | set p to p | p r ρ p r ρ

::=

∀ ̺

b | 6 ∀b

̺

::= hλx.e ǫ. r ˆ ǫ. ρ i | hµx. λx. e ǫ. r ˆ ǫ. ρ i | hci ::= href p ̺ i ::= href v r i | hλx.e ǫ. r ˆ ǫ i | hci ::= v ̺ | x imd r = mon v × ::= g | o

::=

imd ⊚ r mon s

f

⊚ r

s ˆ 6 ⊚s

.r2 . fr

::= ∅ ::=

⊚ r1

s

r

{≬s } f

f

f

6 ⊚ .r2 . r1 ˆ r21

s

::=

f ǫˆ r2 ≬s

::=

f f f r0 ˆ r20 6 ⊚ .r2 . r1 ˆ r21 ≬ s . s

f

∈

imd 6 ⊚ r2 mon s

f

f

f

6 ⊚ .r2 . r ˆ r2

s

f

∈

f

f

imd 6 ⊚ .r2 . r1 ˆ r21 . r0 ˆ r20 mon s f r1 ˆ ≬

s

r0 f r2

f r1 ˆ ≬ ∈ imd mon s ≬ r ≬ r1 r0 s ∈ imd mon s ̺ t ̺ ∈ imd mon t ̺ ̺ ̺ 1 0 f ∈ imd mon f ̺ imd ̺2 ̺1 ̺0 f ∈ mon f ι ∈ imd mon ι

f

≬ r1 r0

s

r0 f r2

f

::=

f r1 r0 ˆ r2 ≬s

::= = ::= ::= ::=

∅ {o 7→ d r } ̺ [ imd mon f ] (ι @ o) (ι @ o) | return f ̺2 ̺1 alloc | read | write | exec

f

f f r ∈ imd ::= rˆ r f̺ ∈ imd ::= fr | ρ mon r mon ̺ imd ̺ ∈ imd ::= ρ r ∈ imd mon ̺ ::= r | ρ ρ ∈ mon ρ mon r ::= r imd imd imd imd imd g ∈ mon g ⊇ obj g o ∈ mon o ⊇ obj o x ∈ imd mon x ⊇ obj x

Figure 6.2.1. Monadic Intermediate Language Syntax

335

Enhancing the Languages

'(.

Intermediate Languages

r20 f f '(. r21 r201 ) r200 f f f f r2 . r1 ˆ r21 . r0 ˆ r201 r2001 ) f

f

&(. r21 . r20 ˆ r201 ) &(.

f r2

f f f . r1 ˆ r21 . r0 ˆ r201 . r200 ˆ r2001 )

f

f

= . r21 . r20 ˆ r201 f

f

f

f

= . r2 . r1 ˆ r21 . r0 ˆ r201 . r200 ˆ r2001 f

r20 f

= . r21 r201 = .

r200 f f f . r1 ˆ r21 . r0 ˆ r201 r2001

f r2

' is used to modify the index upon descending into a running form. It extends the number of index segments by one. Upon descending into a run form, by contrast, we append a region variable segment and upon descending into a return form we simply truncate the index. The latter ensures that a return form cancels the innermost lexically visible, uncancelled region. f

f

f

f

f f

f

f

f

f

f f

We define (. r2 . fr 1 ) r0 such that (. r2 ) r0 = r2 r0 and (. r2 . fr 11 . r10 ˆ r10 ) r0 = . r2 . fr 11 . r10 ˆ r10 r0 . This operation does not change the length of the list of forests at the top level of the index. The reduction semantics for the monadic language is presented in Figures 6.2.2 through 6.2.5. Figures 6.2.2 and 6.2.3 present the reduction rules. The rules largely correspond to those of the object language. Again, the lexical portion of the index (in this case the number of forests in the sequence) is preserved by the rules, but the nonlexical index (the structure of the forests themselves) is not. The reduction rules, with the exceptions of entirely unchanged.

⇀

imd he; si mon

⇀ imd he; si mon

-run and

⇀ imd he; si mon

-running, leave the index

-run extends the last forest (the list of siblings corresponding to the

innermost region) with a trivial tree consisting only of a root. In terms of the stores, this root corresponds to an empty region store. the last forest. In

⇀

imd he; si mon

⇀

imd he; si mon

-running removes the root of the leftmost tree from

-running, e [ r0 := ∅] has a special meaning. It indicates the removal from

e of all return forms corresponding to the running form being removed. That is not necessarily the outermost return forms because additional run and running forms may occur in e. We can implement this by extending an index for the bodies of run and running, and retracting it for the argument to return if is nonempty; otherwise drop that return construct entirely. We are again conservative in the sidecondition of

⇀ imd he; si mon

-running; we require not just that the

effect of e be of the form St∅ T, but that effectful operations on the region being deallocated not occur in e. In terms of stores, we additionally modify the remainder of the leftmost tree by removing all return forms corresponding to the running form being removed. Only the rightmost sequence is thus affected by these reduction rules.

336

Monadic Language

The rule

⇀

imd he; si mon

Dynamics

-µλ allocates the recursive function, returning its offset in the top region of the

lexical store and generating a return form for each region in the lexical index. In we now assume δ :

imd mon c

⇒

imd ⊚ mon hv; si

r

r

imd ⊚ mon hv; s i

⇒

⇀

imd he; si mon

-app-const,

. Thus, we extract the value and lexical

store from the expression configuration, pass these as a value configuration to δ and reapply the return forms and nonlexical store to the result of the transformation. Again, δ and δ ′ will be restricted further via a relation with the type system. ⇀

imd he; si mon

-let

f

f

r2 . fr → r2 . fr []

hlet x = return v in e; si ⇀ he [ x:= v]; si ⇀ imd he; si mon

-alloc

f

hd0 ; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i ⇀ imd he; si mon

⇀

≬

s ( o) = href vi

-deref

f r2 . fr 1 . fr 0

⇀

⇀

hreturn

⇀

≬s( o)

-set

[(write @ o)]

⇀

f

→ r2 . fr 1 . fr 0

href v0 i}} ˆ 6 ⊚s 0 i

≬

s( o) = hλx.e0 i

-app-λ

f r2 . fr 1 . fr 0

f

→ r2 . fr 1 . fr 0 [(exec @ o)] ⇀ he0 [ x:= v0 ]; ⊚s 1

{≬s 0 } ˆ 6 ⊚s 0 i

≬

s ( o) = hci

-app-const

f r2 . fr 1 . fr 0

ho v; ⊚ s 1 {≬s 0 } ˆ 6 ⊚s 0 i ⇀

v; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i

hreturnr1 r0 unit; ⊚s 1 {≬s 0 {o

[(exec @ o)]+ δ ′ ( c)( hv; ⊚s 1 {≬s 0 }i)

imd he; si mon

1 0

= href v0.1 i

f r2 . fr 1 . fr 0

ho v0 ; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i imd he; si mon

f

→ r2 . fr 1 . fr 0 [(read @ o)] r r

hset o to v0 ; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i imd he; si mon

o; s 1 {≬s 0 {o 7→ d0 }} ˆ 6 ⊚s 0 i

hreturn

⇀

hderef o; ⊚s 1 {≬s 0 } ˆ 6 ⊚s 0 i imd he; si mon

f

r2 . fr 1 . fr 0 → r2 . fr 1 . fr 0 [(alloc @ o)] r1 r0 ⊚

-µλ hhµx1 . λx2 . e0 i; ⊚ s 1 {≬s 0 } ˆ 6 ⊚s 0 i

let hv ′ ; ⊚s ′1 {≬s′ 0 }i = δ( c)( hv; ⊚s 1 {≬s 0 }i) in hreturnr1 r0 v ′ ; ⊚s ′1 {≬s′ 0 } ˆ 6 ⊚s 0 i

⇀

f

f

→ r2 . fr 1 . fr 0

f

r2 . fr 1 . fr 0 → r2 . fr 1 . fr 0 [(alloc @ o)] r1 r0

⇀

hreturn o; ⊚ s 1 {≬s 0 {o 7→ hλx2 .e0 [ x1 := o] i}} ˆ 6 ⊚s 0 i

Figure 6.2.2. Monadic Expression Configuration Reduction Rules I

337

Enhancing the Languages

⇀

imd he; si mon

Intermediate Languages

-run

f

f

r2 . fr 1 → (r2 . fr 1 )r0 f [] ⊚ s 1 ˆ 6 ⊚s. ≬s 1 i ⇀ hrunning e0 ; ⊚s 1

hrun e0 ;

f

⇀

imd he; si mon

f

ˆ 6 ⊚s. ≬s 1 ∅i

e0 , ≬s 0 not actively mentioning the current region

-running

r0 f

f f

f

f f

r2 . fr . r0 r1 → r2 . fr . r0 r1 ≬ s0 f f [] ≬s ≬s i ⇀ he := ∅; ⊚s 0 0 1 1

hrunning e0 ; ⊚s 1 ˆ 6 ⊚s .

f

f

ˆ 6 ⊚s . (≬s 0 := ∅) ≬s 1 i

Figure 6.2.3. Monadic Expression Configuration Reduction Rules II →∗ [he; ∅ ˆ 6 ⊚s i

f r0

f

hq; 6 ⊚si

]

r0

imd →∗ [he; ∅ ˆ 6 ⊚s i mon

∈

f r

f

r

hq; 6 ⊚si

]

::=

h[ e]; [ 6 ⊚s]i f f r

→∗ [he; sir0 .

f

r0 . fr

]

he; si

∗

!→

[ t]

t ∈

→∗ [

imd mon he;

f f r

sir0 .

] imd mon he;

f

si

∗

r0 . fr

!→

[ t]

t ::=

h[ e]; [s]i ! [t] | hlet x = [ e] in e; [⊚s] ˆ [ 6 ⊚s ]i ! [t] f f r)

→∗ [he; si' (r0 .

]

f

he; si

r0 . fr

∗

!→

[ t]

t ∈

→∗ [

imd mon he;

f f r)

si'(r0 .

] imd mon he;

f

r0 . fr

si

!→

∗

[ t]

t ::=

hrunning [ e]; &([⊚s] ˆ [ 6 ⊚s])i ! &[t] f f r1

→∗ [he; sir2 .

f f r

he; sir2 .

]

1.

f

r0

∗

!→

∗

t ∈→

[ t]

[

imd mon he;

sir2 . r1 ˆ r31 ] imd mon he; f

f

f

f

f

sir2 . r1 ˆ r31 . r0 ˆ r30 ! →

∗

[ t]

t ::=

f

hreturn [ e]; [⊚s] {≬s} ˆ [ 6 ⊚s]. ≬si ! '[t]

Figure 6.2.4. CBV Monadic Atomic Expression Configuration/Trace Evaluation Contexts Figure 6.2.4 defines program expression configuration and expression configuration/trace evaluation contexts as in the core intermediate monadic language (Figure 4.2.3). The multiple deep reduction rules in Figure 6.2.5 are the same as those of Section 4.2.1. We present in Figures 6.2.6 through 6.2.8 a monadic language reduction sequence corresponding to that of Figures 6.1.5 through 6.1.7. The sequence is again infinite. The main difference is the insertion of

⇀

imd he; si mon

-let reductions. Constants and recursion are handled similarly to the object-

language sequence, but use the forms of the monadic source language.

338

Monadic Language

Dynamics

f

→∗ imd hq; si mon

f

r11 . fr 11 → r12 . fr 12 t3 he3 ; si →∗ he′3 ; s′ i

-cntxt

f

→∗ [ he; si]

f

r1

hq; si[he′3 ; s′ i]

hq; si[he3 ; si] → f

→∗

imd he; si mon

r2

∗ →∗ [ he; si]

f

r11 . fr 11 → r12 . fr 12 t3 he3 ; si →∗ he′3 ; s′ i

-cntxt

f

f

r21 . fr 21 → r22 . fr 22 →∗ [ t ]

t [t3 ]

→∗ [ he; si]

→∗

he; si[he3 ; si]

→∗ imd he; si mon

-reflex

f

→∗ [ he; si]

he; si[he′3 ; s′ i]

f

r0 . fr → r0 . fr [] ∗

f

f

r1 . fr 1 → r2 . fr 2 t ∗ ′ ′

he; si → he; si

he; si → he ; s i f

f

→∗

imd he; si mon

-step

he; si ⇀ he ; s i f r1 . fr 1

→ t ∗

f r2 . fr 2

→∗

imd he; si mon

f

r2 . fr 2 → r3 . fr 3 t′ ′ ′ ∗ ′′ ′′

f

r1 . fr 1 → r2 . fr 2 t ′ ′

-trans

he ; s i → he ; s i f

f

r1 . fr 1 → r3 . fr 3 t+t′ ∗ ′′ ′′

he; si → he′ ; s′ i

he; si → he ; s i

Figure 6.2.5. Monadic Language Multiple Deep Reduction Rules Immediately faulty expression configurations are again similar to those of the object language. Definition 6.2.1 (Immediately Faulty Expression Configurations). f

r2 . fr

(1) hreturnr x; si

f

r2 . fr 1 . fr 0

(2) hhλx.e0 i; si

(λx.e0 not closed) f

r2 . fr 1 . fr 0

(3) hderef v; si

f

r2 . fr 1 . fr 0

, hset v to p; si

f

r2 . fr 1 . fr 0

, hv p; si

(v 6= o) f

r2 . fr 1 . fr 0

(4) hderef o; ⊚s {≬s} ˆ 6 ⊚si ho p; ⊚s {≬s} ˆ 6 ⊚si

f

r2 . fr 1 . fr 0

, hset o to p; ⊚s {≬s} ˆ 6 ⊚s i

,

f r2 . fr 1 . fr 0

(o ∈ / Dom (≬s)) f

r2 . fr 1 . fr 0

(5) hderef o; ⊚s {≬s} ˆ 6 ⊚si

f

r2 . fr 1 . fr 0

, hset o to p; ⊚s {≬s} ˆ 6 ⊚s i

(≬s (o) 6= href vi) f

(6) ho p; ⊚s {≬s} ˆ 6 ⊚si

r2 . fr 1 . fr 0

(≬s (o) ∈ / {hci,hλx.e0 i})

339

Enhancing the Languages

Intermediate Languages

h run let x1 = h µx1 . λx2 . run let x3 = let x4 = hadd1-1i in x4 x2 in return x1 x3 in let x4 = hsub1i ; in let x5 = let x6 = h1i in x1 x6 in x4 x5 let x4 = let x5 = h1i in x1 x5 in deref x4 ∅ ˆ .ǫi

; i

[]

[]

⇀ hrunning [ e r ]; &[ s]i imd he; si -run - 6 mon ⇀

[ (alloc @ o2 ), (alloc @ o3 ), ] [ (alloc @ o1 ) ] r

h let x1 = [ e ] in let x4 = hsub1i in let x5 = let x6 = h1i in x1 x6 in x4 x5 let x4 = let x5 = h1i in x1 x5 in deref x4 [ s]i

(exec @ o1 ), (read @ o3 ), (alloc @ o4 ), . . . [] ⇀ ; h [ e r ]; ⇀ imd he; si -let let x = let x = h1i mon 4 5 in o1 x5 in deref x4 [ s]i

; ;

6 hh µx1 . λx2 . run let x3 = let x4 = hadd1-1i in x4 x2 in return x1 x3 ∅ {∅} ˆ .ǫ. ǫi

;

[(alloc @ o1 ) ]

i

⇀

imd he; si mon

-µλ

⇀ ho1 ; ⊚ s 1 ˆ .ǫ. ǫi

d1

≬

⊚

≬

∅ { s 1,1 } ∅ {≬s 1,2 } ∅ {≬s 1,3 } s3 {≬s 2,1 } ∅ {≬s 1,4 } {≬s 2,1 } ⊚ s 6 = ∅ {≬s 1,4 } s1 ⊚ s2 ⊚ s3 ⊚ s4 ⊚ s5

= = = = =

s 1,1 = ∅ {o1 7→ h λx2 . run let x3 = let x4 = hadd1-1i in x4 x2 in return o1 x3 ≬ s 1,2 = ≬s 1,1 {o2 7→ hsub1i} ≬ s 1,3 = ≬s 1,2 {o3 7→ h1i} ≬ s 1,4 = ≬s 1,3 {o4 7→ h2i} ≬ s 2,1 = ∅ {o1 7→ hadd1-1i}

Figure 6.2.6. Sample Monadic Language Reduction, I f

r2 . fr 1 . fr 0

(7) ho v; ⊚s {≬s} ˆ 6 ⊚si

(≬s (o) = hci, δ(c) (hv; ⊚si) not defined) 340

} i

Monadic Language

d1 = [ (alloc @ o3 ), ]

;

[ (alloc @ o2 ) ] h let x4 = [ e] in let x5 = let x6 = h1i in o1 x6 in x4 x5 [ s]i

(exec @ o1 ), (read @ o3 ), (alloc @ o4 ), (exec @ o1 ), ... ⇀ h let x5 = [ e r ] ; ⇀ imd he; si -let in o2 x5 mon [ s]i

[ (read @ o3 ), ]

6

hhsub1i; ⊚s 1 ˆ .ǫ. ǫi

[(alloc @ o2 ) ]

⇀ ho2 ; ⇀ ⊚ imd he; si -alloc s 2 ˆ .ǫ. ǫi mon

[ (alloc @ o3 ) ] h let x6 = [ e r ] ; in o1 x6 [ s]i

⇀

imd he; si mon

⇀ ho1 o3 ; -let ⊚s 3 ˆ .ǫ. ǫi

(alloc @ o4 ), (exec @ o1 ), [(exec @ o1 )] . . . ⇀ d2 ⇀ imd he; si -app-λ mon

6 [(alloc @ o ) ] 3 hh1i; ⊚s 2 ˆ .ǫ. ǫi ⇀ ho3 ; ⊚s 3 ˆ .ǫ. ǫi ⇀ imd he; si -alloc mon Figure 6.2.7. Sample Monadic Language Reduction, II

Dynamics

341

[ (read @ o3 ), ]

h run let x3 = let x4 = hadd1-1i in x4 o3 in return o1 x3 ⊚ s 3 ˆ .ǫ. ǫi

;

(alloc @ o4 ) ⇀ hrunning [ e rr ]; ⇀ imd he; si -run &[ s]i mon 6 []

[ (alloc @ o1 ),

-

(exec @ o1 ), return (read @ o3 ), return (alloc @ o4 ) h let x3 = [ e rr ] ; in return o1 x3 [ s]i

[(alloc @ o1 )]

h let x4 = [ e rr ] ; in x4 o3 [ s]i

⇀ ho4 o3 ; ⊚ imd he; si -let s 4 ˆ .ǫ. ǫ. ǫi mon ⇀

[] ⇀ imd he; si mon

⇀ ho1 o4 ; ⊚ s 6 ˆ .ǫ. ǫi

-running

]

⇀ imd he; si mon

]

⇀ hreturn o1 o4 ; ⊚s 5 ˆ .ǫ. ǫ. ǫi -let

[(exec @ o1 ), ] return (read @ o3 ), return (alloc @ o4 ) ⇀ imd he; si mon

⇀ ho4 ; ⊚s 5 ˆ .ǫ. ǫ. ǫi

-app-const

[(alloc @ o1 )] ⇀ ho1 ; ⊚s 4 ˆ .ǫ. ǫ. ǫi ⇀ imd he; si -alloc mon Figure 6.2.8. Sample Monadic Language Reduction, III

Intermediate Languages

6 hhadd1-1i; ⊚s 3 {∅} ˆ .ǫ. ǫ. ǫi

[(exec @ o1 )] [ . . . ⇀ ∞ ⇀ imd he; si -app-λ mon

Enhancing the Languages

342

d2 =

Monadic Language

Statics

2.2. Statics.

Γ̺ ∈

imd ǫ mon Γ ̺ imd ̺1 ̺0 Γ ∈ mon Γ ≬ ̺ ≬ ̺ Γ ∈ imd mon Γ f

r2

hQ; 6 ⊚Si f

r2 . fr

hE; Si

∈

∅ {≬Γ }

::=

Γ ̺1 {≬Γ }

::=

∅ | ≬Γ {x 7→ ∀P }

̺

̺

̺

f

imd 6 ⊚ r2 mon hQ; S i

∈

f f r . rˆ r Si 2 ̺ ∈ imd mon E

imd mon hE; ̺

E

̺

∀ ̺ B ∈ imd mon B ̺1 ̺0 B ̺ ∈ imd mon B r r ⊚ hP; ⊚Si ∈ imd mon hP; S i ∀ ̺ ∀ ̺ P ∈ imd mon P ǫ src P ̺ ∈ imd mon P , Q ∈ obj Q ̺1 ̺0 P ̺ ∈ imd mon P ∀

f

ǫ

::=

f

f

imd mon Q

= ::=

r imd mon E ̺ ̺

T P

::= ::=

∀ X. B ̺ X | Ref P ̺ | P ̺ ⇒ E ̺ | C

=

imd r0 . fr mon S ⊚ r ⊚ ǫ S ∈ imd mon S ⊚ r ⊚ r1 r0 S ∈ imd mon S

× ×

f

imd r2 . fr mon S

imd r imd ⊚ r mon P × mon S ̺

::= ::=

∀ X. P G|∅

::=

B ̺ | Return P ̺1

::=

⊚ r10

f

S r0 . r10 ˆ r20 ∈

imd 6 ⊚ r2 mon S

=

f

r0 . fr

S

ˆ 6 ⊚S

::=

∅

::=

⊚ r1

{≬S }

S

r

f

f

f

6 ⊚ .r2 . r10 ˆ r20

S

::=

f ǫ ˆ r2 ≬S

::=

f f f r0 ˆ r20 6 ⊚ .r2 . r1 ˆ r21 ≬ S . S

f

∈

imd 6 ⊚ r2 mon S

f

f

f

6 ⊚ .r2 . r0 ˆ r20

S

f

∈

f r1 ˆ ≬

G ∈

imd mon G

⊇

f

f

imd 6 ⊚ .r2 . r1 ˆ r21 . r0 ˆ r20 mon S r0 f r2

r0 f r2

≬

S

r1 r0

f r1 ˆ f r1 r0 ˆ r2 ≬ ≬S S ∈ imd S ::= mon r ≬ r ≬ r1 r0 S ∈ imd ::= ∅ {o 7→ ∀B } mon S ǫ T ̺ ∈ imd ::= Id mon T ̺ ̺ imd ̺1 ̺0 T ∈ mon T ::= ≬T T ̺1 ̺ ≬ ̺ ≬ ̺1 ̺0 T ∈ imd ::= Stε mon T ̺ ̺1 ̺0 ε ̺ ∈ imd = { imd mon ε mon F } ̺ imd ̺1 ̺0 F ∈ mon F ::= ι imd ι ∈ mon ι ::= alloc | read | write | imd imd imd X ∈ imd mon X ⊇ obj X C ∈ mon C ⊇ obj G f

exec imd obj C

Figure 6.2.1. Monadic Intermediate Language Static Syntax The monadic language static syntax is presented in Figure 6.2.1. It, and the static semantics in general, are a simple combination of the core intermediate monadic language static semantics of 343

Enhancing the Languages

Intermediate Languages

Section 4.2.2 and the enhanced source language static semantics of Section 5.2. We thus present it without further comment. imd

⊢mon hq;

program configs f

6 ⊚ r2

programs

f

;

Γr ρ

prestorables pures

Γ

rρ

f

S r2 . r0 ˆ r1

;

⊚ r

;

⊚ r

Q

:

hE; Si.r2 . r0 ˆ r1

e .r2 . r0 ˆ r1 . ρ0

:

Er ρ

imd r ρ mon b

br ρ

:

Br ρ

imd r ρ mon p

vr ρ

:

Pr ρ

:

hP; ⊚Si

:

Pr ρ

:

S .r2 . r1 ˆ r21

imd

⊢mon e ⊢

S

⊢

S

imd

⊚ r

values

imd

⊢mon v

S

imd

f

q r2

f si.r2 . r0

f .r2 . r0

⊢mon hv;

value configs

⊚

ˆ

f r1

f . r2 . r0

he; si

ˆ rf1 . ρ0

si

f

r

⊢

lexical stores

⊢mon

rρ

⊢

S

f

f

f

s

f imd 6 ⊚ .r2 . r1 mon s

ˆ rf21

6⊚

s

⊚ r1

region stores

⊚ r

⊢

S

⊢

S

storables

S

r

f

S

r0 f

f r1 ˆ r2 ≬

:

s

≬ r

s

S

≬ r

:

S

:

∀

̺

t̺

:

T̺

imd ̺ mon f

f̺

:

F̺

d

f

6 ⊚ .r2 . r1 ˆ r21

:

r

imd

⊢

S

r0 f

imd r = r1 r0 mon d

⊢mon t

atomic traces

ˆ

f r21

f

⊚ r

:

f r1 ˆ r2 ≬

ˆ rf2

imd ≬ r = r1 r0 mon s

⊢

traces

f r2 . r1

f

r

f

f

⊚ r

r0

region trees

r

s .r2 . r1 ˆ r21

s

f r1 imd ≬ mon S

f

vr ρ

imd ⊚ r

⊚ r1

ˆ

f r1

f

hv; ⊚si

s .r2 . r1 ˆ r21

stores

nonlexical stores

f

r2

:

imd

Γr ρ

f

r2

hq; 6 ⊚s i

f r2

⊢mon he;

expressions

f r2

hQ; 6 ⊚Si

imd

expression configs

si

:

⊢mon q

S

6⊚

B

r

Figure 6.2.2. Monadic Intermediate Language Typing Judgments

imd

⊢mon v-glob-const

⊢

imd mon v-ret-run

imd

∅ ⊢mon v

ǫ

imd

⊚

g : TypeOf( g)

S ⊢mon v

⊚

S ⊢

imd ̺ ρ0 mon v

̺

v: P

v : Return P

⊢

imd mon v-ret-running

imd

⊚

S 1 ⊢mon v

⊚

≬

S1 { S0} ⊢

r1

imd r1 r0 mon v

v : P1 v : Return P1

≬

S 0 ( o) B0

imd

⊢mon v-addr-live

⊚

≬

S1 { S0} ⊢

imd r1 r0 mon v

imd

o : B0

⊢mon v-addr-dead

imd

∅ ⊢mon v

ǫ

o: ∅

Figure 6.2.3. Typing of Monadic Intermediate Language Values

344

Monadic Language

Type and Effect Soundness

imd

⊢

imd mon q

{∅}; ∅ ˆ 6 ⊚S ⊢mon e 6⊚

S ⊢

ˆ r2

ǫ

f imd r2 mon q

q : Id Q

q: Q

Figure 6.2.4. Typing of Monadic Intermediate Language Programs imd

⊚

S 0 ⊢mon v

imd

⊢mon d-ref

imd

⊚

S 0 ⊢mon d

r1 r0

r1 r0

v0 : P0

imd

⊢mon d-λ

href v0 i : Ref P0

ftv(B0 ) = X imd r1 r0 ∅; ⊚S 0 ⊢mon b hλx.e0 i : B0 imd

⊚

S 0 ⊢mon d

r1 r0

hλx.e0 i : ∀ X. B0

imd

⊢mon d-const

imd

⊚

S ⊢mon d

r1 r0

hci : TypeOf( c)

Figure 6.2.5. Typing of Monadic Language Storables ⊢ ⊢

imd⊚ r

⊚

S⊢

imd mon s

⊢

⊚

s

⊚

s:

f imd 6 ⊚ .r2 . r mon s

f imd .r2 . r mon s

ˆ

f r1

S ˆ rf1

6⊚

6⊚

s:

s ˆ 6 ⊚s :

⊚

S

⊢

⊚S

imd ≬ mon s

⊚

S0 ⊢

S ˆ 6 ⊚S

⊚

r

imd

0

∀

⊢mon d d0 :

imd ≬ r = r1 r0 mon s

B0

∅ {o 7→ d0 } : ∅ {o 7→

∀

B0 }

Figure 6.2.6. Typing of Monadic Language Stores and Region Stores imd ≬ r

⊚ imd ⊚

⊢mon

s

imd ⊚

-empty

⊢

imd ⊚ ǫ mon s

⊢mon

∅: ∅

s

-nonempty

S 1 {≬S 0 } ⊢mon s ≬s 0 : ≬S 0 imd ⊚ r1 ⊢mon s ⊚s 1 : ⊚S 1 imd ⊚ r = r1 r0

⊢mon

⊚

s 1 {≬s 0 } :

s

⊚

S 1 {≬S 0 }

Figure 6.2.7. Typing of Monadic Lexical Stores

f imd 6 ⊚ r2

∅ ⊢mon imd

⊢mon hq;

6⊚

S ⊢

6⊚

si imd

⊢mon hq;

6⊚

si

s

f imd r2 mon q

f r2

6⊚

imd

⊢mon s

6⊚

s:

S

q: Q

imd

⊢mon he; si

ˆ rf1

∅ {∅}; S ⊢ imd

hq; 6 ⊚s i : hQ; 6 ⊚Si

f .r2 . r1

⊢mon he;

s: S ˆ rf1

f imd .r2 . r1 mon e

f si.r2 . r1

ˆ

f r1

e: E

he; si : hE; Si

imd ⊚ r

imd

⊢mon hv;

⊢mon s ⊚s : ⊚S imd r ⊚ S ⊢mon v v : P

⊚

si imd

⊢mon hv;

⊚

r

si

hv; ⊚si : hP; ⊚Si

2.3. Type and Effect Soundness. We now merely restate the type and effect soundness results, which are proven using by now familiar techniques. 345

Enhancing the Languages

Intermediate Languages

imd ≬ r1 r0

⊚

S 1 {≬S 0 } ⊢mon

⊢

f imd ≬ mon s

⊚S

≬

s 0 : ≬S 0

s

f r1 r0 imd ≬S

ˆ rf2

r0

⊚

S1 ⊢

f r1 imd ≬ mon S

⊢

⊢

≬

s 0 ≬S 0 f f ≬s : ≬S 0 0

ˆ rf2

⊚S

1

{≬S

⊚

S1 ⊢

imd 6 ⊚ mon s -nonempty

ˆ rf2

f f ≬s : ≬S 0 0

≬ 1 { S 0 } ⊢mon

⊚

S 1 {≬S 0 } ⊢

0}

imd 6 ⊚ mon s -empty

f ǫ imd ≬S

∅ ⊢mon

⊢

f r1 r0 imd ≬S mon

f imd6 ⊚ .r2 . r1

s

f imd 6 ⊚ .r2 . r1 mon s

ˆ rf21

ˆ rf20

s

f

: ≬S

f f ≬s : ≬S 0 0

6⊚

6⊚

s1 :

ˆ rf21 . r0 ˆ rf20

f ≬s

f imd 6 ⊚ r2

∅ ⊢mon

f f ≬s: ≬S

S1 f

6⊚

f

6⊚

s 1 . ≬s 0 :

S 1 . ≬S 0

Figure 6.2.8. Typing of Monadic Nonlexical Stores ⊢

Γ( x) P

imd mon p -var

⊚

Γ; S ⊢

imd ̺ mon p

⊢

x: P

imd mon p -value

imd

⊚

S ⊢mon v ⊚

Γ; S ⊢

̺

v: P

imd ̺ mon p

v: P

Figure 6.2.9. Typing of Monadic Intermediate Language Pures ⊢

Γ; ⊚S ⊢ ≬ ′ T0

imd

TypeOf( c) B

imd mon b -const

⊢

f imd .r2 . f̺ 1 . f̺ mon b

hci : B

⊢

Γ1 {≬Γ 0 {x 7→ P0.1 }}; ⊚S ˆ .ǫ. ǫ ⊢mon e Γ1 {≬Γ 0 }; ⊚S ⊢

imd r ρ mon b

⊚

Γ; S ⊢

imd ̺1 ̺0 mon b

̺1 ̺0

p : P0

href pi : Ref P0

Γ {≬Γ {x1 7→ B}}; ⊚S ⊢mon b ≬

⊚

Γ { Γ }; S ⊢

imd ̺1 ̺0 mon b

.ǫ. r0

ˆ ǫ. ρ0

e0 : ≬T ′0 T1′ P0.2

hλx.e0 i : (P0.1 ⇒ ≬T 0 T1 P0.2 )

imd

imd

⊢mon b-µλ

Γ; ⊚S ⊢mon p

T1 ⊑ ≬T 0 T1 imd

imd mon b -λ

imd mon b-ref

̺1 ̺0

hλx2 .e0 i : B0

hµx1 . λx2 . e0 i : B0

Figure 6.2.10. Typing of Monadic Intermediate Language Prestorables Theorem 6.2.1 (Type Soundness). src

⊢mon q

r2

q: Q

r2

→

hq; ǫi ⇓ hv; ǫi

hq; ǫi ⇑ ∨ ∃ v.(

imd hq; 6 ⊚s ir2 mon

imd

∧ ⊢mon hq;

r2

6⊚

si

hv; ǫi : hQ; ǫi).

Theorem 6.2.2 (Effect Soundness).

⊢

he; si ∈ R t: T

imd r mon t

346

imd mon he;

r

si

ˆ r3

∧ ⊢

imd mon he;

r

si

ˆ r3

t he; si →∗ he′ ; s′ i

he; si : hT P; Si ∧

imd he; sir mon

ˆ r3

→

Translation

3. Translation

Figure 6.3.1 and Figure 6.3.2 present the dynamic translation for the enhanced intermediate language. It updates the source translation of Figure 5.3.30 with rules for configurations and stores (as did Figure 4.3.1). In taking into account our extended indexes, we obtain a cleaner formulation. As with the core language, the translation does not take advantage of the full generality of our treeshaped stores and creates a linear one. The only new clause on terms is that for translating storable constants. There are, however, new clauses for translating the indexes. They serve to convert linear object indexes into a “forest” of trivial trees on the monadic side. We now return to our final example with a translation of the sample object language derivation of Figure 6.1.10. Applying the translation of Figures 6.3.1 and 6.3.2 to the configuration r1 ρ

imd

Γ; ⊚S ⊢mon p f f imd .r2 . r1 ˆ r3 . ρ0 e

imd

⊢mon e-pure

returnr1 ρ p : St∅ Id P

Γ; ⊚S ˆ 6 ⊚S ⊢mon

imd

Γ {≬Γ }; S ⊢mon e

f .r2 . f̺ 1

6∀

e : T.1 P.1

Γ {≬Γ {x 7→ P.1 }}; S ⊢

imd

⊢mon e-letn

Γ {≬Γ }; S ⊢

f imd .r2 . f̺ 1 mon e

p: P

f imd .r2 . f̺ 1 mon e

e : T.2 P.2

let x = 6 ∀e in e : (T.1 ⊔ T.2 ) P.2

X = ftv(P.1 ) − ftv(Γ) imd

Γ {≬Γ}; S ⊢mon e

f .r2 . f̺ 1

∀

e : T.1 P.1 imd

⊢

Γ {≬Γ {x 7→ ∀ X. P.1 }}; S ⊢mon e

imd mon e-letg

Γ {≬Γ}; S ⊢

f imd .r2 . f̺ 1 mon e

imd

⊢

imd mon e-run

Γ1 {∅}; S1 ⊢mon e Γ1 ; S1 ⊢

⊢

f .r2 . f̺ 1 . ρ0

f imd .r2 . f̺ 1 mon e

imd

f .r2 . fr 1

f '(.r2 . fr 1 )

r1 ρ

Γ; ⊚S ⊢mon b f f imd .r2 . r1 ˆ r3 . ρ0 e

Γ; ⊚Sˆ 6 ⊚S ⊢mon

e0 : ≬T 0 T1 P0

running e0 : T1 P0 :=r1 r0 ∅ imd

imd

⊢mon e-alloc

e0 : ≬T 0 T1 P0

run e0 : T1 P0 :=̺1 ρ0 ∅

Γ1 {∅}; 'S1 ⊢mon e Γ1 ; S1 ⊢mon e

e : T.2 P.2

let x = ∀e in e : (T.1 ⊔ T.2 ) P.2

imd

imd mon e -running

f .r2 . f̺ 1

b: B

b : St{alloc} (St∅ Id) B

Figure 6.2.11. Typing of Monadic Intermediate Language Expressions I 347

Enhancing the Languages

Intermediate Languages

r1 ρ

imd

Γ; ⊚S ⊢mon p f f imd .r2 . r1 ˆ r3 . ρ0 e

imd

⊢mon e-deref

Γ; ⊚S ˆ 6 ⊚S ⊢mon imd

⊢

imd mon e-set

Γ; ⊚S ⊢mon p

deref p : St{read} (St∅ Id) P

r1 ρ

imd

Γ; ⊚S ⊢mon p

r1 ρ

imd

Γ; ⊚S ⊢mon p

p2 : P1 imd

Γ; ⊚Sˆ 6 ⊚S ⊢mon e

f .r2 . r1

ˆ rf3 . ρ0

⊢

Γ {≬Γ 1 2 ≬Γ 0 }; S1 ⊢mon e Γ {≬Γ 1 } {≬Γ 0 }; S1 ⊢

r1 ρ

f .r2 . fr 1 . ρ1

f imd .r2 . fr 1 . ρ1 . ρ0 mon e

⊢

Γ {≬Γ 1 2 ≬Γ 0 }; ⊚S 1 ˆ 6 ⊚S 1 ⊢mon e Γ {≬Γ 1 } {≬Γ 0 }; ⊚S 1 {≬S}ˆ 6 ⊚S 1

f . ≬S

⊢

e1 : T1 P1

return e1 : St∅ T1 (Return P1 ) imd

imd mon e -ret-running

p1 : (P1 ⇒ ≬T T1 P2 )

p1 p2 : (≬T ⊔ St{exec} ) T1 P2 imd

imd mon e -ret-run

r1 ρ

p1 : Ref P Γ; ⊚S ⊢mon p p2 : P f f imd .r2 . r1 ˆ r3 . ρ0 Γ; ⊚Sˆ 6 ⊚S ⊢mon e set p1 to p2 : St{write} (St∅ Id) Unit imd

imd

⊢mon e-app

p : Ref P

f .r2 . fr 1

f imd .r2 . fr 1 . fr mon e

e1 : T1 P1

return e1 : St∅ T1 (Return P1 )

Figure 6.2.12. Typing of Monadic Intermediate Language Expressions II

hfreeregion r1 after @ r1 hr1 , o2 i @ r1 hr1 , o1 i hr1 , o4 i; ; @ r1 deref (@ r1 hr1 , o1 i @ r1 h1i) ∅ {r1 7→ ∅ {o1 7→ hλx2 .letregion ρ2 i} in @ r1 hr1 , o1 i (@ ρ2 @ ρ2 hadd1@r1 i x2 ) {o2 7→ hsub1@r1 i} {o3 7→ h1i} {o4 7→ h2i}

}i

yields h running ; let x3 = return o2 ; in let x4 = let x5 = return o1 in let x6 = return o4 in x5 x6 in x3 x4 let x7 = let x8 = return o1 in let x9 = h1i in x8 x9 in deref x7 ∅ ˆ . ∅ {o1 7→ hλx2 .run let x10 = return return o1 in let x11 = let x12 = hadd1i in let x13 = return return x2 in x12 x13 in return x10 x11 {o2 7→ hsub1i} {o3 7→ h1i} {o4 7→ h2i} 348

i}

i

Translation

imd

r3

imd

r

ˆ r3 N

imd

r

ˆ r3 N

[hq; si] obj hq; si

[he; si] obj he; si

[hv; si] obj hv; si imd r

[t ] obj t

N

N

Configurations and Traces imd r3 imd ˆ r3 = h[[q ] obj q N ; π6 ⊚[s ] obj s N i imd r ˆ r3 imd r ˆ r3 N N = h[[e ] obj e ; [s ] obj s i imd r imd r ˆ r3 N = h[[v ] obj v N ; π⊚ [s ] obj s i =

[returnr2 (ι @ o)|r = r1 r0 r2 ∧ (ι @ hr0 , oi) ∈ t]

Programs and Expressions imd ǫˆ r3 N [q ] = [q ] obj e imd ̺ ˆ r3 imd ̺ ˆ r3 N N [p ] obj e = return̺ [p ] obj p imd ̺ ˆ r3 imd ̺ ˆ r3 imd ̺ ˆ r3 N N N [let x = e1 in e2 ] obj e = let x = [e1 ] obj e in [e2 ] obj e imd ̺ ˆ r3 imd ̺ = ̺1 ̺0 ̺2 ˆ r3 N N = let x = [e ] obj e in return̺2 href xi [@ ̺0 href ei] obj e imd ̺ = ̺1 ̺0 ̺2 ˆ r3 imd ̺ ˆ r3 N N = let x = [e ] obj e in return̺2 (deref x) [@ ̺0 deref e ] obj e imd ̺ = ̺1 ̺0 ̺2 ˆ r3 imd ̺ ˆ r3 N N [@ ̺0 set e.1 to e.2 ] obj e = let x.1 = [e.1 ] obj e imd ̺ ˆ r3 N in return̺2 (set x.1 to x.2 ) in let x.2 = [e.2 ] obj e imd ̺1 ̺0 ̺2 ˆ r3 imd ̺1 ̺0 ˆ ǫ N N = return̺2 hλx.[[e0 ] obj e i [@ ̺0 hλx.e0 i] obj e imd ̺ = ̺1 ̺0 ̺2 ˆ r3 imd ̺ ˆ r3 e N e N [@ ̺0 e.1 e.2 ] obj = let x.1 = [e.1 ] obj imd ̺ ˆ r3 N in let x.2 = [e.2 ] obj e in return̺2 (x.1 x.2 ) imd ̺1 ̺0 ̺2 ˆ r3 N = return̺2 hci [@ ̺0 hci] obj e imd ̺1 ̺0 ̺2 ˆ r3 imd ̺1 ̺0 ˆ ǫ N N [@ ̺0 hµx1 @̺0 . λx2 . ei] obj e = return̺2 hµx1 . λx2 . [e ] obj e i imd ̺1 ˆ r3 imd ̺1 ρ0 ˆ r3 N N [letregion ρ0 in e0 ] obj e = run [e0 ] obj e imd r1 ˆ r0 r3 imd r1 r0 ˆ r3 N N [freeregion r0 after e0 ] obj e = running [e0 ] obj e imd r3 N obj q

Figure 6.3.1. Translating Object Intermediate Language to Monadic (Dynamic) I We can now complete the example. Applying the translation of Figure 6.3.3 to the configuration type h∅; ∅ {r1 7→ ∅ {o1 7→ ∀ X1 . Int @ r1 {o2 7→ Int @ r1 {o3 7→ Int} {o4 7→ Int}

{ (alloc @ r1 ), } (exec @ r1 ), (read @ r1 )

⇒

{ (alloc @ r1 ), } (read @ r1 )

⇒

X1 @ r1 }

}i

Int @ r1 }

yields

349

Enhancing the Languages

Intermediate Languages

Stores and Storables imd r obj s

ˆ r3 N

[s ] imd ǫ [∅ ] obj s N

= = imd

[s1 {r0 7→ ≬s 0 } ] obj s

r1 r0

imd

N

= [s1 ] obj s

imd ≬ r1 r0 N obj s

[∅ ]

= imd ≬ r1 r0 N obj s

≬

imd

&r3 ([[s ] obj s ∅ r1

N

r r3

imd ≬ r1 r0

{[[≬s 0 ] obj

imd ≬ r1 r0

imd r1 r0 N obj d

[href v0 i]

=

imd r1 r0 N obj d

[hλx.e0 i]

imd r1 r0 N obj d

[hci]

ˆ .ǫ. ǫ. ǫ) s

N

}

∅ s

= [≬s 0 ] obj

[ s 0 {o 7→ d0 } ]

N

N

imd

{o 7→ [d0 ] obj d

imd r1 r0 N obj v

href [v0 ]

imd r1 r0 N obj e

=

hλx.[[e0 ]

=

hci

r1 r0

N

}

i i

Pures and Values imd ̺ obj p N

[x ]

imd ̺ obj p N

[v ]

imd

[g ] obj v

̺

N

=

x

imd

imd ̺ obj v N

= [v ] =

[hr0 , oi] obj v

r1 r0 ̺2

N

imd ̺ obj v N

[h∅, oi]

g

=

o

=

o

Region Indicators imd obj ρN

[ρ ]

imd obj rN

[r ]

imd obj ̺N

[̺ ]

imd [r ρˆ ] obj ̺ ˆ rN imd [ˆ r3 ] obj ̺ ˆ rN imd [r r ˆ r ] obj ̺ ˆ rN

= ρ = r imd

= [̺ ] obj ̺N

1 0

3

imd

imd

= =

.ǫ. [r ] obj ̺N ˆ ǫ. [ρ ] obj ̺N imd .[[r3 ] obj ̺N

=

.ǫ. [r1 ] obj ̺N ˆ ǫ. [r0 ] obj ̺N ˆ [r3 ] obj ̺N

imd

imd

imd

Figure 6.3.2. Translating Object Intermediate Language to Monadic (Dynamic) II

h∅; { alloc, } exec, read

∅ ˆ . ∅ {o1 7→ ∀ X1 . Int ⇒ { alloc, } read

{o2 7→ Int ⇒ {o3 7→ Int} {o4 7→ Int}

X1 }

i

Int}

Our statement that the translation preserves types is similar to that of the core language, but incorporates our extended indexes. It also incorporates a requirement that the types of constants correspond. Theorem 6.3.1 (Types Preservation). If the types of constants in the object and monadic languages correspond, i.e., imd ∀

∀ c. [TypeOfObj c ] obj

350

BN

= TypeOfMon c, then

Translation

Environments imd ǫ obj Γ N

[∅ ]

∅ {∅}

=

imd ̺1 ̺0 N obj Γ

imd

= [Γ ] obj Γ

[Γ {̺0 } ]

imd ̺ obj Γ N

∀

[Γ {x 7→ P} ]

̺1 N

{∅} imd

let Γ { Γ} = [Γ ] obj Γ ≬

=

̺

imd ∀

N

in Γ {≬Γ {x 7→ [∀P ] obj

̺

P N

}}

Configuration Types imd

r3

imd

r

[hQ; Si] obj hQ; Si [hE; Si] obj hE; Si imd

[hP; Si] obj hP; Si

r

N

imd

ǫ

N

; π6 ⊚[S ] obj S

imd

r

N

; [S ] obj S

= h[[Q ] obj P

ˆ r3 N

= h[[E ] obj E

ˆ r3 N

= h[[P ] obj P

imd

r

N

imd

imd

r

ˆ r3 N

ˆ r3 N

imd

; π⊚ [S ] obj S

r

i

i

ˆ r3 N

i

Expression and Trace Types imd ̺ obj E N

[P ! T ]

imd

= [T ] obj T

̺

N

imd

[P ] obj P

̺

N

imd

̺1 ̺0 N

= St{ι|(ι @ ̺0 ) ∈ T} [T − ̺0 ] obj T

imd

ǫ

= Id

[T ] obj T [T ] obj T

imd

N

̺1 N

Pure Types and Schemes imd ∀ ̺ obj P N

[∀ X. P ]

imd ̺ obj P N

[G ]

imd ̺1 ̺0 ̺2 N obj P

[B @ ̺0 ] imd

[∅] obj P

̺

N

imd

∀ X. [P ] obj P

=

̺

N

̺

=

Return G

=

Return̺2 [B ] obj B

=

Return̺ ∅

imd

̺1 ̺0 N

Storable Types and Schemes imd ∀ ̺ = ̺1 ̺0 N obj B

[∀ X. B ]

imd ̺ = ̺1 ̺0 N obj B

̺

N

= X

[X ]

imd ̺ = ̺1 ̺0 N obj B

imd

= Ref [P ] obj P

[Ref P ] T

imd

= ∀ X. [B ] obj B

imd ̺ = ̺1 ̺0 N obj B

[P1 ⇒ P2 ]

imd ̺ = ̺1 ̺0 N obj B

[C ]

imd ̺ obj P N

= [P1 ]

̺

N imd

⇒ [T ] obj T

̺

N

imd

[P2 ] obj P

̺

N

= C Store Types

imd

[S ] obj S

r0

ˆ r3 N

imd

= &r3 ([[S ] obj S

imd ǫ obj S N

[∅ ]

imd r1 r0 N obj S

imd

= [S ] obj S

[S {r0 7→ S } ] [∅ ]

N

ˆ .ǫ. ǫ. ǫ)

= ∅ ≬

imd ≬ obj S

r0 r3

r1 r0

N

r1

N

imd ≬

{[[≬S ] obj

S

r1 r0

N

}

= ∅ imd ≬

[≬S {o 7→ ∀B} ] obj

S

r1 r0

N

imd ≬

= [≬S ] obj

S

r1 r0

N

imd ∀

{o 7→ [∀B ] obj

B

r1 r0

N

}

Figure 6.3.3. Translating Object Intermediate Language to Monadic (Static)

351

Enhancing the Languages

Intermediate Languages

imd

⊢obj hq; si

r2

hq; si

imd ̺ obj 6 ⊚ π6 ⊚[[ r2]] si

ˆ

imd

⊢mon hq; imd

r

⊢obj he; si

imd

imd

[r [

imd

r

ˆ

he; si

imd ̺ r3]]obj

ˆ rN

imd

hv; si ˆ r3]]obj ̺ ˆ r N

→

imd

: [hQ; Si] obj hQ; SiN hE; Si

:

[he; si] obj he; siN

ˆ r3

hQ; Si

:

[hq; si] obj hq; siN

ˆ r3

⊢mon he; si ⊢obj hv; si

ˆ rN

→

imd

: [hE; Si] obj hE; SiN :

hP; Si

→

[hv; si] obj hv; siN

:

[hP; Si] obj hP; SiN

s

:

imd

imd

⊚

⊢mon hv; imd

⊢obj s ⊢

imd

S imd

π⊚ [S ] obj S

r1 r0

ˆN S

imd r1 r0 obj S

π⊚ [S ]

ˆN S

imd

π6 ⊚[S ] obj S

ˆ r3 N

⊢

imd

[s ] obj s

s

imd

⊢

[≬s ]

d

imd

r2

imd

imd r π6 ⊚[[ r2]]obj

⊢mon q imd

̺

⊢obj v

imd

imd

imd

[d ]

ˆ rN

ˆ

imd ̺ r3]]obj

ˆ rN

[̺ [

ˆ

ˆ rN

→

S imd

[S ] obj S

r

N

≬

:

→

S

imd ≬

[≬S ] obj

: :

ˆ r3 N

S

r1 r0

B

:

[Q ] obj QN

e

:

E

[e ] obj eN

:

[E ] obj EN

v

:

imd

imd obj vN

[v ]

:

correspond. Theorem 6.3.2 (Semantics Preservation). If the semantics of constants of the object and monadic languages correspond, i.e., imd r ˆ r3 imd r ˆ r3 N N ∀ c. δ(c)([[hv; si] obj hv; si ) = [δ(c)(hv; si) ] obj hv; si ∧ imd r ˆ r3 imd r N δ ′ (c)([[hv; si] obj hv; si ) = [δ ′ (c)(hv; si) ] obj t N , then

→

imd

→

imd

P imd obj PN

[P ]

The translation again also preserves semantics. We now require that the semantics of constants

352

→

[B ]

[q ] obj qN

imd

N

imd r1 r0 N obj B

Q

ˆ r3 imd ̺ r3]]obj

[S ]

:

:

→

S imd r obj S

:

ˆ r3

̺

⊢mon v

imd r1 r0 N obj d

q ˆ

[̺ [

⊢mon e

N

imd ≬ r1 r0 N obj s

imd ̺N imd [r [ 1 r0]]obj mon d

⊢obj q

r

s

r1 r0

:

imd

:

≬

imd ̺N [ 1 r0]]obj imd ≬ [r mon s

⊢obj d

ˆ r3 N

s

imd ≬ r1 r0

⊢obj

S π⊚ [S ]

[s ]

imd ̺N [ ]]obj imd ⊚ [r mon s

⊢obj e

ˆ r3 N

imd r obj s

r

Γ; S imd imd r ˆ r3 N [Γ ] obj ΓN ; [S ] obj S

imd r obj S

imd

ˆ r3

imd ̺N imd [r [ r3]]obj mon s

⊢obj s ⊢

r

si

π⊚ [r [

→

Translation

imd

r3

hq; si ∈ R obj hq; si he; si ∈ R

imd obj he;

r

si

ˆ r3

imd hq; sir3 N

hq; si →∗ hq′ ; s′ i

∧

→

imd hq; 6 ⊚s i mon

t he; si →∗ he′ ; s′ i

∧

imd he; sir mon

[hq; [ si]]obj

imd he; sir

[he; [ si]]obj

ˆ r3

imd hq; sir3 N

= [hq [ ′ ; s′ i]]obj

imd hq; 6 ⊚s i mon

→

ˆ r3 N

imd he; t = [he [ ′ ; s′ i]]obj

imd he; sir mon

sir

ˆ r3 N

ˆ r3

The proof is structured similarly to that of the core language. We focus on the lemma requiring that reduction translate as evaluation. Lemma 6.3.1 (Reduction Translates as Evaluation). From he; si ∈ R

imd obj he;

sir ˆ r3

imd he; sir

[he; [ si]]obj

it follows that

t he; si ⇀ he′ ; s′ i

and

ˆ r3 N

imd he; sir obj

ˆ r3 ,

t imd he; →∗ [he [ ′ ; s′ i]]obj

imd he; si[r [ mon

sir

ˆ r3 N

ˆ r3]]

.

As with the core languages, translation respects the substitution of values. Proposition 6.3.1 (Translation Respects Substitution of Values). ˆ r3 N imd r imd r ˆ r3 N [x := [v ] obj v N ] = [e [x:= v] ] obj e [e ] imd r obj e

We extend the proposition that translation respects the deallocation of regions to expressions, and require := ∅ on the monadic side. Proposition 6.3.2 (Translation Respects Region Deallocation). imd

([[e ] obj e

r1 r0 r2

N

imd

:= ∅) = [e [r0 := ∅]] obj e

r1 r2

N

Proof: Lemma 6.3.1.

[(exec @ a)]+ ⇀ imd he; si obj

-app-const: ha v; si

δ ′ ( c)( hv; si)

δ( c) ( hv; si):

⇀

where a = hr0 , oi and s ( a) = hci. The proof relies on the assumption regarding δ. ⇀ imd he; si obj

-µλ: h@ r0 hµx1 @r0 . λx2 . e0 i; si

[(alloc @ hr0 , oi)]

⇀

hhr0 , oi; s′ i:

where a = hr0 , oi and s′ = s {a 7→ hλx2 .e0 [ x1 := a] i} for o ∈ / Dom ( s( r0 )). The translaimd r ˆ r3 imd r N tion of the redex [h@ r0 hµx1 @r0 . λx2 . e0 i; si] obj he; si = h[[@ r0 hµx1 @r0 . λx2 . e0 i] obj e N ; imd r ˆ r3 N [s ] obj s i. s = ∅ {r1 7→ ≬s 1 } {r0 7→ ≬s 0 } {r2 7→ ≬s 2 } (by the proof of Lemma 6.1.1) and imd

thus we have that [s ] obj s

r

N

imd ≬s r1 N

= ∅ {[[≬s 1 ] obj

imd r obj e N

.ǫ. ǫ. ǫ. ǫ. [@ r0 hµx1 @r0 . λx2 . e0 i ] plying

⇀

imd he; si mon

imd ≬ r1 r0

} {[[≬s 0 ] obj

= return[r[ 2

s

imd ̺N ]]obj

-µλ within the evaluation context hreturn[r[ 2 imd ̺N [r [ ]]obj

ˆ [ 6 ⊚s]. ǫi yields hreturn

imd ≬s r1 r0 r2 N obj

:= o] i}} {[[≬s 2 ]

imd ≬s r1 N obj

o; ∅ {[[≬s 1 ]

N

imd ≬s r1 r0 r2 N

} {[[≬s 2 ] obj

imd r1 r0 N obj e

hµx1 @[[e0 ] imd ̺N ]]obj

. λx2 . r0 i. Apimd ≬s r1 r0 r2 N

[ e]; [ ⊚s] {[[≬s 2 ] obj

imd ≬ r1 r0 N obj s

} {[[≬s 0 ]

}ˆ

imd r1 r0 N obj e

{o 7→ hλx2 .[[e0 ]

}

[ x1

} ˆ .ǫ. ǫ. ǫ. ǫi. By Proposition 6.3.1, this equals the translation 353

Conclusion and Future Work

Intermediate Languages

imd ̺N obj

of the object language residual, and the trace of [return[r[ 2]]

(alloc @ o)] is the translation

of the object language trace. ⇀

imd he; si obj

[]

-freeregion: hfreeregion r0 after e0 ; s1 {r0 7→ ≬s 0 } s2 i ⇀ he0 [ r0 := ∅]; s1 s2 [ r0 := ∅]i:

with he0 ; s2 i not actively mentioning r0 . The nonlexical store on the monadic side is still linear, but it is now nontrivial due to the eager deallocation of regions. Translate imd r1 ˆ r0 r2 imd r1 ˆ r0 r2 N N [hfreeregion r0 after e0 ; s1 {r0 7→ ≬s 0 } s2 i] obj he; si = h[[freeregion r0 after e0 ] obj e ; imd r1 ˆ r0 r2 imd r1 ˆ r2 r0 imd r1 r0 ˆ r2 N N N [s1 {r0 7→ ≬s 0 } s2 ] obj s i. [freeregion r0 after e0 ] obj e = running [e0 ] obj e . imd r1 ˆ r0 r2 N The store s1 = ∅ {r1 7→ ≬s 1 } and the translation of the store [s1 {r0 7→ ≬s 0 } s2 ] obj s imd ≬s r1 N

= &r0 r2 (∅ {[[≬s 1 ] obj imd ≬ r1 r0 N obj s

.ǫ.[[≬s 0 ]

imd ≬ r1 r0

} {[[≬s 0 ] obj

s

N

imd ≬s r2 N

} {[[≬s 2 ] obj

imd ≬s r1 N

} ˆ .ǫ. ǫ. ǫ. ǫ) = ∅ {[[≬s 1 ] obj

}ˆ

imd ≬s r1 r0 r2 N obj

[≬s 2 ]

. Because the property of not actively mentioning a region ⇀ imd r1 r0 ˆ r2 N := ∅; ∅ is preserved by translation, we can apply imd [e0 ] obj e mon he; si -running to yield h[ imd ≬s r1 N

{[[≬s 1 ] obj

imd ≬s r1 r0 r2 N

} ˆ .ǫ.[[≬s 2 ] obj

i (by Proposition 6.3.2 and the definition of &, the trans-

lation of the object language residual) with trace [] (the translation of the object language trace).

This completes our systematic investigation of enhancements, and in fact, of our systematic investigation of the topic at hand. In the final part that follows, the reader will enjoy a faster-paced, if more ad hoc, treatment of topics on the periphery of this study.

354

Conclusion and Future Work

Allowing Actions on Deallocated Regions

We have related effect systems to monadic ones while handling encapsulation in a manner supporting dependencies between computations in different regions, and have demonstrated some enhancements, but much remains to be done. In this final part, we more informally touch on some of these areas that we believe to be ripe for exploration. We first approach additional enhancements to our object language. Allowing actions on deallocated regions and supporting subtyping are enhancements with the potential of adding a great deal of expressibility to our languages. Expressibility could be further enhanced with explicit region-polymorphic functions. We briefly explore the implications of several ways of including them. A modification that we have already mentioned is an unrestricted store. Certainly, we could extend our approach to other effects beyond state. Finally, we take advantage of a convenient junction between incremental implementation and monadic reflection.

1. Allowing Actions on Deallocated Regions

Where is it okay to perform actions on deallocated regions? One such place is in the body of functions that will not be called. How can we statically guarantee this? Of course, we must be conservative. One possible approach is to first allow actions on ∅ from within the body of any function. This can be accomplished by placing a marker in the environment when entering function bodies. Then, we could add an antecedent to the application rule of the form ∅ ∈ T0 → ∅ ∈ Dom( Γ), i.e., an application is valid unless the body of the function may perform an action on ∅ and we are not within the body of a function. Functions deallocating dangling pointers (directly or via other functions) may continually call each other, but once any of them is called from the top level, the typing will fail. We would then require a lemma for subject reduction that allows us to drop ∅ from the static environment when it does not appear in the effect unless it appears elsewhere in the static environment. Lemma C.1.1 (Drop Dead Region). imd

imd

Γ {∅}; S ⊢obj e e: P ! T ∧ (∅ ∈ T → ∅ ∈ Dom(Γ)) → Γ; S ⊢obj e e: P ! T Allowing actions on deallocated regions in the body of functions that will not be called is necessary to take advantage of eager deallocation of regions, at least in the absence of some way of 357

Conclusion and Future Work

Intermediate Languages

deallocating such functions. Otherwise, once a function that acts on a region is placed in the store at some higher region, the lower region cannot be deallocated before the upper one. The greatest difficulty in applying this to a monadic system would be the most obvious one, i.e., how to represent operations on dangling pointers in the first place.

2. Subtyping

We could allow subtyping to be induced from an (inverted) flat ordering of regions, with ρ ≤ ∅. That would allow us to write functions whose parameter types reference ∅. The implication would be that these functions would not dereference those pointers, so any live pointers could safely be provided. This in fact provides a weak form of region polymorphism, and could allow us to deallocate memory sooner or to reuse certain functions. With this enhancement, we could write programs such as the following: letregion ρ1 in let x1 = @ ρ1 hλx2 .unit i in x1 letregion ρ2 ; in @ ρ2 href uniti x1 @ ρ1 href uniti Here, the function x1 is applied first to a dangling pointer and then to a live one. We could also allow subtyping to be induced from a subset relation on effects, as in other work [3, 16]. This would allow us to provide a more powerful form of effect polymorphism. Our previous form of effect polymorphism from Section 5.1 allowed a function to be typed with a greater latent effect than would be obtained by analyzing its body. The relative advantage of this approach would be that a function would now be free to maintain its stronger type (with lesser latent effect). Various references to the function could use a weaker type where necessary, for example, to share a reference cell with another function of weaker type, or the strong type where necessary, for example, to be passed to an external function that requires a guarantee that some effect will not occur. The subtyping rules are presented in Figure C.2.1. To preserve type safety, subtyping of reference cells must be invariant in the pure type of the contents.

358

Region Polymorphism

Local Region Polymorphism

≤-ρ-∅

≤-P-Unit

Unit ≤ Unit ≤-B-C

≤-B-Ref

≤-P-@

≤-B-fun

ρ≤ρ

B1 ≤ B2 ρ1 ≤ ρ2 B1 @ ρ1 ≤ B2 @ ρ2

≤-B-X

C≤C

Ref P ≤ Ref P

≤-ε

≤-ρ-reflex

ρ≤∅

X≤X P21 ≤ P11 ε1 ≤ ε2 P12 ≤ P22 ε

ε

1 2 P11 ⇒ P12 ≤ P21 ⇒ P22

∀ (ι@ρ1 ) ∈ ε1 .∃ ρ2 .(ι@ρ2 ) ∈ ε2 ∧ ρ1 ≤ ρ2 ε1 ≤ ε2

Figure C.2.1. Object Language SubTyping

3. Region Polymorphism

3.1. Local Region Polymorphism. Our languages up to this point have been generally region-monomorphic. One slight exception has been that since located types mention region indicators and storable types incorporate pure types, our storable type polymorphism provided a very coarse region polymorphism. Another exception has been our assumption of constant functions specific to particular regions. Since under our static analysis of region monomorphism the region index is extended with region variables and region imd

imd

names only upon processing ⊢obj e -letregion and ⊢obj e -freeregion, we can determine the static level of any region variable or region name, i.e., we can determine where it originated in the nesting of letregion constructs. This property is specific to region monomorphism. We will briefly consider two ways of providing region polymorphism. The first is to force regionpolymorphic functions to be defined over a fixed scope. The second is to allow region-polymorphic functions to be passed as arguments. In both cases the region-polymorphism is explicit by contrast

359

Conclusion and Future Work

Intermediate Languages

letregion ρ2 in let <µ fib @ ρ2 . λ [ρ3 ρ4 ] x. if zero?[ρ3 ](x) or one?[ρ3 ](x) then @ ρ4 h1i else letregion ρ5 , ρ6 in letregion ρ7 , ρ8 in fib[ρ8 ρ5 ] @ ρ7 letregion ρ9 in x -[ρ3 ρ9 ρ8 ] (@ ρ9 h2i) +[ρ5 ρ6 ρ4 ] letregion ρ10 , ρ11 in fib[ρ11 ρ6 ] @ ρ10 letregion ρ12 in x -[ρ3 ρ12 ρ11 ] (@ ρ12 h1i)> in letregion ρ12 , ρ13 in @ ρ12 fib[ρ13 ρ1 ] @ ρ13 h15i

Figure C.3.1. Fibonacci Example — Local Region-polymorphic Functions with the implicit type-polymorphism presented previously. Helsen and Thiemann [22] present a region-polymorphic object language without an explicit store. The let hµx@̺. λ[ρ]x. ei in e binding construct replaces let x= hµx@̺0 . λx. e ̺1 ̺0 ˆ r3 i in e and is explicitly region-polymorphic, with a sequence, ρ, of region formal parameters. The only way a region-polymorphic function can be used is to be applied to region indicator arguments. This can be achieved with alternate syntactic classes of program variables, offsets, and addresses for referring to region-polymorphic functions. Only the applications of these to region indicators belong to They belong to

imd obj b

rather than

imd obj p

imd obj b.

because we will allocate a region-monomorphic function

for each such application. There will be a storable in

imd obj d

for nonrecursive region-polymorphic

functions, but these must not be allocated directly. To get a feel for programming with local region-polymorphic functions, consider the program in Figure C.3.1 which computes the Fibonacci sequence, revised from Tofte and Talpin [60]3. The recursive fib routine takes two region arguments and one value argument. The first region argument specifies where to find the input value; the second specifies where to put the output result. We assume that the final output is expected in pre-declared region ρ1 . Constant functions are similarly parameterized. Regions declared within the body of fib provide localized storage. 3We borrow some of their syntactic sugar, such as multiple region variables per letregion declaration and infix operators.

360

Region Polymorphism

Local Region Polymorphism

We revise the reduction semantics below. We modify

⇀

imd he; si obj

to region-polymorphism and require the let expression form.

-µλ to handle the extension

⇀

imd he; si obj

-reg-app applies a region-

polymorphic function to region indicator arguments. It must allocate a region-monomorphic function at the given region r0 , and return its address. The region arguments may or may not be live. The rule

⇀ imd he; si obj

⇀ imd he; si obj

-alloc would be modified to exclude region-polymorphic functions. ρ

-let-µλ-reg

a0 = hr0 , oi

[(alloc @ hr0 , oi)] hlet hµx@r0 . λ[ρ]x. e1 i in e2 ; si ⇀ he2 [ x:=

ρ

a]; s {

ρ

a0 7→ hλ[ρ]x. e1 [ x:=

ρ

a0 ]i}i

ˆ r3 → r3

imd he; si(r = r1 r0 r2 ) obj

ρ

⇀ imd he; si obj

a = hr, oi s( ρ a) = hλ[ρ]x. ei a0 = hr0 , oi

-reg-app h@ r0

ρ

a [∅r ]; si

{(exec @ r),(alloc @ r0 )} ⇀ ha0 ; s {a0 7→ hλx.e [ ρ:= imd he; si(r = r1 r0 r2 ) obj

∅r ]

i}i

ˆ r3 → r3

Additional and modified rules for typing local region-polymorphic functions are presented in Figure C.3.2. For storable values, we add a rule for region-polymorphic procedures in terms of the prestorable judgment with an environment extended only with region names and an empty effect. Two rules type applications of region-polymorphic functions to region indicator arguments. The first applies when the operator is a variable and the second when it is an address. In both cases, the storable type is the result of a substitution for the region variable parameters and an execution effect is registered. The rule for region-polymorphic functions extends the environment with the region variable parameters. A rule for a recursive region-polymorphic function types it as an abstraction over region variables. It requires a derivation for a nonrecursive function, extending the environment with a type scheme abstracted over the region variables. The final rule, for letbound recursive region-polymorphic functions, requires two antecedents; one for the function and the other for the expression body. Throughout, both region arguments and regions at which to allocate functions must occur in the environment, i.e., they may be declared in either encapsulation constructs or region-polymorphic functions. Programs incorporating local region-polymorphic functions can be translated into programs including only region-monomorphic functions. Since we can identify all occurrences of regionpolymorphic functions, we can replace the declaration of each such function with multiple declarations of functions corresponding to each occurring application. We then must replace each application of a region-polymorphic function to region indicator arguments with a reference to the

361

Conclusion and Future Work

Intermediate Languages

ftv(B0 ) = X imd

imd

⊢obj d-λ-reg

∅ {r1 } {r0 }; S0 ⊢obj b S0 ⊢

imd r1 r0 obj d

r1 r0

ˆǫ

hλ[ρ]x. ei : ∀ X. Λ ρ. B0

̺ ∈ Dom( Γ) ̺ ∈ Dom( Γ) Γ( x) Λ ρ. B @ ̺ imd ̺1 ̺0 /̺2 ˆ r Γ; S ⊢obj b x [̺] : B [ ρ:= ̺] ! {(exec @ ̺)}

imd

⊢obj b-reg-app-var

̺ ∈ Dom( Γ) r ∈ ̺1 ̺0 S( hr, oi) Λ ρ. B

imd

⊢obj b-reg-app-addr

imd

Γ; S ⊢obj b

̺1 ̺0 /̺2

imd ̺1 ̺0 /̺2

imd

⊢obj b-λ-reg

Γ {ρ}; S ⊢obj b Γ; S ⊢

⊢

imd obj b-µλ-reg

hλ[ρ]x. ei : Λ ρ. B0 ! ∅

imd ̺1 ̺0 /̺2 obj b

ˆr

ˆ r3

ˆr

hr, oi [̺] : B [ ρ:= ̺] ! {(exec @ r)}

hλx.e i : B ! ∅

hλ[ρ]x. ei : Λ ρ. B ! ∅

imd ̺1 ̺0 /̺2 ˆ r Γ {x 7→ Λ ρ. B @ ̺0 }; S ⊢obj b hλ[ρ]x. ei : Λ ρ. B ! ∅ imd ̺1 ̺0 /̺2 ˆ r Γ; S ⊢obj b hµx@̺0 . λ[ρ]x. ei : Λ ρ. B ! ∅

̺ ∈ Dom( Γ) imd ̺1 ̺0 /̺2

Γ; S ⊢obj b

imd

⊢obj e-let-µλ-reg

ˆr

hµx@̺. λ[ρ]x. e1 i : Λ ρ. B ! ∅ imd ̺ ˆ r Γ {x 7→ ∀ X. Λ ρ. B @ ̺}; S ⊢obj e e2 : P ! T 2 X = ftv(B) − ftv(Γ) imd ̺ ˆ r Γ; S ⊢obj e let hµx@̺. λ[ρ]x. e1 i in e2 : P ! {(alloc @ ̺)} ∪ T2

Figure C.3.2. Object (LRPF) Language Typing Rules appropriate region-monomorphic function. Although the resulting explosion in code size may not be acceptable, it is thus at least theoretically possible to model such an object language in our existing monadic language.

3.2. First-class Region-polymorphic Functions.

A language with first-class region-polymorphic functions would remove the requirement that region-polymorphic functions be let-bound. We present some possible reduction rules that support unary region-polymorphic functions. 362

⇀ imd he; si obj

-reg-app is similar to that for applying local

Region Polymorphism

First-class Region-polymorphic Functions

region-polymorphic functions. The remaining two rules allocate recursive and nonrecursive regionpolymorphic functions, respectively. ⇀

imd he; si

s( a) = hλρ.e i

-reg-app

ha [∅r ]; si

[(exec @ r0 )] ⇀ he [ ρ:=

imd he; si(r = r1 r0 r2 ) obj

⇀

imd he; si obj

∅

r ]; si

ˆ r3 → r3

a = hr0 , oi

-µλ-reg hhµx@r0 . λρ. ei; si

[(alloc @ r0 )] ⇀ ha; s {a 7→ hλρ.e [ x:= a] i}i

ˆ r3 → r3

imd he; si(r = r1 r0 r2 ) obj

⇀ imd he; si obj

-λ-reg

a = hr0 , oi [(alloc @ r0 )] h@ r0 hλρ. ei; si ⇀ ha; s {a 7→ hλρ.e i}i

ˆ r3 → r3

imd he; si(r = r1 r0 r2 ) obj

ftv(B) = X imd r1 r0

imd

⊢obj d-λ-reg

∅ {r1 } {r0 }; S0 ⊢obj b S0 ⊢

imd r1 r0 obj d

ˆǫ

hλρ.e0 i : B0 ! ∅

hλρ.e0 i : ∀ X. B0

T ⊆ T0

⊢

imd obj b -µ-reg

imd ̺1 ̺0 ˆ r Γ {ρ}; S ⊢obj e e: P!T imd ̺1 ̺0 /̺2 ˆ r Γ; S ⊢obj b hλρ.e i : ΛT0 ρ.P ! ∅

̺ ∈ Dom( Γ) ⊢

imd obj b-µλ

imd ̺1 ̺0 /̺2 ˆ r Γ {x1 7→ B @ ̺}; S ⊢obj b hλx2 .e i : B ! ∅ imd ̺1 ̺0 /̺2 ˆ r Γ; S ⊢obj b hµx1 @̺. λx2 . ei : B ! ∅

̺ ∈ Dom( Γ) imd

⊢obj b-µλ-reg

imd

⊢obj e-reg-app

imd ̺1 ̺0 /̺2 ˆ r Γ {x 7→ B @ ̺}; S ⊢obj b hλρ.e i : B ! ∅ imd ̺1 ̺0 /̺2 ˆ r Γ; S ⊢obj b hµx@̺. λρ. ei : B ! ∅

imd ̺1 ̺0 ̺2 ˆ r Γ; S ⊢obj e e1 : (ΛT0 ρ.P) @ ̺1 ! T1 ̺1 ∈ Dom( Γ) ̺2 ∈ Dom( Γ) imd ̺1 ̺0 ̺2 ˆ r Γ; S ⊢obj e e1 [̺2 ] : P [ ρ:= ̺2 ] ! T0 [ ρ:= ̺2 ] ∪ T1 ∪ {(exec @ ̺1 )}

Figure C.3.1. Object (FCRPF) Language Typing Rules Typing rules for values are presented in Figure C.3.1. We make use of ΛT ρ.P, i.e., pure types abstracted over by a region variable and annotated with an effect. Unlike the previous system, there

363

Conclusion and Future Work

Intermediate Languages

is no customized expression presented for allocating region-polymorphic functions. We thus assume the existence of a standard typing rule for allocations of prestorables. Consider translating into a monadic language. The actual location of any operation performed in the body of a first-class region-polymorphic function at the region parameter is not fixed, and cannot, in general, be determined statically. This is because the arguments may only be supplied in other routines to which the function is passed. Clearly, we will not be able to support the same level of effect annotations on monads that we have enjoyed so far in this study. This may recommend the use of local over first-class region-polymorphic functions.

4. Unrestricted Store

We have already described unrestricted stores in Section 2.2 of the Introduction and presented some typing rules in Section 4.1.2. An unrestricted store does not change much in the object languages — in fact, that is how such systems are generally structured. When we model an unrestricted store with monad transformers, we must arrange for each region to allow for the maximum number of applications of the monad transformer. One way to accomplish this is to keep the entire store at the maximum level of monadification, and transform all the stored values as regions are created and destroyed.

5. Other Effects

A natural topic of consideration is the application of these ideas to other effects. Exceptions, multithreading, and continuations are particularly attractive candidates. Operators for control effects are generally expressed in terms of continuations. Related work on this monad was described in the Introduction. Recall that the nesting of regions controls the granularity of the steps of computation. Grossman’s Cyclone system [20] is a region-based system in which regions correspond to both memory and multithreading. Trampolining [15, 19] can be used to implement multithreading effects functionally. The encapsulation construct for multithreading corresponds to a scheduler.

364

Incremental Implementation and Monadic Reflection

The schedulers corresponding to inner regions run as processes within others corresponding to outer regions. Operations, such as spawning processes, would refer to some enclosing region. A reduction semantics would include some representation of these nested queues. Exceptions are now a common feature of programming languages that allow an exception to be raised in one part of the code, and handled in another part of the code higher on the stack. Guzman and Suarez present a type system and reduction semantics for representing exceptions without encapsulation [21]. The try construct provides some implicit encapsulation for languages with forms for handling exceptions when it is only possible to raise or handle an exception within such a construct. The encapsulation could be enriched by separating try from catch and allowing the same exception to be raised (or caught) in any enclosing try construct. There is clearly work to be done in modelling such systems.

6. Incremental Implementation and Monadic Reflection

So far we have interpreted the object language in terms of a monadic language, but have not formally interpreted the monadic language in terms more primitive constructs. An appropriate next step might be to implement the monadic language in a functional setting. We could then build representations for effectful computations that make clear how the effect is achieved. Something similar was sketched for the λ-calculus in our Categorical Interludes. Reflection [51, 18] is the capacity of programming languages to access their own implementation. It generally consists of a reify form that makes an implementation data structure available to runtime code and a reflect procedure that installs a user-generated data structure into the run-time implementation. Monadic reflection [43, 14] is the use of reify and reflect in monadic code. The implementation data structures are just the functional representations of effectful programs. Rather than providing a fixed set of operators, it is simpler and less restrictive to allow effectful operators to be defined in terms of reify and reflect. If we treat the lambda abstractions and pairs in these definitions as new (stateless) language constructs, we can use reify and reflect to move in and out of the abstract computation type.

365

Conclusion and Future Work

run ref deref set abs app

Stε P P Ref P Ref P × P P1 ⇒ (Stε P2 ) P1 ⇒ (Stε P2 ) × P1

:C :C :C :C :C :C

Intermediate Languages

→ → → → → →

P St{alloc} Ref P St{read} P St{write} Unit St{alloc} (P1 ⇒ (Stε P2 )) St{exec} ∪ ε P2

= = = = = =

λe.let hv, si = reify e ∅ in v λv.reflect λs.ho, s {o 7→ v}i λo.reflect λs.hs o, si λho, vi.reflect λs.hunit, s {o v}i λf.reflect λs.ho, s {o 7→ f ◦ reify}i λho, vi.reflect λs.s o v s

reflect surrounds the definition of each operator invocation so that the operators remain abstract. The body of the function f would be abstract, either using reflect or the effectful operators themselves, so the stored function is wrapped in reify. In the definition of app, the application of s o to v applies the stored function, and the reified body computation is ready to accept a store. reify must also be used in run, prior to extracting the value. An implementation function would translate the run or running form and each stored function, and within the reify form replace return, bind, and effectful operators with their definitions, treating alternating reflect and reify forms as signals to cease and resume, respectively, the translation. The intuition is that since the argument to reflect is already a representation of a computation, it can be added to the newly implemented code without modification, subject to occurrences of reify. Finally, the extended intermediate language including reify and reflect could be given an operational semantics consistent with this implementation and the semantics of the lambda-calculus. We have seen that our monadic computations are built up incrementally using monad transformers. We would hope to apply the above strategy to our core language. We do not demonstrate that here, but merely provide a taste of how it could be achieved. We would like to implement monadic computations incrementally, outermost regions or innermost monad transformers first. The implementation function would then be the least fixed point of the function that identifies and removes in parallel the outermost monadic encapsulation constructs. We have seen that the state monad transformer takes the form ≬S ⇒ M (P × ≬S) for prior monad M and value type P. A monadic value of the intermediate language might have type St̺2 ( St̺1 ( Id))P. Applying the definition of the state monad transformer to the innermost transformer application (outermost instance of run ̺1

⇒ St̺2 ( Id)( P × ≬S ). Applying the definition a second time, we obtain

̺1

× ≬S ). Our hybrid language for partially implemented programs will

or running), we obtain ≬S ≬ ̺1

S

⇒ ≬S

̺2

⇒ Id( A × ≬S

̺1

̺2

thus need to contain storage-free procedures, pairs, and region stores4. The extension of a region 4One might wonder whether we have trivialized our problem of implementing heap-allocated functions by assuming the availability of storage-free functions. The answer is in the negative because our storage-free functions are quite restricted and will be used only to abstract over region stores. Additionally, if we execute each computation exactly

366

Incremental Implementation and Monadic Reflection

store type will be considered a subtype of the original. Since the prior monad is applied not to the value type but to a pair type including a region store type, our bind form will need to pattern-match on the entire tuple. Effectful operators would, operationally, use one of these for computations on already-implemented regions. Computations on other regions would be handled as before, using the uppermost region store of the lexical store. As run and running forms are implemented, they are replaced by code that applies the functional representations and projects the value. Incremental implementation is a good idea on its own, but combining it with monadic reflection simplifies the system while adding power. The basis of the added power is that we can allow a user to add any operator supported by the monad. The basis of the simplification is that both ideas rely on combining code at various stages of implementation within the same language. Thus, a strategy for incorporating reflection into our study would be as follows. First, define a source language with reify and reflect instead of effectful and encapsulating operators. Translate from the original monadic source language to this one. Then enhance the source language with constructs such as a let form that binds tuples of region stores (for already-implemented regions) as well as the computed value. Finally, define a staged implementation function from the revised source language to a fully functional implementation language via this extended monadic language.

once and on an empty store, as is done with the traditional definition of run, our implementation does not need them. In all instances not pertaining directly to heap-allocated functions, these functions are applied immediately; thus they can all be changed to lets. And except for the outer application which sets the initial store, they are applied to an operand which is the formal parameter; thus all but one application per region can be eliminated entirely.

367

Bibliography [1] A. Asperti and G. Longo. Categories, Types, and Structures. Foundations of Computing. MIT Press, 1991. [2] N. Benton, J. Hughes, and E. Moggi. Monads and effects. Sept. 2000. [3] N. Benton and A. Kennedy. Monads, effects and transformations. In 3rd International Workshop in Higher Order Operational Techniques in Semantics, Paris, Sept. 1999. Elsevier. Also vol 26 of Electronic Notes in Theoretical Computer Science. [4] G. Bollella, B. Brosgol, P. Dibble, S. Furr, J. Gosling, D. Hardin, and M. Turnbull. The Real-Time Specification for JavaT M , page 59. Addison-Wesley, June 2000. [5] C. Calcagno, S. Helsen, and P. Thiemann. Syntactic type soundness results for the region calculus. Information and Computation, to appear, 20 2001. [6] K. Crary, D. Walker, and G. Morrisett. Typed memory management in a calculus of capabilities. In Proceedings of the ACM Symposium on Principles of Programming Languages, pages 262–275, New York, NY, Jan. 1999. ACM Press. [7] R. L. Crole and A. M. Pitts. New foundations for fixpoint computations: Fix-hyperdoctrines and the fix-logic. Information and Computation, 98(2):171–210, June 1992. Preliminary version appeared in ”Proceedings, Fifth Annual IEEE Symposium on Logic in Computer Science, pages 489-497, Philadelphia, Pennsylvania, 4-7 June 1990. IEEE Computer Society Press.”. [8] O. Danvy and A. Filinski. Abstracting control. In Proceedings of the 1990 ACM Conference on LISP and Functional Programming, Nice, pages 151–160, New York, NY, 1990. ACM. [9] R. K. Dybvig, S. P. Jones, and A. Sabry. A monadic framework for delimited continuations. Journal of Functional Programming, 2006. To appear. [10] M. Elsman. Garbage collection safety for region-based memory management. In Workshop on Types in Language Design and Implementation. ACM Press, 2003. [11] D. Espinosa. Modular denotational semantics. Technical report, Columbia University, Dec. 1993. [12] M. Felleisen and D. P. Friedman. A syntactic theory of sequential state. Theoretical Computer Science, 1987. Preliminary version in: Proceedings of the 14th Annual Symposium on Principles of Programming Languages, 1987, 314–325. [13] M. Felleisen and R. Hieb. A revised report on the syntactic theories of sequential control and state. Theoretical Computer Science, 103(2):235–271, 1992. [14] A. Filinski. Representing monads. In Symposium on Principles of Programming Languages, pages 446–457. ACM Press, 1994.

369

Conclusion and Future Work

Intermediate Languages

[15] A. Filinski. Representing layered monads. In Symposium on Principles of Programming Languages, San Antonio, TX, 1999. ACM Press. [16] J.-C. Filliˆ atre. A theory of monads parameterized by effects. Research Report 1367, LRI, Universit´ e Paris Sud, Nov. 1999. [17] M. Fluet and G. Morrisett. Monadic regions. SIGPLAN Notices, 39(9):103–114, 2004. [18] D. P. Friedman and M. Wand. Reification: Reflection without metaphysics. LISP and Functional Programming, pages 348–355, 1984. [19] S. E. Ganz, D. P. Friedman, and M. Wand. Trampolined style. In International Conference on Functional Programming, Paris, 1999. ACM Press. [20] D. Grossman. Type-safe multithreading in cyclone. In Workshop on Types in Language Design and Implementation. ACM Press, 2003. [21] J. Guzm´ an and A. Su´ arez. An extended type system for exceptions. In 5th Workshop on ML and its Applications. ACM, June 1994. Also appeared as Research Report 2265, INRIA, BP 105-78153 Le Chesnay Cedex, France. [22] S. Helsen and P. Thiemann. Syntactic type soundness for the region calculus. In A. Jeffrey, editor, HOOTS ’00 Higher Order Operational Techniques in Semantics, volume 41 of Electronic Notes in Theoretical Computer Science, Montreal, Canada, Sept. 2000. Elsevier Science. [23] F. Henglein, H. Makholm, and H. Niss. A direct approach to control-flow sensitive region-based memory management. In Proceedings of the 3rd Interanational ACM SIGPLAN Conference on Principles and Practice of Declaritive Programming. ACM Press, Sept. 2001. [24] R. Hindley. The principal type scheme of an object in combinatory logic. Transactions of the American Mathematical Society, 146:29–60, Dec. 1969. [25] R. Hinze. Deriving backtracking monad transformers. In International Conference on Functional Programming, pages 186–197, Montreal, Canada, 2000. ACM Press. [26] P. Hudak. Mutable abstract datatypes (or how to have your state and munge it too). Technical Report YALEU/DCS/RR-914, Department of Computer Science, Yale University, New Haven, CT 06520, May 93. [27] M. P. Jones and L. Duponcheel. Composing monads. Technical Report YALEU/DCS/RR 1004, Yale University, New Haven, CT, 1993. [28] S. L. P. Jones and P. Wadler. Imperative functional programming. In ACM, editor, Conference record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages: papers presented at the symposium, Charleston, South Carolina, January 10–13, 1993, pages 71–84, New York, NY, USA, 1993. ACM Press. [29] P. Jouvelot and D. K. Gifford. Reasoning about continuations with control effects. In Conference on Programming Language Design and Implementation, pages 218–226, Portland, Oregon, 1989. [30] P. Jouvelot and D. K. Gifford. Algebraic reconstruction of types and effects. In Symposium on Principles of Programming Languages, pages 303–310, Orlando, Florida, 1991. ACM Press. [31] K. Kagawa. Monadic encapsulation with stack of regions. In FLOPS 2001, number 2024 in LNCS, pages 264– 279. Springer Verlag, 2001.

370

Incremental Implementation and Monadic Reflection

[32] D. J. King and P. Wadler. Combining monads. In J. Launchbury and P. M. Sansom, editors, 5th Glasgow Workshop on Functional Programming, Ayr, Scotland, 1992. Springer-Verlag. [33] R. M. L. Damas. Principal type schemes for functional programs. In Conferece Record of 9th Annual ACM Symposium on Principles of Programming Languages, Albuquerque, NM, 1982. ACM Press. [34] J. Lambek and P. J. Scott. Introduction to Higher Order Categorical Logic. Number 7 in Cambridge Studies in Advanced Mathematics. Cambridge University Press, 1986. [35] J. Launchbury and S. L. P. Jones. Lazy functional state threads. In Proceedings of the ACM SIGPLAN’94 Conference on Programming Language Design and Implementation (PLDI), pages 24–35, Orlando, Florida, June 1994. [36] J. Launchbury and S. L. P. Jones. State in haskell. Lisp and Symbolic Computation, 8(4):293–341, 1995. [37] J. Launchbury and A. Sabry. Monadic state: Axiomatization and type safety. In International Conference on Functional Programming, pages 227–238, Amsterdam, July 1997. ACM, ACM Press. [38] S. Liang, P. Hudak, and M. Jones. Monad transformers and modular interpreters. In 22nd Symposium on Principles of Programming Languages, pages 333–343, San Francisco, CA, Jan. 1995. ACM Press. [39] J. M. Lucassen and D. K. Gifford. Polymorphic effect systems. In Symposium on Principles of Programming Languages, pages 47–57, San Diego, California, Jan. 1988. ACM Press. [40] C. McLarty. Elementary Categories, Elementary Toposes. Number 21 in Oxford Logic Guides. Oxford University Press, 1992. [41] R. Milner. A theory of type polymorphism in programming. Journal of Computer and System Sciences, 17:348– 375, Aug. 1978. [42] J. C. Mitchell and A. Scedrov. Notes on sconing and relators. In e. a. E. Boerger, editor, Computer Science Logic ’92, Selected Papers, volume 702 of LNCS, pages 352–378. Springer-Verlag, 1993. [43] E. Moggi. An abstract view of programming languages. Technical Report ECS-LFCS90-113, University of Edinburgh, Laboratory for Foundations of Computer Science, 1990. [44] E. Moggi. Notions of computation and monads. Information and Computation, 93(1):55–92, 1991. [45] E. Moggi and F. Palumbo. Monadic encapsulation of effects: A revised approach. In Third International Workshop on Higher-order Operational Techniques in Semantics, Electronic Notes in Theoretical Computer Science. Elsevier, Sept. 1999. [46] E. Moggi and A. Sabry. Monadic encapsulation of effects: A revised approach (extended version). Journal of Functional Programming, 11(6):591–627, nov 2001. [47] F. Nielson and H. R. Nielson. Type and effect systems. In Correct System Design, pages 114–136, 1999. [48] B. C. Pierce. Basic Category Theory for Computer Scientists. Foundations of Computing. MIT Press, 1991. [49] A. Sabry and P. Wadler. A reflection on call-by-value. In International Conference on Functional Programming, pages 13–24. ACM Press, 1996. Also SIGPLAN Notices, 31(6). [50] M. Semmelroth and A. Sabry. Monadic encapsulation in ML. In International Conference on Functional Programming, pages 8–17, Paris, Sept. 1999. ACM Press.

371

Intermediate Languages

[51] B. C. Smith. Reflection and Semantics in a Procedural Language. PhD thesis, Massachusetts Institute of Technology, Jan. 1982. [52] W. Taha. A sound reduction semantics for untyped cbn multi-stage computation. or, the theory of metaml is nontrivial (preliminary report). In Symposium on Partial Evaluation and Semantics Based Program Manipulation, Boston, Massachusetts, Jan. 2000. ACM Press. [53] W. Taha and M. F. Nielsen. Environment classifiers. In Symposium on Principles of Programming Languages. ACM Press, 2003. [54] J.-P. Talpin. A simplified account of region inference. Technical Report 1380, IRISA, Jan. 2000. [55] J.-P. Talpin and P. Jouvelot. Polymorphic type, region and effect inference. Journal of Functional Programming, 2(3):245–271, July 1992. [56] J.-P. Talpin and P. Jouvelot. The type and effect discipline. Information and Computation, 111:245–296, 1994. Preliminary version appeared in ”Proceedings of the 1992 Conference on Logic in Computer Science, IEEE Computer Society Press”. [57] H. Thielecke. From control effects to typed continuation passing. In Symposium on Principles of Programming Languages. ACM Press, 2003. [58] M. Tofte and L. Birkedal. A region inference algorithm. Transactions on Programming Languages and Systems, 20(5):724–757, July 1998. [59] M. Tofte and J.-P. Talpin. Implementation of the typed call-by-value λ-calculus using a stack of regions. In Symposium on Principles of Programming Languages, pages 188–201, Portland, Oregon, 1994. ACM Press. [60] M. Tofte and J.-P. Talpin. Region-based memory management. Information and Computation, 132(2):109–176, Feb. 1997. Preliminary version appeared at ”Symposium on Principles of Programming Languages, 1994”. [61] A. P. Tolmach. Optimizing ML using a hierarchy of monadic types. In Workshop on Types in Compilation, pages 97–115, Kyoto, Mar. 1998. ACM Press. [62] P. Wadler. The essence of functional programming. In Proceedings of the 19th Annual ACM Symposium on Principles of Programming Languages, Albuquerque, New Mexico, pages 1–14, Jan. 1992. [63] P. Wadler. Monads for functional programming. In J. Jeuring and E. Meijer, editors, Advanced Functional Programming, volume 925 of LNCS. Springer-Verlag, 1995. (This is a revised version of [62].). [64] P. Wadler. The marriage of effects and monads. In International Conference on Functional Programming, pages 63–74, Baltimore, Maryland, Sept. 1998. ACM Press. [65] P. Wadler and P. Thiemann. The marriage of effects and monads. Transactions on Computational Logic, 4(1):1–32, Jan. 2003. [66] A. K. Wright. Typing references by effect inference. In European Symposium on Programming, pages 473–491, Rennes, France, Feb. 1992. Springer-Verlag. vol. 582 of Lecture Notes in Computer Science. [67] A. K. Wright and M. Felleisen. A syntactic approach to type soundness. Information and Computation, 115(1):38–94, Nov. 1994.

372

encapsulation of state with monad transformers

Encapsulation of State with Monad Transformers. Abstract. We relate the type-and-effect system of Tofte and Talpin and monadic systems while handling the.

Download PDF

2MB Sizes 1 Downloads 263 Views

Report

encapsulation of state with monad transformers

Recommend Documents