1. Introduction
We explore the possibility of using term graphs as a basis for an imperative language. We present a language based on this idea. The presented language has a very simple syntax and semantics. A complete description of the syntax and semantics takes about 13 pages, including some features that are not really necessary such as a term rewriting facility. The language also has the efficiency of arrays in imperative programming languages. The language has some interesting mathematical properties which are explored.
1.1. Term Graphs
A term graph is a graphical structure in which the nodes represent terms. Common subterms can be represented only once, with shared structure. Terms are very basic and universal mathematical objects which can also be used to represent data structures in programming languages. If f is a function symbol and the are terms then is a term. In the term graph language presented here, terms do double duty. They can be expressions to be evaluated such as gcd or they can be data structures such as cons(a,cons(b,NIL)) as in LISP. The former make use of defined symbols and the latter make use of constructors which can appear in data structures. Only constructors may appear in term graphs which implies that terms represented by term graphs are ground (variable-free) constructor terms.
If there are cycles in the term graph then the terms that are represented can be infinite such as . This permits term graphs to represent data structures such as doubly linked lists which have cycles in them. Two dimensional arrays can be represented in term graphs using finite terms, as an array whose elements are arrays, or more precisely, as a term whose top-level subterms are also terms. Of course, all terms of arity one or more have top-level subterms which are terms, so two dimensional arrays do not require any extension to the language. One might also implement two dimensional arrays by one dimensional arrays, translating a pair of indices into a single index.
Languages based on term graphs are typically defined by term rewriting [
BN98,
DJ90,
DP01]. The language presented here is unusual in this respect because it is based on term graphs but is an imperative programming language. Thus we call this language an imperative term graph language (ITGL). However, a facility for specifying and implementing term rewriting in ITGL is also presented. This gives the language a convenient pattern matching facility. For simplicity the language is untyped.
1.2. Imperative Languages and ITGL
Imperative languages map states of the machine to states of the machine while functional languages replace expressions by equivalent expressions. In ITGL the state of the machine is represented by a term graph and an environment, and statements in the language map such states to states. The environment essentially tells the values assigned to program variables. More precisely, the nodes in the term graph correspond roughly to storage locations (addresses) or registers in which the values of program variables are stored. The environment maps program variables to such nodes which tell where the value of the program variable is stored. These storage locations are made explicit in the semantics of ITGL. However, the nodes never appear explicitly in the language itself. Each such node has a constructor function symbol and a list of children, which are nodes. By tracing down these links and reading the function symbols one can construct a term from a node in the term graph. By making the storage locations explicit it is straightforward for example to represent the fact that two program variables are stored in the same location so that changing the value of one variable will also change the value of the other. There is no way to change the function symbol assigned to a node in the term graph but the children of the node can be changed. This corresponds to the fact that it is hard to change the name of an array in a conventional imperative language but one can easily assign values to elements of the array. One can change the value of a program variable by an assignment statement, which can cause the variable to point to a different storage location, but this will not affect the values of other variables stored at the original location. However, changing the children of the node will affect the values of all variables stored at that node.
The constructs one wants to have in imperative languages include assignment statements, conditional statements, iterative statements, procedure definitions, and procedure calls. Arithmetic expressions and other functional expressions are also essential. One also wants to have arrays and to be able to replace an element of an array. In ITGL arrays are terms and are treated like any other terms, and assignments to an element of an array are implemented as replacing a top-level subterm of a term in a term graph.
The formal semantics of ITGL is presented assuming termination, then defining the denotational semantics of this language and proving properties of it by induction on the length of the computation. This paper only considers the terminating case, but it should be possible to extend the semantics to nontermination using least fixpoint semantics and complete partial orderings.
In this language parameters are essentially passed by reference. Also, there are no global variables, so that all variables in a procedure have to be passed in as parameters or locally assigned.
Formalisms such as Hoare logics and abstract interpretations that enable proofs of correctness in other imperative languages should also work for this language. In general, a simple syntax and semantics facilitates the proofs of correctness of programs. In this regard the simplicity of the syntax and semantics of ITGL is an advantage. Simplicity also makes it easier to construct correct translations into other languages. A complicated language will necessitate a complicated translation. The fact that ITGL does not have interrupts or input-output also decreases its complexity. The simplicity of ITGL also means that the language is not likely to change in any essential way, so that a program that runs in ITGL should still run 50 years from now. This simplicity also makes the language easier to learn. Finally, the simplicity, together with an imperative style, should enable efficient implementations of the language. Of course, the simplicity may also restrict the flexibility of the language.
1.3. Imperative Languages Versus Functional Languages
Why was ITGL designed as an imperative language and not a functional language? The main reason was that defining it this way seemed to be simpler than defining it using term rewriting.
Both kinds of languages have their advantages. The typical machine architecture has a state which consists of the contents of memory and the contents of various registers, and instructions change the state of the machine. This architecture is most closely related to the imperative language design which may imply that imperative languages can be more readily mapped to conventional machines and may have an efficiency advantage. Even with architectures with massive parallelism such as GPUs, one basically has a collection of von Neumann machines operating together. But pure functional language have more mathematical elegance because their execution mechanism involves replacing terms by equal terms.
Both kinds of languages use arrays, which are needed for efficiency because arbitrary elements of the array can be accessed quickly. In ITGL arrays are considered as terms and thus inherit the semantics of terms which differs in some respects from the typical semantics of arrays. This is an interesting issue mathematically but we believe it will not have much impact on practical programs. In general, imperative languages with side effects do not need to copy arrays when an array element is changed. Pure functional languages need to copy arrays when an element of the array is changed, unless the array references are single threaded. By single threaded we mean that after an array is modified, the old versions of it will never again be referenced. This use of the term may be non-standard.
If array references are not single threaded, then no formalism can have both efficient array operations and avoid side effects. If array references are not single threaded, then there may be references to old versions of the array, so it will be necessary to copy the array in order to keep all of these versions available. It doesn’t matter whether one is using functional programming, monads [
Wad95], continuation style parameter passing, or an imperative style, the same constraint still applies. Because pure functional programming avoids side effects, it cannot avoid array copying if the array references are not single threaded.
In ITGL, as in other imperative languages, array assignments are destructive which means that the array is modified in place and all references to the array change. In particular, in ITGL, if an array assignment is done inside a procedure, the term graph is changed and this change persists when the procedure is exited. In a pure functional language such side effects are not allowed because one can only replace expressions by equivalent expressions. This means that when replacing an element of an array in a pure functional language, one must copy the whole array if there are multiple references to it. This can cause considerable inefficiency. A straightforward copying of an array will cost linear time and storage. This copying can be made more efficient by storing the array as a binary tree, but there is still a logarithmic overhead in the worst case for each array assignment compared to imperative languages if the array is copied. If an array is stored as a binary tree, then one can construct a binary tree representing a copy of the array with one element changed by reusing much of the structure of the old array. This takes time proportional to the logarithm of the size of the array.
It is possible to implement arrays efficiently in a pure functional language using monads if references to the arrays are single threaded. This issue is discussed in a paper by Wadler [
Wad95]. For example, one can use state monads. A state monad [
Mog91] operates on pairs
where
v is a value and
S is a state and the state can be an array, but state monads are not as familiar to many programmers as conventional imperative arrays are.
Similarly, functional languages are not as popular for applications or in academia. Perhaps the conventional model of instructions changing the state of a machine is easier for people to understand.
Continuation style programming [
SS75] is another programming style, but it is unnatural for many programmers and is not popular in academia and industry. It may also be more prone to error.
1.4. Side Efects
ITGL is an imperative language. Imperative languages have side effects but these are not allowed in pure functional languages. What are the advantages of side effects? Are they necessary?
First, exactly what is a side effect? We will say that it is a change in the value of a global variable (or the state of a file or device), caused by a procedure or function but outside of its scope. In an imperative language this often happens when modifying an element of an array without choosing a new name for the modified array.
With side effects it is possible for two calls to in sequence to return different results, for example, if f is a counter that modifies the value of x. In a pure functional language two calls to in the same context should always return the same value. Also, in an imperative language it is easy to write an iterative loop to assign a value to each element of an array, while this may require recursion in a functional language, and even then there is a side effect unless a new name is chosen for the modified array each time. Sorting algorithms typically move elements around in an array, which is a side effect of the code if the array is not continually renamed. However, continually renaming the array requires passing it around from one procedure call to another repeatedly. Memoization (remembering a value so that it does not have to be recomputed) involves a side effect. This can happen, for example, in the computation of the Fibonacci sequence.
In a typical imperative language, to compute the element of the Fibonacci sequence one would have something like where the array x holds the values computed and is the Fibonacci number. The calls to f modify the array x without choosing a new name for it, so the change to the array is a side effect. In a pure functional language x could not be modified; the new value of the array x would have to be given a new name and passed back from the recursive calls to f. Then the updated x would have to be passed on to other calls to f. Returning the new value of the array each time and passing it on to the next call would make the code more complicated. This would also require copying the array unless the array were referenced in a single threaded manner. Of course, in a functional language one could compute the Fibonacci sequence efficiently by writing a function to return both and at the same time, and using a simple recursion.
Updating a database also has a side effect, unless one wants to pass back the entire modified and renamed database when making an update and pass the modified database on to others who may want to use it. In a concurrent system this may not even be possible.
Also, in graph algorithms it is often necessary to mark nodes concerning their status, whether they have been visited or not. The graph is typically stored in an array, so these updates involve assigning to an array element. This is a side effect which is acceptable in an imperative language but can only be avoided in a pure functional language by continually passing around the most recent version of the array. That would make the functions in the graph algorithm return the array along with possibly other information, making them more complicated. Making functions more complicated might impact correctness proofs. One could make the return of the array value somehow implicit in a functional language, but then different functions might operate on different arrays, so what they depend on and modify would have to be specified in some manner. It might also be difficult to interface functions that operate on different arrays. One can make this all uniform in a functional language by making all functions depend on and return the values of all global variables. But this is implicitly what imperative languages do, so then one has essentially made a functional language into an imperative language.
Explicitly passing an array or a state between functions in a functional language need not incur an extra cost, if the array or state is accessed in a single threaded manner; the array or state can simply be modified in place and renamed. However, to maintain correctness this approach requires a method to ensure that the old versions of the array or state are never accessed, increasing the burden on the programmer or compiler.
Pure functional languages have alternate means to perform many of the things that imperative languages do, and to do them efficiently, but many programmers find the imperative style of programming to be more natural. Another possibility is to have functional languages with some imperative features. This section was partially based on some queries to the AI system on the Google search engine in September, 2025.
1.5. What Is an Algorithm?
The ITGL language is related to the question, what is an algorithm? What do we mean by a particular algorithm such as insertion sort? Clearly one can write an insertion sort in many languages, so it’s not the same as a particular program. It’s also not the same as an algorithm in the sense of something that can be computed by a Turing machine because there are many Turing computations that do not implement insertion sort. It has something to do with a specification of an input-output relation of a program but it is more than that because an algorithm in the conventional sense also specifies how the computation proceeds (such as in Euclid’s algorithm for the greatest common divisor). Maybe it has something to do with a particular form of a proof of the final input-output relation using Hoare’s axioms. This question has been considered by various authors in the literature [
BDG09,
BFG+12,
BG03a,
BG03b,
Gur00,
Hil15,
Mos01,
Var12]. Algorithm texts frequently use pseudo-code to specify algorithms but without a precise semantics.
We consider algorithms whose input and output are abstract mathematical objects such as graphs, trees, or integers. This definition includes such algorithms as maximum flow, traveling salesman, and greatest common divisor algorithms which map abstract mathematical objects into other mathematical objects. This definition leaves out many algorithms such as memory management algorithms. We want to specify a language that can be used to give an abstract definition of algorithms on abstract mathematical objects, something like the pseudo-code in algorithm texts, but with a precise semantics. Because of its simple syntax and semantics and imperative style, ITGL is one candidate for such a language to be used to specify algorithms. In algorithm texts, algorithms are typically presented in an imperative style, so it is convenient to adopt an imperative formalism for describing abstract algorithms. We hope that such an abstract definition of an algorithm can be translated into particular programs in typical imperative programming languages. For this it helps to have similar programming constructs in the abstract language and common imperative languages to make the translation simple and natural. Then one can verify the abstract algorithm once and if the translations are correct, one will obtain verified algorithms in many typical programming languages. Translating ITGL into typical imperative languages should be facilitated because both ITGL and the target language are imperative and because of the simplicity of the semantics of ITGL. However, the use of ITGL for abstract definition of algorithms and the translation of such algorithms into typical imperative programming languages are not further discussed in this paper.
An insertion sort program and a greatest common divisor program in ITGL are given as examples of abstract specifications of an algorithm. The ITGL language has some features that are not needed for abstract specification of algorithms and so these would not have to be used in such specifications. For example, the rewrite rule facility and the caching mechanism would not be needed.
The question arises whether ITGL is the only language that can be used for abstractly specifying algorithms. In fact, it is not the only language, and may not even be the best language for this purpose. However, because algorithms are typically written in an imperative style, it is natural that a language for specifying algorithms should be imperative, with program variables, assignment statements, conditional statements, iterative statements, arrays with side effects, procedure definitions, and procedure calls. Also, the language should be possible to implement efficiently and should have a simple syntax and semantics. Because of its characteristics ITGL is a good candidate for such a language.
1.6. Organization of This Paper
This paper is organized as follows. First the syntax of the language is presented, then a definition of the denotational semantics. The language has several different kinds of constructs: Imperative statements, functional expressions, and three kinds of procedures: imperative procedures, rewrite procedures, and compiled procedures. States in the language consist of an environment and a term graph. Functional expressions return a node in the term graph and a state. A garbage collection procedure for the term graph is not specified, and because of this one can prove that the size of the term graph never decreases. A notion of variable binding is defined and it is shown that all constructs preserve variable binding, namely, the environment maps program variables to nodes in the term graph. If a couple of kinds of statements in the language are not used then the statements and procedures are term dependent in the sense that terms in the resulting state are functions of the terms in the input state. If all kinds of statements in the language are used then the language is not term dependent but it is still isomorphism dependent.
Quite a bit of space is devoted to the issue of caching in the term graph: When can all copies of the same term be stored at the same node and when must they be stored in different nodes? Caching means that if two or more data structures are identical then only one representative needs to be stored, with multiple references to it. This saves space but one might want to have separate copies of a data structure in case one does not want destructive operations on one copy of a structure to affect the other copies. A systematic way of dealing with this issue is presented.
We give a simple facility for input output but this is not the main focus of the language.
1.7. Previous Work
As for previous work in this area, the paper “Term Graph Rewriting” by Barendregt et al [
BvEG+87] relates graph rewriting to term rewriting and gives a correctness result. Their concept of a term graph is essentially identical to ours. It allows cycles and does not require that all nodes be reachable from a root node. They also have a notion of isomorphism between term graphs but it is not the same as our notion. Their graphs are not restricted to constructors and may contain variables. The paper Towards an Intermediate Language based on Graph Rewriting by Barendregt et al [
BEG+87] gives a language Lean for specifying computations in terms of graph rewriting. The graphs they use are similar to ours, in that they contain no variables, may have cycles, and there is no requirement that all nodes are reachable from a root. Their graphs are used in a way similar to our term graph. A Lean program consists of a set of rewrite rules. The language has functions similar to our compiled procedures. The paper Graph rewriting: An algebraic and logic approach [
COU90] considers graph rewrite rules and graph grammars and their relation to category theory. The formalism is more general than that of term graphs and term graph rewriting. Graph rewriting: A bibliographical guide by Courcelle [
Cou95] gives a survey of various notions of graph rewriting. The paper Jungle evaluation [
HKP88] considers jungles, which are acyclic hypergraphs in which it is not necessary for all the nodes to be reachable from the root. Such structures are manipulated by jungle rewrite rules which generalize term rewrite rules. Such rules can handle non-left-linear rewrite rules and permit sharing of structure on the right-hand side of a rule to speed up term rewriting. The paper Term Graph Rewriting by Klop [
Klo96] considers many different related systems. He considers term graphs as systems of recursion equations that may essentially involve cycles and also discusses the lambda calculus. An example system such as he considers is
. He considers operations that involve replacing a variable by its definition. The paper “Term Graph Rewriting” by Plump [
Plu99] requires the term graphs to be acyclic. This paper is mainly concerned with the relation between term graphs and term rewriting. He formalizes term graphs using hyperedges which include a node and its children. Also, all nodes must be reachable from the root. The graphs are not restricted to constructors and may contain variables. Nodes may have more than one incoming edge. He also discusses isomorphisms. The paper Rewriting on cyclic structures: Equivalence between the operational and the categorical description [
CG99] considers only acyclic rewrite rules but possibly cyclic term graphs, and relates their definition of term graph rewriting to category theory. They consider directed graph representations in which there can be multiple references to shared structures. The paper Addressed Term Rewriting Systems: Syntax, Semantics, and Pragmatics [
DLLL05] makes global addresses explicit, and these correspond to memory addresses in an implementation. Their memory addresses are similar to the nodes in ITGL. In Modeling Pointer Redirection as Cyclic Term-graph Rewriting [
DEP07] the authors discuss pointers in cyclic term graphs using the double pushout approach. Their term graphs include structures such as circular lists and doubly linked lists. Their proposal focuses on pointer rewriting. Drag Rewriting by Nachum Dershowitz, Jean-Pierre Jouannaud, and Fernando Orejas [
DJO24] considers a very general model of graph rewriting. Drags permit the formalization of non-linear rewrite rules in graph rewriting, which has been a difficult issue to date. Drags are finite directed rooted labeled ordered multi-graphs. From the paper: “Drags, the alternative model considered here, are arbitrary directed graphs equipped with roots and sprouts that facilitate composition. Internal vertices, as in term graphs, are labeled by function symbols having a fixed arity. Sprouts are non-internal vertices without successors, labeled by variables. Our framework is able to faithfully encode term rewriting without restriction, hence finally solving this old question positively.” The paper An algebraic definition for control structures [
Cou80] considers rational expressions which are expressions denoting regular trees. These are the possibly infinite trees having a finite number of subtrees, and they are typically obtained by the unraveling of possibly cyclic finite graphs.
The paper "Transforming imperative programs into bisimilar logically constrained term rewrite systems via injective functions from configurations to terms" [
NKK25] presents a translation from a simple imperative language into a logically constrained term rewriting system (LCTRS) and proves its correctness. An LCTRS is basically a conditional term rewriting system in which the condition is a formula from some underlying logical theory. The condition is interpreted using the underlying theory. This is a pleasant formalism and avoids some of the complexities of conditional term rewriting in which the condition is evaluated by rewriting. A paper by Plaisted [
Pla13] gives an abstract discussion of the idea of an imperative language for describing algorithms, such that these algorithms can then be translated into various imperative style programming languages. This paper does not consider or define any specific language. An earlier paper by Plaisted [
Pla05] extends this proposal to define an abstract language and then translate programs into various common imperative languages. It considers a system for proving that such programs are correct. This paper also considers nonterminating computations using least fixpoints and denotational semantics. Some papers by Barnett and Plaisted [
BP18,
BP21,
PB20] describe a rewrite-based imperative style language with arrays and side effects that also uses a term graph as the basic data structure. These papers are defined using orthogonal term rewriting systems and make use of term graphs with destructive array operations. The application of this language to abstract descriptions of algorithms is also mentioned. However, this system seems more complicated than the system proposed in the current paper. In his thesis [
Tao23] Tao Tao describes a rewriting-based imperative style language with side effects and its implementation on a field-programmable gate array.
2. Syntax
is the set of terms over a set F of symbols and a denumerable set X of program variables. Terms may be infinite.
The letters rs and t denote terms (also called functional expressions).
The letters fg and h denote function symbols.
The letters xy and z are used for program variables and logical variables.
The letters ijkm and n denote program variables that are intended to have integer values and in general denote integers as usual.
Elements of F can be constructors or defined function symbols.
The functions top, arg, and arity are abstract functions on terms and also are function symbols in the language that operate on term graphs. So we write atop, aarg, and aarity for the abstract functions and top, arg, and arity for the function symbols in the language with similar functions.
If t is a term then atop(t) is its top level function symbol and aarg() is the argument of f. So regarding as a term, atop() = f, aarg() = x, and aarg() = y. The arity of a function symbol is the number of arguments it takes so aarity(f) = 2 for this example.
The statements and expressions in this language consist of procedure definitions, functional expressions which are terms, and imperative statements which can be assignment statements, conditional statements, and iterative statements. We give some idea of the semantics of some of the syntactic constructs while presenting the syntax.
2.1. Programs
A program is a term (functional expression) followed by a double semicolon, followed by some number of procedure definitions, each followed by a double semicolon. There is a period at the end of the program. It is intended that the evaluation of the functional expression or of any procedure in the program will only cause execution of procedures in the program.
If the programmer forgets to give a procedure definition for a function symbol, then that symbol will be considered as a constructor.
2.2. Syntax of Functional Expressions (Terms)
A term (functional expression) is a function symbol followed by a list of terms (functional expressions). A program variable (a variable) is also a functional expression.
2.3. Constructors
A constructor is a function symbol that is not defined in an imperative procedure definition or in a rewrite rule procedure definition or defined as a compiled function. If a function symbol is intended to be an imperative procedure but the procedure definition is forgotten, that symbol will be considered as a constructor. Among the constructors, there is the constant symbols for undefined variables and for nontermination. Though we do not explicitly consider nontermination we reserve the symbol for it. This symbol might be useful for defining a least fixed point semantics. There is also another constant symbol for error handling such as for . There are also special symbols that are technically constructors but which may be evaluated in different ways than other constructors.
Among the special symbols there is a function symbol arg such that roughly speaking arg evaluates to the argument of a constructor f in a term t of the form . The expression can also be written as . (More precisely t evaluates to a variable that is bound to a constructor term of the form .) There is a function symbol top such that roughly speaking top for a term t of form evaluates to a constant where f is a constructor For distinct constructors f and g, and are distinct. Also, roughly speaking, arity() evaluates to n for a constructor f of arity n. A more precise semantics for these symbols is given later. There is also an equality function symbol equal.top which on evaluation tests if the top symbols of two constructor terms are the same. There is a function symbol equal.node whose value on evaluation depends on the state and will be explained later. The function symbol “copy” can only appear in a term of the form where x is a program variable. So a term cannot appear as a proper subterm of another term. Integers and Booleans are considered as constructors and as individual constants.
2.4. Arity Declarations
Arities are declared by how a function symbol is used, so if is in the program then f has arity two. It is an error to use the same function symbol with two different arities. We assume is a function symbol with arity n distinct from f and from for any m different from n. So is a valid term. More precisely is a permitted notation where t evaluates to an integer. This permits essentially arrays of a dimension that is specified by the input. If t does not evaluate to a nonnegative integer then this is an error. One can write to indicate a term where all the arguments of f are the term s evaluated one at a time from left to right. A term is in error if g does not already have an arity specified by previous appearance in a term unless g is of the form for some function symbol f and term s. We assume there is a way to declare function symbols and their arities when necessary.
2.5. Procedure Definitions
A procedure definition can be an imperative function (procedure) definition, a rewrite rule procedure definition, or a compiled function (procedure) definition.
2.6. Syntax of Imperative Function Definitions
An imperative function definition is a function symbol followed by a list of formal parameters (program variables), followed by an imperative statement, a functional expression, and a double semicolon as follows:
Here
f is a defined function symbol, the
are formal parameters, and
P is an imperative statement. Formal parameters of procedures are program variables and must be distinct. Formal parameters can be assigned into but their modified values will not be transferred back to the calling point. Program variables that are not formal parameters can appear in imperative statements and functional expressions in a procedure definition.
2.7. Rewrite Rule Procedure Definitions
A rewrite rule procedure definition is of the form
where
f is a defined function symbol and the
and
are terms. The expressions
are called rewrite rules. The
must be left linear (each variable occurs at most once). Also, all variables in
must appear also in
. The terms
must be of the form
where the
are finite constructor terms that may include variables. The terms
must be finite (non cyclic). The left linearity restriction makes the implementation more efficient because it is not necessary to test for the equality of two terms when mapping formal to actual paremeters.
2.8. Compiled Function Definitions
A compiled function definition is of the form
Here
f is a defined function symbol, the
are program variables, and
y is a program variable. Also
A is an assertion in some logical theory giving the required relation between the
and the output
y, for example
and
are integers. It is required that for all finite constructor terms
there is at most one constructor term
t such that
is a theorem of the logical theory.
The instances of such a compiled function definition are expressions of the form where the and t are finite ground (variable free) terms composed of constructors and the assertion A holds, that is, is a theorem of the appropriate logical theory. There may be infinitely many such instances.
The intention is that compiled functions would be implemented in some other language. For example, addition could be defined in this language using sequences of zeroes and ones to represent binary numbers, but it would be cumbersome. A compiled function for addition could take two constant symbols representing integers and compute their sum as another constant symbol more efficiently than this.
2.9. Imperative Statements
Imperative statements and programs are indicated by the letters P and Q. An imperative statement can be an assignment statement, a conditional statement, an iterative statement, or a concatenation of two imperative statements. Semicolon indicates concatenation.
An assignment statement can be a simple assignment statement of the form where x is a program variable and t is a functional expression (a term), a multiple assignment statement of the form where the are program variables, an argument replacement statement of the form where x is a program variable, or a copy statement of the form where x and y are program variables. A copy statement is syntactically a special case of a simple assignment statement. The copy function symbol can only appear in a statement of the form for program variables x and y.
2.10. Iterative Statements
An iterative statement can be a while statement or a for statement. A while statement is of the form
where
r is a functional expression and
P is an imperative statement. A "for" statement has the form
where
P is an imperative statement and
and
n are program variables.
2.11. Conditional Statements
A conditional statement has the form
where
t is a functional expression and
and
are imperative statements.
2.12. Sequence
If and are imperative statements then so is .
2.13. Grammar
We now give a grammar in the extended context free style for the language.
<program> ::= <fcnl expr> ;; <proc def>*.
<fcnl expr> ::= <fcn symbol><fcnl expr>* |<vbl>
<fcn symbol> ::= <defined symbol>|<constructor>
<proc def> ::= <imperative proc def>|<rewr proc def>|<compiled fcn def>
<imper stmt> ::= <assg stmt>|<cond stmt>|<iterative stmt>|<imper stmt> ; <imper stmt>
<assg stmt> ::= <vbl>←<fcnl expr>| (<vbl>*) ←<fcnl expr>|<vbl>[<int>] ←<fcnl expr>|<vbl>← copy (<vbl>) %distinct vbls in <vbl>*%
<imperative proc def> ::= procedure <defined symbol> (<vbl>*) {<imper stmt>}<fcnl expr> ;; %distinct vbls in <vbl>*%
<cond stmt> ::= if <fcnl expr> then {<imper stmt>} else {<imper stmt>}
<cond stmt> ::= if <fcnl expr> then {<imper stmt>}
<iterative stmt> ::= while <fcnl expr> do {<imper stmt>}| for <vbl> = <vbl> step <vbl> until <vbl> do {<imper stmt>}
<rewr proc def> ::= rewrite <defined symbol>{<rewrite rule>* } ;;
<rewrite rule> ::= <fcnl expr>→<fcnl expr> %rules are left linear, all variables on rhs appear in lhs, all fcnl exprs have the procedure name at the top level%
<compiled fcn def> ::= compiled <defined symbol>(<vbl>*) →<vbl> where <logical expr> ;;
%vbls in <vbl>* distinct, logical expr may have free variables in <vbl>* but no other free variables%
3. Semantics
In specifying the semantics of the language, it is necessary to distinguish between symbols that are a part of the language, and other auxiliary mathematical functions that are used to define the semantics of the language.
3.1. Term Graphs
The language makes use of term graphs to store data. Term graphs are indicated by the letters G and H. A term graph is a set of triples (node, function symbol, list of children nodes) where there is at most one triple for any node. The notation indicates the triple . If G is a term graph then the set of nodes of G is . Nodes can be indicated by the letters u and v. The symbol f is the label of node v if the triple is in G, and f must be a constructor symbol. The list is the list of children of v. Now nodes is the set of nodes in G, is the label of node u in G, arity is the number of children of u and args is the list of children. Also arg is the node which is the child of u. These functions are easy to confuse with similar functions on terms such as atop, aarity, and aarg.
Given a term graph G and a vertex v in G, define term to be the possibly infinite term defined by:
If v is in G then term is so that atop(term) = f and aarg. In this formalism term does not contain any variables. Therefore, for all v in G, term is a ground constructor term, possibly infinite if G has cycles. It is possible that two distinct nodes u and v satisfy term≡ term where ≡ means syntactic identity. Because terms are used also as arrays, it is possible to have more than one copy of an array with the same function symbol f.
For example, if then term. If then term. If then term.
The symbols and are constructors so these can be the labels of nodes in a term graph. We assume that all term graphs have a node with label .
Define Replace as G with (for node u in G) replaced by in G where is a node in G. This operation is used to define assignment to an element of an array (argument replacement). If or then Replace = G.
Suppose = Replace. Then term = term with the argument replaced by term if .
There are also functions make.term, new.term and cache.term that are auxiliary functions used to define the semantics of the language. Define make.term to be a pair where v is a node and is . Depending on the implementation, the node v can be a new node, not present in G, or else it can be an existing node v of G if already is an element of G. Define new.term to be a pair where v is a new node not present in G and is . The operation new.term is needed for an explicit copy operation. A related operation cache.term will be described later. The functions make.term and new.term are the only functions that can change the set of nodes in the term graph and they can only do this by adding a node to the set of nodes. This implies that the set of nodes in the term graph never gets smaller and may only continually get larger as a program executes.
It would be possible to do garbage collection on the term graph, which could cause the set of nodes in the term graph to get smaller.
3.2. Environments
An environment is a function from program variables to nodes in a term graph G. It is required that undefined variables map to a node in G with label . Environments map from all of the denumerably infinite set of program variables to nodes in G. Only finitely many of them will be defined and these are the only ones that need to be explicitly stored.
3.3. States
A state consists of a term graph G and an environment e that maps program variables to nodes in G. States are indicated by S and T. Also, env is the environment of a state S and is the term graph of S so that env() = e and . The function nodes is extended to states by nodes = nodes
To show that a pair is a state it has to be shown that e maps program variables to nodes in G. A pair in which for some variable x, is not a node of G, is said to be non variable binding; otherwise it is variable binding. All but finitely many variables will be undefined at any time and so they will map to a node in G.
The function “term” is extended to states by term = term for program variables x. However, we often prefer the more complex notation term because it makes clear exactly how the term of a program variable is computed.
Sometimes in specifying the semantics it is necessary to modify the environment of a state; this can be done using the function fix.env defined by fix.env. The function fix.env is an auxiliary function and is not in the syntax of the language.
We say that two environments e and conflict if for some variable .
Program variables x have a term value and a node value in a state S. If S is then the term value of x in S is term which is term and the node value is .
The function equal.node (a node equality test) tests if the nodes and are the same. This function is in the syntax of the language.
3.4. Undefined Variables
Now for some v with label for undefined variables. Formal parameters have initial values as specified by the call to the procedure. Also the environment is defined to be the environment e such that for all program variables x, term = . A more detailed semantics for procedures is given later.
We don’t specify how and are handled.
3.5. Denotational Semantics (Assuming Termination)
We define the denotational semantics
of statements
q in the language by induction on the length of a computation. We also do proofs of properties of the language by a similar induction. Note that the notation for the denotational semantics used here differs from that in the book [
NK14] that includes Isabelle proofs of its results. However, that language does not have arrays or procedures.
Our semantics uses innermost left to right evaluation for functional expressions (terms).
First we give an informal description of the denotational semantics, then a more formal description.
If t is a functional expression (a term) then semantically it maps from states to (node,state) pairs. During the execution (evaluation) of a functional expression, the state can be modified by evaluating functions in the term and by creating structure in the term graph to store the result computed by the term. Other structure besides this may be added to the graph during the evaluation of the expression. The Replace operation also modifies existing structure in the term graph. In this way the functions (procedures) in the term can modify the term graph.
The original environment is restored after the evaluation of a functional expression, though other environments may be created for other functions during the evaluation.
(states → nodes × states) is the type of the semantics of functional expressions t; the environment does not change and if then v is a node in the graph of state . We often write as .
If P is an imperative statement then maps from states to states. The environment and graph may change. (states → states) is the type of the semantics of imperative statements P. We often write as .
3.6. Length of a Computation Sequence
We define the length of a computation sequence and use it to do proofs by induction of properties of the computation, assuming termination. To prove that something is true of a statement P on a state S we assume that it is true for all that are smaller than in the ordering and show that it is true for . Assuming termination, this shows that the property is true of all pairs .
We use the notation for the length of the computation sequence for of a statement or expression P operating on a state S. is a nonnegative integer value that decreases with each computation in a terminating computation. We define along with the semantics for each kind of statement P. For auxiliary functions used to define the semantics we can define for an expression E as the length of the computation it represents. Hopefully the context will make this clear. Another way to formalize this is not to require to have integer values but just abstractly to put a partial ordering on the pairs where if the computation of is a strict subcomputation of the computation for . Then we can say that the computation for is terminating if there are no infinite descending sequences starting from the initial pair . This works by König’s Lemma because each spawns only finitely other pairs such that . The advantage of the partial ordering formalism is that it makes sense even for infinite computations and possibly can help to define a denotational semantics in that case. However, defining as an integer value is simpler for terminating computations.
3.7. Semantics of Imperative Statements
Now we define the semantics of imperative statements.
3.7.1. Assignment Statements (Note This Includes Simple Assignment Statements, Multiple Assignment Statements, Copy Statements, and Argument Replacements)
For an environment e define by and for .
= fix.env(env where
(Simple assignment statement)
.
For the partial order version, .
At the level of an assignment statement the environment of is the same as the environment of S by Theorem 4 which will be proved later. However env(S) has to be modified to reflect the new assignment to x.
Now the sequence makes two different arrays unless the term is in the set to be discussed later. But just makes one array pointed to by both x and y. In the former case an argument replacement statement in state (defined below) will not change term but in the latter case it will. Note that arg can be a term of the form , too.
3.7.2. Semantics of (Multiple Assignment Statement)
|
|
For the partial order version,
.
The semantics of arg will be defined later. There is no need to define a tupling operation because one can use for that where f is a constructor, and then use the arg function to extract the . The arity function can be used to determine the number of arguments of f.
3.7.3. Copy Statement
| If then |
|
for some such that
|
| = new.term. |
|
. No partial order statement is needed here. |
In the copy statement, the variable x is assigned a node distinct from v but having an identical term.
3.7.4. Argument Replacement
This statement is useful for implementing arrays efficiently.
Let and suppose . Let where H = Replace. Then . Here A is a program variable.
We informally speak of this statement as replacing the argument of term with term. We say that the argument replacement operation is of type (term,term, term.
. For the partial order version, .
3.7.5. Conditional Statements
=
if term = then else
if term = true then else
where .
[if t then else = if term() = then else if term = true then else where .
For the partial order version, [if t then else , and also if term = true then [if t then else , and if term true then [if t then else . From now on the partial order version will be omitted.
In a conditional statement, first t is evaluated. If it returns undefined then the state that results from the evaluation of t is returned and neither nor is executed. If t returns true then is executed else is executed.
If the second part of the statement is omitted the semantics is simpler.
=
if term = then else
if term = true then else
where .
[if t then ]: similar to [if t then else ].
3.7.6. Iterative Statements
=
if (term = then else
if term true then else
where
[while t do ] = if (term = then else if term true then else [while t do , where .
Now term is essentially the value resulting from evaluating t in state S and is the state resulting from the execution of t. If t evaluates to undefined or anything other than true, the loop exits in state . Otherwise the loop iterates on the state resulting from the execution of P in state .
for step k until n do : Equivalent to the following:
;
while do
One might define until in a similar style.
3.7.7. Sequence of Statements
For two imperative statements define as .
.
3.8. Nontermination
To handle nontermination it would be necessary to make use of the symbol and possibly some form of denotational semantics using complete partially ordered sets. Note that no imperative statement changes the set of nodes in the term graph.
3.9. Evaluating Functional Expressions
For functional expressions in general, we assume leftmost innermost evaluation so that when a term is evaluated, the top-level subterms will evaluate to nodes of G that represent ground constructor terms. Computing where P is a functional expression, is an environment, and G is a term graph is done as follows.
If P is a variable x then and .
If P is a term where the are terms in then let , let where is a node in , let , …, and let . All the have the same environment and all the are nodes of . The environment is needed to evaluate the because they can contain program variables from the calling procedure.
If any of the fail to terminate then the whole expression does too and the value is ; of course, it may not be possible to compute this value because of nontermination. We do not specify how or are handled if some returns them.
Let S be the state with env and .
Then = finish.function where the are as above and finish.function remains to be defined. So finish.function assumes the arguments of f have been evaluated left to right to obtain the nodes in the term graph and to obtain the state S. Then finish.function applies f to these nodes and to the resulting state S in a way that depends on what kind of a function f is. Finish.function returns a (node, state) pair. The notation is convenient but the nodes cannot actually appear as arguments to f in a functional expression in the language. This notation could be more precisely written as . The nodes of G may appear as children of a node of the form eventually in the term graph. Also = +[finish.function. We want to compute finish.function where S is the state with env = and for various kinds of function symbols f. First it is necessary to define some auxiliary functions for rewriting.
None of these functional expressions directly change the term graph.
3.10. Rewriting Semantics
The use of term graphs makes pattern matching convenient. Therefore this language provides for pattern matching in a rewriting facility. The rewriting facility is not needed for Turing equivalence. The rewriting facility does one rewrite at the top level and then does recursive evaluation. The subterms are first evaluated left to right to normal forms. Recall that all normal forms are constructor terms and even if an implicit type error occurs, the normal form is
, which is also a constructor. This language does not have an explicit type system. This paper does not have a formal discussion of term rewriting systems. There is an extensive literature on this topic; for example, see [
BN98,
DJ90,
DP01]. The evaluation of functions defined by rewriting is performed by the function graph.rewrite defined as follows. How this relates to finish.function is explained below.
Suppose the function symbol
f is defined by rewriting in a statement of the form
in which the syntactic restrictions given earlier are obeyed. In particular the top symbols of the
must all be
f, the
are linear, and all variables in
appear also in
.
Let E represent the expressioni . Then graph.rewrite returns a (node, state) pair and is defined as
if match.list then (match.list else if
match.list then (match.list else … else if
match.list then (match.list else .
In this last case there is an implicit type error.
Also as for the termination measure, [graph.rewrite =
if match.list then ,(match.list] else
if match.list then ,(match.list] else … else
if match.list then ,(match.list] else 1.
Match.list returns an environment (a substitution) and is defined as follows, assuming r is :
if then else if and then
(if match then
else if match then else …
else if match then
else match match).
match is defined as follows:
if r is a variable then [i.e. match] else match.list where is in G.
None of these functions directly change the term graph. However evaluating (match.list may change the term graph.
The basic idea for functions defined by a sequence of rewrite rules is to extract the terms term from ,, do a top level rewrite on the term f(term, … term using the first applicable rewrite rule for f from the list , and evaluate the resulting term recursively. The are obtained from the terms as indicated above by evaluating the left to right. Here we are not concerned with confluence issues. The function graph.rewrite performs this rewrite and makes use of match.list to get a substitution (an environment) mapping the left hand side r of a rewrite rule to f(term, … term. In evaluating match.list, r cannot be a variable because the left hand side of a rewrite rule cannot be a variable. The function match.list does not modify G and does not make use of the current environment e. In the routine match.list we make use of the routine match to match term r to node v in G. Here r is a subterm of the left-hand side of a rewrite rule. This routine match is similar to match.list but also permits the term r to be a variable. These definitions assume that r is left-linear so that the variables in the different are distinct. Because of left linearity there should be no conflicts in the union of environments in match or match.list. Match (and therefore match.list) terminates even if G has cycles because we assume r is a finite cycle-free term (rewrite rules are finite).
3.11. Evaluation of Finish.Function
The evaluation of finish.function depends on what kind of a function is being specified.
3.11.1. Compiled Functions
If the function symbol
f appears in a definition of the form
then as mentioned earlier the instances of such a compiled function are expressions of the form
where the
and
t are finite ground (variable free) constructor terms and the assertion
A holds, that is,
is a theorem of the appropriate logical theory. These instances can be considered as ground rewrite rules. Then the semantics are the same as if the compiled function were defined by the possibly infinite list of such rewrite rules. Alternatively, suppose
= term
. If the
are all finite constructor terms and there is one and only one constructor term
t such that
is a theorem of the relevant logical theory, then finish.function
. Otherwise, finish.function
=
. Also as for the termination measure, [finish.function
= 1.
For all non-logical function symbols in the underlying theory there is a corresponding compiled function. For example, there is a compiled function
The occurrence of + on the left is considered as an infix defined symbol in ITGL and the occurrence on the right is considered as a symbol of the underlying theory.
3.11.2. Constructors
If f is a constructor then finish.function where = make.term (as defined earlier) and [finish.function] = 1. Thus env = e.
This is the place where the set of nodes in the term graph increases. This can happen in the function make.term, new.term, or cache.term.
3.11.3. Rewriting
If f is defined by rewriting then finish.function = graph.rewrite. Note that the environment is not needed for graph.rewrite because the all represent ground terms and the variable case was covered above. So here also env and [finish.function ] = [graph.rewrite] as given earlier.
3.11.4. Procedures
Suppose f is an imperative procedure defined by where the are the formal parameters.
Then finish.function is defined as follows where : Let be defined by and for other program variables x. So no other values can be passed in to the procedure except by the formal parameters. Therefore free variables in the procedure do not pass any values in.
Let be fix.env.
Let be and note that B may change .
Let be and recall that does not change env.
Then finish.function = .
Later we will show that is variable binding because only refers to vertices in .
[-25]Thus in this case also env = so if program variables are changed during the evaluation of they will be restored. Also [finish.function.
None of these functions directly modify G but the evaluation of B and t and the in finish.function may modify G.
3.11.5. Special Symbols
We now define some special defined function symbols. For these the constants true, false, , , and integers i are considered as constructor functions with arity zero. Although top, equal.top, equal.node, arg, and arity are technically constructors, they will never appear in the term graph because during their evaluation by finish.function they are removed. The function symbol copy also cannot appear in the term graph though it is technically a constructor.
finish.function(top = finish.function if label.
If f is an individual constant and label. then finish.function(top = finish.function .
finish.function(equal.top = if (label = label) then finish.function (true,S) else finish.function (false, S).
finish.function(equal.node = if () then finish.function(true,S) else finish.function (false,S).
finish.function(arg = if label, arity, and arg.
finish.function(arg = finish.function if label, arity or .
finish.function(arity = finish.function if arity. We are assuming that integers are individual constants here.
In all these cases [finish.function] = 1.
3.11.6. Array Initialization
It turns out that in all the t need not return the same value even though the environment is restored after each functional expression. This is because of the effect of argument replacement.
Consider this sequence:
procedure d
procedure
Suppose g and h are constructors (because they are not defined). Then when d is called it will call g. The first argument of g will return which will be arg or 1. The second argument of g is but now x is bound to because of the argument replacement statement in procedure f so the second argument of g will evaluate to 2. If instead were called its arguments would evaluate to 1, 2, and 3, respectively.
5. Proofs of Properties
5.1. Term Graph Only Gets Larger
We now show that in the execution of a program the term graph only increases in size. The basic idea is that the only function that directly change the set of nodes in the term graph is new.term which is called by finish.function, and by statements of the form .
Theorem 1.(The Subset Property). If P is a functional expression and then nodes nodes. Also, if P is an imperative statement and then nodes nodes. Recall that nodes is defined as nodes.
Proof. We can assume by induction that both parts of the theorem are true on pairs smaller than in the ordering and show they are true for the pair . We give a few examples.
Suppose P is the simple assignment statement where t is a functional expression. Then = fix.env(env where . By induction, nodes nodes. Because fix.env does not change the set of nodes, nodes = nodes. Therefore nodes nodes.
The proof for argument replacement is similar. Suppose . Let . Then by the definition of argument replacement, if then Replace . Now nodes = nodes by definition of Replace, nodes = nodes by definition of nodes on states, so nodes = nodes, and reasoning as above nodes nodes so nodes nodes.
Suppose P is the copy statement copy. Let and , then for some such that = new.term. Now new.term is a pair where v is a new node not present in G and is . So nodes nodes. Since nodes = nodes, nodes nodes.
In general for imperative statements and , if then where . Assuming by induction that nodes nodes and nodes nodes one obtains that nodes nodes. Many imperative statements can be expressed as compositions in this way so that the subset property can be proved by induction.
As for functional statements, let
= finish.function
where the
are as in
Section 3.9] and obtained by evaluation of
through
in order. It follows by induction on properties of the
that nodes
nodes
. We want to show that nodes
nodes
. For this it suffices to show that nodes
nodes
= finish.function
.
If f is a constructor then finish.function will call make.term which may add a node to nodes or leave nodes unchanged. In either case the desired result follows.
If f is a procedure definition then it can be expressed as a composition of simpler statements for which the subset property can be assumed by induction. By simple reasoning one obtains nodes nodes.
If f is a procedure defined by rewriting then in the evaluation a rewrite rule is chosen resulting in a functional expression to evaluate and the result can be assumed by induction.
If f is a compiled function then finish.function for some functional expression t for which the desired subset result can be assumed by induction.
This completes the proof sketch. □
5.2. Functional Expressions Do Not Change the Environment
Theorem 2. If P is a functional expression and then .
Proof. The only place where the environment is directly changed is in a simple assignment statement, a copy statement, and on entering an imperative procedure. Assignment statements and copy statements only occur in imperative procedures. At the end of an imperative procedure the original environment is restored. □
5.3. Variable Binding
Definition 1.
An imperative statement Ppreserves variable bindingif the following is true: For all states S, if S is variable binding and T = [[P]](S) then T is also variable binding. Also a functional expression Ppreserves variable bindingif the following is true: If S is variable binding and (v,T) = [[P]](S) then T is also variable binding.
Lemma 1. If P and Q are imperative statements that preserve variable binding then also preserves variable binding.
Proof. Let . Suppose and then . Suppose S is variable binding. Because P preserves variable binding, U is also variable binding. Because Q preserves variable binding, T is also variable binding. □
Theorem 3. If P is an imperative statement then P preserves variable binding. Also, if P is a functional expression, then P preserves variable binding.
Proof. Using the subset property. The environment may be changed in a simple assignment statement . However only the binding of x is changed and it is changed to a node v in the term graph where the evaluation of t returns the pair with v in G so variable binding is preserved.
The environment may be changed in a copy statement copy but then where v is a new node in G such that term = term. So because v is in G variable binding is preserved. The environment is changed on entering a procedure so that for a formal parameter x, for v such that for some term . Evaluation of succeeding only makes nodes larger so at the end of the evaluation of the actual parameters, and v is still a node of the term graph, so variable binding is preserved. For an imperative procedure definition, the original environment is put back at the end. This preserves variable binding. In particular, we recall the semantics of an imperative procedure definition:
is variable binding at the start of finish.function.
Let be fix.env.
Let be and note that B may change .
Let be and recall that does not change env . Then finish.function.
Now we want to show that the final state is variable binding. First, fix.env is variable binding because maps the formal parameters to nodes of the term graph at the calling point. By induction we can assume is variable binding and then again by induction is variable binding and is in the nodes of . Now nodes nodes by repeated applications of the subset property and induction. Because is variable binding and nodes nodes, is also variable binding and is in the nodes of . □
The fact that ITGL preserves variable binding means that there will be no dangling pointers.
5.4. Semantics Depends Not Only on Terms
Consider this program:
procedure
. [This is part of the procedure definition and indicates the value to be returned]
Now if both occurrences are at the same node then this program will return when is called. If the occurrences are at different nodes of the term graph then it will return . The effect of an argument replacement operation depends not only on the terms but on how they are stored. We next discuss more fully when the result of a program depends only on the terms, and if not, what it does depend on.
5.5. Dependence on Terms if No Argument Replacement or Equal.Node Operations
For each imperative statement or functional expression P we have as a function of the state.
Under suitable assumptions including no argument replacement statements and no test for equality of nodes in a program we want to show the following:
Definition 2.
We say that the functional expression P and the function areterm dependent, if for all states , if then term and term for various variables x are functions of term for various x. More precisely, suppose that for all , , , , , , , , , , if = and = and for all x, term = term then {term = term and for all x, term = term. This means that the value of the functional expression depends only on the terms of the variables in it. In fact in this case and .
Similarly, we say that the imperative statement P and the function areterm dependent, if for all states and , if then the terms term for various x are functions of the terms term for various x.
We will show that this is true if there are no equal.node tests and no argument replacement statements, that is, statements of the form in the program in which P appears. Also one has to be careful about make.term, new.term, and cache.term as will be explained.
Recall that we extend the function term to states by term = term.
Theorem 4. Suppose functional expression t is evaluated twice in a program P which is term dependent and contains no argument replacement statements. Suppose that the first evaluation is in a state and the second is in a state with the same environment. Then if and then term = term.
Proof. By the subset property, nodes nodes. Also, for all nodes v in , the children of v in will be the same as the children of v in because the program contains no argument replacement statements. Therefore for all variables x, term = term. Then by induction on the term structure of t, using the term dependence property, both evaluations of t yield the same result, so term = term. □
This implies something about the evaluation of if the program is term dependent and contains no argument replacement statements. In particular, all evaluations of t will yield the same result. So suppose the successive evaluations of t yield states respectively. Then nodes nodes nodes so by the theorem term = term = term. This essentially means that all arguments of will be identical after evaluation of the arguments of .
5.6. Isomorphism Equivalence Classes
Even if there are argument replacement statements or tests for equality of nodes, we can say something about how the output of a program depends on the input, using the idea of the isomorphism equivalence class of a state.
Definition 3.
Anisomorphismbetween term graphs and is a one to one onto mapping ϕ from the nodes of to such that for all v in , label = label, arity = arity, and if are the children of v in then are the children of in . That is, if args = then args = . In this case we write that . Two graphs areisomorphicand in the sameisomorphism equivalence classif there is an isomorphism between them.
Note that the composition of two isomorphisms is an isomorphism, the identity mapping is an isomorphism, and the inverse of an isomorphism is an isomorphism. The idea of isomorphism equivalence is that it specifies which terms and subterms are stored at the same node.
Definition 4.
ϕ is anisomorphismfrom a state to state if ϕ is an isomorphism from S to and also for all variables x, . We write . If there is an isomorphism from S to then we say that the states S and areisomorphic. Also for a state S, is the set of states that are isomorphic to it and this is called theisomorphism equivalence classof S.
Now, at any time all but finitely many program variables will be undefined. So for undefined variables x, if , suppose and . Then . Also, as for the labels of v and , in this case.
Note that if and is an isomorphism then for all program variables x, term = term.
Definition 5. ϕ is an isomorphism from a pair to if and also ϕ is an isomorphism from to . We say that and are isomorphic. This implies that term = term. Also is the set of node, state pairs that are isomorphic to . This is called the isomorphism equivalence class of .
Definition 6. ϕ is an isomorphism from a pair with v in G to with in if ϕ is an isomorphism from G to and also = .
We will show that isomorphic states map to isomorphic states (by an imperative statement) and isomorphic states map to isomorphic (value,state) pairs (by a functional statement) even if there are equal.node tests and statements of the form in the program.
5.7. Isomorphism (Equivalence Class) Dependence
Definition 7.
A function F from states to states isequivalence class (e.c.) or isomorphism dependentif there is a function such that = . The same definition works for functions from states to (node, state pairs) so this definition applies both to imperative statements and functional expressions. If is e.c. dependent for an imperative statement or functional expression P then we say that P ise.c. or isomorphism dependent.
Suppose F is such a function for some P. If F is isomorphism dependent then the isomorphism class of the result of a program is a function only of the isomorphism class of the input, and doesn’t directly depend on the names of the nodes.
Why does this matter? It means that if one is only concerned about the isomorphism class of the output then one need only know the isomorphism class of the input. Also one can write a specification of the program that only specifies how the output class depends on the input class, which may simplify things.
Definition 8. Define to mean there is an isomorphism from to (or to ). Recall that the states and are said to be isomorphic. Also if there is an isomorphism from graph to graph .
Also if there is an isomorphism from S to T that also maps to for all i. Similarly define for graphs G and H.
Note implies for all program variables x.
Proposition 1. A function from states to states is isomorphism dependent if for all states S and T, implies . A function F from states to (node,state) pairs is isomorphism dependent if for all (node, state) pairs and , implies .
In fact we will show that if is an isomorphism from S to T then there is an isomorphism from to that extends , in the sense that if for a node v in the term graph of S then v is also in the term graph of and also. We know that v will also be in the term graph of by Theorem 1.
Definition 9.
Such a function satisfies the extension propertyif F is isomorphism dependent and in addition if for all states S and T, if ϕ is an isomorphism from S to T then there is an isomorphism from to which extends ϕ.
5.8. Term Dependence
Definition 10.
We can define the same notation for terms. In the sense of term equivalence means that for all variables x term = term. Then and are said to beterm equivalent. Also in the sense of term dependence means that and that term = term for all i. This applies in particular to (node,state) pairs that result from the execution of a functional expression. We can also define theterm equivalence class of a state as the set of states that are term equivalent to it, and define term dependence if there is a function such that = in the sense of term equivalence. This definition of term dependence also extends to (node, state) pairs as before.
Proposition 2. A function F from states to states is term dependent if for all states S and T, implies . A function F from states to (node, state) pairs is term dependent if for all states S, implies .
As before, if is term dependent then we also say that P is term dependent. For two states and , if (isomorphism) then (term). Thus if two states are isomorphic then they are term equivalent. The converse is not true. However, term dependence does not imply isomorphism dependence. The functions make.term and cache.term are term dependent but depending on the implementation they may not be isomorphism dependent. For example, if their implementation depends on the names of the nodes, then they may not be isomorphism dependent even if they are deterministic. Isomorphism dependence does not imply term dependence. The function equal.node and statements of the form are isomorphism dependent but not term dependent. We can distinguish the two senses of ≅ as and when necessary.
The proofs for isomorphism dependence and term dependence are generally so similar that we do both together in most cases. Of course the proof for term dependence assumes the lack of argument replacement statements and node equality tests. Both proofs are done by induction on the measure that defines the length of a terminating computation. That is, we assume that the theorems are true for all such that and prove that the theorem is true for F and S. For terminating computations , this shows that the theorem is true for all F and S.
5.9. Functional Expressions: Term and Isomorphism Dependence
All functions make use of finish.function so to show that functional expressions are term dependent or isomorphism dependent we first show that the prelude to finish.function is term dependent or isomorphism dependent, and then it only remains to show that finish.function itself is term or isomorphism dependent.
5.10. Finish.Function: Term and Isomorphism Dependence
Here are the details of the proof of term dependence for finish.function: For this we define finish.function in a different but equivalent manner to that given in
Section 3.9]. The basic idea is that pre.function computes and saves the nodes
that represent the values of the terms
which are arguments to a function symbol
f. This is done in a left to right manner. Also the state
S resulting from the computation of these terms is saved.
Definition 11.
Define theprelude pre.functionto finish.function as follows. If are terms in then let , let where is a node in , let , …, and let = . Let S be the state with env = and . All the have the same environment and all the are nodes of . Then the value of pre.function is the tuple .
Proposition 3. If = pre.function then = finish.function where the definition of finish.function depends on f. The term could more precisely be written as .
Proof. By the discussion of finish.function in
Section 3.9]. □
Lemma 2. For now letting refer to terms, if and = pre.function and = pre.function then . Also if for isomorphisms is written instead of then the isomorphism for is an extension of the isomorphism for .
Proof. Assuming the are term dependent, then for , because , and , implies so putting all these together and noting that , . A similar argument works for isomorphism dependence using the fact that the are isomorphism dependent, which can be assumed by induction. Also by induction on the terms and assuming that the satisfy the extension property we can assume that each isomorphism is an extension of the last one so the second part of the lemma holds as well. □
Corollary 1. Suppose that implies finish.function finish.function. Then is term dependent. Similarly for isomorphism dependence.
Proof. Suppose , pre.function = and pre.function = . By the lemma, . We are assuming that implies finish.function finish.function. Putting these together, finish.function finish.function. By Proposition 3, . Thus is term dependent. Similarly for isomorphism dependence. □
This corollary simplifies proofs of properties of functions in the language. This approach works for both term dependence and isomorphism dependence. These properties need to be shown separately for f being a constructor, a defined procedure, a compiled procedure, and a procedure defined by rewriting. Also it is necessary to look at special functions.
Corollary 2. Suppose that if ϕ is an isomorphism from to then there is an isomorphism from finish.function to finish.function that extends ϕ. Then satisfies the extension property.
Proof. By the lemma and using the hypothesis about finish.function. □
6. Constructors: Isomorphism Dependence: Make.Term, New.Term Cache.Term
6.1. Problems
Now if f is a constructor then to compute , finish.function makes use of new.term, make.term, or cache.term. In particular, in this case finish.function is defined as where is make.term and make.term may call new.term or cache.term. Recall that the functions new.term, make.term, and cache.term are not in the language but are auxiliary functions used to define the semantics of the language. These functions do not cause problems for term dependence assuming no argument replacement or node equality tests, but finish.function may not be isomorphism dependent, depending on their implementation. So the problem here is the possible nondeterminism of these functions; different implementations starting from the same state may have different outcomes and result in states that are not isomorphism equivalent.
Recall the definition of new.term
from
Section 3.1] where the
are nodes,
G is a term graph, and
f is a constructor. If
X is
for
in
G and a constructor
f then new.term
is isomorphism dependent, but this is not always true for make.term and cache.term.
A problem with always calling new.term is that if all calls from finish.function for constructors are to new.term instead of make.term and cache.term then this implementation can be highly storage inefficient because each call creates a new node in the term graph, a node with n pointers if the label of the node has n children. We want to find a way to preserve isomorphism dependence with a more storage economical strategy. Garbage collection on the term graph would also help to reduce storage usage. Also for a loop if the index goes from 1 to 10,000 then all the integers from 1 to 10,000 will be stored as terms in the term graph. For each such integer i there will be a node in the term graph having label i. So some kind of garbage collection will be helpful. We do not consider this aspect of storage economization in this paper.
Now make.term is a pair where v is a node and is . Here f is a constructor. Depending on the implementation, the node v can be a new node, not present in G, or else it can be an existing node v of G if there is a node that already is an element of G. Also new.term is a pair where v is a new node not present in G and is . Cache.term chooses v to be an existing node of G if a node in G already exists. If more than one such node exists it chooses one of them arbitrarily.
If c is a zero-ary individual constant, then starting from an empty graph, two calls = make.term and ) = make.term in succession can produce either a graph or a graph depending on whether make.term is implemented as new.term or cache.term. These graphs are not isomorphic. Suppose S is where x and y are distinct variables. Then a call to cache.term can produce either or . These pairs are not isomorphic because of the condition that for variables z. An isomorphism from to would map to , which would thereby map to , not permitted in an isomorphism. However the states that can be produced by a call to new.term on any state S are all isomorphic.
If cache.term is modified so that all nodes with the same label and same children are made identical then cache.term will map isomorphic graphs to isomorphic graphs. It also will be storage efficient and will produce the same set {term in as if new.term were called even though the resulting graph will not always be isomorphic to one produced by new.term. However if cache.term is implemented this way and called on it will produce the pair which identifies and . This may have unintended consequences because now replacing an argument of using a statement will do the same to y. For function symbols with large arity a call to cache.term can be expensive because it will take time proportional to the arity of the function.
6.2. Solutions
Now make.term can call either cache.term or new.term. We want to find a systematic way of deciding when to do cache.term and when to do new.term, and it should be storage efficient. We want also to guarantee isomorphism dependence preservation. To this end we define term for in G to be the term f(term, term), that is, the term t such that atop and aarg = term. Then we specify a set of finite ground constructor terms that is closed under the subterm relation, and when finish.function is called for a constructor f, if term is in then cache.term is called, otherwise new.term is called. We also specify that contains all constant (zero arity) constructor symbols. would typically contain small terms or terms such that atop has small arity. This saves some storage but still is term dependent and isomorphism dependent. If is properly defined, in a manner to be described, it avoids the need to copy a lot of structure when an argument replacement is done on a variable x where term is not in . A possible way to define is to have a set of function symbols that can appear in terms in , and a term is only in if it is finite and all function symbols in it are in this set.
It helps to hash terms in to detect duplicates. If two terms have the same hash value then a more precise test can be used to test if they are identical. Finite terms can be hashed in a variety of ways. As an example of a hash function on finite terms that depends without much extra work on the hash values of the children, we can define a hash function on terms t as follows:
Choose a prime p and choose for function symbol f to be an integer between 1 and . Let be mod p where the are defined similarly in a recursive manner. This is not necessarily the best hash function but it works. Because of the subterm condition on we can assume that the have already been computed when is evaluated. Suppose cache.term is called during the evaluation of finish.function for . Caching requires one to know if a node w with term identical to term already exists in G. We can assume that there are no other nodes in G with term = term because subterms are evaluated and cached before terms. If there is a node w with term syntactically equal to term then the children of w will be . To find such a node w if one exists one only has to look at the nodes such that where t is term and if these are equal then one checks that the function symbol of is f and that the children of are . It is not necessary to examine the entire term t. And clearly if the function symbol of is f and the children of are then term = term. The point is that there should not be two nodes and in G such that term = term and term is in . Instead there should be only one such node. This is for the purpose of economizing the use of storage as much as possible. This kind of a hash function requires time linear in the number of children of a node because each child at least has to be examined, and the test for having the same children may also take time proportional to the number of children, which has to be tested for each such node . Therefore it is a good idea to restrict so that terms in are either of small arity or do not need to have hash values recomputed often, which can happen when argument replacement is done.
This approach works for finite terms but not for infinite terms. When argument replacement is done, a finite term may become infinite and an infinite term may become finite. It is also possible for a finite term to remain finite but its hash value may change. If term is infinite this does not necessarily lead to nontermination because such terms are not evaluated; recall this statement about how variables are evaluated:
If P is a variable x then and = 1.
The only terms that are evaluated, leading to a call to finish.function, are terms that appear in a program. These terms may have constructors in them, but these terms are all finite. If the evaluation of a finite term leads to an infinite constructor term in the prelude to finish.function, this term is not further evaluated. Although they are not evaluated, such infinite constructor terms term can be accessed by finding their top level function symbol top) or their arguments arg and by the function equal.top and they can be modified by the argument replacement operation.
First some terminology that is helpful when considering infinite terms.
Definition 12.
If is in G then the are calledchildrenof v and v is called aparentof the in G. Note that a node can have more than one parent. The ancestor relation is defined in this way: A node is anancestorof itself and if v is a parent of w in G then v is an ancestor of w in G. Also the ancestor relation is transitive. Acyclein G is defined as a sequence of nodes where , each is a child of in G and = . It is possible for a node to be a parent of itself; this is also a cycle in G.
If there is a cycle in G containing a node v of G then term is infinite. It’s fine to create infinite terms, they just can’t be in because it would entail a lot of effort to hash them. We say a node v of a graph G or a state is in if term is in . The following information (the status of v) will be stored with nodes v in G: Whether term is in , the hash value of term, and the set of parents of v in G. The hash value and the membership in can either be given or set to UNKNOWN.
In an argument replacement operation in state , suppose that . Recall that we refer to this statement as the replacement of the argument of term with term and say that the statement is of type (term,term, term where ) = . Suppose such an argument replacement is of type . In all cases if is infinite or not in then will not be in and none of the ancestors of will be in after this operation. Also, in all cases if v is an ancestor of in then will be infinite so that none of the ancestors of will be in after this operation. Therefore the status of all the ancestors will have to be updated. Whenever an argument replacement operation of type is done and is in and v is not an ancestor of then the membership of A in needs to be recomputed in some manner, depending on how is defined. This recomputation can often be done without searching through the entire term structure of . In such cases all four possibilities of s being or not being in and being or not being in are theoretically possible. The decision about membership in in these cases depends on how is defined.
A possible way to define is to specify a set of constructor function symbols and to say that a ground term t is in if it is finite and if all function symbols in t are in the set. Another possibility is to define as the set of all ground constructor terms of bounded depth, where the depth of a constant symbol is zero and if is a term, the depth of it is one plus the maximum depth of the . In this case the depth of term should be added to the status of v and updated as necessary.
If terms in are defined to be of bounded depth and v is not an ancestor of in it is possible that s and are in but is not if has depth greater than the depth of the subterm it is replacing. It is possible also that s is not in but is in if has depth smaller than the depth of the subterm it is replacing. Even if both s and are in it is possible that the hash values of term and of the ancestors of in will need to be recomputed. If is defined as the set of all finite terms over a specified set of function symbols, and v is not an ancestor of in , then is in if both s and are. However, the hash values of ancestors of may need to be recomputed.
One may say that with all of this discussion, the semantics of the language is not very simple. But the complexities arise from mathematical properties of the language and also from constraints imposed by the nature of mathematics itself, not from arbitrary decisions about the form of the language.
6.3. Isomorphism Dependence
Now, after considering efficiency issues for make.term, new.term, and cache.term, we turn to isomorphism dependence assuming caching is done using a set as mentioned above. As mentioned earlier, if f is a constructor then to compute , finish.function makes use of new.term, make.term, or cache.term. In particular, in this case finish.function is defined as where is make.term and make.term may call new.term or cache.term.
Now we show that if then make.term make.term if f is a constructor. This will be helpful in showing isomorphism dependence of finish.function for constructors.
Suppose G and are isomorphic graphs via isomorphism . To show that make.term is isomorphism dependent, we want to show how to extend this isomorphism to one from make.term to make.term if for all i. This implies that term = term. Thus both of these terms or neither of these will be in . Thus both will be implemented by cache.term (if the terms are in ) or both by new.term (if the terms are not in ). Suppose make.term = and make.term = . Then term = term and term = term and term = term. In either case, whether new.term (for terms not in ) or cache.term (for terms in ) is used, can be extended to map v to and the resulting graphs will still be isomorphic. If new.term is used then both v and will be new nodes in G and , respectively, so can be extended to an isomorphism so that . If cache.term is used then recall that is closed under subterms and that the subterms will have been generated and cached already. The top level subterms of term will be term. The top level subterms of term will be term. Thus term = term for all i because of the assumption that and both terms will be in for all i. The children of v will be and the children of will be . Suppose there are existing nodes in G and in . A little thought shows that such a w exists iff such a exists because G and are isomorphic. Then, because caching is done and all nodes with identical terms cache to the same thing, v will be chosen to be w and will be chosen to be . (Because term = term and term = term). So v = w and = . Therefore the existing isomorphism does not need to be modified and make.term and make.term are isomorphic. Otherwise cache.term operates the same as new.term so the comments for new.term apply here and make.term and make.term are isomorphic.
The homomorphism for make.term make.term is an extension of the homomorphism for . That is, for nodes v and w of G and , if then . As an example of something that is not isomorphism or term dependent, suppose there were a function symbol node.symbol which when evaluated would find the node v of and then construct a constant symbol that depends on (the name of) v and then for example, a make.term might be done. This would not be isomorphism or term dependent. Another example that is not isomorphism or term dependent is a function that would tell if v was less than w in some alphabetic ordering on node names.
6.4. Term and Isomorphism Dependence
Now that we have done the proof for pre.function and have shown how to handle new.term, cache.term, and make.term, we turn to the rest of the proof concerning term and isomorphism dependence. Recall Propositions 2 and 1:
Proposition 4. A function F is isomorphism dependent if for all states S and T, implies .
Proposition 5. A function F is term dependent if for all states S and T, implies .
In more detail, we have the following:
If for an imperative statement P, for all states implies then P is isomorphism dependent.
Similarly, if for a functional expression P, for all states implies then P is isomorphism dependent, where the are (node, state) pairs.
The same definitions hold for term dependence with referring to terms and not isomorphism classes.
6.5. Cases to Consider: Functional Expressions: Term and Isomorphism Dependence
We want to prove the following:
Theorem 5. If F is a functional expression in ITGL then is term dependent if there are no argument replacement or node equality statements in the program in which F occurs. Also, F is isomorphism dependent even if there are argument replacement statements or node equality statements in the program.
The proof is done separately for each kind of statement.
For functional expressions it is necessary to look at alternate possibilities depending on the kind of function we are dealing with. Note that all such use finish.function. First we consider rewrite procedures.
6.5.1. Rewrite Procedures
Recall that graph.rewrite is defined as
if match.list then (match.list else
if match.list then (match.list else … else
if match.list then (match.list else
where = make.term.
Suppose T = graph.rewrite and =graph.rewrite with and . By Corollary 1 concerning pre.function we know that if implies then graph.rewrite is term (isomorphism) dependent. Now match.list does not change G (or ) and only depends on the terms of the (or ) and the nodes in G (or ) so it operates identically on S and . These terms are identical for and regardless of whether we are dealing with isomorphism or term dependence. So if (match.list then (match.list and match.list=match.list. Let us call match.listE. Then if , . We can assume by induction that is isomorphism (term) dependent, thus . The same argument works for term dependence.
6.5.2. Compiled Functions
Consider a compiled function definition where . For each collection of ground terms instanting the there is a unique ground term instanting y in the execution of a compiled function. So y depends only on the terms instantiating the . Arguing as for rewrite functions, compiled functions are both isomorphism and term dependent.
6.5.3. Constructors
By Lemma 2 about pre.function it is only necessarily to consider the evaluation of finish.function. If f is a constructor then finish.function = where = make.term = for a node v as specified in the discussion of make.term and . Thus finish.function = .
Suppose S and are two states and . By the lemma (Lemma 2) we can assume that for evaluating finish.function and finish.function. Then finish.function = and finish.function (, =. Suppose is an isomorphism for . Then let be extended so that . Such a exists by the discussion of make.term, cache.term, and new.term. Then demonstrates that finish.function finish.function . Thus for a constructor f, is isomorphism dependent. The argument for term dependence is essentially the same, noting that term = term. Note again that if then . This is a general property of all of these proofs of isomorphism dependence.
6.5.4. Variables
For a variable x, = which is easily shown to be term and isomorphism dependent.
6.5.5. Special Functions
The function arg is easily shown to be isomorphism and term dependent. If two terms are equal their corresponding arguments are equal. If an isomorphism maps node v to w then it also maps corresponding arguments of v to those of w. Similary the defined function top is isomorphism and term dependent assuming that constant symbols are in the set . The function equal.top which tests if the nodes and have the same top symbol is easily isomorphism and term dependent. The function equal.node which tests if and are equal is isomorphism dependent but not term dependent unless the terms of x and y are both in .
6.5.6. Imperative Procedures
The only remaining case to consider is that of an imperative procedure definition. For this the following theorem is helpful:
Theorem 6. Suppose via isomorphism ϕ. Then there is an isomorphism that extends ϕ such that via isomorphism . This means that if then also. This holds for both imperative statements and functional expressions.
Proof. This can be verified by looking at each kind of statement and verifying that the property is preserved under composition. The only place where the isomorphism directly changes is when make.term is executed, that is, during the evaluation of finish.function for a constructor. In this case a new node may be added to the germ graph and the isomorphism may be extended to this new node. Alternatively, the term graph and isomorphism may be unchanged. Therefore, during the execution of any statement, the only changes to the isomorphism and term graph will be during the evaluations of make.term, so a sequence of such operations will result in an isomorphism that is either identical to or that extends it. □
Now we consider the case of an imperative procedure definition. Suppose f is a procedure defined by where are the formal parameters. Let E be the statement . Then with notation as in Proposition 3 finish.function. Let be finish.function(. Recall that v and U are defined as follows:
Let be defined by and for other program variables x. Let be such that env = and = . Let be and note that B may change . Let be and recall that does not change env. Finally, let v be and let U in this case be .
Now we can assume inductively that B and t are term and isomorphism dependent. By the way is defined, it is obtained from by a function which is isomorphism and term dependent. By comments about B and t, is also obtained by a function which is isomorphism and term dependent. Putting the environment back in does not change this. This follows by Theorem 6 concerning extensions of the isomorphism and Theorem 1 concerning nodes of the term graph. Therefore is term and isomorphism dependent under suitable assumptions. (Term dependence requires that argument replacement and node equality statements not be used.)
6.6. Imperative Statements: Term and Isomorphism Dependence
We want to prove the following result:
Theorem 7. If F is an imperative statement then is term dependent if there are no argument replacement statements or node equality statements in the program P in which F occurs. If F is an imperative statement then F is isomorphism dependent even if there are argument replacement statements or node equality statements in the program P.
Again, the proof is done separately for each kind of statement. Recall that is term equivalence and is isomorphism equivalence.
6.6.1. Simple Assignment Statements
For an environment e recall that is defined by and for . Also, = fix.env(env where .
By Theorem 2, the environment of is the same as that of S. The nodes mentioned in this original environment are still in the term graph because the set of nodes in the term graph never becomes smaller, by Theorem 1. Further, for isomorphism dependence, the isomorphism condition for environments is still satisfied by the current isomorphism by the extension property of Theorem 6.
Note that the value v returned by finish.function is a node in the term graph which if necessary is added to the graph before returning.
Consider two states and , and respectively. Suppose . Let be and be . We want to show that in both senses. Let be and be . Now and . Also the graph is not further modified after computing and so and . Thus and .
First consider term equivalence. Suppose and are term equivalent. So term = term for all z. Then and will be term equivalent by induction. So . Thus term = term. Then and . So term = term = term = term and for other variables y because of fix.env, term = term = term = term so .
Now consider isomorphism equivalence. Suppose and are isomorphic. Then we can assume for the induction proof that and are isomorphic. So , say by isomorphism . Thus , with in , in , and . We show that by isomorphism . For this we need that for all program variables y, . We know that and maps to . For x, and and so . For , because by . So .
6.6.2. Argument Replacement Statements
Now consider the argument replacement statement
for isomorphism dependence, not term dependence. Recall the semantics:
Recall that Replace is G with (for node u in G) replaced by .
Suppose and .
Suppose . We want to show that . Suppose and . We assume by induction that is isomorphism dependent so there is an isomorphism that maps to . Thus . Suppose is and is . Then . So the label of in is the same as the label of in and the label of in is the same as the label of in . Also is with the y child of replaced by and is with the y child of replaced by . Suppose is in and is in . Because , for all i. Also . Then is in and is in . So maps each child of in to the corresponding child of in so the conditions for are satisfied and is an isomorphism from to . The only nodes that have changed from to are and . This completes the proof. Note that the same maps to and also maps to because the names of the nodes have not changed, just one of the child pointers of and .
6.6.3. New.Term Cache.Term and Make.Term
Finish.function
has been handled in
Section 5.10]. Using Corollary 1 about pre.function, we can show that evaluating a functional expression
where
f is a constructor preserves term dependence and equivalence dependence.
If there are no statements or equality tests of nodes then all operations are term dependent. For this, new.term cache.term and make.term are not a problem.
6.6.4. Conditional Statements
Recall that
[[if t then else ]]S =
if term = then else
if term = true then else
where = .
Suppose , we want to show that [[if [[if .
Let be [[if and be [[if . We know by the lemma about pre.function that if and then assuming that t is term(isomorphism) dependent. This implies that term = term (in both senses). So if term = then and so . If term=true then and . Assuming is term (isomorphism) dependent then . If term true then and so assuming is term(isomorphism) dependent then . In all cases .
6.6.5. Iterative Statements
Recall that [[while t do = if (term = ) then else if term true then else [[while t do where .
Let = [[while and = [[while . Let and . By the lemma then if is term (isomorphism) dependent. Now either way term = term(. If term then and so . If term true then and so . Otherwise =[[while and =[[while …. By induction we can assume and that [[while …]] is term (isomorphism) dependent on and . Thus in this case also.
6.6.6. Composition of Statements
If two statements are term(isomorphism) dependent then so is their composition.
6.6.7. Multiple Assignment Statements
As for this can be expressed in terms of other statements so the proof reduces to proofs for them and also for compositions of imperative statements.
6.6.8. Copy Statements
If term is in then is the same as because a new node will be created with the same term as y and then it will be cached into again. Otherwise suppose and let and let with and . Then and and and . As for term dependence because term = term( = term = term. As for isomorphism dependence, let be an isomorphism from to with . Then extend to an isomorphism from to with .