Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / XML.adoc
CommitLineData
7f918cf1
CE
1XML
2===
3
4<:XML:> is an <:IntermediateLanguage:>, translated from <:CoreML:> by
5<:Defunctorize:>, optimized by <:XMLSimplify:>, and translated by
6<:Monomorphise:> to <:SXML:>.
7
8== Description ==
9
10<:XML:> is polymorphic, higher-order, with flat patterns. Every
11<:XML:> expression is annotated with its type. Polymorphic
12generalization is made explicit through type variables annotating
13`val` and `fun` declarations. Polymorphic instantiation is made
14explicit by specifying type arguments at variable references. <:XML:>
15patterns can not be nested and can not contain wildcards, constraints,
16flexible records, or layering.
17
18== Implementation ==
19
20* <!ViewGitFile(mlton,master,mlton/xml/xml.sig)>
21* <!ViewGitFile(mlton,master,mlton/xml/xml.fun)>
22* <!ViewGitFile(mlton,master,mlton/xml/xml-tree.sig)>
23* <!ViewGitFile(mlton,master,mlton/xml/xml-tree.fun)>
24
25== Type Checking ==
26
27<:XML:> also has a type checker, used for debugging. At present, the
28type checker is also the best specification of the type system of
29<:XML:>. If you need more details, the type checker
30(<!ViewGitFile(mlton,master,mlton/xml/type-check.sig)>,
31<!ViewGitFile(mlton,master,mlton/xml/type-check.fun)>), is pretty short.
32
33Since the type checker does not affect the output of the compiler
34(unless it reports an error), it can be turned off. The type checker
35recursively descends the program, checking that the type annotating
36each node is the same as the type synthesized from the types of the
37expressions subnodes.
38
39== Details and Notes ==
40
41<:XML:> uses the same atoms as <:CoreML:>, hence all identifiers
42(constructors, variables, etc.) are unique and can have properties
43attached to them. Finally, <:XML:> has a simplifier (<:XMLShrink:>),
44which implements a reduction system.
45
46=== Types ===
47
48<:XML:> types are either type variables or applications of n-ary type
49constructors. There are many utility functions for constructing and
50destructing types involving built-in type constructors.
51
52A type scheme binds list of type variables in a type. The only
53interesting operation on type schemes is the application of a type
54scheme to a list of types, which performs a simultaneous substitution
55of the type arguments for the bound type variables of the scheme. For
56the purposes of type checking, it is necessary to know the type scheme
57of variables, constructors, and primitives. This is done by
58associating the scheme with the identifier using its property list.
59This approach is used instead of the more traditional environment
60approach for reasons of speed.
61
62=== XmlTree ===
63
64Before defining `XML`, the signature for language <:XML:>, we need to
65define an auxiliary signature `XML_TREE`, that contains the datatype
66declarations for the expression trees of <:XML:>. This is done solely
67for the purpose of modularity -- it allows the simplifier and type
68checker to be defined by separate functors (which take a structure
69matching `XML_TREE`). Then, `Xml` is defined as the signature for a
70module containing the expression trees, the simplifier, and the type
71checker.
72
73Both constructors and variables can have type schemes, hence both
74constructor and variable references specify the instance of the scheme
75at the point of references. An instance is specified with a vector of
76types, which corresponds to the type variables in the scheme.
77
78<:XML:> patterns are flat (i.e. not nested). A pattern is a
79constructor with an optional argument variable. Patterns only occur
80in `case` expressions. To evaluate a case expression, compare the
81test value sequentially against each pattern. For the first pattern
82that matches, destruct the value if necessary to bind the pattern
83variables and evaluate the corresponding expression. If no pattern
84matches, evaluate the default. All patterns of a case statement are
85of the same variant of `Pat.t`, although this is not enforced by ML's
86type system. The type checker, however, does enforce this. Because
87tuple patterns are irrefutable, there will only ever be one tuple
88pattern in a case expression and there will be no default.
89
90<:XML:> contains value, exception, and mutually recursive function
91declarations. There are no free type variables in <:XML:>. All type
92variables are explicitly bound at either a value or function
93declaration. At some point in the future, exception declarations may
94go away, and exceptions may be represented with a single datatype
95containing a `unit ref` component to implement genericity.
96
97<:XML:> expressions are like those of <:CoreML:>, with the following
98exceptions. There are no records expressions. After type inference,
99all records (some of which may have originally been tuples in the
100source) are converted to tuples, because once flexible record patterns
101have been resolved, tuple labels are superfluous. Tuple components
102are ordered based on the field ordering relation. <:XML:> eta expands
103primitives and constructors so that there are always fully applied.
104Hence, the only kind of value of arrow type is a lambda. This
105property is useful for flow analysis and later in code generation.
106
107An <:XML:> program is a list of toplevel datatype declarations and a
108body expression. Because datatype declarations are not generative,
109the defunctorizer can safely move them to toplevel.