Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / XML.adoc
1 XML
2 ===
3
4 <:XML:> is an <:IntermediateLanguage:>, translated from <:CoreML:> by
5 <:Defunctorize:>, optimized by <:XMLSimplify:>, and translated by
6 <:Monomorphise:> to <:SXML:>.
7
8 == Description ==
9
10 <:XML:> is polymorphic, higher-order, with flat patterns. Every
11 <:XML:> expression is annotated with its type. Polymorphic
12 generalization is made explicit through type variables annotating
13 `val` and `fun` declarations. Polymorphic instantiation is made
14 explicit by specifying type arguments at variable references. <:XML:>
15 patterns can not be nested and can not contain wildcards, constraints,
16 flexible records, or layering.
17
18 == Implementation ==
19
20 * <!ViewGitFile(mlton,master,mlton/xml/xml.sig)>
21 * <!ViewGitFile(mlton,master,mlton/xml/xml.fun)>
22 * <!ViewGitFile(mlton,master,mlton/xml/xml-tree.sig)>
23 * <!ViewGitFile(mlton,master,mlton/xml/xml-tree.fun)>
24
25 == Type Checking ==
26
27 <:XML:> also has a type checker, used for debugging. At present, the
28 type checker is also the best specification of the type system of
29 <:XML:>. If you need more details, the type checker
30 (<!ViewGitFile(mlton,master,mlton/xml/type-check.sig)>,
31 <!ViewGitFile(mlton,master,mlton/xml/type-check.fun)>), is pretty short.
32
33 Since the type checker does not affect the output of the compiler
34 (unless it reports an error), it can be turned off. The type checker
35 recursively descends the program, checking that the type annotating
36 each node is the same as the type synthesized from the types of the
37 expressions subnodes.
38
39 == Details and Notes ==
40
41 <:XML:> uses the same atoms as <:CoreML:>, hence all identifiers
42 (constructors, variables, etc.) are unique and can have properties
43 attached to them. Finally, <:XML:> has a simplifier (<:XMLShrink:>),
44 which implements a reduction system.
45
46 === Types ===
47
48 <:XML:> types are either type variables or applications of n-ary type
49 constructors. There are many utility functions for constructing and
50 destructing types involving built-in type constructors.
51
52 A type scheme binds list of type variables in a type. The only
53 interesting operation on type schemes is the application of a type
54 scheme to a list of types, which performs a simultaneous substitution
55 of the type arguments for the bound type variables of the scheme. For
56 the purposes of type checking, it is necessary to know the type scheme
57 of variables, constructors, and primitives. This is done by
58 associating the scheme with the identifier using its property list.
59 This approach is used instead of the more traditional environment
60 approach for reasons of speed.
61
62 === XmlTree ===
63
64 Before defining `XML`, the signature for language <:XML:>, we need to
65 define an auxiliary signature `XML_TREE`, that contains the datatype
66 declarations for the expression trees of <:XML:>. This is done solely
67 for the purpose of modularity -- it allows the simplifier and type
68 checker to be defined by separate functors (which take a structure
69 matching `XML_TREE`). Then, `Xml` is defined as the signature for a
70 module containing the expression trees, the simplifier, and the type
71 checker.
72
73 Both constructors and variables can have type schemes, hence both
74 constructor and variable references specify the instance of the scheme
75 at the point of references. An instance is specified with a vector of
76 types, which corresponds to the type variables in the scheme.
77
78 <:XML:> patterns are flat (i.e. not nested). A pattern is a
79 constructor with an optional argument variable. Patterns only occur
80 in `case` expressions. To evaluate a case expression, compare the
81 test value sequentially against each pattern. For the first pattern
82 that matches, destruct the value if necessary to bind the pattern
83 variables and evaluate the corresponding expression. If no pattern
84 matches, evaluate the default. All patterns of a case statement are
85 of the same variant of `Pat.t`, although this is not enforced by ML's
86 type system. The type checker, however, does enforce this. Because
87 tuple patterns are irrefutable, there will only ever be one tuple
88 pattern in a case expression and there will be no default.
89
90 <:XML:> contains value, exception, and mutually recursive function
91 declarations. There are no free type variables in <:XML:>. All type
92 variables are explicitly bound at either a value or function
93 declaration. At some point in the future, exception declarations may
94 go away, and exceptions may be represented with a single datatype
95 containing a `unit ref` component to implement genericity.
96
97 <:XML:> expressions are like those of <:CoreML:>, with the following
98 exceptions. There are no records expressions. After type inference,
99 all records (some of which may have originally been tuples in the
100 source) are converted to tuples, because once flexible record patterns
101 have been resolved, tuple labels are superfluous. Tuple components
102 are ordered based on the field ordering relation. <:XML:> eta expands
103 primitives and constructors so that there are always fully applied.
104 Hence, the only kind of value of arrow type is a lambda. This
105 property is useful for flow analysis and later in code generation.
106
107 An <:XML:> program is a list of toplevel datatype declarations and a
108 body expression. Because datatype declarations are not generative,
109 the defunctorizer can safely move them to toplevel.