Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / Monomorphise.adoc
CommitLineData
7f918cf1
CE
1Monomorphise
2============
3
4<:Monomorphise:> is a translation pass from the <:XML:>
5<:IntermediateLanguage:> to the <:SXML:> <:IntermediateLanguage:>.
6
7== Description ==
8
9Monomorphisation eliminates polymorphic values and datatype
10declarations by duplicating them for each type at which they are used.
11
12Consider the following <:XML:> program.
13[source,sml]
14----
15datatype 'a t = T of 'a
16fun 'a f (x: 'a) = T x
17val a = f 1
18val b = f 2
19val z = f (3, 4)
20----
21
22The result of monomorphising this program is the following <:SXML:> program:
23[source,sml]
24----
25datatype t1 = T1 of int
26datatype t2 = T2 of int * int
27fun f1 (x: int) = T1 x
28fun f2 (x: int * int) = T2 x
29val a = f1 1
30val b = f1 2
31val z = f2 (3, 4)
32----
33
34== Implementation ==
35
36* <!ViewGitFile(mlton,master,mlton/xml/monomorphise.sig)>
37* <!ViewGitFile(mlton,master,mlton/xml/monomorphise.fun)>
38
39== Details and Notes ==
40
41The monomorphiser works by making one pass over the entire program.
42On the way down, it creates a cache for each variable declared in a
43polymorphic declaration that maps a lists of type arguments to a new
44variable name. At a variable reference, it consults the cache (based
45on the types the variable is applied to). If there is already an
46entry in the cache, it is used. If not, a new entry is created. On
47the way up, the monomorphiser duplicates a variable declaration for
48each entry in the cache.
49
50As with variables, the monomorphiser records all of the type at which
51constructors are used. After the entire program is processed, the
52monomorphiser duplicates each datatype declaration and its associated
53constructors.
54
55The monomorphiser duplicates all of the functions declared in a
56`fun` declaration as a unit. Consider the following program
57[source,sml]
58----
59fun 'a f (x: 'a) = g x
60and g (y: 'a) = f y
61val a = f 13
62val b = g 14
63val c = f (1, 2)
64----
65
66and its monomorphisation
67
68[source,sml]
69----
70fun f1 (x: int) = g1 x
71and g1 (y: int) = f1 y
72fun f2 (x : int * int) = g2 x
73and g2 (y : int * int) = f2 y
74val a = f1 13
75val b = g1 14
76val c = f2 (1, 2)
77----
78
79== Pathological datatype declarations ==
80
81SML allows a pathological polymorphic datatype declaration in which
82recursive uses of the defined type constructor are applied to
83different type arguments than the definition. This has been
84disallowed by others on type theoretic grounds. A canonical example
85is the following.
86[source,sml]
87----
88datatype 'a t = A of 'a | B of ('a * 'a) t
89val z : int t = B (B (A ((1, 2), (3, 4))))
90----
91
92The presence of the recursion in the datatype declaration might appear
93to cause the need for the monomorphiser to create an infinite number
94of types. However, due to the absence of polymorphic recursion in
95SML, there are in fact only a finite number of instances of such types
96in any given program. The monomorphiser translates the above program
97to the following one.
98[source,sml]
99----
100datatype t1 = B1 of t2
101datatype t2 = B2 of t3
102datatype t3 = A3 of (int * int) * (int * int)
103val z : int t = B1 (B2 (A3 ((1, 2), (3, 4))))
104----
105
106It is crucial that the monomorphiser be allowed to drop unused
107constructors from datatype declarations in order for the translation
108to terminate.