Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / PolymorphicEquality.adoc
CommitLineData
7f918cf1
CE
1PolymorphicEquality
2===================
3:toc:
4
5Polymorphic equality is a built-in function in
6<:StandardML:Standard ML> that compares two values of the same type
7for equality. It is specified as
8
9[source,sml]
10----
11val = : ''a * ''a -> bool
12----
13
14The `''a` in the specification are
15<:EqualityTypeVariable:equality type variables>, and indicate that
16polymorphic equality can only be applied to values of an
17<:EqualityType:equality type>. It is not allowed in SML to rebind
18`=`, so a programmer is guaranteed that `=` always denotes polymorphic
19equality.
20
21
22== Equality of ground types ==
23
24Ground types like `char`, `int`, and `word` may be compared (to values
25of the same type). For example, `13 = 14` is type correct and yields
26`false`.
27
28
29== Equality of reals ==
30
31The one ground type that can not be compared is `real`. So,
32`13.0 = 14.0` is not type correct. One can use `Real.==` to compare
33reals for equality, but beware that this has different algebraic
34properties than polymorphic equality.
35
36See http://standardml.org/Basis/real.html for a discussion of why
37`real` is not an equality type.
38
39
40== Equality of functions ==
41
42Comparison of functions is not allowed.
43
44
45== Equality of immutable types ==
46
47Polymorphic equality can be used on <:Immutable:immutable> values like
48tuples, records, lists, and vectors. For example,
49
50----
51(1, 2, 3) = (4, 5, 6)
52----
53
54is a type-correct expression yielding `false`, while
55
56----
57[1, 2, 3] = [1, 2, 3]
58----
59
60is type correct and yields `true`.
61
62Equality on immutable values is computed by structure, which means
63that values are compared by recursively descending the data structure
64until ground types are reached, at which point the ground types are
65compared with primitive equality tests (like comparison of
66characters). So, the expression
67
68----
69[1, 2, 3] = [1, 1 + 1, 1 + 1 + 1]
70----
71
72is guaranteed to yield `true`, even though the lists may occupy
73different locations in memory.
74
75Because of structural equality, immutable values can only be compared
76if their components can be compared. For example, `[1, 2, 3]` can be
77compared, but `[1.0, 2.0, 3.0]` can not. The SML type system uses
78<:EqualityType:equality types> to ensure that structural equality is
79only applied to valid values.
80
81
82== Equality of mutable values ==
83
84In contrast to immutable values, polymorphic equality of
85<:Mutable:mutable> values (like ref cells and arrays) is performed by
86pointer comparison, not by structure. So, the expression
87
88----
89ref 13 = ref 13
90----
91
92is guaranteed to yield `false`, even though the ref cells hold the
93same contents.
94
95Because equality of mutable values is not structural, arrays and refs
96can be compared _even if their components are not equality types_.
97Hence, the following expression is type correct (and yields true).
98
99[source,sml]
100----
101let
102 val r = ref 13.0
103in
104 r = r
105end
106----
107
108
109== Equality of datatypes ==
110
111Polymorphic equality of datatypes is structural. Two values of the
112same datatype are equal if they are of the same <:Variant:variant> and
113if the <:Variant:variant>'s arguments are equal (recursively). So,
114with the datatype
115
116[source,sml]
117----
118datatype t = A | B of t
119----
120
121then `B (B A) = B A` is type correct and yields `false`, while `A = A`
122and `B A = B A` yield `true`.
123
124As polymorphic equality descends two values to compare them, it uses
125pointer equality whenever it reaches a mutable value. So, with the
126datatype
127
128[source,sml]
129----
130datatype t = A of int ref | ...
131----
132
133then `A (ref 13) = A (ref 13)` is type correct and yields `false`,
134because the pointer equality on the two ref cells yields `false`.
135
136One weakness of the SML type system is that datatypes do not inherit
137the special property of the `ref` and `array` type constructors that
138allows them to be compared regardless of their component type. For
139example, after declaring
140
141[source,sml]
142----
143datatype 'a t = A of 'a ref
144----
145
146one might expect to be able to compare two values of type `real t`,
147because pointer comparison on a ref cell would suffice.
148Unfortunately, the type system can only express that a user-defined
149datatype <:AdmitsEquality:admits equality> or not. In this case, `t`
150admits equality, which means that `int t` can be compared but that
151`real t` can not. We can confirm this with the program
152
153[source,sml]
154----
155datatype 'a t = A of 'a ref
156fun f (x: real t, y: real t) = x = y
157----
158
159on which MLton reports the following error.
160
161----
162Error: z.sml 2.32-2.36.
163 Function applied to incorrect argument.
164 expects: [<equality>] t * [<equality>] t
165 but got: [real] t * [real] t
166 in: = (x, y)
167----
168
169
170== Implementation ==
171
172Polymorphic equality is implemented by recursively descending the two
173values being compared, stopping as soon as they are determined to be
174unequal, or exploring the entire values to determine that they are
175equal. Hence, polymorphic equality can take time proportional to the
176size of the smaller value.
177
178MLton uses some optimizations to improve performance.
179
180* When computing structural equality, first do a pointer comparison.
181If the comparison yields `true`, then stop and return `true`, since
182the structural comparison is guaranteed to do so. If the pointer
183comparison fails, then recursively descend the values.
184
185* If a datatype is an enum (e.g. `datatype t = A | B | C`), then a
186single comparison suffices to compare values of the datatype. No case
187dispatch is required to determine whether the two values are of the
188same <:Variant:variant>.
189
190* When comparing a known constant non-value-carrying
191<:Variant:variant>, use a single comparison. For example, the
192following code will compile into a single comparison for `A = x`.
193+
194[source,sml]
195----
196datatype t = A | B | C of ...
197fun f x = ... if A = x then ...
198----
199
200* When comparing a small constant `IntInf.int` to another
201`IntInf.int`, use a single comparison against the constant. No case
202dispatch is required.
203
204
205== Also see ==
206
207* <:AdmitsEquality:>
208* <:EqualityType:>
209* <:EqualityTypeVariable:>