Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / PolymorphicEquality.adoc
1 PolymorphicEquality
2 ===================
3 :toc:
4
5 Polymorphic equality is a built-in function in
6 <:StandardML:Standard ML> that compares two values of the same type
7 for equality. It is specified as
8
9 [source,sml]
10 ----
11 val = : ''a * ''a -> bool
12 ----
13
14 The `''a` in the specification are
15 <:EqualityTypeVariable:equality type variables>, and indicate that
16 polymorphic equality can only be applied to values of an
17 <:EqualityType:equality type>. It is not allowed in SML to rebind
18 `=`, so a programmer is guaranteed that `=` always denotes polymorphic
19 equality.
20
21
22 == Equality of ground types ==
23
24 Ground types like `char`, `int`, and `word` may be compared (to values
25 of the same type). For example, `13 = 14` is type correct and yields
26 `false`.
27
28
29 == Equality of reals ==
30
31 The one ground type that can not be compared is `real`. So,
32 `13.0 = 14.0` is not type correct. One can use `Real.==` to compare
33 reals for equality, but beware that this has different algebraic
34 properties than polymorphic equality.
35
36 See http://standardml.org/Basis/real.html for a discussion of why
37 `real` is not an equality type.
38
39
40 == Equality of functions ==
41
42 Comparison of functions is not allowed.
43
44
45 == Equality of immutable types ==
46
47 Polymorphic equality can be used on <:Immutable:immutable> values like
48 tuples, records, lists, and vectors. For example,
49
50 ----
51 (1, 2, 3) = (4, 5, 6)
52 ----
53
54 is a type-correct expression yielding `false`, while
55
56 ----
57 [1, 2, 3] = [1, 2, 3]
58 ----
59
60 is type correct and yields `true`.
61
62 Equality on immutable values is computed by structure, which means
63 that values are compared by recursively descending the data structure
64 until ground types are reached, at which point the ground types are
65 compared with primitive equality tests (like comparison of
66 characters). So, the expression
67
68 ----
69 [1, 2, 3] = [1, 1 + 1, 1 + 1 + 1]
70 ----
71
72 is guaranteed to yield `true`, even though the lists may occupy
73 different locations in memory.
74
75 Because of structural equality, immutable values can only be compared
76 if their components can be compared. For example, `[1, 2, 3]` can be
77 compared, but `[1.0, 2.0, 3.0]` can not. The SML type system uses
78 <:EqualityType:equality types> to ensure that structural equality is
79 only applied to valid values.
80
81
82 == Equality of mutable values ==
83
84 In contrast to immutable values, polymorphic equality of
85 <:Mutable:mutable> values (like ref cells and arrays) is performed by
86 pointer comparison, not by structure. So, the expression
87
88 ----
89 ref 13 = ref 13
90 ----
91
92 is guaranteed to yield `false`, even though the ref cells hold the
93 same contents.
94
95 Because equality of mutable values is not structural, arrays and refs
96 can be compared _even if their components are not equality types_.
97 Hence, the following expression is type correct (and yields true).
98
99 [source,sml]
100 ----
101 let
102 val r = ref 13.0
103 in
104 r = r
105 end
106 ----
107
108
109 == Equality of datatypes ==
110
111 Polymorphic equality of datatypes is structural. Two values of the
112 same datatype are equal if they are of the same <:Variant:variant> and
113 if the <:Variant:variant>'s arguments are equal (recursively). So,
114 with the datatype
115
116 [source,sml]
117 ----
118 datatype t = A | B of t
119 ----
120
121 then `B (B A) = B A` is type correct and yields `false`, while `A = A`
122 and `B A = B A` yield `true`.
123
124 As polymorphic equality descends two values to compare them, it uses
125 pointer equality whenever it reaches a mutable value. So, with the
126 datatype
127
128 [source,sml]
129 ----
130 datatype t = A of int ref | ...
131 ----
132
133 then `A (ref 13) = A (ref 13)` is type correct and yields `false`,
134 because the pointer equality on the two ref cells yields `false`.
135
136 One weakness of the SML type system is that datatypes do not inherit
137 the special property of the `ref` and `array` type constructors that
138 allows them to be compared regardless of their component type. For
139 example, after declaring
140
141 [source,sml]
142 ----
143 datatype 'a t = A of 'a ref
144 ----
145
146 one might expect to be able to compare two values of type `real t`,
147 because pointer comparison on a ref cell would suffice.
148 Unfortunately, the type system can only express that a user-defined
149 datatype <:AdmitsEquality:admits equality> or not. In this case, `t`
150 admits equality, which means that `int t` can be compared but that
151 `real t` can not. We can confirm this with the program
152
153 [source,sml]
154 ----
155 datatype 'a t = A of 'a ref
156 fun f (x: real t, y: real t) = x = y
157 ----
158
159 on which MLton reports the following error.
160
161 ----
162 Error: z.sml 2.32-2.36.
163 Function applied to incorrect argument.
164 expects: [<equality>] t * [<equality>] t
165 but got: [real] t * [real] t
166 in: = (x, y)
167 ----
168
169
170 == Implementation ==
171
172 Polymorphic equality is implemented by recursively descending the two
173 values being compared, stopping as soon as they are determined to be
174 unequal, or exploring the entire values to determine that they are
175 equal. Hence, polymorphic equality can take time proportional to the
176 size of the smaller value.
177
178 MLton uses some optimizations to improve performance.
179
180 * When computing structural equality, first do a pointer comparison.
181 If the comparison yields `true`, then stop and return `true`, since
182 the structural comparison is guaranteed to do so. If the pointer
183 comparison fails, then recursively descend the values.
184
185 * If a datatype is an enum (e.g. `datatype t = A | B | C`), then a
186 single comparison suffices to compare values of the datatype. No case
187 dispatch is required to determine whether the two values are of the
188 same <:Variant:variant>.
189
190 * When comparing a known constant non-value-carrying
191 <:Variant:variant>, use a single comparison. For example, the
192 following code will compile into a single comparison for `A = x`.
193 +
194 [source,sml]
195 ----
196 datatype t = A | B | C of ...
197 fun f x = ... if A = x then ...
198 ----
199
200 * When comparing a small constant `IntInf.int` to another
201 `IntInf.int`, use a single comparison against the constant. No case
202 dispatch is required.
203
204
205 == Also see ==
206
207 * <:AdmitsEquality:>
208 * <:EqualityType:>
209 * <:EqualityTypeVariable:>