Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / StandardMLGotchas.adoc
1 StandardMLGotchas
2 =================
3 :toc:
4
5 This page contains brief explanations of some recurring sources of
6 confusion and problems that SML newbies encounter.
7
8 Many confusions about the syntax of SML seem to arise from the use of
9 an interactive REPL (Read-Eval Print Loop) while trying to learn the
10 basics of the language. While writing your first SML programs, you
11 should keep the source code of your programs in a form that is
12 accepted by an SML compiler as a whole.
13
14 == The `and` keyword ==
15
16 It is a common mistake to misuse the `and` keyword or to not know how
17 to introduce mutually recursive definitions. The purpose of the `and`
18 keyword is to introduce mutually recursive definitions of functions
19 and datatypes. For example,
20
21 [source,sml]
22 ----
23 fun isEven 0w0 = true
24 | isEven 0w1 = false
25 | isEven n = isOdd (n-0w1)
26 and isOdd 0w0 = false
27 | isOdd 0w1 = true
28 | isOdd n = isEven (n-0w1)
29 ----
30
31 and
32
33 [source,sml]
34 ----
35 datatype decl = VAL of id * pat * expr
36 (* | ... *)
37 and expr = LET of decl * expr
38 (* | ... *)
39 ----
40
41 You can also use `and` as a shorthand in a couple of other places, but
42 it is not necessary.
43
44 == Constructed patterns ==
45
46 It is a common mistake to forget to parenthesize constructed patterns
47 in `fun` bindings. Consider the following invalid definition:
48
49 [source,sml]
50 ----
51 fun length nil = 0
52 | length h :: t = 1 + length t
53 ----
54
55 The pattern `h :: t` needs to be parenthesized:
56
57 [source,sml]
58 ----
59 fun length nil = 0
60 | length (h :: t) = 1 + length t
61 ----
62
63 The parentheses are needed, because a `fun` definition may have
64 multiple consecutive constructed patterns through currying.
65
66 The same applies to nonfix constructors. For example, the parentheses
67 in
68
69 [source,sml]
70 ----
71 fun valOf NONE = raise Option
72 | valOf (SOME x) = x
73 ----
74
75 are required. However, the outermost constructed pattern in a `fn` or
76 `case` expression need not be parenthesized, because in those cases
77 there is always just one constructed pattern. So, both
78
79 [source,sml]
80 ----
81 val valOf = fn NONE => raise Option
82 | SOME x => x
83 ----
84
85 and
86
87 [source,sml]
88 ----
89 fun valOf x = case x of
90 NONE => raise Option
91 | SOME x => x
92 ----
93
94 are fine.
95
96 == Declarations and expressions ==
97
98 It is a common mistake to confuse expressions and declarations.
99 Normally an SML source file should only contain declarations. The
100 following are declarations:
101
102 [source,sml]
103 ----
104 datatype dt = ...
105 fun f ... = ...
106 functor Fn (...) = ...
107 infix ...
108 infixr ...
109 local ... in ... end
110 nonfix ...
111 open ...
112 signature SIG = ...
113 structure Struct = ...
114 type t = ...
115 val v = ...
116 ----
117
118 Note that
119
120 [source,sml]
121 ----
122 let ... in ... end
123 ----
124
125 isn't a declaration.
126
127 To specify a side-effecting computation in a source file, you can write:
128
129 [source,sml]
130 ----
131 val () = ...
132 ----
133
134
135 == Equality types ==
136
137 SML has a fairly intricate built-in notion of equality. See
138 <:EqualityType:> and <:EqualityTypeVariable:> for a thorough
139 discussion.
140
141
142 == Nested cases ==
143
144 It is a common mistake to write nested case expressions without the
145 necessary parentheses. See <:UnresolvedBugs:> for a discussion.
146
147
148 == (op *) ==
149
150 It used to be a common mistake to parenthesize `op *` as `(op *)`.
151 Before SML'97, `*)` was considered a comment terminator in SML and
152 caused a syntax error. At the time of writing, <:SMLNJ:SML/NJ> still
153 rejects the code. An extra space may be used for portability:
154 `(op * )`. However, parenthesizing `op` is redundant, even though it
155 is a widely used convention.
156
157
158 == Overloading ==
159
160 A number of standard operators (`+`, `-`, `~`, `*`, `<`, `>`, ...) and
161 numeric constants are overloaded for some of the numeric types (`int`,
162 `real`, `word`). It is a common surprise that definitions using
163 overloaded operators such as
164
165 [source,sml]
166 ----
167 fun min (x, y) = if y < x then y else x
168 ----
169
170 are not overloaded themselves. SML doesn't really support
171 (user-defined) overloading or other forms of ad hoc polymorphism. In
172 cases such as the above where the context doesn't resolve the
173 overloading, expressions using overloaded operators or constants get
174 assigned a default type. The above definition gets the type
175
176 [source,sml]
177 ----
178 val min : int * int -> int
179 ----
180
181 See <:Overloading:> and <:TypeIndexedValues:> for further discussion.
182
183
184 == Semicolons ==
185
186 It is a common mistake to use redundant semicolons in SML code. This
187 is probably caused by the fact that in an SML REPL, a semicolon (and
188 enter) is used to signal the REPL that it should evaluate the
189 preceding chunk of code as a unit. In SML source files, semicolons
190 are really needed in only two places. Namely, in expressions of the
191 form
192
193 [source,sml]
194 ----
195 (exp ; ... ; exp)
196 ----
197
198 and
199
200 [source,sml]
201 ----
202 let ... in exp ; ... ; exp end
203 ----
204
205 Note that semicolons act as expression (or declaration) separators
206 rather than as terminators.
207
208
209 == Stale bindings ==
210
211 {empty}
212
213
214 == Unresolved records ==
215
216 {empty}
217
218
219 == Value restriction ==
220
221 See <:ValueRestriction:>.
222
223
224 == Type Variable Scope ==
225
226 See <:TypeVariableScope:>.