Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / TipsForWritingConciseSML.adoc
CommitLineData
7f918cf1
CE
1TipsForWritingConciseSML
2========================
3
4SML is a rich enough language that there are often several ways to
5express things. This page contains miscellaneous tips (ideas not
6rules) for writing concise SML. The metric that we are interested in
7here is the number of tokens or words (rather than the number of
8lines, for example).
9
10== Datatypes in Signatures ==
11
12A seemingly frequent source of repetition in SML is that of datatype
13definitions in signatures and structures. Actually, it isn't
14repetition at all. A datatype specification in a signature, such as,
15
16[source,sml]
17----
18signature EXP = sig
19 datatype exp = Fn of id * exp | App of exp * exp | Var of id
20end
21----
22
23is just a specification of a datatype that may be matched by multiple
24(albeit identical) datatype declarations. For example, in
25
26[source,sml]
27----
28structure AnExp : EXP = struct
29 datatype exp = Fn of id * exp | App of exp * exp | Var of id
30end
31
32structure AnotherExp : EXP = struct
33 datatype exp = Fn of id * exp | App of exp * exp | Var of id
34end
35----
36
37the types `AnExp.exp` and `AnotherExp.exp` are two distinct types. If
38such <:GenerativeDatatype:generativity> isn't desired or needed, you
39can avoid the repetition:
40
41[source,sml]
42----
43structure Exp = struct
44 datatype exp = Fn of id * exp | App of exp * exp | Var of id
45end
46
47signature EXP = sig
48 datatype exp = datatype Exp.exp
49end
50
51structure Exp : EXP = struct
52 open Exp
53end
54----
55
56Keep in mind that this isn't semantically equivalent to the original.
57
58
59== Clausal Function Definitions ==
60
61The syntax of clausal function definitions is rather repetitive. For
62example,
63
64[source,sml]
65----
66fun isSome NONE = false
67 | isSome (SOME _) = true
68----
69
70is more verbose than
71
72[source,sml]
73----
74val isSome =
75 fn NONE => false
76 | SOME _ => true
77----
78
79For recursive functions the break-even point is one clause higher. For example,
80
81[source,sml]
82----
83fun fib 0 = 0
84 | fib 1 = 1
85 | fib n = fib (n-1) + fib (n-2)
86----
87
88isn't less verbose than
89
90[source,sml]
91----
92val rec fib =
93 fn 0 => 0
94 | 1 => 1
95 | n => fib (n-1) + fib (n-2)
96----
97
98It is quite often the case that a curried function primarily examines
99just one of its arguments. Such functions can be written particularly
100concisely by making the examined argument last. For example, instead
101of
102
103[source,sml]
104----
105fun eval (Fn (v, b)) env => ...
106 | eval (App (f, a) env => ...
107 | eval (Var v) env => ...
108----
109
110consider writing
111
112[source,sml]
113----
114fun eval env =
115 fn Fn (v, b) => ...
116 | App (f, a) => ...
117 | Var v => ...
118----
119
120
121== Parentheses ==
122
123It is a good idea to avoid using lots of irritating superfluous
124parentheses. An important rule to know is that prefix function
125application in SML has higher precedence than any infix operator. For
126example, the outer parentheses in
127
128[source,sml]
129----
130(square (5 + 1)) + (square (5 * 2))
131----
132
133are superfluous.
134
135People trained in other languages often use superfluous parentheses in
136a number of places. In particular, the parentheses in the following
137examples are practically always superfluous and are best avoided:
138
139[source,sml]
140----
141if (condition) then ... else ...
142while (condition) do ...
143----
144
145The same basically applies to case expressions:
146
147[source,sml]
148----
149case (expression) of ...
150----
151
152It is not uncommon to match a tuple of two or more values:
153
154[source,sml]
155----
156case (a, b) of
157 (A1, B1) => ...
158 | (A2, B2) => ...
159----
160
161Such case expressions can be written more concisely with an
162<:ProductType:infix product constructor>:
163
164[source,sml]
165----
166case a & b of
167 A1 & B1 => ...
168 | A2 & B2 => ...
169----
170
171
172== Conditionals ==
173
174Repeated sequences of conditionals such as
175
176[source,sml]
177----
178if x < y then ...
179else if x = y then ...
180else ...
181----
182
183can often be written more concisely as case expressions such as
184
185[source,sml]
186----
187case Int.compare (x, y) of
188 LESS => ...
189 | EQUAL => ...
190 | GREATER => ...
191----
192
193For a custom comparison, you would then define an appropriate datatype
194and a reification function. An alternative to using datatypes is to
195use dispatch functions
196
197[source,sml]
198----
199comparing (x, y)
200{lt = fn () => ...,
201 eq = fn () => ...,
202 gt = fn () => ...}
203----
204
205where
206
207[source,sml]
208----
209fun comparing (x, y) {lt, eq, gt} =
210 (case Int.compare (x, y) of
211 LESS => lt
212 | EQUAL => eq
213 | GREATER => gt) ()
214----
215
216An advantage is that no datatype definition is needed. A disadvantage
217is that you can't combine multiple dispatch results easily.
218
219
220== Command-Query Fusion ==
221
222Many are familiar with the
223http://en.wikipedia.org/wiki/Command-Query_Separation[Command-Query
224Separation Principle]. Adhering to the principle, a signature for an
225imperative stack might contain specifications
226
227[source,sml]
228----
229val isEmpty : 'a t -> bool
230val pop : 'a t -> 'a
231----
232
233and use of a stack would look like
234
235[source,sml]
236----
237if isEmpty stack
238then ... pop stack ...
239else ...
240----
241
242or, when the element needs to be named,
243
244[source,sml]
245----
246if isEmpty stack
247then let val elem = pop stack in ... end
248else ...
249----
250
251For efficiency, correctness, and conciseness, it is often better to
252combine the query and command and return the result as an option:
253
254[source,sml]
255----
256val pop : 'a t -> 'a option
257----
258
259A use of a stack would then look like this:
260
261[source,sml]
262----
263case pop stack of
264 NONE => ...
265 | SOME elem => ...
266----