Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | \documentclass[12pt]{article} |
2 | \usepackage{alltt,epsfig,html,latexsym,longtable,makeidx,moreverb} | |
3 | ||
4 | \setlength\topmargin{-0.5in} | |
5 | \setlength\textheight{8.5in} | |
6 | \setlength\textwidth{7.0in} | |
7 | \setlength\oddsidemargin{-0.3in} | |
8 | \setlength\evensidemargin{-0.3in} | |
9 | \hyphenation{} | |
10 | \title{{\mlton} SML Style Guide} | |
11 | \author{Stephen Weeks} | |
12 | \date{\today} | |
13 | \include{macros} | |
14 | \makeindex | |
15 | ||
16 | \begin{document} | |
17 | ||
18 | \maketitle | |
19 | \input{abstract} | |
20 | ||
21 | % conventions chosen so that inertia is towards modularity and reuse | |
22 | % not to type fewer characters | |
23 | ||
24 | \sec{High-level structure}{high-level-structure} | |
25 | ||
26 | Code is structured in {\mlton} so that signatures are closed. Thus, in | |
27 | {\mlton}, one would never write the following. | |
28 | \begin{verbatim} | |
29 | signature SIG = | |
30 | sig | |
31 | val f: Foo.t -> int | |
32 | end | |
33 | \end{verbatim} | |
34 | Instead, one would write the following. | |
35 | \begin{verbatim} | |
36 | signature SIG = | |
37 | sig | |
38 | structure Foo: FOO | |
39 | ||
40 | val f: Foo.t -> int | |
41 | end | |
42 | \end{verbatim} | |
43 | The benefit of this approach is that one can first understand the | |
44 | specifications (i.e. signatures) of all of the modules in {\mlton} before having | |
45 | to look at any implementations (i.e. structures or functors). That is, the | |
46 | signatures are self-contained. | |
47 | ||
48 | We deviate from this only in allowing references to top level types (like {\tt | |
49 | int}), basis library modules, and {\mlton} library modules. So, the following | |
50 | signature is fine, because structure {\tt Regexp} is part of the {\mlton} | |
51 | library. | |
52 | \begin{verbatim} | |
53 | signature SIG = | |
54 | sig | |
55 | val f: Regexp.t -> int | |
56 | end | |
57 | \end{verbatim} | |
58 | ||
59 | We also use signatures to express (some of) the dependencies between modules. | |
60 | For every module {\tt Foo}, we write two signatures in a file named {\tt | |
61 | foo.sig}. The signature {\tt FOO} specifies what is implemented by {\tt Foo}. | |
62 | The signature {\tt FOO\_STRUCTS} specifies the modules that are needed in order | |
63 | to specify {\tt Foo}, but that are not implemented by {\tt Foo}. As an example, | |
64 | consider {\mlton}'s closure conversion pass (in {\tt mlton/closure-convert}), | |
65 | which converts from {\tt Sxml}, {\mlton}'s higher-order simply-typed | |
66 | intermediate language, to {\tt Cps}, {\mlton}'s first-order simply-typed | |
67 | intermediate language. The file {\tt closure-convert.sig} contains the | |
68 | following. | |
69 | \begin{verbatim} | |
70 | signature CLOSURE_CONVERT_STRUCTS = | |
71 | sig | |
72 | structure Sxml: SXML | |
73 | structure Cps: CPS | |
74 | sharing Sxml.Atoms = Cps.Atoms | |
75 | end | |
76 | ||
77 | signature CLOSURE_CONVERT = | |
78 | sig | |
79 | include CLOSURE_CONVERT_STRUCTS | |
80 | ||
81 | val closureConvert: Sxml.Program.t -> Cps.Program.t | |
82 | end | |
83 | \end{verbatim} | |
84 | These signatures say that the {\tt ClosureConvert} module implements a function | |
85 | {\tt closureConvert} that transforms an {\tt Sxml} program into a {\tt Cps} | |
86 | program. They also say that {\tt ClosureConvert} does not implement {\tt Sxml} | |
87 | or {\tt Cps}. Rather, it expects some other modules to implement these and for | |
88 | them to be provided to {\tt ClosureConvert}. The sharing constraint expresses | |
89 | that the ILs must share some basic atoms, like constants, variables, and | |
90 | primitives. | |
91 | ||
92 | Given the two signatures that specify a module, the module definition always has | |
93 | the same structure. A module {\tt Foo} is implemented in a file named {\tt | |
94 | foo.fun}, which defines a functor named {\tt Foo} that takes as an argument a | |
95 | structure matching {\tt FOO\_STRUCTS} and returns as a result a structure | |
96 | matching {\tt FOO}. For example, {\tt closure-convert.fun} contains the | |
97 | following. | |
98 | \begin{verbatim} | |
99 | functor ClosureConvert (S: CLOSURE_CONVERT_STRUCTS): CLOSURE_CONVERT = | |
100 | struct | |
101 | ||
102 | open S | |
103 | ||
104 | fun closureConvert ... | |
105 | ||
106 | end | |
107 | \end{verbatim} | |
108 | Although the signatures for {\tt ClosureConvert} express the dependence | |
109 | on the {\tt Sxml} and {\tt Cps} ILs, they do not express the | |
110 | dependence on other modules that are only used internally to closure | |
111 | conversion. For example, closure conversion uses an auxiliary module {\tt | |
112 | AbstractValue} as part of its higher-order control-flow analysis. Because {\tt | |
113 | AbstractValue} is only used internally to closure conversion, it does not appear | |
114 | in the signatures that specify closure conversion. So, helper functors (like | |
115 | {\tt AbstractValue}) are analogous to helper functions in that they are not | |
116 | visible to clients. | |
117 | ||
118 | We do not put helper functors lexically in scope because SML only allows top | |
119 | level functor definitions and, more importantly, because files would become | |
120 | unmanageably large. Instead, helper functors get their own {\tt .sig} and {\tt | |
121 | .fun} file, which follow exactly the convention above. | |
122 | ||
123 | \section{General conventions} | |
124 | ||
125 | \begin{itemize} | |
126 | \item A line of code never exceeds 80 columns. | |
127 | \item Use alphabetical order wherever possible. | |
128 | \begin{itemize} | |
129 | \item record field names | |
130 | \item datatype constructors | |
131 | \item value specs in signatures | |
132 | \item file lists in CM files | |
133 | \item export lists in CM files | |
134 | \end{itemize} | |
135 | \end{itemize} | |
136 | ||
137 | %------------------------------------------------------ | |
138 | % Signature conventions | |
139 | %------------------------------------------------------ | |
140 | ||
141 | \sec{Signatures}{signature-conventions} | |
142 | ||
143 | We now enumerate the conventions we follow in writing signatures. | |
144 | ||
145 | \begin{enumerate} | |
146 | ||
147 | \item | |
148 | Signature identifiers are in all capitals, using ``\_'' to | |
149 | separate words. | |
150 | ||
151 | \item | |
152 | A signature typically contains a single type specification that defines a type | |
153 | constructor {\tt t}, which is the type of interest in the specification. For | |
154 | oexample, here are signature fragments for integers, lists, and maps. | |
155 | \begin{verbatim} | |
156 | signature INTEGER = | |
157 | sig | |
158 | type t | |
159 | ||
160 | val + : t * t -> t | |
161 | ... | |
162 | end | |
163 | ||
164 | signature LIST = | |
165 | sig | |
166 | type 'a t | |
167 | ||
168 | val map: 'a t * ('a -> 'b) -> 'b t | |
169 | ... | |
170 | end | |
171 | ||
172 | signature MAP | |
173 | sig | |
174 | type ('a, 'b) t | |
175 | ||
176 | val extend: ('a, 'b) t * 'a * 'b -> ('a, 'b) t | |
177 | ... | |
178 | end | |
179 | \end{verbatim} | |
180 | Although at first it might appear confusing to name every type {\tt t}, in fact | |
181 | there is never ambiguity, because at any point in the program there is at most | |
182 | one unqualified {\tt t} in scope, and all other types will be named with long | |
183 | identifiers (like {\tt Int.t} or {\tt Int.t List.t}). For example, the code for | |
184 | a function {\tt foo} within the {\tt Map} module might look like the following. | |
185 | \begin{verbatim} | |
186 | fun foo (l: 'a List.t, n: Int.t): ('a, Int.t) t = ... | |
187 | \end{verbatim} | |
188 | ||
189 | In practice, for pervasive types like {\tt int}, {\tt 'a list}, we often use the | |
190 | standard pervasive name instead of the {\tt t} name. | |
191 | ||
192 | \item Signatures should not contain free types or structures, other than | |
193 | pervasives, basis library modules, or {\mlton} library modules. This was | |
194 | explained in \secref{high-level-structure}. | |
195 | ||
196 | \item | |
197 | If additional abstract types (other than pervasive types) are needed to specify | |
198 | operations, they are included as substructures of the signature, and have a | |
199 | signature in their own right. For example, the following signature is good. | |
200 | ||
201 | \begin{verbatim} | |
202 | signature FOO = | |
203 | sig | |
204 | structure Var: VAR | |
205 | ||
206 | type t | |
207 | val fromVar: Var.t -> t | |
208 | val toVar: t -> Var.t | |
209 | end | |
210 | \end{verbatim} | |
211 | ||
212 | \item | |
213 | Signatures do not use substructures or multiple structures to group different | |
214 | operations on the same type. This makes you waste energy remembering where the | |
215 | operations are. For exmample, the following signature is bad. | |
216 | ||
217 | \begin{verbatim} | |
218 | signature REAL = | |
219 | sig | |
220 | type t | |
221 | ||
222 | val + : t * t -> t | |
223 | ||
224 | structure Trig: | |
225 | sig | |
226 | val sin: t -> t | |
227 | val cos: t -> t | |
228 | end | |
229 | end | |
230 | \end{verbatim} | |
231 | ||
232 | \item | |
233 | Signatures usually should not contain datatypes. This exposes the | |
234 | implementation of what should be an abstract type. For example, the following | |
235 | signature is bad. | |
236 | \begin{verbatim} | |
237 | signature COMPLEX = | |
238 | sig | |
239 | datatype t = T of real * real | |
240 | end | |
241 | \end{verbatim} | |
242 | A common exception to this rule is abstract syntax trees. | |
243 | ||
244 | \item | |
245 | Use structure sharing to express type sharing. For example, in {\tt | |
246 | closure-convert.sig}, a single structure sharing equation expresses a number of | |
247 | type sharing equations. | |
248 | ||
249 | \end{enumerate} | |
250 | ||
251 | %------------------------------------------------------ | |
252 | % Value specifications | |
253 | %------------------------------------------------------ | |
254 | ||
255 | \subsec{Value specifications}{val-specs} | |
256 | ||
257 | Here are the conventions that we use for individual value specifications in | |
258 | signatures. Of course, many of these conventions directly impact the way in | |
259 | which we write the core language expressions that implement the specifications. | |
260 | ||
261 | \begin{enumerate} | |
262 | ||
263 | \item | |
264 | In a datatype specification, if there is a single constructor, then that | |
265 | constructor is called {\tt T}. | |
266 | \begin{verbatim} | |
267 | datatype t = T of int | |
268 | \end{verbatim} | |
269 | ||
270 | \item | |
271 | In a datatype specification, if a constructor carries multiple values of the | |
272 | same type, use a record to name them to avoid confusion. | |
273 | \begin{verbatim} | |
274 | datatype t = T of {length: int, start: int} | |
275 | \end{verbatim} | |
276 | ||
277 | \item | |
278 | Identifiers begin with and use small letters, using capital letters to separate | |
279 | words. | |
280 | \begin{verbatim} | |
281 | val helloWorld: unit -> unit | |
282 | \end{verbatim} | |
283 | ||
284 | \item | |
285 | There is no space before the colon, and a single space after it. In the case of | |
286 | operators (like {\tt +}), there is a space before the colon to avoid lexing the | |
287 | colon as part of the operator. | |
288 | ||
289 | \item | |
290 | Pass multiple arguments as tuple, not curried. | |
291 | \begin{verbatim} | |
292 | val eval: Exp.t * Env.t -> Val.t | |
293 | \end{verbatim} | |
294 | ||
295 | \item | |
296 | Currying is only used when there staging of a computation, i.e., if | |
297 | precomputation is done on one of the arguments. | |
298 | \begin{verbatim} | |
299 | val match: Regexp.t -> string -> bool | |
300 | \end{verbatim} | |
301 | ||
302 | \item | |
303 | Functions which take a single element of the abstract type of a signature take | |
304 | the element as the first argument, and auxiliary arguments after. | |
305 | \begin{verbatim} | |
306 | val push: t * int -> unit | |
307 | val map: 'a t * ('a -> 'b) -> 'b t | |
308 | \end{verbatim} | |
309 | ||
310 | \item | |
311 | $n$-ary operations take the $n$ elements first, and auxilary arguments after. | |
312 | \begin{verbatim} | |
313 | val merge: 'a t * 'a t * ('a * 'a -> 'b) -> 'b t | |
314 | \end{verbatim} | |
315 | ||
316 | \item | |
317 | If two arguments to a function are of the same type, and the operation is not | |
318 | commutative, pass them using a record. This names the arguments and ensures | |
319 | they are not confused. Exceptions are the standard numerical and algebraic | |
320 | operators. | |
321 | \begin{verbatim} | |
322 | val fromTo: {start: int, step: int, stop: int} -> int list | |
323 | val substring: t * {length: int, start: int} -> t | |
324 | val - : t * t -> t | |
325 | \end{verbatim} | |
326 | ||
327 | \item | |
328 | Field names in record types are written in alphabetical order. | |
329 | ||
330 | \item | |
331 | Return multiple results as a tuple, or as a record if there is the potential for | |
332 | confusion. | |
333 | \begin{verbatim} | |
334 | val parse: string -> t * string | |
335 | val quotRem: t * t -> t * t | |
336 | val partition: 'a t * ('a -> bool) -> {no: 'a t, yes: 'a t} | |
337 | \end{verbatim} | |
338 | ||
339 | \item | |
340 | If a function returns multiple results, at least two of which are of the same | |
341 | type, and the name of the function does not clearly indicate which result is | |
342 | which, use a record to name the results. | |
343 | \begin{verbatim} | |
344 | val vars: t -> {frees : Vars.t, bound : Vars.t} | |
345 | val partition: 'a t * ('a -> bool) -> {yes : 'a t, no : 'a t} | |
346 | \end{verbatim} | |
347 | ||
348 | \item | |
349 | Use the same names and argument orders for similar functions in different | |
350 | signatures. This is especially common in the {\mlton} library. | |
351 | \begin{verbatim} | |
352 | val < : t * t -> bool | |
353 | val equals: t * t -> bool | |
354 | val forall: 'a t * ('a -> bool) -> bool | |
355 | \end{verbatim} | |
356 | ||
357 | \item | |
358 | Use {\tt is}, {\tt are}, {\tt can}, etc. to name predicates. One exception is | |
359 | {\tt equals}. | |
360 | \begin{verbatim} | |
361 | val isEven: int -> bool | |
362 | val canRead: t -> bool | |
363 | \end{verbatim} | |
364 | ||
365 | \end{enumerate} | |
366 | ||
367 | %------------------------------------------------------ | |
368 | % Signature example | |
369 | %------------------------------------------------------ | |
370 | ||
371 | \subsection{Example} | |
372 | ||
373 | Here is the complete specification of a simple interpreter. This demonstrates | |
374 | the {\tt t}-convention, the closed-signature convention, and the use of sharing | |
375 | constraints. | |
376 | ||
377 | \begin{verbatim} | |
378 | signature VAR = | |
379 | sig | |
380 | type t | |
381 | end | |
382 | ||
383 | signature EXP = | |
384 | sig | |
385 | structure Var: VAR | |
386 | ||
387 | datatype t = | |
388 | Var of Var.t | |
389 | | Lam of Var.t * t | |
390 | | App of t * t | |
391 | end | |
392 | ||
393 | signature VAL = | |
394 | sig | |
395 | structure Var: VAR | |
396 | ||
397 | type t | |
398 | ||
399 | val var: Var.t -> t | |
400 | val lam: Var.t * t -> t | |
401 | val app: t * t -> t | |
402 | end | |
403 | ||
404 | signature INTERP = | |
405 | sig | |
406 | structure Exp: EXP | |
407 | structure Val: VAL | |
408 | sharing Exp.Var = Val.Var | |
409 | ||
410 | val eval: Exp.t -> Val.t | |
411 | end | |
412 | ||
413 | signature ENV = | |
414 | sig | |
415 | structure Var: VAR | |
416 | ||
417 | type 'a t | |
418 | ||
419 | val lookup: 'a t * Var.t -> 'a | |
420 | val extend: 'a t * Var.t * 'a -> 'a t | |
421 | end | |
422 | \end{verbatim} | |
423 | ||
424 | %------------------------------------------------------ | |
425 | % Functors and structures | |
426 | %------------------------------------------------------ | |
427 | ||
428 | \section{Functors and structures} | |
429 | We now enumerate the conventions we follow in writing functors and structures. | |
430 | There is some repetition with \secref{high-level-structure}. | |
431 | ||
432 | \begin{enumerate} | |
433 | ||
434 | \item | |
435 | Functor identifiers begin with capital letters, use mixed case, and use capital | |
436 | letters to separate words. | |
437 | ||
438 | \item | |
439 | Functor definitions look like the following. | |
440 | \begin{verbatim} | |
441 | functor Foo (S: FOO_STRUCTS): FOO = | |
442 | struct | |
443 | ||
444 | open S | |
445 | ||
446 | ... | |
447 | ||
448 | end | |
449 | \end{verbatim} | |
450 | ||
451 | \item | |
452 | The name of the functor is the same as the name of the signature describing the | |
453 | structure it produces. | |
454 | ||
455 | \item | |
456 | The functor result is constrained by a signature. | |
457 | ||
458 | \item | |
459 | A functor takes as arguments any structures that occur in the signature of the | |
460 | result that it does not implement. | |
461 | ||
462 | \item | |
463 | Structure identifiers begin with capital letters, and use capital letters to | |
464 | separate words. | |
465 | ||
466 | \item | |
467 | The name of the structure is the same as the name of the functor that produces | |
468 | it. | |
469 | ||
470 | \item | |
471 | A structure definition looks like one of the following. | |
472 | \begin{verbatim} | |
473 | structure Foo = Foo (S) | |
474 | ||
475 | structure Foo = | |
476 | struct | |
477 | ... | |
478 | end | |
479 | \end{verbatim} | |
480 | ||
481 | \item | |
482 | Avoid the use of {\tt open} except within tightly constrained scopes. The use | |
483 | of {\tt open} makes it hard to look at code later and understand where things | |
484 | come from. | |
485 | ||
486 | \end{enumerate} | |
487 | ||
488 | %------------------------------------------------------ | |
489 | % Core expressions | |
490 | %------------------------------------------------------ | |
491 | ||
492 | \section{Core expressions} | |
493 | ||
494 | We now enumerate the conventions we follow in writing core expressions. We do | |
495 | not repeat the conventions of \secref{val-spec}, although many of them apply | |
496 | here. | |
497 | \begin{enumerate} | |
498 | ||
499 | \item | |
500 | Tuples are written with spaces after commas, like {\tt (a, b, c)}. | |
501 | ||
502 | \item | |
503 | Records are written with spaces on both sides of equals and with spaces after | |
504 | commas, like {\tt \{bar = 1, foo = 2\}}. | |
505 | ||
506 | \item | |
507 | Record field names are written in alphabetical order, both in expressions and | |
508 | types. | |
509 | ||
510 | \item | |
511 | Function application is written with a space between the function and the | |
512 | argument. If there is one untupled argument, it looks like {\tt f x}. If there | |
513 | is a tupleg argument, it looks like {\tt f (x, y, z)}. | |
514 | ||
515 | \item | |
516 | When you want to mix declarations with side-effecting statements, use a | |
517 | declaration like {\tt val \_ = sideEffectingProcedure()}. | |
518 | ||
519 | \item | |
520 | In sequence expressions {\tt (e1; e2)} that span multiple lines, place the | |
521 | semicolon at the beginning of lines. | |
522 | \begin{verbatim} | |
523 | (e1 | |
524 | ; e2 | |
525 | ; e3) | |
526 | \end{verbatim} | |
527 | ||
528 | \item | |
529 | Never write nonexhaustive matches. Always handle the default case and raise an | |
530 | error message. Your error message will be better than the compiler's. Also, if | |
531 | you have lots of uncaught cases, then you are probably not using the type system | |
532 | in a strong enough way - your types are not expressing as much as they could. | |
533 | ||
534 | \item | |
535 | Never use the syntax for declaring functions that repeats the function name. | |
536 | Use {\tt case} or {\tt fn} instead. That is, do not write the following. | |
537 | \begin{verbatim} | |
538 | fun f 0 = 1 | |
539 | | f n = n + 1 | |
540 | \end{verbatim} | |
541 | Instead, write the following. | |
542 | \begin{verbatim} | |
543 | val f = | |
544 | fn 0 => 1 | |
545 | | n => n + 1 | |
546 | \end{verbatim} | |
547 | Or, write the following. | |
548 | \begin{verbatim} | |
549 | fun f n = | |
550 | case n of | |
551 | 0 => 1 | |
552 | | _ => n + 1 | |
553 | \end{verbatim} | |
554 | ||
555 | \end{enumerate} | |
556 | ||
557 | \bibliographystyle{alpha} | |
558 | \bibliography{bib} | |
559 | \end{document} |