1 \documentclass[12pt
]{article
}
2 \usepackage{alltt,epsfig,html,latexsym,longtable,makeidx,moreverb
}
4 \setlength\topmargin{-
0.5in
}
5 \setlength\textheight{8.5in
}
6 \setlength\textwidth{7.0in
}
7 \setlength\oddsidemargin{-
0.3in
}
8 \setlength\evensidemargin{-
0.3in
}
10 \title{{\mlton} SML Style Guide
}
11 \author{Stephen Weeks
}
21 % conventions chosen so that inertia is towards modularity and reuse
22 % not to type fewer characters
24 \sec{High-level structure
}{high-level-structure
}
26 Code is structured in
{\mlton} so that signatures are closed. Thus, in
27 {\mlton}, one would never write the following.
34 Instead, one would write the following.
43 The benefit of this approach is that one can first understand the
44 specifications (i.e. signatures) of all of the modules in
{\mlton} before having
45 to look at any implementations (i.e. structures or functors). That is, the
46 signatures are self-contained.
48 We deviate from this only in allowing references to top level types (like
{\tt
49 int
}), basis library modules, and
{\mlton} library modules. So, the following
50 signature is fine, because structure
{\tt Regexp
} is part of the
{\mlton}
55 val f: Regexp.t -> int
59 We also use signatures to express (some of) the dependencies between modules.
60 For every module
{\tt Foo
}, we write two signatures in a file named
{\tt
61 foo.sig
}. The signature
{\tt FOO
} specifies what is implemented by
{\tt Foo
}.
62 The signature
{\tt FOO
\_STRUCTS} specifies the modules that are needed in order
63 to specify
{\tt Foo
}, but that are not implemented by
{\tt Foo
}. As an example,
64 consider
{\mlton}'s closure conversion pass (in
{\tt mlton/closure-convert
}),
65 which converts from
{\tt Sxml
},
{\mlton}'s higher-order simply-typed
66 intermediate language, to
{\tt Cps
},
{\mlton}'s first-order simply-typed
67 intermediate language. The file
{\tt closure-convert.sig
} contains the
70 signature CLOSURE_CONVERT_STRUCTS =
74 sharing Sxml.Atoms = Cps.Atoms
77 signature CLOSURE_CONVERT =
79 include CLOSURE_CONVERT_STRUCTS
81 val closureConvert: Sxml.Program.t -> Cps.Program.t
84 These signatures say that the
{\tt ClosureConvert
} module implements a function
85 {\tt closureConvert
} that transforms an
{\tt Sxml
} program into a
{\tt Cps
}
86 program. They also say that
{\tt ClosureConvert
} does not implement
{\tt Sxml
}
87 or
{\tt Cps
}. Rather, it expects some other modules to implement these and for
88 them to be provided to
{\tt ClosureConvert
}. The sharing constraint expresses
89 that the ILs must share some basic atoms, like constants, variables, and
92 Given the two signatures that specify a module, the module definition always has
93 the same structure. A module
{\tt Foo
} is implemented in a file named
{\tt
94 foo.fun
}, which defines a functor named
{\tt Foo
} that takes as an argument a
95 structure matching
{\tt FOO
\_STRUCTS} and returns as a result a structure
96 matching
{\tt FOO
}. For example,
{\tt closure-convert.fun
} contains the
99 functor ClosureConvert (S: CLOSURE_CONVERT_STRUCTS): CLOSURE_CONVERT =
104 fun closureConvert ...
108 Although the signatures for
{\tt ClosureConvert
} express the dependence
109 on the
{\tt Sxml
} and
{\tt Cps
} ILs, they do not express the
110 dependence on other modules that are only used internally to closure
111 conversion. For example, closure conversion uses an auxiliary module
{\tt
112 AbstractValue
} as part of its higher-order control-flow analysis. Because
{\tt
113 AbstractValue
} is only used internally to closure conversion, it does not appear
114 in the signatures that specify closure conversion. So, helper functors (like
115 {\tt AbstractValue
}) are analogous to helper functions in that they are not
118 We do not put helper functors lexically in scope because SML only allows top
119 level functor definitions and, more importantly, because files would become
120 unmanageably large. Instead, helper functors get their own
{\tt .sig
} and
{\tt
121 .fun
} file, which follow exactly the convention above.
123 \section{General conventions
}
126 \item A line of code never exceeds
80 columns.
127 \item Use alphabetical order wherever possible.
129 \item record field names
130 \item datatype constructors
131 \item value specs in signatures
132 \item file lists in CM files
133 \item export lists in CM files
137 %------------------------------------------------------
138 % Signature conventions
139 %------------------------------------------------------
141 \sec{Signatures
}{signature-conventions
}
143 We now enumerate the conventions we follow in writing signatures.
148 Signature identifiers are in all capitals, using ``
\_'' to
152 A signature typically contains a single type specification that defines a type
153 constructor
{\tt t
}, which is the type of interest in the specification. For
154 oexample, here are signature fragments for integers, lists, and maps.
168 val map: 'a t * ('a -> 'b) -> 'b t
176 val extend: ('a, 'b) t * 'a * 'b -> ('a, 'b) t
180 Although at first it might appear confusing to name every type
{\tt t
}, in fact
181 there is never ambiguity, because at any point in the program there is at most
182 one unqualified
{\tt t
} in scope, and all other types will be named with long
183 identifiers (like
{\tt Int.t
} or
{\tt Int.t List.t
}). For example, the code for
184 a function
{\tt foo
} within the
{\tt Map
} module might look like the following.
186 fun foo (l: 'a List.t, n: Int.t): ('a, Int.t) t = ...
189 In practice, for pervasive types like
{\tt int
},
{\tt 'a list
}, we often use the
190 standard pervasive name instead of the
{\tt t
} name.
192 \item Signatures should not contain free types or structures, other than
193 pervasives, basis library modules, or
{\mlton} library modules. This was
194 explained in
\secref{high-level-structure
}.
197 If additional abstract types (other than pervasive types) are needed to specify
198 operations, they are included as substructures of the signature, and have a
199 signature in their own right. For example, the following signature is good.
207 val fromVar: Var.t -> t
208 val toVar: t -> Var.t
213 Signatures do not use substructures or multiple structures to group different
214 operations on the same type. This makes you waste energy remembering where the
215 operations are. For exmample, the following signature is bad.
233 Signatures usually should not contain datatypes. This exposes the
234 implementation of what should be an abstract type. For example, the following
239 datatype t = T of real * real
242 A common exception to this rule is abstract syntax trees.
245 Use structure sharing to express type sharing. For example, in
{\tt
246 closure-convert.sig
}, a single structure sharing equation expresses a number of
247 type sharing equations.
251 %------------------------------------------------------
252 % Value specifications
253 %------------------------------------------------------
255 \subsec{Value specifications
}{val-specs
}
257 Here are the conventions that we use for individual value specifications in
258 signatures. Of course, many of these conventions directly impact the way in
259 which we write the core language expressions that implement the specifications.
264 In a datatype specification, if there is a single constructor, then that
265 constructor is called
{\tt T
}.
267 datatype t = T of int
271 In a datatype specification, if a constructor carries multiple values of the
272 same type, use a record to name them to avoid confusion.
274 datatype t = T of
{length: int, start: int
}
278 Identifiers begin with and use small letters, using capital letters to separate
281 val helloWorld: unit -> unit
285 There is no space before the colon, and a single space after it. In the case of
286 operators (like
{\tt +
}), there is a space before the colon to avoid lexing the
287 colon as part of the operator.
290 Pass multiple arguments as tuple, not curried.
292 val eval: Exp.t * Env.t -> Val.t
296 Currying is only used when there staging of a computation, i.e., if
297 precomputation is done on one of the arguments.
299 val match: Regexp.t -> string -> bool
303 Functions which take a single element of the abstract type of a signature take
304 the element as the first argument, and auxiliary arguments after.
306 val push: t * int -> unit
307 val map: 'a t * ('a -> 'b) -> 'b t
311 $n$-ary operations take the $n$ elements first, and auxilary arguments after.
313 val merge: 'a t * 'a t * ('a * 'a -> 'b) -> 'b t
317 If two arguments to a function are of the same type, and the operation is not
318 commutative, pass them using a record. This names the arguments and ensures
319 they are not confused. Exceptions are the standard numerical and algebraic
322 val fromTo:
{start: int, step: int, stop: int
} -> int list
323 val substring: t *
{length: int, start: int
} -> t
328 Field names in record types are written in alphabetical order.
331 Return multiple results as a tuple, or as a record if there is the potential for
334 val parse: string -> t * string
335 val quotRem: t * t -> t * t
336 val partition: 'a t * ('a -> bool) ->
{no: 'a t, yes: 'a t
}
340 If a function returns multiple results, at least two of which are of the same
341 type, and the name of the function does not clearly indicate which result is
342 which, use a record to name the results.
344 val vars: t ->
{frees : Vars.t, bound : Vars.t
}
345 val partition: 'a t * ('a -> bool) ->
{yes : 'a t, no : 'a t
}
349 Use the same names and argument orders for similar functions in different
350 signatures. This is especially common in the
{\mlton} library.
352 val < : t * t -> bool
353 val equals: t * t -> bool
354 val forall: 'a t * ('a -> bool) -> bool
358 Use
{\tt is
},
{\tt are
},
{\tt can
}, etc. to name predicates. One exception is
361 val isEven: int -> bool
362 val canRead: t -> bool
367 %------------------------------------------------------
369 %------------------------------------------------------
373 Here is the complete specification of a simple interpreter. This demonstrates
374 the
{\tt t
}-convention, the closed-signature convention, and the use of sharing
400 val lam: Var.t * t -> t
408 sharing Exp.Var = Val.Var
410 val eval: Exp.t -> Val.t
419 val lookup: 'a t * Var.t -> 'a
420 val extend: 'a t * Var.t * 'a -> 'a t
424 %------------------------------------------------------
425 % Functors and structures
426 %------------------------------------------------------
428 \section{Functors and structures
}
429 We now enumerate the conventions we follow in writing functors and structures.
430 There is some repetition with
\secref{high-level-structure
}.
435 Functor identifiers begin with capital letters, use mixed case, and use capital
436 letters to separate words.
439 Functor definitions look like the following.
441 functor Foo (S: FOO_STRUCTS): FOO =
452 The name of the functor is the same as the name of the signature describing the
453 structure it produces.
456 The functor result is constrained by a signature.
459 A functor takes as arguments any structures that occur in the signature of the
460 result that it does not implement.
463 Structure identifiers begin with capital letters, and use capital letters to
467 The name of the structure is the same as the name of the functor that produces
471 A structure definition looks like one of the following.
473 structure Foo = Foo (S)
482 Avoid the use of
{\tt open
} except within tightly constrained scopes. The use
483 of
{\tt open
} makes it hard to look at code later and understand where things
488 %------------------------------------------------------
490 %------------------------------------------------------
492 \section{Core expressions
}
494 We now enumerate the conventions we follow in writing core expressions. We do
495 not repeat the conventions of
\secref{val-spec
}, although many of them apply
500 Tuples are written with spaces after commas, like
{\tt (a, b, c)
}.
503 Records are written with spaces on both sides of equals and with spaces after
504 commas, like
{\tt \
{bar =
1, foo =
2\
}}.
507 Record field names are written in alphabetical order, both in expressions and
511 Function application is written with a space between the function and the
512 argument. If there is one untupled argument, it looks like
{\tt f x
}. If there
513 is a tupleg argument, it looks like
{\tt f (x, y, z)
}.
516 When you want to mix declarations with side-effecting statements, use a
517 declaration like
{\tt val
\_ = sideEffectingProcedure()
}.
520 In sequence expressions
{\tt (e1; e2)
} that span multiple lines, place the
521 semicolon at the beginning of lines.
529 Never write nonexhaustive matches. Always handle the default case and raise an
530 error message. Your error message will be better than the compiler's. Also, if
531 you have lots of uncaught cases, then you are probably not using the type system
532 in a strong enough way - your types are not expressing as much as they could.
535 Never use the syntax for declaring functions that repeats the function name.
536 Use
{\tt case
} or
{\tt fn
} instead. That is, do not write the following.
541 Instead, write the following.
547 Or, write the following.
557 \bibliographystyle{alpha
}