1 MLton Guide ({mlton-version})
2 =============================
8 This is the guide for MLton, an open-source, whole-program, optimizing Standard ML compiler.
10 This guide was generated automatically from the MLton website, available online at http://mlton.org. It is up to date for MLton {mlton-version}.
16 :mlton-guide-page: Home
23 MLton is an open-source, whole-program, optimizing
24 <:StandardML:Standard ML> compiler.
28 * 20180207: Please try out our latest release, <:Release20180207:MLton 20180207>.
30 * 20140730: http://www.cs.rit.edu/%7emtf[Matthew Fluet] and
31 http://www.cse.buffalo.edu/%7elziarek[Lukasz Ziarek] have been
32 awarded an http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=12810[NSF
33 CISE Research Infrastructure (CRI)] grant titled "Positioning MLton
34 for Next-Generation Programming Languages Research;" read the award
36 (http://www.nsf.gov/awardsearch/showAward?AWD_ID=1405770[Award{nbsp}#1405770]
38 http://www.nsf.gov/awardsearch/showAward?AWD_ID=1405614[Award{nbsp}#1405614])
43 * Read about MLton's <:Features:>.
44 * Look at <:Documentation:>.
45 * See some <:Users:> of MLton.
46 * https://sourceforge.net/projects/mlton/files/mlton/20180207[Download] MLton.
47 * Meet the MLton <:Developers:>.
48 * Get involved with MLton <:Development:>.
49 * User-maintained <:FAQ:>.
54 :mlton-guide-page: AdamGoode
59 * I maintain the Fedora package of MLton, in https://admin.fedoraproject.org/pkgdb/packages/name/mlton[Fedora].
60 * I have contributed some patches for Makefiles and PDF documentation building.
64 :mlton-guide-page: AdmitsEquality
69 A <:TypeConstructor:> admits equality if whenever it is applied to
70 equality types, the result is an <:EqualityType:>. This notion enables
71 one to determine whether a type constructor application yields an
72 equality type solely from the application, without looking at the
73 definition of the type constructor. It helps to ensure that
74 <:PolymorphicEquality:> is only applied to sensible values.
76 The definition of admits equality depends on whether the type
77 constructor was declared by a `type` definition or a
78 `datatype` declaration.
81 == Type definitions ==
87 type ('a1, ..., 'an) t = ...
90 type constructor `t` admits equality if the right-hand side of the
91 definition is an equality type after replacing `'a1`, ...,
92 `'an` by equality types (it doesn't matter which equality types
95 For a nullary type definition, this amounts to the right-hand side
96 being an equality type. For example, after the definition
103 type constructor `t` admits equality because `bool * int` is
104 an equality type. On the other hand, after the definition
108 type t = bool * int * real
111 type constructor `t` does not admit equality, because `real`
112 is not an equality type.
114 For another example, after the definition
118 type 'a t = bool * 'a
121 type constructor `t` admits equality because `bool * int`
122 is an equality type (we could have chosen any equality type other than
125 On the other hand, after the definition
129 type 'a t = real * 'a
132 type constructor `t` does not admit equality because
133 `real * int` is not equality type.
135 We can check that a type constructor admits equality using an
136 `eqtype` specification.
140 structure Ok: sig eqtype 'a t end =
142 type 'a t = bool * 'a
148 structure Bad: sig eqtype 'a t end =
150 type 'a t = real * int * 'a
154 On `structure Bad`, MLton reports the following error.
156 Error: z.sml 1.16-1.34.
157 Type in structure disagrees with signature (admits equality): t.
158 structure: type 'a t = [real] * _ * _
159 defn at: z.sml 3.15-3.15
160 signature: [eqtype] 'a t
161 spec at: z.sml 1.30-1.30
164 The `structure:` section provides an explanation of why the type
165 did not admit equality, highlighting the problematic component
169 == Datatype declarations ==
171 For a type constructor declared by a datatype declaration to admit
172 equality, every <:Variant:variant> of the datatype must admit equality. For
173 example, the following datatype admits equality because `bool` and
174 `char * int` are equality types.
178 datatype t = A of bool | B of char * int
181 Nullary constructors trivially admit equality, so that the following
182 datatype admits equality.
186 datatype t = A | B | C
189 For a parameterized datatype constructor to admit equality, we
190 consider each <:Variant:variant> as a type definition, and require that the
191 definition admit equality. For example, for the datatype
195 datatype 'a t = A of bool * 'a | B of 'a
202 type 'a tA = bool * 'a
206 both admit equality. Thus, type constructor `t` admits equality.
208 On the other hand, the following datatype does not admit equality.
212 datatype 'a t = A of bool * 'a | B of real * 'a
215 As with type definitions, we can check using an `eqtype`
220 structure Bad: sig eqtype 'a t end =
222 datatype 'a t = A of bool * 'a | B of real * 'a
226 MLton reports the following error.
229 Error: z.sml 1.16-1.34.
230 Type in structure disagrees with signature (admits equality): t.
231 structure: datatype 'a t = B of [real] * _ | ...
232 defn at: z.sml 3.19-3.19
233 signature: [eqtype] 'a t
234 spec at: z.sml 1.30-1.30
237 MLton indicates the problematic constructor (`B`), as well as
238 the problematic component of the constructor's argument.
241 === Recursive datatypes ===
243 A recursive datatype like
247 datatype t = A | B of int * t
250 introduces a new problem, since in order to decide whether `t`
251 admits equality, we need to know for the `B` <:Variant:variant> whether
252 `t` admits equality. The <:DefinitionOfStandardML:Definition>
253 answers this question by requiring a type constructor to admit
254 equality if it is consistent to do so. So, in our above example, if
255 we assume that `t` admits equality, then the <:Variant:variant>
256 `B of int * t` admits equality. Then, since the `A` <:Variant:variant>
257 trivially admits equality, so does the type constructor `t`.
258 Thus, it was consistent to assume that `t` admits equality, and
259 so, `t` does admit equality.
261 On the other hand, in the following declaration
265 datatype t = A | B of real * t
268 if we assume that `t` admits equality, then the `B` <:Variant:variant>
269 does not admit equality. Hence, the type constructor `t` does not
270 admit equality, and our assumption was inconsistent. Hence, `t`
271 does not admit equality.
273 The same kind of reasoning applies to mutually recursive datatypes as
274 well. For example, the following defines both `t` and `u` to
279 datatype t = A | B of u
283 But the following defines neither `t` nor `u` to admit
288 datatype t = A | B of u * real
292 As always, we can check whether a type admits equality using an
293 `eqtype` specification.
297 structure Bad: sig eqtype t eqtype u end =
299 datatype t = A | B of u * real
304 MLton reports the following error.
307 Error: z.sml 1.16-1.40.
308 Type in structure disagrees with signature (admits equality): t.
309 structure: datatype t = B of [_str.u] * [real] | ...
310 defn at: z.sml 3.16-3.16
311 signature: [eqtype] t
312 spec at: z.sml 1.27-1.27
313 Error: z.sml 1.16-1.40.
314 Type in structure disagrees with signature (admits equality): u.
315 structure: datatype u = D of [_str.t] | ...
316 defn at: z.sml 4.11-4.11
317 signature: [eqtype] u
318 spec at: z.sml 1.36-1.36
323 :mlton-guide-page: Alice
328 http://www.ps.uni-saarland.de/alice[Alice ML] is an extension of SML with
329 concurrency, dynamic typing, components, distribution, and constraint
334 :mlton-guide-page: AllocateRegisters
335 [[AllocateRegisters]]
339 <:AllocateRegisters:> is an analysis pass for the <:RSSA:>
340 <:IntermediateLanguage:>, invoked from <:ToMachine:>.
344 Computes an allocation of <:RSSA:> variables as <:Machine:> register
349 * <!ViewGitFile(mlton,master,mlton/backend/allocate-registers.sig)>
350 * <!ViewGitFile(mlton,master,mlton/backend/allocate-registers.fun)>
352 == Details and Notes ==
358 :mlton-guide-page: AndreiFormiga
363 I'm a graduate student just back in academia. I study concurrent and parallel systems, with a great deal of interest in programming languages (theory, design, implementation). I happen to like functional languages.
365 I use the nickname tautologico on #sml and my email is andrei DOT formiga AT gmail DOT com.
369 :mlton-guide-page: ArrayLiteral
374 <:StandardML:Standard ML> does not have a syntax for array literals or
375 vector literals. The only way to write down an array is like
378 Array.fromList [w, x, y, z]
381 No SML compiler produces efficient code for the above expression. The
382 generated code allocates a list and then converts it to an array. To
383 alleviate this, one could write down the same array using
384 `Array.tabulate`, or even using `Array.array` and `Array.update`, but
385 that is syntactically unwieldy.
387 Fortunately, using <:Fold:>, it is possible to define constants `A`,
388 and +`+ so that one can write down an array like:
393 This is as syntactically concise as the `fromList` expression.
394 Furthermore, MLton, at least, will generate the efficient code as if
395 one had written down a use of `Array.array` followed by four uses of
398 Along with `A` and +`+, one can define a constant `V` that makes
399 it possible to define vector literals with the same syntax, e.g.,
405 Note that the same element indicator, +`+, serves for both array
406 and vector literals. Of course, the `$` is the end-of-arguments
407 marker always used with <:Fold:>. The only difference between an
408 array literal and vector literal is the `A` or `V` at the beginning.
410 Here is the implementation of `A`, `V`, and +`+. We place them
411 in a structure and use signature abstraction to hide the type of the
412 accumulator. See <:Fold:> for more on this technique.
418 val A: ('a z, 'a z, 'a array, 'd) Fold.t
419 val V: ('a z, 'a z, 'a vector, 'd) Fold.t
420 val ` : ('a, 'a z, 'a z, 'b, 'c, 'd) Fold.step1
423 type 'a z = int * 'a option * ('a array -> unit)
432 Array.tabulate (0, fn _ => raise Fail "array0")
435 val a = Array.array (n, x)
442 val V = fn z => Fold.post (A, Array.vector) z
447 (fn (x, (i, opt, fill)) =>
450 fn a => (Array.update (a, i, x); fill a)))
455 The idea of the code is for the fold to accumulate a count of the
456 number of elements, a sample element, and a function that fills in all
457 the elements. When the fold is complete, the finishing function
458 allocates the array, applies the fill function, and returns the array.
459 The only difference between `A` and `V` is at the very end; `A` just
460 returns the array, while `V` converts it to a vector using
461 post-composition, which is further described on the <:Fold:> page.
465 :mlton-guide-page: AST
470 <:AST:> is the <:IntermediateLanguage:> produced by the <:FrontEnd:>
471 and translated by <:Elaborate:> to <:CoreML:>.
475 The abstract syntax tree produced by the <:FrontEnd:>.
479 * <!ViewGitFile(mlton,master,mlton/ast/ast-programs.sig)>
480 * <!ViewGitFile(mlton,master,mlton/ast/ast-programs.fun)>
481 * <!ViewGitFile(mlton,master,mlton/ast/ast-modules.sig)>
482 * <!ViewGitFile(mlton,master,mlton/ast/ast-modules.fun)>
483 * <!ViewGitFile(mlton,master,mlton/ast/ast-core.sig)>
484 * <!ViewGitFile(mlton,master,mlton/ast/ast-core.fun)>
485 * <!ViewGitDir(mlton,master,mlton/ast)>
489 The <:AST:> <:IntermediateLanguage:> has no independent type
490 checker. Type inference is performed on an AST program as part of
493 == Details and Notes ==
495 === Source locations ===
497 MLton makes use of a relatively clean method for annotating the
498 abstract syntax tree with source location information. Every source
499 program phrase is "wrapped" with the `WRAPPED` interface:
503 sys::[./bin/InclGitFile.py mlton master mlton/control/wrapped.sig 8:19]
506 The key idea is that `node'` is the type of an unannotated syntax
507 phrase and `obj` is the type of its annotated counterpart. In the
508 implementation, every `node'` is annotated with a `Region.t`
509 (<!ViewGitFile(mlton,master,mlton/control/region.sig)>,
510 <!ViewGitFile(mlton,master,mlton/control/region.sml)>), which describes the
511 syntax phrase's left source position and right source position, where
512 `SourcePos.t` (<!ViewGitFile(mlton,master,mlton/control/source-pos.sig)>,
513 <!ViewGitFile(mlton,master,mlton/control/source-pos.sml)>) denotes a
514 particular file, line, and column. A typical use of the `WRAPPED`
515 interface is illustrated by the following code:
519 sys::[./bin/InclGitFile.py mlton master mlton/ast/ast-core.sig 46:65]
522 Thus, AST nodes are cleanly separated from source locations. By way
523 of contrast, consider the approach taken by <:SMLNJ:SML/NJ> (and also
524 by the <:CKitLibrary:CKit Library>). Each datatype denoting a syntax
525 phrase dedicates a special constructor for annotating source
529 datatype pat = WildPat (* empty pattern *)
530 | AppPat of {constr:pat,argument:pat} (* application *)
531 | MarkPat of pat * region (* mark a pattern *)
534 The main drawback of this approach is that static type checking is not
535 sufficient to guarantee that the AST emitted from the front-end is
540 :mlton-guide-page: BasisLibrary
545 The <:StandardML:Standard ML> Basis Library is a collection of modules
546 dealing with basic types, input/output, OS interfaces, and simple
547 datatypes. It is intended as a portable library usable across all
548 implementations of SML. For the official online version of the Basis
549 Library specification, see http://www.standardml.org/Basis.
550 <!Cite(GansnerReppy04, The Standard ML Basis Library)> is a book
551 version that includes all of the online version and more. For a
552 reverse chronological list of changes to the specification, see
553 http://www.standardml.org/Basis/history.html.
555 MLton implements all of the required portions of the Basis Library.
556 MLton also implements many of the optional structures. You can obtain
557 a complete and current list of what's available using
558 `mlton -show-basis` (see <:ShowBasis:>). By default, MLton makes the
559 Basis Library available to user programs. You can also
560 <:MLBasisAvailableLibraries:access the Basis Library> from
561 <:MLBasis: ML Basis> files.
563 Below is a complete list of what MLton implements.
565 == Top-level types and constructors ==
569 `datatype bool = false | true`
577 ++datatype 'a list = nil | {two-colons} of ('a * 'a list)++
579 `datatype 'a option = NONE | SOME of 'a`
581 `datatype order = EQUAL | GREATER | LESS`
585 `datatype 'a ref = ref of 'a`
597 == Top-level exception constructors ==
623 == Top-level values ==
625 MLton does not implement the optional top-level value
626 `use: string -> unit`, which conflicts with whole-program
627 compilation because it allows new code to be loaded dynamically.
629 MLton implements all other top-level values:
671 == Overloaded identifiers ==
686 == Top-level signatures ==
814 == Top-level structures ==
816 `structure Array: ARRAY`
818 `structure Array2: ARRAY2`
820 `structure ArraySlice: ARRAY_SLICE`
822 `structure BinIO: BIN_IO`
824 `structure BinPrimIO: PRIM_IO`
826 `structure Bool: BOOL`
828 `structure BoolArray: MONO_ARRAY`
830 `structure BoolArray2: MONO_ARRAY2`
832 `structure BoolArraySlice: MONO_ARRAY_SLICE`
834 `structure BoolVector: MONO_VECTOR`
836 `structure BoolVectorSlice: MONO_VECTOR_SLICE`
838 `structure Byte: BYTE`
840 `structure Char: CHAR`
842 * `Char` characters correspond to ISO-8859-1. The `Char` functions do not depend on locale.
844 `structure CharArray: MONO_ARRAY`
846 `structure CharArray2: MONO_ARRAY2`
848 `structure CharArraySlice: MONO_ARRAY_SLICE`
850 `structure CharVector: MONO_VECTOR`
852 `structure CharVectorSlice: MONO_VECTOR_SLICE`
854 `structure CommandLine: COMMAND_LINE`
856 `structure Date: DATE`
858 * `Date.fromString` and `Date.scan` accept a space in addition to a zero for the first character of the day of the month. The Basis Library specification only allows a zero.
860 `structure FixedInt: INTEGER`
862 `structure General: GENERAL`
864 `structure GenericSock: GENERIC_SOCK`
866 `structure IEEEReal: IEEE_REAL`
868 `structure INetSock: INET_SOCK`
872 `structure Int: INTEGER`
874 `structure Int1: INTEGER`
876 `structure Int2: INTEGER`
878 `structure Int3: INTEGER`
880 `structure Int4: INTEGER`
884 `structure Int31: INTEGER`
886 `structure Int32: INTEGER`
888 `structure Int64: INTEGER`
890 `structure IntArray: MONO_ARRAY`
892 `structure IntArray2: MONO_ARRAY2`
894 `structure IntArraySlice: MONO_ARRAY_SLICE`
896 `structure IntVector: MONO_VECTOR`
898 `structure IntVectorSlice: MONO_VECTOR_SLICE`
900 `structure Int8: INTEGER`
902 `structure Int8Array: MONO_ARRAY`
904 `structure Int8Array2: MONO_ARRAY2`
906 `structure Int8ArraySlice: MONO_ARRAY_SLICE`
908 `structure Int8Vector: MONO_VECTOR`
910 `structure Int8VectorSlice: MONO_VECTOR_SLICE`
912 `structure Int16: INTEGER`
914 `structure Int16Array: MONO_ARRAY`
916 `structure Int16Array2: MONO_ARRAY2`
918 `structure Int16ArraySlice: MONO_ARRAY_SLICE`
920 `structure Int16Vector: MONO_VECTOR`
922 `structure Int16VectorSlice: MONO_VECTOR_SLICE`
924 `structure Int32: INTEGER`
926 `structure Int32Array: MONO_ARRAY`
928 `structure Int32Array2: MONO_ARRAY2`
930 `structure Int32ArraySlice: MONO_ARRAY_SLICE`
932 `structure Int32Vector: MONO_VECTOR`
934 `structure Int32VectorSlice: MONO_VECTOR_SLICE`
936 `structure Int64Array: MONO_ARRAY`
938 `structure Int64Array2: MONO_ARRAY2`
940 `structure Int64ArraySlice: MONO_ARRAY_SLICE`
942 `structure Int64Vector: MONO_VECTOR`
944 `structure Int64VectorSlice: MONO_VECTOR_SLICE`
946 `structure IntInf: INT_INF`
948 `structure LargeInt: INTEGER`
950 `structure LargeIntArray: MONO_ARRAY`
952 `structure LargeIntArray2: MONO_ARRAY2`
954 `structure LargeIntArraySlice: MONO_ARRAY_SLICE`
956 `structure LargeIntVector: MONO_VECTOR`
958 `structure LargeIntVectorSlice: MONO_VECTOR_SLICE`
960 `structure LargeReal: REAL`
962 `structure LargeRealArray: MONO_ARRAY`
964 `structure LargeRealArray2: MONO_ARRAY2`
966 `structure LargeRealArraySlice: MONO_ARRAY_SLICE`
968 `structure LargeRealVector: MONO_VECTOR`
970 `structure LargeRealVectorSlice: MONO_VECTOR_SLICE`
972 `structure LargeWord: WORD`
974 `structure LargeWordArray: MONO_ARRAY`
976 `structure LargeWordArray2: MONO_ARRAY2`
978 `structure LargeWordArraySlice: MONO_ARRAY_SLICE`
980 `structure LargeWordVector: MONO_VECTOR`
982 `structure LargeWordVectorSlice: MONO_VECTOR_SLICE`
984 `structure List: LIST`
986 `structure ListPair: LIST_PAIR`
988 `structure Math: MATH`
990 `structure NetHostDB: NET_HOST_DB`
992 `structure NetProtDB: NET_PROT_DB`
994 `structure NetServDB: NET_SERV_DB`
998 `structure Option: OPTION`
1000 `structure PackReal32Big: PACK_REAL`
1002 `structure PackReal32Little: PACK_REAL`
1004 `structure PackReal64Big: PACK_REAL`
1006 `structure PackReal64Little: PACK_REAL`
1008 `structure PackRealBig: PACK_REAL`
1010 `structure PackRealLittle: PACK_REAL`
1012 `structure PackWord16Big: PACK_WORD`
1014 `structure PackWord16Little: PACK_WORD`
1016 `structure PackWord32Big: PACK_WORD`
1018 `structure PackWord32Little: PACK_WORD`
1020 `structure PackWord64Big: PACK_WORD`
1022 `structure PackWord64Little: PACK_WORD`
1024 `structure Position: INTEGER`
1026 `structure Posix: POSIX`
1028 `structure Real: REAL`
1030 `structure RealArray: MONO_ARRAY`
1032 `structure RealArray2: MONO_ARRAY2`
1034 `structure RealArraySlice: MONO_ARRAY_SLICE`
1036 `structure RealVector: MONO_VECTOR`
1038 `structure RealVectorSlice: MONO_VECTOR_SLICE`
1040 `structure Real32: REAL`
1042 `structure Real32Array: MONO_ARRAY`
1044 `structure Real32Array2: MONO_ARRAY2`
1046 `structure Real32ArraySlice: MONO_ARRAY_SLICE`
1048 `structure Real32Vector: MONO_VECTOR`
1050 `structure Real32VectorSlice: MONO_VECTOR_SLICE`
1052 `structure Real64: REAL`
1054 `structure Real64Array: MONO_ARRAY`
1056 `structure Real64Array2: MONO_ARRAY2`
1058 `structure Real64ArraySlice: MONO_ARRAY_SLICE`
1060 `structure Real64Vector: MONO_VECTOR`
1062 `structure Real64VectorSlice: MONO_VECTOR_SLICE`
1064 `structure Socket: SOCKET`
1066 * The Basis Library specification requires functions like
1067 `Socket.sendVec` to raise an exception if they fail. However, on some
1068 platforms, sending to a socket that hasn't yet been connected causes a
1069 `SIGPIPE` signal, which invokes the default signal handler for
1070 `SIGPIPE` and causes the program to terminate. If you want the
1071 exception to be raised, you can ignore `SIGPIPE` by adding the
1072 following to your program.
1079 setHandler (Posix.Signal.pipe, Handler.ignore)
1083 `structure String: STRING`
1085 * The `String` functions do not depend on locale.
1087 `structure StringCvt: STRING_CVT`
1089 `structure Substring: SUBSTRING`
1091 `structure SysWord: WORD`
1093 `structure Text: TEXT`
1095 `structure TextIO: TEXT_IO`
1097 `structure TextPrimIO: PRIM_IO`
1099 `structure Time: TIME`
1101 `structure Timer: TIMER`
1103 `structure Unix: UNIX`
1105 `structure UnixSock: UNIX_SOCK`
1107 `structure Vector: VECTOR`
1109 `structure VectorSlice: VECTOR_SLICE`
1111 `structure Word: WORD`
1113 `structure Word1: WORD`
1115 `structure Word2: WORD`
1117 `structure Word3: WORD`
1119 `structure Word4: WORD`
1123 `structure Word31: WORD`
1125 `structure Word32: WORD`
1127 `structure Word64: WORD`
1129 `structure WordArray: MONO_ARRAY`
1131 `structure WordArray2: MONO_ARRAY2`
1133 `structure WordArraySlice: MONO_ARRAY_SLICE`
1135 `structure WordVectorSlice: MONO_VECTOR_SLICE`
1137 `structure WordVector: MONO_VECTOR`
1139 `structure Word8Array: MONO_ARRAY`
1141 `structure Word8Array2: MONO_ARRAY2`
1143 `structure Word8ArraySlice: MONO_ARRAY_SLICE`
1145 `structure Word8Vector: MONO_VECTOR`
1147 `structure Word8VectorSlice: MONO_VECTOR_SLICE`
1149 `structure Word16Array: MONO_ARRAY`
1151 `structure Word16Array2: MONO_ARRAY2`
1153 `structure Word16ArraySlice: MONO_ARRAY_SLICE`
1155 `structure Word16Vector: MONO_VECTOR`
1157 `structure Word16VectorSlice: MONO_VECTOR_SLICE`
1159 `structure Word32Array: MONO_ARRAY`
1161 `structure Word32Array2: MONO_ARRAY2`
1163 `structure Word32ArraySlice: MONO_ARRAY_SLICE`
1165 `structure Word32Vector: MONO_VECTOR`
1167 `structure Word32VectorSlice: MONO_VECTOR_SLICE`
1169 `structure Word64Array: MONO_ARRAY`
1171 `structure Word64Array2: MONO_ARRAY2`
1173 `structure Word64ArraySlice: MONO_ARRAY_SLICE`
1175 `structure Word64Vector: MONO_VECTOR`
1177 `structure Word64VectorSlice: MONO_VECTOR_SLICE`
1179 == Top-level functors ==
1187 * MLton's `StreamIO` functor takes structures `ArraySlice` and
1188 `VectorSlice` in addition to the arguments specified in the Basis
1189 Library specification.
1191 == Type equivalences ==
1193 The following types are equivalent.
1195 FixedInt = Int64.int
1196 LargeInt = IntInf.int
1197 LargeReal.real = Real64.real
1198 LargeWord = Word64.word
1201 The default `int`, `real`, and `word` types may be set by the
1202 ++-default-type __type__++ <:CompileTimeOptions: compile-time option>.
1203 By default, the following types are equivalent:
1205 int = Int.int = Int32.int
1206 real = Real.real = Real64.real
1207 word = Word.word = Word32.word
1210 == Real and Math functions ==
1212 The `Real`, `Real32`, and `Real64` modules are implemented
1213 using the `C` math library, so the SML functions will reflect the
1214 behavior of the underlying library function. We have made some effort
1215 to unify the differences between the math libraries on different
1216 platforms, and in particular to handle exceptional cases according to
1217 the Basis Library specification. However, there will be differences
1218 due to different numerical algorithms and cases we may have missed.
1219 Please submit a <:Bug:bug report> if you encounter an error in
1220 the handling of an exceptional case.
1222 On x86, real arithmetic is implemented internally using 80 bits of
1223 precision. Using higher precision for intermediate results in
1224 computations can lead to different results than if all the computation
1225 is done at 32 or 64 bits. If you require strict IEEE compliance, you
1226 can compile with `-ieee-fp true`, which will cause intermediate
1227 results to be stored after each operation. This may cause a
1228 substantial performance penalty.
1232 :mlton-guide-page: Bug
1237 To report a bug, please send mail to
1238 mailto:mlton-devel@mlton.org[`mlton-devel@mlton.org`]. Please include
1239 the complete SML program that caused the problem and a log of a
1240 compile of the program with `-verbose 2`. For large programs (over
1241 256K), please send an email containing the discussion text and a link
1244 There are some <:UnresolvedBugs:> that we don't plan to fix.
1246 We also maintain a list of bugs found with each release.
1256 :mlton-guide-page: Bugs20041109
1261 Here are the known bugs in <:Release20041109:MLton 20041109>, listed
1262 in reverse chronological order of date reported.
1265 `MLton.Finalizable.touch` doesn't necessarily keep values alive
1266 long enough. Our SVN has a patch to the compiler. You must rebuild
1267 the compiler in order for the patch to take effect.
1269 Thanks to Florian Weimer for reporting this bug.
1272 A bug in an optimization pass may incorrectly transform a program
1273 to flatten ref cells into their containing data structure, yielding a
1274 type-error in the transformed program. Our CVS has a
1275 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/ssa/ref-flatten.fun.diff?r1=1.35&r2=1.37[patch]
1276 to the compiler. You must rebuild the compiler in order for the
1277 patch to take effect.
1279 Thanks to <:VesaKarvonen:> for reporting this bug.
1282 A bug in the front end mistakenly allows unary constructors to be
1283 used without an argument in patterns. For example, the following
1284 program is accepted, and triggers a large internal error.
1288 fun f x = case x of SOME => true | _ => false
1291 We have fixed the problem in our CVS.
1293 Thanks to William Lovas for reporting this bug.
1296 A bug in `Posix.IO.{getlk,setlk,setlkw}` causes a link-time error:
1297 `undefined reference to Posix_IO_FLock_typ`
1299 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/posix/primitive.sml.diff?r1=1.34&r2=1.35[patch]
1300 to the Basis Library implementation.
1302 Thanks to Adam Chlipala for reporting this bug.
1305 A bug can cause programs compiled with `-profile alloc` to
1306 segfault. Our CVS has a
1307 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/backend/ssa-to-rssa.fun.diff?r1=1.106&r2=1.107[patch]
1308 to the compiler. You must rebuild the compiler in order for the
1309 patch to take effect.
1311 Thanks to John Reppy for reporting this bug.
1314 A bug in an optimization pass may incorrectly flatten ref cells
1315 into their containing data structure, breaking the sharing between
1316 the cells. Our CVS has a
1317 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/ssa/ref-flatten.fun.diff?r1=1.32&r2=1.33[patch]
1318 to the compiler. You must rebuild the compiler in order for the
1319 patch to take effect.
1321 Thanks to Paul Govereau for reporting this bug.
1324 Some arrays or vectors, such as `(char * char) vector`, are
1325 incorrectly implemented, and will conflate the first and second
1326 components of each element. Our CVS has a
1327 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/backend/packed-representation.fun.diff?r1=1.32&r2=1.33[patch]
1328 to the compiler. You must rebuild the compiler in order for the
1329 patch to take effect.
1331 Thanks to Scott Cruzen for reporting this bug.
1334 `Socket.Ctl.getLINGER` and `Socket.Ctl.setLINGER`
1335 mistakenly raise `Subscript`.
1337 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/net/socket.sml.diff?r1=1.14&r2=1.15[patch]
1338 to the Basis Library implementation.
1340 Thanks to Ray Racine for reporting the bug.
1343 <:ConcurrentML: CML> `Mailbox.send` makes a call in the wrong atomic context.
1344 Our CVS has a http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/lib/cml/core-cml/mailbox.sml.diff?r1=1.3&r2=1.4[patch]
1345 to the CML implementation.
1348 `OS.Path.joinDirFile` and `OS.Path.toString` did not
1349 raise `InvalidArc` when they were supposed to. They now do.
1350 Our CVS has a http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/system/path.sml.diff?r1=1.8&r2=1.11[patch]
1351 to the Basis Library implementation.
1353 Thanks to Andreas Rossberg for reporting the bug.
1356 The front end incorrectly disallows sequences of expressions
1357 (separated by semicolons) after a topdec has already been processed.
1358 For example, the following is incorrectly rejected.
1367 We have fixed the problem in our CVS.
1369 Thanks to Andreas Rossberg for reporting the bug.
1372 The front end incorrectly disallows expansive `val`
1373 declarations that bind a type variable that doesn't occur in the
1374 type of the value being bound. For example, the following is
1375 incorrectly rejected.
1379 val 'a x = let exception E of 'a in () end
1382 We have fixed the problem in our CVS.
1384 Thanks to Andreas Rossberg for reporting this bug.
1387 The x86 codegen fails to account for the possibility that a 64-bit
1388 move could interfere with itself (as simulated by 32-bit moves). We
1389 have fixed the problem in our CVS.
1391 Thanks to Scott Cruzen for reporting this bug.
1394 `NetHostDB.scan` and `NetHostDB.fromString` incorrectly
1395 raise an exception on internet addresses whose last component is a
1396 zero, e.g `0.0.0.0`. Our CVS has a
1397 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/net/net-host-db.sml.diff?r1=1.12&r2=1.13[patch] to the Basis Library implementation.
1399 Thanks to Scott Cruzen for reporting this bug.
1402 `StreamIO.inputLine` has an off-by-one error causing it to drop
1403 the first character after a newline in some situations. Our CVS has a
1404 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/io/stream-io.fun.diff?r1=text&tr1=1.29&r2=text&tr2=1.30&diff_format=h[patch].
1405 to the Basis Library implementation.
1407 Thanks to Scott Cruzen for reporting this bug.
1410 `BinIO.getInstream` and `TextIO.getInstream` are
1411 implemented incorrectly. This also impacts the behavior of
1412 `BinIO.scanStream` and `TextIO.scanStream`. If you (directly
1413 or indirectly) realize a `TextIO.StreamIO.instream` and do not
1414 (directly or indirectly) call `TextIO.setInstream` with a derived
1415 stream, you may lose input data. We have fixed the problem in our
1418 Thanks to <:WesleyTerpstra:> for reporting this bug.
1421 `Posix.ProcEnv.setpgid` doesn't work. If you compile a program
1422 that uses it, you will get a link time error
1425 undefined reference to `Posix_ProcEnv_setpgid'
1428 The bug is due to `Posix_ProcEnv_setpgid` being omitted from the
1429 MLton runtime. We fixed the problem in our CVS by adding the
1430 following definition to `runtime/Posix/ProcEnv/ProcEnv.c`
1434 Int Posix_ProcEnv_setpgid (Pid p, Gid g) {
1435 return setpgid (p, g);
1439 Thanks to Tom Murphy for reporting this bug.
1443 :mlton-guide-page: Bugs20051202
1448 Here are the known bugs in <:Release20051202:MLton 20051202>, listed
1449 in reverse chronological order of date reported.
1452 Bug in the http://www.standardml.org/Basis/real.html#SIG:REAL.fmt:VAL[++Real__<N>__.fmt++], http://www.standardml.org/Basis/real.html#SIG:REAL.fromString:VAL[++Real__<N>__.fromString++], http://www.standardml.org/Basis/real.html#SIG:REAL.scan:VAL[++Real__<N>__.scan++], and http://www.standardml.org/Basis/real.html#SIG:REAL.toString:VAL[++Real__<N>__.toString++] functions of the <:BasisLibrary:Basis Library> implementation. These functions were using `TO_NEAREST` semantics, but should obey the current rounding mode. (Only ++Real__<N>__.fmt StringCvt.EXACT++, ++Real__<N>__.fromDecimal++, and ++Real__<N>__.toDecimal++ are specified to override the current rounding mode with `TO_NEAREST` semantics.)
1454 Thanks to Sean McLaughlin for the bug report.
1456 Fixed by revision <!ViewSVNRev(5827)>.
1459 Bug in the treatment of floating-point operations. Floating-point operations depend on the current rounding mode, but were being treated as pure.
1461 Thanks to Sean McLaughlin for the bug report.
1463 Fixed by revision <!ViewSVNRev(5794)>.
1466 Bug in the http://www.standardml.org/Basis/real.html#SIG:REAL.toInt:VAL[++Real32.toInt++] function of the <:BasisLibrary:Basis Library> implementation could lead incorrect results when applied to a `Real32.real` value numerically close to `valOf(Int.maxInt)`.
1468 Fixed by revision <!ViewSVNRev(5764)>.
1471 The http://www.standardml.org/Basis/socket.html[++Socket++] structure of the <:BasisLibrary:Basis Library> implementation used `andb` rather than `orb` to unmarshal socket options (for ++Socket.Ctl.get__<OPT>__++ functions).
1473 Thanks to Anders Petersson for the bug report and patch.
1475 Fixed by revision <!ViewSVNRev(5735)>.
1478 Bug in the http://www.standardml.org/Basis/date.html[++Date++] structure of the <:BasisLibrary:Basis Library> implementation yielded some functions that would erroneously raise `Date` when applied to a year before 1900.
1480 Thanks to Joe Hurd for the bug report.
1482 Fixed by revision <!ViewSVNRev(5732)>.
1485 Bug in monomorphisation pass could exhibit the error `Type error: type mismatch`.
1487 Thanks to Vesa Karvonen for the bug report.
1489 Fixed by revision <!ViewSVNRev(5731)>.
1492 The http://www.standardml.org/Basis/pack-float.html#SIG:PACK_REAL.toBytes:VAL[++PackReal__<N>__.toBytes++] function in the <:BasisLibrary:Basis Library> implementation incorrectly shared (and mutated) the result vector.
1494 Thanks to Eric McCorkle for the bug report and patch.
1496 Fixed by revision <!ViewSVNRev(5281)>.
1499 Bug in elaboration of FFI forms. Using a unary FFI types (e.g., `array`, `ref`, `vector`) in places where `MLton.Pointer.t` was required would lead to an internal error `TypeError`.
1501 Fixed by revision <!ViewSVNRev(4890)>.
1504 The http://www.standardml.org/Basis/mono-vector.html[++MONO_VECTOR++] signature of the <:BasisLibrary:Basis Library> implementation incorrectly omits the specification of `find`.
1506 Fixed by revision <!ViewSVNRev(4707)>.
1509 The optimizer reports an internal error (`TypeError`) when an imported C function is called but not used.
1511 Thanks to "jq" for the bug report.
1513 Fixed by revision <!ViewSVNRev(4690)>.
1516 Bug in pass to flatten data structures.
1518 Thanks to Joe Hurd for the bug report.
1520 Fixed by revision <!ViewSVNRev(4662)>.
1523 The native codegen's implementation of the C-calling convention failed to widen 16-bit arguments to 32-bits.
1525 Fixed by revision <!ViewSVNRev(4631)>.
1528 The http://www.standardml.org/Basis/pack-float.html[++PACK_REAL++] structures of the <:BasisLibrary:Basis Library> implementation used byte, rather than element, indexing.
1530 Fixed by revision <!ViewSVNRev(4411)>.
1533 `MLton.share` could cause a segmentation fault.
1535 Fixed by revision <!ViewSVNRev(4400)>.
1538 The SSA simplifier could eliminate an irredundant test.
1540 Fixed by revision <!ViewSVNRev(4370)>.
1543 A program with a very large number of functors could exhibit the error `ElaborateEnv.functorClosure: firstTycons`.
1545 Fixed by revision <!ViewSVNRev(4344)>.
1549 :mlton-guide-page: Bugs20070826
1554 Here are the known bugs in <:Release20070826:MLton 20070826>, listed
1555 in reverse chronological order of date reported.
1558 Bug in the mark-compact garbage collector where the C library's `memcpy` was used to move objects during the compaction phase; this could lead to heap corruption and segmentation faults with newer versions of gcc and/or glibc, which assume that src and dst in a `memcpy` do not overlap.
1560 Fixed by revision <!ViewSVNRev(7461)>.
1563 Bug in elaboration of `datatype` declarations with `withtype` bindings.
1565 Fixed by revision <!ViewSVNRev(7434)>.
1568 Performance bug in <:RefFlatten:> optimization pass.
1570 Thanks to Reactive Systems for the bug report.
1572 Fixed by revision <!ViewSVNRev(7379)>.
1575 Performance bug in <:SimplifyTypes:> optimization pass.
1577 Thanks to Reactive Systems for the bug report.
1579 Fixed by revisions <!ViewSVNRev(7377)> and <!ViewSVNRev(7378)>.
1582 Bug in amd64 codegen register allocation of indirect C calls.
1584 Thanks to David Hansel for the bug report.
1586 Fixed by revision <!ViewSVNRev(7368)>.
1589 Bug in `IntInf.scan` and `IntInf.fromString` where leading spaces were only accepted if the stream had an explicit sign character.
1591 Thanks to David Hansel for the bug report.
1593 Fixed by revisions <!ViewSVNRev(7227)> and <!ViewSVNRev(7230)>.
1596 Bug in `IntInf.~>>` that could cause a `glibc` assertion.
1598 Fixed by revisions <!ViewSVNRev(7083)>, <!ViewSVNRev(7084)>, and <!ViewSVNRev(7085)>.
1601 Bug in the return type of `MLton.Process.reap`.
1603 Thanks to Risto Saarelma for the bug report.
1605 Fixed by revision <!ViewSVNRev(7029)>.
1608 Bug in `MLton.size` and `MLton.share` when tracing the current stack.
1610 Fixed by revisions <!ViewSVNRev(6978)>, <!ViewSVNRev(6981)>, <!ViewSVNRev(6988)>, <!ViewSVNRev(6989)>, and <!ViewSVNRev(6990)>.
1613 Bug in nested `_export`/`_import` functions.
1615 Fixed by revision <!ViewSVNRev(6919)>.
1618 Bug in the name mangling of `_import`-ed functions with the `stdcall` convention.
1620 Thanks to Lars Bergstrom for the bug report.
1622 Fixed by revision <!ViewSVNRev(6672)>.
1625 Bug in Windows code to page the heap to disk when unable to grow the heap to a desired size.
1627 Thanks to Sami Evangelista for the bug report.
1629 Fixed by revisions <!ViewSVNRev(6600)> and <!ViewSVNRev(6624)>.
1632 Bug in \*NIX code to page the heap to disk when unable to grow the heap to a desired size.
1634 Thanks to Nicolas Bertolotti for the bug report and patch.
1636 Fixed by revisions <!ViewSVNRev(6596)> and <!ViewSVNRev(6600)>.
1639 Space-safety bug in pass to <:RefFlatten: flatten refs> into containing data structure.
1641 Thanks to Daniel Spoonhower for the bug report and initial diagnosis and patch.
1643 Fixed by revision <!ViewSVNRev(6395)>.
1646 Bug in the frontend that rejected `op longvid` patterns and expressions.
1648 Thanks to Florian Weimer for the bug report.
1650 Fixed by revision <!ViewSVNRev(6347)>.
1653 Bug in the http://www.standardml.org/Basis/imperative-io.html#SIG:IMPERATIVE_IO.canInput:VAL[`IMPERATIVE_IO.canInput`] function of the <:BasisLibrary:Basis Library> implementation.
1655 Thanks to Ville Laurikari for the bug report.
1657 Fixed by revision <!ViewSVNRev(6261)>.
1660 Bug in algebraic simplification of real primitives. http://www.standardml.org/Basis/real.html#SIG:REAL.\|@LTE\|:VAL[++REAL__<N>__.\<=(x, x)++] is `false` when `x` is NaN.
1662 Fixed by revision <!ViewSVNRev(6242)>.
1665 Bug in the FFI visible representation of `Int16.int ref` (and references of other primitive types smaller than 32-bits) on big-endian platforms.
1667 Thanks to Dave Herman for the bug report.
1669 Fixed by revision <!ViewSVNRev(6267)>.
1672 Bug in type inference of flexible records. This would later cause the compiler to raise the `TypeError` exception.
1674 Thanks to Wesley Terpstra for the bug report.
1676 Fixed by revision <!ViewSVNRev(6229)>.
1679 Bug in cross-compilation of `gdtoa` library.
1681 Thanks to Wesley Terpstra for the bug report and patch.
1683 Fixed by revision <!ViewSVNRev(6620)>.
1686 Bug in pass to <:RefFlatten: flatten refs> into containing data structure.
1688 Thanks to Ruy Ley-Wild for the bug report.
1690 Fixed by revision <!ViewSVNRev(6191)>.
1693 Bug in the handling of weak pointers by the mark-compact garbage collector.
1695 Thanks to Sean McLaughlin for the bug report and Florian Weimer for the initial diagnosis.
1697 Fixed by revision <!ViewSVNRev(6183)>.
1700 Bug in the elaboration of structures with signature constraints. This would later cause the compiler to raise the `TypeError` exception.
1702 Thanks to Vesa Karvonen for the bug report.
1704 Fixed by revision <!ViewSVNRev(6046)>.
1707 Bug in the interaction of `_export`-ed functions and signal handlers.
1709 Thanks to Sean McLaughlin for the bug report.
1711 Fixed by revision <!ViewSVNRev(6013)>.
1714 Bug in the implementation of `_export`-ed functions using the `char` type, leading to a linker error.
1716 Thanks to Katsuhiro Ueno for the bug report.
1718 Fixed by revision <!ViewSVNRev(5999)>.
1722 :mlton-guide-page: Bugs20100608
1727 Here are the known bugs in <:Release20100608:MLton 20100608>, listed
1728 in reverse chronological order of date reported.
1731 Bugs in `REAL.signBit`, `REAL.copySign`, and `REAL.toDecimal`/`REAL.fromDecimal`.
1733 Thanks to Phil Clayton for the bug report and examples.
1735 Fixed by revisions <!ViewSVNRev(7571)>, <!ViewSVNRev(7572)>, and <!ViewSVNRev(7573)>.
1738 Bug in elaboration of type variables with and without equality status.
1740 Thanks to Rob Simmons for the bug report and examples.
1742 Fixed by revision <!ViewSVNRev(7565)>.
1745 Bug in <:Redundant:redundant> <:SSA:> optimization.
1747 Thanks to Lars Magnusson for the bug report and example.
1749 Fixed by revision <!ViewSVNRev(7561)>.
1752 Bug in <:SSA:>/<:SSA2:> <:Shrink:shrinker> that could erroneously turn a non-tail function call with a `Bug` transfer as its continuation into a tail function call.
1754 Thanks to Lars Bergstrom for the bug report.
1756 Fixed by revision <!ViewSVNRev(7546)>.
1759 Bug in translation from <:SSA2:> to <:RSSA:> with `case` expressions over non-primitive-sized words.
1761 Fixed by revision <!ViewSVNRev(7544)>.
1764 Bug with <:SSA:>/<:SSA2:> type checking of case expressions over words.
1766 Fixed by revision <!ViewSVNRev(7542)>.
1769 Bug with treatment of `as`-patterns, which should not allow the redefinition of constructor status.
1771 Thanks to Michael Norrish for the bug report.
1773 Fixed by revision <!ViewSVNRev(7530)>.
1776 Bug with treatment of `nan` in <:CommonSubexp:common subexpression elimination> <:SSA:> optimization.
1778 Thanks to Alexandre Hamez for the bug report.
1780 Fixed by revision <!ViewSVNRev(7503)>.
1783 Bug in translation from <:SSA2:> to <:RSSA:> with weak pointers.
1785 Thanks to Alexandre Hamez for the bug report.
1787 Fixed by revision <!ViewSVNRev(7502)>.
1790 Bug in amd64 codegen calling convention for varargs C calls.
1792 Thanks to <:HenryCejtin:> for the bug report and <:WesleyTerpstra:> for the initial diagnosis.
1794 Fixed by revision <!ViewSVNRev(7501)>.
1797 Bug in comment-handling in lexer for <:MLYacc:>'s input language.
1799 Thanks to Michael Norrish for the bug report and patch.
1801 Fixed by revision <!ViewSVNRev(7500)>.
1804 Bug in elaboration of function clauses with different numbers of arguments that would raise an uncaught `Subscript` exception.
1806 Fixed by revision <!ViewSVNRev(75497)>.
1810 :mlton-guide-page: Bugs20130715
1815 Here are the known bugs in <:Release20130715:MLton 20130715>, listed
1816 in reverse chronological order of date reported.
1819 Bug with simultaneous `sharing` of multiple structures.
1821 Fixed by commit <!ViewGitCommit(mlton,9cb5164f6)>.
1824 Minor bug with exception replication.
1826 Fixed by commit <!ViewGitCommit(mlton,1c89c42f6)>.
1829 Minor bug erroneously accepting symbolic identifiers for strid, sigid, and fctid
1830 and erroneously accepting symbolic identifiers before `.` in long identifiers.
1832 Fixed by commit <!ViewGitCommit(mlton,9a56be647)>.
1835 Minor bug in precedence parsing of function clauses.
1837 Fixed by commit <!ViewGitCommit(mlton,1a6d25ec9)>.
1840 Performance bug in creation of worker threads to service calls of `_export`-ed
1843 Thanks to Bernard Berthomieu for the bug report.
1845 Fixed by commit <!ViewGitCommit(mlton,97c2bdf1d)>.
1848 Bug in `MLton.IntInf.fromRep` that could yield values that violate the `IntInf`
1849 representation invariants.
1851 Thanks to Rob Simmons for the bug report.
1853 Fixed by commit <!ViewGitCommit(mlton,3add91eda)>.
1856 Bug in equality status of some arrays, vectors, and slices in Basis Library
1859 Fixed by commit <!ViewGitCommit(mlton,a7ed9cbf1)>.
1863 :mlton-guide-page: Bugs20180207
1868 Here are the known bugs in <:Release20180207:MLton 20180207>, listed
1869 in reverse chronological order of date reported.
1873 :mlton-guide-page: CallGraph
1878 For easier visualization of <:Profiling:profiling> data, `mlprof` can
1879 create a call graph of the program in dot format, from which you can
1880 use the http://www.research.att.com/sw/tools/graphviz/[graphviz]
1881 software package to create a PostScript or PNG graph. For example,
1883 mlprof -call-graph foo.dot foo mlmon.out
1885 will create `foo.dot` with a complete call graph. For each source
1886 function, there will be one node in the graph that contains the
1887 function name (and source position with `-show-line true`), as
1888 well as the percentage of ticks. If you want to create a call graph
1889 for your program without any profiling data, you can simply call
1890 `mlprof` without any `mlmon.out` files, as in
1892 mlprof -call-graph foo.dot foo
1895 Because SML has higher-order functions, the call graph is is dependent
1896 on MLton's analysis of which functions call each other. This analysis
1897 depends on many implementation details and might display spurious
1898 edges that a human could conclude are impossible. However, in
1899 practice, the call graphs tend to be very accurate.
1901 Because call graphs can get big, `mlprof` provides the `-keep` option
1902 to specify the nodes that you would like to see. This option also
1903 controls which functions appear in the table that `mlprof` prints.
1904 The argument to `-keep` is an expression describing a set of source
1905 functions (i.e. graph nodes). The expression _e_ should be of the
1910 * ++(and __e ...__)++
1916 * ++(thresh __x__)++
1917 * ++(thresh-gc __x__)++
1918 * ++(thresh-stack __x__)++
1921 In the grammar, ++all++ denotes the set of all nodes. ++"__s__"++ is
1922 a regular expression denoting the set of functions whose name
1923 (followed by a space and the source position) has a prefix matching
1924 the regexp. The `and`, `not`, and `or` expressions denote
1925 intersection, complement, and union, respectively. The `pred` and
1926 `succ` expressions add the set of immediate predecessors or successors
1927 to their argument, respectively. The `from` and `to` expressions
1928 denote the set of nodes that have paths from or to the set of nodes
1929 denoted by their arguments, respectively. Finally, `thresh`,
1930 `thresh-gc`, and `thresh-stack` denote the set of nodes whose
1931 percentage of ticks, gc ticks, or stack ticks, respectively, is
1932 greater than or equal to the real number _x_.
1934 For example, if you want to see the entire call graph for a program,
1935 you can use `-keep all` (this is the default). If you want to see
1936 all nodes reachable from function `foo` in your program, you would
1937 use `-keep '(from "foo")'`. Or, if you want to see all the
1938 functions defined in subdirectory `bar` of your project that used
1939 at least 1% of the ticks, you would use
1941 -keep '(and ".*/bar/" (thresh 1.0))'
1943 To see all functions with ticks above a threshold, you can also use
1944 `-thresh x`, which is an abbreviation for `-keep '(thresh x)'`. You
1945 can not use multiple `-keep` arguments or both `-keep` and `-thresh`.
1946 When you use `-keep` to display a subset of the functions, `mlprof`
1947 will add dashed edges to the call graph to indicate a path in the
1948 original call graph from one function to another.
1950 When compiling with `-profile-stack true`, you can use `mlprof -gray
1951 true` to make the nodes darker or lighter depending on whether their
1952 stack percentage is higher or lower.
1954 MLton's optimizer may duplicate source functions for any of a number
1955 of reasons (functor duplication, monomorphisation, polyvariance,
1956 inlining). By default, all duplicates of a function are treated as
1957 one. If you would like to treat the duplicates separately, you can
1958 use ++mlprof -split __regexp__++, which will cause all duplicates of
1959 functions whose name has a prefix matching the regular expression to
1960 be treated separately. This can be especially useful for higher-order
1961 utility functions like `General.o`.
1965 Technically speaking, `mlprof` produces a call-stack graph rather than
1966 a call graph, because it describes the set of possible call stacks.
1967 The difference is in how tail calls are displayed. For example if `f`
1968 nontail calls `g` and `g` tail calls `h`, then the call-stack graph
1969 has edges from `f` to `g` and `f` to `h`, while the call graph has
1970 edges from `f` to `g` and `g` to `h`. That is, a tail call from `g`
1971 to `h` removes `g` from the call stack and replaces it with `h`.
1975 :mlton-guide-page: CallingFromCToSML
1976 [[CallingFromCToSML]]
1980 MLton's <:ForeignFunctionInterface:> allows programs to _export_ SML
1981 functions to be called from C. Suppose you would like export from SML
1982 a function of type `real * char -> int` as the C function `foo`.
1983 MLton extends the syntax of SML to allow expressions like the
1986 _export "foo": (real * char -> int) -> unit;
1988 The above expression exports a C function named `foo`, with
1992 Int32 foo (Real64 x0, Char x1);
1994 The `_export` expression denotes a function of type
1995 `(real * char -> int) -> unit` that when called with a function
1996 `f`, arranges for the exported `foo` function to call `f`
1997 when `foo` is called. So, for example, the following exports and
2001 val e = _export "foo": (real * char -> int) -> unit;
2002 val _ = e (fn (x, c) => 13 + Real.floor x + Char.ord c)
2005 The general form of an `_export` expression is
2007 _export "C function name" attr... : cFuncTy -> unit;
2009 The type and the semicolon are not optional. As with `_import`, a
2010 sequence of attributes may follow the function name.
2012 MLton's `-export-header` option generates a C header file with
2013 prototypes for all of the functions exported from SML. Include this
2014 header file in your C files to type check calls to functions exported
2015 from SML. This header file includes ++typedef++s for the
2016 <:ForeignFunctionInterfaceTypes: types that can be passed between SML and C>.
2021 Suppose that `export.sml` is
2025 sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/export.sml]
2028 Note that the the `reentrant` attribute is used for `_import`-ing the
2029 C functions that will call the `_export`-ed SML functions.
2031 Create the header file with `-export-header`.
2033 % mlton -default-ann 'allowFFI true' \
2034 -export-header export.h \
2039 `export.h` now contains the following C prototypes.
2041 Int8 f (Int32 x0, Real64 x1, Int8 x2);
2042 Pointer f2 (Word8 x0);
2048 Use `export.h` in a C program, `ffi-export.c`, as follows.
2052 sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/ffi-export.c]
2055 Compile `ffi-export.c` and `export.sml`.
2057 % gcc -c ffi-export.c
2058 % mlton -default-ann 'allowFFI true' \
2059 export.sml ffi-export.o
2062 Finally, run `export`.
2073 * <!RawGitFile(mlton,master,doc/examples/ffi/export.sml)>
2074 * <!RawGitFile(mlton,master,doc/examples/ffi/ffi-export.c)>
2078 :mlton-guide-page: CallingFromSMLToC
2079 [[CallingFromSMLToC]]
2083 MLton's <:ForeignFunctionInterface:> allows an SML program to _import_
2084 C functions. Suppose you would like to import from C a function with
2085 the following prototype:
2088 int foo (double d, char c);
2090 MLton extends the syntax of SML to allow expressions like the following:
2092 _import "foo": real * char -> int;
2094 This expression denotes a function of type `real * char -> int` whose
2095 behavior is implemented by calling the C function whose name is `foo`.
2096 Thinking in terms of C, imagine that there are C variables `d` of type
2097 `double`, `c` of type `unsigned char`, and `i` of type `int`. Then,
2098 the C statement `i = foo (d, c)` is executed and `i` is returned.
2100 The general form of an `_import` expression is:
2102 _import "C function name" attr... : cFuncTy;
2104 The type and the semicolon are not optional.
2106 The function name is followed by a (possibly empty) sequence of
2107 attributes, analogous to C `__attribute__` specifiers.
2112 `import.sml` imports the C function `ffi` and the C variable `FFI_INT`
2117 sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/import.sml]
2124 sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/ffi-import.c]
2127 Compile and run the program.
2129 % mlton -default-ann 'allowFFI true' -export-header export.h import.sml ffi-import.c
2137 * <!RawGitFile(mlton,master,doc/examples/ffi/import.sml)>
2138 * <!RawGitFile(mlton,master,doc/examples/ffi/ffi-import.c)>
2143 * <:CallingFromSMLToCFunctionPointer:>
2147 :mlton-guide-page: CallingFromSMLToCFunctionPointer
2148 [[CallingFromSMLToCFunctionPointer]]
2149 CallingFromSMLToCFunctionPointer
2150 ================================
2152 Just as MLton can <:CallingFromSMLToC:directly call C functions>, it
2153 is possible to make indirect function calls; that is, function calls
2154 through a function pointer. MLton extends the syntax of SML to allow
2155 expressions like the following:
2157 _import * : MLton.Pointer.t -> real * char -> int;
2159 This expression denotes a function of type
2162 MLton.Pointer.t -> real * char -> int
2164 whose behavior is implemented by calling the C function at the address
2165 denoted by the `MLton.Pointer.t` argument, and supplying the C
2166 function two arguments, a `double` and an `int`. The C function
2167 pointer may be obtained, for example, by the dynamic linking loader
2168 (`dlopen`, `dlsym`, ...).
2170 The general form of an indirect `_import` expression is:
2172 _import * attr... : cPtrTy -> cFuncTy;
2174 The type and the semicolon are not optional.
2179 This example uses `dlopen` and friends (imported using normal
2180 `_import`) to dynamically load the math library (`libm`) and call the
2181 `cos` function. Suppose `iimport.sml` contains the following.
2185 sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/iimport.sml]
2188 Compile and run `iimport.sml`.
2190 % mlton -default-ann 'allowFFI true' \
2191 -target-link-opt linux -ldl \
2192 -target-link-opt solaris -ldl \
2195 Math.cos(2.0) = ~0.416146836547
2196 libm.so::cos(2.0) = ~0.416146836547
2199 This example also shows the `-target-link-opt` option, which uses the
2200 switch when linking only when on the specified platform. Compile with
2201 `-verbose 1` to see in more detail what's being passed to `gcc`.
2205 * <!RawGitFile(mlton,master,doc/examples/ffi/iimport.sml)>
2209 :mlton-guide-page: CCodegen
2214 The <:CCodegen:> is a <:Codegen:code generator> that translates the
2215 <:Machine:> <:IntermediateLanguage:> to C, which is further optimized
2216 and compiled to native object code by `gcc` (or another C compiler).
2218 == Implementation ==
2220 * <!ViewGitFile(mlton,master,mlton/codegen/c-codegen/c-codegen.sig)>
2221 * <!ViewGitFile(mlton,master,mlton/codegen/c-codegen/c-codegen.fun)>
2223 == Details and Notes ==
2225 The <:CCodegen:> is the original <:Codegen:code generator> for MLton.
2229 :mlton-guide-page: Changelog
2234 * <!ViewGitFile(mlton,master,CHANGELOG.adoc)>
2237 sys::[./bin/InclGitFile.py mlton master CHANGELOG.adoc]
2242 :mlton-guide-page: ChrisClearwater
2251 :mlton-guide-page: Chunkify
2256 <:Chunkify:> is an analysis pass for the <:RSSA:>
2257 <:IntermediateLanguage:>, invoked from <:ToMachine:>.
2261 It partitions all the labels (function and block) in an <:RSSA:>
2262 program into disjoint sets, referred to as chunks.
2264 == Implementation ==
2266 * <!ViewGitFile(mlton,master,mlton/backend/chunkify.sig)>
2267 * <!ViewGitFile(mlton,master,mlton/backend/chunkify.fun)>
2269 == Details and Notes ==
2271 Breaking large <:RSSA:> functions into chunks is necessary for
2272 reasonable compile times with the <:CCodegen:> and the <:LLVMCodegen:>.
2276 :mlton-guide-page: CKitLibrary
2281 The http://www.smlnj.org/doc/ckit[ckit Library] is a C front end
2282 written in SML that translates C source code (after preprocessing)
2283 into abstract syntax represented as a set of SML datatypes. The ckit
2284 Library is distributed with SML/NJ. Due to differences between SML/NJ
2285 and MLton, this library will not work out-of-the box with MLton.
2287 As of 20180119, MLton includes a port of the ckit Library synchronized
2288 with SML/NJ version 110.82.
2292 * You can import the ckit Library into an MLB file with:
2296 |MLB file|Description
2297 |`$(SML_LIB)/ckit-lib/ckit-lib.mlb`|
2300 * If you are porting a project from SML/NJ's <:CompilationManager:> to
2301 MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
2302 following map is included by default:
2306 $ckit-lib.cm $(SML_LIB)/ckit-lib
2307 $ckit-lib.cm/ckit-lib.cm $(SML_LIB)/ckit-lib/ckit-lib.mlb
2310 This will automatically convert a `$/ckit-lib.cm` import in an input
2311 `.cm` file into a `$(SML_LIB)/ckit-lib/ckit-lib.mlb` import in the
2316 The following changes were made to the ckit Library, in addition to
2317 deriving the `.mlb` file from the `.cm` file:
2319 * `ast/pp/pp-ast-adornment-sig.sml` (modified): Rewrote use of `signature` in `local`.
2320 * `ast/pp/pp-ast-ext-sig.sml` (modified): Rewrote use of `signature` in `local`.
2321 * `ast/type-util-sig.sml` (modified): Rewrote use of `signature` in `local`.
2322 * `parser/parse-tree-sig.sml` (modified): Rewrote use of (sequential) `withtype` in signature.
2323 * `parser/parse-tree.sml` (modified): Rewrote use of (sequential) `withtype`.
2327 * <!ViewGitFile(mlton,master,lib/ckit-lib/ckit.patch)>
2331 :mlton-guide-page: Closure
2336 A closure is a data structure that is the run-time representation of a
2340 == Typical Implementation ==
2342 In a typical implementation, a closure consists of a _code pointer_
2343 (indicating what the function does) and an _environment_ containing
2344 the values of the free variables of the function. For example, in the
2356 the closure for `fn y => x + y` contains a pointer to a piece of code
2357 that knows to take its argument and add the value of `x` to it, plus
2358 the environment recording the value of `x` as `5`.
2360 To call a function, the code pointer is extracted and jumped to,
2361 passing in some agreed upon location the environment and the argument.
2364 == MLton's Implementation ==
2366 MLton does not implement closures traditionally. Instead, based on
2367 whole-program higher-order control-flow analysis, MLton represents a
2368 function as an element of a sum type, where the variant indicates
2369 which function it is and carries the free variables as arguments. See
2370 <:ClosureConvert:> and <!Cite(CejtinEtAl00)> for details.
2374 :mlton-guide-page: ClosureConvert
2379 <:ClosureConvert:> is a translation pass from the <:SXML:>
2380 <:IntermediateLanguage:> to the <:SSA:> <:IntermediateLanguage:>.
2384 It converts an <:SXML:> program into an <:SSA:> program.
2386 <:Defunctionalization:> is the technique used to eliminate
2387 <:Closure:>s (see <!Cite(CejtinEtAl00)>).
2389 Uses <:Globalize:> and <:LambdaFree:> analyses.
2391 == Implementation ==
2393 * <!ViewGitFile(mlton,master,mlton/closure-convert/closure-convert.sig)>
2394 * <!ViewGitFile(mlton,master,mlton/closure-convert/closure-convert.fun)>
2396 == Details and Notes ==
2402 :mlton-guide-page: CMinusMinus
2407 http://cminusminus.org[C--] is a portable assembly language intended
2408 to make it easy for compilers for different high-level languages to
2409 share the same backend. An experimental version of MLton has been
2410 made to generate C--.
2412 * http://www.mlton.org/pipermail/mlton/2005-March/026850.html
2420 :mlton-guide-page: Codegen
2425 <:Codegen:> is a translation pass from the <:Machine:>
2426 <:IntermediateLanguage:> to one or more compilation units that can be
2427 compiled to native object code by an external tool.
2429 == Implementation ==
2431 * <!ViewGitDir(mlton,master,mlton/codegen)>
2433 == Details and Notes ==
2435 The following <:Codegen:codegens> are implemented:
2444 :mlton-guide-page: CombineConversions
2445 [[CombineConversions]]
2449 <:CombineConversions:> is an optimization pass for the <:SSA:>
2450 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
2454 This pass looks for and simplifies nested calls to (signed)
2455 extension/truncation.
2457 == Implementation ==
2459 * <!ViewGitFile(mlton,master,mlton/ssa/combine-conversions.fun)>
2461 == Details and Notes ==
2463 It processes each block in dfs order (visiting definitions before uses):
2465 * If the statement is not a `PrimApp` with `Word_extdToWord`, skip it.
2466 * After processing a conversion, it tags the `Var` for subsequent use.
2467 * When inspecting a conversion, check if the `Var` operand is also the
2468 result of a conversion. If it is, try to combine the two operations.
2469 Repeatedly simplify until hitting either a non-conversion `Var` or a
2470 case where the conversion cannot be simplified.
2472 The optimization rules are very simple:
2475 x2 = Word_extdToWord (W1, W2, {signed=s1}) x1
2476 x3 = Word_extdToWord (W2, W3, {signed=s2}) x2
2479 * If `W1 = W2`, then there is no conversions before `x_1`.
2481 This is guaranteed because `W2 = W3` will always trigger optimization.
2483 * Case `W1 <= W3 <= W2`:
2486 x3 = Word_extdToWord (W1, W3, {signed=s1}) x1
2489 * Case `W1 < W2 < W3 AND ((NOT s1) OR s2)`:
2492 x3 = Word_extdToWord (W1, W3, {signed=s1}) x1
2495 * Case `W1 = W2 < W3`:
2497 unoptimized, because there are no conversions past `W1` and `x2 = x1`
2499 * Case `W3 <= W2 <= W1 OR W3 <= W1 <= W2`:
2502 x_3 = Word_extdToWord (W1, W3, {signed=_}) x1
2505 because `W3 <= W1 && W3 <= W2`, just clip `x1`
2507 * Case `W2 < W1 <= W3 OR W2 < W3 <= W1`:
2509 unoptimized, because `W2 < W1 && W2 < W3`, has truncation effect
2511 * Case `W1 < W2 < W3 AND (s1 AND (NOT s2))`:
2513 unoptimized, because each conversion affects the result separately
2517 :mlton-guide-page: CommonArg
2522 <:CommonArg:> is an optimization pass for the <:SSA:>
2523 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
2527 It optimizes instances of `Goto` transfers that pass the same
2528 arguments to the same label; e.g.
2544 This code can be simplified to:
2560 which saves a number of resources: time of setting up the arguments
2561 for the jump to `L_3`, space (either stack or pseudo-registers) for
2562 the arguments of `L_3`, etc. It may also expose some other
2563 optimizations, if more information is known about `x` or `y`.
2565 == Implementation ==
2567 * <!ViewGitFile(mlton,master,mlton/ssa/common-arg.fun)>
2569 == Details and Notes ==
2571 Three analyses were originally proposed to drive the optimization
2572 transformation. Only the _Dominator Analysis_ is currently
2573 implemented. (Implementations of the other analyses are available in
2574 the <:Sources:repository history>.)
2576 === Syntactic Analysis ===
2578 The simplest analysis I could think of maintains
2580 varInfo: Var.t -> Var.t option list ref
2582 initialized to `[]`.
2584 * For each variable `v` bound in a `Statement.t` or in the
2585 `Function.t` args, then `List.push(varInfo v, NONE)`.
2586 * For each `L (x1, ..., xn)` transfer where `(a1, ..., an)` are the
2587 formals of `L`, then `List.push(varInfo ai, SOME xi)`.
2588 * For each block argument a used in an unknown context (e.g.,
2589 arguments of blocks used as continuations, handlers, arith success,
2590 runtime return, or case switch labels), then
2591 `List.push(varInfo a, NONE)`.
2593 Now, any block argument `a` such that `varInfo a = xs`, where all of
2594 the elements of `xs` are equal to `SOME x`, can be optimized by
2595 setting `a = x` at the beginning of the block and dropping the
2596 argument from `Goto` transfers.
2598 That takes care of the example above. We can clearly do slightly
2599 better, by changing the transformation criteria to the following: any
2600 block argument a such that `varInfo a = xs`, where all of the elements
2601 of `xs` are equal to `SOME x` _or_ are equal to `SOME a`, can be
2602 optimized by setting `a = x` at the beginning of the block and
2603 dropping the argument from `Goto` transfers. This optimizes a case
2615 true => L_4 | false => L_5
2622 where a common argument is passed to a loop (and is invariant through
2623 the loop). Of course, the <:LoopInvariant:> optimization pass would
2624 normally introduce a local loop and essentially reduce this to the
2625 first example, but I have seen this in practice, which suggests that
2626 some optimizations after <:LoopInvariant:> do enough simplifications
2627 to introduce (new) loop invariant arguments.
2629 === Fixpoint Analysis ===
2631 However, the above analysis and transformation doesn't cover the cases
2632 where eliminating one common argument exposes the opportunity to
2633 eliminate other common arguments. For example:
2651 One pass of analysis and transformation would eliminate the argument
2652 to `L_3` and rewrite the `L_5(a)` transfer to `L_5 (x)`, thereby
2653 exposing the opportunity to eliminate the common argument to `L_5`.
2655 The interdependency the arguments to `L_3` and `L_5` suggest
2656 performing some sort of fixed-point analysis. This analysis is
2657 relatively simple; maintain
2659 varInfo: Var.t -> VarLattice.t
2663 VarLattice.t ~=~ Bot | Point of Var.t | Top
2665 (but is implemented by the <:FlatLattice:> functor with a `lessThan`
2666 list and `value ref` under the hood), initialized to `Bot`.
2668 * For each variable `v` bound in a `Statement.t` or in the
2669 `Function.t` args, then `VarLattice.<= (Point v, varInfo v)`
2670 * For each `L (x1, ..., xn)` transfer where `(a1, ..., an)` are the
2671 formals of `L`}, then `VarLattice.<= (varInfo xi, varInfo ai)`.
2672 * For each block argument a used in an unknown context, then
2673 `VarLattice.<= (Point a, varInfo a)`.
2675 Now, any block argument a such that `varInfo a = Point x` can be
2676 optimized by setting `a = x` at the beginning of the block and
2677 dropping the argument from `Goto` transfers.
2679 Now, with the last example, we introduce the ordering constraints:
2681 varInfo x <= varInfo a
2682 varInfo a <= varInfo b
2683 varInfo x <= varInfo b
2686 Assuming that `varInfo x = Point x`, then we get `varInfo a = Point x`
2687 and `varInfo b = Point x`, and we optimize the example as desired.
2689 But, that is a rather weak assumption. It's quite possible for
2690 `varInfo x = Top`. For example, consider:
2716 Now `varInfo x = varInfo a = varInfo b = Top`. What went wrong here?
2717 When `varInfo x` went to `Top`, it got propagated all the way through
2718 to `a` and `b`, and prevented the elimination of any common arguments.
2719 What we'd like to do instead is when `varInfo x` goes to `Top`,
2720 propagate on `Point x` -- we have no hope of eliminating `x`, but if
2721 we hold `x` constant, then we have a chance of eliminating arguments
2722 for which `x` is passed as an actual.
2724 === Dominator Analysis ===
2726 Does anyone see where this is going yet? Pausing for a little
2727 thought, <:MatthewFluet:> realized that he had once before tried
2728 proposing this kind of "fix" to a fixed-point analysis -- when we were
2729 first investigating the <:Contify:> optimization in light of John
2730 Reppy's CWS paper. Of course, that "fix" failed because it defined a
2731 non-monotonic function and one couldn't take the fixed point. But,
2732 <:StephenWeeks:> suggested a dominator based approach, and we were
2733 able to show that, indeed, the dominator analysis subsumed both the
2734 previous call based analysis and the cont based analysis. And, a
2735 moment's reflection reveals further parallels: when
2736 `varInfo: Var.t -> Var.t option list ref`, we have something analogous
2737 to the call analysis, and when `varInfo: Var.t -> VarLattice.t`, we
2738 have something analogous to the cont analysis. Maybe there is
2739 something analogous to the dominator approach (and therefore superior
2740 to the previous analyses).
2742 And this turns out to be the case. Construct the graph `G` as follows:
2744 nodes(G) = {Root} U Var.t
2745 edges(G) = {Root -> v | v bound in a Statement.t or
2746 in the Function.t args} U
2747 {xi -> ai | L(x1, ..., xn) transfer where (a1, ..., an)
2748 are the formals of L} U
2749 {Root -> a | a is a block argument used in an unknown context}
2752 Let `idom(x)` be the immediate dominator of `x` in `G` with root
2753 `Root`. Now, any block argument a such that `idom(a) = x <> Root` can
2754 be optimized by setting `a = x` at the beginning of the block and
2755 dropping the argument from `Goto` transfers.
2757 Furthermore, experimental evidence suggests (and we are confident that
2758 a formal presentation could prove) that the dominator analysis
2759 subsumes the "syntactic" and "fixpoint" based analyses in this context
2760 as well and that the dominator analysis gets "everything" in one go.
2762 === Final Thoughts ===
2764 I must admit, I was rather surprised at this progression and final
2765 result. At the outset, I never would have thought of a connection
2766 between <:Contify:> and <:CommonArg:> optimizations. They would seem
2767 to be two completely different optimizations. Although, this may not
2768 really be the case. As one of the reviewers of the ICFP paper said:
2770 I understand that such a form of CPS might be convenient in some
2771 cases, but when we're talking about analyzing code to detect that some
2772 continuation is constant, I think it makes a lot more sense to make
2773 all the continuation arguments completely explicit.
2775 I believe that making all the continuation arguments explicit will
2776 show that the optimization can be generalized to eliminating constant
2777 arguments, whether continuations or not.
2780 What I think the common argument optimization shows is that the
2781 dominator analysis does slightly better than the reviewer puts it: we
2782 find more than just constant continuations, we find common
2783 continuations. And I think this is further justified by the fact that
2784 I have observed common argument eliminate some `env_X` arguments which
2785 would appear to correspond to determining that while the closure being
2786 executed isn't constant it is at least the same as the closure being
2789 At first, I was curious whether or not we had missed a bigger picture
2790 with the dominator analysis. When we wrote the contification paper, I
2791 assumed that the dominator analysis was a specialized solution to a
2792 specialized problem; we never suggested that it was a technique suited
2793 to a larger class of analyses. After initially finding a connection
2794 between <:Contify:> and <:CommonArg:> (and thinking that the only
2795 connection was the technique), I wondered if the dominator technique
2796 really was applicable to a larger class of analyses. That is still a
2797 question, but after writing up the above, I'm suspecting that the
2798 "real story" is that the dominator analysis is a solution to the
2799 common argument optimization, and that the <:Contify:> optimization is
2800 specializing <:CommonArg:> to the case of continuation arguments (with
2801 a different transformation at the end). (Note, a whole-program,
2802 inter-procedural common argument analysis doesn't really make sense
2803 (in our <:SSA:> <:IntermediateLanguage:>), because the only way of
2804 passing values between functions is as arguments. (Unless of course
2805 in the case that the common argument is also a constant argument, in
2806 which case <:ConstantPropagation:> could lift it to a global.) The
2807 inter-procedural <:Contify:> optimization works out because there we
2808 move the function to the argument.)
2810 Anyways, it's still unclear to me whether or not the dominator based
2811 approach solves other kinds of problems.
2813 === Phase Ordering ===
2815 On the downside, the optimization doesn't have a huge impact on
2816 runtime, although it does predictably saved some code size. I stuck
2817 it in the optimization sequence after <:Flatten:> and (the third round
2818 of) <:LocalFlatten:>, since it seems to me that we could have cases
2819 where some components of a tuple used as an argument are common, but
2820 the whole tuple isn't. I think it makes sense to add it after
2821 <:IntroduceLoops:> and <:LoopInvariant:> (even though <:CommonArg:>
2822 get some things that <:LoopInvariant:> gets, it doesn't get all of
2823 them). I also think that it makes sense to add it before
2824 <:CommonSubexp:>, since identifying variables could expose more common
2825 subexpressions. I would think a similar thought applies to
2830 :mlton-guide-page: CommonBlock
2835 <:CommonBlock:> is an optimization pass for the <:SSA:>
2836 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
2840 It eliminates equivalent blocks in a <:SSA:> function. The
2841 equivalence criteria requires blocks to have no arguments or
2842 statements and transfer via `Raise`, `Return`, or `Goto` of a single
2845 == Implementation ==
2847 * <!ViewGitFile(mlton,master,mlton/ssa/common-block.fun)>
2849 == Details and Notes ==
2872 to the <:SSA:> function.
2895 to the <:SSA:> function.
2918 to the <:SSA:> function.
2920 The <:Shrink:> pass rewrites all uses of `L_X` to `L_Y'` and drops `L_X`.
2922 For example, all uncaught `Overflow` exceptions in a <:SSA:> function
2923 share the same raising block.
2927 :mlton-guide-page: CommonSubexp
2932 <:CommonSubexp:> is an optimization pass for the <:SSA:>
2933 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
2937 It eliminates instances of common subexpressions.
2939 == Implementation ==
2941 * <!ViewGitFile(mlton,master,mlton/ssa/common-subexp.fun)>
2943 == Details and Notes ==
2945 In addition to getting the usual sorts of things like
2950 (w + 0wx1) + (w + 0wx1)
2956 let val w' = w + 0wx1 in w' + w' end
2959 it also gets things like
2964 val a = Array_uninit n
2965 val b = Array_length a
2971 val a = Array_uninit n
2975 `Arith` transfers are handled specially. The _result_ of an `Arith`
2976 transfer can be used in _common_ `Arith` transfers that it dominates:
2981 val l = (n + m) + (n + m)
2983 val k = (l + n) + ((l + m) handle Overflow => ((l + m)
2984 handle Overflow => l + n))
2987 is rewritten so that `(n + m)` is computed exactly once, as are
2988 `(l + n)` and `(l + m)`.
2992 :mlton-guide-page: CompilationManager
2993 [[CompilationManager]]
2997 The http://www.smlnj.org/doc/CM/index.html[Compilation Manager] (CM) is SML/NJ's mechanism for supporting programming-in-the-very-large.
2999 == Porting SML/NJ CM files to MLton ==
3001 To help in porting CM files to MLton, the MLton source distribution
3002 includes the sources for a utility, `cm2mlb`, that will print an
3003 <:MLBasis: ML Basis> file with essentially the same semantics as the
3004 CM file -- handling the full syntax of CM supported by your installed
3005 SML/NJ version and correctly handling export filters. When `cm2mlb`
3006 encounters a `.cm` import, it attempts to convert it to a
3007 corresponding `.mlb` import. CM anchored paths are translated to
3008 paths according to a default configuration file
3009 (<!ViewGitFile(mlton,master,util/cm2mlb/cm2mlb-map)>). For example,
3010 the default configuration includes
3012 # Standard ML Basis Library
3013 $SMLNJ-BASIS $(SML_LIB)/basis
3014 $basis.cm $(SML_LIB)/basis
3015 $basis.cm/basis.cm $(SML_LIB)/basis/basis.mlb
3017 to ensure that a `$/basis.cm` import is translated to a
3018 `$(SML_LIB)/basis/basis.mlb` import. See `util/cm2mlb` for details.
3019 Building `cm2mlb` requires that you have already installed a recent
3024 :mlton-guide-page: CompilerOverview
3025 [[CompilerOverview]]
3029 The following table shows the overall structure of the compiler.
3030 <:IntermediateLanguage:>s are shown in the center column. The names
3031 of compiler passes are listed in the left and right columns.
3033 [align="center",witdth="50%",cols="^,^,^"]
3035 3+^| *Compiler Overview*
3036 | _Translation Passes_ | _<:IntermediateLanguage:>_ | _Optimization Passes_
3041 | | <:CoreML:> | <:CoreMLSimplify:>
3042 | <:Defunctorize:> | |
3043 | | <:XML:> | <:XMLSimplify:>
3044 | <:Monomorphise:> | |
3045 | | <:SXML:> | <:SXMLSimplify:>
3046 | <:ClosureConvert:> | |
3047 | | <:SSA:> | <:SSASimplify:>
3049 | | <:SSA2:> | <:SSA2Simplify:>
3051 | | <:RSSA:> | <:RSSASimplify:>
3057 The `Compile` functor (<!ViewGitFile(mlton,master,mlton/main/compile.sig)>,
3058 <!ViewGitFile(mlton,master,mlton/main/compile.fun)>), controls the
3059 high-level view of the compiler passes, from <:FrontEnd:> to code
3064 :mlton-guide-page: CompilerPassTemplate
3065 [[CompilerPassTemplate]]
3066 CompilerPassTemplate
3067 ====================
3069 An analysis pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZOtherPass:>.
3070 An implementation pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZSimplify:>.
3071 An optimization pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZSimplify:>.
3072 A rewrite pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZOtherPass:>.
3073 A translation pass from the <:ZZA:> <:IntermediateLanguage:> to the <:ZZB:> <:IntermediateLanguage:>.
3077 A short description of the pass.
3079 == Implementation ==
3081 * <!ViewGitFile(mlton,master,mlton/ZZZ.fun)>
3083 == Details and Notes ==
3085 Relevant details and notes.
3089 :mlton-guide-page: CompileTimeOptions
3090 [[CompileTimeOptions]]
3094 MLton's compile-time options control the name of the output file, the
3095 verbosity of compile-time messages, and whether or not certain
3096 optimizations are performed. They also can specify which intermediate
3097 files are saved and can stop the compilation process early, at some
3098 intermediate pass, in which case compilation can be resumed by passing
3099 the generated files to MLton. MLton uses the input file suffix to
3100 determine the type of input program. The possibilities are `.c`,
3101 `.mlb`, `.o`, `.s`, and `.sml`.
3103 With no arguments, MLton prints the version number and exits. For a
3104 usage message, run MLton with an invalid switch, e.g. `mlton -z`. In
3105 the explanation below and in the usage message, for flags that take a
3106 number of choices (e.g. `{true|false}`), the first value listed is the
3114 Aligns object in memory by the specified alignment (+4+ or +8+).
3115 The default varies depending on architecture.
3117 * ++-as-opt __option__++
3119 Pass _option_ to `gcc` when compiling assembler code. If you wish to
3120 pass an option to the assembler, you must use `gcc`'s `-Wa,` syntax.
3122 * ++-cc-opt __option__++
3124 Pass _option_ to `gcc` when compiling C code.
3126 * ++-codegen {native|amd64|c|llvm|x86}++
3128 Generate native object code via amd64 assembly, C code, LLVM code, or
3129 x86 code or C code. With `-codegen native` (`-codegen amd64` or
3130 `-codegen x86`), MLton typically compiles more quickly and generates
3133 * ++-const __name__ __value__++
3135 Set the value of a compile-time constant. Here is a list of
3136 available constants, their default values, and what they control.
3138 ** ++Exn.keepHistory {false|true}++
3140 Enable `MLton.Exn.history`. See <:MLtonExn:> for details. There is a
3141 performance cost to setting this to `true`, both in memory usage of
3142 exceptions and in run time, because of additional work that must be
3143 performed at each exception construction, raise, and handle.
3145 * ++-default-ann __ann__++
3147 Specify default <:MLBasisAnnotations:ML Basis annotations>. For
3148 example, `-default-ann 'warnUnused true'` causes unused variable
3149 warnings to be enabled by default. A default is overridden by the
3150 corresponding annotation in an ML Basis file.
3152 * ++-default-type __type__++
3154 Specify the default binding for a primitive type. For example,
3155 `-default-type word64` causes the top-level type `word` and the
3156 top-level structure `Word` in the <:BasisLibrary:Basis Library> to be
3157 equal to `Word64.word` and `Word64:WORD`, respectively. Similarly,
3158 `-default-type intinf` causes the top-level type `int` and the
3159 top-level structure `Int` in the <:BasisLibrary:Basis Library> to be
3160 equal to `IntInf.int` and `IntInf:INTEGER`, respectively.
3162 * ++-disable-ann __ann__++
3164 Ignore the specified <:MLBasisAnnotations:ML Basis annotation> in
3165 every ML Basis file. For example, to see _all_ match and unused
3166 warnings, compile with
3169 -default-ann 'warnUnused true'
3170 -disable-ann forceUsed
3171 -disable-ann nonexhaustiveMatch
3172 -disable-ann redundantMatch
3173 -disable-ann warnUnused
3176 * ++-export-header __file__++
3178 Write C prototypes to _file_ for all of the functions in the program
3179 <:CallingFromCToSML:exported from SML to C>.
3181 * ++-ieee-fp {false|true}++
3183 Cause the x86 native code generator to be pedantic about following the
3184 IEEE floating point standard. By default, it is not, because of the
3185 performance cost. This only has an effect with `-codegen x86`.
3189 Set the inlining threshold used in the optimizer. The threshold is an
3190 approximate measure of code size of a procedure. The default is
3195 Save intermediate files. If no `-keep` argument is given, then only
3196 the output file is saved.
3200 | `g` | generated `.c` and `.s` files passed to `gcc` and generated `.ll` files passed to `llvm-as`
3201 | `o` | object (`.o`) files
3204 * ++-link-opt __option__++
3206 Pass _option_ to `gcc` when linking. You can use this to specify
3207 library search paths, e.g. `-link-opt -Lpath`, and libraries to link
3208 with, e.g., `-link-opt -lfoo`, or even both at the same time,
3209 e.g. `-link-opt '-Lpath -lfoo'`. If you wish to pass an option to the
3210 linker, you must use `gcc`'s `-Wl,` syntax, e.g.,
3211 `-link-opt '-Wl,--export-dynamic'`.
3213 * ++-llvm-as-opt __option__++
3215 Pass _option_ to `llvm-as` when assembling (`.ll` to `.bc`) LLVM code.
3217 * ++-llvm-llc-opt __option__++
3219 Pass _option_ to `llc` when compiling (`.bc` to `.o`) LLVM code.
3221 * ++-llvm-opt-opt __option__++
3223 Pass _option_ to `opt` when optimizing (`.bc` to `.bc`) LLVM code.
3225 * ++-mlb-path-map __file__++
3227 Use _file_ as an <:MLBasisPathMap:ML Basis path map> to define
3228 additional MLB path variables. Multiple uses of `-mlb-path-map` and
3229 `-mlb-path-var` are allowed, with variable definitions in later path
3230 maps taking precedence over earlier ones.
3232 * ++-mlb-path-var __name__ __value__++
3234 Define an additional MLB path variable. Multiple uses of
3235 `-mlb-path-map` and `-mlb-path-var` are allowed, with variable
3236 definitions in later path maps taking precedence over earlier ones.
3238 * ++-output __file__++
3240 Specify the name of the final output file. The default name is the
3241 input file name with its suffix removed and an appropriate, possibly
3242 empty, suffix added.
3244 * ++-profile {no|alloc|count|time}++
3246 Produce an executable that gathers <:Profiling: profiling> data. When
3247 such an executable is run, it produces an `mlmon.out` file.
3249 * ++-profile-branch {false|true}++
3251 If true, the profiler will separately gather profiling data for each
3252 branch of a function definition, `case` expression, and `if`
3255 * ++-profile-stack {false|true}++
3257 If `true`, the executable will gather profiling data for all functions
3258 on the stack, not just the currently executing function. See
3259 <:ProfilingTheStack:>.
3261 * ++-profile-val {false|true}++
3263 If `true`, the profiler will separately gather profiling data for each
3264 (expansive) `val` declaration.
3266 * ++-runtime __arg__++
3268 Pass argument to the runtime system via `@MLton`. See
3269 <:RunTimeOptions:>. The argument will be processed before other
3270 `@MLton` command line switches. Multiple uses of `-runtime` are
3271 allowed, and will pass all the arguments in order. If the same
3272 runtime switch occurs more than once, then the last setting will take
3273 effect. There is no need to supply the leading `@MLton` or the
3274 trailing `--`; these will be supplied automatically.
3276 An argument to `-runtime` may contain spaces, which will cause the
3277 argument to be treated as a sequence of words by the runtime. For
3278 example the command line:
3281 mlton -runtime 'ram-slop 0.4' foo.sml
3284 will cause `foo` to run as if it had been called like:
3287 foo @MLton ram-slop 0.4 --
3290 An executable created with `-runtime stop` doesn't process any
3291 `@MLton` arguments. This is useful to create an executable, e.g.,
3292 `echo`, that must treat `@MLton` like any other command-line argument.
3295 % mlton -runtime stop echo.sml
3300 * ++-show-basis __file__++
3302 Pretty print to _file_ the basis defined by the input program. See
3305 * ++-show-def-use __file__++
3307 Output def-use information to _file_. Each identifier that is defined
3308 appears on a line, followed on subsequent lines by the position of
3311 * ++-stop {f|g|o|tc}++
3313 Specify when to stop.
3317 | `f` | list of files on stdout (only makes sense when input is `foo.mlb`)
3318 | `g` | generated `.c` and `.s` files
3319 | `o` | object (`.o`) files
3320 | `tc` | after type checking
3323 If you compile with `-stop g` or `-stop o`, you can resume compilation
3324 by running MLton on the generated `.c` and `.s` or `.o` files.
3326 * ++-target {self|__...__}++
3328 Generate an executable that runs on the specified platform. The
3329 default is `self`, which means to compile for the machine that MLton
3330 is running on. To use any other target, you must first install a
3331 <:CrossCompiling: cross compiler>.
3333 * ++-target-as-opt __target__ __option__++
3335 Like `-as-opt`, this passes _option_ to `gcc` when compliling
3336 assembler code, except it only passes _option_ when the target
3337 architecture, operating system, or arch-os pair is _target_.
3339 * ++-target-cc-opt __target__ __option__++
3341 Like `-cc-opt`, this passes _option_ to `gcc` when compiling C code,
3342 except it only passes _option_ when the target architecture, operating
3343 system, or arch-os pair is _target_.
3345 * ++-target-link-opt __target__ __option__++
3347 Like `-link-opt`, this passes _option_ to `gcc` when linking, except
3348 it only passes _option_ when the target architecture, operating
3349 system, or arch-os pair is _target_.
3351 * ++-verbose {0|1|2|3}++
3353 How verbose to be about what passes are running. The default is `0`.
3358 | `1` | calls to compiler, assembler, and linker
3359 | `2` | 1, plus intermediate compiler passes
3360 | `3` | 2, plus some data structure sizes
3365 :mlton-guide-page: CompilingWithSMLNJ
3366 [[CompilingWithSMLNJ]]
3370 You can compile MLton with <:SMLNJ:SML/NJ>, however the resulting
3371 compiler will run much more slowly than MLton compiled by itself. We
3372 don't recommend using SML/NJ as a means of
3373 <:PortingMLton:porting MLton> to a new platform or bootstrapping on a
3376 If you do want to build MLton with SML/NJ, it is best to have a binary
3377 MLton package installed. If you don't, here are some issues you may
3378 encounter when you run `make smlnj-mlton`.
3380 You will get (many copies of) the error messages:
3383 /bin/sh: mlton: command not found
3389 make[2]: mlton: Command not found
3392 The `Makefile` calls `mlton` to determine dependencies, and can
3393 proceed in spite of this error.
3395 If you don't have an `mllex` executable, you will get the error
3399 mllex: Command not found
3402 Building MLton requires `mllex` and `mlyacc` executables, which are
3403 distributed with a binary package of MLton. The easiest solution is
3404 to copy the front-end lexer/parser files from a different machine
3405 (`ml.grm.sml`, `ml.grm.sig`, `ml.lex.sml`, `mlb.grm.sig`,
3410 :mlton-guide-page: ConcurrentML
3415 http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency
3416 library based on synchronous message passing. MLton has an initial
3417 port of CML from SML/NJ, but is missing a thread-safe wrapper around
3418 the Basis Library and event-based equivalents to `IO` and `OS`
3421 All of the core CML functionality is present.
3426 structure SyncVar: SYNC_VAR
3427 structure Mailbox: MAILBOX
3428 structure Multicast: MULTICAST
3429 structure SimpleRPC: SIMPLE_RPC
3430 structure RunCML: RUN_CML
3433 The `RUN_CML` signature is minimal.
3439 val isRunning: unit -> bool
3440 val doit: (unit -> unit) * Time.time option -> OS.Process.status
3441 val shutdown: OS.Process.status -> 'a
3445 MLton's `RunCML` structure does not include all of the cleanup and
3446 logging operations of SML/NJ's `RunCML` structure. However, the
3447 implementation does include the `CML.timeOutEvt` and `CML.atTimeEvt`
3448 functions, and a preemptive scheduler that knows to sleep when there
3449 are no ready threads and some threads blocked on time events.
3451 Because MLton does not wrap the Basis Library for CML, the "right" way
3452 to call a Basis Library function that is stateful is to wrap the call
3453 with `MLton.Thread.atomically`.
3457 * You can import the CML Library into an MLB file with:
3461 |MLB file|Description
3462 |`$(SML_LIB)/cml/cml.mlb`|
3465 * If you are porting a project from SML/NJ's <:CompilationManager:> to
3466 MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
3467 following map is included by default:
3472 $cml/cml.cm $(SML_LIB)/cml/cml.mlb
3475 This will automatically convert a `$cml/cml.cm` import in an input `.cm` file into a `$(SML_LIB)/cml/cml.mlb` import in the output `.mlb` file.
3479 * <:ConcurrentMLImplementation:>
3484 :mlton-guide-page: ConcurrentMLImplementation
3485 [[ConcurrentMLImplementation]]
3486 ConcurrentMLImplementation
3487 ==========================
3489 Here are some notes on MLton's implementation of <:ConcurrentML:>.
3491 Concurrent ML was originally implemented for SML/NJ. It was ported to
3492 MLton in the summer of 2004. The main difference between the
3493 implementations is that SML/NJ uses continuations to implement CML
3494 threads, while MLton uses its underlying <:MLtonThread:thread>
3495 package. Presently, MLton's threads are a little more heavyweight
3496 than SML/NJ's continuations, but it's pretty clear that there is some
3497 fat there that could be trimmed.
3499 The implementation of CML in SML/NJ is built upon the first-class
3500 continuations of the `SMLofNJ.Cont` module.
3504 val callcc: ('a cont -> 'a) -> 'a
3505 val isolate: ('a -> unit) -> 'a cont
3506 val throw: 'a cont -> 'a -> 'b
3509 The implementation of CML in MLton is built upon the first-class
3510 threads of the <:MLtonThread:> module.
3514 val new: ('a -> unit) -> 'a t
3515 val prepare: 'a t * 'a -> Runnable.t
3516 val switch: ('a t -> Runnable.t) -> 'a
3519 The port is relatively straightforward, because CML always throws to a
3520 continuation at most once. Hence, an "abstract" implementation of
3521 CML could be built upon first-class one-shot continuations, which map
3522 equally well to SML/NJ's continuations and MLton's threads.
3524 The "essence" of the port is to transform:
3526 callcc (fn k => ... throw k' v')
3530 switch (fn t => ... prepare (t', v'))
3532 which suffices for the vast majority of the CML implementation.
3534 There was only one complicated transformation: blocking multiple base
3535 events. In SML/NJ CML, the representation of base events is given by:
3538 datatype 'a event_status
3539 = ENABLED of {prio: int, doFn: unit -> 'a}
3541 transId: trans_id ref,
3542 cleanUp: unit -> unit,
3545 type 'a base_evt = unit -> 'a event_status
3548 When synchronizing on a set of base events, which are all blocked, we
3549 must invoke each `BLOCKED` function with the same `transId` and
3550 `cleanUp` (the `transId` is (checked and) set to `CANCEL` by the
3551 `cleanUp` function, which is invoked by the first enabled event; this
3552 "fizzles" every other event in the synchronization group that later
3553 becomes enabled). However, each `BLOCKED` function is implemented by
3554 a callcc, so that when the event is enabled, it throws back to the
3555 point of synchronization. Hence, the next function (which doesn't
3556 return) is invoked by the `BLOCKED` function to escape the callcc and
3557 continue in the thread performing the synchronization. In SML/NJ this
3558 is implemented as follows:
3561 fun ext ([], blockFns) = callcc (fn k => let
3563 val (transId, setFlg) = mkFlg()
3564 fun log [] = S.atomicDispatch ()
3565 | log (blockFn:: r) =
3569 next = fn () => log r
3572 log blockFns; error "[log]"
3575 (Note that `S.atomicDispatch` invokes the continuation of the next
3576 continuation on the ready queue.) This doesn't map well to the MLton
3577 thread model. Although it follows the
3579 callcc (fn k => ... throw k v)
3581 model, the fact that `blockFn` will also attempt to do
3583 callcc (fn k' => ... next ())
3585 means that the naive transformation will result in nested `switch`-es.
3587 We need to think a little more about what this code is trying to do.
3588 Essentially, each `blockFn` wants to capture this continuation, hold
3589 on to it until the event is enabled, and continue with next; when the
3590 event is enabled, before invoking the continuation and returning to
3591 the synchronization point, the `cleanUp` and other event specific
3592 operations are performed.
3594 To accomplish the same effect in the MLton thread implementation, we
3598 datatype 'a status =
3599 ENABLED of {prio: int, doitFn: unit -> 'a}
3600 | BLOCKED of {transId: trans_id,
3601 cleanUp: unit -> unit,
3602 next: unit -> rdy_thread} -> 'a
3604 type 'a base = unit -> 'a status
3606 fun ext ([], blockFns): 'a =
3608 (fn (t: 'a S.thread) =>
3610 val (transId, cleanUp) = TransID.mkFlg ()
3611 fun log blockFns: S.rdy_thread =
3614 | blockFn::blockFns =>
3618 val () = S.atomicBegin ()
3619 val x = blockFn {transId = transId,
3621 next = fn () => log blockFns}
3622 in S.switch(fn _ => S.prepVal (t, x))
3629 To avoid the nested `switch`-es, I run the `blockFn` in it's own
3630 thread, whose only purpose is to return to the synchronization point.
3631 This corresponds to the `throw (blockFn {...})` in the SML/NJ
3632 implementation. I'm worried that this implementation might be a
3633 little expensive, starting a new thread for each blocked event (when
3634 there are only multiple blocked events in a synchronization group).
3635 But, I don't see another way of implementing this behavior in the
3638 Note that another way of thinking about what is going on is to
3639 consider each `blockFn` as prepending a different set of actions to
3640 the thread `t`. It might be possible to give a
3641 `MLton.Thread.unsafePrepend`.
3644 fun unsafePrepend (T r: 'a t, f: 'b -> 'a): 'b t =
3648 Dead => raise Fail "prepend to a Dead thread"
3649 | New g => New (g o f)
3650 | Paused (g, t) => Paused (fn h => g (f o h), t)
3655 I have commented out the `r := Dead`, which would allow multiple
3656 prepends to the same thread (i.e., not destroying the original thread
3657 in the process). Of course, only one of the threads could be run: if
3658 the original thread were in the `Paused` state, then multiple threads
3659 would share the underlying runtime/primitive thread. Now, this
3660 matches the "one-shot" nature of CML continuations/threads, but I'm
3661 not comfortable with extending `MLton.Thread` with such an unsafe
3664 Other than this complication with blocking multiple base events, the
3665 port was quite routine. (As a very pleasant surprise, the CML
3666 implementation in SML/NJ doesn't use any SML/NJ-isms.) There is a
3667 slight difference in the way in which critical sections are handled in
3668 SML/NJ and MLton; since `MLton.Thread.switch` _always_ leaves a
3669 critical section, it is sometimes necessary to add additional
3670 `atomicBegin`-s/`atomicEnd`-s to ensure that we remain in a critical
3671 section after a thread switch.
3673 While looking at virtually every file in the core CML implementation,
3674 I took the liberty of simplifying things where it seemed possible; in
3675 terms of style, the implementation is about half-way between Reppy's
3676 original and MLton's.
3678 Some changes of note:
3680 * `util/` contains all pertinent data-structures: (functional and
3681 imperative) queues, (functional) priority queues. Hence, it should be
3682 easier to switch in more efficient or real-time implementations.
3684 * `core-cml/scheduler.sml`: in both implementations, this is where
3685 most of the interesting action takes place. I've made the connection
3686 between `MLton.Thread.t`-s and `ThreadId.thread_id`-s more abstract
3687 than it is in the SML/NJ implementation, and encapsulated all of the
3688 `MLton.Thread` operations in this module.
3690 * eliminated all of the "by hand" inlining
3693 == Future Extensions ==
3695 The CML documentation says the following:
3699 CML.joinEvt: thread_id -> unit event
3704 creates an event value for synchronizing on the termination of the
3705 thread with the ID tid. There are three ways that a thread may
3706 terminate: the function that was passed to spawn (or spawnc) may
3707 return; it may call the exit function, or it may have an uncaught
3708 exception. Note that `joinEvt` does not distinguish between these
3709 cases; it also does not become enabled if the named thread deadlocks
3710 (even if it is garbage collected).
3713 I believe that the `MLton.Finalizable` might be able to relax that
3714 last restriction. Upon the creation of a `'a Scheduler.thread`, we
3715 could attach a finalizer to the underlying `'a MLton.Thread.t` that
3716 enables the `joinEvt` (in the associated `ThreadID.thread_id`) when
3717 the `'a MLton.Thread.t` becomes unreachable.
3719 I don't know why CML doesn't have
3721 CML.kill: thread_id -> unit
3723 which has a fairly simple implementation -- setting a kill flag in the
3724 `thread_id` and adjusting the scheduler to discard any killed threads
3725 that it takes off the ready queue. The fairness of the scheduler
3726 ensures that a killed thread will eventually be discarded. The
3727 semantics are little murky for blocked threads that are killed,
3728 though. For example, consider a thread blocked on `SyncVar.mTake mv`
3729 and a thread blocked on `SyncVar.mGet mv`. If the first thread is
3730 killed while blocked, and a third thread does `SyncVar.mPut (mv, x)`,
3731 then we might expect that we'll enable the second thread, and never
3732 the first. But, when only the ready queue is able to discard killed
3733 threads, then the `SyncVar.mPut` could enable the first thread
3734 (putting it on the ready queue, from which it will be discarded) and
3735 leave the second thread blocked. We could solve this by adjusting the
3736 `TransID.trans_id types` and the "cleaner" functions to look for both
3737 canceled transactions and transactions on killed threads.
3739 John Reppy says that <!Cite(MarlowEtAl01)> and <!Cite(FlattFindler04)>
3740 explain why `CML.kill` would be a bad idea.
3742 Between `CML.timeOutEvt` and `CML.kill`, one could give an efficient
3743 solution to the recent `comp.lang.ml` post about terminating a
3744 function that doesn't complete in a given time.
3747 fun timeOut (f: unit -> 'a, t: Time.time): 'a option =
3749 val iv = SyncVar.iVar ()
3750 val tid = CML.spawn (fn () => SyncVar.iPut (iv, f ()))
3753 [CML.wrap (CML.timeOutEvt t, fn () => (CML.kill tid; NONE)),
3754 CML.wrap (SyncVar.iGetEvt iv, fn x => SOME x)]
3761 There are some CML related posts on the MLton mailing list:
3763 * http://www.mlton.org/pipermail/mlton/2004-May/
3765 that discuss concerns that SML/NJ's implementation is not space
3766 efficient, because multi-shot continuations can be held indefinitely
3767 on event queues. MLton is better off because of the one-shot nature
3768 -- when an event enables a thread, all other copies of the thread
3769 waiting in other event queues get turned into dead threads (of zero
3774 :mlton-guide-page: ConstantPropagation
3775 [[ConstantPropagation]]
3779 <:ConstantPropagation:> is an optimization pass for the <:SSA:>
3780 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
3784 This is whole-program constant propagation, even through data
3785 structures. It also performs globalization of (small) values computed
3790 == Implementation ==
3792 * <!ViewGitFile(mlton,master,mlton/ssa/constant-propagation.fun)>
3794 == Details and Notes ==
3800 :mlton-guide-page: Contact
3807 There are three mailing lists available.
3809 * mailto:MLton-user@mlton.org[`MLton-user@mlton.org`]
3811 MLton user community discussion
3814 * https://lists.sourceforge.net/lists/listinfo/mlton-user[subscribe]
3815 https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-user[archive (SourceForge; current)],
3816 http://www.mlton.org/pipermail/mlton-user/[archive (PiperMail; through 201110)]
3819 * mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`]
3821 MLton developer community discussion
3824 * https://lists.sourceforge.net/lists/listinfo/mlton-devel[subscribe]
3825 https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-devel[archive (SourceForge; current)],
3826 http://www.mlton.org/pipermail/mlton-devel/[archive (PiperMail; through 201110)]
3829 * mailto:MLton-commit@mlton.org[`MLton-commit@mlton.org`]
3834 * https://lists.sourceforge.net/lists/listinfo/mlton-commit[subscribe]
3835 * https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-commit[archive (SourceForge; current)],
3836 http://www.mlton.org/pipermail/mlton-commit/[archive (PiperMail; through 201110)]
3840 === Mailing list policies ===
3842 * Both mailing lists are unmoderated. However, the mailing lists are
3843 configured to discard all spam, to hold all non-subscriber posts
3844 for moderation, to accept all subscriber posts, and to admin approve
3845 subscription requests. Please contact
3846 mailto:matthew.fluet@gmail.com[Matthew Fluet] if it appears that your
3847 messages are being discarded as spam.
3849 * Large messages (over 256K) should not be sent. Rather, please send
3850 an email containing the discussion text and a link to any large files.
3853 * Very active mailto:MLton-devel@mlton.org[`MLton@mlton.org`] list
3854 members who might otherwise be expected to provide a fast response
3855 should send a message when they will be offline for more than a few
3856 days. The convention is to put
3857 "++__userid__ offline until __date__++" in the subject line to make it
3861 * Discussions started on the mailing lists should stay on the mailing
3862 lists. Private replies may be bounced to the mailing list for the
3863 benefit of those following the discussion.
3865 * Discussions started on
3866 mailto:MLton-user@mlton.org[`MLton-user@mlton.org`] may be migrated to
3867 mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`], particularly
3868 when the discussion shifts from how to use MLton to how to modify
3869 MLton (e.g., to fix a bug identified by the initial discussion).
3873 * Some MLton developers and users are in channel `#sml` on http://freenode.net.
3877 :mlton-guide-page: Contify
3882 <:Contify:> is an optimization pass for the <:SSA:>
3883 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
3887 Contification is a compiler optimization that turns a function that
3888 always returns to the same place into a continuation. This exposes
3889 control-flow information that is required by many optimizations,
3890 including traditional loop optimizations.
3892 == Implementation ==
3894 * <!ViewGitFile(mlton,master,mlton/ssa/contify.fun)>
3896 == Details and Notes ==
3898 See <!Cite(FluetWeeks01, Contification Using Dominators)>. The
3899 intermediate language described in that paper has since evolved to the
3900 <:SSA:> <:IntermediateLanguage:>; hence, the complication described in
3901 Section 6.1 is no longer relevant.
3905 :mlton-guide-page: CoreML
3910 <:CoreML:Core ML> is an <:IntermediateLanguage:>, translated from
3911 <:AST:> by <:Elaborate:>, optimized by <:CoreMLSimplify:>, and
3912 translated by <:Defunctorize:> to <:XML:>.
3916 <:CoreML:> is polymorphic, higher-order, and has nested patterns.
3918 == Implementation ==
3920 * <!ViewGitFile(mlton,master,mlton/core-ml/core-ml.sig)>
3921 * <!ViewGitFile(mlton,master,mlton/core-ml/core-ml.fun)>
3925 The <:CoreML:> <:IntermediateLanguage:> has no independent type
3928 == Details and Notes ==
3934 :mlton-guide-page: CoreMLSimplify
3939 The single optimization pass for the <:CoreML:>
3940 <:IntermediateLanguage:> is controlled by the `Compile` functor
3941 (<!ViewGitFile(mlton,master,mlton/main/compile.fun)>).
3943 The following optimization pass is implemented:
3949 :mlton-guide-page: Credits
3954 MLton was designed and implemented by HenryCejtin,
3955 MatthewFluet, SureshJagannathan, and <:StephenWeeks:>.
3957 * <:HenryCejtin:> wrote the `IntInf` implementation, the original
3958 profiler, the original man pages, the `.spec` files for the RPMs,
3959 and lots of little hacks to speed stuff up.
3961 * <:MatthewFluet:> implemented the X86 and AMD64 native code generators,
3962 ported `mlprof` to work with the native code generator, did a lot
3963 of work on the SSA optimizer, both adding new optimizations and
3964 improving or porting existing optimizations, updated the
3965 <:BasisLibrary:Basis Library> implementation, ported
3966 <:ConcurrentML:> and <:MLNLFFI:ML-NLFFI> to MLton, implemented the
3967 <:MLBasis: ML Basis system>, ported MLton to 64-bit platforms,
3968 and currently leads the project.
3970 * <:SureshJagannathan:> implemented some early inlining and uncurrying
3973 * <:StephenWeeks:> implemented most of the original version of MLton, and
3974 continues to keep his fingers in most every part.
3976 Many people have helped us over the years. Here is an alphabetical
3979 * <:JesperLouisAndersen:> sent several patches to improve the runtime on
3980 FreeBSD and ported MLton to run on NetBSD and OpenBSD.
3982 * <:JohnnyAndersen:> implemented `BinIO`, modified MLton so it could
3983 cross compile to MinGW, and provided useful discussion about
3986 * Alexander Abushkevich extended support for OpenBSD.
3988 * Ross Bayer added the `-keep ast` compile-time option and experimented with
3989 porting the build system to CMake.
3991 * Kevin Bradley added initial support for <:SuccessorML:> features.
3993 * Bryan Camp added `-disable-pass _regex_` and `enable-pass _regex_` compile
3994 options to generalize `-drop-pass _regex_` and added `Array_copyArray` and
3995 `Array_copyVector` primitives.
3997 * Jason Carr added a parser combinator library and a parser for the <:SXML:>
3998 IR, extended compilation to start with a `.sxml` file, and experimented with
3999 alternate control-flow analyses for <:ClosureConvert: closure conversion>.
4001 * Christopher Cramer contributed support for additional
4002 `Posix.ProcEnv.sysconf` variables, performance improvements for
4003 `String.concatWith`, and Debian packaging.
4006 http://www.polyspace.com/[PolySpace Technologies] provided many bug
4007 fixes and runtime system improvements, code to help the Sparc/Solaris
4008 port, and funded a number of improvements to MLton.
4010 * Armando Doval updated `mlnlffigen` to warn and skip functions with
4011 `struct`/`union` arguments.
4013 * Martin Elsman provided helpful discussions in the development of
4014 the <:MLBasis:ML Basis system>.
4016 * Brent Fulgham ported MLton most of the way to MinGW.
4018 * <:AdamGoode:> provided a script to build the PDF MLton Guide and
4020 https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora]
4023 * Simon Helsen provided bug reports, suggestions, and helpful
4026 * Joe Hurd provided useful discussion and feedback on source-level
4029 * <:VesaKarvonen:> contributed `esml-mode.el` and `esml-mlb-mode.el` (see <:Emacs:>),
4030 contributed patches for improving match warnings,
4031 contributed `esml-du-mlton.el` and extended def-use output to include types of variable definitions (see <:EmacsDefUseMode:>), and
4032 improved constant folding of floating-point operations.
4034 * Richard Kelsey provided helpful discussions.
4036 * Ville Laurikari ported MLton to IA64/HPUX, HPPA/HPUX, PowerPC/AIX, PowerPC64/AIX.
4038 * Brian Leibig implemented the <:LLVMCodegen:>.
4040 * Geoffrey Mainland helped with FreeBSD packaging.
4042 * Eric McCorkle ported MLton to Intel Mac.
4044 * <:TomMurphy:> wrote the original version of `MLton.Syslog` as part
4045 of his `mlftpd` project, and has sent many useful bug reports and
4048 * Michael Neumann helped to patch the runtime to compile under
4051 * Barak Pearlmutter built the original
4052 http://packages.debian.org/mlton[Debian package] for MLton, and
4053 helped us to take over the process.
4055 * Filip Pizlo ported MLton to (PowerPC) Darwin.
4057 * Vedant Raiththa extended the <:ForeignFunctionInterface:> with support for
4058 `pure` and `impure` attributes to `_import`.
4060 * Krishna Ravikumar added initial support for vector expressions and the
4061 `Vector_vector` primitive.
4063 * John Reppy assisted in porting MLton to Intel Mac.
4065 * Sam Rushing ported MLton to FreeBSD.
4067 * Rob Simmons refactored the array and vector implementation in the
4068 <:BasisLibrary: Basis Library:> into a primitive implementation (using
4069 `SeqInt.int` for indexing) and a wrapper implementation (using the default
4070 `Int.int` for indexing).
4072 * Jeffrey Mark Siskind provided helpful discussions and inspiration
4073 with his Stalin Scheme compiler.
4075 * Matthew Surawski added <:LoopUnroll:> and <:LoopUnswitch:> SSA optimizations.
4077 * <:WesleyTerpstra:> added support for `MLton.Process.create`, made
4078 a number of contributions to the <:ForeignFunctionInterface:>,
4079 contributed a number of runtime system patches,
4080 added support for compiling to a <:LibrarySupport:C library>,
4081 ported MLton to http://mingw.org[MinGW] and all http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian] supported architectures with <:CrossCompiling:cross-compiling> support,
4082 and maintains the http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian] and http://mingw.org[MinGW] packages.
4084 * Maksim Yegorov added rudimentary support for `./configure` and other
4085 improvements to the build system and implemented the <:ShareZeroVec:> SSA
4088 * Luke Ziarek assisted in porting MLton to (PowerPC) Darwin.
4090 We have also benefited from other software development tools and
4091 used code from other sources.
4093 * MLton was developed using
4094 <:SMLNJ:Standard ML of New Jersey> and the
4095 <:CompilationManager:Compilation Manager (CM)>
4097 * MLton's lexer (`mlton/frontend/ml.lex`), parser
4098 (`mlton/frontend/ml.grm`), and precedence-parser
4099 (`mlton/elaborate/precedence-parse.fun`) are modified versions of
4102 * The MLton <:BasisLibrary:Basis Library> implementation of
4103 conversions between binary and decimal representations of reals uses
4104 David Gay's http://www.netlib.org/fp/[gdtoa] library.
4106 * The MLton <:BasisLibrary:Basis Library> implementation uses
4107 modified versions of portions of the the SML/NJ Basis Library
4108 implementation modules `OS.IO`, `Posix.IO`, `Process`,
4111 * The MLton <:BasisLibrary:Basis Library> implementation uses
4112 modified versions of portions of the <:MLKit:ML Kit> Version 4.1.4
4113 Basis Library implementation modules `Path`, `Time`, and
4116 * Many of the benchmarks come from the SML/NJ benchmark suite.
4118 * Many of the regression tests come from the ML Kit Version 4.1.4
4119 distribution, which borrowed them from the
4120 http://www.dina.kvl.dk/%7Esestoft/mosml.html[Moscow ML] distribution.
4122 * MLton uses the http://www.gnu.org/software/gmp/gmp.html[GNU multiprecision library] for its implementation of `IntInf`.
4124 * MLton's implementation of <:MLLex: mllex>, <:MLYacc: mlyacc>,
4125 the <:CKitLibrary:ckit Library>,
4126 the <:MLLPTLibrary:ML-LPT Library>,
4127 the <:MLRISCLibrary:MLRISC Library>,
4128 the <:SMLNJLibrary:SML/NJ Library>,
4129 <:ConcurrentML:Concurrent ML>,
4130 mlnlffigen and <:MLNLFFI:ML-NLFFI>
4131 are modified versions of code from SML/NJ.
4135 :mlton-guide-page: CrossCompiling
4140 MLton's `-target` flag directs MLton to cross compile an application
4141 for another platform. By default, MLton is only able to compile for
4142 the machine it is running on. In order to use MLton as a cross
4143 compiler, you need to do two things.
4145 1. Install the GCC cross-compiler tools on the host so that GCC can
4146 compile to the target.
4148 2. Cross compile the MLton runtime system to build the runtime
4149 libraries for the target.
4151 To make the terminology clear, we refer to the _host_ as the machine
4152 MLton is running on and the _target_ as the machine that MLton is
4155 To build a GCC cross-compiler toolset on the host, you can use the
4156 script `bin/build-cross-gcc`, available in the MLton sources, as a
4157 template. The value of the `target` variable in that script is
4158 important, since that is what you will pass to MLton's `-target` flag.
4159 Once you have the toolset built, you should be able to test it by
4160 cross compiling a simple hello world program on your host machine.
4162 % gcc -b i386-pc-cygwin -o hello-world hello-world.c
4165 You should now be able to run `hello-world` on the target machine, in
4166 this case, a Cygwin machine.
4168 Next, you must cross compile the MLton runtime system and inform MLton
4169 of the availability of the new target. The script `bin/add-cross`
4170 from the MLton sources will help you do this. Please read the
4171 comments at the top of the script. Here is a sample run adding a
4172 Solaris cross compiler.
4174 % add-cross sparc-sun-solaris sun blade
4176 Building print-constants executable.
4177 Running print-constants on blade.
4180 Running `add-cross` uses `ssh` to compile the runtime on the target
4181 machine and to create `print-constants`, which prints out all of the
4182 constants that MLton needs in order to implement the
4183 <:BasisLibrary:Basis Library>. The script runs `print-constants` on
4184 the target machine (`blade` in this case), and saves the output.
4186 Once you have done all this, you should be able to cross compile SML
4187 applications. For example,
4189 mlton -target i386-pc-cygwin hello-world.sml
4191 will create `hello-world`, which you should be able to run from a
4192 Cygwin shell on your Windows machine.
4195 == Cross-compiling alternatives ==
4197 Building and maintaining cross-compiling `gcc`'s is complex. You may
4198 find it simpler to use `mlton -keep g` to generate the files on the
4199 host, then copy the files to the target, and then use `gcc` or `mlton`
4200 on the target to compile the files.
4204 :mlton-guide-page: CVS
4209 http://www.gnu.org/software/cvs/[CVS] (Concurrent Versions System) is
4210 a version control system. The MLton project used CVS to maintain its
4211 <:Sources:source code>, but switched to <:Subversion:> on 20050730.
4213 Here are some online CVS resources.
4215 * http://cvsbook.red-bean.com/[Open Source Development with CVS]
4219 :mlton-guide-page: DeadCode
4224 <:DeadCode:> is an optimization pass for the <:CoreML:>
4225 <:IntermediateLanguage:>, invoked from <:CoreMLSimplify:>.
4229 This pass eliminates declarations from the
4230 <:BasisLibrary:Basis Library> not needed by the user program.
4232 == Implementation ==
4234 * <!ViewGitFile(mlton,master,mlton/core-ml/dead-code.sig)>
4235 * <!ViewGitFile(mlton,master,mlton/core-ml/dead-code.fun)>
4237 == Details and Notes ==
4239 In order to compile small programs rapidly, a pass of dead code
4240 elimination is run in order to eliminate as much of the Basis Library
4241 as possible. The dead code elimination algorithm used is not safe in
4242 general, and only works because the Basis Library implementation has
4246 * it performs no I/O
4248 The dead code elimination includes the minimal set of
4249 declarations from the Basis Library so that there are no free
4250 variables in the user program (or remaining Basis Library
4251 implementation). It has a special hack to include all
4252 bindings of the form:
4258 There is an <:MLBasisAnnotations:ML Basis annotation>,
4259 `deadCode true`, that governs which code is subject to this unsafe
4260 dead-code elimination.
4264 :mlton-guide-page: DeepFlatten
4269 <:DeepFlatten:> is an optimization pass for the <:SSA2:>
4270 <:IntermediateLanguage:>, invoked from <:SSA2Simplify:>.
4274 This pass flattens into mutable fields of objects and into vectors.
4276 For example, an `(int * int) ref` is represented by a 2 word
4277 object, and an `(int * int) array` contains pairs of `int`-s,
4278 rather than pointers to pairs of `int`-s.
4280 == Implementation ==
4282 * <!ViewGitFile(mlton,master,mlton/ssa/deep-flatten.fun)>
4284 == Details and Notes ==
4286 There are some performance issues with the deep flatten pass, where it
4287 consumes an excessive amount of memory.
4289 * http://www.mlton.org/pipermail/mlton/2005-April/026990.html
4290 * http://www.mlton.org/pipermail/mlton-user/2010-June/001626.html
4291 * http://www.mlton.org/pipermail/mlton/2010-December/030876.html
4293 A number of applications require compilation with
4294 `-disable-pass deepFlatten` to avoid exceeding available memory. It is
4295 often asked whether the deep flatten pass usually has a significant
4296 impact on performance. The standard benchmark suite was run with and
4297 without the deep flatten pass enabled when the pass was first
4300 * http://www.mlton.org/pipermail/mlton/2004-August/025760.html
4302 The conclusion is that it does not have a significant impact.
4303 However, these are micro benchmarks; other applications may derive
4304 greater benefit from the pass.
4308 :mlton-guide-page: DefineTypeBeforeUse
4309 [[DefineTypeBeforeUse]]
4313 <:StandardML:Standard ML> requires types to be defined before they are
4314 used. Because of type inference, the use of a type can be implicit;
4315 hence, this requirement is more subtle than it might appear. For
4316 example, the following program is not type correct, because the type
4317 of `r` is `t option ref`, but `t` is defined after `r`.
4323 val () = r := SOME A
4326 MLton reports the following error, indicating that the type defined on
4327 line 2 is used on line 1.
4330 Error: z.sml 3.10-3.20.
4331 Function applied to incorrect argument.
4332 expects: _ * [???] option
4333 but got: _ * [t] option
4335 note: type would escape its scope: t
4336 escape from: z.sml 2.10-2.10
4337 escape to: z.sml 1.1-1.16
4338 Warning: z.sml 1.5-1.5.
4339 Type of variable was not inferred and could not be generalized: r.
4340 type: ??? option ref
4341 in: val r = ref NONE
4344 While the above example is benign, the following example shows how to
4345 cast an integer to a function by (implicitly) using a type before it
4346 is defined. In the example, the ref cell `r` is of type
4347 `t option ref`, where `t` is defined _after_ `r`, as a parameter to
4356 val () = r := SOME x
4357 fun get () = valOf (!r)
4359 structure S1 = F (type t = unit -> unit
4360 val x = fn () => ())
4361 structure S2 = F (type t = int
4363 val () = S1.get () ()
4366 MLton reports the following error.
4369 Warning: z.sml 1.5-1.5.
4370 Type of variable was not inferred and could not be generalized: r.
4371 type: ??? option ref
4372 in: val r = ref NONE
4373 Error: z.sml 5.16-5.26.
4374 Function applied to incorrect argument.
4375 expects: _ * [???] option
4376 but got: _ * [t] option
4378 note: type would escape its scope: t
4379 escape from: z.sml 2.17-2.17
4380 escape to: z.sml 1.1-1.16
4381 Warning: z.sml 6.11-6.13.
4382 Type of variable was not inferred and could not be generalized: get.
4384 in: fun get () = (valOf (! r))
4385 Error: z.sml 12.10-12.18.
4386 Function not of arrow type.
4393 :mlton-guide-page: DefinitionOfStandardML
4394 [[DefinitionOfStandardML]]
4395 DefinitionOfStandardML
4396 ======================
4398 <!Cite(MilnerEtAl97, The Definition of Standard ML (Revised))> is a
4399 terse and formal specification of <:StandardML:Standard ML>'s syntax
4400 and semantics. The language specified by this book is often referred
4401 to as SML 97. You can check its syntax
4402 http://www.mpi-sws.org/~rossberg/sml.html[grammar] online (thanks to
4405 <!Cite(MilnerEtAl90, The Definition of Standard ML)> is an older
4406 version of the definition, published in 1990. The accompanying
4407 <!Cite(MilnerTofte91, Commentary)> introduces and explains the notation
4408 and approach. The same notation is used in the SML 97 definition, so it
4409 is worth keeping the older definition and its commentary at hand if you
4410 intend a close study of the definition.
4414 :mlton-guide-page: Defunctorize
4419 <:Defunctorize:> is a translation pass from the <:CoreML:>
4420 <:IntermediateLanguage:> to the <:XML:> <:IntermediateLanguage:>.
4424 This pass converts a <:CoreML:> program to an <:XML:> program by
4429 * polymorphic `val` dec expansion
4430 * `datatype` lifting (to the top-level)
4432 == Implementation ==
4434 * <!ViewGitFile(mlton,master,mlton/defunctorize/defunctorize.sig)>
4435 * <!ViewGitFile(mlton,master,mlton/defunctorize/defunctorize.fun)>
4437 == Details and Notes ==
4439 This pass is grossly misnamed and does not perform defunctorization.
4441 === Datatype Lifting ===
4443 This pass moves all `datatype` declarations to the top level.
4445 <:StandardML:Standard ML> `datatype` declarations can contain type
4446 variables that are not bound in the declaration itself. For example,
4447 the following program is valid.
4452 datatype 'b t = T of 'a * 'b
4453 val y: int t = T (x, 1)
4459 Unfortunately, the `datatype` declaration can not be immediately moved
4460 to the top level, because that would leave `'a` free.
4463 datatype 'b t = T of 'a * 'b
4466 val y: int t = T (x, 1)
4472 In order to safely move `datatype`s, this pass must close them, as
4473 well as add any free type variables as extra arguments to the type
4474 constructor. For example, the above program would be translated to
4478 datatype ('a, 'b) t = T of 'a * 'b
4481 val y: ('a * int) t = T (x, 1)
4487 == Historical Notes ==
4489 The <:Defunctorize:> pass originally eliminated
4490 <:StandardML:Standard ML> functors by duplicating their body at each
4491 application. These duties have been adopted by the <:Elaborate:>
4496 :mlton-guide-page: Developers
4501 Here is a picture of the MLton team at a meeting in Chicago in August
4502 2003. From left to right we have:
4504 [align="center",frame="none",cols="^"]
4506 |<:StephenWeeks:> -- <:MatthewFluet:> -- <:HenryCejtin:> -- <:SureshJagannathan:>
4509 image::Developers.attachments/team.jpg[align="center"]
4511 Also see the <:Credits:> for a list of specific contributions.
4514 == Developers list ==
4516 A number of people read the developers mailing list,
4517 mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`], and make
4518 contributions there. Here's a list of those who have a page here.
4521 * <:JesperLouisAndersen:>
4522 * <:JohnnyAndersen:>
4523 * <:MichaelNorrish:>
4526 * <:WesleyTerpstra:>
4531 :mlton-guide-page: Development
4536 This page is the central point for MLton development.
4538 * Access the <:Sources:>.
4539 * Check the current <!ViewGitFile(mlton,master,CHANGELOG.adoc)> or recent https://github.com/MLton/mlton/commits/master[commits].
4540 * Open https://github.com/MLton/mlton/issues[Issues].
4541 * Ideas for <:Projects:> to improve MLton.
4542 * <:Developers:> that are or have been involved in the project.
4543 // * Help maintain and improve the <:WebSite:>.
4547 * <:CompilerOverview:>
4548 * <:CompilingWithSMLNJ:>
4549 * <:CrossCompiling:>
4553 * <:ReleaseChecklist:>
4558 :mlton-guide-page: Documentation
4563 Documentation is available on the following topics.
4565 * <:StandardML:Standard ML>
4566 ** <:BasisLibrary:Basis Library>
4567 ** <:Libraries: Additional libraries>
4568 * <:Installation:Installing MLton>
4570 ** <:ForeignFunctionInterface: Foreign function interface (FFI)>
4571 ** <:ManualPage: Manual page> (<:CompileTimeOptions:compile-time options> <:RunTimeOptions:run-time options>)
4572 ** <:MLBasis: ML Basis system>
4573 ** <:MLtonStructure: MLton structure>
4574 ** <:PlatformSpecificNotes: Platform-specific notes>
4575 ** <:Profiling: Profiling>
4576 ** <:TypeChecking: Type checking>
4577 ** Help for porting from <:SMLNJ:SML/NJ> to MLton.
4587 ** <:MLLex:> (<!Attachment(Documentation,mllex.pdf)>)
4588 ** <:MLYacc:> (<!Attachment(Documentation,mlyacc.pdf)>)
4589 ** <:MLNLFFIGen:> (<!Attachment(Documentation,mlyacc.pdf)>)
4594 :mlton-guide-page: Drawbacks
4599 MLton has several drawbacks due to its use of whole-program
4602 * Large compile-time memory requirement.
4604 Because MLton performs whole-program analysis and optimization,
4605 compilation requires a large amount of memory. For example, compiling
4606 MLton (over 140K lines) requires at least 512M RAM.
4608 * Long compile times.
4610 Whole-program compilation can take a long time. For example,
4611 compiling MLton (over 140K lines) on a 1.6GHz machine takes five to
4614 * No interactive top level.
4616 Because of whole-program compilation, MLton does not provide an
4617 interactive top level. In particular, it does not implement the
4618 optional <:BasisLibrary:Basis Library> function `use`.
4622 :mlton-guide-page: Eclipse
4627 http://eclipse.org/[Eclipse] is an open, extensible IDE.
4629 http://www.cse.iitd.ernet.in/%7Ecsu02132/mldev/[ML-Dev] is a plug-in
4630 for Eclipse, based on <:SMLNJ:SML/NJ>.
4632 There has been some talk on the MLton mailing list about adding
4633 support to Eclipse for MLton/SML, and in particular, using
4634 http://eclipsefp.sourceforge.net/. We are unaware of any progress
4639 :mlton-guide-page: Elaborate
4644 <:Elaborate:> is a translation pass from the <:AST:>
4645 <:IntermediateLanguage:> to the <:CoreML:> <:IntermediateLanguage:>.
4649 This pass performs type inference and type checking according to the
4650 <:DefinitionOfStandardML:Definition>. It also defunctorizes the
4651 program, eliminating all module-level constructs.
4653 == Implementation ==
4655 * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate.sig)>
4656 * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate.fun)>
4657 * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-env.sig)>
4658 * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-env.fun)>
4659 * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-modules.sig)>
4660 * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-modules.fun)>
4661 * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-core.sig)>
4662 * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-core.fun)>
4663 * <!ViewGitDir(mlton,master,mlton/elaborate)>
4665 == Details and Notes ==
4667 At the modules level, the <:Elaborate:> pass:
4669 * elaborates signatures with interfaces (see
4670 <!ViewGitFile(mlton,master,mlton/elaborate/interface.sig)> and
4671 <!ViewGitFile(mlton,master,mlton/elaborate/interface.fun)>)
4673 The main trick is to use disjoint sets to efficiently handle sharing
4674 of tycons and of structures and then to copy signatures as dags rather
4677 * checks functors at the point of definition, using functor summaries
4678 to speed up checking of functor applications.
4680 When a functor is first type checked, we keep track of the dummy
4681 argument structure and the dummy result structure, as well as all the
4682 tycons that were created while elaborating the body. Then, if we
4683 later need to type check an application of the functor (as opposed to
4684 defunctorize an application), we pair up tycons in the dummy argument
4685 structure with the actual argument structure and then replace the
4686 dummy tycons with the actual tycons in the dummy result structure,
4687 yielding the actual result structure. We also generate new tycons for
4688 all the tycons that we created while originally elaborating the body.
4690 * handles opaque signature constraints.
4692 This is implemented by building a dummy structure realized from the
4693 signature, just as we would for a functor argument when type checking
4694 a functor. The dummy structure contains exactly the type information
4695 that is in the signature, which is what opacity requires. We then
4696 replace the variables (and constructors) in the dummy structure with
4697 the corresponding variables (and constructors) from the actual
4698 structure so that the translation to <:CoreML:> uses the right stuff.
4699 For each tycon in the dummy structure, we keep track of the
4700 corresponding type structure in the actual structure. This is used
4701 when producing the <:CoreML:> types (see `expandOpaque` in
4702 <!ViewGitFile(mlton,master,mlton/elaborate/type-env.sig)> and
4703 <!ViewGitFile(mlton,master,mlton/elaborate/type-env.fun)>).
4705 Then, within each `structure` or `functor` body, for each declaration
4706 (`<dec>` in the <:StandardML:Standard ML> grammar), the <:Elaborate:>
4707 pass does three steps:
4710 1. <:ScopeInference:>
4712 ** <:PrecedenceParse:>
4713 ** `_{ex,im}port` expansion
4714 ** profiling insertion
4716 3. Overloaded {constant, function, record pattern} resolution
4719 === Defunctorization ===
4721 The <:Elaborate:> pass performs a number of duties historically
4722 assigned to the <:Defunctorize:> pass.
4724 As part of the <:Elaborate:> pass, all module level constructs
4725 (`open`, `signature`, `structure`, `functor`, long identifiers) are
4726 removed. This works because the <:Elaborate:> pass assigns a unique
4727 name to every type and variable in the program. This also allows the
4728 <:Elaborate:> pass to eliminate `local` declarations, which are purely
4729 for namespace management.
4734 Here are a number of examples of elaboration.
4736 * All variables bound in `val` declarations are renamed.
4749 * All variables in `fun` declarations are renamed.
4758 fun f_0 x_0 = g_0 x_0
4759 and g_0 y_0 = f_0 y_0
4762 * Type abbreviations are removed, and the abbreviation is expanded
4763 wherever it is used.
4767 type 'a u = int * 'a
4768 type 'b t = 'b u * real
4769 fun f (x : bool t) = x
4773 fun f_0 (x_0 : (int * bool) * real) = x_0
4776 * Exception declarations create a new constructor and rename the type.
4781 exception E of t * real
4785 exception E_0 of int * real
4788 * The type and value constructors in datatype declarations are renamed.
4792 datatype t = A of int | B of real * t
4796 datatype t_0 = A_0 of int | B_0 of real * t_0
4799 * Local declarations are moved to the top-level. The environment
4800 keeps track of the variables in scope.
4818 * Structure declarations are eliminated, with all declarations moved
4819 to the top level. Long identifiers are renamed.
4836 * Open declarations are eliminated.
4857 * Functor declarations are eliminated, and the body of a functor is
4858 duplicated wherever the functor is applied.
4862 functor F(val x : int) =
4866 structure F1 = F(val x = 13)
4867 structure F2 = F(val x = 14)
4879 * Signature constraints are eliminated. Note that signatures do
4880 affect how subsequent variables are renamed.
4905 :mlton-guide-page: Emacs
4912 There are a few Emacs modes for SML.
4915 ** http://www.xemacs.org/Documentation/packages/html/sml-mode_3.html
4916 ** http://www.smlnj.org/doc/Emacs/sml-mode.html
4917 ** http://www.iro.umontreal.ca/%7Emonnier/elisp/
4919 * <!ViewGitFile(mlton,master,ide/emacs/mlton.el)> contains the Emacs lisp that <:StephenWeeks:> uses to interact with MLton (in addition to using `sml-mode`).
4921 * http://primate.net/%7Eitz/mindent.tar, developed by Ian Zimmerman, who writes:
4924 Unlike the widespread `sml-mode.el` it doesn't try to indent code
4925 based on ML syntax. I gradually got skeptical about this approach
4926 after writing the initial indentation support for caml mode and
4927 watching it bloat insanely as the language added new features. Also,
4928 any such attempts that I know of impose a particular coding style, or
4929 at best a choice among a limited set of styles, which I now oppose.
4930 Instead my mode is based on a generic package which provides manual
4931 bindable commands for common indentation operations (example: indent
4932 the current line under the n-th occurrence of a particular character
4933 in the previous non-blank line).
4938 There is a mode for editing <:MLBasis: ML Basis> files.
4940 * <!ViewGitFile(mlton,master,ide/emacs/esml-mlb-mode.el)> (plus other files)
4942 == Definitions and uses ==
4944 There is a mode that supports the precise def-use information that
4945 MLton can output. It highlights definitions and uses and provides
4946 commands for navigation (e.g., `jump-to-def`, `jump-to-next`,
4947 `list-all-refs`). It can be handy, for example, for navigating in the
4948 MLton compiler source code. See <:EmacsDefUseMode:> for further
4951 == Building on the background ==
4953 Tired of manually starting/stopping/restarting builds after editing
4954 files? Now you don't have to. See <:EmacsBgBuildMode:> for further
4957 == Error messages ==
4959 MLton's error messages are not among those that the Emacs `next-error`
4960 parser natively understands. The easiest way to fix this is to add
4961 the following to your `.emacs` to teach Emacs to recognize MLton's
4967 (add-to-list 'compilation-error-regexp-alist 'mlton)
4968 (add-to-list 'compilation-error-regexp-alist-alist
4970 "^[[:space:]]*\\(\\(?:\\(Error\\)\\|\\(Warning\\)\\|\\(\\(?:\\(?:defn\\|spec\\) at\\)\\|\\(?:escape \\(?:from\\|to\\)\\)\\|\\(?:scoped at\\)\\)\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\(?:-\\([0-9]+\\)\\.\\([0-9]+\\)\\)?\\.?\\)$"
4971 5 (6 . 8) (7 . 9) (3 . 4) 1))
4976 :mlton-guide-page: EmacsBgBuildMode
4977 [[EmacsBgBuildMode]]
4981 Do you really want to think about starting a build of you project?
4982 What if you had a personal slave that would restart a build of your
4983 project whenever you save any file belonging to that project? The
4984 bg-build mode does just that. Just save the file, a compile is
4985 started (silently!), you can continue working without even thinking
4986 about starting a build, and if there are errors, you are notified
4987 (with a message), and can then jump to errors.
4989 This mode is not specific to MLton per se, but is particularly useful
4990 for working with MLton due to the longer compile times. By the time
4991 you start wondering about possible errors, the build is already on the
4994 == Functionality and Features ==
4996 * Each time a file is saved, and after a user configurable delay
4997 period has been exhausted, a build is started silently in the
4999 * When the build is finished, a status indicator (message) is
5000 displayed non-intrusively.
5001 * At any time, you can switch to a build process buffer where all the
5002 messages from the build are shown.
5003 * Optionally highlights (error/warning) message locations in (source
5004 code) buffers after a finished build.
5005 * After a build has finished, you can jump to locations of warnings
5006 and errors from the build process buffer or by using the `first-error`
5007 and `next-error` commands.
5008 * When a build fails, bg-build mode can optionally execute a user
5009 specified command. By default, bg-build mode executes `first-error`.
5010 * When starting a build of a particular project, a possible previous
5011 live build of the same project is interrupted first.
5012 * A project configuration file specifies the commands required to
5014 * Multiple projects can be loaded into bg-build mode and bg-build mode
5015 can build a given maximum number of projects concurrently.
5016 * Supports both http://www.gnu.org/software/emacs/[Gnu Emacs] and
5017 http://www.xemacs.org[XEmacs].
5022 There is no package for the mode at the moment. To install the mode you
5023 need to fetch the Emacs Lisp, `*.el`, files from the MLton repository:
5024 <!ViewGitDir(mlton,master,ide/emacs)>.
5029 The easiest way to load the mode is to first tell Emacs where to find the
5030 files. For example, add
5034 (add-to-list 'load-path (file-truename "path-to-the-el-files"))
5037 to your `~/.emacs` or `~/.xemacs/init.el`. You'll probably also want
5038 to start the mode automatically by adding
5042 (require 'bg-build-mode)
5046 to your Emacs init file. Once the mode is activated, you should see
5047 the `BGB` indicator on the mode line.
5050 === MLton and Compilation-Mode ===
5052 At the time of writing, neither Gnu Emacs nor XEmacs contain an error
5053 regexp that would match MLton's messages.
5055 If you use Gnu Emacs, insert the following code into your `.emacs` file:
5061 'compilation-error-regexp-alist
5062 '("^\\(Warning\\|Error\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\.$"
5066 If you use XEmacs, insert the following code into your `init.el` file:
5072 'compilation-error-regexp-alist-alist
5074 ("^\\(Warning\\|Error\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\.$"
5076 (compilation-build-compilation-error-regexp-alist)
5081 Typically projects are built (or compiled) using a tool like http://www.gnu.org/software/make/[`make`],
5082 but the details vary. The bg-build mode needs a project configuration file to
5083 know how to build your project. A project configuration file basically contains
5084 an Emacs Lisp expression calling a function named `bg-build` that returns a
5085 project object. A simple example of a project configuration file would be the
5086 (<!ViewGitFile(mltonlib,master,com/ssh/async/unstable/example/smlbot/Build.bgb)>)
5087 file used with smlbot:
5091 sys::[./bin/InclGitFile.py mltonlib master com/ssh/async/unstable/example/smlbot/Build.bgb 5:]
5094 The `bg-build` function takes a number of keyword arguments:
5096 * `:name` specifies the name of the project. This can be any
5097 expression that evaluates to a string or to a nullary function that
5100 * `:shell` specifies a shell command to execute. This can be any
5101 expression that evaluates to a string, a list of strings, or to a
5102 nullary function returning a list of strings.
5104 * `:build?` specifies a predicate to determine whether the project
5105 should be built after some files have been modified. The predicate is
5106 given a list of filenames and should return a non-nil value when the
5107 project should be built and nil otherwise.
5109 All of the keyword arguments, except `:shell`, are optional and can be left out.
5111 Note the use of the `nice` command above. It means that background
5112 build process is given a lower priority by the system process
5113 scheduler. Assuming your machine has enough memory, using nice
5114 ensures that your computer remains responsive. (You probably won't
5115 even notice when a build is started.)
5117 Once you have written a project file for bg-build mode. Use the
5118 `bg-build-add-project` command to load the project file for bg-build
5119 mode. The bg-build mode can also optionally load recent project files
5120 automatically at startup.
5122 After the project file has been loaded and bg-build mode activated,
5123 each time you save a file in Emacs, the bg-build mode tries to build
5126 The `bg-build-status` command creates a buffer that displays some
5127 status information on builds and allows you to manage projects (start
5128 builds explicitly, remove a project from bg-build, ...) as well as
5129 visit buffers created by bg-build. Notice the count of started
5130 builds. At the end of the day it can be in the hundreds or thousands.
5131 Imagine the number of times you've been relieved of starting a build
5136 :mlton-guide-page: EmacsDefUseMode
5141 MLton provides an <:CompileTimeOptions:option>,
5142 ++-show-def-use __file__++, to output precise (giving exact source
5143 locations) and accurate (including all uses and no false data)
5144 whole-program def-use information to a file. Unlike typical tags
5145 facilities, the information includes local variables and distinguishes
5146 between different definitions even when they have the same name. The
5147 def-use Emacs mode uses the information to provide navigation support,
5148 which can be particularly useful while reading SML programs compiled
5149 with MLton (such as the MLton compiler itself).
5152 == Screen Capture ==
5154 Note the highlighting and the type displayed in the minibuffer.
5156 image::EmacsDefUseMode.attachments/def-use-capture.png[align="center"]
5161 * Highlights definitions and uses. Different colors for definitions, unused definitions, and uses.
5162 * Shows types (with highlighting) of variable definitions in the minibuffer.
5163 * Navigation: `jump-to-def`, `jump-to-next`, and `jump-to-prev`. These work precisely (no searching involved).
5164 * Can list, visit and mark all references to a definition (within a program).
5165 * Automatically reloads updated def-use files.
5166 * Automatically loads previously used def-use files at startup.
5167 * Supports both http://www.gnu.org/software/emacs/[Gnu Emacs] and http://www.xemacs.org[XEmacs].
5172 There is no separate package for the def-use mode although the mode
5173 has been relatively stable for some time already. To install the mode
5174 you need to get the Emacs Lisp, `*.el`, files from MLton's repository:
5175 <!ViewGitDir(mlton,master,ide/emacs)>. The easiest way to get the files
5176 is to use <:Git:> to access MLton's <:Sources:sources>.
5179 If you only want the Emacs lisp files, you can use the following
5182 svn co svn://mlton.org/mlton/trunk/ide/emacs mlton-emacs-ide
5188 The easiest way to load def-use mode is to first tell Emacs where to
5189 find the files. For example, add
5193 (add-to-list 'load-path (file-truename "path-to-the-el-files"))
5196 to your `~/.emacs` or `~/.xemacs/init.el`. You'll probably
5197 also want to start `def-use-mode` automatically by adding
5201 (require 'esml-du-mlton)
5205 to your Emacs init file. Once the def-use mode is activated, you
5206 should see the `DU` indicator on the mode line.
5210 To use def-use mode one typically first sets up the program's makefile
5211 or build script so that the def-use information is saved each time the
5212 program is compiled. In addition to the ++-show-def-use __file__++
5213 option, the ++-prefer-abs-paths true++ expert option is required.
5214 Note that the time it takes to save the information is small (compared
5215 to type-checking), so it is recommended to simply add the options to
5216 the MLton invocation that compiles the program. However, it is only
5217 necessary to type check the program (or library), so one can specify
5218 the ++-stop tc++ option. For example, suppose you have a program
5219 defined by an MLB file named `my-prg.mlb`, you can save the def-use
5220 information to the file `my-prg.du` by invoking MLton as:
5223 mlton -prefer-abs-paths true -show-def-use my-prg.du -stop tc my-prg.mlb
5226 Finally, one needs to tell the mode where to find the def-use
5227 information. This is done with the `esml-du-mlton` command. For
5228 example, to load the `my-prg.du` file, one would type:
5231 M-x esml-du-mlton my-prg.du
5234 After doing all of the above, find an SML file covered by the
5235 previously saved and loaded def-use information, and place the cursor
5236 at some variable (definition or use, it doesn't matter). You should
5237 see the variable being highlighted. (Note that specifications in
5238 signatures do not define variables.)
5240 You might also want to setup and use the
5241 <:EmacsBgBuildMode:Bg-Build mode> to start builds automatically.
5246 `-show-def-use` output was extended to include types of variable
5247 definitions in revision <!ViewSVNRev(6333)>. To get good type names, the
5248 types must be in scope at the end of the program. If you are using the
5249 <:MLBasis:ML Basis> system, this means that the root MLB-file for your
5250 application should not wrap the libraries used in the application inside
5251 `local ... in ... end`, because that would remove them from the scope before
5252 the end of the program.
5256 :mlton-guide-page: Enscript
5261 http://www.gnu.org/s/enscript/[GNU Enscript] converts ASCII files to
5262 PostScript, HTML, and other output languages, applying language
5263 sensitive highlighting (similar to <:Emacs:>'s font lock mode). Here
5264 are a few _states_ files for highlighting <:StandardML: Standard ML>.
5266 * <!ViewGitFile(mlton,master,ide/enscript/sml_simple.st)> -- Provides highlighting of keywords, string and character constants, and (nested) comments.
5271 (* Comments (* can be nested *) *)
5272 structure S = struct
5273 val x = (1, 2, "three")
5278 * <!ViewGitFile(mlton,master,ide/enscript/sml_verbose.st)> -- Supersedes
5279 the above, adding highlighting of numeric constants. Due to the
5280 limited parsing available, numeric record labels are highlighted as
5281 numeric constants, in all contexts. Likewise, a binding precedence
5282 separated from `infix` or `infixr` by a newline is highlighted as a
5283 numeric constant and a numeric record label selector separated from
5284 `#` by a newline is highlighted as a numeric constant.
5289 structure S = struct
5290 (* These look good *)
5291 val x = (1, 2, "three")
5294 (* Although these look bad (not all the numbers are constants), *
5295 * they never occur in practice, as they are equivalent to the above. *)
5296 val x = {1 = 1, 3 = "three", 2 = 2}
5303 * <!ViewGitFile(mlton,master,ide/enscript/sml_fancy.st)> -- Supersedes the
5304 above, adding highlighting of type and constructor bindings,
5305 highlighting of explicit binding of type variables at `val` and `fun`
5306 declarations, and separate highlighting of core and modules level
5307 keywords. Due to the limited parsing available, it is assumed that
5308 the input is a syntactically correct, top-level declaration.
5313 structure S = struct
5314 val x = (1, 2, "three")
5315 datatype 'a t = T of 'a
5317 withtype v = {left: int t, right: int t}
5318 exception E1 of int and E2
5319 fun 'a id (x: 'a) : 'a = x
5321 (* Although this looks bad (the explicitly bound type variable 'a is *
5322 * not highlighted), it is unlikely to occur in practice. *)
5324 'a id = fn (x : 'a) => x
5329 * <!ViewGitFile(mlton,master,ide/enscript/sml_gaudy.st)> -- Supersedes the
5330 above, adding highlighting of type annotations, in both expressions
5331 and signatures. Due to the limited parsing available, it is assumed
5332 that the input is a syntactically correct, top-level declaration.
5340 val f : t * int -> int
5342 structure S : S = struct
5343 datatype t = T of int
5345 fun f (T x, i : int) : int = x + y
5346 fun 'a id (x: 'a) : 'a = x
5351 == Install and use ==
5353 * Version 1.6.3 of http://people.ssh.com/mtr/genscript[GNU Enscript]
5354 ** Copy all files to `/usr/share/enscript/hl/` or `.enscript/` in your home directory.
5355 ** Invoke `enscript` with `--highlight=sml_simple` (or `--highlight=sml_verbose` or `--highlight=sml_fancy` or `--highlight=sml_gaudy`).
5357 * Version 1.6.1 of http://people.ssh.com/mtr/genscript[GNU Enscript]
5358 ** Append <!ViewGitFile(mlton,master,ide/enscript/sml_all.st)> to `/usr/share/enscript/enscript.st`
5359 ** Invoke `enscript` with `--pretty-print=sml_simple` (or `--pretty-print=sml_verbose` or `--pretty-print=sml_fancy` or `--pretty-print=sml_gaudy`).
5363 Comments and suggestions should be directed to <:MatthewFluet:>.
5367 :mlton-guide-page: EqualityType
5372 An equality type is a type to which <:PolymorphicEquality:> can be
5373 applied. The <:DefinitionOfStandardML:Definition> and the
5374 <:BasisLibrary:Basis Library> precisely spell out which types are
5377 * `bool`, `char`, `IntInf.int`, ++Int__<N>__.int++, `string`, and ++Word__<N>__.word++ are equality types.
5379 * for any `t`, both `t array` and `t ref` are equality types.
5381 * if `t` is an equality type, then `t list`, and `t vector` are equality types.
5383 * if `t1`, ..., `tn` are equality types, then `t1 * ... * tn` and `{l1: t1, ..., ln: tn}` are equality types.
5385 * if `t1`, ..., `tn` are equality types and `t` <:AdmitsEquality:>, then `(t1, ..., tn) t` is an equality type.
5387 To check that a type t is an equality type, use the following idiom.
5390 structure S: sig eqtype t end =
5396 Notably, `exn` and `real` are not equality types. Neither is `t1 -> t2`, for any `t1` and `t2`.
5398 Equality on arrays and ref cells is by identity, not structure.
5399 For example, `ref 13 = ref 13` is `false`.
5400 On the other hand, equality for lists, strings, and vectors is by
5401 structure, not identity. For example, the following equalities hold.
5405 val _ = [1, 2, 3] = 1 :: [2, 3]
5406 val _ = "foo" = concat ["f", "o", "o"]
5407 val _ = Vector.fromList [1, 2, 3] = Vector.tabulate (3, fn i => i + 1)
5412 :mlton-guide-page: EqualityTypeVariable
5413 [[EqualityTypeVariable]]
5414 EqualityTypeVariable
5415 ====================
5417 An equality type variable is a type variable that starts with two or
5418 more primes, as in `''a` or `''b`. The canonical use of equality type
5419 variables is in specifying the type of the <:PolymorphicEquality:>
5420 function, which is `''a * ''a -> bool`. Equality type variables
5421 ensure that polymorphic equality is only used on
5422 <:EqualityType:equality types>, by requiring that at every use of a
5423 polymorphic value, equality type variables are instantiated by
5426 For example, the following program is type correct because polymorphic
5427 equality is applied to variables of type `''a`.
5431 fun f (x: ''a, y: ''a): bool = x = y
5434 On the other hand, the following program is not type correct, because
5435 polymorphic equality is applied to variables of type `'a`, which is
5436 not an equality type.
5440 fun f (x: 'a, y: 'a): bool = x = y
5443 MLton reports the following error, indicating that polymorphic
5444 equality expects equality types, but didn't get them.
5447 Error: z.sml 1.30-1.34.
5448 Function applied to incorrect argument.
5449 expects: [<equality>] * [<equality>]
5450 but got: ['a] * ['a]
5454 As an example of using such a function that requires equality types,
5455 suppose that `f` has polymorphic type `''a -> unit`. Then, `f 13` is
5456 type correct because `int` is an equality type. On the other hand,
5457 `f 13.0` and `f (fn x => x)` are not type correct, because `real` and
5458 arrow types are not equality types. We can test these facts with the
5459 following short programs. First, we verify that such an `f` can be
5460 applied to integers.
5464 functor Ok (val f: ''a -> unit): sig end =
5471 We can do better, and verify that such an `f` can be applied to
5476 functor Ok (val f: ''a -> unit): sig end =
5478 fun g (x: int) = f x
5482 Even better, we don't need to introduce a dummy function name; we can
5483 use a type constraint.
5487 functor Ok (val f: ''a -> unit): sig end =
5489 val _ = f: int -> unit
5493 Even better, we can use a signature constraint.
5497 functor Ok (S: sig val f: ''a -> unit end):
5498 sig val f: int -> unit end = S
5501 This functor concisely verifies that a function of polymorphic type
5502 `''a -> unit` can be safely used as a function of type `int -> unit`.
5504 As above, we can verify that such an `f` can not be used at
5509 functor Bad (S: sig val f: ''a -> unit end):
5510 sig val f: real -> unit end = S
5512 functor Bad (S: sig val f: ''a -> unit end):
5513 sig val f: ('a -> 'a) -> unit end = S
5516 MLton reports the following errors.
5519 Error: z.sml 2.4-2.30.
5520 Variable in structure disagrees with signature (type): f.
5521 structure: val f: [<equality>] -> _
5522 defn at: z.sml 1.25-1.25
5523 signature: val f: [real] -> _
5524 spec at: z.sml 2.12-2.12
5525 Error: z.sml 5.4-5.36.
5526 Variable in structure disagrees with signature (type): f.
5527 structure: val f: [<equality>] -> _
5528 defn at: z.sml 4.25-4.25
5529 signature: val f: [_ -> _] -> _
5530 spec at: z.sml 5.12-5.12
5534 == Equality type variables in type and datatype declarations ==
5536 Equality type variables can be used in type and datatype declarations;
5537 however they play no special role. For example,
5541 type 'a t = 'a * int
5544 is completely identical to
5548 type ''a t = ''a * int
5551 In particular, such a definition does _not_ require that `t` only be
5552 applied to equality types.
5558 datatype 'a t = A | B of 'a
5561 is completely identical to
5565 datatype ''a t = A | B of ''a
5570 :mlton-guide-page: EtaExpansion
5575 Eta expansion is a simple syntactic change used to work around the
5576 <:ValueRestriction:> in <:StandardML:Standard ML>.
5578 The eta expansion of an expression `e` is the expression
5579 `fn z => e z`, where `z` does not occur in `e`. This only
5580 makes sense if `e` denotes a function, i.e. is of arrow type. Eta
5581 expansion delays the evaluation of `e` until the function is
5582 applied, and will re-evaluate `e` each time the function is
5585 The name "eta expansion" comes from the eta-conversion rule of the
5586 <:LambdaCalculus:lambda calculus>. Expansion refers to the
5587 directionality of the equivalence being used, namely taking `e` to
5588 `fn z => e z` rather than `fn z => e z` to `e` (eta
5593 :mlton-guide-page: eXene
5598 http://people.cs.uchicago.edu/%7Ejhr/eXene/index.html[eXene] is a
5599 multi-threaded X Window System toolkit written in <:ConcurrentML:>.
5601 There is a group at K-State working toward
5602 http://www.cis.ksu.edu/%7Estough/eXene/[eXene 2.0].
5606 :mlton-guide-page: FAQ
5611 Feel free to ask questions and to update answers by editing this page.
5612 Since we try to make as much information as possible available on the
5613 web site and we like to avoid duplication, many of the answers are
5614 simply links to a web page that answers the question.
5616 == How do you pronounce MLton? ==
5620 == What SML software has been ported to MLton? ==
5624 == What graphical libraries are available for MLton? ==
5628 == How does MLton's performance compare to other SML compilers and to other languages? ==
5630 MLton has <:Performance:excellent performance>.
5632 == Does MLton treat monomorphic arrays and vectors specially? ==
5634 MLton implements monomorphic arrays and vectors (e.g. `BoolArray`,
5635 `Word8Vector`) exactly as instantiations of their polymorphic
5636 counterpart (e.g. `bool array`, `Word8.word vector`). Thus, there is
5637 no need to use the monomorphic versions except when required to
5638 interface with the <:BasisLibrary:Basis Library> or for portability
5639 with other SML implementations.
5641 == Why do I get a Segfault/Bus error in a program that uses `IntInf`/`LargeInt` to calculate numbers with several hundred thousand digits? ==
5645 == How can I decrease compile-time memory usage? ==
5647 * Compile with `-verbose 3` to find out if the problem is due to an
5648 SSA optimization pass. If so, compile with ++-disable-pass __pass__++ to
5651 * Compile with `@MLton hash-cons 0.5 --`, which will instruct the
5652 runtime to hash cons the heap every other GC.
5654 * Compile with `-polyvariance false`, which is an undocumented option
5655 that causes less code duplication.
5657 Also, please <:Contact:> us to let us know the problem to help us
5658 better understand MLton's limitations.
5660 == How portable is SML code across SML compilers? ==
5662 <:StandardMLPortability:>
5666 :mlton-guide-page: Features
5671 MLton has the following features.
5675 * Runs on a variety of platforms.
5677 ** <:RunningOnARM:ARM>:
5678 *** <:RunningOnLinux:Linux> (Debian)
5680 ** <:RunningOnAlpha:Alpha>:
5681 *** <:RunningOnLinux:Linux> (Debian)
5683 ** <:RunningOnAMD64:AMD64>:
5684 *** <:RunningOnDarwin:Darwin> (Mac OS X)
5685 *** <:RunningOnFreeBSD:FreeBSD>
5686 *** <:RunningOnLinux:Linux> (Debian, Fedora, Ubuntu, ...)
5687 *** <:RunningOnOpenBSD:OpenBSD>
5688 *** <:RunningOnSolaris:Solaris> (10 and above)
5690 ** <:RunningOnHPPA:HPPA>:
5691 *** <:RunningOnHPUX:HPUX> (11.11 and above)
5692 *** <:RunningOnLinux:Linux> (Debian)
5694 ** <:RunningOnIA64:IA64>:
5695 *** <:RunningOnHPUX:HPUX> (11.11 and above)
5696 *** <:RunningOnLinux:Linux> (Debian)
5698 ** <:RunningOnPowerPC:PowerPC>:
5699 *** <:RunningOnAIX:AIX> (5.2 and above)
5700 *** <:RunningOnDarwin:Darwin> (Mac OS X)
5701 *** <:RunningOnLinux:Linux> (Debian, Fedora, ...)
5703 ** <:RunningOnPowerPC64:PowerPC64>:
5704 *** <:RunningOnAIX:AIX> (5.2 and above)
5706 ** <:RunningOnS390:S390>
5707 *** <:RunningOnLinux:Linux> (Debian)
5709 ** <:RunningOnSparc:Sparc>
5710 *** <:RunningOnLinux:Linux> (Debian)
5711 *** <:RunningOnSolaris:Solaris> (8 and above)
5713 ** <:RunningOnX86:X86>:
5714 *** <:RunningOnCygwin:Cygwin>/Windows
5715 *** <:RunningOnDarwin:Darwin> (Mac OS X)
5716 *** <:RunningOnFreeBSD:FreeBSD>
5717 *** <:RunningOnLinux:Linux> (Debian, Fedora, Ubuntu, ...)
5718 *** <:RunningOnMinGW:MinGW>/Windows
5719 *** <:RunningOnNetBSD:NetBSD>
5720 *** <:RunningOnOpenBSD:OpenBSD>
5721 *** <:RunningOnSolaris:Solaris> (10 and above)
5725 * Supports the full SML 97 language as given in <:DefinitionOfStandardML:The Definition of Standard ML (Revised)>.
5727 If there is a program that is valid according to the
5728 <:DefinitionOfStandardML:Definition> that is rejected by MLton, or a
5729 program that is invalid according to the
5730 <:DefinitionOfStandardML:Definition> that is accepted by MLton, it is
5731 a bug. For a list of known bugs, see <:UnresolvedBugs:>.
5733 * A complete implementation of the <:BasisLibrary:Basis Library>.
5735 MLton's implementation matches latest <:BasisLibrary:Basis Library>
5736 http://www.standardml.org/Basis[specification], and includes a
5737 complete implementation of all the required modules, as well as many
5738 of the optional modules.
5740 * Generates standalone executables.
5742 No additional code or libraries are necessary in order to run an
5743 executable, except for the standard shared libraries. MLton can also
5744 generate statically linked executables.
5746 * Compiles large programs.
5748 MLton is sufficiently efficient and robust that it can compile large
5749 programs, including itself (over 190K lines). The distributed version
5750 of MLton was compiled by MLton.
5752 * Support for large amounts of memory (up to 4G on 32-bit systems; more on 64-bit systems).
5754 * Support for large array lengths (up to 2^31^-1 on 32-bit systems; up to 2^63^-1 on 64-bit systems).
5756 * Support for large files, using 64-bit file positions.
5760 * Executables have <:Performance:excellent running times>.
5762 * Generates small executables.
5764 MLton takes advantage of whole-program compilation to perform very
5765 aggressive dead-code elimination, which often leads to smaller
5766 executables than with other SML compilers.
5768 * Untagged and unboxed native integers, reals, and words.
5770 In MLton, integers and words are 8 bits, 16 bits, 32 bits, and 64 bits
5771 and arithmetic does not have any overhead due to tagging or boxing.
5772 Also, reals (32-bit and 64-bit) are stored unboxed, avoiding any
5773 overhead due to boxing.
5775 * Unboxed native arrays.
5777 In MLton, an array (or vector) of integers, reals, or words uses the
5778 natural C-like representation. This is fast and supports easy
5779 exchange of data with C. Monomorphic arrays (and vectors) use the
5780 same C-like representations as their polymorphic counterparts.
5782 * Multiple <:GarbageCollection:garbage collection> strategies.
5784 * Fast arbitrary precision arithmetic (`IntInf`) based on <:GnuMP:>.
5786 For `IntInf` intensive programs, MLton can be an order of magnitude or
5787 more faster than Poly/ML or SML/NJ.
5791 * Source-level <:Profiling:> of both time and allocation.
5792 * <:MLLex:> lexer generator
5793 * <:MLYacc:> parser generator
5794 * <:MLNLFFIGen:> foreign-function-interface generator
5798 * A simple and fast C <:ForeignFunctionInterface:> that supports calling from SML to C and from C to SML.
5800 * The <:MLBasis:ML Basis system> for programming in the very large, separate delivery of library sources, and more.
5802 * A number of extension libraries that provide useful functionality
5803 that cannot be implemented with the <:BasisLibrary:Basis Library>.
5804 See below for an overview and <:MLtonStructure:> for details.
5806 ** <:MLtonCont:continuations>
5808 MLton supports continuations via `callcc` and `throw`.
5810 ** <:MLtonFinalizable:finalization>
5812 MLton supports finalizable values of arbitrary type.
5814 ** <:MLtonItimer:interval timers>
5816 MLton supports the functionality of the C `setitimer` function.
5818 ** <:MLtonRandom:random numbers>
5820 MLton has functions similar to the C `rand` and `srand` functions, as well as support for access to `/dev/random` and `/dev/urandom`.
5822 ** <:MLtonRlimit:resource limits>
5824 MLton has functions similar to the C `getrlimit` and `setrlimit` functions.
5826 ** <:MLtonRusage:resource usage>
5828 MLton supports a subset of the functionality of the C `getrusage` function.
5830 ** <:MLtonSignal:signal handlers>
5832 MLton supports signal handlers written in SML. Signal handlers run in
5833 a separate MLton thread, and have access to the thread that was
5834 interrupted by the signal. Signal handlers can be used in conjunction
5835 with threads to implement preemptive multitasking.
5837 ** <:MLtonStructure:size primitive>
5839 MLton includes a primitive that returns the size (in bytes) of any
5840 object. This can be useful in understanding the space behavior of a
5843 ** <:MLtonSyslog:system logging>
5845 MLton has a complete interface to the C `syslog` function.
5847 ** <:MLtonThread:threads>
5849 MLton has support for its own threads, upon which either preemptive or
5850 non-preemptive multitasking can be implemented. MLton also has
5851 support for <:ConcurrentML:Concurrent ML> (CML).
5853 ** <:MLtonWeak:weak pointers>
5855 MLton supports weak pointers, which allow the garbage collector to
5856 reclaim objects that it would otherwise be forced to keep. Weak
5857 pointers are also used to provide finalization.
5859 ** <:MLtonWorld:world save and restore>
5861 MLton has a facility for saving the entire state of a computation to a
5862 file and restarting it later. This facility can be used for staging
5863 and for checkpointing computations. It can even be used from within
5864 signal handlers, allowing interrupt driven checkpointing.
5868 :mlton-guide-page: FirstClassPolymorphism
5869 [[FirstClassPolymorphism]]
5870 FirstClassPolymorphism
5871 ======================
5873 First-class polymorphism is the ability to treat polymorphic functions
5874 just like other values: pass them as arguments, store them in data
5875 structures, etc. Although <:StandardML:Standard ML> does have
5876 polymorphic functions, it does not support first-class polymorphism.
5878 For example, the following declares and uses the polymorphic function
5887 If SML supported first-class polymorphism, we could write the
5891 fun useId id = (id 13; id "foo")
5894 However, this does not type check. MLton reports the following error.
5896 Error: z.sml 1.24-1.31.
5897 Function applied to incorrect argument.
5902 The error message arises because MLton infers from `id 13` that `id`
5903 accepts an integer argument, but that `id "foo"` is passing a string.
5905 Using explicit types sheds some light on the problem.
5908 fun useId (id: 'a -> 'a) = (id 13; id "foo")
5911 On this, MLton reports the following errors.
5913 Error: z.sml 1.29-1.33.
5914 Function applied to incorrect argument.
5918 Error: z.sml 1.36-1.43.
5919 Function applied to incorrect argument.
5925 The errors arise because the argument `id` is _not_ polymorphic;
5926 rather, it is monomorphic, with type `'a -> 'a`. It is perfectly
5927 valid to apply `id` to a value of type `'a`, as in the following
5930 fun useId (id: 'a -> 'a, x: 'a) = id x (* type correct *)
5933 So, what is the difference between the type specification on `id` in
5934 the following two declarations?
5937 val id: 'a -> 'a = fn x => x
5938 fun useId (id: 'a -> 'a) = (id 13; id "foo")
5941 While the type specifications on `id` look identical, they mean
5942 different things. The difference can be made clearer by explicitly
5943 <:TypeVariableScope:scoping the type variables>.
5946 val 'a id: 'a -> 'a = fn x => x
5947 fun 'a useId (id: 'a -> 'a) = (id 13; id "foo") (* type error *)
5950 In `val 'a id`, the type variable scoping means that for any `'a`,
5951 `id` has type `'a -> 'a`. Hence, `id` can be applied to arguments of
5952 type `int`, `real`, etc. Similarly, in `fun 'a useId`, the scoping
5953 means that `useId` is a polymorphic function that for any `'a` takes a
5954 function of type `'a -> 'a` and does something. Thus, `useId` could
5955 be applied to a function of type `int -> int`, `real -> real`, etc.
5957 One could imagine an extension of SML that allowed scoping of type
5958 variables at places other than `fun` or `val` declarations, as in the
5961 fun useId (id: ('a).'a -> 'a) = (id 13; id "foo") (* not SML *)
5964 Such an extension would need to be thought through very carefully, as
5965 it could cause significant complications with <:TypeInference:>,
5966 possible even undecidability.
5970 :mlton-guide-page: Fixpoints
5975 This page discusses a framework that makes it possible to compute
5976 fixpoints over arbitrary products of abstract types. The code is from
5977 an Extended Basis library
5978 (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>).
5980 First the signature of the framework
5981 (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/public/generic/tie.sig)>):
5984 sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/public/generic/tie.sig 6:]
5987 `fix` is a <:TypeIndexedValues:type-indexed> function. The type-index
5988 parameter to `fix` is called a "witness". To compute fixpoints over
5989 products, one uses the +*`+ operator to combine witnesses. To provide
5990 a fixpoint combinator for an abstract type, one implements a witness
5991 providing a thunk whose instantiation allocates a fresh, mutable proxy
5992 and a procedure for updating the proxy with the solution. Naturally
5993 this means that not all possible ways of computing a fixpoint of a
5994 particular type are possible under the framework. The `pure`
5995 combinator is a generalization of `tier`. The `iso` combinator is
5996 provided for reusing existing witnesses.
5998 Note that instead of using an infix operator, we could alternatively
5999 employ an interface using <:Fold:>. Also, witnesses are eta-expanded
6000 to work around the <:ValueRestriction:value restriction>, while
6001 maintaining abstraction.
6003 Here is the implementation
6004 (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/detail/generic/tie.sml)>):
6007 sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/detail/generic/tie.sml 6:]
6010 Let's then take a look at a couple of additional examples.
6012 Here is a naive implementation of lazy promises:
6015 structure Promise :> sig
6017 val lazy : 'a Thunk.t -> 'a t
6018 val force : 'a t -> 'a
6023 | THUNK of 'a Thunk.t
6025 type 'a t = 'a t' Ref.t
6026 fun lazy f = ref (THUNK f)
6030 | THUNK f => (t := VALUE (f ()) handle e => t := EXN e ; force t)
6032 fun Y ? = Tie.tier (fn () => let
6033 val r = lazy (raising Fix.Fix)
6040 An example use of our naive lazy promises is to implement equally naive
6044 structure Stream :> sig
6046 val cons : 'a * 'a t -> 'a t
6047 val get : 'a t -> ('a * 'a t) Option.t
6050 datatype 'a t = IN of ('a * 'a t) Option.t Promise.t
6051 fun cons (x, xs) = IN (Promise.lazy (fn () => SOME (x, xs)))
6052 fun get (IN p) = Promise.force p
6053 fun Y ? = Tie.iso Promise.Y (fn IN p => p, IN) ?
6057 Note that above we make use of the `iso` combinator. Here is a finite
6058 representation of an infinite stream of ones:
6065 fix Y (fn ones => cons (1, ones))
6071 :mlton-guide-page: Flatten
6076 <:Flatten:> is an optimization pass for the <:SSA:>
6077 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
6081 This pass flattens arguments to <:SSA:> constructors, blocks, and
6084 If a tuple is explicitly available at all uses of a function
6085 (resp. block), then:
6087 * The formals and call sites are changed so that the components of the
6090 * The tuple is reconstructed at the beginning of the body of the
6091 function (resp. block).
6093 Similarly, if a tuple is explicitly available at all uses of a
6096 * The constructor argument datatype is changed to flatten the tuple
6099 * The tuple is passed flat at each `ConApp`.
6101 * The tuple is reconstructed at each `Case` transfer target.
6103 == Implementation ==
6105 * <!ViewGitFile(mlton,master,mlton/ssa/flatten.fun)>
6107 == Details and Notes ==
6113 :mlton-guide-page: Fold
6118 This page describes a technique that enables convenient syntax for a
6119 number of language features that are not explicitly supported by
6120 <:StandardML:Standard ML>, including: variable number of arguments,
6121 <:OptionalArguments:optional arguments and labeled arguments>,
6122 <:ArrayLiteral:array and vector literals>,
6123 <:FunctionalRecordUpdate:functional record update>,
6124 and (seemingly) dependently typed functions like <:Printf:printf> and scanf.
6126 The key idea to _fold_ is to define functions `fold`, `step0`,
6127 and `$` such that the following equation holds.
6131 fold (a, f) (step0 h1) (step0 h2) ... (step0 hn) $
6132 = f (hn (... (h2 (h1 a))))
6135 The name `fold` comes because this is like a traditional list fold,
6136 where `a` is the _base element_, and each _step function_,
6137 `step0 hi`, corresponds to one element of the list and does one
6138 step of the fold. The name `$` is chosen to mean "end of
6139 arguments" from its common use in regular-expression syntax.
6141 Unlike the usual list fold in which the same function is used to step
6142 over each element in the list, this fold allows the step functions to
6143 be different from each other, and even to be of different types. Also
6144 unlike the usual list fold, this fold includes a "finishing
6145 function", `f`, that is applied to the result of the fold. The
6146 presence of the finishing function may seem odd because there is no
6147 analogy in list fold. However, the finishing function is essential;
6148 without it, there would be no way for the folder to perform an
6149 arbitrary computation after processing all the arguments. The
6150 examples below will make this clear.
6152 The functions `fold`, `step0`, and `$` are easy to
6161 fun fold (a, f) g = g (a, f)
6162 fun step0 h (a, f) = fold (h a, f)
6166 We've placed `fold` and `step0` in the `Fold` structure
6167 but left `$` at the toplevel because it is convenient in code to
6168 always have `$` in scope. We've also defined the identity
6169 function, `id`, at the toplevel since we use it so frequently.
6171 Plugging in the definitions, it is easy to verify the equation from
6176 fold (a, f) (step0 h1) (step0 h2) ... (step0 hn) $
6177 = step0 h1 (a, f) (step0 h2) ... (step0 hn) $
6178 = fold (h1 a, f) (step0 h2) ... (step0 hn) $
6179 = step0 h2 (h1 a, f) ... (step0 hn) $
6180 = fold (h2 (h1 a), f) ... (step0 hn) $
6182 = fold (hn (... (h2 (h1 a))), f) $
6183 = $ (hn (... (h2 (h1 a))), f)
6184 = f (hn (... (h2 (h1 a))))
6188 == Example: variable number of arguments ==
6190 The simplest example of fold is accepting a variable number of
6191 (curried) arguments. We'll define a function `f` and argument
6192 `a` such that all of the following expressions are valid.
6200 f a a a ... a a a $ (* as many a's as we want *)
6203 Off-hand it may appear impossible that all of the above expressions
6204 are type correct SML -- how can a function `f` accept a variable
6205 number of curried arguments? What could the type of `f` be?
6206 We'll have more to say later on how type checking works. For now,
6207 once we have supplied the definitions below, you can check that the
6208 expressions are type correct by feeding them to your favorite SML
6211 It is simple to define `f` and `a`. We define `f` as a
6212 folder whose base element is `()` and whose finish function does
6213 nothing. We define `a` as the step function that does nothing.
6214 The only trickiness is that we must <:EtaExpansion:eta expand> the
6215 definition of `f` and `a` to work around the ValueRestriction;
6216 we frequently use eta expansion for this purpose without mention.
6223 val f = fn z => Fold.fold (base, finish) z
6224 val a = fn z => Fold.step0 step z
6227 One can easily apply the fold equation to verify by hand that `f`
6228 applied to any number of `a`'s evaluates to `()`.
6233 = finish (step (... (step base)))
6234 = finish (step (... ()))
6241 == Example: variable-argument sum ==
6243 Let's look at an example that computes something: a variable-argument
6244 function `sum` and a stepper `a` such that
6248 sum (a i1) (a i2) ... (a im) $ = i1 + i2 + ... + im
6251 The idea is simple -- the folder starts with a base accumulator of
6252 `0` and the stepper adds each element to the accumulator, `s`,
6253 which the folder simply returns at the end.
6257 val sum = fn z => Fold.fold (0, fn s => s) z
6258 fun a i = Fold.step0 (fn s => i + s)
6261 Using the fold equation, one can verify the following.
6265 sum (a 1) (a 2) (a 3) $ = 6
6271 It is sometimes syntactically convenient to omit the parentheses
6272 around the steps in a fold. This is easily done by defining a new
6273 function, `step1`, as follows.
6280 fun step1 h (a, f) b = fold (h (b, a), f)
6284 From the definition of `step1`, we have the following
6289 fold (a, f) (step1 h) b
6291 = fold (h (b, a), f)
6294 Using the above equivalence, we can compute the following equation for
6299 fold (a, f) (step1 h1) b1 (step1 h2) b2 ... (step1 hn) bn $
6300 = fold (h1 (b1, a), f) (step1 h2) b2 ... (step1 hn) bn $
6301 = fold (h2 (b2, h1 (b1, a)), f) ... (step1 hn) bn $
6302 = fold (hn (bn, ... (h2 (b2, h1 (b1, a)))), f) $
6303 = f (hn (bn, ... (h2 (b2, h1 (b1, a)))))
6306 Here is an example using `step1` to define a variable-argument
6307 product function, `prod`, with a convenient syntax.
6311 val prod = fn z => Fold.fold (1, fn p => p) z
6312 val ` = fn z => Fold.step1 (fn (i, p) => i * p) z
6315 The functions `prod` and +`+ satisfy the following equation.
6318 prod `i1 `i2 ... `im $ = i1 * i2 * ... * im
6321 Note that in SML, +`i1+ is two different tokens, +`+ and
6322 `i1`. We often use +`+ for an instance of a `step1` function
6323 because of its syntactic unobtrusiveness and because no space is
6324 required to separate it from an alphanumeric token.
6326 Also note that there are no parenthesis around the steps. That is,
6327 the following expression is not the same as the above one (in fact, it
6328 is not type correct).
6332 prod (`i1) (`i2) ... (`im) $
6336 == Example: list literals ==
6338 SML already has a syntax for list literals, e.g. `[w, x, y, z]`.
6339 However, using fold, we can define our own syntax.
6343 val list = fn z => Fold.fold ([], rev) z
6344 val ` = fn z => Fold.step1 (op ::) z
6347 The idea is that the folder starts out with the empty list, the steps
6348 accumulate the elements into a list, and then the finishing function
6349 reverses the list at the end.
6351 With these definitions one can write a list like:
6358 While the example is not practically useful, it does demonstrate the
6359 need for the finishing function to be incorporated in `fold`.
6360 Without a finishing function, every use of `list` would need to be
6361 wrapped in `rev`, as follows.
6365 rev (list `w `x `y `z $)
6368 The finishing function allows us to incorporate the reversal into the
6369 definition of `list`, and to treat `list` as a truly variable
6370 argument function, performing an arbitrary computation after receiving
6371 all of its arguments.
6373 See <:ArrayLiteral:> for a similar use of `fold` that provides a
6374 syntax for array and vector literals, which are not built in to SML.
6379 Just as `fold` is analogous to a fold left, in which the functions
6380 are applied to the accumulator left-to-right, we can define a variant
6381 of `fold` that is analogous to a fold right, in which the
6382 functions are applied to the accumulator right-to-left. That is, we
6383 can define functions `foldr` and `step0` such that the
6384 following equation holds.
6388 foldr (a, f) (step0 h1) (step0 h2) ... (step0 hn) $
6389 = f (h1 (h2 (... (hn a))))
6392 The implementation of fold right is easy, using fold. The idea is for
6393 the fold to start with `f` and for each step to precompose the
6394 next `hi`. Then, the finisher applies the composed function to
6395 the base value, `a`. Here is the code.
6401 fun foldr (a, f) = Fold.fold (f, fn g => g a)
6402 fun step0 h = Fold.step0 (fn g => g o h)
6406 Verifying the fold-right equation is straightforward, using the
6411 foldr (a, f) (Foldr.step0 h1) (Foldr.step0 h2) ... (Foldr.step0 hn) $
6412 = fold (f, fn g => g a)
6413 (Fold.step0 (fn g => g o h1))
6414 (Fold.step0 (fn g => g o h2))
6416 (Fold.step0 (fn g => g o hn)) $
6418 ((fn g => g o hn) (... ((fn g => g o h2) ((fn g => g o h1) f))))
6420 ((fn g => g o hn) (... ((fn g => g o h2) (f o h1))))
6421 = (fn g => g a) ((fn g => g o hn) (... (f o h1 o h2)))
6422 = (fn g => g a) (f o h1 o h2 o ... o hn)
6423 = (f o h1 o h2 o ... o hn) a
6424 = f (h1 (h2 (... (hn a))))
6427 One can also define the fold-right analogue of `step1`.
6434 fun step1 h = Fold.step1 (fn (b, g) => g o (fn a => h (b, a)))
6439 == Example: list literals via fold right ==
6441 Revisiting the list literal example from earlier, we can use fold
6442 right to define a syntax for list literals that doesn't do a reversal.
6446 val list = fn z => Foldr.foldr ([], fn l => l) z
6447 val ` = fn z => Foldr.step1 (op ::) z
6450 As before, with these definitions, one can write a list like:
6457 The difference between the fold-left and fold-right approaches is that
6458 the fold-right approach does not have to reverse the list at the end,
6459 since it accumulates the elements in the correct order. In practice,
6460 MLton will simplify away all of the intermediate function composition,
6461 so the the fold-right approach will be more efficient.
6464 == Mixing steppers ==
6466 All of the examples so far have used the same step function throughout
6467 a fold. This need not be the case. For example, consider the
6472 val n = fn z => Fold.fold (0, fn i => i) z
6473 val I = fn z => Fold.step0 (fn i => i * 2) z
6474 val O = fn z => Fold.step0 (fn i => i * 2 + 1) z
6477 Here we have one folder, `n`, that can be used with two different
6478 steppers, `I` and `O`. By using the fold equation, one can
6479 verify the following equations.
6490 That is, we've defined a syntax for writing binary integer constants.
6492 Not only can one use different instances of `step0` in the same
6493 fold, one can also intermix uses of `step0` and `step1`. For
6494 example, consider the following.
6498 val n = fn z => Fold.fold (0, fn i => i) z
6499 val O = fn z => Fold.step0 (fn i => n * 8) z
6500 val ` = fn z => Fold.step1 (fn (i, n) => n * 8 + i) z
6503 Using the straightforward generalization of the fold equation to mixed
6504 steppers, one can verify the following equations.
6513 That is, we've defined a syntax for writing octal integer constants,
6514 with a special syntax, `O`, for the zero digit (admittedly
6515 contrived, since one could just write +`0+ instead of `O`).
6517 See <:NumericLiteral:> for a practical extension of this approach that
6518 supports numeric constants in any base and of any type.
6521 == (Seemingly) dependent types ==
6523 A normal list fold always returns the same type no matter what
6524 elements are in the list or how long the list is. Variable-argument
6525 fold is more powerful, because the result type can vary based both on
6526 the arguments that are passed and on their number. This can provide
6527 the illusion of dependent types.
6529 For example, consider the following.
6533 val f = fn z => Fold.fold ((), id) z
6534 val a = fn z => Fold.step0 (fn () => "hello") z
6535 val b = fn z => Fold.step0 (fn () => 13) z
6536 val c = fn z => Fold.step0 (fn () => (1, 2)) z
6539 Using the fold equation, one can verify the following equations.
6543 f a $ = "hello": string
6545 f c $ = (1, 2): int * int
6548 That is, `f` returns a value of a different type depending on
6549 whether it is applied to argument `a`, argument `b`, or
6552 The following example shows how the type of a fold can depend on the
6553 number of arguments.
6557 val grow = fn z => Fold.fold ([], fn l => l) z
6558 val a = fn z => Fold.step0 (fn x => [x]) z
6561 Using the fold equation, one can verify the following equations.
6565 grow $ = []: 'a list
6566 grow a $ = [[]]: 'a list list
6567 grow a a $ = [[[]]]: 'a list list list
6570 Clearly, the result type of a call to the variable argument `grow`
6571 function depends on the number of arguments that are passed.
6573 As a reminder, this is well-typed SML. You can check it out in any
6577 == (Seemingly) dependently-typed functional results ==
6579 Fold is especially useful when it returns a curried function whose
6580 arity depends on the number of arguments. For example, consider the
6585 val makeSum = fn z => Fold.fold (id, fn f => f 0) z
6586 val I = fn z => Fold.step0 (fn f => fn i => fn x => f (x + i)) z
6589 The `makeSum` folder constructs a function whose arity depends on
6590 the number of `I` arguments and that adds together all of its
6591 arguments. For example,
6592 `makeSum I $` is of type `int -> int` and
6593 `makeSum I I $` is of type `int -> int -> int`.
6595 One can use the fold equation to verify that the `makeSum` works
6596 correctly. For example, one can easily check by hand the following
6602 makeSum I I $ 1 2 = 3
6603 makeSum I I I $ 1 2 3 = 6
6606 Returning a function becomes especially interesting when there are
6607 steppers of different types. For example, the following `makeSum`
6608 folder constructs functions that sum integers and reals.
6612 val makeSum = fn z => Foldr.foldr (id, fn f => f 0.0) z
6613 val I = fn z => Foldr.step0 (fn f => fn x => fn i => f (x + real i)) z
6614 val R = fn z => Foldr.step0 (fn f => fn x: real => fn r => f (x + r)) z
6617 With these definitions, `makeSum I R $` is of type
6618 `int -> real -> real` and `makeSum R I I $` is of type
6619 `real -> int -> int -> real`. One can use the foldr equation to
6620 check the following equations.
6625 makeSum I R $ 1 2.5 = 3.5
6626 makeSum R I I $ 1.5 2 3 = 6.5
6629 We used `foldr` instead of `fold` for this so that the order
6630 in which the specifiers `I` and `R` appear is the same as the
6631 order in which the arguments appear. Had we used `fold`, things
6632 would have been reversed.
6634 An extension of this idea is sufficient to define <:Printf:>-like
6638 == An idiom for combining steps ==
6640 It is sometimes useful to combine a number of steps together and name
6641 them as a single step. As a simple example, suppose that one often
6642 sees an integer follower by a real in the `makeSum` example above.
6643 One can define a new _compound step_ `IR` as follows.
6647 val IR = fn u => Fold.fold u I R
6650 With this definition in place, one can verify the following.
6654 makeSum IR IR $ 1 2.2 3 4.4 = 10.6
6657 In general, one can combine steps `s1`, `s2`, ... `sn` as
6661 fn u => Fold.fold u s1 s2 ... sn
6664 The following calculation shows why a compound step behaves as the
6665 composition of its constituent steps.
6669 fold u (fn u => fold u s1 s2 ... sn)
6670 = (fn u => fold u s1 s2 ... sn) u
6671 = fold u s1 s2 ... sn
6675 == Post composition ==
6677 Suppose we already have a function defined via fold,
6678 `w = fold (a, f)`, and we would like to construct a new fold
6679 function that is like `w`, but applies `g` to the result
6680 produced by `w`. This is similar to function composition, but we
6681 can't just do `g o w`, because we don't want to use `g` until
6682 `w` has been applied to all of its arguments and received the
6683 end-of-arguments terminator `$`.
6685 More precisely, we want to define a post-composition function
6686 `post` that satisfies the following equation.
6690 post (w, g) s1 ... sn $ = g (w s1 ... sn $)
6693 Here is the definition of `post`.
6700 fun post (w, g) s = w (fn (a, h) => s (a, g o h))
6704 The following calculations show that `post` satisfies the desired
6705 equation, where `w = fold (a, f)`.
6710 = w (fn (a, h) => s (a, g o h))
6711 = fold (a, f) (fn (a, h) => s (a, g o h))
6712 = (fn (a, h) => s (a, g o h)) (a, f)
6717 Now, suppose `si = step0 hi` for `i` from `1` to `n`.
6721 post (w, g) s1 s2 ... sn $
6722 = fold (a, g o f) s1 s2 ... sn $
6723 = (g o f) (hn (... (h1 a)))
6724 = g (f (hn (... (h1 a))))
6725 = g (fold (a, f) s1 ... sn $)
6729 For a practical example of post composition, see <:ArrayLiteral:>.
6734 We now define a peculiar-looking function, `lift0`, that is,
6735 equationally speaking, equivalent to the identity function on a step
6740 fun lift0 s (a, f) = fold (fold (a, id) s $, f)
6743 Using the definitions, we can prove the following equation.
6747 fold (a, f) (lift0 (step0 h)) = fold (a, f) (step0 h)
6754 fold (a, f) (lift0 (step0 h))
6755 = lift0 (step0 h) (a, f)
6756 = fold (fold (a, id) (step0 h) $, f)
6757 = fold (step0 h (a, id) $, f)
6758 = fold (fold (h a, id) $, f)
6759 = fold ($ (h a, id), f)
6760 = fold (id (h a), f)
6763 = fold (a, f) (step0 h)
6766 If `lift0` is the identity, then why even define it? The answer
6767 lies in the typing of fold expressions, which we have, until now, left
6773 Perhaps the most surprising aspect of fold is that it can be checked
6774 by the SML type system. The types involved in fold expressions are
6775 complex; fortunately type inference is able to deduce them.
6776 Nevertheless, it is instructive to study the types of fold functions
6777 and steppers. More importantly, it is essential to understand the
6778 typing aspects of fold in order to write down signatures of functions
6779 defined using fold and step.
6781 Here is the `FOLD` signature, and a recapitulation of the entire
6782 `Fold` structure, with additional type annotations.
6788 type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd
6789 type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd
6790 type ('a1, 'a2, 'b, 'c, 'd) step0 =
6791 ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step
6792 type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 =
6793 ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step
6795 val fold: 'a * ('b -> 'c) -> ('a, 'b, 'c, 'd) t
6796 val lift0: ('a1, 'a2, 'a2, 'a2, 'a2) step0
6797 -> ('a1, 'a2, 'b, 'c, 'd) step0
6798 val post: ('a, 'b, 'c1, 'd) t * ('c1 -> 'c2)
6799 -> ('a, 'b, 'c2, 'd) t
6800 val step0: ('a1 -> 'a2) -> ('a1, 'a2, 'b, 'c, 'd) step0
6801 val step1: ('a11 * 'a12 -> 'a2)
6802 -> ('a11, 'a12, 'a2, 'b, 'c, 'd) step1
6805 structure Fold:> FOLD =
6807 type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd
6809 type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd
6811 type ('a1, 'a2, 'b, 'c, 'd) step0 =
6812 ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step
6814 type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 =
6815 ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step
6817 fun fold (a: 'a, f: 'b -> 'c)
6818 (g: ('a, 'b, 'c, 'd) step): 'd =
6821 fun step0 (h: 'a1 -> 'a2)
6822 (a1: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t =
6825 fun step1 (h: 'a11 * 'a12 -> 'a2)
6826 (a12: 'a12, f: 'b -> 'c)
6827 (a11: 'a11): ('a2, 'b, 'c, 'd) t =
6828 fold (h (a11, a12), f)
6830 fun lift0 (s: ('a1, 'a2, 'a2, 'a2, 'a2) step0)
6831 (a: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t =
6832 fold (fold (a, id) s $, f)
6834 fun post (w: ('a, 'b, 'c1, 'd) t,
6836 (s: ('a, 'b, 'c2, 'd) step): 'd =
6837 w (fn (a, h) => s (a, g o h))
6841 That's a lot to swallow, so let's walk through it one step at a time.
6842 First, we have the definition of type `Fold.step`.
6846 type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd
6849 As a fold proceeds over its arguments, it maintains two things: the
6850 accumulator, of type `'a`, and the finishing function, of type
6851 `'b -> 'c`. Each step in the fold is a function that takes those
6852 two pieces (i.e. `'a * ('b -> 'c)` and does something to them
6853 (i.e. produces `'d`). The result type of the step is completely
6854 left open to be filled in by type inference, as it is an arrow type
6855 that is capable of consuming the rest of the arguments to the fold.
6857 A folder, of type `Fold.t`, is a function that consumes a single
6862 type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd
6865 Expanding out the type, we have:
6869 type ('a, 'b, 'c, 'd) t = ('a * ('b -> 'c) -> 'd) -> 'd
6872 This shows that the only thing a folder does is to hand its
6873 accumulator (`'a`) and finisher (`'b -> 'c`) to the next step
6874 (`'a * ('b -> 'c) -> 'd`). If SML had <:FirstClassPolymorphism:first-class polymorphism>,
6875 we would write the fold type as follows.
6879 type ('a, 'b, 'c) t = Forall 'd . ('a, 'b, 'c, 'd) step -> 'd
6882 This type definition shows that a folder had nothing to do with
6883 the rest of the fold, it only deals with the next step.
6885 We now can understand the type of `fold`, which takes the initial
6886 value of the accumulator and the finishing function, and constructs a
6887 folder, i.e. a function awaiting the next step.
6891 val fold: 'a * ('b -> 'c) -> ('a, 'b, 'c, 'd) t
6892 fun fold (a: 'a, f: 'b -> 'c)
6893 (g: ('a, 'b, 'c, 'd) step): 'd =
6897 Continuing on, we have the type of step functions.
6901 type ('a1, 'a2, 'b, 'c, 'd) step0 =
6902 ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step
6905 Expanding out the type a bit gives:
6909 type ('a1, 'a2, 'b, 'c, 'd) step0 =
6910 'a1 * ('b -> 'c) -> ('a2, 'b, 'c, 'd) t
6913 So, a step function takes the accumulator (`'a1`) and finishing
6914 function (`'b -> 'c`), which will be passed to it by the previous
6915 folder, and transforms them to a new folder. This new folder has a
6916 new accumulator (`'a2`) and the same finishing function.
6918 Again, imagining that SML had <:FirstClassPolymorphism:first-class polymorphism> makes the type
6923 type ('a1, 'a2) step0 =
6924 Forall ('b, 'c) . ('a1, 'b, 'c, ('a2, 'b, 'c) t) step
6927 Thus, in essence, a `step0` function is a wrapper around a
6928 function of type `'a1 -> 'a2`, which is exactly what the
6929 definition of `step0` does.
6933 val step0: ('a1 -> 'a2) -> ('a1, 'a2, 'b, 'c, 'd) step0
6934 fun step0 (h: 'a1 -> 'a2)
6935 (a1: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t =
6939 It is not much beyond `step0` to understand `step1`.
6943 type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 =
6944 ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step
6947 A `step1` function takes the accumulator (`'a12`) and finisher
6948 (`'b -> 'c`) passed to it by the previous folder and transforms
6949 them into a function that consumes the next argument (`'a11`) and
6950 produces a folder that will continue the fold with a new accumulator
6951 (`'a2`) and the same finisher.
6955 fun step1 (h: 'a11 * 'a12 -> 'a2)
6956 (a12: 'a12, f: 'b -> 'c)
6957 (a11: 'a11): ('a2, 'b, 'c, 'd) t =
6958 fold (h (a11, a12), f)
6961 With <:FirstClassPolymorphism:first-class polymorphism>, a `step1` function is more clearly
6962 seen as a wrapper around a binary function of type
6963 `'a11 * 'a12 -> 'a2`.
6967 type ('a11, 'a12, 'a2) step1 =
6968 Forall ('b, 'c) . ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c) t) step
6971 The type of `post` is clear: it takes a folder with a finishing
6972 function that produces type `'c1`, and a function of type
6973 `'c1 -> 'c2` to postcompose onto the folder. It returns a new
6974 folder with a finishing function that produces type `'c2`.
6978 val post: ('a, 'b, 'c1, 'd) t * ('c1 -> 'c2)
6979 -> ('a, 'b, 'c2, 'd) t
6980 fun post (w: ('a, 'b, 'c1, 'd) t,
6982 (s: ('a, 'b, 'c2, 'd) step): 'd =
6983 w (fn (a, h) => s (a, g o h))
6986 We will return to `lift0` after an example.
6989 == An example typing ==
6991 Let's type check our simplest example, a variable-argument fold.
6992 Recall that we have a folder `f` and a stepper `a` defined as
6997 val f = fn z => Fold.fold ((), fn () => ()) z
6998 val a = fn z => Fold.step0 (fn () => ()) z
7001 Since the accumulator and finisher are uninteresting, we'll use some
7002 abbreviations to simplify things.
7006 type 'd step = (unit, unit, unit, 'd) Fold.step
7007 type 'd fold = 'd step -> 'd
7010 With these abbreviations, `f` and `a` have the following polymorphic
7019 Suppose we want to type check
7026 As a reminder, the fully parenthesized expression is
7032 The observation that we will use repeatedly is that for any type
7033 `z`, if `f: z fold` and `s: z step`, then `f s: z`.
7049 Applying the observation again, we must have
7053 f a a: unit fold fold
7057 Applying the observation two more times leads to the following type
7062 f: unit fold fold fold fold a: unit fold fold fold step
7063 f a: unit fold fold fold a: unit fold fold step
7064 f a a: unit fold fold a: unit fold step
7065 f a a a: unit fold $: unit step
7069 So, each application is a fold that consumes the next step, producing
7070 a fold of one smaller type.
7072 One can expand some of the type definitions in `f` to see that it is
7073 indeed a function that takes four curried arguments, each one a step
7078 f: unit fold fold fold step
7079 -> unit fold fold step
7085 This example shows why we must eta expand uses of `fold` and `step0`
7086 to work around the value restriction and make folders and steppers
7087 polymorphic. The type of a fold function like `f` depends on the
7088 number of arguments, and so will vary from use to use. Similarly,
7089 each occurrence of an argument like `a` has a different type,
7090 depending on the number of remaining arguments.
7092 This example also shows that the type of a folder, when fully
7093 expanded, is exponential in the number of arguments: there are as many
7094 nested occurrences of the `fold` type constructor as there are
7095 arguments, and each occurrence duplicates its type argument. One can
7096 observe this exponential behavior in a type checker that doesn't share
7097 enough of the representation of types (e.g. one that represents types
7098 as trees rather than directed acyclic graphs).
7100 Generalizing this type derivation to uses of fold where the
7101 accumulator and finisher are more interesting is straightforward. One
7102 simply includes the type of the accumulator, which may change, for
7103 each step, and the type of the finisher, which doesn't change from
7109 The lack of <:FirstClassPolymorphism:first-class polymorphism> in SML
7110 causes problems if one wants to use a step in a first-class way.
7111 Consider the following `double` function, which takes a step, `s`, and
7112 produces a composite step that does `s` twice.
7116 fun double s = fn u => Fold.fold u s s
7119 The definition of `double` is not type correct. The problem is that
7120 the type of a step depends on the number of remaining arguments but
7121 that the parameter `s` is not polymorphic, and so can not be used in
7122 two different positions.
7124 Fortunately, we can define a function, `lift0`, that takes a monotyped
7125 step function and _lifts_ it into a polymorphic step function. This
7126 is apparent in the type of `lift0`.
7130 val lift0: ('a1, 'a2, 'a2, 'a2, 'a2) step0
7131 -> ('a1, 'a2, 'b, 'c, 'd) step0
7132 fun lift0 (s: ('a1, 'a2, 'a2, 'a2, 'a2) step0)
7133 (a: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t =
7134 fold (fold (a, id) s $, f)
7137 The following definition of `double` uses `lift0`, appropriately eta
7138 wrapped, to fix the problem.
7144 val s = fn z => Fold.lift0 s z
7146 fn u => Fold.fold u s s
7150 With that definition of `double` in place, we can use it as in the
7155 val f = fn z => Fold.fold ((), fn () => ()) z
7156 val a = fn z => Fold.step0 (fn () => ()) z
7157 val a2 = fn z => double a z
7158 val () = f a a2 a a2 $
7161 Of course, we must eta wrap the call `double` in order to use its
7162 result, which is a step function, polymorphically.
7165 == Hiding the type of the accumulator ==
7167 For clarity and to avoid mistakes, it can be useful to hide the type
7168 of the accumulator in a fold. Reworking the simple variable-argument
7169 example to do this leads to the following.
7176 val f: (ac, ac, unit, 'd) Fold.t
7177 val s: (ac, ac, 'b, 'c, 'd) Fold.step0
7181 val f = fn z => Fold.fold ((), fn () => ()) z
7182 val s = fn z => Fold.step0 (fn () => ()) z
7186 The idea is to name the accumulator type and use opaque signature
7187 matching to make it abstract. This can prevent improper manipulation
7188 of the accumulator by client code and ensure invariants that the
7189 folder and stepper would like to maintain.
7191 For a practical example of this technique, see <:ArrayLiteral:>.
7196 Fold has a number of practical applications. Here are some of them.
7200 * <:FunctionalRecordUpdate:>
7201 * <:NumericLiteral:>
7202 * <:OptionalArguments:>
7204 * <:VariableArityPolymorphism:>
7206 There are a number of related techniques. Here are some of them.
7209 * <:TypeIndexedValues:>
7213 :mlton-guide-page: Fold01N
7218 A common use pattern of <:Fold:> is to define a variable-arity
7219 function that combines multiple arguments together using a binary
7220 function. It is slightly tricky to do this directly using fold,
7221 because of the special treatment required for the case of zero or one
7222 argument. Here is a structure, `Fold01N`, that solves the problem
7223 once and for all, and eases the definition of such functions.
7229 fun fold {finish, start, zero} =
7230 Fold.fold ((id, finish, fn () => zero, start),
7231 fn (finish, _, p, _) => finish (p ()))
7233 fun step0 {combine, input} =
7234 Fold.step0 (fn (_, finish, _, f) =>
7238 fn x' => combine (f input, x')))
7240 fun step1 {combine} z input =
7241 step0 {combine = combine, input = input} z
7245 If one has a value `zero`, and functions `start`, `c`, and `finish`,
7246 then one can define a variable-arity function `f` and stepper
7247 +`+ as follows.
7250 val f = fn z => Fold01N.fold {finish = finish, start = start, zero = zero} z
7251 val ` = fn z => Fold01N.step1 {combine = c} z
7254 One can then use the fold equation to prove the following equations.
7258 f `a1 $ = finish (start a1)
7259 f `a1 `a2 $ = finish (c (start a1, a2))
7260 f `a1 `a2 `a3 $ = finish (c (c (start a1, a2), a3))
7264 For an example of `Fold01N`, see <:VariableArityPolymorphism:>.
7267 == Typing Fold01N ==
7269 Here is the signature for `Fold01N`. We use a trick to avoid having
7270 to duplicate the definition of some rather complex types in both the
7271 signature and the structure. We first define the types in a
7272 structure. Then, we define them via type re-definitions in the
7273 signature, and via `open` in the full structure.
7278 type ('input, 'accum1, 'accum2, 'answer, 'zero,
7279 'a, 'b, 'c, 'd, 'e) t =
7281 * ('accum2 -> 'answer)
7283 * ('input -> 'accum1),
7284 ('a -> 'b) * 'c * (unit -> 'a) * 'd,
7288 type ('input1, 'accum1, 'input2, 'accum2,
7289 'a, 'b, 'c, 'd, 'e, 'f) step0 =
7290 ('a * 'b * 'c * ('input1 -> 'accum1),
7291 'b * 'b * (unit -> 'accum1) * ('input2 -> 'accum2),
7292 'd, 'e, 'f) Fold.step0
7294 type ('accum1, 'input, 'accum2,
7295 'a, 'b, 'c, 'd, 'e, 'f, 'g) step1 =
7297 'b * 'c * 'd * ('a -> 'accum1),
7298 'c * 'c * (unit -> 'accum1) * ('input -> 'accum2),
7299 'e, 'f, 'g) Fold.step1
7302 signature FOLD_01N =
7304 type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) t =
7305 ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.t
7306 type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) step0 =
7307 ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.step0
7308 type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) step1 =
7309 ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.step1
7312 {finish: 'accum2 -> 'answer,
7313 start: 'input -> 'accum1,
7315 -> ('input, 'accum1, 'accum2, 'answer, 'zero,
7316 'a, 'b, 'c, 'd, 'e) t
7319 {combine: 'accum1 * 'input2 -> 'accum2,
7321 -> ('input1, 'accum1, 'input2, 'accum2,
7322 'a, 'b, 'c, 'd, 'e, 'f) step0
7325 {combine: 'accum1 * 'input -> 'accum2}
7326 -> ('accum1, 'input, 'accum2,
7327 'a, 'b, 'c, 'd, 'e, 'f, 'g) step1
7330 structure Fold01N: FOLD_01N =
7334 fun fold {finish, start, zero} =
7335 Fold.fold ((id, finish, fn () => zero, start),
7336 fn (finish, _, p, _) => finish (p ()))
7338 fun step0 {combine, input} =
7339 Fold.step0 (fn (_, finish, _, f) =>
7343 fn x' => combine (f input, x')))
7345 fun step1 {combine} z input =
7346 step0 {combine = combine, input = input} z
7352 :mlton-guide-page: ForeignFunctionInterface
7353 [[ForeignFunctionInterface]]
7354 ForeignFunctionInterface
7355 ========================
7357 MLton's foreign function interface (FFI) extends Standard ML and makes
7358 it easy to take the address of C global objects, access C global
7359 variables, call from SML to C, and call from C to SML. MLton also
7360 provides <:MLNLFFI:ML-NLFFI>, which is a higher-level FFI for calling
7361 C functions and manipulating C data from SML.
7364 * <:ForeignFunctionInterfaceTypes:Foreign Function Interface Types>
7365 * <:ForeignFunctionInterfaceSyntax:Foreign Function Interface Syntax>
7367 == Importing Code into SML ==
7368 * <:CallingFromSMLToC:Calling From SML To C>
7369 * <:CallingFromSMLToCFunctionPointer:Calling From SML To C Function Pointer>
7371 == Exporting Code from SML ==
7372 * <:CallingFromCToSML:Calling From C To SML>
7374 == Building System Libraries ==
7375 * <:LibrarySupport:Library Support>
7379 :mlton-guide-page: ForeignFunctionInterfaceSyntax
7380 [[ForeignFunctionInterfaceSyntax]]
7381 ForeignFunctionInterfaceSyntax
7382 ==============================
7384 MLton extends the syntax of SML with expressions that enable a
7385 <:ForeignFunctionInterface:> to C. The following description of the
7386 syntax uses some abbreviations.
7390 | C base type | _cBaseTy_ | <:ForeignFunctionInterfaceTypes: Foreign Function Interface types>
7391 | C argument type | _cArgTy_ | _cBaseTy_~1~ `*` ... `*` _cBaseTy_~n~ or `unit`
7392 | C return type | _cRetTy_ | _cBaseTy_ or `unit`
7393 | C function type | _cFuncTy_ | _cArgTy_ `->` _cRetTy_
7394 | C pointer type | _cPtrTy_ | `MLton.Pointer.t`
7397 The type annotation and the semicolon are not optional in the syntax
7398 of <:ForeignFunctionInterface:> expressions. However, the type is
7399 lexed, parsed, and elaborated as an SML type, so any type (including
7400 type abbreviations) may be used, so long as it elaborates to a type of
7407 _address "CFunctionOrVariableName" attr... : cPtrTy;
7410 Denotes the address of the C function or variable.
7412 `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7414 * `external` : import with external symbol scope (see <:LibrarySupport:>) (default).
7415 * `private` : import with private symbol scope (see <:LibrarySupport:>).
7416 * `public` : import with public symbol scope (see <:LibrarySupport:>).
7418 See <:MLtonPointer: MLtonPointer> for functions that manipulate C pointers.
7424 _symbol "CVariableName" attr... : (unit -> cBaseTy) * (cBaseTy -> unit);
7427 Denotes the _getter_ and _setter_ for a C variable. The __cBaseTy__s
7430 `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7432 * `alloc` : allocate storage (and export a symbol) for the C variable.
7433 * `external` : import or export with external symbol scope (see <:LibrarySupport:>) (default if not `alloc`).
7434 * `private` : import or export with private symbol scope (see <:LibrarySupport:>).
7435 * `public` : import or export with public symbol scope (see <:LibrarySupport:>) (default if `alloc`).
7439 _symbol * : cPtrTy -> (unit -> cBaseTy) * (cBaseTy -> unit);
7442 Denotes the _getter_ and _setter_ for a C pointer to a variable.
7443 The __cBaseTy__s must be identical.
7449 _import "CFunctionName" attr... : cFuncTy;
7452 Denotes an SML function whose behavior is implemented by calling the C
7453 function. See <:CallingFromSMLToC: Calling from SML to C> for more
7456 `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7458 * `cdecl` : call with the `cdecl` calling convention (default).
7459 * `external` : import with external symbol scope (see <:LibrarySupport:>) (default).
7460 * `impure`: assert that the function depends upon state and/or performs side effects (default).
7461 * `private` : import with private symbol scope (see <:LibrarySupport:>).
7462 * `public` : import with public symbol scope (see <:LibrarySupport:>).
7463 * `pure`: assert that the function does not depend upon state or perform any side effects; such functions are subject to various optimizations (e.g., <:CommonSubexp:>, <:RemoveUnused:>)
7464 * `reentrant`: assert that the function (directly or indirectly) calls an `_export`-ed SML function.
7465 * `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW).
7469 _import * attr... : cPtrTy -> cFuncTy;
7472 Denotes an SML function whose behavior is implemented by calling a C
7473 function through a C function pointer.
7475 `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7477 * `cdecl` : call with the `cdecl` calling convention (default).
7478 * `impure`: assert that the function depends upon state and/or performs side effects (default).
7479 * `pure`: assert that the function does not depend upon state or perform any side effects; such functions are subject to various optimizations (e.g., <:CommonSubexp:>, <:RemoveUnused:>)
7480 * `reentrant`: assert that the function (directly or indirectly) calls an `_export`-ed SML function.
7481 * `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW).
7484 <:CallingFromSMLToCFunctionPointer: Calling from SML to C function pointer>
7491 _export "CFunctionName" attr... : cFuncTy -> unit;
7494 Exports a C function with the name `CFunctionName` that can be used to
7495 call an SML function of the type _cFuncTy_. When the function denoted
7496 by the export expression is applied to an SML function `f`, subsequent
7497 C calls to `CFunctionName` will call `f`. It is an error to call
7498 `CFunctionName` before the export has been applied. The export may be
7499 applied more than once, with each application replacing any previous
7500 definition of `CFunctionName`.
7502 `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7504 * `cdecl` : call with the `cdecl` calling convention (default).
7505 * `private` : export with private symbol scope (see <:LibrarySupport:>).
7506 * `public` : export with public symbol scope (see <:LibrarySupport:>) (default).
7507 * `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW).
7509 See <:CallingFromCToSML: Calling from C to SML> for more details.
7513 :mlton-guide-page: ForeignFunctionInterfaceTypes
7514 [[ForeignFunctionInterfaceTypes]]
7515 ForeignFunctionInterfaceTypes
7516 =============================
7518 MLton's <:ForeignFunctionInterface:> only allows values of certain SML
7519 types to be passed between SML and C. The following types are
7520 allowed: `bool`, `char`, `int`, `real`, `word`. All of the different
7521 sizes of (fixed-sized) integers, reals, and words are supported as
7522 well: `Int8.int`, `Int16.int`, `Int32.int`, `Int64.int`,
7523 `Real32.real`, `Real64.real`, `Word8.word`, `Word16.word`,
7524 `Word32.word`, `Word64.word`. There is a special type,
7525 `MLton.Pointer.t`, for passing C pointers -- see <:MLtonPointer:> for
7528 Arrays, refs, and vectors of the above types are also allowed.
7529 Because in MLton monomorphic arrays and vectors are exactly the same
7530 as their polymorphic counterpart, these are also allowed. Hence,
7531 `string`, `char vector`, and `CharVector.vector` are also allowed.
7532 Strings are not null terminated, unless you manually do so from the
7535 Unfortunately, passing tuples or datatypes is not allowed because that
7536 would interfere with representation optimizations.
7538 The C header file that `-export-header` generates includes
7539 ++typedef++s for the C types corresponding to the SML types. Here is
7540 the mapping between SML types and C types.
7544 | SML type | C typedef | C type | Note
7545 | `array` | `Pointer` | `unsigned char *` |
7546 | `bool` | `Bool` | `int32_t` |
7547 | `char` | `Char8` | `uint8_t` |
7548 | `Int8.int` | `Int8` | `int8_t` |
7549 | `Int16.int` | `Int16` | `int16_t` |
7550 | `Int32.int` | `Int32` | `int32_t` |
7551 | `Int64.int` | `Int64` | `int64_t` |
7552 | `int` | `Int32` | `int32_t` | <:#Default:(default)>
7553 | `MLton.Pointer.t` | `Pointer` | `unsigned char *` |
7554 | `Real32.real` | `Real32` | `float` |
7555 | `Real64.real` | `Real64` | `double` |
7556 | `real` | `Real64` | `double` | <:#Default:(default)>
7557 | `ref` | `Pointer` | `unsigned char *` |
7558 | `string` | `Pointer` | `unsigned char *` | <:#ReadOnly:(read only)>
7559 | `vector` | `Pointer` | `unsigned char *` | <:#ReadOnly:(read only)>
7560 | `Word8.word` | `Word8` | `uint8_t` |
7561 | `Word16.word` | `Word16` | `uint16_t` |
7562 | `Word32.word` | `Word32` | `uint32_t` |
7563 | `Word64.word` | `Word64` | `uint64_t` |
7564 | `word` | `Word32` | `uint32_t` | <:#Default:(default)>
7567 <!Anchor(Default)>Note (default): The default `int`, `real`, and
7568 `word` types may be set by the ++-default-type __type__++
7569 <:CompileTimeOptions: compiler option>. The given C typedef and C
7570 types correspond to the default behavior.
7572 <!Anchor(ReadOnly)>Note (read only): Because MLton assumes that
7573 vectors and strings are read-only (and will perform optimizations
7574 that, for instance, cause them to share space), you must not modify
7575 the data pointed to by the `unsigned char *` in C code.
7577 Although the C type of an array, ref, or vector is always `Pointer`,
7578 in reality, the object has the natural C representation. Your C code
7579 should cast to the appropriate C type if you want to keep the C
7580 compiler from complaining.
7582 When calling an <:CallingFromSMLToC: imported C function from SML>
7583 that returns an array, ref, or vector result or when calling an
7584 <:CallingFromCToSML: exported SML function from C> that takes an
7585 array, ref, or string argument, then the object must be an ML object
7586 allocated on the ML heap. (Although an array, ref, or vector object
7587 has the natural C representation, the object also has an additional
7588 header used by the SML runtime system.)
7590 In addition, there is an <:MLBasis:> file, `$(SML_LIB)/basis/c-types.mlb`,
7591 which provides structure aliases for various C types:
7594 | C type | Structure | Signature
7595 | `char` | `C_Char` | `INTEGER`
7596 | `signed char` | `C_SChar` | `INTEGER`
7597 | `unsigned char` | `C_UChar` | `WORD`
7598 | `short` | `C_Short` | `INTEGER`
7599 | `signed short` | `C_SShort` | `INTEGER`
7600 | `unsigned short` | `C_UShort` | `WORD`
7601 | `int` | `C_Int` | `INTEGER`
7602 | `signed int` | `C_SInt` | `INTEGER`
7603 | `unsigned int` | `C_UInt` | `WORD`
7604 | `long` | `C_Long` | `INTEGER`
7605 | `signed long` | `C_SLong` | `INTEGER`
7606 | `unsigned long` | `C_ULong` | `WORD`
7607 | `long long` | `C_LongLong` | `INTEGER`
7608 | `signed long long` | `C_SLongLong` | `INTEGER`
7609 | `unsigned long long` | `C_ULongLong` | `WORD`
7610 | `float` | `C_Float` | `REAL`
7611 | `double` | `C_Double` | `REAL`
7612 | `size_t` | `C_Size` | `WORD`
7613 | `ptrdiff_t` | `C_Ptrdiff` | `INTEGER`
7614 | `intmax_t` | `C_Intmax` | `INTEGER`
7615 | `uintmax_t` | `C_UIntmax` | `WORD`
7616 | `intptr_t` | `C_Intptr` | `INTEGER`
7617 | `uintptr_t` | `C_UIntptr` | `WORD`
7618 | `void *` | `C_Pointer` | `WORD`
7621 These aliases depend on the configuration of the C compiler for the
7622 target architecture, and are independent of the configuration of MLton
7623 (including the ++-default-type __type__++
7624 <:CompileTimeOptions: compiler option>).
7628 :mlton-guide-page: ForLoops
7633 A `for`-loop is typically used to iterate over a range of consecutive
7634 integers that denote indices of some sort. For example, in <:OCaml:>
7635 a `for`-loop takes either the form
7637 for <name> = <lower> to <upper> do <body> done
7641 for <name> = <upper> downto <lower> do <body> done
7644 Some languages provide considerably more flexible `for`-loop or
7645 `foreach`-constructs.
7647 A bit surprisingly, <:StandardML:Standard ML> provides special syntax
7648 for `while`-loops, but not for `for`-loops. Indeed, in SML, many uses
7649 of `for`-loops are better expressed using `app`, `foldl`/`foldr`,
7650 `map` and many other higher-order functions provided by the
7651 <:BasisLibrary:Basis Library> for manipulating lists, vectors and
7652 arrays. However, the Basis Library does not provide a function for
7653 iterating over a range of integer values. Fortunately, it is very
7657 == A fairly simple design ==
7659 The following implementation imitates both the syntax and semantics of
7660 the OCaml `for`-loop.
7664 datatype for = to of int * int
7665 | downto of int * int
7671 (fn f => let fun loop lo = if lo > up then ()
7672 else (f lo; loop (lo+1))
7675 (fn f => let fun loop up = if up < lo then ()
7676 else (f up; loop (up-1))
7685 (fn i => print (Int.toString i))
7688 would print `123456789` and
7693 (fn i => print (Int.toString i))
7696 would print `987654321`.
7698 Straightforward formatting of nested loops
7709 is fairly readable, but tends to cause the body of the loop to be
7710 indented quite deeply.
7715 The above design has an annoying feature. In practice, the upper
7716 bound of the iterated range is almost always excluded and most loops
7717 would subtract one from the upper bound:
7722 for (n-1 downto 0) ...
7725 It is probably better to break convention and exclude the upper bound
7726 by default, because it leads to more concise code and becomes
7727 idiomatic with very little practice. The iterator combinators
7728 described below exclude the upper bound by default.
7731 == Iterator combinators ==
7733 While the simple `for`-function described in the previous section is
7734 probably good enough for many uses, it is a bit cumbersome when one
7735 needs to iterate over a Cartesian product. One might also want to
7736 iterate over more than just consecutive integers. It turns out that
7737 one can provide a library of iterator combinators that allow one to
7738 implement iterators more flexibly.
7740 Since the types of the combinators may be a bit difficult to infer
7741 from their implementations, let's first take a look at a signature of
7742 the iterator combinator library:
7748 type 'a t = ('a -> unit) -> unit
7750 val return : 'a -> 'a t
7751 val >>= : 'a t * ('a -> 'b t) -> 'b t
7755 val to : int * int -> int t
7756 val downto : int * int -> int t
7758 val inList : 'a list -> 'a t
7759 val inVector : 'a vector -> 'a t
7760 val inArray : 'a array -> 'a t
7762 val using : ('a, 'b) StringCvt.reader -> 'b -> 'a t
7764 val when : 'a t * ('a -> bool) -> 'a t
7765 val by : 'a t * ('a -> 'b) -> 'b t
7766 val @@ : 'a t * 'a t -> 'a t
7767 val ** : 'a t * 'b t -> ('a, 'b) product t
7773 Several of the above combinators are meant to be used as infix
7774 operators. Here is a set of suitable infix declarations:
7783 A few notes are in order:
7785 * The `'a t` type constructor with the `return` and `>>=` operators forms a monad.
7787 * The `to` and `downto` combinators will omit the upper bound of the range.
7789 * `for` is the identity function. It is purely for syntactic sugar and is not strictly required.
7791 * The `@@` combinator produces an iterator for the concatenation of the given iterators.
7793 * The `**` combinator produces an iterator for the Cartesian product of the given iterators.
7794 ** See <:ProductType:> for the type constructor `('a, 'b) product` used in the type of the iterator produced by `**`.
7796 * The `using` combinator allows one to iterate over slices, streams and many other kinds of sequences.
7798 * `when` is the filtering combinator. The name `when` is inspired by <:OCaml:>'s guard clauses.
7800 * `by` is the mapping combinator.
7802 The below implementation of the `ITER`-signature makes use of the
7803 following basic combinators:
7808 fun flip f x y = f y x
7810 fun opt fno fso = fn NONE => fno () | SOME ? => fso ?
7814 Here is an implementation the `ITER`-signature:
7818 structure Iter :> ITER =
7820 type 'a t = ('a -> unit) -> unit
7823 fun (iA >>= a2iB) f = iA (flip a2iB f)
7827 fun (l to u) f = let fun `l = if l<u then (f l; `(l+1)) else () in `l end
7828 fun (u downto l) f = let fun `u = if u>l then (f (u-1); `(u-1)) else () in `u end
7830 fun inList ? = flip List.app ?
7831 fun inVector ? = flip Vector.app ?
7832 fun inArray ? = flip Array.app ?
7834 fun using get s f = let fun `s = opt (const ()) (fn (x, s) => (f x; `s)) (get s) in `s end
7836 fun (iA when p) f = iA (fn a => if p a then f a else ())
7837 fun (iA by g) f = iA (f o g)
7838 fun (iA @@ iB) f = (iA f : unit; iB f)
7839 fun (iA ** iB) f = iA (fn a => iB (fn b => f (a & b)))
7845 Note that some of the above combinators (e.g. `**`) could be expressed
7846 in terms of the other combinators, most notably `return` and `>>=`.
7847 Another implementation issue worth mentioning is that `downto` is
7848 written specifically to avoid computing `l-1`, which could cause an
7851 To use the above combinators the `Iter`-structure needs to be opened
7858 and one usually also wants to declare the infix status of the
7859 operators as shown earlier.
7861 Here is an example that illustrates some of the features:
7865 for (0 to 10 when (fn x => x mod 3 <> 0) ** inList ["a", "b"] ** 2 downto 1 by real)
7867 print ("("^Int.toString x^", \""^y^"\", "^Real.toString z^")\n"))
7870 Using the `Iter` combinators one can easily produce more complicated
7871 iterators. For example, here is an iterator over a "triangle":
7875 fun triangle (l, u) = l to u >>= (fn i => i to u >>= (fn j => return (i, j)))
7880 :mlton-guide-page: FrontEnd
7885 <:FrontEnd:> is a translation pass from source to the <:AST:>
7886 <:IntermediateLanguage:>.
7890 This pass performs lexing and parsing to produce an abstract syntax
7893 == Implementation ==
7895 * <!ViewGitFile(mlton,master,mlton/front-end/front-end.sig)>
7896 * <!ViewGitFile(mlton,master,mlton/front-end/front-end.fun)>
7898 == Details and Notes ==
7900 The lexer is produced by <:MLLex:> from
7901 <!ViewGitFile(mlton,master,mlton/front-end/ml.lex)>.
7903 The parser is produced by <:MLYacc:> from
7904 <!ViewGitFile(mlton,master,mlton/front-end/ml.grm)>.
7906 The specifications for the lexer and parser were originally taken from
7907 <:SMLNJ: SML/NJ> (version 109.32), but have been heavily modified
7912 :mlton-guide-page: FSharp
7917 http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/[F#]
7918 is a functional programming language developed at Microsoft Research.
7919 F# was partly inspired by the <:OCaml:OCaml> language and shares some
7920 common core constructs with it. F# is integrated with Visual Studio
7921 2010 as a first-class language.
7925 :mlton-guide-page: FunctionalRecordUpdate
7926 [[FunctionalRecordUpdate]]
7927 FunctionalRecordUpdate
7928 ======================
7930 Functional record update is the copying of a record while replacing
7931 the values of some of the fields. <:StandardML:Standard ML> does not
7932 have explicit syntax for functional record update. We will show below
7933 how to implement functional record update in SML, with a little
7936 As an example, the functional update of the record
7940 {a = 13, b = 14, c = 15}
7943 with `c = 16` yields a new record
7947 {a = 13, b = 14, c = 16}
7950 Functional record update also makes sense with multiple simultaneous
7951 updates. For example, the functional update of the record above with
7952 `a = 18, c = 19` yields a new record
7956 {a = 18, b = 14, c = 19}
7960 One could easily imagine an extension of the SML that supports
7961 functional record update. For example
7965 e with {a = 16, b = 17}
7968 would create a copy of the record denoted by `e` with field `a`
7969 replaced with `16` and `b` replaced with `17`.
7971 Since there is no such syntax in SML, we now show how to implement
7972 functional record update directly. We first give a simple
7973 implementation that has a number of problems. We then give an
7974 advanced implementation, that, while complex underneath, is a reusable
7975 library that admits simple use.
7978 == Simple implementation ==
7980 To support functional record update on the record type
7984 {a: 'a, b: 'b, c: 'c}
7987 first, define an update function for each component.
7991 fun withA ({a = _, b, c}, a) = {a = a, b = b, c = c}
7992 fun withB ({a, b = _, c}, b) = {a = a, b = b, c = c}
7993 fun withC ({a, b, c = _}, c) = {a = a, b = b, c = c}
7996 Then, one can express `e with {a = 16, b = 17}` as
8000 withB (withA (e, 16), 17)
8007 infix withA withB withC
8010 the syntax is almost as concise as a language extension.
8017 This approach suffers from the fact that the amount of boilerplate
8018 code is quadratic in the number of record fields. Furthermore,
8019 changing, adding, or deleting a field requires time proportional to
8020 the number of fields (because each ++with__<L>__++ function must be
8021 changed). It is also annoying to have to define a ++with__<L>__++
8022 function, possibly with a fixity declaration, for each field.
8024 Fortunately, there is a solution to these problems.
8027 == Advanced implementation ==
8029 Using <:Fold:> one can define a family of ++makeUpdate__<N>__++
8030 functions and single _update_ operator `U` so that one can define a
8031 functional record update function for any record type simply by
8032 specifying a (trivial) isomorphism between that type and function
8033 argument list. For example, suppose that we would like to do
8034 functional record update on records with fields `a` and `b`. Then one
8035 defines a function `updateAB` as follows.
8042 fun from v1 v2 = {a = v1, b = v2}
8043 fun to f {a = v1, b = v2} = f v1 v2
8045 makeUpdate2 (from, from, to)
8050 The functions `from` (think _from function arguments_) and `to` (think
8051 _to function arguements_) specify an isomorphism between `a`,`b`
8052 records and function arguments. There is a second use of `from` to
8053 work around the lack of
8054 <:FirstClassPolymorphism:first-class polymorphism> in SML.
8056 With the definition of `updateAB` in place, the following expressions
8061 updateAB {a = 13, b = "hello"} (set#b "goodbye") $
8062 updateAB {a = 13.5, b = true} (set#b false) (set#a 12.5) $
8065 As another example, suppose that we would like to do functional record
8066 update on records with fields `b`, `c`, and `d`. Then one defines a
8067 function `updateBCD` as follows.
8074 fun from v1 v2 v3 = {b = v1, c = v2, d = v3}
8075 fun to f {b = v1, c = v2, d = v3} = f v1 v2 v3
8077 makeUpdate3 (from, from, to)
8082 With the definition of `updateBCD` in place, the following expression
8087 updateBCD {b = 1, c = 2, d = 3} (set#c 4) (set#c 5) $
8090 Note that not all fields need be updated and that the same field may
8091 be updated multiple times. Further note that the same `set` operator
8092 is used for all update functions (in the above, for both `updateAB`
8095 In general, to define a functional-record-update function on records
8096 with fields `f1`, `f2`, ..., `fN`, use the following template.
8103 fun from v1 v2 ... vn = {f1 = v1, f2 = v2, ..., fn = vn}
8104 fun to f {f1 = v1, f2 = v2, ..., fn = vn} = v1 v2 ... vn
8106 makeUpdateN (from, from, to)
8111 With this, one can update a record as follows.
8115 update {f1 = v1, ..., fn = vn} (set#fi1 vi1) ... (set#fim vim) $
8119 == The `FunctionalRecordUpdate` structure ==
8121 Here is the implementation of functional record update.
8125 structure FunctionalRecordUpdate =
8128 fun next g (f, z) x = g (f x, z)
8129 fun f1 (f, z) x = f (z x)
8130 fun f2 z = next f1 z
8131 fun f3 z = next f2 z
8134 fun c1 from = c0 from f1
8135 fun c2 from = c1 from f2
8136 fun c3 from = c2 from f3
8138 fun makeUpdate cX (from, from', to) record =
8140 fun ops () = cX from'
8141 fun vars f = to f record
8143 Fold.fold ((vars, ops), fn (vars, _) => vars from)
8146 fun makeUpdate0 z = makeUpdate c0 z
8147 fun makeUpdate1 z = makeUpdate c1 z
8148 fun makeUpdate2 z = makeUpdate c2 z
8149 fun makeUpdate3 z = makeUpdate c3 z
8151 fun upd z = Fold.step2 (fn (s, f, (vars, ops)) => (fn out => vars (s (ops ()) (out, f)), ops)) z
8152 fun set z = Fold.step2 (fn (s, v, (vars, ops)) => (fn out => vars (s (ops ()) (out, fn _ => v)), ops)) z
8157 The idea of `makeUpdate` is to build a record of functions which can
8158 replace the contents of one argument out of a list of arguments. The
8159 functions ++f__<X>__++ replace the 0th, 1st, ... argument with their
8160 argument `z`. The ++c__<X>__++ functions pass the first __X__ `f`
8161 functions to the record constructor.
8163 The `#field` notation of Standard ML allows us to select the map
8164 function which replaces the corresponding argument. By converting the
8165 record to an argument list, feeding that list through the selected map
8166 function and piping the list into the record constructor, functional
8167 record update is achieved.
8172 With MLton, the efficiency of this approach is as good as one would
8173 expect with the special syntax. Namely a sequence of updates will be
8174 optimized into a single record construction that copies the unchanged
8175 fields and fills in the changed fields with their new values.
8177 Before Sep 14, 2009, this page advocated an alternative implementation
8178 of <:FunctionalRecordUpdate:>. However, the old structure caused
8179 exponentially increasing compile times. We advise you to switch to
8185 Functional record update can be used to implement labelled
8186 <:OptionalArguments:optional arguments>.
8190 :mlton-guide-page: fxp
8195 http://atseidl2.informatik.tu-muenchen.de/%7Eberlea/Fxp/[fxp] is an XML
8196 parser written in Standard ML.
8199 http://atseidl2.informatik.tu-muenchen.de/%7Eberlea/Fxp/mlton.html[patch]
8200 to compile with MLton.
8204 :mlton-guide-page: GarbageCollection
8205 [[GarbageCollection]]
8209 For a good introduction and overview to garbage collection, see
8212 MLton's garbage collector uses copying, mark-compact, and generational
8213 collection, automatically switching between them at run time based on
8214 the amount of live data relative to the amount of RAM. The runtime
8215 system tries to keep the heap within RAM if at all possible.
8217 MLton's copying collector is a simple, two-space, breadth-first,
8218 Cheney-style collector. The design for the generational and
8219 mark-compact GC is based on <!Cite(Sansom91)>.
8223 * http://www.mlton.org/pipermail/mlton/2002-May/012420.html
8225 object layout and header word design
8233 :mlton-guide-page: GenerativeDatatype
8234 [[GenerativeDatatype]]
8238 In <:StandardML:Standard ML>, datatype declarations are said to be
8239 _generative_, because each time a datatype declaration is evaluated,
8240 it yields a new type. Thus, any attempt to mix the types will lead to
8241 a type error at compile-time. The following program, which does not
8242 type check, demonstrates this.
8252 val _: S1.t -> S2.t = fn x => x
8255 Generativity also means that two different datatype declarations
8256 define different types, even if they define identical constructors.
8257 The following program does not type check due to this.
8265 val _ = if true then a1 else a2
8270 * <:GenerativeException:>
8274 :mlton-guide-page: GenerativeException
8275 [[GenerativeException]]
8279 In <:StandardML:Standard ML>, exception declarations are said to be
8280 _generative_, because each time an exception declaration is evaluated,
8281 it yields a new exception.
8283 The following program demonstrates the generativity of exceptions.
8289 fun isE1 (e: exn): bool =
8295 fun isE2 (e: exn): bool =
8299 fun pb (b: bool): unit =
8300 print (concat [Bool.toString b, "\n"])
8301 val () = (pb (isE1 e1)
8307 In the above program, two different exception declarations declare an
8308 exception `E` and a corresponding function that returns `true` only on
8309 that exception. Although declared by syntactically identical
8310 exception declarations, `e1` and `e2` are different exceptions. The
8311 program, when run, prints `true`, `false`, `false`, `true`.
8313 A slight modification of the above program shows that even a single
8314 exception declaration yields a new exception each time it is
8319 fun f (): exn * (exn -> bool) =
8323 (E, fn E => true | _ => false)
8325 val (e1, isE1) = f ()
8326 val (e2, isE2) = f ()
8327 fun pb (b: bool): unit =
8328 print (concat [Bool.toString b, "\n"])
8329 val () = (pb (isE1 e1)
8335 Each call to `f` yields a new exception and a function that returns
8336 `true` only on that exception. The program, when run, prints `true`,
8337 `false`, `false`, `true`.
8342 Exception generativity is required for type safety. Consider the
8343 following valid SML program.
8347 fun f (): ('a -> exn) * (exn -> 'a) =
8351 (E, fn E x => x | _ => raise Fail "f")
8353 fun cast (a: 'a): 'b =
8355 val (make: 'a -> exn, _) = f ()
8356 val (_, get: exn -> 'b) = f ()
8360 val _ = ((cast 13): int -> int) 14
8363 If exceptions weren't generative, then each call `f ()` would yield
8364 the same exception constructor `E`. Then, our `cast` function could
8365 use `make: 'a -> exn` to convert any value into an exception and then
8366 `get: exn -> 'b` to convert that exception to a value of arbitrary
8367 type. If `cast` worked, then we could cast an integer as a function
8368 and apply. Of course, because of generative exceptions, this program
8374 The `exn` type is effectively a <:UniversalType:universal type>.
8379 * <:GenerativeDatatype:>
8383 :mlton-guide-page: Git
8388 http://git-scm.com/[Git] is a distributed version control system. The
8389 MLton project currently uses Git to maintain its
8390 <:Sources:source code>.
8392 Here are some online Git resources.
8394 * http://git-scm.com/docs[Reference Manual]
8395 * http://git-scm.com/book[ProGit, by Scott Chacon]
8399 :mlton-guide-page: Glade
8404 http://glade.gnome.org/features.html[Glade] is a tool for generating
8405 Gtk user interfaces.
8407 <:WesleyTerpstra:> is working on a Glade->mGTK converter.
8409 * http://www.mlton.org/pipermail/mlton/2004-December/016865.html
8413 :mlton-guide-page: Globalize
8418 <:Globalize:> is an analysis pass for the <:SXML:>
8419 <:IntermediateLanguage:>, invoked from <:ClosureConvert:>.
8423 This pass marks values that are constant, allowing <:ClosureConvert:>
8424 to move them out to the top level so they are only evaluated once and
8425 do not appear in closures.
8427 == Implementation ==
8429 * <!ViewGitFile(mlton,master,mlton/closure-convert/globalize.sig)>
8430 * <!ViewGitFile(mlton,master,mlton/closure-convert/globalize.fun)>
8432 == Details and Notes ==
8438 :mlton-guide-page: GnuMP
8443 The http://gmplib.org[GnuMP] library (GNU Multiple Precision
8444 arithmetic library) is a library for arbitrary precision integer
8445 arithmetic. MLton uses the GnuMP library to implement the
8446 <:BasisLibrary: Basis Library> `IntInf` module.
8450 * There is a known problem with the GnuMP library (prior to version
8451 4.2.x), where it requires a lot of stack space for some computations,
8452 e.g. `IntInf.toString` of a million digit number. If you run with
8453 stack size limited, you may see a segfault in such programs. This
8454 problem is mentioned in the http://gmplib.org/#FAQ[GnuMP FAQ], where
8455 they describe two solutions.
8457 ** Increase (or unlimit) your stack space. From your program, use
8458 `setrlimit`, or from the shell, use `ulimit`.
8460 ** Configure and rebuild `libgmp` with `--disable-alloca`, which will
8461 cause it to allocate temporaries using `malloc` instead of on the
8464 * On some platforms, the GnuMP library may be configured to use one of
8465 multiple ABIs (Application Binary Interfaces). For example, on some
8466 32-bit architectures, GnuMP may be configured to represent a limb as
8467 either a 32-bit `long` or as a 64-bit `long long`. Similarly, GnuMP
8468 may be configured to use specific CPU features.
8470 In order to efficiently use the GnuMP library, MLton represents an
8471 `IntInf.int` value in a manner compatible with the GnuMP library's
8472 representation of a limb. Hence, it is important that MLton and the
8473 GnuMP library agree upon the representation of a limb.
8475 ** When using a source package of MLton, building will detect the
8476 GnuMP library's representation of a limb.
8478 ** When using a binary package of MLton that is dynamically linked
8479 against the GnuMP library, the build machine and the install machine
8480 must have the GnuMP library configured with the same representation of
8481 a limb. (On the other hand, the build machine need not have the GnuMP
8482 library configured with CPU features compatible with the install
8485 ** When using a binary package of MLton that is statically linked
8486 against the GnuMP library, the build machine and the install machine
8487 need not have the GnuMP library configured with the same
8488 representation of a limb. (On the other hand, the build machine must
8489 have the GnuMP library configured with CPU features compatible with
8490 the install machine.)
8492 However, MLton will be configured with the representation of a limb
8493 from the GnuMP library of the build machine. Executables produced by
8494 MLton will be incompatible with the GnuMP library of the install
8495 machine. To _reconfigure_ MLton with the representation of a limb
8496 from the GnuMP library of the install machine, one must edit:
8499 /usr/lib/mlton/self/sizes
8508 entry so that `??` corresponds to the bytes in a limb; and, one must edit:
8511 /usr/lib/mlton/sml/basis/config/c/arch-os/c-types.sml
8518 structure C_MPLimb = struct open Word?? type t = word end
8519 functor C_MPLimb_ChooseWordN (A: CHOOSE_WORDN_ARG) = ChooseWordN_Word?? (A)
8522 entries so that `??` corresponds to the bits in a limb.
8526 :mlton-guide-page: GoogleSummerOfCode2013
8527 [[GoogleSummerOfCode2013]]
8528 Google Summer of Code (2013)
8529 ============================
8533 The following developers have agreed to serve as mentors for the 2013 Google Summer of Code:
8535 * http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8536 * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8537 * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan]
8541 === Implement a Partial Redundancy Elimination (PRE) Optimization ===
8543 Partial redundancy elimination (PRE) is a program transformation that
8544 removes operations that are redundant on some, but not necessarily all
8545 paths, through the program. PRE can subsume both common subexpression
8546 elimination and loop-invariant code motion, and is therefore a
8547 potentially powerful optimization. However, a naïve
8548 implementation of PRE on a program in static single assignment (SSA)
8549 form is unlikely to be effective. This project aims to adapt and
8550 implement the SSAPRE algorithm(s) of Thomas VanDrunen in MLton's SSA
8551 intermediate language.
8555 * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based partial redundancy elimination for static single assignment form]; Thomas VanDrunen and Antony L. Hosking
8556 * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen
8557 * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
8558 * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial redundancy elimination in SSA form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
8561 Recommended Skills: SML programming experience; some middle-end compiler experience
8564 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8567 === Design and Implement a Heap Profiler ===
8569 A heap profile is a description of the space usage of a program. A
8570 heap profile is concerned with the allocation, retention, and
8571 deallocation (via garbage collection) of heap data during the
8572 execution of a program. A heap profile can be used to diagnose
8573 performance problems in a functional program that arise from space
8574 leaks. This project aims to design and implement a heap profiler for
8575 MLton compiled programs.
8579 * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones
8580 * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo
8581 * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo
8582 * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling
8585 Recommended Skills: C and SML programming experience; some experience with UI and visualization
8588 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8591 === Garbage Collector Improvements ===
8593 The garbage collector plays a significant role in the performance of
8594 functional languages. Garbage collect too often, and program
8595 performance suffers due to the excessive time spent in the garbage
8596 collector. Garbage collect not often enough, and program performance
8597 suffers due to the excessive space used by the uncollected garbage.
8598 One particular issue is ensuring that a program utilizing a garbage
8599 collector "plays nice" with other processes on the system, by not
8600 using too much or too little physical memory. While there are some
8601 reasonable theoretical results about garbage collections with heaps of
8602 fixed size, there seems to be insufficient work that really looks
8603 carefully at the question of dynamically resizing the heap in response
8604 to the live data demands of the application and, similarly, in
8605 response to the behavior of the operating system and other processes.
8606 This project aims to investigate improvements to the memory behavior of
8607 MLton compiled programs through better tuning of the garbage
8612 * http://www.dcs.gla.ac.uk/%7Ewhited/papers/automated_heap_sizing.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews
8613 * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
8614 * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling garbage collection and heap growth to reduce the execution time of Java applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
8615 * http://portal.acm.org/citation.cfm?doid=1065010.1065028[Garbage collection without paging]; Matthew Hertz, Yi Feng, and Emery D. Berger
8616 * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic heap sizing: taking real memory into account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
8619 Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
8622 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8625 === Implement Successor{nbsp}ML Language Features ===
8627 Any programming language, including Standard{nbsp}ML, can be improved.
8628 The community has identified a number of modest extensions and
8629 revisions to the Standard{nbsp}ML programming language that would
8630 likely prove useful in practice. This project aims to implement these
8631 language features in the MLton compiler.
8635 * http://successor-ml.org/index.php?title=Main_Page[Successor{nbsp}ML]
8636 * http://www.mpi-sws.org/%7Erossberg/hamlet/index.html#successor-ml[HaMLet (Successor{nbsp}ML)]
8637 * http://journals.cambridge.org/action/displayAbstract?aid=1322628[A critique of Standard{nbsp}ML]; Andrew W. Appel
8640 Recommended Skills: SML programming experience; some front-end compiler experience (i.e., scanners and parsers)
8643 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8646 === Implement Source-level Debugging ===
8648 Debugging is a fact of programming life. Unfortunately, most SML
8649 implementations (including MLton) provide little to no source-level
8650 debugging support. This project aims to add basic to intermediate
8651 source-level debugging support to the MLton compiler. MLton already
8652 supports source-level profiling, which can be used to attribute bytes
8653 allocated or time spent in source functions. It should be relatively
8654 straightforward to leverage this source-level information into basic
8655 source-level debugging support, with the ability to set/unset
8656 breakpoints and step through declarations and functions. It may be
8657 possible to also provide intermediate source-level debugging support,
8658 with the ability to inspect in-scope variables of basic types (e.g.,
8659 types compatible with MLton's foreign function interface).
8663 * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works]
8664 * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types]
8665 * http://dwarfstd.org/[DWARF Debugging Standard]
8666 * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format]
8669 Recommended Skills: SML programming experience; some compiler experience
8672 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8675 === SIMD Primitives ===
8677 Most modern processors offer some direct support for SIMD (Single
8678 Instruction, Multiple Data) operations, such as Intel's MMX/SSE
8679 instructions, AMD's 3DNow! instructions, and IBM's AltiVec. Such
8680 instructions are particularly useful for multimedia, scientific, and
8681 cryptographic applications. This project aims to add preliminary
8682 support for vector data and vector operations to the MLton compiler.
8683 Ideally, after surveying SIMD instruction sets and SIMD support in
8684 other compilers, a core set of SIMD primitives with broad architecture
8685 and compiler support can be identified. After adding SIMD primitives
8686 to the core compiler and carrying them through to the various
8687 backends, there will be opportunities to design and implement an SML
8688 library that exposes the primitives to the SML programmer as well as
8689 opportunities to design and implement auto-vectorization
8694 * http://en.wikipedia.org/wiki/SIMD[SIMD]
8695 * http://gcc.gnu.org/projects/tree-ssa/vectorization.html[Auto-vectorization in GCC]
8696 * http://llvm.org/docs/Vectorizers.html[Auto-vectorization in LLVM]
8699 Recommended Skills: SML programming experience; some compiler experience; some computer architecture experience
8702 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8705 === RTOS Support ===
8707 This project entails porting the MLton compiler to RTOSs such as:
8708 RTEMS, RT Linux, and FreeRTOS. The project will include modifications
8709 to the MLton build and configuration process. Students will need to
8710 extend the MLton configuration process for each of the RTOSs. The
8711 MLton compilation process will need to be extended to invoke the C
8712 cross compilers the RTOSs provide for embedded support. Test scripts
8713 for validation will be necessary and these will need to be run in
8714 emulators for supported architectures.
8716 Recommended Skills: C programming experience; some scripting experience
8719 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8722 === Region Based Memory Management ===
8724 Region based memory management is an alternative automatic memory
8725 management scheme to garbage collection. Regions can be inferred by
8726 the compiler (e.g., Cyclone and MLKit) or provided to the programmer
8727 through a library. Since many students do not have extensive
8728 experience with compilers we plan on adopting the later approach.
8729 Creating a viable region based memory solution requires the removal of
8730 the GC and changes to the allocator. Additionally, write barriers
8731 will be necessary to ensure references between two ML objects is never
8732 established if the left hand side of the assignment has a longer
8733 lifetime than the right hand side. Students will need to come up with
8734 an appropriate interface for creating, entering, and exiting regions
8735 (examples include RTSJ scoped memory and SCJ scoped memory).
8744 Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
8747 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8750 === Integration of Multi-MLton ===
8752 http://multimlton.cs.purdue.edu[MultiMLton] is a compiler and runtime
8753 environment that targets scalable multicore platforms. It is an
8754 extension of MLton. It combines new language abstractions and
8755 associated compiler analyses for expressing and implementing various
8756 kinds of fine-grained parallelism (safe futures, speculation,
8757 transactions, etc.), along with a sophisticated runtime system tuned
8758 to efficiently handle large numbers of lightweight threads. The core
8759 stable features of MultiMLton will need to be integrated with the
8760 latest MLton public release. Certain experimental features, such as
8761 support for the Intel SCC and distributed runtime will be omitted.
8762 This project requires students to understand the delta between the
8763 MultiMLton code base and the MLton code base. Students will need to
8764 create build and configuration scripts for MLton to enable MultiMLton
8769 * http://multimlton.cs.purdue.edu/mML/Publications.html[MultiMLton -- Publications]
8772 Recommended Skills: SML programming experience; C programming experience; some compiler experience
8775 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8780 :mlton-guide-page: GoogleSummerOfCode2014
8781 [[GoogleSummerOfCode2014]]
8782 Google Summer of Code (2014)
8783 ============================
8787 The following developers have agreed to serve as mentors for the 2014 Google Summer of Code:
8789 * http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8790 * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8791 * http://people.cs.uchicago.edu/~jhr/[John Reppy]
8792 * http://www.cs.purdue.edu/homes/chandras[KC Sivaramakrishnan]
8794 * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan]
8799 === Implement a Partial Redundancy Elimination (PRE) Optimization ===
8801 Partial redundancy elimination (PRE) is a program transformation that
8802 removes operations that are redundant on some, but not necessarily all
8803 paths, through the program. PRE can subsume both common subexpression
8804 elimination and loop-invariant code motion, and is therefore a
8805 potentially powerful optimization. However, a naïve
8806 implementation of PRE on a program in static single assignment (SSA)
8807 form is unlikely to be effective. This project aims to adapt and
8808 implement the SSAPRE algorithm(s) of Thomas VanDrunen in MLton's SSA
8809 intermediate language.
8813 * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based partial redundancy elimination for static single assignment form]; Thomas VanDrunen and Antony L. Hosking
8814 * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen
8815 * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
8816 * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial redundancy elimination in SSA form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
8819 Recommended Skills: SML programming experience; some middle-end compiler experience
8822 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8825 === Design and Implement a Heap Profiler ===
8827 A heap profile is a description of the space usage of a program. A
8828 heap profile is concerned with the allocation, retention, and
8829 deallocation (via garbage collection) of heap data during the
8830 execution of a program. A heap profile can be used to diagnose
8831 performance problems in a functional program that arise from space
8832 leaks. This project aims to design and implement a heap profiler for
8833 MLton compiled programs.
8837 * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones
8838 * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo
8839 * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo
8840 * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling
8843 Recommended Skills: C and SML programming experience; some experience with UI and visualization
8846 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8849 === Garbage Collector Improvements ===
8851 The garbage collector plays a significant role in the performance of
8852 functional languages. Garbage collect too often, and program
8853 performance suffers due to the excessive time spent in the garbage
8854 collector. Garbage collect not often enough, and program performance
8855 suffers due to the excessive space used by the uncollected garbage.
8856 One particular issue is ensuring that a program utilizing a garbage
8857 collector "plays nice" with other processes on the system, by not
8858 using too much or too little physical memory. While there are some
8859 reasonable theoretical results about garbage collections with heaps of
8860 fixed size, there seems to be insufficient work that really looks
8861 carefully at the question of dynamically resizing the heap in response
8862 to the live data demands of the application and, similarly, in
8863 response to the behavior of the operating system and other processes.
8864 This project aims to investigate improvements to the memory behavior of
8865 MLton compiled programs through better tuning of the garbage
8870 * http://www.dcs.gla.ac.uk/%7Ewhited/papers/automated_heap_sizing.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews
8871 * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
8872 * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling garbage collection and heap growth to reduce the execution time of Java applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
8873 * http://portal.acm.org/citation.cfm?doid=1065010.1065028[Garbage collection without paging]; Matthew Hertz, Yi Feng, and Emery D. Berger
8874 * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic heap sizing: taking real memory into account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
8877 Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
8880 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8883 === Implement Successor{nbsp}ML Language Features ===
8885 Any programming language, including Standard{nbsp}ML, can be improved.
8886 The community has identified a number of modest extensions and
8887 revisions to the Standard{nbsp}ML programming language that would
8888 likely prove useful in practice. This project aims to implement these
8889 language features in the MLton compiler.
8893 * http://successor-ml.org/index.php?title=Main_Page[Successor{nbsp}ML]
8894 * http://www.mpi-sws.org/%7Erossberg/hamlet/index.html#successor-ml[HaMLet (Successor{nbsp}ML)]
8895 * http://journals.cambridge.org/action/displayAbstract?aid=1322628[A critique of Standard{nbsp}ML]; Andrew W. Appel
8898 Recommended Skills: SML programming experience; some front-end compiler experience (i.e., scanners and parsers)
8901 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8904 === Implement Source-level Debugging ===
8906 Debugging is a fact of programming life. Unfortunately, most SML
8907 implementations (including MLton) provide little to no source-level
8908 debugging support. This project aims to add basic to intermediate
8909 source-level debugging support to the MLton compiler. MLton already
8910 supports source-level profiling, which can be used to attribute bytes
8911 allocated or time spent in source functions. It should be relatively
8912 straightforward to leverage this source-level information into basic
8913 source-level debugging support, with the ability to set/unset
8914 breakpoints and step through declarations and functions. It may be
8915 possible to also provide intermediate source-level debugging support,
8916 with the ability to inspect in-scope variables of basic types (e.g.,
8917 types compatible with MLton's foreign function interface).
8921 * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works]
8922 * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types]
8923 * http://dwarfstd.org/[DWARF Debugging Standard]
8924 * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format]
8927 Recommended Skills: SML programming experience; some compiler experience
8930 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8933 === Region Based Memory Management ===
8935 Region based memory management is an alternative automatic memory
8936 management scheme to garbage collection. Regions can be inferred by
8937 the compiler (e.g., Cyclone and MLKit) or provided to the programmer
8938 through a library. Since many students do not have extensive
8939 experience with compilers we plan on adopting the later approach.
8940 Creating a viable region based memory solution requires the removal of
8941 the GC and changes to the allocator. Additionally, write barriers
8942 will be necessary to ensure references between two ML objects is never
8943 established if the left hand side of the assignment has a longer
8944 lifetime than the right hand side. Students will need to come up with
8945 an appropriate interface for creating, entering, and exiting regions
8946 (examples include RTSJ scoped memory and SCJ scoped memory).
8955 Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
8958 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8961 === Integration of Multi-MLton ===
8963 http://multimlton.cs.purdue.edu[MultiMLton] is a compiler and runtime
8964 environment that targets scalable multicore platforms. It is an
8965 extension of MLton. It combines new language abstractions and
8966 associated compiler analyses for expressing and implementing various
8967 kinds of fine-grained parallelism (safe futures, speculation,
8968 transactions, etc.), along with a sophisticated runtime system tuned
8969 to efficiently handle large numbers of lightweight threads. The core
8970 stable features of MultiMLton will need to be integrated with the
8971 latest MLton public release. Certain experimental features, such as
8972 support for the Intel SCC and distributed runtime will be omitted.
8973 This project requires students to understand the delta between the
8974 MultiMLton code base and the MLton code base. Students will need to
8975 create build and configuration scripts for MLton to enable MultiMLton
8980 * http://multimlton.cs.purdue.edu/mML/Publications.html[MultiMLton -- Publications]
8983 Recommended Skills: SML programming experience; C programming experience; some compiler experience
8986 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8989 === Concurrent{nbsp}ML Improvements ===
8991 http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency
8992 library based on synchronous message passing. MLton has a partial
8993 implementation of the CML message-passing primitives, but its use in
8994 real-world applications has been stymied by the lack of completeness
8995 and thread-safe I/O libraries. This project would aim to flesh out
8996 the CML implementation in MLton to be fully compatible with the
8997 "official" version distributed as part of SML/NJ. Furthermore, time
8998 permitting, runtime system support could be added to allow use of
8999 modern OS features, such as asynchronous I/O, in the implementation of
9000 CML's system interfaces.
9004 * http://cml.cs.uchicago.edu/
9005 * http://mlton.org/ConcurrentML
9006 * http://mlton.org/ConcurrentMLImplementation
9009 Recommended Skills: SML programming experience; knowledge of concurrent programming; some operating systems and/or systems programming experience
9012 Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
9013 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9017 === SML3d Development ===
9019 The SML3d Project is a collection of libraries to support 3D graphics
9020 programming using Standard ML and the http://opengl.org/[OpenGL]
9021 graphics API. It currently requires the MLton implementation of SML
9022 and is supported on Linux, Mac OS X, and Microsoft Windows. There is
9023 also support for http://www.khronos.org/opencl/[OpenCL]. This project
9024 aims to continue development of the SML3d Project.
9028 * http://sml3d.cs.uchicago.edu/
9031 Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
9036 :mlton-guide-page: GoogleSummerOfCode2015
9037 [[GoogleSummerOfCode2015]]
9038 Google Summer of Code (2015)
9039 ============================
9043 The following developers have agreed to serve as mentors for the 2015 Google Summer of Code:
9045 * http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9046 * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
9048 * http://people.cs.uchicago.edu/~jhr/[John Reppy]
9049 * http://www.cs.purdue.edu/homes/chandras[KC Sivaramakrishnan]
9050 * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan]
9056 === Implement a Partial Redundancy Elimination (PRE) Optimization ===
9058 Partial redundancy elimination (PRE) is a program transformation that
9059 removes operations that are redundant on some, but not necessarily all
9060 paths, through the program. PRE can subsume both common subexpression
9061 elimination and loop-invariant code motion, and is therefore a
9062 potentially powerful optimization. However, a naïve implementation of
9063 PRE on a program in static single assignment (SSA) form is unlikely to
9064 be effective. This project aims to adapt and implement the GVN-PRE
9065 algorithm of Thomas VanDrunen in MLton's SSA intermediate language.
9069 * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen
9070 * http://www.cs.purdue.edu/research/technical_reports/2003/TR%2003-032.pdf[Corner-cases in Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
9071 * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
9072 * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based Partial Redundancy Elimination for Static Single Assignment Form]; Thomas VanDrunen and Antony L. Hosking
9073 * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial Redundancy Elimination in SSA Form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
9076 Recommended Skills: SML programming experience; some middle-end compiler experience
9078 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9081 === Design and Implement a Heap Profiler ===
9083 A heap profile is a description of the space usage of a program. A
9084 heap profile is concerned with the allocation, retention, and
9085 deallocation (via garbage collection) of heap data during the
9086 execution of a program. A heap profile can be used to diagnose
9087 performance problems in a functional program that arise from space
9088 leaks. This project aims to design and implement a heap profiler for
9089 MLton compiled programs.
9093 * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones
9094 * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo
9095 * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo
9096 * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling
9099 Recommended Skills: C and SML programming experience; some experience with UI and visualization
9102 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9105 === Garbage Collector Improvements ===
9107 The garbage collector plays a significant role in the performance of
9108 functional languages. Garbage collect too often, and program
9109 performance suffers due to the excessive time spent in the garbage
9110 collector. Garbage collect not often enough, and program performance
9111 suffers due to the excessive space used by the uncollected
9112 garbage. One particular issue is ensuring that a program utilizing a
9113 garbage collector "plays nice" with other processes on the system, by
9114 not using too much or too little physical memory. While there are some
9115 reasonable theoretical results about garbage collections with heaps of
9116 fixed size, there seems to be insufficient work that really looks
9117 carefully at the question of dynamically resizing the heap in response
9118 to the live data demands of the application and, similarly, in
9119 response to the behavior of the operating system and other
9120 processes. This project aims to investigate improvements to the memory
9121 behavior of MLton compiled programs through better tuning of the
9126 * http://gchandbook.org/[The Garbage Collection Handbook: The Art of Automatic Memory Management]; Richard Jones, Antony Hosking, Eliot Moss
9127 * http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1020[Dual-Mode Garbage Collection]; Patrick Sansom
9128 * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic Heap Sizing: Taking Real Memory into Account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
9129 * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling Garbage Collection and Heap Growth to Reduce the Execution Time of Java Applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
9130 * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
9131 * http://portal.acm.org/citation.cfm?doid=1806651.1806669[The Economics of Garbage Collection]; Jeremy Singer, Richard E. Jones, Gavin Brown, and Mikel Luján
9132 * http://www.dcs.gla.ac.uk/%7Ejsinger/pdfs/tfp12.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews
9133 * http://portal.acm.org/citation.cfm?doid=2555670.2466481[Control Theory for Principled Heap Sizing]; David R. White, Jeremy Singer, Jonathan M. Aitken, and Richard E. Jones
9136 Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
9139 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9142 === Heap-allocated Activation Records ===
9144 Activation records (a.k.a., stack frames) are traditionally allocated
9145 on a stack. This naturally corresponds to the call-return pattern of
9146 function invocation. However, there are some disadvantages to
9147 stack-allocated activation records. In a functional programming
9148 language, functions may be deeply recursive, resulting in call stacks
9149 that are much larger than typically supported by the operating system;
9150 hence, a functional programming language implementation will typically
9151 store its stack in its heap. Furthermore, a functional programming
9152 language implementation must handle and recover from stack overflow,
9153 by allocating a larger stack (again, in its heap) and copying
9154 activation records from the old stack to the new stack. In the
9155 presence of threads, stacks must be allocated in a heap and, in the
9156 presence of a garbage collector, should be garbage collected when
9157 unreachable. While heap-allocated activation records avoid many of
9158 these disadvantages, they have not been widely implemented. This
9159 project aims to implement and evaluate heap-allocated activation
9160 records in the MLton compiler.
9164 * http://journals.cambridge.org/action/displayAbstract?aid=1295104[Empirical and Analytic Study of Stack Versus Heap Cost for Languages with Closures]; Andrew W. Appel and Zhong Shao
9165 * http://portal.acm.org/citation.cfm?doid=182590.156783[Space-efficient closure representations]; Zhong Shao and Andrew W. Appel
9166 * http://portal.acm.org/citation.cfm?doid=93548.93554[Representing control in the presence of first-class continuations]; R. Hieb, R. Kent Dybvig, and Carl Bruggeman
9169 Recommended Skills: SML programming experience; some middle- and back-end compiler experience
9172 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9175 === Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML ===
9178 http://en.wikipedia.org/wiki/IEEE_754-2008[IEEE Standard for Floating-Point Arithmetic (IEEE 754)]
9179 is the de facto representation for floating-point computation.
9180 However, it is a _binary_ (base 2) representation of floating-point
9181 values, while many applications call for input and output of
9182 floating-point values in _decimal_ (base 10) representation. The
9183 _decimal-to-binary_ conversion problem takes a decimal floating-point
9184 representation (e.g., a string like +"0.1"+) and returns the best
9185 binary floating-point representation of that number. The
9186 _binary-to-decimal_ conversion problem takes a binary floating-point
9187 representation and returns a decimal floating-point representation
9188 using the smallest number of digits that allow the decimal
9189 floating-point representation to be converted to the original binary
9190 floating-point representation. For both conversion routines, "best"
9191 is dependent upon the current floating-point rounding mode.
9193 MLton uses David Gay's
9194 http://www.netlib.org/fp/gdtoa.tgz[gdtoa library] for floating-point
9195 conversions. While this is an exellent library, it generalizes the
9196 decimal-to-binary and binary-to-decimal conversion routines beyond
9197 what is required by the
9198 http://standardml.org/Basis/[Standard ML Basis Library] and induces an
9199 external dependency on the compiler. Native implementations of these
9200 conversion routines in Standard ML would obviate the dependency on the
9201 +gdtoa+ library, while also being able to take advantage of Standard
9202 ML features in the implementation (e.g., the published algorithms
9203 often require use of infinite precision arithmetic, which is provided
9204 by the +IntInf+ structure in Standard ML, but is provided in an ad hoc
9205 fasion in the +gdtoa+ library).
9207 This project aims to develop a native implementation of the conversion
9208 routines in Standard ML.
9212 * http://dl.acm.org/citation.cfm?doid=103162.103163[What every computer scientist should know about floating-point arithmetic]; David Goldberg
9213 * http://dl.acm.org/citation.cfm?doid=93542.93559[How to print floating-point numbers accurately]; Guy L. Steele, Jr. and Jon L. White
9214 * http://dl.acm.org/citation.cfm?doid=93542.93557[How to read floating point numbers accurately]; William D. Clinger
9215 * http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz[Correctly Rounded Binary-Decimal and Decimal-Binary Conversions]; David Gay
9216 * http://dl.acm.org/citation.cfm?doid=249069.231397[Printing floating-point numbers quickly and accurately]; Robert G. Burger and R. Kent Dybvig
9217 * http://dl.acm.org/citation.cfm?doid=1806596.1806623[Printing floating-point numbers quickly and accurately with integers]; Florian Loitsch
9220 Recommended Skills: SML programming experience; algorithm design and implementation
9223 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9226 === Implement Source-level Debugging ===
9228 Debugging is a fact of programming life. Unfortunately, most SML
9229 implementations (including MLton) provide little to no source-level
9230 debugging support. This project aims to add basic to intermediate
9231 source-level debugging support to the MLton compiler. MLton already
9232 supports source-level profiling, which can be used to attribute bytes
9233 allocated or time spent in source functions. It should be relatively
9234 straightforward to leverage this source-level information into basic
9235 source-level debugging support, with the ability to set/unset
9236 breakpoints and step through declarations and functions. It may be
9237 possible to also provide intermediate source-level debugging support,
9238 with the ability to inspect in-scope variables of basic types (e.g.,
9239 types compatible with MLton's foreign function interface).
9243 * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works]
9244 * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types]
9245 * http://dwarfstd.org/[DWARF Debugging Standard]
9246 * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format]
9249 Recommended Skills: SML programming experience; some compiler experience
9252 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9255 === Region Based Memory Management ===
9257 Region based memory management is an alternative automatic memory
9258 management scheme to garbage collection. Regions can be inferred by
9259 the compiler (e.g., Cyclone and MLKit) or provided to the programmer
9260 through a library. Since many students do not have extensive
9261 experience with compilers we plan on adopting the later approach.
9262 Creating a viable region based memory solution requires the removal of
9263 the GC and changes to the allocator. Additionally, write barriers
9264 will be necessary to ensure references between two ML objects is never
9265 established if the left hand side of the assignment has a longer
9266 lifetime than the right hand side. Students will need to come up with
9267 an appropriate interface for creating, entering, and exiting regions
9268 (examples include RTSJ scoped memory and SCJ scoped memory).
9277 Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
9280 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
9283 === Adding Real-Time Capabilities ===
9285 This project focuses on exposing real-time APIs from a real-time OS
9286 kernel at the SML level. This will require mapping the current MLton
9287 (or http://multimlton.cs.purdue.edu[MultiMLton]) threading framework
9288 to real-time threads that the RTOS provides. This will include
9289 associating priorities with MLton threads and building priority based
9290 scheduling algorithms. Additionally, support for perdioc, aperiodic,
9291 and sporadic tasks should be supported. A real-time SML library will
9292 need to be created to provide a forward facing interface for
9293 programmers. Stretch goals include reworking the MLton +atomic+
9294 statement and associated synchronization primitives built on top of
9295 the MLton +atomic+ statement.
9297 Recommended Skills: SML programming experience; C programming experience; real-time experience a plus but not required
9300 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
9303 === Real-Time Garbage Collection ===
9305 This project focuses on modifications to the MLton GC to support
9306 real-time garbage collection. We will model the real-time GC on the
9307 Schism RTGC. The first task will be to create a fixed size runtime
9308 object representation. Large structures will need to be represented
9309 as a linked lists of fixed sized objects. Arrays and vectors will be
9310 transferred into dense trees. Compaction and copying can therefore be
9311 removed from the GC algorithms that MLton currently supports. Lastly,
9312 the GC will be made concurrent, allowing for the execution of the GC
9313 threads as the lowest priority task in the system. Stretch goals
9314 include a priority aware mechanism for the GC to signal to real-time
9315 ML threads that it needs to scan their stack and identification of
9316 places where the stack is shallow to bound priority inversion during
9319 Recommended Skills: C programming experience; garbage collector experience a plus but not required
9322 Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
9326 === Concurrent{nbsp}ML Improvements ===
9328 http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency
9329 library based on synchronous message passing. MLton has a partial
9330 implementation of the CML message-passing primitives, but its use in
9331 real-world applications has been stymied by the lack of completeness
9332 and thread-safe I/O libraries. This project would aim to flesh out
9333 the CML implementation in MLton to be fully compatible with the
9334 "official" version distributed as part of SML/NJ. Furthermore, time
9335 permitting, runtime system support could be added to allow use of
9336 modern OS features, such as asynchronous I/O, in the implementation of
9337 CML's system interfaces.
9341 * http://cml.cs.uchicago.edu/
9342 * http://mlton.org/ConcurrentML
9343 * http://mlton.org/ConcurrentMLImplementation
9346 Recommended Skills: SML programming experience; knowledge of concurrent programming; some operating systems and/or systems programming experience
9348 Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
9349 Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9353 === SML3d Development ===
9355 The SML3d Project is a collection of libraries to support 3D graphics
9356 programming using Standard ML and the http://opengl.org/[OpenGL]
9357 graphics API. It currently requires the MLton implementation of SML
9358 and is supported on Linux, Mac OS X, and Microsoft Windows. There is
9359 also support for http://www.khronos.org/opencl/[OpenCL]. This project
9360 aims to continue development of the SML3d Project.
9364 * http://sml3d.cs.uchicago.edu/
9367 Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
9372 :mlton-guide-page: HaMLet
9377 http://www.mpi-sws.org/~rossberg/hamlet/[HaMLet] is a
9378 <:StandardMLImplementations:Standard ML implementation>. It is
9379 intended as reference implementation of
9380 <:DefinitionOfStandardML:The Definition of Standard ML (Revised)> and
9381 not for serious practical work.
9385 :mlton-guide-page: HenryCejtin
9390 I was one of the original developers of Mathematica (actually employee #1).
9391 My background is a combination of mathematics and computer science.
9392 Currently I am doing various things in Chicago.
9396 :mlton-guide-page: History
9401 In April 1997, Stephen Weeks wrote a defunctorizer for Standard ML and
9402 integrated it with SML/NJ. The defunctorizer used SML/NJ's visible
9403 compiler and operated on the `Ast` intermediate representation
9404 produced by the SML/NJ front end. Experiments showed that
9405 defunctorization gave a speedup of up to six times over separate
9406 compilation and up to two times over batch compilation without functor
9409 In August 1997, we began development of an independent compiler for
9410 SML. At the time the compiler was called `smlc`. By October, we had
9411 a working monomorphiser. By November, we added a polyvariant
9412 higher-order control-flow analysis. At that point, MLton was about
9413 10,000 lines of code.
9415 Over the next year and half, `smlc` morphed into a full-fledged
9416 compiler for SML. It was renamed MLton, and first released in March
9419 From the start, MLton has been driven by whole-program optimization
9420 and an emphasis on performance. Also from the start, MLton has had a
9421 fast C FFI and `IntInf` based on the GNU multiprecision library. At
9422 its first release, MLton was 48,006 lines.
9424 Between the March 1999 and January 2002, MLton grew to 102,541 lines,
9425 as we added a native code generator, mllex, mlyacc, a profiler, many
9426 optimizations, and many libraries including threads and signal
9429 During 2002, MLton grew to 112,204 lines and we had releases in April
9430 and September. We added support for cross compilation and used this
9431 to enable MLton to run on Cygwin/Windows and FreeBSD. We also made
9432 improvements to the garbage collector, so that it now works with large
9433 arrays and up to 4G of memory and so that it automatically uses
9434 copying, mark-compact, or generational collection depending on heap
9435 usage and RAM size. We also continued improvements to the optimizer
9438 During 2003, MLton grew to 122,299 lines and we had releases in March
9439 and July. We extended the profiler to support source-level profiling
9440 of time and allocation and to display call graphs. We completed the
9441 Basis Library implementation, and added new MLton-specific libraries
9442 for weak pointers and finalization. We extended the FFI to allow
9443 callbacks from C to SML. We added support for the Sparc/Solaris
9444 platform, and made many improvements to the C code generator.
9448 :mlton-guide-page: HowProfilingWorks
9449 [[HowProfilingWorks]]
9453 Here's how <:Profiling:> works. If profiling is on, the front end
9454 (elaborator) inserts `Enter` and `Leave` statements into the source
9455 program for function entry and exit. For example,
9458 fun f n = if n = 0 then 0 else 1 + f (n - 1)
9466 val res = (if n = 0 then 0 else 1 + f (n - 1))
9467 handle e => (Leave "f"; raise e)
9474 Actually there is a bit more information than just the source function
9475 name; there is also lexical nesting and file position.
9477 Most of the middle of the compiler ignores, but preserves, `Enter` and
9478 `Leave`. However, so that profiling preserves tail calls, the
9479 <:Shrink:SSA shrinker> has an optimization that notices when the only
9480 operations that cause a call to be a nontail call are profiling
9481 operations, and if so, moves them before the call, turning it into a
9482 tail call. If you observe a program that has a tail call that appears
9483 to be turned into a nontail when compiled with profiling, please
9484 <:Bug:report a bug>.
9486 There is the `checkProf` function in
9487 <!ViewGitFile(mlton,master,mlton/ssa/type-check.fun)>, which checks that
9488 the `Enter`/`Leave` statements match up.
9490 In the backend, just before translating to the <:Machine: Machine IL>,
9491 the profiler uses the `Enter`/`Leave` statements to infer the "local"
9492 portion of the control stack at each program point. The profiler then
9493 removes the ++Enter++s/++Leave++s and inserts different information
9494 depending on which kind of profiling is happening. For time profiling
9495 (with the <:AMD64Codegen:> and <:X86Codegen:>), the profiler inserts labels that cover the
9496 code (i.e. each statement has a unique label in its basic block that
9497 prefixes it) and associates each label with the local control stack.
9498 For time profiling (with the <:CCodegen:> and <:LLVMCodegen:>), the profiler
9499 inserts code that sets a global field that records the local control
9500 stack. For allocation profiling, the profiler inserts calls to a C
9501 function that will maintain byte counts. With stack profiling, the
9502 profiler also inserts a call to a C function at each nontail call in
9503 order to maintain information at runtime about what SML functions are
9506 At run time, the profiler associates counters (either clock ticks or
9507 byte counts) with source functions. When the program finishes, the
9508 profiler writes the counts out to the `mlmon.out` file. Then,
9509 `mlprof` uses source information stored in the executable to
9510 associate the counts in the `mlmon.out` file with source
9513 For time profiling, the profiler catches the `SIGPROF` signal 100
9514 times per second and increments the appropriate counter, determined by
9515 looking at the label prefixing the current program counter and mapping
9516 that to the current source function.
9520 There may be a few missed clock ticks or bytes allocated at the very
9521 end of the program after the data is written.
9523 Profiling has not been tested with signals or threads. In particular,
9524 stack profiling may behave strangely.
9528 :mlton-guide-page: Identifier
9533 In <:StandardML:Standard ML>, there are syntactically two kinds of
9536 * Alphanumeric: starts with a letter or prime (`'`) and is followed by letters, digits, primes and underbars (`_`).
9538 Examples: `abc`, `ABC123`, `Abc_123`, `'a`.
9540 * Symbolic: a sequence of the following
9543 ! % & $ # + - / : < = > ? @ | ~ ` ^ | *
9546 Examples: `+=`, `<=`, `>>`, `$`.
9548 With the exception of `=`, reserved words can not be identifiers.
9550 There are a number of different classes of identifiers, some of which
9551 have additional syntactic rules.
9553 * Identifiers not starting with a prime.
9554 ** value identifier (includes variables and constructors)
9556 ** structure identifier
9557 ** signature identifier
9558 ** functor identifier
9559 * Identifiers starting with a prime.
9561 * Identifiers not starting with a prime and numeric labels (`1`, `2`, ...).
9566 :mlton-guide-page: Immutable
9571 Immutable means not <:Mutable:mutable> and is an adjective meaning
9572 "can not be modified". Most values in <:StandardML:Standard ML> are
9573 immutable. For example, constants, tuples, records, lists, and
9574 vectors are all immutable.
9578 :mlton-guide-page: ImperativeTypeVariable
9579 [[ImperativeTypeVariable]]
9580 ImperativeTypeVariable
9581 ======================
9583 In <:StandardML:Standard ML>, an imperative type variable is a type
9584 variable whose second character is a digit, as in `'1a` or
9585 `'2b`. Imperative type variables were used as an alternative to
9586 the <:ValueRestriction:> in an earlier version of SML, but no longer play
9587 a role. They are treated exactly as other type variables.
9591 :mlton-guide-page: ImplementExceptions
9592 [[ImplementExceptions]]
9596 <:ImplementExceptions:> is a pass for the <:SXML:>
9597 <:IntermediateLanguage:>, invoked from <:SXMLSimplify:>.
9601 This pass implements exceptions.
9603 == Implementation ==
9605 * <!ViewGitFile(mlton,master,mlton/xml/implement-exceptions.fun)>
9607 == Details and Notes ==
9613 :mlton-guide-page: ImplementHandlers
9614 [[ImplementHandlers]]
9618 <:ImplementHandlers:> is a pass for the <:RSSA:>
9619 <:IntermediateLanguage:>, invoked from <:RSSASimplify:>.
9623 This pass implements the (threaded) exception handler stack.
9625 == Implementation ==
9627 * <!ViewGitFile(mlton,master,mlton/backend/implement-handlers.fun)>
9629 == Details and Notes ==
9635 :mlton-guide-page: ImplementProfiling
9636 [[ImplementProfiling]]
9640 <:ImplementProfiling:> is a pass for the <:RSSA:>
9641 <:IntermediateLanguage:>, invoked from <:RSSASimplify:>.
9645 This pass implements profiling.
9647 == Implementation ==
9649 * <!ViewGitFile(mlton,master,mlton/backend/implement-profiling.fun)>
9651 == Details and Notes ==
9653 See <:HowProfilingWorks:>.
9657 :mlton-guide-page: ImplementSuffix
9662 <:ImplementSuffix:> is a pass for the <:SXML:>
9663 <:IntermediateLanguage:>, invoked from <:SXMLSimplify:>.
9667 This pass implements the `TopLevel_setSuffix` primitive, which
9668 installs a function to exit the program.
9670 == Implementation ==
9672 * <!ViewGitFile(mlton,master,mlton/xml/implement-suffix.fun)>
9674 == Details and Notes ==
9676 <:ImplementSuffix:> works by introducing a new `ref` cell to contain
9677 the function of type `unit -> unit` that should be called on program
9680 * The following code (appropriately alpha-converted) is appended to the beginning of the <:SXML:> program:
9688 "toplevel suffix not installed"
9694 val topLevelSuffixCell =
9703 TopLevel_setSuffix (f_0)
9711 Ref_assign (topLevelSuffixCell, f_0)
9714 * The following code (appropriately alpha-converted) is appended to the end of the <:SXML:> program:
9719 Ref_deref (topLevelSuffixCell)
9728 :mlton-guide-page: InfixingOperators
9729 [[InfixingOperators]]
9733 Fixity specifications are not part of signatures in
9734 <:StandardML:Standard ML>. When one wants to use a module that
9735 provides functions designed to be used as infix operators there are
9736 several obvious alternatives:
9738 * Use only prefix applications. Unfortunately there are situations
9739 where infix applications lead to considerably more readable code.
9741 * Make the fixity declarations at the top-level. This may lead to
9742 collisions and may be unsustainable in a large project. Pollution of
9743 the top-level should be avoided.
9745 * Make the fixity declarations at each scope where you want to use
9746 infix applications. The duplication becomes inconvenient if the
9747 operators are widely used. Duplication of code should be avoided.
9749 * Use non-standard extensions, such as the <:MLBasis: ML Basis system>
9750 to control the scope of fixity declarations. This has the obvious
9751 drawback of reduced portability.
9753 * Reuse existing infix operator symbols (`^`, `+`, `-`, ...). This
9754 can be convenient when the standard operators aren't needed in the
9755 same scope with the new operators. On the other hand, one is limited
9756 to the standard operator symbols and the code may appear confusing.
9758 None of the obvious alternatives is best in every case. The following
9759 describes a slightly less obvious alternative that can sometimes be
9760 useful. The idea is to approximate Haskell's special syntax for
9761 treating any identifier enclosed in grave accents (backquotes) as an
9762 infix operator. In Haskell, instead of writing the prefix application
9763 `f x y` one can write the infix application ++x `f` y++.
9766 == Infixing operators ==
9768 Let's first take a look at the definitions of the operators:
9772 infix 3 <\ fun x <\ f = fn y => f (x, y) (* Left section *)
9773 infix 3 \> fun f \> y = f y (* Left application *)
9774 infixr 3 /> fun f /> y = fn x => f (x, y) (* Right section *)
9775 infixr 3 </ fun x </ f = f x (* Right application *)
9777 infix 2 o (* See motivation below *)
9781 The left and right sectioning operators, `<\` and `/>`, are useful in
9782 SML for partial application of infix operators.
9783 <!Cite(Paulson96, ML For the Working Programmer)> describes curried
9784 functions `secl` and `secr` for the same purpose on pages 179-181.
9792 is a function for subtracting `y` from a list of integers and
9796 List.exists (x <\ op=)
9799 is a function for testing whether a list contains an `x`.
9801 Together with the left and right application operators, `\>` and `</`,
9802 the sectioning operators provide a way to treat any binary function
9803 (i.e. a function whose domain is a pair) as an infix operator. In
9807 x0 <\f1\> x1 <\f2\> x2 ... <\fN\> xN = fN (... f2 (f1 (x0, x1), x2) ..., xN)
9813 xN </fN/> ... x2 </f2/> x1 </f1/> x0 = fN (xN, ... f2 (x2, f1 (x1, x0)) ...)
9819 As a fairly realistic example, consider providing a function for sequencing
9824 structure Order (* ... *) =
9827 val orWhenEq = fn (EQUAL, th) => th ()
9828 | (other, _) => other
9832 Using `orWhenEq` and the infixing operators, one can write a
9833 `compare` function for triples as
9837 fun compare (fad, fbe, fcf) ((a, b, c), (d, e, f)) =
9838 fad (a, d) <\Order.orWhenEq\> `fbe (b, e) <\Order.orWhenEq\> `fcf (c, f)
9841 where +`+ is defined as
9845 fun `f x = fn () => f x
9848 Although `orWhenEq` can be convenient (try rewriting the above without
9849 it), it is probably not useful enough to be defined at the top level
9850 as an infix operator. Fortunately we can use the infixing operators
9853 Another fairly realistic example would be to use the infixing operators with
9854 the technique described on the <:Printf:> page. Assuming that you would have
9855 a `Printf` module binding `printf`, +`+, and formatting combinators
9856 named `int` and `string`, you could write
9861 printf (`"Here's an int "<\int\>" and a string "<\string\>".") 13 "foo" end
9864 without having to duplicate the fixity declarations. Alternatively, you could
9869 P.printf (P.`"Here's an int "<\P.int\>" and a string "<\P.string\>".") 13 "foo"
9872 assuming you have the made the binding
9876 structure P = Printf
9880 == Application and piping operators ==
9882 The left and right application operators may also provide some notational
9883 convenience on their own. In general,
9886 f \> x1 \> ... \> xN = f x1 ... xN
9892 xN </ ... </ x1 </ f = f x1 ... xN
9895 If nothing else, both of them can eliminate parentheses. For example,
9899 foo (1 + 2) = foo \> 1 + 2
9902 The left and right application operators are related to operators
9903 that could be described as the right and left piping operators:
9907 infix 1 >| val op>| = op</ (* Left pipe *)
9908 infixr 1 |< val op|< = op\> (* Right pipe *)
9911 As you can see, the left and right piping operators, `>|` and `|<`,
9912 are the same as the right and left application operators,
9913 respectively, except the associativities are reversed and the binding
9914 strength is lower. They are useful for piping data through a sequence
9915 of operations. In general,
9918 x >| f1 >| ... >| fN = fN (... (f1 x) ...) = (fN o ... o f1) x
9924 fN |< ... |< f1 |< x = fN (... (f1 x) ...) = (fN o ... o f1) x
9927 The right piping operator, `|<`, is provided by the Haskell prelude as
9928 `$`. It can be convenient in CPS or continuation passing style.
9930 A use for the left piping operator is with parsing combinators. In a
9931 strict language, like SML, eta-reduction is generally unsafe. Using
9932 the left piping operator, parsing functions can be formatted
9937 fun parsingFunc input =
9943 where `||` is supposed to be a combinator provided by the parsing combinator
9947 == About precedences ==
9949 You probably noticed that we redefined the
9950 <:OperatorPrecedence:precedences> of the function composition operator
9951 `o` and the assignment operator `:=`. Doing so is not strictly
9952 necessary, but can be convenient and should be relatively
9953 safe. Consider the following motivating examples from
9954 <:WesleyTerpstra: Wesley W. Terpstra> relying on the redefined
9959 Word8.fromInt o Char.ord o s <\String.sub
9960 (* Combining sectioning and composition *)
9962 x := s <\String.sub\> i
9963 (* Assigning the result of an infixed application *)
9966 In imperative languages, assignment usually has the lowest precedence
9967 (ignoring statement separators). The precedence of `:=` in the
9968 <:BasisLibrary: Basis Library> is perhaps unnecessarily high, because
9969 an expression of the form `r := x` always returns a unit, which makes
9970 little sense to combine with anything. Dropping `:=` to the lowest
9971 precedence level makes it behave more like in other imperative
9974 The case for `o` is different. With the exception of `before` and
9975 `:=`, it doesn't seem to make much sense to use `o` with any of the
9976 operators defined by the <:BasisLibrary: Basis Library> in an
9977 unparenthesized expression. This is simply because none of the other
9978 operators deal with functions. It would seem that the precedence of
9979 `o` could be chosen completely arbitrarily from the set `{1, ..., 9}`
9980 without having any adverse effects with respect to other infix
9981 operators defined by the <:BasisLibrary: Basis Library>.
9984 == Design of the symbols ==
9986 The closest approximation of Haskell's ++x `f` y++ syntax
9987 achievable in Standard ML would probably be something like
9988 ++x `f^ y++, but `^` is already used for string
9989 concatenation by the <:BasisLibrary: Basis Library>. Other
9990 combinations of the characters +`+ and `^` would be
9991 possible, but none seems clearly the best visually. The symbols `<\`,
9992 `\>`, `</`, and `/>` are reasonably concise and have a certain
9993 self-documenting appearance and symmetry, which can help to remember
9994 them. As the names suggest, the symbols of the piping operators `>|`
9995 and `|<` are inspired by Unix shell pipelines.
10004 :mlton-guide-page: Inline
10009 <:Inline:> is an optimization pass for the <:SSA:>
10010 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10014 This pass inlines <:SSA:> functions using a size-based metric.
10016 == Implementation ==
10018 * <!ViewGitFile(mlton,master,mlton/ssa/inline.sig)>
10019 * <!ViewGitFile(mlton,master,mlton/ssa/inline.fun)>
10021 == Details and Notes ==
10023 The <:Inline:> pass can be invoked to use one of three metrics:
10025 * `NonRecursive(product, small)` -- inline any function satisfying `(numCalls - 1) * (size - small) <= product`, where `numCalls` is the static number of calls to the function and `size` is the size of the function.
10026 * `Leaf(size)` -- inline any leaf function smaller than `size`
10027 * `LeafNoLoop(size)` -- inline any leaf function without loops smaller than `size`
10031 :mlton-guide-page: InsertLimitChecks
10032 [[InsertLimitChecks]]
10036 <:InsertLimitChecks:> is a pass for the <:RSSA:>
10037 <:IntermediateLanguage:>, invoked from <:RSSASimplify:>.
10041 This pass inserts limit checks.
10043 == Implementation ==
10045 * <!ViewGitFile(mlton,master,mlton/backend/limit-check.fun)>
10047 == Details and Notes ==
10053 :mlton-guide-page: InsertSignalChecks
10054 [[InsertSignalChecks]]
10058 <:InsertSignalChecks:> is a pass for the <:RSSA:>
10059 <:IntermediateLanguage:>, invoked from <:RSSASimplify:>.
10063 This pass inserts signal checks.
10065 == Implementation ==
10067 * <!ViewGitFile(mlton,master,mlton/backend/limit-check.fun)>
10069 == Details and Notes ==
10075 :mlton-guide-page: Installation
10080 MLton runs on a variety of platforms and is distributed in both source and
10083 A `.tgz` or `.tbz` binary package can be extracted at any location, yielding
10084 `README.adoc` (this file), `CHANGELOG.adoc`, `LICENSE`, `Makefile`, `bin/`,
10085 `lib/`, and `share/`. The compiler and tools can be executed in-place (e.g.,
10088 A small set of `Makefile` variables can be used to customize the binary package
10091 * `CC`: Specify C compiler. Can be used for alternative tools (e.g.,
10092 `CC=clang` or `CC=gcc-7`).
10093 * `WITH_GMP_DIR`, `WITH_GMP_INC_DIR`, `WITH_GMP_LIB_DIR`: Specify GMP include
10094 and library paths, if not on default search paths. (If `WITH_GMP_DIR` is
10095 set, then `WITH_GMP_INC_DIR` defaults to `$(WITH_GMP_DIR)/include` and
10096 `WITH_GMP_LIB_DIR` defaults to `$(WITH_GMP_DIR)/lib`.)
10102 $ make CC=clang WITH_GMP_DIR=/opt/gmp update
10105 On typical platforms, installing MLton (after optionally performing
10106 `make update`) to `/usr/local` can be accomplished via:
10113 A small set of `Makefile` variables can be used to customize the installation:
10115 * `PREFIX`: Specify the installation prefix.
10116 * `CC`: Specify C compiler. Can be used for alternative tools (e.g.,
10117 `CC=clang` or `CC=gcc-7`).
10118 * `WITH_GMP_DIR`, `WITH_GMP_INC_DIR`, `WITH_GMP_LIB_DIR`: Specify GMP include
10119 and library paths, if not on default search paths. (If `WITH_GMP_DIR` is
10120 set, then `WITH_GMP_INC_DIR` defaults to `$(WITH_GMP_DIR)/include` and
10121 `WITH_GMP_LIB_DIR` defaults to `$(WITH_GMP_DIR)/lib`.)
10127 $ make PREFIX=/opt/mlton install
10130 Installation of MLton creates the following files and directories.
10132 * ++__prefix__/bin/mllex++
10134 The <:MLLex:> lexer generator.
10136 * ++__prefix__/bin/mlnlffigen++
10138 The <:MLNLFFI:ML-NLFFI> tool.
10140 * ++__prefix__/bin/mlprof++
10142 A <:Profiling:> tool.
10144 * ++__prefix__/bin/mlton++
10146 A script to call the compiler. This script may be moved anywhere,
10147 however, it makes use of files in ++__prefix__/lib/mlton++.
10149 * ++__prefix__/bin/mlyacc++
10151 The <:MLYacc:> parser generator.
10153 * ++__prefix__/lib/mlton++
10155 Directory containing libraries and include files needed during compilation.
10157 * ++__prefix__/share/man/man1/{mllex,mlnlffigen,mlprof,mlton,mlyacc}.1++
10161 * ++__prefix__/share/doc/mlton++
10163 Directory containing the user guide for MLton, mllex, and mlyacc, as
10164 well as example SML programs (in the `examples` directory), and license
10168 == Hello, World! ==
10170 Once you have installed MLton, create a file called `hello-world.sml`
10171 with the following contents.
10174 print "Hello, world!\n";
10177 Now create an executable, `hello-world`, with the following command.
10179 mlton hello-world.sml
10182 You can now run `hello-world` to verify that it works. There are more
10183 small examples in ++__prefix__/share/doc/mlton/examples++.
10186 == Installation on Cygwin ==
10188 When installing the Cygwin `tgz`, you should use Cygwin's `bash` and
10189 `tar`. The use of an archiving tool that is not aware of Cygwin's
10190 mounts will put the files in the wrong place.
10194 :mlton-guide-page: IntermediateLanguage
10195 [[IntermediateLanguage]]
10196 IntermediateLanguage
10197 ====================
10199 MLton uses a number of intermediate languages in translating from the input source program to low-level code. Here is a list in the order which they are translated to.
10201 * <:AST:>. Pretty close to the source.
10202 * <:CoreML:>. Explicitly typed, no module constructs.
10203 * <:XML:>. Polymorphic, <:HigherOrder:>.
10204 * <:SXML:>. SimplyTyped, <:HigherOrder:>.
10205 * <:SSA:>. SimplyTyped, <:FirstOrder:>.
10206 * <:SSA2:>. SimplyTyped, <:FirstOrder:>.
10207 * <:RSSA:>. Explicit data representations.
10208 * <:Machine:>. Untyped register transfer language.
10212 :mlton-guide-page: IntroduceLoops
10217 <:IntroduceLoops:> is an optimization pass for the <:SSA:>
10218 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10222 This pass rewrites any <:SSA:> function that calls itself in tail
10223 position into one with a local loop and no self tail calls.
10225 A <:SSA:> function like
10227 fun F (arg_0, arg_1) = L_0 ()
10236 fun F (arg_0', arg_1') = loopS_0 ()
10238 loop_0 (arg_0', arg_1')
10239 loop_0 (arg_0, arg_1)
10248 == Implementation ==
10250 * <!ViewGitFile(mlton,master,mlton/ssa/introduce-loops.fun)>
10252 == Details and Notes ==
10258 :mlton-guide-page: JesperLouisAndersen
10259 [[JesperLouisAndersen]]
10260 JesperLouisAndersen
10261 ===================
10263 Jesper Louis Andersen is an undergraduate student at DIKU, the department of computer science, Copenhagen university. His contributions to MLton are few, though he has made the port of MLton to the NetBSD and OpenBSD platforms.
10265 His general interests in computer science are compiler theory, language theory, algorithms and datastructures and programming. His assets are his general knowledge of UNIX systems, knowledge of system administration, knowledge of operating system kernels; NetBSD in particular.
10267 He was employed by the university as a system administrator for 2 years, which has set him back somewhat in his studies. Currently he is trying to learn mathematics (real analysis, general topology, complex functional analysis and algebra).
10270 == Projects using MLton ==
10272 === A register allocator ===
10273 For internal use at a compiler course at DIKU. It is written in the literate programming style and implements the _Iterated Register Coalescing_ algorithm by Lal George and Andrew Appel http://citeseer.ist.psu.edu/george96iterated.html. The status of the project is that it is unfinished. Most of the basic parts of the algorithm is done, but the interface to the students (simple) datatype takes some conversion.
10275 === A configuration management system in SML ===
10276 At this time, only loose plans exists for this. The plan is to build a Configuration Management system on the principles of the OpenCM system, see http://www.opencm.org/docs.html. The basic idea is to unify "naming" and "identity" into one by uniquely identifying all objects managed in the repository by the use of cryptographic checksums. This mantra guides the rest of the system, providing integrity, accessibility and confidentiality.
10280 :mlton-guide-page: JohnnyAndersen
10285 Johnny Andersen (aka Anoq of the Sun)
10287 Here is a picture in front of the academy building
10288 at the University of Athens, Greece, taken in September 2003.
10290 image::JohnnyAndersen.attachments/anoq.jpg[align="center"]
10294 :mlton-guide-page: KnownCase
10299 <:KnownCase:> is an optimization pass for the <:SSA:>
10300 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10304 This pass duplicates and simplifies `Case` transfers when the
10305 constructor of the scrutinee is known.
10309 For example, the program
10317 val _ = 1 + last [2, 3, 4, 5, 6, 7]
10320 gives rise to the <:SSA:> function
10323 fun last_0 (x_142) = loopS_1 ()
10328 nil_1 => L_73 | ::_0 => L_74
10331 L_74 (x_145, x_144)
10333 nil_1 => L_75 | _ => L_76
10340 which is simplified to
10343 fun last_0 (x_142) = loopS_1 ()
10346 nil_1 => L_73 | ::_0 => L_118
10349 L_118 (x_230, x_229)
10350 L_74 (x_230, x_229, x_142)
10351 L_74 (x_145, x_144, x_232)
10353 nil_1 => L_75 | ::_0 => L_114
10356 L_114 (x_227, x_226)
10357 L_74 (x_227, x_226, x_145)
10360 == Implementation ==
10362 * <!ViewGitFile(mlton,master,mlton/ssa/known-case.fun)>
10364 == Details and Notes ==
10366 One interesting aspect of <:KnownCase:>, is that it often has the
10367 effect of unrolling list traversals by one iteration, moving the
10368 `nil`/`::` check to the end of the loop, rather than the beginning.
10372 :mlton-guide-page: LambdaCalculus
10377 The http://en.wikipedia.org/wiki/Lambda_calculus[lambda calculus] is
10378 the formal system underlying <:StandardML:Standard ML>.
10382 :mlton-guide-page: LambdaFree
10387 <:LambdaFree:> is an analysis pass for the <:SXML:>
10388 <:IntermediateLanguage:>, invoked from <:ClosureConvert:>.
10392 This pass descends the entire <:SXML:> program and attaches a property
10393 to each `Lambda` `PrimExp.t` in the program. Then, you can use
10394 `lambdaFree` and `lambdaRec` to get free variables of that `Lambda`.
10396 == Implementation ==
10398 * <!ViewGitFile(mlton,master,mlton/closure-convert/lambda-free.sig)>
10399 * <!ViewGitFile(mlton,master,mlton/closure-convert/lambda-free.fun)>
10401 == Details and Notes ==
10403 For `Lambda`-s bound in a `Fun` dec, `lambdaFree` gives the union of
10404 the frees of the entire group of mutually recursive functions. Hence,
10405 `lambdaFree` for every `Lambda` in a single `Fun` dec is the same.
10406 Furthermore, for a `Lambda` bound in a `Fun` dec, `lambdaRec` gives
10407 the list of other functions bound in the same dec defining that
10412 val rec f = fn x => ... y ... g ... f ...
10413 and g = fn z => ... f ... w ...
10417 lambdaFree(fn x =>) = [y, w]
10418 lambdaFree(fn z =>) = [y, w]
10419 lambdaRec(fn x =>) = [g, f]
10420 lambdaRec(fn z =>) = [f]
10425 :mlton-guide-page: LanguageChanges
10426 [[LanguageChanges]]
10430 We are sometimes asked to modify MLton to change the language it
10431 compiles. In short, we are conservative about making such changes.
10432 There are a number of reasons for this.
10434 * <:DefinitionOfStandardML:The Definition of Standard ML> is an
10435 extremely high standard of specification. The value of the Definition
10436 would be significantly diluted by changes that are not specified at an
10437 equally high level, and the dilution increases with the complexity of
10438 the language change and its interaction with other language features.
10440 * The SML community is small and there are a number of
10441 <:StandardMLImplementations:SML implementations>. Without an
10442 agreed-upon standard, it becomes very difficult to port programs
10443 between compilers, and the community would be balkanized.
10445 * Our main goal is to enable programmers to be as effective as
10446 possible with MLton/SML. There are a number of improvements other
10447 than language changes that we could spend our time on that would
10448 provide more benefit to programmers.
10450 * The more the language that MLton compiles changes over time, the
10451 more difficult it is to use MLton as a stable platform for serious
10452 program development.
10454 Despite these drawbacks, we have extended SML in a couple of cases.
10456 * <:ForeignFunctionInterface: Foreign function interface>
10457 * <:MLBasis: ML Basis system>
10458 * <:SuccessorML: Successor ML features>
10460 We allow these language extensions because they provide functionality
10461 that is impossible to achieve without them or have non-trivial
10462 community support. The Definition does not define a foreign function
10463 interface. So, we must either extend the language or greatly restrict
10464 the class of programs that can be written. Similarly, the Definition
10465 does not provide a mechanism for namespace control at the module
10466 level, making it impossible to deliver packaged libraries and have a
10467 hope of users using them without name clashes. The ML Basis system
10468 addresses this problem. We have also provided a formal specification
10469 of the ML Basis system at the level of the Definition.
10473 * http://www.mlton.org/pipermail/mlton/2004-August/016165.html
10474 * http://www.mlton.org/pipermail/mlton-user/2004-December/000320.html
10478 :mlton-guide-page: Lazy
10483 In a lazy (or non-strict) language, the arguments to a function are
10484 not evaluated before calling the function. Instead, the arguments are
10485 suspended and only evaluated by the function if needed.
10487 <:StandardML:Standard ML> is an eager (or strict) language, not a lazy
10488 language. However, it is easy to delay evaluation of an expression in
10489 SML by creating a _thunk_, which is a nullary function. In SML, a
10490 thunk is written `fn () => e`. Another essential feature of laziness
10491 is _memoization_, meaning that once a suspended argument is evaluated,
10492 subsequent references look up the value. We can express this in SML
10493 with a function that maps a thunk to a memoized thunk.
10499 val lazy: (unit -> 'a) -> unit -> 'a
10503 This is easy to implement in SML.
10507 structure Lazy: LAZY =
10509 fun lazy (th: unit -> 'a): unit -> 'a =
10511 datatype 'a lazy_result = Unevaluated of (unit -> 'a)
10515 val r = ref (Unevaluated th)
10519 Unevaluated th => let
10521 handle x => (r := Failed x; raise x)
10522 val () = r := Evaluated a
10527 | Failed x => raise x
10534 :mlton-guide-page: Libraries
10539 In theory every strictly conforming Standard ML program should run on
10540 MLton. However, often large SML projects use implementation specific
10541 features so some "porting" is required. Here is a partial list of
10542 software that is known to run on MLton.
10544 * Utility libraries:
10545 ** <:SMLNJLibrary:> - distributed with MLton
10546 ** <:MLtonLibraryProject:> - various libraries located on the MLton subversion repository
10547 ** <!ViewGitDir(mlton,master,lib/mlton)> - the internal MLton utility library, which we hope to cleanup and make more accessible someday
10548 ** http://github.com/seanmcl/sml-ext[sml-ext], a grab bag of libraries for MLton and other SML implementations (by Sean McLaughlin)
10549 ** http://tom7misc.cvs.sourceforge.net/tom7misc/sml-lib/[sml-lib], a grab bag of libraries for MLton and other SML implementations (by <:TomMurphy:>)
10550 * Scanner generators:
10551 ** <:MLLPTLibrary:> - distributed with MLton
10552 ** <:MLLex:> - distributed with MLton
10554 * Parser generators:
10556 ** <:MLLPTLibrary:> - distributed with MLton
10557 ** <:MLYacc:> - distributed with MLton
10558 * Concurrency: <:ConcurrentML:> - distributed with MLton
10563 ** <:CKitLibrary:> - distributed with MLton
10564 ** <:MLRISCLibrary:> - distributed with MLton
10565 ** <:MLNLFFI:ML-NLFFI> - distributed with MLton
10566 ** <:Swerve:>, an HTTP server
10567 ** <:fxp:>, an XML parser
10569 == Ports in progress ==
10571 <:Contact:> us for details on any of these.
10573 * <:MLDoc:> http://people.cs.uchicago.edu/%7Ejhr/tools/ml-doc.html
10578 More projects using MLton can be seen on the <:Users:> page.
10580 == Software for SML implementations other than MLton ==
10583 ** Moscow ML: http://www.dina.kvl.dk/%7Esestoft/mosmllib/Postgres.html
10584 ** SML/NJ NLFFI: http://smlweb.sourceforge.net/smlsql/
10586 ** ML Kit: http://www.smlserver.org[SMLserver] (a plugin for AOLserver)
10587 ** Moscow ML: http://ellemose.dina.kvl.dk/%7Esestoft/msp/index.msp[ML Server Pages] (support for PHP-style CGI scripting)
10588 ** SML/NJ: http://smlweb.sourceforge.net/[smlweb]
10592 :mlton-guide-page: LibrarySupport
10597 MLton supports both linking to and creating system-level libraries.
10598 While Standard ML libraries should be designed with the <:MLBasis:> system to work with other Standard ML programs,
10599 system-level library support allows MLton to create libraries for use by other programming languages.
10600 Even more importantly, system-level library support allows MLton to access libraries from other languages.
10601 This article will explain how to use libraries portably with MLton.
10605 A Dynamic Shared Object (DSO) is a piece of executable code written in a format understood by the operating system.
10606 Executable programs and dynamic libraries are the two most common examples of a DSO.
10607 They are called shared because if they are used more than once, they are only loaded once into main memory.
10608 For example, if you start two instances of your web browser (an executable), there may be two processes running, but the program code of the executable is only loaded once.
10609 A dynamic library, for example a graphical toolkit, might be used by several different executable programs, each possibly running multiple times.
10610 Nevertheless, the dynamic library is only loaded once and it's program code is shared between all of the processes.
10612 In addition to program code, DSOs contain a table of textual strings called symbols.
10613 These are used in order to make the DSO do something useful, like execute.
10614 For example, on linux the symbol `_start` refers to the point in the program code where the operating system should start executing the program.
10615 Dynamic libraries generally provide many symbols, corresponding to functions which can be called and variables which can be read or written.
10616 Symbols can be used by the DSO itself, or by other DSOs which require services.
10618 When a DSO creates a symbol, this is called 'exporting'.
10619 If a DSO needs to use a symbol, this is called 'importing'.
10620 A DSO might need to use symbols defined within itself or perhaps from another DSO.
10621 In both cases, it is importing that symbol, but the scope of the import differs.
10622 Similarly, a DSO might export a symbol for use only within itself, or it might export a symbol for use by other DSOs.
10623 Some symbols are resolved at compile time by the linker (those used within the DSO) and some are resolved at runtime by the dynamic link loader (symbols accessed between DSOs).
10625 == Symbols in MLton ==
10627 Symbols in MLton are both imported and exported via the <:ForeignFunctionInterface:>.
10628 The notation `_import "symbolname"` imports functions, `_symbol "symbolname"` imports variables, and `_address "symbolname"` imports an address.
10629 To create and export a symbol, `_export "symbolname"` creates a function symbol and `_symbol "symbolname" 'alloc'` creates and exports a variable.
10630 For details of the syntax and restrictions on the supported FFI types, read the <:ForeignFunctionInterface:> page.
10631 In this discussion it only matters that every FFI use is either an import or an export.
10633 When exporting a symbol, MLton supports controlling the export scope.
10634 If the symbol should only be used within the same DSO, that symbol has '`private`' scope.
10635 Conversely, if the symbol should also be available to other DSOs the symbol has '`public`' scope.
10636 Generally, one should have as few public exports as possible.
10637 Since they are public, other DSOs will come to depend on them, limiting your ability to change them.
10638 You specify the export scope in MLton by putting `private` or `public` after the symbol's name in an FFI directive.
10639 eg: `_export "foo" private: int->int;` or `_export "bar" public: int->int;` .
10641 For technical reasons, the linker and loader on various platforms need to know the scope of a symbol being imported.
10642 If the symbol is exported by the same DSO, use `public` or `private` as appropriate.
10643 If the symbol is exported by a different DSO, then the scope '`external`' should be used to import it.
10644 Within a DSO, all references to a symbol must use the same scope.
10645 MLton will check this at compile time, reporting: `symbol "foo" redeclared as public (previously external)`. This may cause linker errors.
10646 However, MLton can only check usage within Standard ML.
10647 All objects being linked into a resulting DSO must agree, and it is the programmer's responsibility to ensure this.
10649 Summary of symbol scopes:
10651 * `private`: used for symbols exported within a DSO only for use within that DSO
10652 * `public`: used for symbols exported within a DSO that may also be used outside that DSO
10653 * `external`: used for importing symbols from another DSO
10654 * All uses of a symbol within a DSO (both imports and exports) must agree on the symbol scope
10656 == Output Formats ==
10658 MLton can create executables (`-format executable`) and dynamic shared libraries (`-format library`).
10659 To link a shared library, use `-link-opt -l<dso_name>`.
10660 The default output format is executable.
10662 MLton can also create archives.
10663 An archive is not a DSO, but it does have a collection of symbols.
10664 When an archive is linked into a DSO, it is completely absorbed.
10665 Other objects being compiled into the DSO should refer to the public symbols in the archive as public, since they are still in the same DSO.
10666 However, in the interest of modular programming, private symbols in an archive cannot be used outside of that archive, even within the same DSO.
10668 Although both executables and libraries are DSOs, some implementation details differ on some platforms.
10669 For this reason, MLton can create two types or archives.
10670 A normal archive (`-format archive`) is appropriate for linking into an executable.
10671 Conversely, a libarchive (`-format libarchive`) should be used if it will be linked into a dynamic library.
10673 When MLton does not create an executable, it creates two special symbols.
10674 The symbol `libname_open` is a function which must be called before any other symbols are accessed.
10675 The `libname` is controlled by the `-libname` compile option and defaults to the name of the output, with any prefixing lib stripped (eg: `foo` -> `foo`, `libfoo` -> `foo`).
10676 The symbol `libname_close` is a function which should be called to clean up memory once done.
10678 Summary of `-format` options:
10680 * `executable`: create an executable (a DSO)
10681 * `library`: create a dynamic shared library (a DSO)
10682 * `archive`: create an archive of symbols (not a DSO) that can be linked into an executable
10683 * `libarchive`: create an archive of symbols (not a DSO) that can be linked into a library
10687 * `-libname x`: controls the name of the special `_open` and `_close` functions.
10690 == Interfacing with C ==
10692 MLton can generate a C header file.
10693 When the output format is not an executable, it creates one by default named `libname.h`.
10694 This can be overridden with `-export-header foo.h`.
10695 This header file should be included by any C files using the exported Standard ML symbols.
10697 If C is being linked with Standard ML into the same output archive or DSO,
10698 then the C code should `#define PART_OF_LIBNAME` before it includes the header file.
10699 This ensures that the C code is using the symbols with correct scope.
10700 Any symbols exported from C should also be marked using the `PRIVATE`/`PUBLIC`/`EXTERNAL` macros defined in the Standard ML export header.
10701 The declared C scope on exported C symbols should match the import scope used in Standard ML.
10706 #define PART_OF_FOO
10709 PUBLIC int cFoo() {
10716 val () = _export "smlFoo" private: unit -> int; (fn () => 5)
10717 val cFoo = _import "cFoo" public: unit -> int;
10721 == Operating-system specific details ==
10723 On Windows, `libarchive` and `archive` are the same.
10724 However, depending on this will lead to portability problems.
10725 Windows is also especially sensitive to mixups of '`public`' and '`external`'.
10726 If an archive is linked, make sure it's symbols are imported as `public`.
10727 If a DLL is linked, make sure it's symbols are imported as `external`.
10728 Using `external` instead of `public` will result in link errors that `__imp__foo is undefined`.
10729 Using `public` instead of `external` will result in inconsistent function pointer addresses and failure to update the imported variables.
10731 On Linux, `libarchive` and `archive` are different.
10732 Libarchives are quite rare, but necessary if creating a library from an archive.
10733 It is common for a library to provide both an archive and a dynamic library on this platform.
10734 The linker will pick one or the other, usually preferring the dynamic library.
10735 While a quirk of the operating system allows external import to work for both archives and libraries,
10736 portable projects should not depend on this behaviour.
10737 On other systems it can matter how the library is linked (static or dynamic).
10741 :mlton-guide-page: License
10747 In order to allow the maximum freedom for the future use of the
10748 content in this web site, we require that contributions to the web
10749 site be dedicated to the public domain. That means that you can only
10750 add works that are already in the public domain, or that you must hold
10751 the copyright on the work that you agree to dedicate the work to the
10754 By contributing to this web site, you agree to dedicate your
10755 contribution to the public domain.
10759 As of 20050812, MLton software is licensed under the BSD-style license
10760 below. By contributing code to the project, you agree to release the
10761 code under this license. Contributors can retain copyright to their
10762 contributions by asserting copyright in their code. Contributors may
10763 also add to the list of copyright holders in
10764 `doc/license/MLton-LICENSE`, which appears below.
10768 sys::[./bin/InclGitFile.py mlton master doc/license/MLton-LICENSE]
10773 :mlton-guide-page: LineDirective
10778 To aid in the debugging of code produced by program generators such
10779 as http://www.eecs.harvard.edu/%7Enr/noweb/[Noweb], MLton supports
10780 comments with line directives of the form
10785 Here, _l_ and _c_ are sequences of decimal digits and _f_ is the
10786 source file. The first character of a source file has the position
10787 1.1. A line directive causes the front end to believe that the
10788 character following the right parenthesis is at the line and column of
10789 the specified file. A line directive only affects the reporting of
10790 error messages and does not affect program semantics (except for
10791 functions like `MLton.Exn.history` that report source file positions).
10792 Syntactically invalid line directives are ignored. To prevent
10793 incompatibilities with SML, the file name may not contain the
10794 character sequence `*)`.
10798 :mlton-guide-page: LLVM
10803 The http://www.llvm.org/[LLVM Project] is a collection of modular and
10804 reusable compiler and toolchain technologies.
10806 MLton supports code generation via LLVM (`-codegen llvm`); see
10815 :mlton-guide-page: LLVMCodegen
10820 The <:LLVMCodegen:> is a <:Codegen:code generator> that translates the
10821 <:Machine:> <:IntermediateLanguage:> to <:LLVM:> assembly, which is
10822 further optimized and compiled to native object code by the <:LLVM:>
10825 It requires <:LLVM:> version 3.7 or greater to be installed.
10827 In benchmarks performed on the <:RunningOnAMD64:AMD64> architecture,
10828 code size with this generator is usually slightly smaller than either
10829 the <:AMD64Codegen:native> or the <:CCodegen:C> code generators. Compile
10830 time is worse than <:AMD64Codegen:native>, but slightly better than
10831 <:CCodegen:C>. Run time is often better than either <:AMD64Codegen:native>
10834 == Implementation ==
10836 * <!ViewGitFile(mlton,master,mlton/codegen/llvm-codegen/llvm-codegen.sig)>
10837 * <!ViewGitFile(mlton,master,mlton/codegen/llvm-codegen/llvm-codegen.fun)>
10839 == Details and Notes ==
10841 The <:LLVMCodegen:> was initially developed by Brian Leibig (see
10842 <!Cite(Leibig13,An LLVM Back-end for MLton)>).
10846 :mlton-guide-page: LocalFlatten
10851 <:LocalFlatten:> is an optimization pass for the <:SSA:>
10852 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10856 This pass flattens arguments to <:SSA:> blocks.
10858 A block argument is flattened as long as it only flows to selects and
10859 there is some tuple constructed in this function that flows to it.
10861 == Implementation ==
10863 * <!ViewGitFile(mlton,master,mlton/ssa/local-flatten.fun)>
10865 == Details and Notes ==
10871 :mlton-guide-page: LocalRef
10876 <:LocalRef:> is an optimization pass for the <:SSA:>
10877 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10881 This pass optimizes `ref` cells local to a <:SSA:> function:
10883 * global `ref`-s only used in one function are moved to the function
10885 * `ref`-s only created, read from, and written to (i.e., don't escape)
10886 are converted into function local variables
10888 Uses <:Multi:> and <:Restore:>.
10890 == Implementation ==
10892 * <!ViewGitFile(mlton,master,mlton/ssa/local-ref.fun)>
10894 == Details and Notes ==
10896 Moving a global `ref` requires the <:Multi:> analysis, because a
10897 global `ref` can only be moved into a function that is executed at
10900 Conversion of non-escaping `ref`-s is structured in three phases:
10902 * analysis -- a variable `r = Ref_ref x` escapes if
10903 ** `r` is used in any context besides `Ref_assign (r, _)` or `Ref_deref r`
10904 ** all uses `r` reachable from a (direct or indirect) call to `Thread_copyCurrent` are of the same flavor (either `Ref_assign` or `Ref_deref`); this also requires the <:Multi:> analysis.
10909 ** rewrites `r = Ref_ref x` to `r = x`
10910 ** rewrites `_ = Ref_assign (r, y)` to `r = y`
10911 ** rewrites `z = Ref_deref r` to `z = r`
10914 Note that the resulting program violates the SSA condition.
10916 * <:Restore:> -- restore the SSA condition.
10920 :mlton-guide-page: Logo
10925 ifdef::basebackend-html[]
10926 image::Logo.attachments/mlton.svg[align="center",height="128",width="128"]
10928 ifdef::basebackend-docbook[]
10929 image::Logo.attachments/mlton-128.pdf[align="center"]
10934 * <!Attachment(Logo,mlton.svg)>
10935 * <!Attachment(Logo,mlton-1024.png)>
10936 * <!Attachment(Logo,mlton-512.png)>
10937 * <!Attachment(Logo,mlton-256.png)>
10938 * <!Attachment(Logo,mlton-128.png)>
10939 * <!Attachment(Logo,mlton-64.png)>
10940 * <!Attachment(Logo,mlton-32.png)>
10944 :mlton-guide-page: LoopInvariant
10949 <:LoopInvariant:> is an optimization pass for the <:SSA:>
10950 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10954 This pass removes loop invariant arguments to local loops.
10976 == Implementation ==
10978 * <!ViewGitFile(mlton,master,mlton/ssa/loop-invariant.fun)>
10980 == Details and Notes ==
10986 :mlton-guide-page: LoopUnroll
10991 <:LoopUnroll:> is an optimization pass for the <:SSA:> <:IntermediateLanguage:>,
10992 invoked from <:SSASimplify:>.
10996 A simple loop unrolling optimization.
10998 == Implementation ==
11000 * <!ViewGitFile(mlton,master,mlton/ssa/loop-unroll.fun)>
11002 == Details and Notes ==
11008 :mlton-guide-page: LoopUnswitch
11013 <:LoopUnswitch:> is an optimization pass for the <:SSA:>
11014 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
11018 A simple loop unswitching optimization.
11020 == Implementation ==
11022 * <!ViewGitFile(mlton,master,mlton/ssa/loop-unswitch.fun)>
11024 == Details and Notes ==
11030 :mlton-guide-page: Machine
11035 <:Machine:> is an <:IntermediateLanguage:>, translated from <:RSSA:>
11036 by <:ToMachine:> and used as input by the <:Codegen:>.
11040 <:Machine:> is an <:Untyped:> <:IntermediateLanguage:>, corresponding
11041 to a abstract register machine.
11043 == Implementation ==
11045 * <!ViewGitFile(mlton,master,mlton/backend/machine.sig)>
11046 * <!ViewGitFile(mlton,master,mlton/backend/machine.fun)>
11048 == Type Checking ==
11050 The <:Machine:> <:IntermediateLanguage:> has a primitive type checker
11051 (<!ViewGitFile(mlton,master,mlton/backend/machine.sig)>,
11052 <!ViewGitFile(mlton,master,mlton/backend/machine.fun)>), which only checks
11053 some liveness properties.
11055 == Details and Notes ==
11057 The runtime structure sets some constants according to the
11058 configuration files on the target architecture and OS.
11062 :mlton-guide-page: ManualPage
11067 MLton is run from the command line with a collection of options
11068 followed by a file name and a list of files to compile, assemble, and
11072 mlton [option ...] file.{c|mlb|o|sml} [file.{c|o|s|S} ...]
11075 The simplest case is to run `mlton foo.sml`, where `foo.sml` contains
11076 a valid SML program, in which case MLton compiles the program to
11077 produce an executable `foo`. Since MLton does not support separate
11078 compilation, the program must be the entire program you wish to
11079 compile. However, the program may refer to signatures and structures
11080 defined in the <:BasisLibrary:Basis Library>.
11082 Larger programs, spanning many files, can be compiled with the
11083 <:MLBasis:ML Basis system>. In this case, `mlton foo.mlb` will
11084 compile the complete SML program described by the basis `foo.mlb`,
11085 which may specify both SML files and additional bases.
11089 * <:CompileTimeOptions:>
11090 * <:RunTimeOptions:>
11094 :mlton-guide-page: MatchCompilation
11095 [[MatchCompilation]]
11099 Match compilation is the process of translating an SML match into a
11100 nested tree (or dag) of simple case expressions and tests.
11102 MLton's match compiler is described <:MatchCompile:here>.
11104 == Match compilation in other compilers ==
11106 * <!Cite(BaudinetMacQueen85)>
11107 * <!Cite(Leroy90)>, pages 60-69.
11108 * <!Cite(Sestoft96)>
11109 * <!Cite(ScottRamsey00)>
11113 :mlton-guide-page: MatchCompile
11118 <:MatchCompile:> is a translation pass, agnostic in the
11119 <:IntermediateLanguage:>s between which it translates.
11123 <:MatchCompilation:Match compilation> converts a case expression with
11124 nested patterns into a case expression with flat patterns.
11126 == Implementation ==
11128 * <!ViewGitFile(mlton,master,mlton/match-compile/match-compile.sig)>
11129 * <!ViewGitFile(mlton,master,mlton/match-compile/match-compile.fun)>
11131 == Details and Notes ==
11136 {caseType: Type.t, (* type of entire expression *)
11137 cases: (NestedPat.t * ((Var.t -> Var.t) -> Exp.t)) vector,
11138 conTycon: Con.t -> Tycon.t,
11142 tyconCons: Tycon.t -> {con: Con.t, hasArg: bool} vector}
11143 -> Exp.t * (unit -> ((Layout.t * {isOnlyExns: bool}) vector) vector)
11146 `matchCompile` is complicated by the desire for modularity between the
11147 match compiler and its caller. Its caller is responsible for building
11148 the right hand side of a rule `p => e`. On the other hand, the match
11149 compiler is responsible for destructing the test and binding new
11150 variables to the components. In order to connect the new variables
11151 created by the match compiler with the variables in the pattern `p`,
11152 the match compiler passes an environment back to its caller that maps
11153 each variable in `p` to the corresponding variable introduced by the
11156 The match compiler builds a tree of n-way case expressions by working
11157 from outside to inside and left to right in the patterns. For example,
11162 | (C2 b, C3 c) => e2
11174 C2 b' => (case x2 of
11176 | C3 c' => f2(b',c')
11177 | _ => raise Match)
11180 | _ => raise Match))
11184 Here you can see the necessity of abstracting out the ride hand sides
11185 of the cases in order to avoid code duplication. Right hand sides are
11186 always abstracted. The simplifier cleans things up. You can also see
11187 the new (primed) variables introduced by the match compiler and how
11188 the renaming works. Finally, you can see how the match compiler
11189 introduces the necessary default clauses in order to make a match
11190 exhaustive, i.e. cover all the cases.
11192 The match compiler uses `numCons` and `tyconCons` to determine
11193 the exhaustivity of matches against constructors.
11197 :mlton-guide-page: MatthewFluet
11203 mailto:matthew.fluet@gmail.com[matthew.fluet@gmail.com]
11205 http://www.cs.rit.edu/%7Emtf
11207 is an Assistant Professor at the http://www.rit.edu[Rochester Institute of Technology].
11211 Current MLton projects:
11213 * general maintenance
11214 * release new version
11218 Misc. and underspecified TODOs:
11220 * understand <:RefFlatten:> and <:DeepFlatten:>
11221 ** http://www.mlton.org/pipermail/mlton/2005-April/026990.html
11222 ** http://www.mlton.org/pipermail/mlton/2007-November/030056.html
11223 ** http://www.mlton.org/pipermail/mlton/2008-April/030250.html
11224 ** http://www.mlton.org/pipermail/mlton/2008-July/030279.html
11225 ** http://www.mlton.org/pipermail/mlton/2008-August/030312.html
11226 ** http://www.mlton.org/pipermail/mlton/2008-September/030360.html
11227 ** http://www.mlton.org/pipermail/mlton-user/2009-June/001542.html
11228 * `MSG_DONTWAIT` isn't Posix
11229 * coordinate w/ Dan Spoonhower and Lukasz Ziarek and Armand Navabi on multi-threaded
11230 ** http://www.mlton.org/pipermail/mlton/2008-March/030214.html
11231 * Intel Research bug: `no tyconRep property` (company won't release sample code)
11232 ** http://www.mlton.org/pipermail/mlton-user/2008-March/001358.html
11233 * treatment of real constants
11234 ** http://www.mlton.org/pipermail/mlton/2008-May/030262.html
11235 ** http://www.mlton.org/pipermail/mlton/2008-June/030271.html
11236 * representation of `bool` and `_bool` in <:ForeignFunctionInterface:>
11237 ** http://www.mlton.org/pipermail/mlton/2008-May/030264.html
11238 * http://www.icfpcontest.org
11239 ** John Reppy claims that "It looks like the card-marking overhead that one incurs when using generational collection swamps the benefits of generational collection."
11240 * page to disk policy / single heap
11241 ** http://www.mlton.org/pipermail/mlton/2008-June/030278.html
11242 ** http://www.mlton.org/pipermail/mlton/2008-August/030318.html
11243 * `MLton.GC.pack` doesn't keep a small heap if a garbage collection occurs before `MLton.GC.unpack`.
11244 ** It might be preferable for `MLton.GC.pack` to be implemented as a (new) `MLton.GC.Ratios.setLive 1.1` followed by `MLton.GC.collect ()` and for `MLton.GC.unpack` to be implemented as `MLton.GC.Ratios.setLive 8.0` followed by `MLton.GC.collect ()`.
11245 * The `static struct GC_objectType objectTypes[] =` array includes many duplicates. Objects of distinct source type, but equivalent representations (in terms of size, bytes non-pointers, number pointers) can share the objectType index.
11246 * PolySpace bug: <:Redundant:> optimization (company won't release sample code)
11247 ** http://www.mlton.org/pipermail/mlton/2008-September/030355.html
11248 * treatment of exception raised during <:BasisLibrary:> evaluation
11249 ** http://www.mlton.org/pipermail/mlton/2008-December/030501.html
11250 ** http://www.mlton.org/pipermail/mlton/2008-December/030502.html
11251 ** http://www.mlton.org/pipermail/mlton/2008-December/030503.html
11253 ** http://www.mlton.org/pipermail/mlton-user/2009-January/001506.html
11254 ** http://www.mlton.org/pipermail/mlton/2009-January/030506.html
11255 * Implement more 64bit primops in x86 codegen
11256 ** http://www.mlton.org/pipermail/mlton/2009-January/030507.html
11257 * Enrich path-map file syntax:
11258 ** http://www.mlton.org/pipermail/mlton/2008-September/030348.html
11259 ** http://www.mlton.org/pipermail/mlton-user/2009-January/001507.html
11260 * PolySpace bug: crash during Cheney-copy collection
11261 ** http://www.mlton.org/pipermail/mlton/2009-February/030513.html
11262 * eliminate `-build-constants`
11263 ** all `_const`-s are known by `runtime/gen/basis-ffi.def`
11264 ** generate `gen-constants.c` from `basis-ffi.def`
11265 ** generate `constants` from `gen-constants.c` and `libmlton.a`
11266 ** similar to `gen-sizes.c` and `sizes`
11267 * eliminate "Windows hacks" for Cygwin from `Path` module
11268 ** http://www.mlton.org/pipermail/mlton/2009-July/030606.html
11269 * extend IL type checkers to check for empty property lists
11270 * make (unsafe) `IntInf` conversions into primitives
11271 ** http://www.mlton.org/pipermail/mlton/2009-July/030622.html
11275 :mlton-guide-page: mGTK
11280 http://mgtk.sourceforge.net/[mGTK] is a wrapper for
11281 http://www.gtk.org/[GTK+], a GUI toolkit.
11283 We recommend using mGTK 0.93, which is not listed on their home page,
11284 but is available at the
11285 http://sourceforge.net/project/showfiles.php?group_id=23226&package_id=16523[file
11286 release page]. To test it, after unpacking, do `cd examples; make
11287 mlton`, after which you should be able to run the many examples
11288 (`signup-mlton`, `listview-mlton`, ...).
11296 :mlton-guide-page: MichaelNorrish
11301 I am a researcher at http://nicta.com.au[NICTA], with a web-page http://web.rsise.anu.edu.au/%7Emichaeln/[here].
11303 I'm interested in MLton because of the chance that it might be a good vehicle for future implementations of the http://hol.sf.net[HOL] theorem-proving system. It's beginning to look as if one route forward will be to embed an SML interpreter into a MLton-compiled executable. I don't know if an extensible interpreter of the kind we're looking for already exists.
11307 :mlton-guide-page: MikeThomas
11312 Here is a picture at home in Brisbane, Queensland, Australia, taken in January 2004.
11314 image::MikeThomas.attachments/picture.jpg[align="center"]
11318 :mlton-guide-page: ML
11323 ML stands for _meta language_. ML was originally designed in the
11324 1970s as a programming language to assist theorem proving in the logic
11325 LCF. In the 1980s, ML split into two variants,
11326 <:StandardML:Standard ML> and <:OCaml:>, both of which are still used
11331 :mlton-guide-page: MLAntlr
11336 http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[MLAntlr] is a
11337 parser generator for <:StandardML:Standard ML>.
11346 :mlton-guide-page: MLBasis
11351 The ML Basis system extends <:StandardML:Standard ML> to support
11352 programming-in-the-very-large, namespace management at the module
11353 level, separate delivery of library sources, and more. While Standard
11354 ML modules are a sophisticated language for programming-in-the-large,
11355 it is difficult, if not impossible, to accomplish a number of routine
11356 namespace management operations when a program draws upon multiple
11357 libraries provided by different vendors.
11359 The ML Basis system is a simple, yet powerful, approach that builds
11360 upon the programmer's intuitive notion (and
11361 <:DefinitionOfStandardML: The Definition of Standard ML (Revised)>'s
11362 formal notion) of the top-level environment (a _basis_). The system
11363 is designed as a natural extension of <:StandardML: Standard ML>; the
11364 formal specification of the ML Basis system
11365 (<!Attachment(MLBasis,mlb-formal.pdf)>) is given in the style
11368 Here are some of the key features of the ML Basis system:
11370 1. Explicit file order: The order of files (and, hence, the order of
11371 evaluation) in the program is explicit. The ML Basis system's
11372 semantics are structured in such a way that for any well-formed
11373 project, there will be exactly one possible interpretation of the
11374 project's syntax, static semantics, and dynamic semantics.
11376 2. Implicit dependencies: A source file (corresponding to an SML
11377 top-level declaration) is elaborated in the environment described by
11378 preceding declarations. It is not necessary to explicitly list the
11379 dependencies of a file.
11381 3. Scoping and renaming: The ML Basis system provides mechanisms for
11382 limiting the scope of (i.e, hiding) and renaming identifiers.
11384 4. No naming convention for finding the file that defines a module.
11385 To import a module, its defining file must appear in some ML Basis
11390 * <:MLBasisSyntaxAndSemantics:>
11391 * <:MLBasisExamples:>
11392 * <:MLBasisPathMap:>
11393 * <:MLBasisAnnotations:>
11394 * <:MLBasisAvailableLibraries:>
11398 :mlton-guide-page: MLBasisAnnotationExamples
11399 [[MLBasisAnnotationExamples]]
11400 MLBasisAnnotationExamples
11401 =========================
11403 Here are some example uses of <:MLBasisAnnotations:>.
11405 == Eliminate spurious warnings in automatically generated code ==
11407 Programs that automatically generate source code can often produce
11408 nonexhaustive patterns, relying on invariants of the generated code to
11409 ensure that the pattern matchings never fail. A programmer may wish
11410 to elide the nonexhaustive warnings from this code, in order that
11411 legitimate warnings are not missed in a flurry of false positives. To
11412 do so, the programmer simply annotates the generated code with the
11413 `nonexhaustiveBind ignore` and `nonexhaustiveMatch ignore`
11418 $(GEN_ROOT)/gen-lib.mlb
11421 "nonexhaustiveBind ignore"
11422 "nonexhaustiveMatch ignore"
11433 == Deliver a library ==
11435 Standard ML libraries can be delivered via `.mlb` files. Authors of
11436 such libraries should strive to be mindful of the ways in which
11437 programmers may choose to compile their programs. For example,
11438 although the defaults for `sequenceNonUnit` and `warnUnused` are
11439 `ignore` and `false`, periodically compiling with these annotations
11440 defaulted to `warn` and `true` can help uncover likely bugs. However,
11441 a programmer is unlikely to be interested in unused modules from an
11442 imported library, and the behavior of `sequenceNonUnit error` may be
11443 incompatible with some libraries. Hence, a library author may choose
11444 to deliver a library as follows:
11448 "nonexhaustiveBind warn" "nonexhaustiveMatch warn"
11449 "redundantBind warn" "redundantMatch warn"
11450 "sequenceNonUnit warn"
11451 "warnUnused true" "forceUsed"
11468 The annotations `nonexhaustiveBind warn`, `redundantBind warn`,
11469 `nonexhaustiveMatch warn`, `redundantMatch warn`, and `sequenceNonUnit
11470 warn` have the obvious effect on elaboration. The annotations
11471 `warnUnused true` and `forceUsed` work in conjunction -- warning on
11472 any identifiers that do not contribute to the exported modules, and
11473 preventing warnings on exported modules that are not used in the
11474 remainder of the program. Many of the
11475 <:MLBasisAvailableLibraries:available libraries> are delivered with
11480 :mlton-guide-page: MLBasisAnnotations
11481 [[MLBasisAnnotations]]
11485 <:MLBasis:ML Basis> annotations control options that affect the
11486 elaboration of SML source files. Conceptually, a basis file is
11487 elaborated in a default annotation environment (just as it is
11488 elaborated in an empty basis). The declaration
11489 ++ann++{nbsp}++"++__ann__++"++{nbsp}++in++{nbsp}__basdec__{nbsp}++end++
11490 merges the annotation _ann_ with the "current" annotation environment
11491 for the elaboration of _basdec_. To allow for future expansion,
11492 ++"++__ann__++"++ is lexed as a single SML string constant. To
11493 conveniently specify multiple annotations, the following derived form
11497 +ann+ ++"++__ann__++"++ (++"++__ann__++"++ )^\+^ +in+ _basdec_ +end+
11499 +ann+ ++"++__ann__++"++ +in+ +ann+ (++"++__ann__++"++)^\+^ +in+ _basdec_ +end+ +end+
11502 Here are the available annotations. In the explanation below, for
11503 annotations that take an argument, the first value listed is the
11506 * +allowFFI {false|true}+
11508 If `true`, allow `_address`, `_export`, `_import`, and `_symbol`
11509 expressions to appear in source files. See
11510 <:ForeignFunctionInterface:>.
11512 * +allowSuccessorML {false|true}+
11515 Allow or disallow all of the <:SuccessorML:> features. This is a
11516 proxy for all of the following annotations.
11518 ** +allowDoDecls {false|true}+
11520 If `true`, allow a +do _exp_+ declaration form.
11522 ** +allowExtendedConsts {false|true}+
11525 Allow or disallow all of the extended constants features. This is a
11526 proxy for all of the following annotations.
11528 *** +allowExtendedNumConsts {false|true}+
11530 If `true`, allow extended numeric constants.
11532 *** +allowExtendedTextConsts {false|true}+
11534 If `true`, allow extended text constants.
11537 ** +allowLineComments {false|true}+
11539 If `true`, allow line comments beginning with the token ++(*)++.
11541 ** +allowOptBar {false|true}+
11543 If `true`, allow a bar to appear before the first match rule of a
11544 `case`, `fn`, or `handle` expression, allow a bar to appear before the
11545 first function-value binding of a `fun` declaration, and allow a bar
11546 to appear before the first constructor binding or description of a
11547 `datatype` declaration or specification.
11549 ** +allowOptSemicolon {false|true}+
11551 If `true`, allows a semicolon to appear after the last expression in a
11552 sequence expression or `let` body.
11554 ** +allowOrPats {false|true}+
11556 If `true`, allows disjunctive (a.k.a., "or") patterns of the form
11559 ** +allowRecordPunExps {false|true}+
11561 If `true`, allows record punning expressions.
11563 ** +allowSigWithtype {false|true}+
11565 If `true`, allows `withtype` to modify a `datatype` specification in a
11568 ** +allowVectorExpsAndPats {false|true}+
11571 Allow or disallow vector expressions and vector patterns. This is a
11572 proxy for all of the following annotations.
11574 *** +allowVectorExps {false|true}+
11576 If `true`, allow vector expressions.
11578 *** +allowVectorPats {false|true}+
11580 If `true`, allow vector patterns.
11586 Force all identifiers in the basis denoted by the body of the `ann` to
11587 be considered used; use in conjunction with `warnUnused true`.
11589 * +nonexhaustiveBind {warn|error|ignore}+
11591 If `error` or `warn`, report nonexhaustive patterns in `val`
11592 declarations (i.e., pattern-match failures that raise the `Bind`
11593 exception). An error will abort a compile, while a warning will not.
11595 * +nonexhaustiveExnBind {default|ignore}+
11597 If `ignore`, suppress errors and warnings about nonexhaustive matches
11598 in `val` declarations that arise solely from unmatched exceptions.
11599 If `default`, follow the behavior of `nonexhaustiveBind`.
11601 * +nonexhaustiveExnMatch {default|ignore}+
11603 If `ignore`, suppress errors and warnings about nonexhaustive matches
11604 in `fn` expressions, `case` expressions, and `fun` declarations that
11605 arise solely from unmatched exceptions. If `default`, follow the
11606 behavior of `nonexhaustiveMatch`.
11608 * +nonexhaustiveExnRaise {ignore|default}+
11610 If `ignore`, suppress errors and warnings about nonexhaustive matches
11611 in `handle` expressions that arise solely from unmatched exceptions.
11612 If `default`, follow the behavior of `nonexhaustiveRaise`.
11614 * +nonexhaustiveMatch {warn|error|ignore}+
11616 If `error` or `warn`, report nonexhaustive patterns in `fn`
11617 expressions, `case` expressions, and `fun` declarations (i.e.,
11618 pattern-match failures that raise the `Match` exception). An error
11619 will abort a compile, while a warning will not.
11621 * +nonexhaustiveRaise {ignore|warn|error}+
11623 If `error` or `warn`, report nonexhaustive patterns in `handle`
11624 expressions (i.e., pattern-match failures that implicitly (re)raise
11625 the unmatched exception). An error will abort a compile, while a
11628 * +redundantBind {warn|error|ignore}+
11630 If `error` or `warn`, report redundant patterns in `val` declarations.
11631 An error will abort a compile, while a warning will not.
11633 * +redundantMatch {warn|error|ignore}+
11635 If `error` or `warn`, report redundant patterns in `fn` expressions,
11636 `case` expressions, and `fun` declarations. An error will abort a
11637 compile, while a warning will not.
11639 * +redundantRaise {warn|error|ignore}+
11641 If `error` or `warn`, report redundant patterns in `handle`
11642 expressions. An error will abort a compile, while a warning will not.
11644 * +resolveScope {strdec|dec|topdec|program}+
11646 Used to control the scope at which overload constraints are resolved
11647 to default types (if not otherwise resolved by type inference) and the
11648 scope at which unresolved flexible record constraints are reported.
11650 The syntactic-class argument means to perform resolution checks at the
11651 smallest enclosing syntactic form of the given class. The default
11652 behavior is to resolve at the smallest enclosing _strdec_ (which is
11653 equivalent to the largest enclosing _dec_). Other useful behaviors
11654 are to resolve at the smallest enclosing _topdec_ (which is equivalent
11655 to the largest enclosing _strdec_) and at the smallest enclosing
11656 _program_ (which corresponds to a single `.sml` file and does not
11657 correspond to the whole `.mlb` program).
11659 * +sequenceNonUnit {ignore|error|warn}+
11661 If `error` or `warn`, report when `e1` is not of type `unit` in the
11662 sequence expression `(e1; e2)`. This can be helpful in detecting
11663 curried applications that are mistakenly not fully applied. To
11664 silence spurious messages, you can use `ignore e1`.
11666 * +valrecConstr {warn|error|ignore}+
11668 If `error` or `warn`, report when a `val rec` (or `fun`) declaration
11669 redefines an identifier that previously had constructor status. An
11670 error will abort a compile, while a warning will not.
11672 * +warnUnused {false|true}+
11674 Report unused identifiers.
11678 * <:MLBasisAnnotationExamples:>
11679 * <:WarnUnusedAnomalies:>
11683 :mlton-guide-page: MLBasisAvailableLibraries
11684 [[MLBasisAvailableLibraries]]
11685 MLBasisAvailableLibraries
11686 =========================
11688 MLton comes with the following <:MLBasis:ML Basis> files available.
11690 * `$(SML_LIB)/basis/basis.mlb`
11692 The <:BasisLibrary:Basis Library>.
11694 * `$(SML_LIB)/basis/basis-1997.mlb`
11696 The (deprecated) 1997 version of the <:BasisLibrary:Basis Library>.
11698 * `$(SML_LIB)/basis/mlton.mlb`
11700 The <:MLtonStructure:MLton> structure and signatures.
11702 * `$(SML_LIB)/basis/c-types.mlb`
11704 Various structure aliases useful as <:ForeignFunctionInterfaceTypes:>.
11706 * `$(SML_LIB)/basis/unsafe.mlb`
11708 The <:UnsafeStructure:Unsafe> structure and signature.
11710 * `$(SML_LIB)/basis/sml-nj.mlb`
11712 The <:SMLofNJStructure:SMLofNJ> structure and signature.
11714 * `$(SML_LIB)/mlyacc-lib/mlyacc-lib.mlb`
11716 Modules used by parsers built with <:MLYacc:>.
11718 * `$(SML_LIB)/cml/cml.mlb`
11720 <:ConcurrentML:>, a library for message-passing concurrency.
11722 * `$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb`
11724 <:MLNLFFI:ML-NLFFI>, a library for foreign function interfaces.
11726 * `$(SML_LIB)/mlrisc-lib/...`
11728 <:MLRISCLibrary:>, a library for retargetable and optimizing compiler back ends.
11730 * `$(SML_LIB)/smlnj-lib/...`
11732 <:SMLNJLibrary:>, a collection of libraries distributed with SML/NJ.
11734 * `$(SML_LIB)/ckit-lib/ckit-lib.mlb`
11736 <:CKitLibrary:>, a library for C source code.
11738 * `$(SML_LIB)/mllpt-lib/mllpt-lib.mlb`
11740 <:MLLPTLibrary:>, a support library for the <:MLULex:> scanner generator and the <:MLAntlr:> parser generator.
11743 == Basis fragments ==
11745 There are a number of specialized ML Basis files for importing
11746 fragments of the <:BasisLibrary: Basis Library> that can not be
11747 expressed within SML.
11749 * `$(SML_LIB)/basis/pervasive-types.mlb`
11751 The top-level types and constructors of the Basis Library.
11753 * `$(SML_LIB)/basis/pervasive-exns.mlb`
11755 The top-level exception constructors of the Basis Library.
11757 * `$(SML_LIB)/basis/pervasive-vals.mlb`
11759 The top-level values of the Basis Library, without infix status.
11761 * `$(SML_LIB)/basis/overloads.mlb`
11763 The top-level overloaded values of the Basis Library, without infix status.
11765 * `$(SML_LIB)/basis/equal.mlb`
11767 The polymorphic equality `=` and inequality `<>` values, without infix status.
11769 * `$(SML_LIB)/basis/infixes.mlb`
11771 The infix declarations of the Basis Library.
11773 * `$(SML_LIB)/basis/pervasive.mlb`
11775 The entire top-level value and type environment of the Basis Library, with infix status. This is the same as importing the above six MLB files.
11779 :mlton-guide-page: MLBasisExamples
11780 [[MLBasisExamples]]
11784 Here are some example uses of <:MLBasis:ML Basis> files.
11787 == Complete program ==
11789 Suppose your complete program consists of the files `file1.sml`, ...,
11790 `filen.sml`, which depend upon libraries `lib1.mlb`, ..., `libm.mlb`.
11793 (* import libraries *)
11798 (* program files *)
11804 The bases denoted by `lib1.mlb`, ..., `libm.mlb` are merged (bindings
11805 of names in later bases take precedence over bindings of the same name
11806 in earlier bases), producing a basis in which `file1.sml`, ...,
11807 `filen.sml` are elaborated, adding additional bindings to the basis.
11810 == Export filter ==
11812 Suppose you only want to export certain structures, signatures, and
11813 functors from a collection of files.
11821 (* export filter here *)
11827 While `file1.sml`, ..., `filen.sml` may declare top-level identifiers
11828 in addition to `F` and `S`, such names are not accessible to programs
11829 and libraries that import this `.mlb`.
11832 == Export filter with renaming ==
11834 Suppose you want an export filter, but want to rename one of the
11843 (* export filter, with renaming, here *)
11849 Note that `functor F` is an abbreviation for `functor F = F`, which
11850 simply exports an identifier under the same name.
11853 == Import filter ==
11855 Suppose you only want to import a functor `F` from one library and a
11856 structure `S` from another library.
11862 (* import filter here *)
11868 (* import filter here *)
11877 == Import filter with renaming ==
11879 Suppose you want to import a structure `S` from one library and
11880 another structure `S` from another library.
11886 (* import filter, with renaming, here *)
11892 (* import filter, with renaming, here *)
11903 Since the Modules level of SML is the natural means for organizing
11904 program and library components, MLB files provide convenient syntax
11905 for renaming Modules level identifiers (in fact, renaming of functor
11906 identifiers provides a mechanism that is not available in SML).
11907 However, please note that `.mlb` files elaborate to full bases
11908 including top-level types and values (including infix status), in
11909 addition to structures, signatures, and functors. For example,
11910 suppose you wished to extend the <:BasisLibrary:Basis Library> with an
11911 `('a, 'b) either` datatype corresponding to a disjoint sum; the type
11912 and some operations should be available at the top-level;
11913 additionally, a signature and structure provide the complete
11916 We could use the following files.
11921 signature EITHER_GLOBAL =
11923 datatype ('a, 'b) either = Left of 'a | Right of 'b
11924 val & : ('a -> 'c) * ('b -> 'c) -> ('a, 'b) either -> 'c
11925 val && : ('a -> 'c) * ('b -> 'd) -> ('a, 'b) either -> ('c, 'd) either
11930 include EITHER_GLOBAL
11931 val isLeft : ('a, 'b) either -> bool
11932 val isRight : ('a, 'b) either -> bool
11940 structure Either : EITHER =
11942 datatype ('a, 'b) either = Left of 'a | Right of 'b
11943 fun f & g = fn x =>
11944 case x of Left z => f z | Right z => g z
11945 fun f && g = (Left o f) & (Right o g)
11946 fun isLeft x = ((fn _ => true) & (fn _ => false)) x
11947 fun isRight x = (not o isLeft) x
11950 structure EitherGlobal : EITHER_GLOBAL = Either
11953 `either-infixes.sml`
11969 (* import Basis Library *)
11970 $(SML_LIB)/basis/basis.mlb
11980 A client that imports `either.mlb` will have access to neither
11981 `EITHER_GLOBAL` nor `EitherGlobal`, but will have access to the type
11982 `either` and the values `&` and `&&` (with infix status) in the
11983 top-level environment. Note that `either-infixes.sml` is outside the
11984 scope of the local, because we want the infixes available in the
11985 implementation of the library and to clients of the library.
11989 :mlton-guide-page: MLBasisPathMap
11994 An <:MLBasis:ML Basis> _path map_ describes a map from ML Basis path
11995 variables (of the form `$(VAR)`) to file system paths. ML Basis path
11996 variables provide a flexible way to refer to libraries while allowing
11997 them to be moved without changing their clients.
11999 The format of an `mlb-path-map` file is a sequence of lines; each line
12000 consists of two, white-space delimited tokens. The first token is a
12001 path variable `VAR` and the second token is the path to which the
12002 variable is mapped. The path may include path variables, which are
12003 recursively expanded.
12005 The mapping from path variables to paths is initialized by the compiler.
12006 Additional path maps can be specified with `-mlb-path-map` and
12007 individual path variable mappings can be specified with
12008 `-mlb-path-var` (see <:CompileTimeOptions:>). Configuration files are
12009 processed from first to last and from top to bottom, later mappings
12010 take precedence over earlier mappings.
12012 The compiler and system-wide configuration file makes the following
12013 path variables available.
12015 [options="header",cols="^25%,<75%"]
12017 |MLB path variable|Description
12018 |`SML_LIB`|path to system-wide libraries, usually `/usr/lib/mlton/sml`
12019 |`TARGET_ARCH`|string representation of target architecture
12020 |`TARGET_OS`|string representation of target operating system
12021 |`DEFAULT_INT`|binding for default int, usually `int32`
12022 |`DEFAULT_WORD`|binding for default word, usually `word32`
12023 |`DEFAULT_REAL`|binding for default real, usually `real64`
12028 :mlton-guide-page: MLBasisSyntaxAndSemantics
12029 [[MLBasisSyntaxAndSemantics]]
12030 MLBasisSyntaxAndSemantics
12031 =========================
12033 An <:MLBasis:ML Basis> (MLB) file should have the `.mlb` suffix and
12034 should contain a basis declaration.
12038 A basis declaration (_basdec_) must be one of the following forms.
12040 * +basis+ _basid_ +=+ _basexp_ (+and+ _basid_ +=+ _basexp_)^*^
12041 * +open+ _basid~1~_ ... _basid~n~_
12042 * +local+ _basdec_ +in+ _basdec_ +end+
12043 * _basdec_ [+;+] _basdec_
12044 * +structure+ _strid_ [+=+ _strid_] (+and+ _strid_[+=+ _strid_])^*^
12045 * +signature+ _sigid_ [+=+ _sigid_] (+and+ _sigid_ [+=+ _sigid_])^*^
12046 * +functor+ _funid_ [+=+ _funid_] (+and+ _funid_ [+=+ _funid_])^*^
12047 * __path__++.sml++, __path__++.sig++, or __path__++.fun++
12049 * +ann+ ++"++_ann_++"++ +in+ _basdec_ +end+
12051 A basis expression (_basexp_) must be of one the following forms.
12053 * +bas+ _basdec_ +end+
12055 * +let+ _basdec_ +in+ _basexp_ +end+
12057 Nested SML-style comments (enclosed with `(*` and `*)`) are ignored
12058 (but <:LineDirective:>s are recognized).
12060 Paths can be relative or absolute. Relative paths are relative to the
12061 directory containing the MLB file. Paths may include path variables
12062 and are expanded according to a <:MLBasisPathMap:path map>. Unquoted
12063 paths may include alpha-numeric characters and the symbols "`-`" and
12064 "`_`", along with the arc separator "`/`" and extension separator
12065 "`.`". More complicated paths, including paths with spaces, may be
12066 included by quoting the path with `"`. A quoted path is lexed as an
12067 SML string constant.
12069 <:MLBasisAnnotations:Annotations> allow a library author to
12070 control options that affect the elaboration of SML source files.
12074 There is a <!Attachment(MLBasis,mlb-formal.pdf,formal semantics)> for
12075 ML Basis files in the style of the
12076 <:DefinitionOfStandardML:Definition>. Here, we give an informal
12079 An SML structure is a collection of types, values, and other
12080 structures. Similarly, a basis is a collection, but of more kinds of
12081 objects: types, values, structures, fixities, signatures, functors,
12084 A basis declaration denotes a basis. A structure, signature, or
12085 functor declaration denotes a basis containing the corresponding
12086 module. Sequencing of basis declarations merges bases, with later
12087 definitions taking precedence over earlier ones, just like sequencing
12088 of SML declarations. Local declarations provide name hiding, just
12089 like SML local declarations. A reference to an SML source file causes
12090 the file to be elaborated in the basis extant at the point of
12091 reference. A reference to an MLB file causes the basis denoted by
12092 that MLB file to be imported -- the basis at the point of reference
12093 does _not_ affect the imported basis.
12095 Basis expressions and basis identifiers allow binding a basis to a
12098 An MLB file is elaborated starting in an empty basis. Each MLB file
12099 is elaborated and evaluated only once, with the result being cached.
12100 Subsequent references use the cached value. Thus, any observable
12101 effects due to evaluation are not duplicated if the MLB file is
12102 referred to multiple times.
12106 :mlton-guide-page: MLj
12111 http://www.dcs.ed.ac.uk/home/mlj/[MLj] is a
12112 <:StandardMLImplementations:Standard ML implementation> that targets
12113 Java bytecode. It is no longer maintained. It has morphed into
12118 * <!Cite(BentonEtAl98)>
12119 * <!Cite(BentonKennedy99)>
12123 :mlton-guide-page: MLKit
12128 The http://sourceforge.net/apps/mediawiki/mlkit[ML Kit] is a
12129 <:StandardMLImplementations:Standard ML implementation>.
12133 * <:DefinitionOfStandardML:SML'97>
12134 ** including most of the latest <:BasisLibrary:Basis Library>
12135 http://www.standardml.org/Basis[specification],
12136 * <:MLBasis:ML Basis> files
12137 ** and separate compilation,
12138 * <:Regions:Region-Based Memory Management>
12139 ** and <:GarbageCollection:garbage collection>,
12140 * Multiple backends, including
12143 ** JavaScript (see http://www.itu.dk/people/mael/smltojs/[SMLtoJs]).
12145 At the time of writing, MLKit does not support:
12147 * concurrent programming / threads,
12148 * calling from C to SML.
12152 :mlton-guide-page: MLLex
12157 <:MLLex:> is a lexical analyzer generator for <:StandardML:Standard ML>
12158 modeled after the Lex lexical analyzer generator.
12160 A version of MLLex, ported from the <:SMLNJ:SML/NJ> sources, is
12161 distributed with MLton.
12165 MLLex takes as input the lex language as defined in the ML-Lex manual,
12166 and outputs a lexical analyzer in SML.
12168 == Implementation ==
12170 * <!ViewGitFile(mlton,master,mllex/lexgen.sml)>
12171 * <!ViewGitFile(mlton,master,mllex/main.sml)>
12172 * <!ViewGitFile(mlton,master,mllex/call-main.sml)>
12174 == Details and Notes ==
12176 There are 3 main passes in the MLLex tool:
12178 * Source parsing. In this pass, lex source program are parsed into internal representations. The core part of this pass is a hand-written lexer and an LL(1) parser. The output of this pass is a record of user code, rules (along with start states) and actions. (MLLex definitions are wiped off.)
12179 * DFA construction. In this pass, a DFA is constructed by the algorithm of H. Yamada et. al.
12180 * Output. In this pass, the generated DFA is written out as a transition table, along with a table-driven algorithm, to an SML file.
12184 * <!Attachment(Documentation,mllex.pdf)>
12186 * <!Cite(AppelEtAl94)>
12191 :mlton-guide-page: MLLPTLibrary
12197 http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[ML-LPT Library]
12198 is a support library for the <:MLULex:> scanner generator and the
12199 <:MLAntlr:> parser generator. The ML-LPT Library is distributed with
12202 As of 20180119, MLton includes the ML-LPT Library synchronized with
12203 SML/NJ version 110.82.
12207 * You can import the ML-LPT Library into an MLB file with:
12211 |MLB file|Description
12212 |`$(SML_LIB)/mllpt-lib/mllpt-lib.mlb`|
12215 * If you are porting a project from SML/NJ's <:CompilationManager:> to
12216 MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
12217 following map is included by default:
12221 $ml-lpt-lib.cm $(SML_LIB)/mllpt-lib
12222 $ml-lpt-lib.cm/ml-lpt-lib.cm $(SML_LIB)/mllpt-lib/mllpt-lib.mlb
12225 This will automatically convert a `$/mllpt-lib.cm` import in an input
12226 `.cm` file into a `$(SML_LIB)/mllpt-lib/mllpt-lib.mlb` import in the
12227 output `.mlb` file.
12235 * <!ViewGitFile(mlton,master,lib/mllpt-lib/ml-lpt.patch)>
12239 :mlton-guide-page: MLmon
12244 An `mlmon.out` file records dynamic <:Profiling:profiling> counts.
12248 An `mlmon.out` file is a text file with a sequence of lines.
12250 * The string "`MLton prof`".
12252 * The string "`alloc`", "`count`", or "`time`", depending on the kind
12253 of profiling information, corresponding to the command-line argument
12254 supplied to `mlton -profile`.
12256 * The string "`current`" or "`stack`" depending on whether profiling
12257 data was gathered for only the current function (the top of the stack)
12258 or for all functions on the stack. This corresponds to whether the
12259 executable was compiled with `-profile-stack false` or `-profile-stack
12262 * The magic number of the executable.
12264 * The number of non-gc ticks, followed by a space, then the number of
12267 * The number of (split) functions for which data is recorded.
12269 * A line for each (split) function with counts. Each line contains an
12270 integer count of the number of ticks while the function was current.
12271 In addition, if stack data was gathered (`-profile-stack true`), then
12272 the line contains two additional tick counts:
12274 ** the number of ticks while the function was on the stack.
12275 ** the number of ticks while the function was on the stack and a GC
12278 * The number of (master) functions for which data is recorded.
12280 * A line for each (master) function with counts. The lines have the
12281 same format and meaning as with split-function counts.
12285 :mlton-guide-page: MLNLFFI
12290 <!Cite(Blume01, ML-NLFFI)> is the no-longer-foreign-function interface
12293 As of 20050212, MLton has an initial port of ML-NLFFI from SML/NJ to
12294 MLton. All of the ML-NLFFI functionality is present.
12296 Additionally, MLton has an initial port of the
12297 <:MLNLFFIGen:mlnlffigen> tool from SML/NJ to MLton. Due to low-level
12298 details, the code generated by SML/NJ's `ml-nlffigen` is not
12299 compatible with MLton, and vice-versa. However, the generated code
12300 has the same interface, so portable client code can be written.
12301 MLton's `mlnlffigen` does not currently support C functions with
12302 `struct` or `union` arguments.
12306 * You can import the ML-NLFFI Library into an MLB file with
12310 |MLB file|Description
12311 |`$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb`|
12314 * If you are porting a project from SML/NJ's <:CompilationManager:> to
12315 MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
12316 following maps are included by default:
12320 $c $(SML_LIB)/mlnlffi-lib
12321 $c/c.cm $(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb
12324 This will automatically convert a `$/c.cm` import in an input `.cm`
12325 file into a `$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb` import in the
12326 output `.mlb` file.
12331 * <:MLNLFFIImplementation:>
12336 :mlton-guide-page: MLNLFFIGen
12341 `mlnlffigen` generates a <:MLNLFFI:> binding from a collection of `.c`
12342 files. It is based on the <:CKitLibrary:>, which is primarily designed
12343 to handle standardized C and thus does not understand many (any?)
12344 compiler extensions; however, it attempts to recover from errors when
12345 seeing unrecognized definitions.
12347 In order to work around common gcc extensions, it may be useful to add
12348 `-cppopt` options to the command line; for example
12349 `-cppopt '-D__extension__'` may be occasionally useful. Fortunately,
12350 most portable libraries largely avoid the use of these types of
12351 extensions in header files.
12353 `mlnlffigen` will normally not generate bindings for `#included`
12354 files; see `-match` and `-allSU` if this is desirable.
12358 :mlton-guide-page: MLNLFFIImplementation
12359 [[MLNLFFIImplementation]]
12360 MLNLFFIImplementation
12361 =====================
12363 MLton's implementation(s) of the <:MLNLFFI:> library differs from the
12364 SML/NJ implementation in two important ways:
12366 * MLton cannot utilize the `Unsafe.cast` "cheat" described in Section
12367 3.7 of <!Cite(Blume01)>. (MLton's representation of
12368 <:Closure:closures> and
12369 <:PackedRepresentation:aggressive representation> optimizations make
12370 an `Unsafe.cast` even more "unsafe" than in other implementations.)
12373 We have considered two solutions:
12375 ** One solution is to utilize an additional type parameter (as
12376 described in Section 3.7 of <!Cite(Blume01)>):
12383 type ('t, 'f, 'c) obj
12384 eqtype ('t, 'f, 'c) obj'
12387 eqtype ('o, 'f) ptr'
12399 The rule for `('t, 'f, 'c) obj`,`('t, 'f, 'c) ptr`, and also `('t, 'f)
12400 T.typ` is that whenever `F fptr` occurs within the instantiation of
12401 `'t`, then `'f` must be instantiated to `F`. In all other cases, `'f`
12402 will be instantiated to `unit`.
12405 (In the actual MLton implementation, an abstract type `naf`
12406 (not-a-function) is used instead of `unit`.)
12408 While this means that type-annotated programs may not type-check under
12409 both the SML/NJ implementation and the MLton implementation, this
12410 should not be a problem in practice. Tools, like `ml-nlffigen`, which
12411 are necessarily implementation dependent (in order to make
12412 <:CallingFromSMLToCFunctionPointer:calls through a C function
12413 pointer>), may be easily extended to emit the additional type
12414 parameter. Client code which uses such generated glue-code (e.g.,
12415 Section 1 of <!Cite(Blume01)>) need rarely write type-annotations,
12416 thanks to the magic of type inference.
12419 ** The above implementation suffers from two disadvantages.
12422 First, it changes the MLNLFFI Library interface, meaning that the same
12423 program may not type-check under both the SML/NJ implementation and
12424 the MLton implementation (though, in light of type inference and the
12425 richer `MLRep` structure provided by MLton, this point is mostly
12428 Second, it appears to unnecessarily duplicate type information. For
12429 example, an external C variable of type `int (* f[3])(int)` (that is,
12430 an array of three function pointers), would be represented by the SML
12431 type `(((sint -> sint) fptr, dec dg3) arr, sint -> sint, rw) obj`.
12432 One might well ask why the `'f` instantiation (`sint -> sint` in this
12433 case) cannot be _extracted_ from the `'t` instantiation
12434 (`((sint -> sint) fptr, dec dg3) arr` in this case), obviating the
12435 need for a separate _function-type_ type argument. There are a number
12436 of components to an complete answer to this question. Foremost is the
12437 fact that <:StandardML: Standard ML> supports neither (general)
12438 type-level functions nor intensional polymorphism.
12440 A more direct answer for MLNLFFI is that in the SML/NJ implemention,
12441 the definition of the types `('t, 'c) obj` and `('t, 'c) ptr` are made
12442 in such a way that the type variables `'t` and `'c` are <:PhantomType:
12443 phantom> (not contributing to the run-time representation of an
12444 `('t, 'c) obj` or `('t, 'c) ptr` value), despite the fact that the
12445 types `((sint -> sint) fptr, rw) ptr` and
12446 `((double -> double) fptr, rw) ptr` necessarily carry distinct (and
12447 type incompatible) run-time (C-)type information (RTTI), corresponding
12448 to the different calling conventions of the two C functions. The
12449 `Unsafe.cast` "cheat" overcomes the type incompatibility without
12450 introducing a new type variable (as in the first solution above).
12452 Hence, the reason that _function-type_ type cannot be extracted from
12453 the `'t` type variable instantiation is that the type of the
12454 representation of RTTI doesn't even _see_ the (phantom) `'t` type
12455 variable. The solution which presents itself is to give up on the
12456 phantomness of the `'t` type variable, making it available to the
12457 representation of RTTI.
12459 This is not without some small drawbacks. Because many of the types
12460 used to instantiate `'t` carry more structure than is strictly
12461 necessary for `'t`'s RTTI, it is sometimes necessary to wrap and
12462 unwrap RTTI to accommodate the additional structure. (In the other
12463 implementations, the corresponding operations can pass along the RTTI
12464 unchanged.) However, these coercions contribute minuscule overhead;
12465 in fact, in a majority of cases, MLton's optimizations will completely
12466 eliminate the RTTI from the final program.
12469 The implementation distributed with MLton uses the second solution.
12471 Bonus question: Why can't one use a <:UniversalType: universal type>
12472 to eliminate the use of `Unsafe.cast`?
12477 * MLton (in both of the above implementations) provides a richer
12478 `MLRep` structure, utilizing ++Int__<N>__++ and ++Word__<N>__++
12484 structure MLRep = struct
12487 structure Signed = Int8
12488 structure Unsigned = Word8
12489 (* word-style bit-operations on integers... *)
12490 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12491 structure W = Unsigned)
12495 structure Signed = Int16
12496 structure Unsigned = Word16
12497 (* word-style bit-operations on integers... *)
12498 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12499 structure W = Unsigned)
12503 structure Signed = Int32
12504 structure Unsigned = Word32
12505 (* word-style bit-operations on integers... *)
12506 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12507 structure W = Unsigned)
12511 structure Signed = Int32
12512 structure Unsigned = Word32
12513 (* word-style bit-operations on integers... *)
12514 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12515 structure W = Unsigned)
12517 structure <:LongLong:> =
12519 structure Signed = Int64
12520 structure Unsigned = Word64
12521 (* word-style bit-operations on integers... *)
12522 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12523 structure W = Unsigned)
12525 structure Float = Real32
12526 structure Double = Real64
12530 This would appear to be a better interface, even when an
12531 implementation must choose `Int32` and `Word32` as the representation
12532 for smaller C-types.
12537 :mlton-guide-page: MLRISCLibrary
12542 The http://www.cs.nyu.edu/leunga/www/MLRISC/Doc/html/index.html[MLRISC
12543 Library] is a framework for retargetable and optimizing compiler back
12544 ends. The MLRISC Library is distributed with SML/NJ. Due to
12545 differences between SML/NJ and MLton, this library will not work
12546 out-of-the box with MLton.
12548 As of 20180119, MLton includes a port of the MLRISC Library
12549 synchronized with SML/NJ version 110.82.
12553 * You can import a sub-library of the MLRISC Library into an MLB file with:
12557 |MLB file|Description
12558 |`$(SML_LIB)/mlrisc-lib/mlb/ALPHA.mlb`|The ALPHA backend
12559 |`$(SML_LIB)/mlrisc-lib/mlb/AMD64.mlb`|The AMD64 backend
12560 |`$(SML_LIB)/mlrisc-lib/mlb/AMD64-Peephole.mlb`|The AMD64 peephole optimizer
12561 |`$(SML_LIB)/mlrisc-lib/mlb/CCall.mlb`|
12562 |`$(SML_LIB)/mlrisc-lib/mlb/CCall-sparc.mlb`|
12563 |`$(SML_LIB)/mlrisc-lib/mlb/CCall-x86-64.mlb`|
12564 |`$(SML_LIB)/mlrisc-lib/mlb/CCall-x86.mlb`|
12565 |`$(SML_LIB)/mlrisc-lib/mlb/Control.mlb`|
12566 |`$(SML_LIB)/mlrisc-lib/mlb/Graphs.mlb`|
12567 |`$(SML_LIB)/mlrisc-lib/mlb/HPPA.mlb`|The HPPA backend
12568 |`$(SML_LIB)/mlrisc-lib/mlb/IA32.mlb`|The IA32 backend
12569 |`$(SML_LIB)/mlrisc-lib/mlb/IA32-Peephole.mlb`|The IA32 peephole optimizer
12570 |`$(SML_LIB)/mlrisc-lib/mlb/Lib.mlb`|
12571 |`$(SML_LIB)/mlrisc-lib/mlb/MLRISC.mlb`|
12572 |`$(SML_LIB)/mlrisc-lib/mlb/MLTREE.mlb`|
12573 |`$(SML_LIB)/mlrisc-lib/mlb/Peephole.mlb`|
12574 |`$(SML_LIB)/mlrisc-lib/mlb/PPC.mlb`|The PPC backend
12575 |`$(SML_LIB)/mlrisc-lib/mlb/RA.mlb`|
12576 |`$(SML_LIB)/mlrisc-lib/mlb/SPARC.mlb`|The Sparc backend
12577 |`$(SML_LIB)/mlrisc-lib/mlb/StagedAlloc.mlb`|
12578 |`$(SML_LIB)/mlrisc-lib/mlb/Visual.mlb`|
12581 * If you are porting a project from SML/NJ's <:CompilationManager:> to
12582 MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
12583 following map is included by default:
12587 $SMLNJ-MLRISC $(SML_LIB)/mlrisc-lib/mlb
12590 This will automatically convert a `$SMLNJ-MLRISC/MLRISC.cm` import in
12591 an input `.cm` file into a `$(SML_LIB)/mlrisc-lib/mlb/MLRISC.mlb`
12592 import in the output `.mlb` file.
12596 The following changes were made to the MLRISC Library, in addition to
12597 deriving the `.mlb` files from the `.cm` files:
12599 * eliminate sequential `withtype` expansions: Most could be rewritten as a sequence of type definitions and datatype definitions.
12600 * eliminate higher-order functors: Every higher-order functor definition and application could be uncurried in the obvious way.
12601 * eliminate `where <str> = <str>`: Quite painful to expand out all the flexible types in the respective structures. Furthermore, many of the implied type equalities aren't needed, but it's too hard to pick out the right ones.
12602 * `library/array-noneq.sml` (added, not exported): Implements `signature ARRAY_NONEQ`, similar to `signature ARRAY` from the <:BasisLibrary:Basis Library>, but replacing the latter's `eqtype 'a array = 'a array` and `type 'a vector = 'a Vector.vector` with `type 'a array` and `type 'a vector`. Thus, array-like containers may match `ARRAY_NONEQ`, whereas only the pervasive `'a array` container may math `ARRAY`. (SML/NJ's implementation of `signature ARRAY` omits the type realizations.)
12603 * `library/dynamic-array.sml` and `library/hash-array.sml` (modifed): Replace `include ARRAY` with `include ARRAY_NONEQ`; see above.
12607 * <!ViewGitFile(mlton,master,lib/mlrisc-lib/MLRISC.patch)>
12611 :mlton-guide-page: MLtonArray
12618 signature MLTON_ARRAY =
12620 val unfoldi: int * 'b * (int * 'b -> 'a * 'b) -> 'a array * 'b
12624 * `unfoldi (n, b, f)`
12626 constructs an array _a_ of length `n`, whose elements _a~i~_ are
12627 determined by the equations __b~0~ = b__ and
12628 __(a~i~, b~i+1~) = f (i, b~i~)__.
12632 :mlton-guide-page: MLtonBinIO
12639 signature MLTON_BIN_IO = MLTON_IO
12646 :mlton-guide-page: MLtonCont
12653 signature MLTON_CONT =
12657 val callcc: ('a t -> 'a) -> 'a
12658 val isolate: ('a -> unit) -> 'a t
12659 val prepend: 'a t * ('b -> 'a) -> 'b t
12660 val throw: 'a t * 'a -> 'b
12661 val throw': 'a t * (unit -> 'a) -> 'b
12667 the type of continuations that expect a value of type `'a`.
12671 applies `f` to the current continuation. This copies the entire
12672 stack; hence, `callcc` takes time proportional to the size of the
12677 creates a continuation that evaluates `f` in an empty context. This
12678 is a constant time operation, and yields a constant size stack.
12682 composes a function `f` with a continuation `k` to create a
12683 continuation that first does `f` and then does `k`. This is a
12684 constant time operation.
12688 throws value `v` to continuation `k`. This copies the entire stack of
12689 `k`; hence, `throw` takes time proportional to the size of this stack.
12693 a generalization of throw that evaluates `th ()` in the context of
12694 `k`. Thus, for example, if `th ()` raises an exception or captures
12695 another continuation, it will see `k`, not the current continuation.
12700 * <:MLtonContIsolateImplementation:>
12704 :mlton-guide-page: MLtonContIsolateImplementation
12705 [[MLtonContIsolateImplementation]]
12706 MLtonContIsolateImplementation
12707 ==============================
12709 As noted before, it is fairly easy to get the operational behavior of `isolate` with just `callcc` and `throw`, but establishing the right space behavior is trickier. Here, we show how to start from the obvious, but inefficient, implementation of `isolate` using only `callcc` and `throw`, and 'derive' an equivalent, but more efficient, implementation of `isolate` using MLton's primitive stack capture and copy operations. This isn't a formal derivation, as we are not formally showing the equivalence of the programs (though I believe that they are all equivalent, modulo the space behavior).
12711 Here is a direct implementation of isolate using only `callcc` and `throw`:
12715 val isolate: ('a -> unit) -> 'a t =
12716 fn (f: 'a -> unit) =>
12720 val x = callcc (fn k2 => throw (k1, k2))
12721 val _ = (f x ; Exit.topLevelSuffix ())
12722 handle exn => MLtonExn.topLevelHandler exn
12724 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12729 We use the standard nested `callcc` trick to return a continuation that is ready to receive an argument, execute the isolated function, and exit the program. Both `Exit.topLevelSuffix` and `MLtonExn.topLevelHandler` will terminate the program.
12731 Throwing to an isolated function will execute the function in a 'semantically' empty context, in the sense that we never re-execute the 'original' continuation of the call to isolate (i.e., the context that was in place at the time `isolate` was called). However, we assume that the compiler isn't able to recognize that the 'original' continuation is unused; for example, while we (the programmer) know that `Exit.topLevelSuffix` and `MLtonExn.topLevelHandler` will terminate the program, the compiler may only see opaque calls to unknown foreign-functions. So, that original continuation (in its entirety) is part of the continuation returned by `isolate` and throwing to the continuation returned by `isolate` will execute `f x` (with the exit wrapper) in the context of that original continuation. Thus, the garbage collector will retain everything reachable from that original continuation during the evaluation of `f x`, even though it is 'semantically' garbage.
12733 Note that this space-leak is independent of the implementation of continuations (it arises in both MLton's stack copying implementation of continuations and would arise in SML/NJ's CPS-translation implementation); we are only assuming that the implementation can't 'see' the program termination, and so must retain the original continuation (and anything reachable from it).
12735 So, we need an 'empty' continuation in which to execute `f x`. (No surprise there, as that is the written description of `isolate`.) To do this, we capture a top-level continuation and throw to that in order to execute `f x`:
12740 val base: (unit -> unit) t =
12744 val th = callcc (fn k2 => throw (k1, k2))
12745 val _ = (th () ; Exit.topLevelSuffix ())
12746 handle exn => MLtonExn.topLevelHandler exn
12748 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12751 val isolate: ('a -> unit) -> 'a t =
12752 fn (f: 'a -> unit) =>
12756 val x = callcc (fn k2 => throw (k1, k2))
12758 throw (base, fn () => f x)
12764 We presume that `base` is evaluated 'early' in the program. There is a subtlety here, because one needs to believe that this `base` continuation (which technically corresponds to the entire rest of the program evaluation) 'works' as an empty context; in particular, we want it to be the case that executing `f x` in the `base` context retains less space than executing `f x` in the context in place at the call to `isolate` (as occurred in the previous implementation of `isolate`). This isn't particularly easy to believe if one takes a normal substitution-based operational semantics, because it seems that the context captured and bound to `base` is arbitrarily large. However, this context is mostly unevaluated code; the only heap-allocated values that are reachable from it are those that were evaluated before the evaluation of `base` (and used in the program after the evaluation of `base`). Assuming that `base` is evaluated 'early' in the program, we conclude that there are few heap-allocated values reachable from its continuation. In contrast, the previous implementation of `isolate` could capture a context that has many heap-allocated values reachable from it (because we could evaluate `isolate f` 'late' in the program and 'deep' in a call stack), which would all remain reachable during the evaluation of
12765 `f x`. [We'll return to this point later, as it is taking a slightly MLton-esque view of the evaluation of a program, and may not apply as strongly to other implementations (e.g., SML/NJ).]
12767 Now, once we throw to `base` and begin executing `f x`, only the heap-allocated values reachable from `f` and `x` and the few heap-allocated values reachable from `base` are retained by the garbage collector. So, it seems that `base` 'works' as an empty context.
12769 But, what about the continuation returned from `isolate f`? Note that the continuation returned by `isolate` is one that receives an argument `x` and then
12770 throws to `base` to evaluate `f x`. If we used a CPS-translation implementation (and assume sufficient beta-contractions to eliminate administrative redexes), then the original continuation passed to `isolate` (i.e., the continuation bound to `k1`) will not be free in the continuation returned by `isolate f`. Rather, the only free variables in the continuation returned by `isolate f` will be `base` and `f`, so the only heap-allocated values reachable from the continuation returned by `isolate f` will be those values reachable from `base` (assumed to be few) and those values reachable from `f` (necessary in order to execute `f` at some later point).
12772 But, MLton doesn't use a CPS-translation implementation. Rather, at each call to `callcc` in the body of `isolate`, MLton will copy the current execution stack. Thus, `k2` (the continuation returned by `isolate f`) will include execution stack at the time of the call to `isolate f` -- that is, it will include the 'original' continuation of the call to `isolate f`. Thus, the heap-allocated values reachable from the continuation returned by `isolate f` will include those values reachable from `base`, those values reachable from `f`, and those values reachable from the original continuation of the call to `isolate f`. So, just holding on to the continuation returned by `isolate f` will retain all of the heap-allocated values live at the time `isolate f` was called. This leaks space, since, 'semantically', the
12773 continuation returned by `isolate f` only needs the heap-allocated values reachable from `f` (and `base`).
12775 In practice, this probably isn't a significant issue. A common use of `isolate` is implement `abort`:
12778 fun abort th = throw (isolate th, ())
12781 The continuation returned by `isolate th` is dead immediately after being thrown to -- the continuation isn't retained, so neither is the 'semantic'
12782 garbage it would have retained.
12784 But, it is easy enough to 'move' onto the 'empty' context `base` the capturing of the context that we want to be returned by `isolate f`:
12789 val base: (unit -> unit) t =
12793 val th = callcc (fn k2 => throw (k1, k2))
12794 val _ = (th () ; Exit.topLevelSuffix ())
12795 handle exn => MLtonExn.topLevelHandler exn
12797 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12800 val isolate: ('a -> unit) -> 'a t =
12801 fn (f: 'a -> unit) =>
12804 throw (base, fn () =>
12806 val x = callcc (fn k2 => throw (k1, k2))
12808 throw (base, fn () => f x)
12814 This implementation now has the right space behavior; the continuation returned by `isolate f` will only retain the heap-allocated values reachable from `f` and from `base`. (Technically, the continuation will retain two copies of the stack that was in place at the time `base` was evaluated, but we are assuming that that stack small.)
12816 One minor inefficiency of this implementation (given MLton's implementation of continuations) is that every `callcc` and `throw` entails copying a stack (albeit, some of them are small). We can avoid this in the evaluation of `base` by using a reference cell, because `base` is evaluated at the top-level:
12821 val base: (unit -> unit) option t =
12823 val baseRef: (unit -> unit) option t option ref = ref NONE
12824 val th = callcc (fn k => (base := SOME k; NONE))
12827 NONE => (case !baseRef of
12828 NONE => raise Fail "MLton.Cont.isolate: missing base"
12829 | SOME base => base)
12831 val _ = (th () ; Exit.topLevelSuffix ())
12832 handle exn => MLtonExn.topLevelHandler exn
12834 raise Fail "MLton.Cont.isolate: return from (wrapped)
12839 val isolate: ('a -> unit) -> 'a t =
12840 fn (f: 'a -> unit) =>
12843 throw (base, SOME (fn () =>
12845 val x = callcc (fn k2 => throw (k1, k2))
12847 throw (base, SOME (fn () => f x))
12853 Now, to evaluate `base`, we only copy the stack once (instead of 3 times). Because we don't have a dummy continuation around to initialize the reference cell, the reference cell holds a continuation `option`. To distinguish between the original evaluation of `base` (when we want to return the continuation) and the subsequent evaluations of `base` (when we want to evaluate a thunk), we capture a `(unit -> unit) option` continuation.
12855 This seems to be as far as we can go without exploiting the concrete implementation of continuations in <:MLtonCont:>. Examining the implementation, we note that the type of
12856 continuations is given by
12859 type 'a t = (unit -> 'a) -> unit
12862 and the implementation of `throw` is given by
12865 fun ('a, 'b) throw' (k: 'a t, v: unit -> 'a): 'b =
12866 (k v; raise Fail "MLton.Cont.throw': return from continuation")
12868 fun ('a, 'b) throw (k: 'a t, v: 'a): 'b = throw' (k, fn () => v)
12872 Suffice to say, a continuation is simply a function that accepts a thunk to yield the thrown value and the body of the function performs the actual throw. Using this knowledge, we can create a dummy continuation to initialize `baseRef` and greatly simplify the body of `isolate`:
12877 val base: (unit -> unit) option t =
12879 val baseRef: (unit -> unit) option t ref =
12880 ref (fn _ => raise Fail "MLton.Cont.isolate: missing base")
12881 val th = callcc (fn k => (baseRef := k; NONE))
12886 val _ = (th () ; Exit.topLevelSuffix ())
12887 handle exn => MLtonExn.topLevelHandler exn
12889 raise Fail "MLton.Cont.isolate: return from (wrapped)
12894 val isolate: ('a -> unit) -> 'a t =
12895 fn (f: 'a -> unit) =>
12896 fn (v: unit -> 'a) =>
12897 throw (base, SOME (f o v))
12902 Note that this implementation of `isolate` makes it clear that the continuation returned by `isolate f` only retains the heap-allocated values reachable from `f` and `base`. It also retains only one copy of the stack that was in place at the time `base` was evaluated. Finally, it completely avoids making any copies of the stack that is in place at the time `isolate f` is evaluated; indeed, `isolate f` is a constant-time operation.
12904 Next, suppose we limited ourselves to capturing `unit` continuations with `callcc`. We can't pass the thunk to be evaluated in the 'empty' context directly, but we can use a reference cell.
12909 val thRef: (unit -> unit) option ref = ref NONE
12912 val baseRef: unit t ref =
12913 ref (fn _ => raise Fail "MLton.Cont.isolate: missing base")
12914 val () = callcc (fn k => baseRef := k)
12920 val _ = thRef := NONE
12921 val _ = (th () ; Exit.topLevelSuffix ())
12922 handle exn => MLtonExn.topLevelHandler exn
12924 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12928 val isolate: ('a -> unit) -> 'a t =
12929 fn (f: 'a -> unit) =>
12930 fn (v: unit -> 'a) =>
12932 val () = thRef := SOME (f o v)
12940 Note that it is important to set `thRef` to `NONE` before evaluating the thunk, so that the garbage collector doesn't retain all the heap-allocated values reachable from `f` and `v` during the evaluation of `f (v ())`. This is because `thRef` is still live during the evaluation of the thunk; in particular, it was allocated before the evaluation of `base` (and used after), and so is retained by continuation on which the thunk is evaluated.
12942 This implementation can be easily adapted to use MLton's primitive stack copying operations.
12947 val thRef: (unit -> unit) option ref = ref NONE
12948 val base: Thread.preThread =
12950 val () = Thread.copyCurrent ()
12953 NONE => Thread.savedPre ()
12956 val () = thRef := NONE
12957 val _ = (th () ; Exit.topLevelSuffix ())
12958 handle exn => MLtonExn.topLevelHandler exn
12960 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12964 val isolate: ('a -> unit) -> 'a t =
12965 fn (f: 'a -> unit) =>
12966 fn (v: unit -> 'a) =>
12968 val () = thRef := SOME (f o v)
12969 val new = Thread.copy base
12971 Thread.switchTo new
12977 In essence, `Thread.copyCurrent` copies the current execution stack and stores it in an implicit reference cell in the runtime system, which is fetchable with `Thread.savedPre`. When we are ready to throw to the isolated function, `Thread.copy` copies the saved execution stack (because the stack is modified in place during execution, we need to retain a pristine copy in case the isolated function itself throws to other isolated functions) and `Thread.switchTo` abandons the current execution stack, installing the newly copied execution stack.
12979 The actual implementation of `MLton.Cont.isolate` simply adds some `Thread.atomicBegin` and `Thread.atomicEnd` commands, which effectively protect the global `thRef` and accommodate the fact that `Thread.switchTo` does an implicit `Thread.atomicEnd` (used for leaving a signal handler thread).
12984 val thRef: (unit -> unit) option ref = ref NONE
12985 val base: Thread.preThread =
12987 val () = Thread.copyCurrent ()
12990 NONE => Thread.savedPre ()
12993 val () = thRef := NONE
12994 val _ = MLton.atomicEnd (* Match 1 *)
12995 val _ = (th () ; Exit.topLevelSuffix ())
12996 handle exn => MLtonExn.topLevelHandler exn
12998 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
13002 val isolate: ('a -> unit) -> 'a t =
13003 fn (f: 'a -> unit) =>
13004 fn (v: unit -> 'a) =>
13006 val _ = MLton.atomicBegin (* Match 1 *)
13007 val () = thRef := SOME (f o v)
13008 val new = Thread.copy base
13009 val _ = MLton.atomicBegin (* Match 2 *)
13011 Thread.switchTo new (* Match 2 *)
13017 It is perhaps interesting to note that the above implementation was originally 'derived' by specializing implementations of the <:MLtonThread:> `new`, `prepare`, and `switch` functions as if their only use was in the following implementation of `isolate`:
13021 val isolate: ('a -> unit) -> 'a t =
13022 fn (f: 'a -> unit) =>
13023 fn (v: unit -> 'a) =>
13025 val th = (f (v ()) ; Exit.topLevelSuffix ())
13026 handle exn => MLtonExn.topLevelHandler exn
13027 val t = MLton.Thread.prepare (MLton.Thread.new th, ())
13029 MLton.Thread.switch (fn _ => t)
13034 It was pleasant to discover that it could equally well be 'derived' starting from the `callcc` and `throw` implementation.
13036 As a final comment, we noted that the degree to which the context of `base` could be considered 'empty' (i.e., retaining few heap-allocated values) depended upon a slightly MLton-esque view. In particular, MLton does not heap allocate executable code. So, although the `base` context keeps a lot of unevaluated code 'live', such code is not heap allocated. In a system like SML/NJ, that does heap allocate executable code, one might want it to be the case that after throwing to an isolated function, the garbage collector retains only the code necessary to evaluate the function, and not any code that was necessary to evaluate the `base` context.
13040 :mlton-guide-page: MLtonCross
13045 The debian package MLton-Cross adds various targets to MLton. In
13046 combination with the emdebian project, this allows a debian system to
13047 compile SML files to other architectures.
13049 Currently, these targets are supported:
13051 * _Windows (MinGW)_
13052 ** -target i586-mingw32msvc (mlton-target-i586-mingw32msvc)
13053 ** -target amd64-mingw32msvc( mlton-target-amd64-mingw32msvc)
13055 ** -target alpha-linux-gnu (mlton-target-alpha-linux-gnu)
13056 ** -target arm-linux-gnueabi (mlton-target-arm-linux-gnueabi)
13057 ** -target hppa-linux-gnu (mlton-target-hppa-linux-gnu)
13058 ** -target i486-linux-gnu (mlton-target-i486-linux-gnu)
13059 ** -target ia64-linux-gnu (mlton-target-ia64-linux-gnu)
13060 ** -target mips-linux-gnu (mlton-target-mips-linux-gnu)
13061 ** -target mipsel-linux-gnu (mlton-target-mipsel-linux-gnu)
13062 ** -target powerpc-linux-gnu (mlton-target-powerpc-linux-gnu)
13063 ** -target s390-linux-gnu (mlton-target-s390-linux-gnu)
13064 ** -target sparc-linux-gnu (mlton-target-sparc-linux-gnu)
13065 ** -target x86-64-linux-gnu (mlton-target-x86-64-linux-gnu)
13070 MLton-Cross is kept in-sync with the current MLton release.
13072 * <!Attachment(MLtonCross,mlton-cross_20100608.orig.tar.gz)>
13076 :mlton-guide-page: MLtonExn
13083 signature MLTON_EXN =
13085 val addExnMessager: (exn -> string option) -> unit
13086 val history: exn -> string list
13088 val defaultTopLevelHandler: exn -> 'a
13089 val getTopLevelHandler: unit -> (exn -> unit)
13090 val setTopLevelHandler: (exn -> unit) -> unit
13091 val topLevelHandler: exn -> 'a
13095 * `addExnMessager f`
13097 adds `f` as a pretty-printer to be used by `General.exnMessage` for
13098 converting exceptions to strings. Messagers are tried in order from
13099 most recently added to least recently added.
13103 returns call stack at the point that `e` was first raised. Each
13104 element of the list is a file position. The elements are in reverse
13105 chronological order, i.e. the function called last is at the front of
13108 `history e` will return `[]` unless the program is compiled with
13109 `-const 'Exn.keepHistory true'`.
13111 * `defaultTopLevelHandler e`
13113 function that behaves as the default top level handler; that is, print
13114 out the unhandled exception message for `e` and exit.
13116 * `getTopLevelHandler ()`
13118 get the top level handler.
13120 * `setTopLevelHandler f`
13122 set the top level handler to the function `f`. The function `f`
13123 should not raise an exception or return normally.
13125 * `topLevelHandler e`
13127 behaves as if the top level handler received the exception `e`.
13131 :mlton-guide-page: MLtonFinalizable
13132 [[MLtonFinalizable]]
13138 signature MLTON_FINALIZABLE =
13142 val addFinalizer: 'a t * ('a -> unit) -> unit
13143 val finalizeBefore: 'a t * 'b t -> unit
13144 val new: 'a -> 'a t
13145 val touch: 'a t -> unit
13146 val withValue: 'a t * ('a -> 'b) -> 'b
13150 A _finalizable_ value is a container to which finalizers can be
13151 attached. A container holds a value, which is reachable as long as
13152 the container itself is reachable. A _finalizer_ is a function that
13153 runs at some point after garbage collection determines that the
13154 container to which it is attached has become
13155 <:Reachability:unreachable>. A finalizer is treated like a signal
13156 handler, in that it runs asynchronously in a separate thread, with
13157 signals blocked, and will not interrupt a critical section (see
13160 * `addFinalizer (v, f)`
13162 adds `f` as a finalizer to `v`. This means that sometime after the
13163 last call to `withValue` on `v` completes and `v` becomes unreachable,
13164 `f` will be called with the value of `v`.
13166 * `finalizeBefore (v1, v2)`
13168 ensures that `v1` will be finalized before `v2`. A cycle of values
13169 `v` = `v1`, ..., `vn` = `v` with `finalizeBefore (vi, vi+1)` will
13170 result in none of the `vi` being finalized.
13174 creates a new finalizable value, `v`, with value `x`. The finalizers
13175 of `v` will run sometime after the last call to `withValue` on `v`
13176 when the garbage collector determines that `v` is unreachable.
13180 ensures that `v`'s finalizers will not run before the call to `touch`.
13182 * `withValue (v, f)`
13184 returns the result of applying `f` to the value of `v` and ensures
13185 that `v`'s finalizers will not run before `f` completes. The call to
13186 `f` is a nontail call.
13191 Suppose that `finalizable.sml` contains the following:
13194 sys::[./bin/InclGitFile.py mlton master doc/examples/finalizable/finalizable.sml]
13197 Suppose that `cons.c` contains the following.
13200 sys::[./bin/InclGitFile.py mlton master doc/examples/finalizable/cons.c]
13203 We can compile these to create an executable with
13205 % mlton -default-ann 'allowFFI true' finalizable.sml cons.c
13208 Running this executable will create output like the following.
13211 0x08072890 = listSing (2)
13212 0x080728a0 = listCons (2)
13213 0x080728b0 = listCons (2)
13214 0x080728c0 = listCons (2)
13215 0x080728d0 = listCons (2)
13216 0x080728e0 = listCons (2)
13217 0x080728f0 = listCons (2)
13220 listFree (0x080728f0)
13221 listFree (0x080728e0)
13222 listFree (0x080728d0)
13223 listFree (0x080728c0)
13224 listFree (0x080728b0)
13225 listFree (0x080728a0)
13226 listFree (0x08072890)
13230 == Synchronous Finalizers ==
13232 Finalizers in MLton are asynchronous. That is, they run at an
13233 unspecified time, interrupting the user program. It is also possible,
13234 and sometimes useful, to have synchronous finalizers, where the user
13235 program explicitly decides when to run enabled finalizers. We have
13236 considered this in MLton, and it seems possible, but there are some
13237 unresolved design issues. See the thread at
13239 * http://www.mlton.org/pipermail/mlton/2004-September/016570.html
13247 :mlton-guide-page: MLtonGC
13254 signature MLTON_GC =
13256 val collect: unit -> unit
13257 val pack: unit -> unit
13258 val setMessages: bool -> unit
13259 val setSummary: bool -> unit
13260 val unpack: unit -> unit
13261 structure Statistics :
13263 val bytesAllocated: unit -> IntInf.int
13264 val lastBytesLive: unit -> IntInf.int
13265 val numCopyingGCs: unit -> IntInf.int
13266 val numMarkCompactGCs: unit -> IntInf.int
13267 val numMinorGCs: unit -> IntInf.int
13268 val maxBytesLive: unit -> IntInf.int
13275 causes a garbage collection to occur.
13279 shrinks the heap as much as possible so that other processes can use
13284 controls whether diagnostic messages are printed at the beginning and
13285 end of each garbage collection. It is the same as the `gc-messages`
13286 runtime system option.
13290 controls whether a summary of garbage collection statistics is printed
13291 upon termination of the program. It is the same as the `gc-summary`
13292 runtime system option.
13296 resizes a packed heap to the size desired by the runtime.
13298 * `Statistics.bytesAllocated ()`
13300 returns bytes allocated (as of the most recent garbage collection).
13302 * `Statistics.lastBytesLive ()`
13304 returns bytes live (as of the most recent garbage collection).
13306 * `Statistics.numCopyingGCs ()`
13308 returns number of (major) copying garbage collections performed (as of
13309 the most recent garbage collection).
13311 * `Statistics.numMarkCompactGCs ()`
13313 returns number of (major) mark-compact garbage collections performed
13314 (as of the most recent garbage collection).
13316 * `Statistics.numMinorGCs ()`
13318 returns number of minor garbage collections performed (as of the most
13319 recent garbage collection).
13321 * `Statistics.maxBytesLive ()`
13323 returns maximum bytes live (as of the most recent garbage collection).
13327 :mlton-guide-page: MLtonIntInf
13334 signature MLTON_INT_INF =
13336 type t = IntInf.int
13338 val areSmall: t * t -> bool
13339 val gcd: t * t -> t
13340 val isSmall: t -> bool
13342 structure BigWord : WORD
13343 structure SmallInt : INTEGER
13345 Big of BigWord.word vector
13346 | Small of SmallInt.int
13348 val fromRep : rep -> t option
13352 MLton represents an arbitrary precision integer either as an unboxed
13353 word with the bottom bit set to 1 and the top bits representing a
13354 small signed integer, or as a pointer to a vector of words, where the
13355 first word indicates the sign and the rest are the limbs of a
13356 <:GnuMP:> big integer.
13360 the same as type `IntInf.int`.
13362 * `areSmall (a, b)`
13364 returns true iff both `a` and `b` are small.
13368 uses the <:GnuMP:GnuMP's> fast gcd implementation.
13372 returns true iff `a` is small.
13376 representation of a big `IntInf.int` as a vector of words; on 32-bit
13377 platforms, `BigWord` is likely to be equivalent to `Word32`, and on
13378 64-bit platforms, `BigWord` is likely to be equivalent to `Word64`.
13380 * `SmallInt : INTEGER`
13382 representation of a small `IntInf.int` as a signed integer; on 32-bit
13383 platforms, `SmallInt` is likely to be equivalent to `Int32`, and on
13384 64-bit platforms, `SmallInt` is likely to be equivalent to `Int64`.
13388 the underlying representation of an `IntInf.int`.
13392 returns the underlying representation of `i`.
13396 converts from the underlying representation back to an `IntInf.int`.
13397 If `fromRep r` is given anything besides the valid result of `rep i`
13398 for some `i`, this function call will return `NONE`.
13402 :mlton-guide-page: MLtonIO
13409 signature MLTON_IO =
13414 val inFd: instream -> Posix.IO.file_desc
13415 val mkstemp: string -> string * outstream
13416 val mkstemps: {prefix: string, suffix: string} -> string * outstream
13417 val newIn: Posix.IO.file_desc * string -> instream
13418 val newOut: Posix.IO.file_desc * string -> outstream
13419 val outFd: outstream -> Posix.IO.file_desc
13420 val tempPrefix: string -> string
13426 returns the file descriptor corresponding to `ins`.
13430 like the C `mkstemp` function, generates and open a temporary file
13433 * `mkstemps {prefix, suffix}`
13435 like `mkstemp`, except it has both a prefix and suffix.
13437 * `newIn (fd, name)`
13439 creates a new instream from file descriptor `fd`, with `name` used in
13440 any `Io` exceptions later raised.
13442 * `newOut (fd, name)`
13444 creates a new outstream from file descriptor `fd`, with `name` used in
13445 any `Io` exceptions later raised.
13449 returns the file descriptor corresponding to `out`.
13453 adds a suitable system or user specific prefix (directory) for temp
13458 :mlton-guide-page: MLtonItimer
13465 signature MLTON_ITIMER =
13472 val set: t * {interval: Time.time, value: Time.time} -> unit
13473 val signal: t -> Posix.Signal.signal
13477 * `set (t, {interval, value})`
13479 sets the interval timer (using `setitimer`) specified by `t` to the
13480 given `interval` and `value`.
13484 returns the signal corresponding to `t`.
13488 :mlton-guide-page: MLtonLibraryProject
13489 [[MLtonLibraryProject]]
13490 MLtonLibraryProject
13491 ===================
13493 We have a https://github.com/MLton/mltonlib[MLton Library repository]
13494 that is intended to collect libraries.
13497 https://github.com/MLton/mltonlib
13500 Libraries are kept in the `master` branch, and are grouped according
13501 to domain name, in the Java package style. For example,
13502 <:VesaKarvonen:>, who works at `ssh.com`, has been putting code at:
13505 https://github.com/MLton/mltonlib/tree/master/com/ssh
13508 <:StephenWeeks:>, owning `sweeks.com`, has been putting code at:
13511 https://github.com/MLton/mltonlib/tree/master/com/sweeks
13514 A "library" is a subdirectory of some such directory. For example,
13515 Stephen's basis-library replacement library is at
13518 https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic
13521 We use "transparent per-library branching" to handle library
13522 versioning. Each library has an "unstable" subdirectory in which work
13523 happens. When one is happy with a library, one tags it by copying it
13524 to a stable version directory. Stable libraries are immutable -- when
13525 one refers to a stable library, one always gets exactly the same code.
13526 No one has actually made a stable library yet, but, when I'm ready to
13527 tag my library, I was thinking that I would do something like copying
13530 https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic/unstable
13536 https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic/v1
13539 So far, libraries in the MLton repository have been licensed under
13540 MLton's <:License:>. We haven't decided on whether that will be a
13541 requirement to be in the repository or not. For the sake of
13542 simplicity (a single license) and encouraging widest use of code,
13543 contributors are encouraged to use that license. But it may be too
13544 strict to require it.
13546 If someone wants to contribute a new library to our repository or to
13547 work on an old one, they can make a pull request. If people want to
13548 work in their own repository, they can do so -- that's the point of
13549 using domain names to prevent clashes. The idea is that a user should
13550 be able to bring library collections in from many different
13551 repositories without problems. And those libraries could even work
13554 At some point we may want to settle on an <:MLBasisPathMap:> variable
13555 for the root of the library project. Or, we could reuse `SML_LIB`,
13556 and migrate what we currently keep there into the library
13561 :mlton-guide-page: MLtonMonoArray
13568 signature MLTON_MONO_ARRAY =
13572 val fromPoly: elem array -> t
13573 val toPoly: t -> elem array
13579 type of monomorphic array
13583 type of array elements
13587 type cast a polymorphic array to its monomorphic counterpart; the
13588 argument and result arrays share the same identity
13592 type cast a monomorphic array to its polymorphic counterpart; the
13593 argument and result arrays share the same identity
13597 :mlton-guide-page: MLtonMonoVector
13598 [[MLtonMonoVector]]
13604 signature MLTON_MONO_VECTOR =
13608 val fromPoly: elem vector -> t
13609 val toPoly: t -> elem vector
13615 type of monomorphic vector
13619 type of vector elements
13623 type cast a polymorphic vector to its monomorphic counterpart; in
13624 MLton, this is a constant-time operation
13628 type cast a monomorphic vector to its polymorphic counterpart; in
13629 MLton, this is a constant-time operation
13633 :mlton-guide-page: MLtonPlatform
13640 signature MLTON_PLATFORM =
13644 datatype t = Alpha | AMD64 | ARM | ARM64 | HPPA | IA64 | m68k
13645 | MIPS | PowerPC | PowerPC64 | S390 | Sparc | X86
13647 val fromString: string -> t option
13649 val toString: t -> string
13654 datatype t = AIX | Cygwin | Darwin | FreeBSD | Hurd | HPUX
13655 | Linux | MinGW | NetBSD | OpenBSD | Solaris
13657 val fromString: string -> t option
13659 val toString: t -> string
13664 * `datatype Arch.t`
13666 processor architectures
13668 * `Arch.fromString a`
13670 converts from string to architecture. Case insensitive.
13674 the architecture for which the program is compiled.
13678 string for architecture.
13686 converts from string to operating system. Case insensitive.
13690 the operating system for which the program is compiled.
13694 string for operating system.
13698 :mlton-guide-page: MLtonPointer
13705 signature MLTON_POINTER =
13709 val add: t * word -> t
13710 val compare: t * t -> order
13711 val diff: t * t -> word
13712 val getInt8: t * int -> Int8.int
13713 val getInt16: t * int -> Int16.int
13714 val getInt32: t * int -> Int32.int
13715 val getInt64: t * int -> Int64.int
13716 val getPointer: t * int -> t
13717 val getReal32: t * int -> Real32.real
13718 val getReal64: t * int -> Real64.real
13719 val getWord8: t * int -> Word8.word
13720 val getWord16: t * int -> Word16.word
13721 val getWord32: t * int -> Word32.word
13722 val getWord64: t * int -> Word64.word
13724 val setInt8: t * int * Int8.int -> unit
13725 val setInt16: t * int * Int16.int -> unit
13726 val setInt32: t * int * Int32.int -> unit
13727 val setInt64: t * int * Int64.int -> unit
13728 val setPointer: t * int * t -> unit
13729 val setReal32: t * int * Real32.real -> unit
13730 val setReal64: t * int * Real64.real -> unit
13731 val setWord8: t * int * Word8.word -> unit
13732 val setWord16: t * int * Word16.word -> unit
13733 val setWord32: t * int * Word32.word -> unit
13734 val setWord64: t * int * Word64.word -> unit
13735 val sizeofPointer: word
13736 val sub: t * word -> t
13742 the type of pointers, i.e. machine addresses.
13746 returns the pointer `w` bytes after than `p`. Does not check for
13749 * `compare (p1, p2)`
13751 compares the pointer `p1` to the pointer `p2` (as addresses).
13755 returns the number of bytes `w` such that `add (p2, w) = p1`. Does
13756 not check for overflow.
13758 * ++get__<X>__ (p, i)++
13760 returns the object stored at index i of the array of _X_ objects
13761 pointed to by `p`. For example, `getWord32 (p, 7)` returns the 32-bit
13762 word stored 28 bytes beyond `p`.
13766 the null pointer, i.e. 0.
13768 * ++set__<X>__ (p, i, v)++
13770 assigns `v` to the object stored at index i of the array of _X_
13771 objects pointed to by `p`. For example, `setWord32 (p, 7, w)` stores
13772 the 32-bit word `w` at the address 28 bytes beyond `p`.
13776 size, in bytes, of a pointer.
13780 returns the pointer `w` bytes before `p`. Does not check for
13785 :mlton-guide-page: MLtonProcEnv
13792 signature MLTON_PROC_ENV =
13796 val setenv: {name: string, value: string} -> unit
13797 val setgroups: gid list -> unit
13801 * `setenv {name, value}`
13803 like the C `setenv` function. Does not require `name` or `value` to
13804 be null terminated.
13808 like the C `setgroups` function.
13812 :mlton-guide-page: MLtonProcess
13819 signature MLTON_PROCESS =
13823 val spawn: {args: string list, path: string} -> pid
13824 val spawne: {args: string list, env: string list, path: string} -> pid
13825 val spawnp: {args: string list, file: string} -> pid
13827 type ('stdin, 'stdout, 'stderr) t
13836 exception MisuseOfForget
13837 exception DoublyRedirected
13841 type ('use, 'dir) t
13843 val binIn: (BinIO.instream, input) t -> BinIO.instream
13844 val binOut: (BinIO.outstream, output) t -> BinIO.outstream
13845 val fd: (Posix.FileSys.file_desc, 'dir) t -> Posix.FileSys.file_desc
13846 val remember: (any, 'dir) t -> ('use, 'dir) t
13847 val textIn: (TextIO.instream, input) t -> TextIO.instream
13848 val textOut: (TextIO.outstream, output) t -> TextIO.outstream
13853 type ('use, 'dir) t
13855 val child: (chain, 'dir) Child.t -> (none, 'dir) t
13856 val fd: Posix.FileSys.file_desc -> (none, 'dir) t
13857 val file: string -> (none, 'dir) t
13858 val forget: ('use, 'dir) t -> (any, 'dir) t
13859 val null: (none, 'dir) t
13860 val pipe: ('use, 'dir) t
13861 val self: (none, 'dir) t
13865 {args: string list,
13866 env: string list option,
13868 stderr: ('stderr, output) Param.t,
13869 stdin: ('stdin, input) Param.t,
13870 stdout: ('stdout, output) Param.t}
13871 -> ('stdin, 'stdout, 'stderr) t
13872 val getStderr: ('stdin, 'stdout, 'stderr) t -> ('stderr, input) Child.t
13873 val getStdin: ('stdin, 'stdout, 'stderr) t -> ('stdin, output) Child.t
13874 val getStdout: ('stdin, 'stdout, 'stderr) t -> ('stdout, input) Child.t
13875 val kill: ('stdin, 'stdout, 'stderr) t * Posix.Signal.signal -> unit
13876 val reap: ('stdin, 'stdout, 'stderr) t -> Posix.Process.exit_status
13883 The `spawn` functions provide an alternative to the
13884 `fork`/`exec` idiom that is typically used to create a new
13885 process. On most platforms, the `spawn` functions are simple
13886 wrappers around `fork`/`exec`. However, under Windows, the
13887 `spawn` functions are primitive. All `spawn` functions return
13888 the process id of the spawned process. They differ in how the
13889 executable is found and the environment that it uses.
13891 * `spawn {args, path}`
13893 starts a new process running the executable specified by `path`
13894 with the arguments `args`. Like `Posix.Process.exec`.
13896 * `spawne {args, env, path}`
13898 starts a new process running the executable specified by `path` with
13899 the arguments `args` and environment `env`. Like
13900 `Posix.Process.exece`.
13902 * `spawnp {args, file}`
13904 search the `PATH` environment variable for an executable named `file`,
13905 and start a new process running that executable with the arguments
13906 `args`. Like `Posix.Process.execp`.
13911 `MLton.Process.create` provides functionality similar to
13912 `Unix.executeInEnv`, but provides more control control over the input,
13913 output, and error streams. In addition, `create` works on all
13914 platforms, including Cygwin and MinGW (Windows) where `Posix.fork` is
13915 unavailable. For greatest portability programs should still use the
13916 standard `Unix.execute`, `Unix.executeInEnv`, and `OS.Process.system`.
13918 The following types and sub-structures are used by the `create`
13919 function. They provide static type checking of correct stream usage.
13923 * `('use, 'dir) Child.t`
13925 This represents a handle to one of a child's standard streams. The
13926 `'dir` is viewed with respect to the parent. Thus a `('a, input)
13927 Child.t` handle means that the parent may input the output from the
13930 * `Child.{bin,text}{In,Out} h`
13932 These functions take a handle and bind it to a stream of the named
13933 type. The type system will detect attempts to reverse the direction
13934 of a stream or to use the same stream in multiple, incompatible ways.
13938 This function behaves like the other `Child.*` functions; it opens a
13939 stream. However, it does not enforce that you read or write from the
13940 handle. If you use the descriptor in an inappropriate direction, the
13941 behavior is undefined. Furthermore, this function may potentially be
13942 unavailable on future MLton host platforms.
13944 * `Child.remember h`
13946 This function takes a stream of use `any` and resets the use of the
13947 stream so that the stream may be used by `Child.*`. An `any` stream
13948 may have had use `none` or `'use` prior to calling `Param.forget`. If
13949 the stream was `none` and is used, `MisuseOfForget` is raised.
13953 * `('use, 'dir) Param.t`
13955 This is a handle to an input/output source and will be passed to the
13956 created child process. The `'dir` is relative to the child process.
13957 Input means that the child process will read from this stream.
13961 Connect the stream of the new child process to the stream of a
13962 previously created child process. A single child stream should be
13963 connected to only one child process or else `DoublyRedirected` will be
13968 This creates a stream from the provided file descriptor which will be
13969 closed when `create` is called. This function may not be available on
13970 future MLton host platforms.
13974 This hides the type of the actual parameter as `any`. This is useful
13975 if you are implementing an application which conditionally attaches
13976 the child process to files or pipes. However, you must ensure that
13977 your use after `Child.remember` matches the original type.
13981 Open the given file and connect it to the child process. Note that the
13982 file will be opened only when `create` is called. So any exceptions
13983 will be raised there and not by this function. If used for `input`,
13984 the file is opened read-only. If used for `output`, the file is opened
13989 In some situations, the child process should have its output
13990 discarded. The `null` param when passed as `stdout` or `stderr` does
13991 this. When used for `stdin`, the child process will either receive
13992 `EOF` or a failure condition if it attempts to read from `stdin`.
13996 This will connect the input/output of the child process to a pipe
13997 which the parent process holds. This may later form the input to one
13998 of the `Child.*` functions and/or the `Param.child` function.
14002 This will connect the input/output of the child process to the
14003 corresponding stream of the parent process.
14007 * `type ('stdin, 'stdout, 'stderr) t`
14009 represents a handle to a child process. The type arguments capture
14010 how the named stream of the child process may be used.
14014 bypasses the type system in situations where an application does not
14015 want the it to enforce correct usage. See `Child.remember` and
14020 means that the child process's stream was connected via a pipe to the
14021 parent process. The parent process may pass this pipe in turn to
14022 another child, thus chaining them together.
14024 * `type input, output`
14026 record the direction that a stream flows. They are used as a part of
14027 `Param.t` and `Child.t` and is detailed there.
14031 means that the child process's stream my not be used by the parent
14032 process. This happens when the child process is connected directly to
14035 The types `BinIO.instream`, `BinIO.outstream`, `TextIO.instream`,
14036 `TextIO.outstream`, and `Posix.FileSys.file_desc` are also valid types
14037 with which to instantiate child streams.
14039 * `exception MisuseOfForget`
14041 may be raised if `Child.remember` and `Param.forget` are used to
14042 bypass the normal type checking. This exception will only be raised
14043 in cases where the `forget` mechanism allows a misuse that would be
14044 impossible with the type-safe versions.
14046 * `exception DoublyRedirected`
14048 raised if a stream connected to a child process is redirected to two
14049 separate child processes. It is safe, though bad style, to use the a
14050 `Child.t` with the same `Child.*` function repeatedly.
14052 * `create {args, path, env, stderr, stdin, stdout}`
14054 starts a child process with the given command-line `args` (excluding
14055 the program name). `path` should be an absolute path to the executable
14056 run in the new child process; relative paths work, but are less
14057 robust. Optionally, the environment may be overridden with `env`
14058 where each string element has the form `"key=value"`. The `std*`
14059 options must be provided by the `Param.*` functions documented above.
14061 Processes which are `create`-d must be either `reap`-ed or `kill`-ed.
14063 * `getStd{in,out,err} proc`
14065 gets a handle to the specified stream. These should be used by the
14066 `Child.*` functions. Failure to use a stream connected via pipe to a
14067 child process may result in runtime dead-lock and elicits a compiler
14070 * `kill (proc, sig)`
14072 terminates the child process immediately. The signal may or may not
14073 mean anything depending on the host platform. A good value is
14074 `Posix.Signal.term`.
14078 waits for the child process to terminate and return its exit status.
14081 == Important usage notes ==
14083 When building an application with many pipes between child processes,
14084 it is important to ensure that there are no cycles in the undirected
14085 pipe graph. If this property is not maintained, deadlocks are a very
14086 serious potential bug which may only appear under difficult to
14087 reproduce conditions.
14089 The danger lies in that most operating systems implement pipes with a
14090 fixed buffer size. If process A has two output pipes which process B
14091 reads, it can happen that process A blocks writing to pipe 2 because
14092 it is full while process B blocks reading from pipe 1 because it is
14093 empty. This same situation can happen with any undirected cycle formed
14094 between processes (vertexes) and pipes (undirected edges) in the
14097 It is possible to make this safe using low-level I/O primitives for
14098 polling. However, these primitives are not very portable and
14099 difficult to use properly. A far better approach is to make sure you
14100 never create a cycle in the first place.
14102 For these reasons, the `Unix.executeInEnv` is a very dangerous
14103 function. Be careful when using it to ensure that the child process
14104 only operates on either `stdin` or `stdout`, but not both.
14107 == Example use of MLton.Process.create ==
14109 The following example program launches the `ipconfig` utility, pipes
14110 its output through `grep`, and then reads the result back into the
14117 create {args = [ "/all" ],
14119 path = "C:\\WINDOWS\\system32\\ipconfig.exe",
14120 stderr = Param.self,
14121 stdin = Param.null,
14122 stdout = Param.pipe}
14124 create {args = [ "IP-Ad" ],
14126 path = "C:\\msys\\bin\\grep.exe",
14127 stderr = Param.self,
14128 stdin = Param.child (getStdout p),
14129 stdout = Param.pipe}
14131 case TextIO.inputLine h of
14133 | SOME s => (print ("'" ^ s ^ "'\n"); suck h)
14135 val () = suck (Child.textIn (getStdout q))
14140 :mlton-guide-page: MLtonProfile
14147 signature MLTON_PROFILE =
14153 val equals: t * t -> bool
14154 val free: t -> unit
14155 val malloc: unit -> t
14156 val write: t * string -> unit
14160 val withData: Data.t * (unit -> 'a) -> 'a
14164 `MLton.Profile` provides <:Profiling:> control from within the
14165 program, allowing you to profile individual portions of your
14166 program. With `MLton.Profile`, you can create many units of profiling
14167 data (essentially, mappings from functions to counts) during a run of
14168 a program, switch between them while the program is running, and
14169 output multiple `mlmon.out` files.
14173 a compile-time constant that is false only when compiling `-profile no`.
14177 the type of a unit of profiling data. In order to most efficiently
14178 execute non-profiled programs, when compiling `-profile no` (the
14179 default), `Data.t` is equivalent to `unit ref`.
14181 * `Data.equals (x, y)`
14183 returns true if the `x` and `y` are the same unit of profiling data.
14187 frees the memory associated with the unit of profiling data `x`. It
14188 is an error to free the current unit of profiling data or to free a
14189 previously freed unit of profiling data. When compiling
14190 `-profile no`, `Data.free x` is a no-op.
14194 returns a new unit of profiling data. Each unit of profiling data is
14195 allocated from the process address space (but is _not_ in the MLton
14196 heap) and consumes memory proportional to the number of source
14197 functions. When compiling `-profile no`, `Data.malloc ()` is
14198 equivalent to allocating a new `unit ref`.
14202 writes the accumulated ticks in the unit of profiling data `x` to file
14203 `f`. It is an error to write a previously freed unit of profiling
14204 data. When compiling `-profile no`, `write (x, f)` is a no-op. A
14205 profiled program will always write the current unit of profiling data
14206 at program exit to a file named `mlmon.out`.
14208 * `withData (d, f)`
14210 runs `f` with `d` as the unit of profiling data, and returns the
14211 result of `f` after restoring the current unit of profiling data.
14212 When compiling `-profile no`, `withData (d, f)` is equivalent to
14218 Here is an example, taken from the `examples/profiling` directory,
14219 showing how to profile the executions of the `fib` and `tak` functions
14220 separately. Suppose that `fib-tak.sml` contains the following.
14223 structure Profile = MLton.Profile
14225 val fibData = Profile.Data.malloc ()
14226 val takData = Profile.Data.malloc ()
14228 fun wrap (f, d) x =
14229 Profile.withData (d, fn () => f x)
14234 | n => fib (n - 1) + fib (n - 2)
14235 val fib = wrap (fib, fibData)
14240 else tak (tak (x - 1, y, z),
14243 val tak = wrap (tak, takData)
14247 | n => (fib 38; f (n-1))
14252 | n => (tak (18,12,6); g (n-1))
14255 fun done (data, file) =
14256 (Profile.Data.write (data, file)
14257 ; Profile.Data.free data)
14259 val _ = done (fibData, "mlmon.fib.out")
14260 val _ = done (takData, "mlmon.tak.out")
14263 Compile and run the program.
14265 % mlton -profile time fib-tak.sml
14269 Separately display the profiling data for `fib`
14271 % mlprof fib-tak mlmon.fib.out
14272 5.77 seconds of CPU time (0.00 seconds GC)
14280 % mlprof fib-tak mlmon.tak.out
14281 0.68 seconds of CPU time (0.00 seconds GC)
14287 Combine the data for `fib` and `tak` by calling `mlprof`
14288 with multiple `mlmon.out` files.
14290 % mlprof fib-tak mlmon.fib.out mlmon.tak.out mlmon.out
14291 6.45 seconds of CPU time (0.00 seconds GC)
14301 :mlton-guide-page: MLtonRandom
14308 signature MLTON_RANDOM =
14310 val alphaNumChar: unit -> char
14311 val alphaNumString: int -> string
14312 val rand: unit -> word
14313 val seed: unit -> word option
14314 val srand: word -> unit
14315 val useed: unit -> word option
14319 * `alphaNumChar ()`
14321 returns a random alphanumeric character.
14323 * `alphaNumString n`
14325 returns a string of length `n` of random alphanumeric characters.
14329 returns the next pseudo-random number.
14333 returns a random word from `/dev/random`. Useful as an arg to
14334 `srand`. If `/dev/random` can not be read from, `seed ()` returns
14335 `NONE`. A call to `seed` may block until enough random bits are
14340 sets the seed used by `rand` to `w`.
14344 returns a random word from `/dev/urandom`. Useful as an arg to
14345 `srand`. If `/dev/urandom` can not be read from, `useed ()` returns
14346 `NONE`. A call to `useed` will never block -- it will instead return
14347 lower quality random bits.
14351 :mlton-guide-page: MLtonReal
14358 signature MLTON_REAL =
14362 val fromWord: word -> t
14363 val fromLargeWord: LargeWord.word -> t
14364 val toWord: IEEEReal.rounding_mode -> t -> word
14365 val toLargeWord: IEEEReal.rounding_mode -> t -> LargeWord.word
14371 the type of reals. For `MLton.LargeReal` this is `LargeReal.real`,
14372 for `MLton.Real` this is `Real.real`, for `MLton.Real32` this is
14373 `Real32.real`, for `MLton.Real64` this is `Real64.real`.
14376 * `fromLargeWord w`
14378 convert the word `w` to a real value. If the value of `w` is larger
14379 than (the appropriate) `REAL.maxFinite`, then infinity is returned.
14380 If `w` cannot be exactly represented as a real value, then the current
14381 rounding mode is used to determine the resulting value.
14384 * `toLargeWord mode r`
14386 convert the argument `r` to a word type using the specified rounding
14387 mode. They raise `Overflow` if the result is not representable, in
14388 particular, if `r` is an infinity. They raise `Domain` if `r` is NaN.
14390 * `MLton.Real32.castFromWord w`
14391 * `MLton.Real64.castFromWord w`
14393 convert the argument `w` to a real type as a bit-wise cast.
14395 * `MLton.Real32.castToWord r`
14396 * `MLton.Real64.castToWord r`
14398 convert the argument `r` to a word type as a bit-wise cast.
14402 :mlton-guide-page: MLtonRlimit
14409 signature MLTON_RLIMIT =
14411 structure RLim : sig
14413 val castFromSysWord: SysWord.word -> t
14414 val castToSysWord: t -> SysWord.word
14417 val infinity: RLim.t
14421 val coreFileSize: t (* CORE max core file size *)
14422 val cpuTime: t (* CPU CPU time in seconds *)
14423 val dataSize: t (* DATA max data size *)
14424 val fileSize: t (* FSIZE Maximum filesize *)
14425 val numFiles: t (* NOFILE max number of open files *)
14426 val lockedInMemorySize: t (* MEMLOCK max locked address space *)
14427 val numProcesses: t (* NPROC max number of processes *)
14428 val residentSetSize: t (* RSS max resident set size *)
14429 val stackSize: t (* STACK max stack size *)
14430 val virtualMemorySize: t (* AS virtual memory limit *)
14432 val get: t -> {hard: rlim, soft: rlim}
14433 val set: t * {hard: rlim, soft: rlim} -> unit
14437 `MLton.Rlimit` provides a wrapper around the C `getrlimit` and
14438 `setrlimit` functions.
14442 the type of resource limits.
14446 indicates that a resource is unlimited.
14450 the types of resources that can be inspected and modified.
14454 returns the current hard and soft limits for resource `r`. May raise
14457 * `set (r, {hard, soft})`
14459 sets the hard and soft limits for resource `r`. May raise
14464 :mlton-guide-page: MLtonRusage
14471 signature MLTON_RUSAGE =
14473 type t = {utime: Time.time, (* user time *)
14474 stime: Time.time} (* system time *)
14476 val measureGC: bool -> unit
14477 val rusage: unit -> {children: t, gc: t, self: t}
14483 corresponds to a subset of the C `struct rusage`.
14487 controls whether garbage collection time is separately measured during
14488 program execution. This affects the behavior of both `rusage` and
14489 `Timer.checkCPUTimes`, both of which will return gc times of zero with
14490 `measureGC false`. Garbage collection time is always measured when
14491 either `gc-messages` or `gc-summary` is given as a
14492 <:RunTimeOptions:runtime system option>.
14496 corresponds to the C `getrusage` function. It returns the resource
14497 usage of the exited children, the garbage collector, and the process
14498 itself. The `self` component includes the usage of the `gc`
14499 component, regardless of whether `measureGC` is `true` or `false`. If
14500 `rusage` is used in a program, either directly, or indirectly via the
14501 `Timer` structure, then `measureGC true` is automatically called at
14502 the start of the program (it can still be disable by user code later).
14506 :mlton-guide-page: MLtonSignal
14513 signature MLTON_SIGNAL =
14515 type t = Posix.Signal.signal
14523 val handler: (Thread.Runnable.t -> Thread.Runnable.t) -> t
14525 val isDefault: t -> bool
14526 val isIgnore: t -> bool
14527 val simple: (unit -> unit) -> t
14535 val allBut: signal list -> t
14536 val block: t -> unit
14537 val getBlocked: unit -> t
14538 val isMember: t * signal -> bool
14540 val setBlocked: t -> unit
14541 val some: signal list -> t
14542 val unblock: t -> unit
14545 val getHandler: t -> Handler.t
14546 val handled: unit -> Mask.t
14548 val restart: bool ref
14549 val setHandler: t * Handler.t -> unit
14550 val suspend: Mask.t -> unit
14555 Signals handlers are functions from (runnable) threads to (runnable)
14556 threads. When a signal arrives, the corresponding signal handler is
14557 invoked, its argument being the thread that was interrupted by the
14558 signal. The signal handler runs asynchronously, in its own thread.
14559 The signal handler returns the thread that it would like to resume
14560 execution (this is often the thread that it was passed). It is an
14561 error for a signal handler to raise an exception that is not handled
14562 within the signal handler itself.
14564 A signal handler is never invoked while the running thread is in a
14565 critical section (see <:MLtonThread:>). Invoking a signal handler
14566 implicitly enters a critical section and the normal return of a signal
14567 handler implicitly exits the critical section; hence, a signal handler
14568 is never interrupted by another signal handler.
14572 the type of signals.
14576 the type of signal handlers.
14578 * `Handler.default`
14580 handles the signal with the default action.
14582 * `Handler.handler f`
14584 returns a handler `h` such that when a signal `s` is handled by `h`,
14585 `f` will be passed the thread that was interrupted by `s` and should
14586 return the thread that will resume execution.
14590 is a handler that will ignore the signal.
14592 * `Handler.isDefault`
14594 returns true if the handler is the default handler.
14596 * `Handler.isIgnore`
14598 returns true if the handler is the ignore handler.
14600 * `Handler.simple f`
14602 returns a handler that executes `f ()` and does not switch threads.
14606 the type of signal masks, which are sets of blocked signals.
14610 a mask of all signals.
14614 a mask of all signals except for those in `l`.
14618 blocks all signals in `m`.
14620 * `Mask.getBlocked ()`
14622 gets the signal mask `m`, i.e. a signal is blocked if and only if it
14625 * `Mask.isMember (m, s)`
14627 returns true if the signal `s` is in `m`.
14631 a mask of no signals.
14633 * `Mask.setBlocked m`
14635 sets the signal mask to `m`, i.e. a signal is blocked if and only if
14640 a mask of the signals in `l`.
14644 unblocks all signals in `m`.
14648 returns the current handler for signal `s`.
14652 returns the signal mask `m` corresponding to the currently handled
14653 signals; i.e., a signal is handled if and only if it is in `m`.
14657 `SIGPROF`, the profiling signal.
14661 dynamically determines the behavior of interrupted system calls; when
14662 `true`, interrupted system calls are restarted; when `false`,
14663 interrupted system calls raise `OS.SysError`.
14665 * `setHandler (s, h)`
14667 sets the handler for signal `s` to `h`.
14671 temporarily sets the signal mask to `m` and suspends until an unmasked
14672 signal is received and handled, at which point `suspend` resets the
14677 `SIGVTALRM`, the signal for virtual timers.
14680 == Interruptible System Calls ==
14682 Signal handling interacts in a non-trivial way with those functions in
14683 the <:BasisLibrary:Basis Library> that correspond directly to
14684 interruptible system calls (a subset of those functions that may raise
14685 `OS.SysError`). The desire is that these functions should have
14686 predictable semantics. The principal concerns are:
14688 1. System calls that are interrupted by signals should, by default, be
14689 restarted; the alternative is to raise
14693 OS.SysError (Posix.Error.errorMsg Posix.Error.intr,
14694 SOME Posix.Error.intr)
14697 This behavior is determined dynamically by the value of `Signal.restart`.
14699 2. Signal handlers should always get a chance to run (when outside a
14700 critical region). If a system call is interrupted by a signal, then
14701 the signal handler will run before the call is restarted or
14702 `OS.SysError` is raised; that is, before the `Signal.restart` check.
14704 3. A system call that must be restarted while in a critical section
14705 will be restarted with the handled signals blocked (and the previously
14706 blocked signals remembered). This encourages the system call to
14707 complete, allowing the program to make progress towards leaving the
14708 critical section where the signal can be handled. If the system call
14709 completes, the set of blocked signals are restored to those previously
14714 :mlton-guide-page: MLtonStructure
14719 The `MLton` structure contains a lot of functionality that is not
14720 available in the <:BasisLibrary:Basis Library>. As a warning,
14721 please keep in mind that the `MLton` structure and its
14722 substructures do change from release to release of MLton.
14728 val eq: 'a * 'a -> bool
14729 val equal: 'a * 'a -> bool
14730 val hash: 'a -> Word32.word
14732 val share: 'a -> unit
14733 val shareAll: unit -> unit
14734 val size: 'a -> int
14736 structure Array: MLTON_ARRAY
14737 structure BinIO: MLTON_BIN_IO
14738 structure CharArray: MLTON_MONO_ARRAY where type t = CharArray.array
14739 where type elem = CharArray.elem
14740 structure CharVector: MLTON_MONO_VECTOR where type t = CharVector.vector
14741 where type elem = CharVector.elem
14742 structure Cont: MLTON_CONT
14743 structure Exn: MLTON_EXN
14744 structure Finalizable: MLTON_FINALIZABLE
14745 structure GC: MLTON_GC
14746 structure IntInf: MLTON_INT_INF
14747 structure Itimer: MLTON_ITIMER
14748 structure LargeReal: MLTON_REAL where type t = LargeReal.real
14749 structure LargeWord: MLTON_WORD where type t = LargeWord.word
14750 structure Platform: MLTON_PLATFORM
14751 structure Pointer: MLTON_POINTER
14752 structure ProcEnv: MLTON_PROC_ENV
14753 structure Process: MLTON_PROCESS
14754 structure Profile: MLTON_PROFILE
14755 structure Random: MLTON_RANDOM
14756 structure Real: MLTON_REAL where type t = Real.real
14757 structure Real32: sig
14759 val castFromWord: Word32.word -> t
14760 val castToWord: t -> Word32.word
14761 end where type t = Real32.real
14762 structure Real64: sig
14764 val castFromWord: Word64.word -> t
14765 val castToWord: t -> Word64.word
14766 end where type t = Real64.real
14767 structure Rlimit: MLTON_RLIMIT
14768 structure Rusage: MLTON_RUSAGE
14769 structure Signal: MLTON_SIGNAL
14770 structure Syslog: MLTON_SYSLOG
14771 structure TextIO: MLTON_TEXT_IO
14772 structure Thread: MLTON_THREAD
14773 structure Vector: MLTON_VECTOR
14774 structure Weak: MLTON_WEAK
14775 structure Word: MLTON_WORD where type t = Word.word
14776 structure Word8: MLTON_WORD where type t = Word8.word
14777 structure Word16: MLTON_WORD where type t = Word16.word
14778 structure Word32: MLTON_WORD where type t = Word32.word
14779 structure Word64: MLTON_WORD where type t = Word64.word
14780 structure Word8Array: MLTON_MONO_ARRAY where type t = Word8Array.array
14781 where type elem = Word8Array.elem
14782 structure Word8Vector: MLTON_MONO_VECTOR where type t = Word8Vector.vector
14783 where type elem = Word8Vector.elem
14784 structure World: MLTON_WORLD
14789 == Substructures ==
14795 * <:MLtonFinalizable:>
14800 * <:MLtonMonoArray:>
14801 * <:MLtonMonoVector:>
14802 * <:MLtonPlatform:>
14823 returns true if `x` and `y` are equal as pointers. For simple types
14824 like `char`, `int`, and `word`, this is the same as equals. For
14825 arrays, datatypes, strings, tuples, and vectors, this is a simple
14826 pointer equality. The semantics is a bit murky.
14830 returns true if `x` and `y` are structurally equal. For equality
14831 types, this is the same as <:PolymorphicEquality:>. For other types,
14832 it is a conservative approximation of equivalence.
14836 returns a structural hash of `x`. The hash function is consistent
14837 between execution of the same program, but may not be consistent
14838 between different programs.
14842 is always `true` in a MLton implementation, and is always `false` in a
14843 stub implementation.
14847 maximizes sharing in the heap for the object graph reachable from `x`.
14851 maximizes sharing in the heap by sharing space for equivalent
14852 immutable objects. A call to `shareAll` performs a major garbage
14853 collection, and takes time proportional to the size of the heap.
14857 returns the amount of heap space (in bytes) taken by the value of `x`,
14858 including all objects reachable from `x` by following pointers. It
14859 takes time proportional to the size of `x`. See below for an example.
14862 == <!Anchor(size)>Example of `MLton.size` ==
14864 This example, `size.sml`, demonstrates the application of `MLton.size`
14865 to many different kinds of objects.
14868 sys::[./bin/InclGitFile.py mlton master doc/examples/size/size.sml]
14871 Compile and run as usual.
14875 The size of an int list of length 4 is 48 bytes.
14876 The size of a string of length 10 is 24 bytes.
14877 The size of an int array of length 10 is 52 bytes.
14878 The size of a double array of length 10 is 92 bytes.
14879 The size of an array of length 10 of 2-ples of ints is 92 bytes.
14880 The size of a useless function is 0 bytes.
14881 The size of a continuation option ref is 4544 bytes.
14883 The size of a continuation option ref is 8 bytes.
14886 Note that sizes are dependent upon the target platform and compiler
14891 :mlton-guide-page: MLtonSyslog
14898 signature MLTON_SYSLOG =
14902 val CONS : openflag
14903 val NDELAY : openflag
14904 val NOWAIT : openflag
14905 val ODELAY : openflag
14906 val PERROR : openflag
14911 val AUTHPRIV : facility
14912 val CRON : facility
14913 val DAEMON : facility
14914 val KERN : facility
14915 val LOCAL0 : facility
14916 val LOCAL1 : facility
14917 val LOCAL2 : facility
14918 val LOCAL3 : facility
14919 val LOCAL4 : facility
14920 val LOCAL5 : facility
14921 val LOCAL6 : facility
14922 val LOCAL7 : facility
14924 val MAIL : facility
14925 val NEWS : facility
14926 val SYSLOG : facility
14927 val USER : facility
14928 val UUCP : facility
14932 val EMERG : loglevel
14933 val ALERT : loglevel
14934 val CRIT : loglevel
14936 val WARNING : loglevel
14937 val NOTICE : loglevel
14938 val INFO : loglevel
14939 val DEBUG : loglevel
14941 val closelog: unit -> unit
14942 val log: loglevel * string -> unit
14943 val openlog: string * openflag list * facility -> unit
14947 `MLton.Syslog` is a complete interface to the system logging
14948 facilities. See `man 3 syslog` for more details.
14952 closes the connection to the system logger.
14956 logs message `s` at a loglevel `l`.
14958 * `openlog (name, flags, facility)`
14960 opens a connection to the system logger. `name` will be prefixed to
14961 each message, and is typically set to the program name.
14965 :mlton-guide-page: MLtonTextIO
14972 signature MLTON_TEXT_IO = MLTON_IO
14979 :mlton-guide-page: MLtonThread
14986 signature MLTON_THREAD =
14988 structure AtomicState:
14990 datatype t = NonAtomic | Atomic of int
14993 val atomically: (unit -> 'a) -> 'a
14994 val atomicBegin: unit -> unit
14995 val atomicEnd: unit -> unit
14996 val atomicState: unit -> AtomicState.t
14998 structure Runnable:
15005 val atomicSwitch: ('a t -> Runnable.t) -> 'a
15006 val new: ('a -> unit) -> 'a t
15007 val prepend: 'a t * ('b -> 'a) -> 'b t
15008 val prepare: 'a t * 'a -> Runnable.t
15009 val switch: ('a t -> Runnable.t) -> 'a
15013 `MLton.Thread` provides access to MLton's user-level thread
15014 implementation (i.e. not OS-level threads). Threads are lightweight
15015 data structures that represent a paused computation. Runnable threads
15016 are threads that will begin or continue computing when `switch`-ed to.
15017 `MLton.Thread` does not include a default scheduling mechanism, but it
15018 can be used to implement both preemptive and non-preemptive threads.
15020 * `type AtomicState.t`
15022 the type of atomic states.
15027 runs `f` in a critical section.
15031 begins a critical section.
15035 ends a critical section.
15039 returns the current atomic state.
15041 * `type Runnable.t`
15043 the type of threads that can be resumed.
15047 the type of threads that expect a value of type `'a`.
15051 like `switch`, but assumes an atomic calling context. Upon
15052 `switch`-ing back to the current thread, an implicit `atomicEnd` is
15057 creates a new thread that, when run, applies `f` to the value given to
15058 the thread. `f` must terminate by `switch`ing to another thread or
15059 exiting the process.
15063 creates a new thread (destroying `t` in the process) that first
15064 applies `f` to the value given to the thread and then continues with
15065 `t`. This is a constant time operation.
15069 prepares a new runnable thread (destroying `t` in the process) that
15070 will evaluate `t` on `v`.
15074 applies `f` to the current thread to get `rt`, and then start running
15075 thread `rt`. It is an error for `f` to perform another `switch`. `f`
15076 is guaranteed to run atomically.
15079 == Example of non-preemptive threads ==
15083 sys::[./bin/InclGitFile.py mlton master doc/examples/thread/non-preemptive-threads.sml]
15087 == Example of preemptive threads ==
15091 sys::[./bin/InclGitFile.py mlton master doc/examples/thread/preemptive-threads.sml]
15096 :mlton-guide-page: MLtonVector
15103 signature MLTON_VECTOR =
15105 val create: int -> {done: unit -> 'a vector,
15107 update: int * 'a -> unit}
15108 val unfoldi: int * 'b * (int * 'b -> 'a * 'b) -> 'a vector * 'b
15114 initiates the construction a vector _v_ of length `n`, returning
15115 functions to manipulate the vector. The `done` function may be called
15116 to return the created vector; it is an error to call `done` before all
15117 entries have been initialized; it is an error to call `done` after
15118 having called `done`. The `sub` function may be called to return an
15119 initialized vector entry; it is not an error to call `sub` after
15120 having called `done`. The `update` function may be called to
15121 initialize a vector entry; it is an error to call `update` after
15122 having called `done`. One must initialize vector entries in order
15123 from lowest to highest; that is, before calling `update (i, x)`, one
15124 must have already called `update (j, x)` for all `j` in `[0, i)`. The
15125 `done`, `sub`, and `update` functions are all constant-time
15128 * `unfoldi (n, b, f)`
15130 constructs a vector _v_ of length `n`, whose elements __v~i~__ are
15131 determined by the equations __b~0~ = b__ and
15132 __(v~i~, b~i+1~) = f (i, b~i~)__.
15136 :mlton-guide-page: MLtonWeak
15143 signature MLTON_WEAK =
15147 val get: 'a t -> 'a option
15148 val new: 'a -> 'a t
15152 A weak pointer is a pointer to an object that is nulled if the object
15153 becomes <:Reachability:unreachable> due to garbage collection. The
15154 weak pointer does not itself cause the object it points to be retained
15155 by the garbage collector -- only other strong pointers can do that.
15156 For objects that are not allocated in the heap, like integers, a weak
15157 pointer will always be nulled. So, if `w: int Weak.t`, then
15158 `Weak.get w = NONE`.
15162 the type of weak pointers to objects of type `'a`
15166 returns `NONE` if the object pointed to by `w` no longer exists.
15167 Otherwise, returns `SOME` of the object pointed to by `w`.
15171 returns a weak pointer to `x`.
15175 :mlton-guide-page: MLtonWord
15182 signature MLTON_WORD =
15187 val rol: t * word -> t
15188 val ror: t * word -> t
15194 the type of words. For `MLton.LargeWord` this is `LargeWord.word`,
15195 for `MLton.Word` this is `Word.word`, for `MLton.Word8` this is
15196 `Word8.word`, for `MLton.Word16` this is `Word16.word`, for
15197 `MLton.Word32` this is `Word32.word`, for `MLton.Word64` this is
15206 rotates left (circular).
15210 rotates right (circular).
15214 :mlton-guide-page: MLtonWorld
15221 signature MLTON_WORLD =
15223 datatype status = Clone | Original
15225 val load: string -> 'a
15226 val save: string -> status
15227 val saveThread: string * Thread.Runnable.t -> unit
15231 * `datatype status`
15233 specifies whether a world is original or restarted (a clone).
15237 loads the saved computation from file `f`.
15241 saves the entire state of the computation to the file `f`. The
15242 computation can then be restarted at a later time using `World.load`
15243 or the `load-world` <:RunTimeOptions:runtime option>. The call to
15244 `save` in the original computation returns `Original` and the call in
15245 the restarted world returns `Clone`.
15247 * `saveThread (f, rt)`
15249 saves the entire state of the computation to the file `f` that will
15250 resume with thread `rt` upon restart.
15256 Executables that save and load worlds are incompatible with
15257 http://en.wikipedia.org/wiki/Address_space_layout_randomization[address space layout randomization (ASLR)]
15258 of the executable (though, not of shared libraries). The state of a
15259 computation includes addresses into the code and data segments of the
15260 executable (e.g., static runtime-system data, return addresses); such
15261 addresses are invalid when interpreted by the executable loaded at a
15262 different base address.
15264 Executables that save and load worlds should be compiled with an
15265 option to suppress the generation of position-independent executables.
15267 * <:RunningOnDarwin:Darwin 11 (Mac OS X Lion) and higher> : `-link-opt -fno-PIE`
15272 Suppose that `save-world.sml` contains the following.
15275 sys::[./bin/InclGitFile.py mlton master doc/examples/save-world/save-world.sml]
15278 Then, if we compile `save-world.sml` and run it, the `Original`
15279 branch will execute, and a file named `world` will be created.
15281 % mlton save-world.sml
15286 We can then load `world` using the `load-world`
15287 <:RunTimeOptions:run time option>.
15289 % ./save-world @MLton load-world world --
15295 :mlton-guide-page: MLULex
15300 http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[MLULex] is a
15301 scanner generator for <:StandardML:Standard ML>.
15307 * <!Cite(OwensEtAl09)>
15311 :mlton-guide-page: MLYacc
15316 <:MLYacc:> is a parser generator for <:StandardML:Standard ML> modeled
15317 after the Yacc parser generator.
15319 A version of MLYacc, ported from the <:SMLNJ:SML/NJ> sources, is
15320 distributed with MLton.
15324 * <!Attachment(Documentation,mlyacc.pdf)>
15326 * <!Cite(TarditiAppel00)>
15331 :mlton-guide-page: Monomorphise
15336 <:Monomorphise:> is a translation pass from the <:XML:>
15337 <:IntermediateLanguage:> to the <:SXML:> <:IntermediateLanguage:>.
15341 Monomorphisation eliminates polymorphic values and datatype
15342 declarations by duplicating them for each type at which they are used.
15344 Consider the following <:XML:> program.
15347 datatype 'a t = T of 'a
15348 fun 'a f (x: 'a) = T x
15354 The result of monomorphising this program is the following <:SXML:> program:
15357 datatype t1 = T1 of int
15358 datatype t2 = T2 of int * int
15359 fun f1 (x: int) = T1 x
15360 fun f2 (x: int * int) = T2 x
15366 == Implementation ==
15368 * <!ViewGitFile(mlton,master,mlton/xml/monomorphise.sig)>
15369 * <!ViewGitFile(mlton,master,mlton/xml/monomorphise.fun)>
15371 == Details and Notes ==
15373 The monomorphiser works by making one pass over the entire program.
15374 On the way down, it creates a cache for each variable declared in a
15375 polymorphic declaration that maps a lists of type arguments to a new
15376 variable name. At a variable reference, it consults the cache (based
15377 on the types the variable is applied to). If there is already an
15378 entry in the cache, it is used. If not, a new entry is created. On
15379 the way up, the monomorphiser duplicates a variable declaration for
15380 each entry in the cache.
15382 As with variables, the monomorphiser records all of the type at which
15383 constructors are used. After the entire program is processed, the
15384 monomorphiser duplicates each datatype declaration and its associated
15387 The monomorphiser duplicates all of the functions declared in a
15388 `fun` declaration as a unit. Consider the following program
15391 fun 'a f (x: 'a) = g x
15392 and g (y: 'a) = f y
15398 and its monomorphisation
15402 fun f1 (x: int) = g1 x
15403 and g1 (y: int) = f1 y
15404 fun f2 (x : int * int) = g2 x
15405 and g2 (y : int * int) = f2 y
15411 == Pathological datatype declarations ==
15413 SML allows a pathological polymorphic datatype declaration in which
15414 recursive uses of the defined type constructor are applied to
15415 different type arguments than the definition. This has been
15416 disallowed by others on type theoretic grounds. A canonical example
15420 datatype 'a t = A of 'a | B of ('a * 'a) t
15421 val z : int t = B (B (A ((1, 2), (3, 4))))
15424 The presence of the recursion in the datatype declaration might appear
15425 to cause the need for the monomorphiser to create an infinite number
15426 of types. However, due to the absence of polymorphic recursion in
15427 SML, there are in fact only a finite number of instances of such types
15428 in any given program. The monomorphiser translates the above program
15429 to the following one.
15432 datatype t1 = B1 of t2
15433 datatype t2 = B2 of t3
15434 datatype t3 = A3 of (int * int) * (int * int)
15435 val z : int t = B1 (B2 (A3 ((1, 2), (3, 4))))
15438 It is crucial that the monomorphiser be allowed to drop unused
15439 constructors from datatype declarations in order for the translation
15444 :mlton-guide-page: MoscowML
15449 http://mosml.org[Moscow ML] is a
15450 <:StandardMLImplementations:Standard ML implementation>. It is a
15451 byte-code compiler, so it compiles code quickly, but the code runs
15452 slowly. See <:Performance:>.
15456 :mlton-guide-page: Multi
15461 <:Multi:> is an analysis pass for the <:SSA:>
15462 <:IntermediateLanguage:>, invoked from <:ConstantPropagation:> and
15467 This pass analyzes the control flow of a <:SSA:> program to determine
15468 which <:SSA:> functions and blocks might be executed more than once or
15469 by more than one thread. It also determines when a program uses
15470 threads and when functions and blocks directly or indirectly invoke
15471 `Thread_copyCurrent`.
15473 == Implementation ==
15475 * <!ViewGitFile(mlton,master,mlton/ssa/multi.sig)>
15476 * <!ViewGitFile(mlton,master,mlton/ssa/multi.fun)>
15478 == Details and Notes ==
15484 :mlton-guide-page: Mutable
15489 Mutable is an adjective meaning "can be modified". In
15490 <:StandardML:Standard ML>, ref cells and arrays are mutable, while all
15491 other values are <:Immutable:immutable>.
15495 :mlton-guide-page: NeedsReview
15500 This page documents some patches and bug fixes that need additional review by experienced developers:
15502 * Bug in transparent signature match:
15503 ** What is an 'original' interface and why does the equivalence of original interfaces implies the equivalence of the actual interfaces?
15504 ** http://www.mlton.org/pipermail/mlton/2007-September/029991.html
15505 ** http://www.mlton.org/pipermail/mlton/2007-September/029995.html
15506 ** SVN Revision: <!ViewSVNRev(6046)>
15508 * Bug in <:DeepFlatten:> pass:
15509 ** Should we allow argument to `Weak_new` to be flattened?
15510 ** SVN Revision: <!ViewSVNRev(6189)> (regression test demonstrating bug)
15511 ** SVN Revision: <!ViewSVNRev(6191)>
15515 :mlton-guide-page: NumericLiteral
15520 Numeric literals in <:StandardML:Standard ML> can be written in either
15521 decimal or hexadecimal notation. Sometimes it can be convenient to
15522 write numbers down in other bases. Fortunately, using <:Fold:>, it is
15523 possible to define a concise syntax for numeric literals that allows
15524 one to write numeric constants in any base and of various types
15525 (`int`, `IntInf.int`, `word`, and more).
15527 We will define constants `I`, `II`, `W`, and +`+ so
15533 denotes `123:int` in base 10, while
15538 denotes `19:IntInf.int` in base 8, and
15543 denotes `0w13: word`.
15551 fun make (op *, op +, i2x) iBase =
15553 val xBase = i2x iBase
15558 if 0 <= i andalso i < iBase then
15562 ["Num: ", Int.toString i,
15565 Int.toString iBase])),
15569 fun I ? = make (op *, op +, id) ?
15570 fun II ? = make (op *, op +, IntInf.fromInt) ?
15571 fun W ? = make (op *, op +, Word.fromInt) ?
15573 fun ` ? = Fold.step1 (fn (i, (x, step)) =>
15574 (step (i, x), step)) ?
15590 The idea is for the fold to start with zero and to construct the
15591 result one digit at a time, with each stepper multiplying the previous
15592 result by the base and adding the next digit. The code is abstracted
15593 in two different ways for extra generality. First, the `make`
15594 function abstracts over the various primitive operations (addition,
15595 multiplication, etc) that are needed to construct a number. This
15596 allows the same code to be shared for constants `I`, `II`, `W` used to
15597 write down the various numeric types. It also allows users to add new
15598 constants for additional numeric types, by supplying the necessary
15601 Second, the step function, +`+, is abstracted over the actual
15602 construction operation, which is created by make, and passed along the
15603 fold. This allows the same constant, +`+, to be used for all
15604 numeric types. The alternative approach, having a different step
15605 function for each numeric type, would be more painful to use.
15607 On the surface, it appears that the code checks the digits dynamically
15608 to ensure they are valid for the base. However, MLton will simplify
15609 everything away at compile time, leaving just the final numeric
15614 :mlton-guide-page: ObjectOrientedProgramming
15615 [[ObjectOrientedProgramming]]
15616 ObjectOrientedProgramming
15617 =========================
15619 <:StandardML:Standard ML> does not have explicit support for
15620 object-oriented programming. Here are some papers that show how to
15621 express certain object-oriented concepts in SML.
15623 * <!Cite(Berthomieu00, OO Programming styles in ML)>
15625 * <!Cite(ThorupTofte94, Object-oriented programming and Standard ML)>
15627 * <!Cite(LarsenNiss04, mGTK: An SML binding of Gtk+)>
15629 * <!Cite(FluetPucella06, Phantom Types and Subtyping)>
15631 The question of OO programming in SML comes up every now and then.
15632 The following discusses a simple object-oriented (OO) programming
15633 technique in Standard ML. The reader is assumed to be able to read
15639 SML doesn't provide subtyping, but it does provide parametric
15640 polymorphism, which can be used to encode some forms of subtyping.
15641 Most articles on OO programming in SML concentrate on such encoding
15642 techniques. While those techniques are interesting -- and it is
15643 recommended to read such articles -- and sometimes useful, it seems
15644 that basically all OO gurus agree that (deep) subtyping (or
15645 inheritance) hierarchies aren't as practical as they were thought to
15646 be in the early OO days. "Good", flexible, "OO" designs tend to have
15653 - - -+-------+-------+- - -
15659 and deep inheritance hierarchies
15673 tend to be signs of design mistakes. There are good underlying
15674 reasons for this, but a thorough discussion is not in the scope of
15675 this article. However, the point is that perhaps the encoding of
15676 subtyping is not as important as one might believe. In the following
15677 we ignore subtyping and rather concentrate on a very simple and basic
15678 dynamic dispatch technique.
15681 == Dynamic Dispatch Using a Recursive Record of Functions ==
15683 Quite simply, the basic idea is to implement a "virtual function
15684 table" using a record that is wrapped inside a (possibly recursive)
15685 datatype. Let's first take a look at a simple concrete example.
15687 Consider the following Java interface:
15690 public interface Counter {
15696 We can translate the `Counter` interface to SML as follows:
15700 datatype counter = Counter of {inc : unit -> unit, get : unit -> int}
15703 Each value of type `counter` can be thought of as an object that
15704 responds to two messages `inc` and `get`. To actually send messages
15705 to a counter, it is useful to define auxiliary functions
15710 fun mk m (Counter t) = m t ()
15717 that basically extract the "function table" `t` from a counter object
15718 and then select the specified method `m` from the table.
15720 Let's then implement a simple function that increments a counter until a
15721 given maximum is reached:
15725 fun incUpto counter max = while cGet counter < max do cInc counter
15728 You can easily verify that the above code compiles even without any
15729 concrete implementation of a counter, thus it is clear that it doesn't
15730 depend on a particular counter implementation.
15732 Let's then implement a couple of counters. First consider the
15733 following Java class implementing the `Counter` interface given earlier.
15736 public class BasicCounter implements Counter {
15738 public BasicCounter(int initialCnt) { this.cnt = initialCnt; }
15739 public void inc() { this.cnt += 1; }
15740 public int get() { return this.cnt; }
15744 We can translate the above to SML as follows:
15748 fun newBasicCounter initialCnt = let
15749 val cnt = ref initialCnt
15751 Counter {inc = fn () => cnt := !cnt + 1,
15752 get = fn () => !cnt}
15756 The SML function `newBasicCounter` can be described as a constructor
15757 function for counter objects of the `BasicCounter` "class". We can
15758 also have other counter implementations. Here is the constructor for
15759 a counter decorator that logs messages:
15763 fun newLoggedCounter counter =
15764 Counter {inc = fn () => (print "inc\n" ; cInc counter),
15765 get = fn () => (print "get\n" ; cGet counter)}
15768 The `incUpto` function works just as well with objects of either
15773 val aCounter = newBasicCounter 0
15774 val () = incUpto aCounter 5
15775 val () = print (Int.toString (cGet aCounter) ^"\n")
15777 val aCounter = newLoggedCounter (newBasicCounter 0)
15778 val () = incUpto aCounter 5
15779 val () = print (Int.toString (cGet aCounter) ^"\n")
15782 In general, a dynamic dispatch interface is represented as a record
15783 type wrapped inside a datatype. Each field of the record corresponds
15784 to a public method or field of the object:
15788 datatype interface =
15789 Interface of {method : t1 -> t2,
15790 immutableField : t,
15791 mutableField : t ref}
15794 The reason for wrapping the record inside a datatype is that records,
15795 in SML, can not be recursive. However, SML datatypes can be
15796 recursive. A record wrapped in a datatype can contain fields that
15797 contain the datatype. For example, an interface such as `Cloneable`
15801 datatype cloneable = Cloneable of {clone : unit -> cloneable}
15804 can be represented using recursive datatypes.
15806 Like in OO languages, interfaces are abstract and can not be
15807 instantiated to produce objects. To be able to instantiate objects,
15808 the constructors of a concrete class are needed. In SML, we can
15809 implement constructors as simple functions from arbitrary arguments to
15810 values of the interface type. Such a constructor function can
15811 encapsulate arbitrary private state and functions using lexical
15812 closure. It is also easy to share implementations of methods between
15813 two or more constructors.
15815 While the `Counter` example is rather trivial, it should not be
15816 difficult to see that this technique quite simply doesn't require a huge
15817 amount of extra verbiage and is more than usable in practice.
15820 == SML Modules and Dynamic Dispatch ==
15822 One might wonder about how SML modules and the dynamic dispatch
15823 technique work together. Let's investigate! Let's use a simple
15824 dispenser framework as a concrete example. (Note that this isn't
15825 intended to be an introduction to the SML module system.)
15827 === Programming with SML Modules ===
15829 Using SML signatures we can specify abstract data types (ADTs) such as
15830 dispensers. Here is a signature for an "abstract" functional (as
15831 opposed to imperative) dispenser:
15835 signature ABSTRACT_DISPENSER = sig
15837 val isEmpty : 'a t -> bool
15838 val push : 'a * 'a t -> 'a t
15839 val pop : 'a t -> ('a * 'a t) option
15843 The term "abstract" in the name of the signature refers to the fact that
15844 the signature gives no way to instantiate a dispenser. It has nothing to
15845 do with the concept of abstract data types.
15847 Using SML functors we can write "generic" algorithms that manipulate
15848 dispensers of an unknown type. Here are a couple of very simple
15853 functor DispenserAlgs (D : ABSTRACT_DISPENSER) = struct
15856 fun pushAll (xs, d) = foldl push d xs
15859 fun lp (xs, NONE) = rev xs
15860 | lp (xs, SOME (x, d)) = lp (x::xs, pop d)
15865 fun cp (from, to) = pushAll (popAll from, to)
15869 As one can easily verify, the above compiles even without any concrete
15870 dispenser structure. Functors essentially provide a form a static
15871 dispatch that one can use to break compile-time dependencies.
15873 We can also give a signature for a concrete dispenser
15877 signature DISPENSER = sig
15878 include ABSTRACT_DISPENSER
15883 and write any number of concrete structures implementing the signature.
15884 For example, we could implement stacks
15888 structure Stack :> DISPENSER = struct
15889 type 'a t = 'a list
15893 val pop = List.getItem
15901 structure Queue :> DISPENSER = struct
15902 datatype 'a t = T of 'a list * 'a list
15903 val empty = T ([], [])
15904 val isEmpty = fn T ([], _) => true | _ => false
15905 val normalize = fn ([], ys) => (rev ys, []) | q => q
15906 fun push (y, T (xs, ys)) = T (normalize (xs, y::ys))
15907 val pop = fn (T (x::xs, ys)) => SOME (x, T (normalize (xs, ys))) | _ => NONE
15911 One can now write code that uses either the `Stack` or the `Queue`
15912 dispenser. One can also instantiate the previously defined functor to
15913 create functions for manipulating dispensers of a type:
15917 structure S = DispenserAlgs (Stack)
15918 val [4,3,2,1] = S.popAll (S.pushAll ([1,2,3,4], Stack.empty))
15920 structure Q = DispenserAlgs (Queue)
15921 val [1,2,3,4] = Q.popAll (Q.pushAll ([1,2,3,4], Queue.empty))
15924 There is no dynamic dispatch involved at the module level in SML. An
15925 attempt to do dynamic dispatch
15929 val q = Q.push (1, Stack.empty)
15932 will give a type error.
15934 === Combining SML Modules and Dynamic Dispatch ===
15936 Let's then combine SML modules and the dynamic dispatch technique
15937 introduced in this article. First we define an interface for
15942 structure Dispenser = struct
15944 I of {isEmpty : unit -> bool,
15946 pop : unit -> ('a * 'a t) option}
15948 fun O m (I t) = m t
15950 fun isEmpty t = O#isEmpty t ()
15951 fun push (v, t) = O#push t v
15952 fun pop t = O#pop t ()
15956 The `Dispenser` module, which we can think of as an interface for
15957 dispensers, implements the `ABSTRACT_DISPENSER` signature using
15958 the dynamic dispatch technique, but we leave the signature ascription
15961 Then we define a `DispenserClass` functor that makes a "class" out of
15962 a given dispenser module:
15966 functor DispenserClass (D : DISPENSER) : DISPENSER = struct
15970 I {isEmpty = fn () => D.isEmpty d,
15971 push = fn x => make (D.push (x, d)),
15975 | SOME (x, d) => SOME (x, make d)}
15978 I {isEmpty = fn () => true,
15979 push = fn x => make (D.push (x, D.empty)),
15980 pop = fn () => NONE}
15984 Finally we seal the `Dispenser` module:
15988 structure Dispenser : ABSTRACT_DISPENSER = Dispenser
15991 This isn't necessary for type safety, because the unsealed `Dispenser`
15992 module does not allow one to break encapsulation, but makes sure that
15993 only the `DispenserClass` functor can create dispenser classes
15994 (because the constructor `Dispenser.I` is no longer accessible).
15996 Using the `DispenserClass` functor we can turn any concrete dispenser
15997 module into a dispenser class:
16001 structure StackClass = DispenserClass (Stack)
16002 structure QueueClass = DispenserClass (Queue)
16005 Each dispenser class implements the same dynamic dispatch interface
16006 and the `ABSTRACT_DISPENSER` -signature.
16008 Because the dynamic dispatch `Dispenser` module implements the
16009 `ABSTRACT_DISPENSER`-signature, we can use it to instantiate the
16010 `DispenserAlgs`-functor:
16014 structure D = DispenserAlgs (Dispenser)
16017 The resulting `D` module, like the `Dispenser` module, works with
16018 any dispenser class and uses dynamic dispatch:
16022 val [4, 3, 2, 1] = D.popAll (D.pushAll ([1, 2, 3, 4], StackClass.empty))
16023 val [1, 2, 3, 4] = D.popAll (D.pushAll ([1, 2, 3, 4], QueueClass.empty))
16028 :mlton-guide-page: OCaml
16033 http://caml.inria.fr/[OCaml] is a variant of <:ML:> and is similar to
16034 <:StandardML:Standard ML>.
16036 == OCaml and SML ==
16038 Here's a comparison of some aspects of the OCaml and SML languages.
16040 * Standard ML has a formal <:DefinitionOfStandardML:Definition>, while
16041 OCaml is specified by its lone implementation and informal
16044 * Standard ML has a number of <:StandardMLImplementations:compilers>,
16045 while OCaml has only one.
16047 * OCaml has built-in support for object-oriented programming, while
16048 Standard ML does not (however, see <:ObjectOrientedProgramming:>).
16050 * Andreas Rossberg has a
16051 http://www.mpi-sws.org/%7Erossberg/sml-vs-ocaml.html[side-by-side
16052 comparison] of the syntax of SML and OCaml.
16054 * Adam Chlipala has a
16055 http://adam.chlipala.net/mlcomp[point-by-point comparison] of OCaml
16058 == OCaml and MLton ==
16060 Here's a comparison of some aspects of OCaml and MLton.
16064 ** Both OCaml and MLton have excellent performance.
16066 ** MLton performs extensive <:WholeProgramOptimization:>, which can
16067 provide substantial improvements in large, modular programs.
16069 ** MLton uses native types, like 32-bit integers, without any penalty
16070 due to tagging or boxing. OCaml uses 31-bit integers with a penalty
16071 due to tagging, and 32-bit integers with a penalty due to boxing.
16073 ** MLton uses native types, like 64-bit floats, without any penalty
16074 due to boxing. OCaml, in some situations, boxes 64-bit floats.
16076 ** MLton represents arrays of all types unboxed. In OCaml, only
16077 arrays of 64-bit floats are unboxed, and then only when it is
16078 syntactically apparent.
16080 ** MLton represents records compactly by reordering and packing the
16083 ** In MLton, polymorphic and monomorphic code have the same
16084 performance. In OCaml, polymorphism can introduce a performance
16087 ** In MLton, module boundaries have no impact on performance. In
16088 OCaml, moving code between modules can cause a performance penalty.
16090 ** MLton's <:ForeignFunctionInterface:> is simpler than OCaml's.
16094 ** OCaml has a debugger, while MLton does not.
16096 ** OCaml supports separate compilation, while MLton does not.
16098 ** OCaml compiles faster than MLton.
16100 ** MLton supports profiling of both time and allocation.
16104 ** OCaml has more available libraries.
16108 ** OCaml has a larger community than MLton.
16110 ** MLton has a very responsive
16111 http://www.mlton.org/mailman/listinfo/mlton[developer list].
16115 :mlton-guide-page: OpenGL
16120 There are at least two interfaces to OpenGL for MLton/SML, both of
16121 which should be considered alpha quality.
16123 * <:MikeThomas:> built a low-level interface, directly translating
16124 many of the functions, covering GL, GLU, and GLUT. This is available
16125 in the MLton <:Sources:>:
16126 <!ViewGitDir(mltonlib,master,org/mlton/mike/opengl)>. The code
16127 contains a number of small, standard OpenGL examples translated to
16130 * <:ChrisClearwater:> has written at least an interface to GL, and
16132 ** http://mlton.org/pipermail/mlton/2005-January/026669.html
16134 <:Contact:> us for more information or an update on the status of
16139 :mlton-guide-page: OperatorPrecedence
16140 [[OperatorPrecedence]]
16144 <:StandardML:Standard ML> has a built in notion of precedence for
16145 certain symbols. Every program that includes the
16146 <:BasisLibrary:Basis Library> automatically gets the following infix
16147 declarations. Higher number indicates higher precedence.
16151 infix 7 * / mod div
16154 infix 4 = <> > >= < <=
16161 :mlton-guide-page: OptionalArguments
16162 [[OptionalArguments]]
16166 <:StandardML:Standard ML> does not have built-in support for optional
16167 arguments. Nevertheless, using <:Fold:>, it is easy to define
16168 functions that take optional arguments.
16170 For example, suppose that we have the following definition of a
16176 concat [Int.toString i, ", ", Real.toString r, ", ", s]
16179 Using the `OptionalArg` structure described below, we can define a
16180 function `f'`, an optionalized version of `f`, that takes 0, 1, 2, or
16181 3 arguments. Embedded within `f'` will be default values for `i`,
16182 `r`, and `s`. If `f'` gets no arguments, then all the defaults are
16183 used. If `f'` gets one argument, then that will be used for `i`. Two
16184 arguments will be used for `i` and `r` respectively. Three arguments
16185 will override all default values. Calls to `f'` will look like the
16193 f' `2 `3.0 `"four" $
16196 The optional argument indicator, +`+, is not special syntax ---
16197 it is a normal SML value, defined in the `OptionalArg` structure
16200 Here is the definition of `f'` using the `OptionalArg` structure, in
16201 particular, `OptionalArg.make` and `OptionalArg.D`.
16207 let open OptionalArg in
16208 make (D 1) (D 2.0) (D "three") $
16209 end (fn i & r & s => f (i, r, s))
16213 The definition of `f'` is eta expanded as with all uses of fold. A
16214 call to `OptionalArg.make` is supplied with a variable number of
16215 defaults (in this case, three), the end-of-arguments terminator, `$`,
16216 and the function to run, taking its arguments as an n-ary
16217 <:ProductType:product>. In this case, the function simply converts
16218 the product to an ordinary tuple and calls `f`. Often, the function
16219 body will simply be written directly.
16221 In general, the definition of an optional-argument function looks like
16228 let open OptionalArg in
16229 make (D <default1>) (D <default2>) ... (D <defaultn>) $
16230 end (fn x1 & x2 & ... & xn =>
16231 <function code goes here>)
16235 Here is the definition of `OptionalArg`.
16239 structure OptionalArg =
16244 ((id, fn (f, x) => f x),
16245 fn (d, r) => fn func =>
16246 Fold.fold ((id, d ()), fn (f, d) =>
16248 val d & () = r (id, f d)
16254 fun D d = Fold.step0 (fn (f, r) =>
16255 (fn ds => f (d & ds),
16256 fn (f, a & b) => r (fn x => f a & x, b)))
16260 Fold.step1 (fn (x, (f, _ & d)) => (fn d => f (x & d), d))
16265 `OptionalArg.make` uses a nested fold. The first `fold` accumulates
16266 the default values in a product, associated to the right, and a
16267 reversal function that converts a product (of the same arity as the
16268 number of defaults) from right associativity to left associativity.
16269 The accumulated defaults are used by the second fold, which recurs
16270 over the product, replacing the appropriate component as it encounters
16271 optional arguments. The second fold also constructs a "fill"
16272 function, `f`, that is used to reconstruct the product once the
16273 end-of-arguments is reached. Finally, the finisher reconstructs the
16274 product and uses the reversal function to convert the product from
16275 right associative to left associative, at which point it is passed to
16276 the user-supplied function.
16278 Much of the complexity comes from the fact that while recurring over a
16279 product from left to right, one wants it to be right-associative,
16287 but the user function in the end wants the product to be left
16288 associative, so that the product argument pattern can be written
16289 without parentheses (since `&` is left associative).
16292 == Labelled optional arguments ==
16294 In addition to the positional optional arguments described above, it
16295 is sometimes useful to have labelled optional arguments. These allow
16296 one to define a function, `f`, with defaults, say `a` and `b`. Then,
16297 a caller of `f` can supply values for `a` and `b` by name. If no
16298 value is supplied then the default is used.
16300 Labelled optional arguments are a simple extension of
16301 <:FunctionalRecordUpdate:> using post composition. Suppose, for
16302 example, that one wants a function `f` with labelled optional
16303 arguments `a` and `b` with default values `0` and `0.0` respectively.
16304 If one has a functional-record-update function `updateAB` for records
16305 with `a` and `b` fields, then one can define `f` in the following way.
16312 (updateAB {a = 0, b = 0.0},
16313 fn {a, b} => print (concat [Int.toString a, " ",
16314 Real.toString b, "\n"]))
16318 The idea is that `f` is the post composition (using `Fold.post`) of
16319 the actual code for the function with a functional-record updater that
16320 starts with the defaults.
16322 Here are some example calls to `f`.
16326 val () = f (U#a 13) $
16327 val () = f (U#a 13) (U#b 17.5) $
16328 val () = f (U#b 17.5) (U#a 13) $
16331 Notice that a caller can supply neither of the arguments, either of
16332 the arguments, or both of the arguments, and in either order. All
16333 that matter is that the arguments be labelled correctly (and of the
16334 right type, of course).
16336 Here is another example.
16343 (updateBCD {b = 0, c = 0.0, d = "<>"},
16345 print (concat [Int.toString b, " ",
16346 Real.toString c, " ",
16351 Here are some example calls.
16356 val () = f (U#d "goodbye") $
16357 val () = f (U#d "hello") (U#b 17) (U#c 19.3) $
16362 :mlton-guide-page: Overloading
16367 In <:StandardML:Standard ML>, constants (like `13`, `0w13`, `13.0`)
16368 are overloaded, meaning that they can denote a constant of the
16369 appropriate type as determined by context. SML defines the
16370 overloading classes _Int_, _Real_, and _Word_, which denote the sets
16371 of types that integer, real, and word constants may take on. In
16372 MLton, these are defined as follows.
16376 | _Int_ | `Int2.int`, `Int3.int`, ... `Int32.int`, `Int64.int`, `Int.int`, `IntInf.int`, `LargeInt.int`, `FixedInt.int`, `Position.int`
16377 | _Real_ | `Real32.real`, `Real64.real`, `Real.real`, `LargeReal.real`
16378 | _Word_ | `Word2.word`, `Word3.word`, ... `Word32.word`, `Word64.word`, `Word.word`, `LargeWord.word`, `SysWord.word`
16381 The <:DefinitionOfStandardML:Definition> allows flexibility in how
16382 much context is used to resolve overloading. It says that the context
16383 is _no larger than the smallest enclosing structure-level
16384 declaration_, but that _an implementation may require that a smaller
16385 context determines the type_. MLton uses the largest possible context
16386 allowed by SML in resolving overloading. If the type of a constant is
16387 not determined by context, then it takes on a default type. In MLton,
16388 these are defined as follows.
16392 | _Int_ | `Int.int`
16393 | _Real_ | `Real.real`
16394 | _Word_ | `Word.word`
16397 Other implementations may use a smaller context or different default
16402 * http://www.standardml.org/Basis/top-level-chapter.html[discussion of overloading in the Basis Library]
16406 * The following program is rejected.
16419 The smallest enclosing structure declaration for `0w0` is
16420 `val x = 0w0`. Hence, `0w0` receives the default type for words,
16421 which is `Word.word`.
16425 :mlton-guide-page: PackedRepresentation
16426 [[PackedRepresentation]]
16427 PackedRepresentation
16428 ====================
16430 <:PackedRepresentation:> is an analysis pass for the <:SSA2:>
16431 <:IntermediateLanguage:>, invoked from <:ToRSSA:>.
16435 This pass analyzes a <:SSA2:> program to compute a packed
16436 representation for each object.
16438 == Implementation ==
16440 * <!ViewGitFile(mlton,master,mlton/backend/representation.sig)>
16441 * <!ViewGitFile(mlton,master,mlton/backend/packed-representation.fun)>
16443 == Details and Notes ==
16445 Has a special case to make sure that `true` is represented as `1` and
16446 `false` is represented as `0`.
16450 :mlton-guide-page: ParallelMove
16455 <:ParallelMove:> is a rewrite pass, agnostic in the
16456 <:IntermediateLanguage:> which it produces.
16460 This function computes a sequence of individual moves to effect a
16461 parallel move (with possibly overlapping froms and tos).
16463 == Implementation ==
16465 * <!ViewGitFile(mlton,master,mlton/backend/parallel-move.sig)>
16466 * <!ViewGitFile(mlton,master,mlton/backend/parallel-move.fun)>
16468 == Details and Notes ==
16474 :mlton-guide-page: Performance
16479 This page compares the performance of a number of SML compilers on a
16480 range of benchmarks.
16482 This page compares the following SML compiler versions.
16484 * <:Home:MLton> 20171211 (git 79d4a623c)
16485 * <:MLKit:ML Kit> 4.3.12 (20171210)
16486 * <:MoscowML:Moscow ML> 2.10.1 ++ (git f529b33bb, 20170711)
16487 * <:PolyML:Poly/ML> 5.7.2 Testing (git 5.7.1-35-gcb73407a)
16488 * <:SMLNJ:SML/NJ> 110.81 (20170501)
16490 There are tables for <:#RunTime:run time>, <:#CodeSize:code size>, and
16491 <:#CompileTime:compile time>.
16496 All benchmarks were compiled and run on a 2.6 GHz Core i7-5600U with 16G of
16497 RAM. The benchmarks were compiled with the default settings for all
16498 the compilers, except for Moscow ML, which was passed the
16499 `-orthodox -standalone -toplevel` switches. The Poly/ML executables
16500 were produced using `polyc`.
16501 The SML/NJ executables were produced by wrapping the entire program in
16502 a `local` declaration whose body performs an `SMLofNJ.exportFn`.
16504 For more details, or if you want to run the benchmarks yourself,
16505 please see the <!ViewGitDir(mlton,master,benchmark)> directory of our
16508 All of the benchmarks are available for download from this page. Some
16509 of the benchmarks were obtained from the SML/NJ benchmark suite. Some
16510 of the benchmarks expect certain input files to exist in the
16511 <!ViewGitDir(mlton,master,benchmark/tests/DATA)> subdirectory.
16513 * <!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/hamlet-input.sml)>
16514 * <!RawGitFile(mlton,master,benchmark/tests/ray.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/ray)>
16515 * <!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/chess.gml)>
16516 * <!RawGitFile(mlton,master,benchmark/tests/vliw.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/ndotprod.s)>
16519 == <!Anchor(RunTime)>Run-time ratio ==
16521 The following table gives the ratio of the run time of each benchmark
16522 when compiled by another compiler to the run time when compiled by
16523 MLton. That is, the larger the number, the slower the generated code
16524 runs. A number larger than one indicates that the corresponding
16525 compiler produces code that runs more slowly than MLton. A * in an
16526 entry means the compiler failed to compile the benchmark or that the
16527 benchmark failed to run.
16529 [options="header",cols="<2,5*<1"]
16531 |benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ
16532 |<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|1.00|10.11|19.36|2.98|1.24
16533 |<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|1.00|*|7.87|1.22|1.75
16534 |<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|1.00|30.79|*|10.94|9.08
16535 |<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|1.00|6.51|40.42|2.34|2.32
16536 |<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|1.00|0.97|*|0.60|*
16537 |<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|1.00|0.50|11.50|0.42|0.42
16538 |<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|1.00|7.35|81.51|4.03|1.19
16539 |<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|1.00|1.41|10.94|1.25|1.17
16540 |<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|1.00|7.19|68.33|5.28|13.16
16541 |<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|1.00|4.97|22.85|1.58|*
16542 |<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|1.00|4.99|57.84|3.34|4.67
16543 |<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|1.00|*|18.43|3.18|3.06
16544 |<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|1.00|2.76|7.94|3.19|*
16545 |<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|1.00|1.80|20.19|0.89|1.50
16546 |<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|1.00|5.10|11.06|1.15|1.27
16547 |<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|1.00|3.50|25.52|1.33|1.28
16548 |<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|1.00|29.40|183.02|7.41|15.19
16549 |<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|1.00|95.18|*|32.61|47.47
16550 |<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|1.00|1.42|*|0.74|3.24
16551 |<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|1.00|1.83|8.45|0.84|*
16552 |<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|1.00|4.03|12.42|1.70|2.25
16553 |<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|1.00|3.73|57.44|2.05|3.22
16554 |<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|1.00|3.96|*|1.73|1.20
16555 |<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|1.00|6.26|30.85|7.82|5.99
16556 |<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|1.00|9.37|44.78|2.18|2.15
16557 |<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|1.00|*|*|2.79|3.59
16558 |<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|1.00|5.68|165.56|3.92|37.52
16559 |<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|1.00|12.05|25.08|8.73|1.75
16560 |<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|1.00|*|*|2.11|3.33
16561 |<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|1.00|2.95|24.03|3.67|1.93
16562 |<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|1.00|*|*|1.04|*
16563 |<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|1.00|1.88|28.01|0.70|2.67
16564 |<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|1.00|1.58|23.57|0.90|1.04
16565 |<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|1.00|1.69|15.90|1.57|2.01
16566 |<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|1.00|*|*|*|2.07
16567 |<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|1.00|2.19|66.76|3.27|1.48
16568 |<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|1.00|*|19.43|1.08|1.03
16569 |<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|1.00|13.85|*|1.80|12.48
16570 |<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|1.00|*|*|*|13.92
16571 |<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|1.00|7.88|68.85|9.39|68.80
16572 |<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|1.00|2.46|15.39|1.43|1.55
16573 |<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|1.00|6.00|*|29.25|9.54
16574 |<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|1.00|80.43|*|19.45|8.71
16575 |<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|1.00|4.62|35.56|1.68|9.97
16576 |<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|1.00|*|*|*|1.60
16580 Note: for SML/NJ, the
16581 <!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>
16582 benchmark was killed after running for over 51,000 seconds.
16585 == <!Anchor(CodeSize)>Code size ==
16587 The following table gives the code size of each benchmark in bytes.
16588 The size for MLton and the ML Kit is the sum of text and data for the
16589 standalone executable as reported by `size`. The size for Moscow
16590 ML is the size in bytes of the executable `a.out`. The size for
16591 Poly/ML is the difference in size of the database before the session
16592 start and after the commit. The size for SML/NJ is the size of the
16593 heap file created by `exportFn` and does not include the size of
16594 the SML/NJ runtime system (approximately 100K). A * in an entry means
16595 that the compiler failed to compile the benchmark.
16597 [options="header",cols="<2,5*<1"]
16599 |benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ
16600 |<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|180,788|810,267|199,503|148,120|402,480
16601 |<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|250,246|*|248,018|196,984|496,664
16602 |<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|122,422|225,274|*|106,088|406,560
16603 |<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|151,878|250,126|187,048|144,032|428,136
16604 |<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|223,073|827,483|*|272,664|*
16605 |<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|122,350|87,586|181,415|106,072|380,928
16606 |<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|145,008|237,230|186,228|131,400|418,896
16607 |<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|122,310|87,402|181,312|106,088|380,928
16608 |<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|121,958|104,102|181,464|106,072|394,256
16609 |<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|1,503,849|2,280,691|407,219|2,249,504|*
16610 |<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|122,078|89,346|181,470|106,088|381,952
16611 |<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|193,145|*|192,659|161,080|400,408
16612 |<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|308,296|826,819|213,128|268,272|*
16613 |<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|141,862|721,419|186,463|118,552|384,024
16614 |<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|211,086|782,667|188,908|198,408|409,624
16615 |<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|122,086|700,075|183,037|106,104|386,048
16616 |<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|124,398|280,006|184,328|110,232|416,784
16617 |<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|150,497|271,794|*|122,624|399,416
16618 |<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|123,846|100,858|181,542|106,136|381,960
16619 |<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|678,920|1,233,587|263,721|576,728|*
16620 |<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|846,779|1,432,283|297,108|777,664|985,304
16621 |<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|124,126|229,078|184,440|114,584|392,232
16622 |<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|298,038|507,186|*|475,808|456,744
16623 |<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|157,973|699,003|181,680|118,800|380,928
16624 |<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|156,401|201,138|183,438|110,456|385,072
16625 |<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|126,486|106,166|*|106,088|393,256
16626 |<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|150,174|265,694|190,088|184,536|414,760
16627 |<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|260,863|736,795|195,064|198,976|512,160
16628 |<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|384,905|*|*|446,424|623,824
16629 |<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|365,578|895,139|197,765|1,051,952|708,696
16630 |<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|286,474|*|*|262,616|547,984
16631 |<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|119,102|140,626|183,249|106,088|390,160
16632 |<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|122,110|87,890|181,369|106,072|381,952
16633 |<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|122,246|87,402|181,349|106,088|376,832
16634 |<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|186,545|*|*|*|421,984
16635 |<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|163,033|722,571|188,634|126,984|393,264
16636 |<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|235,449|*|195,401|184,816|478,296
16637 |<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|123,790|104,398|*|106,200|394,256
16638 |<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|123,846|*|*|*|405,552
16639 |<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|122,982|104,614|181,534|106,072|394,256
16640 |<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|538,074|1,182,851|249,884|580,792|749,752
16641 |<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|186,152|699,459|191,347|127,200|386,048
16642 |<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|196,232|700,131|191,539|127,232|387,072
16643 |<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|230,433|128,354|186,322|127,048|390,184
16644 |<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|156,902|*|*|*|453,768
16648 == <!Anchor(CompileTime)>Compile time ==
16650 The following table gives the compile time of each benchmark in
16651 seconds. A * in an entry means that the compiler failed to compile
16654 [options="header",cols="<2,5*<1"]
16656 |benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ
16657 |<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|2.70|0.89|0.15|0.29|0.20
16658 |<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|2.87|*|0.14|0.20|0.41
16659 |<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|2.21|0.24|*|0.07|0.05
16660 |<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|2.28|0.34|0.04|0.11|0.21
16661 |<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|2.93|1.01|*|0.27|*
16662 |<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|2.23|0.20|0.01|0.07|0.04
16663 |<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|2.35|0.28|0.03|0.09|0.10
16664 |<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|2.16|0.19|0.01|0.07|0.04
16665 |<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|2.16|0.20|0.01|0.07|0.04
16666 |<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|12.28|19.25|23.75|6.44|*
16667 |<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|2.14|0.20|0.01|0.08|0.04
16668 |<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|2.48|*|0.08|0.14|0.23
16669 |<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|3.31|0.75|0.15|0.22|*
16670 |<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|2.25|0.32|0.03|0.09|0.10
16671 |<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|2.72|0.57|0.07|0.17|0.21
16672 |<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|2.14|0.24|0.01|0.07|0.04
16673 |<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|2.14|0.24|0.01|0.08|0.05
16674 |<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|2.31|0.39|*|0.12|0.27
16675 |<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|2.15|0.21|0.01|0.07|0.04
16676 |<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|7.07|4.53|2.05|0.80|*
16677 |<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|6.78|4.76|1.20|1.65|4.78
16678 |<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|2.14|0.28|0.02|0.08|0.07
16679 |<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|3.96|2.12|*|0.37|0.49
16680 |<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|2.30|0.22|0.01|0.07|0.04
16681 |<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|2.26|0.20|0.01|0.07|0.04
16682 |<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|2.12|0.22|*|9.83|12.55
16683 |<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|2.59|0.47|0.07|0.16|0.24
16684 |<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|2.95|0.46|0.05|0.17|0.14
16685 |<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|3.93|*|*|0.45|0.74
16686 |<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|3.42|1.23|0.30|0.32|0.53
16687 |<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|3.23|*|*|0.15|0.32
16688 |<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|2.25|0.28|0.01|0.08|0.05
16689 |<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|2.24|0.21|0.01|0.08|0.05
16690 |<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|2.23|0.20|0.01|0.08|0.05
16691 |<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|2.73|*|*|*|0.44
16692 |<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|2.42|0.38|0.05|0.11|0.11
16693 |<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|2.93|*|0.10|0.27|0.31
16694 |<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|2.23|0.22|*|0.07|0.04
16695 |<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|2.18|*|*|*|0.04
16696 |<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|2.23|0.22|0.01|0.08|0.05
16697 |<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|5.25|2.93|0.63|0.94|1.85
16698 |<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|2.46|0.24|0.01|0.08|0.05
16699 |<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|2.61|0.25|0.01|0.08|0.05
16700 |<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|2.99|0.35|0.03|0.09|0.11
16701 |<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|2.31|*|*|*|0.11
16706 :mlton-guide-page: PhantomType
16711 A phantom type is a type that has no run-time representation, but is
16712 used to force the type checker to ensure invariants at compile time.
16713 This is done by augmenting a type with additional arguments (phantom
16714 type variables) and expressing constraints by choosing phantom types
16715 to stand for the phantom types in the types of values.
16722 * <!Cite(FluetPucella06)>
16724 * socket module in <:BasisLibrary:Basis Library>
16728 :mlton-guide-page: PlatformSpecificNotes
16729 [[PlatformSpecificNotes]]
16730 PlatformSpecificNotes
16731 =====================
16733 Here are notes about using MLton on the following platforms.
16735 == Operating Systems ==
16737 * <:RunningOnAIX:AIX>
16738 * <:RunningOnCygwin:Cygwin>
16739 * <:RunningOnDarwin:Darwin>
16740 * <:RunningOnFreeBSD:FreeBSD>
16741 * <:RunningOnHPUX:HPUX>
16742 * <:RunningOnLinux:Linux>
16743 * <:RunningOnMinGW:MinGW>
16744 * <:RunningOnNetBSD:NetBSD>
16745 * <:RunningOnOpenBSD:OpenBSD>
16746 * <:RunningOnSolaris:Solaris>
16748 == Architectures ==
16750 * <:RunningOnAMD64:AMD64>
16751 * <:RunningOnHPPA:HPPA>
16752 * <:RunningOnPowerPC:PowerPC>
16753 * <:RunningOnPowerPC64:PowerPC64>
16754 * <:RunningOnSparc:Sparc>
16755 * <:RunningOnX86:X86>
16763 :mlton-guide-page: PolyEqual
16768 <:PolyEqual:> is an optimization pass for the <:SSA:>
16769 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
16773 This pass implements polymorphic equality.
16775 == Implementation ==
16777 * <!ViewGitFile(mlton,master,mlton/ssa/poly-equal.fun)>
16779 == Details and Notes ==
16781 For each datatype, tycon, and vector type, it builds and equality
16782 function and translates calls to `MLton_equal` into calls to that
16785 Also generates calls to `Word_equal`.
16787 For tuples, it does the equality test inline; i.e., it does not create
16788 a separate equality function for each tuple type.
16790 All equality functions are created only if necessary, i.e., if
16791 equality is actually used at a type.
16795 * for datatypes that are enumerations, do not build a case dispatch,
16796 just use `MLton_eq`, as the backend will represent these as ints
16798 * deep equality always does an `MLton_eq` test first
16800 * If one argument to `=` is a constant and the type will get
16801 translated to an `IntOrPointer`, then just use `eq` instead of the
16802 full equality. This is important for implementing code like the
16803 following efficiently:
16806 if x = 0 ... (* where x is of type IntInf.int *)
16809 * Also convert pointer equality on scalar types to type specific
16814 :mlton-guide-page: PolyHash
16819 <:PolyHash:> is an optimization pass for the <:SSA:>
16820 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
16824 This pass implements polymorphic, structural hashing.
16826 == Implementation ==
16828 * <!ViewGitFile(mlton,master,mlton/ssa/poly-hash.fun)>
16830 == Details and Notes ==
16832 For each datatype, tycon, and vector type, it builds and equality
16833 function and translates calls to `MLton_hash` into calls to that
16836 For tuples, it does the equality test inline; i.e., it does not create
16837 a separate equality function for each tuple type.
16839 All equality functions are created only if necessary, i.e., if
16840 equality is actually used at a type.
16844 :mlton-guide-page: PolyML
16849 http://www.polyml.org/[Poly/ML] is a
16850 <:StandardMLImplementations:Standard ML implementation>.
16854 * <!Cite(Matthews95)>
16858 :mlton-guide-page: PolymorphicEquality
16859 [[PolymorphicEquality]]
16860 PolymorphicEquality
16861 ===================
16863 Polymorphic equality is a built-in function in
16864 <:StandardML:Standard ML> that compares two values of the same type
16865 for equality. It is specified as
16869 val = : ''a * ''a -> bool
16872 The `''a` in the specification are
16873 <:EqualityTypeVariable:equality type variables>, and indicate that
16874 polymorphic equality can only be applied to values of an
16875 <:EqualityType:equality type>. It is not allowed in SML to rebind
16876 `=`, so a programmer is guaranteed that `=` always denotes polymorphic
16880 == Equality of ground types ==
16882 Ground types like `char`, `int`, and `word` may be compared (to values
16883 of the same type). For example, `13 = 14` is type correct and yields
16887 == Equality of reals ==
16889 The one ground type that can not be compared is `real`. So,
16890 `13.0 = 14.0` is not type correct. One can use `Real.==` to compare
16891 reals for equality, but beware that this has different algebraic
16892 properties than polymorphic equality.
16894 See http://standardml.org/Basis/real.html for a discussion of why
16895 `real` is not an equality type.
16898 == Equality of functions ==
16900 Comparison of functions is not allowed.
16903 == Equality of immutable types ==
16905 Polymorphic equality can be used on <:Immutable:immutable> values like
16906 tuples, records, lists, and vectors. For example,
16909 (1, 2, 3) = (4, 5, 6)
16912 is a type-correct expression yielding `false`, while
16915 [1, 2, 3] = [1, 2, 3]
16918 is type correct and yields `true`.
16920 Equality on immutable values is computed by structure, which means
16921 that values are compared by recursively descending the data structure
16922 until ground types are reached, at which point the ground types are
16923 compared with primitive equality tests (like comparison of
16924 characters). So, the expression
16927 [1, 2, 3] = [1, 1 + 1, 1 + 1 + 1]
16930 is guaranteed to yield `true`, even though the lists may occupy
16931 different locations in memory.
16933 Because of structural equality, immutable values can only be compared
16934 if their components can be compared. For example, `[1, 2, 3]` can be
16935 compared, but `[1.0, 2.0, 3.0]` can not. The SML type system uses
16936 <:EqualityType:equality types> to ensure that structural equality is
16937 only applied to valid values.
16940 == Equality of mutable values ==
16942 In contrast to immutable values, polymorphic equality of
16943 <:Mutable:mutable> values (like ref cells and arrays) is performed by
16944 pointer comparison, not by structure. So, the expression
16950 is guaranteed to yield `false`, even though the ref cells hold the
16953 Because equality of mutable values is not structural, arrays and refs
16954 can be compared _even if their components are not equality types_.
16955 Hence, the following expression is type correct (and yields true).
16967 == Equality of datatypes ==
16969 Polymorphic equality of datatypes is structural. Two values of the
16970 same datatype are equal if they are of the same <:Variant:variant> and
16971 if the <:Variant:variant>'s arguments are equal (recursively). So,
16976 datatype t = A | B of t
16979 then `B (B A) = B A` is type correct and yields `false`, while `A = A`
16980 and `B A = B A` yield `true`.
16982 As polymorphic equality descends two values to compare them, it uses
16983 pointer equality whenever it reaches a mutable value. So, with the
16988 datatype t = A of int ref | ...
16991 then `A (ref 13) = A (ref 13)` is type correct and yields `false`,
16992 because the pointer equality on the two ref cells yields `false`.
16994 One weakness of the SML type system is that datatypes do not inherit
16995 the special property of the `ref` and `array` type constructors that
16996 allows them to be compared regardless of their component type. For
16997 example, after declaring
17001 datatype 'a t = A of 'a ref
17004 one might expect to be able to compare two values of type `real t`,
17005 because pointer comparison on a ref cell would suffice.
17006 Unfortunately, the type system can only express that a user-defined
17007 datatype <:AdmitsEquality:admits equality> or not. In this case, `t`
17008 admits equality, which means that `int t` can be compared but that
17009 `real t` can not. We can confirm this with the program
17013 datatype 'a t = A of 'a ref
17014 fun f (x: real t, y: real t) = x = y
17017 on which MLton reports the following error.
17020 Error: z.sml 2.32-2.36.
17021 Function applied to incorrect argument.
17022 expects: [<equality>] t * [<equality>] t
17023 but got: [real] t * [real] t
17028 == Implementation ==
17030 Polymorphic equality is implemented by recursively descending the two
17031 values being compared, stopping as soon as they are determined to be
17032 unequal, or exploring the entire values to determine that they are
17033 equal. Hence, polymorphic equality can take time proportional to the
17034 size of the smaller value.
17036 MLton uses some optimizations to improve performance.
17038 * When computing structural equality, first do a pointer comparison.
17039 If the comparison yields `true`, then stop and return `true`, since
17040 the structural comparison is guaranteed to do so. If the pointer
17041 comparison fails, then recursively descend the values.
17043 * If a datatype is an enum (e.g. `datatype t = A | B | C`), then a
17044 single comparison suffices to compare values of the datatype. No case
17045 dispatch is required to determine whether the two values are of the
17046 same <:Variant:variant>.
17048 * When comparing a known constant non-value-carrying
17049 <:Variant:variant>, use a single comparison. For example, the
17050 following code will compile into a single comparison for `A = x`.
17054 datatype t = A | B | C of ...
17055 fun f x = ... if A = x then ...
17058 * When comparing a small constant `IntInf.int` to another
17059 `IntInf.int`, use a single comparison against the constant. No case
17060 dispatch is required.
17065 * <:AdmitsEquality:>
17067 * <:EqualityTypeVariable:>
17071 :mlton-guide-page: Polyvariance
17076 Polyvariance is an optimization pass for the <:SXML:>
17077 <:IntermediateLanguage:>, invoked from <:SXMLSimplify:>.
17081 This pass duplicates a higher-order, `let` bound function at each
17082 variable reference, if the cost is smaller than some threshold.
17084 == Implementation ==
17086 * <!ViewGitFile(mlton,master,mlton/xml/polyvariance.fun)>
17088 == Details and Notes ==
17094 :mlton-guide-page: Poplog
17099 http://www.cs.bham.ac.uk/research/poplog/poplog.info.html[POPLOG] is a
17100 development environment that includes implementations of a number of
17101 languages, including <:StandardML:Standard ML>.
17103 While POPLOG is actively developed, the <:ML:> support predates
17104 <:DefinitionOfStandardML:SML'97>, and there is no support for the
17105 <:BasisLibrary:Basis Library>
17106 http://www.standardml.org/Basis[specification].
17110 * http://www.cs.bham.ac.uk/research/poplog/doc/pmlhelp/mlinpop[Mixed-language programming in ML and Pop-11].
17114 :mlton-guide-page: PortingMLton
17119 Porting MLton to a new target platform (architecture or OS) involves
17120 the following steps.
17122 1. Make the necessary changes to the scripts, runtime system,
17123 <:BasisLibrary: Basis Library> implementation, and compiler.
17125 2. Get the regressions working using a cross compiler.
17127 3. <:CrossCompiling: Cross compile> MLton and bootstrap on the target.
17129 MLton has a native code generator only for AMD64 and X86, so, if you
17130 are porting to another architecture, you must use the C code
17131 generator. These notes do not cover building a new native code
17134 Some of the following steps will not be necessary if MLton already
17135 supports the architecture or operating system you are porting to.
17138 == What code to change ==
17143 * In `bin/platform`, add new cases to define `$HOST_OS` and `$HOST_ARCH`.
17149 The goal of this step is to be able to successfully run `make` in the
17150 `runtime` directory on the target machine.
17152 * In `platform.h`, add a new case to include `platform/<arch>.h` and `platform/<os>.h`.
17154 * In `platform/<arch>.h`:
17155 ** define `MLton_Platform_Arch_host`.
17157 * In `platform/<os>.h`:
17158 ** include platform-specific includes.
17159 ** define `MLton_Platform_OS_host`.
17160 ** define all of the `HAS_*` macros.
17162 * In `platform/<os>.c` implement any platform-dependent functions that the runtime needs.
17164 * Add rounding mode control to `basis/Real/IEEEReal.c` for the new arch (if not `HAS_FEROUND`)
17166 * Compile and install the <:GnuMP:>. This varies from platform to platform. In `platform/<os>.h`, you need to include the appropriate `gmp.h`.
17169 * Basis Library implementation (`basis-library/*`)
17172 * In `primitive/prim-mlton.sml`:
17173 ** Add a new variant to the `MLton.Platform.Arch.t` datatype.
17174 ** modify the constants that define `MLton.Platform.Arch.host` to match with `MLton_Platform_Arch_host`, as set in `runtime/platform/<arch>.h`.
17175 ** Add a new variant to the `MLton.Platform.OS.t` datatype.
17176 ** modify the constants that define `MLton.Platform.OS.host` to match with `MLton_Platform_OS_host`, as set in `runtime/platform/<os>.h`.
17178 * In `mlton/platform.{sig,sml}` add a new variant.
17180 * In `sml-nj/sml-nj.sml`, modify `getOSKind`.
17182 * Look at all the uses of `MLton.Platform` in the Basis Library implementation and see if you need to do anything special. You might use the following command to see where to look.
17185 find basis-library -type f | xargs grep 'MLton\.Platform'
17188 If in doubt, leave the code alone and wait to see what happens when you run the regression tests.
17194 * In `lib/stubs/mlton-stubs/platform.sig` add any new variants, as was done in the Basis Library.
17196 * In `lib/stubs/mlton-stubs/mlton.sml` add any new variants in `MLton.Platform`, as was done in the Basis Library.
17199 The string used to identify a particular architecture or operating
17200 system must be the same (except for possibly case of letters) in the
17201 scripts, runtime, Basis Library implementation, and compiler (stubs).
17202 In `mlton/main/main.fun`, MLton itself uses the conversions to and
17205 MLton.Platform.{Arch,OS}.{from,to}String
17208 If the there is a mismatch, you may see the error message
17209 `strange arch` or `strange os`.
17212 == Running the regressions with a cross compiler ==
17214 When porting to a new platform, it is always best to get all (or as
17215 many as possible) of the regressions working before moving to a self
17216 compile. It is easiest to do this by modifying and rebuilding the
17217 compiler on a working machine and then running the regressions with a
17218 cross compiler. It is not easy to build a gcc cross compiler, so we
17219 recommend generating the C and assembly on a working machine (using
17220 MLton's `-target` and `-stop g` flags, copying the generated files to
17221 the target machine, then compiling and linking there.
17223 1. Remake the compiler on a working machine.
17225 2. Use `bin/add-cross` to add support for the new target. In particular, this should create `build/lib/mlton/targets/<target>/` with the platform-specific necessary cross-compilation information.
17227 3. Run the regression tests with the cross-compiler. To cross-compile all the tests, do
17230 bin/regression -cross <target>
17233 This will create all the executables. Then, copy `bin/regression` and
17234 the `regression` directory to the target machine, and do
17237 bin/regression -run-only <target>
17240 This should run all the tests.
17242 Repeat this step, interleaved with appropriate compiler modifications,
17243 until all the regressions pass.
17248 Once you've got all the regressions working, you can build MLton for
17249 the new target. As with the regressions, the idea for bootstrapping
17250 is to generate the C and assembly on a working machine, copy it to the
17251 target machine, and then compile and link there. Here's the sequence
17254 1. On a working machine, with the newly rebuilt compiler, in the `mlton` directory, do:
17257 mlton -stop g -target <target> mlton.mlb
17260 2. Copy to the target machine.
17262 3. On the target machine, move the libraries to the right place. That is, in `build/lib/mlton/targets`, do:
17269 Also make sure you have all the header files in build/lib/mlton/include. You can copy them from a host machine that has run `make runtime`.
17271 4. On the target machine, compile and link MLton. That is, in the mlton directory, do something like:
17274 gcc -c -Ibuild/lib/mlton/include -Ibuild/lib/mlton/targets/self/include -O1 -w mlton/mlton.*.[cs]
17275 gcc -o build/lib/mlton/mlton-compile \
17276 -Lbuild/lib/mlton/targets/self \
17279 -lmlton -lgmp -lgdtoa -lm
17282 5. At this point, MLton should be working and you can finish the rest of a usual make on the target machine.
17285 make basis-no-check script mlbpathmap constants libraries tools
17288 6. Making the last tool, mlyacc, will fail, because mlyacc cannot bootstrap its own yacc.grm.* files. On the host machine, run `make -C mlyacc src/yacc.grm.sml`. Then copy both files to the target machine, and compile mlyacc, making sure to supply the path to your newly compile mllex: `make -C mlyacc MLLEX=mllex/mllex`.
17290 There are other details to get right, like making sure that the tools
17291 directories were clean so that the tools are rebuilt on the new
17292 platform, but hopefully this structure works. Once you've got a
17293 compiler on the target machine, you should test it by running all the
17294 regressions normally (i.e. without the `-cross` flag) and by running a
17295 couple rounds of self compiles.
17300 The above description is based on the following emails sent to the
17303 * http://www.mlton.org/pipermail/mlton/2002-October/013110.html
17304 * http://www.mlton.org/pipermail/mlton/2004-July/016029.html
17308 :mlton-guide-page: PrecedenceParse
17309 [[PrecedenceParse]]
17313 <:PrecedenceParse:> is an analysis/rewrite pass for the <:AST:>
17314 <:IntermediateLanguage:>, invoked from <:Elaborate:>.
17318 This pass rewrites <:AST:> function clauses, expressions, and patterns
17319 to resolve <:OperatorPrecedence:>.
17321 == Implementation ==
17323 * <!ViewGitFile(mlton,master,mlton/elaborate/precedence-parse.sig)>
17324 * <!ViewGitFile(mlton,master,mlton/elaborate/precedence-parse.fun)>
17326 == Details and Notes ==
17332 :mlton-guide-page: Printf
17337 Programmers coming from C or Java often ask if
17338 <:StandardML:Standard ML> has a `printf` function. It does not.
17339 However, it is possible to implement your own version with only a few
17342 Here is a definition for `printf` and `fprintf`, along with format
17343 specifiers for booleans, integers, and reals.
17349 fun $ (_, f) = f (fn p => p ()) ignore
17350 fun fprintf out f = f (out, id)
17351 val printf = fn z => fprintf TextIO.stdOut z
17352 fun one ((out, f), make) g =
17356 r (fn () => (p (); TextIO.output (out, s))))))
17357 fun ` x s = one (x, fn f => f s)
17358 fun spec to x = one (x, fn f => f o to)
17359 val B = fn z => spec Bool.toString z
17360 val I = fn z => spec Int.toString z
17361 val R = fn z => spec Real.toString z
17365 Here's an example use.
17369 val () = printf `"Int="I`" Bool="B`" Real="R`"\n" $ 1 false 2.0
17372 This prints the following.
17375 Int=1 Bool=false Real=2.0
17378 In general, a use of `printf` looks like
17381 printf <spec1> ... <specn> $ <arg1> ... <argm>
17384 where each `<speci>` is either a specifier like `B`, `I`, or `R`, or
17385 is an inline string, like ++`"foo"++. A backtick (+`+)
17386 must precede each inline string. Each `<argi>` must be of the
17387 appropriate type for the corresponding specifier.
17389 SML `printf` is more powerful than its C counterpart in a number of
17390 ways. In particular, the function produced by `printf` is a perfectly
17391 ordinary SML function, and can be passed around, used multiple times,
17396 val f: int -> bool -> unit = printf `"Int="I`" Bool="B`"\n" $
17401 The definition of `printf` is even careful to not print anything until
17402 it is fully applied. So, examples like the following will work as
17406 val f: int -> bool -> unit = printf `"Int="I`" Bool="B`"\n" $ 13
17411 It is also easy to define new format specifiers. For example, suppose
17412 we wanted format specifiers for characters and strings.
17415 val C = fn z => spec Char.toString z
17416 val S = fn z => spec (fn s => s) z
17419 One can define format specifiers for more complex types, e.g. pairs of
17426 concat ["(", Int.toString i, ", ", Int.toString j, ")"])
17430 Here's an example use.
17433 val () = printf `"Test "I2`" a string "S`"\n" $ (1, 2) "hello"
17437 == Printf via <:Fold:> ==
17439 `printf` is best viewed as a special case of variable-argument
17440 <:Fold:> that inductively builds a function as it processes its
17441 arguments. Here is the definition of a `Printf` structure in terms of
17442 fold. The structure is equivalent to the above one, except that it
17443 uses the standard `$` instead of a specialized one.
17450 Fold.fold ((out, id), fn (_, f) => f (fn p => p ()) ignore)
17452 val printf = fn z => fprintf TextIO.stdOut z
17454 fun one ((out, f), make) =
17458 r (fn () => (p (); TextIO.output (out, s))))))
17461 fn z => Fold.step1 (fn (s, x) => one (x, fn f => f s)) z
17463 fun spec to = Fold.step0 (fn x => one (x, fn f => f o to))
17465 val B = fn z => spec Bool.toString z
17466 val I = fn z => spec Int.toString z
17467 val R = fn z => spec Real.toString z
17471 Viewing `printf` as a fold opens up a number of possibilities. For
17472 example, one can name parts of format strings using the fold idiom for
17473 naming sequences of steps.
17476 val IB = fn u => Fold.fold u `"Int="I`" Bool="B
17477 val () = printf IB`" "IB`"\n" $ 1 true 3 false
17480 One can even parametrize over partial format strings.
17483 fun XB X = fn u => Fold.fold u `"X="X`" Bool="B
17484 val () = printf (XB I)`" "(XB R)`"\n" $ 1 true 2.0 false
17491 * <!Cite(Danvy98, Functional Unparsing)>
17495 :mlton-guide-page: PrintfGentle
17500 This page provides a gentle introduction and derivation of <:Printf:>,
17501 with sections and arrangement more suitable to a talk.
17506 SML does not have `printf`. Could we define it ourselves?
17510 val () = printf ("here's an int %d and a real %f.\n", 13, 17.0)
17511 val () = printf ("here's three values (%d, %f, %f).\n", 13, 17.0, 19.0)
17514 What could the type of `printf` be?
17516 This obviously can't work, because SML functions take a fixed number
17517 of arguments. Actually they take one argument, but if that's a tuple,
17518 it can only have a fixed number of components.
17521 == From tupling to currying ==
17523 What about currying to get around the typing problem?
17527 val () = printf "here's an int %d and a real %f.\n" 13 17.0
17528 val () = printf "here's three values (%d, %f, %f).\n" 13 17.0 19.0
17531 That fails for a similar reason. We need two types for `printf`.
17534 val printf: string -> int -> real -> unit
17535 val printf: string -> int -> real -> real -> unit
17538 This can't work, because `printf` can only have one type. SML doesn't
17539 support programmer-defined overloading.
17542 == Overloading and dependent types ==
17544 Even without worrying about number of arguments, there is another
17545 problem. The type of `printf` depends on the format string.
17549 val () = printf "here's an int %d and a real %f.\n" 13 17.0
17550 val () = printf "here's a real %f and an int %d.\n" 17.0 13
17556 val printf: string -> int -> real -> unit
17557 val printf: string -> real -> int -> unit
17560 Again, this can't possibly working because SML doesn't have
17561 overloading, and types can't depend on values.
17564 == Idea: express type information in the format string ==
17566 If we express type information in the format string, then different
17567 uses of `printf` can have different types.
17571 type 'a t (* the type of format strings *)
17572 val printf: 'a t -> 'a
17574 val fs1: (int -> real -> unit) t = "here's an int "D" and a real "F".\n"
17575 val fs2: (int -> real -> real -> unit) t =
17576 "here's three values ("D", "F", "F").\n"
17577 val () = printf fs1 13 17.0
17578 val () = printf fs2 13 17.0 19.0
17581 Now, our two calls to `printf` type check, because the format
17582 string specializes `printf` to the appropriate type.
17585 == The types of format characters ==
17587 What should the type of format characters `D` and `F` be? Each format
17588 character requires an additional argument of the appropriate type to
17589 be supplied to `printf`.
17591 Idea: guess the final type that will be needed for `printf` the format
17592 string and verify it with each format character.
17596 type ('a, 'b) t (* 'a = rest of type to verify, 'b = final type *)
17597 val ` : string -> ('a, 'a) t (* guess the type, which must be verified *)
17598 val D: (int -> 'a, 'b) t * string -> ('a, 'b) t (* consume an int *)
17599 val F: (real -> 'a, 'b) t * string -> ('a, 'b) t (* consume a real *)
17600 val printf: (unit, 'a) t -> 'a
17603 Don't worry. In the end, type inference will guess and verify for us.
17606 == Understanding guess and verify ==
17608 Now, let's build up a format string and a specialized `printf`.
17613 val f0 = `"here's an int "
17614 val f1 = f0 D " and a real "
17615 val f2 = f1 F ".\n"
17619 These definitions yield the following types.
17623 val f0: (int -> real -> unit, int -> real -> unit) t
17624 val f1: (real -> unit, int -> real -> unit) t
17625 val f2: (unit, int -> real -> unit) t
17626 val p: int -> real -> unit
17629 So, `p` is a specialized `printf` function. We could use it as
17639 == Type checking this using a functor ==
17646 val ` : string -> ('a, 'a) t
17647 val D: (int -> 'a, 'b) t * string -> ('a, 'b) t
17648 val F: (real -> 'a, 'b) t * string -> ('a, 'b) t
17649 val printf: (unit, 'a) t -> 'a
17652 functor Test (P: PRINTF) =
17657 val () = printf (`"here's an int "D" and a real "F".\n") 13 17.0
17658 val () = printf (`"here's three values ("D", "F ", "F").\n") 13 17.0 19.0
17663 == Implementing `Printf` ==
17665 Think of a format character as a formatter transformer. It takes the
17666 formatter for the part of the format string before it and transforms
17667 it into a new formatter that first does the left hand bit, then does
17668 its bit, then continues on with the rest of the format string.
17672 structure Printf: PRINTF =
17674 datatype ('a, 'b) t = T of (unit -> 'a) -> 'b
17676 fun printf (T f) = f (fn () => ())
17678 fun ` s = T (fn a => (print s; a ()))
17681 T (fn g => f (fn () => fn i =>
17682 (print (Int.toString i); print s; g ())))
17685 T (fn g => f (fn () => fn i =>
17686 (print (Real.toString i); print s; g ())))
17691 == Testing printf ==
17695 structure Z = Test (Printf)
17699 == User-definable formats ==
17701 The definition of the format characters is pretty much the same.
17702 Within the `Printf` structure we can define a format character
17707 val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t =
17708 fn toString => fn (T f, s) =>
17709 T (fn th => f (fn () => fn a => (print (toString a); print s ; th ())))
17710 val D = fn z => newFormat Int.toString z
17711 val F = fn z => newFormat Real.toString z
17715 == A core `Printf` ==
17717 We can now have a very small `PRINTF` signature, and define all
17718 the format strings externally to the core module.
17725 val ` : string -> ('a, 'a) t
17726 val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t
17727 val printf: (unit, 'a) t -> 'a
17730 structure Printf: PRINTF =
17732 datatype ('a, 'b) t = T of (unit -> 'a) -> 'b
17734 fun printf (T f) = f (fn () => ())
17736 fun ` s = T (fn a => (print s; a ()))
17738 fun newFormat toString (T f, s) =
17740 f (fn () => fn a =>
17741 (print (toString a)
17748 == Extending to fprintf ==
17750 One can implement fprintf by threading the outstream through all the
17758 val ` : string -> ('a, 'a) t
17759 val fprintf: (unit, 'a) t * TextIO.outstream -> 'a
17760 val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t
17761 val printf: (unit, 'a) t -> 'a
17764 structure Printf: PRINTF =
17766 type out = TextIO.outstream
17767 val output = TextIO.output
17769 datatype ('a, 'b) t = T of (out -> 'a) -> out -> 'b
17771 fun fprintf (T f, out) = f (fn _ => ()) out
17773 fun printf t = fprintf (t, TextIO.stdOut)
17775 fun ` s = T (fn a => fn out => (output (out, s); a out))
17777 fun newFormat toString (T f, s) =
17779 f (fn out => fn a =>
17780 (output (out, toString a)
17789 * Lesson: instead of using dependent types for a function, express the
17790 the dependency in the type of the argument.
17792 * If `printf` is partially applied, it will do the printing then and
17793 there. Perhaps this could be fixed with some kind of terminator.
17795 A syntactic or argument terminator is not necessary. A formatter can
17796 either be eager (as above) or lazy (as below). A lazy formatter
17797 accumulates enough state to print the entire string. The simplest
17798 lazy formatter concatenates the strings as they become available:
17802 structure PrintfLazyConcat: PRINTF =
17804 datatype ('a, 'b) t = T of (string -> 'a) -> string -> 'b
17806 fun printf (T f) = f print ""
17808 fun ` s = T (fn th => fn s' => th (s' ^ s))
17810 fun newFormat toString (T f, s) =
17812 f (fn s' => fn a =>
17813 th (s' ^ toString a ^ s)))
17817 It is somewhat more efficient to accumulate the strings as a list:
17821 structure PrintfLazyList: PRINTF =
17823 datatype ('a, 'b) t = T of (string list -> 'a) -> string list -> 'b
17825 fun printf (T f) = f (List.app print o List.rev) []
17827 fun ` s = T (fn th => fn ss => th (s::ss))
17829 fun newFormat toString (T f, s) =
17831 f (fn ss => fn a =>
17832 th (s::toString a::ss)))
17840 * <!Cite(Danvy98, Functional Unparsing)>
17844 :mlton-guide-page: ProductType
17849 <:StandardML:Standard ML> has special syntax for products (tuples). A
17850 product type is written as
17855 and a product pattern is written as
17861 In most situations the syntax is quite convenient. However, there are
17862 situations where the syntax is cumbersome. There are also situations
17863 in which it is useful to construct and destruct n-ary products
17864 inductively, especially when using <:Fold:>.
17866 In such situations, it is useful to have a binary product datatype
17867 with an infix constructor defined as follows.
17870 datatype ('a, 'b) product = & of 'a * 'b
17874 With these definitions, one can write an n-ary product as a nested
17875 binary product quite conveniently.
17881 Because of left associativity, this is the same as
17884 (((x1 & x2) & ...) & xn)
17887 Because `&` is a constructor, the syntax can also be used for
17890 The symbol `&` is inspired by the Curry-Howard isomorphism: the proof
17891 of a conjunction `(A & B)` is a pair of proofs `(a, b)`.
17894 == Example: parser combinators ==
17896 A typical parser combinator library provides a combinator that has a
17900 'a parser * 'b parser -> ('a * 'b) parser
17902 and produces a parser for the concatenation of two parsers. When more
17903 than two parsers are concatenated, the result of the resulting parser
17904 is a nested structure of pairs
17907 (...((p1, p2), p3)..., pN)
17909 which is somewhat cumbersome.
17911 By using a product type, the type of the concatenation combinator then
17915 'a parser * 'b parser -> ('a, 'b) product parser
17917 While this doesn't stop the nesting, it makes the pattern significantly
17918 easier to write. Instead of
17921 (...((p1, p2), p3)..., pN)
17923 the pattern is written as
17926 p1 & p2 & p3 & ... & pN
17928 which is considerably more concise.
17933 * <:VariableArityPolymorphism:>
17938 :mlton-guide-page: Profiling
17943 With MLton and `mlprof`, you can profile your program to find out
17944 bytes allocated, execution counts, or time spent in each function. To
17945 profile you program, compile with ++-profile __kind__++, where _kind_
17946 is one of `alloc`, `count`, or `time`. Then, run the executable,
17947 which will write an `mlmon.out` file when it finishes. You can then
17948 run `mlprof` on the executable and the `mlmon.out` file to see the
17951 Here are the three kinds of profiling that MLton supports.
17953 * <:ProfilingAllocation:>
17954 * <:ProfilingCounts:>
17955 * <:ProfilingTime:>
17959 * <:CallGraph:>s to visualize profiling data.
17960 * <:HowProfilingWorks:>
17962 * <:MLtonProfile:> to selectively profile parts of your program.
17963 * <:ProfilingTheStack:>
17968 :mlton-guide-page: ProfilingAllocation
17969 [[ProfilingAllocation]]
17970 ProfilingAllocation
17971 ===================
17973 With MLton and `mlprof`, you can <:Profiling:profile> your program to
17974 find out how many bytes each function allocates. To do so, compile
17975 your program with `-profile alloc`. For example, suppose that
17976 `list-rev.sml` is the following.
17980 sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/list-rev.sml]
17983 Compile and run `list-rev` as follows.
17985 % mlton -profile alloc list-rev.sml
17987 % mlprof -show-line true list-rev mlmon.out
17988 6,030,136 bytes allocated (108,336 bytes by GC)
17990 ----------------------- -----
17991 append list-rev.sml: 1 97.6%
17994 rev list-rev.sml: 6 0.2%
17997 The data shows that most of the allocation is done by the `append`
17998 function defined on line 1 of `list-rev.sml`. The table also shows
17999 how special functions like `gc` and `main` are handled: they are
18000 printed with surrounding brackets. C functions are displayed
18001 similarly. In this example, the allocation done by the garbage
18002 collector is due to stack growth, which is usually the case.
18004 The run-time performance impact of allocation profiling is noticeable,
18005 because it inserts additional C calls for object allocation.
18007 Compile with `-profile alloc -profile-branch true` to find out how
18008 much allocation is done in each branch of a function; see
18009 <:ProfilingCounts:> for more details on `-profile-branch`.
18013 :mlton-guide-page: ProfilingCounts
18014 [[ProfilingCounts]]
18018 With MLton and `mlprof`, you can <:Profiling:profile> your program to
18019 find out how many times each function is called and how many times
18020 each branch is taken. To do so, compile your program with
18021 `-profile count -profile-branch true`. For example, suppose that
18022 `tak.sml` contains the following.
18026 sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/tak.sml]
18029 Compile with count profiling and run the program.
18031 % mlton -profile count -profile-branch true tak.sml
18035 Display the profiling data, along with raw counts and file positions.
18037 % mlprof -raw true -show-line true tak mlmon.out
18040 --------------------------------- ----- -------------
18041 Tak.tak1.tak2 tak.sml: 5 38.2% (238,530,000)
18042 Tak.tak1.tak2.<true> tak.sml: 7 27.5% (171,510,000)
18043 Tak.tak1 tak.sml: 3 10.7% (67,025,000)
18044 Tak.tak1.<true> tak.sml: 14 10.7% (67,025,000)
18045 Tak.tak1.tak2.<false> tak.sml: 9 10.7% (67,020,000)
18046 Tak.tak1.<false> tak.sml: 16 2.0% (12,490,000)
18047 f tak.sml: 23 0.0% (5,001)
18048 f.<branch> tak.sml: 25 0.0% (5,000)
18049 f.<branch> tak.sml: 23 0.0% (1)
18050 uncalled tak.sml: 29 0.0% (0)
18051 f.<branch> tak.sml: 24 0.0% (0)
18054 Branches are displayed with lexical nesting followed by `<branch>`
18055 where the function name would normally be, or `<true>` or `<false>`
18056 for if-expressions. It is best to run `mlprof` with `-show-line true`
18057 to help identify the branch.
18059 One use of `-profile count` is as a code-coverage tool, to help find
18060 code in your program that hasn't been tested. For this reason,
18061 `mlprof` displays functions and branches even if they have a count of
18062 zero. As the above output shows, the branch on line 24 was never
18063 taken and the function defined on line 29 was never called. To see
18064 zero counts, it is best to run `mlprof` with `-raw true`, since some
18065 code (e.g. the branch on line 23 above) will show up with `0.0%` but
18066 may still have been executed and hence have a nonzero raw count.
18070 :mlton-guide-page: ProfilingTheStack
18071 [[ProfilingTheStack]]
18075 For all forms of <:Profiling:>, you can gather counts for all
18076 functions on the stack, not just the currently executing function. To
18077 do so, compile your program with `-profile-stack true`. For example,
18078 suppose that `list-rev.sml` contains the following.
18082 sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/list-rev.sml]
18085 Compile with stack profiling and then run the program.
18087 % mlton -profile alloc -profile-stack true list-rev.sml
18091 Display the profiling data.
18093 % mlprof -show-line true list-rev mlmon.out
18094 6,030,136 bytes allocated (108,336 bytes by GC)
18095 function cur stack GC
18096 ----------------------- ----- ----- ----
18097 append list-rev.sml: 1 97.6% 97.6% 1.4%
18098 <gc> 1.8% 0.0% 1.8%
18099 <main> 0.4% 98.2% 1.8%
18100 rev list-rev.sml: 6 0.2% 97.6% 1.8%
18103 In the above table, we see that `rev`, defined on line 6 of
18104 `list-rev.sml`, is only responsible for 0.2% of the allocation, but is
18105 on the stack while 97.6% of the allocation is done by the user program
18106 and while 1.8% of the allocation is done by the garbage collector.
18108 The run-time performance impact of `-profile-stack true` can be
18109 noticeable since there is some extra bookkeeping at every nontail call
18114 :mlton-guide-page: ProfilingTime
18119 With MLton and `mlprof`, you can <:Profiling:profile> your program to
18120 find out how much time is spent in each function over an entire run of
18121 the program. To do so, compile your program with `-profile time`.
18122 For example, suppose that `tak.sml` contains the following.
18126 sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/tak.sml]
18129 Compile with time profiling and run the program.
18131 % mlton -profile time tak.sml
18135 Display the profiling data.
18137 % mlprof tak mlmon.out
18138 6.00 seconds of CPU time (0.00 seconds GC)
18140 ------------- -----
18141 Tak.tak1.tak2 75.8%
18145 This example shows how `mlprof` indicates lexical nesting: as a
18146 sequence of period-separated names indicating the structures and
18147 functions in which a function definition is nested. The profiling
18148 data shows that roughly three-quarters of the time is spent in the
18149 `Tak.tak1.tak2` function, while the rest is spent in `Tak.tak1`.
18151 Display raw counts in addition to percentages with `-raw true`.
18153 % mlprof -raw true tak mlmon.out
18154 6.00 seconds of CPU time (0.00 seconds GC)
18156 ------------- ----- -------
18157 Tak.tak1.tak2 75.8% (4.55s)
18158 Tak.tak1 24.2% (1.45s)
18161 Display the file name and line number for each function in addition to
18162 its name with `-show-line true`.
18164 % mlprof -show-line true tak mlmon.out
18165 6.00 seconds of CPU time (0.00 seconds GC)
18167 ------------------------- -----
18168 Tak.tak1.tak2 tak.sml: 5 75.8%
18169 Tak.tak1 tak.sml: 3 24.2%
18172 Time profiling is designed to have a very small performance impact.
18173 However, in some cases there will be a run-time performance cost,
18174 which may perturb the results. There is more likely to be an impact
18175 with `-codegen c` than `-codegen native`.
18177 You can also compile with `-profile time -profile-branch true` to find
18178 out how much time is spent in each branch of a function; see
18179 <:ProfilingCounts:> for more details on `-profile-branch`.
18184 With `-profile time`, use of the following in your program will cause
18185 a run-time error, since they would interfere with the profiler signal
18188 * `MLton.Itimer.set (MLton.Itimer.Prof, ...)`
18189 * `MLton.Signal.setHandler (MLton.Signal.prof, ...)`
18191 Also, because of the random sampling used to implement `-profile
18192 time`, it is best to have a long running program (at least tens of
18193 seconds) in order to get reasonable time
18197 :mlton-guide-page: Projects
18202 We have lots of ideas for projects to improve MLton, many of which we
18203 do not have time to implement, or at least haven't started on yet.
18204 Here is a list of some of those improvements, ranging from the easy (1
18205 week) to the difficult (several months). If you have any interest in
18206 working on one of these, or some other improvement to MLton not listed
18207 here, please send mail to
18208 mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`].
18210 * Port to new platform: Windows (native, not Cygwin or MinGW), ...
18211 * Source-level debugger
18213 * Interfaces to libraries: OpenGL, Gtk+, D-BUS, ...
18214 * More libraries written in SML (see <!ViewGitProj(mltonlib)>)
18215 * Additional constant types: `structure Real80: REAL`, ...
18216 * An IDE (possibly integrated with <:Eclipse:>)
18217 * Port MLRISC and use for code generation
18219 ** Improved closure representation
18221 Right now, MLton's closure conversion algorithm uses a simple flat closure to represent each function.
18223 *** http://www.mlton.org/pipermail/mlton/2003-October/024570.html
18224 *** http://www.mlton.org/pipermail/mlton-user/2007-July/001150.html
18225 *** <!Cite(ShaoAppel94)>
18226 ** Elimination of array bounds checks in loops
18227 ** Elimination of overflow checks on array index computations
18228 ** Common-subexpression elimination of repeated array subscripts
18229 ** Loop-invariant code motion, especially for tuple selects
18230 ** Partial redundancy elimination
18231 *** http://www.mlton.org/pipermail/mlton/2006-April/028598.html
18232 ** Loop unrolling, especially for small loops
18233 ** Auto-vectorization, for MMX/SSE/3DNow!/AltiVec (see the http://gcc.gnu.org/projects/tree-ssa/vectorization.html[work done on GCC])
18234 ** Optimize `MLton_eq`: pointer equality is necessarily false when one of the arguments is freshly allocated in the block
18236 ** Uncaught exception analysis
18240 :mlton-guide-page: Pronounce
18245 Here is <!Attachment(Pronounce,pronounce-mlton.mp3,how "MLton" sounds)>.
18247 "MLton" is pronounced in two syllables, with stress on the first
18248 syllable. The first syllable sounds like the word _mill_ (as in
18249 "steel mill"), the second like the word _tin_ (as in "cookie tin").
18253 :mlton-guide-page: PropertyList
18258 A property list is a dictionary-like data structure into which
18259 properties (name-value pairs) can be inserted and from which
18260 properties can be looked up by name. The term comes from the Lisp
18261 language, where every symbol has a property list for storing
18262 information, and where the names are typically symbols and keys can be
18265 Here is an SML signature for property lists such that for any type of
18266 value a new property can be dynamically created to manipulate that
18267 type of value in a property list.
18271 signature PROPERTY_LIST =
18276 val newProperty: unit -> {add: t * 'a -> unit,
18277 peek: t -> 'a option}
18281 Here is a functor demonstrating the use of property lists. It first
18282 creates a property list, then two new properties (of different types),
18283 and adds a value to the list for each property.
18287 functor Test (P: PROPERTY_LIST) =
18291 val {add = addInt: P.t * int -> unit, peek = peekInt} = P.newProperty ()
18292 val {add = addReal: P.t * real -> unit, peek = peekReal} = P.newProperty ()
18294 val () = addInt (pl, 13)
18295 val () = addReal (pl, 17.0)
18296 val s1 = Int.toString (valOf (peekInt pl))
18297 val s2 = Real.toString (valOf (peekReal pl))
18298 val () = print (concat [s1, " ", s2, "\n"])
18302 Applied to an appropriate implementation `PROPERTY_LIST`, the `Test`
18303 functor will produce the following output.
18310 == Implementation ==
18312 Because property lists can hold values of any type, their
18313 implementation requires a <:UniversalType:>. Given that, a property
18314 list is simply a list of elements of the universal type. Adding a
18315 property adds to the front of the list, and looking up a property
18320 functor PropertyList (U: UNIVERSAL_TYPE): PROPERTY_LIST =
18322 datatype t = T of U.t list ref
18324 fun new () = T (ref [])
18326 fun 'a newProperty () =
18328 val (inject, out) = U.embed ()
18329 fun add (T r, a: 'a): unit = r := inject a :: (!r)
18331 Option.map (valOf o out) (List.find (isSome o out) (!r))
18333 {add = add, peek = peek}
18339 If `U: UNIVERSAL_TYPE`, then we can test our code as follows.
18343 structure Z = Test (PropertyList (U))
18346 Of course, a serious implementation of property lists would have to
18347 handle duplicate insertions of the same property, as well as the
18348 removal of elements in order to avoid space leaks.
18352 * MLton relies heavily on property lists for attaching information to
18353 syntax tree nodes in its intermediate languages. See
18354 <!ViewGitFile(mlton,master,lib/mlton/basic/property-list.sig)> and
18355 <!ViewGitFile(mlton,master,lib/mlton/basic/property-list.fun)>.
18357 * The <:MLRISCLibrary:> <!Cite(LeungGeorge99, uses property lists
18362 :mlton-guide-page: Pygments
18367 http://pygments.org/[Pygments] is a generic syntax highlighter. Here is a _lexer_ for highlighting
18368 <:StandardML: Standard ML>.
18370 * <!ViewGitDir(mlton,master,ide/pygments/sml_lexer)> -- Provides highlighting of keywords, special constants, and (nested) comments.
18372 == Install and use ==
18373 * Checkout all files and install as a http://pygments.org/[Pygments] plugin.
18376 $ git clone https://github.com/MLton/mlton.git mlton
18377 $ cd mlton/ide/pygments
18378 $ python setup.py install
18381 * Invoke `pygmentize` with `-l sml`.
18385 Comments and suggestions should be directed to <:MatthewFluet:>.
18389 :mlton-guide-page: RayRacine
18394 Using SML in some _Semantic Web_ stuff. Anyone interested in
18395 similar, please contact me. GreyLensman on #sml on IRC or rracine at
18396 this domain adelphia with a dot here net.
18398 Current areas of coding.
18400 . Pretty solid, high performance Rete implementation - base functionality is complete.
18401 . N3 parser - mostly complete
18402 . RDF parser based on fxg - not started.
18403 . Swerve HTTP server - 1/2 done.
18404 . SPARQL implementation - not started.
18405 . Persistent engine based on BerkelyDB - not started.
18406 . Native implementation of Postgresql protocol - underway, ways to go.
18407 . I also have a small change to the MLton compiler to add ++PackWord__<N>__++ - changes compile but needs some more work, clean-up and unit tests.
18411 :mlton-guide-page: Reachability
18416 Reachability is a notion dealing with the graph of heap objects
18417 maintained at runtime. Nodes in the graph are heap objects and edges
18418 correspond to the pointers between heap objects. As the program runs,
18419 it allocates new objects (adds nodes to the graph), and those new
18420 objects can contain pointers to other objects (new edges in the
18421 graph). If the program uses mutable objects (refs or arrays), it can
18422 also change edges in the graph.
18424 At any time, the program has access to some finite set of _root_
18425 nodes, and can only ever access nodes that are reachable by following
18426 edges from these root nodes. Nodes that are _unreachable_ can be
18431 * <:MLtonFinalizable:>
18436 :mlton-guide-page: Redundant
18441 <:Redundant:> is an optimization pass for the <:SSA:>
18442 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
18446 The redundant SSA optimization eliminates redundant function and label
18447 arguments; an argument of a function or label is redundant if it is
18448 always the same as another argument of the same function or label.
18449 The analysis finds an equivalence relation on the arguments of a
18450 function or label, such that all arguments in an equivalence class are
18451 redundant with respect to the other arguments in the equivalence
18452 class; the transformation selects one representative of each
18453 equivalence class and drops the binding occurrence of
18454 non-representative variables and renames use occurrences of the
18455 non-representative variables to the representative variable. The
18456 analysis finds the equivalence classes via a fixed-point analysis.
18457 Each vector of arguments to a function or label is initialized to
18458 equivalence classes that equate all arguments of the same type; one
18459 could start with an equivalence class that equates all arguments, but
18460 arguments of different type cannot be redundant. Variables bound in
18461 statements are initialized to singleton equivalence classes. The
18462 fixed-point analysis repeatedly refines these equivalence classes on
18463 the formals by the equivalence classes of the actuals.
18465 == Implementation ==
18467 * <!ViewGitFile(mlton,master,mlton/ssa/redundant.fun)>
18469 == Details and Notes ==
18471 The reason <:Redundant:> got put in was due to some output of the
18472 <:ClosureConvert:> pass converter where the environment record, or
18473 components of it, were passed around in several places. That may have
18474 been more relevant with polyvariant analyses (which are long gone).
18475 But it still seems possibly relevant, especially with more aggressive
18476 flattening, which should reveal some fields in nested closure records
18477 that are redundant.
18481 :mlton-guide-page: RedundantTests
18486 <:RedundantTests:> is an optimization pass for the <:SSA:>
18487 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
18491 This pass simplifies conditionals whose results are implied by a
18492 previous conditional test.
18494 == Implementation ==
18496 * <!ViewGitFile(mlton,master,mlton/ssa/redundant-tests.fun)>
18498 == Details and Notes ==
18500 An additional test will sometimes eliminate the overflow test when
18501 adding or subtracting 1. In particular, it will eliminate it in the
18512 :mlton-guide-page: References
18544 == <!Anchor(AAA)>A ==
18546 * <!Anchor(AcarEtAl06)>
18547 http://www.umut-acar.org/publications/pldi2006.pdf[An Experimental Analysis of Self-Adjusting Computation]
18548 Umut Acar, Guy Blelloch, Matthias Blume, and Kanat Tangwongsan.
18551 * <!Anchor(Appel92)>
18552 http://us.cambridge.org/titles/catalogue.asp?isbn=0521416957[Compiling with Continuations]
18553 (http://www.addall.com/New/submitNew.cgi?query=0-521-41695-7&type=ISBN&location=10000&state=&dispCurr=USD[addall]).
18556 Cambridge University Press, 1992.
18558 * <!Anchor(Appel93)>
18559 http://www.cs.princeton.edu/research/techreps/TR-364-92[A Critique of Standard ML].
18563 * <!Anchor(Appel98)>
18564 http://us.cambridge.org/titles/catalogue.asp?isbn=0521582741[Modern Compiler Implementation in ML]
18565 (http://www.addall.com/New/submitNew.cgi?query=0-521-58274-1&type=ISBN&location=10000&state=&dispCurr=USD[addall]).
18568 Cambridge University Press, 1998.
18570 * <!Anchor(AppelJim97)>
18571 http://ncstrl.cs.princeton.edu/expand.php?id=TR-556-97[Shrinking Lambda Expressions in Linear Time]
18572 Andrew Appel and Trevor Jim.
18575 * <!Anchor(AppelEtAl94)>
18576 http://www.smlnj.org/doc/ML-Lex/manual.html[A lexical analyzer generator for Standard ML. Version 1.6.0]
18577 Andrew W. Appel, James S. Mattson, and David R. Tarditi. 1994
18579 == <!Anchor(BBB)>B ==
18581 * <!Anchor(BaudinetMacQueen85)>
18582 http://www.classes.cs.uchicago.edu/archive/2011/spring/22620-1/papers/macqueen-baudinet85.pdf[Tree Pattern Matching for ML].
18583 Marianne Baudinet, David MacQueen. 1985.
18586 Describes the match compiler used in an early version of
18590 * <!Anchor(BentonEtAl98)>
18591 http://research.microsoft.com/en-us/um/people/nick/icfp98.pdf[Compiling Standard ML to Java Bytecodes].
18592 Nick Benton, Andrew Kennedy, and George Russell.
18595 * <!Anchor(BentonKennedy99)>
18596 http://research.microsoft.com/en-us/um/people/nick/SMLJavaInterop.pdf[Interlanguage Working Without Tears: Blending SML with Java].
18597 Nick Benton and Andrew Kennedy.
18600 * <!Anchor(BentonKennedy01)>
18601 http://research.microsoft.com/en-us/um/people/akenn/sml/ExceptionalSyntax.pdf[Exceptional Syntax].
18602 Nick Benton and Andrew Kennedy.
18605 * <!Anchor(BentonEtAl04)>
18606 http://research.microsoft.com/en-us/um/people/nick/p53-Benton.pdf[Adventures in Interoperability: The SML.NET Experience].
18607 Nick Benton, Andrew Kennedy, and Claudio Russo.
18610 * <!Anchor(BentonEtAl04_2)>
18611 http://research.microsoft.com/en-us/um/people/nick/shrinking.pdf[Shrinking Reductions in SML.NET].
18612 Nick Benton, Andrew Kennedy, Sam Lindley and Claudio Russo.
18616 Describes a linear-time implementation of an
18617 <!Cite(AppelJim97,Appel-Jim shrinker)>, using a mutable IL, and shows
18618 that it yields nice speedups in SML.NET's compile times. There are
18619 also benchmarks showing that SML.NET when compiled by MLton runs
18620 roughly five times faster than when compiled by SML/NJ.
18623 * <!Anchor(Benton05)>
18624 http://research.microsoft.com/en-us/um/people/nick/benton03.pdf[Embedded Interpreters].
18628 * <!Anchor(Berry91)>
18629 http://www.lfcs.inf.ed.ac.uk/reports/91/ECS-LFCS-91-148/ECS-LFCS-91-148.pdf[The Edinburgh SML Library].
18631 University of Edinburgh Technical Report ECS-LFCS-91-148, 1991.
18633 * <!Anchor(BerryEtAl93)>
18634 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.7958&rep=rep1&type=ps[A semantics for ML concurrency primitives].
18635 Dave Berry, Robin Milner, and David N. Turner.
18638 * <!Anchor(Berry93)>
18639 http://journals.cambridge.org/abstract_S0956796800000873[Lessons From the Design of a Standard ML Library].
18643 * <!Anchor(Bertelsen98)>
18644 http://www.petermb.dk/sml2jvm.ps.gz[Compiling SML to Java Bytecode].
18646 Master's Thesis, 1998.
18648 * <!Anchor(Berthomieu00)>
18649 http://homepages.laas.fr/bernard/oo/ooml.html[OO Programming styles in ML].
18650 Bernard Berthomieu.
18651 LAAS Report #2000111, 2000.
18653 * <!Anchor(Blume01)>
18654 http://people.cs.uchicago.edu/~blume/papers/nlffi-entcs.pdf[No-Longer-Foreign: Teaching an ML compiler to speak C "natively"].
18658 * <!Anchor(Blume01_02)>
18659 http://people.cs.uchicago.edu/~blume/pgraph/proposal.pdf[Portable library descriptions for Standard ML].
18660 Matthias Blume. 2001.
18662 * <!Anchor(Boehm03)>
18663 http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html[Destructors, Finalizers, and Synchronization].
18668 Discusses a number of issues in the design of finalizers. Many of the
18669 design choices are consistent with <:MLtonFinalizable:>.
18672 == <!Anchor(CCC)>C ==
18674 * <!Anchor(CejtinEtAl00)>
18675 http://www.cs.purdue.edu/homes/suresh/papers/icfp99.ps.gz[Flow-directed Closure Conversion for Typed Languages].
18676 Henry Cejtin, Suresh Jagannathan, and Stephen Weeks.
18680 Describes MLton's closure-conversion algorithm, which translates from
18681 its simply-typed higher-order intermediate language to its
18682 simply-typed first-order intermediate language.
18685 * <!Anchor(ChengBlelloch01)>
18686 http://www.cs.cmu.edu/afs/cs/project/pscico/pscico/papers/gc01/pldi-final.pdf[A Parallel, Real-Time Garbage Collector].
18687 Perry Cheng and Guy E. Blelloch.
18690 * <!Anchor(Claessen00)>
18691 http://users.eecs.northwestern.edu/~robby/courses/395-495-2009-fall/quick.pdf[QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs].
18692 Koen Claessen and John Hughes.
18695 * <!Anchor(Clinger98)>
18696 http://www.cesura17.net/~will/Professional/Research/Papers/tail.pdf[Proper Tail Recursion and Space Efficiency].
18697 William D. Clinger.
18700 * <!Anchor(CooperMorrisett90)>
18701 http://www.eecs.harvard.edu/~greg/papers/jgmorris-mlthreads.ps[Adding Threads to Standard ML].
18702 Eric C. Cooper and J. Gregory Morrisett.
18703 CMU Technical Report CMU-CS-90-186, 1990.
18705 * <!Anchor(CouttsEtAl07)>
18706 http://metagraph.org/papers/stream_fusion.pdf[Stream Fusion: From Lists to Streams to Nothing at All].
18707 Duncan Coutts, Roman Leshchinskiy, and Don Stewart.
18708 Submitted for publication. April 2007.
18710 == <!Anchor(DDD)>D ==
18712 * <!Anchor(DamasMilner82)>
18713 http://groups.csail.mit.edu/pag/6.883/readings/p207-damas.pdf[Principal Type-Schemes for Functional Programs].
18714 Luis Damas and Robin Milner.
18717 * <!Anchor(Danvy98)>
18718 http://www.brics.dk/RS/98/12[Functional Unparsing].
18720 BRICS Technical Report RS 98-12, 1998.
18722 * <!Anchor(Deboer05)>
18723 http://alleystoughton.us/eXene/dusty-thesis.pdf[Exhancements to eXene].
18725 Master of Science Thesis, 2005.
18728 Describes ways to improve widget concurrency, handling of input focus,
18729 X resources and selections.
18732 * <!Anchor(DoligezLeroy93)>
18733 http://cristal.inria.fr/~doligez/publications/doligez-leroy-popl-1993.pdf[A Concurrent, Generational Garbage Collector for a Multithreaded Implementation of ML].
18734 Damien Doligez and Xavier Leroy.
18737 * <!Anchor(Dreyer07)>
18738 http://www.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf[Modular Type Classes].
18739 Derek Dreyer, Robert Harper, Manuel M.T. Chakravarty, Gabriele Keller.
18740 University of Chicago Technical Report TR-2007-02, 2006.
18742 * <!Anchor(DreyerBlume07)>
18743 http://www.mpi-sws.org/~dreyer/papers/infmod/main-long.pdf[Principal Type Schemes for Modular Programs].
18744 Derek Dreyer and Matthias Blume.
18747 * <!Anchor(Dubois95)>
18748 ftp://ftp.inria.fr/INRIA/Projects/cristal/Francois.Rouaix/generics.dvi.Z[Extensional Polymorphism].
18749 Catherin Dubois, Francois Rouaix, and Pierre Weis.
18753 An extension of ML that allows the definition of ad-hoc polymorphic
18754 functions by inspecting the type of their argument.
18757 == <!Anchor(EEE)>E ==
18759 * <!Anchor(Elsman03)>
18760 http://www.elsman.com/tldi03.pdf[Garbage Collection Safety for Region-based Memory Management].
18764 * <!Anchor(Elsman04)>
18765 http://www.elsman.com/ITU-TR-2004-43.pdf[Type-Specialized Serialization with Sharing].
18766 Martin Elsman. University of Copenhagen. IT University Technical
18767 Report TR-2004-43, 2004.
18769 == <!Anchor(FFF)>F ==
18771 * <!Anchor(FelleisenFreidman98)>
18772 http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=4787[The Little MLer]
18773 (http://www3.addall.com/New/submitNew.cgi?query=026256114X&type=ISBN[addall]).
18775 Matthias Felleisen and Dan Freidman.
18776 The MIT Press, 1998.
18778 * <!Anchor(FlattFindler04)>
18779 http://www.cs.utah.edu/plt/kill-safe/[Kill-Safe Synchronization Abstractions].
18780 Matthew Flatt and Robert Bruce Findler.
18783 * <!Anchor(FluetWeeks01)>
18784 http://www.cs.rit.edu/~mtf/research/contification[Contification Using Dominators].
18785 Matthew Fluet and Stephen Weeks.
18789 Describes contification, a generalization of tail-recursion
18790 elimination that is an optimization operating on MLton's static single
18791 assignment (SSA) intermediate language.
18794 * <!Anchor(FluetPucella06)>
18795 http://www.cs.rit.edu/~mtf/research/phantom-subtyping/jfp06/jfp06.pdf[Phantom Types and Subtyping].
18796 Matthew Fluet and Riccardo Pucella.
18799 * <!Anchor(Furuse01)>
18800 http://jfla.inria.fr/2001/actes/07-furuse.ps[Generic Polymorphism in ML].
18805 The formalism behind G'CAML, which has an approach to ad-hoc
18806 polymorphism based on <!Cite(Dubois95)>, the differences being in how
18807 type checking works an an improved compilation approach for typecase
18808 that does the matching at compile time, not run time.
18811 == <!Anchor(GGG)>G ==
18813 * <!Anchor(GansnerReppy93)>
18814 http://alleystoughton.us/eXene/1993-trends.pdf[A Multi-Threaded Higher-order User Interface Toolkit].
18815 Emden R. Gansner and John H. Reppy.
18816 User Interface Software, 1993.
18818 * <!Anchor(GansnerReppy04)>
18819 http://www.cambridge.org/gb/academic/subjects/computer-science/programming-languages-and-applied-logic/standard-ml-basis-library[The Standard ML Basis Library].
18820 (http://www3.addall.com/New/submitNew.cgi?query=9780521794787&type=ISBN[addall])
18821 ISBN 9780521794787.
18822 Emden R. Gansner and John H. Reppy.
18823 Cambridge University Press, 2004.
18826 An introduction and overview of the <:BasisLibrary:Basis Library>,
18827 followed by a detailed description of each module. The module
18828 descriptions are also available
18829 http://www.standardml.org/Basis[online].
18832 * <!Anchor(GrossmanEtAl02)>
18833 http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf[Region-based Memory Management in Cyclone].
18834 Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling
18835 Wang, and James Cheney.
18838 == <!Anchor(HHH)>H ==
18840 * <!Anchor(HallenbergEtAl02)>
18841 http://www.itu.dk/people/tofte/publ/pldi2002.pdf[Combining Region Inference and Garbage Collection].
18842 Niels Hallenberg, Martin Elsman, and Mads Tofte.
18845 * <!Anchor(HansenRichel99)>
18846 http://www.it.dtu.dk/introSML[Introduction to Programming Using SML]
18847 (http://www3.addall.com/New/submitNew.cgi?query=0201398206&type=ISBN[addall]).
18849 Michael R. Hansen, Hans Rischel.
18850 Addison-Wesley, 1999.
18852 * <!Anchor(Harper11)>
18853 http://www.cs.cmu.edu/~rwh/smlbook/book.pdf[Programming in Standard ML].
18856 * <!Anchor(HarperEtAl93)>
18857 http://www.cs.cmu.edu/~rwh/papers/callcc/jfp.pdf[Typing First-Class Continuations in ML].
18858 Robert Harper, Bruce F. Duba, and David MacQueen.
18861 * <!Anchor(HarperMitchell92)>
18862 http://www.cs.cmu.edu/~rwh/papers/xml/toplas93.pdf[On the Type Structure of Standard ML].
18863 Robert Harper and John C. Mitchell.
18866 * <!Anchor(HauserBenson04)>
18867 http://doi.ieeecomputersociety.org/10.1109/CSD.2004.1309122[On the Practicality and Desirability of Highly-concurrent, Mostly-functional Programming].
18868 Carl H. Hauser and David B. Benson.
18872 Describes the use of <:ConcurrentML: Concurrent ML> in implementing
18873 the Ped text editor. Argues that using large numbers of threads and
18874 message passing style is a practical and effective way of
18875 modularizing a program.
18878 * <!Anchor(HeckmanWilhelm97)>
18879 http://rw4.cs.uni-sb.de/~heckmann/abstracts/neuform.html[A Functional Description of TeX's Formula Layout].
18880 Reinhold Heckmann and Reinhard Wilhelm.
18883 * <!Anchor(HicksEtAl03)>
18884 http://wwwold.cs.umd.edu/Library/TRs/CS-TR-4514/CS-TR-4514.pdf[Safe and Flexible Memory Management in Cyclone].
18885 Mike Hicks, Greg Morrisett, Dan Grossman, and Trevor Jim.
18886 University of Maryland Technical Report CS-TR-4514, 2003.
18888 * <!Anchor(Hurd04)>
18889 http://www.gilith.com/research/talks/tphols2004.pdf[Compiling HOL4 to Native Code].
18894 Describes a port of HOL from Moscow ML to MLton, the difficulties
18895 encountered in compiling large programs, and the speedups achieved
18899 == <!Anchor(III)>I ==
18903 == <!Anchor(JJJ)>J ==
18905 * <!Anchor(Jones99)>
18906 http://www.cs.kent.ac.uk/people/staff/rej/gcbook[Garbage Collection: Algorithms for Automatic Memory Management]
18907 (http://www3.addall.com/New/submitNew.cgi?query=0471941484&type=ISBN[addall]).
18910 John Wiley & Sons, 1999.
18912 == <!Anchor(KKK)>K ==
18914 * <!Anchor(Kahrs93)>
18915 http://kar.kent.ac.uk/21122/[Mistakes and Ambiguities in the Definition of Standard ML].
18917 University of Edinburgh Technical Report ECS-LFCS-93-257, 1993.
18920 Describes a number of problems with the
18921 <!Cite(MilnerEtAl90,1990 Definition)>, many of which were fixed in the
18922 <!Cite(MilnerEtAl97,1997 Definition)>.
18924 Also see the http://www.cs.kent.ac.uk/~smk/errors-new.ps.Z[addenda]
18928 * <!Anchor(Karvonen07)>
18929 http://dl.acm.org/citation.cfm?doid=1292535.1292547[Generics for the Working ML'er].
18931 <:#ML:> 2007. http://research.microsoft.com/~crusso/ml2007/slides/ml08rp-karvonen-slides.pdf[Slides] from the presentation are also available.
18933 * <!Anchor(Kennedy04)>
18934 http://research.microsoft.com/~akenn/fun/picklercombinators.pdf[Pickler Combinators].
18938 * <!Anchor(KoserEtAl03)>
18939 http://www.litech.org/~vaughan/pdf/dpcool2003.pdf[sml2java: A Source To Source Translator].
18940 Justin Koser, Haakon Larsen, Jeffrey A. Vaughan.
18943 == <!Anchor(LLL)>L ==
18945 * <!Anchor(Lang99)>
18946 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.29.7130&rep=rep1&type=ps[Faster Algorithms for Finding Minimal Consistent DFAs].
18949 * <!Anchor(LarsenNiss04)>
18950 http://usenix.org/publications/library/proceedings/usenix04/tech/freenix/full_papers/larsen/larsen.pdf[mGTK: An SML binding of Gtk+].
18951 Ken Larsen and Henning Niss.
18952 USENIX Annual Technical Conference, 2004.
18954 * <!Anchor(Leibig13)>
18955 http://www.cs.rit.edu/~bal6053/msproject/[An LLVM Back-end for MLton].
18957 MS Project Report, 2013.
18960 Describes MLton's <:LLVMCodegen:>.
18963 * <!Anchor(Leroy90)>
18964 http://pauillac.inria.fr/~xleroy/bibrefs/Leroy-ZINC.html[The ZINC Experiment: an Economical Implementation of the ML Language].
18966 Technical report 117, INRIA, 1990.
18969 A detailed explanation of the design and implementation of a bytecode
18970 compiler and interpreter for ML with a machine model aimed at
18971 efficient implementation.
18974 * <!Anchor(Leroy93)>
18975 http://pauillac.inria.fr/~xleroy/bibrefs/Leroy-poly-par-nom.html[Polymorphism by Name for References and Continuations].
18979 * <!Anchor(LeungGeorge99)>
18980 http://www.cs.nyu.edu/leunga/my-papers/annotations.ps[MLRISC Annotations].
18981 Allen Leung and Lal George. 1999.
18983 == <!Anchor(MMM)>M ==
18985 * <!Anchor(MarlowEtAl01)>
18986 http://community.haskell.org/~simonmar/papers/async.pdf[Asynchronous Exceptions in Haskell].
18987 Simon Marlow, Simon Peyton Jones, Andy Moran and John Reppy.
18991 An asynchronous exception is a signal that one thread can send to
18992 another, and is useful for the receiving thread to treat as an
18993 exception so that it can clean up locks or other state relevant to its
18997 * <!Anchor(MacQueenEtAl84)>
18998 http://homepages.inf.ed.ac.uk/gdp/publications/Ideal_model.pdf[An Ideal Model for Recursive Polymorphic Types].
18999 David MacQueen, Gordon Plotkin, Ravi Sethi.
19002 * <!Anchor(Matthews91)>
19003 http://www.lfcs.inf.ed.ac.uk/reports/91/ECS-LFCS-91-174[A Distributed Concurrent Implementation of Standard ML].
19005 University of Edinburgh Technical Report ECS-LFCS-91-174, 1991.
19007 * <!Anchor(Matthews95)>
19008 http://www.lfcs.inf.ed.ac.uk/reports/95/ECS-LFCS-95-335[Papers on Poly/ML].
19009 David C. J. Matthews.
19010 University of Edinburgh Technical Report ECS-LFCS-95-335, 1995.
19012 * http://www.lfcs.inf.ed.ac.uk/reports/97/ECS-LFCS-97-375[That About Wraps it Up: Using FIX to Handle Errors Without Exceptions, and Other Programming Tricks].
19014 University of Edinburgh Technical Report ECS-LFCS-97-375, 1997.
19016 * <!Anchor(MeierNorgaard93)>
19017 A Just-In-Time Backend for Moscow ML 2.00 in SML.
19018 Bjarke Meier, Kristian Nørgaard.
19019 Masters Thesis, 2003.
19022 A just-in-time compiler using GNU Lightning, showing a speedup of up
19023 to four times over Moscow ML's usual bytecode interpreter.
19025 The full report is only available in
19026 http://www.itu.dk/stud/speciale/bmkn/fundanemt/download/report[Danish].
19029 * <!Anchor(Milner78)>
19030 http://courses.engr.illinois.edu/cs421/sp2013/project/milner-polymorphism.pdf[A Theory of Type Polymorphism in Programming].
19032 Journal of Computer and System Sciences, 1978.
19034 * <!Anchor(Milner82)>
19035 http://homepages.inf.ed.ac.uk/dts/fps/papers/evolved.dvi.gz[How ML Evolved].
19037 Polymorphism--The ML/LCF/Hope Newsletter, 1983.
19039 * <!Anchor(MilnerTofte91)>
19040 http://www.itu.dk/people/tofte/publ/1990sml/1990sml.html[Commentary on Standard ML]
19041 (http://www3.addall.com/New/submitNew.cgi?query=0262631377&type=ISBN[addall])
19043 Robin Milner and Mads Tofte.
19044 The MIT Press, 1991.
19047 Introduces and explains the notation and approach used in
19048 <!Cite(MilnerEtAl90,The Definition of Standard ML)>.
19051 * <!Anchor(MilnerEtAl90)>
19052 http://www.itu.dk/people/tofte/publ/1990sml/1990sml.html[The Definition of Standard ML].
19053 (http://www3.addall.com/New/submitNew.cgi?query=0262631326&type=ISBN[addall])
19055 Robin Milner, Mads Tofte, and Robert Harper.
19056 The MIT Press, 1990.
19059 Superseded by <!Cite(MilnerEtAl97,The Definition of Standard ML (Revised))>.
19060 Accompanied by the <!Cite(MilnerTofte91,Commentary on Standard ML)>.
19063 * <!Anchor(MilnerEtAl97)>
19064 http://mitpress.mit.edu/books/definition-standard-ml[The Definition of Standard ML (Revised)].
19065 (http://www3.addall.com/New/submitNew.cgi?query=0262631814&type=ISBN[addall])
19067 Robin Milner, Mads Tofte, Robert Harper, and David MacQueen.
19068 The MIT Press, 1997.
19071 A terse and formal specification of Standard ML's syntax and
19072 semantics. Supersedes <!Cite(MilnerEtAl90,The Definition of Standard ML)>.
19075 * <!Anchor(ML2000)>
19076 http://flint.cs.yale.edu/flint/publications/ml2000.html[Principles and a Preliminary Design for ML2000].
19077 The ML2000 working group, 1999.
19079 * <!Anchor(Morentsen99)>
19080 http://daimi.au.dk/CPnets/workshop99/papers/Mortensen.pdf[Automatic Code Generation from Coloured Petri Nets for an Access Control System].
19081 Kjeld H. Mortensen.
19082 Workshop on Practical Use of Coloured Petri Nets and Design/CPN, 1999.
19084 * <!Anchor(MorrisettTolmach93)>
19085 http://web.cecs.pdx.edu/~apt/ppopp93.ps[Procs and Locks: a Portable Multiprocessing Platform for Standard ML of New Jersey].
19086 J{empty}. Gregory Morrisett and Andrew Tolmach.
19089 * <!Anchor(Murphy06)>
19090 http://www.cs.cmu.edu/~tom7/papers/grid-ml06.pdf[ML Grid Programming with ConCert].
19094 == <!Anchor(NNN)>N ==
19096 * <!Anchor(Neumann99)>
19097 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.9485&rep=rep1&type=ps[fxp - Processing Structured Documents in SML].
19099 Scottish Functional Programming Workshop, 1999.
19102 Describes http://atseidl2.informatik.tu-muenchen.de/~berlea/Fxp[fxp],
19103 an XML parser implemented in Standard ML.
19106 * <!Anchor(Neumann99Thesis)>
19107 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.8108&rep=rep1&type=ps[Parsing and Querying XML Documents in SML].
19109 Doctoral Thesis, 1999.
19111 * <!Anchor(NguyenOhori06)>
19112 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/NguyenOhoriPPDP06.pdf[Compiling ML Polymorphism with Explicit Layout Bitmap].
19113 Huu-Duc Nguyen and Atsushi Ohori.
19116 == <!Anchor(OOO)>O ==
19118 * <!Anchor(Okasaki99)>
19119 http://www.cambridge.org/gb/academic/subjects/computer-science/programming-languages-and-applied-logic/purely-functional-data-structures[Purely Functional Data Structures].
19120 ISBN 9780521663502.
19122 Cambridge University Press, 1999.
19124 * <!Anchor(Ohori89)>
19125 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/fpca89.pdf[A Simple Semantics for ML Polymorphism].
19129 * <!Anchor(Ohori95)>
19130 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/toplas95.pdf[A Polymorphic Record Calculus and Its Compilation].
19134 * <!Anchor(OhoriTakamizawa97)>
19135 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/jlsc97.pdf[An Unboxed Operational Semantics for ML Polymorphism].
19136 Atsushi Ohori and Tomonobu Takamizawa.
19139 * <!Anchor(Ohori99)>
19140 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/ic98.pdf[Type-Directed Specialization of Polymorphism].
19144 * <!Anchor(OwensEtAl09)>
19145 http://www.mpi-sws.org/~turon/re-deriv.pdf[Regular-expression derivatives reexamined].
19146 Scott Owens, John Reppy, and Aaron Turon.
19149 == <!Anchor(PPP)>P ==
19151 * <!Anchor(Paulson96)>
19152 http://www.cambridge.org/co/academic/subjects/computer-science/programming-languages-and-applied-logic/ml-working-programmer-2nd-edition[ML For the Working Programmer]
19153 (http://www3.addall.com/New/submitNew.cgi?query=052156543X&type=ISBN[addall])
19156 Cambridge University Press, 1996.
19158 * <!Anchor(PetterssonEtAl02)>
19159 http://user.it.uu.se/~kostis/Papers/flops02_22.ps.gz[The HiPE/x86 Erlang Compiler: System Description and Performance Evaluation].
19160 Mikael Pettersson, Konstantinos Sagonas, and Erik Johansson.
19164 Describes a native x86 Erlang compiler and a comparison of many
19165 different native x86 compilers (including MLton) and their register
19166 usage and call stack implementations.
19169 * <!Anchor(Price09)>
19170 http://rogerprice.org/#UG[User's Guide to ML-Lex and ML-Yacc]
19173 * <!Anchor(Pucella98)>
19174 http://arxiv.org/abs/cs.PL/0405080[Reactive Programming in Standard ML].
19175 Riccardo R. Puccella. 1998.
19178 == <!Anchor(QQQ)>Q ==
19182 == <!Anchor(RRR)>R ==
19184 * <!Anchor(Ramsey90)>
19185 https://www.cs.princeton.edu/research/techreps/TR-262-90[Concurrent Programming in ML].
19187 Princeton University Technical Report CS-TR-262-90, 1990.
19189 * <!Anchor(Ramsey11)>
19190 http://www.cs.tufts.edu/~nr/pubs/embedj-abstract.html[Embedding an Interpreted Language Using Higher-Order Functions and Types].
19194 * <!Anchor(RamseyFisherGovereau05)>
19195 http://www.cs.tufts.edu/~nr/pubs/els-abstract.html[An Expressive Language of Signatures].
19196 Norman Ramsey, Kathleen Fisher, and Paul Govereau.
19199 * <!Anchor(RedwineRamsey04)>
19200 http://www.cs.tufts.edu/~nr/pubs/widen-abstract.html[Widening Integer Arithmetic].
19201 Kevin Redwine and Norman Ramsey.
19205 Describes a method to implement numeric types and operations (like
19206 `Int31` or `Word17`) for sizes smaller than that provided by the
19210 * <!Anchor(Reppy88)>
19211 Synchronous Operations as First-Class Values.
19215 * <!Anchor(Reppy07)>
19216 http://www.cambridge.org/co/academic/subjects/computer-science/distributed-networked-and-mobile-computing/concurrent-programming-ml[Concurrent Programming in ML]
19217 (http://www3.addall.com/New/submitNew.cgi?query=9780521714723&type=ISBN[addall]).
19218 ISBN 9780521714723.
19220 Cambridge University Press, 2007.
19223 Describes <:ConcurrentML:>.
19226 * <!Anchor(Reynolds98)>
19227 https://users-cs.au.dk/hosc/local/HOSC-11-4-pp355-361.pdf[Definitional Interpreters Revisited].
19231 * <!Anchor(Reynolds98_2)>
19232 https://users-cs.au.dk/hosc/local/HOSC-11-4-pp363-397.pdf[Definitional Interpreters for Higher-Order Programming Languages]
19236 * <!Anchor(Rossberg01)>
19237 http://www.mpi-sws.org/~rossberg/papers/Rossberg%20-%20Defects%20in%20the%20Revised%20Definition%20of%20Standard%20ML%20%5B2007-01-22%20Update%5D.pdf[Defects in the Revised Definition of Standard ML].
19238 Andreas Rossberg. 2001.
19240 == <!Anchor(SSS)>S ==
19242 * <!Anchor(Sansom91)>
19243 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.1020&rep=rep1&type=ps[Dual-Mode Garbage Collection].
19245 Workshop on the Parallel Implementation of Functional Languages, 1991.
19247 * <!Anchor(ScottRamsey00)>
19248 http://www.cs.tufts.edu/~nr/pubs/match-abstract.html[When Do Match-Compilation Heuristics Matter].
19249 Kevin Scott and Norman Ramsey.
19250 University of Virginia Technical Report CS-2000-13, 2000.
19253 Modified SML/NJ to experimentally compare a number of
19254 match-compilation heuristics and showed that choice of heuristic
19255 usually does not significantly affect code size or run time.
19258 * <!Anchor(Sestoft96)>
19259 http://www.itu.dk/~sestoft/papers/match.ps.gz[ML Pattern Match Compilation and Partial Evaluation].
19261 Partial Evaluation, 1996.
19264 Describes the derivation of the match compiler used in
19265 <:MoscowML:Moscow ML>.
19268 * <!Anchor(ShaoAppel94)>
19269 http://flint.cs.yale.edu/flint/publications/closure.html[Space-Efficient Closure Representations].
19270 Zhong Shao and Andrew W. Appel.
19273 * <!Anchor(Shipman02)>
19274 <!Attachment(References,Shipman02.pdf,Unix System Programming with Standard ML)>.
19275 Anthony L. Shipman.
19279 Includes a description of the <:Swerve:> HTTP server written in SML.
19282 * <!Anchor(Signoles03)>
19283 Calcul Statique des Applications de Modules Parametres.
19288 Describes a http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=382[defunctorizer]
19289 for OCaml, and compares it to existing defunctorizers, including MLton.
19292 * <!Anchor(SittampalamEtAl04)>
19293 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.1349&rep=rep1&type=ps[Incremental Execution of Transformation Specifications].
19294 Ganesh Sittampalam, Oege de Moor, and Ken Friis Larsen.
19298 Mentions a port from Moscow ML to MLton of
19299 http://www.itu.dk/research/muddy/[MuDDY], an SML wrapper around the
19300 http://sourceforge.net/projects/buddy[BuDDY] BDD package.
19303 * <!Anchor(SwaseyEtAl06)>
19304 http://www.cs.cmu.edu/~tom7/papers/smlsc2-ml06.pdf[A Separate Compilation Extension to Standard ML].
19305 David Swasey, Tom Murphy VII, Karl Crary and Robert Harper.
19308 == <!Anchor(TTT)>T ==
19310 * <!Anchor(TarditiAppel00)>
19311 http://www.smlnj.org/doc/ML-Yacc/index.html[ML-Yacc User's Manual. Version 2.4]
19312 David R. Tarditi and Andrew W. Appel. 2000.
19314 * <!Anchor(TarditiEtAl90)>
19315 http://research.microsoft.com/pubs/68738/loplas-sml2c.ps[No Assembly Required: Compiling Standard ML to C].
19316 David Tarditi, Peter Lee, and Anurag Acharya. 1990.
19318 * <!Anchor(ThorupTofte94)>
19319 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.5372&rep=rep1&type=ps[Object-oriented programming and Standard ML].
19320 Lars Thorup and Mads Tofte.
19323 * <!Anchor(Tofte90)>
19324 Type Inference for Polymorphic References.
19328 * <!Anchor(Tofte96)>
19329 http://www.itu.dk/courses/FDP/E2004/Tofte-1996-Essentials_of_SML_Modules.pdf[Essentials of Standard ML Modules].
19332 * <!Anchor(Tofte09)>
19333 http://www.itu.dk/people/tofte/publ/tips.pdf[Tips for Computer Scientists on Standard ML (Revised)].
19336 * <!Anchor(TolmachAppel95)>
19337 http://web.cecs.pdx.edu/~apt/jfp95.ps[A Debugger for Standard ML].
19338 Andrew Tolmach and Andrew W. Appel.
19341 * <!Anchor(Tolmach97)>
19342 http://web.cecs.pdx.edu/~apt/tic97.ps[Combining Closure Conversion with Closure Analysis using Algebraic Types].
19347 Describes a closure-conversion algorithm for a monomorphic IL. The
19348 algorithm uses a unification-based flow analysis followed by
19349 defunctionalization and is similar to the approach used in MLton
19350 (<!Cite(CejtinEtAl00)>).
19353 * <!Anchor(TolmachOliva98)>
19354 http://web.cecs.pdx.edu/~apt/jfp98.ps[From ML to Ada: Strongly-typed Language Interoperability via Source Translation].
19355 Andrew Tolmach and Dino Oliva.
19359 Describes a compiler for RML, a core SML-like language. The compiler
19360 is similar in structure to MLton, using monomorphisation,
19361 defunctionalization, and optimization on a first-order IL.
19364 == <!Anchor(UUU)>U ==
19366 * <!Anchor(Ullman98)>
19367 http://www-db.stanford.edu/~ullman/emlp.html[Elements of ML Programming]
19368 (http://www3.addall.com/New/submitNew.cgi?query=0137903871&type=ISBN[addall]).
19371 Prentice-Hall, 1998.
19373 == <!Anchor(VVV)>V ==
19377 == <!Anchor(WWW)>W ==
19379 * <!Anchor(Wand84)>
19380 http://portal.acm.org/citation.cfm?id=800527[A Types-as-Sets Semantics for Milner-Style Polymorphism].
19384 * <!Anchor(Wang01)>
19385 http://ncstrl.cs.princeton.edu/expand.php?id=TR-640-01[Managing Memory with Types].
19390 Chapter 6 describes an implementation of a type-preserving garbage
19391 collector for MLton.
19394 * <!Anchor(WangAppel01)>
19395 http://www.cs.princeton.edu/~appel/papers/typegc.pdf[Type-Preserving Garbage Collectors].
19396 Daniel C. Wang and Andrew W. Appel.
19400 Shows how to modify MLton to generate a strongly-typed garbage
19401 collector as part of a program.
19404 * <!Anchor(WangMurphy02)>
19405 http://www.cs.cmu.edu/~tom7/papers/wang-murphy-recursion.pdf[Programming With Recursion Schemes].
19406 Daniel C. Wang and Tom Murphy VII.
19409 Describes a programming technique for data abstraction, along with
19410 benchmarks of MLton and other SML compilers.
19413 * <!Anchor(Weeks06)>
19414 <!Attachment(References,060916-mlton.pdf,Whole-Program Compilation in MLton)>.
19418 * <!Anchor(Wright95)>
19419 http://homepages.inf.ed.ac.uk/dts/fps/papers/wright.ps.gz[Simple Imperative Polymorphism].
19421 <:#LASC:>, 8(4):343-355, 1995.
19424 The origin of the <:ValueRestriction:>.
19427 == <!Anchor(XXX)>X ==
19431 == <!Anchor(YYY)>Y ==
19433 * <!Anchor(Yang98)>
19434 http://cs.nyu.edu/zheyang/papers/YangZ\--ICFP98.html[Encoding Types in ML-like Languages].
19438 == <!Anchor(ZZZ)>Z ==
19440 * <!Anchor(ZiarekEtAl06)>
19441 http://www.cs.purdue.edu/homes/lziarek/icfp06.pdf[Stabilizers: A Modular Checkpointing Abstraction for Concurrent Functional Programs].
19442 Lukasz Ziarek, Philip Schatz, and Suresh Jagannathan.
19445 * <!Anchor(ZiarekEtAl08)>
19446 http://www.cse.buffalo.edu/~lziarek/hosc.pdf[Flattening tuples in an SSA intermediate representation].
19447 Lukasz Ziarek, Stephen Weeks, and Suresh Jagannathan.
19451 == Abbreviations ==
19453 * <!Anchor(ACSD)> ACSD = International Conference on Application of Concurrency to System Design
19454 * <!Anchor(BABEL)> BABEL = Workshop on multi-language infrastructure and interoperability
19455 * <!Anchor(CC)> CC = International Conference on Compiler Construction
19456 * <!Anchor(DPCOOL)> DPCOOL = Workshop on Declarative Programming in the Context of OO Languages
19457 * <!Anchor(ESOP)> ESOP = European Symposium on Programming
19458 * <!Anchor(FLOPS)> FLOPS = Symposium on Functional and Logic Programming
19459 * <!Anchor(FPCA)> FPCA = Conference on Functional Programming Languages and Computer Architecture
19460 * <!Anchor(HOSC)> HOSC = Higher-Order and Symbolic Computation
19461 * <!Anchor(IC)> IC = Information and Computation
19462 * <!Anchor(ICCL)> ICCL = IEEE International Conference on Computer Languages
19463 * <!Anchor(ICFP)> ICFP = International Conference on Functional Programming
19464 * <!Anchor(IFL)> IFL = International Workshop on Implementation and Application of Functional Languages
19465 * <!Anchor(IVME)> IVME = Workshop on Interpreters, Virtual Machines and Emulators
19466 * <!Anchor(JFLA)> JFLA = Journees Francophones des Langages Applicatifs
19467 * <!Anchor(JFP)> JFP = Journal of Functional Programming
19468 * <!Anchor(LASC)> LASC = Lisp and Symbolic Computation
19469 * <!Anchor(LFP)> LFP = Lisp and Functional Programming
19470 * <!Anchor(ML)> ML = Workshop on ML
19471 * <!Anchor(PLDI)> PLDI = Conference on Programming Language Design and Implementation
19472 * <!Anchor(POPL)> POPL = Symposium on Principles of Programming Languages
19473 * <!Anchor(PPDP)> PPDP = International Conference on Principles and Practice of Declarative Programming
19474 * <!Anchor(PPoPP)> PPoPP = Principles and Practice of Parallel Programming
19475 * <!Anchor(TCS)> TCS = IFIP International Conference on Theoretical Computer Science
19476 * <!Anchor(TIC)> TIC = Types in Compilation
19477 * <!Anchor(TLDI)> TLDI = Workshop on Types in Language Design and Implementation
19478 * <!Anchor(TOPLAS)> TOPLAS = Transactions on Programming Languages and Systems
19479 * <!Anchor(TPHOLs)> TPHOLs = International Conference on Theorem Proving in Higher Order Logics
19483 :mlton-guide-page: RefFlatten
19488 <:RefFlatten:> is an optimization pass for the <:SSA2:>
19489 <:IntermediateLanguage:>, invoked from <:SSA2Simplify:>.
19493 This pass flattens a `ref` cell into its containing object.
19494 The idea is to replace, where possible, a type like
19504 where the `[m]` indicates a mutable field of a tuple.
19506 == Implementation ==
19508 * <!ViewGitFile(mlton,master,mlton/ssa/ref-flatten.fun)>
19510 == Details and Notes ==
19512 The savings is obvious, I hope. We avoid an extra heap-allocated
19513 object for the `ref`, which in the above case saves two words. We
19514 also save the time and code for the extra indirection at each get and
19515 set. There are lots of useful data structures (singly-linked and
19516 doubly-linked lists, union-find, Fibonacci heaps, ...) that I believe
19517 we are paying through the nose right now because of the absence of ref
19520 The idea is to compute for each occurrence of a `ref` type in the
19521 program whether or not that `ref` can be represented as an offset of
19522 some object (constructor or tuple). As before, a unification-based
19523 whole-program with deep abstract values makes sure the analysis is
19526 The only syntactic part of the analysis that remains is the part that
19527 checks that for a variable bound to a value constructed by `Ref_ref`:
19529 * the object allocation is in the same block. This is pretty
19530 draconian, and it would be nice to generalize it some day to allow
19531 flattening as long as the `ref` allocation and object allocation "line
19532 up one-to-one" in the same loop-free chunk of code.
19534 * updates occur in the same block (and hence it is safe-for-space
19535 because the containing object is still alive). It would be nice to
19536 relax this to allow updates as long as it can be provedthat the
19539 Prevent flattening of `unit ref`-s.
19541 <:RefFlatten:> is safe for space. The idea is to prevent a `ref`
19542 being flattened into an object that has a component of unbounded size
19543 (other than possibly the `ref` itself) unless we can prove that at
19544 each point the `ref` is live, then the containing object is live too.
19545 I used a pretty simple approximation to liveness.
19549 :mlton-guide-page: Regions
19554 In region-based memory management, the heap is divided into a
19555 collection of regions into which objects are allocated. At compile
19556 time, either in the source program or through automatic inference,
19557 allocation points are annotated with the region in which the
19558 allocation will occur. Typically, although not always, the regions
19559 are allocated and deallocated according to a stack discipline.
19561 MLton does not use region-based memory management; it uses traditional
19562 <:GarbageCollection:>. We have considered integrating regions with
19563 MLton, but in our opinion it is far from clear that regions would
19564 provide MLton with improved performance, while they would certainly
19565 add a lot of complexity to the compiler and complicate reasoning about
19566 and achieving <:SpaceSafety:>. Region-based memory management and
19567 garbage collection have different strengths and weaknesses; it's
19568 pretty easy to come up with programs that do significantly better
19569 under regions than under GC, and vice versa. We believe that it is
19570 the case that common SML idioms tend to work better under GC than
19573 One common argument for regions is that the region operations can all
19574 be done in (approximately) constant time; therefore, you eliminate GC
19575 pause times, leading to a real-time GC. However, because of space
19576 safety concerns (see below), we believe that region-based memory
19577 management for SML must also include a traditional garbage collector.
19578 Hence, to achieve real-time memory management for MLton/SML, we
19579 believe that it would be both easier and more efficient to implement a
19580 traditional real-time garbage collector than it would be to implement
19583 == Regions, the ML Kit, and space safety ==
19585 The <:MLKit:ML Kit> pioneered the use of regions for compiling
19586 Standard ML. The ML Kit maintains a stack of regions at run time. At
19587 compile time, it uses region inference to decide when data can be
19588 allocated in a stack-like manner, assigning it to an appropriate
19589 region. The ML Kit has put a lot of effort into improving the
19590 supporting analyses and representations of regions, which are all
19591 necessary to improve the performance.
19593 Unfortunately, under a pure stack-based region system, space leaks are
19594 inevitable in theory, and costly in practice. Data for which region
19595 inference can not determine the lifetime is moved into the "global
19596 region" whose lifetime is the entire program. There are two ways in
19597 which region inference will place an object to the global region.
19599 * When the inference is too conservative, that is, when the data is
19600 used in a stack-like manner but the region inference can't figure it
19603 * When data is not used in a stack-like manner. In this case,
19604 correctness requires region inference to place the object
19606 This global region is a source of space leaks. No matter what region
19607 system you use, there are some programs such that the global region
19608 must exist, and its size will grow to an unbounded multiple of the
19609 live data size. For these programs one must have a GC to achieve
19612 To solve this problem, the ML Kit has undergone work to combine
19613 garbage collection with region-based memory management.
19614 <!Cite(HallenbergEtAl02)> and <!Cite(Elsman03)> describe the addition
19615 of a garbage collector to the ML Kit's region-based system. These
19616 papers provide convincing evidence for space leaks in the global
19617 region. They show a number of benchmarks where the memory usage of
19618 the program running with just regions is a large multiple (2, 10, 50,
19619 even 150) of the program running with regions plus GC.
19621 These papers also give some numbers to show the ML Kit with just
19622 regions does better than either a system with just GC or a combined
19623 system. Unfortunately, a pure region system isn't practical because
19624 of the lack of space safety. And the other performance numbers are
19625 not so convincing, because they compare to an old version of SML/NJ
19626 and not at all with MLton. It would be interesting to see a
19627 comparison with a more serious collector.
19629 == Regions, Garbage Collection, and Cyclone ==
19631 One possibility is to take Cyclone's approach, and provide both
19632 region-based memory management and garbage collection, but at the
19633 programmer's option (<!Cite(GrossmanEtAl02)>, <!Cite(HicksEtAl03)>).
19635 One might ask whether we might do the same thing -- i.e., provide a
19636 `MLton.Regions` structure with explicit region based memory
19637 management operations, so that the programmer could use them when
19638 appropriate. <:MatthewFluet:> has thought about this question
19640 * http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html
19642 Unfortunately, his conclusion is that the SML type system is too weak
19643 to support this option, although there might be a "poor-man's" version
19644 with dynamic checks.
19648 :mlton-guide-page: Release20041109
19649 [[Release20041109]]
19653 This is an archived public release of MLton, version 20041109.
19655 == Changes since the last public release ==
19658 ** x86: FreeBSD 5.x, OpenBSD
19659 ** PowerPC: Darwin (MacOSX)
19660 * Support for the <:MLBasis: ML Basis system>, a new mechanism supporting programming in the very large, separate delivery of library sources, and more.
19661 * Support for dynamic libraries.
19662 * Support for <:ConcurrentML:> (CML).
19663 * New structures: `Int2`, `Int3`, ..., `Int31` and `Word2`, `Word3`, ..., `Word31`.
19664 * Front-end bug fixes and improvements.
19665 * A new form of profiling with ++-profile count++, which can be used to test code coverage.
19666 * A bytecode generator, available via ++-codegen bytecode++.
19667 * Representation improvements:
19668 ** Tuples and datatypes are packed to decrease space usage.
19669 ** Ref cells may be unboxed into their containing object.
19670 ** Arrays of tuples may represent the tuples unboxed.
19672 For a complete list of changes and bug fixes since 20040227, see the
19673 <!RawGitFile(mlton,on-20041109-release,doc/changelog)>.
19681 :mlton-guide-page: Release20051202
19682 [[Release20051202]]
19686 This is an archived public release of MLton, version 20051202.
19688 == Changes since the last public release ==
19690 * The <:License:MLton license> is now BSD-style instead of the GPL.
19691 * New platforms: <:RunningOnMinGW:X86/MinGW> and HPPA/Linux.
19692 * Improved and expanded documentation, based on the MLton wiki.
19694 ** improved exception history.
19695 ** <:CompileTimeOptions:Command-line switches>.
19696 *** Added: ++-as-opt++, ++-mlb-path-map++, ++-target-as-opt++, ++-target-cc-opt++.
19697 *** Removed: ++-native++, ++-sequence-unit++, ++-warn-match++, ++-warn-unused++.
19699 ** <:ForeignFunctionInterface:FFI> syntax changes and extensions.
19700 *** Added: `_symbol`.
19701 *** Changed: `_export`, `_import`.
19702 *** Removed: `_ffi`.
19703 ** <:MLBasisAnnotations:ML Basis annotations>.
19704 *** Added: `allowFFI`, `nonexhaustiveExnMatch`, `nonexhaustiveMatch`, `redundantMatch`, `sequenceNonUnit`.
19705 *** Deprecated: `allowExport`, `allowImport`, `sequenceUnit`, `warnMatch`.
19708 *** Added: `Int1`, `Word1`.
19709 ** <:MLtonStructure:MLton structure>.
19710 *** Added: `Process.create`, `ProcEnv.setgroups`, `Rusage.measureGC`, `Socket.fdToSock`, `Socket.Ctl.getError`.
19711 *** Changed: `MLton.Platform.Arch`.
19712 ** Other libraries.
19713 *** Added: <:CKitLibrary:ckit>, <:MLNLFFI:ML-NLFFI library>, <:SMLNJLibrary:SML/NJ library>.
19715 ** Updates of `mllex` and `mlyacc` from SML/NJ.
19716 ** Added <:MLNLFFI:mlnlffigen>.
19717 ** <:Profiling:> supports better inclusion/exclusion of code.
19719 For a complete list of changes and bug fixes since
19720 <:Release20041109:>, see the
19721 <!RawGitFile(mlton,on-20051202-release,doc/changelog)> and
19724 == 20051202 binary packages ==
19727 ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-cygwin.tgz[Cygwin] 1.5.18-1
19728 ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-freebsd.tbz[FreeBSD] 5.4
19730 *** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1_i386.deb[Debian] sid
19731 *** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1_i386.stable.deb[Debian] stable (Sarge)
19732 *** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386.rpm[RedHat] 7.1-9.3 FC1-FC4
19733 *** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-linux.tgz[tgz] for other distributions (glibc 2.3)
19734 ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-mingw.tgz[MinGW]
19735 ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-netbsd.tgz[NetBSD] 2.0.2
19736 ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-openbsd.tgz[OpenBSD] 3.7
19738 ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.powerpc-darwin.tgz[Darwin] 7.9.0 (Mac OS X)
19740 ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.sparc-solaris.tgz[Solaris] 8
19742 == 20051202 source packages ==
19744 * http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.src.tgz[source tgz]
19745 * Debian http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1.dsc[dsc], http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1.diff.gz[diff.gz], http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202.orig.tar.gz[orig.tar.gz]
19746 * RedHat http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.src.rpm[source rpm]
19748 == Packages available at other sites ==
19750 * http://packages.debian.org/cgi-bin/search_packages.pl?searchon=names&version=all&exact=1&keywords=mlton[Debian]
19751 * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD]
19752 * Fedora Core http://fedoraproject.org/extras/4/i386/repodata/repoview/mlton-0-20051202-8.fc4.html[4] http://fedoraproject.org/extras/5/i386/repodata/repoview/mlton-0-20051202-8.fc5.html[5]
19753 * http://packages.ubuntu.com/dapper/devel/mlton[Ubuntu]
19758 * http://www.mlton.org/guide/20051202/[MLton Guide (20051202)].
19760 A snapshot of the MLton wiki at the time of release.
19764 :mlton-guide-page: Release20070826
19765 [[Release20070826]]
19769 This is an archived public release of MLton, version 20070826.
19771 == Changes since the last public release ==
19774 ** <:RunningOnAMD64:AMD64>/<:RunningOnLinux:Linux>, <:RunningOnAMD64:AMD64>/<:RunningOnFreeBSD:FreeBSD>
19775 ** <:RunningOnHPPA:HPPA>/<:RunningOnHPUX:HPUX>
19776 ** <:RunningOnPowerPC:PowerPC>/<:RunningOnAIX:AIX>
19777 ** <:RunningOnX86:X86>/<:RunningOnDarwin:Darwin (Mac OS X)>
19779 ** Support for 64-bit platforms.
19780 *** Native amd64 codegen.
19781 ** <:CompileTimeOptions:Compile-time options>.
19782 *** Added: ++-codegen amd64++, ++-codegen x86++, ++-default-type __type__++, ++-profile-val {false|true}++.
19783 *** Changed: ++-stop f++ (file listing now includes `.mlb` files).
19784 ** Bytecode codegen.
19785 *** Support for exception history.
19786 *** Support for profiling.
19788 *** <:MLBasisAnnotations:ML Basis annotations>.
19789 **** Removed: `allowExport`, `allowImport`, `sequenceUnit`, `warnMatch`.
19791 ** <:BasisLibrary:Basis Library>.
19792 *** Added: `PackWord16Big`, `PackWord16Little`, `PackWord64Big`, `PackWord64Little`.
19793 *** Bug fixes: see <!RawGitFile(mlton,on-20070826-release,doc/changelog)>.
19794 ** <:MLtonStructure:MLton structure>.
19795 *** Added: `MLTON_MONO_ARRAY`, `MLTON_MONO_VECTOR`, `MLTON_REAL`, `MLton.BinIO.tempPrefix`, `MLton.CharArray`, `MLton.CharVector`, `MLton.Exn.defaultTopLevelHandler`, `MLton.Exn.getTopLevelHandler`, `MLton.Exn.setTopLevelHandler`, `MLton.IntInf.BigWord`, `Mlton.IntInf.SmallInt`, `MLton.LargeReal`, `MLton.LargeWord`, `MLton.Real`, `MLton.Real32`, `MLton.Real64`, `MLton.Rlimit.Rlim`, `MLton.TextIO.tempPrefix`, `MLton.Vector.create`, `MLton.Word.bswap`, `MLton.Word8.bswap`, `MLton.Word16`, `MLton.Word32`, `MLton.Word64`, `MLton.Word8Array`, `MLton.Word8Vector`.
19796 *** Changed: `MLton.Array.unfoldi`, `MLton.IntInf.rep`, `MLton.Rlimit`, `MLton.Vector.unfoldi`.
19797 *** Deprecated: `MLton.Socket`.
19798 ** Other libraries.
19799 *** Added: <:MLRISCLibrary:MLRISC library>.
19800 *** Updated: <:CKitLibrary:ckit library>, <:SMLNJLibrary:SML/NJ library>.
19803 For a complete list of changes and bug fixes since
19804 <:Release20051202:>, see the
19805 <!RawGitFile(mlton,on-20070826-release,doc/changelog)> and
19808 == 20070826 binary packages ==
19811 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.amd64-linux.tgz[Linux], glibc 2.3
19813 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.hppa-hpux1100.tgz[HPUX] 11.00 and above, statically linked against <:GnuMP:>
19815 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-aix51.tgz[AIX] 5.1 and above, statically linked against <:GnuMP:>
19816 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-darwin.gmp-static.tgz[Darwin] 8.10 (Mac OS X), statically linked against <:GnuMP:>
19817 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-darwin.gmp-macports.tgz[Darwin] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19819 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.sparc-solaris8.tgz[Solaris] 8 and above, statically linked against <:GnuMP:>
19821 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-cygwin.tgz[Cygwin] 1.5.24-2
19822 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-macports.tgz[Darwin (.tgz)] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19823 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-macports.dmg[Darwin (.dmg)] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19824 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-static.tgz[Darwin (.tgz)] 8.10 (Mac OS X), statically linked against <:GnuMP:>
19825 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-static.dmg[Darwin (.dmg)] 8.10 (Mac OS X), statically linked against <:GnuMP:>
19826 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-freebsd.tgz[FreeBSD]
19827 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-linux.tgz[Linux], glibc 2.3
19828 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-linux.glibc213.gmp-static.tgz[Linux], glibc 2.1, statically linked against <:GnuMP:>
19829 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-mingw.gmp-dll.tgz[MinGW], dynamically linked against <:GnuMP:> (requires `libgmp-3.dll`)
19830 ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-mingw.gmp-static.tgz[MinGW], statically linked against <:GnuMP:>
19832 == 20070826 source packages ==
19834 * http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.src.tgz[source tgz]
19836 * Debian http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826-1.dsc[dsc],
19837 http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826-1.diff.gz[diff.gz],
19838 http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826.orig.tar.gz[orig.tar.gz]
19840 == Packages available at other sites ==
19842 * http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian]
19843 * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD]
19844 * https://admin.fedoraproject.org/pkgdb/packages/name/mlton[Fedora]
19845 * http://packages.ubuntu.com/cgi-bin/search_packages.pl?keywords=mlton&searchon=names&version=all&release=all[Ubuntu]
19850 * http://www.mlton.org/guide/20070826/[MLton Guide (20070826)].
19852 A snapshot of the MLton wiki at the time of release.
19856 :mlton-guide-page: Release20100608
19857 [[Release20100608]]
19861 This is an archived public release of MLton, version 20100608.
19863 == Changes since the last public release ==
19866 ** <:RunningOnAMD64:AMD64>/<:RunningOnDarwin:Darwin> (Mac OS X Snow Leopard)
19867 ** <:RunningOnIA64:IA64>/<:RunningOnHPUX:HPUX>
19868 ** <:RunningOnPowerPC64:PowerPC64>/<:RunningOnAIX:AIX>
19870 ** <:CompileTimeOptions:Command-line switches>.
19871 *** Added: ++-mlb-path-var __<name> <value>__++
19872 *** Removed: ++-keep sml++, ++-stop sml++
19873 ** Improved constant folding of floating-point operations.
19874 ** Experimental: Support for compiling to a C library; see <:LibrarySupport: documentation>.
19875 ** Extended ++-show-def-use __output__++ to include types of variable definitions.
19876 ** Deprecated features (to be removed in a future release)
19877 *** Bytecode codegen: The bytecode codegen has not seen significant use and it is not well understood by any of the active developers.
19878 *** Support for `.cm` files as input: The ML Basis system provides much better infrastructure for "programming in the very large" than the (very) limited support for CM. The `cm2mlb` tool (available in the source distribution) can be used to convert CM projects to MLB projects, preserving the CM scoping of module identifiers.
19879 ** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)>
19881 ** <:RunTimeOptions:@MLton switches>.
19882 *** Added: ++may-page-heap {false|true}++
19883 ** ++may-page-heap++: By default, MLton will not page the heap to disk when unable to grow the heap to accommodate an allocation. (Previously, this behavior was the default, with no means to disable, with security an least-surprise issues.)
19884 ** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)>
19886 ** Allow numeric characters in <:MLBasis:ML Basis> path variables.
19888 ** <:BasisLibrary:Basis Library>.
19889 *** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)>
19890 ** <:MLtonStructure:MLton structure>.
19891 *** Added: `MLton.equal`, `MLton.hash`, `MLton.Cont.isolate`, `MLton.GC.Statistics`, `MLton.Pointer.sizeofPointer`, `MLton.Socket.Address.toVector`
19893 *** Deprecated: `MLton.Socket`
19894 ** <:UnsafeStructure:Unsafe structure>.
19895 *** Added versions of all of the monomorphic array and vector structures.
19896 ** Other libraries.
19897 *** Updated: <:CKitLibrary:ckit library>, <:MLRISCLibrary:MLRISC library>, <:SMLNJLibrary:SML/NJ library>.
19900 *** Eliminated top-level `type int = Int.int` in output.
19901 *** Include `(*#line line:col "file.lex" *)` directives in output.
19902 *** Added `%posint` command, to set the `yypos` type and allow the lexing of multi-gigabyte files.
19904 *** Added command-line switches `-linkage archive` and `-linkage shared`.
19905 *** Deprecated command-line switch `-linkage static`.
19906 *** Added support for <:RunningOnIA64:IA64> and <:RunningOnHPPA:HPPA> targets.
19908 *** Eliminated top-level `type int = Int.int` in output.
19909 *** Include `(*#line line:col "file.grm" *)` directives in output.
19911 For a complete list of changes and bug fixes since <:Release20070826:>, see the
19912 <!RawGitFile(mlton,on-20100608-release,doc/changelog)>
19913 and <:Bugs20070826:>.
19915 == 20100608 binary packages ==
19917 * AMD64 (aka "x86-64" or "x64")
19918 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-darwin.gmp-macports.tgz[Darwin (.tgz)] 10.3 (Mac OS X Snow Leopard), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19919 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 10.3 (Mac OS X Snow Leopard), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables)
19920 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-linux.tgz[Linux], glibc 2.11
19921 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-linux.static.tgz[Linux], statically linked
19922 ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.msi[MSI] (61MB) installer
19924 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-cygwin.tgz[Cygwin] 1.7.5
19925 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-darwin.gmp-macports.tgz[Darwin (.tgz)] 9.8 (Mac OS X Leopard), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19926 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-darwin.gmp-static.tgz[Darwin (.tgz)] 9.8 (Mac OS X Leopard), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables)
19927 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-linux.tgz[Linux], glibc 2.11
19928 ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-linux.static.tgz[Linux], statically linked
19929 ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.msi[MSI] (61MB) installer
19931 == 20100608 source packages ==
19933 * http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608.src.tgz[mlton-20100608.src.tgz]
19935 == Packages available at other sites ==
19937 * http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian]
19938 * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD]
19939 * https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora]
19940 * http://packages.ubuntu.com/search?suite=default§ion=all&arch=any&searchon=names&keywords=mlton[Ubuntu]
19945 * http://www.mlton.org/guide/20100608/[MLton Guide (20100608)].
19947 A snapshot of the MLton wiki at the time of release.
19951 :mlton-guide-page: Release20130715
19952 [[Release20130715]]
19956 This is an archived public release of MLton, version 20130715.
19958 == Changes since the last public release ==
19960 // * New platforms.
19963 ** Cosmetic improvements to type-error messages.
19964 ** Removed features:
19965 *** Bytecode codegen: The bytecode codegen had not seen significant use and it was not well understood by any of the active developers.
19966 *** Support for `.cm` files as input: The <:MLBasis:ML Basis system> provides much better infrastructure for "programming in the very large" than the (very) limited support for CM. The `cm2mlb` tool (available in the source distribution) can be used to convert CM projects to MLB projects, preserving the CM scoping of module identifiers.
19967 ** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)>
19969 ** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)>
19971 ** Interpret `(*#line line:col "file" *)` directives as relative file names.
19972 ** <:MLBasisAnnotations:ML Basis annotations>.
19973 *** Added: `resolveScope`
19975 ** <:BasisLibrary:Basis Library>.
19976 *** Improved performance of `String.concatWith`.
19977 *** Use bit operations for `REAL.class` and other low-level operations.
19978 *** Support additional variables with `Posix.ProcEnv.sysconf`.
19979 *** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)>
19980 ** <:MLtonStructure:MLton structure>.
19981 *** Removed: `MLton.Socket`
19982 ** Other libraries.
19983 *** Updated: <:CKitLibrary:ckit library>, <:MLRISCLibrary:MLRISC library>, <:SMLNJLibrary:SML/NJ library>
19984 *** Added: <:MLLPTLibrary:MLLPT library>
19987 *** Generate `(*#line line:col "file.lex" *)` directives with simple (relative) file names, rather than absolute paths.
19989 *** Generate `(*#line line:col "file.grm" *)` directives with simple (relative) file names, rather than absolute paths.
19990 *** Fixed bug in comment-handling in lexer.
19992 For a complete list of changes and bug fixes since
19993 <:Release20100608:>, see the
19994 <!RawGitFile(mlton,on-20130715-release,doc/changelog)> and
19997 == 20130715 binary packages ==
19999 * AMD64 (aka "x86-64" or "x64")
20000 ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-darwin.gmp-macports.tgz[Darwin (.tgz)] 11.4 (Mac OS X Lion), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
20001 ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 11.4 (Mac OS X Lion), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables)
20002 ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-linux.tgz[Linux], glibc 2.15
20003 // ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-linux.static.tgz[Linux], statically linked
20004 // ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.msi[MSI] (61MB) installer
20006 // ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-cygwin.tgz[Cygwin] 1.7.5
20007 ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-linux.tgz[Linux], glibc 2.15
20008 // ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-linux.static.tgz[Linux], statically linked
20009 // ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.msi[MSI] (61MB) installer
20011 == 20130715 source packages ==
20013 * http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715.src.tgz[mlton-20130715.src.tgz]
20015 == Downstream packages ==
20017 * http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian]
20018 * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD]
20019 * https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora]
20020 * http://packages.ubuntu.com/search?suite=default§ion=all&arch=any&searchon=names&keywords=mlton[Ubuntu]
20025 * http://www.mlton.org/guide/20130715/[MLton Guide (20130715)].
20027 A snapshot of the MLton website at the time of release.
20031 :mlton-guide-page: Release20180207
20032 [[Release20180207]]
20036 Here you can download the latest public release of MLton, version 20180207.
20038 == Changes since the last public release ==
20041 ** Added an experimental LLVM codegen (`-codegen llvm`); requires LLVM tools
20042 (`llvm-as`, `opt`, `llc`) version ≥ 3.7.
20043 ** Made many substantial cosmetic improvements to front-end diagnostic
20044 messages, especially with respect to source location regions, type inference
20045 for `fun` and `val rec` declarations, signature constraints applied to a
20046 structure, `sharing type` specifications and `where type` signature
20047 expressions, type constructor or type variable escaping scope, and
20048 nonexhaustive pattern matching.
20049 ** Fixed minor bugs with exception replication, precedence parsing of function
20050 clauses, and simultaneous `sharing` of multiple structures.
20051 ** Made compilation deterministic (eliminate output executable name from
20052 compile-time specified `@MLton` runtime arguments; deterministically generate
20053 magic constant for executable).
20054 ** Updated `-show-basis` (recursively expand structures in environments,
20055 displaying components with long identifiers; append `(* @ region *)`
20056 annotations to items shown in environment).
20057 ** Forced amd64 codegen to generate PIC on amd64-linux targets.
20059 ** Added `gc-summary-file file` runtime option.
20060 ** Reorganized runtime support for `IntInf` operations so that programs that
20061 do not use `IntInf` compile to executables with no residual dependency on GMP.
20062 ** Changed heap representation to store forwarding pointer for an object in
20063 the object header (rather than in the object data and setting the header to a
20066 ** Added support for selected SuccessorML features; see
20067 http://mlton.org/SuccessorML for details.
20068 ** Added `(*#showBasis "file" *)` directive; see
20069 http://mlton.org/ShowBasisDirective for details.
20071 *** Added `pure`, `impure`, and `reentrant` attributes to `_import`. An
20072 unattributed `_import` is treated as `impure`. A `pure` `_import` may be
20073 subject to more aggressive optimizations (common subexpression elimination,
20074 dead-code elimination). An `_import`-ed C function that (directly or
20075 indirectly) calls an `_export`-ed SML function should be attributed
20077 ** ML Basis annotations.
20078 *** Added `allowSuccessorML {false|true}` to enable all SuccessorML features
20079 and other annotations to enable specific SuccessorML features; see
20080 http://mlton.org/SuccessorML for details.
20081 *** Split `nonexhaustiveMatch {warn|error|igore}` and `redundantMatch
20082 {warn|error|ignore}` into `nonexhaustiveMatch` and `redundantMatch`
20083 (controls diagnostics for `case` expressions, `fn` expressions, and `fun`
20084 declarations (which may raise `Match` on failure)) and `nonexhaustiveBind`
20085 and `redundantBind` (controls diagnostics for `val` declarations (which may
20086 raise `Bind` on failure)).
20087 *** Added `valrecConstr {warn|error|ignore}` to report when a `val rec` (or
20088 `fun`) declaration redefines an identifier that previously had constructor
20092 *** Improved performance of `Array.copy`, `Array.copyVec`, `Vector.append`,
20093 `String.^`, `String.concat`, `String.concatWith`, and other related
20094 functions by using `memmove` rather than element-by-element constructions.
20095 ** `Unsafe` structure.
20096 *** Added unsafe operations for array uninitialization and raw arrays; see
20097 https://github.com/MLton/mlton/pull/207 for details.
20098 ** Other libraries.
20099 *** Updated: ckit library, MLLPT library, MLRISC library, SML/NJ library
20102 *** Updated to warn and skip (rather than abort) when encountering functions
20103 with `struct`/`union` argument or return type.
20105 For a complete list of changes and bug fixes since
20106 <:Release20130715:>, see the
20107 <!ViewGitFile(mlton,on-20180207-release,CHANGELOG.adoc)> and
20110 == 20180207 binary packages ==
20112 * AMD64 (aka "x86-64" or "x64")
20113 ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-darwin.gmp-homebrew.tgz[Darwin (.tgz)] 16.7 (Mac OS X Sierra), dynamically linked against <:GnuMP:> in `/usr/local/lib` (suitable for https://brew.sh/[Homebrew] install of <:GnuMP:>)
20114 ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 16.7 (Mac OS X Sierra), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables)
20115 ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-linux.tgz[Linux], glibc 2.23
20116 // ** Windows MinGW 32/64 https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.exe[self-extracting] (28MB) or https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.msi[MSI] (61MB) installer
20118 // ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-cygwin.tgz[Cygwin] 1.7.5
20119 // ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-linux.tgz[Linux], glibc 2.23
20120 // ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-linux.static.tgz[Linux], statically linked
20121 // ** Windows MinGW 32/64 https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.exe[self-extracting] (28MB) or https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.msi[MSI] (61MB) installer
20123 == 20180207 source packages ==
20125 * https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207.src.tgz[mlton-20180207.src.tgz]
20130 * http://www.mlton.org/guide/20180207/[MLton Guide (20180207)].
20132 A snapshot of the MLton website at the time of release.
20136 :mlton-guide-page: ReleaseChecklist
20137 [[ReleaseChecklist]]
20141 == Advance preparation for release ==
20143 * Update `./CHANGELOG.adoc`.
20144 ** Write entries for missing notable commits.
20145 ** Write summary of changes from previous release.
20146 ** Update with estimated release date.
20147 * Update `./README.adoc`.
20148 ** Check features and description.
20149 * Update `man/{mlton,mlprof}.1`.
20150 ** Check compile-time and run-time options in `man/mlton.1`.
20151 ** Check options in `man/mlprof.1`.
20152 ** Update with estimated release date.
20153 * Update `doc/guide`.
20154 // ** Check <:OrphanedPages:> and <:WantedPages:>.
20155 ** Synchronize <:Features:> page with `./README.adoc`.
20156 ** Update <:Credits:> page with acknowledgements.
20157 ** Create *ReleaseYYYYMM??* page (i.e., forthcoming release) based on *ReleaseXXXXLLCC* (i.e., previous release).
20158 *** Update summary from `./CHANGELOG.adoc`.
20159 *** Update links to estimated release date.
20160 ** Create *BugsYYYYMM??* page based on *BugsXXXXLLCC*.
20161 *** Update links to estimated release date.
20162 ** Spell check pages.
20163 * Ensure that all updates are pushed to `master` branch of <!ViewGitProj(mlton)>.
20165 == Prepare sources for tagging ==
20167 * Update `./CHANGELOG.adoc`.
20168 ** Update with proper release date.
20169 * Update `man/{mlton,mlprof}.1`.
20170 ** Update with proper release date.
20171 * Update `doc/guide`.
20172 ** Rename *ReleaseYYYYMM??* to *ReleaseYYYYMMDD* with proper release date.
20173 *** Update links with proper release date.
20174 ** Rename *BugsYYYYMM??* to *BugsYYYYMMDD* with proper release date.
20175 *** Update links with proper release date.
20176 ** Update *ReleaseXXXXLLCC*.
20177 *** Change intro to "`This is an archived public release of MLton, version XXXXLLCC.`"
20178 ** Update <:Home:> with note of new release.
20179 *** Change `What's new?` text to `Please try out our new release, <:ReleaseYYYYMMDD:MLton YYYYMMDD>`.
20180 *** Update `Download` link with proper release date.
20181 ** Update <:Releases:> with new release.
20182 * Ensure that all updates are pushed to `master` branch of <!ViewGitProj(mlton)>.
20189 git clone http://github.com/MLton/mlton mlton.git
20191 git checkout master
20192 git tag -a -m "Tagging YYYYMMDD release" on-YYYYMMDD-release master
20193 git push origin on-YYYYMMDD-release
20198 === SourceForge FRS ===
20200 * Create *YYYYMMDD* directory:
20203 sftp user@frs.sourceforge.net:/home/frs/project/mlton/mlton
20204 sftp> mkdir YYYYMMDD
20208 === Source release ===
20210 * Create `mlton-YYYYMMDD.src.tgz`:
20213 git clone http://github.com/MLton/mlton mlton
20215 git checkout on-YYYYMMDD-release
20216 make MLTON_VERSION=YYYYMMDD source-release
20223 wget https://github.com/MLton/mlton/archive/on-YYYYMMDD-release.tar.gz
20224 tar xzvf on-YYYYMMDD-release.tar.gz
20225 cd mlton-on-YYYYMMDD-release
20226 make MLTON_VERSION=YYYYMMDD source-release
20230 * Upload `mlton-YYYYMMDD.src.tgz`:
20233 scp mlton-YYYYMMDD.src.tgz user@frs.sourceforge.net:/home/frs/project/mlton/mlton/YYYYMMDD/
20236 * Update *ReleaseYYYYMMDD* with `mlton-YYYYMMDD.src.tgz` link.
20238 === Binary releases ===
20240 * Build and create `mlton-YYYYMMDD-1.ARCH-OS.tgz`:
20243 wget http://sourceforge.net/projects/mlton/files/mlton/YYYYMMDD/mlton-YYYYMMDD.src.tgz
20244 tar xzvf mlton-YYYYMMDD.src.tgz
20246 make binary-release
20250 * Upload `mlton-YYYYMMDD-1.ARCH-OS.tgz`:
20253 scp mlton-YYYYMMDD-1.ARCH-OS.tgz user@frs.sourceforge.net:/home/frs/project/mlton/mlton/YYYYMMDD/
20256 * Update *ReleaseYYYYMMDD* with `mlton-YYYYMMDD-1.ARCH-OS.tgz` link.
20260 * `guide/YYYYMMDD` gets a copy of `doc/guide/localhost`.
20264 wget http://sourceforge.net/projects/mlton/files/mlton/YYYYMMDD/mlton-YYYYMMDD.src.tgz
20265 tar xzvf mlton-YYYYMMDD.src.tgz
20268 cp -prf localhost YYYYMMDD
20269 tar czvf guide-YYYYMMDD.tgz YYYYMMDD
20270 rsync -avzP --delete -e ssh YYYYMMDD user@web.sourceforge.net:/home/project-web/mlton/htdocs/guide/
20271 rsync -avzP --delete -e ssh guide-YYYYMMDD.tgz user@web.sourceforge.net:/home/project-web/mlton/htdocs/guide/
20274 == Announce release ==
20276 * Mail announcement to:
20277 ** mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`]
20278 ** mailto:MLton-user@mlton.org[`MLton-user@mlton.org`]
20282 * Generate new <:Performance:> numbers.
20286 :mlton-guide-page: Releases
20291 Public releases of MLton:
20293 * <:Release20180207:>
20294 * <:Release20130715:>
20295 * <:Release20100608:>
20296 * <:Release20070826:>
20297 * <:Release20051202:>
20298 * <:Release20041109:>
20316 :mlton-guide-page: RemoveUnused
20321 <:RemoveUnused:> is an optimization pass for both the <:SSA:> and
20322 <:SSA2:> <:IntermediateLanguage:>s, invoked from <:SSASimplify:> and
20327 This pass aggressively removes unused:
20330 * datatype constructors
20331 * datatype constructor arguments
20333 * function arguments
20337 * statements (variable bindings)
20338 * handlers from non-tail calls (mayRaise analysis)
20339 * continuations from non-tail calls (mayReturn analysis)
20341 == Implementation ==
20343 * <!ViewGitFile(mlton,master,mlton/ssa/remove-unused.fun)>
20344 * <!ViewGitFile(mlton,master,mlton/ssa/remove-unused2.fun)>
20346 == Details and Notes ==
20352 :mlton-guide-page: Restore
20357 <:Restore:> is a rewrite pass for the <:SSA:> and <:SSA2:>
20358 <:IntermediateLanguage:>s, invoked from <:KnownCase:> and
20363 This pass restores the SSA condition for a violating <:SSA:> or
20364 <:SSA2:> program; the program must satisfy:
20366 Every path from the root to a use of a variable (excluding globals)
20367 passes through a def of that variable.
20370 == Implementation ==
20372 * <!ViewGitFile(mlton,master,mlton/ssa/restore.sig)>
20373 * <!ViewGitFile(mlton,master,mlton/ssa/restore.fun)>
20374 * <!ViewGitFile(mlton,master,mlton/ssa/restore2.sig)>
20375 * <!ViewGitFile(mlton,master,mlton/ssa/restore2.fun)>
20377 == Details and Notes ==
20379 Based primarily on Section 19.1 of <!Cite(Appel98, Modern Compiler
20380 Implementation in ML)>.
20382 The main deviation is the calculation of liveness of the violating
20383 variables, which is used to predicate the insertion of phi arguments.
20384 This is due to the algorithm's bias towards imperative languages, for
20385 which it makes the assumption that all variables are defined in the
20386 start block and all variables are "used" at exit.
20388 This is "optimized" for restoration of functions with small numbers of
20389 violating variables -- use bool vectors to represent sets of violating
20392 Also, we use a `Promise.t` to suspend part of the dominance frontier
20397 :mlton-guide-page: ReturnStatement
20398 [[ReturnStatement]]
20402 Programmers coming from languages that have a `return` statement, such
20403 as C, Java, and Python, often ask how one can translate functions that
20404 return early into SML. This page briefly describes a number of ways
20405 to translate uses of `return` to SML.
20407 == Conditional iterator function ==
20409 A conditional iterator function, such as
20410 http://www.standardml.org/Basis/list.html#SIG:LIST.find:VAL[`List.find`],
20411 http://www.standardml.org/Basis/list.html#SIG:LIST.exists:VAL[`List.exists`],
20413 http://www.standardml.org/Basis/list.html#SIG:LIST.all:VAL[`List.all`]
20414 is probably what you want in most cases. Unfortunately, it might be
20415 the case that the particular conditional iteration pattern that you
20416 want isn't provided for your data structure. Usually the best
20417 alternative in such a case is to implement the desired iteration
20418 pattern as a higher-order function. For example, to implement a
20419 `find` function for arrays (which already exists as
20420 http://www.standardml.org/Basis/array.html#SIG:ARRAY.findi:VAL[`Array.find`])
20425 fun find predicate array = let
20427 if i = Array.length array then
20429 else if predicate (Array.sub (array, i)) then
20430 SOME (Array.sub (array, i))
20438 Of course, this technique, while probably the most common case in
20439 practice, applies only if you are essentially iterating over some data
20442 == Escape handler ==
20444 Probably the most direct way to translate code using `return`
20445 statements is to basically implement `return` using exception
20446 handling. The mechanism can be packaged into a reusable module with
20448 (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/public/control/exit.sig)>):
20451 sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/public/control/exit.sig 6:]
20454 (<!Cite(HarperEtAl93, Typing First-Class Continuations in ML)>
20455 discusses the typing of a related construct.) The implementation
20456 (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/detail/control/exit.sml)>)
20457 is straightforward:
20460 sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/detail/control/exit.sml 6:]
20463 Here is an example of how one could implement a `find` function given
20467 fun appToFind (app : ('a -> unit) -> 'b -> unit)
20468 (predicate : 'a -> bool)
20473 if predicate x then
20481 In the above, as soon as the expression `predicate x` evaluates to
20482 `true` the `app` invocation is terminated.
20485 == Continuation-passing Style (CPS) ==
20487 A general way to implement complex control patterns is to use
20488 http://en.wikipedia.org/wiki/Continuation-passing_style[CPS]. In CPS,
20489 instead of returning normally, functions invoke a function passed as
20490 an argument. In general, multiple continuation functions may be
20491 passed as arguments and the ordinary return continuation may also be
20492 used. As an example, here is a function that finds the leftmost
20493 element of a binary tree satisfying a given predicate:
20496 datatype 'a tree = LEAF | BRANCH of 'a tree * 'a * 'a tree
20498 fun find predicate = let
20499 fun recurse continue =
20502 | BRANCH (lhs, elem, rhs) =>
20505 if predicate elem then
20508 recurse continue rhs)
20511 recurse (fn () => NONE)
20515 Note that the above function returns as soon as the leftmost element
20516 satisfying the predicate is found.
20520 :mlton-guide-page: RSSA
20525 <:RSSA:> is an <:IntermediateLanguage:>, translated from <:SSA2:> by
20526 <:ToRSSA:>, optimized by <:RSSASimplify:>, and translated by
20527 <:ToMachine:> to <:Machine:>.
20531 <:RSSA:> is a <:IntermediateLanguage:> that makes representation
20532 decisions explicit.
20534 == Implementation ==
20536 * <!ViewGitFile(mlton,master,mlton/backend/rssa.sig)>
20537 * <!ViewGitFile(mlton,master,mlton/backend/rssa.fun)>
20539 == Type Checking ==
20541 The new type language is aimed at expressing bit-level control over
20542 layout and associated packing of data representations. There are
20543 singleton types that denote constants, other atomic types for things
20544 like integers and reals, and arbitrary sum types and sequence (tuple)
20545 types. The big change to the type system is that type checking is now
20546 based on subtyping, not type equality. So, for example, the singleton
20547 type `0xFFFFEEBB` whose only inhabitant is the eponymous constant is a
20548 subtype of the type `Word32`.
20550 == Details and Notes ==
20552 SSA is an abbreviation for Static Single Assignment. The <:RSSA:>
20553 <:IntermediateLanguage:> is a variant of SSA.
20557 :mlton-guide-page: RSSAShrink
20562 <:RSSAShrink:> is an optimization pass for the <:RSSA:>
20563 <:IntermediateLanguage:>.
20567 This pass implements a whole family of compile-time reductions, like:
20569 * constant folding, copy propagation
20570 * inline the `Goto` to a block with a unique predecessor
20572 == Implementation ==
20574 * <!ViewGitFile(mlton,master,mlton/backend/rssa.fun)>
20576 == Details and Notes ==
20582 :mlton-guide-page: RSSASimplify
20587 The optimization passes for the <:RSSA:> <:IntermediateLanguage:> are
20588 collected and controlled by the `Backend` functor
20589 (<!ViewGitFile(mlton,master,mlton/backend/backend.sig)>,
20590 <!ViewGitFile(mlton,master,mlton/backend/backend.fun)>).
20592 The following optimization pass is implemented:
20596 The following implementation passes are implemented:
20598 * <:ImplementHandlers:>
20599 * <:ImplementProfiling:>
20600 * <:InsertLimitChecks:>
20601 * <:InsertSignalChecks:>
20603 The optimization passes can be controlled from the command-line by the options
20605 * `-diag-pass <pass>` -- keep diagnostic info for pass
20606 * `-drop-pass <pass>` -- omit optimization pass
20607 * `-keep-pass <pass>` -- keep the results of pass
20611 :mlton-guide-page: RunningOnAIX
20616 MLton runs fine on AIX.
20620 * <:RunningOnPowerPC:>
20621 * <:RunningOnPowerPC64:>
20625 :mlton-guide-page: RunningOnAlpha
20630 MLton runs fine on the Alpha architecture.
20634 * When compiling for Alpha, MLton doesn't support native code
20635 generation (`-codegen native`). Hence, performance is not as good as
20636 it might be and compile times are longer. Also, the quality of code
20637 generated by `gcc` is important. By default, MLton calls `gcc -O1`.
20638 You can change this by calling MLton with `-cc-opt -O2`.
20640 * When compiling for Alpha, MLton uses `-align 8` by default.
20644 :mlton-guide-page: RunningOnAMD64
20649 MLton runs fine on the AMD64 (aka "x86-64" or "x64") architecture.
20653 * When compiling for AMD64, MLton targets the 64-bit ABI.
20655 * On AMD64, MLton supports native code generation (`-codegen native` or `-codegen amd64`).
20657 * When compiling for AMD64, MLton uses `-align 8` by default. Using
20658 `-align 4` may be incompatible with optimized builds of the <:GnuMP:>
20659 library, which assume 8-byte alignment. (See the thread at
20660 http://www.mlton.org/pipermail/mlton/2009-October/030674.html for more
20665 :mlton-guide-page: RunningOnARM
20670 MLton runs fine on the ARM architecture.
20674 * When compiling for ARM, MLton doesn't support native code generation
20675 (`-codegen native`). Hence, performance is not as good as it might be
20676 and compile times are longer. Also, the quality of code generated by
20677 `gcc` is important. By default, MLton calls `gcc -O1`. You can
20678 change this by calling MLton with `-cc-opt -O2`.
20682 :mlton-guide-page: RunningOnCygwin
20683 [[RunningOnCygwin]]
20687 MLton runs on the http://www.cygwin.com/[Cygwin] emulation layer,
20688 which provides a Posix-like environment while running on Windows. To
20689 run MLton with Cygwin, you must first install Cygwin on your Windows
20690 machine. To do this, visit the Cygwin site from your Windows machine
20691 and run their `setup.exe` script. Then, you can unpack the MLton
20692 binary `tgz` in your Cygwin environment.
20694 To run MLton cross-compiled executables on Windows, you must install
20695 the Cygwin `dll` on the Windows machine.
20699 * Time profiling is disabled.
20701 * Cygwin's `mmap` emulation is less than perfect. Sometimes it
20702 interacts badly with `Posix.Process.fork`.
20704 * The <!RawGitFile(mlton,master,regression/socket.sml)> regression
20705 test fails. We suspect this is not a bug and is simply due to our
20706 test relying on a certain behavior when connecting to a socket that
20707 has not yet accepted, which is handled differently on Cygwin than
20708 other platforms. Any help in understanding and resolving this issue
20713 * <:RunningOnMinGW:RunningOnMinGW>
20717 :mlton-guide-page: RunningOnDarwin
20718 [[RunningOnDarwin]]
20722 MLton runs fine on Darwin (and on Mac OS X).
20726 * MLton requires the <:GnuMP:> library, which is available via
20727 http://www.finkproject.org[Fink], http://www.macports.com[MacPorts],
20728 http://mxcl.github.io/homebrew/[Homebrew].
20730 * For Intel-based Macs, MLton targets the <:RunningOnAMD64:AMD64
20731 architecture> on Darwin 10 (Mac OS X Snow Leopard) and higher and
20732 targets the <:RunningOnX86:x86 architecture> on Darwin 8 (Mac OS X
20733 Tiger) and Darwin 9 (Mac OS X Leopard).
20737 * Executables that save and load worlds on Darwin 11 (Mac OS X Lion)
20738 and higher should be compiled with `-link-opt -fno-PIE` ; see
20739 <:MLtonWorld:> for more details.
20741 * <:ProfilingTime:> may give inaccurate results on multi-processor
20742 machines. The `SIGPROF` signal, used to sample the profiled program,
20743 is supposed to be delivered 100 times a second (i.e., at 10000us
20744 intervals), but there can be delays of over 1 minute between the
20745 delivery of consecutive `SIGPROF` signals. A more complete
20746 description may be found
20747 http://lists.apple.com/archives/Unix-porting/2007/Aug/msg00000.html[here]
20749 http://lists.apple.com/archives/Darwin-dev/2007/Aug/msg00045.html[here].
20753 * <:RunningOnAMD64:>
20754 * <:RunningOnPowerPC:>
20759 :mlton-guide-page: RunningOnFreeBSD
20760 [[RunningOnFreeBSD]]
20764 MLton runs fine on http://www.freebsd.org/[FreeBSD].
20768 * MLton is available as a http://www.freebsd.org/[FreeBSD]
20769 http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[port].
20773 * Executables often run more slowly than on a comparable Linux
20774 machine. We conjecture that part of this is due to costs due to heap
20775 resizing and kernel zeroing of pages. Any help in solving the problem
20776 would be appreciated.
20778 * FreeBSD defaults to a datasize limit of 512M, even if you have more
20779 than that amount of memory in the computer. Hence, your MLton process
20780 will be limited in the amount of memory it has. To fix this problem,
20781 turn up the datasize and the default datasize available to a process:
20782 Edit `/boot/loader.conf` to set the limits. For example, the setting
20785 kern.maxdsiz="671088640"
20786 kern.dfldsiz="671088640"
20787 kern.maxssiz="134217728"
20790 will give a process 640M of datasize memory, default to 640M available
20791 and set 128M of stack size memory.
20795 :mlton-guide-page: RunningOnHPPA
20800 MLton runs fine on the HPPA architecture.
20804 * When compiling for HPPA, MLton targets the 32-bit HPPA architecture.
20806 * When compiling for HPPA, MLton doesn't support native code
20807 generation (`-codegen native`). Hence, performance is not as good as
20808 it might be and compile times are longer. Also, the quality of code
20809 generated by `gcc` is important. By default, MLton calls `gcc -O1`.
20810 You can change this by calling MLton with `-cc-opt -O2`.
20812 * When compiling for HPPA, MLton uses `-align 8` by default. While
20813 this speeds up reals, it also may increase object sizes. If your
20814 program does not make significant use of reals, you might see a
20815 speedup with `-align 4`.
20819 :mlton-guide-page: RunningOnHPUX
20824 MLton runs fine on HPUX.
20828 * <:RunningOnHPPA:>
20832 :mlton-guide-page: RunningOnIA64
20837 MLton runs fine on the IA64 architecture.
20841 * When compiling for IA64, MLton targets the 64-bit ABI.
20843 * When compiling for IA64, MLton doesn't support native code
20844 generation (`-codegen native`). Hence, performance is not as good as
20845 it might be and compile times are longer. Also, the quality of code
20846 generated by `gcc` is important. By default, MLton calls `gcc -O1`.
20847 You can change this by calling MLton with `-cc-opt -O2`.
20849 * When compiling for IA64, MLton uses `-align 8` by default.
20851 * On the IA64, the <:GnuMP:> library supports multiple ABIs. See the
20852 <:GnuMP:> page for more details.
20856 :mlton-guide-page: RunningOnLinux
20861 MLton runs fine on Linux.
20865 :mlton-guide-page: RunningOnMinGW
20870 MLton runs on http://mingw.org[MinGW], a library for porting Unix
20871 applications to Windows. Some library functionality is missing or
20876 * To compile MLton on MinGW:
20877 ** The <:GnuMP:> library is required.
20878 ** The Bash shell is required. If you are using a prebuilt MSYS, you
20879 probably want to symlink `bash` to `sh`.
20883 * Many functions are unimplemented and will `raise SysErr`.
20884 ** `MLton.Itimer.set`
20885 ** `MLton.ProcEnv.setgroups`
20886 ** `MLton.Process.kill`
20887 ** `MLton.Process.reap`
20888 ** `MLton.World.load`
20889 ** `OS.FileSys.readLink`
20891 ** `OS.Process.terminate`
20892 ** `Posix.FileSys.chown`
20893 ** `Posix.FileSys.fchown`
20894 ** `Posix.FileSys.fpathconf`
20895 ** `Posix.FileSys.link`
20896 ** `Posix.FileSys.mkfifo`
20897 ** `Posix.FileSys.pathconf`
20898 ** `Posix.FileSys.readlink`
20899 ** `Posix.FileSys.symlink`
20900 ** `Posix.IO.dupfd`
20901 ** `Posix.IO.getfd`
20902 ** `Posix.IO.getfl`
20903 ** `Posix.IO.getlk`
20904 ** `Posix.IO.setfd`
20905 ** `Posix.IO.setfl`
20906 ** `Posix.IO.setlkw`
20907 ** `Posix.IO.setlk`
20908 ** `Posix.ProcEnv.ctermid`
20909 ** `Posix.ProcEnv.getegid`
20910 ** `Posix.ProcEnv.geteuid`
20911 ** `Posix.ProcEnv.getgid`
20912 ** `Posix.ProcEnv.getgroups`
20913 ** `Posix.ProcEnv.getlogin`
20914 ** `Posix.ProcEnv.getpgrp`
20915 ** `Posix.ProcEnv.getpid`
20916 ** `Posix.ProcEnv.getppid`
20917 ** `Posix.ProcEnv.getuid`
20918 ** `Posix.ProcEnv.setgid`
20919 ** `Posix.ProcEnv.setpgid`
20920 ** `Posix.ProcEnv.setsid`
20921 ** `Posix.ProcEnv.setuid`
20922 ** `Posix.ProcEnv.sysconf`
20923 ** `Posix.ProcEnv.times`
20924 ** `Posix.ProcEnv.ttyname`
20925 ** `Posix.Process.exece`
20926 ** `Posix.Process.execp`
20927 ** `Posix.Process.exit`
20928 ** `Posix.Process.fork`
20929 ** `Posix.Process.kill`
20930 ** `Posix.Process.pause`
20931 ** `Posix.Process.waitpid_nh`
20932 ** `Posix.Process.waitpid`
20933 ** `Posix.SysDB.getgrgid`
20934 ** `Posix.SysDB.getgrnam`
20935 ** `Posix.SysDB.getpwuid`
20936 ** `Posix.TTY.TC.drain`
20937 ** `Posix.TTY.TC.flow`
20938 ** `Posix.TTY.TC.flush`
20939 ** `Posix.TTY.TC.getattr`
20940 ** `Posix.TTY.TC.getpgrp`
20941 ** `Posix.TTY.TC.sendbreak`
20942 ** `Posix.TTY.TC.setattr`
20943 ** `Posix.TTY.TC.setpgrp`
20946 ** `UnixSock.fromAddr`
20947 ** `UnixSock.toAddr`
20951 :mlton-guide-page: RunningOnNetBSD
20952 [[RunningOnNetBSD]]
20956 MLton runs fine on http://www.netbsd.org/[NetBSD].
20958 == Installing the correct packages for NetBSD ==
20960 The NetBSD system installs 3rd party packages by a mechanism known as
20961 pkgsrc. This is a tree of Makefiles which when invoked downloads the
20962 source code, builds a package and installs it on the system. In order
20963 to run MLton on NetBSD, you will have to install several packages for
20972 In order to get graphical call-graphs of profiling information, you
20973 will need the additional package
20975 * `graphics/graphviz`
20977 To build the documentation for MLton, you will need the addtional
20980 * `textproc/asciidoc`.
20982 == Tips for compiling and using MLton on NetBSD ==
20984 MLton can be a memory-hog on computers with little memory. While
20985 640Mb of RAM ought to be enough to self-compile MLton one might want
20986 to do some tuning to the NetBSD VM subsystem in order to succeed. The
20987 notes presented here is what <:JesperLouisAndersen:> uses for
20988 compiling MLton on his laptop.
20990 === The NetBSD VM subsystem ===
20992 NetBSD uses a VM subsystem named
20993 http://www.ccrc.wustl.edu/pub/chuck/tech/uvm/[UVM].
20994 http://www.selonen.org/arto/netbsd/vm_tune.html[Tuning the VM system]
20995 can be done via the `sysctl(8)`-interface with the "VM" MIB set.
20997 === Tuning the NetBSD VM subsystem for MLton ===
20999 MLton uses a lot of anonymous pages when it is running. Thus, we will
21000 need to tune up the default of 80 for anonymous pages. Setting
21003 sysctl -w vm.anonmax=95
21004 sysctl -w vm.anonmin=50
21005 sysctl -w vm.filemin=2
21006 sysctl -w vm.execmin=2
21007 sysctl -w vm.filemax=4
21008 sysctl -w vm.execmax=4
21011 makes it less likely for the VM system to swap out anonymous pages.
21012 For a full explanation of the above flags, see the documentation.
21014 The result is that my laptop goes from a MLton compile where it swaps
21015 a lot to a MLton compile with no swapping.
21019 :mlton-guide-page: RunningOnOpenBSD
21020 [[RunningOnOpenBSD]]
21024 MLton runs fine on http://www.openbsd.org/[OpenBSD].
21028 * The <!RawGitFile(mlton,master,regression/socket.sml)> regression
21029 test fails. We suspect this is not a bug and is simply due to our
21030 test relying on a certain behavior when connecting to a socket that
21031 has not yet accepted, which is handled differently on OpenBSD than
21032 other platforms. Any help in understanding and resolving this issue
21037 :mlton-guide-page: RunningOnPowerPC
21038 [[RunningOnPowerPC]]
21042 MLton runs fine on the PowerPC architecture.
21046 * When compiling for PowerPC, MLton targets the 32-bit PowerPC
21049 * When compiling for PowerPC, MLton doesn't support native code
21050 generation (`-codegen native`). Hence, performance is not as good as
21051 it might be and compile times are longer. Also, the quality of code
21052 generated by `gcc` is important. By default, MLton calls `gcc -O1`.
21053 You can change this by calling MLton with `-cc-opt -O2`.
21055 * On the PowerPC, the <:GnuMP:> library supports multiple ABIs. See
21056 the <:GnuMP:> page for more details.
21060 :mlton-guide-page: RunningOnPowerPC64
21061 [[RunningOnPowerPC64]]
21065 MLton runs fine on the PowerPC64 architecture.
21069 * When compiling for PowerPC64, MLton targets the 64-bit PowerPC
21072 * When compiling for PowerPC64, MLton doesn't support native code
21073 generation (`-codegen native`). Hence, performance is not as good as
21074 it might be and compile times are longer. Also, the quality of code
21075 generated by `gcc` is important. By default, MLton calls `gcc -O1`.
21076 You can change this by calling MLton with `-cc-opt -O2`.
21078 * On the PowerPC64, the <:GnuMP:> library supports multiple ABIs. See
21079 the <:GnuMP:> page for more details.
21083 :mlton-guide-page: RunningOnS390
21088 MLton runs fine on the S390 architecture.
21092 * When compiling for S390, MLton doesn't support native code
21093 generation (`-codegen native`). Hence, performance is not as good as
21094 it might be and compile times are longer. Also, the quality of code
21095 generated by `gcc` is important. By default, MLton calls `gcc -O1`.
21096 You can change this by calling MLton with `-cc-opt -O2`.
21100 :mlton-guide-page: RunningOnSolaris
21101 [[RunningOnSolaris]]
21105 MLton runs fine on Solaris.
21109 * You must install the `binutils`, `gcc`, and `make` packages. You
21110 can find out how to get these at
21111 http://www.sunfreeware.com[sunfreeware.com].
21113 * Making the documentation requires that you install `latex` and
21114 `dvips`, which are available in the `tetex` package.
21118 * Bootstrapping on the <:RunningOnSparc:Sparc architecture> is so slow
21119 as to be impractical (many hours on a 500MHz UltraSparc). For this
21120 reason, we strongly recommend building with a
21121 <:CrossCompiling:cross compiler>.
21125 * <:RunningOnAMD64:>
21126 * <:RunningOnSparc:>
21131 :mlton-guide-page: RunningOnSparc
21136 MLton runs fine on the Sparc architecture.
21140 * When compiling for Sparc, MLton targets the 32-bit Sparc
21141 architecture (i.e., Sparc V8).
21143 * When compiling for Sparc, MLton doesn't support native code
21144 generation (`-codegen native`). Hence, performance is not as good as
21145 it might be and compile times are longer. Also, the quality of code
21146 generated by `gcc` is important. By default, MLton calls `gcc -O1`.
21147 You can change this by calling MLton with `-cc-opt -O2`. We have seen
21148 this speed up some programs by as much as 30%, especially those
21149 involving floating point; however, it can also more than double
21152 * When compiling for Sparc, MLton uses `-align 8` by default. While
21153 this speeds up reals, it also may increase object sizes. If your
21154 program does not make significant use of reals, you might see a
21155 speedup with `-align 4`.
21159 * Bootstrapping on the <:RunningOnSparc:Sparc architecture> is so slow
21160 as to be impractical (many hours on a 500MHz UltraSparc). For this
21161 reason, we strongly recommend building with a
21162 <:CrossCompiling:cross compiler>.
21166 * <:RunningOnSolaris:>
21170 :mlton-guide-page: RunningOnX86
21175 MLton runs fine on the x86 architecture.
21179 * On x86, MLton supports native code generation (`-codegen native` or
21184 :mlton-guide-page: RunTimeOptions
21189 Executables produced by MLton take command line arguments that control
21190 the runtime system. These arguments are optional, and occur before
21191 the executable's usual arguments. To use these options, the first
21192 argument to the executable must be `@MLton`. The optional arguments
21193 then follow, must be terminated by `--`, and are followed by any
21194 arguments to the program. The optional arguments are _not_ made
21195 available to the SML program via `CommandLine.arguments`. For
21196 example, a valid call to `hello-world` is:
21199 hello-world @MLton gc-summary fixed-heap 10k -- a b c
21202 In the above example,
21203 `CommandLine.arguments () = ["a", "b", "c"]`.
21205 It is allowed to have a sequence of `@MLton` arguments, as in:
21208 hello-world @MLton gc-summary -- @MLton fixed-heap 10k -- a b c
21211 Run-time options can also control MLton, as in
21214 mlton @MLton fixed-heap 0.5g -- foo.sml
21220 * ++fixed-heap __x__{k|K|m|M|g|G}++
21222 Use a fixed size heap of size _x_, where _x_ is a real number and the
21223 trailing letter indicates its units.
21227 | `k` or `K` | 1024
21228 | `m` or `M` | 1,048,576
21229 | `g` or `G` | 1,073,741,824
21232 A value of `0` means to use almost all the RAM present on the machine.
21234 The heap size used by `fixed-heap` includes all memory allocated by
21235 SML code, including memory for the stack (or stacks, if there are
21236 multiple threads). It does not, however, include any memory used for
21237 code itself or memory used by C globals, the C stack, or malloc.
21241 Print a message at the start and end of every garbage collection.
21245 Print a summary of garbage collection statistics upon program
21246 termination to standard error.
21248 * ++gc-summary-file __file__++
21250 Print a summary of garbage collection statistics upon program
21251 termination to the file specified by _file_.
21253 * ++load-world __world__++
21255 Restart the computation with the file specified by _world_, which must
21256 have been created by a call to `MLton.World.save` by the same
21257 executable. See <:MLtonWorld:>.
21259 * ++max-heap __x__{k|K|m|M|g|G}++
21261 Run the computation with an automatically resized heap that is never
21262 larger than _x_, where _x_ is a real number and the trailing letter
21263 indicates the units as with `fixed-heap`. The heap size for
21264 `max-heap` is accounted for as with `fixed-heap`.
21266 * ++may-page-heap {false|true}++
21268 Enable paging the heap to disk when unable to grow the heap to a
21271 * ++no-load-world++
21273 Disable `load-world`. This can be used as an argument to the compiler
21274 via `-runtime no-load-world` to create executables that will not load
21275 a world. This may be useful to ensure that set-uid executables do not
21276 load some strange world.
21278 * ++ram-slop __x__++
21280 Multiply _x_ by the amount of RAM on the machine to obtain what the
21281 runtime views as the amount of RAM it can use. Typically _x_ is less
21282 than 1, and is used to account for space used by other programs
21283 running on the same machine.
21287 Causes the runtime to stop processing `@MLton` arguments once the next
21288 `--` is reached. This can be used as an argument to the compiler via
21289 `-runtime stop` to create executables that don't process any `@MLton`
21294 :mlton-guide-page: ScopeInference
21299 Scope inference is an analysis/rewrite pass for the <:AST:>
21300 <:IntermediateLanguage:>, invoked from <:Elaborate:>.
21304 This pass adds free type variables to the `val` or `fun`
21305 declaration where they are implicitly scoped.
21307 == Implementation ==
21309 <!ViewGitFile(mlton,master,mlton/elaborate/scope.sig)>
21310 <!ViewGitFile(mlton,master,mlton/elaborate/scope.fun)>
21312 == Details and Notes ==
21314 Scope inference determines for each type variable, the declaration
21315 where it is bound. Scope inference is a direct implementation of the
21316 specification given in section 4.6 of the
21317 <:DefinitionOfStandardML: Definition>. Recall that a free occurrence
21318 of a type variable `'a` in a declaration `d` is _unguarded_
21319 in `d` if `'a` is not part of a smaller declaration. A type
21320 variable `'a` is implicitly scoped at `d` if `'a` is
21321 unguarded in `d` and `'a` does not occur unguarded in any
21322 declaration containing `d`.
21324 The first pass of scope inference walks down the tree and renames all
21325 explicitly bound type variables in order to avoid name collisions. It
21326 then walks up the tree and adds to each declaration the set of
21327 unguarded type variables occurring in that declaration. At this
21328 point, if declaration `d` contains an unguarded type variable
21329 `'a` and the immediately containing declaration does not contain
21330 `'a`, then `'a` is implicitly scoped at `d`. The final
21331 pass walks down the tree leaving a `'a` at the a declaration where
21332 it is scoped and removing it from all enclosed declarations.
21336 :mlton-guide-page: SelfCompiling
21341 If you want to compile MLton, you must first get the <:Sources:>. You
21342 can compile with either MLton or SML/NJ, but we strongly recommend
21343 using MLton, since it generates a much faster and more robust
21346 == Compiling with MLton ==
21348 To compile with MLton, you need the binary versions of `mlton`,
21349 `mllex`, and `mlyacc` that come with the MLton binary package. To be
21350 safe, you should use the same version of MLton that you are building.
21351 However, older versions may work, as long as they don't go back too
21352 far. To build MLton, run `make` from within the root directory of the
21353 sources. This will build MLton first with the already installed
21354 binary version of MLton and will then rebuild MLton with itself.
21356 First, the `Makefile` calls `mllex` and `mlyacc` to build the lexer
21357 and parser, and then calls `mlton` to compile itself. When making
21358 MLton using another version the `Makefile` automatically uses
21359 `mlton-stubs.mlb`, which will put in enough stubs to emulate the
21360 `structure MLton`. Once MLton is built, the `Makefile` will rebuild
21361 MLton with itself, this time using `mlton.mlb` and the real
21362 `structure MLton` from the <:BasisLibrary:Basis Library>. This second round
21363 of compilation is essential in order to achieve a fast and robust
21366 Compiling MLton requires at least 1GB of RAM for 32-bit platforms (2GB is
21367 preferable) and at least 2GB RAM for 64-bit platforms (4GB is preferable).
21368 If your machine has less RAM, self-compilation will
21369 likely fail, or at least take a very long time due to paging. Even if
21370 you have enough memory, there simply may not be enough available, due
21371 to memory consumed by other processes. In this case, you may see an
21372 `Out of memory` message, or self-compilation may become extremely
21373 slow. The only fix is to make sure that enough memory is available.
21375 === Possible Errors ===
21377 * The C compiler may not be able to find the <:GnuMP:> header file,
21378 `gmp.h` leading to an error like the following.
21381 cenv.h:49:18: fatal error: gmp.h: No such file or directory
21384 The solution is to install (or build) GnuMP on your machine. If you
21385 install it at a location not on the default seach path, then run
21386 ++make WITH_GMP_INC_DIR=__/path/to/gmp/include__ WITH_GMP_LIB_DIR=__/path/to/gmp/lib__++.
21388 * The following errors indicates that a binary version of MLton could
21389 not be found in your path.
21392 /bin/sh: mlton: command not found
21396 make[2]: mlton: Command not found
21399 You need to have `mlton` in your path to build MLton from source.
21401 During the build process, there are various times that the `Makefile`-s
21402 look for a `mlton` in your path and in `src/build/bin`. It is OK if
21403 the latter doesn't exist when the build starts; it is the target being
21404 built. Failure to find a `mlton` in your path will abort the build.
21407 == Compiling with SML/NJ ==
21409 To compile with SML/NJ, run `make bootstrap-smlnj` from within the
21410 root directory of the sources. You must use a recent version of
21411 SML/NJ. First, the `Makefile` calls `ml-lex` and `ml-yacc` to build
21412 the lexer and parser. Then, it calls SML/NJ with the appropriate
21413 `sources.cm` file. Once MLton is built with SML/NJ, the `Makefile`
21414 will rebuild MLton with this SML/NJ built MLton and then will rebuild
21415 MLton with the MLton built MLton. Building with SML/NJ takes
21416 significant time (particularly during the "`parseAndElaborate`" phase
21417 when the SML/NJ built MLton is compiling MLton). Unless you are doing
21418 compiler development and need rapid recompilation, we recommend
21419 compiling with MLton.
21423 :mlton-guide-page: Serialization
21428 <:StandardML:Standard ML> does not have built-in support for
21429 serialization. Here are papers that describe user-level approaches:
21431 * <!Cite(Elsman04)>
21432 * <!Cite(Kennedy04)>
21434 The MLton repository also contains an experimental generic programming
21436 <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>) that
21437 includes a pickling (serialization) generic (see
21438 <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/public/value/pickle.sig)>).
21442 :mlton-guide-page: ShareZeroVec
21447 <:ShareZeroVec:> is an optimization pass for the <:SSA:>
21448 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
21452 An SSA optimization to share zero-length vectors.
21454 From <!ViewGitCommit(mlton,be8c5f576)>, which replaced the use of the
21455 `Array_array0Const` primitive in the Basis Library implementation with a
21456 (nullary) `Vector_vector` primitive:
21460 The original motivation for the `Array_array0Const` primitive was to share the
21461 heap space required for zero-length vectors among all vectors (of a given type).
21462 It was claimed that this optimization is important, e.g., in a self-compile,
21463 where vectors are used for lots of syntax tree elements and many of those
21464 vectors are empty. See:
21465 http://www.mlton.org/pipermail/mlton-devel/2002-February/021523.html
21467 Curiously, the full effect of this optimization has been missing for quite some
21468 time (perhaps since the port of <:ConstantPropagation:> to the SSA IL). While
21469 <:ConstantPropagation:> has "globalized" the nullary application of the
21470 `Array_array0Const` primitive, it also simultaneously transformed it to an
21471 application of the `Array_uninit` (previously, the `Array_array`) primitive to
21472 the zero constant. The hash-consing of globals, meant to create exactly one
21473 global for each distinct constant, treats `Array_uninit` primitives as unequal
21474 (appropriately, since `Array_uninit` allocates an array with identity (though
21475 the identity may be supressed by a subsequent `Array_toVector`)), hence each
21476 distinct `Array_array0Const` primitive in the program remained as distinct
21477 globals. The limited amount of inlining prior to <:ConstantPropagation:> meant
21478 that there were typically fewer than a dozen "copies" of the same empty vector
21479 in a program for a given type.
21481 As a "functional" primitive, a nullary `Vector_vector` is globalized by
21482 ClosureConvert, but is further recognized by ConstantPropagation and hash-consed
21483 into a unique instance for each type.
21486 However, a single, shared, global `Vector_vector ()` inhibits the
21487 coercion-based optimizations of `Useless`. For example, consider the
21492 val n = valOf (Int.fromString (hd (CommandLine.arguments ())))
21494 val v1 = Vector.tabulate (n, fn i =>
21495 let val w = Word16.fromInt i
21496 in (w - 0wx1, w, w + 0wx1 + w)
21498 val v2 = Vector.map (fn (w1, w2, w3) => (w1, 0wx2 * w2, 0wx3 * w3)) v1
21499 val v3 = VectorSlice.vector (VectorSlice.slice (v1, 1, SOME (n - 2)))
21500 val ans1 = Vector.foldl (fn ((w1,w2,w3),w) => w + w1 + w2 + w3) 0wx0 v1
21501 val ans2 = Vector.foldl (fn ((_,w2,_),w) => w + w2) 0wx0 v2
21502 val ans3 = Vector.foldl (fn ((_,w2,_),w) => w + w2) 0wx0 v3
21504 val _ = print (concat ["ans1 = ", Word16.toString ans1, " ",
21505 "ans2 = ", Word16.toString ans2, " ",
21506 "ans3 = ", Word16.toString ans3, "\n"])
21509 We would like `v2` and `v3` to be optimized from
21510 `(word16 * word16 * word16) vector` to `word16 vector` because only
21511 the 2nd component of the elements is needed to compute the answer.
21513 With `Array_array0Const`, each distinct occurrence of
21514 `Array_array0Const((word16 * word16 * word16))` arising from
21515 polyvariance and inlining remained a distinct
21516 `Array_uninit((word16 * word16 * word16)) (0x0)` global, which
21517 resulted in distinct occurrences for the
21518 `val v1 = Vector.tabulate ...` and for the
21519 `val v2 = Vector.map ...`. The latter could be optimized to
21520 `Array_uninit(word16) (0x0)` by `Useless`, because its result only
21521 flows to places requiring the 2nd component of the elements.
21523 With `Vector_vector ()`, the distinct occurrences of
21524 `Vector_vector((word16 * word16 * word16)) ()` arising from
21525 polyvariance are globalized during `ClosureConvert`, those global
21526 references may be further duplicated by inlining, but the distinct
21527 occurrences of `Vector_vector((word16 * word16 * word16)) ()` are
21528 merged to a single occurrence. Because this result flows to places
21529 requiring all three components of the elements, it remains
21530 `Vector_vector((word16 * word16 * word16)) ()` after
21531 `Useless`. Furthermore, because one cannot (in constant time) coerce a
21532 `(word16 * word16 * word16) vector` to a `word16 vector`, the `v2`
21533 value remains of type `(word16 * word16 * word16) vector`.
21535 One option would be to drop the 0-element vector "optimization"
21536 entirely. This costs some space (no sharing of empty vectors) and
21537 some time (allocation and garbage collection of empty vectors).
21539 Another option would be to reinstate the `Array_array0Const` primitive
21540 and associated `ConstantPropagation` treatment. But, the semantics
21541 and purpose of `Array_array0Const` was poorly understood, resulting in
21544 The <:ShareZeroVec:> pass pursues a different approach: perform the 0-element
21545 vector "optimization" as a separate optimization, after
21546 `ConstantPropagation` and `Useless`. A trivial static analysis is
21547 used to match `val v: t vector = Array_toVector(t) (a)` with
21548 corresponding `val a: array = Array_uninit(t) (l)` and the later are
21550 `val a: t array = if 0 = l then zeroArr_[t] else Array_uninit(t) (l)`
21551 with a single global `val zeroArr_[t] = Array_uninit(t) (0)` created
21552 for each distinct type (after coercion-based optimizations).
21554 One disadvantage of this approach, compared to the `Vector_vector(t) ()`
21555 approach, is that `Array_toVector` is applied each time a vector
21556 is created, even if it is being applied to the `zeroArr_[t]`
21557 zero-length array. (Although, this was the behavior of the
21558 `Array_array0Const` approach.) This updates the object header each
21559 time, whereas the `Vector_vector(t) ()` approach would have updated
21560 the object header once, when the global was created, and the
21561 `zeroVec_[t]` global and the `Array_toVector` result would flow to the
21564 It would be possible to properly share zero-length vectors, but doing
21565 so is a more sophisticated analysis and transformation, because there
21566 can be arbitrary code between the
21567 `val a: t array = Array_uninit(t) (l)` and the corresponding
21568 `val v: v vector = Array_toVector(t) (a)`, although, in practice,
21569 nothing happens when a zero-length vector is created. It may be best
21570 to pursue a more general "array to vector" optimization that
21571 transforms creations of static-length vectors (e.g., all the
21572 `Vector.new<N>` functions) into `Vector_vector` primitives (some of
21573 which could be globalized).
21575 == Implementation ==
21577 * <!ViewGitFile(mlton,master,mlton/ssa/share-zero-vec.fun)>
21579 == Details and Notes ==
21585 :mlton-guide-page: ShowBasis
21590 MLton has a flag, `-show-basis <file>`, that causes MLton to pretty
21591 print to _file_ the basis defined by the input program. For example,
21592 if `foo.sml` contains
21597 then `mlton -show-basis foo.basis foo.sml` will create `foo.basis`
21598 with the following contents.
21603 If you only want to see the basis and do not wish to compile the
21604 program, you can call MLton with `-stop tc`.
21606 == Displaying signatures ==
21608 When displaying signatures, MLton prefixes types defined in the
21609 signature them with `_sig.` to distinguish them from types defined in the
21610 environment. For example,
21616 val x: t * int -> unit
21624 val x: _sig.t * int -> unit
21628 Notice that `int` occurs without the `_sig.` prefix.
21630 MLton also uses a canonical name for each type in the signature, and
21631 that name is used everywhere for that type, no matter what the input
21632 signature looked like. For example:
21654 Canonical names are always relative to the "top" of the signature,
21655 even when used in nested substructures. For example:
21687 == Displaying structures ==
21689 When displaying structures, MLton uses signature constraints wherever
21690 possible, combined with `where type` clauses to specify the meanings
21691 of the types defined within the signature. For example:
21704 structure S2:> SIG = S
21716 where type t = S2.t
21721 :mlton-guide-page: ShowBasisDirective
21722 [[ShowBasisDirective]]
21726 A comment of the form `(*#showBasis "<file>"*)` is recognized as a directive to
21727 save the current basis (i.e., environment) to `<file>` (in the same format as
21728 the `-show-basis <file>` <:CompileTimeOptions: compile-time option>). The
21729 `<file>` is interpreted relative to the source file in which it appears. The
21730 comment is lexed as a distinct token and is parsed as a structure-level
21731 declaration. [Note that treating the directive as a top-level declaration would
21732 prohibit using it inside a functor body, which would make the feature
21733 significantly less useful in the context of the MLton compiler sources (with its
21734 nearly fully functorial style).]
21736 This feature is meant to facilitate auto-completion via
21737 https://github.com/MatthewFluet/company-mlton[`company-mlton`] and similar
21742 :mlton-guide-page: ShowProf
21747 If an executable is compiled for <:Profiling:profiling>, then it
21748 accepts a special command-line runtime system argument, `show-prof`,
21749 that outputs information about the source functions that are profiled.
21750 Normally, this information is used by `mlprof`. This page documents
21751 the `show-prof` output format, and is intended for those working on
21752 the profiler internals.
21754 The `show-prof` output is ASCII, and consists of a sequence of lines.
21756 * The magic number of the executable.
21757 * The number of source names in the executable.
21758 * A line for each source name giving the name of the function, a tab,
21759 the filename of the file containing the function, a colon, a space,
21760 and the line number that the function starts on in that file.
21761 * The number of (split) source functions.
21762 * A line for each (split) source function, where each line consists of
21763 a source-name index (into the array of source names) and a successors
21764 index (into the array of split-source sequences, defined below).
21765 * The number of split-source sequences.
21766 * A line for each split-source sequence, where each line is a space
21767 separated list of (split) source functions.
21769 The latter two arrays, split sources and split-source sequences,
21770 define a directed graph, which is the call-graph of the program.
21774 :mlton-guide-page: Shrink
21779 <:Shrink:> is a rewrite pass for the <:SSA:> and <:SSA2:>
21780 <:IntermediateLanguage:>s, invoked from every optimization pass (see
21781 <:SSASimplify:> and <:SSA2Simplify:>).
21785 This pass implements a whole family of compile-time reductions, like:
21787 * `#1(a, b)` => `a`
21788 * `case C x of C y => e` => `let y = x in e`
21789 * constant folding, copy propagation
21791 * tuple reconstruction elimination
21793 == Implementation ==
21795 * <!ViewGitFile(mlton,master,mlton/ssa/shrink.sig)>
21796 * <!ViewGitFile(mlton,master,mlton/ssa/shrink.fun)>
21797 * <!ViewGitFile(mlton,master,mlton/ssa/shrink2.sig)>
21798 * <!ViewGitFile(mlton,master,mlton/ssa/shrink2.fun)>
21800 == Details and Notes ==
21802 The <:Shrink:> pass is run after every <:SSA:> and <:SSA2:>
21805 The <:Shrink:> implementation also includes functions to eliminate
21806 unreachable blocks from a <:SSA:> or <:SSA2:> program or function.
21807 The <:Shrink:> pass does not guarantee to eliminate all unreachable
21808 blocks. Doing so would unduly complicate the implementation, and it
21809 is almost always the case that all unreachable blocks are eliminated.
21810 However, a small number of optimization passes require that the input
21811 have no unreachable blocks (essentially, when the analysis works on
21812 the control flow graph and the rewrite iterates on the vector of
21813 blocks). These passes explicitly call `eliminateDeadBlocks`.
21815 The <:Shrink:> pass has a special case to turn a non-tail call where
21816 the continuation and handler only do `Profile` statements into a tail
21817 call where the `Profile` statements precede the tail call.
21821 :mlton-guide-page: SimplifyTypes
21826 <:SimplifyTypes:> is an optimization pass for the <:SSA:>
21827 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
21831 This pass computes a "cardinality" of each datatype, which is an
21832 abstraction of the number of values of the datatype.
21834 * `Zero` means the datatype has no values (except for bottom).
21835 * `One` means the datatype has one value (except for bottom).
21836 * `Many` means the datatype has many values.
21838 This pass removes all datatypes whose cardinality is `Zero` or `One`
21841 * components of tuples
21845 which are such datatypes.
21847 This pass marks constructors as one of:
21849 * `Useless`: it never appears in a `ConApp`.
21850 * `Transparent`: it is the only variant in its datatype and its argument type does not contain any uses of `array` or `vector`.
21851 * `Useful`: otherwise
21853 This pass also removes `Useless` and `Transparent` constructors.
21855 == Implementation ==
21857 * <!ViewGitFile(mlton,master,mlton/ssa/simplify-types.fun)>
21859 == Details and Notes ==
21861 This pass must happen before polymorphic equality is implemented because
21863 * it will make polymorphic equality faster because some types are simpler
21864 * it removes uses of polymorphic equality that must return true
21866 We must keep track of `Transparent` constructors whose argument type
21867 uses `array` because of datatypes like the following:
21870 datatype t = T of t array
21873 Such a datatype has `Cardinality.Many`, but we cannot eliminate the
21874 datatype and replace the lhs by the rhs, i.e. we must keep the
21875 circularity around.
21877 Must do similar things for `vectors`.
21879 Also, to eliminate as many `Transparent` constructors as possible, for
21880 something like the following,
21883 datatype t = T of u array
21884 and u = U of t vector
21886 we (arbitrarily) expand one of the datatypes first. The result will
21890 datatype u = U of u array array
21892 where all uses of `t` are replaced by `u array`.
21896 :mlton-guide-page: SML3d
21901 The http://sml3d.cs.uchicago.edu/[SML3d Project] is a collection of
21902 libraries to support 3D graphics programming using Standard ML and the
21903 http://www.opengl.org/[OpenGL] graphics API. It currently requires the
21904 MLton implementation of SML and is supported on Linux, Mac OS X, and
21905 Microsoft Windows. There is also support for
21906 http://www.khronos.org/opencl/[OpenCL].
21910 :mlton-guide-page: SMLNET
21915 http://www.cl.cam.ac.uk/research/tsg/SMLNET[SML.NET] is a
21916 <:StandardMLImplementations:Standard ML implementation> that
21917 targets the .NET Common Language Runtime.
21919 SML.NET is based on the <:MLj:MLj> compiler.
21923 * <!Cite(BentonEtAl04)>
21927 :mlton-guide-page: SMLNJ
21932 http://www.smlnj.org/[SML/NJ] is a
21933 <:StandardMLImplementations:Standard ML implementation>. It is a
21934 native code compiler that runs on a variety of platforms and has a
21935 number of libraries and tools.
21937 We maintain a list of SML/NJ's <:SMLNJDeviations:deviations> from
21938 <:DefinitionOfStandardML:The Definition of Standard ML>.
21940 MLton has support for some features of SML/NJ in order to ease porting
21941 between MLton and SML/NJ.
21943 * <:CompilationManager:> (CM)
21944 * <:LineDirective:>s
21945 * <:SMLofNJStructure:>
21946 * <:UnsafeStructure:>
21950 :mlton-guide-page: SMLNJDeviations
21951 [[SMLNJDeviations]]
21955 Here are some deviations of <:SMLNJ:SML/NJ> from
21956 <:DefinitionOfStandardML:The Definition of Standard ML (Revised)>.
21957 Some of these are documented in the
21958 http://www.smlnj.org/doc/Conversion/index.html[SML '97 Conversion Guide].
21959 Since MLton does not deviate from the Definition, you should look here
21960 if you are having trouble porting a program from MLton to SML/NJ or
21961 vice versa. If you discover other deviations of SML/NJ that aren't
21962 listed here, please send mail to
21963 mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`].
21965 * SML/NJ allows spaces in long identifiers, as in `S . x`. Section
21966 2.5 of the Definition implies that `S . x` should be treated as three
21967 separate lexical items.
21969 * SML/NJ allows `op` to appear in `val` specifications:
21973 signature FOO = sig
21974 val op + : int * int -> int
21978 The grammar on page 14 of the Definition does not allow it. Recent
21979 versions of SML/NJ do give a warning.
21988 as an unmatched close comment.
21990 * SML/NJ allows `=` to be rebound by the declaration:
21997 This is explicitly forbidden on page 5 of the Definition. Recent
21998 versions of SML/NJ do give a warning.
22000 * SML/NJ allows rebinding `true`, `false`, `nil`, `::`, and `ref` by
22012 This is explicitly forbidden on page 9 of the Definition.
22014 * SML/NJ extends the syntax of the language to allow vector
22015 expressions and patterns like the following:
22023 MLton supports vector expressions and patterns with the <:SuccessorML#VectorExpsAndPats:`allowVectorExpsAndPats`> <:MLBasisAnnotations:ML Basis annotation>.
22025 * SML/NJ extends the syntax of the language to allow _or patterns_
22026 like the following:
22030 datatype foo = Foo of int | Bar of int
22031 val (Foo x | Bar x) = Foo 13
22034 MLton supports or patterns with the <:SuccessorML#OrPats:`allowOrPats`> <:MLBasisAnnotations:ML Basis annotation>.
22036 * SML/NJ allows higher-order functors, that is, functors can be
22037 components of structures and can be passed as functor arguments and
22038 returned as functor results. As a consequence, SML/NJ allows
22039 abbreviated functor definitions, as in the following:
22048 functor F (structure A: S): S =
22056 * SML/NJ extends the syntax of the language to allow `functor` and
22057 `signature` declarations to occur within the scope of `local` and
22058 `structure` declarations.
22060 * SML/NJ allows duplicate type specifications in signatures when the
22061 duplicates are introduced by `include`, as in the following:
22082 This is disallowed by rule 77 of the Definition.
22084 * SML/NJ allows sharing constraints between type abbreviations in
22085 signatures, as in the following:
22097 These are disallowed by rule 78 of the Definition. Recent versions of
22098 SML/NJ correctly disallow sharing constraints between type
22099 abbreviations in signatures.
22101 * SML/NJ disallows multiple `where type` specifications of the same
22102 type name, as in the following
22114 This is allowed by rule 64 of the Definition.
22116 * SML/NJ allows `and` in `sharing` specs in signatures, as in
22130 * SML/NJ does not expand the `withtype` derived form as described by
22131 the Definition. According to page 55 of the Definition, the type
22132 bindings of a `withtype` declaration are substituted simultaneously in
22133 the connected datatype. Consider the following program.
22145 According to the Definition, it should be expanded to the following.
22157 However, SML/NJ expands `withtype` bindings sequentially, meaning that
22158 earlier bindings are expanded within later ones. Hence, the above
22159 program is expanded to the following.
22171 * SML/NJ allows `withtype` specifications in signatures.
22173 MLton supports `withtype` specifications in signatures with the <:SuccessorML#SigWithtype:`allowSigWithtype`> <:MLBasisAnnotations:ML Basis annotation>.
22175 * SML/NJ allows a `where` structure specification that is similar to a
22176 `where type` specification. For example:
22180 structure S = struct type t = int end
22183 structure T : sig type t end
22187 This is equivalent to:
22191 structure S = struct type t = int end
22194 structure T : sig type t end
22195 end where type T.t = S.t
22198 SML/NJ also allows a definitional structure specification that is
22199 similar to a definitional type specification. For example:
22203 structure S = struct type t = int end
22206 structure T : sig type t end = S
22210 This is equivalent to the previous examples and to:
22214 structure S = struct type t = int end
22217 structure T : sig type t end where type t = S.t
22221 * SML/NJ disallows binding non-datatypes with datatype replication.
22222 For example, it rejects the following program that should be allowed
22223 according to the Definition.
22227 type ('a, 'b) t = 'a * 'b
22228 datatype u = datatype t
22231 This idiom can be useful when one wants to rename a type without
22232 rewriting all the type arguments. For example, the above would have
22233 to be written in SML/NJ as follows.
22237 type ('a, 'b) t = 'a * 'b
22238 type ('a, 'b) u = ('a, 'b) t
22241 * SML/NJ disallows sharing a structure with one of its substructures.
22242 For example, SML/NJ disallows the following.
22251 structure T: sig type t end
22257 This signature is allowed by the Definition.
22259 * SML/NJ disallows polymorphic generalization of refutable
22260 patterns. For example, SML/NJ disallows the following.
22265 val _ = (1 :: x, "one" :: x)
22268 Recent versions of SML/NJ correctly allow polymorphic generalization
22269 of refutable patterns.
22271 * SML/NJ uses an overly restrictive context for type inference. For
22272 example, SML/NJ rejects both of the following.
22278 val z = (fn x => x) []
22279 val y = z :: [true] :: nil
22285 structure S : sig val z : bool list end =
22287 val z = (fn x => x) []
22291 These structures are allowed by the Definition.
22293 == Deviations from the Basis Library Specification ==
22295 Here are some deviations of SML/NJ from the <:BasisLibrary:Basis Library>
22296 http://www.standardml.org/Basis[specification].
22298 * SML/NJ exposes the equality of the `vector` type in structures such
22299 as `Word8Vector` that abstractly match `MONO_VECTOR`, which says
22300 `type vector`, not `eqtype vector`. So, for example, SML/NJ accepts
22301 the following program:
22305 fun f (v: Word8Vector.vector) = v = v
22308 * SML/NJ exposes the equality property of the type `status` in
22309 `OS.Process`. This means that programs which directly compare two
22310 values of type `status` will work with SML/NJ but not MLton.
22312 * Under SML/NJ on Windows, `OS.Path.validVolume` incorrectly considers
22313 absolute empty volumes to be valid. In other words, when the
22318 OS.Path.validVolume { isAbs = true, vol = "" }
22321 is evaluated by SML/NJ on Windows, the result is `true`. MLton, on
22322 the other hand, correctly follows the Basis Library Specification,
22323 which states that on Windows, `OS.Path.validVolume` should return
22324 `false` whenever `isAbs = true` and `vol = ""`.
22326 This incorrect behavior causes other `OS.Path` functions to behave
22327 differently. For example, when the expression
22331 OS.Path.toString (OS.Path.fromString "\\usr\\local")
22334 is evaluated by SML/NJ on Windows, the result is `"\\usr\\local"`,
22335 whereas under MLton on Windows, evaluating this expression (correctly)
22336 causes an `OS.Path.Path` exception to be raised.
22340 :mlton-guide-page: SMLNJLibrary
22345 The http://www.smlnj.org/doc/smlnj-lib/index.html[SML/NJ Library] is a
22346 collection of libraries that are distributed with SML/NJ. Due to
22347 differences between SML/NJ and MLton, these libraries will not work
22348 out-of-the box with MLton.
22350 As of 20180119, MLton includes a port of the SML/NJ Library
22351 synchronized with SML/NJ version 110.82.
22355 * You can import a sub-library of the SML/NJ Library into an MLB file with:
22359 |MLB file|Description
22360 |`$(SML_LIB)/smlnj-lib/Util/smlnj-lib.mlb`|Various utility modules, included collections, simple formating, ...
22361 |`$(SML_LIB)/smlnj-lib/Controls/controls-lib.mlb`|A library for managing control flags in an application.
22362 |`$(SML_LIB)/smlnj-lib/HashCons/hash-cons-lib.mlb`|Support for implementing hash-consed data structures.
22363 |`$(SML_LIB)/smlnj-lib/HTML/html-lib.mlb`|HTML 3.2 parsing and pretty-printing library.
22364 |`$(SML_LIB)/smlnj-lib/HTML4/html4-lib.mlb`|HTML 4.01 parsing and pretty-printing library.
22365 |`$(SML_LIB)/smlnj-lib/INet/inet-lib.mlb`|Networking utilities; supported on both Unix and Windows systems.
22366 |`$(SML_LIB)/smlnj-lib/JSON/json-lib.mlb`|JavaScript Object Notation (JSON) reading and writing library.
22367 |`$(SML_LIB)/smlnj-lib/PP/pp-lib.mlb`|Pretty-printing library.
22368 |`$(SML_LIB)/smlnj-lib/Reactive/reactive-lib.mlb`|Reactive scripting library.
22369 |`$(SML_LIB)/smlnj-lib/RegExp/regexp-lib.mlb`|Regular expression library.
22370 |`$(SML_LIB)/smlnj-lib/SExp/sexp-lib.mlb`|S-expression library.
22371 |`$(SML_LIB)/smlnj-lib/Unix/unix-lib.mlb`|Utilities for Unix-based operating systems.
22372 |`$(SML_LIB)/smlnj-lib/XML/xml-lib.mlb`|XML library.
22375 * If you are porting a project from SML/NJ's <:CompilationManager:> to
22376 MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
22377 following maps are included by default:
22381 $SMLNJ-LIB $(SML_LIB)/smlnj-lib
22382 $smlnj-lib.cm $(SML_LIB)/smlnj-lib/Util
22383 $controls-lib.cm $(SML_LIB)/smlnj-lib/Controls
22384 $hash-cons-lib.cm $(SML_LIB)/smlnj-lib/HashCons
22385 $html-lib.cm $(SML_LIB)/smlnj-lib/HTML
22386 $html4-lib.cm $(SML_LIB)/smlnj-lib/HTML4
22387 $inet-lib.cm $(SML_LIB)/smlnj-lib/INet
22388 $json-lib.cm $(SML_LIB)/smlnj-lib/JSON
22389 $pp-lib.cm $(SML_LIB)/smlnj-lib/PP
22390 $reactive-lib.cm $(SML_LIB)/smlnj-lib/Reactive
22391 $regexp-lib.cm $(SML_LIB)/smlnj-lib/RegExp
22392 $sexp-lib.cm $(SML_LIB)/smlnj-lib/SExp
22393 $unix-lib.cm $(SML_LIB)/smlnj-lib/Unix
22394 $xml-lib.cm $(SML_LIB)/smlnj-lib/XML
22397 This will automatically convert a `$/smlnj-lib.cm` import in an input
22398 `.cm` file into a `$(SML_LIB)/smlnj-lib/Util/smlnj-lib.mlb` import in
22399 the output `.mlb` file.
22403 The following changes were made to the SML/NJ Library, in addition to
22404 deriving the `.mlb` files from the `.cm` files:
22406 * `HTML4/pp-init.sml` (added): Implements `structure PrettyPrint` using the SML/NJ PP Library. This implementation is taken from the SML/NJ compiler source, since the SML/NJ HTML4 Library used the `structure PrettyPrint` provided by the SML/NJ compiler itself.
22407 * `Util/base64.sml` (modified): Rewrote use of `Unsafe.CharVector.create` and `Unsafe.CharVector.update`; MLton assumes that vectors are immutable.
22408 * `Util/engine.mlton.sml` (added, not exported): Implements `structure Engine`, providing time-limited, resumable computations using <:MLtonThread:>, <:MLtonSignal:>, and <:MLtonItimer:>.
22409 * `Util/graph-scc-fn.sml` (modified): Rewrote use of `where` structure specification.
22410 * `Util/redblack-map-fn.sml` (modified): Rewrote use of `where` structure specification.
22411 * `Util/redblack-set-fn.sml` (modified): Rewrote use of `where` structure specification.
22412 * `Util/time-limit.mlb` (added): Exports `structure TimeLimit`, which is _not_ exported by `smlnj-lib.mlb`. Since MLton is very conservative in the presence of threads and signals, program performance may be adversely affected by unnecessarily including `structure TimeLimit`.
22413 * `Util/time-limit.mlton.sml` (added): Implements `structure TimeLimit` using `structure Engine`. The SML/NJ implementation of `structure TimeLimit` uses SML/NJ's first-class continuations, signals, and interval timer.
22417 * <!ViewGitFile(mlton,master,lib/smlnj-lib/smlnj-lib.patch)>
22421 :mlton-guide-page: SMLofNJStructure
22422 [[SMLofNJStructure]]
22428 signature SML_OF_NJ =
22433 val callcc: ('a cont -> 'a) -> 'a
22434 val isolate: ('a -> unit) -> 'a cont
22435 val throw: 'a cont -> 'a -> 'b
22440 datatype os_kind = BEOS | MACOS | OS2 | UNIX | WIN32
22442 val getHostArch: unit -> string
22443 val getOSKind: unit -> os_kind
22444 val getOSName: unit -> string
22447 val exnHistory: exn -> string list
22448 val exportFn: string * (string * string list -> OS.Process.status) -> unit
22449 val exportML: string -> bool
22450 val getAllArgs: unit -> string list
22451 val getArgs: unit -> string list
22452 val getCmdName: unit -> string
22456 `SMLofNJ` implements a subset of the structure of the same name
22457 provided in <:SMLNJ:Standard ML of New Jersey>. It is included to
22458 make it easier to port programs between the two systems. The
22459 semantics of these functions may be different than in SML/NJ.
22463 implements continuations.
22465 * `SysInfo.getHostArch ()`
22467 returns the string for the architecture.
22469 * `SysInfo.getOSKind`
22471 returns the OS kind.
22473 * `SysInfo.getOSName ()`
22475 returns the string for the host.
22479 the same as `MLton.Exn.history`.
22483 the same as `CommandLine.name ()`.
22487 the same as `CommandLine.arguments ()`.
22491 the same as `getCmdName()::getArgs()`.
22495 saves the state of the computation to a file that will apply `f` to
22496 the command-line arguments upon restart.
22500 saves the state of the computation to file `f` and continue. Returns
22501 `true` in the restarted computation and `false` in the continuing
22506 :mlton-guide-page: SMLSharp
22511 http://www.pllab.riec.tohoku.ac.jp/smlsharp/[SML#] is an
22512 <:StandardMLImplementations:implementation> of an extension of SML.
22515 http://www.pllab.riec.tohoku.ac.jp/smlsharp/?Tools[generally useful SML tools]
22516 including a pretty printer generator, a document generator, and a
22517 regression testing framework, and
22518 http://www.pllab.riec.tohoku.ac.jp/smlsharp/?Library%2FScripting[scripting library].
22522 :mlton-guide-page: Sources
22527 We maintain our sources with <:Git:>. You can
22528 https://github.com/MLton/mlton/[view them on the web] or access
22529 them with a git client.
22531 Anonymous read-only access is available via
22533 https://github.com/MLton/mlton.git
22537 git://github.com/MLton/mlton.git
22543 All commits are sent to
22544 mailto:MLton-commit@mlton.org[`MLton-commit@mlton.org`]
22545 (https://lists.sourceforge.net/lists/listinfo/mlton-commit[subscribe],
22546 https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-commit[archive],
22547 http://www.mlton.org/pipermail/mlton-commit/[archive]) which is a
22548 read-only mailing list for commit emails. Discussion should go to
22549 mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`].
22552 If the first line of a commit log message begins with "++MAIL{nbsp} ++",
22553 then the commit message will be sent with the subject as the rest of
22554 that first line, and will also be sent to
22555 mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`].
22561 See <!ViewGitFile(mlton,master,CHANGELOG.adoc)> for a list of
22562 changes and bug fixes.
22567 Prior to 20130308, we used <:Subversion:>.
22571 Prior to 20050730, we used <:CVS:>.
22575 :mlton-guide-page: SpaceSafety
22580 Informally, space safety is a property of a language implementation
22581 that asymptotically bounds the space used by a running program.
22585 * Chapter 12 of <!Cite(Appel92)>
22586 * <!Cite(Clinger98)>
22590 :mlton-guide-page: SSA
22595 <:SSA:> is an <:IntermediateLanguage:>, translated from <:SXML:> by
22596 <:ClosureConvert:>, optimized by <:SSASimplify:>, and translated by
22597 <:ToSSA2:> to <:SSA2:>.
22601 <:SSA:> is a <:FirstOrder:>, <:SimplyTyped:> <:IntermediateLanguage:>.
22602 It is the main <:IntermediateLanguage:> used for optimizations.
22604 An <:SSA:> program consists of a collection of datatype declarations,
22605 a sequence of global statements, and a collection of functions, along
22606 with a distinguished "main" function. Each function consists of a
22607 collection of basic blocks, where each basic block is a sequence of
22608 statements ending with some control transfer.
22610 == Implementation ==
22612 * <!ViewGitFile(mlton,master,mlton/ssa/ssa.sig)>
22613 * <!ViewGitFile(mlton,master,mlton/ssa/ssa.fun)>
22614 * <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree.sig)>
22615 * <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree.fun)>
22617 == Type Checking ==
22619 Type checking (<!ViewGitFile(mlton,master,mlton/ssa/type-check.sig)>,
22620 <!ViewGitFile(mlton,master,mlton/ssa/type-check.fun)>) of a <:SSA:> program
22621 verifies the following:
22623 * no duplicate definitions (tycons, cons, vars, labels, funcs)
22624 * no out of scope references (tycons, cons, vars, labels, funcs)
22625 * variable definitions dominate variable uses
22626 * case transfers are exhaustive and irredundant
22627 * `Enter`/`Leave` profile statements match
22628 * "traditional" well-typedness
22630 == Details and Notes ==
22632 SSA is an abbreviation for Static Single Assignment.
22634 For some initial design discussion, see the thread at:
22636 * http://mlton.org/pipermail/mlton/2001-August/019689.html
22638 For some retrospectives, see the threads at:
22640 * http://mlton.org/pipermail/mlton/2003-January/023054.html
22641 * http://mlton.org/pipermail/mlton/2007-February/029597.html
22645 :mlton-guide-page: SSA2
22650 <:SSA2:> is an <:IntermediateLanguage:>, translated from <:SSA:> by
22651 <:ToSSA2:>, optimized by <:SSA2Simplify:>, and translated by
22652 <:ToRSSA:> to <:RSSA:>.
22656 <:SSA2:> is a <:FirstOrder:>, <:SimplyTyped:>
22657 <:IntermediateLanguage:>, a slight variant of the <:SSA:>
22658 <:IntermediateLanguage:>,
22660 Like <:SSA:>, an <:SSA2:> program consists of a collection of datatype
22661 declarations, a sequence of global statements, and a collection of
22662 functions, along with a distinguished "main" function. Each function
22663 consists of a collection of basic blocks, where each basic block is a
22664 sequence of statements ending with some control transfer.
22666 Unlike <:SSA:>, <:SSA2:> includes mutable fields in objects and makes
22667 the vector type constructor n-ary instead of unary. This allows
22668 optimizations like <:RefFlatten:> and <:DeepFlatten:> to be expressed.
22670 == Implementation ==
22672 * <!ViewGitFile(mlton,master,mlton/ssa/ssa2.sig)>
22673 * <!ViewGitFile(mlton,master,mlton/ssa/ssa2.fun)>
22674 * <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree2.sig)>
22675 * <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree2.fun)>
22677 == Type Checking ==
22679 Type checking (<!ViewGitFile(mlton,master,mlton/ssa/type-check2.sig)>,
22680 <!ViewGitFile(mlton,master,mlton/ssa/type-check2.fun)>) of a <:SSA2:>
22681 program verifies the following:
22683 * no duplicate definitions (tycons, cons, vars, labels, funcs)
22684 * no out of scope references (tycons, cons, vars, labels, funcs)
22685 * variable definitions dominate variable uses
22686 * case transfers are exhaustive and irredundant
22687 * `Enter`/`Leave` profile statements match
22688 * "traditional" well-typedness
22690 == Details and Notes ==
22692 SSA is an abbreviation for Static Single Assignment.
22696 :mlton-guide-page: SSA2Simplify
22701 The optimization passes for the <:SSA2:> <:IntermediateLanguage:> are
22702 collected and controlled by the `Simplify2` functor
22703 (<!ViewGitFile(mlton,master,mlton/ssa/simplify2.sig)>,
22704 <!ViewGitFile(mlton,master,mlton/ssa/simplify2.fun)>).
22706 The following optimization passes are implemented:
22713 There are additional analysis and rewrite passes that augment many of the other optimization passes:
22718 The optimization passes can be controlled from the command-line by the options
22720 * `-diag-pass <pass>` -- keep diagnostic info for pass
22721 * `-disable-pass <pass>` -- skip optimization pass (if normally performed)
22722 * `-enable-pass <pass>` -- perform optimization pass (if normally skipped)
22723 * `-keep-pass <pass>` -- keep the results of pass
22724 * `-loop-passes <n>` -- loop optimization passes
22725 * `-ssa2-passes <passes>` -- ssa optimization passes
22729 :mlton-guide-page: SSASimplify
22734 The optimization passes for the <:SSA:> <:IntermediateLanguage:> are
22735 collected and controlled by the `Simplify` functor
22736 (<!ViewGitFile(mlton,master,mlton/ssa/simplify.sig)>,
22737 <!ViewGitFile(mlton,master,mlton/ssa/simplify.fun)>).
22739 The following optimization passes are implemented:
22741 * <:CombineConversions:>
22745 * <:ConstantPropagation:>
22749 * <:IntroduceLoops:>
22753 * <:LoopInvariant:>
22757 * <:RedundantTests:>
22760 * <:SimplifyTypes:>
22763 The following implementation passes are implemented:
22768 There are additional analysis and rewrite passes that augment many of the other optimization passes:
22774 The optimization passes can be controlled from the command-line by the options:
22776 * `-diag-pass <pass>` -- keep diagnostic info for pass
22777 * `-disable-pass <pass>` -- skip optimization pass (if normally performed)
22778 * `-enable-pass <pass>` -- perform optimization pass (if normally skipped)
22779 * `-keep-pass <pass>` -- keep the results of pass
22780 * `-loop-passes <n>` -- loop optimization passes
22781 * `-ssa-passes <passes>` -- ssa optimization passes
22785 :mlton-guide-page: Stabilizers
22792 * Stabilizers currently require the MLton sources, this should be fixed by the next release
22796 * Stabilizers are released under the MLton License
22800 * Download and build a source copy of MLton
22801 * Extract the tar.gz file attached to this page
22802 * Some examples are provided in the "examples/" sub directory, more examples will be added to this page in the following week
22804 == Bug reports / Suggestions ==
22806 * Please send any errors you encounter to schatzp and lziarek at cs.purdue.edu
22807 * We are looking to expand the usability of stabilizers
22808 * Please send any suggestions and desired functionality to the above email addresses
22812 * This is an alpha release. We expect to have another release shortly with added functionality soon
22813 * More documentation, such as signatures and descriptions of functionality, will be forthcoming
22816 == Documentation ==
22824 val stable: ('a -> 'b) -> ('a -> 'b)
22825 val stabilize: unit -> 'a
22827 val stableCP: (('a -> 'b) * (unit -> unit)) ->
22828 (('a -> 'b) * checkpoint)
22829 val stabilizeCP: checkpoint -> unit
22831 val unmonitoredAssign: ('a ref * 'a) -> unit
22832 val monitoredAssign: ('a ref * 'a) -> unit
22837 `Stable` provides functions to manage stable sections.
22839 * `type checkpoint`
22841 handle used to stabilize contexts other than the current one.
22845 returns a function identical to `f` that will execute within a stable section.
22849 unrolls the effects made up to the current context to at least the
22850 nearest enclosing _stable_ section. These effects may have propagated
22851 to other threads, so all affected threads are returned to a globally
22852 consistent previous state. The return is undefined because control
22853 cannot resume after stabilize is called.
22855 * `stableCP (f, comp)`
22857 returns a function `f'` and checkpoint tag `cp`. Function `f'` is
22858 identical to `f` but when applied will execute within a stable
22859 section. `comp` will be executed if `f'` is later stabilized. `cp`
22860 is used by `stabilizeCP` to stabilize a given checkpoint.
22864 same as stabilize except that the (possibly current) checkpoint to
22865 stabilize is provided.
22867 * `unmonitoredAssign (r, v)`
22869 standard assignment (`:=`). The version of CML distributed rebinds
22870 `:=` to a monitored version so interesting effects can be recorded.
22872 * `monitoredAssign (r, v)`
22874 the assignment operator that should be used in programs that use
22875 stabilizers. `:=` is rebound to this by including CML.
22879 * <!Attachment(Stabilizers,stabilizers_alpha_2006-10-09.tar.gz)>
22883 * <!Cite(ZiarekEtAl06)>
22887 :mlton-guide-page: StandardML
22892 Standard ML (SML) is a programming language that combines excellent
22893 support for rapid prototyping, modularity, and development of large
22894 programs, with performance approaching that of C.
22896 == SML Resources ==
22898 * <:StandardMLTutorials:Tutorials>
22899 * <:StandardMLBooks:Books>
22900 * <:StandardMLImplementations:Implementations>
22901 // * http://google.com/coop/cse?cx=014714656471597805969%3Afzuz7eybmcy[SML web search] from Google Co-op
22903 == Aspects of SML ==
22905 * <:DefineTypeBeforeUse:>
22907 * <:EqualityTypeVariable:>
22908 * <:GenerativeDatatype:>
22909 * <:GenerativeException:>
22911 * <:OperatorPrecedence:>
22913 * <:PolymorphicEquality:>
22914 * <:TypeVariableScope:>
22915 * <:ValueRestriction:>
22921 * <:FunctionalRecordUpdate:>
22922 * <:InfixingOperators:>
22924 * <:ObjectOrientedProgramming:>
22925 * <:OptionalArguments:>
22928 * <:ReturnStatement:>
22929 * <:Serialization:>
22930 * <:StandardMLGotchas:>
22932 * <:TipsForWritingConciseSML:>
22933 * <:UniversalType:>
22935 == Programming in SML ==
22943 * <:StandardMLHistory: History of SML>
22946 == Related Languages ==
22954 :mlton-guide-page: StandardMLBooks
22955 [[StandardMLBooks]]
22959 == Introductory Books ==
22961 * <!Cite(Ullman98, Elements of ML Programming)>
22963 * <!Cite(Paulson96, ML For the Working Programmer)>
22965 * <!Cite(HansenRichel99, Introduction to Programming using SML)>
22967 * <!Cite(FelleisenFreidman98, The Little MLer)>
22971 * <!Cite(Shipman02, Unix System Programming with Standard ML)>
22973 == Reference Books ==
22975 * <!Cite(GansnerReppy04, The Standard ML Basis Library)>
22977 * <:DefinitionOfStandardML:The Definition of Standard ML (Revised)>
22979 == Related Topics ==
22981 * <!Cite(Reppy07, Concurrent Programming in ML)>
22983 * <!Cite(Okasaki99, Purely Functional Data Structures)>
22987 :mlton-guide-page: StandardMLGotchas
22988 [[StandardMLGotchas]]
22992 This page contains brief explanations of some recurring sources of
22993 confusion and problems that SML newbies encounter.
22995 Many confusions about the syntax of SML seem to arise from the use of
22996 an interactive REPL (Read-Eval Print Loop) while trying to learn the
22997 basics of the language. While writing your first SML programs, you
22998 should keep the source code of your programs in a form that is
22999 accepted by an SML compiler as a whole.
23001 == The `and` keyword ==
23003 It is a common mistake to misuse the `and` keyword or to not know how
23004 to introduce mutually recursive definitions. The purpose of the `and`
23005 keyword is to introduce mutually recursive definitions of functions
23006 and datatypes. For example,
23010 fun isEven 0w0 = true
23011 | isEven 0w1 = false
23012 | isEven n = isOdd (n-0w1)
23013 and isOdd 0w0 = false
23015 | isOdd n = isEven (n-0w1)
23022 datatype decl = VAL of id * pat * expr
23024 and expr = LET of decl * expr
23028 You can also use `and` as a shorthand in a couple of other places, but
23029 it is not necessary.
23031 == Constructed patterns ==
23033 It is a common mistake to forget to parenthesize constructed patterns
23034 in `fun` bindings. Consider the following invalid definition:
23039 | length h :: t = 1 + length t
23042 The pattern `h :: t` needs to be parenthesized:
23047 | length (h :: t) = 1 + length t
23050 The parentheses are needed, because a `fun` definition may have
23051 multiple consecutive constructed patterns through currying.
23053 The same applies to nonfix constructors. For example, the parentheses
23058 fun valOf NONE = raise Option
23059 | valOf (SOME x) = x
23062 are required. However, the outermost constructed pattern in a `fn` or
23063 `case` expression need not be parenthesized, because in those cases
23064 there is always just one constructed pattern. So, both
23068 val valOf = fn NONE => raise Option
23076 fun valOf x = case x of
23077 NONE => raise Option
23083 == Declarations and expressions ==
23085 It is a common mistake to confuse expressions and declarations.
23086 Normally an SML source file should only contain declarations. The
23087 following are declarations:
23093 functor Fn (...) = ...
23096 local ... in ... end
23099 signature SIG = ...
23100 structure Struct = ...
23112 isn't a declaration.
23114 To specify a side-effecting computation in a source file, you can write:
23122 == Equality types ==
23124 SML has a fairly intricate built-in notion of equality. See
23125 <:EqualityType:> and <:EqualityTypeVariable:> for a thorough
23131 It is a common mistake to write nested case expressions without the
23132 necessary parentheses. See <:UnresolvedBugs:> for a discussion.
23137 It used to be a common mistake to parenthesize `op *` as `(op *)`.
23138 Before SML'97, `*)` was considered a comment terminator in SML and
23139 caused a syntax error. At the time of writing, <:SMLNJ:SML/NJ> still
23140 rejects the code. An extra space may be used for portability:
23141 `(op * )`. However, parenthesizing `op` is redundant, even though it
23142 is a widely used convention.
23147 A number of standard operators (`+`, `-`, `~`, `*`, `<`, `>`, ...) and
23148 numeric constants are overloaded for some of the numeric types (`int`,
23149 `real`, `word`). It is a common surprise that definitions using
23150 overloaded operators such as
23154 fun min (x, y) = if y < x then y else x
23157 are not overloaded themselves. SML doesn't really support
23158 (user-defined) overloading or other forms of ad hoc polymorphism. In
23159 cases such as the above where the context doesn't resolve the
23160 overloading, expressions using overloaded operators or constants get
23161 assigned a default type. The above definition gets the type
23165 val min : int * int -> int
23168 See <:Overloading:> and <:TypeIndexedValues:> for further discussion.
23173 It is a common mistake to use redundant semicolons in SML code. This
23174 is probably caused by the fact that in an SML REPL, a semicolon (and
23175 enter) is used to signal the REPL that it should evaluate the
23176 preceding chunk of code as a unit. In SML source files, semicolons
23177 are really needed in only two places. Namely, in expressions of the
23189 let ... in exp ; ... ; exp end
23192 Note that semicolons act as expression (or declaration) separators
23193 rather than as terminators.
23196 == Stale bindings ==
23201 == Unresolved records ==
23206 == Value restriction ==
23208 See <:ValueRestriction:>.
23211 == Type Variable Scope ==
23213 See <:TypeVariableScope:>.
23217 :mlton-guide-page: StandardMLHistory
23218 [[StandardMLHistory]]
23222 <:StandardML:Standard ML> grew out of <:ML:> in the early 1980s.
23224 For an excellent overview of SML's history, see Appendix F of the
23225 <:DefinitionOfStandardML:Definition>.
23227 For an overview if its history before 1982, see <!Cite(Milner82, How
23232 :mlton-guide-page: StandardMLImplementations
23233 [[StandardMLImplementations]]
23234 StandardMLImplementations
23235 =========================
23237 There are a number of implementations of <:StandardML:Standard ML>,
23238 from interpreters, to byte-code compilers, to incremental compilers,
23239 to whole-program compilers.
23241 * <:Alice:Alice ML>
23245 * <:MoscowML:Moscow ML>
23246 * <:PolyML:Poly/ML>
23249 * <:SMLNET:SML.NET>
23252 == Not Actively Maintained ==
23254 * http://www.dcs.ed.ac.uk/home/edml/[Edinburgh ML]
23258 * http://www.cs.cornell.edu/Info/People/jgm/til.tar.Z[TIL]
23262 :mlton-guide-page: StandardMLPortability
23263 [[StandardMLPortability]]
23264 StandardMLPortability
23265 =====================
23267 Technically, SML'97 as defined in the
23268 <:DefinitionOfStandardML:Definition>
23269 requires only a minimal initial basis, which, while including the
23270 types `int`, `real`, `char`, and `string`, need have
23271 no operations on those base types. Hence, the only observable output
23272 of an SML'97 program is termination or raising an exception. Most SML
23273 compilers should agree there, to the degree each agrees with the
23274 Definition. See <:UnresolvedBugs:> for MLton's very few corner cases.
23276 Realistically, a program needs to make use of the
23277 <:BasisLibrary:Basis Library>.
23278 Within the Basis Library, there are numerous places where the behavior
23279 is implementation dependent. For a trivial example:
23283 val _ = valOf (Int.maxInt)
23287 may either raise the `Option` exception (if
23288 `Int.maxInt == NONE`) or may terminate normally. The default
23289 Int/Real/Word sizes are the biggest implementation dependent aspect;
23290 so, one implementation may raise `Overflow` while another can
23291 accommodate the result. Also, maximum array and vector lengths are
23292 implementation dependent. Interfacing with the operating system is a
23293 bit murky, and implementations surely differ in handling of errors
23298 :mlton-guide-page: StandardMLTutorials
23299 [[StandardMLTutorials]]
23300 StandardMLTutorials
23301 ===================
23303 * http://www.dcs.napier.ac.uk/course-notes/sml/manual.html[A Gentle Introduction to ML].
23306 * http://www.dcs.ed.ac.uk/home/stg/NOTES/[Programming in Standard ML '97: An Online Tutorial].
23309 * <!Cite(Harper11, Programming in Standard ML)>.
23312 * <!Cite(Tofte96, Essentials of Standard ML Modules)>.
23315 * <!Cite(Tofte09, Tips for Computer Scientists on Standard ML (Revised))>.
23320 :mlton-guide-page: StaticSum
23325 While SML makes it impossible to write functions whose types would
23326 depend on the values of their arguments, or so called dependently
23327 typed functions, it is possible, and arguably commonplace, to write
23328 functions whose types depend on the types of their arguments. Indeed,
23329 the types of parametrically polymorphic functions like `map` and
23330 `foldl` can be said to depend on the types of their arguments. What
23331 is less commonplace, however, is to write functions whose behavior
23332 would depend on the types of their arguments. Nevertheless, there are
23333 several techniques for writing such functions.
23334 <:TypeIndexedValues:Type-indexed values> and <:Fold:fold> are two such
23335 techniques. This page presents another such technique dubbed static
23339 == Ordinary Sums ==
23341 Consider the sum type as defined below:
23344 structure Sum = struct
23345 datatype ('a, 'b) t = INL of 'a | INR of 'b
23349 While a generic sum type such as defined above is very useful, it has
23350 a number of limitations. As an example, we could write the function
23351 `out` to extract the value from a sum as follows:
23354 fun out (s : ('a, 'a) Sum.t) : 'a =
23360 As can be seen from the type of `out`, it is limited in the sense that
23361 it requires both variants of the sum to have the same type. So, `out`
23362 cannot be used to extract the value of a sum of two different types,
23363 such as the type `(int, real) Sum.t`. As another example of a
23364 limitation, consider the following attempt at a `succ` function:
23367 fun succ (s : (int, real) Sum.t) : ??? =
23369 of Sum.INL i => i + 1
23370 | Sum.INR r => Real.nextAfter (r, Real.posInf)
23373 The above definition of `succ` cannot be typed, because there is no
23374 type for the codomain within SML.
23379 Interestingly, it is possible to define values `inL`, `inR`, and
23380 `match` that satisfy the laws
23382 match (inL x) (f, g) = f x
23383 match (inR x) (f, g) = g x
23385 and do not suffer from the same limitions. The definitions are
23386 actually quite trivial:
23389 structure StaticSum = struct
23390 fun inL x (f, _) = f x
23391 fun inR x (_, g) = g x
23396 Now, given the `succ` function defined as
23402 fn r => Real.nextAfter (r, Real.posInf))
23407 succ (StaticSum.inL 1) = 2
23408 succ (StaticSum.inR Real.maxFinite) = Real.posInf
23411 To better understand how this works, consider the following signature
23415 structure StaticSum :> sig
23416 type ('dL, 'cL, 'dR, 'cR, 'c) t
23417 val inL : 'dL -> ('dL, 'cL, 'dR, 'cR, 'cL) t
23418 val inR : 'dR -> ('dL, 'cL, 'dR, 'cR, 'cR) t
23419 val match : ('dL, 'cL, 'dR, 'cR, 'c) t -> ('dL -> 'cL) * ('dR -> 'cR) -> 'c
23421 type ('dL, 'cL, 'dR, 'cR, 'c) t = ('dL -> 'cL) * ('dR -> 'cR) -> 'c
23426 Above, `'d` stands for domain and `'c` for codomain. The key
23427 difference between an ordinary sum type, like `(int, real) Sum.t`, and
23428 a static sum type, like `(int, real, real, int, real) StaticSum.t`, is
23429 that the ordinary sum type says nothing about the type of the result
23430 of deconstructing a sum while the static sum type specifies the type.
23432 With the sealed static sum module, we get the type
23435 val succ : (int, int, real, real, 'a) StaticSum.t -> 'a
23437 for the previously defined `succ` function. The type specifies that
23438 `succ` maps a left `int` to an `int` and a right `real` to a `real`.
23439 For example, the type of `StaticSum.inL 1` is
23440 `(int, 'cL, 'dR, 'cR, 'cL) StaticSum.t`. Unifying this with the
23441 argument type of `succ` gives the type `(int, int, real, real, int)
23442 StaticSum.t -> int`.
23444 The `out` function is quite useful on its own. Here is how it can be
23448 structure StaticSum = struct
23450 val out : ('a, 'a, 'b, 'b, 'c) t -> 'c =
23451 fn s => match s (fn x => x, fn x => x)
23455 Due to the value restriction, lack of first class polymorphism and
23456 polymorphic recursion, the usefulness and convenience of static sums
23457 is somewhat limited in SML. So, don't throw away the ordinary sum
23458 type just yet. Static sums can nevertheless be quite useful.
23461 === Example: Send and Receive with Argument Type Dependent Result Types ===
23463 In some situations it would seem useful to define functions whose
23464 result type would depend on some of the arguments. Traditionally such
23465 functions have been thought to be impossible in SML and the solution
23466 has been to define multiple functions. For example, the
23467 http://www.standardml.org/Basis/socket.html[`Socket` structure] of the
23468 Basis library defines 16 `send` and 16 `recv` functions. In contrast,
23470 (<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/net.sig)>) of the
23471 Basic library designed by Stephen Weeks defines only a single `send`
23472 and a single `receive` and the result types of the functions depend on
23473 their arguments. The implementation
23474 (<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/net.sml)>) uses
23475 static sums (with a slighly different signature:
23476 <!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/static-sum.sig)>).
23479 === Example: Picking Monad Results ===
23481 Suppose that we need to write a parser that accepts a pair of integers
23482 and returns their sum given a monadic parsing combinator library. A
23483 part of the signature of such library could look like this
23486 signature PARSING = sig
23489 val lparen : unit t
23490 val rparen : unit t
23495 where the `MONAD` signature could be defined as
23498 signature MONAD = sig
23500 val return : 'a -> 'a t
23501 val >>= : 'a t * ('a -> 'b t) -> 'b t
23506 The straightforward, but tedious, way to write the desired parser is:
23509 val p = lparen >>= (fn _ =>
23513 rparen >>= (fn _ =>
23514 return (x + y))))))
23517 In Haskell, the parser could be written using the `do` notation
23518 considerably less verbosely as:
23521 p = do { lparen ; x <- int ; comma ; y <- int ; rparen ; return $ x + y }
23524 SML doesn't provide a `do` notation, so we need another solution.
23526 Suppose we would have a "pick" notation for monads that would allows
23527 us to write the parser as
23530 val p = `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (fn x & y => x + y)
23532 using four auxiliary combinators: +`+, `\`, `^`, and `@`.
23536 * +`p+ means that the result of `p` is dropped,
23537 * `\p` means that the result of `p` is taken,
23538 * `p ^ q` means that results of `p` and `q` are taken as a product, and
23539 * `p @ a` means that the results of `p` are passed to the function `a` and that result is returned.
23541 The difficulty is in implementing the concatenation combinator `^`.
23542 The type of the result of the concatenation depends on the types of
23545 Using static sums and the <:ProductType:product type>, the pick
23546 notation for monads can be implemented as follows:
23549 functor MkMonadPick (include MONAD) = let
23553 fun `a = inL (a >>= (fn _ => return ()))
23555 fun a @ f = out a >>= (return o f)
23557 (match b o match a)
23559 (fn b => inL (a >>= (fn _ => b)),
23560 fn b => inR (a >>= (fn _ => b))),
23562 (fn b => inR (a >>= (fn a => b >>= (fn _ => return a))),
23563 fn b => inR (a >>= (fn a => b >>= (fn b => return (a & b))))))
23568 The above implementation is inefficient, however. It uses many more
23569 bind operations, `>>=`, than necessary. That can be solved with an
23570 additional level of abstraction:
23573 functor MkMonadPick (include MONAD) = let
23577 fun `a = inL (fn b => a >>= (fn _ => b ()))
23578 fun \a = inR (fn b => a >>= b)
23579 fun a @ f = out a (return o f)
23581 (match b o match a)
23582 (fn a => (fn b => inL (fn c => a (fn () => b c)),
23583 fn b => inR (fn c => a (fn () => b c))),
23584 fn a => (fn b => inR (fn c => a (fn a => b (fn () => c a))),
23585 fn b => inR (fn c => a (fn a => b (fn b => c (a & b))))))
23590 After instantiating and opening either of the above monad pick
23591 implementations, the previously given definition of `p` can be
23592 compiled and results in a parser whose result is of type `int`. Here
23593 is a functor to test the theory:
23596 functor Test (Arg : PARSING) = struct
23598 structure Pick = MkMonadPick (Arg)
23602 `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (fn x & y => x + y)
23610 There are a number of related techniques. Here are some of them.
23613 * <:TypeIndexedValues:>
23617 :mlton-guide-page: StephenWeeks
23622 I live in the New York City area and work at http://janestcapital.com[Jane Street Capital].
23624 My http://sweeks.com/[home page].
23626 You can email me at sweeks@sweeks.com.
23630 :mlton-guide-page: StyleGuide
23635 These conventions are chosen so that inertia is towards modularity, code reuse and finding bugs early, _not_ to save typing.
23637 * <:SyntacticConventions:>
23641 :mlton-guide-page: Subversion
23646 http://subversion.apache.org/[Subversion] is a version control system.
23647 The MLton project used Subversion to maintain its
23648 <:Sources:source code>, but switched to <:Git:> on 20130308.
23650 Here are some online Subversion resources.
23652 * http://svnbook.red-bean.com[Version Control with Subversion]
23656 :mlton-guide-page: SuccessorML
23661 The purpose of http://sml-family.org/successor-ml/[successor ML], or
23662 sML for short, is to provide a vehicle for the continued evolution of
23663 ML, using Standard ML as a starting point. The intention is for
23664 successor ML to be a living, evolving dialect of ML that is responsive
23665 to community needs and advances in language design, implementation,
23668 == SuccessorML Features in MLton ==
23670 The following SuccessorML features have been implemented in MLton.
23671 The features are disabled by default, and may be enabled utilizing the
23672 feature's corresponding <:MLBasisAnnotations:ML Basis annotation>
23673 which is listed directly after the feature name. In addition, the
23674 +allowSuccessorML {false|true}+ annotation can be used to
23675 simultaneously enable all of the features.
23677 * <!Anchor(DoDecls)>
23678 `do` Declarations: +allowDoDecls {false|true}+
23680 Allow a +do _exp_+ declaration form, which evaluates _exp_ for its
23681 side effects. The following example uses a `do` declaration:
23685 do print "Hello world.\n"
23688 and is equivalent to:
23692 val () = print "Hello world.\n"
23695 * <!Anchor(ExtendedConsts)>
23696 Extended Constants: +allowExtendedConsts {false|true}+
23699 Allow or disallow all of the extended constants features. This is a
23700 proxy for all of the following annotations.
23702 ** <!Anchor(ExtendedNumConsts)>
23703 Extended Numeric Constants: +allowExtendedNumConsts {false|true}+
23705 Allow underscores as a separator in numeric constants and allow binary
23706 integer and word constants.
23708 Underscores in a numeric constant must occur between digits and
23709 consecutive underscores are allowed.
23711 Binary integer constants use the prefix +0b+ and binary word constants
23712 use the prefix +0wb+.
23714 The following example uses extended numeric constants (although it may
23715 be incorrectly syntax highlighted):
23720 val nb = ~0b10_10_10
23722 val i = 4__327__829
23723 val r = 6.022_140_9e23
23726 ** <!Anchor(ExtendedTextConsts)> Extended Text Constants: +allowExtendedTextConsts {false|true}+
23728 Allow characters with integer codes ≥ 128 and ≤ 247 that
23729 correspond to syntactically well-formed UTF-8 byte sequences in text
23733 and allow `\Uxxxxxxxx` numeric escapes in text constants.
23736 Any 1, 2, 3, or 4 byte sequence that can be properly decoded to a
23737 binary number according to the UTF-8 encoding/decoding scheme is
23738 allowed in a text constant (but invalid sequences are not explicitly
23739 rejected) and denotes the corresponding sequence of characters with
23740 integer codes ≥ 128 and ≤ 247. This feature enables "UTF-8
23741 convenience" (but not comprehensive Unicode support); in particular,
23742 it allows one to copy text from a browser and paste it into a string
23743 constant in an editor and, furthermore, if the string is printed to a
23744 terminal, then will (typically) appear as the original text. The
23745 following example uses UTF-8 byte sequences:
23749 val s1 : String.string = "\240\159\130\161"
23750 val s2 : String.string = "🂡"
23751 val _ = print ("s1 --> " ^ s1 ^ "\n")
23752 val _ = print ("s2 --> " ^ s2 ^ "\n")
23753 val _ = print ("String.size s1 --> " ^ Int.toString (String.size s1) ^ "\n")
23754 val _ = print ("String.size s2 --> " ^ Int.toString (String.size s2) ^ "\n")
23755 val _ = print ("s1 = s2 --> " ^ Bool.toString (s1 = s2) ^ "\n")
23758 and, when compiled and executed, will display:
23763 String.size s1 --> 4
23764 String.size s2 --> 4
23768 Note that the `String.string` type corresponds to any sequence of
23769 8-bit values, including invalid UTF-8 sequences; hence the string
23770 constant `"\192"` (a UTF-8 leading byte with no UTF-8 continuation
23771 byte) is valid. Similarly, the `Char.char` type corresponds to a
23772 single 8-bit value; hence the char constant `#"α"` is not valid, as
23773 the text constant `"α"` denotes a sequence of two 8-bit values.
23776 A `\Uxxxxxxxx` numeric escape denotes a single character with the
23777 hexadecimal integer code `xxxxxxxx`. Such numeric escapes are not
23778 necessary for the `String.string` and `Char.char` types, since
23779 characters in such text constants must have integer codes ≤ 255 and
23780 the `\ddd` and `\uxxxx` numeric escapes suffice. However, the
23781 `\Uxxxxxxxx` numeric escapes are useful for the `WideString.string`
23782 and `WideChar.char` types, since characters in such text constants may
23783 have integer codes ≤ 2^32^-1. The following uses a `\Uxxxxxxxx`
23784 numeric escape (although it may be incorrectly syntax highlighted):
23788 val s1 : WideString.string = "\U0001F0A1" (* 'PLAYING CARD ACE OF SPADES' (U+1F0A1) *)
23789 val _ = print ("WideString.size s1 --> " ^ Int.toString (WideString.size s1) ^ "\n")
23792 and, when compiled and executed, will display:
23795 WideString.size s1 --> 1
23798 Note that the `WideString.string` type corresponds to any sequence of
23799 32-bit values, including invalid Unicode code points; hence, the
23800 string constants `"\U001F0000"` and `"\U40000000"` are valid (but the
23801 corresponding integer codes are not valid Unicode code points).
23802 Similarly, the `WideChar.char` type corresponds to a single 32-bit
23805 Finally, note that a UTF-8 byte sequence in a `WideString.string` or
23806 `WideChar.char` text constant does not denote a single 32-bit value,
23807 but rather a sequence of 32-bit values ≥ 128 and ≤ 247. The
23808 following example uses both UTF-8 byte sequences and `\Uxxxxxxxx`
23809 numeric escapes (although it may be incorrectly syntax highlighted):
23813 val s1 : WideString.string = "\U0001F0A1" (* 'PLAYING CARD ACE OF SPADES' (U+1F0A1) *)
23814 val s2 : WideString.string = "🂡"
23815 val s3 : WideString.string = "\U000000F0\U0000009F\U00000082\U000000A1"
23816 val _ = print ("WideString.size s1 --> " ^ Int.toString (WideString.size s1) ^ "\n")
23817 val _ = print ("WideString.size s2 --> " ^ Int.toString (WideString.size s2) ^ "\n")
23818 val _ = print ("WideString.size s3 --> " ^ Int.toString (WideString.size s3) ^ "\n")
23819 val _ = print ("s1 = s2 --> " ^ Bool.toString (s1 = s2) ^ "\n")
23820 val _ = print ("s2 = s3 --> " ^ Bool.toString (s2 = s3) ^ "\n")
23823 and, when compiled and executed, will display:
23826 WideString.size s1 --> 1
23827 WideString.size s2 --> 4
23828 WideString.size s3 --> 4
23835 * <!Anchor(LineComments)>
23836 Line Comments: +allowLineComments {false|true}+
23838 Allow line comments beginning with the token ++(*)++. The following
23839 example uses a line comment:
23843 (*) This is a line comment
23846 Line comments properly nest within block comments. The following
23847 example uses line comments nested within block comments:
23852 val x = 4 (*) This is a line comment
23856 val y = 5 (*) This is a line comment *)
23860 * <!Anchor(OptBar)>
23861 Optional Pattern Bars: +allowOptBar {false|true}+
23863 Allow a bar to appear before the first match rule of a `case`, `fn`,
23864 or `handle` expression, allow a bar to appear before the first
23865 function-value binding of a `fun` declaration, and allow a bar to
23866 appear before the first constructor binding or description of a
23867 `datatype` declaration or specification. The following example uses
23868 leading bars in a `datatype` declaration, a `fun` declaration, and a
23887 By eliminating the special case of the first element, this feature
23888 allows for simpler refactoring (e.g., sorting the lines of the
23889 `datatype` declaration's constructor bindings to put the constructors
23890 in alphabetical order).
23892 * <!Anchor(OptSemicolon)>
23893 Optional Semicolons: +allowOptSemicolon {false|true}+
23895 Allow a semicolon to appear after the last expression in a sequence or
23896 `let`-body expression. The following example uses a trailing
23897 semicolon in the body of a `let` expression:
23910 By eliminating the special case of the last element, this feature
23911 allows for simpler refactoring.
23913 * <!Anchor(OrPats)>
23914 Disjunctive (Or) Patterns: +allowOrPats {false|true}+
23916 Allow disjunctive (a.k.a., "or") patterns of the form +_pat~1~_ |
23917 _pat~2~_+, which matches a value that matches either +_pat~1~_+ or
23918 +_pat~2~_+. Disjunctive patterns have lower precedence than `as`
23919 patterns and constraint patterns, much as `orelse` expressions have
23920 lower precedence than `andalso` expressions and constraint
23921 expressions. Both sub-patterns of a disjunctive pattern must bind the
23922 same variables with the same types. The following example uses
23923 disjunctive patterns:
23927 datatype t = A of int | B of int | C of int | D of int * int | E of int * int
23931 A x | B x | C x => x + 1
23932 | D (x, _) | E (_, x) => x * 2
23935 * <!Anchor(RecordPunExps)>
23936 Record Punning Expressions: +allowRecordPunExps {false|true}+
23938 Allow record punning expressions, whereby an identifier +_vid_+ as an
23939 expression row in a record expression denotes the expression row
23940 +_vid_ = _vid_+ (i.e., treating a label as a variable). The following
23941 example uses record punning expressions (and also record punning
23947 case r of {a, b, c} => {a, b = b + 1, c}
23950 and is equivalent to:
23955 case r of {a = a, b = b, c = c} => {a = a, b = b + 1, c = c}
23958 * <!Anchor(SigWithtype)>
23959 `withtype` in Signatures: +allowSigWithtype {false|true}+
23961 Allow `withtype` to modify a `datatype` specification in a signature.
23962 The following example uses `withtype` in a signature (and also
23963 `withtype` in a declaration):
23969 datatype 'a u = Nil | Cons of 'a * 'a t
23970 withtype 'a t = unit -> 'a u
23972 structure Stream : STREAM =
23974 datatype 'a u = Nil | Cons of 'a * 'a t
23975 withtype 'a t = unit -> 'a u
23979 and is equivalent to:
23985 datatype 'a u = Nil | Cons of 'a * (unit -> 'a u)
23986 type 'a t = unit -> 'a u
23988 structure Stream : STREAM =
23990 datatype 'a u = Nil | Cons of 'a * (unit -> 'a u)
23991 type 'a t = unit -> 'a u
23995 * <!Anchor(VectorExpsAndPats)>
23996 Vector Expressions and Patterns: +allowVectorExpsAndPats {false|true}+
23999 Allow or disallow vector expressions and vector patterns. This is a
24000 proxy for all of the following annotations.
24002 ** <!Anchor(VectorExps)>
24003 Vector Expressions: +allowVectorExps {false|true}+
24005 Allow vector expressions of the form +#[_exp~0~_, _exp~1~_, ..., _exp~n-1~_]+ (where _n ≥ 0_). The expression has type +_τ_ vector+ when each expression _exp~i~_ has type +_τ_+.
24007 ** <!Anchor(VectorPats)>
24008 Vector Patterns: +allowVectorPats {false|true}+
24010 Allow vector patterns of the form +#[_pat~0~_, _pat~1~_, ..., _pat~n-1~_]+ (where _n ≥ 0_). The pattern matches values of type +_τ_ vector+ when each pattern _pat~i~_ matches values of type +_τ_+.
24015 :mlton-guide-page: SureshJagannathan
24016 [[SureshJagannathan]]
24020 I am an Associate Professor at the http://www.cs.purdue.edu/[Department of Computer Science] at Purdue University.
24021 My research focus is in programming language design and implementation, concurrency,
24022 and distributed systems. I am interested in various aspects of MLton, mostly related to (in no particular order): (1) control-flow analysis (2) representation
24023 strategies (e.g., flattening), (3) IR formats, and (4) extensions for distributed programming.
24026 Please see my http://www.cs.purdue.edu/homes/suresh/index.html[Home page] for more details.
24030 :mlton-guide-page: Swerve
24035 http://ftp.sun.ac.za/ftp/mirrorsites/ocaml/Systems_programming/book/c3253.html[Swerve]
24036 is an HTTP server written in SML, originally developed with SML/NJ.
24037 <:RayRacine:> ported Swerve to MLton in January 2005.
24039 <!Attachment(Swerve,swerve.tar.bz2,Download)> the port.
24041 Excerpt from the included `README`:
24043 Total testing of this port consisted of a successful compile, startup,
24044 and serving one html page with one gif image. Given that the original
24045 code was throughly designed and implemented in a thoughtful manner and
24046 I expect it is quite usable modulo a few minor bugs introduced by my
24050 Swerve is described in <!Cite(Shipman02)>.
24054 :mlton-guide-page: SXML
24059 <:SXML:> is an <:IntermediateLanguage:>, translated from <:XML:> by
24060 <:Monomorphise:>, optimized by <:SXMLSimplify:>, and translated by
24061 <:ClosureConvert:> to <:SSA:>.
24065 SXML is a simply-typed version of <:XML:>.
24067 == Implementation ==
24069 * <!ViewGitFile(mlton,master,mlton/xml/sxml.sig)>
24070 * <!ViewGitFile(mlton,master,mlton/xml/sxml.fun)>
24071 * <!ViewGitFile(mlton,master,mlton/xml/sxml-tree.sig)>
24073 == Type Checking ==
24075 <:SXML:> shares the type checker for <:XML:>.
24077 == Details and Notes ==
24079 There are only two differences between <:XML:> and <:SXML:>. First,
24080 <:SXML:> `val`, `fun`, and `datatype` declarations always have an
24081 empty list of type variables. Second, <:SXML:> variable references
24082 always have an empty list of type arguments. Constructors uses can
24083 only have a nonempty list of type arguments if the constructor is a
24086 Although we could rely on the type system to enforce these constraints
24087 by parameterizing the <:XML:> signature, <:StephenWeeks:> did so in a
24088 previous version of the compiler, and the software engineering gains
24089 were not worth the effort.
24093 :mlton-guide-page: SXMLShrink
24098 SXMLShrink is an optimization pass for the <:SXML:>
24099 <:IntermediateLanguage:>, invoked from <:SXMLSimplify:>.
24103 This pass performs optimizations based on a reduction system.
24105 == Implementation ==
24107 * <!ViewGitFile(mlton,master,mlton/xml/shrink.sig)>
24108 * <!ViewGitFile(mlton,master,mlton/xml/shrink.fun)>
24110 == Details and Notes ==
24112 <:SXML:> shares the <:XMLShrink:> simplifier.
24116 :mlton-guide-page: SXMLSimplify
24121 The optimization passes for the <:SXML:> <:IntermediateLanguage:> are
24122 collected and controlled by the `SxmlSimplify` functor
24123 (<!ViewGitFile(mlton,master,mlton/xml/sxml-simplify.sig)>,
24124 <!ViewGitFile(mlton,master,mlton/xml/sxml-simplify.fun)>).
24126 The following optimization passes are implemented:
24131 The following implementation passes are implemented:
24133 * <:ImplementExceptions:>
24134 * <:ImplementSuffix:>
24136 The following optimization passes are not implemented, but might prove useful:
24141 The optimization passes can be controlled from the command-line by the options
24143 * `-diag-pass <pass>` -- keep diagnostic info for pass
24144 * `-disable-pass <pass>` -- skip optimization pass (if normally performed)
24145 * `-enable-pass <pass>` -- perform optimization pass (if normally skipped)
24146 * `-keep-pass <pass>` -- keep the results of pass
24147 * `-sxml-passes <passes>` -- sxml optimization passes
24151 :mlton-guide-page: SyntacticConventions
24152 [[SyntacticConventions]]
24153 SyntacticConventions
24154 ====================
24156 Here are a number of syntactic conventions useful for programming in
24162 * A line of code never exceeds 80 columns.
24164 * Only split a syntactic entity across multiple lines if it doesn't fit on one line within 80 columns.
24166 * Use alphabetical order wherever possible.
24168 * Avoid redundant parentheses.
24170 * When using `:`, there is no space before the colon, and a single space after it.
24175 * Variables, record labels and type constructors begin with and use
24176 small letters, using capital letters to separate words.
24184 * Variables that represent collections of objects (lists, arrays,
24185 vectors, ...) are often suffixed with an `s`.
24193 * Constructors, structure identifiers, and functor identifiers begin
24194 with a capital letter.
24202 * Signature identifiers are in all capitals, using `_` to separate
24214 * Alphabetize record labels. In a record type, there are spaces after
24215 colons and commas, but not before colons or commas, or at the
24216 delimiters `{` and `}`.
24220 {bar: int, foo: int}
24223 * Only split a record type across multiple lines if it doesn't fit on
24224 one line. If a record type must be split over multiple lines, put one
24235 * In a tuple type, there are spaces before and after each `*`.
24242 * Only split a tuple type across multiple lines if it doesn't fit on
24243 one line. In a tuple type split over multiple lines, there is one
24244 type per line, and the `*`-s go at the beginning of the lines.
24253 It may also be useful to parenthesize to make the grouping more
24263 * In an arrow type split over multiple lines, put the arrow at the
24264 beginning of its line.
24272 It may also be useful to parenthesize to make the grouping more
24281 * Avoid redundant parentheses.
24283 * Arrow types associate to the right, so write
24297 * Type constructor application associates to the left, so write
24311 * Type constructor application binds more tightly than a tuple type,
24316 int list * bool list
24323 (int list) * (bool list)
24326 * Tuple types bind more tightly than arrow types, so write
24337 (int * bool) -> real
24343 * A core expression or declaration split over multiple lines does not
24344 contain any blank lines.
24346 * A record field selector has no space between the `#` and the record
24362 * A tuple has a space after each comma, but not before, and not at the
24363 delimiters `(` and `)`.
24370 * A tuple split over multiple lines has one element per line, and the
24371 commas go at the end of the lines.
24380 * A list has a space after each comma, but not before, and not at the
24381 delimiters `[` and `]`.
24388 * A list split over multiple lines has one element per line, and the
24389 commas at the end of the lines.
24398 * A record has spaces before and after `=`, a space after each comma,
24399 but not before, and not at the delimiters `{` and `}`. Field names
24400 appear in alphabetical order.
24404 {bar = 13, foo = true}
24407 * A sequence expression has a space after each semicolon, but not before.
24414 * A sequence expression split over multiple lines has one expression
24415 per line, and the semicolons at the beginning of lines. Lisp and
24416 Scheme programmers may find this hard to read at first.
24425 _Rationale_: this makes it easy to visually spot the beginning of each
24426 expression, which becomes more valuable as the expressions themselves
24427 are split across multiple lines.
24429 * An application expression has a space between the function and the
24430 argument. There are no parens unless the argument is a tuple (in
24431 which case the parens are really part of the tuple, not the
24440 * Avoid redundant parentheses. Application associates to left, so
24455 * Infix operators have a space before and after the operator.
24463 * Avoid redundant parentheses. Use <:OperatorPrecedence:>. So, write
24477 * An `andalso` expression split over multiple lines has the `andalso`
24478 at the beginning of subsequent lines.
24487 * A `case` expression is indented as follows
24497 * A `datatype`'s constructors are alphabetized.
24501 datatype t = A | B | C
24504 * A `datatype` declaration has a space before and after each `|`.
24508 datatype t = A | B of int | C
24511 * A `datatype` split over multiple lines has one constructor per line,
24512 with the `|` at the beginning of lines and the constructors beginning
24513 3 columns to the right of the `datatype`.
24523 * A `fun` declaration may start its body on the subsequent line,
24536 * An `if` expression is indented as follows.
24545 * A sequence of `if`-`then`-`else`-s is indented as follows.
24558 * A `let` expression has the `let`, `in`, and `end` on their own
24559 lines, starting in the same column. Declarations and the body are
24572 * A `local` declaration has the `local`, `in`, and `end` on their own
24573 lines, starting in the same column. Declarations are indented 3
24585 * An `orelse` expression split over multiple lines has the `orelse` at
24586 the beginning of subsequent lines.
24595 * A `val` declaration has a space before and after the `=`.
24602 * A `val` declaration can start the expression on the subsequent line,
24608 if e1 then e2 else e3
24614 * A `signature` declaration is indented as follows.
24624 _Exception_: a signature declaration in a file to itself can omit the
24625 indentation to save horizontal space.
24637 In this case, there should be a blank line after the `sig` and before
24640 * A `val` specification has a space after the colon, but not before.
24647 _Exception_: in the case of operators (like `+`), there is a space
24648 before the colon to avoid lexing the colon as part of the operator.
24655 * Alphabetize specifications in signatures.
24668 * A `structure` declaration has a space on both sides of the `=`.
24672 structure Foo = Bar
24675 * A `structure` declaration split over multiple lines is indented as
24686 _Exception_: a structure declaration in a file to itself can omit the
24687 indentation to save horizontal space.
24699 In this case, there should be a blank line after the `struct` and
24702 * Declarations in a `struct` are separated by blank lines.
24721 * A `functor` declaration has spaces after each `:` (or `:>`) but not
24722 before, and a space before and after the `=`. It is indented as
24727 functor Foo (S: FOO_ARG): FOO =
24733 _Exception_: a functor declaration in a file to itself can omit the
24734 indentation to save horizontal space.
24738 functor Foo (S: FOO_ARG): FOO =
24746 In this case, there should be a blank line after the `struct`
24747 and before the `end`.
24751 :mlton-guide-page: Talk
24756 == The MLton Standard ML Compiler ==
24758 *Henry Cejtin, Matthew Fluet, Suresh Jagannathan, Stephen Weeks*
24768 ||<:TalkStandardML: Next>
24773 :mlton-guide-page: TalkDiveIn
24780 * to <:Development:>
24781 * to <:Documentation:>
24792 |<:TalkMLtonHistory: Prev>|
24797 :mlton-guide-page: TalkFolkLore
24804 * Defunctorization and monomorphisation are feasible
24805 * Global control-flow analysis is feasible
24806 * Early closure conversion is feasible
24816 |<:TalkWholeProgram: Prev>|<:TalkMLtonFeatures: Next>
24821 :mlton-guide-page: TalkFromSMLTo
24826 == From Standard ML to S-T F-O IL ==
24828 * What issues arise when translating from Standard ML into an intermediate language?
24838 |<:TalkMLtonApproach: Prev>|<:TalkHowModules: Next>
24843 :mlton-guide-page: TalkHowHigherOrder
24844 [[TalkHowHigherOrder]]
24848 == Higher-order Functions ==
24850 * How does one represent SML's higher-order functions?
24851 * MLton's answer: defunctionalize
24856 See <:ClosureConvert:>.
24865 |<:TalkMLtonApproach: Prev>|<:TalkWholeProgram: Next>
24870 :mlton-guide-page: TalkHowModules
24877 * How does one represent SML's modules?
24878 * MLton's answer: defunctorize
24893 |<:TalkFromSMLTo: Prev>|<:TalkHowPolymorphism: Next>
24898 :mlton-guide-page: TalkHowPolymorphism
24899 [[TalkHowPolymorphism]]
24900 TalkHowPolymorphism
24901 ===================
24905 * How does one represent SML's polymorphism?
24906 * MLton's answer: monomorphise
24911 See <:Monomorphise:>.
24921 |<:TalkHowModules: Prev>|<:TalkHowHigherOrder: Next>
24926 :mlton-guide-page: TalkMLtonApproach
24927 [[TalkMLtonApproach]]
24931 == MLton's Approach ==
24933 * whole-program optimization using a simply-typed, first-order intermediate language
24934 * ensures programs are not penalized for exploiting abstraction and modularity
24944 |<:TalkStandardML: Prev>|<:TalkFromSMLTo: Next>
24949 :mlton-guide-page: TalkMLtonFeatures
24950 [[TalkMLtonFeatures]]
24954 == MLton Features ==
24956 * Supports full Standard ML language and Basis Library
24957 * Generates standalone executables
24959 ** Foreign function interface (SML to C, C to SML)
24960 ** ML Basis system for programming in the very large
24961 ** Extension libraries
24976 |<:TalkFolkLore: Prev>|<:TalkMLtonHistory: Next>
24981 :mlton-guide-page: TalkMLtonHistory
24982 [[TalkMLtonHistory]]
24986 == MLton History ==
24990 | April 1997 | Stephen Weeks wrote a defunctorizer for SML/NJ
24991 | Aug. 1997 | Begin independent compiler (`smlc`)
24992 | Oct. 1997 | Monomorphiser
24993 | Nov. 1997 | Polyvariant higher-order control-flow analysis (10,000 lines)
24994 | March 1999 | First release of MLton (48,006 lines)
24995 | Jan. 2002 | MLton at 102,541 lines
24996 | Jan. 2003 | MLton at 112,204 lines
24997 | Jan. 2004 | MLton at 122,299 lines
24998 | Nov. 2004 | MLton at 141,311 lines
25014 |<:TalkMLtonFeatures: Prev>|<:TalkDiveIn: Next>
25019 :mlton-guide-page: TalkStandardML
25026 * a high-level language makes
25027 ** a programmer's life easier
25028 ** a compiler writer's life harder
25030 * perceived overheads of features discourage their use
25031 ** higher-order functions
25032 ** polymorphic datatypes
25033 ** separate modules
25038 Also see <:StandardML:Standard ML>.
25048 |<:Talk: Prev>|<:TalkMLtonApproach: Next>
25053 :mlton-guide-page: TalkTemplate
25072 |<:ZZZPrev: Prev>|<:ZZZNext: Next>
25077 :mlton-guide-page: TalkWholeProgram
25078 [[TalkWholeProgram]]
25082 == Whole Program Compiler ==
25084 * Each of these techniques requires whole-program analysis
25085 * But, additional benefits:
25086 ** eliminate (some) variability in programming styles
25087 ** specialize representations
25088 ** simplifies and improves runtime system
25098 |<:TalkHowHigherOrder: Prev>|<:TalkFolkLore: Next>
25103 :mlton-guide-page: TILT
25108 http://www.cs.cornell.edu/home/jgm/tilt.html[TILT] is a
25109 <:StandardMLImplementations:Standard ML implementation>.
25113 :mlton-guide-page: TipsForWritingConciseSML
25114 [[TipsForWritingConciseSML]]
25115 TipsForWritingConciseSML
25116 ========================
25118 SML is a rich enough language that there are often several ways to
25119 express things. This page contains miscellaneous tips (ideas not
25120 rules) for writing concise SML. The metric that we are interested in
25121 here is the number of tokens or words (rather than the number of
25122 lines, for example).
25124 == Datatypes in Signatures ==
25126 A seemingly frequent source of repetition in SML is that of datatype
25127 definitions in signatures and structures. Actually, it isn't
25128 repetition at all. A datatype specification in a signature, such as,
25132 signature EXP = sig
25133 datatype exp = Fn of id * exp | App of exp * exp | Var of id
25137 is just a specification of a datatype that may be matched by multiple
25138 (albeit identical) datatype declarations. For example, in
25142 structure AnExp : EXP = struct
25143 datatype exp = Fn of id * exp | App of exp * exp | Var of id
25146 structure AnotherExp : EXP = struct
25147 datatype exp = Fn of id * exp | App of exp * exp | Var of id
25151 the types `AnExp.exp` and `AnotherExp.exp` are two distinct types. If
25152 such <:GenerativeDatatype:generativity> isn't desired or needed, you
25153 can avoid the repetition:
25157 structure Exp = struct
25158 datatype exp = Fn of id * exp | App of exp * exp | Var of id
25161 signature EXP = sig
25162 datatype exp = datatype Exp.exp
25165 structure Exp : EXP = struct
25170 Keep in mind that this isn't semantically equivalent to the original.
25173 == Clausal Function Definitions ==
25175 The syntax of clausal function definitions is rather repetitive. For
25180 fun isSome NONE = false
25181 | isSome (SOME _) = true
25184 is more verbose than
25193 For recursive functions the break-even point is one clause higher. For example,
25199 | fib n = fib (n-1) + fib (n-2)
25202 isn't less verbose than
25209 | n => fib (n-1) + fib (n-2)
25212 It is quite often the case that a curried function primarily examines
25213 just one of its arguments. Such functions can be written particularly
25214 concisely by making the examined argument last. For example, instead
25219 fun eval (Fn (v, b)) env => ...
25220 | eval (App (f, a) env => ...
25221 | eval (Var v) env => ...
25229 fn Fn (v, b) => ...
25230 | App (f, a) => ...
25237 It is a good idea to avoid using lots of irritating superfluous
25238 parentheses. An important rule to know is that prefix function
25239 application in SML has higher precedence than any infix operator. For
25240 example, the outer parentheses in
25244 (square (5 + 1)) + (square (5 * 2))
25249 People trained in other languages often use superfluous parentheses in
25250 a number of places. In particular, the parentheses in the following
25251 examples are practically always superfluous and are best avoided:
25255 if (condition) then ... else ...
25256 while (condition) do ...
25259 The same basically applies to case expressions:
25263 case (expression) of ...
25266 It is not uncommon to match a tuple of two or more values:
25275 Such case expressions can be written more concisely with an
25276 <:ProductType:infix product constructor>:
25288 Repeated sequences of conditionals such as
25293 else if x = y then ...
25297 can often be written more concisely as case expressions such as
25301 case Int.compare (x, y) of
25307 For a custom comparison, you would then define an appropriate datatype
25308 and a reification function. An alternative to using datatypes is to
25309 use dispatch functions
25314 {lt = fn () => ...,
25323 fun comparing (x, y) {lt, eq, gt} =
25324 (case Int.compare (x, y) of
25327 | GREATER => gt) ()
25330 An advantage is that no datatype definition is needed. A disadvantage
25331 is that you can't combine multiple dispatch results easily.
25334 == Command-Query Fusion ==
25336 Many are familiar with the
25337 http://en.wikipedia.org/wiki/Command-Query_Separation[Command-Query
25338 Separation Principle]. Adhering to the principle, a signature for an
25339 imperative stack might contain specifications
25343 val isEmpty : 'a t -> bool
25344 val pop : 'a t -> 'a
25347 and use of a stack would look like
25352 then ... pop stack ...
25356 or, when the element needs to be named,
25361 then let val elem = pop stack in ... end
25365 For efficiency, correctness, and conciseness, it is often better to
25366 combine the query and command and return the result as an option:
25370 val pop : 'a t -> 'a option
25373 A use of a stack would then look like this:
25384 :mlton-guide-page: ToMachine
25389 <:ToMachine:> is a translation pass from the <:RSSA:>
25390 <:IntermediateLanguage:> to the <:Machine:> <:IntermediateLanguage:>.
25394 This pass converts from a <:RSSA:> program into a <:Machine:> program.
25396 It uses <:AllocateRegisters:>, <:Chunkify:>, and <:ParallelMove:>.
25398 == Implementation ==
25400 * <!ViewGitFile(mlton,master,mlton/backend/backend.sig)>
25401 * <!ViewGitFile(mlton,master,mlton/backend/backend.fun)>
25403 == Details and Notes ==
25405 Because the MLton runtime system is shared by all <:Codegen:codegens>, it is most
25406 convenient to decide on stack layout _before_ any <:Codegen:codegen> takes over.
25407 In particular, we compute all the stack frame info for each <:RSSA:>
25408 function, including stack size, <:GarbageCollection:garbage collector>
25409 masks for each frame, etc. To do so, the <:Machine:>
25410 <:IntermediateLanguage:> imagines an abstract machine with an infinite
25411 number of (pseudo-)registers of every size. A liveness analysis
25412 determines, for each variable, whether or not it is live across a
25413 point where the runtime system might take over (for example, any
25414 garbage collection point) or a non-tail call to another <:RSSA:>
25415 function. Those that are live go on the stack, while those that
25416 aren't live go into psuedo-registers. From this information, we know
25417 all we need to about each stack frame. On the downside, nothing
25418 further on is allowed to change this stack info; it is set in stone.
25422 :mlton-guide-page: TomMurphy
25427 Tom Murphy VII is a long time MLton user and occasional contributor. He works on programming languages for his PhD work at Carnegie Mellon in Pittsburgh, USA. <:AdamGoode:> lives on the same floor of Wean Hall.
25429 http://tom7.org[Home page]
25433 :mlton-guide-page: ToRSSA
25438 <:ToRSSA:> is a translation pass from the <:SSA2:>
25439 <:IntermediateLanguage:> to the <:RSSA:> <:IntermediateLanguage:>.
25443 This pass converts a <:SSA2:> program into a <:RSSA:> program.
25445 It uses <:PackedRepresentation:>.
25447 == Implementation ==
25449 * <!ViewGitFile(mlton,master,mlton/backend/ssa-to-rssa.sig)>
25450 * <!ViewGitFile(mlton,master,mlton/backend/ssa-to-rssa.fun)>
25452 == Details and Notes ==
25458 :mlton-guide-page: ToSSA2
25463 <:ToSSA2:> is a translation pass from the <:SSA:>
25464 <:IntermediateLanguage:> to the <:SSA2:> <:IntermediateLanguage:>.
25468 This pass is a simple conversion from a <:SSA:> program into a
25471 The only interesting portions of the translation are:
25473 * an <:SSA:> `ref` type becomes an object with a single mutable field
25474 * `array`, `vector`, and `ref` are eliminated in favor of select and updates
25475 * `Case` transfers separate discrimination and constructor argument selects
25477 == Implementation ==
25479 * <!ViewGitFile(mlton,master,mlton/ssa/ssa-to-ssa2.sig)>
25480 * <!ViewGitFile(mlton,master,mlton/ssa/ssa-to-ssa2.fun)>
25482 == Details and Notes ==
25488 :mlton-guide-page: TypeChecking
25493 MLton's type checker follows the <:DefinitionOfStandardML:Definition>
25494 closely, so you may find differences between MLton and other SML
25495 compilers that do not follow the Definition so closely. In
25496 particular, SML/NJ has many deviations from the Definition -- please
25497 see <:SMLNJDeviations:> for those that we are aware of.
25499 In some respects MLton's type checker is more powerful than other SML
25500 compilers, so there are programs that MLton accepts that are rejected
25501 by some other SML compilers. These kinds of programs fall into a few
25504 * MLton resolves flexible record patterns using a larger context than
25505 many other SML compilers. For example, MLton accepts the
25511 val _ = f {x = 13, y = "foo"}
25514 * MLton uses as large a context as possible to resolve the type of
25515 variables constrained by the value restriction to be monotypes. For
25516 example, MLton accepts the following.
25525 val f = (fn x => x) (fn y => y)
25530 == Type error messages ==
25532 To aid in the understanding of type errors, MLton's type checker
25533 displays type errors differently than other SML compilers. In
25534 particular, when two types are different, it is important for the
25535 programmer to easily understand why they are different. So, MLton
25536 displays only the differences between two types that don't match,
25537 using underscores for the parts that match. For example, if a
25538 function expects `real * int` but gets `real * real`, the type error
25539 message would look like
25543 but got: _ * [real]
25546 As another aid to spotting differences, MLton places brackets `[]`
25547 around the parts of the types that don't match. A common situation is
25548 when a function receives a different number of arguments than it
25549 expects, in which case you might see an error like
25552 expects: [int * real]
25553 but got: [int * real * string]
25556 The brackets make it easy to see that the problem is that the tuples
25557 have different numbers of components -- not that the components don't
25558 match. Contrast that with a case where a function receives the right
25559 number of arguments, but in the wrong order, in which case you might
25563 expects: [int] * [real]
25564 but got: [real] * [int]
25567 Here the brackets make it easy to see that the components do not match.
25569 We appreciate feedback on any type error messages that you find
25570 confusing, or suggestions you may have for improvements to error
25574 == The shortest/most-recent rule for type names ==
25576 In a type error message, MLton often has a number of choices in
25577 deciding what name to use for a type. For example, in the following
25578 type-incorrect program
25587 MLton reports the error message
25590 Error: z.sml 3.9-3.15.
25591 Function applied to incorrect argument.
25597 MLton could have reported `expects: [int]` instead of `expects: [t]`.
25598 However, MLton uses the shortest/most-recent rule in order to decide
25599 what type name to display. This rule means that, at the point of the
25600 error, MLton first looks for the shortest name for a type in terms of
25601 number of structure identifiers (e.g. `foobar` is shorter than `A.t`).
25602 Next, if there are multiple names of the same length, then MLton uses
25603 the most recently defined name. It is this tiebreaker that causes
25604 MLton to prefer `t` to `int` in the above example.
25606 In signature matching, most recently defined is not taken to include
25607 all of the definitions introduced by the structure (since the matching
25608 takes place outside the structure and before it is defined). For
25609 example, in the following type-incorrect program
25623 MLton reports the error message
25626 Error: z.sml 2.4-4.6.
25627 Variable in structure disagrees with signature (type): x.
25628 structure: val x: [string]
25629 defn at: z.sml 7.11-7.11
25630 signature: val x: [int]
25631 spec at: z.sml 3.11-3.11
25634 If there is a type that only exists inside the structure being
25635 matched, then the prefix `_str.` is used. For example, in the
25636 following type-incorrect program
25650 MLton reports the error message
25653 Error: z.sml 2.4-4.6.
25654 Variable in structure disagrees with signature (type): x.
25655 structure: val x: [_str.t]
25656 defn at: z.sml 7.11-7.11
25657 signature: val x: [int]
25658 spec at: z.sml 3.11-3.11
25661 in which the `[_str.t]` refers to the type defined in the structure.
25665 :mlton-guide-page: TypeConstructor
25666 [[TypeConstructor]]
25670 In <:StandardML:Standard ML>, a type constructor is a function from
25671 types to types. Type constructors can be _nullary_, meaning that
25672 they take no arguments, as in `char`, `int`, and `real`.
25673 Type constructors can be _unary_, meaning that they take one
25674 argument, as in `array`, `list`, and `vector`. A program
25675 can define a new type constructor in two ways: a `type` definition
25676 or a `datatype` declaration. User-defined type constructors can
25677 can take any number of arguments.
25681 datatype t = T of int * real (* 0 arguments *)
25682 type 'a t = 'a * int (* 1 argument *)
25683 datatype ('a, 'b) t = A | B of 'a * 'b (* 2 arguments *)
25684 type ('a, 'b, 'c) t = 'a * ('b -> 'c) (* 3 arguments *)
25687 Here are the syntax rules for type constructor application.
25689 * Type constructor application is written in postfix. So, one writes
25690 `int list`, not `list int`.
25692 * Unary type constructors drop the parens, so one writes
25693 `int list`, not `(int) list`.
25695 * Nullary type constructors drop the argument entirely, so one writes
25696 `int`, not `() int`.
25698 * N-ary type constructors use tuple notation; for example,
25701 * Type constructor application associates to the left. So,
25702 `int ref list` is the same as `(int ref) list`.
25706 :mlton-guide-page: TypeIndexedValues
25707 [[TypeIndexedValues]]
25711 <:StandardML:Standard ML> does not support ad hoc polymorphism. This
25712 presents a challenge to programmers. The problem is that at first
25713 glance there seems to be no practical way to implement something like
25714 a function for converting a value of any type to a string or a
25715 function for computing a hash value for a value of any type.
25716 Fortunately there are ways to implement type-indexed values in SML as
25717 discussed in <!Cite(Yang98)>. Various articles such as
25718 <!Cite(Danvy98)>, <!Cite(Ramsey11)>, <!Cite(Elsman04)>,
25719 <!Cite(Kennedy04)>, and <!Cite(Benton05)> also contain examples of
25720 type-indexed values.
25722 *NOTE:* The technique used in the following example uses an early (and
25723 somewhat broken) variation of the basic technique used in an
25724 experimental generic programming library (see
25725 <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>) that can
25726 be found from the MLton repository. The generic programming library
25727 also includes a more advanced generic pretty printing function (see
25728 <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/public/value/pretty.sig)>).
25730 == Example: Converting any SML value to (roughly) SML syntax ==
25732 Consider the problem of converting any SML value to a textual
25733 presentation that matches the syntax of SML as closely as possible.
25734 One solution is a type-indexed function that maps a given type to a
25735 function that maps any value (of the type) to its textual
25736 presentation. A type-indexed function like this can be useful for a
25737 variety of purposes. For example, one could use it to show debugging
25738 information. We'll call this function "`show`".
25740 We'll do a fairly complete implementation of `show`. We do not
25741 distinguish infix and nonfix constructors, but that is not an
25742 intrinsic property of SML datatypes. We also don't reconstruct a type
25743 name for the value, although it would be particularly useful for
25744 functional values. To reconstruct type names, some changes would be
25745 needed and the reader is encouraged to consider how to do that. A
25746 more realistic implementation would use some pretty printing
25747 combinators to compute a layout for the result. This should be a
25748 relatively easy change (given a suitable pretty printing library).
25749 Cyclic values (through references and arrays) do not have a standard
25750 textual presentation and it is impossible to convert arbitrary
25751 functional values (within SML) to a meaningful textual presentation.
25752 Finally, it would also make sense to show sharing of references and
25753 arrays. We'll leave these improvements to an actual library
25756 The following code uses the <:Fixpoints:fixpoint framework> and other
25757 utilities from an Extended Basis library (see
25758 <!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>).
25762 Let's consider the design of the `SHOW` signature:
25767 signature SHOW = sig
25768 type 'a t (* complete type-index *)
25769 type 'a s (* incomplete sum *)
25770 type ('a, 'k) p (* incomplete product *)
25771 type u (* tuple or unlabelled product *)
25772 type l (* record or labelled product *)
25774 val show : 'a t -> 'a -> string
25776 (* user-defined types *)
25777 val inj : ('a -> 'b) -> 'b t -> 'a t
25779 (* tuples and records *)
25780 val * : ('a, 'k) p * ('b, 'k) p -> (('a, 'b) product, 'k) p
25782 val U : 'a t -> ('a, u) p
25783 val L : string -> 'a t -> ('a, l) p
25785 val tuple : ('a, u) p -> 'a t
25786 val record : ('a, l) p -> 'a t
25789 val + : 'a s * 'b s -> (('a, 'b) sum) s
25791 val C0 : string -> unit s
25792 val C1 : string -> 'a t -> 'a s
25794 val data : 'a s -> 'a t
25800 val regExn : (exn -> ('a * 'a s) option) -> unit
25802 (* some built-in type constructors *)
25803 val refc : 'a t -> 'a ref t
25804 val array : 'a t -> 'a array t
25805 val list : 'a t -> 'a list t
25806 val vector : 'a t -> 'a vector t
25807 val --> : 'a t * 'b t -> ('a -> 'b) t
25809 (* some built-in base types *)
25810 val string : string t
25820 While some details are shaped by the specific requirements of `show`,
25821 there are a number of (design) patterns that translate to other
25822 type-indexed values. The former kind of details are mostly shaped by
25823 the syntax of SML values that `show` is designed to produce. To this
25824 end, abstract types and phantom types are used to distinguish
25825 incomplete record, tuple, and datatype type-indices from each other
25826 and from complete type-indices. Also, names of record labels and
25827 datatype constructors need to be provided by the user.
25829 ==== Arbitrary user-defined datatypes ====
25831 Perhaps the most important pattern is how the design supports
25832 arbitrary user-defined datatypes. A number of combinators together
25833 conspire to provide the functionality. First of all, to support new
25834 user-defined types, a combinator taking a conversion function to a
25835 previously supported type is provided:
25838 val inj : ('a -> 'b) -> 'b t -> 'a t
25841 An injection function is sufficient in this case, but in the general
25842 case, an embedding with injection and projection functions may be
25845 To support products (tuples and records) a product combinator is
25849 val * : ('a, 'k) p * ('b, 'k) p -> (('a, 'b) product, 'k) p
25851 The second (phantom) type variable `'k` is there to distinguish
25852 between labelled and unlabelled products and the type `p`
25853 distinguishes incomplete products from complete type-indices of type
25854 `t`. Most type-indexed values do not need to make such distinctions.
25856 To support sums (datatypes) a sum combinator is provided:
25859 val + : 'a s * 'b s -> (('a, 'b) sum) s
25861 Again, the purpose of the type `s` is to distinguish incomplete sums
25862 from complete type-indices of type `t`, which usually isn't necessary.
25864 Finally, to support recursive datatypes, including sets of mutually
25865 recursive datatypes, a <:Fixpoints:fixpoint tier> is provided:
25871 Together these combinators (with the more domain specific combinators
25872 `U`, `L`, `tuple`, `record`, `C0`, `C1`, and `data`) enable one to
25873 encode a type-index for any user-defined datatype.
25875 ==== Exceptions ====
25877 The `exn` type in SML is a <:UniversalType:universal type> into which
25878 all types can be embedded. SML also allows a program to generate new
25879 exception variants at run-time. Thus a mechanism is required to register
25880 handlers for particular variants:
25884 val regExn : (exn -> ('a * 'a s) option) -> unit
25887 The universal `exn` type-index then makes use of the registered
25888 handlers. The above particular form of handler, which converts an
25889 exception value to a value of some type and a type-index for that type
25890 (essentially an existential type) is designed to make it convenient to
25891 write handlers. To write a handler, one can conveniently reuse
25892 existing type-indices:
25895 exception Int of int
25900 val () = regExn (fn Int v => SOME (v, C1"Int" int)
25905 Note that a single handler may actually handle an arbitrary number of
25906 different exceptions.
25908 ==== Other types ====
25910 Some built-in and standard types typically require special treatment
25911 due to their special nature. The most important of these are arrays
25912 and references, because cyclic data (ignoring closures) and observable
25913 sharing can only be constructed through them.
25915 When arrow types are really supported, unlike in this case, they
25916 usually need special treatment due to the contravariance of arguments.
25918 Lists and vectors require special treatment in the case of `show`,
25919 because of their special syntax. This isn't usually the case.
25921 The set of base types to support also needs to be considered unless
25922 one exports an interface for constructing type-indices for entirely
25927 Before going to the implementation, let's look at some examples. For
25928 the following examples, we'll assume a structure binding
25929 `Show :> SHOW`. If you want to try the examples immediately, just
25930 skip forward to the implementation.
25932 To use `show`, one first needs a type-index, which is then given to
25933 `show`. To show a list of integers, one would use the type-index
25934 `list int`, which has the type `int list Show.t`:
25938 let open Show in show (list int) end
25942 Likewise, to show a list of lists of characters, one would use the
25943 type-index `list (list char)`, which has the type `char list list
25947 val "[[#\"a\", #\"b\", #\"c\"], []]" =
25948 let open Show in show (list (list char)) end
25949 [[#"a", #"b", #"c"], []]
25952 Handling standard types is not particularly interesting. It is more
25953 interesting to see how user-defined types can be handled. Although
25954 the `option` datatype is a standard type, it requires no special
25955 support, so we can treat it as a user-defined type. Options can be
25956 encoded easily using a sum:
25962 inj (fn NONE => INL ()
25964 (data (C0"NONE" + C1"SOME" t))
25968 let open Show in show (option int) end
25972 Readers new to type-indexed values might want to type annotate each
25973 subexpression of the above example as an exercise. (Use a compiler to
25974 check your annotations.)
25976 Using a product, user specified records can be also be encoded easily:
25982 inj (fn {a, b, c} => a & b & c)
25983 (record (L"a" (option int) *
25988 val "{a = SOME 1, b = 3.0, c = false}" =
25989 let open Show in show abc end
25990 {a = SOME 1, b = 3.0, c = false}
25993 As you can see, both of the above use `inj` to inject user-defined
25994 types to the general purpose sum and product types.
25996 Of particular interest is whether recursive datatypes and cyclic data
25997 can be handled. For example, how does one write a type-index for a
25998 recursive datatype such as a cyclic graph?
26001 datatype 'a graph = VTX of 'a * 'a graph list ref
26002 fun arcs (VTX (_, r)) = r
26005 Using the `Show` combinators, we could first write a new type-index
26006 combinator for `graph`:
26012 fix Y (fn graph_a =>
26013 inj (fn VTX (x, y) => x & y)
26016 U (refc (list graph_a)))))))
26020 To show a graph with integer labels
26024 val a = VTX (1, ref [])
26025 val b = VTX (2, ref [])
26026 val c = VTX (3, ref [])
26027 val d = VTX (4, ref [])
26028 val e = VTX (5, ref [])
26029 val f = VTX (6, ref [])
26040 we could then simply write
26043 val "VTX (1, ref [VTX (2, ref [VTX (3, ref [VTX (1, %0), \
26044 \VTX (6, ref [VTX (5, ref [VTX (4, ref [VTX (6, %3)])])] as %3)]), \
26045 \VTX (5, ref [VTX (4, ref [VTX (6, ref [VTX (5, %2)])])] as %2)]), \
26046 \VTX (4, ref [VTX (6, ref [VTX (5, ref [VTX (4, %1)])])] as %1)] as %0)" =
26047 let open Show in show (graph int) end
26051 There is a subtle gotcha with cyclic data. Consider the following code:
26054 exception ExnArray of exn array
26059 regExn (fn ExnArray a =>
26060 SOME (a, C1"ExnArray" (array exn))
26065 val a = Array.fromList [Empty]
26067 Array.update (a, 0, ExnArray a) ; a
26071 Although the above looks innocent enough, the evaluation of
26074 val "[|ExnArray %0|] as %0" =
26075 let open Show in show (array exn) end
26078 goes into an infinite loop. To avoid this problem, the type-index
26079 `array exn` must be evaluated only once, as in the following:
26082 val array_exn = let open Show in array exn end
26084 exception ExnArray of exn array
26089 regExn (fn ExnArray a =>
26090 SOME (a, C1"ExnArray" array_exn)
26095 val a = Array.fromList [Empty]
26097 Array.update (a, 0, ExnArray a) ; a
26100 val "[|ExnArray %0|] as %0" =
26101 let open Show in show array_exn end
26105 Cyclic data (excluding closures) in Standard ML can only be
26106 constructed imperatively through arrays and references (combined with
26107 exceptions or recursive datatypes). Before recursing to a reference
26108 or an array, one needs to check whether that reference or array has
26109 already been seen before. When `ref` or `array` is called with a
26110 type-index, a new cyclicity checker is instantiated.
26112 == Implementation ==
26116 structure SmlSyntax = struct
26118 structure CV = CharVector and C = Char
26120 val isSym = Char.contains "!%&$#+-/:<=>?@\\~`^|*"
26122 fun isSymId s = 0 < size s andalso CV.all isSym s
26124 fun isAlphaNumId s =
26126 andalso C.isAlpha (CV.sub (s, 0))
26127 andalso CV.all (fn c => C.isAlphaNum c
26133 andalso #"0" <> CV.sub (s, 0)
26134 andalso CV.all C.isDigit s
26136 fun isId s = isAlphaNumId s orelse isSymId s
26138 fun isLongId s = List.all isId (String.fields (#"." <\ op =) s)
26140 fun isLabel s = isId s orelse isNumLabel s
26144 structure Show :> SHOW = struct
26145 datatype 'a t = IN of exn list * 'a -> bool * string
26147 type ('a, 'k) p = 'a t
26151 fun show (IN t) x = #2 (t ([], x))
26153 (* user-defined types *)
26154 fun inj inj (IN b) = IN (b o Pair.map (id, inj))
26157 fun surround pre suf (_, s) = (false, concat [pre, s, suf])
26158 fun parenthesize x = if #1 x then surround "(" ")" x else x
26159 fun construct tag =
26160 (fn (_, s) => (true, concat [tag, " ", s])) o parenthesize
26161 fun check p m s = if p s then () else raise Fail (m^s)
26163 (* tuples and records *)
26164 fun (IN l) * (IN r) =
26165 IN (fn (rs, a & b) =>
26166 (false, concat [#2 (l (rs, a)),
26171 fun L l = (check SmlSyntax.isLabel "Invalid label: " l
26172 ; fn IN t => IN (surround (l^" = ") "" o t))
26174 fun tuple (IN t) = IN (surround "(" ")" o t)
26175 fun record (IN t) = IN (surround "{" "}" o t)
26178 fun (IN l) + (IN r) = IN (fn (rs, INL a) => l (rs, a)
26179 | (rs, INR b) => r (rs, b))
26181 fun C0 c = (check SmlSyntax.isId "Invalid constructor: " c
26182 ; IN (const (false, c)))
26183 fun C1 c (IN t) = (check SmlSyntax.isId "Invalid constructor: " c
26184 ; IN (construct c o t))
26188 fun Y ? = Tie.iso Tie.function (fn IN x => x, IN) ?
26192 val handlers = ref ([] : (exn -> unit t option) list)
26194 val exn = IN (fn (rs, e) => let
26196 C0(concat ["<exn:",
26203 val IN f = lp (!handlers)
26208 handlers := (Option.map
26215 (* some built-in type constructors *)
26217 fun cyclic (IN t) = let
26218 exception E of ''a * bool ref
26220 IN (fn (rs, v : ''a) => let
26221 val idx = Int.toString o length
26222 fun lp (E (v', c)::rs) =
26223 if v' <> v then lp rs
26224 else (c := false ; (false, "%"^idx rs))
26225 | lp (_::rs) = lp rs
26228 val r = t (E (v, c)::rs, v)
26231 else surround "" (" as %"^idx rs) r
26238 fun aggregate pre suf toList (IN t) =
26239 IN (surround pre suf o
26244 (map (#2 o curry t rs)
26247 fun refc ? = (cyclic o inj ! o C1"ref") ?
26248 fun array ? = (cyclic o aggregate "[|" "|]" (Array.foldr op:: [])) ?
26249 fun list ? = aggregate "[" "]" id ?
26250 fun vector ? = aggregate "#[" "]" (Vector.foldr op:: []) ?
26253 fun (IN _) --> (IN _) = IN (const (false, "<fn>"))
26255 (* some built-in base types *)
26257 fun mk toS = (fn x => (false, x)) o toS o (fn (_, x) => x)
26260 IN (surround "\"" "\"" o mk (String.translate Char.toString))
26261 val unit = IN (mk (fn () => "()"))
26262 val bool = IN (mk Bool.toString)
26263 val char = IN (surround "#\"" "\"" o mk Char.toString)
26264 val int = IN (mk Int.toString)
26265 val word = IN (surround "0wx" "" o mk Word.toString)
26266 val real = IN (mk Real.toString)
26271 (* Handlers for standard top-level exceptions *)
26274 fun E0 name = SOME ((), C0 name)
26276 regExn (fn Bind => E0"Bind"
26279 | Domain => E0"Domain"
26280 | Empty => E0"Empty"
26281 | Match => E0"Match"
26282 | Option => E0"Option"
26283 | Overflow => E0"Overflow"
26286 | Subscript => E0"Subscript"
26288 ; regExn (fn Fail s => SOME (s, C1"Fail" string)
26296 There are a number of related techniques. Here are some of them.
26303 :mlton-guide-page: TypeVariableScope
26304 [[TypeVariableScope]]
26308 In <:StandardML:Standard ML>, every type variable is _scoped_ (or
26309 bound) at a particular point in the program. A type variable can be
26310 either implicitly scoped or explicitly scoped. For example, `'a` is
26311 implicitly scoped in
26315 val id: 'a -> 'a = fn x => x
26318 and is implicitly scoped in
26322 val id = fn x: 'a => x
26325 On the other hand, `'a` is explicitly scoped in
26329 val 'a id: 'a -> 'a = fn x => x
26332 and is explicitly scoped in
26336 val 'a id = fn x: 'a => x
26339 A type variable can be scoped at a `val` or `fun` declaration. An SML
26340 type checker performs scope inference on each top-level declaration to
26341 determine the scope of each implicitly scoped type variable. After
26342 scope inference, every type variable is scoped at exactly one
26343 enclosing `val` or `fun` declaration. Scope inference shows that the
26344 first and second example above are equivalent to the third and fourth
26345 example, respectively.
26347 Section 4.6 of the <:DefinitionOfStandardML:Definition> specifies
26348 precisely the scope of an implicitly scoped type variable. A free
26349 occurrence of a type variable `'a` in a declaration `d` is said to be
26350 _unguarded_ in `d` if `'a` is not part of a smaller declaration. A
26351 type variable `'a` is implicitly scoped at `d` if `'a` is unguarded in
26352 `d` and `'a` does not occur unguarded in any declaration containing
26356 == Scope inference examples ==
26362 val id: 'a -> 'a = fn x => x
26365 `'a` is unguarded in `val id` and does not occur unguarded in any
26366 containing declaration. Hence, `'a` is scoped at `val id` and the
26367 declaration is equivalent to the following.
26371 val 'a id: 'a -> 'a = fn x => x
26378 val f = fn x => let exception E of 'a in E x end
26381 `'a` is unguarded in `val f` and does not occur unguarded in any
26382 containing declaration. Hence, `'a` is scoped at `val f` and the
26383 declaration is equivalent to the following.
26387 val 'a f = fn x => let exception E of 'a in E x end
26390 * In this example (taken from the <:DefinitionOfStandardML:Definition>),
26394 val x: int -> int = let val id: 'a -> 'a = fn z => z in id id end
26397 `'a` occurs unguarded in `val id`, but not in `val x`. Hence, `'a` is
26398 implicitly scoped at `val id`, and the declaration is equivalent to
26403 val x: int -> int = let val 'a id: 'a -> 'a = fn z => z in id id end
26411 val f = (fn x: 'a => x) (fn y => y)
26414 `'a` occurs unguarded in `val f` and does not occur unguarded in any
26415 containing declaration. Hence, `'a` is implicitly scoped at `val f`,
26416 and the declaration is equivalent to the following.
26420 val 'a f = (fn x: 'a => x) (fn y => y)
26423 This does not type check due to the <:ValueRestriction:>.
26431 fun g (y: 'a) = if true then x else y
26437 `'a` occurs unguarded in `fun g`, not in `fun f`. Hence, `'a` is
26438 implicitly scoped at `fun g`, and the declaration is equivalent to
26444 fun 'a g (y: 'a) = if true then x else y
26450 This fails to type check because `x` and `y` must have the same type,
26451 but the `x` occurs outside the scope of the type variable `'a`. MLton
26452 reports the following error.
26455 Error: z.sml 3.21-3.41.
26456 Then and else branches disagree.
26459 in: if true then x else y
26460 note: type would escape its scope: 'a
26461 escape to: z.sml 1.1-6.5
26464 This problem could be fixed either by adding an explicit type
26465 constraint, as in `fun f (x: 'a)`, or by explicitly scoping `'a`, as
26466 in `fun 'a f x = ...`.
26469 == Restrictions on type variable scope ==
26471 It is not allowed to scope a type variable within a declaration in
26472 which it is already in scope (see the last restriction listed on page
26473 9 of the <:DefinitionOfStandardML:Definition>). For example, the
26474 following program is invalid.
26480 fun 'a g (y: 'a) = y
26486 MLton reports the following error.
26489 Error: z.sml 3.11-3.12.
26490 Type variable scoped at an outer declaration: 'a.
26491 scoped at: z.sml 1.1-6.6
26494 This is an error even if the scoping is implicit. That is, the
26495 following program is invalid as well.
26501 fun 'a g (y: 'a) = y
26509 :mlton-guide-page: Unicode
26514 == Support in The Definition of Standard ML ==
26516 There is no real support for Unicode in the
26517 <:DefinitionOfStandardML:Definition>; there are only a few throw-away
26518 sentences along the lines of "the characters with numbers 0 to 127
26519 coincide with the ASCII character set."
26521 == Support in The Standard ML Basis Library ==
26523 Neither is there real support for Unicode in the <:BasisLibrary:Basis
26524 Library>. The general consensus (which includes the opinions of the
26525 editors of the Basis Library) is that the `WideChar` and `WideString`
26526 structures are insufficient for the purposes of Unicode. There is no
26527 `LargeChar` structure, which in itself is a deficiency, since a
26528 programmer can not program against the largest supported character
26531 == Current Support in MLton ==
26533 MLton, as a minor extension over the Definition, supports UTF-8 byte
26534 sequences in text constants. This feature enables "UTF-8 convenience"
26535 (but not comprehensive Unicode support); in particular, it allows one
26536 to copy text from a browser and paste it into a string constant in an
26537 editor and, furthermore, if the string is printed to a terminal, then
26538 will (typically) appear as the original text. See the
26539 <:SuccessorML#ExtendedTextConsts:extended text constants feature of
26540 Successor ML> for more details.
26542 MLton, also as a minor extension over the Definition, supports
26543 `\Uxxxxxxxx` numeric escapes in text constants and has preliminary
26544 internal support for 16- and 32-bit characters and strings.
26546 MLton provides `WideChar` and `WideString` structures, corresponding
26547 to 32-bit characters and strings, respectively.
26549 == Questions and Discussions ==
26551 There are periodic flurries of questions and discussion about Unicode
26552 in MLton/SML. In December 2004, there was a discussion that led to
26553 some seemingly sound design decisions. The discussion started at:
26555 * http://www.mlton.org/pipermail/mlton/2004-December/026396.html
26557 There is a good summary of points at:
26559 * http://www.mlton.org/pipermail/mlton/2004-December/026440.html
26561 In November 2005, there was a followup discussion and the beginning of
26564 * http://www.mlton.org/pipermail/mlton/2005-November/028300.html
26568 The <:fxp:> XML parser has some support for dealing with Unicode
26573 :mlton-guide-page: UniversalType
26578 A universal type is a type into which all other types can be embedded.
26579 Here's a <:StandardML:Standard ML> signature for a universal type.
26583 signature UNIVERSAL_TYPE =
26587 val embed: unit -> ('a -> t) * (t -> 'a option)
26591 The idea is that `type t` is the universal type and that each call to
26592 `embed` returns a new pair of functions `(inject, project)`, where
26593 `inject` embeds a value into the universal type and `project` extracts
26594 the value from the universal type. A pair `(inject, project)`
26595 returned by `embed` works together in that `project u` will return
26596 `SOME v` if and only if `u` was created by `inject v`. If `u` was
26597 created by a different function `inject'`, then `project` returns
26600 Here's an example embedding integers and reals into a universal type.
26604 functor Test (U: UNIVERSAL_TYPE): sig end =
26606 val (intIn: int -> U.t, intOut) = U.embed ()
26607 val r: U.t ref = ref (intIn 13)
26609 case intOut (!r) of
26611 | SOME i => Int.toString i
26612 val (realIn: real -> U.t, realOut) = U.embed ()
26613 val () = r := realIn 13.0
26615 case intOut (!r) of
26617 | SOME i => Int.toString i
26619 case realOut (!r) of
26621 | SOME x => Real.toString x
26622 val () = print (concat [s1, " ", s2, " ", s3, "\n"])
26626 Applying `Test` to an appropriate implementation will print
26632 Note that two different calls to embed on the same type return
26633 different embeddings.
26635 Standard ML does not have explicit support for universal types;
26636 however, there are at least two ways to implement them.
26639 == Implementation Using Exceptions ==
26641 While the intended use of SML exceptions is for exception handling, an
26642 accidental feature of their design is that the `exn` type is a
26643 universal type. The implementation relies on being able to declare
26644 exceptions locally to a function and on the fact that exceptions are
26645 <:GenerativeException:generative>.
26649 structure U:> UNIVERSAL_TYPE =
26656 fun project (e: t): 'a option =
26667 == Implementation Using Functions and References ==
26671 structure U:> UNIVERSAL_TYPE =
26673 datatype t = T of {clear: unit -> unit,
26674 store: unit -> unit}
26678 val r: 'a option ref = ref NONE
26679 fun inject (a: 'a): t =
26680 T {clear = fn () => r := NONE,
26681 store = fn () => r := SOME a}
26682 fun project (T {clear, store}): 'a option =
26696 Note that due to the use of a shared ref cell, the above
26697 implementation is not thread safe.
26699 One could try to simplify the above implementation by eliminating the
26700 `clear` function, making `type t = unit -> unit`.
26704 structure U:> UNIVERSAL_TYPE =
26706 type t = unit -> unit
26710 val r: 'a option ref = ref NONE
26711 fun inject (a: 'a): t = fn () => r := SOME a
26712 fun project (f: t): 'a option = (r := NONE; f (); !r)
26719 While correct, this approach keeps the contents of the ref cell alive
26720 longer than necessary, which could cause a space leak. The problem is
26721 in `project`, where the call to `f` stores some value in some ref cell
26722 `r'`. Perhaps `r'` is the same ref cell as `r`, but perhaps not. If
26723 we do not clear `r'` before returning from `project`, then `r'` will
26724 keep the value alive, even though it is useless.
26729 * <:PropertyList:>: Lisp-style property lists implemented with a universal type
26733 :mlton-guide-page: UnresolvedBugs
26738 Here are the places where MLton deviates from
26739 <:DefinitionOfStandardML:The Definition of Standard ML (Revised)> and
26740 the <:BasisLibrary:Basis Library>. In general, MLton complies with
26741 the <:DefinitionOfStandardML:Definition> quite closely, typically much
26742 more closely than other SML compilers (see, e.g., our list of
26743 <:SMLNJDeviations:SML/NJ's deviations>). In fact, the four deviations
26744 listed here are the only known deviations, and we have no immediate
26745 plans to fix them. If you find a deviation not listed here, please
26748 We don't plan to fix these bugs because the first (parsing nested
26749 cases) has historically never been accepted by any SML compiler, the
26750 second clearly indicates a problem in the
26751 <:DefinitionOfStandardML:Definition>, and the remaining are difficult
26752 to resolve in the context of MLton's implementaton of Standard ML (and
26753 unlikely to be problematic in practice).
26755 * MLton does not correctly parse case expressions nested within other
26756 matches. For example, the following fails.
26767 To do this in a program, simply parenthesize the case expression.
26769 Allowing such expressions, although compliant with the Definition,
26770 would be a mistake, since using parentheses is clearer and no SML
26771 compiler has ever allowed them. Furthermore, implementing this would
26772 require serious yacc grammar rewriting followed by postprocessing.
26774 * MLton does not raise the `Bind` exception at run time when
26775 evaluating `val rec` (and `fun`) declarations that redefine
26776 identifiers that previously had constructor status. (By default,
26777 MLton does warn at compile time about `val rec` (and `fun`)
26778 declarations that redefine identifiers that previously had
26779 constructors status; see the `valrecConstr` <:MLBasisAnnotations:ML
26780 Basis annotation>.) For example, the Definition requires the
26781 following program to type check, but also (bizarelly) requires it to
26782 raise the `Bind` exception
26786 val rec NONE = fn () => ()
26789 The Definition's behavior is obviously an error, a mismatch between
26790 the static semantics (rule 26) and the dynamic semantics (rule 126).
26791 Given the comments on rule 26 in the Definition, it seems clear that
26792 the authors meant for `val rec` to allow an identifier's constructor
26793 status to be overridden both statically and dynamically. Hence, MLton
26794 and most SML compilers follow rule 26, but do not follow rule 126.
26796 * MLton does not hide the equality aspect of types declared in
26797 `abstype` declarations. So, MLton accepts programs like the following,
26798 while the Definition rejects them.
26802 abstype t = T with end
26803 val _ = fn (t1, t2 : t) => t1 = t2
26805 abstype t = T with val a = T end
26809 One consequence of this choice is that MLton accepts the following
26810 program, in accordance with the Definition.
26814 abstype t = T with val eq = op = end
26815 val _ = fn (t1, t2 : t) => eq (t1, t2)
26818 Other implementations will typically reject this program, because they
26819 make an early choice for the type of `eq` to be `''a * ''a -> bool`
26820 instead of `t * t -> bool`. The choice is understandable, since the
26821 Definition accepts the following program.
26825 abstype t = T with val eq = op = end
26830 * MLton (re-)type checks each functor definition at every
26831 corresponding functor application (the compilation technique of
26832 defunctorization). One consequence of this implementation is that
26833 MLton accepts the following program, while the Definition rejects
26838 functor F (X: sig type t end) = struct
26841 structure A = F (struct type t = int end)
26842 structure B = F (struct type t = bool end)
26847 On the other hand, other implementations will typically reject the
26848 following program, while MLton and the Definition accept it.
26852 functor F (X: sig type t end) = struct
26855 structure A = F (struct type t = int end)
26856 structure B = F (struct type t = bool end)
26861 See <!Cite(DreyerBlume07)> for more details.
26865 :mlton-guide-page: UnsafeStructure
26866 [[UnsafeStructure]]
26870 This module is a subset of the `Unsafe` module provided by SML/NJ,
26871 with a few extract operations for `PackWord` and `PackReal`.
26875 signature UNSAFE_MONO_ARRAY =
26880 val create: int -> array
26881 val sub: array * int -> elem
26882 val update: array * int * elem -> unit
26885 signature UNSAFE_MONO_VECTOR =
26890 val sub: vector * int -> elem
26897 val create: int * 'a -> 'a array
26898 val sub: 'a array * int -> 'a
26899 val update: 'a array * int * 'a -> unit
26901 structure CharArray: UNSAFE_MONO_ARRAY
26902 structure CharVector: UNSAFE_MONO_VECTOR
26903 structure IntArray: UNSAFE_MONO_ARRAY
26904 structure IntVector: UNSAFE_MONO_VECTOR
26905 structure Int8Array: UNSAFE_MONO_ARRAY
26906 structure Int8Vector: UNSAFE_MONO_VECTOR
26907 structure Int16Array: UNSAFE_MONO_ARRAY
26908 structure Int16Vector: UNSAFE_MONO_VECTOR
26909 structure Int32Array: UNSAFE_MONO_ARRAY
26910 structure Int32Vector: UNSAFE_MONO_VECTOR
26911 structure Int64Array: UNSAFE_MONO_ARRAY
26912 structure Int64Vector: UNSAFE_MONO_VECTOR
26913 structure IntInfArray: UNSAFE_MONO_ARRAY
26914 structure IntInfVector: UNSAFE_MONO_VECTOR
26915 structure LargeIntArray: UNSAFE_MONO_ARRAY
26916 structure LargeIntVector: UNSAFE_MONO_VECTOR
26917 structure LargeRealArray: UNSAFE_MONO_ARRAY
26918 structure LargeRealVector: UNSAFE_MONO_VECTOR
26919 structure LargeWordArray: UNSAFE_MONO_ARRAY
26920 structure LargeWordVector: UNSAFE_MONO_VECTOR
26921 structure RealArray: UNSAFE_MONO_ARRAY
26922 structure RealVector: UNSAFE_MONO_VECTOR
26923 structure Real32Array: UNSAFE_MONO_ARRAY
26924 structure Real32Vector: UNSAFE_MONO_VECTOR
26925 structure Real64Array: UNSAFE_MONO_ARRAY
26928 val sub: 'a vector * int -> 'a
26930 structure Word8Array: UNSAFE_MONO_ARRAY
26931 structure Word8Vector: UNSAFE_MONO_VECTOR
26932 structure Word16Array: UNSAFE_MONO_ARRAY
26933 structure Word16Vector: UNSAFE_MONO_VECTOR
26934 structure Word32Array: UNSAFE_MONO_ARRAY
26935 structure Word32Vector: UNSAFE_MONO_VECTOR
26936 structure Word64Array: UNSAFE_MONO_ARRAY
26937 structure Word64Vector: UNSAFE_MONO_VECTOR
26939 structure PackReal32Big : PACK_REAL
26940 structure PackReal32Little : PACK_REAL
26941 structure PackReal64Big : PACK_REAL
26942 structure PackReal64Little : PACK_REAL
26943 structure PackRealBig : PACK_REAL
26944 structure PackRealLittle : PACK_REAL
26945 structure PackWord16Big : PACK_WORD
26946 structure PackWord16Little : PACK_WORD
26947 structure PackWord32Big : PACK_WORD
26948 structure PackWord32Little : PACK_WORD
26949 structure PackWord64Big : PACK_WORD
26950 structure PackWord64Little : PACK_WORD
26956 :mlton-guide-page: Useless
26961 <:Useless:> is an optimization pass for the <:SSA:>
26962 <:IntermediateLanguage:>, invoked from <:SSASimplify:>.
26968 * removes components of tuples that are constants (use unification)
26969 * removes function arguments that are constants
26970 * builds some kind of dependence graph where
26971 ** a value of ground type is useful if it is an arg to a primitive
26972 ** a tuple is useful if it contains a useful component
26973 ** a constructor is useful if it contains a useful component or is used in a `Case` transfer
26975 If a useful tuple is coerced to another useful tuple, then all of
26976 their components must agree (exactly). It is trivial to convert a
26977 useful value to a useless one.
26979 == Implementation ==
26981 * <!ViewGitFile(mlton,master,mlton/ssa/useless.fun)>
26983 == Details and Notes ==
26985 It is also trivial to convert a useful tuple to one of its useful
26986 components -- but this seems hard.
26988 Suppose that you have a `ref`/`array`/`vector` that is useful, but the
26989 components aren't -- then the components are converted to type `unit`,
26990 and any primitive args must be as well.
26992 Unify all handler arguments so that `raise`/`handle` has a consistent
26993 calling convention.
26997 :mlton-guide-page: Users
27002 Here is a list of companies, projects, and courses that use or have
27003 used MLton. If you use MLton and are not here, please add your
27004 project with a brief description and a link. Thanks.
27008 * http://www.hardcoreprocessing.com/[Hardcore Processing] uses MLton as a http://www.hardcoreprocessing.com/Freeware/MLTonWin32.html[crosscompiler from Linux to Windows] for graphics and game software.
27009 ** http://www.cex3d.net/[CEX3D Converter], a conversion program for 3D objects.
27010 ** http://www.hardcoreprocessing.com/company/showreel/index.html[Interactive Showreel], which contains a crossplatform GUI-toolkit and a realtime renderer for a subset of RenderMan written in Standard ML.
27011 ** various http://www.hardcoreprocessing.com/entertainment/index.html[games]
27012 * http://www.mathworks.com/products/polyspace/[MathWorks/PolySpace Technologies] builds their product that detects runtime errors in embedded systems based on abstract interpretation.
27013 // * http://www.sourcelight.com/[Sourcelight Technologies] uses MLton internally for prototyping and for processing databases as part of their system that makes personalized movie recommen
27014 * http://www.reactive-systems.com/[Reactive Systems] uses MLton to build Reactis, a model-based testing and validation package used in the automotive and aerospace industries.
27018 * http://www-ia.hiof.no/%7Erolando/adate_intro.html[ADATE], Automatic Design of Algorithms Through Evolution, a system for automatic programming i.e., inductive inference of algorithms. ADATE can automatically generate non-trivial and novel algorithms written in Standard ML.
27019 * http://types.bu.edu/reports/Dim+Wes+Mul+Tur+Wel+Con:TIC-2000-LNCS.html[CIL], a compiler for SML based on intersection and union types.
27020 * http://www.cs.cmu.edu/%7Econcert/[ConCert], a project investigating certified code for grid computing.
27021 * http://hcoop.sourceforge.net/[Cooperative Internet hosting tools]
27022 // * http://www.eecs.harvard.edu/%7Estein/[DesynchFS], a programming model and distributed file system for large clusters
27023 * http://www.fantasy-coders.de/projects/gh/[Guugelhupf], a simple search engine.
27024 * http://www.mpi-sws.org/%7Erossberg/hamlet/[HaMLet], a model implementation of Standard ML.
27025 * http://code.google.com/p/kepler-code/[KeplerCode], independent verification of the computational aspects of proofs of the Kepler conjecture and the Dodecahedral conjecture.
27026 * http://www.gilith.com/research/metis/[Metis], a first-order prover (used in the http://hol.sourceforge.net/[HOL4 theorem prover] and the http://isabelle.in.tum.de/[Isabelle theorem prover]).
27027 * http://tom7misc.cvs.sourceforge.net/viewvc/tom7misc/net/mlftpd/[mlftpd], an ftp daemon written in SML. <:TomMurphy:> is also working on http://tom7misc.cvs.sourceforge.net/viewvc/tom7misc/net/[replacements for standard network services] in SML. He also uses MLton to build his entries (http://www.cs.cmu.edu/%7Etom7/icfp2001/[2001], http://www.cs.cmu.edu/%7Etom7/icfp2002/[2002], http://www.cs.cmu.edu/%7Etom7/icfp2004/[2004], http://www.cs.cmu.edu/%7Etom7/icfp2005/[2005]) in the annual ICFP programming contest.
27028 * http://www.informatik.uni-freiburg.de/proglang/research/software/mlope/[MLOPE], an offline partial evaluator for Standard ML.
27029 * http://www.ida.liu.se/%7Epelab/rml/[RML], a system for developing, compiling and debugging and teaching structural operational semantics (SOS) and natural semantics specifications.
27030 * http://www.macs.hw.ac.uk/ultra/skalpel/index.html[Skalpel], a type-error slicer for SML
27031 // * http://alleystoughton.us/smlnjtrans/[SMLNJtrans], a program for generating SML/NJ transcripts in LaTeX.
27032 * http://www.cs.cmu.edu/%7Etom7/ssapre/[SSA PRE], an implementation of Partial Redundancy Elimination for MLton.
27033 * <:Stabilizers:>, a modular checkpointing abstraction for concurrent functional programs.
27034 * http://ttic.uchicago.edu/%7Epl/sa-sml/[Self-Adjusting SML], self-adjusting computation, a model of computing where programs can automatically adjust to changes to their data.
27035 * http://faculty.ist.unomaha.edu/winter/ShiftLab/TL_web/TL_index.html[TL System], providing general-purpose support for rewrite-based transformation over elements belonging to a (user-defined) domain language.
27036 * http://projects.laas.fr/tina/[Tina] (Time Petri net Analyzer)
27037 * http://www.twelf.org/[Twelf] an implementation of the LF logical framework.
27038 * http://www.cs.indiana.edu/%7Errnewton/wavescope/[WaveScript/WaveScript], a sensor network project; the WaveScript compiler can generate SML (MLton) code.
27042 * http://www.eecs.harvard.edu/%7Enr/cs152/[Harvard CS-152], undergraduate programming languages.
27043 * http://www.ia-stud.hiof.no/%7Erolando/PL/[Høgskolen i Østfold IAI30202], programming languages.
27047 :mlton-guide-page: Utilities
27052 This page is a collection of basic utilities used in the examples on
27055 * <:InfixingOperators:>, and
27058 for longer discussions on some of these utilities.
27062 (* Operator precedence table *)
27063 infix 8 * / div mod (* +1 from Basis Library *)
27064 infix 7 + - ^ (* +1 from Basis Library *)
27065 infixr 6 :: @ (* +1 from Basis Library *)
27066 infix 5 = <> > >= < <= (* +1 from Basis Library *)
27072 infix 1 := (* -2 from Basis Library *)
27075 (* Some basic combinators *)
27077 fun cross (f, g) (x, y) = (f x, g y)
27078 fun curry f x y = f (x, y)
27079 fun fail e _ = raise e
27083 datatype ('a, 'b) product = & of 'a * 'b
27086 datatype ('a, 'b) sum = INL of 'a | INR of 'b
27088 (* Some type shorthands *)
27089 type 'a uop = 'a -> 'a
27090 type 'a fix = 'a uop -> 'a
27091 type 'a thunk = unit -> 'a
27092 type 'a effect = 'a -> unit
27093 type ('a, 'b) emb = ('a -> 'b) * ('b -> 'a)
27095 (* Infixing, sectioning, and application operators *)
27096 fun x <\ f = fn y => f (x, y)
27098 fun f /> y = fn x => f (x, y)
27101 (* Piping operators *)
27108 :mlton-guide-page: ValueRestriction
27109 [[ValueRestriction]]
27113 The value restriction is a rule that governs when type inference is
27114 allowed to polymorphically generalize a value declaration. In short,
27115 the value restriction says that generalization can only occur if the
27116 right-hand side of an expression is syntactically a value. For
27122 val _ = (f "foo"; f 13)
27125 the expression `fn x => x` is syntactically a value, so `f` has
27126 polymorphic type `'a -> 'a` and both calls to `f` type check. On the
27131 val f = let in fn x => x end
27132 val _ = (f "foo"; f 13)
27135 the expression `let in fn x => end end` is not syntactically a value
27136 and so `f` can either have type `int -> int` or `string -> string`,
27137 but not `'a -> 'a`. Hence, the program does not type check.
27139 <:DefinitionOfStandardML:The Definition of Standard ML> spells out
27140 precisely which expressions are syntactic values (it refers to such
27141 expressions as _non-expansive_). An expression is a value if it is of
27142 one of the following forms.
27144 * a constant (`13`, `"foo"`, `13.0`, ...)
27145 * a variable (`x`, `y`, ...)
27146 * a function (`fn x => e`)
27147 * the application of a constructor other than `ref` to a value (`Foo v`)
27148 * a type constrained value (`v: t`)
27149 * a tuple in which each field is a value `(v1, v2, ...)`
27150 * a record in which each field is a value `{l1 = v1, l2 = v2, ...}`
27151 * a list in which each element is a value `[v1, v2, ...]`
27154 == Why the value restriction exists ==
27156 The value restriction prevents a ref cell (or an array) from holding
27157 values of different types, which would allow a value of one type to be
27158 cast to another and hence would break type safety. If the restriction
27159 were not in place, the following program would type check.
27163 val r: 'a option ref = ref NONE
27164 val r1: string option ref = r
27165 val r2: int option ref = r
27166 val () = r1 := SOME "foo"
27167 val v: int = valOf (!r2)
27170 The first line violates the value restriction because `ref NONE` is
27171 not a value. All other lines are type correct. By its last line, the
27172 program has cast the string `"foo"` to an integer. This breaks type
27173 safety, because now we can add a string to an integer with an
27174 expression like `v + 13`. We could even be more devious, by adding
27175 the following two lines, which allow us to threat the string `"foo"`
27180 val r3: (int -> int) option ref = r
27181 val v: int -> int = valOf (!r3)
27184 Eliminating the explicit `ref` does nothing to fix the problem. For
27185 example, we could replace the declaration of `r` with the following.
27189 val f: unit -> 'a option ref = fn () => ref NONE
27190 val r: 'a option ref = f ()
27193 The declaration of `f` is well typed, while the declaration of `r`
27194 violates the value restriction because `f ()` is not a value.
27197 == Unnecessarily rejected programs ==
27199 Unfortunately, the value restriction rejects some programs that could
27204 val id: 'a -> 'a = fn x => x
27205 val f: 'a -> 'a = id id
27208 The type constraint on `f` requires `f` to be polymorphic, which is
27209 disallowed because `id id` is not a value. MLton reports the
27210 following type error.
27213 Error: z.sml 2.5-2.5.
27214 Type of variable cannot be generalized in expansive declaration: f.
27216 in: val 'a f: ('a -> 'a) = id id
27219 MLton indicates the inability to make `f` polymorphic by saying that
27220 the type of `f` cannot be generalized (made polymorphic) its
27221 declaration is expansive (not a value). MLton doesn't explicitly
27222 mention the value restriction, but that is the reason. If we leave
27223 the type constraint off of `f`
27227 val id: 'a -> 'a = fn x => x
27231 then the program succeeds; however, MLton gives us the following
27235 Warning: z.sml 2.5-2.5.
27236 Type of variable was not inferred and could not be generalized: f.
27241 This warning indicates that MLton couldn't polymorphically generalize
27242 `f`, nor was there enough context using `f` to determine its type.
27243 This in itself is not a type error, but it it is a hint that something
27244 is wrong with our program. Using `f` provides enough context to
27245 eliminate the warning.
27249 val id: 'a -> 'a = fn x => x
27254 But attempting to use `f` as a polymorphic function will fail.
27258 val id: 'a -> 'a = fn x => x
27265 Error: z.sml 4.9-4.15.
27266 Function applied to incorrect argument.
27273 == Alternatives to the value restriction ==
27275 There would be nothing wrong with treating `f` as polymorphic in
27279 val id: 'a -> 'a = fn x => x
27283 One might think that the value restriction could be relaxed, and that
27284 only types involving `ref` should be disallowed. Unfortunately, the
27285 following example shows that even the type `'a -> 'a` can cause
27286 problems. If this program were allowed, then we could cast an integer
27287 to a string (or any other type).
27293 val r: 'a option ref = ref NONE
27298 val () = r := SOME x
27309 The previous version of Standard ML took a different approach
27310 (<!Cite(MilnerEtAl90)>, <!Cite(Tofte90)>, <:ImperativeTypeVariable:>)
27311 than the value restriction. It encoded information in the type system
27312 about when ref cells would be created, and used this to prevent a ref
27313 cell from holding multiple types. Although it allowed more programs
27314 to be type checked, this approach had significant drawbacks. First,
27315 it was significantly more complex, both for implementers and for
27316 programmers. Second, it had an unfortunate interaction with the
27317 modularity, because information about ref usage was exposed in module
27318 signatures. This either prevented the use of references for
27319 implementing a signature, or required information that one would like
27320 to keep hidden to propagate across modules.
27322 In the early nineties, Andrew Wright studied about 250,000 lines of
27323 existing SML code and discovered that it did not make significant use
27324 of the extended typing ability, and proposed the value restriction as
27325 a simpler alternative (<!Cite(Wright95)>). This was adopted in the
27326 revised <:DefinitionOfStandardML:Definition>.
27329 == Working with the value restriction ==
27331 One technique that works with the value restriction is
27332 <:EtaExpansion:>. We can use eta expansion to make our `id id`
27333 example type check follows.
27337 val id: 'a -> 'a = fn x => x
27338 val f: 'a -> 'a = fn z => (id id) z
27341 This solution means that the computation (in this case `id id`) will
27342 be performed each time `f` is applied, instead of just once when `f`
27343 is declared. In this case, that is not a problem, but it could be if
27344 the declaration of `f` performs substantial computation or creates a
27345 shared data structure.
27347 Another technique that sometimes works is to move a monomorphic
27348 computation prior to a (would-be) polymorphic declaration so that the
27349 expression is a value. Consider the following program, which fails
27350 due to the value restriction.
27354 datatype 'a t = A of string | B of 'a
27355 val x: 'a t = A (if true then "yes" else "no")
27358 It is easy to rewrite this program as
27362 datatype 'a t = A of string | B of 'a
27364 val s = if true then "yes" else "no"
27370 The following example (taken from <!Cite(Wright95)>) creates a ref
27371 cell to count the number of times a function is called.
27375 val count: ('a -> 'a) -> ('a -> 'a) * (unit -> int) =
27380 (fn x => (r := 1 + !r; f x), fn () => !r)
27382 val id: 'a -> 'a = fn x => x
27383 val (countId: 'a -> 'a, numCalls) = count id
27386 The example does not type check, due to the value restriction.
27387 However, it is easy to rewrite the program, staging the ref cell
27388 creation before the polymorphic code.
27392 datatype t = T of int ref
27393 val count1: unit -> t = fn () => T (ref 0)
27394 val count2: t * ('a -> 'a) -> (unit -> int) * ('a -> 'a) =
27395 fn (T r, f) => (fn () => !r, fn x => (r := 1 + !r; f x))
27396 val id: 'a -> 'a = fn x => x
27398 val countId: 'a -> 'a = fn z => #2 (count2 (t, id)) z
27399 val numCalls = #1 (count2 (t, id))
27402 Of course, one can hide the constructor `T` inside a `local` or behind
27408 * <:ImperativeTypeVariable:>
27412 :mlton-guide-page: VariableArityPolymorphism
27413 [[VariableArityPolymorphism]]
27414 VariableArityPolymorphism
27415 =========================
27417 <:StandardML:Standard ML> programmers often face the problem of how to
27418 provide a variable-arity polymorphic function. For example, suppose
27419 one is defining a combinator library, e.g. for parsing or pickling.
27420 The signature for such a library might look something like the
27425 signature COMBINATOR =
27431 val string: string t
27433 val tuple2: 'a1 t * 'a2 t -> ('a1 * 'a2) t
27434 val tuple3: 'a1 t * 'a2 t * 'a3 t -> ('a1 * 'a2 * 'a3) t
27435 val tuple4: 'a1 t * 'a2 t * 'a3 t * 'a4 t
27436 -> ('a1 * 'a2 * 'a3 * 'a4) t
27441 The question is how to define a variable-arity tuple combinator.
27442 Traditionally, the only way to take a variable number of arguments in
27443 SML is to put the arguments in a list (or vector) and pass that. So,
27444 one might define a tuple combinator with the following signature.
27447 val tupleN: 'a list -> 'a list t
27450 The problem with this approach is that as soon as one places values in
27451 a list, they must all have the same type. So, programmers often take
27452 an alternative approach, and define a family of `tuple<N>` functions,
27453 as we see in the `COMBINATOR` signature above.
27455 The family-of-functions approach is ugly for many reasons. First, it
27456 clutters the signature with a number of functions when there should
27457 really only be one. Second, it is _closed_, in that there are a fixed
27458 number of tuple combinators in the interface, and should a client need
27459 a combinator for a large tuple, he is out of luck. Third, this
27460 approach often requires a lot of duplicate code in the implementation
27461 of the combinators.
27463 Fortunately, using <:Fold01N:> and <:ProductType:products>, one can
27464 provide an interface and implementation that solves all these
27465 problems. Here is a simple pickling module that converts values to
27469 structure Pickler =
27471 type 'a t = 'a -> string
27473 val unit = fn () => ""
27475 val int = Int.toString
27477 val real = Real.toString
27481 type 'a accum = 'a * string list -> string list
27486 {finish = fn ps => fn x => concat (rev (ps (x, []))),
27487 start = fn p => fn (x, l) => p x :: l,
27494 {combine = (fn (p, p') => fn (x & x', l) => p' x' :: "," :: p (x, l))}
27499 If one has `n` picklers of types
27502 val p1: a1 Pickler.t
27503 val p2: a2 Pickler.t
27505 val pn: an Pickler.t
27507 then one can construct a pickler for n-ary products as follows.
27510 tuple `p1 `p2 ... `pn $ : (a1 & a2 & ... & an) Pickler.t
27513 For example, with `Pickler` in scope, one can prove the following
27518 "1" = tuple `int $ 1
27519 "1,2.0" = tuple `int `real $ (1 & 2.0)
27520 "1,2.0,three" = tuple `int `real `string $ (1 & 2.0 & "three")
27523 Here is the signature for `Pickler`. It shows why the `accum` type is
27527 signature PICKLER =
27533 val string: string t
27537 val ` : ('a accum, 'b t, ('a, 'b) prod accum,
27538 'z1, 'z2, 'z3, 'z4, 'z5, 'z6, 'z7) Fold01N.step1
27539 val tuple: ('a t, 'a accum, 'b accum, 'b t, unit t,
27540 'z1, 'z2, 'z3, 'z4, 'z5) Fold01N.t
27543 structure Pickler: PICKLER = Pickler
27548 :mlton-guide-page: Variant
27553 A _variant_ is an arm of a datatype declaration. For example, the
27558 datatype t = A | B of int | C of real
27561 has three variants: `A`, `B`, and `C`.
27565 :mlton-guide-page: VesaKarvonen
27570 Vesa Karvonen is a student at the http://www.cs.helsinki.fi/index.en.html[University of Helsinki].
27571 His interests lie in programming techniques that allow complex programs to be expressed
27572 clearly and concisely and the design and implementation of programming languages.
27574 image::VesaKarvonen.attachments/vesa-in-mlton-t-shirt.jpg[align="center"]
27576 Things he'd like to see for SML and hopes to be able to contribute towards:
27578 * A practical tool for documenting libraries. Preferably one that is
27579 based on extracting the documentation from source code comments.
27581 * A good IDE. Possibly an enhanced SML mode (`esml-mode`) for Emacs.
27582 Google for http://www.google.com/search?&q=SLIME+video[SLIME video] to
27583 get an idea of what he'd like to see. Some specific notes:
27586 * show type at point
27587 * robust, consistent indentation
27588 * show documentation
27589 * jump to definition (see <:EmacsDefUseMode:>)
27592 <:EmacsBgBuildMode:> has also been written for working with MLton.
27594 * Documented and cataloged libraries. Perhaps something like
27595 http://www.boost.org[Boost], but for SML libraries. Here is a partial
27596 list of libraries, tools, and frameworks Vesa is or has been working
27600 * Asynchronous Programming Library (<!ViewGitFile(mltonlib,master,com/ssh/async/unstable/README)>)
27601 * Extended Basis Library (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>)
27602 * Generic Programming Library (<!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>)
27603 * Pretty Printing Library (<!ViewGitFile(mltonlib,master,com/ssh/prettier/unstable/README)>)
27604 * Random Generator Library (<!ViewGitFile(mltonlib,master,com/ssh/random/unstable/README)>)
27605 * RPC (Remote Procedure Call) Library (<!ViewGitFile(mltonlib,master,org/mlton/vesak/rpc-lib/unstable/README)>)
27606 * http://www.libsdl.org/[SDL] Binding (<!ViewGitFile(mltonlib,master,org/mlton/vesak/sdl/unstable/README)>)
27607 * Unit Testing Library (<!ViewGitFile(mltonlib,master,com/ssh/unit-test/unstable/README)>)
27608 * Use Library (<!ViewGitFile(mltonlib,master,org/mlton/vesak/use-lib/unstable/README)>)
27609 * Windows Library (<!ViewGitFile(mltonlib,master,com/ssh/windows/unstable/README)>)
27611 Note that most of these libraries have been ported to several <:StandardMLImplementations:SML implementations>.
27615 :mlton-guide-page: WarnUnusedAnomalies
27616 [[WarnUnusedAnomalies]]
27617 WarnUnusedAnomalies
27618 ===================
27620 The `warnUnused` <:MLBasisAnnotations:MLBasis annotation> can be used
27621 to report unused identifiers. This can be useful for catching bugs
27622 and for code maintenance (e.g., eliminating dead code). However, the
27623 `warnUnused` annotation can sometimes behave in counter-intuitive
27624 ways. This page gives some of the anomalies that have been reported.
27626 * Functions whose only uses are recursive uses within their bodies are
27627 not warned as unused:
27632 fun foo () = foo () : unit
27633 val bar = let fun baz () = baz () : unit in baz end
27639 Warning: z.sml 3.5.
27640 Unused variable: bar.
27643 * Components of actual functor argument that are necessary to match
27644 the functor argument signature but are unused in the body of the
27645 functor are warned as unused:
27649 functor Warning (type t val x : t) = struct
27652 structure X = Warning (type t = int val x = 1)
27656 Warning: z.sml 4.29.
27661 * No component of a functor result is warned as unused. In the
27662 following, the only uses of `f2` are to match the functor argument
27663 signatures of `functor G` and `functor H` and there are no uses of
27668 functor F(structure X : sig type t end) = struct
27670 fun f1 (_ : X.t) = ()
27671 fun f2 (_ : X.t) = ()
27674 functor G(structure Y : sig
27680 fun g (x : Y.t) = Y.f1 x
27682 functor H(structure Y : sig
27688 fun h (x : Y.t) = Y.f1 x
27690 functor Z() = struct
27691 structure S = F(structure X = struct type t = unit end)
27692 structure SG = G(structure Y = S)
27693 structure SH = H(structure Y = S)
27705 :mlton-guide-page: WesleyTerpstra
27710 Wesley W. Terpstra is a PhD student at the Technische Universitat Darmstadt (Germany).
27714 * Distributed systems (P2P)
27715 * Number theory (Error-correcting codes)
27717 My interest in SML is centered on the fact the the language is able to directly express ideas from number theory which are important for my work. Modules and Functors seem to be a very natural basis for implementing many algebraic structures. MLton provides an ideal platform for actual implementation as it is fast and has unboxed words.
27719 Things I would like from MLton in the future:
27721 * Some better optimization of mathematical expressions
27722 * IPv6 and multicast support
27723 * A complete GUI toolkit like mGTK
27724 * More supported platforms so that applications written under MLton have a wider audience
27728 :mlton-guide-page: WholeProgramOptimization
27729 [[WholeProgramOptimization]]
27730 WholeProgramOptimization
27731 ========================
27733 Whole-program optimization is a compilation technique in which
27734 optimizations operate over the entire program. This allows the
27735 compiler many optimization opportunities that are not available when
27736 analyzing modules separately (as with separate compilation).
27738 Most of MLton's optimizations are whole-program optimizations.
27739 Because MLton compiles the whole program at once, it can perform
27740 optimization across module boundaries. As a consequence, MLton often
27741 reduces or eliminates the run-time penalty that arises with separate
27742 compilation of SML features such as functors, modules, polymorphism,
27743 and higher-order functions. MLton takes advantage of having the
27744 entire program to perform transformations such as: defunctorization,
27745 monomorphisation, higher-order control-flow analysis, inlining,
27746 unboxing, argument flattening, redundant-argument removal, constant
27747 folding, and representation selection. Whole-program compilation is
27748 an integral part of the design of MLton and is not likely to change.
27752 :mlton-guide-page: WishList
27757 This page is mainly for recording recurring feature requests. If you
27758 have a new feature request, you probably want to query interest on one
27759 of the <:Contact:mailing lists> first.
27761 Please be aware of MLton's policy on
27762 <:LanguageChanges:language changes>. Nonetheless, we hope to provide
27763 support for some of the "immediate" <:SuccessorML:> proposals in a
27767 == Support for link options in ML Basis files ==
27769 Introduce a mechanism to specify link options in <:MLBasis:ML Basis>
27770 files. For example, generalizing a bit, a ML Basis declaration of the
27777 could be introduced whose semantics would be the same (as closely as
27778 possible) as if the option string were specified on the compiler
27781 The main motivation for this is that a MLton library that would
27782 introduce bindings (through <:ForeignFunctionInterface:FFI>) to an
27783 external library could be packaged conveniently as a single MLB file.
27784 For example, to link with library `foo` the MLB file would simply
27788 option "-link-opt -lfoo"
27791 Similar feature requests have been discussed previously on the mailing lists:
27793 * http://www.mlton.org/pipermail/mlton/2004-July/025553.html
27794 * http://www.mlton.org/pipermail/mlton/2005-January/026648.html
27798 :mlton-guide-page: XML
27803 <:XML:> is an <:IntermediateLanguage:>, translated from <:CoreML:> by
27804 <:Defunctorize:>, optimized by <:XMLSimplify:>, and translated by
27805 <:Monomorphise:> to <:SXML:>.
27809 <:XML:> is polymorphic, higher-order, with flat patterns. Every
27810 <:XML:> expression is annotated with its type. Polymorphic
27811 generalization is made explicit through type variables annotating
27812 `val` and `fun` declarations. Polymorphic instantiation is made
27813 explicit by specifying type arguments at variable references. <:XML:>
27814 patterns can not be nested and can not contain wildcards, constraints,
27815 flexible records, or layering.
27817 == Implementation ==
27819 * <!ViewGitFile(mlton,master,mlton/xml/xml.sig)>
27820 * <!ViewGitFile(mlton,master,mlton/xml/xml.fun)>
27821 * <!ViewGitFile(mlton,master,mlton/xml/xml-tree.sig)>
27822 * <!ViewGitFile(mlton,master,mlton/xml/xml-tree.fun)>
27824 == Type Checking ==
27826 <:XML:> also has a type checker, used for debugging. At present, the
27827 type checker is also the best specification of the type system of
27828 <:XML:>. If you need more details, the type checker
27829 (<!ViewGitFile(mlton,master,mlton/xml/type-check.sig)>,
27830 <!ViewGitFile(mlton,master,mlton/xml/type-check.fun)>), is pretty short.
27832 Since the type checker does not affect the output of the compiler
27833 (unless it reports an error), it can be turned off. The type checker
27834 recursively descends the program, checking that the type annotating
27835 each node is the same as the type synthesized from the types of the
27836 expressions subnodes.
27838 == Details and Notes ==
27840 <:XML:> uses the same atoms as <:CoreML:>, hence all identifiers
27841 (constructors, variables, etc.) are unique and can have properties
27842 attached to them. Finally, <:XML:> has a simplifier (<:XMLShrink:>),
27843 which implements a reduction system.
27847 <:XML:> types are either type variables or applications of n-ary type
27848 constructors. There are many utility functions for constructing and
27849 destructing types involving built-in type constructors.
27851 A type scheme binds list of type variables in a type. The only
27852 interesting operation on type schemes is the application of a type
27853 scheme to a list of types, which performs a simultaneous substitution
27854 of the type arguments for the bound type variables of the scheme. For
27855 the purposes of type checking, it is necessary to know the type scheme
27856 of variables, constructors, and primitives. This is done by
27857 associating the scheme with the identifier using its property list.
27858 This approach is used instead of the more traditional environment
27859 approach for reasons of speed.
27863 Before defining `XML`, the signature for language <:XML:>, we need to
27864 define an auxiliary signature `XML_TREE`, that contains the datatype
27865 declarations for the expression trees of <:XML:>. This is done solely
27866 for the purpose of modularity -- it allows the simplifier and type
27867 checker to be defined by separate functors (which take a structure
27868 matching `XML_TREE`). Then, `Xml` is defined as the signature for a
27869 module containing the expression trees, the simplifier, and the type
27872 Both constructors and variables can have type schemes, hence both
27873 constructor and variable references specify the instance of the scheme
27874 at the point of references. An instance is specified with a vector of
27875 types, which corresponds to the type variables in the scheme.
27877 <:XML:> patterns are flat (i.e. not nested). A pattern is a
27878 constructor with an optional argument variable. Patterns only occur
27879 in `case` expressions. To evaluate a case expression, compare the
27880 test value sequentially against each pattern. For the first pattern
27881 that matches, destruct the value if necessary to bind the pattern
27882 variables and evaluate the corresponding expression. If no pattern
27883 matches, evaluate the default. All patterns of a case statement are
27884 of the same variant of `Pat.t`, although this is not enforced by ML's
27885 type system. The type checker, however, does enforce this. Because
27886 tuple patterns are irrefutable, there will only ever be one tuple
27887 pattern in a case expression and there will be no default.
27889 <:XML:> contains value, exception, and mutually recursive function
27890 declarations. There are no free type variables in <:XML:>. All type
27891 variables are explicitly bound at either a value or function
27892 declaration. At some point in the future, exception declarations may
27893 go away, and exceptions may be represented with a single datatype
27894 containing a `unit ref` component to implement genericity.
27896 <:XML:> expressions are like those of <:CoreML:>, with the following
27897 exceptions. There are no records expressions. After type inference,
27898 all records (some of which may have originally been tuples in the
27899 source) are converted to tuples, because once flexible record patterns
27900 have been resolved, tuple labels are superfluous. Tuple components
27901 are ordered based on the field ordering relation. <:XML:> eta expands
27902 primitives and constructors so that there are always fully applied.
27903 Hence, the only kind of value of arrow type is a lambda. This
27904 property is useful for flow analysis and later in code generation.
27906 An <:XML:> program is a list of toplevel datatype declarations and a
27907 body expression. Because datatype declarations are not generative,
27908 the defunctorizer can safely move them to toplevel.
27912 :mlton-guide-page: XMLShrink
27917 XMLShrink is an optimization pass for the <:XML:>
27918 <:IntermediateLanguage:>, invoked from <:XMLSimplify:>.
27922 This pass performs optimizations based on a reduction system.
27924 == Implementation ==
27926 * <!ViewGitFile(mlton,master,mlton/xml/shrink.sig)>
27927 * <!ViewGitFile(mlton,master,mlton/xml/shrink.fun)>
27929 == Details and Notes ==
27931 The simplifier is based on <!Cite(AppelJim97, Shrinking Lambda
27932 Expressions in Linear Time)>.
27934 The source program may contain functions that are only called once, or
27935 not even called at all. Match compilation introduces many such
27936 functions. In order to reduce the program size, speed up later
27937 phases, and improve the flow analysis, a source to source simplifier
27938 is run on <:XML:> after type inference and match compilation.
27940 The simplifier implements the reductions shown below. The reductions
27941 eliminate unnecessary declarations (see the side constraint in the
27942 figure), applications where the function is immediate, and case
27943 statements where the test is immediate. Declarations can be
27944 eliminated only when the expression is nonexpansive (see Section 4.7
27945 of the <:DefinitionOfStandardML: Definition>), which is a syntactic
27946 condition that ensures that the expression has no effects
27947 (assignments, raises, or nontermination). The reductions on case
27948 statements do not show the other irrelevant cases that may exist. The
27949 reductions were chosen so that they were strongly normalizing and so
27950 that they never increased tree size.
27967 if `e1` is a constant or variable or if `e1` is nonexpansive and `x` occurs zero or one time in `e2`
28001 if `e1` is nonexpansive
28009 case let d in e end of p1 => e1 ...
28016 let d in case e of p1 => e1 ... end
28025 case C e1 of C x => e2
28038 :mlton-guide-page: XMLSimplify
28043 The optimization passes for the <:XML:> <:IntermediateLanguage:> are
28044 collected and controlled by the `XmlSimplify` functor
28045 (<!ViewGitFile(mlton,master,mlton/xml/xml-simplify.sig)>,
28046 <!ViewGitFile(mlton,master,mlton/xml/xml-simplify.fun)>).
28048 The following optimization passes are implemented:
28050 * <:XMLSimplifyTypes:>
28053 The optimization passes can be controlled from the command-line by the options
28055 * `-diag-pass <pass>` -- keep diagnostic info for pass
28056 * `-disable-pass <pass>` -- skip optimization pass (if normally performed)
28057 * `-enable-pass <pass>` -- perform optimization pass (if normally skipped)
28058 * `-keep-pass <pass>` -- keep the results of pass
28059 * `-xml-passes <passes>` -- xml optimization passes
28063 :mlton-guide-page: XMLSimplifyTypes
28064 [[XMLSimplifyTypes]]
28068 <:XMLSimplifyTypes:> is an optimization pass for the <:XML:>
28069 <:IntermediateLanguage:>, invoked from <:XMLSimplify:>.
28073 This pass simplifies types in an <:XML:> program, eliminating all
28074 unused type arguments.
28076 == Implementation ==
28078 * <!ViewGitFile(mlton,master,mlton/xml/simplify-types.sig)>
28079 * <!ViewGitFile(mlton,master,mlton/xml/simplify-types.fun)>
28081 == Details and Notes ==
28083 It first computes a simple fixpoint on all the `datatype` declarations
28084 to determine which `datatype` `tycon` args are actually used. Then it
28085 does a single pass over the program to determine which polymorphic
28086 declaration type variables are used, and rewrites types to eliminate
28087 unused type arguments.
28089 This pass should eliminate any spurious duplication that the
28090 <:Monomorphise:> pass might perform due to phantom types.
28094 :mlton-guide-page: Zone
28099 <:Zone:> is an optimization pass for the <:SSA2:>
28100 <:IntermediateLanguage:>, invoked from <:SSA2Simplify:>.
28104 This pass breaks large <:SSA2:> functions into zones, which are
28105 connected subgraphs of the dominator tree. For each zone, at the node
28106 that dominates the zone (the "zone root"), it places a tuple
28107 collecting all of the live variables at that node. It replaces any
28108 variables used in that zone with offsets from the tuple. The goal is
28109 to decrease the liveness information in large <:SSA:> functions.
28111 == Implementation ==
28113 * <!ViewGitFile(mlton,master,mlton/ssa/zone.fun)>
28115 == Details and Notes ==
28117 Compute strongly-connected components to avoid put tuple constructions
28120 There are two (expert) flags that govern the use of this pass
28122 * `-max-function-size <n>`
28123 * `-zone-cut-depth <n>`
28125 Zone splitting only works when the number of basic blocks in a
28126 function is greater than `n`. The `n` used to cut the dominator tree
28127 is set by `-zone-cut-depth`.
28129 There is currently no attempt to be safe-for-space. That is, the
28130 tuples are not restricted to containing only "small" values.
28132 In the `HOL` program, the particular problem is the main function,
28133 which has 161,783 blocks and 257,519 variables -- the product of those
28134 two numbers being about 41 billion. Now, we're not likely going to
28135 need that much space since we use a sparse representation. But even
28136 1/100th would really hurt. And of course this rules out bit vectors.
28140 :mlton-guide-page: ZZZOrphanedPages
28141 [[ZZZOrphanedPages]]
28145 The contents of these pages have been moved to other pages.
28147 These templates are used by other pages.
28149 * <:CompilerPassTemplate:>