Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / mlton-guide.adoc
CommitLineData
7f918cf1
CE
1MLton Guide ({mlton-version})
2=============================
3:toc:
4:mlton-guide-page!:
5
6[abstract]
7--
8This is the guide for MLton, an open-source, whole-program, optimizing Standard ML compiler.
9
10This guide was generated automatically from the MLton website, available online at http://mlton.org. It is up to date for MLton {mlton-version}.
11--
12
13
14:leveloffset: 1
15
16:mlton-guide-page: Home
17[[Home]]
18MLton
19=====
20
21== What is MLton? ==
22
23MLton is an open-source, whole-program, optimizing
24<:StandardML:Standard ML> compiler.
25
26== What's new? ==
27
28* 20180207: Please try out our latest release, <:Release20180207:MLton 20180207>.
29
30* 20140730: http://www.cs.rit.edu/%7emtf[Matthew Fluet] and
31 http://www.cse.buffalo.edu/%7elziarek[Lukasz Ziarek] have been
32 awarded an http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=12810[NSF
33 CISE Research Infrastructure (CRI)] grant titled "Positioning MLton
34 for Next-Generation Programming Languages Research;" read the award
35 abstracts
36 (http://www.nsf.gov/awardsearch/showAward?AWD_ID=1405770[Award{nbsp}#1405770]
37 and
38 http://www.nsf.gov/awardsearch/showAward?AWD_ID=1405614[Award{nbsp}#1405614])
39 for more details.
40
41== Next steps ==
42
43* Read about MLton's <:Features:>.
44* Look at <:Documentation:>.
45* See some <:Users:> of MLton.
46* https://sourceforge.net/projects/mlton/files/mlton/20180207[Download] MLton.
47* Meet the MLton <:Developers:>.
48* Get involved with MLton <:Development:>.
49* User-maintained <:FAQ:>.
50* <:Contact:> us.
51
52<<<
53
54:mlton-guide-page: AdamGoode
55[[AdamGoode]]
56AdamGoode
57=========
58
59 * I maintain the Fedora package of MLton, in https://admin.fedoraproject.org/pkgdb/packages/name/mlton[Fedora].
60 * I have contributed some patches for Makefiles and PDF documentation building.
61
62<<<
63
64:mlton-guide-page: AdmitsEquality
65[[AdmitsEquality]]
66AdmitsEquality
67==============
68
69A <:TypeConstructor:> admits equality if whenever it is applied to
70equality types, the result is an <:EqualityType:>. This notion enables
71one to determine whether a type constructor application yields an
72equality type solely from the application, without looking at the
73definition of the type constructor. It helps to ensure that
74<:PolymorphicEquality:> is only applied to sensible values.
75
76The definition of admits equality depends on whether the type
77constructor was declared by a `type` definition or a
78`datatype` declaration.
79
80
81== Type definitions ==
82
83For type definition
84
85[source,sml]
86----
87type ('a1, ..., 'an) t = ...
88----
89
90type constructor `t` admits equality if the right-hand side of the
91definition is an equality type after replacing `'a1`, ...,
92`'an` by equality types (it doesn't matter which equality types
93are chosen).
94
95For a nullary type definition, this amounts to the right-hand side
96being an equality type. For example, after the definition
97
98[source,sml]
99----
100type t = bool * int
101----
102
103type constructor `t` admits equality because `bool * int` is
104an equality type. On the other hand, after the definition
105
106[source,sml]
107----
108type t = bool * int * real
109----
110
111type constructor `t` does not admit equality, because `real`
112is not an equality type.
113
114For another example, after the definition
115
116[source,sml]
117----
118type 'a t = bool * 'a
119----
120
121type constructor `t` admits equality because `bool * int`
122is an equality type (we could have chosen any equality type other than
123`int`).
124
125On the other hand, after the definition
126
127[source,sml]
128----
129type 'a t = real * 'a
130----
131
132type constructor `t` does not admit equality because
133`real * int` is not equality type.
134
135We can check that a type constructor admits equality using an
136`eqtype` specification.
137
138[source,sml]
139----
140structure Ok: sig eqtype 'a t end =
141 struct
142 type 'a t = bool * 'a
143 end
144----
145
146[source,sml]
147----
148structure Bad: sig eqtype 'a t end =
149 struct
150 type 'a t = real * int * 'a
151 end
152----
153
154On `structure Bad`, MLton reports the following error.
155----
156Error: z.sml 1.16-1.34.
157 Type in structure disagrees with signature (admits equality): t.
158 structure: type 'a t = [real] * _ * _
159 defn at: z.sml 3.15-3.15
160 signature: [eqtype] 'a t
161 spec at: z.sml 1.30-1.30
162----
163
164The `structure:` section provides an explanation of why the type
165did not admit equality, highlighting the problematic component
166(`real`).
167
168
169== Datatype declarations ==
170
171For a type constructor declared by a datatype declaration to admit
172equality, every <:Variant:variant> of the datatype must admit equality. For
173example, the following datatype admits equality because `bool` and
174`char * int` are equality types.
175
176[source,sml]
177----
178datatype t = A of bool | B of char * int
179----
180
181Nullary constructors trivially admit equality, so that the following
182datatype admits equality.
183
184[source,sml]
185----
186datatype t = A | B | C
187----
188
189For a parameterized datatype constructor to admit equality, we
190consider each <:Variant:variant> as a type definition, and require that the
191definition admit equality. For example, for the datatype
192
193[source,sml]
194----
195datatype 'a t = A of bool * 'a | B of 'a
196----
197
198the type definitions
199
200[source,sml]
201----
202type 'a tA = bool * 'a
203type 'a tB = 'a
204----
205
206both admit equality. Thus, type constructor `t` admits equality.
207
208On the other hand, the following datatype does not admit equality.
209
210[source,sml]
211----
212datatype 'a t = A of bool * 'a | B of real * 'a
213----
214
215As with type definitions, we can check using an `eqtype`
216specification.
217
218[source,sml]
219----
220structure Bad: sig eqtype 'a t end =
221 struct
222 datatype 'a t = A of bool * 'a | B of real * 'a
223 end
224----
225
226MLton reports the following error.
227
228----
229Error: z.sml 1.16-1.34.
230 Type in structure disagrees with signature (admits equality): t.
231 structure: datatype 'a t = B of [real] * _ | ...
232 defn at: z.sml 3.19-3.19
233 signature: [eqtype] 'a t
234 spec at: z.sml 1.30-1.30
235----
236
237MLton indicates the problematic constructor (`B`), as well as
238the problematic component of the constructor's argument.
239
240
241=== Recursive datatypes ===
242
243A recursive datatype like
244
245[source,sml]
246----
247datatype t = A | B of int * t
248----
249
250introduces a new problem, since in order to decide whether `t`
251admits equality, we need to know for the `B` <:Variant:variant> whether
252`t` admits equality. The <:DefinitionOfStandardML:Definition>
253answers this question by requiring a type constructor to admit
254equality if it is consistent to do so. So, in our above example, if
255we assume that `t` admits equality, then the <:Variant:variant>
256`B of int * t` admits equality. Then, since the `A` <:Variant:variant>
257trivially admits equality, so does the type constructor `t`.
258Thus, it was consistent to assume that `t` admits equality, and
259so, `t` does admit equality.
260
261On the other hand, in the following declaration
262
263[source,sml]
264----
265datatype t = A | B of real * t
266----
267
268if we assume that `t` admits equality, then the `B` <:Variant:variant>
269does not admit equality. Hence, the type constructor `t` does not
270admit equality, and our assumption was inconsistent. Hence, `t`
271does not admit equality.
272
273The same kind of reasoning applies to mutually recursive datatypes as
274well. For example, the following defines both `t` and `u` to
275admit equality.
276
277[source,sml]
278----
279datatype t = A | B of u
280and u = C | D of t
281----
282
283But the following defines neither `t` nor `u` to admit
284equality.
285
286[source,sml]
287----
288datatype t = A | B of u * real
289and u = C | D of t
290----
291
292As always, we can check whether a type admits equality using an
293`eqtype` specification.
294
295[source,sml]
296----
297structure Bad: sig eqtype t eqtype u end =
298 struct
299 datatype t = A | B of u * real
300 and u = C | D of t
301 end
302----
303
304MLton reports the following error.
305
306----
307Error: z.sml 1.16-1.40.
308 Type in structure disagrees with signature (admits equality): t.
309 structure: datatype t = B of [_str.u] * [real] | ...
310 defn at: z.sml 3.16-3.16
311 signature: [eqtype] t
312 spec at: z.sml 1.27-1.27
313Error: z.sml 1.16-1.40.
314 Type in structure disagrees with signature (admits equality): u.
315 structure: datatype u = D of [_str.t] | ...
316 defn at: z.sml 4.11-4.11
317 signature: [eqtype] u
318 spec at: z.sml 1.36-1.36
319----
320
321<<<
322
323:mlton-guide-page: Alice
324[[Alice]]
325Alice
326=====
327
328http://www.ps.uni-saarland.de/alice[Alice ML] is an extension of SML with
329concurrency, dynamic typing, components, distribution, and constraint
330solving.
331
332<<<
333
334:mlton-guide-page: AllocateRegisters
335[[AllocateRegisters]]
336AllocateRegisters
337=================
338
339<:AllocateRegisters:> is an analysis pass for the <:RSSA:>
340<:IntermediateLanguage:>, invoked from <:ToMachine:>.
341
342== Description ==
343
344Computes an allocation of <:RSSA:> variables as <:Machine:> register
345or stack operands.
346
347== Implementation ==
348
349* <!ViewGitFile(mlton,master,mlton/backend/allocate-registers.sig)>
350* <!ViewGitFile(mlton,master,mlton/backend/allocate-registers.fun)>
351
352== Details and Notes ==
353
354{empty}
355
356<<<
357
358:mlton-guide-page: AndreiFormiga
359[[AndreiFormiga]]
360AndreiFormiga
361=============
362
363I'm a graduate student just back in academia. I study concurrent and parallel systems, with a great deal of interest in programming languages (theory, design, implementation). I happen to like functional languages.
364
365I use the nickname tautologico on #sml and my email is andrei DOT formiga AT gmail DOT com.
366
367<<<
368
369:mlton-guide-page: ArrayLiteral
370[[ArrayLiteral]]
371ArrayLiteral
372============
373
374<:StandardML:Standard ML> does not have a syntax for array literals or
375vector literals. The only way to write down an array is like
376[source,sml]
377----
378Array.fromList [w, x, y, z]
379----
380
381No SML compiler produces efficient code for the above expression. The
382generated code allocates a list and then converts it to an array. To
383alleviate this, one could write down the same array using
384`Array.tabulate`, or even using `Array.array` and `Array.update`, but
385that is syntactically unwieldy.
386
387Fortunately, using <:Fold:>, it is possible to define constants `A`,
388and +&grave;+ so that one can write down an array like:
389[source,sml]
390----
391A `w `x `y `z $
392----
393This is as syntactically concise as the `fromList` expression.
394Furthermore, MLton, at least, will generate the efficient code as if
395one had written down a use of `Array.array` followed by four uses of
396`Array.update`.
397
398Along with `A` and +&grave;+, one can define a constant `V` that makes
399it possible to define vector literals with the same syntax, e.g.,
400[source,sml]
401----
402V `w `x `y `z $
403----
404
405Note that the same element indicator, +&grave;+, serves for both array
406and vector literals. Of course, the `$` is the end-of-arguments
407marker always used with <:Fold:>. The only difference between an
408array literal and vector literal is the `A` or `V` at the beginning.
409
410Here is the implementation of `A`, `V`, and +&grave;+. We place them
411in a structure and use signature abstraction to hide the type of the
412accumulator. See <:Fold:> for more on this technique.
413[source,sml]
414----
415structure Literal:>
416 sig
417 type 'a z
418 val A: ('a z, 'a z, 'a array, 'd) Fold.t
419 val V: ('a z, 'a z, 'a vector, 'd) Fold.t
420 val ` : ('a, 'a z, 'a z, 'b, 'c, 'd) Fold.step1
421 end =
422 struct
423 type 'a z = int * 'a option * ('a array -> unit)
424
425 val A =
426 fn z =>
427 Fold.fold
428 ((0, NONE, ignore),
429 fn (n, opt, fill) =>
430 case opt of
431 NONE =>
432 Array.tabulate (0, fn _ => raise Fail "array0")
433 | SOME x =>
434 let
435 val a = Array.array (n, x)
436 val () = fill a
437 in
438 a
439 end)
440 z
441
442 val V = fn z => Fold.post (A, Array.vector) z
443
444 val ` =
445 fn z =>
446 Fold.step1
447 (fn (x, (i, opt, fill)) =>
448 (i + 1,
449 SOME x,
450 fn a => (Array.update (a, i, x); fill a)))
451 z
452 end
453----
454
455The idea of the code is for the fold to accumulate a count of the
456number of elements, a sample element, and a function that fills in all
457the elements. When the fold is complete, the finishing function
458allocates the array, applies the fill function, and returns the array.
459The only difference between `A` and `V` is at the very end; `A` just
460returns the array, while `V` converts it to a vector using
461post-composition, which is further described on the <:Fold:> page.
462
463<<<
464
465:mlton-guide-page: AST
466[[AST]]
467AST
468===
469
470<:AST:> is the <:IntermediateLanguage:> produced by the <:FrontEnd:>
471and translated by <:Elaborate:> to <:CoreML:>.
472
473== Description ==
474
475The abstract syntax tree produced by the <:FrontEnd:>.
476
477== Implementation ==
478
479* <!ViewGitFile(mlton,master,mlton/ast/ast-programs.sig)>
480* <!ViewGitFile(mlton,master,mlton/ast/ast-programs.fun)>
481* <!ViewGitFile(mlton,master,mlton/ast/ast-modules.sig)>
482* <!ViewGitFile(mlton,master,mlton/ast/ast-modules.fun)>
483* <!ViewGitFile(mlton,master,mlton/ast/ast-core.sig)>
484* <!ViewGitFile(mlton,master,mlton/ast/ast-core.fun)>
485* <!ViewGitDir(mlton,master,mlton/ast)>
486
487== Type Checking ==
488
489The <:AST:> <:IntermediateLanguage:> has no independent type
490checker. Type inference is performed on an AST program as part of
491<:Elaborate:>.
492
493== Details and Notes ==
494
495=== Source locations ===
496
497MLton makes use of a relatively clean method for annotating the
498abstract syntax tree with source location information. Every source
499program phrase is "wrapped" with the `WRAPPED` interface:
500
501[source,sml]
502----
503sys::[./bin/InclGitFile.py mlton master mlton/control/wrapped.sig 8:19]
504----
505
506The key idea is that `node'` is the type of an unannotated syntax
507phrase and `obj` is the type of its annotated counterpart. In the
508implementation, every `node'` is annotated with a `Region.t`
509(<!ViewGitFile(mlton,master,mlton/control/region.sig)>,
510<!ViewGitFile(mlton,master,mlton/control/region.sml)>), which describes the
511syntax phrase's left source position and right source position, where
512`SourcePos.t` (<!ViewGitFile(mlton,master,mlton/control/source-pos.sig)>,
513<!ViewGitFile(mlton,master,mlton/control/source-pos.sml)>) denotes a
514particular file, line, and column. A typical use of the `WRAPPED`
515interface is illustrated by the following code:
516
517[source,sml]
518----
519sys::[./bin/InclGitFile.py mlton master mlton/ast/ast-core.sig 46:65]
520----
521
522Thus, AST nodes are cleanly separated from source locations. By way
523of contrast, consider the approach taken by <:SMLNJ:SML/NJ> (and also
524by the <:CKitLibrary:CKit Library>). Each datatype denoting a syntax
525phrase dedicates a special constructor for annotating source
526locations:
527[source,sml]
528-----
529datatype pat = WildPat (* empty pattern *)
530 | AppPat of {constr:pat,argument:pat} (* application *)
531 | MarkPat of pat * region (* mark a pattern *)
532----
533
534The main drawback of this approach is that static type checking is not
535sufficient to guarantee that the AST emitted from the front-end is
536properly annotated.
537
538<<<
539
540:mlton-guide-page: BasisLibrary
541[[BasisLibrary]]
542BasisLibrary
543============
544
545The <:StandardML:Standard ML> Basis Library is a collection of modules
546dealing with basic types, input/output, OS interfaces, and simple
547datatypes. It is intended as a portable library usable across all
548implementations of SML. For the official online version of the Basis
549Library specification, see http://www.standardml.org/Basis.
550<!Cite(GansnerReppy04, The Standard ML Basis Library)> is a book
551version that includes all of the online version and more. For a
552reverse chronological list of changes to the specification, see
553http://www.standardml.org/Basis/history.html.
554
555MLton implements all of the required portions of the Basis Library.
556MLton also implements many of the optional structures. You can obtain
557a complete and current list of what's available using
558`mlton -show-basis` (see <:ShowBasis:>). By default, MLton makes the
559Basis Library available to user programs. You can also
560<:MLBasisAvailableLibraries:access the Basis Library> from
561<:MLBasis: ML Basis> files.
562
563Below is a complete list of what MLton implements.
564
565== Top-level types and constructors ==
566
567`eqtype 'a array`
568
569`datatype bool = false | true`
570
571`eqtype char`
572
573`type exn`
574
575`eqtype int`
576
577++datatype 'a list = nil | {two-colons} of ('a * 'a list)++
578
579`datatype 'a option = NONE | SOME of 'a`
580
581`datatype order = EQUAL | GREATER | LESS`
582
583`type real`
584
585`datatype 'a ref = ref of 'a`
586
587`eqtype string`
588
589`type substring`
590
591`eqtype unit`
592
593`eqtype 'a vector`
594
595`eqtype word`
596
597== Top-level exception constructors ==
598
599`Bind`
600
601`Chr`
602
603`Div`
604
605`Domain`
606
607`Empty`
608
609`Fail of string`
610
611`Match`
612
613`Option`
614
615`Overflow`
616
617`Size`
618
619`Span`
620
621`Subscript`
622
623== Top-level values ==
624
625MLton does not implement the optional top-level value
626`use: string -> unit`, which conflicts with whole-program
627compilation because it allows new code to be loaded dynamically.
628
629MLton implements all other top-level values:
630
631`!`,
632`:=`,
633`<>`,
634`=`,
635`@`,
636`^`,
637`app`,
638`before`,
639`ceil`,
640`chr`,
641`concat`,
642`exnMessage`,
643`exnName`,
644`explode`,
645`floor`,
646`foldl`,
647`foldr`,
648`getOpt`,
649`hd`,
650`ignore`,
651`implode`,
652`isSome`,
653`length`,
654`map`,
655`not`,
656`null`,
657`o`,
658`ord`,
659`print`,
660`real`,
661`rev`,
662`round`,
663`size`,
664`str`,
665`substring`,
666`tl`,
667`trunc`,
668`valOf`,
669`vector`
670
671== Overloaded identifiers ==
672
673`*`,
674`+`,
675`-`,
676`/`,
677`<`,
678`<=`,
679`>`,
680`>=`,
681`~`,
682`abs`,
683`div`,
684`mod`
685
686== Top-level signatures ==
687
688`ARRAY`
689
690`ARRAY2`
691
692`ARRAY_SLICE`
693
694`BIN_IO`
695
696`BIT_FLAGS`
697
698`BOOL`
699
700`BYTE`
701
702`CHAR`
703
704`COMMAND_LINE`
705
706`DATE`
707
708`GENERAL`
709
710`GENERIC_SOCK`
711
712`IEEE_REAL`
713
714`IMPERATIVE_IO`
715
716`INET_SOCK`
717
718`INTEGER`
719
720`INT_INF`
721
722`IO`
723
724`LIST`
725
726`LIST_PAIR`
727
728`MATH`
729
730`MONO_ARRAY`
731
732`MONO_ARRAY2`
733
734`MONO_ARRAY_SLICE`
735
736`MONO_VECTOR`
737
738`MONO_VECTOR_SLICE`
739
740`NET_HOST_DB`
741
742`NET_PROT_DB`
743
744`NET_SERV_DB`
745
746`OPTION`
747
748`OS`
749
750`OS_FILE_SYS`
751
752`OS_IO`
753
754`OS_PATH`
755
756`OS_PROCESS`
757
758`PACK_REAL`
759
760`PACK_WORD`
761
762`POSIX`
763
764`POSIX_ERROR`
765
766`POSIX_FILE_SYS`
767
768`POSIX_IO`
769
770`POSIX_PROCESS`
771
772`POSIX_PROC_ENV`
773
774`POSIX_SIGNAL`
775
776`POSIX_SYS_DB`
777
778`POSIX_TTY`
779
780`PRIM_IO`
781
782`REAL`
783
784`SOCKET`
785
786`STREAM_IO`
787
788`STRING`
789
790`STRING_CVT`
791
792`SUBSTRING`
793
794`TEXT`
795
796`TEXT_IO`
797
798`TEXT_STREAM_IO`
799
800`TIME`
801
802`TIMER`
803
804`UNIX`
805
806`UNIX_SOCK`
807
808`VECTOR`
809
810`VECTOR_SLICE`
811
812`WORD`
813
814== Top-level structures ==
815
816`structure Array: ARRAY`
817
818`structure Array2: ARRAY2`
819
820`structure ArraySlice: ARRAY_SLICE`
821
822`structure BinIO: BIN_IO`
823
824`structure BinPrimIO: PRIM_IO`
825
826`structure Bool: BOOL`
827
828`structure BoolArray: MONO_ARRAY`
829
830`structure BoolArray2: MONO_ARRAY2`
831
832`structure BoolArraySlice: MONO_ARRAY_SLICE`
833
834`structure BoolVector: MONO_VECTOR`
835
836`structure BoolVectorSlice: MONO_VECTOR_SLICE`
837
838`structure Byte: BYTE`
839
840`structure Char: CHAR`
841
842* `Char` characters correspond to ISO-8859-1. The `Char` functions do not depend on locale.
843
844`structure CharArray: MONO_ARRAY`
845
846`structure CharArray2: MONO_ARRAY2`
847
848`structure CharArraySlice: MONO_ARRAY_SLICE`
849
850`structure CharVector: MONO_VECTOR`
851
852`structure CharVectorSlice: MONO_VECTOR_SLICE`
853
854`structure CommandLine: COMMAND_LINE`
855
856`structure Date: DATE`
857
858* `Date.fromString` and `Date.scan` accept a space in addition to a zero for the first character of the day of the month. The Basis Library specification only allows a zero.
859
860`structure FixedInt: INTEGER`
861
862`structure General: GENERAL`
863
864`structure GenericSock: GENERIC_SOCK`
865
866`structure IEEEReal: IEEE_REAL`
867
868`structure INetSock: INET_SOCK`
869
870`structure IO: IO`
871
872`structure Int: INTEGER`
873
874`structure Int1: INTEGER`
875
876`structure Int2: INTEGER`
877
878`structure Int3: INTEGER`
879
880`structure Int4: INTEGER`
881
882...
883
884`structure Int31: INTEGER`
885
886`structure Int32: INTEGER`
887
888`structure Int64: INTEGER`
889
890`structure IntArray: MONO_ARRAY`
891
892`structure IntArray2: MONO_ARRAY2`
893
894`structure IntArraySlice: MONO_ARRAY_SLICE`
895
896`structure IntVector: MONO_VECTOR`
897
898`structure IntVectorSlice: MONO_VECTOR_SLICE`
899
900`structure Int8: INTEGER`
901
902`structure Int8Array: MONO_ARRAY`
903
904`structure Int8Array2: MONO_ARRAY2`
905
906`structure Int8ArraySlice: MONO_ARRAY_SLICE`
907
908`structure Int8Vector: MONO_VECTOR`
909
910`structure Int8VectorSlice: MONO_VECTOR_SLICE`
911
912`structure Int16: INTEGER`
913
914`structure Int16Array: MONO_ARRAY`
915
916`structure Int16Array2: MONO_ARRAY2`
917
918`structure Int16ArraySlice: MONO_ARRAY_SLICE`
919
920`structure Int16Vector: MONO_VECTOR`
921
922`structure Int16VectorSlice: MONO_VECTOR_SLICE`
923
924`structure Int32: INTEGER`
925
926`structure Int32Array: MONO_ARRAY`
927
928`structure Int32Array2: MONO_ARRAY2`
929
930`structure Int32ArraySlice: MONO_ARRAY_SLICE`
931
932`structure Int32Vector: MONO_VECTOR`
933
934`structure Int32VectorSlice: MONO_VECTOR_SLICE`
935
936`structure Int64Array: MONO_ARRAY`
937
938`structure Int64Array2: MONO_ARRAY2`
939
940`structure Int64ArraySlice: MONO_ARRAY_SLICE`
941
942`structure Int64Vector: MONO_VECTOR`
943
944`structure Int64VectorSlice: MONO_VECTOR_SLICE`
945
946`structure IntInf: INT_INF`
947
948`structure LargeInt: INTEGER`
949
950`structure LargeIntArray: MONO_ARRAY`
951
952`structure LargeIntArray2: MONO_ARRAY2`
953
954`structure LargeIntArraySlice: MONO_ARRAY_SLICE`
955
956`structure LargeIntVector: MONO_VECTOR`
957
958`structure LargeIntVectorSlice: MONO_VECTOR_SLICE`
959
960`structure LargeReal: REAL`
961
962`structure LargeRealArray: MONO_ARRAY`
963
964`structure LargeRealArray2: MONO_ARRAY2`
965
966`structure LargeRealArraySlice: MONO_ARRAY_SLICE`
967
968`structure LargeRealVector: MONO_VECTOR`
969
970`structure LargeRealVectorSlice: MONO_VECTOR_SLICE`
971
972`structure LargeWord: WORD`
973
974`structure LargeWordArray: MONO_ARRAY`
975
976`structure LargeWordArray2: MONO_ARRAY2`
977
978`structure LargeWordArraySlice: MONO_ARRAY_SLICE`
979
980`structure LargeWordVector: MONO_VECTOR`
981
982`structure LargeWordVectorSlice: MONO_VECTOR_SLICE`
983
984`structure List: LIST`
985
986`structure ListPair: LIST_PAIR`
987
988`structure Math: MATH`
989
990`structure NetHostDB: NET_HOST_DB`
991
992`structure NetProtDB: NET_PROT_DB`
993
994`structure NetServDB: NET_SERV_DB`
995
996`structure OS: OS`
997
998`structure Option: OPTION`
999
1000`structure PackReal32Big: PACK_REAL`
1001
1002`structure PackReal32Little: PACK_REAL`
1003
1004`structure PackReal64Big: PACK_REAL`
1005
1006`structure PackReal64Little: PACK_REAL`
1007
1008`structure PackRealBig: PACK_REAL`
1009
1010`structure PackRealLittle: PACK_REAL`
1011
1012`structure PackWord16Big: PACK_WORD`
1013
1014`structure PackWord16Little: PACK_WORD`
1015
1016`structure PackWord32Big: PACK_WORD`
1017
1018`structure PackWord32Little: PACK_WORD`
1019
1020`structure PackWord64Big: PACK_WORD`
1021
1022`structure PackWord64Little: PACK_WORD`
1023
1024`structure Position: INTEGER`
1025
1026`structure Posix: POSIX`
1027
1028`structure Real: REAL`
1029
1030`structure RealArray: MONO_ARRAY`
1031
1032`structure RealArray2: MONO_ARRAY2`
1033
1034`structure RealArraySlice: MONO_ARRAY_SLICE`
1035
1036`structure RealVector: MONO_VECTOR`
1037
1038`structure RealVectorSlice: MONO_VECTOR_SLICE`
1039
1040`structure Real32: REAL`
1041
1042`structure Real32Array: MONO_ARRAY`
1043
1044`structure Real32Array2: MONO_ARRAY2`
1045
1046`structure Real32ArraySlice: MONO_ARRAY_SLICE`
1047
1048`structure Real32Vector: MONO_VECTOR`
1049
1050`structure Real32VectorSlice: MONO_VECTOR_SLICE`
1051
1052`structure Real64: REAL`
1053
1054`structure Real64Array: MONO_ARRAY`
1055
1056`structure Real64Array2: MONO_ARRAY2`
1057
1058`structure Real64ArraySlice: MONO_ARRAY_SLICE`
1059
1060`structure Real64Vector: MONO_VECTOR`
1061
1062`structure Real64VectorSlice: MONO_VECTOR_SLICE`
1063
1064`structure Socket: SOCKET`
1065
1066* The Basis Library specification requires functions like
1067`Socket.sendVec` to raise an exception if they fail. However, on some
1068platforms, sending to a socket that hasn't yet been connected causes a
1069`SIGPIPE` signal, which invokes the default signal handler for
1070`SIGPIPE` and causes the program to terminate. If you want the
1071exception to be raised, you can ignore `SIGPIPE` by adding the
1072following to your program.
1073+
1074[source,sml]
1075----
1076let
1077 open MLton.Signal
1078in
1079 setHandler (Posix.Signal.pipe, Handler.ignore)
1080end
1081----
1082
1083`structure String: STRING`
1084
1085* The `String` functions do not depend on locale.
1086
1087`structure StringCvt: STRING_CVT`
1088
1089`structure Substring: SUBSTRING`
1090
1091`structure SysWord: WORD`
1092
1093`structure Text: TEXT`
1094
1095`structure TextIO: TEXT_IO`
1096
1097`structure TextPrimIO: PRIM_IO`
1098
1099`structure Time: TIME`
1100
1101`structure Timer: TIMER`
1102
1103`structure Unix: UNIX`
1104
1105`structure UnixSock: UNIX_SOCK`
1106
1107`structure Vector: VECTOR`
1108
1109`structure VectorSlice: VECTOR_SLICE`
1110
1111`structure Word: WORD`
1112
1113`structure Word1: WORD`
1114
1115`structure Word2: WORD`
1116
1117`structure Word3: WORD`
1118
1119`structure Word4: WORD`
1120
1121...
1122
1123`structure Word31: WORD`
1124
1125`structure Word32: WORD`
1126
1127`structure Word64: WORD`
1128
1129`structure WordArray: MONO_ARRAY`
1130
1131`structure WordArray2: MONO_ARRAY2`
1132
1133`structure WordArraySlice: MONO_ARRAY_SLICE`
1134
1135`structure WordVectorSlice: MONO_VECTOR_SLICE`
1136
1137`structure WordVector: MONO_VECTOR`
1138
1139`structure Word8Array: MONO_ARRAY`
1140
1141`structure Word8Array2: MONO_ARRAY2`
1142
1143`structure Word8ArraySlice: MONO_ARRAY_SLICE`
1144
1145`structure Word8Vector: MONO_VECTOR`
1146
1147`structure Word8VectorSlice: MONO_VECTOR_SLICE`
1148
1149`structure Word16Array: MONO_ARRAY`
1150
1151`structure Word16Array2: MONO_ARRAY2`
1152
1153`structure Word16ArraySlice: MONO_ARRAY_SLICE`
1154
1155`structure Word16Vector: MONO_VECTOR`
1156
1157`structure Word16VectorSlice: MONO_VECTOR_SLICE`
1158
1159`structure Word32Array: MONO_ARRAY`
1160
1161`structure Word32Array2: MONO_ARRAY2`
1162
1163`structure Word32ArraySlice: MONO_ARRAY_SLICE`
1164
1165`structure Word32Vector: MONO_VECTOR`
1166
1167`structure Word32VectorSlice: MONO_VECTOR_SLICE`
1168
1169`structure Word64Array: MONO_ARRAY`
1170
1171`structure Word64Array2: MONO_ARRAY2`
1172
1173`structure Word64ArraySlice: MONO_ARRAY_SLICE`
1174
1175`structure Word64Vector: MONO_VECTOR`
1176
1177`structure Word64VectorSlice: MONO_VECTOR_SLICE`
1178
1179== Top-level functors ==
1180
1181`ImperativeIO`
1182
1183`PrimIO`
1184
1185`StreamIO`
1186
1187* MLton's `StreamIO` functor takes structures `ArraySlice` and
1188`VectorSlice` in addition to the arguments specified in the Basis
1189Library specification.
1190
1191== Type equivalences ==
1192
1193The following types are equivalent.
1194----
1195FixedInt = Int64.int
1196LargeInt = IntInf.int
1197LargeReal.real = Real64.real
1198LargeWord = Word64.word
1199----
1200
1201The default `int`, `real`, and `word` types may be set by the
1202++-default-type __type__++ <:CompileTimeOptions: compile-time option>.
1203By default, the following types are equivalent:
1204----
1205int = Int.int = Int32.int
1206real = Real.real = Real64.real
1207word = Word.word = Word32.word
1208----
1209
1210== Real and Math functions ==
1211
1212The `Real`, `Real32`, and `Real64` modules are implemented
1213using the `C` math library, so the SML functions will reflect the
1214behavior of the underlying library function. We have made some effort
1215to unify the differences between the math libraries on different
1216platforms, and in particular to handle exceptional cases according to
1217the Basis Library specification. However, there will be differences
1218due to different numerical algorithms and cases we may have missed.
1219Please submit a <:Bug:bug report> if you encounter an error in
1220the handling of an exceptional case.
1221
1222On x86, real arithmetic is implemented internally using 80 bits of
1223precision. Using higher precision for intermediate results in
1224computations can lead to different results than if all the computation
1225is done at 32 or 64 bits. If you require strict IEEE compliance, you
1226can compile with `-ieee-fp true`, which will cause intermediate
1227results to be stored after each operation. This may cause a
1228substantial performance penalty.
1229
1230<<<
1231
1232:mlton-guide-page: Bug
1233[[Bug]]
1234Bug
1235===
1236
1237To report a bug, please send mail to
1238mailto:mlton-devel@mlton.org[`mlton-devel@mlton.org`]. Please include
1239the complete SML program that caused the problem and a log of a
1240compile of the program with `-verbose 2`. For large programs (over
1241256K), please send an email containing the discussion text and a link
1242to any large files.
1243
1244There are some <:UnresolvedBugs:> that we don't plan to fix.
1245
1246We also maintain a list of bugs found with each release.
1247
1248* <:Bugs20130715:>
1249* <:Bugs20100608:>
1250* <:Bugs20070826:>
1251* <:Bugs20051202:>
1252* <:Bugs20041109:>
1253
1254<<<
1255
1256:mlton-guide-page: Bugs20041109
1257[[Bugs20041109]]
1258Bugs20041109
1259============
1260
1261Here are the known bugs in <:Release20041109:MLton 20041109>, listed
1262in reverse chronological order of date reported.
1263
1264* <!Anchor(bug17)>
1265 `MLton.Finalizable.touch` doesn't necessarily keep values alive
1266 long enough. Our SVN has a patch to the compiler. You must rebuild
1267 the compiler in order for the patch to take effect.
1268+
1269Thanks to Florian Weimer for reporting this bug.
1270
1271* <!Anchor(bug16)>
1272 A bug in an optimization pass may incorrectly transform a program
1273 to flatten ref cells into their containing data structure, yielding a
1274 type-error in the transformed program. Our CVS has a
1275 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/ssa/ref-flatten.fun.diff?r1=1.35&r2=1.37[patch]
1276 to the compiler. You must rebuild the compiler in order for the
1277 patch to take effect.
1278+
1279Thanks to <:VesaKarvonen:> for reporting this bug.
1280
1281* <!Anchor(bug15)>
1282 A bug in the front end mistakenly allows unary constructors to be
1283 used without an argument in patterns. For example, the following
1284 program is accepted, and triggers a large internal error.
1285+
1286[source,sml]
1287----
1288fun f x = case x of SOME => true | _ => false
1289----
1290+
1291We have fixed the problem in our CVS.
1292+
1293Thanks to William Lovas for reporting this bug.
1294
1295* <!Anchor(bug14)>
1296 A bug in `Posix.IO.{getlk,setlk,setlkw}` causes a link-time error:
1297 `undefined reference to Posix_IO_FLock_typ`
1298 Our CVS has a
1299 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/posix/primitive.sml.diff?r1=1.34&r2=1.35[patch]
1300 to the Basis Library implementation.
1301+
1302Thanks to Adam Chlipala for reporting this bug.
1303
1304* <!Anchor(bug13)>
1305 A bug can cause programs compiled with `-profile alloc` to
1306 segfault. Our CVS has a
1307 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/backend/ssa-to-rssa.fun.diff?r1=1.106&r2=1.107[patch]
1308 to the compiler. You must rebuild the compiler in order for the
1309 patch to take effect.
1310+
1311Thanks to John Reppy for reporting this bug.
1312
1313* <!Anchor(bug12)>
1314 A bug in an optimization pass may incorrectly flatten ref cells
1315 into their containing data structure, breaking the sharing between
1316 the cells. Our CVS has a
1317 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/ssa/ref-flatten.fun.diff?r1=1.32&r2=1.33[patch]
1318 to the compiler. You must rebuild the compiler in order for the
1319 patch to take effect.
1320+
1321Thanks to Paul Govereau for reporting this bug.
1322
1323* <!Anchor(bug11)>
1324 Some arrays or vectors, such as `(char * char) vector`, are
1325 incorrectly implemented, and will conflate the first and second
1326 components of each element. Our CVS has a
1327 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/backend/packed-representation.fun.diff?r1=1.32&r2=1.33[patch]
1328 to the compiler. You must rebuild the compiler in order for the
1329 patch to take effect.
1330+
1331Thanks to Scott Cruzen for reporting this bug.
1332
1333* <!Anchor(bug10)>
1334 `Socket.Ctl.getLINGER` and `Socket.Ctl.setLINGER`
1335 mistakenly raise `Subscript`.
1336 Our CVS has a
1337 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/net/socket.sml.diff?r1=1.14&r2=1.15[patch]
1338 to the Basis Library implementation.
1339+
1340Thanks to Ray Racine for reporting the bug.
1341
1342* <!Anchor(bug09)>
1343 <:ConcurrentML: CML> `Mailbox.send` makes a call in the wrong atomic context.
1344 Our CVS has a http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/lib/cml/core-cml/mailbox.sml.diff?r1=1.3&r2=1.4[patch]
1345 to the CML implementation.
1346
1347* <!Anchor(bug08)>
1348 `OS.Path.joinDirFile` and `OS.Path.toString` did not
1349 raise `InvalidArc` when they were supposed to. They now do.
1350 Our CVS has a http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/system/path.sml.diff?r1=1.8&r2=1.11[patch]
1351 to the Basis Library implementation.
1352+
1353Thanks to Andreas Rossberg for reporting the bug.
1354
1355* <!Anchor(bug07)>
1356 The front end incorrectly disallows sequences of expressions
1357 (separated by semicolons) after a topdec has already been processed.
1358 For example, the following is incorrectly rejected.
1359+
1360[source,sml]
1361----
1362val x = 0;
1363ignore x;
1364ignore x;
1365----
1366+
1367We have fixed the problem in our CVS.
1368+
1369Thanks to Andreas Rossberg for reporting the bug.
1370
1371* <!Anchor(bug06)>
1372 The front end incorrectly disallows expansive `val`
1373 declarations that bind a type variable that doesn't occur in the
1374 type of the value being bound. For example, the following is
1375 incorrectly rejected.
1376+
1377[source,sml]
1378----
1379val 'a x = let exception E of 'a in () end
1380----
1381+
1382We have fixed the problem in our CVS.
1383+
1384Thanks to Andreas Rossberg for reporting this bug.
1385
1386* <!Anchor(bug05)>
1387 The x86 codegen fails to account for the possibility that a 64-bit
1388 move could interfere with itself (as simulated by 32-bit moves). We
1389 have fixed the problem in our CVS.
1390+
1391Thanks to Scott Cruzen for reporting this bug.
1392
1393* <!Anchor(bug04)>
1394 `NetHostDB.scan` and `NetHostDB.fromString` incorrectly
1395 raise an exception on internet addresses whose last component is a
1396 zero, e.g `0.0.0.0`. Our CVS has a
1397 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/net/net-host-db.sml.diff?r1=1.12&r2=1.13[patch] to the Basis Library implementation.
1398+
1399Thanks to Scott Cruzen for reporting this bug.
1400
1401* <!Anchor(bug03)>
1402 `StreamIO.inputLine` has an off-by-one error causing it to drop
1403 the first character after a newline in some situations. Our CVS has a
1404 http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/io/stream-io.fun.diff?r1=text&tr1=1.29&r2=text&tr2=1.30&diff_format=h[patch].
1405 to the Basis Library implementation.
1406+
1407Thanks to Scott Cruzen for reporting this bug.
1408
1409* <!Anchor(bug02)>
1410 `BinIO.getInstream` and `TextIO.getInstream` are
1411 implemented incorrectly. This also impacts the behavior of
1412 `BinIO.scanStream` and `TextIO.scanStream`. If you (directly
1413 or indirectly) realize a `TextIO.StreamIO.instream` and do not
1414 (directly or indirectly) call `TextIO.setInstream` with a derived
1415 stream, you may lose input data. We have fixed the problem in our
1416 CVS.
1417+
1418Thanks to <:WesleyTerpstra:> for reporting this bug.
1419
1420* <!Anchor(bug01)>
1421 `Posix.ProcEnv.setpgid` doesn't work. If you compile a program
1422 that uses it, you will get a link time error
1423+
1424----
1425undefined reference to `Posix_ProcEnv_setpgid'
1426----
1427+
1428The bug is due to `Posix_ProcEnv_setpgid` being omitted from the
1429 MLton runtime. We fixed the problem in our CVS by adding the
1430 following definition to `runtime/Posix/ProcEnv/ProcEnv.c`
1431+
1432[source,c]
1433----
1434Int Posix_ProcEnv_setpgid (Pid p, Gid g) {
1435 return setpgid (p, g);
1436}
1437----
1438+
1439Thanks to Tom Murphy for reporting this bug.
1440
1441<<<
1442
1443:mlton-guide-page: Bugs20051202
1444[[Bugs20051202]]
1445Bugs20051202
1446============
1447
1448Here are the known bugs in <:Release20051202:MLton 20051202>, listed
1449in reverse chronological order of date reported.
1450
1451* <!Anchor(bug16)>
1452Bug in the http://www.standardml.org/Basis/real.html#SIG:REAL.fmt:VAL[++Real__<N>__.fmt++], http://www.standardml.org/Basis/real.html#SIG:REAL.fromString:VAL[++Real__<N>__.fromString++], http://www.standardml.org/Basis/real.html#SIG:REAL.scan:VAL[++Real__<N>__.scan++], and http://www.standardml.org/Basis/real.html#SIG:REAL.toString:VAL[++Real__<N>__.toString++] functions of the <:BasisLibrary:Basis Library> implementation. These functions were using `TO_NEAREST` semantics, but should obey the current rounding mode. (Only ++Real__<N>__.fmt StringCvt.EXACT++, ++Real__<N>__.fromDecimal++, and ++Real__<N>__.toDecimal++ are specified to override the current rounding mode with `TO_NEAREST` semantics.)
1453+
1454Thanks to Sean McLaughlin for the bug report.
1455+
1456Fixed by revision <!ViewSVNRev(5827)>.
1457
1458* <!Anchor(bug15)>
1459Bug in the treatment of floating-point operations. Floating-point operations depend on the current rounding mode, but were being treated as pure.
1460+
1461Thanks to Sean McLaughlin for the bug report.
1462+
1463Fixed by revision <!ViewSVNRev(5794)>.
1464
1465* <!Anchor(bug14)>
1466Bug in the http://www.standardml.org/Basis/real.html#SIG:REAL.toInt:VAL[++Real32.toInt++] function of the <:BasisLibrary:Basis Library> implementation could lead incorrect results when applied to a `Real32.real` value numerically close to `valOf(Int.maxInt)`.
1467+
1468Fixed by revision <!ViewSVNRev(5764)>.
1469
1470* <!Anchor(bug13)>
1471The http://www.standardml.org/Basis/socket.html[++Socket++] structure of the <:BasisLibrary:Basis Library> implementation used `andb` rather than `orb` to unmarshal socket options (for ++Socket.Ctl.get__<OPT>__++ functions).
1472+
1473Thanks to Anders Petersson for the bug report and patch.
1474+
1475Fixed by revision <!ViewSVNRev(5735)>.
1476
1477* <!Anchor(bug12)>
1478Bug in the http://www.standardml.org/Basis/date.html[++Date++] structure of the <:BasisLibrary:Basis Library> implementation yielded some functions that would erroneously raise `Date` when applied to a year before 1900.
1479+
1480Thanks to Joe Hurd for the bug report.
1481+
1482Fixed by revision <!ViewSVNRev(5732)>.
1483
1484* <!Anchor(bug11)>
1485Bug in monomorphisation pass could exhibit the error `Type error: type mismatch`.
1486+
1487Thanks to Vesa Karvonen for the bug report.
1488+
1489Fixed by revision <!ViewSVNRev(5731)>.
1490
1491* <!Anchor(bug10)>
1492The http://www.standardml.org/Basis/pack-float.html#SIG:PACK_REAL.toBytes:VAL[++PackReal__<N>__.toBytes++] function in the <:BasisLibrary:Basis Library> implementation incorrectly shared (and mutated) the result vector.
1493+
1494Thanks to Eric McCorkle for the bug report and patch.
1495+
1496Fixed by revision <!ViewSVNRev(5281)>.
1497
1498* <!Anchor(bug09)>
1499Bug in elaboration of FFI forms. Using a unary FFI types (e.g., `array`, `ref`, `vector`) in places where `MLton.Pointer.t` was required would lead to an internal error `TypeError`.
1500+
1501Fixed by revision <!ViewSVNRev(4890)>.
1502
1503* <!Anchor(bug08)>
1504The http://www.standardml.org/Basis/mono-vector.html[++MONO_VECTOR++] signature of the <:BasisLibrary:Basis Library> implementation incorrectly omits the specification of `find`.
1505+
1506Fixed by revision <!ViewSVNRev(4707)>.
1507
1508* <!Anchor(bug07)>
1509The optimizer reports an internal error (`TypeError`) when an imported C function is called but not used.
1510+
1511Thanks to "jq" for the bug report.
1512+
1513Fixed by revision <!ViewSVNRev(4690)>.
1514
1515* <!Anchor(bug06)>
1516Bug in pass to flatten data structures.
1517+
1518Thanks to Joe Hurd for the bug report.
1519+
1520Fixed by revision <!ViewSVNRev(4662)>.
1521
1522* <!Anchor(bug05)>
1523The native codegen's implementation of the C-calling convention failed to widen 16-bit arguments to 32-bits.
1524+
1525Fixed by revision <!ViewSVNRev(4631)>.
1526
1527* <!Anchor(bug04)>
1528The http://www.standardml.org/Basis/pack-float.html[++PACK_REAL++] structures of the <:BasisLibrary:Basis Library> implementation used byte, rather than element, indexing.
1529+
1530Fixed by revision <!ViewSVNRev(4411)>.
1531
1532* <!Anchor(bug03)>
1533`MLton.share` could cause a segmentation fault.
1534+
1535Fixed by revision <!ViewSVNRev(4400)>.
1536
1537* <!Anchor(bug02)>
1538The SSA simplifier could eliminate an irredundant test.
1539+
1540Fixed by revision <!ViewSVNRev(4370)>.
1541
1542* <!Anchor(bug01)>
1543A program with a very large number of functors could exhibit the error `ElaborateEnv.functorClosure: firstTycons`.
1544+
1545Fixed by revision <!ViewSVNRev(4344)>.
1546
1547<<<
1548
1549:mlton-guide-page: Bugs20070826
1550[[Bugs20070826]]
1551Bugs20070826
1552============
1553
1554Here are the known bugs in <:Release20070826:MLton 20070826>, listed
1555in reverse chronological order of date reported.
1556
1557* <!Anchor(bug25)>
1558Bug in the mark-compact garbage collector where the C library's `memcpy` was used to move objects during the compaction phase; this could lead to heap corruption and segmentation faults with newer versions of gcc and/or glibc, which assume that src and dst in a `memcpy` do not overlap.
1559+
1560Fixed by revision <!ViewSVNRev(7461)>.
1561
1562* <!Anchor(bug24)>
1563Bug in elaboration of `datatype` declarations with `withtype` bindings.
1564+
1565Fixed by revision <!ViewSVNRev(7434)>.
1566
1567* <!Anchor(bug23)>
1568Performance bug in <:RefFlatten:> optimization pass.
1569+
1570Thanks to Reactive Systems for the bug report.
1571+
1572Fixed by revision <!ViewSVNRev(7379)>.
1573
1574* <!Anchor(bug22)>
1575Performance bug in <:SimplifyTypes:> optimization pass.
1576+
1577Thanks to Reactive Systems for the bug report.
1578+
1579Fixed by revisions <!ViewSVNRev(7377)> and <!ViewSVNRev(7378)>.
1580
1581* <!Anchor(bug21)>
1582Bug in amd64 codegen register allocation of indirect C calls.
1583+
1584Thanks to David Hansel for the bug report.
1585+
1586Fixed by revision <!ViewSVNRev(7368)>.
1587
1588* <!Anchor(bug20)>
1589Bug in `IntInf.scan` and `IntInf.fromString` where leading spaces were only accepted if the stream had an explicit sign character.
1590+
1591Thanks to David Hansel for the bug report.
1592+
1593Fixed by revisions <!ViewSVNRev(7227)> and <!ViewSVNRev(7230)>.
1594
1595* <!Anchor(bug19)>
1596Bug in `IntInf.~>>` that could cause a `glibc` assertion.
1597+
1598Fixed by revisions <!ViewSVNRev(7083)>, <!ViewSVNRev(7084)>, and <!ViewSVNRev(7085)>.
1599
1600* <!Anchor(bug18)>
1601Bug in the return type of `MLton.Process.reap`.
1602+
1603Thanks to Risto Saarelma for the bug report.
1604+
1605Fixed by revision <!ViewSVNRev(7029)>.
1606
1607* <!Anchor(bug17)>
1608Bug in `MLton.size` and `MLton.share` when tracing the current stack.
1609+
1610Fixed by revisions <!ViewSVNRev(6978)>, <!ViewSVNRev(6981)>, <!ViewSVNRev(6988)>, <!ViewSVNRev(6989)>, and <!ViewSVNRev(6990)>.
1611
1612* <!Anchor(bug16)>
1613Bug in nested `_export`/`_import` functions.
1614+
1615Fixed by revision <!ViewSVNRev(6919)>.
1616
1617* <!Anchor(bug15)>
1618Bug in the name mangling of `_import`-ed functions with the `stdcall` convention.
1619+
1620Thanks to Lars Bergstrom for the bug report.
1621+
1622Fixed by revision <!ViewSVNRev(6672)>.
1623
1624* <!Anchor(bug14)>
1625Bug in Windows code to page the heap to disk when unable to grow the heap to a desired size.
1626+
1627Thanks to Sami Evangelista for the bug report.
1628+
1629Fixed by revisions <!ViewSVNRev(6600)> and <!ViewSVNRev(6624)>.
1630
1631* <!Anchor(bug13)>
1632Bug in \*NIX code to page the heap to disk when unable to grow the heap to a desired size.
1633+
1634Thanks to Nicolas Bertolotti for the bug report and patch.
1635+
1636Fixed by revisions <!ViewSVNRev(6596)> and <!ViewSVNRev(6600)>.
1637
1638* <!Anchor(bug12)>
1639Space-safety bug in pass to <:RefFlatten: flatten refs> into containing data structure.
1640+
1641Thanks to Daniel Spoonhower for the bug report and initial diagnosis and patch.
1642+
1643Fixed by revision <!ViewSVNRev(6395)>.
1644
1645* <!Anchor(bug11)>
1646Bug in the frontend that rejected `op longvid` patterns and expressions.
1647+
1648Thanks to Florian Weimer for the bug report.
1649+
1650Fixed by revision <!ViewSVNRev(6347)>.
1651
1652* <!Anchor(bug10)>
1653Bug in the http://www.standardml.org/Basis/imperative-io.html#SIG:IMPERATIVE_IO.canInput:VAL[`IMPERATIVE_IO.canInput`] function of the <:BasisLibrary:Basis Library> implementation.
1654+
1655Thanks to Ville Laurikari for the bug report.
1656+
1657Fixed by revision <!ViewSVNRev(6261)>.
1658
1659* <!Anchor(bug09)>
1660Bug in algebraic simplification of real primitives. http://www.standardml.org/Basis/real.html#SIG:REAL.\|@LTE\|:VAL[++REAL__<N>__.\<=(x, x)++] is `false` when `x` is NaN.
1661+
1662Fixed by revision <!ViewSVNRev(6242)>.
1663
1664* <!Anchor(bug08)>
1665Bug in the FFI visible representation of `Int16.int ref` (and references of other primitive types smaller than 32-bits) on big-endian platforms.
1666+
1667Thanks to Dave Herman for the bug report.
1668+
1669Fixed by revision <!ViewSVNRev(6267)>.
1670
1671* <!Anchor(bug07)>
1672Bug in type inference of flexible records. This would later cause the compiler to raise the `TypeError` exception.
1673+
1674Thanks to Wesley Terpstra for the bug report.
1675+
1676Fixed by revision <!ViewSVNRev(6229)>.
1677
1678* <!Anchor(bug06)>
1679Bug in cross-compilation of `gdtoa` library.
1680+
1681Thanks to Wesley Terpstra for the bug report and patch.
1682+
1683Fixed by revision <!ViewSVNRev(6620)>.
1684
1685* <!Anchor(bug05)>
1686Bug in pass to <:RefFlatten: flatten refs> into containing data structure.
1687+
1688Thanks to Ruy Ley-Wild for the bug report.
1689+
1690Fixed by revision <!ViewSVNRev(6191)>.
1691
1692* <!Anchor(bug04)>
1693Bug in the handling of weak pointers by the mark-compact garbage collector.
1694+
1695Thanks to Sean McLaughlin for the bug report and Florian Weimer for the initial diagnosis.
1696+
1697Fixed by revision <!ViewSVNRev(6183)>.
1698
1699* <!Anchor(bug03)>
1700Bug in the elaboration of structures with signature constraints. This would later cause the compiler to raise the `TypeError` exception.
1701+
1702Thanks to Vesa Karvonen for the bug report.
1703+
1704Fixed by revision <!ViewSVNRev(6046)>.
1705
1706* <!Anchor(bug02)>
1707Bug in the interaction of `_export`-ed functions and signal handlers.
1708+
1709Thanks to Sean McLaughlin for the bug report.
1710+
1711Fixed by revision <!ViewSVNRev(6013)>.
1712
1713* <!Anchor(bug01)>
1714Bug in the implementation of `_export`-ed functions using the `char` type, leading to a linker error.
1715+
1716Thanks to Katsuhiro Ueno for the bug report.
1717+
1718Fixed by revision <!ViewSVNRev(5999)>.
1719
1720<<<
1721
1722:mlton-guide-page: Bugs20100608
1723[[Bugs20100608]]
1724Bugs20100608
1725============
1726
1727Here are the known bugs in <:Release20100608:MLton 20100608>, listed
1728in reverse chronological order of date reported.
1729
1730* <!Anchor(bug11)>
1731Bugs in `REAL.signBit`, `REAL.copySign`, and `REAL.toDecimal`/`REAL.fromDecimal`.
1732+
1733Thanks to Phil Clayton for the bug report and examples.
1734+
1735Fixed by revisions <!ViewSVNRev(7571)>, <!ViewSVNRev(7572)>, and <!ViewSVNRev(7573)>.
1736
1737* <!Anchor(bug10)>
1738Bug in elaboration of type variables with and without equality status.
1739+
1740Thanks to Rob Simmons for the bug report and examples.
1741+
1742Fixed by revision <!ViewSVNRev(7565)>.
1743
1744* <!Anchor(bug09)>
1745Bug in <:Redundant:redundant> <:SSA:> optimization.
1746+
1747Thanks to Lars Magnusson for the bug report and example.
1748+
1749Fixed by revision <!ViewSVNRev(7561)>.
1750
1751* <!Anchor(bug08)>
1752Bug in <:SSA:>/<:SSA2:> <:Shrink:shrinker> that could erroneously turn a non-tail function call with a `Bug` transfer as its continuation into a tail function call.
1753+
1754Thanks to Lars Bergstrom for the bug report.
1755+
1756Fixed by revision <!ViewSVNRev(7546)>.
1757
1758* <!Anchor(bug07)>
1759Bug in translation from <:SSA2:> to <:RSSA:> with `case` expressions over non-primitive-sized words.
1760+
1761Fixed by revision <!ViewSVNRev(7544)>.
1762
1763* <!Anchor(bug06)>
1764Bug with <:SSA:>/<:SSA2:> type checking of case expressions over words.
1765+
1766Fixed by revision <!ViewSVNRev(7542)>.
1767
1768* <!Anchor(bug05)>
1769Bug with treatment of `as`-patterns, which should not allow the redefinition of constructor status.
1770+
1771Thanks to Michael Norrish for the bug report.
1772+
1773Fixed by revision <!ViewSVNRev(7530)>.
1774
1775* <!Anchor(bug04)>
1776Bug with treatment of `nan` in <:CommonSubexp:common subexpression elimination> <:SSA:> optimization.
1777+
1778Thanks to Alexandre Hamez for the bug report.
1779+
1780Fixed by revision <!ViewSVNRev(7503)>.
1781
1782* <!Anchor(bug03)>
1783Bug in translation from <:SSA2:> to <:RSSA:> with weak pointers.
1784+
1785Thanks to Alexandre Hamez for the bug report.
1786+
1787Fixed by revision <!ViewSVNRev(7502)>.
1788
1789* <!Anchor(bug02)>
1790Bug in amd64 codegen calling convention for varargs C calls.
1791+
1792Thanks to <:HenryCejtin:> for the bug report and <:WesleyTerpstra:> for the initial diagnosis.
1793+
1794Fixed by revision <!ViewSVNRev(7501)>.
1795
1796* <!Anchor(bug01)>
1797Bug in comment-handling in lexer for <:MLYacc:>'s input language.
1798+
1799Thanks to Michael Norrish for the bug report and patch.
1800+
1801Fixed by revision <!ViewSVNRev(7500)>.
1802
1803* <!Anchor(bug00)>
1804Bug in elaboration of function clauses with different numbers of arguments that would raise an uncaught `Subscript` exception.
1805+
1806Fixed by revision <!ViewSVNRev(75497)>.
1807
1808<<<
1809
1810:mlton-guide-page: Bugs20130715
1811[[Bugs20130715]]
1812Bugs20130715
1813============
1814
1815Here are the known bugs in <:Release20130715:MLton 20130715>, listed
1816in reverse chronological order of date reported.
1817
1818* <!Anchor(bug06)>
1819Bug with simultaneous `sharing` of multiple structures.
1820+
1821Fixed by commit <!ViewGitCommit(mlton,9cb5164f6)>.
1822
1823* <!Anchor(bug05)>
1824Minor bug with exception replication.
1825+
1826Fixed by commit <!ViewGitCommit(mlton,1c89c42f6)>.
1827
1828* <!Anchor(bug04)>
1829Minor bug erroneously accepting symbolic identifiers for strid, sigid, and fctid
1830and erroneously accepting symbolic identifiers before `.` in long identifiers.
1831+
1832Fixed by commit <!ViewGitCommit(mlton,9a56be647)>.
1833
1834* <!Anchor(bug03)>
1835Minor bug in precedence parsing of function clauses.
1836+
1837Fixed by commit <!ViewGitCommit(mlton,1a6d25ec9)>.
1838
1839* <!Anchor(bug02)>
1840Performance bug in creation of worker threads to service calls of `_export`-ed
1841functions.
1842+
1843Thanks to Bernard Berthomieu for the bug report.
1844+
1845Fixed by commit <!ViewGitCommit(mlton,97c2bdf1d)>.
1846
1847* <!Anchor(bug01)>
1848Bug in `MLton.IntInf.fromRep` that could yield values that violate the `IntInf`
1849representation invariants.
1850+
1851Thanks to Rob Simmons for the bug report.
1852+
1853Fixed by commit <!ViewGitCommit(mlton,3add91eda)>.
1854
1855* <!Anchor(bug00)>
1856Bug in equality status of some arrays, vectors, and slices in Basis Library
1857implementation.
1858+
1859Fixed by commit <!ViewGitCommit(mlton,a7ed9cbf1)>.
1860
1861<<<
1862
1863:mlton-guide-page: Bugs20180207
1864[[Bugs20180207]]
1865Bugs20180207
1866============
1867
1868Here are the known bugs in <:Release20180207:MLton 20180207>, listed
1869in reverse chronological order of date reported.
1870
1871<<<
1872
1873:mlton-guide-page: CallGraph
1874[[CallGraph]]
1875CallGraph
1876=========
1877
1878For easier visualization of <:Profiling:profiling> data, `mlprof` can
1879create a call graph of the program in dot format, from which you can
1880use the http://www.research.att.com/sw/tools/graphviz/[graphviz]
1881software package to create a PostScript or PNG graph. For example,
1882----
1883mlprof -call-graph foo.dot foo mlmon.out
1884----
1885will create `foo.dot` with a complete call graph. For each source
1886function, there will be one node in the graph that contains the
1887function name (and source position with `-show-line true`), as
1888well as the percentage of ticks. If you want to create a call graph
1889for your program without any profiling data, you can simply call
1890`mlprof` without any `mlmon.out` files, as in
1891----
1892mlprof -call-graph foo.dot foo
1893----
1894
1895Because SML has higher-order functions, the call graph is is dependent
1896on MLton's analysis of which functions call each other. This analysis
1897depends on many implementation details and might display spurious
1898edges that a human could conclude are impossible. However, in
1899practice, the call graphs tend to be very accurate.
1900
1901Because call graphs can get big, `mlprof` provides the `-keep` option
1902to specify the nodes that you would like to see. This option also
1903controls which functions appear in the table that `mlprof` prints.
1904The argument to `-keep` is an expression describing a set of source
1905functions (i.e. graph nodes). The expression _e_ should be of the
1906following form.
1907
1908* ++all++
1909* ++"__s__"++
1910* ++(and __e ...__)++
1911* ++(from __e__)++
1912* ++(not __e__)++
1913* ++(or __e__)++
1914* ++(pred __e__)++
1915* ++(succ __e__)++
1916* ++(thresh __x__)++
1917* ++(thresh-gc __x__)++
1918* ++(thresh-stack __x__)++
1919* ++(to __e__)++
1920
1921In the grammar, ++all++ denotes the set of all nodes. ++"__s__"++ is
1922a regular expression denoting the set of functions whose name
1923(followed by a space and the source position) has a prefix matching
1924the regexp. The `and`, `not`, and `or` expressions denote
1925intersection, complement, and union, respectively. The `pred` and
1926`succ` expressions add the set of immediate predecessors or successors
1927to their argument, respectively. The `from` and `to` expressions
1928denote the set of nodes that have paths from or to the set of nodes
1929denoted by their arguments, respectively. Finally, `thresh`,
1930`thresh-gc`, and `thresh-stack` denote the set of nodes whose
1931percentage of ticks, gc ticks, or stack ticks, respectively, is
1932greater than or equal to the real number _x_.
1933
1934For example, if you want to see the entire call graph for a program,
1935you can use `-keep all` (this is the default). If you want to see
1936all nodes reachable from function `foo` in your program, you would
1937use `-keep '(from "foo")'`. Or, if you want to see all the
1938functions defined in subdirectory `bar` of your project that used
1939at least 1% of the ticks, you would use
1940----
1941-keep '(and ".*/bar/" (thresh 1.0))'
1942----
1943To see all functions with ticks above a threshold, you can also use
1944`-thresh x`, which is an abbreviation for `-keep '(thresh x)'`. You
1945can not use multiple `-keep` arguments or both `-keep` and `-thresh`.
1946When you use `-keep` to display a subset of the functions, `mlprof`
1947will add dashed edges to the call graph to indicate a path in the
1948original call graph from one function to another.
1949
1950When compiling with `-profile-stack true`, you can use `mlprof -gray
1951true` to make the nodes darker or lighter depending on whether their
1952stack percentage is higher or lower.
1953
1954MLton's optimizer may duplicate source functions for any of a number
1955of reasons (functor duplication, monomorphisation, polyvariance,
1956inlining). By default, all duplicates of a function are treated as
1957one. If you would like to treat the duplicates separately, you can
1958use ++mlprof -split __regexp__++, which will cause all duplicates of
1959functions whose name has a prefix matching the regular expression to
1960be treated separately. This can be especially useful for higher-order
1961utility functions like `General.o`.
1962
1963== Caveats ==
1964
1965Technically speaking, `mlprof` produces a call-stack graph rather than
1966a call graph, because it describes the set of possible call stacks.
1967The difference is in how tail calls are displayed. For example if `f`
1968nontail calls `g` and `g` tail calls `h`, then the call-stack graph
1969has edges from `f` to `g` and `f` to `h`, while the call graph has
1970edges from `f` to `g` and `g` to `h`. That is, a tail call from `g`
1971to `h` removes `g` from the call stack and replaces it with `h`.
1972
1973<<<
1974
1975:mlton-guide-page: CallingFromCToSML
1976[[CallingFromCToSML]]
1977CallingFromCToSML
1978=================
1979
1980MLton's <:ForeignFunctionInterface:> allows programs to _export_ SML
1981functions to be called from C. Suppose you would like export from SML
1982a function of type `real * char -> int` as the C function `foo`.
1983MLton extends the syntax of SML to allow expressions like the
1984following:
1985----
1986_export "foo": (real * char -> int) -> unit;
1987----
1988The above expression exports a C function named `foo`, with
1989prototype
1990[source,c]
1991----
1992Int32 foo (Real64 x0, Char x1);
1993----
1994The `_export` expression denotes a function of type
1995`(real * char -> int) -> unit` that when called with a function
1996`f`, arranges for the exported `foo` function to call `f`
1997when `foo` is called. So, for example, the following exports and
1998defines `foo`.
1999[source,sml]
2000----
2001val e = _export "foo": (real * char -> int) -> unit;
2002val _ = e (fn (x, c) => 13 + Real.floor x + Char.ord c)
2003----
2004
2005The general form of an `_export` expression is
2006----
2007_export "C function name" attr... : cFuncTy -> unit;
2008----
2009The type and the semicolon are not optional. As with `_import`, a
2010sequence of attributes may follow the function name.
2011
2012MLton's `-export-header` option generates a C header file with
2013prototypes for all of the functions exported from SML. Include this
2014header file in your C files to type check calls to functions exported
2015from SML. This header file includes ++typedef++s for the
2016<:ForeignFunctionInterfaceTypes: types that can be passed between SML and C>.
2017
2018
2019== Example ==
2020
2021Suppose that `export.sml` is
2022
2023[source,sml]
2024----
2025sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/export.sml]
2026----
2027
2028Note that the the `reentrant` attribute is used for `_import`-ing the
2029C functions that will call the `_export`-ed SML functions.
2030
2031Create the header file with `-export-header`.
2032----
2033% mlton -default-ann 'allowFFI true' \
2034 -export-header export.h \
2035 -stop tc \
2036 export.sml
2037----
2038
2039`export.h` now contains the following C prototypes.
2040----
2041Int8 f (Int32 x0, Real64 x1, Int8 x2);
2042Pointer f2 (Word8 x0);
2043void f3 ();
2044void f4 (Int32 x0);
2045extern Int32 zzz;
2046----
2047
2048Use `export.h` in a C program, `ffi-export.c`, as follows.
2049
2050[source,c]
2051----
2052sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/ffi-export.c]
2053----
2054
2055Compile `ffi-export.c` and `export.sml`.
2056----
2057% gcc -c ffi-export.c
2058% mlton -default-ann 'allowFFI true' \
2059 export.sml ffi-export.o
2060----
2061
2062Finally, run `export`.
2063----
2064% ./export
2065g starting
2066...
2067g4 (0)
2068success
2069----
2070
2071
2072== Download ==
2073* <!RawGitFile(mlton,master,doc/examples/ffi/export.sml)>
2074* <!RawGitFile(mlton,master,doc/examples/ffi/ffi-export.c)>
2075
2076<<<
2077
2078:mlton-guide-page: CallingFromSMLToC
2079[[CallingFromSMLToC]]
2080CallingFromSMLToC
2081=================
2082
2083MLton's <:ForeignFunctionInterface:> allows an SML program to _import_
2084C functions. Suppose you would like to import from C a function with
2085the following prototype:
2086[source,c]
2087----
2088int foo (double d, char c);
2089----
2090MLton extends the syntax of SML to allow expressions like the following:
2091----
2092_import "foo": real * char -> int;
2093----
2094This expression denotes a function of type `real * char -> int` whose
2095behavior is implemented by calling the C function whose name is `foo`.
2096Thinking in terms of C, imagine that there are C variables `d` of type
2097`double`, `c` of type `unsigned char`, and `i` of type `int`. Then,
2098the C statement `i = foo (d, c)` is executed and `i` is returned.
2099
2100The general form of an `_import` expression is:
2101----
2102_import "C function name" attr... : cFuncTy;
2103----
2104The type and the semicolon are not optional.
2105
2106The function name is followed by a (possibly empty) sequence of
2107attributes, analogous to C `__attribute__` specifiers.
2108
2109
2110== Example ==
2111
2112`import.sml` imports the C function `ffi` and the C variable `FFI_INT`
2113as follows.
2114
2115[source,sml]
2116----
2117sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/import.sml]
2118----
2119
2120`ffi-import.c` is
2121
2122[source,c]
2123----
2124sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/ffi-import.c]
2125----
2126
2127Compile and run the program.
2128----
2129% mlton -default-ann 'allowFFI true' -export-header export.h import.sml ffi-import.c
2130% ./import
213113
2132success
2133----
2134
2135
2136== Download ==
2137* <!RawGitFile(mlton,master,doc/examples/ffi/import.sml)>
2138* <!RawGitFile(mlton,master,doc/examples/ffi/ffi-import.c)>
2139
2140
2141== Next Steps ==
2142
2143* <:CallingFromSMLToCFunctionPointer:>
2144
2145<<<
2146
2147:mlton-guide-page: CallingFromSMLToCFunctionPointer
2148[[CallingFromSMLToCFunctionPointer]]
2149CallingFromSMLToCFunctionPointer
2150================================
2151
2152Just as MLton can <:CallingFromSMLToC:directly call C functions>, it
2153is possible to make indirect function calls; that is, function calls
2154through a function pointer. MLton extends the syntax of SML to allow
2155expressions like the following:
2156----
2157_import * : MLton.Pointer.t -> real * char -> int;
2158----
2159This expression denotes a function of type
2160[source,sml]
2161----
2162MLton.Pointer.t -> real * char -> int
2163----
2164whose behavior is implemented by calling the C function at the address
2165denoted by the `MLton.Pointer.t` argument, and supplying the C
2166function two arguments, a `double` and an `int`. The C function
2167pointer may be obtained, for example, by the dynamic linking loader
2168(`dlopen`, `dlsym`, ...).
2169
2170The general form of an indirect `_import` expression is:
2171----
2172_import * attr... : cPtrTy -> cFuncTy;
2173----
2174The type and the semicolon are not optional.
2175
2176
2177== Example ==
2178
2179This example uses `dlopen` and friends (imported using normal
2180`_import`) to dynamically load the math library (`libm`) and call the
2181`cos` function. Suppose `iimport.sml` contains the following.
2182
2183[source,sml]
2184----
2185sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/iimport.sml]
2186----
2187
2188Compile and run `iimport.sml`.
2189----
2190% mlton -default-ann 'allowFFI true' \
2191 -target-link-opt linux -ldl \
2192 -target-link-opt solaris -ldl \
2193 iimport.sml
2194% iimport
2195 Math.cos(2.0) = ~0.416146836547
2196libm.so::cos(2.0) = ~0.416146836547
2197----
2198
2199This example also shows the `-target-link-opt` option, which uses the
2200switch when linking only when on the specified platform. Compile with
2201`-verbose 1` to see in more detail what's being passed to `gcc`.
2202
2203
2204== Download ==
2205* <!RawGitFile(mlton,master,doc/examples/ffi/iimport.sml)>
2206
2207<<<
2208
2209:mlton-guide-page: CCodegen
2210[[CCodegen]]
2211CCodegen
2212========
2213
2214The <:CCodegen:> is a <:Codegen:code generator> that translates the
2215<:Machine:> <:IntermediateLanguage:> to C, which is further optimized
2216and compiled to native object code by `gcc` (or another C compiler).
2217
2218== Implementation ==
2219
2220* <!ViewGitFile(mlton,master,mlton/codegen/c-codegen/c-codegen.sig)>
2221* <!ViewGitFile(mlton,master,mlton/codegen/c-codegen/c-codegen.fun)>
2222
2223== Details and Notes ==
2224
2225The <:CCodegen:> is the original <:Codegen:code generator> for MLton.
2226
2227<<<
2228
2229:mlton-guide-page: Changelog
2230[[Changelog]]
2231Changelog
2232=========
2233
2234* <!ViewGitFile(mlton,master,CHANGELOG.adoc)>
2235
2236----
2237sys::[./bin/InclGitFile.py mlton master CHANGELOG.adoc]
2238----
2239
2240<<<
2241
2242:mlton-guide-page: ChrisClearwater
2243[[ChrisClearwater]]
2244ChrisClearwater
2245===============
2246
2247{empty}
2248
2249<<<
2250
2251:mlton-guide-page: Chunkify
2252[[Chunkify]]
2253Chunkify
2254========
2255
2256<:Chunkify:> is an analysis pass for the <:RSSA:>
2257<:IntermediateLanguage:>, invoked from <:ToMachine:>.
2258
2259== Description ==
2260
2261It partitions all the labels (function and block) in an <:RSSA:>
2262program into disjoint sets, referred to as chunks.
2263
2264== Implementation ==
2265
2266* <!ViewGitFile(mlton,master,mlton/backend/chunkify.sig)>
2267* <!ViewGitFile(mlton,master,mlton/backend/chunkify.fun)>
2268
2269== Details and Notes ==
2270
2271Breaking large <:RSSA:> functions into chunks is necessary for
2272reasonable compile times with the <:CCodegen:> and the <:LLVMCodegen:>.
2273
2274<<<
2275
2276:mlton-guide-page: CKitLibrary
2277[[CKitLibrary]]
2278CKitLibrary
2279===========
2280
2281The http://www.smlnj.org/doc/ckit[ckit Library] is a C front end
2282written in SML that translates C source code (after preprocessing)
2283into abstract syntax represented as a set of SML datatypes. The ckit
2284Library is distributed with SML/NJ. Due to differences between SML/NJ
2285and MLton, this library will not work out-of-the box with MLton.
2286
2287As of 20180119, MLton includes a port of the ckit Library synchronized
2288with SML/NJ version 110.82.
2289
2290== Usage ==
2291
2292* You can import the ckit Library into an MLB file with:
2293+
2294[options="header"]
2295|=====
2296|MLB file|Description
2297|`$(SML_LIB)/ckit-lib/ckit-lib.mlb`|
2298|=====
2299
2300* If you are porting a project from SML/NJ's <:CompilationManager:> to
2301MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
2302following map is included by default:
2303+
2304----
2305# ckit Library
2306$ckit-lib.cm $(SML_LIB)/ckit-lib
2307$ckit-lib.cm/ckit-lib.cm $(SML_LIB)/ckit-lib/ckit-lib.mlb
2308----
2309+
2310This will automatically convert a `$/ckit-lib.cm` import in an input
2311`.cm` file into a `$(SML_LIB)/ckit-lib/ckit-lib.mlb` import in the
2312output `.mlb` file.
2313
2314== Details ==
2315
2316The following changes were made to the ckit Library, in addition to
2317deriving the `.mlb` file from the `.cm` file:
2318
2319* `ast/pp/pp-ast-adornment-sig.sml` (modified): Rewrote use of `signature` in `local`.
2320* `ast/pp/pp-ast-ext-sig.sml` (modified): Rewrote use of `signature` in `local`.
2321* `ast/type-util-sig.sml` (modified): Rewrote use of `signature` in `local`.
2322* `parser/parse-tree-sig.sml` (modified): Rewrote use of (sequential) `withtype` in signature.
2323* `parser/parse-tree.sml` (modified): Rewrote use of (sequential) `withtype`.
2324
2325== Patch ==
2326
2327* <!ViewGitFile(mlton,master,lib/ckit-lib/ckit.patch)>
2328
2329<<<
2330
2331:mlton-guide-page: Closure
2332[[Closure]]
2333Closure
2334=======
2335
2336A closure is a data structure that is the run-time representation of a
2337function.
2338
2339
2340== Typical Implementation ==
2341
2342In a typical implementation, a closure consists of a _code pointer_
2343(indicating what the function does) and an _environment_ containing
2344the values of the free variables of the function. For example, in the
2345expression
2346
2347[source,sml]
2348----
2349let
2350 val x = 5
2351in
2352 fn y => x + y
2353end
2354----
2355
2356the closure for `fn y => x + y` contains a pointer to a piece of code
2357that knows to take its argument and add the value of `x` to it, plus
2358the environment recording the value of `x` as `5`.
2359
2360To call a function, the code pointer is extracted and jumped to,
2361passing in some agreed upon location the environment and the argument.
2362
2363
2364== MLton's Implementation ==
2365
2366MLton does not implement closures traditionally. Instead, based on
2367whole-program higher-order control-flow analysis, MLton represents a
2368function as an element of a sum type, where the variant indicates
2369which function it is and carries the free variables as arguments. See
2370<:ClosureConvert:> and <!Cite(CejtinEtAl00)> for details.
2371
2372<<<
2373
2374:mlton-guide-page: ClosureConvert
2375[[ClosureConvert]]
2376ClosureConvert
2377==============
2378
2379<:ClosureConvert:> is a translation pass from the <:SXML:>
2380<:IntermediateLanguage:> to the <:SSA:> <:IntermediateLanguage:>.
2381
2382== Description ==
2383
2384It converts an <:SXML:> program into an <:SSA:> program.
2385
2386<:Defunctionalization:> is the technique used to eliminate
2387<:Closure:>s (see <!Cite(CejtinEtAl00)>).
2388
2389Uses <:Globalize:> and <:LambdaFree:> analyses.
2390
2391== Implementation ==
2392
2393* <!ViewGitFile(mlton,master,mlton/closure-convert/closure-convert.sig)>
2394* <!ViewGitFile(mlton,master,mlton/closure-convert/closure-convert.fun)>
2395
2396== Details and Notes ==
2397
2398{empty}
2399
2400<<<
2401
2402:mlton-guide-page: CMinusMinus
2403[[CMinusMinus]]
2404CMinusMinus
2405===========
2406
2407http://cminusminus.org[C--] is a portable assembly language intended
2408to make it easy for compilers for different high-level languages to
2409share the same backend. An experimental version of MLton has been
2410made to generate C--.
2411
2412* http://www.mlton.org/pipermail/mlton/2005-March/026850.html
2413
2414== Also see ==
2415
2416 * <:LLVM:>
2417
2418<<<
2419
2420:mlton-guide-page: Codegen
2421[[Codegen]]
2422Codegen
2423=======
2424
2425<:Codegen:> is a translation pass from the <:Machine:>
2426<:IntermediateLanguage:> to one or more compilation units that can be
2427compiled to native object code by an external tool.
2428
2429== Implementation ==
2430
2431* <!ViewGitDir(mlton,master,mlton/codegen)>
2432
2433== Details and Notes ==
2434
2435The following <:Codegen:codegens> are implemented:
2436
2437* <:AMD64Codegen:>
2438* <:CCodegen:>
2439* <:LLVMCodegen:>
2440* <:X86Codegen:>
2441
2442<<<
2443
2444:mlton-guide-page: CombineConversions
2445[[CombineConversions]]
2446CombineConversions
2447==================
2448
2449<:CombineConversions:> is an optimization pass for the <:SSA:>
2450<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
2451
2452== Description ==
2453
2454This pass looks for and simplifies nested calls to (signed)
2455extension/truncation.
2456
2457== Implementation ==
2458
2459* <!ViewGitFile(mlton,master,mlton/ssa/combine-conversions.fun)>
2460
2461== Details and Notes ==
2462
2463It processes each block in dfs order (visiting definitions before uses):
2464
2465* If the statement is not a `PrimApp` with `Word_extdToWord`, skip it.
2466* After processing a conversion, it tags the `Var` for subsequent use.
2467* When inspecting a conversion, check if the `Var` operand is also the
2468result of a conversion. If it is, try to combine the two operations.
2469Repeatedly simplify until hitting either a non-conversion `Var` or a
2470case where the conversion cannot be simplified.
2471
2472The optimization rules are very simple:
2473----
2474x1 = ...
2475x2 = Word_extdToWord (W1, W2, {signed=s1}) x1
2476x3 = Word_extdToWord (W2, W3, {signed=s2}) x2
2477----
2478
2479* If `W1 = W2`, then there is no conversions before `x_1`.
2480+
2481This is guaranteed because `W2 = W3` will always trigger optimization.
2482
2483* Case `W1 <= W3 <= W2`:
2484+
2485----
2486x3 = Word_extdToWord (W1, W3, {signed=s1}) x1
2487----
2488
2489* Case `W1 < W2 < W3 AND ((NOT s1) OR s2)`:
2490+
2491----
2492x3 = Word_extdToWord (W1, W3, {signed=s1}) x1
2493----
2494
2495* Case `W1 = W2 < W3`:
2496+
2497unoptimized, because there are no conversions past `W1` and `x2 = x1`
2498
2499* Case `W3 <= W2 <= W1 OR W3 <= W1 <= W2`:
2500+
2501----
2502x_3 = Word_extdToWord (W1, W3, {signed=_}) x1
2503----
2504+
2505because `W3 <= W1 && W3 <= W2`, just clip `x1`
2506
2507* Case `W2 < W1 <= W3 OR W2 < W3 <= W1`:
2508+
2509unoptimized, because `W2 < W1 && W2 < W3`, has truncation effect
2510
2511* Case `W1 < W2 < W3 AND (s1 AND (NOT s2))`:
2512+
2513unoptimized, because each conversion affects the result separately
2514
2515<<<
2516
2517:mlton-guide-page: CommonArg
2518[[CommonArg]]
2519CommonArg
2520=========
2521
2522<:CommonArg:> is an optimization pass for the <:SSA:>
2523<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
2524
2525== Description ==
2526
2527It optimizes instances of `Goto` transfers that pass the same
2528arguments to the same label; e.g.
2529----
2530L_1 ()
2531 ...
2532 z1 = ?
2533 ...
2534 L_3 (x, y, z1)
2535L_2 ()
2536 ...
2537 z2 = ?
2538 ...
2539 L_3 (x, y, z2)
2540L_3 (a, b, c)
2541 ...
2542----
2543
2544This code can be simplified to:
2545----
2546L_1 ()
2547 ...
2548 z1 = ?
2549 ...
2550 L_3 (z1)
2551L_2 ()
2552 ...
2553 z2 = ?
2554 ...
2555 L_3 (z2)
2556L_3 (c)
2557 a = x
2558 b = y
2559----
2560which saves a number of resources: time of setting up the arguments
2561for the jump to `L_3`, space (either stack or pseudo-registers) for
2562the arguments of `L_3`, etc. It may also expose some other
2563optimizations, if more information is known about `x` or `y`.
2564
2565== Implementation ==
2566
2567* <!ViewGitFile(mlton,master,mlton/ssa/common-arg.fun)>
2568
2569== Details and Notes ==
2570
2571Three analyses were originally proposed to drive the optimization
2572transformation. Only the _Dominator Analysis_ is currently
2573implemented. (Implementations of the other analyses are available in
2574the <:Sources:repository history>.)
2575
2576=== Syntactic Analysis ===
2577
2578The simplest analysis I could think of maintains
2579----
2580varInfo: Var.t -> Var.t option list ref
2581----
2582initialized to `[]`.
2583
2584* For each variable `v` bound in a `Statement.t` or in the
2585`Function.t` args, then `List.push(varInfo v, NONE)`.
2586* For each `L (x1, ..., xn)` transfer where `(a1, ..., an)` are the
2587formals of `L`, then `List.push(varInfo ai, SOME xi)`.
2588* For each block argument a used in an unknown context (e.g.,
2589arguments of blocks used as continuations, handlers, arith success,
2590runtime return, or case switch labels), then
2591`List.push(varInfo a, NONE)`.
2592
2593Now, any block argument `a` such that `varInfo a = xs`, where all of
2594the elements of `xs` are equal to `SOME x`, can be optimized by
2595setting `a = x` at the beginning of the block and dropping the
2596argument from `Goto` transfers.
2597
2598That takes care of the example above. We can clearly do slightly
2599better, by changing the transformation criteria to the following: any
2600block argument a such that `varInfo a = xs`, where all of the elements
2601of `xs` are equal to `SOME x` _or_ are equal to `SOME a`, can be
2602optimized by setting `a = x` at the beginning of the block and
2603dropping the argument from `Goto` transfers. This optimizes a case
2604like:
2605----
2606L_1 ()
2607 ... z1 = ? ...
2608 L_3 (x, y, z1)
2609L_2 ()
2610 ... z2 = ? ...
2611 L_3(x, y, z2)
2612L_3 (a, b, c)
2613 ... w = ? ...
2614 case w of
2615 true => L_4 | false => L_5
2616L_4 ()
2617 ...
2618 L_3 (a, b, w)
2619L_5 ()
2620 ...
2621----
2622where a common argument is passed to a loop (and is invariant through
2623the loop). Of course, the <:LoopInvariant:> optimization pass would
2624normally introduce a local loop and essentially reduce this to the
2625first example, but I have seen this in practice, which suggests that
2626some optimizations after <:LoopInvariant:> do enough simplifications
2627to introduce (new) loop invariant arguments.
2628
2629=== Fixpoint Analysis ===
2630
2631However, the above analysis and transformation doesn't cover the cases
2632where eliminating one common argument exposes the opportunity to
2633eliminate other common arguments. For example:
2634----
2635L_1 ()
2636 ...
2637 L_3 (x)
2638L_2 ()
2639 ...
2640 L_3 (x)
2641L_3 (a)
2642 ...
2643 L_5 (a)
2644L_4 ()
2645 ...
2646 L_5 (x)
2647L_5 (b)
2648 ...
2649----
2650
2651One pass of analysis and transformation would eliminate the argument
2652to `L_3` and rewrite the `L_5(a)` transfer to `L_5 (x)`, thereby
2653exposing the opportunity to eliminate the common argument to `L_5`.
2654
2655The interdependency the arguments to `L_3` and `L_5` suggest
2656performing some sort of fixed-point analysis. This analysis is
2657relatively simple; maintain
2658----
2659varInfo: Var.t -> VarLattice.t
2660----
2661{empty}where
2662----
2663VarLattice.t ~=~ Bot | Point of Var.t | Top
2664----
2665(but is implemented by the <:FlatLattice:> functor with a `lessThan`
2666list and `value ref` under the hood), initialized to `Bot`.
2667
2668* For each variable `v` bound in a `Statement.t` or in the
2669`Function.t` args, then `VarLattice.<= (Point v, varInfo v)`
2670* For each `L (x1, ..., xn)` transfer where `(a1, ..., an)` are the
2671formals of `L`}, then `VarLattice.<= (varInfo xi, varInfo ai)`.
2672* For each block argument a used in an unknown context, then
2673`VarLattice.<= (Point a, varInfo a)`.
2674
2675Now, any block argument a such that `varInfo a = Point x` can be
2676optimized by setting `a = x` at the beginning of the block and
2677dropping the argument from `Goto` transfers.
2678
2679Now, with the last example, we introduce the ordering constraints:
2680----
2681varInfo x <= varInfo a
2682varInfo a <= varInfo b
2683varInfo x <= varInfo b
2684----
2685
2686Assuming that `varInfo x = Point x`, then we get `varInfo a = Point x`
2687and `varInfo b = Point x`, and we optimize the example as desired.
2688
2689But, that is a rather weak assumption. It's quite possible for
2690`varInfo x = Top`. For example, consider:
2691----
2692G_1 ()
2693 ... n = 1 ...
2694 L_0 (n)
2695G_2 ()
2696 ... m = 2 ...
2697 L_0 (m)
2698L_0 (x)
2699 ...
2700L_1 ()
2701 ...
2702 L_3 (x)
2703L_2 ()
2704 ...
2705 L_3 (x)
2706L_3 (a)
2707 ...
2708 L_5(a)
2709L_4 ()
2710 ...
2711 L_5(x)
2712L_5 (b)
2713 ...
2714----
2715
2716Now `varInfo x = varInfo a = varInfo b = Top`. What went wrong here?
2717When `varInfo x` went to `Top`, it got propagated all the way through
2718to `a` and `b`, and prevented the elimination of any common arguments.
2719What we'd like to do instead is when `varInfo x` goes to `Top`,
2720propagate on `Point x` -- we have no hope of eliminating `x`, but if
2721we hold `x` constant, then we have a chance of eliminating arguments
2722for which `x` is passed as an actual.
2723
2724=== Dominator Analysis ===
2725
2726Does anyone see where this is going yet? Pausing for a little
2727thought, <:MatthewFluet:> realized that he had once before tried
2728proposing this kind of "fix" to a fixed-point analysis -- when we were
2729first investigating the <:Contify:> optimization in light of John
2730Reppy's CWS paper. Of course, that "fix" failed because it defined a
2731non-monotonic function and one couldn't take the fixed point. But,
2732<:StephenWeeks:> suggested a dominator based approach, and we were
2733able to show that, indeed, the dominator analysis subsumed both the
2734previous call based analysis and the cont based analysis. And, a
2735moment's reflection reveals further parallels: when
2736`varInfo: Var.t -> Var.t option list ref`, we have something analogous
2737to the call analysis, and when `varInfo: Var.t -> VarLattice.t`, we
2738have something analogous to the cont analysis. Maybe there is
2739something analogous to the dominator approach (and therefore superior
2740to the previous analyses).
2741
2742And this turns out to be the case. Construct the graph `G` as follows:
2743----
2744nodes(G) = {Root} U Var.t
2745edges(G) = {Root -> v | v bound in a Statement.t or
2746 in the Function.t args} U
2747 {xi -> ai | L(x1, ..., xn) transfer where (a1, ..., an)
2748 are the formals of L} U
2749 {Root -> a | a is a block argument used in an unknown context}
2750----
2751
2752Let `idom(x)` be the immediate dominator of `x` in `G` with root
2753`Root`. Now, any block argument a such that `idom(a) = x <> Root` can
2754be optimized by setting `a = x` at the beginning of the block and
2755dropping the argument from `Goto` transfers.
2756
2757Furthermore, experimental evidence suggests (and we are confident that
2758a formal presentation could prove) that the dominator analysis
2759subsumes the "syntactic" and "fixpoint" based analyses in this context
2760as well and that the dominator analysis gets "everything" in one go.
2761
2762=== Final Thoughts ===
2763
2764I must admit, I was rather surprised at this progression and final
2765result. At the outset, I never would have thought of a connection
2766between <:Contify:> and <:CommonArg:> optimizations. They would seem
2767to be two completely different optimizations. Although, this may not
2768really be the case. As one of the reviewers of the ICFP paper said:
2769____
2770I understand that such a form of CPS might be convenient in some
2771cases, but when we're talking about analyzing code to detect that some
2772continuation is constant, I think it makes a lot more sense to make
2773all the continuation arguments completely explicit.
2774
2775I believe that making all the continuation arguments explicit will
2776show that the optimization can be generalized to eliminating constant
2777arguments, whether continuations or not.
2778____
2779
2780What I think the common argument optimization shows is that the
2781dominator analysis does slightly better than the reviewer puts it: we
2782find more than just constant continuations, we find common
2783continuations. And I think this is further justified by the fact that
2784I have observed common argument eliminate some `env_X` arguments which
2785would appear to correspond to determining that while the closure being
2786executed isn't constant it is at least the same as the closure being
2787passed elsewhere.
2788
2789At first, I was curious whether or not we had missed a bigger picture
2790with the dominator analysis. When we wrote the contification paper, I
2791assumed that the dominator analysis was a specialized solution to a
2792specialized problem; we never suggested that it was a technique suited
2793to a larger class of analyses. After initially finding a connection
2794between <:Contify:> and <:CommonArg:> (and thinking that the only
2795connection was the technique), I wondered if the dominator technique
2796really was applicable to a larger class of analyses. That is still a
2797question, but after writing up the above, I'm suspecting that the
2798"real story" is that the dominator analysis is a solution to the
2799common argument optimization, and that the <:Contify:> optimization is
2800specializing <:CommonArg:> to the case of continuation arguments (with
2801a different transformation at the end). (Note, a whole-program,
2802inter-procedural common argument analysis doesn't really make sense
2803(in our <:SSA:> <:IntermediateLanguage:>), because the only way of
2804passing values between functions is as arguments. (Unless of course
2805in the case that the common argument is also a constant argument, in
2806which case <:ConstantPropagation:> could lift it to a global.) The
2807inter-procedural <:Contify:> optimization works out because there we
2808move the function to the argument.)
2809
2810Anyways, it's still unclear to me whether or not the dominator based
2811approach solves other kinds of problems.
2812
2813=== Phase Ordering ===
2814
2815On the downside, the optimization doesn't have a huge impact on
2816runtime, although it does predictably saved some code size. I stuck
2817it in the optimization sequence after <:Flatten:> and (the third round
2818of) <:LocalFlatten:>, since it seems to me that we could have cases
2819where some components of a tuple used as an argument are common, but
2820the whole tuple isn't. I think it makes sense to add it after
2821<:IntroduceLoops:> and <:LoopInvariant:> (even though <:CommonArg:>
2822get some things that <:LoopInvariant:> gets, it doesn't get all of
2823them). I also think that it makes sense to add it before
2824<:CommonSubexp:>, since identifying variables could expose more common
2825subexpressions. I would think a similar thought applies to
2826<:RedundantTests:>.
2827
2828<<<
2829
2830:mlton-guide-page: CommonBlock
2831[[CommonBlock]]
2832CommonBlock
2833===========
2834
2835<:CommonBlock:> is an optimization pass for the <:SSA:>
2836<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
2837
2838== Description ==
2839
2840It eliminates equivalent blocks in a <:SSA:> function. The
2841equivalence criteria requires blocks to have no arguments or
2842statements and transfer via `Raise`, `Return`, or `Goto` of a single
2843global variable.
2844
2845== Implementation ==
2846
2847* <!ViewGitFile(mlton,master,mlton/ssa/common-block.fun)>
2848
2849== Details and Notes ==
2850
2851* Rewrites
2852+
2853----
2854L_X ()
2855 raise (global_Y)
2856----
2857+
2858to
2859+
2860----
2861L_X ()
2862 L_Y' ()
2863----
2864+
2865and adds
2866+
2867----
2868L_Y' ()
2869 raise (global_Y)
2870----
2871+
2872to the <:SSA:> function.
2873
2874* Rewrites
2875+
2876----
2877L_X ()
2878 return (global_Y)
2879----
2880+
2881to
2882+
2883----
2884L_X ()
2885 L_Y' ()
2886----
2887+
2888and adds
2889+
2890----
2891L_Y' ()
2892 return (global_Y)
2893----
2894+
2895to the <:SSA:> function.
2896
2897* Rewrites
2898+
2899----
2900L_X ()
2901 L_Z (global_Y)
2902----
2903+
2904to
2905+
2906----
2907L_X ()
2908 L_Y' ()
2909----
2910+
2911and adds
2912+
2913----
2914L_Y' ()
2915 L_Z (global_Y)
2916----
2917+
2918to the <:SSA:> function.
2919
2920The <:Shrink:> pass rewrites all uses of `L_X` to `L_Y'` and drops `L_X`.
2921
2922For example, all uncaught `Overflow` exceptions in a <:SSA:> function
2923share the same raising block.
2924
2925<<<
2926
2927:mlton-guide-page: CommonSubexp
2928[[CommonSubexp]]
2929CommonSubexp
2930============
2931
2932<:CommonSubexp:> is an optimization pass for the <:SSA:>
2933<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
2934
2935== Description ==
2936
2937It eliminates instances of common subexpressions.
2938
2939== Implementation ==
2940
2941* <!ViewGitFile(mlton,master,mlton/ssa/common-subexp.fun)>
2942
2943== Details and Notes ==
2944
2945In addition to getting the usual sorts of things like
2946
2947* {empty}
2948+
2949----
2950(w + 0wx1) + (w + 0wx1)
2951----
2952+
2953rewritten to
2954+
2955----
2956let val w' = w + 0wx1 in w' + w' end
2957----
2958
2959it also gets things like
2960
2961* {empty}
2962+
2963----
2964val a = Array_uninit n
2965val b = Array_length a
2966----
2967+
2968rewritten to
2969+
2970----
2971val a = Array_uninit n
2972val b = n
2973----
2974
2975`Arith` transfers are handled specially. The _result_ of an `Arith`
2976transfer can be used in _common_ `Arith` transfers that it dominates:
2977
2978* {empty}
2979+
2980----
2981val l = (n + m) + (n + m)
2982
2983val k = (l + n) + ((l + m) handle Overflow => ((l + m)
2984 handle Overflow => l + n))
2985----
2986+
2987is rewritten so that `(n + m)` is computed exactly once, as are
2988`(l + n)` and `(l + m)`.
2989
2990<<<
2991
2992:mlton-guide-page: CompilationManager
2993[[CompilationManager]]
2994CompilationManager
2995==================
2996
2997The http://www.smlnj.org/doc/CM/index.html[Compilation Manager] (CM) is SML/NJ's mechanism for supporting programming-in-the-very-large.
2998
2999== Porting SML/NJ CM files to MLton ==
3000
3001To help in porting CM files to MLton, the MLton source distribution
3002includes the sources for a utility, `cm2mlb`, that will print an
3003<:MLBasis: ML Basis> file with essentially the same semantics as the
3004CM file -- handling the full syntax of CM supported by your installed
3005SML/NJ version and correctly handling export filters. When `cm2mlb`
3006encounters a `.cm` import, it attempts to convert it to a
3007corresponding `.mlb` import. CM anchored paths are translated to
3008paths according to a default configuration file
3009(<!ViewGitFile(mlton,master,util/cm2mlb/cm2mlb-map)>). For example,
3010the default configuration includes
3011----
3012# Standard ML Basis Library
3013$SMLNJ-BASIS $(SML_LIB)/basis
3014$basis.cm $(SML_LIB)/basis
3015$basis.cm/basis.cm $(SML_LIB)/basis/basis.mlb
3016----
3017to ensure that a `$/basis.cm` import is translated to a
3018`$(SML_LIB)/basis/basis.mlb` import. See `util/cm2mlb` for details.
3019Building `cm2mlb` requires that you have already installed a recent
3020version of SML/NJ.
3021
3022<<<
3023
3024:mlton-guide-page: CompilerOverview
3025[[CompilerOverview]]
3026CompilerOverview
3027================
3028
3029The following table shows the overall structure of the compiler.
3030<:IntermediateLanguage:>s are shown in the center column. The names
3031of compiler passes are listed in the left and right columns.
3032
3033[align="center",witdth="50%",cols="^,^,^"]
3034|====
30353+^| *Compiler Overview*
3036| _Translation Passes_ | _<:IntermediateLanguage:>_ | _Optimization Passes_
3037| | Source |
3038| <:FrontEnd:> | |
3039| | <:AST:> |
3040| <:Elaborate:> | |
3041| | <:CoreML:> | <:CoreMLSimplify:>
3042| <:Defunctorize:> | |
3043| | <:XML:> | <:XMLSimplify:>
3044| <:Monomorphise:> | |
3045| | <:SXML:> | <:SXMLSimplify:>
3046| <:ClosureConvert:> | |
3047| | <:SSA:> | <:SSASimplify:>
3048| <:ToSSA2:> | |
3049| | <:SSA2:> | <:SSA2Simplify:>
3050| <:ToRSSA:> | |
3051| | <:RSSA:> | <:RSSASimplify:>
3052| <:ToMachine:> | |
3053| | <:Machine:> |
3054| <:Codegen:> | |
3055|====
3056
3057The `Compile` functor (<!ViewGitFile(mlton,master,mlton/main/compile.sig)>,
3058<!ViewGitFile(mlton,master,mlton/main/compile.fun)>), controls the
3059high-level view of the compiler passes, from <:FrontEnd:> to code
3060generation.
3061
3062<<<
3063
3064:mlton-guide-page: CompilerPassTemplate
3065[[CompilerPassTemplate]]
3066CompilerPassTemplate
3067====================
3068
3069An analysis pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZOtherPass:>.
3070An implementation pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZSimplify:>.
3071An optimization pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZSimplify:>.
3072A rewrite pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZOtherPass:>.
3073A translation pass from the <:ZZA:> <:IntermediateLanguage:> to the <:ZZB:> <:IntermediateLanguage:>.
3074
3075== Description ==
3076
3077A short description of the pass.
3078
3079== Implementation ==
3080
3081* <!ViewGitFile(mlton,master,mlton/ZZZ.fun)>
3082
3083== Details and Notes ==
3084
3085Relevant details and notes.
3086
3087<<<
3088
3089:mlton-guide-page: CompileTimeOptions
3090[[CompileTimeOptions]]
3091CompileTimeOptions
3092==================
3093
3094MLton's compile-time options control the name of the output file, the
3095verbosity of compile-time messages, and whether or not certain
3096optimizations are performed. They also can specify which intermediate
3097files are saved and can stop the compilation process early, at some
3098intermediate pass, in which case compilation can be resumed by passing
3099the generated files to MLton. MLton uses the input file suffix to
3100determine the type of input program. The possibilities are `.c`,
3101`.mlb`, `.o`, `.s`, and `.sml`.
3102
3103With no arguments, MLton prints the version number and exits. For a
3104usage message, run MLton with an invalid switch, e.g. `mlton -z`. In
3105the explanation below and in the usage message, for flags that take a
3106number of choices (e.g. `{true|false}`), the first value listed is the
3107default.
3108
3109
3110== Options ==
3111
3112* ++-align __n__++
3113+
3114Aligns object in memory by the specified alignment (+4+ or +8+).
3115The default varies depending on architecture.
3116
3117* ++-as-opt __option__++
3118+
3119Pass _option_ to `gcc` when compiling assembler code. If you wish to
3120pass an option to the assembler, you must use `gcc`'s `-Wa,` syntax.
3121
3122* ++-cc-opt __option__++
3123+
3124Pass _option_ to `gcc` when compiling C code.
3125
3126* ++-codegen {native|amd64|c|llvm|x86}++
3127+
3128Generate native object code via amd64 assembly, C code, LLVM code, or
3129x86 code or C code. With `-codegen native` (`-codegen amd64` or
3130`-codegen x86`), MLton typically compiles more quickly and generates
3131better code.
3132
3133* ++-const __name__ __value__++
3134+
3135Set the value of a compile-time constant. Here is a list of
3136available constants, their default values, and what they control.
3137+
3138** ++Exn.keepHistory {false|true}++
3139+
3140Enable `MLton.Exn.history`. See <:MLtonExn:> for details. There is a
3141performance cost to setting this to `true`, both in memory usage of
3142exceptions and in run time, because of additional work that must be
3143performed at each exception construction, raise, and handle.
3144
3145* ++-default-ann __ann__++
3146+
3147Specify default <:MLBasisAnnotations:ML Basis annotations>. For
3148example, `-default-ann 'warnUnused true'` causes unused variable
3149warnings to be enabled by default. A default is overridden by the
3150corresponding annotation in an ML Basis file.
3151
3152* ++-default-type __type__++
3153+
3154Specify the default binding for a primitive type. For example,
3155`-default-type word64` causes the top-level type `word` and the
3156top-level structure `Word` in the <:BasisLibrary:Basis Library> to be
3157equal to `Word64.word` and `Word64:WORD`, respectively. Similarly,
3158`-default-type intinf` causes the top-level type `int` and the
3159top-level structure `Int` in the <:BasisLibrary:Basis Library> to be
3160equal to `IntInf.int` and `IntInf:INTEGER`, respectively.
3161
3162* ++-disable-ann __ann__++
3163+
3164Ignore the specified <:MLBasisAnnotations:ML Basis annotation> in
3165every ML Basis file. For example, to see _all_ match and unused
3166warnings, compile with
3167+
3168----
3169-default-ann 'warnUnused true'
3170-disable-ann forceUsed
3171-disable-ann nonexhaustiveMatch
3172-disable-ann redundantMatch
3173-disable-ann warnUnused
3174----
3175
3176* ++-export-header __file__++
3177+
3178Write C prototypes to _file_ for all of the functions in the program
3179<:CallingFromCToSML:exported from SML to C>.
3180
3181* ++-ieee-fp {false|true}++
3182+
3183Cause the x86 native code generator to be pedantic about following the
3184IEEE floating point standard. By default, it is not, because of the
3185performance cost. This only has an effect with `-codegen x86`.
3186
3187* ++-inline __n__++
3188+
3189Set the inlining threshold used in the optimizer. The threshold is an
3190approximate measure of code size of a procedure. The default is
3191`320`.
3192
3193* ++-keep {g|o}++
3194+
3195Save intermediate files. If no `-keep` argument is given, then only
3196the output file is saved.
3197+
3198[cols="^25%,<75%"]
3199|====
3200| `g` | generated `.c` and `.s` files passed to `gcc` and generated `.ll` files passed to `llvm-as`
3201| `o` | object (`.o`) files
3202|====
3203
3204* ++-link-opt __option__++
3205+
3206Pass _option_ to `gcc` when linking. You can use this to specify
3207library search paths, e.g. `-link-opt -Lpath`, and libraries to link
3208with, e.g., `-link-opt -lfoo`, or even both at the same time,
3209e.g. `-link-opt '-Lpath -lfoo'`. If you wish to pass an option to the
3210linker, you must use `gcc`'s `-Wl,` syntax, e.g.,
3211`-link-opt '-Wl,--export-dynamic'`.
3212
3213* ++-llvm-as-opt __option__++
3214+
3215Pass _option_ to `llvm-as` when assembling (`.ll` to `.bc`) LLVM code.
3216
3217* ++-llvm-llc-opt __option__++
3218+
3219Pass _option_ to `llc` when compiling (`.bc` to `.o`) LLVM code.
3220
3221* ++-llvm-opt-opt __option__++
3222+
3223Pass _option_ to `opt` when optimizing (`.bc` to `.bc`) LLVM code.
3224
3225* ++-mlb-path-map __file__++
3226+
3227Use _file_ as an <:MLBasisPathMap:ML Basis path map> to define
3228additional MLB path variables. Multiple uses of `-mlb-path-map` and
3229`-mlb-path-var` are allowed, with variable definitions in later path
3230maps taking precedence over earlier ones.
3231
3232* ++-mlb-path-var __name__ __value__++
3233+
3234Define an additional MLB path variable. Multiple uses of
3235`-mlb-path-map` and `-mlb-path-var` are allowed, with variable
3236definitions in later path maps taking precedence over earlier ones.
3237
3238* ++-output __file__++
3239+
3240Specify the name of the final output file. The default name is the
3241input file name with its suffix removed and an appropriate, possibly
3242empty, suffix added.
3243
3244* ++-profile {no|alloc|count|time}++
3245+
3246Produce an executable that gathers <:Profiling: profiling> data. When
3247such an executable is run, it produces an `mlmon.out` file.
3248
3249* ++-profile-branch {false|true}++
3250+
3251If true, the profiler will separately gather profiling data for each
3252branch of a function definition, `case` expression, and `if`
3253expression.
3254
3255* ++-profile-stack {false|true}++
3256+
3257If `true`, the executable will gather profiling data for all functions
3258on the stack, not just the currently executing function. See
3259<:ProfilingTheStack:>.
3260
3261* ++-profile-val {false|true}++
3262+
3263If `true`, the profiler will separately gather profiling data for each
3264(expansive) `val` declaration.
3265
3266* ++-runtime __arg__++
3267+
3268Pass argument to the runtime system via `@MLton`. See
3269<:RunTimeOptions:>. The argument will be processed before other
3270`@MLton` command line switches. Multiple uses of `-runtime` are
3271allowed, and will pass all the arguments in order. If the same
3272runtime switch occurs more than once, then the last setting will take
3273effect. There is no need to supply the leading `@MLton` or the
3274trailing `--`; these will be supplied automatically.
3275+
3276An argument to `-runtime` may contain spaces, which will cause the
3277argument to be treated as a sequence of words by the runtime. For
3278example the command line:
3279+
3280----
3281mlton -runtime 'ram-slop 0.4' foo.sml
3282----
3283+
3284will cause `foo` to run as if it had been called like:
3285+
3286----
3287foo @MLton ram-slop 0.4 --
3288----
3289+
3290An executable created with `-runtime stop` doesn't process any
3291`@MLton` arguments. This is useful to create an executable, e.g.,
3292`echo`, that must treat `@MLton` like any other command-line argument.
3293+
3294----
3295% mlton -runtime stop echo.sml
3296% echo @MLton --
3297@MLton --
3298----
3299
3300* ++-show-basis __file__++
3301+
3302Pretty print to _file_ the basis defined by the input program. See
3303<:ShowBasis:>.
3304
3305* ++-show-def-use __file__++
3306+
3307Output def-use information to _file_. Each identifier that is defined
3308appears on a line, followed on subsequent lines by the position of
3309each use.
3310
3311* ++-stop {f|g|o|tc}++
3312+
3313Specify when to stop.
3314+
3315[cols="^25%,<75%"]
3316|====
3317| `f` | list of files on stdout (only makes sense when input is `foo.mlb`)
3318| `g` | generated `.c` and `.s` files
3319| `o` | object (`.o`) files
3320| `tc` | after type checking
3321|====
3322+
3323If you compile with `-stop g` or `-stop o`, you can resume compilation
3324by running MLton on the generated `.c` and `.s` or `.o` files.
3325
3326* ++-target {self|__...__}++
3327+
3328Generate an executable that runs on the specified platform. The
3329default is `self`, which means to compile for the machine that MLton
3330is running on. To use any other target, you must first install a
3331<:CrossCompiling: cross compiler>.
3332
3333* ++-target-as-opt __target__ __option__++
3334+
3335Like `-as-opt`, this passes _option_ to `gcc` when compliling
3336assembler code, except it only passes _option_ when the target
3337architecture, operating system, or arch-os pair is _target_.
3338
3339* ++-target-cc-opt __target__ __option__++
3340+
3341Like `-cc-opt`, this passes _option_ to `gcc` when compiling C code,
3342except it only passes _option_ when the target architecture, operating
3343system, or arch-os pair is _target_.
3344
3345* ++-target-link-opt __target__ __option__++
3346+
3347Like `-link-opt`, this passes _option_ to `gcc` when linking, except
3348it only passes _option_ when the target architecture, operating
3349system, or arch-os pair is _target_.
3350
3351* ++-verbose {0|1|2|3}++
3352+
3353How verbose to be about what passes are running. The default is `0`.
3354+
3355[cols="^25%,<75%"]
3356|====
3357| `0` | silent
3358| `1` | calls to compiler, assembler, and linker
3359| `2` | 1, plus intermediate compiler passes
3360| `3` | 2, plus some data structure sizes
3361|====
3362
3363<<<
3364
3365:mlton-guide-page: CompilingWithSMLNJ
3366[[CompilingWithSMLNJ]]
3367CompilingWithSMLNJ
3368==================
3369
3370You can compile MLton with <:SMLNJ:SML/NJ>, however the resulting
3371compiler will run much more slowly than MLton compiled by itself. We
3372don't recommend using SML/NJ as a means of
3373<:PortingMLton:porting MLton> to a new platform or bootstrapping on a
3374new platform.
3375
3376If you do want to build MLton with SML/NJ, it is best to have a binary
3377MLton package installed. If you don't, here are some issues you may
3378encounter when you run `make smlnj-mlton`.
3379
3380You will get (many copies of) the error messages:
3381
3382----
3383/bin/sh: mlton: command not found
3384----
3385
3386and
3387
3388----
3389make[2]: mlton: Command not found
3390----
3391
3392The `Makefile` calls `mlton` to determine dependencies, and can
3393proceed in spite of this error.
3394
3395If you don't have an `mllex` executable, you will get the error
3396message:
3397
3398----
3399mllex: Command not found
3400----
3401
3402Building MLton requires `mllex` and `mlyacc` executables, which are
3403distributed with a binary package of MLton. The easiest solution is
3404to copy the front-end lexer/parser files from a different machine
3405(`ml.grm.sml`, `ml.grm.sig`, `ml.lex.sml`, `mlb.grm.sig`,
3406`mlb.grm.sml`).
3407
3408<<<
3409
3410:mlton-guide-page: ConcurrentML
3411[[ConcurrentML]]
3412ConcurrentML
3413============
3414
3415http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency
3416library based on synchronous message passing. MLton has an initial
3417port of CML from SML/NJ, but is missing a thread-safe wrapper around
3418the Basis Library and event-based equivalents to `IO` and `OS`
3419functions.
3420
3421All of the core CML functionality is present.
3422
3423[source,sml]
3424----
3425structure CML: CML
3426structure SyncVar: SYNC_VAR
3427structure Mailbox: MAILBOX
3428structure Multicast: MULTICAST
3429structure SimpleRPC: SIMPLE_RPC
3430structure RunCML: RUN_CML
3431----
3432
3433The `RUN_CML` signature is minimal.
3434
3435[source,sml]
3436----
3437signature RUN_CML =
3438 sig
3439 val isRunning: unit -> bool
3440 val doit: (unit -> unit) * Time.time option -> OS.Process.status
3441 val shutdown: OS.Process.status -> 'a
3442 end
3443----
3444
3445MLton's `RunCML` structure does not include all of the cleanup and
3446logging operations of SML/NJ's `RunCML` structure. However, the
3447implementation does include the `CML.timeOutEvt` and `CML.atTimeEvt`
3448functions, and a preemptive scheduler that knows to sleep when there
3449are no ready threads and some threads blocked on time events.
3450
3451Because MLton does not wrap the Basis Library for CML, the "right" way
3452to call a Basis Library function that is stateful is to wrap the call
3453with `MLton.Thread.atomically`.
3454
3455== Usage ==
3456
3457* You can import the CML Library into an MLB file with:
3458+
3459[options="header"]
3460|=====
3461|MLB file|Description
3462|`$(SML_LIB)/cml/cml.mlb`|
3463|====
3464
3465* If you are porting a project from SML/NJ's <:CompilationManager:> to
3466MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
3467following map is included by default:
3468+
3469----
3470# CML Library
3471$cml $(SML_LIB)/cml
3472$cml/cml.cm $(SML_LIB)/cml/cml.mlb
3473----
3474+
3475This will automatically convert a `$cml/cml.cm` import in an input `.cm` file into a `$(SML_LIB)/cml/cml.mlb` import in the output `.mlb` file.
3476
3477== Also see ==
3478
3479* <:ConcurrentMLImplementation:>
3480* <:eXene:>
3481
3482<<<
3483
3484:mlton-guide-page: ConcurrentMLImplementation
3485[[ConcurrentMLImplementation]]
3486ConcurrentMLImplementation
3487==========================
3488
3489Here are some notes on MLton's implementation of <:ConcurrentML:>.
3490
3491Concurrent ML was originally implemented for SML/NJ. It was ported to
3492MLton in the summer of 2004. The main difference between the
3493implementations is that SML/NJ uses continuations to implement CML
3494threads, while MLton uses its underlying <:MLtonThread:thread>
3495package. Presently, MLton's threads are a little more heavyweight
3496than SML/NJ's continuations, but it's pretty clear that there is some
3497fat there that could be trimmed.
3498
3499The implementation of CML in SML/NJ is built upon the first-class
3500continuations of the `SMLofNJ.Cont` module.
3501[source,sml]
3502----
3503type 'a cont
3504val callcc: ('a cont -> 'a) -> 'a
3505val isolate: ('a -> unit) -> 'a cont
3506val throw: 'a cont -> 'a -> 'b
3507----
3508
3509The implementation of CML in MLton is built upon the first-class
3510threads of the <:MLtonThread:> module.
3511[source,sml]
3512----
3513type 'a t
3514val new: ('a -> unit) -> 'a t
3515val prepare: 'a t * 'a -> Runnable.t
3516val switch: ('a t -> Runnable.t) -> 'a
3517----
3518
3519The port is relatively straightforward, because CML always throws to a
3520continuation at most once. Hence, an "abstract" implementation of
3521CML could be built upon first-class one-shot continuations, which map
3522equally well to SML/NJ's continuations and MLton's threads.
3523
3524The "essence" of the port is to transform:
3525----
3526callcc (fn k => ... throw k' v')
3527----
3528{empty}to
3529----
3530switch (fn t => ... prepare (t', v'))
3531----
3532which suffices for the vast majority of the CML implementation.
3533
3534There was only one complicated transformation: blocking multiple base
3535events. In SML/NJ CML, the representation of base events is given by:
3536[source,sml]
3537----
3538datatype 'a event_status
3539 = ENABLED of {prio: int, doFn: unit -> 'a}
3540 | BLOCKED of {
3541 transId: trans_id ref,
3542 cleanUp: unit -> unit,
3543 next: unit -> unit
3544 } -> 'a
3545type 'a base_evt = unit -> 'a event_status
3546----
3547
3548When synchronizing on a set of base events, which are all blocked, we
3549must invoke each `BLOCKED` function with the same `transId` and
3550`cleanUp` (the `transId` is (checked and) set to `CANCEL` by the
3551`cleanUp` function, which is invoked by the first enabled event; this
3552"fizzles" every other event in the synchronization group that later
3553becomes enabled). However, each `BLOCKED` function is implemented by
3554a callcc, so that when the event is enabled, it throws back to the
3555point of synchronization. Hence, the next function (which doesn't
3556return) is invoked by the `BLOCKED` function to escape the callcc and
3557continue in the thread performing the synchronization. In SML/NJ this
3558is implemented as follows:
3559[source,sml]
3560----
3561fun ext ([], blockFns) = callcc (fn k => let
3562 val throw = throw k
3563 val (transId, setFlg) = mkFlg()
3564 fun log [] = S.atomicDispatch ()
3565 | log (blockFn:: r) =
3566 throw (blockFn {
3567 transId = transId,
3568 cleanUp = setFlg,
3569 next = fn () => log r
3570 })
3571 in
3572 log blockFns; error "[log]"
3573 end)
3574----
3575(Note that `S.atomicDispatch` invokes the continuation of the next
3576continuation on the ready queue.) This doesn't map well to the MLton
3577thread model. Although it follows the
3578----
3579callcc (fn k => ... throw k v)
3580----
3581model, the fact that `blockFn` will also attempt to do
3582----
3583callcc (fn k' => ... next ())
3584----
3585means that the naive transformation will result in nested `switch`-es.
3586
3587We need to think a little more about what this code is trying to do.
3588Essentially, each `blockFn` wants to capture this continuation, hold
3589on to it until the event is enabled, and continue with next; when the
3590event is enabled, before invoking the continuation and returning to
3591the synchronization point, the `cleanUp` and other event specific
3592operations are performed.
3593
3594To accomplish the same effect in the MLton thread implementation, we
3595have the following:
3596[source,sml]
3597----
3598datatype 'a status =
3599 ENABLED of {prio: int, doitFn: unit -> 'a}
3600 | BLOCKED of {transId: trans_id,
3601 cleanUp: unit -> unit,
3602 next: unit -> rdy_thread} -> 'a
3603
3604type 'a base = unit -> 'a status
3605
3606fun ext ([], blockFns): 'a =
3607 S.atomicSwitch
3608 (fn (t: 'a S.thread) =>
3609 let
3610 val (transId, cleanUp) = TransID.mkFlg ()
3611 fun log blockFns: S.rdy_thread =
3612 case blockFns of
3613 [] => S.next ()
3614 | blockFn::blockFns =>
3615 (S.prep o S.new)
3616 (fn _ => fn () =>
3617 let
3618 val () = S.atomicBegin ()
3619 val x = blockFn {transId = transId,
3620 cleanUp = cleanUp,
3621 next = fn () => log blockFns}
3622 in S.switch(fn _ => S.prepVal (t, x))
3623 end)
3624 in
3625 log blockFns
3626 end)
3627----
3628
3629To avoid the nested `switch`-es, I run the `blockFn` in it's own
3630thread, whose only purpose is to return to the synchronization point.
3631This corresponds to the `throw (blockFn {...})` in the SML/NJ
3632implementation. I'm worried that this implementation might be a
3633little expensive, starting a new thread for each blocked event (when
3634there are only multiple blocked events in a synchronization group).
3635But, I don't see another way of implementing this behavior in the
3636MLton thread model.
3637
3638Note that another way of thinking about what is going on is to
3639consider each `blockFn` as prepending a different set of actions to
3640the thread `t`. It might be possible to give a
3641`MLton.Thread.unsafePrepend`.
3642[source,sml]
3643----
3644fun unsafePrepend (T r: 'a t, f: 'b -> 'a): 'b t =
3645 let
3646 val t =
3647 case !r of
3648 Dead => raise Fail "prepend to a Dead thread"
3649 | New g => New (g o f)
3650 | Paused (g, t) => Paused (fn h => g (f o h), t)
3651 in (* r := Dead; *)
3652 T (ref t)
3653 end
3654----
3655I have commented out the `r := Dead`, which would allow multiple
3656prepends to the same thread (i.e., not destroying the original thread
3657in the process). Of course, only one of the threads could be run: if
3658the original thread were in the `Paused` state, then multiple threads
3659would share the underlying runtime/primitive thread. Now, this
3660matches the "one-shot" nature of CML continuations/threads, but I'm
3661not comfortable with extending `MLton.Thread` with such an unsafe
3662operation.
3663
3664Other than this complication with blocking multiple base events, the
3665port was quite routine. (As a very pleasant surprise, the CML
3666implementation in SML/NJ doesn't use any SML/NJ-isms.) There is a
3667slight difference in the way in which critical sections are handled in
3668SML/NJ and MLton; since `MLton.Thread.switch` _always_ leaves a
3669critical section, it is sometimes necessary to add additional
3670`atomicBegin`-s/`atomicEnd`-s to ensure that we remain in a critical
3671section after a thread switch.
3672
3673While looking at virtually every file in the core CML implementation,
3674I took the liberty of simplifying things where it seemed possible; in
3675terms of style, the implementation is about half-way between Reppy's
3676original and MLton's.
3677
3678Some changes of note:
3679
3680* `util/` contains all pertinent data-structures: (functional and
3681imperative) queues, (functional) priority queues. Hence, it should be
3682easier to switch in more efficient or real-time implementations.
3683
3684* `core-cml/scheduler.sml`: in both implementations, this is where
3685most of the interesting action takes place. I've made the connection
3686between `MLton.Thread.t`-s and `ThreadId.thread_id`-s more abstract
3687than it is in the SML/NJ implementation, and encapsulated all of the
3688`MLton.Thread` operations in this module.
3689
3690* eliminated all of the "by hand" inlining
3691
3692
3693== Future Extensions ==
3694
3695The CML documentation says the following:
3696____
3697
3698----
3699CML.joinEvt: thread_id -> unit event
3700----
3701
3702* `joinEvt tid`
3703+
3704creates an event value for synchronizing on the termination of the
3705thread with the ID tid. There are three ways that a thread may
3706terminate: the function that was passed to spawn (or spawnc) may
3707return; it may call the exit function, or it may have an uncaught
3708exception. Note that `joinEvt` does not distinguish between these
3709cases; it also does not become enabled if the named thread deadlocks
3710(even if it is garbage collected).
3711____
3712
3713I believe that the `MLton.Finalizable` might be able to relax that
3714last restriction. Upon the creation of a `'a Scheduler.thread`, we
3715could attach a finalizer to the underlying `'a MLton.Thread.t` that
3716enables the `joinEvt` (in the associated `ThreadID.thread_id`) when
3717the `'a MLton.Thread.t` becomes unreachable.
3718
3719I don't know why CML doesn't have
3720----
3721CML.kill: thread_id -> unit
3722----
3723which has a fairly simple implementation -- setting a kill flag in the
3724`thread_id` and adjusting the scheduler to discard any killed threads
3725that it takes off the ready queue. The fairness of the scheduler
3726ensures that a killed thread will eventually be discarded. The
3727semantics are little murky for blocked threads that are killed,
3728though. For example, consider a thread blocked on `SyncVar.mTake mv`
3729and a thread blocked on `SyncVar.mGet mv`. If the first thread is
3730killed while blocked, and a third thread does `SyncVar.mPut (mv, x)`,
3731then we might expect that we'll enable the second thread, and never
3732the first. But, when only the ready queue is able to discard killed
3733threads, then the `SyncVar.mPut` could enable the first thread
3734(putting it on the ready queue, from which it will be discarded) and
3735leave the second thread blocked. We could solve this by adjusting the
3736`TransID.trans_id types` and the "cleaner" functions to look for both
3737canceled transactions and transactions on killed threads.
3738
3739John Reppy says that <!Cite(MarlowEtAl01)> and <!Cite(FlattFindler04)>
3740explain why `CML.kill` would be a bad idea.
3741
3742Between `CML.timeOutEvt` and `CML.kill`, one could give an efficient
3743solution to the recent `comp.lang.ml` post about terminating a
3744function that doesn't complete in a given time.
3745[source,sml]
3746----
3747 fun timeOut (f: unit -> 'a, t: Time.time): 'a option =
3748 let
3749 val iv = SyncVar.iVar ()
3750 val tid = CML.spawn (fn () => SyncVar.iPut (iv, f ()))
3751 in
3752 CML.select
3753 [CML.wrap (CML.timeOutEvt t, fn () => (CML.kill tid; NONE)),
3754 CML.wrap (SyncVar.iGetEvt iv, fn x => SOME x)]
3755 end
3756----
3757
3758
3759== Space Safety ==
3760
3761There are some CML related posts on the MLton mailing list:
3762
3763* http://www.mlton.org/pipermail/mlton/2004-May/
3764
3765that discuss concerns that SML/NJ's implementation is not space
3766efficient, because multi-shot continuations can be held indefinitely
3767on event queues. MLton is better off because of the one-shot nature
3768-- when an event enables a thread, all other copies of the thread
3769waiting in other event queues get turned into dead threads (of zero
3770size).
3771
3772<<<
3773
3774:mlton-guide-page: ConstantPropagation
3775[[ConstantPropagation]]
3776ConstantPropagation
3777===================
3778
3779<:ConstantPropagation:> is an optimization pass for the <:SSA:>
3780<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
3781
3782== Description ==
3783
3784This is whole-program constant propagation, even through data
3785structures. It also performs globalization of (small) values computed
3786once.
3787
3788Uses <:Multi:>.
3789
3790== Implementation ==
3791
3792* <!ViewGitFile(mlton,master,mlton/ssa/constant-propagation.fun)>
3793
3794== Details and Notes ==
3795
3796{empty}
3797
3798<<<
3799
3800:mlton-guide-page: Contact
3801[[Contact]]
3802Contact
3803=======
3804
3805== Mailing lists ==
3806
3807There are three mailing lists available.
3808
3809* mailto:MLton-user@mlton.org[`MLton-user@mlton.org`]
3810+
3811MLton user community discussion
3812+
3813--
3814* https://lists.sourceforge.net/lists/listinfo/mlton-user[subscribe]
3815https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-user[archive (SourceForge; current)],
3816http://www.mlton.org/pipermail/mlton-user/[archive (PiperMail; through 201110)]
3817--
3818
3819* mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`]
3820+
3821MLton developer community discussion
3822+
3823--
3824* https://lists.sourceforge.net/lists/listinfo/mlton-devel[subscribe]
3825https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-devel[archive (SourceForge; current)],
3826http://www.mlton.org/pipermail/mlton-devel/[archive (PiperMail; through 201110)]
3827--
3828
3829* mailto:MLton-commit@mlton.org[`MLton-commit@mlton.org`]
3830+
3831MLton code commits
3832+
3833--
3834* https://lists.sourceforge.net/lists/listinfo/mlton-commit[subscribe]
3835* https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-commit[archive (SourceForge; current)],
3836http://www.mlton.org/pipermail/mlton-commit/[archive (PiperMail; through 201110)]
3837--
3838
3839
3840=== Mailing list policies ===
3841
3842* Both mailing lists are unmoderated. However, the mailing lists are
3843configured to discard all spam, to hold all non-subscriber posts
3844for moderation, to accept all subscriber posts, and to admin approve
3845subscription requests. Please contact
3846mailto:matthew.fluet@gmail.com[Matthew Fluet] if it appears that your
3847messages are being discarded as spam.
3848
3849* Large messages (over 256K) should not be sent. Rather, please send
3850an email containing the discussion text and a link to any large files.
3851
3852/////
3853* Very active mailto:MLton-devel@mlton.org[`MLton@mlton.org`] list
3854members who might otherwise be expected to provide a fast response
3855should send a message when they will be offline for more than a few
3856days. The convention is to put
3857"++__userid__ offline until __date__++" in the subject line to make it
3858easy to scan.
3859/////
3860
3861* Discussions started on the mailing lists should stay on the mailing
3862lists. Private replies may be bounced to the mailing list for the
3863benefit of those following the discussion.
3864
3865* Discussions started on
3866mailto:MLton-user@mlton.org[`MLton-user@mlton.org`] may be migrated to
3867mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`], particularly
3868when the discussion shifts from how to use MLton to how to modify
3869MLton (e.g., to fix a bug identified by the initial discussion).
3870
3871== IRC ==
3872
3873* Some MLton developers and users are in channel `#sml` on http://freenode.net.
3874
3875<<<
3876
3877:mlton-guide-page: Contify
3878[[Contify]]
3879Contify
3880=======
3881
3882<:Contify:> is an optimization pass for the <:SSA:>
3883<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
3884
3885== Description ==
3886
3887Contification is a compiler optimization that turns a function that
3888always returns to the same place into a continuation. This exposes
3889control-flow information that is required by many optimizations,
3890including traditional loop optimizations.
3891
3892== Implementation ==
3893
3894* <!ViewGitFile(mlton,master,mlton/ssa/contify.fun)>
3895
3896== Details and Notes ==
3897
3898See <!Cite(FluetWeeks01, Contification Using Dominators)>. The
3899intermediate language described in that paper has since evolved to the
3900<:SSA:> <:IntermediateLanguage:>; hence, the complication described in
3901Section 6.1 is no longer relevant.
3902
3903<<<
3904
3905:mlton-guide-page: CoreML
3906[[CoreML]]
3907CoreML
3908======
3909
3910<:CoreML:Core ML> is an <:IntermediateLanguage:>, translated from
3911<:AST:> by <:Elaborate:>, optimized by <:CoreMLSimplify:>, and
3912translated by <:Defunctorize:> to <:XML:>.
3913
3914== Description ==
3915
3916<:CoreML:> is polymorphic, higher-order, and has nested patterns.
3917
3918== Implementation ==
3919
3920* <!ViewGitFile(mlton,master,mlton/core-ml/core-ml.sig)>
3921* <!ViewGitFile(mlton,master,mlton/core-ml/core-ml.fun)>
3922
3923== Type Checking ==
3924
3925The <:CoreML:> <:IntermediateLanguage:> has no independent type
3926checker.
3927
3928== Details and Notes ==
3929
3930{empty}
3931
3932<<<
3933
3934:mlton-guide-page: CoreMLSimplify
3935[[CoreMLSimplify]]
3936CoreMLSimplify
3937==============
3938
3939The single optimization pass for the <:CoreML:>
3940<:IntermediateLanguage:> is controlled by the `Compile` functor
3941(<!ViewGitFile(mlton,master,mlton/main/compile.fun)>).
3942
3943The following optimization pass is implemented:
3944
3945* <:DeadCode:>
3946
3947<<<
3948
3949:mlton-guide-page: Credits
3950[[Credits]]
3951Credits
3952=======
3953
3954MLton was designed and implemented by HenryCejtin,
3955MatthewFluet, SureshJagannathan, and <:StephenWeeks:>.
3956
3957 * <:HenryCejtin:> wrote the `IntInf` implementation, the original
3958 profiler, the original man pages, the `.spec` files for the RPMs,
3959 and lots of little hacks to speed stuff up.
3960
3961 * <:MatthewFluet:> implemented the X86 and AMD64 native code generators,
3962 ported `mlprof` to work with the native code generator, did a lot
3963 of work on the SSA optimizer, both adding new optimizations and
3964 improving or porting existing optimizations, updated the
3965 <:BasisLibrary:Basis Library> implementation, ported
3966 <:ConcurrentML:> and <:MLNLFFI:ML-NLFFI> to MLton, implemented the
3967 <:MLBasis: ML Basis system>, ported MLton to 64-bit platforms,
3968 and currently leads the project.
3969
3970 * <:SureshJagannathan:> implemented some early inlining and uncurrying
3971 optimizations.
3972
3973 * <:StephenWeeks:> implemented most of the original version of MLton, and
3974 continues to keep his fingers in most every part.
3975
3976Many people have helped us over the years. Here is an alphabetical
3977list.
3978
3979 * <:JesperLouisAndersen:> sent several patches to improve the runtime on
3980 FreeBSD and ported MLton to run on NetBSD and OpenBSD.
3981
3982 * <:JohnnyAndersen:> implemented `BinIO`, modified MLton so it could
3983 cross compile to MinGW, and provided useful discussion about
3984 cross-compilation.
3985
3986 * Alexander Abushkevich extended support for OpenBSD.
3987
3988 * Ross Bayer added the `-keep ast` compile-time option and experimented with
3989 porting the build system to CMake.
3990
3991 * Kevin Bradley added initial support for <:SuccessorML:> features.
3992
3993 * Bryan Camp added `-disable-pass _regex_` and `enable-pass _regex_` compile
3994 options to generalize `-drop-pass _regex_` and added `Array_copyArray` and
3995 `Array_copyVector` primitives.
3996
3997 * Jason Carr added a parser combinator library and a parser for the <:SXML:>
3998 IR, extended compilation to start with a `.sxml` file, and experimented with
3999 alternate control-flow analyses for <:ClosureConvert: closure conversion>.
4000
4001 * Christopher Cramer contributed support for additional
4002 `Posix.ProcEnv.sysconf` variables, performance improvements for
4003 `String.concatWith`, and Debian packaging.
4004
4005 * Alain Deutsch and
4006 http://www.polyspace.com/[PolySpace Technologies] provided many bug
4007 fixes and runtime system improvements, code to help the Sparc/Solaris
4008 port, and funded a number of improvements to MLton.
4009
4010 * Armando Doval updated `mlnlffigen` to warn and skip functions with
4011 `struct`/`union` arguments.
4012
4013 * Martin Elsman provided helpful discussions in the development of
4014 the <:MLBasis:ML Basis system>.
4015
4016 * Brent Fulgham ported MLton most of the way to MinGW.
4017
4018 * <:AdamGoode:> provided a script to build the PDF MLton Guide and
4019 maintains the
4020 https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora]
4021 packages.
4022
4023 * Simon Helsen provided bug reports, suggestions, and helpful
4024 discussions.
4025
4026 * Joe Hurd provided useful discussion and feedback on source-level
4027 profiling.
4028
4029 * <:VesaKarvonen:> contributed `esml-mode.el` and `esml-mlb-mode.el` (see <:Emacs:>),
4030 contributed patches for improving match warnings,
4031 contributed `esml-du-mlton.el` and extended def-use output to include types of variable definitions (see <:EmacsDefUseMode:>), and
4032 improved constant folding of floating-point operations.
4033
4034 * Richard Kelsey provided helpful discussions.
4035
4036 * Ville Laurikari ported MLton to IA64/HPUX, HPPA/HPUX, PowerPC/AIX, PowerPC64/AIX.
4037
4038 * Brian Leibig implemented the <:LLVMCodegen:>.
4039
4040 * Geoffrey Mainland helped with FreeBSD packaging.
4041
4042 * Eric McCorkle ported MLton to Intel Mac.
4043
4044 * <:TomMurphy:> wrote the original version of `MLton.Syslog` as part
4045 of his `mlftpd` project, and has sent many useful bug reports and
4046 suggestions.
4047
4048 * Michael Neumann helped to patch the runtime to compile under
4049 FreeBSD.
4050
4051 * Barak Pearlmutter built the original
4052 http://packages.debian.org/mlton[Debian package] for MLton, and
4053 helped us to take over the process.
4054
4055 * Filip Pizlo ported MLton to (PowerPC) Darwin.
4056
4057 * Vedant Raiththa extended the <:ForeignFunctionInterface:> with support for
4058 `pure` and `impure` attributes to `_import`.
4059
4060 * Krishna Ravikumar added initial support for vector expressions and the
4061 `Vector_vector` primitive.
4062
4063 * John Reppy assisted in porting MLton to Intel Mac.
4064
4065 * Sam Rushing ported MLton to FreeBSD.
4066
4067 * Rob Simmons refactored the array and vector implementation in the
4068 <:BasisLibrary: Basis Library:> into a primitive implementation (using
4069 `SeqInt.int` for indexing) and a wrapper implementation (using the default
4070 `Int.int` for indexing).
4071
4072 * Jeffrey Mark Siskind provided helpful discussions and inspiration
4073 with his Stalin Scheme compiler.
4074
4075 * Matthew Surawski added <:LoopUnroll:> and <:LoopUnswitch:> SSA optimizations.
4076
4077 * <:WesleyTerpstra:> added support for `MLton.Process.create`, made
4078 a number of contributions to the <:ForeignFunctionInterface:>,
4079 contributed a number of runtime system patches,
4080 added support for compiling to a <:LibrarySupport:C library>,
4081 ported MLton to http://mingw.org[MinGW] and all http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all&section=all[Debian] supported architectures with <:CrossCompiling:cross-compiling> support,
4082 and maintains the http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all&section=all[Debian] and http://mingw.org[MinGW] packages.
4083
4084 * Maksim Yegorov added rudimentary support for `./configure` and other
4085 improvements to the build system and implemented the <:ShareZeroVec:> SSA
4086 optimization.
4087
4088 * Luke Ziarek assisted in porting MLton to (PowerPC) Darwin.
4089
4090We have also benefited from other software development tools and
4091used code from other sources.
4092
4093 * MLton was developed using
4094 <:SMLNJ:Standard ML of New Jersey> and the
4095 <:CompilationManager:Compilation Manager (CM)>
4096
4097 * MLton's lexer (`mlton/frontend/ml.lex`), parser
4098 (`mlton/frontend/ml.grm`), and precedence-parser
4099 (`mlton/elaborate/precedence-parse.fun`) are modified versions of
4100 code from SML/NJ.
4101
4102 * The MLton <:BasisLibrary:Basis Library> implementation of
4103 conversions between binary and decimal representations of reals uses
4104 David Gay's http://www.netlib.org/fp/[gdtoa] library.
4105
4106 * The MLton <:BasisLibrary:Basis Library> implementation uses
4107 modified versions of portions of the the SML/NJ Basis Library
4108 implementation modules `OS.IO`, `Posix.IO`, `Process`,
4109 and `Unix`.
4110
4111 * The MLton <:BasisLibrary:Basis Library> implementation uses
4112 modified versions of portions of the <:MLKit:ML Kit> Version 4.1.4
4113 Basis Library implementation modules `Path`, `Time`, and
4114 `Date`.
4115
4116 * Many of the benchmarks come from the SML/NJ benchmark suite.
4117
4118 * Many of the regression tests come from the ML Kit Version 4.1.4
4119 distribution, which borrowed them from the
4120 http://www.dina.kvl.dk/%7Esestoft/mosml.html[Moscow ML] distribution.
4121
4122 * MLton uses the http://www.gnu.org/software/gmp/gmp.html[GNU multiprecision library] for its implementation of `IntInf`.
4123
4124 * MLton's implementation of <:MLLex: mllex>, <:MLYacc: mlyacc>,
4125 the <:CKitLibrary:ckit Library>,
4126 the <:MLLPTLibrary:ML-LPT Library>,
4127 the <:MLRISCLibrary:MLRISC Library>,
4128 the <:SMLNJLibrary:SML/NJ Library>,
4129 <:ConcurrentML:Concurrent ML>,
4130 mlnlffigen and <:MLNLFFI:ML-NLFFI>
4131 are modified versions of code from SML/NJ.
4132
4133<<<
4134
4135:mlton-guide-page: CrossCompiling
4136[[CrossCompiling]]
4137CrossCompiling
4138==============
4139
4140MLton's `-target` flag directs MLton to cross compile an application
4141for another platform. By default, MLton is only able to compile for
4142the machine it is running on. In order to use MLton as a cross
4143compiler, you need to do two things.
4144
41451. Install the GCC cross-compiler tools on the host so that GCC can
4146compile to the target.
4147
41482. Cross compile the MLton runtime system to build the runtime
4149libraries for the target.
4150
4151To make the terminology clear, we refer to the _host_ as the machine
4152MLton is running on and the _target_ as the machine that MLton is
4153compiling for.
4154
4155To build a GCC cross-compiler toolset on the host, you can use the
4156script `bin/build-cross-gcc`, available in the MLton sources, as a
4157template. The value of the `target` variable in that script is
4158important, since that is what you will pass to MLton's `-target` flag.
4159Once you have the toolset built, you should be able to test it by
4160cross compiling a simple hello world program on your host machine.
4161----
4162% gcc -b i386-pc-cygwin -o hello-world hello-world.c
4163----
4164
4165You should now be able to run `hello-world` on the target machine, in
4166this case, a Cygwin machine.
4167
4168Next, you must cross compile the MLton runtime system and inform MLton
4169of the availability of the new target. The script `bin/add-cross`
4170from the MLton sources will help you do this. Please read the
4171comments at the top of the script. Here is a sample run adding a
4172Solaris cross compiler.
4173----
4174% add-cross sparc-sun-solaris sun blade
4175Making runtime.
4176Building print-constants executable.
4177Running print-constants on blade.
4178----
4179
4180Running `add-cross` uses `ssh` to compile the runtime on the target
4181machine and to create `print-constants`, which prints out all of the
4182constants that MLton needs in order to implement the
4183<:BasisLibrary:Basis Library>. The script runs `print-constants` on
4184the target machine (`blade` in this case), and saves the output.
4185
4186Once you have done all this, you should be able to cross compile SML
4187applications. For example,
4188----
4189mlton -target i386-pc-cygwin hello-world.sml
4190----
4191will create `hello-world`, which you should be able to run from a
4192Cygwin shell on your Windows machine.
4193
4194
4195== Cross-compiling alternatives ==
4196
4197Building and maintaining cross-compiling `gcc`'s is complex. You may
4198find it simpler to use `mlton -keep g` to generate the files on the
4199host, then copy the files to the target, and then use `gcc` or `mlton`
4200on the target to compile the files.
4201
4202<<<
4203
4204:mlton-guide-page: CVS
4205[[CVS]]
4206CVS
4207===
4208
4209http://www.gnu.org/software/cvs/[CVS] (Concurrent Versions System) is
4210a version control system. The MLton project used CVS to maintain its
4211<:Sources:source code>, but switched to <:Subversion:> on 20050730.
4212
4213Here are some online CVS resources.
4214
4215* http://cvsbook.red-bean.com/[Open Source Development with CVS]
4216
4217<<<
4218
4219:mlton-guide-page: DeadCode
4220[[DeadCode]]
4221DeadCode
4222========
4223
4224<:DeadCode:> is an optimization pass for the <:CoreML:>
4225<:IntermediateLanguage:>, invoked from <:CoreMLSimplify:>.
4226
4227== Description ==
4228
4229This pass eliminates declarations from the
4230<:BasisLibrary:Basis Library> not needed by the user program.
4231
4232== Implementation ==
4233
4234* <!ViewGitFile(mlton,master,mlton/core-ml/dead-code.sig)>
4235* <!ViewGitFile(mlton,master,mlton/core-ml/dead-code.fun)>
4236
4237== Details and Notes ==
4238
4239In order to compile small programs rapidly, a pass of dead code
4240elimination is run in order to eliminate as much of the Basis Library
4241as possible. The dead code elimination algorithm used is not safe in
4242general, and only works because the Basis Library implementation has
4243special properties:
4244
4245* it terminates
4246* it performs no I/O
4247
4248The dead code elimination includes the minimal set of
4249declarations from the Basis Library so that there are no free
4250variables in the user program (or remaining Basis Library
4251implementation). It has a special hack to include all
4252bindings of the form:
4253[source,sml]
4254----
4255 val _ = ...
4256----
4257
4258There is an <:MLBasisAnnotations:ML Basis annotation>,
4259`deadCode true`, that governs which code is subject to this unsafe
4260dead-code elimination.
4261
4262<<<
4263
4264:mlton-guide-page: DeepFlatten
4265[[DeepFlatten]]
4266DeepFlatten
4267===========
4268
4269<:DeepFlatten:> is an optimization pass for the <:SSA2:>
4270<:IntermediateLanguage:>, invoked from <:SSA2Simplify:>.
4271
4272== Description ==
4273
4274This pass flattens into mutable fields of objects and into vectors.
4275
4276For example, an `(int * int) ref` is represented by a 2 word
4277object, and an `(int * int) array` contains pairs of `int`-s,
4278rather than pointers to pairs of `int`-s.
4279
4280== Implementation ==
4281
4282* <!ViewGitFile(mlton,master,mlton/ssa/deep-flatten.fun)>
4283
4284== Details and Notes ==
4285
4286There are some performance issues with the deep flatten pass, where it
4287consumes an excessive amount of memory.
4288
4289* http://www.mlton.org/pipermail/mlton/2005-April/026990.html
4290* http://www.mlton.org/pipermail/mlton-user/2010-June/001626.html
4291* http://www.mlton.org/pipermail/mlton/2010-December/030876.html
4292
4293A number of applications require compilation with
4294`-disable-pass deepFlatten` to avoid exceeding available memory. It is
4295often asked whether the deep flatten pass usually has a significant
4296impact on performance. The standard benchmark suite was run with and
4297without the deep flatten pass enabled when the pass was first
4298introduced:
4299
4300* http://www.mlton.org/pipermail/mlton/2004-August/025760.html
4301
4302The conclusion is that it does not have a significant impact.
4303However, these are micro benchmarks; other applications may derive
4304greater benefit from the pass.
4305
4306<<<
4307
4308:mlton-guide-page: DefineTypeBeforeUse
4309[[DefineTypeBeforeUse]]
4310DefineTypeBeforeUse
4311===================
4312
4313<:StandardML:Standard ML> requires types to be defined before they are
4314used. Because of type inference, the use of a type can be implicit;
4315hence, this requirement is more subtle than it might appear. For
4316example, the following program is not type correct, because the type
4317of `r` is `t option ref`, but `t` is defined after `r`.
4318
4319[source,sml]
4320----
4321val r = ref NONE
4322datatype t = A | B
4323val () = r := SOME A
4324----
4325
4326MLton reports the following error, indicating that the type defined on
4327line 2 is used on line 1.
4328
4329----
4330Error: z.sml 3.10-3.20.
4331 Function applied to incorrect argument.
4332 expects: _ * [???] option
4333 but got: _ * [t] option
4334 in: := (r, SOME A)
4335 note: type would escape its scope: t
4336 escape from: z.sml 2.10-2.10
4337 escape to: z.sml 1.1-1.16
4338Warning: z.sml 1.5-1.5.
4339 Type of variable was not inferred and could not be generalized: r.
4340 type: ??? option ref
4341 in: val r = ref NONE
4342----
4343
4344While the above example is benign, the following example shows how to
4345cast an integer to a function by (implicitly) using a type before it
4346is defined. In the example, the ref cell `r` is of type
4347`t option ref`, where `t` is defined _after_ `r`, as a parameter to
4348functor `F`.
4349
4350[source,sml]
4351----
4352val r = ref NONE
4353functor F (type t
4354 val x: t) =
4355 struct
4356 val () = r := SOME x
4357 fun get () = valOf (!r)
4358 end
4359structure S1 = F (type t = unit -> unit
4360 val x = fn () => ())
4361structure S2 = F (type t = int
4362 val x = 13)
4363val () = S1.get () ()
4364----
4365
4366MLton reports the following error.
4367
4368----
4369Warning: z.sml 1.5-1.5.
4370 Type of variable was not inferred and could not be generalized: r.
4371 type: ??? option ref
4372 in: val r = ref NONE
4373Error: z.sml 5.16-5.26.
4374 Function applied to incorrect argument.
4375 expects: _ * [???] option
4376 but got: _ * [t] option
4377 in: := (r, SOME x)
4378 note: type would escape its scope: t
4379 escape from: z.sml 2.17-2.17
4380 escape to: z.sml 1.1-1.16
4381Warning: z.sml 6.11-6.13.
4382 Type of variable was not inferred and could not be generalized: get.
4383 type: unit -> ???
4384 in: fun get () = (valOf (! r))
4385Error: z.sml 12.10-12.18.
4386 Function not of arrow type.
4387 function: [unit]
4388 in: (S1.get ()) ()
4389----
4390
4391<<<
4392
4393:mlton-guide-page: DefinitionOfStandardML
4394[[DefinitionOfStandardML]]
4395DefinitionOfStandardML
4396======================
4397
4398<!Cite(MilnerEtAl97, The Definition of Standard ML (Revised))> is a
4399terse and formal specification of <:StandardML:Standard ML>'s syntax
4400and semantics. The language specified by this book is often referred
4401to as SML 97. You can check its syntax
4402http://www.mpi-sws.org/~rossberg/sml.html[grammar] online (thanks to
4403Andreas Rossberg).
4404
4405<!Cite(MilnerEtAl90, The Definition of Standard ML)> is an older
4406version of the definition, published in 1990. The accompanying
4407<!Cite(MilnerTofte91, Commentary)> introduces and explains the notation
4408and approach. The same notation is used in the SML 97 definition, so it
4409is worth keeping the older definition and its commentary at hand if you
4410intend a close study of the definition.
4411
4412<<<
4413
4414:mlton-guide-page: Defunctorize
4415[[Defunctorize]]
4416Defunctorize
4417============
4418
4419<:Defunctorize:> is a translation pass from the <:CoreML:>
4420<:IntermediateLanguage:> to the <:XML:> <:IntermediateLanguage:>.
4421
4422== Description ==
4423
4424This pass converts a <:CoreML:> program to an <:XML:> program by
4425performing:
4426
4427* linearization
4428* <:MatchCompile:>
4429* polymorphic `val` dec expansion
4430* `datatype` lifting (to the top-level)
4431
4432== Implementation ==
4433
4434* <!ViewGitFile(mlton,master,mlton/defunctorize/defunctorize.sig)>
4435* <!ViewGitFile(mlton,master,mlton/defunctorize/defunctorize.fun)>
4436
4437== Details and Notes ==
4438
4439This pass is grossly misnamed and does not perform defunctorization.
4440
4441=== Datatype Lifting ===
4442
4443This pass moves all `datatype` declarations to the top level.
4444
4445<:StandardML:Standard ML> `datatype` declarations can contain type
4446variables that are not bound in the declaration itself. For example,
4447the following program is valid.
4448[source,sml]
4449----
4450fun 'a f (x: 'a) =
4451 let
4452 datatype 'b t = T of 'a * 'b
4453 val y: int t = T (x, 1)
4454 in
4455 13
4456 end
4457----
4458
4459Unfortunately, the `datatype` declaration can not be immediately moved
4460to the top level, because that would leave `'a` free.
4461[source,sml]
4462----
4463datatype 'b t = T of 'a * 'b
4464fun 'a f (x: 'a) =
4465 let
4466 val y: int t = T (x, 1)
4467 in
4468 13
4469 end
4470----
4471
4472In order to safely move `datatype`s, this pass must close them, as
4473well as add any free type variables as extra arguments to the type
4474constructor. For example, the above program would be translated to
4475the following.
4476[source,sml]
4477----
4478datatype ('a, 'b) t = T of 'a * 'b
4479fun 'a f (x: 'a) =
4480 let
4481 val y: ('a * int) t = T (x, 1)
4482 in
4483 13
4484 end
4485----
4486
4487== Historical Notes ==
4488
4489The <:Defunctorize:> pass originally eliminated
4490<:StandardML:Standard ML> functors by duplicating their body at each
4491application. These duties have been adopted by the <:Elaborate:>
4492pass.
4493
4494<<<
4495
4496:mlton-guide-page: Developers
4497[[Developers]]
4498Developers
4499==========
4500
4501Here is a picture of the MLton team at a meeting in Chicago in August
45022003. From left to right we have:
4503
4504[align="center",frame="none",cols="^"]
4505|=====
4506|<:StephenWeeks:> -- <:MatthewFluet:> -- <:HenryCejtin:> -- <:SureshJagannathan:>
4507|=====
4508
4509image::Developers.attachments/team.jpg[align="center"]
4510
4511Also see the <:Credits:> for a list of specific contributions.
4512
4513
4514== Developers list ==
4515
4516A number of people read the developers mailing list,
4517mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`], and make
4518contributions there. Here's a list of those who have a page here.
4519
4520* <:AndreiFormiga:>
4521* <:JesperLouisAndersen:>
4522* <:JohnnyAndersen:>
4523* <:MichaelNorrish:>
4524* <:MikeThomas:>
4525* <:RayRacine:>
4526* <:WesleyTerpstra:>
4527* <:VesaKarvonen:>
4528
4529<<<
4530
4531:mlton-guide-page: Development
4532[[Development]]
4533Development
4534===========
4535
4536This page is the central point for MLton development.
4537
4538* Access the <:Sources:>.
4539* Check the current <!ViewGitFile(mlton,master,CHANGELOG.adoc)> or recent https://github.com/MLton/mlton/commits/master[commits].
4540* Open https://github.com/MLton/mlton/issues[Issues].
4541* Ideas for <:Projects:> to improve MLton.
4542* <:Developers:> that are or have been involved in the project.
4543// * Help maintain and improve the <:WebSite:>.
4544
4545== Notes ==
4546
4547* <:CompilerOverview:>
4548* <:CompilingWithSMLNJ:>
4549* <:CrossCompiling:>
4550* <:License:>
4551* <:NeedsReview:>
4552* <:PortingMLton:>
4553* <:ReleaseChecklist:>
4554* <:SelfCompiling:>
4555
4556<<<
4557
4558:mlton-guide-page: Documentation
4559[[Documentation]]
4560Documentation
4561=============
4562
4563Documentation is available on the following topics.
4564
4565* <:StandardML:Standard ML>
4566** <:BasisLibrary:Basis Library>
4567** <:Libraries: Additional libraries>
4568* <:Installation:Installing MLton>
4569* Using MLton
4570** <:ForeignFunctionInterface: Foreign function interface (FFI)>
4571** <:ManualPage: Manual page> (<:CompileTimeOptions:compile-time options> <:RunTimeOptions:run-time options>)
4572** <:MLBasis: ML Basis system>
4573** <:MLtonStructure: MLton structure>
4574** <:PlatformSpecificNotes: Platform-specific notes>
4575** <:Profiling: Profiling>
4576** <:TypeChecking: Type checking>
4577** Help for porting from <:SMLNJ:SML/NJ> to MLton.
4578* About MLton
4579** <:Credits:>
4580** <:Drawbacks:>
4581** <:Features:>
4582** <:History:>
4583** <:License:>
4584** <:Talk:>
4585** <:WishList:>
4586* Tools
4587** <:MLLex:> (<!Attachment(Documentation,mllex.pdf)>)
4588** <:MLYacc:> (<!Attachment(Documentation,mlyacc.pdf)>)
4589** <:MLNLFFIGen:> (<!Attachment(Documentation,mlyacc.pdf)>)
4590* <:References:>
4591
4592<<<
4593
4594:mlton-guide-page: Drawbacks
4595[[Drawbacks]]
4596Drawbacks
4597=========
4598
4599MLton has several drawbacks due to its use of whole-program
4600compilation.
4601
4602* Large compile-time memory requirement.
4603+
4604Because MLton performs whole-program analysis and optimization,
4605compilation requires a large amount of memory. For example, compiling
4606MLton (over 140K lines) requires at least 512M RAM.
4607
4608* Long compile times.
4609+
4610Whole-program compilation can take a long time. For example,
4611compiling MLton (over 140K lines) on a 1.6GHz machine takes five to
4612ten minutes.
4613
4614* No interactive top level.
4615+
4616Because of whole-program compilation, MLton does not provide an
4617interactive top level. In particular, it does not implement the
4618optional <:BasisLibrary:Basis Library> function `use`.
4619
4620<<<
4621
4622:mlton-guide-page: Eclipse
4623[[Eclipse]]
4624Eclipse
4625=======
4626
4627http://eclipse.org/[Eclipse] is an open, extensible IDE.
4628
4629http://www.cse.iitd.ernet.in/%7Ecsu02132/mldev/[ML-Dev] is a plug-in
4630for Eclipse, based on <:SMLNJ:SML/NJ>.
4631
4632There has been some talk on the MLton mailing list about adding
4633support to Eclipse for MLton/SML, and in particular, using
4634http://eclipsefp.sourceforge.net/. We are unaware of any progress
4635along those lines.
4636
4637<<<
4638
4639:mlton-guide-page: Elaborate
4640[[Elaborate]]
4641Elaborate
4642=========
4643
4644<:Elaborate:> is a translation pass from the <:AST:>
4645<:IntermediateLanguage:> to the <:CoreML:> <:IntermediateLanguage:>.
4646
4647== Description ==
4648
4649This pass performs type inference and type checking according to the
4650<:DefinitionOfStandardML:Definition>. It also defunctorizes the
4651program, eliminating all module-level constructs.
4652
4653== Implementation ==
4654
4655* <!ViewGitFile(mlton,master,mlton/elaborate/elaborate.sig)>
4656* <!ViewGitFile(mlton,master,mlton/elaborate/elaborate.fun)>
4657* <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-env.sig)>
4658* <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-env.fun)>
4659* <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-modules.sig)>
4660* <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-modules.fun)>
4661* <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-core.sig)>
4662* <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-core.fun)>
4663* <!ViewGitDir(mlton,master,mlton/elaborate)>
4664
4665== Details and Notes ==
4666
4667At the modules level, the <:Elaborate:> pass:
4668
4669* elaborates signatures with interfaces (see
4670<!ViewGitFile(mlton,master,mlton/elaborate/interface.sig)> and
4671<!ViewGitFile(mlton,master,mlton/elaborate/interface.fun)>)
4672+
4673The main trick is to use disjoint sets to efficiently handle sharing
4674of tycons and of structures and then to copy signatures as dags rather
4675than as trees.
4676
4677* checks functors at the point of definition, using functor summaries
4678to speed up checking of functor applications.
4679+
4680When a functor is first type checked, we keep track of the dummy
4681argument structure and the dummy result structure, as well as all the
4682tycons that were created while elaborating the body. Then, if we
4683later need to type check an application of the functor (as opposed to
4684defunctorize an application), we pair up tycons in the dummy argument
4685structure with the actual argument structure and then replace the
4686dummy tycons with the actual tycons in the dummy result structure,
4687yielding the actual result structure. We also generate new tycons for
4688all the tycons that we created while originally elaborating the body.
4689
4690* handles opaque signature constraints.
4691+
4692This is implemented by building a dummy structure realized from the
4693signature, just as we would for a functor argument when type checking
4694a functor. The dummy structure contains exactly the type information
4695that is in the signature, which is what opacity requires. We then
4696replace the variables (and constructors) in the dummy structure with
4697the corresponding variables (and constructors) from the actual
4698structure so that the translation to <:CoreML:> uses the right stuff.
4699For each tycon in the dummy structure, we keep track of the
4700corresponding type structure in the actual structure. This is used
4701when producing the <:CoreML:> types (see `expandOpaque` in
4702<!ViewGitFile(mlton,master,mlton/elaborate/type-env.sig)> and
4703<!ViewGitFile(mlton,master,mlton/elaborate/type-env.fun)>).
4704+
4705Then, within each `structure` or `functor` body, for each declaration
4706(`<dec>` in the <:StandardML:Standard ML> grammar), the <:Elaborate:>
4707pass does three steps:
4708+
4709--
47101. <:ScopeInference:>
47112. {empty}
4712** <:PrecedenceParse:>
4713** `_{ex,im}port` expansion
4714** profiling insertion
4715** unification
47163. Overloaded {constant, function, record pattern} resolution
4717--
4718
4719=== Defunctorization ===
4720
4721The <:Elaborate:> pass performs a number of duties historically
4722assigned to the <:Defunctorize:> pass.
4723
4724As part of the <:Elaborate:> pass, all module level constructs
4725(`open`, `signature`, `structure`, `functor`, long identifiers) are
4726removed. This works because the <:Elaborate:> pass assigns a unique
4727name to every type and variable in the program. This also allows the
4728<:Elaborate:> pass to eliminate `local` declarations, which are purely
4729for namespace management.
4730
4731
4732== Examples ==
4733
4734Here are a number of examples of elaboration.
4735
4736* All variables bound in `val` declarations are renamed.
4737+
4738[source,sml]
4739----
4740val x = 13
4741val y = x
4742----
4743+
4744----
4745val x_0 = 13
4746val y_0 = x_0
4747----
4748
4749* All variables in `fun` declarations are renamed.
4750+
4751[source,sml]
4752----
4753fun f x = g x
4754and g y = f y
4755----
4756+
4757----
4758fun f_0 x_0 = g_0 x_0
4759and g_0 y_0 = f_0 y_0
4760----
4761
4762* Type abbreviations are removed, and the abbreviation is expanded
4763wherever it is used.
4764+
4765[source,sml]
4766----
4767type 'a u = int * 'a
4768type 'b t = 'b u * real
4769fun f (x : bool t) = x
4770----
4771+
4772----
4773fun f_0 (x_0 : (int * bool) * real) = x_0
4774----
4775
4776* Exception declarations create a new constructor and rename the type.
4777+
4778[source,sml]
4779----
4780type t = int
4781exception E of t * real
4782----
4783+
4784----
4785exception E_0 of int * real
4786----
4787
4788* The type and value constructors in datatype declarations are renamed.
4789+
4790[source,sml]
4791----
4792datatype t = A of int | B of real * t
4793----
4794+
4795----
4796datatype t_0 = A_0 of int | B_0 of real * t_0
4797----
4798
4799* Local declarations are moved to the top-level. The environment
4800keeps track of the variables in scope.
4801+
4802[source,sml]
4803----
4804val x = 13
4805local val x = 14
4806in val y = x
4807end
4808val z = x
4809----
4810+
4811----
4812val x_0 = 13
4813val x_1 = 14
4814val y_0 = x_1
4815val z_0 = x_0
4816----
4817
4818* Structure declarations are eliminated, with all declarations moved
4819to the top level. Long identifiers are renamed.
4820+
4821[source,sml]
4822----
4823structure S =
4824 struct
4825 type t = int
4826 val x : t = 13
4827 end
4828val y : S.t = S.x
4829----
4830+
4831----
4832val x_0 : int = 13
4833val y_0 : int = x_0
4834----
4835
4836* Open declarations are eliminated.
4837+
4838[source,sml]
4839----
4840val x = 13
4841val y = 14
4842structure S =
4843 struct
4844 val x = 15
4845 end
4846open S
4847val z = x + y
4848----
4849+
4850----
4851val x_0 = 13
4852val y_0 = 14
4853val x_1 = 15
4854val z_0 = x_1 + y_0
4855----
4856
4857* Functor declarations are eliminated, and the body of a functor is
4858duplicated wherever the functor is applied.
4859+
4860[source,sml]
4861----
4862functor F(val x : int) =
4863 struct
4864 val y = x
4865 end
4866structure F1 = F(val x = 13)
4867structure F2 = F(val x = 14)
4868val z = F1.y + F2.y
4869----
4870+
4871----
4872val x_0 = 13
4873val y_0 = x_0
4874val x_1 = 14
4875val y_1 = x_1
4876val z_0 = y_0 + y_1
4877----
4878
4879* Signature constraints are eliminated. Note that signatures do
4880affect how subsequent variables are renamed.
4881+
4882[source,sml]
4883----
4884val y = 13
4885structure S : sig
4886 val x : int
4887 end =
4888 struct
4889 val x = 14
4890 val y = x
4891 end
4892open S
4893val z = x + y
4894----
4895+
4896----
4897val y_0 = 13
4898val x_0 = 14
4899val y_1 = x_0
4900val z_0 = x_0 + y_0
4901----
4902
4903<<<
4904
4905:mlton-guide-page: Emacs
4906[[Emacs]]
4907Emacs
4908=====
4909
4910== SML modes ==
4911
4912There are a few Emacs modes for SML.
4913
4914* `sml-mode`
4915** http://www.xemacs.org/Documentation/packages/html/sml-mode_3.html
4916** http://www.smlnj.org/doc/Emacs/sml-mode.html
4917** http://www.iro.umontreal.ca/%7Emonnier/elisp/
4918
4919* <!ViewGitFile(mlton,master,ide/emacs/mlton.el)> contains the Emacs lisp that <:StephenWeeks:> uses to interact with MLton (in addition to using `sml-mode`).
4920
4921* http://primate.net/%7Eitz/mindent.tar, developed by Ian Zimmerman, who writes:
4922+
4923_____
4924Unlike the widespread `sml-mode.el` it doesn't try to indent code
4925based on ML syntax. I gradually got skeptical about this approach
4926after writing the initial indentation support for caml mode and
4927watching it bloat insanely as the language added new features. Also,
4928any such attempts that I know of impose a particular coding style, or
4929at best a choice among a limited set of styles, which I now oppose.
4930Instead my mode is based on a generic package which provides manual
4931bindable commands for common indentation operations (example: indent
4932the current line under the n-th occurrence of a particular character
4933in the previous non-blank line).
4934_____
4935
4936== MLB modes ==
4937
4938There is a mode for editing <:MLBasis: ML Basis> files.
4939
4940* <!ViewGitFile(mlton,master,ide/emacs/esml-mlb-mode.el)> (plus other files)
4941
4942== Definitions and uses ==
4943
4944There is a mode that supports the precise def-use information that
4945MLton can output. It highlights definitions and uses and provides
4946commands for navigation (e.g., `jump-to-def`, `jump-to-next`,
4947`list-all-refs`). It can be handy, for example, for navigating in the
4948MLton compiler source code. See <:EmacsDefUseMode:> for further
4949information.
4950
4951== Building on the background ==
4952
4953Tired of manually starting/stopping/restarting builds after editing
4954files? Now you don't have to. See <:EmacsBgBuildMode:> for further
4955information.
4956
4957== Error messages ==
4958
4959MLton's error messages are not among those that the Emacs `next-error`
4960parser natively understands. The easiest way to fix this is to add
4961the following to your `.emacs` to teach Emacs to recognize MLton's
4962error messages.
4963
4964[source,cl]
4965----
4966(require 'compile)
4967(add-to-list 'compilation-error-regexp-alist 'mlton)
4968(add-to-list 'compilation-error-regexp-alist-alist
4969 '(mlton
4970 "^[[:space:]]*\\(\\(?:\\(Error\\)\\|\\(Warning\\)\\|\\(\\(?:\\(?:defn\\|spec\\) at\\)\\|\\(?:escape \\(?:from\\|to\\)\\)\\|\\(?:scoped at\\)\\)\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\(?:-\\([0-9]+\\)\\.\\([0-9]+\\)\\)?\\.?\\)$"
4971 5 (6 . 8) (7 . 9) (3 . 4) 1))
4972----
4973
4974<<<
4975
4976:mlton-guide-page: EmacsBgBuildMode
4977[[EmacsBgBuildMode]]
4978EmacsBgBuildMode
4979================
4980
4981Do you really want to think about starting a build of you project?
4982What if you had a personal slave that would restart a build of your
4983project whenever you save any file belonging to that project? The
4984bg-build mode does just that. Just save the file, a compile is
4985started (silently!), you can continue working without even thinking
4986about starting a build, and if there are errors, you are notified
4987(with a message), and can then jump to errors.
4988
4989This mode is not specific to MLton per se, but is particularly useful
4990for working with MLton due to the longer compile times. By the time
4991you start wondering about possible errors, the build is already on the
4992way.
4993
4994== Functionality and Features ==
4995
4996* Each time a file is saved, and after a user configurable delay
4997period has been exhausted, a build is started silently in the
4998background.
4999* When the build is finished, a status indicator (message) is
5000displayed non-intrusively.
5001* At any time, you can switch to a build process buffer where all the
5002messages from the build are shown.
5003* Optionally highlights (error/warning) message locations in (source
5004code) buffers after a finished build.
5005* After a build has finished, you can jump to locations of warnings
5006and errors from the build process buffer or by using the `first-error`
5007and `next-error` commands.
5008* When a build fails, bg-build mode can optionally execute a user
5009specified command. By default, bg-build mode executes `first-error`.
5010* When starting a build of a particular project, a possible previous
5011live build of the same project is interrupted first.
5012* A project configuration file specifies the commands required to
5013build a project.
5014* Multiple projects can be loaded into bg-build mode and bg-build mode
5015can build a given maximum number of projects concurrently.
5016* Supports both http://www.gnu.org/software/emacs/[Gnu Emacs] and
5017http://www.xemacs.org[XEmacs].
5018
5019
5020== Download ==
5021
5022There is no package for the mode at the moment. To install the mode you
5023need to fetch the Emacs Lisp, `*.el`, files from the MLton repository:
5024<!ViewGitDir(mlton,master,ide/emacs)>.
5025
5026
5027== Setup ==
5028
5029The easiest way to load the mode is to first tell Emacs where to find the
5030files. For example, add
5031
5032[source,cl]
5033----
5034(add-to-list 'load-path (file-truename "path-to-the-el-files"))
5035----
5036
5037to your `~/.emacs` or `~/.xemacs/init.el`. You'll probably also want
5038to start the mode automatically by adding
5039
5040[source,cl]
5041----
5042(require 'bg-build-mode)
5043(bg-build-mode)
5044----
5045
5046to your Emacs init file. Once the mode is activated, you should see
5047the `BGB` indicator on the mode line.
5048
5049
5050=== MLton and Compilation-Mode ===
5051
5052At the time of writing, neither Gnu Emacs nor XEmacs contain an error
5053regexp that would match MLton's messages.
5054
5055If you use Gnu Emacs, insert the following code into your `.emacs` file:
5056
5057[source,cl]
5058----
5059(require 'compile)
5060(add-to-list
5061 'compilation-error-regexp-alist
5062 '("^\\(Warning\\|Error\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\.$"
5063 2 3 4))
5064----
5065
5066If you use XEmacs, insert the following code into your `init.el` file:
5067
5068[source,cl]
5069----
5070(require 'compile)
5071(add-to-list
5072 'compilation-error-regexp-alist-alist
5073 '(mlton
5074 ("^\\(Warning\\|Error\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\.$"
5075 2 3 4)))
5076(compilation-build-compilation-error-regexp-alist)
5077----
5078
5079== Usage ==
5080
5081Typically projects are built (or compiled) using a tool like http://www.gnu.org/software/make/[`make`],
5082but the details vary. The bg-build mode needs a project configuration file to
5083know how to build your project. A project configuration file basically contains
5084an Emacs Lisp expression calling a function named `bg-build` that returns a
5085project object. A simple example of a project configuration file would be the
5086(<!ViewGitFile(mltonlib,master,com/ssh/async/unstable/example/smlbot/Build.bgb)>)
5087file used with smlbot:
5088
5089[source,cl]
5090----
5091sys::[./bin/InclGitFile.py mltonlib master com/ssh/async/unstable/example/smlbot/Build.bgb 5:]
5092----
5093
5094The `bg-build` function takes a number of keyword arguments:
5095
5096* `:name` specifies the name of the project. This can be any
5097expression that evaluates to a string or to a nullary function that
5098returns a string.
5099
5100* `:shell` specifies a shell command to execute. This can be any
5101expression that evaluates to a string, a list of strings, or to a
5102nullary function returning a list of strings.
5103
5104* `:build?` specifies a predicate to determine whether the project
5105should be built after some files have been modified. The predicate is
5106given a list of filenames and should return a non-nil value when the
5107project should be built and nil otherwise.
5108
5109All of the keyword arguments, except `:shell`, are optional and can be left out.
5110
5111Note the use of the `nice` command above. It means that background
5112build process is given a lower priority by the system process
5113scheduler. Assuming your machine has enough memory, using nice
5114ensures that your computer remains responsive. (You probably won't
5115even notice when a build is started.)
5116
5117Once you have written a project file for bg-build mode. Use the
5118`bg-build-add-project` command to load the project file for bg-build
5119mode. The bg-build mode can also optionally load recent project files
5120automatically at startup.
5121
5122After the project file has been loaded and bg-build mode activated,
5123each time you save a file in Emacs, the bg-build mode tries to build
5124your project.
5125
5126The `bg-build-status` command creates a buffer that displays some
5127status information on builds and allows you to manage projects (start
5128builds explicitly, remove a project from bg-build, ...) as well as
5129visit buffers created by bg-build. Notice the count of started
5130builds. At the end of the day it can be in the hundreds or thousands.
5131Imagine the number of times you've been relieved of starting a build
5132explicitly!
5133
5134<<<
5135
5136:mlton-guide-page: EmacsDefUseMode
5137[[EmacsDefUseMode]]
5138EmacsDefUseMode
5139===============
5140
5141MLton provides an <:CompileTimeOptions:option>,
5142++-show-def-use __file__++, to output precise (giving exact source
5143locations) and accurate (including all uses and no false data)
5144whole-program def-use information to a file. Unlike typical tags
5145facilities, the information includes local variables and distinguishes
5146between different definitions even when they have the same name. The
5147def-use Emacs mode uses the information to provide navigation support,
5148which can be particularly useful while reading SML programs compiled
5149with MLton (such as the MLton compiler itself).
5150
5151
5152== Screen Capture ==
5153
5154Note the highlighting and the type displayed in the minibuffer.
5155
5156image::EmacsDefUseMode.attachments/def-use-capture.png[align="center"]
5157
5158
5159== Features ==
5160
5161* Highlights definitions and uses. Different colors for definitions, unused definitions, and uses.
5162* Shows types (with highlighting) of variable definitions in the minibuffer.
5163* Navigation: `jump-to-def`, `jump-to-next`, and `jump-to-prev`. These work precisely (no searching involved).
5164* Can list, visit and mark all references to a definition (within a program).
5165* Automatically reloads updated def-use files.
5166* Automatically loads previously used def-use files at startup.
5167* Supports both http://www.gnu.org/software/emacs/[Gnu Emacs] and http://www.xemacs.org[XEmacs].
5168
5169
5170== Download ==
5171
5172There is no separate package for the def-use mode although the mode
5173has been relatively stable for some time already. To install the mode
5174you need to get the Emacs Lisp, `*.el`, files from MLton's repository:
5175<!ViewGitDir(mlton,master,ide/emacs)>. The easiest way to get the files
5176is to use <:Git:> to access MLton's <:Sources:sources>.
5177
5178/////
5179If you only want the Emacs lisp files, you can use the following
5180command:
5181----
5182svn co svn://mlton.org/mlton/trunk/ide/emacs mlton-emacs-ide
5183----
5184/////
5185
5186== Setup ==
5187
5188The easiest way to load def-use mode is to first tell Emacs where to
5189find the files. For example, add
5190
5191[source,cl]
5192----
5193(add-to-list 'load-path (file-truename "path-to-the-el-files"))
5194----
5195
5196to your `~/.emacs` or `~/.xemacs/init.el`. You'll probably
5197also want to start `def-use-mode` automatically by adding
5198
5199[source,cl]
5200----
5201(require 'esml-du-mlton)
5202(def-use-mode)
5203----
5204
5205to your Emacs init file. Once the def-use mode is activated, you
5206should see the `DU` indicator on the mode line.
5207
5208== Usage ==
5209
5210To use def-use mode one typically first sets up the program's makefile
5211or build script so that the def-use information is saved each time the
5212program is compiled. In addition to the ++-show-def-use __file__++
5213option, the ++-prefer-abs-paths true++ expert option is required.
5214Note that the time it takes to save the information is small (compared
5215to type-checking), so it is recommended to simply add the options to
5216the MLton invocation that compiles the program. However, it is only
5217necessary to type check the program (or library), so one can specify
5218the ++-stop tc++ option. For example, suppose you have a program
5219defined by an MLB file named `my-prg.mlb`, you can save the def-use
5220information to the file `my-prg.du` by invoking MLton as:
5221
5222----
5223mlton -prefer-abs-paths true -show-def-use my-prg.du -stop tc my-prg.mlb
5224----
5225
5226Finally, one needs to tell the mode where to find the def-use
5227information. This is done with the `esml-du-mlton` command. For
5228example, to load the `my-prg.du` file, one would type:
5229
5230----
5231M-x esml-du-mlton my-prg.du
5232----
5233
5234After doing all of the above, find an SML file covered by the
5235previously saved and loaded def-use information, and place the cursor
5236at some variable (definition or use, it doesn't matter). You should
5237see the variable being highlighted. (Note that specifications in
5238signatures do not define variables.)
5239
5240You might also want to setup and use the
5241<:EmacsBgBuildMode:Bg-Build mode> to start builds automatically.
5242
5243
5244== Types ==
5245
5246`-show-def-use` output was extended to include types of variable
5247definitions in revision <!ViewSVNRev(6333)>. To get good type names, the
5248types must be in scope at the end of the program. If you are using the
5249<:MLBasis:ML Basis> system, this means that the root MLB-file for your
5250application should not wrap the libraries used in the application inside
5251`local ... in ... end`, because that would remove them from the scope before
5252the end of the program.
5253
5254<<<
5255
5256:mlton-guide-page: Enscript
5257[[Enscript]]
5258Enscript
5259========
5260
5261http://www.gnu.org/s/enscript/[GNU Enscript] converts ASCII files to
5262PostScript, HTML, and other output languages, applying language
5263sensitive highlighting (similar to <:Emacs:>'s font lock mode). Here
5264are a few _states_ files for highlighting <:StandardML: Standard ML>.
5265
5266* <!ViewGitFile(mlton,master,ide/enscript/sml_simple.st)> -- Provides highlighting of keywords, string and character constants, and (nested) comments.
5267/////
5268+
5269[source,sml]
5270----
5271(* Comments (* can be nested *) *)
5272structure S = struct
5273 val x = (1, 2, "three")
5274end
5275----
5276/////
5277
5278* <!ViewGitFile(mlton,master,ide/enscript/sml_verbose.st)> -- Supersedes
5279the above, adding highlighting of numeric constants. Due to the
5280limited parsing available, numeric record labels are highlighted as
5281numeric constants, in all contexts. Likewise, a binding precedence
5282separated from `infix` or `infixr` by a newline is highlighted as a
5283numeric constant and a numeric record label selector separated from
5284`#` by a newline is highlighted as a numeric constant.
5285/////
5286+
5287[source,sml]
5288----
5289structure S = struct
5290 (* These look good *)
5291 val x = (1, 2, "three")
5292 val z = #2 x
5293
5294 (* Although these look bad (not all the numbers are constants), *
5295 * they never occur in practice, as they are equivalent to the above. *)
5296 val x = {1 = 1, 3 = "three", 2 = 2}
5297 val z = #
5298 2 x
5299end
5300----
5301/////
5302
5303* <!ViewGitFile(mlton,master,ide/enscript/sml_fancy.st)> -- Supersedes the
5304above, adding highlighting of type and constructor bindings,
5305highlighting of explicit binding of type variables at `val` and `fun`
5306declarations, and separate highlighting of core and modules level
5307keywords. Due to the limited parsing available, it is assumed that
5308the input is a syntactically correct, top-level declaration.
5309/////
5310+
5311[source,sml]
5312----
5313structure S = struct
5314 val x = (1, 2, "three")
5315 datatype 'a t = T of 'a
5316 and u = U of v * v
5317 withtype v = {left: int t, right: int t}
5318 exception E1 of int and E2
5319 fun 'a id (x: 'a) : 'a = x
5320
5321 (* Although this looks bad (the explicitly bound type variable 'a is *
5322 * not highlighted), it is unlikely to occur in practice. *)
5323 val
5324 'a id = fn (x : 'a) => x
5325end
5326----
5327/////
5328
5329* <!ViewGitFile(mlton,master,ide/enscript/sml_gaudy.st)> -- Supersedes the
5330above, adding highlighting of type annotations, in both expressions
5331and signatures. Due to the limited parsing available, it is assumed
5332that the input is a syntactically correct, top-level declaration.
5333/////
5334+
5335[source,sml]
5336----
5337signature S = sig
5338 type t
5339 val x : t
5340 val f : t * int -> int
5341end
5342structure S : S = struct
5343 datatype t = T of int
5344 val x : t = T 0
5345 fun f (T x, i : int) : int = x + y
5346 fun 'a id (x: 'a) : 'a = x
5347end
5348----
5349/////
5350
5351== Install and use ==
5352
5353* Version 1.6.3 of http://people.ssh.com/mtr/genscript[GNU Enscript]
5354** Copy all files to `/usr/share/enscript/hl/` or `.enscript/` in your home directory.
5355** Invoke `enscript` with `--highlight=sml_simple` (or `--highlight=sml_verbose` or `--highlight=sml_fancy` or `--highlight=sml_gaudy`).
5356
5357* Version 1.6.1 of http://people.ssh.com/mtr/genscript[GNU Enscript]
5358** Append <!ViewGitFile(mlton,master,ide/enscript/sml_all.st)> to `/usr/share/enscript/enscript.st`
5359** Invoke `enscript` with `--pretty-print=sml_simple` (or `--pretty-print=sml_verbose` or `--pretty-print=sml_fancy` or `--pretty-print=sml_gaudy`).
5360
5361== Feedback ==
5362
5363Comments and suggestions should be directed to <:MatthewFluet:>.
5364
5365<<<
5366
5367:mlton-guide-page: EqualityType
5368[[EqualityType]]
5369EqualityType
5370============
5371
5372An equality type is a type to which <:PolymorphicEquality:> can be
5373applied. The <:DefinitionOfStandardML:Definition> and the
5374<:BasisLibrary:Basis Library> precisely spell out which types are
5375equality types.
5376
5377* `bool`, `char`, `IntInf.int`, ++Int__<N>__.int++, `string`, and ++Word__<N>__.word++ are equality types.
5378
5379* for any `t`, both `t array` and `t ref` are equality types.
5380
5381* if `t` is an equality type, then `t list`, and `t vector` are equality types.
5382
5383* if `t1`, ..., `tn` are equality types, then `t1 * ... * tn` and `{l1: t1, ..., ln: tn}` are equality types.
5384
5385* if `t1`, ..., `tn` are equality types and `t` <:AdmitsEquality:>, then `(t1, ..., tn) t` is an equality type.
5386
5387To check that a type t is an equality type, use the following idiom.
5388[source,sml]
5389----
5390structure S: sig eqtype t end =
5391 struct
5392 type t = ...
5393 end
5394----
5395
5396Notably, `exn` and `real` are not equality types. Neither is `t1 -> t2`, for any `t1` and `t2`.
5397
5398Equality on arrays and ref cells is by identity, not structure.
5399For example, `ref 13 = ref 13` is `false`.
5400On the other hand, equality for lists, strings, and vectors is by
5401structure, not identity. For example, the following equalities hold.
5402
5403[source,sml]
5404----
5405val _ = [1, 2, 3] = 1 :: [2, 3]
5406val _ = "foo" = concat ["f", "o", "o"]
5407val _ = Vector.fromList [1, 2, 3] = Vector.tabulate (3, fn i => i + 1)
5408----
5409
5410<<<
5411
5412:mlton-guide-page: EqualityTypeVariable
5413[[EqualityTypeVariable]]
5414EqualityTypeVariable
5415====================
5416
5417An equality type variable is a type variable that starts with two or
5418more primes, as in `''a` or `''b`. The canonical use of equality type
5419variables is in specifying the type of the <:PolymorphicEquality:>
5420function, which is `''a * ''a -> bool`. Equality type variables
5421ensure that polymorphic equality is only used on
5422<:EqualityType:equality types>, by requiring that at every use of a
5423polymorphic value, equality type variables are instantiated by
5424equality types.
5425
5426For example, the following program is type correct because polymorphic
5427equality is applied to variables of type `''a`.
5428
5429[source,sml]
5430----
5431fun f (x: ''a, y: ''a): bool = x = y
5432----
5433
5434On the other hand, the following program is not type correct, because
5435polymorphic equality is applied to variables of type `'a`, which is
5436not an equality type.
5437
5438[source,sml]
5439----
5440fun f (x: 'a, y: 'a): bool = x = y
5441----
5442
5443MLton reports the following error, indicating that polymorphic
5444equality expects equality types, but didn't get them.
5445
5446----
5447Error: z.sml 1.30-1.34.
5448 Function applied to incorrect argument.
5449 expects: [<equality>] * [<equality>]
5450 but got: ['a] * ['a]
5451 in: = (x, y)
5452----
5453
5454As an example of using such a function that requires equality types,
5455suppose that `f` has polymorphic type `''a -> unit`. Then, `f 13` is
5456type correct because `int` is an equality type. On the other hand,
5457`f 13.0` and `f (fn x => x)` are not type correct, because `real` and
5458arrow types are not equality types. We can test these facts with the
5459following short programs. First, we verify that such an `f` can be
5460applied to integers.
5461
5462[source,sml]
5463----
5464functor Ok (val f: ''a -> unit): sig end =
5465 struct
5466 val () = f 13
5467 val () = f 14
5468 end
5469----
5470
5471We can do better, and verify that such an `f` can be applied to
5472any integer.
5473
5474[source,sml]
5475----
5476functor Ok (val f: ''a -> unit): sig end =
5477 struct
5478 fun g (x: int) = f x
5479 end
5480----
5481
5482Even better, we don't need to introduce a dummy function name; we can
5483use a type constraint.
5484
5485[source,sml]
5486----
5487functor Ok (val f: ''a -> unit): sig end =
5488 struct
5489 val _ = f: int -> unit
5490 end
5491----
5492
5493Even better, we can use a signature constraint.
5494
5495[source,sml]
5496----
5497functor Ok (S: sig val f: ''a -> unit end):
5498 sig val f: int -> unit end = S
5499----
5500
5501This functor concisely verifies that a function of polymorphic type
5502`''a -> unit` can be safely used as a function of type `int -> unit`.
5503
5504As above, we can verify that such an `f` can not be used at
5505non-equality types.
5506
5507[source,sml]
5508----
5509functor Bad (S: sig val f: ''a -> unit end):
5510 sig val f: real -> unit end = S
5511
5512functor Bad (S: sig val f: ''a -> unit end):
5513 sig val f: ('a -> 'a) -> unit end = S
5514----
5515
5516MLton reports the following errors.
5517
5518----
5519Error: z.sml 2.4-2.30.
5520 Variable in structure disagrees with signature (type): f.
5521 structure: val f: [<equality>] -> _
5522 defn at: z.sml 1.25-1.25
5523 signature: val f: [real] -> _
5524 spec at: z.sml 2.12-2.12
5525Error: z.sml 5.4-5.36.
5526 Variable in structure disagrees with signature (type): f.
5527 structure: val f: [<equality>] -> _
5528 defn at: z.sml 4.25-4.25
5529 signature: val f: [_ -> _] -> _
5530 spec at: z.sml 5.12-5.12
5531----
5532
5533
5534== Equality type variables in type and datatype declarations ==
5535
5536Equality type variables can be used in type and datatype declarations;
5537however they play no special role. For example,
5538
5539[source,sml]
5540----
5541type 'a t = 'a * int
5542----
5543
5544is completely identical to
5545
5546[source,sml]
5547----
5548type ''a t = ''a * int
5549----
5550
5551In particular, such a definition does _not_ require that `t` only be
5552applied to equality types.
5553
5554Similarly,
5555
5556[source,sml]
5557----
5558datatype 'a t = A | B of 'a
5559----
5560
5561is completely identical to
5562
5563[source,sml]
5564----
5565datatype ''a t = A | B of ''a
5566----
5567
5568<<<
5569
5570:mlton-guide-page: EtaExpansion
5571[[EtaExpansion]]
5572EtaExpansion
5573============
5574
5575Eta expansion is a simple syntactic change used to work around the
5576<:ValueRestriction:> in <:StandardML:Standard ML>.
5577
5578The eta expansion of an expression `e` is the expression
5579`fn z => e z`, where `z` does not occur in `e`. This only
5580makes sense if `e` denotes a function, i.e. is of arrow type. Eta
5581expansion delays the evaluation of `e` until the function is
5582applied, and will re-evaluate `e` each time the function is
5583applied.
5584
5585The name "eta expansion" comes from the eta-conversion rule of the
5586<:LambdaCalculus:lambda calculus>. Expansion refers to the
5587directionality of the equivalence being used, namely taking `e` to
5588`fn z => e z` rather than `fn z => e z` to `e` (eta
5589contraction).
5590
5591<<<
5592
5593:mlton-guide-page: eXene
5594[[eXene]]
5595eXene
5596=====
5597
5598http://people.cs.uchicago.edu/%7Ejhr/eXene/index.html[eXene] is a
5599multi-threaded X Window System toolkit written in <:ConcurrentML:>.
5600
5601There is a group at K-State working toward
5602http://www.cis.ksu.edu/%7Estough/eXene/[eXene 2.0].
5603
5604<<<
5605
5606:mlton-guide-page: FAQ
5607[[FAQ]]
5608FAQ
5609===
5610
5611Feel free to ask questions and to update answers by editing this page.
5612Since we try to make as much information as possible available on the
5613web site and we like to avoid duplication, many of the answers are
5614simply links to a web page that answers the question.
5615
5616== How do you pronounce MLton? ==
5617
5618<:Pronounce:>
5619
5620== What SML software has been ported to MLton? ==
5621
5622<:Libraries:>
5623
5624== What graphical libraries are available for MLton? ==
5625
5626<:Libraries:>
5627
5628== How does MLton's performance compare to other SML compilers and to other languages? ==
5629
5630MLton has <:Performance:excellent performance>.
5631
5632== Does MLton treat monomorphic arrays and vectors specially? ==
5633
5634MLton implements monomorphic arrays and vectors (e.g. `BoolArray`,
5635`Word8Vector`) exactly as instantiations of their polymorphic
5636counterpart (e.g. `bool array`, `Word8.word vector`). Thus, there is
5637no need to use the monomorphic versions except when required to
5638interface with the <:BasisLibrary:Basis Library> or for portability
5639with other SML implementations.
5640
5641== Why do I get a Segfault/Bus error in a program that uses `IntInf`/`LargeInt` to calculate numbers with several hundred thousand digits? ==
5642
5643<:GnuMP:>
5644
5645== How can I decrease compile-time memory usage? ==
5646
5647* Compile with `-verbose 3` to find out if the problem is due to an
5648SSA optimization pass. If so, compile with ++-disable-pass __pass__++ to
5649skip that pass.
5650
5651* Compile with `@MLton hash-cons 0.5 --`, which will instruct the
5652runtime to hash cons the heap every other GC.
5653
5654* Compile with `-polyvariance false`, which is an undocumented option
5655that causes less code duplication.
5656
5657Also, please <:Contact:> us to let us know the problem to help us
5658better understand MLton's limitations.
5659
5660== How portable is SML code across SML compilers? ==
5661
5662<:StandardMLPortability:>
5663
5664<<<
5665
5666:mlton-guide-page: Features
5667[[Features]]
5668Features
5669========
5670
5671MLton has the following features.
5672
5673== Portability ==
5674
5675* Runs on a variety of platforms.
5676
5677** <:RunningOnARM:ARM>:
5678*** <:RunningOnLinux:Linux> (Debian)
5679
5680** <:RunningOnAlpha:Alpha>:
5681*** <:RunningOnLinux:Linux> (Debian)
5682
5683** <:RunningOnAMD64:AMD64>:
5684*** <:RunningOnDarwin:Darwin> (Mac OS X)
5685*** <:RunningOnFreeBSD:FreeBSD>
5686*** <:RunningOnLinux:Linux> (Debian, Fedora, Ubuntu, ...)
5687*** <:RunningOnOpenBSD:OpenBSD>
5688*** <:RunningOnSolaris:Solaris> (10 and above)
5689
5690** <:RunningOnHPPA:HPPA>:
5691*** <:RunningOnHPUX:HPUX> (11.11 and above)
5692*** <:RunningOnLinux:Linux> (Debian)
5693
5694** <:RunningOnIA64:IA64>:
5695*** <:RunningOnHPUX:HPUX> (11.11 and above)
5696*** <:RunningOnLinux:Linux> (Debian)
5697
5698** <:RunningOnPowerPC:PowerPC>:
5699*** <:RunningOnAIX:AIX> (5.2 and above)
5700*** <:RunningOnDarwin:Darwin> (Mac OS X)
5701*** <:RunningOnLinux:Linux> (Debian, Fedora, ...)
5702
5703** <:RunningOnPowerPC64:PowerPC64>:
5704*** <:RunningOnAIX:AIX> (5.2 and above)
5705
5706** <:RunningOnS390:S390>
5707*** <:RunningOnLinux:Linux> (Debian)
5708
5709** <:RunningOnSparc:Sparc>
5710*** <:RunningOnLinux:Linux> (Debian)
5711*** <:RunningOnSolaris:Solaris> (8 and above)
5712
5713** <:RunningOnX86:X86>:
5714*** <:RunningOnCygwin:Cygwin>/Windows
5715*** <:RunningOnDarwin:Darwin> (Mac OS X)
5716*** <:RunningOnFreeBSD:FreeBSD>
5717*** <:RunningOnLinux:Linux> (Debian, Fedora, Ubuntu, ...)
5718*** <:RunningOnMinGW:MinGW>/Windows
5719*** <:RunningOnNetBSD:NetBSD>
5720*** <:RunningOnOpenBSD:OpenBSD>
5721*** <:RunningOnSolaris:Solaris> (10 and above)
5722
5723== Robustness ==
5724
5725* Supports the full SML 97 language as given in <:DefinitionOfStandardML:The Definition of Standard ML (Revised)>.
5726+
5727If there is a program that is valid according to the
5728<:DefinitionOfStandardML:Definition> that is rejected by MLton, or a
5729program that is invalid according to the
5730<:DefinitionOfStandardML:Definition> that is accepted by MLton, it is
5731a bug. For a list of known bugs, see <:UnresolvedBugs:>.
5732
5733* A complete implementation of the <:BasisLibrary:Basis Library>.
5734+
5735MLton's implementation matches latest <:BasisLibrary:Basis Library>
5736http://www.standardml.org/Basis[specification], and includes a
5737complete implementation of all the required modules, as well as many
5738of the optional modules.
5739
5740* Generates standalone executables.
5741+
5742No additional code or libraries are necessary in order to run an
5743executable, except for the standard shared libraries. MLton can also
5744generate statically linked executables.
5745
5746* Compiles large programs.
5747+
5748MLton is sufficiently efficient and robust that it can compile large
5749programs, including itself (over 190K lines). The distributed version
5750of MLton was compiled by MLton.
5751
5752* Support for large amounts of memory (up to 4G on 32-bit systems; more on 64-bit systems).
5753
5754* Support for large array lengths (up to 2^31^-1 on 32-bit systems; up to 2^63^-1 on 64-bit systems).
5755
5756* Support for large files, using 64-bit file positions.
5757
5758== Performance ==
5759
5760* Executables have <:Performance:excellent running times>.
5761
5762* Generates small executables.
5763+
5764MLton takes advantage of whole-program compilation to perform very
5765aggressive dead-code elimination, which often leads to smaller
5766executables than with other SML compilers.
5767
5768* Untagged and unboxed native integers, reals, and words.
5769+
5770In MLton, integers and words are 8 bits, 16 bits, 32 bits, and 64 bits
5771and arithmetic does not have any overhead due to tagging or boxing.
5772Also, reals (32-bit and 64-bit) are stored unboxed, avoiding any
5773overhead due to boxing.
5774
5775* Unboxed native arrays.
5776+
5777In MLton, an array (or vector) of integers, reals, or words uses the
5778natural C-like representation. This is fast and supports easy
5779exchange of data with C. Monomorphic arrays (and vectors) use the
5780same C-like representations as their polymorphic counterparts.
5781
5782* Multiple <:GarbageCollection:garbage collection> strategies.
5783
5784* Fast arbitrary precision arithmetic (`IntInf`) based on <:GnuMP:>.
5785+
5786For `IntInf` intensive programs, MLton can be an order of magnitude or
5787more faster than Poly/ML or SML/NJ.
5788
5789== Tools ==
5790
5791* Source-level <:Profiling:> of both time and allocation.
5792* <:MLLex:> lexer generator
5793* <:MLYacc:> parser generator
5794* <:MLNLFFIGen:> foreign-function-interface generator
5795
5796== Extensions ==
5797
5798* A simple and fast C <:ForeignFunctionInterface:> that supports calling from SML to C and from C to SML.
5799
5800* The <:MLBasis:ML Basis system> for programming in the very large, separate delivery of library sources, and more.
5801
5802* A number of extension libraries that provide useful functionality
5803that cannot be implemented with the <:BasisLibrary:Basis Library>.
5804See below for an overview and <:MLtonStructure:> for details.
5805
5806** <:MLtonCont:continuations>
5807+
5808MLton supports continuations via `callcc` and `throw`.
5809
5810** <:MLtonFinalizable:finalization>
5811+
5812MLton supports finalizable values of arbitrary type.
5813
5814** <:MLtonItimer:interval timers>
5815+
5816MLton supports the functionality of the C `setitimer` function.
5817
5818** <:MLtonRandom:random numbers>
5819+
5820MLton has functions similar to the C `rand` and `srand` functions, as well as support for access to `/dev/random` and `/dev/urandom`.
5821
5822** <:MLtonRlimit:resource limits>
5823+
5824MLton has functions similar to the C `getrlimit` and `setrlimit` functions.
5825
5826** <:MLtonRusage:resource usage>
5827+
5828MLton supports a subset of the functionality of the C `getrusage` function.
5829
5830** <:MLtonSignal:signal handlers>
5831+
5832MLton supports signal handlers written in SML. Signal handlers run in
5833a separate MLton thread, and have access to the thread that was
5834interrupted by the signal. Signal handlers can be used in conjunction
5835with threads to implement preemptive multitasking.
5836
5837** <:MLtonStructure:size primitive>
5838+
5839MLton includes a primitive that returns the size (in bytes) of any
5840object. This can be useful in understanding the space behavior of a
5841program.
5842
5843** <:MLtonSyslog:system logging>
5844+
5845MLton has a complete interface to the C `syslog` function.
5846
5847** <:MLtonThread:threads>
5848+
5849MLton has support for its own threads, upon which either preemptive or
5850non-preemptive multitasking can be implemented. MLton also has
5851support for <:ConcurrentML:Concurrent ML> (CML).
5852
5853** <:MLtonWeak:weak pointers>
5854+
5855MLton supports weak pointers, which allow the garbage collector to
5856reclaim objects that it would otherwise be forced to keep. Weak
5857pointers are also used to provide finalization.
5858
5859** <:MLtonWorld:world save and restore>
5860+
5861MLton has a facility for saving the entire state of a computation to a
5862file and restarting it later. This facility can be used for staging
5863and for checkpointing computations. It can even be used from within
5864signal handlers, allowing interrupt driven checkpointing.
5865
5866<<<
5867
5868:mlton-guide-page: FirstClassPolymorphism
5869[[FirstClassPolymorphism]]
5870FirstClassPolymorphism
5871======================
5872
5873First-class polymorphism is the ability to treat polymorphic functions
5874just like other values: pass them as arguments, store them in data
5875structures, etc. Although <:StandardML:Standard ML> does have
5876polymorphic functions, it does not support first-class polymorphism.
5877
5878For example, the following declares and uses the polymorphic function
5879`id`.
5880[source,sml]
5881----
5882val id = fn x => x
5883val _ = id 13
5884val _ = id "foo"
5885----
5886
5887If SML supported first-class polymorphism, we could write the
5888following.
5889[source,sml]
5890----
5891fun useId id = (id 13; id "foo")
5892----
5893
5894However, this does not type check. MLton reports the following error.
5895----
5896Error: z.sml 1.24-1.31.
5897 Function applied to incorrect argument.
5898 expects: [int]
5899 but got: [string]
5900 in: id "foo"
5901----
5902The error message arises because MLton infers from `id 13` that `id`
5903accepts an integer argument, but that `id "foo"` is passing a string.
5904
5905Using explicit types sheds some light on the problem.
5906[source,sml]
5907----
5908fun useId (id: 'a -> 'a) = (id 13; id "foo")
5909----
5910
5911On this, MLton reports the following errors.
5912----
5913Error: z.sml 1.29-1.33.
5914 Function applied to incorrect argument.
5915 expects: ['a]
5916 but got: [int]
5917 in: id 13
5918Error: z.sml 1.36-1.43.
5919 Function applied to incorrect argument.
5920 expects: ['a]
5921 but got: [string]
5922 in: id "foo"
5923----
5924
5925The errors arise because the argument `id` is _not_ polymorphic;
5926rather, it is monomorphic, with type `'a -> 'a`. It is perfectly
5927valid to apply `id` to a value of type `'a`, as in the following
5928[source,sml]
5929----
5930fun useId (id: 'a -> 'a, x: 'a) = id x (* type correct *)
5931----
5932
5933So, what is the difference between the type specification on `id` in
5934the following two declarations?
5935[source,sml]
5936----
5937val id: 'a -> 'a = fn x => x
5938fun useId (id: 'a -> 'a) = (id 13; id "foo")
5939----
5940
5941While the type specifications on `id` look identical, they mean
5942different things. The difference can be made clearer by explicitly
5943<:TypeVariableScope:scoping the type variables>.
5944[source,sml]
5945----
5946val 'a id: 'a -> 'a = fn x => x
5947fun 'a useId (id: 'a -> 'a) = (id 13; id "foo") (* type error *)
5948----
5949
5950In `val 'a id`, the type variable scoping means that for any `'a`,
5951`id` has type `'a -> 'a`. Hence, `id` can be applied to arguments of
5952type `int`, `real`, etc. Similarly, in `fun 'a useId`, the scoping
5953means that `useId` is a polymorphic function that for any `'a` takes a
5954function of type `'a -> 'a` and does something. Thus, `useId` could
5955be applied to a function of type `int -> int`, `real -> real`, etc.
5956
5957One could imagine an extension of SML that allowed scoping of type
5958variables at places other than `fun` or `val` declarations, as in the
5959following.
5960----
5961fun useId (id: ('a).'a -> 'a) = (id 13; id "foo") (* not SML *)
5962----
5963
5964Such an extension would need to be thought through very carefully, as
5965it could cause significant complications with <:TypeInference:>,
5966possible even undecidability.
5967
5968<<<
5969
5970:mlton-guide-page: Fixpoints
5971[[Fixpoints]]
5972Fixpoints
5973=========
5974
5975This page discusses a framework that makes it possible to compute
5976fixpoints over arbitrary products of abstract types. The code is from
5977an Extended Basis library
5978(<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>).
5979
5980First the signature of the framework
5981(<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/public/generic/tie.sig)>):
5982[source,sml]
5983----
5984sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/public/generic/tie.sig 6:]
5985----
5986
5987`fix` is a <:TypeIndexedValues:type-indexed> function. The type-index
5988parameter to `fix` is called a "witness". To compute fixpoints over
5989products, one uses the +*&grave;+ operator to combine witnesses. To provide
5990a fixpoint combinator for an abstract type, one implements a witness
5991providing a thunk whose instantiation allocates a fresh, mutable proxy
5992and a procedure for updating the proxy with the solution. Naturally
5993this means that not all possible ways of computing a fixpoint of a
5994particular type are possible under the framework. The `pure`
5995combinator is a generalization of `tier`. The `iso` combinator is
5996provided for reusing existing witnesses.
5997
5998Note that instead of using an infix operator, we could alternatively
5999employ an interface using <:Fold:>. Also, witnesses are eta-expanded
6000to work around the <:ValueRestriction:value restriction>, while
6001maintaining abstraction.
6002
6003Here is the implementation
6004(<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/detail/generic/tie.sml)>):
6005[source,sml]
6006----
6007sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/detail/generic/tie.sml 6:]
6008----
6009
6010Let's then take a look at a couple of additional examples.
6011
6012Here is a naive implementation of lazy promises:
6013[source,sml]
6014----
6015structure Promise :> sig
6016 type 'a t
6017 val lazy : 'a Thunk.t -> 'a t
6018 val force : 'a t -> 'a
6019 val Y : 'a t Tie.t
6020end = struct
6021 datatype 'a t' =
6022 EXN of exn
6023 | THUNK of 'a Thunk.t
6024 | VALUE of 'a
6025 type 'a t = 'a t' Ref.t
6026 fun lazy f = ref (THUNK f)
6027 fun force t =
6028 case !t
6029 of EXN e => raise e
6030 | THUNK f => (t := VALUE (f ()) handle e => t := EXN e ; force t)
6031 | VALUE v => v
6032 fun Y ? = Tie.tier (fn () => let
6033 val r = lazy (raising Fix.Fix)
6034 in
6035 (r, r <\ op := o !)
6036 end) ?
6037end
6038----
6039
6040An example use of our naive lazy promises is to implement equally naive
6041lazy streams:
6042[source,sml]
6043----
6044structure Stream :> sig
6045 type 'a t
6046 val cons : 'a * 'a t -> 'a t
6047 val get : 'a t -> ('a * 'a t) Option.t
6048 val Y : 'a t Tie.t
6049end = struct
6050 datatype 'a t = IN of ('a * 'a t) Option.t Promise.t
6051 fun cons (x, xs) = IN (Promise.lazy (fn () => SOME (x, xs)))
6052 fun get (IN p) = Promise.force p
6053 fun Y ? = Tie.iso Promise.Y (fn IN p => p, IN) ?
6054end
6055----
6056
6057Note that above we make use of the `iso` combinator. Here is a finite
6058representation of an infinite stream of ones:
6059
6060[source,sml]
6061----
6062val ones = let
6063 open Tie Stream
6064in
6065 fix Y (fn ones => cons (1, ones))
6066end
6067----
6068
6069<<<
6070
6071:mlton-guide-page: Flatten
6072[[Flatten]]
6073Flatten
6074=======
6075
6076<:Flatten:> is an optimization pass for the <:SSA:>
6077<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
6078
6079== Description ==
6080
6081This pass flattens arguments to <:SSA:> constructors, blocks, and
6082functions.
6083
6084If a tuple is explicitly available at all uses of a function
6085(resp. block), then:
6086
6087* The formals and call sites are changed so that the components of the
6088tuple are passed.
6089
6090* The tuple is reconstructed at the beginning of the body of the
6091function (resp. block).
6092
6093Similarly, if a tuple is explicitly available at all uses of a
6094constructor, then:
6095
6096* The constructor argument datatype is changed to flatten the tuple
6097type.
6098
6099* The tuple is passed flat at each `ConApp`.
6100
6101* The tuple is reconstructed at each `Case` transfer target.
6102
6103== Implementation ==
6104
6105* <!ViewGitFile(mlton,master,mlton/ssa/flatten.fun)>
6106
6107== Details and Notes ==
6108
6109{empty}
6110
6111<<<
6112
6113:mlton-guide-page: Fold
6114[[Fold]]
6115Fold
6116====
6117
6118This page describes a technique that enables convenient syntax for a
6119number of language features that are not explicitly supported by
6120<:StandardML:Standard ML>, including: variable number of arguments,
6121<:OptionalArguments:optional arguments and labeled arguments>,
6122<:ArrayLiteral:array and vector literals>,
6123<:FunctionalRecordUpdate:functional record update>,
6124and (seemingly) dependently typed functions like <:Printf:printf> and scanf.
6125
6126The key idea to _fold_ is to define functions `fold`, `step0`,
6127and `$` such that the following equation holds.
6128
6129[source,sml]
6130----
6131fold (a, f) (step0 h1) (step0 h2) ... (step0 hn) $
6132= f (hn (... (h2 (h1 a))))
6133----
6134
6135The name `fold` comes because this is like a traditional list fold,
6136where `a` is the _base element_, and each _step function_,
6137`step0 hi`, corresponds to one element of the list and does one
6138step of the fold. The name `$` is chosen to mean "end of
6139arguments" from its common use in regular-expression syntax.
6140
6141Unlike the usual list fold in which the same function is used to step
6142over each element in the list, this fold allows the step functions to
6143be different from each other, and even to be of different types. Also
6144unlike the usual list fold, this fold includes a "finishing
6145function", `f`, that is applied to the result of the fold. The
6146presence of the finishing function may seem odd because there is no
6147analogy in list fold. However, the finishing function is essential;
6148without it, there would be no way for the folder to perform an
6149arbitrary computation after processing all the arguments. The
6150examples below will make this clear.
6151
6152The functions `fold`, `step0`, and `$` are easy to
6153define.
6154
6155[source,sml]
6156----
6157fun $ (a, f) = f a
6158fun id x = x
6159structure Fold =
6160 struct
6161 fun fold (a, f) g = g (a, f)
6162 fun step0 h (a, f) = fold (h a, f)
6163 end
6164----
6165
6166We've placed `fold` and `step0` in the `Fold` structure
6167but left `$` at the toplevel because it is convenient in code to
6168always have `$` in scope. We've also defined the identity
6169function, `id`, at the toplevel since we use it so frequently.
6170
6171Plugging in the definitions, it is easy to verify the equation from
6172above.
6173
6174[source,sml]
6175----
6176fold (a, f) (step0 h1) (step0 h2) ... (step0 hn) $
6177= step0 h1 (a, f) (step0 h2) ... (step0 hn) $
6178= fold (h1 a, f) (step0 h2) ... (step0 hn) $
6179= step0 h2 (h1 a, f) ... (step0 hn) $
6180= fold (h2 (h1 a), f) ... (step0 hn) $
6181...
6182= fold (hn (... (h2 (h1 a))), f) $
6183= $ (hn (... (h2 (h1 a))), f)
6184= f (hn (... (h2 (h1 a))))
6185----
6186
6187
6188== Example: variable number of arguments ==
6189
6190The simplest example of fold is accepting a variable number of
6191(curried) arguments. We'll define a function `f` and argument
6192`a` such that all of the following expressions are valid.
6193
6194[source,sml]
6195----
6196f $
6197f a $
6198f a a $
6199f a a a $
6200f a a a ... a a a $ (* as many a's as we want *)
6201----
6202
6203Off-hand it may appear impossible that all of the above expressions
6204are type correct SML -- how can a function `f` accept a variable
6205number of curried arguments? What could the type of `f` be?
6206We'll have more to say later on how type checking works. For now,
6207once we have supplied the definitions below, you can check that the
6208expressions are type correct by feeding them to your favorite SML
6209implementation.
6210
6211It is simple to define `f` and `a`. We define `f` as a
6212folder whose base element is `()` and whose finish function does
6213nothing. We define `a` as the step function that does nothing.
6214The only trickiness is that we must <:EtaExpansion:eta expand> the
6215definition of `f` and `a` to work around the ValueRestriction;
6216we frequently use eta expansion for this purpose without mention.
6217
6218[source,sml]
6219----
6220val base = ()
6221fun finish () = ()
6222fun step () = ()
6223val f = fn z => Fold.fold (base, finish) z
6224val a = fn z => Fold.step0 step z
6225----
6226
6227One can easily apply the fold equation to verify by hand that `f`
6228applied to any number of `a`'s evaluates to `()`.
6229
6230[source,sml]
6231----
6232f a ... a $
6233= finish (step (... (step base)))
6234= finish (step (... ()))
6235...
6236= finish ()
6237= ()
6238----
6239
6240
6241== Example: variable-argument sum ==
6242
6243Let's look at an example that computes something: a variable-argument
6244function `sum` and a stepper `a` such that
6245
6246[source,sml]
6247----
6248sum (a i1) (a i2) ... (a im) $ = i1 + i2 + ... + im
6249----
6250
6251The idea is simple -- the folder starts with a base accumulator of
6252`0` and the stepper adds each element to the accumulator, `s`,
6253which the folder simply returns at the end.
6254
6255[source,sml]
6256----
6257val sum = fn z => Fold.fold (0, fn s => s) z
6258fun a i = Fold.step0 (fn s => i + s)
6259----
6260
6261Using the fold equation, one can verify the following.
6262
6263[source,sml]
6264----
6265sum (a 1) (a 2) (a 3) $ = 6
6266----
6267
6268
6269== Step1 ==
6270
6271It is sometimes syntactically convenient to omit the parentheses
6272around the steps in a fold. This is easily done by defining a new
6273function, `step1`, as follows.
6274
6275[source,sml]
6276----
6277structure Fold =
6278 struct
6279 open Fold
6280 fun step1 h (a, f) b = fold (h (b, a), f)
6281 end
6282----
6283
6284From the definition of `step1`, we have the following
6285equivalence.
6286
6287[source,sml]
6288----
6289fold (a, f) (step1 h) b
6290= step1 h (a, f) b
6291= fold (h (b, a), f)
6292----
6293
6294Using the above equivalence, we can compute the following equation for
6295`step1`.
6296
6297[source,sml]
6298----
6299fold (a, f) (step1 h1) b1 (step1 h2) b2 ... (step1 hn) bn $
6300= fold (h1 (b1, a), f) (step1 h2) b2 ... (step1 hn) bn $
6301= fold (h2 (b2, h1 (b1, a)), f) ... (step1 hn) bn $
6302= fold (hn (bn, ... (h2 (b2, h1 (b1, a)))), f) $
6303= f (hn (bn, ... (h2 (b2, h1 (b1, a)))))
6304----
6305
6306Here is an example using `step1` to define a variable-argument
6307product function, `prod`, with a convenient syntax.
6308
6309[source,sml]
6310----
6311val prod = fn z => Fold.fold (1, fn p => p) z
6312val ` = fn z => Fold.step1 (fn (i, p) => i * p) z
6313----
6314
6315The functions `prod` and +&grave;+ satisfy the following equation.
6316[source,sml]
6317----
6318prod `i1 `i2 ... `im $ = i1 * i2 * ... * im
6319----
6320
6321Note that in SML, +&grave;i1+ is two different tokens, +&grave;+ and
6322`i1`. We often use +&grave;+ for an instance of a `step1` function
6323because of its syntactic unobtrusiveness and because no space is
6324required to separate it from an alphanumeric token.
6325
6326Also note that there are no parenthesis around the steps. That is,
6327the following expression is not the same as the above one (in fact, it
6328is not type correct).
6329
6330[source,sml]
6331----
6332prod (`i1) (`i2) ... (`im) $
6333----
6334
6335
6336== Example: list literals ==
6337
6338SML already has a syntax for list literals, e.g. `[w, x, y, z]`.
6339However, using fold, we can define our own syntax.
6340
6341[source,sml]
6342----
6343val list = fn z => Fold.fold ([], rev) z
6344val ` = fn z => Fold.step1 (op ::) z
6345----
6346
6347The idea is that the folder starts out with the empty list, the steps
6348accumulate the elements into a list, and then the finishing function
6349reverses the list at the end.
6350
6351With these definitions one can write a list like:
6352
6353[source,sml]
6354----
6355list `w `x `y `z $
6356----
6357
6358While the example is not practically useful, it does demonstrate the
6359need for the finishing function to be incorporated in `fold`.
6360Without a finishing function, every use of `list` would need to be
6361wrapped in `rev`, as follows.
6362
6363[source,sml]
6364----
6365rev (list `w `x `y `z $)
6366----
6367
6368The finishing function allows us to incorporate the reversal into the
6369definition of `list`, and to treat `list` as a truly variable
6370argument function, performing an arbitrary computation after receiving
6371all of its arguments.
6372
6373See <:ArrayLiteral:> for a similar use of `fold` that provides a
6374syntax for array and vector literals, which are not built in to SML.
6375
6376
6377== Fold right ==
6378
6379Just as `fold` is analogous to a fold left, in which the functions
6380are applied to the accumulator left-to-right, we can define a variant
6381of `fold` that is analogous to a fold right, in which the
6382functions are applied to the accumulator right-to-left. That is, we
6383can define functions `foldr` and `step0` such that the
6384following equation holds.
6385
6386[source,sml]
6387----
6388foldr (a, f) (step0 h1) (step0 h2) ... (step0 hn) $
6389= f (h1 (h2 (... (hn a))))
6390----
6391
6392The implementation of fold right is easy, using fold. The idea is for
6393the fold to start with `f` and for each step to precompose the
6394next `hi`. Then, the finisher applies the composed function to
6395the base value, `a`. Here is the code.
6396
6397[source,sml]
6398----
6399structure Foldr =
6400 struct
6401 fun foldr (a, f) = Fold.fold (f, fn g => g a)
6402 fun step0 h = Fold.step0 (fn g => g o h)
6403 end
6404----
6405
6406Verifying the fold-right equation is straightforward, using the
6407fold-left equation.
6408
6409[source,sml]
6410----
6411foldr (a, f) (Foldr.step0 h1) (Foldr.step0 h2) ... (Foldr.step0 hn) $
6412= fold (f, fn g => g a)
6413 (Fold.step0 (fn g => g o h1))
6414 (Fold.step0 (fn g => g o h2))
6415 ...
6416 (Fold.step0 (fn g => g o hn)) $
6417= (fn g => g a)
6418 ((fn g => g o hn) (... ((fn g => g o h2) ((fn g => g o h1) f))))
6419= (fn g => g a)
6420 ((fn g => g o hn) (... ((fn g => g o h2) (f o h1))))
6421= (fn g => g a) ((fn g => g o hn) (... (f o h1 o h2)))
6422= (fn g => g a) (f o h1 o h2 o ... o hn)
6423= (f o h1 o h2 o ... o hn) a
6424= f (h1 (h2 (... (hn a))))
6425----
6426
6427One can also define the fold-right analogue of `step1`.
6428
6429[source,sml]
6430----
6431structure Foldr =
6432 struct
6433 open Foldr
6434 fun step1 h = Fold.step1 (fn (b, g) => g o (fn a => h (b, a)))
6435 end
6436----
6437
6438
6439== Example: list literals via fold right ==
6440
6441Revisiting the list literal example from earlier, we can use fold
6442right to define a syntax for list literals that doesn't do a reversal.
6443
6444[source,sml]
6445----
6446val list = fn z => Foldr.foldr ([], fn l => l) z
6447val ` = fn z => Foldr.step1 (op ::) z
6448----
6449
6450As before, with these definitions, one can write a list like:
6451
6452[source,sml]
6453----
6454list `w `x `y `z $
6455----
6456
6457The difference between the fold-left and fold-right approaches is that
6458the fold-right approach does not have to reverse the list at the end,
6459since it accumulates the elements in the correct order. In practice,
6460MLton will simplify away all of the intermediate function composition,
6461so the the fold-right approach will be more efficient.
6462
6463
6464== Mixing steppers ==
6465
6466All of the examples so far have used the same step function throughout
6467a fold. This need not be the case. For example, consider the
6468following.
6469
6470[source,sml]
6471----
6472val n = fn z => Fold.fold (0, fn i => i) z
6473val I = fn z => Fold.step0 (fn i => i * 2) z
6474val O = fn z => Fold.step0 (fn i => i * 2 + 1) z
6475----
6476
6477Here we have one folder, `n`, that can be used with two different
6478steppers, `I` and `O`. By using the fold equation, one can
6479verify the following equations.
6480
6481[source,sml]
6482----
6483n O $ = 0
6484n I $ = 1
6485n I O $ = 2
6486n I O I $ = 5
6487n I I I O $ = 14
6488----
6489
6490That is, we've defined a syntax for writing binary integer constants.
6491
6492Not only can one use different instances of `step0` in the same
6493fold, one can also intermix uses of `step0` and `step1`. For
6494example, consider the following.
6495
6496[source,sml]
6497----
6498val n = fn z => Fold.fold (0, fn i => i) z
6499val O = fn z => Fold.step0 (fn i => n * 8) z
6500val ` = fn z => Fold.step1 (fn (i, n) => n * 8 + i) z
6501----
6502
6503Using the straightforward generalization of the fold equation to mixed
6504steppers, one can verify the following equations.
6505
6506[source,sml]
6507----
6508n 0 $ = 0
6509n `3 O $ = 24
6510n `1 O `7 $ = 71
6511----
6512
6513That is, we've defined a syntax for writing octal integer constants,
6514with a special syntax, `O`, for the zero digit (admittedly
6515contrived, since one could just write +&grave;0+ instead of `O`).
6516
6517See <:NumericLiteral:> for a practical extension of this approach that
6518supports numeric constants in any base and of any type.
6519
6520
6521== (Seemingly) dependent types ==
6522
6523A normal list fold always returns the same type no matter what
6524elements are in the list or how long the list is. Variable-argument
6525fold is more powerful, because the result type can vary based both on
6526the arguments that are passed and on their number. This can provide
6527the illusion of dependent types.
6528
6529For example, consider the following.
6530
6531[source,sml]
6532----
6533val f = fn z => Fold.fold ((), id) z
6534val a = fn z => Fold.step0 (fn () => "hello") z
6535val b = fn z => Fold.step0 (fn () => 13) z
6536val c = fn z => Fold.step0 (fn () => (1, 2)) z
6537----
6538
6539Using the fold equation, one can verify the following equations.
6540
6541[source,sml]
6542----
6543f a $ = "hello": string
6544f b $ = 13: int
6545f c $ = (1, 2): int * int
6546----
6547
6548That is, `f` returns a value of a different type depending on
6549whether it is applied to argument `a`, argument `b`, or
6550argument `c`.
6551
6552The following example shows how the type of a fold can depend on the
6553number of arguments.
6554
6555[source,sml]
6556----
6557val grow = fn z => Fold.fold ([], fn l => l) z
6558val a = fn z => Fold.step0 (fn x => [x]) z
6559----
6560
6561Using the fold equation, one can verify the following equations.
6562
6563[source,sml]
6564----
6565grow $ = []: 'a list
6566grow a $ = [[]]: 'a list list
6567grow a a $ = [[[]]]: 'a list list list
6568----
6569
6570Clearly, the result type of a call to the variable argument `grow`
6571function depends on the number of arguments that are passed.
6572
6573As a reminder, this is well-typed SML. You can check it out in any
6574implementation.
6575
6576
6577== (Seemingly) dependently-typed functional results ==
6578
6579Fold is especially useful when it returns a curried function whose
6580arity depends on the number of arguments. For example, consider the
6581following.
6582
6583[source,sml]
6584----
6585val makeSum = fn z => Fold.fold (id, fn f => f 0) z
6586val I = fn z => Fold.step0 (fn f => fn i => fn x => f (x + i)) z
6587----
6588
6589The `makeSum` folder constructs a function whose arity depends on
6590the number of `I` arguments and that adds together all of its
6591arguments. For example,
6592`makeSum I $` is of type `int -> int` and
6593`makeSum I I $` is of type `int -> int -> int`.
6594
6595One can use the fold equation to verify that the `makeSum` works
6596correctly. For example, one can easily check by hand the following
6597equations.
6598
6599[source,sml]
6600----
6601makeSum I $ 1 = 1
6602makeSum I I $ 1 2 = 3
6603makeSum I I I $ 1 2 3 = 6
6604----
6605
6606Returning a function becomes especially interesting when there are
6607steppers of different types. For example, the following `makeSum`
6608folder constructs functions that sum integers and reals.
6609
6610[source,sml]
6611----
6612val makeSum = fn z => Foldr.foldr (id, fn f => f 0.0) z
6613val I = fn z => Foldr.step0 (fn f => fn x => fn i => f (x + real i)) z
6614val R = fn z => Foldr.step0 (fn f => fn x: real => fn r => f (x + r)) z
6615----
6616
6617With these definitions, `makeSum I R $` is of type
6618`int -> real -> real` and `makeSum R I I $` is of type
6619`real -> int -> int -> real`. One can use the foldr equation to
6620check the following equations.
6621
6622[source,sml]
6623----
6624makeSum I $ 1 = 1.0
6625makeSum I R $ 1 2.5 = 3.5
6626makeSum R I I $ 1.5 2 3 = 6.5
6627----
6628
6629We used `foldr` instead of `fold` for this so that the order
6630in which the specifiers `I` and `R` appear is the same as the
6631order in which the arguments appear. Had we used `fold`, things
6632would have been reversed.
6633
6634An extension of this idea is sufficient to define <:Printf:>-like
6635functions in SML.
6636
6637
6638== An idiom for combining steps ==
6639
6640It is sometimes useful to combine a number of steps together and name
6641them as a single step. As a simple example, suppose that one often
6642sees an integer follower by a real in the `makeSum` example above.
6643One can define a new _compound step_ `IR` as follows.
6644
6645[source,sml]
6646----
6647val IR = fn u => Fold.fold u I R
6648----
6649
6650With this definition in place, one can verify the following.
6651
6652[source,sml]
6653----
6654makeSum IR IR $ 1 2.2 3 4.4 = 10.6
6655----
6656
6657In general, one can combine steps `s1`, `s2`, ... `sn` as
6658
6659[source,sml]
6660----
6661fn u => Fold.fold u s1 s2 ... sn
6662----
6663
6664The following calculation shows why a compound step behaves as the
6665composition of its constituent steps.
6666
6667[source,sml]
6668----
6669fold u (fn u => fold u s1 s2 ... sn)
6670= (fn u => fold u s1 s2 ... sn) u
6671= fold u s1 s2 ... sn
6672----
6673
6674
6675== Post composition ==
6676
6677Suppose we already have a function defined via fold,
6678`w = fold (a, f)`, and we would like to construct a new fold
6679function that is like `w`, but applies `g` to the result
6680produced by `w`. This is similar to function composition, but we
6681can't just do `g o w`, because we don't want to use `g` until
6682`w` has been applied to all of its arguments and received the
6683end-of-arguments terminator `$`.
6684
6685More precisely, we want to define a post-composition function
6686`post` that satisfies the following equation.
6687
6688[source,sml]
6689----
6690post (w, g) s1 ... sn $ = g (w s1 ... sn $)
6691----
6692
6693Here is the definition of `post`.
6694
6695[source,sml]
6696----
6697structure Fold =
6698 struct
6699 open Fold
6700 fun post (w, g) s = w (fn (a, h) => s (a, g o h))
6701 end
6702----
6703
6704The following calculations show that `post` satisfies the desired
6705equation, where `w = fold (a, f)`.
6706
6707[source,sml]
6708----
6709post (w, g) s
6710= w (fn (a, h) => s (a, g o h))
6711= fold (a, f) (fn (a, h) => s (a, g o h))
6712= (fn (a, h) => s (a, g o h)) (a, f)
6713= s (a, g o f)
6714= fold (a, g o f) s
6715----
6716
6717Now, suppose `si = step0 hi` for `i` from `1` to `n`.
6718
6719[source,sml]
6720----
6721post (w, g) s1 s2 ... sn $
6722= fold (a, g o f) s1 s2 ... sn $
6723= (g o f) (hn (... (h1 a)))
6724= g (f (hn (... (h1 a))))
6725= g (fold (a, f) s1 ... sn $)
6726= g (w s1 ... sn $)
6727----
6728
6729For a practical example of post composition, see <:ArrayLiteral:>.
6730
6731
6732== Lift ==
6733
6734We now define a peculiar-looking function, `lift0`, that is,
6735equationally speaking, equivalent to the identity function on a step
6736function.
6737
6738[source,sml]
6739----
6740fun lift0 s (a, f) = fold (fold (a, id) s $, f)
6741----
6742
6743Using the definitions, we can prove the following equation.
6744
6745[source,sml]
6746----
6747fold (a, f) (lift0 (step0 h)) = fold (a, f) (step0 h)
6748----
6749
6750Here is the proof.
6751
6752[source,sml]
6753----
6754fold (a, f) (lift0 (step0 h))
6755= lift0 (step0 h) (a, f)
6756= fold (fold (a, id) (step0 h) $, f)
6757= fold (step0 h (a, id) $, f)
6758= fold (fold (h a, id) $, f)
6759= fold ($ (h a, id), f)
6760= fold (id (h a), f)
6761= fold (h a, f)
6762= step0 h (a, f)
6763= fold (a, f) (step0 h)
6764----
6765
6766If `lift0` is the identity, then why even define it? The answer
6767lies in the typing of fold expressions, which we have, until now, left
6768unexplained.
6769
6770
6771== Typing ==
6772
6773Perhaps the most surprising aspect of fold is that it can be checked
6774by the SML type system. The types involved in fold expressions are
6775complex; fortunately type inference is able to deduce them.
6776Nevertheless, it is instructive to study the types of fold functions
6777and steppers. More importantly, it is essential to understand the
6778typing aspects of fold in order to write down signatures of functions
6779defined using fold and step.
6780
6781Here is the `FOLD` signature, and a recapitulation of the entire
6782`Fold` structure, with additional type annotations.
6783
6784[source,sml]
6785----
6786signature FOLD =
6787 sig
6788 type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd
6789 type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd
6790 type ('a1, 'a2, 'b, 'c, 'd) step0 =
6791 ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step
6792 type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 =
6793 ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step
6794
6795 val fold: 'a * ('b -> 'c) -> ('a, 'b, 'c, 'd) t
6796 val lift0: ('a1, 'a2, 'a2, 'a2, 'a2) step0
6797 -> ('a1, 'a2, 'b, 'c, 'd) step0
6798 val post: ('a, 'b, 'c1, 'd) t * ('c1 -> 'c2)
6799 -> ('a, 'b, 'c2, 'd) t
6800 val step0: ('a1 -> 'a2) -> ('a1, 'a2, 'b, 'c, 'd) step0
6801 val step1: ('a11 * 'a12 -> 'a2)
6802 -> ('a11, 'a12, 'a2, 'b, 'c, 'd) step1
6803 end
6804
6805structure Fold:> FOLD =
6806 struct
6807 type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd
6808
6809 type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd
6810
6811 type ('a1, 'a2, 'b, 'c, 'd) step0 =
6812 ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step
6813
6814 type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 =
6815 ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step
6816
6817 fun fold (a: 'a, f: 'b -> 'c)
6818 (g: ('a, 'b, 'c, 'd) step): 'd =
6819 g (a, f)
6820
6821 fun step0 (h: 'a1 -> 'a2)
6822 (a1: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t =
6823 fold (h a1, f)
6824
6825 fun step1 (h: 'a11 * 'a12 -> 'a2)
6826 (a12: 'a12, f: 'b -> 'c)
6827 (a11: 'a11): ('a2, 'b, 'c, 'd) t =
6828 fold (h (a11, a12), f)
6829
6830 fun lift0 (s: ('a1, 'a2, 'a2, 'a2, 'a2) step0)
6831 (a: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t =
6832 fold (fold (a, id) s $, f)
6833
6834 fun post (w: ('a, 'b, 'c1, 'd) t,
6835 g: 'c1 -> 'c2)
6836 (s: ('a, 'b, 'c2, 'd) step): 'd =
6837 w (fn (a, h) => s (a, g o h))
6838 end
6839----
6840
6841That's a lot to swallow, so let's walk through it one step at a time.
6842First, we have the definition of type `Fold.step`.
6843
6844[source,sml]
6845----
6846type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd
6847----
6848
6849As a fold proceeds over its arguments, it maintains two things: the
6850accumulator, of type `'a`, and the finishing function, of type
6851`'b -> 'c`. Each step in the fold is a function that takes those
6852two pieces (i.e. `'a * ('b -> 'c)` and does something to them
6853(i.e. produces `'d`). The result type of the step is completely
6854left open to be filled in by type inference, as it is an arrow type
6855that is capable of consuming the rest of the arguments to the fold.
6856
6857A folder, of type `Fold.t`, is a function that consumes a single
6858step.
6859
6860[source,sml]
6861----
6862type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd
6863----
6864
6865Expanding out the type, we have:
6866
6867[source,sml]
6868----
6869type ('a, 'b, 'c, 'd) t = ('a * ('b -> 'c) -> 'd) -> 'd
6870----
6871
6872This shows that the only thing a folder does is to hand its
6873accumulator (`'a`) and finisher (`'b -> 'c`) to the next step
6874(`'a * ('b -> 'c) -> 'd`). If SML had <:FirstClassPolymorphism:first-class polymorphism>,
6875we would write the fold type as follows.
6876
6877[source,sml]
6878----
6879type ('a, 'b, 'c) t = Forall 'd . ('a, 'b, 'c, 'd) step -> 'd
6880----
6881
6882This type definition shows that a folder had nothing to do with
6883the rest of the fold, it only deals with the next step.
6884
6885We now can understand the type of `fold`, which takes the initial
6886value of the accumulator and the finishing function, and constructs a
6887folder, i.e. a function awaiting the next step.
6888
6889[source,sml]
6890----
6891val fold: 'a * ('b -> 'c) -> ('a, 'b, 'c, 'd) t
6892fun fold (a: 'a, f: 'b -> 'c)
6893 (g: ('a, 'b, 'c, 'd) step): 'd =
6894 g (a, f)
6895----
6896
6897Continuing on, we have the type of step functions.
6898
6899[source,sml]
6900----
6901type ('a1, 'a2, 'b, 'c, 'd) step0 =
6902 ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step
6903----
6904
6905Expanding out the type a bit gives:
6906
6907[source,sml]
6908----
6909type ('a1, 'a2, 'b, 'c, 'd) step0 =
6910 'a1 * ('b -> 'c) -> ('a2, 'b, 'c, 'd) t
6911----
6912
6913So, a step function takes the accumulator (`'a1`) and finishing
6914function (`'b -> 'c`), which will be passed to it by the previous
6915folder, and transforms them to a new folder. This new folder has a
6916new accumulator (`'a2`) and the same finishing function.
6917
6918Again, imagining that SML had <:FirstClassPolymorphism:first-class polymorphism> makes the type
6919clearer.
6920
6921[source,sml]
6922----
6923type ('a1, 'a2) step0 =
6924 Forall ('b, 'c) . ('a1, 'b, 'c, ('a2, 'b, 'c) t) step
6925----
6926
6927Thus, in essence, a `step0` function is a wrapper around a
6928function of type `'a1 -> 'a2`, which is exactly what the
6929definition of `step0` does.
6930
6931[source,sml]
6932----
6933val step0: ('a1 -> 'a2) -> ('a1, 'a2, 'b, 'c, 'd) step0
6934fun step0 (h: 'a1 -> 'a2)
6935 (a1: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t =
6936 fold (h a1, f)
6937----
6938
6939It is not much beyond `step0` to understand `step1`.
6940
6941[source,sml]
6942----
6943type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 =
6944 ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step
6945----
6946
6947A `step1` function takes the accumulator (`'a12`) and finisher
6948(`'b -> 'c`) passed to it by the previous folder and transforms
6949them into a function that consumes the next argument (`'a11`) and
6950produces a folder that will continue the fold with a new accumulator
6951(`'a2`) and the same finisher.
6952
6953[source,sml]
6954----
6955fun step1 (h: 'a11 * 'a12 -> 'a2)
6956 (a12: 'a12, f: 'b -> 'c)
6957 (a11: 'a11): ('a2, 'b, 'c, 'd) t =
6958 fold (h (a11, a12), f)
6959----
6960
6961With <:FirstClassPolymorphism:first-class polymorphism>, a `step1` function is more clearly
6962seen as a wrapper around a binary function of type
6963`'a11 * 'a12 -> 'a2`.
6964
6965[source,sml]
6966----
6967type ('a11, 'a12, 'a2) step1 =
6968 Forall ('b, 'c) . ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c) t) step
6969----
6970
6971The type of `post` is clear: it takes a folder with a finishing
6972function that produces type `'c1`, and a function of type
6973`'c1 -> 'c2` to postcompose onto the folder. It returns a new
6974folder with a finishing function that produces type `'c2`.
6975
6976[source,sml]
6977----
6978val post: ('a, 'b, 'c1, 'd) t * ('c1 -> 'c2)
6979 -> ('a, 'b, 'c2, 'd) t
6980fun post (w: ('a, 'b, 'c1, 'd) t,
6981 g: 'c1 -> 'c2)
6982 (s: ('a, 'b, 'c2, 'd) step): 'd =
6983 w (fn (a, h) => s (a, g o h))
6984----
6985
6986We will return to `lift0` after an example.
6987
6988
6989== An example typing ==
6990
6991Let's type check our simplest example, a variable-argument fold.
6992Recall that we have a folder `f` and a stepper `a` defined as
6993follows.
6994
6995[source,sml]
6996----
6997val f = fn z => Fold.fold ((), fn () => ()) z
6998val a = fn z => Fold.step0 (fn () => ()) z
6999----
7000
7001Since the accumulator and finisher are uninteresting, we'll use some
7002abbreviations to simplify things.
7003
7004[source,sml]
7005----
7006type 'd step = (unit, unit, unit, 'd) Fold.step
7007type 'd fold = 'd step -> 'd
7008----
7009
7010With these abbreviations, `f` and `a` have the following polymorphic
7011types.
7012
7013[source,sml]
7014----
7015f: 'd fold
7016a: 'd step
7017----
7018
7019Suppose we want to type check
7020
7021[source,sml]
7022----
7023f a a a $: unit
7024----
7025
7026As a reminder, the fully parenthesized expression is
7027[source,sml]
7028----
7029((((f a) a) a) a) $
7030----
7031
7032The observation that we will use repeatedly is that for any type
7033`z`, if `f: z fold` and `s: z step`, then `f s: z`.
7034So, if we want
7035
7036[source,sml]
7037----
7038(f a a a) $: unit
7039----
7040
7041then we must have
7042
7043[source,sml]
7044----
7045f a a a: unit fold
7046$: unit step
7047----
7048
7049Applying the observation again, we must have
7050
7051[source,sml]
7052----
7053f a a: unit fold fold
7054a: unit fold step
7055----
7056
7057Applying the observation two more times leads to the following type
7058derivation.
7059
7060[source,sml]
7061----
7062f: unit fold fold fold fold a: unit fold fold fold step
7063f a: unit fold fold fold a: unit fold fold step
7064f a a: unit fold fold a: unit fold step
7065f a a a: unit fold $: unit step
7066f a a a $: unit
7067----
7068
7069So, each application is a fold that consumes the next step, producing
7070a fold of one smaller type.
7071
7072One can expand some of the type definitions in `f` to see that it is
7073indeed a function that takes four curried arguments, each one a step
7074function.
7075
7076[source,sml]
7077----
7078f: unit fold fold fold step
7079 -> unit fold fold step
7080 -> unit fold step
7081 -> unit step
7082 -> unit
7083----
7084
7085This example shows why we must eta expand uses of `fold` and `step0`
7086to work around the value restriction and make folders and steppers
7087polymorphic. The type of a fold function like `f` depends on the
7088number of arguments, and so will vary from use to use. Similarly,
7089each occurrence of an argument like `a` has a different type,
7090depending on the number of remaining arguments.
7091
7092This example also shows that the type of a folder, when fully
7093expanded, is exponential in the number of arguments: there are as many
7094nested occurrences of the `fold` type constructor as there are
7095arguments, and each occurrence duplicates its type argument. One can
7096observe this exponential behavior in a type checker that doesn't share
7097enough of the representation of types (e.g. one that represents types
7098as trees rather than directed acyclic graphs).
7099
7100Generalizing this type derivation to uses of fold where the
7101accumulator and finisher are more interesting is straightforward. One
7102simply includes the type of the accumulator, which may change, for
7103each step, and the type of the finisher, which doesn't change from
7104step to step.
7105
7106
7107== Typing lift ==
7108
7109The lack of <:FirstClassPolymorphism:first-class polymorphism> in SML
7110causes problems if one wants to use a step in a first-class way.
7111Consider the following `double` function, which takes a step, `s`, and
7112produces a composite step that does `s` twice.
7113
7114[source,sml]
7115----
7116fun double s = fn u => Fold.fold u s s
7117----
7118
7119The definition of `double` is not type correct. The problem is that
7120the type of a step depends on the number of remaining arguments but
7121that the parameter `s` is not polymorphic, and so can not be used in
7122two different positions.
7123
7124Fortunately, we can define a function, `lift0`, that takes a monotyped
7125step function and _lifts_ it into a polymorphic step function. This
7126is apparent in the type of `lift0`.
7127
7128[source,sml]
7129----
7130val lift0: ('a1, 'a2, 'a2, 'a2, 'a2) step0
7131 -> ('a1, 'a2, 'b, 'c, 'd) step0
7132fun lift0 (s: ('a1, 'a2, 'a2, 'a2, 'a2) step0)
7133 (a: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t =
7134 fold (fold (a, id) s $, f)
7135----
7136
7137The following definition of `double` uses `lift0`, appropriately eta
7138wrapped, to fix the problem.
7139
7140[source,sml]
7141----
7142fun double s =
7143 let
7144 val s = fn z => Fold.lift0 s z
7145 in
7146 fn u => Fold.fold u s s
7147 end
7148----
7149
7150With that definition of `double` in place, we can use it as in the
7151following example.
7152
7153[source,sml]
7154----
7155val f = fn z => Fold.fold ((), fn () => ()) z
7156val a = fn z => Fold.step0 (fn () => ()) z
7157val a2 = fn z => double a z
7158val () = f a a2 a a2 $
7159----
7160
7161Of course, we must eta wrap the call `double` in order to use its
7162result, which is a step function, polymorphically.
7163
7164
7165== Hiding the type of the accumulator ==
7166
7167For clarity and to avoid mistakes, it can be useful to hide the type
7168of the accumulator in a fold. Reworking the simple variable-argument
7169example to do this leads to the following.
7170
7171[source,sml]
7172----
7173structure S:>
7174 sig
7175 type ac
7176 val f: (ac, ac, unit, 'd) Fold.t
7177 val s: (ac, ac, 'b, 'c, 'd) Fold.step0
7178 end =
7179 struct
7180 type ac = unit
7181 val f = fn z => Fold.fold ((), fn () => ()) z
7182 val s = fn z => Fold.step0 (fn () => ()) z
7183 end
7184----
7185
7186The idea is to name the accumulator type and use opaque signature
7187matching to make it abstract. This can prevent improper manipulation
7188of the accumulator by client code and ensure invariants that the
7189folder and stepper would like to maintain.
7190
7191For a practical example of this technique, see <:ArrayLiteral:>.
7192
7193
7194== Also see ==
7195
7196Fold has a number of practical applications. Here are some of them.
7197
7198* <:ArrayLiteral:>
7199* <:Fold01N:>
7200* <:FunctionalRecordUpdate:>
7201* <:NumericLiteral:>
7202* <:OptionalArguments:>
7203* <:Printf:>
7204* <:VariableArityPolymorphism:>
7205
7206There are a number of related techniques. Here are some of them.
7207
7208* <:StaticSum:>
7209* <:TypeIndexedValues:>
7210
7211<<<
7212
7213:mlton-guide-page: Fold01N
7214[[Fold01N]]
7215Fold01N
7216=======
7217
7218A common use pattern of <:Fold:> is to define a variable-arity
7219function that combines multiple arguments together using a binary
7220function. It is slightly tricky to do this directly using fold,
7221because of the special treatment required for the case of zero or one
7222argument. Here is a structure, `Fold01N`, that solves the problem
7223once and for all, and eases the definition of such functions.
7224
7225[source,sml]
7226----
7227structure Fold01N =
7228 struct
7229 fun fold {finish, start, zero} =
7230 Fold.fold ((id, finish, fn () => zero, start),
7231 fn (finish, _, p, _) => finish (p ()))
7232
7233 fun step0 {combine, input} =
7234 Fold.step0 (fn (_, finish, _, f) =>
7235 (finish,
7236 finish,
7237 fn () => f input,
7238 fn x' => combine (f input, x')))
7239
7240 fun step1 {combine} z input =
7241 step0 {combine = combine, input = input} z
7242 end
7243----
7244
7245If one has a value `zero`, and functions `start`, `c`, and `finish`,
7246then one can define a variable-arity function `f` and stepper
7247+&grave;+ as follows.
7248[source,sml]
7249----
7250val f = fn z => Fold01N.fold {finish = finish, start = start, zero = zero} z
7251val ` = fn z => Fold01N.step1 {combine = c} z
7252----
7253
7254One can then use the fold equation to prove the following equations.
7255[source,sml]
7256----
7257f $ = zero
7258f `a1 $ = finish (start a1)
7259f `a1 `a2 $ = finish (c (start a1, a2))
7260f `a1 `a2 `a3 $ = finish (c (c (start a1, a2), a3))
7261...
7262----
7263
7264For an example of `Fold01N`, see <:VariableArityPolymorphism:>.
7265
7266
7267== Typing Fold01N ==
7268
7269Here is the signature for `Fold01N`. We use a trick to avoid having
7270to duplicate the definition of some rather complex types in both the
7271signature and the structure. We first define the types in a
7272structure. Then, we define them via type re-definitions in the
7273signature, and via `open` in the full structure.
7274[source,sml]
7275----
7276structure Fold01N =
7277 struct
7278 type ('input, 'accum1, 'accum2, 'answer, 'zero,
7279 'a, 'b, 'c, 'd, 'e) t =
7280 (('zero -> 'zero)
7281 * ('accum2 -> 'answer)
7282 * (unit -> 'zero)
7283 * ('input -> 'accum1),
7284 ('a -> 'b) * 'c * (unit -> 'a) * 'd,
7285 'b,
7286 'e) Fold.t
7287
7288 type ('input1, 'accum1, 'input2, 'accum2,
7289 'a, 'b, 'c, 'd, 'e, 'f) step0 =
7290 ('a * 'b * 'c * ('input1 -> 'accum1),
7291 'b * 'b * (unit -> 'accum1) * ('input2 -> 'accum2),
7292 'd, 'e, 'f) Fold.step0
7293
7294 type ('accum1, 'input, 'accum2,
7295 'a, 'b, 'c, 'd, 'e, 'f, 'g) step1 =
7296 ('a,
7297 'b * 'c * 'd * ('a -> 'accum1),
7298 'c * 'c * (unit -> 'accum1) * ('input -> 'accum2),
7299 'e, 'f, 'g) Fold.step1
7300 end
7301
7302signature FOLD_01N =
7303 sig
7304 type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) t =
7305 ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.t
7306 type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) step0 =
7307 ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.step0
7308 type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) step1 =
7309 ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.step1
7310
7311 val fold:
7312 {finish: 'accum2 -> 'answer,
7313 start: 'input -> 'accum1,
7314 zero: 'zero}
7315 -> ('input, 'accum1, 'accum2, 'answer, 'zero,
7316 'a, 'b, 'c, 'd, 'e) t
7317
7318 val step0:
7319 {combine: 'accum1 * 'input2 -> 'accum2,
7320 input: 'input1}
7321 -> ('input1, 'accum1, 'input2, 'accum2,
7322 'a, 'b, 'c, 'd, 'e, 'f) step0
7323
7324 val step1:
7325 {combine: 'accum1 * 'input -> 'accum2}
7326 -> ('accum1, 'input, 'accum2,
7327 'a, 'b, 'c, 'd, 'e, 'f, 'g) step1
7328 end
7329
7330structure Fold01N: FOLD_01N =
7331 struct
7332 open Fold01N
7333
7334 fun fold {finish, start, zero} =
7335 Fold.fold ((id, finish, fn () => zero, start),
7336 fn (finish, _, p, _) => finish (p ()))
7337
7338 fun step0 {combine, input} =
7339 Fold.step0 (fn (_, finish, _, f) =>
7340 (finish,
7341 finish,
7342 fn () => f input,
7343 fn x' => combine (f input, x')))
7344
7345 fun step1 {combine} z input =
7346 step0 {combine = combine, input = input} z
7347 end
7348----
7349
7350<<<
7351
7352:mlton-guide-page: ForeignFunctionInterface
7353[[ForeignFunctionInterface]]
7354ForeignFunctionInterface
7355========================
7356
7357MLton's foreign function interface (FFI) extends Standard ML and makes
7358it easy to take the address of C global objects, access C global
7359variables, call from SML to C, and call from C to SML. MLton also
7360provides <:MLNLFFI:ML-NLFFI>, which is a higher-level FFI for calling
7361C functions and manipulating C data from SML.
7362
7363== Overview ==
7364* <:ForeignFunctionInterfaceTypes:Foreign Function Interface Types>
7365* <:ForeignFunctionInterfaceSyntax:Foreign Function Interface Syntax>
7366
7367== Importing Code into SML ==
7368* <:CallingFromSMLToC:Calling From SML To C>
7369* <:CallingFromSMLToCFunctionPointer:Calling From SML To C Function Pointer>
7370
7371== Exporting Code from SML ==
7372* <:CallingFromCToSML:Calling From C To SML>
7373
7374== Building System Libraries ==
7375* <:LibrarySupport:Library Support>
7376
7377<<<
7378
7379:mlton-guide-page: ForeignFunctionInterfaceSyntax
7380[[ForeignFunctionInterfaceSyntax]]
7381ForeignFunctionInterfaceSyntax
7382==============================
7383
7384MLton extends the syntax of SML with expressions that enable a
7385<:ForeignFunctionInterface:> to C. The following description of the
7386syntax uses some abbreviations.
7387
7388[options="header"]
7389|====
7390| C base type | _cBaseTy_ | <:ForeignFunctionInterfaceTypes: Foreign Function Interface types>
7391| C argument type | _cArgTy_ | _cBaseTy_~1~ `*` ... `*` _cBaseTy_~n~ or `unit`
7392| C return type | _cRetTy_ | _cBaseTy_ or `unit`
7393| C function type | _cFuncTy_ | _cArgTy_ `->` _cRetTy_
7394| C pointer type | _cPtrTy_ | `MLton.Pointer.t`
7395|====
7396
7397The type annotation and the semicolon are not optional in the syntax
7398of <:ForeignFunctionInterface:> expressions. However, the type is
7399lexed, parsed, and elaborated as an SML type, so any type (including
7400type abbreviations) may be used, so long as it elaborates to a type of
7401the correct form.
7402
7403
7404== Address ==
7405
7406----
7407_address "CFunctionOrVariableName" attr... : cPtrTy;
7408----
7409
7410Denotes the address of the C function or variable.
7411
7412`attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7413
7414* `external` : import with external symbol scope (see <:LibrarySupport:>) (default).
7415* `private` : import with private symbol scope (see <:LibrarySupport:>).
7416* `public` : import with public symbol scope (see <:LibrarySupport:>).
7417
7418See <:MLtonPointer: MLtonPointer> for functions that manipulate C pointers.
7419
7420
7421== Symbol ==
7422
7423----
7424_symbol "CVariableName" attr... : (unit -> cBaseTy) * (cBaseTy -> unit);
7425----
7426
7427Denotes the _getter_ and _setter_ for a C variable. The __cBaseTy__s
7428must be identical.
7429
7430`attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7431
7432* `alloc` : allocate storage (and export a symbol) for the C variable.
7433* `external` : import or export with external symbol scope (see <:LibrarySupport:>) (default if not `alloc`).
7434* `private` : import or export with private symbol scope (see <:LibrarySupport:>).
7435* `public` : import or export with public symbol scope (see <:LibrarySupport:>) (default if `alloc`).
7436
7437
7438----
7439_symbol * : cPtrTy -> (unit -> cBaseTy) * (cBaseTy -> unit);
7440----
7441
7442Denotes the _getter_ and _setter_ for a C pointer to a variable.
7443The __cBaseTy__s must be identical.
7444
7445
7446== Import ==
7447
7448----
7449_import "CFunctionName" attr... : cFuncTy;
7450----
7451
7452Denotes an SML function whose behavior is implemented by calling the C
7453function. See <:CallingFromSMLToC: Calling from SML to C> for more
7454details.
7455
7456`attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7457
7458* `cdecl` : call with the `cdecl` calling convention (default).
7459* `external` : import with external symbol scope (see <:LibrarySupport:>) (default).
7460* `impure`: assert that the function depends upon state and/or performs side effects (default).
7461* `private` : import with private symbol scope (see <:LibrarySupport:>).
7462* `public` : import with public symbol scope (see <:LibrarySupport:>).
7463* `pure`: assert that the function does not depend upon state or perform any side effects; such functions are subject to various optimizations (e.g., <:CommonSubexp:>, <:RemoveUnused:>)
7464* `reentrant`: assert that the function (directly or indirectly) calls an `_export`-ed SML function.
7465* `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW).
7466
7467
7468----
7469_import * attr... : cPtrTy -> cFuncTy;
7470----
7471
7472Denotes an SML function whose behavior is implemented by calling a C
7473function through a C function pointer.
7474
7475`attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7476
7477* `cdecl` : call with the `cdecl` calling convention (default).
7478* `impure`: assert that the function depends upon state and/or performs side effects (default).
7479* `pure`: assert that the function does not depend upon state or perform any side effects; such functions are subject to various optimizations (e.g., <:CommonSubexp:>, <:RemoveUnused:>)
7480* `reentrant`: assert that the function (directly or indirectly) calls an `_export`-ed SML function.
7481* `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW).
7482
7483See
7484<:CallingFromSMLToCFunctionPointer: Calling from SML to C function pointer>
7485for more details.
7486
7487
7488== Export ==
7489
7490----
7491_export "CFunctionName" attr... : cFuncTy -> unit;
7492----
7493
7494Exports a C function with the name `CFunctionName` that can be used to
7495call an SML function of the type _cFuncTy_. When the function denoted
7496by the export expression is applied to an SML function `f`, subsequent
7497C calls to `CFunctionName` will call `f`. It is an error to call
7498`CFunctionName` before the export has been applied. The export may be
7499applied more than once, with each application replacing any previous
7500definition of `CFunctionName`.
7501
7502`attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized:
7503
7504* `cdecl` : call with the `cdecl` calling convention (default).
7505* `private` : export with private symbol scope (see <:LibrarySupport:>).
7506* `public` : export with public symbol scope (see <:LibrarySupport:>) (default).
7507* `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW).
7508
7509See <:CallingFromCToSML: Calling from C to SML> for more details.
7510
7511<<<
7512
7513:mlton-guide-page: ForeignFunctionInterfaceTypes
7514[[ForeignFunctionInterfaceTypes]]
7515ForeignFunctionInterfaceTypes
7516=============================
7517
7518MLton's <:ForeignFunctionInterface:> only allows values of certain SML
7519types to be passed between SML and C. The following types are
7520allowed: `bool`, `char`, `int`, `real`, `word`. All of the different
7521sizes of (fixed-sized) integers, reals, and words are supported as
7522well: `Int8.int`, `Int16.int`, `Int32.int`, `Int64.int`,
7523`Real32.real`, `Real64.real`, `Word8.word`, `Word16.word`,
7524`Word32.word`, `Word64.word`. There is a special type,
7525`MLton.Pointer.t`, for passing C pointers -- see <:MLtonPointer:> for
7526details.
7527
7528Arrays, refs, and vectors of the above types are also allowed.
7529Because in MLton monomorphic arrays and vectors are exactly the same
7530as their polymorphic counterpart, these are also allowed. Hence,
7531`string`, `char vector`, and `CharVector.vector` are also allowed.
7532Strings are not null terminated, unless you manually do so from the
7533SML side.
7534
7535Unfortunately, passing tuples or datatypes is not allowed because that
7536would interfere with representation optimizations.
7537
7538The C header file that `-export-header` generates includes
7539++typedef++s for the C types corresponding to the SML types. Here is
7540the mapping between SML types and C types.
7541
7542[options="header"]
7543|====
7544| SML type | C typedef | C type | Note
7545| `array` | `Pointer` | `unsigned char *` |
7546| `bool` | `Bool` | `int32_t` |
7547| `char` | `Char8` | `uint8_t` |
7548| `Int8.int` | `Int8` | `int8_t` |
7549| `Int16.int` | `Int16` | `int16_t` |
7550| `Int32.int` | `Int32` | `int32_t` |
7551| `Int64.int` | `Int64` | `int64_t` |
7552| `int` | `Int32` | `int32_t` | <:#Default:(default)>
7553| `MLton.Pointer.t` | `Pointer` | `unsigned char *` |
7554| `Real32.real` | `Real32` | `float` |
7555| `Real64.real` | `Real64` | `double` |
7556| `real` | `Real64` | `double` | <:#Default:(default)>
7557| `ref` | `Pointer` | `unsigned char *` |
7558| `string` | `Pointer` | `unsigned char *` | <:#ReadOnly:(read only)>
7559| `vector` | `Pointer` | `unsigned char *` | <:#ReadOnly:(read only)>
7560| `Word8.word` | `Word8` | `uint8_t` |
7561| `Word16.word` | `Word16` | `uint16_t` |
7562| `Word32.word` | `Word32` | `uint32_t` |
7563| `Word64.word` | `Word64` | `uint64_t` |
7564| `word` | `Word32` | `uint32_t` | <:#Default:(default)>
7565|====
7566
7567<!Anchor(Default)>Note (default): The default `int`, `real`, and
7568`word` types may be set by the ++-default-type __type__++
7569<:CompileTimeOptions: compiler option>. The given C typedef and C
7570types correspond to the default behavior.
7571
7572<!Anchor(ReadOnly)>Note (read only): Because MLton assumes that
7573vectors and strings are read-only (and will perform optimizations
7574that, for instance, cause them to share space), you must not modify
7575the data pointed to by the `unsigned char *` in C code.
7576
7577Although the C type of an array, ref, or vector is always `Pointer`,
7578in reality, the object has the natural C representation. Your C code
7579should cast to the appropriate C type if you want to keep the C
7580compiler from complaining.
7581
7582When calling an <:CallingFromSMLToC: imported C function from SML>
7583that returns an array, ref, or vector result or when calling an
7584<:CallingFromCToSML: exported SML function from C> that takes an
7585array, ref, or string argument, then the object must be an ML object
7586allocated on the ML heap. (Although an array, ref, or vector object
7587has the natural C representation, the object also has an additional
7588header used by the SML runtime system.)
7589
7590In addition, there is an <:MLBasis:> file, `$(SML_LIB)/basis/c-types.mlb`,
7591which provides structure aliases for various C types:
7592
7593|====
7594| C type | Structure | Signature
7595| `char` | `C_Char` | `INTEGER`
7596| `signed char` | `C_SChar` | `INTEGER`
7597| `unsigned char` | `C_UChar` | `WORD`
7598| `short` | `C_Short` | `INTEGER`
7599| `signed short` | `C_SShort` | `INTEGER`
7600| `unsigned short` | `C_UShort` | `WORD`
7601| `int` | `C_Int` | `INTEGER`
7602| `signed int` | `C_SInt` | `INTEGER`
7603| `unsigned int` | `C_UInt` | `WORD`
7604| `long` | `C_Long` | `INTEGER`
7605| `signed long` | `C_SLong` | `INTEGER`
7606| `unsigned long` | `C_ULong` | `WORD`
7607| `long long` | `C_LongLong` | `INTEGER`
7608| `signed long long` | `C_SLongLong` | `INTEGER`
7609| `unsigned long long` | `C_ULongLong` | `WORD`
7610| `float` | `C_Float` | `REAL`
7611| `double` | `C_Double` | `REAL`
7612| `size_t` | `C_Size` | `WORD`
7613| `ptrdiff_t` | `C_Ptrdiff` | `INTEGER`
7614| `intmax_t` | `C_Intmax` | `INTEGER`
7615| `uintmax_t` | `C_UIntmax` | `WORD`
7616| `intptr_t` | `C_Intptr` | `INTEGER`
7617| `uintptr_t` | `C_UIntptr` | `WORD`
7618| `void *` | `C_Pointer` | `WORD`
7619|====
7620
7621These aliases depend on the configuration of the C compiler for the
7622target architecture, and are independent of the configuration of MLton
7623(including the ++-default-type __type__++
7624<:CompileTimeOptions: compiler option>).
7625
7626<<<
7627
7628:mlton-guide-page: ForLoops
7629[[ForLoops]]
7630ForLoops
7631========
7632
7633A `for`-loop is typically used to iterate over a range of consecutive
7634integers that denote indices of some sort. For example, in <:OCaml:>
7635a `for`-loop takes either the form
7636----
7637for <name> = <lower> to <upper> do <body> done
7638----
7639or the form
7640----
7641for <name> = <upper> downto <lower> do <body> done
7642----
7643
7644Some languages provide considerably more flexible `for`-loop or
7645`foreach`-constructs.
7646
7647A bit surprisingly, <:StandardML:Standard ML> provides special syntax
7648for `while`-loops, but not for `for`-loops. Indeed, in SML, many uses
7649of `for`-loops are better expressed using `app`, `foldl`/`foldr`,
7650`map` and many other higher-order functions provided by the
7651<:BasisLibrary:Basis Library> for manipulating lists, vectors and
7652arrays. However, the Basis Library does not provide a function for
7653iterating over a range of integer values. Fortunately, it is very
7654easy to write one.
7655
7656
7657== A fairly simple design ==
7658
7659The following implementation imitates both the syntax and semantics of
7660the OCaml `for`-loop.
7661
7662[source,sml]
7663----
7664datatype for = to of int * int
7665 | downto of int * int
7666
7667infix to downto
7668
7669val for =
7670 fn lo to up =>
7671 (fn f => let fun loop lo = if lo > up then ()
7672 else (f lo; loop (lo+1))
7673 in loop lo end)
7674 | up downto lo =>
7675 (fn f => let fun loop up = if up < lo then ()
7676 else (f up; loop (up-1))
7677 in loop up end)
7678----
7679
7680For example,
7681
7682[source,sml]
7683----
7684for (1 to 9)
7685 (fn i => print (Int.toString i))
7686----
7687
7688would print `123456789` and
7689
7690[source,sml]
7691----
7692for (9 downto 1)
7693 (fn i => print (Int.toString i))
7694----
7695
7696would print `987654321`.
7697
7698Straightforward formatting of nested loops
7699
7700[source,sml]
7701----
7702for (a to b)
7703 (fn i =>
7704 for (c to d)
7705 (fn j =>
7706 ...))
7707----
7708
7709is fairly readable, but tends to cause the body of the loop to be
7710indented quite deeply.
7711
7712
7713== Off-by-one ==
7714
7715The above design has an annoying feature. In practice, the upper
7716bound of the iterated range is almost always excluded and most loops
7717would subtract one from the upper bound:
7718
7719[source,sml]
7720----
7721for (0 to n-1) ...
7722for (n-1 downto 0) ...
7723----
7724
7725It is probably better to break convention and exclude the upper bound
7726by default, because it leads to more concise code and becomes
7727idiomatic with very little practice. The iterator combinators
7728described below exclude the upper bound by default.
7729
7730
7731== Iterator combinators ==
7732
7733While the simple `for`-function described in the previous section is
7734probably good enough for many uses, it is a bit cumbersome when one
7735needs to iterate over a Cartesian product. One might also want to
7736iterate over more than just consecutive integers. It turns out that
7737one can provide a library of iterator combinators that allow one to
7738implement iterators more flexibly.
7739
7740Since the types of the combinators may be a bit difficult to infer
7741from their implementations, let's first take a look at a signature of
7742the iterator combinator library:
7743
7744[source,sml]
7745----
7746signature ITER =
7747 sig
7748 type 'a t = ('a -> unit) -> unit
7749
7750 val return : 'a -> 'a t
7751 val >>= : 'a t * ('a -> 'b t) -> 'b t
7752
7753 val none : 'a t
7754
7755 val to : int * int -> int t
7756 val downto : int * int -> int t
7757
7758 val inList : 'a list -> 'a t
7759 val inVector : 'a vector -> 'a t
7760 val inArray : 'a array -> 'a t
7761
7762 val using : ('a, 'b) StringCvt.reader -> 'b -> 'a t
7763
7764 val when : 'a t * ('a -> bool) -> 'a t
7765 val by : 'a t * ('a -> 'b) -> 'b t
7766 val @@ : 'a t * 'a t -> 'a t
7767 val ** : 'a t * 'b t -> ('a, 'b) product t
7768
7769 val for : 'a -> 'a
7770 end
7771----
7772
7773Several of the above combinators are meant to be used as infix
7774operators. Here is a set of suitable infix declarations:
7775
7776[source,sml]
7777----
7778infix 2 to downto
7779infix 1 @@ when by
7780infix 0 >>= **
7781----
7782
7783A few notes are in order:
7784
7785* The `'a t` type constructor with the `return` and `>>=` operators forms a monad.
7786
7787* The `to` and `downto` combinators will omit the upper bound of the range.
7788
7789* `for` is the identity function. It is purely for syntactic sugar and is not strictly required.
7790
7791* The `@@` combinator produces an iterator for the concatenation of the given iterators.
7792
7793* The `**` combinator produces an iterator for the Cartesian product of the given iterators.
7794** See <:ProductType:> for the type constructor `('a, 'b) product` used in the type of the iterator produced by `**`.
7795
7796* The `using` combinator allows one to iterate over slices, streams and many other kinds of sequences.
7797
7798* `when` is the filtering combinator. The name `when` is inspired by <:OCaml:>'s guard clauses.
7799
7800* `by` is the mapping combinator.
7801
7802The below implementation of the `ITER`-signature makes use of the
7803following basic combinators:
7804
7805[source,sml]
7806----
7807fun const x _ = x
7808fun flip f x y = f y x
7809fun id x = x
7810fun opt fno fso = fn NONE => fno () | SOME ? => fso ?
7811fun pass x f = f x
7812----
7813
7814Here is an implementation the `ITER`-signature:
7815
7816[source,sml]
7817----
7818structure Iter :> ITER =
7819 struct
7820 type 'a t = ('a -> unit) -> unit
7821
7822 val return = pass
7823 fun (iA >>= a2iB) f = iA (flip a2iB f)
7824
7825 val none = ignore
7826
7827 fun (l to u) f = let fun `l = if l<u then (f l; `(l+1)) else () in `l end
7828 fun (u downto l) f = let fun `u = if u>l then (f (u-1); `(u-1)) else () in `u end
7829
7830 fun inList ? = flip List.app ?
7831 fun inVector ? = flip Vector.app ?
7832 fun inArray ? = flip Array.app ?
7833
7834 fun using get s f = let fun `s = opt (const ()) (fn (x, s) => (f x; `s)) (get s) in `s end
7835
7836 fun (iA when p) f = iA (fn a => if p a then f a else ())
7837 fun (iA by g) f = iA (f o g)
7838 fun (iA @@ iB) f = (iA f : unit; iB f)
7839 fun (iA ** iB) f = iA (fn a => iB (fn b => f (a & b)))
7840
7841 val for = id
7842 end
7843----
7844
7845Note that some of the above combinators (e.g. `**`) could be expressed
7846in terms of the other combinators, most notably `return` and `>>=`.
7847Another implementation issue worth mentioning is that `downto` is
7848written specifically to avoid computing `l-1`, which could cause an
7849`Overflow`.
7850
7851To use the above combinators the `Iter`-structure needs to be opened
7852
7853[source,sml]
7854----
7855open Iter
7856----
7857
7858and one usually also wants to declare the infix status of the
7859operators as shown earlier.
7860
7861Here is an example that illustrates some of the features:
7862
7863[source,sml]
7864----
7865for (0 to 10 when (fn x => x mod 3 <> 0) ** inList ["a", "b"] ** 2 downto 1 by real)
7866 (fn x & y & z =>
7867 print ("("^Int.toString x^", \""^y^"\", "^Real.toString z^")\n"))
7868----
7869
7870Using the `Iter` combinators one can easily produce more complicated
7871iterators. For example, here is an iterator over a "triangle":
7872
7873[source,sml]
7874----
7875fun triangle (l, u) = l to u >>= (fn i => i to u >>= (fn j => return (i, j)))
7876----
7877
7878<<<
7879
7880:mlton-guide-page: FrontEnd
7881[[FrontEnd]]
7882FrontEnd
7883========
7884
7885<:FrontEnd:> is a translation pass from source to the <:AST:>
7886<:IntermediateLanguage:>.
7887
7888== Description ==
7889
7890This pass performs lexing and parsing to produce an abstract syntax
7891tree.
7892
7893== Implementation ==
7894
7895* <!ViewGitFile(mlton,master,mlton/front-end/front-end.sig)>
7896* <!ViewGitFile(mlton,master,mlton/front-end/front-end.fun)>
7897
7898== Details and Notes ==
7899
7900The lexer is produced by <:MLLex:> from
7901<!ViewGitFile(mlton,master,mlton/front-end/ml.lex)>.
7902
7903The parser is produced by <:MLYacc:> from
7904<!ViewGitFile(mlton,master,mlton/front-end/ml.grm)>.
7905
7906The specifications for the lexer and parser were originally taken from
7907<:SMLNJ: SML/NJ> (version 109.32), but have been heavily modified
7908since then.
7909
7910<<<
7911
7912:mlton-guide-page: FSharp
7913[[FSharp]]
7914FSharp
7915======
7916
7917http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/[F#]
7918is a functional programming language developed at Microsoft Research.
7919F# was partly inspired by the <:OCaml:OCaml> language and shares some
7920common core constructs with it. F# is integrated with Visual Studio
79212010 as a first-class language.
7922
7923<<<
7924
7925:mlton-guide-page: FunctionalRecordUpdate
7926[[FunctionalRecordUpdate]]
7927FunctionalRecordUpdate
7928======================
7929
7930Functional record update is the copying of a record while replacing
7931the values of some of the fields. <:StandardML:Standard ML> does not
7932have explicit syntax for functional record update. We will show below
7933how to implement functional record update in SML, with a little
7934boilerplate code.
7935
7936As an example, the functional update of the record
7937
7938[source,sml]
7939----
7940{a = 13, b = 14, c = 15}
7941----
7942
7943with `c = 16` yields a new record
7944
7945[source,sml]
7946----
7947{a = 13, b = 14, c = 16}
7948----
7949
7950Functional record update also makes sense with multiple simultaneous
7951updates. For example, the functional update of the record above with
7952`a = 18, c = 19` yields a new record
7953
7954[source,sml]
7955----
7956{a = 18, b = 14, c = 19}
7957----
7958
7959
7960One could easily imagine an extension of the SML that supports
7961functional record update. For example
7962
7963[source,sml]
7964----
7965e with {a = 16, b = 17}
7966----
7967
7968would create a copy of the record denoted by `e` with field `a`
7969replaced with `16` and `b` replaced with `17`.
7970
7971Since there is no such syntax in SML, we now show how to implement
7972functional record update directly. We first give a simple
7973implementation that has a number of problems. We then give an
7974advanced implementation, that, while complex underneath, is a reusable
7975library that admits simple use.
7976
7977
7978== Simple implementation ==
7979
7980To support functional record update on the record type
7981
7982[source,sml]
7983----
7984{a: 'a, b: 'b, c: 'c}
7985----
7986
7987first, define an update function for each component.
7988
7989[source,sml]
7990----
7991fun withA ({a = _, b, c}, a) = {a = a, b = b, c = c}
7992fun withB ({a, b = _, c}, b) = {a = a, b = b, c = c}
7993fun withC ({a, b, c = _}, c) = {a = a, b = b, c = c}
7994----
7995
7996Then, one can express `e with {a = 16, b = 17}` as
7997
7998[source,sml]
7999----
8000withB (withA (e, 16), 17)
8001----
8002
8003With infix notation
8004
8005[source,sml]
8006----
8007infix withA withB withC
8008----
8009
8010the syntax is almost as concise as a language extension.
8011
8012[source,sml]
8013----
8014e withA 16 withB 17
8015----
8016
8017This approach suffers from the fact that the amount of boilerplate
8018code is quadratic in the number of record fields. Furthermore,
8019changing, adding, or deleting a field requires time proportional to
8020the number of fields (because each ++with__<L>__++ function must be
8021changed). It is also annoying to have to define a ++with__<L>__++
8022function, possibly with a fixity declaration, for each field.
8023
8024Fortunately, there is a solution to these problems.
8025
8026
8027== Advanced implementation ==
8028
8029Using <:Fold:> one can define a family of ++makeUpdate__<N>__++
8030functions and single _update_ operator `U` so that one can define a
8031functional record update function for any record type simply by
8032specifying a (trivial) isomorphism between that type and function
8033argument list. For example, suppose that we would like to do
8034functional record update on records with fields `a` and `b`. Then one
8035defines a function `updateAB` as follows.
8036
8037[source,sml]
8038----
8039val updateAB =
8040 fn z =>
8041 let
8042 fun from v1 v2 = {a = v1, b = v2}
8043 fun to f {a = v1, b = v2} = f v1 v2
8044 in
8045 makeUpdate2 (from, from, to)
8046 end
8047 z
8048----
8049
8050The functions `from` (think _from function arguments_) and `to` (think
8051_to function arguements_) specify an isomorphism between `a`,`b`
8052records and function arguments. There is a second use of `from` to
8053work around the lack of
8054<:FirstClassPolymorphism:first-class polymorphism> in SML.
8055
8056With the definition of `updateAB` in place, the following expressions
8057are valid.
8058
8059[source,sml]
8060----
8061updateAB {a = 13, b = "hello"} (set#b "goodbye") $
8062updateAB {a = 13.5, b = true} (set#b false) (set#a 12.5) $
8063----
8064
8065As another example, suppose that we would like to do functional record
8066update on records with fields `b`, `c`, and `d`. Then one defines a
8067function `updateBCD` as follows.
8068
8069[source,sml]
8070----
8071val updateBCD =
8072 fn z =>
8073 let
8074 fun from v1 v2 v3 = {b = v1, c = v2, d = v3}
8075 fun to f {b = v1, c = v2, d = v3} = f v1 v2 v3
8076 in
8077 makeUpdate3 (from, from, to)
8078 end
8079 z
8080----
8081
8082With the definition of `updateBCD` in place, the following expression
8083is valid.
8084
8085[source,sml]
8086----
8087updateBCD {b = 1, c = 2, d = 3} (set#c 4) (set#c 5) $
8088----
8089
8090Note that not all fields need be updated and that the same field may
8091be updated multiple times. Further note that the same `set` operator
8092is used for all update functions (in the above, for both `updateAB`
8093and `updateBCD`).
8094
8095In general, to define a functional-record-update function on records
8096with fields `f1`, `f2`, ..., `fN`, use the following template.
8097
8098[source,sml]
8099----
8100val update =
8101 fn z =>
8102 let
8103 fun from v1 v2 ... vn = {f1 = v1, f2 = v2, ..., fn = vn}
8104 fun to f {f1 = v1, f2 = v2, ..., fn = vn} = v1 v2 ... vn
8105 in
8106 makeUpdateN (from, from, to)
8107 end
8108 z
8109----
8110
8111With this, one can update a record as follows.
8112
8113[source,sml]
8114----
8115update {f1 = v1, ..., fn = vn} (set#fi1 vi1) ... (set#fim vim) $
8116----
8117
8118
8119== The `FunctionalRecordUpdate` structure ==
8120
8121Here is the implementation of functional record update.
8122
8123[source,sml]
8124----
8125structure FunctionalRecordUpdate =
8126 struct
8127 local
8128 fun next g (f, z) x = g (f x, z)
8129 fun f1 (f, z) x = f (z x)
8130 fun f2 z = next f1 z
8131 fun f3 z = next f2 z
8132
8133 fun c0 from = from
8134 fun c1 from = c0 from f1
8135 fun c2 from = c1 from f2
8136 fun c3 from = c2 from f3
8137
8138 fun makeUpdate cX (from, from', to) record =
8139 let
8140 fun ops () = cX from'
8141 fun vars f = to f record
8142 in
8143 Fold.fold ((vars, ops), fn (vars, _) => vars from)
8144 end
8145 in
8146 fun makeUpdate0 z = makeUpdate c0 z
8147 fun makeUpdate1 z = makeUpdate c1 z
8148 fun makeUpdate2 z = makeUpdate c2 z
8149 fun makeUpdate3 z = makeUpdate c3 z
8150
8151 fun upd z = Fold.step2 (fn (s, f, (vars, ops)) => (fn out => vars (s (ops ()) (out, f)), ops)) z
8152 fun set z = Fold.step2 (fn (s, v, (vars, ops)) => (fn out => vars (s (ops ()) (out, fn _ => v)), ops)) z
8153 end
8154 end
8155----
8156
8157The idea of `makeUpdate` is to build a record of functions which can
8158replace the contents of one argument out of a list of arguments. The
8159functions ++f__<X>__++ replace the 0th, 1st, ... argument with their
8160argument `z`. The ++c__<X>__++ functions pass the first __X__ `f`
8161functions to the record constructor.
8162
8163The `#field` notation of Standard ML allows us to select the map
8164function which replaces the corresponding argument. By converting the
8165record to an argument list, feeding that list through the selected map
8166function and piping the list into the record constructor, functional
8167record update is achieved.
8168
8169
8170== Efficiency ==
8171
8172With MLton, the efficiency of this approach is as good as one would
8173expect with the special syntax. Namely a sequence of updates will be
8174optimized into a single record construction that copies the unchanged
8175fields and fills in the changed fields with their new values.
8176
8177Before Sep 14, 2009, this page advocated an alternative implementation
8178of <:FunctionalRecordUpdate:>. However, the old structure caused
8179exponentially increasing compile times. We advise you to switch to
8180the newer version.
8181
8182
8183== Applications ==
8184
8185Functional record update can be used to implement labelled
8186<:OptionalArguments:optional arguments>.
8187
8188<<<
8189
8190:mlton-guide-page: fxp
8191[[fxp]]
8192fxp
8193===
8194
8195http://atseidl2.informatik.tu-muenchen.de/%7Eberlea/Fxp/[fxp] is an XML
8196parser written in Standard ML.
8197
8198It has a
8199http://atseidl2.informatik.tu-muenchen.de/%7Eberlea/Fxp/mlton.html[patch]
8200to compile with MLton.
8201
8202<<<
8203
8204:mlton-guide-page: GarbageCollection
8205[[GarbageCollection]]
8206GarbageCollection
8207=================
8208
8209For a good introduction and overview to garbage collection, see
8210<!Cite(Jones99)>.
8211
8212MLton's garbage collector uses copying, mark-compact, and generational
8213collection, automatically switching between them at run time based on
8214the amount of live data relative to the amount of RAM. The runtime
8215system tries to keep the heap within RAM if at all possible.
8216
8217MLton's copying collector is a simple, two-space, breadth-first,
8218Cheney-style collector. The design for the generational and
8219mark-compact GC is based on <!Cite(Sansom91)>.
8220
8221== Design notes ==
8222
8223* http://www.mlton.org/pipermail/mlton/2002-May/012420.html
8224+
8225object layout and header word design
8226
8227== Also see ==
8228
8229 * <:Regions:>
8230
8231<<<
8232
8233:mlton-guide-page: GenerativeDatatype
8234[[GenerativeDatatype]]
8235GenerativeDatatype
8236==================
8237
8238In <:StandardML:Standard ML>, datatype declarations are said to be
8239_generative_, because each time a datatype declaration is evaluated,
8240it yields a new type. Thus, any attempt to mix the types will lead to
8241a type error at compile-time. The following program, which does not
8242type check, demonstrates this.
8243
8244[source,sml]
8245----
8246functor F () =
8247 struct
8248 datatype t = T
8249 end
8250structure S1 = F ()
8251structure S2 = F ()
8252val _: S1.t -> S2.t = fn x => x
8253----
8254
8255Generativity also means that two different datatype declarations
8256define different types, even if they define identical constructors.
8257The following program does not type check due to this.
8258
8259[source,sml]
8260----
8261datatype t = A | B
8262val a1 = A
8263datatype t = A | B
8264val a2 = A
8265val _ = if true then a1 else a2
8266----
8267
8268== Also see ==
8269
8270 * <:GenerativeException:>
8271
8272<<<
8273
8274:mlton-guide-page: GenerativeException
8275[[GenerativeException]]
8276GenerativeException
8277===================
8278
8279In <:StandardML:Standard ML>, exception declarations are said to be
8280_generative_, because each time an exception declaration is evaluated,
8281it yields a new exception.
8282
8283The following program demonstrates the generativity of exceptions.
8284
8285[source,sml]
8286----
8287exception E
8288val e1 = E
8289fun isE1 (e: exn): bool =
8290 case e of
8291 E => true
8292 | _ => false
8293exception E
8294val e2 = E
8295fun isE2 (e: exn): bool =
8296 case e of
8297 E => true
8298 | _ => false
8299fun pb (b: bool): unit =
8300 print (concat [Bool.toString b, "\n"])
8301val () = (pb (isE1 e1)
8302 ;pb (isE1 e2)
8303 ; pb (isE2 e1)
8304 ; pb (isE2 e2))
8305----
8306
8307In the above program, two different exception declarations declare an
8308exception `E` and a corresponding function that returns `true` only on
8309that exception. Although declared by syntactically identical
8310exception declarations, `e1` and `e2` are different exceptions. The
8311program, when run, prints `true`, `false`, `false`, `true`.
8312
8313A slight modification of the above program shows that even a single
8314exception declaration yields a new exception each time it is
8315evaluated.
8316
8317[source,sml]
8318----
8319fun f (): exn * (exn -> bool) =
8320 let
8321 exception E
8322 in
8323 (E, fn E => true | _ => false)
8324 end
8325val (e1, isE1) = f ()
8326val (e2, isE2) = f ()
8327fun pb (b: bool): unit =
8328 print (concat [Bool.toString b, "\n"])
8329val () = (pb (isE1 e1)
8330 ; pb (isE1 e2)
8331 ; pb (isE2 e1)
8332 ; pb (isE2 e2))
8333----
8334
8335Each call to `f` yields a new exception and a function that returns
8336`true` only on that exception. The program, when run, prints `true`,
8337`false`, `false`, `true`.
8338
8339
8340== Type Safety ==
8341
8342Exception generativity is required for type safety. Consider the
8343following valid SML program.
8344
8345[source,sml]
8346----
8347fun f (): ('a -> exn) * (exn -> 'a) =
8348 let
8349 exception E of 'a
8350 in
8351 (E, fn E x => x | _ => raise Fail "f")
8352 end
8353fun cast (a: 'a): 'b =
8354 let
8355 val (make: 'a -> exn, _) = f ()
8356 val (_, get: exn -> 'b) = f ()
8357 in
8358 get (make a)
8359 end
8360val _ = ((cast 13): int -> int) 14
8361----
8362
8363If exceptions weren't generative, then each call `f ()` would yield
8364the same exception constructor `E`. Then, our `cast` function could
8365use `make: 'a -> exn` to convert any value into an exception and then
8366`get: exn -> 'b` to convert that exception to a value of arbitrary
8367type. If `cast` worked, then we could cast an integer as a function
8368and apply. Of course, because of generative exceptions, this program
8369raises `Fail "f"`.
8370
8371
8372== Applications ==
8373
8374The `exn` type is effectively a <:UniversalType:universal type>.
8375
8376
8377== Also see ==
8378
8379 * <:GenerativeDatatype:>
8380
8381<<<
8382
8383:mlton-guide-page: Git
8384[[Git]]
8385Git
8386===
8387
8388http://git-scm.com/[Git] is a distributed version control system. The
8389MLton project currently uses Git to maintain its
8390<:Sources:source code>.
8391
8392Here are some online Git resources.
8393
8394* http://git-scm.com/docs[Reference Manual]
8395* http://git-scm.com/book[ProGit, by Scott Chacon]
8396
8397<<<
8398
8399:mlton-guide-page: Glade
8400[[Glade]]
8401Glade
8402=====
8403
8404http://glade.gnome.org/features.html[Glade] is a tool for generating
8405Gtk user interfaces.
8406
8407<:WesleyTerpstra:> is working on a Glade->mGTK converter.
8408
8409* http://www.mlton.org/pipermail/mlton/2004-December/016865.html
8410
8411<<<
8412
8413:mlton-guide-page: Globalize
8414[[Globalize]]
8415Globalize
8416=========
8417
8418<:Globalize:> is an analysis pass for the <:SXML:>
8419<:IntermediateLanguage:>, invoked from <:ClosureConvert:>.
8420
8421== Description ==
8422
8423This pass marks values that are constant, allowing <:ClosureConvert:>
8424to move them out to the top level so they are only evaluated once and
8425do not appear in closures.
8426
8427== Implementation ==
8428
8429* <!ViewGitFile(mlton,master,mlton/closure-convert/globalize.sig)>
8430* <!ViewGitFile(mlton,master,mlton/closure-convert/globalize.fun)>
8431
8432== Details and Notes ==
8433
8434{empty}
8435
8436<<<
8437
8438:mlton-guide-page: GnuMP
8439[[GnuMP]]
8440GnuMP
8441=====
8442
8443The http://gmplib.org[GnuMP] library (GNU Multiple Precision
8444arithmetic library) is a library for arbitrary precision integer
8445arithmetic. MLton uses the GnuMP library to implement the
8446<:BasisLibrary: Basis Library> `IntInf` module.
8447
8448== Known issues ==
8449
8450* There is a known problem with the GnuMP library (prior to version
84514.2.x), where it requires a lot of stack space for some computations,
8452e.g. `IntInf.toString` of a million digit number. If you run with
8453stack size limited, you may see a segfault in such programs. This
8454problem is mentioned in the http://gmplib.org/#FAQ[GnuMP FAQ], where
8455they describe two solutions.
8456
8457** Increase (or unlimit) your stack space. From your program, use
8458`setrlimit`, or from the shell, use `ulimit`.
8459
8460** Configure and rebuild `libgmp` with `--disable-alloca`, which will
8461cause it to allocate temporaries using `malloc` instead of on the
8462stack.
8463
8464* On some platforms, the GnuMP library may be configured to use one of
8465multiple ABIs (Application Binary Interfaces). For example, on some
846632-bit architectures, GnuMP may be configured to represent a limb as
8467either a 32-bit `long` or as a 64-bit `long long`. Similarly, GnuMP
8468may be configured to use specific CPU features.
8469+
8470In order to efficiently use the GnuMP library, MLton represents an
8471`IntInf.int` value in a manner compatible with the GnuMP library's
8472representation of a limb. Hence, it is important that MLton and the
8473GnuMP library agree upon the representation of a limb.
8474
8475** When using a source package of MLton, building will detect the
8476GnuMP library's representation of a limb.
8477
8478** When using a binary package of MLton that is dynamically linked
8479against the GnuMP library, the build machine and the install machine
8480must have the GnuMP library configured with the same representation of
8481a limb. (On the other hand, the build machine need not have the GnuMP
8482library configured with CPU features compatible with the install
8483machine.)
8484
8485** When using a binary package of MLton that is statically linked
8486against the GnuMP library, the build machine and the install machine
8487need not have the GnuMP library configured with the same
8488representation of a limb. (On the other hand, the build machine must
8489have the GnuMP library configured with CPU features compatible with
8490the install machine.)
8491+
8492However, MLton will be configured with the representation of a limb
8493from the GnuMP library of the build machine. Executables produced by
8494MLton will be incompatible with the GnuMP library of the install
8495machine. To _reconfigure_ MLton with the representation of a limb
8496from the GnuMP library of the install machine, one must edit:
8497+
8498----
8499/usr/lib/mlton/self/sizes
8500----
8501+
8502changing the
8503+
8504----
8505mplimb = ??
8506----
8507+
8508entry so that `??` corresponds to the bytes in a limb; and, one must edit:
8509+
8510----
8511/usr/lib/mlton/sml/basis/config/c/arch-os/c-types.sml
8512----
8513+
8514changing the
8515+
8516----
8517(* from "gmp.h" *)
8518structure C_MPLimb = struct open Word?? type t = word end
8519functor C_MPLimb_ChooseWordN (A: CHOOSE_WORDN_ARG) = ChooseWordN_Word?? (A)
8520----
8521+
8522entries so that `??` corresponds to the bits in a limb.
8523
8524<<<
8525
8526:mlton-guide-page: GoogleSummerOfCode2013
8527[[GoogleSummerOfCode2013]]
8528Google Summer of Code (2013)
8529============================
8530
8531== Mentors ==
8532
8533The following developers have agreed to serve as mentors for the 2013 Google Summer of Code:
8534
8535* http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8536* http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8537* http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan]
8538
8539== Ideas List ==
8540
8541=== Implement a Partial Redundancy Elimination (PRE) Optimization ===
8542
8543Partial redundancy elimination (PRE) is a program transformation that
8544removes operations that are redundant on some, but not necessarily all
8545paths, through the program. PRE can subsume both common subexpression
8546elimination and loop-invariant code motion, and is therefore a
8547potentially powerful optimization. However, a na&iuml;ve
8548implementation of PRE on a program in static single assignment (SSA)
8549form is unlikely to be effective. This project aims to adapt and
8550implement the SSAPRE algorithm(s) of Thomas VanDrunen in MLton's SSA
8551intermediate language.
8552
8553Background:
8554--
8555* http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based partial redundancy elimination for static single assignment form]; Thomas VanDrunen and Antony L. Hosking
8556* http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen
8557* http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
8558* http://portal.acm.org/citation.cfm?doid=319301.319348[Partial redundancy elimination in SSA form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
8559--
8560
8561Recommended Skills: SML programming experience; some middle-end compiler experience
8562
8563/////
8564Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8565/////
8566
8567=== Design and Implement a Heap Profiler ===
8568
8569A heap profile is a description of the space usage of a program. A
8570heap profile is concerned with the allocation, retention, and
8571deallocation (via garbage collection) of heap data during the
8572execution of a program. A heap profile can be used to diagnose
8573performance problems in a functional program that arise from space
8574leaks. This project aims to design and implement a heap profiler for
8575MLton compiled programs.
8576
8577Background:
8578--
8579* http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones
8580* http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas R&ouml;jemo
8581* http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas R&ouml;jemo
8582* http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling
8583--
8584
8585Recommended Skills: C and SML programming experience; some experience with UI and visualization
8586
8587/////
8588Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8589/////
8590
8591=== Garbage Collector Improvements ===
8592
8593The garbage collector plays a significant role in the performance of
8594functional languages. Garbage collect too often, and program
8595performance suffers due to the excessive time spent in the garbage
8596collector. Garbage collect not often enough, and program performance
8597suffers due to the excessive space used by the uncollected garbage.
8598One particular issue is ensuring that a program utilizing a garbage
8599collector "plays nice" with other processes on the system, by not
8600using too much or too little physical memory. While there are some
8601reasonable theoretical results about garbage collections with heaps of
8602fixed size, there seems to be insufficient work that really looks
8603carefully at the question of dynamically resizing the heap in response
8604to the live data demands of the application and, similarly, in
8605response to the behavior of the operating system and other processes.
8606This project aims to investigate improvements to the memory behavior of
8607MLton compiled programs through better tuning of the garbage
8608collector.
8609
8610Background:
8611--
8612* http://www.dcs.gla.ac.uk/%7Ewhited/papers/automated_heap_sizing.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews
8613* http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
8614* http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling garbage collection and heap growth to reduce the execution time of Java applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
8615* http://portal.acm.org/citation.cfm?doid=1065010.1065028[Garbage collection without paging]; Matthew Hertz, Yi Feng, and Emery D. Berger
8616* http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic heap sizing: taking real memory into account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
8617--
8618
8619Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
8620
8621/////
8622Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8623/////
8624
8625=== Implement Successor{nbsp}ML Language Features ===
8626
8627Any programming language, including Standard{nbsp}ML, can be improved.
8628The community has identified a number of modest extensions and
8629revisions to the Standard{nbsp}ML programming language that would
8630likely prove useful in practice. This project aims to implement these
8631language features in the MLton compiler.
8632
8633Background:
8634--
8635* http://successor-ml.org/index.php?title=Main_Page[Successor{nbsp}ML]
8636* http://www.mpi-sws.org/%7Erossberg/hamlet/index.html#successor-ml[HaMLet (Successor{nbsp}ML)]
8637* http://journals.cambridge.org/action/displayAbstract?aid=1322628[A critique of Standard{nbsp}ML]; Andrew W. Appel
8638--
8639
8640Recommended Skills: SML programming experience; some front-end compiler experience (i.e., scanners and parsers)
8641
8642/////
8643Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8644/////
8645
8646=== Implement Source-level Debugging ===
8647
8648Debugging is a fact of programming life. Unfortunately, most SML
8649implementations (including MLton) provide little to no source-level
8650debugging support. This project aims to add basic to intermediate
8651source-level debugging support to the MLton compiler. MLton already
8652supports source-level profiling, which can be used to attribute bytes
8653allocated or time spent in source functions. It should be relatively
8654straightforward to leverage this source-level information into basic
8655source-level debugging support, with the ability to set/unset
8656breakpoints and step through declarations and functions. It may be
8657possible to also provide intermediate source-level debugging support,
8658with the ability to inspect in-scope variables of basic types (e.g.,
8659types compatible with MLton's foreign function interface).
8660
8661Background:
8662--
8663* http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works]
8664* http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types]
8665* http://dwarfstd.org/[DWARF Debugging Standard]
8666* http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format]
8667--
8668
8669Recommended Skills: SML programming experience; some compiler experience
8670
8671/////
8672Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8673/////
8674
8675=== SIMD Primitives ===
8676
8677Most modern processors offer some direct support for SIMD (Single
8678Instruction, Multiple Data) operations, such as Intel's MMX/SSE
8679instructions, AMD's 3DNow! instructions, and IBM's AltiVec. Such
8680instructions are particularly useful for multimedia, scientific, and
8681cryptographic applications. This project aims to add preliminary
8682support for vector data and vector operations to the MLton compiler.
8683Ideally, after surveying SIMD instruction sets and SIMD support in
8684other compilers, a core set of SIMD primitives with broad architecture
8685and compiler support can be identified. After adding SIMD primitives
8686to the core compiler and carrying them through to the various
8687backends, there will be opportunities to design and implement an SML
8688library that exposes the primitives to the SML programmer as well as
8689opportunities to design and implement auto-vectorization
8690optimizations.
8691
8692Background:
8693--
8694* http://en.wikipedia.org/wiki/SIMD[SIMD]
8695* http://gcc.gnu.org/projects/tree-ssa/vectorization.html[Auto-vectorization in GCC]
8696* http://llvm.org/docs/Vectorizers.html[Auto-vectorization in LLVM]
8697--
8698
8699Recommended Skills: SML programming experience; some compiler experience; some computer architecture experience
8700
8701/////
8702Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8703/////
8704
8705=== RTOS Support ===
8706
8707This project entails porting the MLton compiler to RTOSs such as:
8708RTEMS, RT Linux, and FreeRTOS. The project will include modifications
8709to the MLton build and configuration process. Students will need to
8710extend the MLton configuration process for each of the RTOSs. The
8711MLton compilation process will need to be extended to invoke the C
8712cross compilers the RTOSs provide for embedded support. Test scripts
8713for validation will be necessary and these will need to be run in
8714emulators for supported architectures.
8715
8716Recommended Skills: C programming experience; some scripting experience
8717
8718/////
8719Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8720/////
8721
8722=== Region Based Memory Management ===
8723
8724Region based memory management is an alternative automatic memory
8725management scheme to garbage collection. Regions can be inferred by
8726the compiler (e.g., Cyclone and MLKit) or provided to the programmer
8727through a library. Since many students do not have extensive
8728experience with compilers we plan on adopting the later approach.
8729Creating a viable region based memory solution requires the removal of
8730the GC and changes to the allocator. Additionally, write barriers
8731will be necessary to ensure references between two ML objects is never
8732established if the left hand side of the assignment has a longer
8733lifetime than the right hand side. Students will need to come up with
8734an appropriate interface for creating, entering, and exiting regions
8735(examples include RTSJ scoped memory and SCJ scoped memory).
8736
8737Background:
8738--
8739* Cyclone
8740* MLKit
8741* RTSJ + SCJ scopes
8742--
8743
8744Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
8745
8746/////
8747Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8748/////
8749
8750=== Integration of Multi-MLton ===
8751
8752http://multimlton.cs.purdue.edu[MultiMLton] is a compiler and runtime
8753environment that targets scalable multicore platforms. It is an
8754extension of MLton. It combines new language abstractions and
8755associated compiler analyses for expressing and implementing various
8756kinds of fine-grained parallelism (safe futures, speculation,
8757transactions, etc.), along with a sophisticated runtime system tuned
8758to efficiently handle large numbers of lightweight threads. The core
8759stable features of MultiMLton will need to be integrated with the
8760latest MLton public release. Certain experimental features, such as
8761support for the Intel SCC and distributed runtime will be omitted.
8762This project requires students to understand the delta between the
8763MultiMLton code base and the MLton code base. Students will need to
8764create build and configuration scripts for MLton to enable MultiMLton
8765features.
8766
8767Background
8768--
8769* http://multimlton.cs.purdue.edu/mML/Publications.html[MultiMLton -- Publications]
8770--
8771
8772Recommended Skills: SML programming experience; C programming experience; some compiler experience
8773
8774/////
8775Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8776/////
8777
8778<<<
8779
8780:mlton-guide-page: GoogleSummerOfCode2014
8781[[GoogleSummerOfCode2014]]
8782Google Summer of Code (2014)
8783============================
8784
8785== Mentors ==
8786
8787The following developers have agreed to serve as mentors for the 2014 Google Summer of Code:
8788
8789* http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8790* http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8791* http://people.cs.uchicago.edu/~jhr/[John Reppy]
8792* http://www.cs.purdue.edu/homes/chandras[KC Sivaramakrishnan]
8793/////
8794* http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan]
8795/////
8796
8797== Ideas List ==
8798
8799=== Implement a Partial Redundancy Elimination (PRE) Optimization ===
8800
8801Partial redundancy elimination (PRE) is a program transformation that
8802removes operations that are redundant on some, but not necessarily all
8803paths, through the program. PRE can subsume both common subexpression
8804elimination and loop-invariant code motion, and is therefore a
8805potentially powerful optimization. However, a na&iuml;ve
8806implementation of PRE on a program in static single assignment (SSA)
8807form is unlikely to be effective. This project aims to adapt and
8808implement the SSAPRE algorithm(s) of Thomas VanDrunen in MLton's SSA
8809intermediate language.
8810
8811Background:
8812--
8813* http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based partial redundancy elimination for static single assignment form]; Thomas VanDrunen and Antony L. Hosking
8814* http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen
8815* http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
8816* http://portal.acm.org/citation.cfm?doid=319301.319348[Partial redundancy elimination in SSA form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
8817--
8818
8819Recommended Skills: SML programming experience; some middle-end compiler experience
8820
8821/////
8822Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8823/////
8824
8825=== Design and Implement a Heap Profiler ===
8826
8827A heap profile is a description of the space usage of a program. A
8828heap profile is concerned with the allocation, retention, and
8829deallocation (via garbage collection) of heap data during the
8830execution of a program. A heap profile can be used to diagnose
8831performance problems in a functional program that arise from space
8832leaks. This project aims to design and implement a heap profiler for
8833MLton compiled programs.
8834
8835Background:
8836--
8837* http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones
8838* http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas R&ouml;jemo
8839* http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas R&ouml;jemo
8840* http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling
8841--
8842
8843Recommended Skills: C and SML programming experience; some experience with UI and visualization
8844
8845/////
8846Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8847/////
8848
8849=== Garbage Collector Improvements ===
8850
8851The garbage collector plays a significant role in the performance of
8852functional languages. Garbage collect too often, and program
8853performance suffers due to the excessive time spent in the garbage
8854collector. Garbage collect not often enough, and program performance
8855suffers due to the excessive space used by the uncollected garbage.
8856One particular issue is ensuring that a program utilizing a garbage
8857collector "plays nice" with other processes on the system, by not
8858using too much or too little physical memory. While there are some
8859reasonable theoretical results about garbage collections with heaps of
8860fixed size, there seems to be insufficient work that really looks
8861carefully at the question of dynamically resizing the heap in response
8862to the live data demands of the application and, similarly, in
8863response to the behavior of the operating system and other processes.
8864This project aims to investigate improvements to the memory behavior of
8865MLton compiled programs through better tuning of the garbage
8866collector.
8867
8868Background:
8869--
8870* http://www.dcs.gla.ac.uk/%7Ewhited/papers/automated_heap_sizing.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews
8871* http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
8872* http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling garbage collection and heap growth to reduce the execution time of Java applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
8873* http://portal.acm.org/citation.cfm?doid=1065010.1065028[Garbage collection without paging]; Matthew Hertz, Yi Feng, and Emery D. Berger
8874* http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic heap sizing: taking real memory into account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
8875--
8876
8877Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
8878
8879/////
8880Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8881/////
8882
8883=== Implement Successor{nbsp}ML Language Features ===
8884
8885Any programming language, including Standard{nbsp}ML, can be improved.
8886The community has identified a number of modest extensions and
8887revisions to the Standard{nbsp}ML programming language that would
8888likely prove useful in practice. This project aims to implement these
8889language features in the MLton compiler.
8890
8891Background:
8892--
8893* http://successor-ml.org/index.php?title=Main_Page[Successor{nbsp}ML]
8894* http://www.mpi-sws.org/%7Erossberg/hamlet/index.html#successor-ml[HaMLet (Successor{nbsp}ML)]
8895* http://journals.cambridge.org/action/displayAbstract?aid=1322628[A critique of Standard{nbsp}ML]; Andrew W. Appel
8896--
8897
8898Recommended Skills: SML programming experience; some front-end compiler experience (i.e., scanners and parsers)
8899
8900/////
8901Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8902/////
8903
8904=== Implement Source-level Debugging ===
8905
8906Debugging is a fact of programming life. Unfortunately, most SML
8907implementations (including MLton) provide little to no source-level
8908debugging support. This project aims to add basic to intermediate
8909source-level debugging support to the MLton compiler. MLton already
8910supports source-level profiling, which can be used to attribute bytes
8911allocated or time spent in source functions. It should be relatively
8912straightforward to leverage this source-level information into basic
8913source-level debugging support, with the ability to set/unset
8914breakpoints and step through declarations and functions. It may be
8915possible to also provide intermediate source-level debugging support,
8916with the ability to inspect in-scope variables of basic types (e.g.,
8917types compatible with MLton's foreign function interface).
8918
8919Background:
8920--
8921* http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works]
8922* http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types]
8923* http://dwarfstd.org/[DWARF Debugging Standard]
8924* http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format]
8925--
8926
8927Recommended Skills: SML programming experience; some compiler experience
8928
8929/////
8930Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
8931/////
8932
8933=== Region Based Memory Management ===
8934
8935Region based memory management is an alternative automatic memory
8936management scheme to garbage collection. Regions can be inferred by
8937the compiler (e.g., Cyclone and MLKit) or provided to the programmer
8938through a library. Since many students do not have extensive
8939experience with compilers we plan on adopting the later approach.
8940Creating a viable region based memory solution requires the removal of
8941the GC and changes to the allocator. Additionally, write barriers
8942will be necessary to ensure references between two ML objects is never
8943established if the left hand side of the assignment has a longer
8944lifetime than the right hand side. Students will need to come up with
8945an appropriate interface for creating, entering, and exiting regions
8946(examples include RTSJ scoped memory and SCJ scoped memory).
8947
8948Background:
8949--
8950* Cyclone
8951* MLKit
8952* RTSJ + SCJ scopes
8953--
8954
8955Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
8956
8957/////
8958Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8959/////
8960
8961=== Integration of Multi-MLton ===
8962
8963http://multimlton.cs.purdue.edu[MultiMLton] is a compiler and runtime
8964environment that targets scalable multicore platforms. It is an
8965extension of MLton. It combines new language abstractions and
8966associated compiler analyses for expressing and implementing various
8967kinds of fine-grained parallelism (safe futures, speculation,
8968transactions, etc.), along with a sophisticated runtime system tuned
8969to efficiently handle large numbers of lightweight threads. The core
8970stable features of MultiMLton will need to be integrated with the
8971latest MLton public release. Certain experimental features, such as
8972support for the Intel SCC and distributed runtime will be omitted.
8973This project requires students to understand the delta between the
8974MultiMLton code base and the MLton code base. Students will need to
8975create build and configuration scripts for MLton to enable MultiMLton
8976features.
8977
8978Background
8979--
8980* http://multimlton.cs.purdue.edu/mML/Publications.html[MultiMLton -- Publications]
8981--
8982
8983Recommended Skills: SML programming experience; C programming experience; some compiler experience
8984
8985/////
8986Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
8987/////
8988
8989=== Concurrent{nbsp}ML Improvements ===
8990
8991http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency
8992library based on synchronous message passing. MLton has a partial
8993implementation of the CML message-passing primitives, but its use in
8994real-world applications has been stymied by the lack of completeness
8995and thread-safe I/O libraries. This project would aim to flesh out
8996the CML implementation in MLton to be fully compatible with the
8997"official" version distributed as part of SML/NJ. Furthermore, time
8998permitting, runtime system support could be added to allow use of
8999modern OS features, such as asynchronous I/O, in the implementation of
9000CML's system interfaces.
9001
9002Background
9003--
9004* http://cml.cs.uchicago.edu/
9005* http://mlton.org/ConcurrentML
9006* http://mlton.org/ConcurrentMLImplementation
9007--
9008
9009Recommended Skills: SML programming experience; knowledge of concurrent programming; some operating systems and/or systems programming experience
9010
9011/////
9012Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
9013Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9014/////
9015
9016/////
9017=== SML3d Development ===
9018
9019The SML3d Project is a collection of libraries to support 3D graphics
9020programming using Standard ML and the http://opengl.org/[OpenGL]
9021graphics API. It currently requires the MLton implementation of SML
9022and is supported on Linux, Mac OS X, and Microsoft Windows. There is
9023also support for http://www.khronos.org/opencl/[OpenCL]. This project
9024aims to continue development of the SML3d Project.
9025
9026Background
9027--
9028* http://sml3d.cs.uchicago.edu/
9029--
9030
9031Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
9032/////
9033
9034<<<
9035
9036:mlton-guide-page: GoogleSummerOfCode2015
9037[[GoogleSummerOfCode2015]]
9038Google Summer of Code (2015)
9039============================
9040
9041== Mentors ==
9042
9043The following developers have agreed to serve as mentors for the 2015 Google Summer of Code:
9044
9045* http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9046* http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
9047/////
9048* http://people.cs.uchicago.edu/~jhr/[John Reppy]
9049* http://www.cs.purdue.edu/homes/chandras[KC Sivaramakrishnan]
9050* http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan]
9051/////
9052
9053== Ideas List ==
9054
9055/////
9056=== Implement a Partial Redundancy Elimination (PRE) Optimization ===
9057
9058Partial redundancy elimination (PRE) is a program transformation that
9059removes operations that are redundant on some, but not necessarily all
9060paths, through the program. PRE can subsume both common subexpression
9061elimination and loop-invariant code motion, and is therefore a
9062potentially powerful optimization. However, a naïve implementation of
9063PRE on a program in static single assignment (SSA) form is unlikely to
9064be effective. This project aims to adapt and implement the GVN-PRE
9065algorithm of Thomas VanDrunen in MLton's SSA intermediate language.
9066
9067Background:
9068--
9069* http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen
9070* http://www.cs.purdue.edu/research/technical_reports/2003/TR%2003-032.pdf[Corner-cases in Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
9071* http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking
9072* http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based Partial Redundancy Elimination for Static Single Assignment Form]; Thomas VanDrunen and Antony L. Hosking
9073* http://portal.acm.org/citation.cfm?doid=319301.319348[Partial Redundancy Elimination in SSA Form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow
9074--
9075
9076Recommended Skills: SML programming experience; some middle-end compiler experience
9077
9078Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9079/////
9080
9081=== Design and Implement a Heap Profiler ===
9082
9083A heap profile is a description of the space usage of a program. A
9084heap profile is concerned with the allocation, retention, and
9085deallocation (via garbage collection) of heap data during the
9086execution of a program. A heap profile can be used to diagnose
9087performance problems in a functional program that arise from space
9088leaks. This project aims to design and implement a heap profiler for
9089MLton compiled programs.
9090
9091Background:
9092--
9093* http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones
9094* http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas R&ouml;jemo
9095* http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas R&ouml;jemo
9096* http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling
9097--
9098
9099Recommended Skills: C and SML programming experience; some experience with UI and visualization
9100
9101/////
9102Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9103/////
9104
9105=== Garbage Collector Improvements ===
9106
9107The garbage collector plays a significant role in the performance of
9108functional languages. Garbage collect too often, and program
9109performance suffers due to the excessive time spent in the garbage
9110collector. Garbage collect not often enough, and program performance
9111suffers due to the excessive space used by the uncollected
9112garbage. One particular issue is ensuring that a program utilizing a
9113garbage collector "plays nice" with other processes on the system, by
9114not using too much or too little physical memory. While there are some
9115reasonable theoretical results about garbage collections with heaps of
9116fixed size, there seems to be insufficient work that really looks
9117carefully at the question of dynamically resizing the heap in response
9118to the live data demands of the application and, similarly, in
9119response to the behavior of the operating system and other
9120processes. This project aims to investigate improvements to the memory
9121behavior of MLton compiled programs through better tuning of the
9122garbage collector.
9123
9124Background:
9125--
9126* http://gchandbook.org/[The Garbage Collection Handbook: The Art of Automatic Memory Management]; Richard Jones, Antony Hosking, Eliot Moss
9127* http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1020[Dual-Mode Garbage Collection]; Patrick Sansom
9128* http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic Heap Sizing: Taking Real Memory into Account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss
9129* http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling Garbage Collection and Heap Growth to Reduce the Execution Time of Java Applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham
9130* http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski
9131* http://portal.acm.org/citation.cfm?doid=1806651.1806669[The Economics of Garbage Collection]; Jeremy Singer, Richard E. Jones, Gavin Brown, and Mikel Luján
9132* http://www.dcs.gla.ac.uk/%7Ejsinger/pdfs/tfp12.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews
9133* http://portal.acm.org/citation.cfm?doid=2555670.2466481[Control Theory for Principled Heap Sizing]; David R. White, Jeremy Singer, Jonathan M. Aitken, and Richard E. Jones
9134--
9135
9136Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience
9137
9138/////
9139Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9140/////
9141
9142=== Heap-allocated Activation Records ===
9143
9144Activation records (a.k.a., stack frames) are traditionally allocated
9145on a stack. This naturally corresponds to the call-return pattern of
9146function invocation. However, there are some disadvantages to
9147stack-allocated activation records. In a functional programming
9148language, functions may be deeply recursive, resulting in call stacks
9149that are much larger than typically supported by the operating system;
9150hence, a functional programming language implementation will typically
9151store its stack in its heap. Furthermore, a functional programming
9152language implementation must handle and recover from stack overflow,
9153by allocating a larger stack (again, in its heap) and copying
9154activation records from the old stack to the new stack. In the
9155presence of threads, stacks must be allocated in a heap and, in the
9156presence of a garbage collector, should be garbage collected when
9157unreachable. While heap-allocated activation records avoid many of
9158these disadvantages, they have not been widely implemented. This
9159project aims to implement and evaluate heap-allocated activation
9160records in the MLton compiler.
9161
9162Background:
9163--
9164* http://journals.cambridge.org/action/displayAbstract?aid=1295104[Empirical and Analytic Study of Stack Versus Heap Cost for Languages with Closures]; Andrew W. Appel and Zhong Shao
9165* http://portal.acm.org/citation.cfm?doid=182590.156783[Space-efficient closure representations]; Zhong Shao and Andrew W. Appel
9166* http://portal.acm.org/citation.cfm?doid=93548.93554[Representing control in the presence of first-class continuations]; R. Hieb, R. Kent Dybvig, and Carl Bruggeman
9167--
9168
9169Recommended Skills: SML programming experience; some middle- and back-end compiler experience
9170
9171/////
9172Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9173/////
9174
9175=== Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML ===
9176
9177The
9178http://en.wikipedia.org/wiki/IEEE_754-2008[IEEE Standard for Floating-Point Arithmetic (IEEE 754)]
9179is the de facto representation for floating-point computation.
9180However, it is a _binary_ (base 2) representation of floating-point
9181values, while many applications call for input and output of
9182floating-point values in _decimal_ (base 10) representation. The
9183_decimal-to-binary_ conversion problem takes a decimal floating-point
9184representation (e.g., a string like +"0.1"+) and returns the best
9185binary floating-point representation of that number. The
9186_binary-to-decimal_ conversion problem takes a binary floating-point
9187representation and returns a decimal floating-point representation
9188using the smallest number of digits that allow the decimal
9189floating-point representation to be converted to the original binary
9190floating-point representation. For both conversion routines, "best"
9191is dependent upon the current floating-point rounding mode.
9192
9193MLton uses David Gay's
9194http://www.netlib.org/fp/gdtoa.tgz[gdtoa library] for floating-point
9195conversions. While this is an exellent library, it generalizes the
9196decimal-to-binary and binary-to-decimal conversion routines beyond
9197what is required by the
9198http://standardml.org/Basis/[Standard ML Basis Library] and induces an
9199external dependency on the compiler. Native implementations of these
9200conversion routines in Standard ML would obviate the dependency on the
9201+gdtoa+ library, while also being able to take advantage of Standard
9202ML features in the implementation (e.g., the published algorithms
9203often require use of infinite precision arithmetic, which is provided
9204by the +IntInf+ structure in Standard ML, but is provided in an ad hoc
9205fasion in the +gdtoa+ library).
9206
9207This project aims to develop a native implementation of the conversion
9208routines in Standard ML.
9209
9210Background:
9211--
9212* http://dl.acm.org/citation.cfm?doid=103162.103163[What every computer scientist should know about floating-point arithmetic]; David Goldberg
9213* http://dl.acm.org/citation.cfm?doid=93542.93559[How to print floating-point numbers accurately]; Guy L. Steele, Jr. and Jon L. White
9214* http://dl.acm.org/citation.cfm?doid=93542.93557[How to read floating point numbers accurately]; William D. Clinger
9215* http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz[Correctly Rounded Binary-Decimal and Decimal-Binary Conversions]; David Gay
9216* http://dl.acm.org/citation.cfm?doid=249069.231397[Printing floating-point numbers quickly and accurately]; Robert G. Burger and R. Kent Dybvig
9217* http://dl.acm.org/citation.cfm?doid=1806596.1806623[Printing floating-point numbers quickly and accurately with integers]; Florian Loitsch
9218--
9219
9220Recommended Skills: SML programming experience; algorithm design and implementation
9221
9222/////
9223Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9224/////
9225
9226=== Implement Source-level Debugging ===
9227
9228Debugging is a fact of programming life. Unfortunately, most SML
9229implementations (including MLton) provide little to no source-level
9230debugging support. This project aims to add basic to intermediate
9231source-level debugging support to the MLton compiler. MLton already
9232supports source-level profiling, which can be used to attribute bytes
9233allocated or time spent in source functions. It should be relatively
9234straightforward to leverage this source-level information into basic
9235source-level debugging support, with the ability to set/unset
9236breakpoints and step through declarations and functions. It may be
9237possible to also provide intermediate source-level debugging support,
9238with the ability to inspect in-scope variables of basic types (e.g.,
9239types compatible with MLton's foreign function interface).
9240
9241Background:
9242--
9243* http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works]
9244* http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types]
9245* http://dwarfstd.org/[DWARF Debugging Standard]
9246* http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format]
9247--
9248
9249Recommended Skills: SML programming experience; some compiler experience
9250
9251/////
9252Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9253/////
9254
9255=== Region Based Memory Management ===
9256
9257Region based memory management is an alternative automatic memory
9258management scheme to garbage collection. Regions can be inferred by
9259the compiler (e.g., Cyclone and MLKit) or provided to the programmer
9260through a library. Since many students do not have extensive
9261experience with compilers we plan on adopting the later approach.
9262Creating a viable region based memory solution requires the removal of
9263the GC and changes to the allocator. Additionally, write barriers
9264will be necessary to ensure references between two ML objects is never
9265established if the left hand side of the assignment has a longer
9266lifetime than the right hand side. Students will need to come up with
9267an appropriate interface for creating, entering, and exiting regions
9268(examples include RTSJ scoped memory and SCJ scoped memory).
9269
9270Background:
9271--
9272* Cyclone
9273* MLKit
9274* RTSJ + SCJ scopes
9275--
9276
9277Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience
9278
9279/////
9280Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
9281/////
9282
9283=== Adding Real-Time Capabilities ===
9284
9285This project focuses on exposing real-time APIs from a real-time OS
9286kernel at the SML level. This will require mapping the current MLton
9287(or http://multimlton.cs.purdue.edu[MultiMLton]) threading framework
9288to real-time threads that the RTOS provides. This will include
9289associating priorities with MLton threads and building priority based
9290scheduling algorithms. Additionally, support for perdioc, aperiodic,
9291and sporadic tasks should be supported. A real-time SML library will
9292need to be created to provide a forward facing interface for
9293programmers. Stretch goals include reworking the MLton +atomic+
9294statement and associated synchronization primitives built on top of
9295the MLton +atomic+ statement.
9296
9297Recommended Skills: SML programming experience; C programming experience; real-time experience a plus but not required
9298
9299/////
9300Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
9301/////
9302
9303=== Real-Time Garbage Collection ===
9304
9305This project focuses on modifications to the MLton GC to support
9306real-time garbage collection. We will model the real-time GC on the
9307Schism RTGC. The first task will be to create a fixed size runtime
9308object representation. Large structures will need to be represented
9309as a linked lists of fixed sized objects. Arrays and vectors will be
9310transferred into dense trees. Compaction and copying can therefore be
9311removed from the GC algorithms that MLton currently supports. Lastly,
9312the GC will be made concurrent, allowing for the execution of the GC
9313threads as the lowest priority task in the system. Stretch goals
9314include a priority aware mechanism for the GC to signal to real-time
9315ML threads that it needs to scan their stack and identification of
9316places where the stack is shallow to bound priority inversion during
9317this procedure.
9318
9319Recommended Skills: C programming experience; garbage collector experience a plus but not required
9320
9321/////
9322Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek]
9323/////
9324
9325/////
9326=== Concurrent{nbsp}ML Improvements ===
9327
9328http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency
9329library based on synchronous message passing. MLton has a partial
9330implementation of the CML message-passing primitives, but its use in
9331real-world applications has been stymied by the lack of completeness
9332and thread-safe I/O libraries. This project would aim to flesh out
9333the CML implementation in MLton to be fully compatible with the
9334"official" version distributed as part of SML/NJ. Furthermore, time
9335permitting, runtime system support could be added to allow use of
9336modern OS features, such as asynchronous I/O, in the implementation of
9337CML's system interfaces.
9338
9339Background
9340--
9341* http://cml.cs.uchicago.edu/
9342* http://mlton.org/ConcurrentML
9343* http://mlton.org/ConcurrentMLImplementation
9344--
9345
9346Recommended Skills: SML programming experience; knowledge of concurrent programming; some operating systems and/or systems programming experience
9347
9348Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
9349Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet]
9350/////
9351
9352/////
9353=== SML3d Development ===
9354
9355The SML3d Project is a collection of libraries to support 3D graphics
9356programming using Standard ML and the http://opengl.org/[OpenGL]
9357graphics API. It currently requires the MLton implementation of SML
9358and is supported on Linux, Mac OS X, and Microsoft Windows. There is
9359also support for http://www.khronos.org/opencl/[OpenCL]. This project
9360aims to continue development of the SML3d Project.
9361
9362Background
9363--
9364* http://sml3d.cs.uchicago.edu/
9365--
9366
9367Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy]
9368/////
9369
9370<<<
9371
9372:mlton-guide-page: HaMLet
9373[[HaMLet]]
9374HaMLet
9375======
9376
9377http://www.mpi-sws.org/~rossberg/hamlet/[HaMLet] is a
9378<:StandardMLImplementations:Standard ML implementation>. It is
9379intended as reference implementation of
9380<:DefinitionOfStandardML:The Definition of Standard ML (Revised)> and
9381not for serious practical work.
9382
9383<<<
9384
9385:mlton-guide-page: HenryCejtin
9386[[HenryCejtin]]
9387HenryCejtin
9388===========
9389
9390I was one of the original developers of Mathematica (actually employee #1).
9391My background is a combination of mathematics and computer science.
9392Currently I am doing various things in Chicago.
9393
9394<<<
9395
9396:mlton-guide-page: History
9397[[History]]
9398History
9399=======
9400
9401In April 1997, Stephen Weeks wrote a defunctorizer for Standard ML and
9402integrated it with SML/NJ. The defunctorizer used SML/NJ's visible
9403compiler and operated on the `Ast` intermediate representation
9404produced by the SML/NJ front end. Experiments showed that
9405defunctorization gave a speedup of up to six times over separate
9406compilation and up to two times over batch compilation without functor
9407expansion.
9408
9409In August 1997, we began development of an independent compiler for
9410SML. At the time the compiler was called `smlc`. By October, we had
9411a working monomorphiser. By November, we added a polyvariant
9412higher-order control-flow analysis. At that point, MLton was about
941310,000 lines of code.
9414
9415Over the next year and half, `smlc` morphed into a full-fledged
9416compiler for SML. It was renamed MLton, and first released in March
94171999.
9418
9419From the start, MLton has been driven by whole-program optimization
9420and an emphasis on performance. Also from the start, MLton has had a
9421fast C FFI and `IntInf` based on the GNU multiprecision library. At
9422its first release, MLton was 48,006 lines.
9423
9424Between the March 1999 and January 2002, MLton grew to 102,541 lines,
9425as we added a native code generator, mllex, mlyacc, a profiler, many
9426optimizations, and many libraries including threads and signal
9427handling.
9428
9429During 2002, MLton grew to 112,204 lines and we had releases in April
9430and September. We added support for cross compilation and used this
9431to enable MLton to run on Cygwin/Windows and FreeBSD. We also made
9432improvements to the garbage collector, so that it now works with large
9433arrays and up to 4G of memory and so that it automatically uses
9434copying, mark-compact, or generational collection depending on heap
9435usage and RAM size. We also continued improvements to the optimizer
9436and libraries.
9437
9438During 2003, MLton grew to 122,299 lines and we had releases in March
9439and July. We extended the profiler to support source-level profiling
9440of time and allocation and to display call graphs. We completed the
9441Basis Library implementation, and added new MLton-specific libraries
9442for weak pointers and finalization. We extended the FFI to allow
9443callbacks from C to SML. We added support for the Sparc/Solaris
9444platform, and made many improvements to the C code generator.
9445
9446<<<
9447
9448:mlton-guide-page: HowProfilingWorks
9449[[HowProfilingWorks]]
9450HowProfilingWorks
9451=================
9452
9453Here's how <:Profiling:> works. If profiling is on, the front end
9454(elaborator) inserts `Enter` and `Leave` statements into the source
9455program for function entry and exit. For example,
9456[source,sml]
9457----
9458fun f n = if n = 0 then 0 else 1 + f (n - 1)
9459----
9460becomes
9461[source,sml]
9462----
9463fun f n =
9464 let
9465 val () = Enter "f"
9466 val res = (if n = 0 then 0 else 1 + f (n - 1))
9467 handle e => (Leave "f"; raise e)
9468 val () = Leave "f"
9469 in
9470 res
9471 end
9472----
9473
9474Actually there is a bit more information than just the source function
9475name; there is also lexical nesting and file position.
9476
9477Most of the middle of the compiler ignores, but preserves, `Enter` and
9478`Leave`. However, so that profiling preserves tail calls, the
9479<:Shrink:SSA shrinker> has an optimization that notices when the only
9480operations that cause a call to be a nontail call are profiling
9481operations, and if so, moves them before the call, turning it into a
9482tail call. If you observe a program that has a tail call that appears
9483to be turned into a nontail when compiled with profiling, please
9484<:Bug:report a bug>.
9485
9486There is the `checkProf` function in
9487<!ViewGitFile(mlton,master,mlton/ssa/type-check.fun)>, which checks that
9488the `Enter`/`Leave` statements match up.
9489
9490In the backend, just before translating to the <:Machine: Machine IL>,
9491the profiler uses the `Enter`/`Leave` statements to infer the "local"
9492portion of the control stack at each program point. The profiler then
9493removes the ++Enter++s/++Leave++s and inserts different information
9494depending on which kind of profiling is happening. For time profiling
9495(with the <:AMD64Codegen:> and <:X86Codegen:>), the profiler inserts labels that cover the
9496code (i.e. each statement has a unique label in its basic block that
9497prefixes it) and associates each label with the local control stack.
9498For time profiling (with the <:CCodegen:> and <:LLVMCodegen:>), the profiler
9499inserts code that sets a global field that records the local control
9500stack. For allocation profiling, the profiler inserts calls to a C
9501function that will maintain byte counts. With stack profiling, the
9502profiler also inserts a call to a C function at each nontail call in
9503order to maintain information at runtime about what SML functions are
9504on the stack.
9505
9506At run time, the profiler associates counters (either clock ticks or
9507byte counts) with source functions. When the program finishes, the
9508profiler writes the counts out to the `mlmon.out` file. Then,
9509`mlprof` uses source information stored in the executable to
9510associate the counts in the `mlmon.out` file with source
9511functions.
9512
9513For time profiling, the profiler catches the `SIGPROF` signal 100
9514times per second and increments the appropriate counter, determined by
9515looking at the label prefixing the current program counter and mapping
9516that to the current source function.
9517
9518== Caveats ==
9519
9520There may be a few missed clock ticks or bytes allocated at the very
9521end of the program after the data is written.
9522
9523Profiling has not been tested with signals or threads. In particular,
9524stack profiling may behave strangely.
9525
9526<<<
9527
9528:mlton-guide-page: Identifier
9529[[Identifier]]
9530Identifier
9531==========
9532
9533In <:StandardML:Standard ML>, there are syntactically two kinds of
9534identifiers.
9535
9536* Alphanumeric: starts with a letter or prime (`'`) and is followed by letters, digits, primes and underbars (`_`).
9537+
9538Examples: `abc`, `ABC123`, `Abc_123`, `'a`.
9539
9540* Symbolic: a sequence of the following
9541+
9542----
9543 ! % & $ # + - / : < = > ? @ | ~ ` ^ | *
9544----
9545+
9546Examples: `+=`, `<=`, `>>`, `$`.
9547
9548With the exception of `=`, reserved words can not be identifiers.
9549
9550There are a number of different classes of identifiers, some of which
9551have additional syntactic rules.
9552
9553* Identifiers not starting with a prime.
9554** value identifier (includes variables and constructors)
9555** type constructor
9556** structure identifier
9557** signature identifier
9558** functor identifier
9559* Identifiers starting with a prime.
9560** type variable
9561* Identifiers not starting with a prime and numeric labels (`1`, `2`, ...).
9562** record label
9563
9564<<<
9565
9566:mlton-guide-page: Immutable
9567[[Immutable]]
9568Immutable
9569=========
9570
9571Immutable means not <:Mutable:mutable> and is an adjective meaning
9572"can not be modified". Most values in <:StandardML:Standard ML> are
9573immutable. For example, constants, tuples, records, lists, and
9574vectors are all immutable.
9575
9576<<<
9577
9578:mlton-guide-page: ImperativeTypeVariable
9579[[ImperativeTypeVariable]]
9580ImperativeTypeVariable
9581======================
9582
9583In <:StandardML:Standard ML>, an imperative type variable is a type
9584variable whose second character is a digit, as in `'1a` or
9585`'2b`. Imperative type variables were used as an alternative to
9586the <:ValueRestriction:> in an earlier version of SML, but no longer play
9587a role. They are treated exactly as other type variables.
9588
9589<<<
9590
9591:mlton-guide-page: ImplementExceptions
9592[[ImplementExceptions]]
9593ImplementExceptions
9594===================
9595
9596<:ImplementExceptions:> is a pass for the <:SXML:>
9597<:IntermediateLanguage:>, invoked from <:SXMLSimplify:>.
9598
9599== Description ==
9600
9601This pass implements exceptions.
9602
9603== Implementation ==
9604
9605* <!ViewGitFile(mlton,master,mlton/xml/implement-exceptions.fun)>
9606
9607== Details and Notes ==
9608
9609{empty}
9610
9611<<<
9612
9613:mlton-guide-page: ImplementHandlers
9614[[ImplementHandlers]]
9615ImplementHandlers
9616=================
9617
9618<:ImplementHandlers:> is a pass for the <:RSSA:>
9619<:IntermediateLanguage:>, invoked from <:RSSASimplify:>.
9620
9621== Description ==
9622
9623This pass implements the (threaded) exception handler stack.
9624
9625== Implementation ==
9626
9627* <!ViewGitFile(mlton,master,mlton/backend/implement-handlers.fun)>
9628
9629== Details and Notes ==
9630
9631{empty}
9632
9633<<<
9634
9635:mlton-guide-page: ImplementProfiling
9636[[ImplementProfiling]]
9637ImplementProfiling
9638==================
9639
9640<:ImplementProfiling:> is a pass for the <:RSSA:>
9641<:IntermediateLanguage:>, invoked from <:RSSASimplify:>.
9642
9643== Description ==
9644
9645This pass implements profiling.
9646
9647== Implementation ==
9648
9649* <!ViewGitFile(mlton,master,mlton/backend/implement-profiling.fun)>
9650
9651== Details and Notes ==
9652
9653See <:HowProfilingWorks:>.
9654
9655<<<
9656
9657:mlton-guide-page: ImplementSuffix
9658[[ImplementSuffix]]
9659ImplementSuffix
9660===============
9661
9662<:ImplementSuffix:> is a pass for the <:SXML:>
9663<:IntermediateLanguage:>, invoked from <:SXMLSimplify:>.
9664
9665== Description ==
9666
9667This pass implements the `TopLevel_setSuffix` primitive, which
9668installs a function to exit the program.
9669
9670== Implementation ==
9671
9672* <!ViewGitFile(mlton,master,mlton/xml/implement-suffix.fun)>
9673
9674== Details and Notes ==
9675
9676<:ImplementSuffix:> works by introducing a new `ref` cell to contain
9677the function of type `unit -> unit` that should be called on program
9678exit.
9679
9680* The following code (appropriately alpha-converted) is appended to the beginning of the <:SXML:> program:
9681+
9682[source,sml]
9683----
9684val z_0 =
9685 fn a_0 =>
9686 let
9687 val x_0 =
9688 "toplevel suffix not installed"
9689 val x_1 =
9690 MLton_bug (x_0)
9691 in
9692 x_1
9693 end
9694val topLevelSuffixCell =
9695 Ref_ref (z_0)
9696----
9697
9698* Any occurrence of
9699+
9700[source,sml]
9701----
9702val x_0 =
9703 TopLevel_setSuffix (f_0)
9704----
9705+
9706is rewritten to
9707+
9708[source,sml]
9709----
9710val x_0 =
9711 Ref_assign (topLevelSuffixCell, f_0)
9712----
9713
9714* The following code (appropriately alpha-converted) is appended to the end of the <:SXML:> program:
9715+
9716[source,sml]
9717----
9718val f_0 =
9719 Ref_deref (topLevelSuffixCell)
9720val z_0 =
9721 ()
9722val x_0 =
9723 f_0 z_0
9724----
9725
9726<<<
9727
9728:mlton-guide-page: InfixingOperators
9729[[InfixingOperators]]
9730InfixingOperators
9731=================
9732
9733Fixity specifications are not part of signatures in
9734<:StandardML:Standard ML>. When one wants to use a module that
9735provides functions designed to be used as infix operators there are
9736several obvious alternatives:
9737
9738* Use only prefix applications. Unfortunately there are situations
9739where infix applications lead to considerably more readable code.
9740
9741* Make the fixity declarations at the top-level. This may lead to
9742collisions and may be unsustainable in a large project. Pollution of
9743the top-level should be avoided.
9744
9745* Make the fixity declarations at each scope where you want to use
9746infix applications. The duplication becomes inconvenient if the
9747operators are widely used. Duplication of code should be avoided.
9748
9749* Use non-standard extensions, such as the <:MLBasis: ML Basis system>
9750to control the scope of fixity declarations. This has the obvious
9751drawback of reduced portability.
9752
9753* Reuse existing infix operator symbols (`^`, `+`, `-`, ...). This
9754can be convenient when the standard operators aren't needed in the
9755same scope with the new operators. On the other hand, one is limited
9756to the standard operator symbols and the code may appear confusing.
9757
9758None of the obvious alternatives is best in every case. The following
9759describes a slightly less obvious alternative that can sometimes be
9760useful. The idea is to approximate Haskell's special syntax for
9761treating any identifier enclosed in grave accents (backquotes) as an
9762infix operator. In Haskell, instead of writing the prefix application
9763`f x y` one can write the infix application ++x &grave;f&grave; y++.
9764
9765
9766== Infixing operators ==
9767
9768Let's first take a look at the definitions of the operators:
9769
9770[source,sml]
9771----
9772infix 3 <\ fun x <\ f = fn y => f (x, y) (* Left section *)
9773infix 3 \> fun f \> y = f y (* Left application *)
9774infixr 3 /> fun f /> y = fn x => f (x, y) (* Right section *)
9775infixr 3 </ fun x </ f = f x (* Right application *)
9776
9777infix 2 o (* See motivation below *)
9778infix 0 :=
9779----
9780
9781The left and right sectioning operators, `<\` and `/>`, are useful in
9782SML for partial application of infix operators.
9783<!Cite(Paulson96, ML For the Working Programmer)> describes curried
9784functions `secl` and `secr` for the same purpose on pages 179-181.
9785For example,
9786
9787[source,sml]
9788----
9789List.map (op- /> y)
9790----
9791
9792is a function for subtracting `y` from a list of integers and
9793
9794[source,sml]
9795----
9796List.exists (x <\ op=)
9797----
9798
9799is a function for testing whether a list contains an `x`.
9800
9801Together with the left and right application operators, `\>` and `</`,
9802the sectioning operators provide a way to treat any binary function
9803(i.e. a function whose domain is a pair) as an infix operator. In
9804general,
9805
9806----
9807x0 <\f1\> x1 <\f2\> x2 ... <\fN\> xN = fN (... f2 (f1 (x0, x1), x2) ..., xN)
9808----
9809
9810and
9811
9812----
9813xN </fN/> ... x2 </f2/> x1 </f1/> x0 = fN (xN, ... f2 (x2, f1 (x1, x0)) ...)
9814----
9815
9816
9817=== Examples ===
9818
9819As a fairly realistic example, consider providing a function for sequencing
9820comparisons:
9821
9822[source,sml]
9823----
9824structure Order (* ... *) =
9825 struct
9826 (* ... *)
9827 val orWhenEq = fn (EQUAL, th) => th ()
9828 | (other, _) => other
9829 (* ... *)
9830 end
9831----
9832Using `orWhenEq` and the infixing operators, one can write a
9833`compare` function for triples as
9834
9835[source,sml]
9836----
9837fun compare (fad, fbe, fcf) ((a, b, c), (d, e, f)) =
9838 fad (a, d) <\Order.orWhenEq\> `fbe (b, e) <\Order.orWhenEq\> `fcf (c, f)
9839----
9840
9841where +&grave;+ is defined as
9842
9843[source,sml]
9844----
9845fun `f x = fn () => f x
9846----
9847
9848Although `orWhenEq` can be convenient (try rewriting the above without
9849it), it is probably not useful enough to be defined at the top level
9850as an infix operator. Fortunately we can use the infixing operators
9851and don't have to.
9852
9853Another fairly realistic example would be to use the infixing operators with
9854the technique described on the <:Printf:> page. Assuming that you would have
9855a `Printf` module binding `printf`, +&grave;+, and formatting combinators
9856named `int` and `string`, you could write
9857
9858[source,sml]
9859----
9860let open Printf in
9861 printf (`"Here's an int "<\int\>" and a string "<\string\>".") 13 "foo" end
9862----
9863
9864without having to duplicate the fixity declarations. Alternatively, you could
9865write
9866
9867[source,sml]
9868----
9869P.printf (P.`"Here's an int "<\P.int\>" and a string "<\P.string\>".") 13 "foo"
9870----
9871
9872assuming you have the made the binding
9873
9874[source,sml]
9875----
9876structure P = Printf
9877----
9878
9879
9880== Application and piping operators ==
9881
9882The left and right application operators may also provide some notational
9883convenience on their own. In general,
9884
9885----
9886f \> x1 \> ... \> xN = f x1 ... xN
9887----
9888
9889and
9890
9891----
9892xN </ ... </ x1 </ f = f x1 ... xN
9893----
9894
9895If nothing else, both of them can eliminate parentheses. For example,
9896
9897[source,sml]
9898----
9899foo (1 + 2) = foo \> 1 + 2
9900----
9901
9902The left and right application operators are related to operators
9903that could be described as the right and left piping operators:
9904
9905[source,sml]
9906----
9907infix 1 >| val op>| = op</ (* Left pipe *)
9908infixr 1 |< val op|< = op\> (* Right pipe *)
9909----
9910
9911As you can see, the left and right piping operators, `>|` and `|<`,
9912are the same as the right and left application operators,
9913respectively, except the associativities are reversed and the binding
9914strength is lower. They are useful for piping data through a sequence
9915of operations. In general,
9916
9917----
9918x >| f1 >| ... >| fN = fN (... (f1 x) ...) = (fN o ... o f1) x
9919----
9920
9921and
9922
9923----
9924fN |< ... |< f1 |< x = fN (... (f1 x) ...) = (fN o ... o f1) x
9925----
9926
9927The right piping operator, `|<`, is provided by the Haskell prelude as
9928`$`. It can be convenient in CPS or continuation passing style.
9929
9930A use for the left piping operator is with parsing combinators. In a
9931strict language, like SML, eta-reduction is generally unsafe. Using
9932the left piping operator, parsing functions can be formatted
9933conveniently as
9934
9935[source,sml]
9936----
9937fun parsingFunc input =
9938 input >| (* ... *)
9939 || (* ... *)
9940 || (* ... *)
9941----
9942
9943where `||` is supposed to be a combinator provided by the parsing combinator
9944library.
9945
9946
9947== About precedences ==
9948
9949You probably noticed that we redefined the
9950<:OperatorPrecedence:precedences> of the function composition operator
9951`o` and the assignment operator `:=`. Doing so is not strictly
9952necessary, but can be convenient and should be relatively
9953safe. Consider the following motivating examples from
9954<:WesleyTerpstra: Wesley W. Terpstra> relying on the redefined
9955precedences:
9956
9957[source,sml]
9958----
9959Word8.fromInt o Char.ord o s <\String.sub
9960(* Combining sectioning and composition *)
9961
9962x := s <\String.sub\> i
9963(* Assigning the result of an infixed application *)
9964----
9965
9966In imperative languages, assignment usually has the lowest precedence
9967(ignoring statement separators). The precedence of `:=` in the
9968<:BasisLibrary: Basis Library> is perhaps unnecessarily high, because
9969an expression of the form `r := x` always returns a unit, which makes
9970little sense to combine with anything. Dropping `:=` to the lowest
9971precedence level makes it behave more like in other imperative
9972languages.
9973
9974The case for `o` is different. With the exception of `before` and
9975`:=`, it doesn't seem to make much sense to use `o` with any of the
9976operators defined by the <:BasisLibrary: Basis Library> in an
9977unparenthesized expression. This is simply because none of the other
9978operators deal with functions. It would seem that the precedence of
9979`o` could be chosen completely arbitrarily from the set `{1, ..., 9}`
9980without having any adverse effects with respect to other infix
9981operators defined by the <:BasisLibrary: Basis Library>.
9982
9983
9984== Design of the symbols ==
9985
9986The closest approximation of Haskell's ++x &grave;f&grave; y++ syntax
9987achievable in Standard ML would probably be something like
9988++x &grave;f^ y++, but `^` is already used for string
9989concatenation by the <:BasisLibrary: Basis Library>. Other
9990combinations of the characters +&grave;+ and `^` would be
9991possible, but none seems clearly the best visually. The symbols `<\`,
9992`\>`, `</`, and `/>` are reasonably concise and have a certain
9993self-documenting appearance and symmetry, which can help to remember
9994them. As the names suggest, the symbols of the piping operators `>|`
9995and `|<` are inspired by Unix shell pipelines.
9996
9997
9998== Also see ==
9999
10000 * <:Utilities:>
10001
10002<<<
10003
10004:mlton-guide-page: Inline
10005[[Inline]]
10006Inline
10007======
10008
10009<:Inline:> is an optimization pass for the <:SSA:>
10010<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10011
10012== Description ==
10013
10014This pass inlines <:SSA:> functions using a size-based metric.
10015
10016== Implementation ==
10017
10018* <!ViewGitFile(mlton,master,mlton/ssa/inline.sig)>
10019* <!ViewGitFile(mlton,master,mlton/ssa/inline.fun)>
10020
10021== Details and Notes ==
10022
10023The <:Inline:> pass can be invoked to use one of three metrics:
10024
10025* `NonRecursive(product, small)` -- inline any function satisfying `(numCalls - 1) * (size - small) <= product`, where `numCalls` is the static number of calls to the function and `size` is the size of the function.
10026* `Leaf(size)` -- inline any leaf function smaller than `size`
10027* `LeafNoLoop(size)` -- inline any leaf function without loops smaller than `size`
10028
10029<<<
10030
10031:mlton-guide-page: InsertLimitChecks
10032[[InsertLimitChecks]]
10033InsertLimitChecks
10034=================
10035
10036<:InsertLimitChecks:> is a pass for the <:RSSA:>
10037<:IntermediateLanguage:>, invoked from <:RSSASimplify:>.
10038
10039== Description ==
10040
10041This pass inserts limit checks.
10042
10043== Implementation ==
10044
10045* <!ViewGitFile(mlton,master,mlton/backend/limit-check.fun)>
10046
10047== Details and Notes ==
10048
10049{empty}
10050
10051<<<
10052
10053:mlton-guide-page: InsertSignalChecks
10054[[InsertSignalChecks]]
10055InsertSignalChecks
10056==================
10057
10058<:InsertSignalChecks:> is a pass for the <:RSSA:>
10059<:IntermediateLanguage:>, invoked from <:RSSASimplify:>.
10060
10061== Description ==
10062
10063This pass inserts signal checks.
10064
10065== Implementation ==
10066
10067* <!ViewGitFile(mlton,master,mlton/backend/limit-check.fun)>
10068
10069== Details and Notes ==
10070
10071{empty}
10072
10073<<<
10074
10075:mlton-guide-page: Installation
10076[[Installation]]
10077Installation
10078============
10079
10080MLton runs on a variety of platforms and is distributed in both source and
10081binary form.
10082
10083A `.tgz` or `.tbz` binary package can be extracted at any location, yielding
10084`README.adoc` (this file), `CHANGELOG.adoc`, `LICENSE`, `Makefile`, `bin/`,
10085`lib/`, and `share/`. The compiler and tools can be executed in-place (e.g.,
10086`./bin/mlton`).
10087
10088A small set of `Makefile` variables can be used to customize the binary package
10089via `make update`:
10090
10091 * `CC`: Specify C compiler. Can be used for alternative tools (e.g.,
10092 `CC=clang` or `CC=gcc-7`).
10093 * `WITH_GMP_DIR`, `WITH_GMP_INC_DIR`, `WITH_GMP_LIB_DIR`: Specify GMP include
10094 and library paths, if not on default search paths. (If `WITH_GMP_DIR` is
10095 set, then `WITH_GMP_INC_DIR` defaults to `$(WITH_GMP_DIR)/include` and
10096 `WITH_GMP_LIB_DIR` defaults to `$(WITH_GMP_DIR)/lib`.)
10097
10098For example:
10099
10100[source,sml]
10101----
10102$ make CC=clang WITH_GMP_DIR=/opt/gmp update
10103----
10104
10105On typical platforms, installing MLton (after optionally performing
10106`make update`) to `/usr/local` can be accomplished via:
10107
10108[source,sml]
10109----
10110$ make install
10111----
10112
10113A small set of `Makefile` variables can be used to customize the installation:
10114
10115 * `PREFIX`: Specify the installation prefix.
10116 * `CC`: Specify C compiler. Can be used for alternative tools (e.g.,
10117 `CC=clang` or `CC=gcc-7`).
10118 * `WITH_GMP_DIR`, `WITH_GMP_INC_DIR`, `WITH_GMP_LIB_DIR`: Specify GMP include
10119 and library paths, if not on default search paths. (If `WITH_GMP_DIR` is
10120 set, then `WITH_GMP_INC_DIR` defaults to `$(WITH_GMP_DIR)/include` and
10121 `WITH_GMP_LIB_DIR` defaults to `$(WITH_GMP_DIR)/lib`.)
10122
10123For example:
10124
10125[source,sml]
10126----
10127$ make PREFIX=/opt/mlton install
10128----
10129
10130Installation of MLton creates the following files and directories.
10131
10132* ++__prefix__/bin/mllex++
10133+
10134The <:MLLex:> lexer generator.
10135
10136* ++__prefix__/bin/mlnlffigen++
10137+
10138The <:MLNLFFI:ML-NLFFI> tool.
10139
10140* ++__prefix__/bin/mlprof++
10141+
10142A <:Profiling:> tool.
10143
10144* ++__prefix__/bin/mlton++
10145+
10146A script to call the compiler. This script may be moved anywhere,
10147however, it makes use of files in ++__prefix__/lib/mlton++.
10148
10149* ++__prefix__/bin/mlyacc++
10150+
10151The <:MLYacc:> parser generator.
10152
10153* ++__prefix__/lib/mlton++
10154+
10155Directory containing libraries and include files needed during compilation.
10156
10157* ++__prefix__/share/man/man1/{mllex,mlnlffigen,mlprof,mlton,mlyacc}.1++
10158+
10159Man pages.
10160
10161* ++__prefix__/share/doc/mlton++
10162+
10163Directory containing the user guide for MLton, mllex, and mlyacc, as
10164well as example SML programs (in the `examples` directory), and license
10165information.
10166
10167
10168== Hello, World! ==
10169
10170Once you have installed MLton, create a file called `hello-world.sml`
10171with the following contents.
10172
10173----
10174print "Hello, world!\n";
10175----
10176
10177Now create an executable, `hello-world`, with the following command.
10178----
10179mlton hello-world.sml
10180----
10181
10182You can now run `hello-world` to verify that it works. There are more
10183small examples in ++__prefix__/share/doc/mlton/examples++.
10184
10185
10186== Installation on Cygwin ==
10187
10188When installing the Cygwin `tgz`, you should use Cygwin's `bash` and
10189`tar`. The use of an archiving tool that is not aware of Cygwin's
10190mounts will put the files in the wrong place.
10191
10192<<<
10193
10194:mlton-guide-page: IntermediateLanguage
10195[[IntermediateLanguage]]
10196IntermediateLanguage
10197====================
10198
10199MLton uses a number of intermediate languages in translating from the input source program to low-level code. Here is a list in the order which they are translated to.
10200
10201 * <:AST:>. Pretty close to the source.
10202 * <:CoreML:>. Explicitly typed, no module constructs.
10203 * <:XML:>. Polymorphic, <:HigherOrder:>.
10204 * <:SXML:>. SimplyTyped, <:HigherOrder:>.
10205 * <:SSA:>. SimplyTyped, <:FirstOrder:>.
10206 * <:SSA2:>. SimplyTyped, <:FirstOrder:>.
10207 * <:RSSA:>. Explicit data representations.
10208 * <:Machine:>. Untyped register transfer language.
10209
10210<<<
10211
10212:mlton-guide-page: IntroduceLoops
10213[[IntroduceLoops]]
10214IntroduceLoops
10215==============
10216
10217<:IntroduceLoops:> is an optimization pass for the <:SSA:>
10218<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10219
10220== Description ==
10221
10222This pass rewrites any <:SSA:> function that calls itself in tail
10223position into one with a local loop and no self tail calls.
10224
10225A <:SSA:> function like
10226----
10227fun F (arg_0, arg_1) = L_0 ()
10228 ...
10229 L_16 (x_0)
10230 ...
10231 F (z_0, z_1) Tail
10232 ...
10233----
10234becomes
10235----
10236fun F (arg_0', arg_1') = loopS_0 ()
10237 loopS_0 ()
10238 loop_0 (arg_0', arg_1')
10239 loop_0 (arg_0, arg_1)
10240 L_0 ()
10241 ...
10242 L_16 (x_0)
10243 ...
10244 loop_0 (z_0, z_1)
10245 ...
10246----
10247
10248== Implementation ==
10249
10250* <!ViewGitFile(mlton,master,mlton/ssa/introduce-loops.fun)>
10251
10252== Details and Notes ==
10253
10254{empty}
10255
10256<<<
10257
10258:mlton-guide-page: JesperLouisAndersen
10259[[JesperLouisAndersen]]
10260JesperLouisAndersen
10261===================
10262
10263Jesper Louis Andersen is an undergraduate student at DIKU, the department of computer science, Copenhagen university. His contributions to MLton are few, though he has made the port of MLton to the NetBSD and OpenBSD platforms.
10264
10265His general interests in computer science are compiler theory, language theory, algorithms and datastructures and programming. His assets are his general knowledge of UNIX systems, knowledge of system administration, knowledge of operating system kernels; NetBSD in particular.
10266
10267He was employed by the university as a system administrator for 2 years, which has set him back somewhat in his studies. Currently he is trying to learn mathematics (real analysis, general topology, complex functional analysis and algebra).
10268
10269
10270== Projects using MLton ==
10271
10272=== A register allocator ===
10273For internal use at a compiler course at DIKU. It is written in the literate programming style and implements the _Iterated Register Coalescing_ algorithm by Lal George and Andrew Appel http://citeseer.ist.psu.edu/george96iterated.html. The status of the project is that it is unfinished. Most of the basic parts of the algorithm is done, but the interface to the students (simple) datatype takes some conversion.
10274
10275=== A configuration management system in SML ===
10276At this time, only loose plans exists for this. The plan is to build a Configuration Management system on the principles of the OpenCM system, see http://www.opencm.org/docs.html. The basic idea is to unify "naming" and "identity" into one by uniquely identifying all objects managed in the repository by the use of cryptographic checksums. This mantra guides the rest of the system, providing integrity, accessibility and confidentiality.
10277
10278<<<
10279
10280:mlton-guide-page: JohnnyAndersen
10281[[JohnnyAndersen]]
10282JohnnyAndersen
10283==============
10284
10285Johnny Andersen (aka Anoq of the Sun)
10286
10287Here is a picture in front of the academy building
10288at the University of Athens, Greece, taken in September 2003.
10289
10290image::JohnnyAndersen.attachments/anoq.jpg[align="center"]
10291
10292<<<
10293
10294:mlton-guide-page: KnownCase
10295[[KnownCase]]
10296KnownCase
10297=========
10298
10299<:KnownCase:> is an optimization pass for the <:SSA:>
10300<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10301
10302== Description ==
10303
10304This pass duplicates and simplifies `Case` transfers when the
10305constructor of the scrutinee is known.
10306
10307Uses <:Restore:>.
10308
10309For example, the program
10310[source,sml]
10311----
10312val rec last =
10313 fn [] => 0
10314 | [x] => x
10315 | _ :: l => last l
10316
10317val _ = 1 + last [2, 3, 4, 5, 6, 7]
10318----
10319
10320gives rise to the <:SSA:> function
10321
10322----
10323fun last_0 (x_142) = loopS_1 ()
10324 loopS_1 ()
10325 loop_11 (x_142)
10326 loop_11 (x_143)
10327 case x_143 of
10328 nil_1 => L_73 | ::_0 => L_74
10329 L_73 ()
10330 return global_5
10331 L_74 (x_145, x_144)
10332 case x_145 of
10333 nil_1 => L_75 | _ => L_76
10334 L_75 ()
10335 return x_144
10336 L_76 ()
10337 loop_11 (x_145)
10338----
10339
10340which is simplified to
10341
10342----
10343fun last_0 (x_142) = loopS_1 ()
10344 loopS_1 ()
10345 case x_142 of
10346 nil_1 => L_73 | ::_0 => L_118
10347 L_73 ()
10348 return global_5
10349 L_118 (x_230, x_229)
10350 L_74 (x_230, x_229, x_142)
10351 L_74 (x_145, x_144, x_232)
10352 case x_145 of
10353 nil_1 => L_75 | ::_0 => L_114
10354 L_75 ()
10355 return x_144
10356 L_114 (x_227, x_226)
10357 L_74 (x_227, x_226, x_145)
10358----
10359
10360== Implementation ==
10361
10362* <!ViewGitFile(mlton,master,mlton/ssa/known-case.fun)>
10363
10364== Details and Notes ==
10365
10366One interesting aspect of <:KnownCase:>, is that it often has the
10367effect of unrolling list traversals by one iteration, moving the
10368`nil`/`::` check to the end of the loop, rather than the beginning.
10369
10370<<<
10371
10372:mlton-guide-page: LambdaCalculus
10373[[LambdaCalculus]]
10374LambdaCalculus
10375==============
10376
10377The http://en.wikipedia.org/wiki/Lambda_calculus[lambda calculus] is
10378the formal system underlying <:StandardML:Standard ML>.
10379
10380<<<
10381
10382:mlton-guide-page: LambdaFree
10383[[LambdaFree]]
10384LambdaFree
10385==========
10386
10387<:LambdaFree:> is an analysis pass for the <:SXML:>
10388<:IntermediateLanguage:>, invoked from <:ClosureConvert:>.
10389
10390== Description ==
10391
10392This pass descends the entire <:SXML:> program and attaches a property
10393to each `Lambda` `PrimExp.t` in the program. Then, you can use
10394`lambdaFree` and `lambdaRec` to get free variables of that `Lambda`.
10395
10396== Implementation ==
10397
10398* <!ViewGitFile(mlton,master,mlton/closure-convert/lambda-free.sig)>
10399* <!ViewGitFile(mlton,master,mlton/closure-convert/lambda-free.fun)>
10400
10401== Details and Notes ==
10402
10403For `Lambda`-s bound in a `Fun` dec, `lambdaFree` gives the union of
10404the frees of the entire group of mutually recursive functions. Hence,
10405`lambdaFree` for every `Lambda` in a single `Fun` dec is the same.
10406Furthermore, for a `Lambda` bound in a `Fun` dec, `lambdaRec` gives
10407the list of other functions bound in the same dec defining that
10408`Lambda`.
10409
10410For example:
10411----
10412val rec f = fn x => ... y ... g ... f ...
10413and g = fn z => ... f ... w ...
10414----
10415
10416----
10417lambdaFree(fn x =>) = [y, w]
10418lambdaFree(fn z =>) = [y, w]
10419lambdaRec(fn x =>) = [g, f]
10420lambdaRec(fn z =>) = [f]
10421----
10422
10423<<<
10424
10425:mlton-guide-page: LanguageChanges
10426[[LanguageChanges]]
10427LanguageChanges
10428===============
10429
10430We are sometimes asked to modify MLton to change the language it
10431compiles. In short, we are conservative about making such changes.
10432There are a number of reasons for this.
10433
10434* <:DefinitionOfStandardML:The Definition of Standard ML> is an
10435extremely high standard of specification. The value of the Definition
10436would be significantly diluted by changes that are not specified at an
10437equally high level, and the dilution increases with the complexity of
10438the language change and its interaction with other language features.
10439
10440* The SML community is small and there are a number of
10441<:StandardMLImplementations:SML implementations>. Without an
10442agreed-upon standard, it becomes very difficult to port programs
10443between compilers, and the community would be balkanized.
10444
10445* Our main goal is to enable programmers to be as effective as
10446possible with MLton/SML. There are a number of improvements other
10447than language changes that we could spend our time on that would
10448provide more benefit to programmers.
10449
10450* The more the language that MLton compiles changes over time, the
10451more difficult it is to use MLton as a stable platform for serious
10452program development.
10453
10454Despite these drawbacks, we have extended SML in a couple of cases.
10455
10456* <:ForeignFunctionInterface: Foreign function interface>
10457* <:MLBasis: ML Basis system>
10458* <:SuccessorML: Successor ML features>
10459
10460We allow these language extensions because they provide functionality
10461that is impossible to achieve without them or have non-trivial
10462community support. The Definition does not define a foreign function
10463interface. So, we must either extend the language or greatly restrict
10464the class of programs that can be written. Similarly, the Definition
10465does not provide a mechanism for namespace control at the module
10466level, making it impossible to deliver packaged libraries and have a
10467hope of users using them without name clashes. The ML Basis system
10468addresses this problem. We have also provided a formal specification
10469of the ML Basis system at the level of the Definition.
10470
10471== Also see ==
10472
10473* http://www.mlton.org/pipermail/mlton/2004-August/016165.html
10474* http://www.mlton.org/pipermail/mlton-user/2004-December/000320.html
10475
10476<<<
10477
10478:mlton-guide-page: Lazy
10479[[Lazy]]
10480Lazy
10481====
10482
10483In a lazy (or non-strict) language, the arguments to a function are
10484not evaluated before calling the function. Instead, the arguments are
10485suspended and only evaluated by the function if needed.
10486
10487<:StandardML:Standard ML> is an eager (or strict) language, not a lazy
10488language. However, it is easy to delay evaluation of an expression in
10489SML by creating a _thunk_, which is a nullary function. In SML, a
10490thunk is written `fn () => e`. Another essential feature of laziness
10491is _memoization_, meaning that once a suspended argument is evaluated,
10492subsequent references look up the value. We can express this in SML
10493with a function that maps a thunk to a memoized thunk.
10494
10495[source,sml]
10496----
10497signature LAZY =
10498 sig
10499 val lazy: (unit -> 'a) -> unit -> 'a
10500 end
10501----
10502
10503This is easy to implement in SML.
10504
10505[source,sml]
10506----
10507structure Lazy: LAZY =
10508 struct
10509 fun lazy (th: unit -> 'a): unit -> 'a =
10510 let
10511 datatype 'a lazy_result = Unevaluated of (unit -> 'a)
10512 | Evaluated of 'a
10513 | Failed of exn
10514
10515 val r = ref (Unevaluated th)
10516 in
10517 fn () =>
10518 case !r of
10519 Unevaluated th => let
10520 val a = th ()
10521 handle x => (r := Failed x; raise x)
10522 val () = r := Evaluated a
10523 in
10524 a
10525 end
10526 | Evaluated a => a
10527 | Failed x => raise x
10528 end
10529 end
10530----
10531
10532<<<
10533
10534:mlton-guide-page: Libraries
10535[[Libraries]]
10536Libraries
10537=========
10538
10539In theory every strictly conforming Standard ML program should run on
10540MLton. However, often large SML projects use implementation specific
10541features so some "porting" is required. Here is a partial list of
10542software that is known to run on MLton.
10543
10544* Utility libraries:
10545** <:SMLNJLibrary:> - distributed with MLton
10546** <:MLtonLibraryProject:> - various libraries located on the MLton subversion repository
10547** <!ViewGitDir(mlton,master,lib/mlton)> - the internal MLton utility library, which we hope to cleanup and make more accessible someday
10548** http://github.com/seanmcl/sml-ext[sml-ext], a grab bag of libraries for MLton and other SML implementations (by Sean McLaughlin)
10549** http://tom7misc.cvs.sourceforge.net/tom7misc/sml-lib/[sml-lib], a grab bag of libraries for MLton and other SML implementations (by <:TomMurphy:>)
10550* Scanner generators:
10551** <:MLLPTLibrary:> - distributed with MLton
10552** <:MLLex:> - distributed with MLton
10553** <:MLULex:> -
10554* Parser generators:
10555** <:MLAntlr:> -
10556** <:MLLPTLibrary:> - distributed with MLton
10557** <:MLYacc:> - distributed with MLton
10558* Concurrency: <:ConcurrentML:> - distributed with MLton
10559* Graphics
10560** <:SML3d:>
10561** <:mGTK:>
10562* Misc. libraries:
10563** <:CKitLibrary:> - distributed with MLton
10564** <:MLRISCLibrary:> - distributed with MLton
10565** <:MLNLFFI:ML-NLFFI> - distributed with MLton
10566** <:Swerve:>, an HTTP server
10567** <:fxp:>, an XML parser
10568
10569== Ports in progress ==
10570
10571<:Contact:> us for details on any of these.
10572
10573* <:MLDoc:> http://people.cs.uchicago.edu/%7Ejhr/tools/ml-doc.html
10574* <:Unicode:>
10575
10576== More ==
10577
10578More projects using MLton can be seen on the <:Users:> page.
10579
10580== Software for SML implementations other than MLton ==
10581
10582* PostgreSQL
10583** Moscow ML: http://www.dina.kvl.dk/%7Esestoft/mosmllib/Postgres.html
10584** SML/NJ NLFFI: http://smlweb.sourceforge.net/smlsql/
10585* Web:
10586** ML Kit: http://www.smlserver.org[SMLserver] (a plugin for AOLserver)
10587** Moscow ML: http://ellemose.dina.kvl.dk/%7Esestoft/msp/index.msp[ML Server Pages] (support for PHP-style CGI scripting)
10588** SML/NJ: http://smlweb.sourceforge.net/[smlweb]
10589
10590<<<
10591
10592:mlton-guide-page: LibrarySupport
10593[[LibrarySupport]]
10594LibrarySupport
10595==============
10596
10597MLton supports both linking to and creating system-level libraries.
10598While Standard ML libraries should be designed with the <:MLBasis:> system to work with other Standard ML programs,
10599system-level library support allows MLton to create libraries for use by other programming languages.
10600Even more importantly, system-level library support allows MLton to access libraries from other languages.
10601This article will explain how to use libraries portably with MLton.
10602
10603== The Basics ==
10604
10605A Dynamic Shared Object (DSO) is a piece of executable code written in a format understood by the operating system.
10606Executable programs and dynamic libraries are the two most common examples of a DSO.
10607They are called shared because if they are used more than once, they are only loaded once into main memory.
10608For example, if you start two instances of your web browser (an executable), there may be two processes running, but the program code of the executable is only loaded once.
10609A dynamic library, for example a graphical toolkit, might be used by several different executable programs, each possibly running multiple times.
10610Nevertheless, the dynamic library is only loaded once and it's program code is shared between all of the processes.
10611
10612In addition to program code, DSOs contain a table of textual strings called symbols.
10613These are used in order to make the DSO do something useful, like execute.
10614For example, on linux the symbol `_start` refers to the point in the program code where the operating system should start executing the program.
10615Dynamic libraries generally provide many symbols, corresponding to functions which can be called and variables which can be read or written.
10616Symbols can be used by the DSO itself, or by other DSOs which require services.
10617
10618When a DSO creates a symbol, this is called 'exporting'.
10619If a DSO needs to use a symbol, this is called 'importing'.
10620A DSO might need to use symbols defined within itself or perhaps from another DSO.
10621In both cases, it is importing that symbol, but the scope of the import differs.
10622Similarly, a DSO might export a symbol for use only within itself, or it might export a symbol for use by other DSOs.
10623Some symbols are resolved at compile time by the linker (those used within the DSO) and some are resolved at runtime by the dynamic link loader (symbols accessed between DSOs).
10624
10625== Symbols in MLton ==
10626
10627Symbols in MLton are both imported and exported via the <:ForeignFunctionInterface:>.
10628The notation `_import "symbolname"` imports functions, `_symbol "symbolname"` imports variables, and `_address "symbolname"` imports an address.
10629To create and export a symbol, `_export "symbolname"` creates a function symbol and `_symbol "symbolname" 'alloc'` creates and exports a variable.
10630For details of the syntax and restrictions on the supported FFI types, read the <:ForeignFunctionInterface:> page.
10631In this discussion it only matters that every FFI use is either an import or an export.
10632
10633When exporting a symbol, MLton supports controlling the export scope.
10634If the symbol should only be used within the same DSO, that symbol has '`private`' scope.
10635Conversely, if the symbol should also be available to other DSOs the symbol has '`public`' scope.
10636Generally, one should have as few public exports as possible.
10637Since they are public, other DSOs will come to depend on them, limiting your ability to change them.
10638You specify the export scope in MLton by putting `private` or `public` after the symbol's name in an FFI directive.
10639eg: `_export "foo" private: int->int;` or `_export "bar" public: int->int;` .
10640
10641For technical reasons, the linker and loader on various platforms need to know the scope of a symbol being imported.
10642If the symbol is exported by the same DSO, use `public` or `private` as appropriate.
10643If the symbol is exported by a different DSO, then the scope '`external`' should be used to import it.
10644Within a DSO, all references to a symbol must use the same scope.
10645MLton will check this at compile time, reporting: `symbol "foo" redeclared as public (previously external)`. This may cause linker errors.
10646However, MLton can only check usage within Standard ML.
10647All objects being linked into a resulting DSO must agree, and it is the programmer's responsibility to ensure this.
10648
10649Summary of symbol scopes:
10650
10651* `private`: used for symbols exported within a DSO only for use within that DSO
10652* `public`: used for symbols exported within a DSO that may also be used outside that DSO
10653* `external`: used for importing symbols from another DSO
10654* All uses of a symbol within a DSO (both imports and exports) must agree on the symbol scope
10655
10656== Output Formats ==
10657
10658MLton can create executables (`-format executable`) and dynamic shared libraries (`-format library`).
10659To link a shared library, use `-link-opt -l<dso_name>`.
10660The default output format is executable.
10661
10662MLton can also create archives.
10663An archive is not a DSO, but it does have a collection of symbols.
10664When an archive is linked into a DSO, it is completely absorbed.
10665Other objects being compiled into the DSO should refer to the public symbols in the archive as public, since they are still in the same DSO.
10666However, in the interest of modular programming, private symbols in an archive cannot be used outside of that archive, even within the same DSO.
10667
10668Although both executables and libraries are DSOs, some implementation details differ on some platforms.
10669For this reason, MLton can create two types or archives.
10670A normal archive (`-format archive`) is appropriate for linking into an executable.
10671Conversely, a libarchive (`-format libarchive`) should be used if it will be linked into a dynamic library.
10672
10673When MLton does not create an executable, it creates two special symbols.
10674The symbol `libname_open` is a function which must be called before any other symbols are accessed.
10675The `libname` is controlled by the `-libname` compile option and defaults to the name of the output, with any prefixing lib stripped (eg: `foo` -> `foo`, `libfoo` -> `foo`).
10676The symbol `libname_close` is a function which should be called to clean up memory once done.
10677
10678Summary of `-format` options:
10679
10680* `executable`: create an executable (a DSO)
10681* `library`: create a dynamic shared library (a DSO)
10682* `archive`: create an archive of symbols (not a DSO) that can be linked into an executable
10683* `libarchive`: create an archive of symbols (not a DSO) that can be linked into a library
10684
10685Related options:
10686
10687* `-libname x`: controls the name of the special `_open` and `_close` functions.
10688
10689
10690== Interfacing with C ==
10691
10692MLton can generate a C header file.
10693When the output format is not an executable, it creates one by default named `libname.h`.
10694This can be overridden with `-export-header foo.h`.
10695This header file should be included by any C files using the exported Standard ML symbols.
10696
10697If C is being linked with Standard ML into the same output archive or DSO,
10698then the C code should `#define PART_OF_LIBNAME` before it includes the header file.
10699This ensures that the C code is using the symbols with correct scope.
10700Any symbols exported from C should also be marked using the `PRIVATE`/`PUBLIC`/`EXTERNAL` macros defined in the Standard ML export header.
10701The declared C scope on exported C symbols should match the import scope used in Standard ML.
10702
10703An example:
10704[source,c]
10705----
10706#define PART_OF_FOO
10707#include "foo.h"
10708
10709PUBLIC int cFoo() {
10710 return smlFoo();
10711}
10712----
10713
10714[source,sml]
10715----
10716val () = _export "smlFoo" private: unit -> int; (fn () => 5)
10717val cFoo = _import "cFoo" public: unit -> int;
10718----
10719
10720
10721== Operating-system specific details ==
10722
10723On Windows, `libarchive` and `archive` are the same.
10724However, depending on this will lead to portability problems.
10725Windows is also especially sensitive to mixups of '`public`' and '`external`'.
10726If an archive is linked, make sure it's symbols are imported as `public`.
10727If a DLL is linked, make sure it's symbols are imported as `external`.
10728Using `external` instead of `public` will result in link errors that `__imp__foo is undefined`.
10729Using `public` instead of `external` will result in inconsistent function pointer addresses and failure to update the imported variables.
10730
10731On Linux, `libarchive` and `archive` are different.
10732Libarchives are quite rare, but necessary if creating a library from an archive.
10733It is common for a library to provide both an archive and a dynamic library on this platform.
10734The linker will pick one or the other, usually preferring the dynamic library.
10735While a quirk of the operating system allows external import to work for both archives and libraries,
10736portable projects should not depend on this behaviour.
10737On other systems it can matter how the library is linked (static or dynamic).
10738
10739<<<
10740
10741:mlton-guide-page: License
10742[[License]]
10743License
10744=======
10745
10746== Web Site ==
10747In order to allow the maximum freedom for the future use of the
10748content in this web site, we require that contributions to the web
10749site be dedicated to the public domain. That means that you can only
10750add works that are already in the public domain, or that you must hold
10751the copyright on the work that you agree to dedicate the work to the
10752public domain.
10753
10754By contributing to this web site, you agree to dedicate your
10755contribution to the public domain.
10756
10757== Software ==
10758
10759As of 20050812, MLton software is licensed under the BSD-style license
10760below. By contributing code to the project, you agree to release the
10761code under this license. Contributors can retain copyright to their
10762contributions by asserting copyright in their code. Contributors may
10763also add to the list of copyright holders in
10764`doc/license/MLton-LICENSE`, which appears below.
10765
10766[source,text]
10767----
10768sys::[./bin/InclGitFile.py mlton master doc/license/MLton-LICENSE]
10769----
10770
10771<<<
10772
10773:mlton-guide-page: LineDirective
10774[[LineDirective]]
10775LineDirective
10776=============
10777
10778To aid in the debugging of code produced by program generators such
10779as http://www.eecs.harvard.edu/%7Enr/noweb/[Noweb], MLton supports
10780comments with line directives of the form
10781[source,sml]
10782----
10783(*#line l.c "f"*)
10784----
10785Here, _l_ and _c_ are sequences of decimal digits and _f_ is the
10786source file. The first character of a source file has the position
107871.1. A line directive causes the front end to believe that the
10788character following the right parenthesis is at the line and column of
10789the specified file. A line directive only affects the reporting of
10790error messages and does not affect program semantics (except for
10791functions like `MLton.Exn.history` that report source file positions).
10792Syntactically invalid line directives are ignored. To prevent
10793incompatibilities with SML, the file name may not contain the
10794character sequence `*)`.
10795
10796<<<
10797
10798:mlton-guide-page: LLVM
10799[[LLVM]]
10800LLVM
10801====
10802
10803The http://www.llvm.org/[LLVM Project] is a collection of modular and
10804reusable compiler and toolchain technologies.
10805
10806MLton supports code generation via LLVM (`-codegen llvm`); see
10807<:LLVMCodegen:>.
10808
10809== Also see ==
10810
10811* <:CMinusMinus:>
10812
10813<<<
10814
10815:mlton-guide-page: LLVMCodegen
10816[[LLVMCodegen]]
10817LLVMCodegen
10818===========
10819
10820The <:LLVMCodegen:> is a <:Codegen:code generator> that translates the
10821<:Machine:> <:IntermediateLanguage:> to <:LLVM:> assembly, which is
10822further optimized and compiled to native object code by the <:LLVM:>
10823toolchain.
10824
10825It requires <:LLVM:> version 3.7 or greater to be installed.
10826
10827In benchmarks performed on the <:RunningOnAMD64:AMD64> architecture,
10828code size with this generator is usually slightly smaller than either
10829the <:AMD64Codegen:native> or the <:CCodegen:C> code generators. Compile
10830time is worse than <:AMD64Codegen:native>, but slightly better than
10831<:CCodegen:C>. Run time is often better than either <:AMD64Codegen:native>
10832or <:CCodegen:C>.
10833
10834== Implementation ==
10835
10836* <!ViewGitFile(mlton,master,mlton/codegen/llvm-codegen/llvm-codegen.sig)>
10837* <!ViewGitFile(mlton,master,mlton/codegen/llvm-codegen/llvm-codegen.fun)>
10838
10839== Details and Notes ==
10840
10841The <:LLVMCodegen:> was initially developed by Brian Leibig (see
10842<!Cite(Leibig13,An LLVM Back-end for MLton)>).
10843
10844<<<
10845
10846:mlton-guide-page: LocalFlatten
10847[[LocalFlatten]]
10848LocalFlatten
10849============
10850
10851<:LocalFlatten:> is an optimization pass for the <:SSA:>
10852<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10853
10854== Description ==
10855
10856This pass flattens arguments to <:SSA:> blocks.
10857
10858A block argument is flattened as long as it only flows to selects and
10859there is some tuple constructed in this function that flows to it.
10860
10861== Implementation ==
10862
10863* <!ViewGitFile(mlton,master,mlton/ssa/local-flatten.fun)>
10864
10865== Details and Notes ==
10866
10867{empty}
10868
10869<<<
10870
10871:mlton-guide-page: LocalRef
10872[[LocalRef]]
10873LocalRef
10874========
10875
10876<:LocalRef:> is an optimization pass for the <:SSA:>
10877<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10878
10879== Description ==
10880
10881This pass optimizes `ref` cells local to a <:SSA:> function:
10882
10883* global `ref`-s only used in one function are moved to the function
10884
10885* `ref`-s only created, read from, and written to (i.e., don't escape)
10886are converted into function local variables
10887
10888Uses <:Multi:> and <:Restore:>.
10889
10890== Implementation ==
10891
10892* <!ViewGitFile(mlton,master,mlton/ssa/local-ref.fun)>
10893
10894== Details and Notes ==
10895
10896Moving a global `ref` requires the <:Multi:> analysis, because a
10897global `ref` can only be moved into a function that is executed at
10898most once.
10899
10900Conversion of non-escaping `ref`-s is structured in three phases:
10901
10902* analysis -- a variable `r = Ref_ref x` escapes if
10903** `r` is used in any context besides `Ref_assign (r, _)` or `Ref_deref r`
10904** all uses `r` reachable from a (direct or indirect) call to `Thread_copyCurrent` are of the same flavor (either `Ref_assign` or `Ref_deref`); this also requires the <:Multi:> analysis.
10905
10906* transformation
10907+
10908--
10909** rewrites `r = Ref_ref x` to `r = x`
10910** rewrites `_ = Ref_assign (r, y)` to `r = y`
10911** rewrites `z = Ref_deref r` to `z = r`
10912--
10913+
10914Note that the resulting program violates the SSA condition.
10915
10916* <:Restore:> -- restore the SSA condition.
10917
10918<<<
10919
10920:mlton-guide-page: Logo
10921[[Logo]]
10922Logo
10923====
10924
10925ifdef::basebackend-html[]
10926image::Logo.attachments/mlton.svg[align="center",height="128",width="128"]
10927endif::[]
10928ifdef::basebackend-docbook[]
10929image::Logo.attachments/mlton-128.pdf[align="center"]
10930endif::[]
10931
10932== Files ==
10933
10934* <!Attachment(Logo,mlton.svg)>
10935* <!Attachment(Logo,mlton-1024.png)>
10936* <!Attachment(Logo,mlton-512.png)>
10937* <!Attachment(Logo,mlton-256.png)>
10938* <!Attachment(Logo,mlton-128.png)>
10939* <!Attachment(Logo,mlton-64.png)>
10940* <!Attachment(Logo,mlton-32.png)>
10941
10942<<<
10943
10944:mlton-guide-page: LoopInvariant
10945[[LoopInvariant]]
10946LoopInvariant
10947=============
10948
10949<:LoopInvariant:> is an optimization pass for the <:SSA:>
10950<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
10951
10952== Description ==
10953
10954This pass removes loop invariant arguments to local loops.
10955
10956----
10957 loop (x, y)
10958 ...
10959 ...
10960 loop (x, z)
10961 ...
10962----
10963
10964becomes
10965
10966----
10967 loop' (x, y)
10968 loop (y)
10969 loop (y)
10970 ...
10971 ...
10972 loop (z)
10973 ...
10974----
10975
10976== Implementation ==
10977
10978* <!ViewGitFile(mlton,master,mlton/ssa/loop-invariant.fun)>
10979
10980== Details and Notes ==
10981
10982{empty}
10983
10984<<<
10985
10986:mlton-guide-page: LoopUnroll
10987[[LoopUnroll]]
10988LoopUnroll
10989==========
10990
10991<:LoopUnroll:> is an optimization pass for the <:SSA:> <:IntermediateLanguage:>,
10992invoked from <:SSASimplify:>.
10993
10994== Description ==
10995
10996A simple loop unrolling optimization.
10997
10998== Implementation ==
10999
11000* <!ViewGitFile(mlton,master,mlton/ssa/loop-unroll.fun)>
11001
11002== Details and Notes ==
11003
11004{empty}
11005
11006<<<
11007
11008:mlton-guide-page: LoopUnswitch
11009[[LoopUnswitch]]
11010LoopUnswitch
11011============
11012
11013<:LoopUnswitch:> is an optimization pass for the <:SSA:>
11014<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
11015
11016== Description ==
11017
11018A simple loop unswitching optimization.
11019
11020== Implementation ==
11021
11022* <!ViewGitFile(mlton,master,mlton/ssa/loop-unswitch.fun)>
11023
11024== Details and Notes ==
11025
11026{empty}
11027
11028<<<
11029
11030:mlton-guide-page: Machine
11031[[Machine]]
11032Machine
11033=======
11034
11035<:Machine:> is an <:IntermediateLanguage:>, translated from <:RSSA:>
11036by <:ToMachine:> and used as input by the <:Codegen:>.
11037
11038== Description ==
11039
11040<:Machine:> is an <:Untyped:> <:IntermediateLanguage:>, corresponding
11041to a abstract register machine.
11042
11043== Implementation ==
11044
11045* <!ViewGitFile(mlton,master,mlton/backend/machine.sig)>
11046* <!ViewGitFile(mlton,master,mlton/backend/machine.fun)>
11047
11048== Type Checking ==
11049
11050The <:Machine:> <:IntermediateLanguage:> has a primitive type checker
11051(<!ViewGitFile(mlton,master,mlton/backend/machine.sig)>,
11052<!ViewGitFile(mlton,master,mlton/backend/machine.fun)>), which only checks
11053some liveness properties.
11054
11055== Details and Notes ==
11056
11057The runtime structure sets some constants according to the
11058configuration files on the target architecture and OS.
11059
11060<<<
11061
11062:mlton-guide-page: ManualPage
11063[[ManualPage]]
11064ManualPage
11065==========
11066
11067MLton is run from the command line with a collection of options
11068followed by a file name and a list of files to compile, assemble, and
11069link with.
11070
11071----
11072mlton [option ...] file.{c|mlb|o|sml} [file.{c|o|s|S} ...]
11073----
11074
11075The simplest case is to run `mlton foo.sml`, where `foo.sml` contains
11076a valid SML program, in which case MLton compiles the program to
11077produce an executable `foo`. Since MLton does not support separate
11078compilation, the program must be the entire program you wish to
11079compile. However, the program may refer to signatures and structures
11080defined in the <:BasisLibrary:Basis Library>.
11081
11082Larger programs, spanning many files, can be compiled with the
11083<:MLBasis:ML Basis system>. In this case, `mlton foo.mlb` will
11084compile the complete SML program described by the basis `foo.mlb`,
11085which may specify both SML files and additional bases.
11086
11087== Next Steps ==
11088
11089* <:CompileTimeOptions:>
11090* <:RunTimeOptions:>
11091
11092<<<
11093
11094:mlton-guide-page: MatchCompilation
11095[[MatchCompilation]]
11096MatchCompilation
11097================
11098
11099Match compilation is the process of translating an SML match into a
11100nested tree (or dag) of simple case expressions and tests.
11101
11102MLton's match compiler is described <:MatchCompile:here>.
11103
11104== Match compilation in other compilers ==
11105
11106* <!Cite(BaudinetMacQueen85)>
11107* <!Cite(Leroy90)>, pages 60-69.
11108* <!Cite(Sestoft96)>
11109* <!Cite(ScottRamsey00)>
11110
11111<<<
11112
11113:mlton-guide-page: MatchCompile
11114[[MatchCompile]]
11115MatchCompile
11116============
11117
11118<:MatchCompile:> is a translation pass, agnostic in the
11119<:IntermediateLanguage:>s between which it translates.
11120
11121== Description ==
11122
11123<:MatchCompilation:Match compilation> converts a case expression with
11124nested patterns into a case expression with flat patterns.
11125
11126== Implementation ==
11127
11128* <!ViewGitFile(mlton,master,mlton/match-compile/match-compile.sig)>
11129* <!ViewGitFile(mlton,master,mlton/match-compile/match-compile.fun)>
11130
11131== Details and Notes ==
11132
11133[source,sml]
11134----
11135val matchCompile:
11136 {caseType: Type.t, (* type of entire expression *)
11137 cases: (NestedPat.t * ((Var.t -> Var.t) -> Exp.t)) vector,
11138 conTycon: Con.t -> Tycon.t,
11139 region: Region.t,
11140 test: Var.t,
11141 testType: Type.t,
11142 tyconCons: Tycon.t -> {con: Con.t, hasArg: bool} vector}
11143 -> Exp.t * (unit -> ((Layout.t * {isOnlyExns: bool}) vector) vector)
11144----
11145
11146`matchCompile` is complicated by the desire for modularity between the
11147match compiler and its caller. Its caller is responsible for building
11148the right hand side of a rule `p => e`. On the other hand, the match
11149compiler is responsible for destructing the test and binding new
11150variables to the components. In order to connect the new variables
11151created by the match compiler with the variables in the pattern `p`,
11152the match compiler passes an environment back to its caller that maps
11153each variable in `p` to the corresponding variable introduced by the
11154match compiler.
11155
11156The match compiler builds a tree of n-way case expressions by working
11157from outside to inside and left to right in the patterns. For example,
11158[source,sml]
11159----
11160case x of
11161 (_, C1 a) => e1
11162| (C2 b, C3 c) => e2
11163----
11164is translated to
11165[source,sml]
11166----
11167let
11168 fun f1 a = e1
11169 fun f2 (b, c) = e2
11170in
11171 case x of
11172 (x1, x2) =>
11173 (case x1 of
11174 C2 b' => (case x2 of
11175 C1 a' => f1 a'
11176 | C3 c' => f2(b',c')
11177 | _ => raise Match)
11178 | _ => (case x2 of
11179 C1 a_ => f1 a_
11180 | _ => raise Match))
11181end
11182----
11183
11184Here you can see the necessity of abstracting out the ride hand sides
11185of the cases in order to avoid code duplication. Right hand sides are
11186always abstracted. The simplifier cleans things up. You can also see
11187the new (primed) variables introduced by the match compiler and how
11188the renaming works. Finally, you can see how the match compiler
11189introduces the necessary default clauses in order to make a match
11190exhaustive, i.e. cover all the cases.
11191
11192The match compiler uses `numCons` and `tyconCons` to determine
11193the exhaustivity of matches against constructors.
11194
11195<<<
11196
11197:mlton-guide-page: MatthewFluet
11198[[MatthewFluet]]
11199MatthewFluet
11200============
11201
11202Matthew Fluet (
11203mailto:matthew.fluet@gmail.com[matthew.fluet@gmail.com]
11204,
11205http://www.cs.rit.edu/%7Emtf
11206)
11207is an Assistant Professor at the http://www.rit.edu[Rochester Institute of Technology].
11208
11209''''
11210
11211Current MLton projects:
11212
11213* general maintenance
11214* release new version
11215
11216''''
11217
11218Misc. and underspecified TODOs:
11219
11220* understand <:RefFlatten:> and <:DeepFlatten:>
11221** http://www.mlton.org/pipermail/mlton/2005-April/026990.html
11222** http://www.mlton.org/pipermail/mlton/2007-November/030056.html
11223** http://www.mlton.org/pipermail/mlton/2008-April/030250.html
11224** http://www.mlton.org/pipermail/mlton/2008-July/030279.html
11225** http://www.mlton.org/pipermail/mlton/2008-August/030312.html
11226** http://www.mlton.org/pipermail/mlton/2008-September/030360.html
11227** http://www.mlton.org/pipermail/mlton-user/2009-June/001542.html
11228* `MSG_DONTWAIT` isn't Posix
11229* coordinate w/ Dan Spoonhower and Lukasz Ziarek and Armand Navabi on multi-threaded
11230** http://www.mlton.org/pipermail/mlton/2008-March/030214.html
11231* Intel Research bug: `no tyconRep property` (company won't release sample code)
11232** http://www.mlton.org/pipermail/mlton-user/2008-March/001358.html
11233* treatment of real constants
11234** http://www.mlton.org/pipermail/mlton/2008-May/030262.html
11235** http://www.mlton.org/pipermail/mlton/2008-June/030271.html
11236* representation of `bool` and `_bool` in <:ForeignFunctionInterface:>
11237** http://www.mlton.org/pipermail/mlton/2008-May/030264.html
11238* http://www.icfpcontest.org
11239** John Reppy claims that "It looks like the card-marking overhead that one incurs when using generational collection swamps the benefits of generational collection."
11240* page to disk policy / single heap
11241** http://www.mlton.org/pipermail/mlton/2008-June/030278.html
11242** http://www.mlton.org/pipermail/mlton/2008-August/030318.html
11243* `MLton.GC.pack` doesn't keep a small heap if a garbage collection occurs before `MLton.GC.unpack`.
11244** It might be preferable for `MLton.GC.pack` to be implemented as a (new) `MLton.GC.Ratios.setLive 1.1` followed by `MLton.GC.collect ()` and for `MLton.GC.unpack` to be implemented as `MLton.GC.Ratios.setLive 8.0` followed by `MLton.GC.collect ()`.
11245* The `static struct GC_objectType objectTypes[] =` array includes many duplicates. Objects of distinct source type, but equivalent representations (in terms of size, bytes non-pointers, number pointers) can share the objectType index.
11246* PolySpace bug: <:Redundant:> optimization (company won't release sample code)
11247** http://www.mlton.org/pipermail/mlton/2008-September/030355.html
11248* treatment of exception raised during <:BasisLibrary:> evaluation
11249** http://www.mlton.org/pipermail/mlton/2008-December/030501.html
11250** http://www.mlton.org/pipermail/mlton/2008-December/030502.html
11251** http://www.mlton.org/pipermail/mlton/2008-December/030503.html
11252* Use `memcpy`
11253** http://www.mlton.org/pipermail/mlton-user/2009-January/001506.html
11254** http://www.mlton.org/pipermail/mlton/2009-January/030506.html
11255* Implement more 64bit primops in x86 codegen
11256** http://www.mlton.org/pipermail/mlton/2009-January/030507.html
11257* Enrich path-map file syntax:
11258** http://www.mlton.org/pipermail/mlton/2008-September/030348.html
11259** http://www.mlton.org/pipermail/mlton-user/2009-January/001507.html
11260* PolySpace bug: crash during Cheney-copy collection
11261** http://www.mlton.org/pipermail/mlton/2009-February/030513.html
11262* eliminate `-build-constants`
11263** all `_const`-s are known by `runtime/gen/basis-ffi.def`
11264** generate `gen-constants.c` from `basis-ffi.def`
11265** generate `constants` from `gen-constants.c` and `libmlton.a`
11266** similar to `gen-sizes.c` and `sizes`
11267* eliminate "Windows hacks" for Cygwin from `Path` module
11268** http://www.mlton.org/pipermail/mlton/2009-July/030606.html
11269* extend IL type checkers to check for empty property lists
11270* make (unsafe) `IntInf` conversions into primitives
11271** http://www.mlton.org/pipermail/mlton/2009-July/030622.html
11272
11273<<<
11274
11275:mlton-guide-page: mGTK
11276[[mGTK]]
11277mGTK
11278====
11279
11280http://mgtk.sourceforge.net/[mGTK] is a wrapper for
11281http://www.gtk.org/[GTK+], a GUI toolkit.
11282
11283We recommend using mGTK 0.93, which is not listed on their home page,
11284but is available at the
11285http://sourceforge.net/project/showfiles.php?group_id=23226&package_id=16523[file
11286release page]. To test it, after unpacking, do `cd examples; make
11287mlton`, after which you should be able to run the many examples
11288(`signup-mlton`, `listview-mlton`, ...).
11289
11290== Also see ==
11291
11292* <:Glade:>
11293
11294<<<
11295
11296:mlton-guide-page: MichaelNorrish
11297[[MichaelNorrish]]
11298MichaelNorrish
11299==============
11300
11301I am a researcher at http://nicta.com.au[NICTA], with a web-page http://web.rsise.anu.edu.au/%7Emichaeln/[here].
11302
11303I'm interested in MLton because of the chance that it might be a good vehicle for future implementations of the http://hol.sf.net[HOL] theorem-proving system. It's beginning to look as if one route forward will be to embed an SML interpreter into a MLton-compiled executable. I don't know if an extensible interpreter of the kind we're looking for already exists.
11304
11305<<<
11306
11307:mlton-guide-page: MikeThomas
11308[[MikeThomas]]
11309MikeThomas
11310==========
11311
11312Here is a picture at home in Brisbane, Queensland, Australia, taken in January 2004.
11313
11314image::MikeThomas.attachments/picture.jpg[align="center"]
11315
11316<<<
11317
11318:mlton-guide-page: ML
11319[[ML]]
11320ML
11321==
11322
11323ML stands for _meta language_. ML was originally designed in the
113241970s as a programming language to assist theorem proving in the logic
11325LCF. In the 1980s, ML split into two variants,
11326<:StandardML:Standard ML> and <:OCaml:>, both of which are still used
11327today.
11328
11329<<<
11330
11331:mlton-guide-page: MLAntlr
11332[[MLAntlr]]
11333MLAntlr
11334=======
11335
11336http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[MLAntlr] is a
11337parser generator for <:StandardML:Standard ML>.
11338
11339== Also see ==
11340
11341* <:MLULex:>
11342* <:MLLPTLibrary:>
11343
11344<<<
11345
11346:mlton-guide-page: MLBasis
11347[[MLBasis]]
11348MLBasis
11349=======
11350
11351The ML Basis system extends <:StandardML:Standard ML> to support
11352programming-in-the-very-large, namespace management at the module
11353level, separate delivery of library sources, and more. While Standard
11354ML modules are a sophisticated language for programming-in-the-large,
11355it is difficult, if not impossible, to accomplish a number of routine
11356namespace management operations when a program draws upon multiple
11357libraries provided by different vendors.
11358
11359The ML Basis system is a simple, yet powerful, approach that builds
11360upon the programmer's intuitive notion (and
11361<:DefinitionOfStandardML: The Definition of Standard ML (Revised)>'s
11362formal notion) of the top-level environment (a _basis_). The system
11363is designed as a natural extension of <:StandardML: Standard ML>; the
11364formal specification of the ML Basis system
11365(<!Attachment(MLBasis,mlb-formal.pdf)>) is given in the style
11366of the Definition.
11367
11368Here are some of the key features of the ML Basis system:
11369
113701. Explicit file order: The order of files (and, hence, the order of
11371evaluation) in the program is explicit. The ML Basis system's
11372semantics are structured in such a way that for any well-formed
11373project, there will be exactly one possible interpretation of the
11374project's syntax, static semantics, and dynamic semantics.
11375
113762. Implicit dependencies: A source file (corresponding to an SML
11377top-level declaration) is elaborated in the environment described by
11378preceding declarations. It is not necessary to explicitly list the
11379dependencies of a file.
11380
113813. Scoping and renaming: The ML Basis system provides mechanisms for
11382limiting the scope of (i.e, hiding) and renaming identifiers.
11383
113844. No naming convention for finding the file that defines a module.
11385To import a module, its defining file must appear in some ML Basis
11386file.
11387
11388== Next steps ==
11389
11390* <:MLBasisSyntaxAndSemantics:>
11391* <:MLBasisExamples:>
11392* <:MLBasisPathMap:>
11393* <:MLBasisAnnotations:>
11394* <:MLBasisAvailableLibraries:>
11395
11396<<<
11397
11398:mlton-guide-page: MLBasisAnnotationExamples
11399[[MLBasisAnnotationExamples]]
11400MLBasisAnnotationExamples
11401=========================
11402
11403Here are some example uses of <:MLBasisAnnotations:>.
11404
11405== Eliminate spurious warnings in automatically generated code ==
11406
11407Programs that automatically generate source code can often produce
11408nonexhaustive patterns, relying on invariants of the generated code to
11409ensure that the pattern matchings never fail. A programmer may wish
11410to elide the nonexhaustive warnings from this code, in order that
11411legitimate warnings are not missed in a flurry of false positives. To
11412do so, the programmer simply annotates the generated code with the
11413`nonexhaustiveBind ignore` and `nonexhaustiveMatch ignore`
11414annotations:
11415
11416----
11417local
11418 $(GEN_ROOT)/gen-lib.mlb
11419
11420 ann
11421 "nonexhaustiveBind ignore"
11422 "nonexhaustiveMatch ignore"
11423 in
11424 foo.gen.sml
11425 end
11426in
11427 signature FOO
11428 structure Foo
11429end
11430----
11431
11432
11433== Deliver a library ==
11434
11435Standard ML libraries can be delivered via `.mlb` files. Authors of
11436such libraries should strive to be mindful of the ways in which
11437programmers may choose to compile their programs. For example,
11438although the defaults for `sequenceNonUnit` and `warnUnused` are
11439`ignore` and `false`, periodically compiling with these annotations
11440defaulted to `warn` and `true` can help uncover likely bugs. However,
11441a programmer is unlikely to be interested in unused modules from an
11442imported library, and the behavior of `sequenceNonUnit error` may be
11443incompatible with some libraries. Hence, a library author may choose
11444to deliver a library as follows:
11445
11446----
11447ann
11448 "nonexhaustiveBind warn" "nonexhaustiveMatch warn"
11449 "redundantBind warn" "redundantMatch warn"
11450 "sequenceNonUnit warn"
11451 "warnUnused true" "forceUsed"
11452in
11453 local
11454 file1.sml
11455 ...
11456 filen.sml
11457 in
11458 functor F1
11459 ...
11460 signature S1
11461 ...
11462 structure SN
11463 ...
11464 end
11465end
11466----
11467
11468The annotations `nonexhaustiveBind warn`, `redundantBind warn`,
11469`nonexhaustiveMatch warn`, `redundantMatch warn`, and `sequenceNonUnit
11470warn` have the obvious effect on elaboration. The annotations
11471`warnUnused true` and `forceUsed` work in conjunction -- warning on
11472any identifiers that do not contribute to the exported modules, and
11473preventing warnings on exported modules that are not used in the
11474remainder of the program. Many of the
11475<:MLBasisAvailableLibraries:available libraries> are delivered with
11476these annotations.
11477
11478<<<
11479
11480:mlton-guide-page: MLBasisAnnotations
11481[[MLBasisAnnotations]]
11482MLBasisAnnotations
11483==================
11484
11485<:MLBasis:ML Basis> annotations control options that affect the
11486elaboration of SML source files. Conceptually, a basis file is
11487elaborated in a default annotation environment (just as it is
11488elaborated in an empty basis). The declaration
11489++ann++{nbsp}++"++__ann__++"++{nbsp}++in++{nbsp}__basdec__{nbsp}++end++
11490merges the annotation _ann_ with the "current" annotation environment
11491for the elaboration of _basdec_. To allow for future expansion,
11492++"++__ann__++"++ is lexed as a single SML string constant. To
11493conveniently specify multiple annotations, the following derived form
11494is provided:
11495
11496****
11497+ann+ ++"++__ann__++"++ (++"++__ann__++"++ )^\+^ +in+ _basdec_ +end+
11498=>
11499+ann+ ++"++__ann__++"++ +in+ +ann+ (++"++__ann__++"++)^\+^ +in+ _basdec_ +end+ +end+
11500****
11501
11502Here are the available annotations. In the explanation below, for
11503annotations that take an argument, the first value listed is the
11504default.
11505
11506* +allowFFI {false|true}+
11507+
11508If `true`, allow `_address`, `_export`, `_import`, and `_symbol`
11509expressions to appear in source files. See
11510<:ForeignFunctionInterface:>.
11511
11512* +allowSuccessorML {false|true}+
11513+
11514--
11515Allow or disallow all of the <:SuccessorML:> features. This is a
11516proxy for all of the following annotations.
11517
11518** +allowDoDecls {false|true}+
11519+
11520If `true`, allow a +do _exp_+ declaration form.
11521
11522** +allowExtendedConsts {false|true}+
11523+
11524--
11525Allow or disallow all of the extended constants features. This is a
11526proxy for all of the following annotations.
11527
11528*** +allowExtendedNumConsts {false|true}+
11529+
11530If `true`, allow extended numeric constants.
11531
11532*** +allowExtendedTextConsts {false|true}+
11533+
11534If `true`, allow extended text constants.
11535--
11536
11537** +allowLineComments {false|true}+
11538+
11539If `true`, allow line comments beginning with the token ++(*)++.
11540
11541** +allowOptBar {false|true}+
11542+
11543If `true`, allow a bar to appear before the first match rule of a
11544`case`, `fn`, or `handle` expression, allow a bar to appear before the
11545first function-value binding of a `fun` declaration, and allow a bar
11546to appear before the first constructor binding or description of a
11547`datatype` declaration or specification.
11548
11549** +allowOptSemicolon {false|true}+
11550+
11551If `true`, allows a semicolon to appear after the last expression in a
11552sequence expression or `let` body.
11553
11554** +allowOrPats {false|true}+
11555+
11556If `true`, allows disjunctive (a.k.a., "or") patterns of the form
11557+_pat_ | _pat_+.
11558
11559** +allowRecordPunExps {false|true}+
11560+
11561If `true`, allows record punning expressions.
11562
11563** +allowSigWithtype {false|true}+
11564+
11565If `true`, allows `withtype` to modify a `datatype` specification in a
11566signature.
11567
11568** +allowVectorExpsAndPats {false|true}+
11569+
11570--
11571Allow or disallow vector expressions and vector patterns. This is a
11572proxy for all of the following annotations.
11573
11574*** +allowVectorExps {false|true}+
11575+
11576If `true`, allow vector expressions.
11577
11578*** +allowVectorPats {false|true}+
11579+
11580If `true`, allow vector patterns.
11581--
11582--
11583
11584* +forceUsed+
11585+
11586Force all identifiers in the basis denoted by the body of the `ann` to
11587be considered used; use in conjunction with `warnUnused true`.
11588
11589* +nonexhaustiveBind {warn|error|ignore}+
11590+
11591If `error` or `warn`, report nonexhaustive patterns in `val`
11592declarations (i.e., pattern-match failures that raise the `Bind`
11593exception). An error will abort a compile, while a warning will not.
11594
11595* +nonexhaustiveExnBind {default|ignore}+
11596+
11597If `ignore`, suppress errors and warnings about nonexhaustive matches
11598in `val` declarations that arise solely from unmatched exceptions.
11599If `default`, follow the behavior of `nonexhaustiveBind`.
11600
11601* +nonexhaustiveExnMatch {default|ignore}+
11602+
11603If `ignore`, suppress errors and warnings about nonexhaustive matches
11604in `fn` expressions, `case` expressions, and `fun` declarations that
11605arise solely from unmatched exceptions. If `default`, follow the
11606behavior of `nonexhaustiveMatch`.
11607
11608* +nonexhaustiveExnRaise {ignore|default}+
11609+
11610If `ignore`, suppress errors and warnings about nonexhaustive matches
11611in `handle` expressions that arise solely from unmatched exceptions.
11612If `default`, follow the behavior of `nonexhaustiveRaise`.
11613
11614* +nonexhaustiveMatch {warn|error|ignore}+
11615+
11616If `error` or `warn`, report nonexhaustive patterns in `fn`
11617expressions, `case` expressions, and `fun` declarations (i.e.,
11618pattern-match failures that raise the `Match` exception). An error
11619will abort a compile, while a warning will not.
11620
11621* +nonexhaustiveRaise {ignore|warn|error}+
11622+
11623If `error` or `warn`, report nonexhaustive patterns in `handle`
11624expressions (i.e., pattern-match failures that implicitly (re)raise
11625the unmatched exception). An error will abort a compile, while a
11626warning will not.
11627
11628* +redundantBind {warn|error|ignore}+
11629+
11630If `error` or `warn`, report redundant patterns in `val` declarations.
11631An error will abort a compile, while a warning will not.
11632
11633* +redundantMatch {warn|error|ignore}+
11634+
11635If `error` or `warn`, report redundant patterns in `fn` expressions,
11636`case` expressions, and `fun` declarations. An error will abort a
11637compile, while a warning will not.
11638
11639* +redundantRaise {warn|error|ignore}+
11640+
11641If `error` or `warn`, report redundant patterns in `handle`
11642expressions. An error will abort a compile, while a warning will not.
11643
11644* +resolveScope {strdec|dec|topdec|program}+
11645+
11646Used to control the scope at which overload constraints are resolved
11647to default types (if not otherwise resolved by type inference) and the
11648scope at which unresolved flexible record constraints are reported.
11649+
11650The syntactic-class argument means to perform resolution checks at the
11651smallest enclosing syntactic form of the given class. The default
11652behavior is to resolve at the smallest enclosing _strdec_ (which is
11653equivalent to the largest enclosing _dec_). Other useful behaviors
11654are to resolve at the smallest enclosing _topdec_ (which is equivalent
11655to the largest enclosing _strdec_) and at the smallest enclosing
11656_program_ (which corresponds to a single `.sml` file and does not
11657correspond to the whole `.mlb` program).
11658
11659* +sequenceNonUnit {ignore|error|warn}+
11660+
11661If `error` or `warn`, report when `e1` is not of type `unit` in the
11662sequence expression `(e1; e2)`. This can be helpful in detecting
11663curried applications that are mistakenly not fully applied. To
11664silence spurious messages, you can use `ignore e1`.
11665
11666* +valrecConstr {warn|error|ignore}+
11667+
11668If `error` or `warn`, report when a `val rec` (or `fun`) declaration
11669redefines an identifier that previously had constructor status. An
11670error will abort a compile, while a warning will not.
11671
11672* +warnUnused {false|true}+
11673+
11674Report unused identifiers.
11675
11676== Next Steps ==
11677
11678 * <:MLBasisAnnotationExamples:>
11679 * <:WarnUnusedAnomalies:>
11680
11681<<<
11682
11683:mlton-guide-page: MLBasisAvailableLibraries
11684[[MLBasisAvailableLibraries]]
11685MLBasisAvailableLibraries
11686=========================
11687
11688MLton comes with the following <:MLBasis:ML Basis> files available.
11689
11690* `$(SML_LIB)/basis/basis.mlb`
11691+
11692The <:BasisLibrary:Basis Library>.
11693
11694* `$(SML_LIB)/basis/basis-1997.mlb`
11695+
11696The (deprecated) 1997 version of the <:BasisLibrary:Basis Library>.
11697
11698* `$(SML_LIB)/basis/mlton.mlb`
11699+
11700The <:MLtonStructure:MLton> structure and signatures.
11701
11702* `$(SML_LIB)/basis/c-types.mlb`
11703+
11704Various structure aliases useful as <:ForeignFunctionInterfaceTypes:>.
11705
11706* `$(SML_LIB)/basis/unsafe.mlb`
11707+
11708The <:UnsafeStructure:Unsafe> structure and signature.
11709
11710* `$(SML_LIB)/basis/sml-nj.mlb`
11711+
11712The <:SMLofNJStructure:SMLofNJ> structure and signature.
11713
11714* `$(SML_LIB)/mlyacc-lib/mlyacc-lib.mlb`
11715+
11716Modules used by parsers built with <:MLYacc:>.
11717
11718* `$(SML_LIB)/cml/cml.mlb`
11719+
11720<:ConcurrentML:>, a library for message-passing concurrency.
11721
11722* `$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb`
11723+
11724<:MLNLFFI:ML-NLFFI>, a library for foreign function interfaces.
11725
11726* `$(SML_LIB)/mlrisc-lib/...`
11727+
11728<:MLRISCLibrary:>, a library for retargetable and optimizing compiler back ends.
11729
11730* `$(SML_LIB)/smlnj-lib/...`
11731+
11732<:SMLNJLibrary:>, a collection of libraries distributed with SML/NJ.
11733
11734* `$(SML_LIB)/ckit-lib/ckit-lib.mlb`
11735+
11736<:CKitLibrary:>, a library for C source code.
11737
11738* `$(SML_LIB)/mllpt-lib/mllpt-lib.mlb`
11739+
11740<:MLLPTLibrary:>, a support library for the <:MLULex:> scanner generator and the <:MLAntlr:> parser generator.
11741
11742
11743== Basis fragments ==
11744
11745There are a number of specialized ML Basis files for importing
11746fragments of the <:BasisLibrary: Basis Library> that can not be
11747expressed within SML.
11748
11749* `$(SML_LIB)/basis/pervasive-types.mlb`
11750+
11751The top-level types and constructors of the Basis Library.
11752
11753* `$(SML_LIB)/basis/pervasive-exns.mlb`
11754+
11755The top-level exception constructors of the Basis Library.
11756
11757* `$(SML_LIB)/basis/pervasive-vals.mlb`
11758+
11759The top-level values of the Basis Library, without infix status.
11760
11761* `$(SML_LIB)/basis/overloads.mlb`
11762+
11763The top-level overloaded values of the Basis Library, without infix status.
11764
11765* `$(SML_LIB)/basis/equal.mlb`
11766+
11767The polymorphic equality `=` and inequality `<>` values, without infix status.
11768
11769* `$(SML_LIB)/basis/infixes.mlb`
11770+
11771The infix declarations of the Basis Library.
11772
11773* `$(SML_LIB)/basis/pervasive.mlb`
11774+
11775The entire top-level value and type environment of the Basis Library, with infix status. This is the same as importing the above six MLB files.
11776
11777<<<
11778
11779:mlton-guide-page: MLBasisExamples
11780[[MLBasisExamples]]
11781MLBasisExamples
11782===============
11783
11784Here are some example uses of <:MLBasis:ML Basis> files.
11785
11786
11787== Complete program ==
11788
11789Suppose your complete program consists of the files `file1.sml`, ...,
11790`filen.sml`, which depend upon libraries `lib1.mlb`, ..., `libm.mlb`.
11791
11792----
11793(* import libraries *)
11794lib1.mlb
11795...
11796libm.mlb
11797
11798(* program files *)
11799file1.sml
11800...
11801filen.sml
11802----
11803
11804The bases denoted by `lib1.mlb`, ..., `libm.mlb` are merged (bindings
11805of names in later bases take precedence over bindings of the same name
11806in earlier bases), producing a basis in which `file1.sml`, ...,
11807`filen.sml` are elaborated, adding additional bindings to the basis.
11808
11809
11810== Export filter ==
11811
11812Suppose you only want to export certain structures, signatures, and
11813functors from a collection of files.
11814
11815----
11816local
11817 file1.sml
11818 ...
11819 filen.sml
11820in
11821 (* export filter here *)
11822 functor F
11823 structure S
11824end
11825----
11826
11827While `file1.sml`, ..., `filen.sml` may declare top-level identifiers
11828in addition to `F` and `S`, such names are not accessible to programs
11829and libraries that import this `.mlb`.
11830
11831
11832== Export filter with renaming ==
11833
11834Suppose you want an export filter, but want to rename one of the
11835modules.
11836
11837----
11838local
11839 file1.sml
11840 ...
11841 filen.sml
11842in
11843 (* export filter, with renaming, here *)
11844 functor F
11845 structure S' = S
11846end
11847----
11848
11849Note that `functor F` is an abbreviation for `functor F = F`, which
11850simply exports an identifier under the same name.
11851
11852
11853== Import filter ==
11854
11855Suppose you only want to import a functor `F` from one library and a
11856structure `S` from another library.
11857
11858----
11859local
11860 lib1.mlb
11861in
11862 (* import filter here *)
11863 functor F
11864end
11865local
11866 lib2.mlb
11867in
11868 (* import filter here *)
11869 structure S
11870end
11871file1.sml
11872...
11873filen.sml
11874----
11875
11876
11877== Import filter with renaming ==
11878
11879Suppose you want to import a structure `S` from one library and
11880another structure `S` from another library.
11881
11882----
11883local
11884 lib1.mlb
11885in
11886 (* import filter, with renaming, here *)
11887 structure S1 = S
11888end
11889local
11890 lib2.mlb
11891in
11892 (* import filter, with renaming, here *)
11893 structure S2 = S
11894end
11895file1.sml
11896...
11897filen.sml
11898----
11899
11900
11901== Full Basis ==
11902
11903Since the Modules level of SML is the natural means for organizing
11904program and library components, MLB files provide convenient syntax
11905for renaming Modules level identifiers (in fact, renaming of functor
11906identifiers provides a mechanism that is not available in SML).
11907However, please note that `.mlb` files elaborate to full bases
11908including top-level types and values (including infix status), in
11909addition to structures, signatures, and functors. For example,
11910suppose you wished to extend the <:BasisLibrary:Basis Library> with an
11911`('a, 'b) either` datatype corresponding to a disjoint sum; the type
11912and some operations should be available at the top-level;
11913additionally, a signature and structure provide the complete
11914interface.
11915
11916We could use the following files.
11917
11918`either-sigs.sml`
11919[source,sml]
11920----
11921signature EITHER_GLOBAL =
11922 sig
11923 datatype ('a, 'b) either = Left of 'a | Right of 'b
11924 val & : ('a -> 'c) * ('b -> 'c) -> ('a, 'b) either -> 'c
11925 val && : ('a -> 'c) * ('b -> 'd) -> ('a, 'b) either -> ('c, 'd) either
11926 end
11927
11928signature EITHER =
11929 sig
11930 include EITHER_GLOBAL
11931 val isLeft : ('a, 'b) either -> bool
11932 val isRight : ('a, 'b) either -> bool
11933 ...
11934 end
11935----
11936
11937`either-strs.sml`
11938[source,sml]
11939----
11940structure Either : EITHER =
11941 struct
11942 datatype ('a, 'b) either = Left of 'a | Right of 'b
11943 fun f & g = fn x =>
11944 case x of Left z => f z | Right z => g z
11945 fun f && g = (Left o f) & (Right o g)
11946 fun isLeft x = ((fn _ => true) & (fn _ => false)) x
11947 fun isRight x = (not o isLeft) x
11948 ...
11949 end
11950structure EitherGlobal : EITHER_GLOBAL = Either
11951----
11952
11953`either-infixes.sml`
11954[source,sml]
11955----
11956infixr 3 & &&
11957----
11958
11959`either-open.sml`
11960[source,sml]
11961----
11962open EitherGlobal
11963----
11964
11965`either.mlb`
11966----
11967either-infixes.sml
11968local
11969 (* import Basis Library *)
11970 $(SML_LIB)/basis/basis.mlb
11971 either-sigs.sml
11972 either-strs.sml
11973in
11974 signature EITHER
11975 structure Either
11976 either-open.sml
11977end
11978----
11979
11980A client that imports `either.mlb` will have access to neither
11981`EITHER_GLOBAL` nor `EitherGlobal`, but will have access to the type
11982`either` and the values `&` and `&&` (with infix status) in the
11983top-level environment. Note that `either-infixes.sml` is outside the
11984scope of the local, because we want the infixes available in the
11985implementation of the library and to clients of the library.
11986
11987<<<
11988
11989:mlton-guide-page: MLBasisPathMap
11990[[MLBasisPathMap]]
11991MLBasisPathMap
11992==============
11993
11994An <:MLBasis:ML Basis> _path map_ describes a map from ML Basis path
11995variables (of the form `$(VAR)`) to file system paths. ML Basis path
11996variables provide a flexible way to refer to libraries while allowing
11997them to be moved without changing their clients.
11998
11999The format of an `mlb-path-map` file is a sequence of lines; each line
12000consists of two, white-space delimited tokens. The first token is a
12001path variable `VAR` and the second token is the path to which the
12002variable is mapped. The path may include path variables, which are
12003recursively expanded.
12004
12005The mapping from path variables to paths is initialized by the compiler.
12006Additional path maps can be specified with `-mlb-path-map` and
12007individual path variable mappings can be specified with
12008`-mlb-path-var` (see <:CompileTimeOptions:>). Configuration files are
12009processed from first to last and from top to bottom, later mappings
12010take precedence over earlier mappings.
12011
12012The compiler and system-wide configuration file makes the following
12013path variables available.
12014
12015[options="header",cols="^25%,<75%"]
12016|====
12017|MLB path variable|Description
12018|`SML_LIB`|path to system-wide libraries, usually `/usr/lib/mlton/sml`
12019|`TARGET_ARCH`|string representation of target architecture
12020|`TARGET_OS`|string representation of target operating system
12021|`DEFAULT_INT`|binding for default int, usually `int32`
12022|`DEFAULT_WORD`|binding for default word, usually `word32`
12023|`DEFAULT_REAL`|binding for default real, usually `real64`
12024|====
12025
12026<<<
12027
12028:mlton-guide-page: MLBasisSyntaxAndSemantics
12029[[MLBasisSyntaxAndSemantics]]
12030MLBasisSyntaxAndSemantics
12031=========================
12032
12033An <:MLBasis:ML Basis> (MLB) file should have the `.mlb` suffix and
12034should contain a basis declaration.
12035
12036== Syntax ==
12037
12038A basis declaration (_basdec_) must be one of the following forms.
12039
12040* +basis+ _basid_ +=+ _basexp_ (+and+ _basid_ +=+ _basexp_)^*^
12041* +open+ _basid~1~_ ... _basid~n~_
12042* +local+ _basdec_ +in+ _basdec_ +end+
12043* _basdec_ [+;+] _basdec_
12044* +structure+ _strid_ [+=+ _strid_] (+and+ _strid_[+=+ _strid_])^*^
12045* +signature+ _sigid_ [+=+ _sigid_] (+and+ _sigid_ [+=+ _sigid_])^*^
12046* +functor+ _funid_ [+=+ _funid_] (+and+ _funid_ [+=+ _funid_])^*^
12047* __path__++.sml++, __path__++.sig++, or __path__++.fun++
12048* __path__++.mlb++
12049* +ann+ ++"++_ann_++"++ +in+ _basdec_ +end+
12050
12051A basis expression (_basexp_) must be of one the following forms.
12052
12053* +bas+ _basdec_ +end+
12054* _basid_
12055* +let+ _basdec_ +in+ _basexp_ +end+
12056
12057Nested SML-style comments (enclosed with `(*` and `*)`) are ignored
12058(but <:LineDirective:>s are recognized).
12059
12060Paths can be relative or absolute. Relative paths are relative to the
12061directory containing the MLB file. Paths may include path variables
12062and are expanded according to a <:MLBasisPathMap:path map>. Unquoted
12063paths may include alpha-numeric characters and the symbols "`-`" and
12064"`_`", along with the arc separator "`/`" and extension separator
12065"`.`". More complicated paths, including paths with spaces, may be
12066included by quoting the path with `"`. A quoted path is lexed as an
12067SML string constant.
12068
12069<:MLBasisAnnotations:Annotations> allow a library author to
12070control options that affect the elaboration of SML source files.
12071
12072== Semantics ==
12073
12074There is a <!Attachment(MLBasis,mlb-formal.pdf,formal semantics)> for
12075ML Basis files in the style of the
12076<:DefinitionOfStandardML:Definition>. Here, we give an informal
12077explanation.
12078
12079An SML structure is a collection of types, values, and other
12080structures. Similarly, a basis is a collection, but of more kinds of
12081objects: types, values, structures, fixities, signatures, functors,
12082and other bases.
12083
12084A basis declaration denotes a basis. A structure, signature, or
12085functor declaration denotes a basis containing the corresponding
12086module. Sequencing of basis declarations merges bases, with later
12087definitions taking precedence over earlier ones, just like sequencing
12088of SML declarations. Local declarations provide name hiding, just
12089like SML local declarations. A reference to an SML source file causes
12090the file to be elaborated in the basis extant at the point of
12091reference. A reference to an MLB file causes the basis denoted by
12092that MLB file to be imported -- the basis at the point of reference
12093does _not_ affect the imported basis.
12094
12095Basis expressions and basis identifiers allow binding a basis to a
12096name.
12097
12098An MLB file is elaborated starting in an empty basis. Each MLB file
12099is elaborated and evaluated only once, with the result being cached.
12100Subsequent references use the cached value. Thus, any observable
12101effects due to evaluation are not duplicated if the MLB file is
12102referred to multiple times.
12103
12104<<<
12105
12106:mlton-guide-page: MLj
12107[[MLj]]
12108MLj
12109===
12110
12111http://www.dcs.ed.ac.uk/home/mlj/[MLj] is a
12112<:StandardMLImplementations:Standard ML implementation> that targets
12113Java bytecode. It is no longer maintained. It has morphed into
12114<:SMLNET:SML.NET>.
12115
12116== Also see ==
12117
12118* <!Cite(BentonEtAl98)>
12119* <!Cite(BentonKennedy99)>
12120
12121<<<
12122
12123:mlton-guide-page: MLKit
12124[[MLKit]]
12125MLKit
12126=====
12127
12128The http://sourceforge.net/apps/mediawiki/mlkit[ML Kit] is a
12129<:StandardMLImplementations:Standard ML implementation>.
12130
12131MLKit supports:
12132
12133* <:DefinitionOfStandardML:SML'97>
12134** including most of the latest <:BasisLibrary:Basis Library>
12135http://www.standardml.org/Basis[specification],
12136* <:MLBasis:ML Basis> files
12137** and separate compilation,
12138* <:Regions:Region-Based Memory Management>
12139** and <:GarbageCollection:garbage collection>,
12140* Multiple backends, including
12141** native x86,
12142** bytecode, and
12143** JavaScript (see http://www.itu.dk/people/mael/smltojs/[SMLtoJs]).
12144
12145At the time of writing, MLKit does not support:
12146
12147* concurrent programming / threads,
12148* calling from C to SML.
12149
12150<<<
12151
12152:mlton-guide-page: MLLex
12153[[MLLex]]
12154MLLex
12155=====
12156
12157<:MLLex:> is a lexical analyzer generator for <:StandardML:Standard ML>
12158modeled after the Lex lexical analyzer generator.
12159
12160A version of MLLex, ported from the <:SMLNJ:SML/NJ> sources, is
12161distributed with MLton.
12162
12163== Description ==
12164
12165MLLex takes as input the lex language as defined in the ML-Lex manual,
12166and outputs a lexical analyzer in SML.
12167
12168== Implementation ==
12169
12170* <!ViewGitFile(mlton,master,mllex/lexgen.sml)>
12171* <!ViewGitFile(mlton,master,mllex/main.sml)>
12172* <!ViewGitFile(mlton,master,mllex/call-main.sml)>
12173
12174== Details and Notes ==
12175
12176There are 3 main passes in the MLLex tool:
12177
12178* Source parsing. In this pass, lex source program are parsed into internal representations. The core part of this pass is a hand-written lexer and an LL(1) parser. The output of this pass is a record of user code, rules (along with start states) and actions. (MLLex definitions are wiped off.)
12179* DFA construction. In this pass, a DFA is constructed by the algorithm of H. Yamada et. al.
12180* Output. In this pass, the generated DFA is written out as a transition table, along with a table-driven algorithm, to an SML file.
12181
12182== Also see ==
12183
12184* <!Attachment(Documentation,mllex.pdf)>
12185* <:MLYacc:>
12186* <!Cite(AppelEtAl94)>
12187* <!Cite(Price09)>
12188
12189<<<
12190
12191:mlton-guide-page: MLLPTLibrary
12192[[MLLPTLibrary]]
12193MLLPTLibrary
12194============
12195
12196The
12197http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[ML-LPT Library]
12198is a support library for the <:MLULex:> scanner generator and the
12199<:MLAntlr:> parser generator. The ML-LPT Library is distributed with
12200SML/NJ.
12201
12202As of 20180119, MLton includes the ML-LPT Library synchronized with
12203SML/NJ version 110.82.
12204
12205== Usage ==
12206
12207* You can import the ML-LPT Library into an MLB file with:
12208+
12209[options="header"]
12210|=====
12211|MLB file|Description
12212|`$(SML_LIB)/mllpt-lib/mllpt-lib.mlb`|
12213|=====
12214
12215* If you are porting a project from SML/NJ's <:CompilationManager:> to
12216MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
12217following map is included by default:
12218+
12219----
12220# MLLPT Library
12221$ml-lpt-lib.cm $(SML_LIB)/mllpt-lib
12222$ml-lpt-lib.cm/ml-lpt-lib.cm $(SML_LIB)/mllpt-lib/mllpt-lib.mlb
12223----
12224+
12225This will automatically convert a `$/mllpt-lib.cm` import in an input
12226`.cm` file into a `$(SML_LIB)/mllpt-lib/mllpt-lib.mlb` import in the
12227output `.mlb` file.
12228
12229== Details ==
12230
12231{empty}
12232
12233== Patch ==
12234
12235* <!ViewGitFile(mlton,master,lib/mllpt-lib/ml-lpt.patch)>
12236
12237<<<
12238
12239:mlton-guide-page: MLmon
12240[[MLmon]]
12241MLmon
12242=====
12243
12244An `mlmon.out` file records dynamic <:Profiling:profiling> counts.
12245
12246== File format ==
12247
12248An `mlmon.out` file is a text file with a sequence of lines.
12249
12250* The string "`MLton prof`".
12251
12252* The string "`alloc`", "`count`", or "`time`", depending on the kind
12253of profiling information, corresponding to the command-line argument
12254supplied to `mlton -profile`.
12255
12256* The string "`current`" or "`stack`" depending on whether profiling
12257data was gathered for only the current function (the top of the stack)
12258or for all functions on the stack. This corresponds to whether the
12259executable was compiled with `-profile-stack false` or `-profile-stack
12260true`.
12261
12262* The magic number of the executable.
12263
12264* The number of non-gc ticks, followed by a space, then the number of
12265GC ticks.
12266
12267* The number of (split) functions for which data is recorded.
12268
12269* A line for each (split) function with counts. Each line contains an
12270integer count of the number of ticks while the function was current.
12271In addition, if stack data was gathered (`-profile-stack true`), then
12272the line contains two additional tick counts:
12273
12274** the number of ticks while the function was on the stack.
12275** the number of ticks while the function was on the stack and a GC
12276 was performed.
12277
12278* The number of (master) functions for which data is recorded.
12279
12280* A line for each (master) function with counts. The lines have the
12281same format and meaning as with split-function counts.
12282
12283<<<
12284
12285:mlton-guide-page: MLNLFFI
12286[[MLNLFFI]]
12287MLNLFFI
12288=======
12289
12290<!Cite(Blume01, ML-NLFFI)> is the no-longer-foreign-function interface
12291library for SML.
12292
12293As of 20050212, MLton has an initial port of ML-NLFFI from SML/NJ to
12294MLton. All of the ML-NLFFI functionality is present.
12295
12296Additionally, MLton has an initial port of the
12297<:MLNLFFIGen:mlnlffigen> tool from SML/NJ to MLton. Due to low-level
12298details, the code generated by SML/NJ's `ml-nlffigen` is not
12299compatible with MLton, and vice-versa. However, the generated code
12300has the same interface, so portable client code can be written.
12301MLton's `mlnlffigen` does not currently support C functions with
12302`struct` or `union` arguments.
12303
12304== Usage ==
12305
12306* You can import the ML-NLFFI Library into an MLB file with
12307+
12308[options="header"]
12309|=====
12310|MLB file|Description
12311|`$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb`|
12312|=====
12313
12314* If you are porting a project from SML/NJ's <:CompilationManager:> to
12315MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
12316following maps are included by default:
12317+
12318----
12319# MLNLFFI Library
12320$c $(SML_LIB)/mlnlffi-lib
12321$c/c.cm $(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb
12322----
12323+
12324This will automatically convert a `$/c.cm` import in an input `.cm`
12325file into a `$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb` import in the
12326output `.mlb` file.
12327
12328== Also see ==
12329
12330* <!Cite(Blume01)>
12331* <:MLNLFFIImplementation:>
12332* <:MLNLFFIGen:>
12333
12334<<<
12335
12336:mlton-guide-page: MLNLFFIGen
12337[[MLNLFFIGen]]
12338MLNLFFIGen
12339==========
12340
12341`mlnlffigen` generates a <:MLNLFFI:> binding from a collection of `.c`
12342files. It is based on the <:CKitLibrary:>, which is primarily designed
12343to handle standardized C and thus does not understand many (any?)
12344compiler extensions; however, it attempts to recover from errors when
12345seeing unrecognized definitions.
12346
12347In order to work around common gcc extensions, it may be useful to add
12348`-cppopt` options to the command line; for example
12349`-cppopt '-D__extension__'` may be occasionally useful. Fortunately,
12350most portable libraries largely avoid the use of these types of
12351extensions in header files.
12352
12353`mlnlffigen` will normally not generate bindings for `#included`
12354files; see `-match` and `-allSU` if this is desirable.
12355
12356<<<
12357
12358:mlton-guide-page: MLNLFFIImplementation
12359[[MLNLFFIImplementation]]
12360MLNLFFIImplementation
12361=====================
12362
12363MLton's implementation(s) of the <:MLNLFFI:> library differs from the
12364SML/NJ implementation in two important ways:
12365
12366* MLton cannot utilize the `Unsafe.cast` "cheat" described in Section
123673.7 of <!Cite(Blume01)>. (MLton's representation of
12368<:Closure:closures> and
12369<:PackedRepresentation:aggressive representation> optimizations make
12370an `Unsafe.cast` even more "unsafe" than in other implementations.)
12371+
12372--
12373We have considered two solutions:
12374
12375** One solution is to utilize an additional type parameter (as
12376described in Section 3.7 of <!Cite(Blume01)>):
12377+
12378--
12379__________
12380[source,sml]
12381----
12382signature C = sig
12383 type ('t, 'f, 'c) obj
12384 eqtype ('t, 'f, 'c) obj'
12385 ...
12386 type ('o, 'f) ptr
12387 eqtype ('o, 'f) ptr'
12388 ...
12389 type 'f fptr
12390 type 'f ptr'
12391 ...
12392 structure T : sig
12393 type ('t, 'f) typ
12394 ...
12395 end
12396end
12397----
12398
12399The rule for `('t, 'f, 'c) obj`,`('t, 'f, 'c) ptr`, and also `('t, 'f)
12400T.typ` is that whenever `F fptr` occurs within the instantiation of
12401`'t`, then `'f` must be instantiated to `F`. In all other cases, `'f`
12402will be instantiated to `unit`.
12403__________
12404
12405(In the actual MLton implementation, an abstract type `naf`
12406(not-a-function) is used instead of `unit`.)
12407
12408While this means that type-annotated programs may not type-check under
12409both the SML/NJ implementation and the MLton implementation, this
12410should not be a problem in practice. Tools, like `ml-nlffigen`, which
12411are necessarily implementation dependent (in order to make
12412<:CallingFromSMLToCFunctionPointer:calls through a C function
12413pointer>), may be easily extended to emit the additional type
12414parameter. Client code which uses such generated glue-code (e.g.,
12415Section 1 of <!Cite(Blume01)>) need rarely write type-annotations,
12416thanks to the magic of type inference.
12417--
12418
12419** The above implementation suffers from two disadvantages.
12420+
12421--
12422First, it changes the MLNLFFI Library interface, meaning that the same
12423program may not type-check under both the SML/NJ implementation and
12424the MLton implementation (though, in light of type inference and the
12425richer `MLRep` structure provided by MLton, this point is mostly
12426moot).
12427
12428Second, it appears to unnecessarily duplicate type information. For
12429example, an external C variable of type `int (* f[3])(int)` (that is,
12430an array of three function pointers), would be represented by the SML
12431type `(((sint -> sint) fptr, dec dg3) arr, sint -> sint, rw) obj`.
12432One might well ask why the `'f` instantiation (`sint -> sint` in this
12433case) cannot be _extracted_ from the `'t` instantiation
12434(`((sint -> sint) fptr, dec dg3) arr` in this case), obviating the
12435need for a separate _function-type_ type argument. There are a number
12436of components to an complete answer to this question. Foremost is the
12437fact that <:StandardML: Standard ML> supports neither (general)
12438type-level functions nor intensional polymorphism.
12439
12440A more direct answer for MLNLFFI is that in the SML/NJ implemention,
12441the definition of the types `('t, 'c) obj` and `('t, 'c) ptr` are made
12442in such a way that the type variables `'t` and `'c` are <:PhantomType:
12443phantom> (not contributing to the run-time representation of an
12444`('t, 'c) obj` or `('t, 'c) ptr` value), despite the fact that the
12445types `((sint -> sint) fptr, rw) ptr` and
12446`((double -> double) fptr, rw) ptr` necessarily carry distinct (and
12447type incompatible) run-time (C-)type information (RTTI), corresponding
12448to the different calling conventions of the two C functions. The
12449`Unsafe.cast` "cheat" overcomes the type incompatibility without
12450introducing a new type variable (as in the first solution above).
12451
12452Hence, the reason that _function-type_ type cannot be extracted from
12453the `'t` type variable instantiation is that the type of the
12454representation of RTTI doesn't even _see_ the (phantom) `'t` type
12455variable. The solution which presents itself is to give up on the
12456phantomness of the `'t` type variable, making it available to the
12457representation of RTTI.
12458
12459This is not without some small drawbacks. Because many of the types
12460used to instantiate `'t` carry more structure than is strictly
12461necessary for `'t`'s RTTI, it is sometimes necessary to wrap and
12462unwrap RTTI to accommodate the additional structure. (In the other
12463implementations, the corresponding operations can pass along the RTTI
12464unchanged.) However, these coercions contribute minuscule overhead;
12465in fact, in a majority of cases, MLton's optimizations will completely
12466eliminate the RTTI from the final program.
12467--
12468
12469The implementation distributed with MLton uses the second solution.
12470
12471Bonus question: Why can't one use a <:UniversalType: universal type>
12472to eliminate the use of `Unsafe.cast`?
12473
12474** Answer: ???
12475--
12476
12477* MLton (in both of the above implementations) provides a richer
12478`MLRep` structure, utilizing ++Int__<N>__++ and ++Word__<N>__++
12479structures.
12480+
12481--
12482[source,sml]
12483-----
12484structure MLRep = struct
12485 structure Char =
12486 struct
12487 structure Signed = Int8
12488 structure Unsigned = Word8
12489 (* word-style bit-operations on integers... *)
12490 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12491 structure W = Unsigned)
12492 end
12493 structure Short =
12494 struct
12495 structure Signed = Int16
12496 structure Unsigned = Word16
12497 (* word-style bit-operations on integers... *)
12498 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12499 structure W = Unsigned)
12500 end
12501 structure Int =
12502 struct
12503 structure Signed = Int32
12504 structure Unsigned = Word32
12505 (* word-style bit-operations on integers... *)
12506 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12507 structure W = Unsigned)
12508 end
12509 structure Long =
12510 struct
12511 structure Signed = Int32
12512 structure Unsigned = Word32
12513 (* word-style bit-operations on integers... *)
12514 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12515 structure W = Unsigned)
12516 end
12517 structure <:LongLong:> =
12518 struct
12519 structure Signed = Int64
12520 structure Unsigned = Word64
12521 (* word-style bit-operations on integers... *)
12522 structure <:SignedBitops:> = IntBitOps(structure I = Signed
12523 structure W = Unsigned)
12524 end
12525 structure Float = Real32
12526 structure Double = Real64
12527end
12528----
12529
12530This would appear to be a better interface, even when an
12531implementation must choose `Int32` and `Word32` as the representation
12532for smaller C-types.
12533--
12534
12535<<<
12536
12537:mlton-guide-page: MLRISCLibrary
12538[[MLRISCLibrary]]
12539MLRISCLibrary
12540=============
12541
12542The http://www.cs.nyu.edu/leunga/www/MLRISC/Doc/html/index.html[MLRISC
12543Library] is a framework for retargetable and optimizing compiler back
12544ends. The MLRISC Library is distributed with SML/NJ. Due to
12545differences between SML/NJ and MLton, this library will not work
12546out-of-the box with MLton.
12547
12548As of 20180119, MLton includes a port of the MLRISC Library
12549synchronized with SML/NJ version 110.82.
12550
12551== Usage ==
12552
12553* You can import a sub-library of the MLRISC Library into an MLB file with:
12554+
12555[options="header"]
12556|====
12557|MLB file|Description
12558|`$(SML_LIB)/mlrisc-lib/mlb/ALPHA.mlb`|The ALPHA backend
12559|`$(SML_LIB)/mlrisc-lib/mlb/AMD64.mlb`|The AMD64 backend
12560|`$(SML_LIB)/mlrisc-lib/mlb/AMD64-Peephole.mlb`|The AMD64 peephole optimizer
12561|`$(SML_LIB)/mlrisc-lib/mlb/CCall.mlb`|
12562|`$(SML_LIB)/mlrisc-lib/mlb/CCall-sparc.mlb`|
12563|`$(SML_LIB)/mlrisc-lib/mlb/CCall-x86-64.mlb`|
12564|`$(SML_LIB)/mlrisc-lib/mlb/CCall-x86.mlb`|
12565|`$(SML_LIB)/mlrisc-lib/mlb/Control.mlb`|
12566|`$(SML_LIB)/mlrisc-lib/mlb/Graphs.mlb`|
12567|`$(SML_LIB)/mlrisc-lib/mlb/HPPA.mlb`|The HPPA backend
12568|`$(SML_LIB)/mlrisc-lib/mlb/IA32.mlb`|The IA32 backend
12569|`$(SML_LIB)/mlrisc-lib/mlb/IA32-Peephole.mlb`|The IA32 peephole optimizer
12570|`$(SML_LIB)/mlrisc-lib/mlb/Lib.mlb`|
12571|`$(SML_LIB)/mlrisc-lib/mlb/MLRISC.mlb`|
12572|`$(SML_LIB)/mlrisc-lib/mlb/MLTREE.mlb`|
12573|`$(SML_LIB)/mlrisc-lib/mlb/Peephole.mlb`|
12574|`$(SML_LIB)/mlrisc-lib/mlb/PPC.mlb`|The PPC backend
12575|`$(SML_LIB)/mlrisc-lib/mlb/RA.mlb`|
12576|`$(SML_LIB)/mlrisc-lib/mlb/SPARC.mlb`|The Sparc backend
12577|`$(SML_LIB)/mlrisc-lib/mlb/StagedAlloc.mlb`|
12578|`$(SML_LIB)/mlrisc-lib/mlb/Visual.mlb`|
12579|=====
12580
12581* If you are porting a project from SML/NJ's <:CompilationManager:> to
12582MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
12583following map is included by default:
12584+
12585----
12586# MLRISC Library
12587$SMLNJ-MLRISC $(SML_LIB)/mlrisc-lib/mlb
12588----
12589+
12590This will automatically convert a `$SMLNJ-MLRISC/MLRISC.cm` import in
12591an input `.cm` file into a `$(SML_LIB)/mlrisc-lib/mlb/MLRISC.mlb`
12592import in the output `.mlb` file.
12593
12594== Details ==
12595
12596The following changes were made to the MLRISC Library, in addition to
12597deriving the `.mlb` files from the `.cm` files:
12598
12599* eliminate sequential `withtype` expansions: Most could be rewritten as a sequence of type definitions and datatype definitions.
12600* eliminate higher-order functors: Every higher-order functor definition and application could be uncurried in the obvious way.
12601* eliminate `where <str> = <str>`: Quite painful to expand out all the flexible types in the respective structures. Furthermore, many of the implied type equalities aren't needed, but it's too hard to pick out the right ones.
12602* `library/array-noneq.sml` (added, not exported): Implements `signature ARRAY_NONEQ`, similar to `signature ARRAY` from the <:BasisLibrary:Basis Library>, but replacing the latter's `eqtype 'a array = 'a array` and `type 'a vector = 'a Vector.vector` with `type 'a array` and `type 'a vector`. Thus, array-like containers may match `ARRAY_NONEQ`, whereas only the pervasive `'a array` container may math `ARRAY`. (SML/NJ's implementation of `signature ARRAY` omits the type realizations.)
12603* `library/dynamic-array.sml` and `library/hash-array.sml` (modifed): Replace `include ARRAY` with `include ARRAY_NONEQ`; see above.
12604
12605== Patch ==
12606
12607* <!ViewGitFile(mlton,master,lib/mlrisc-lib/MLRISC.patch)>
12608
12609<<<
12610
12611:mlton-guide-page: MLtonArray
12612[[MLtonArray]]
12613MLtonArray
12614==========
12615
12616[source,sml]
12617----
12618signature MLTON_ARRAY =
12619 sig
12620 val unfoldi: int * 'b * (int * 'b -> 'a * 'b) -> 'a array * 'b
12621 end
12622----
12623
12624* `unfoldi (n, b, f)`
12625+
12626constructs an array _a_ of length `n`, whose elements _a~i~_ are
12627determined by the equations __b~0~ = b__ and
12628__(a~i~, b~i+1~) = f (i, b~i~)__.
12629
12630<<<
12631
12632:mlton-guide-page: MLtonBinIO
12633[[MLtonBinIO]]
12634MLtonBinIO
12635==========
12636
12637[source,sml]
12638----
12639signature MLTON_BIN_IO = MLTON_IO
12640----
12641
12642See <:MLtonIO:>.
12643
12644<<<
12645
12646:mlton-guide-page: MLtonCont
12647[[MLtonCont]]
12648MLtonCont
12649=========
12650
12651[source,sml]
12652----
12653signature MLTON_CONT =
12654 sig
12655 type 'a t
12656
12657 val callcc: ('a t -> 'a) -> 'a
12658 val isolate: ('a -> unit) -> 'a t
12659 val prepend: 'a t * ('b -> 'a) -> 'b t
12660 val throw: 'a t * 'a -> 'b
12661 val throw': 'a t * (unit -> 'a) -> 'b
12662 end
12663----
12664
12665* `type 'a t`
12666+
12667the type of continuations that expect a value of type `'a`.
12668
12669* `callcc f`
12670+
12671applies `f` to the current continuation. This copies the entire
12672stack; hence, `callcc` takes time proportional to the size of the
12673current stack.
12674
12675* `isolate f`
12676+
12677creates a continuation that evaluates `f` in an empty context. This
12678is a constant time operation, and yields a constant size stack.
12679
12680* `prepend (k, f)`
12681+
12682composes a function `f` with a continuation `k` to create a
12683continuation that first does `f` and then does `k`. This is a
12684constant time operation.
12685
12686* `throw (k, v)`
12687+
12688throws value `v` to continuation `k`. This copies the entire stack of
12689`k`; hence, `throw` takes time proportional to the size of this stack.
12690
12691* `throw' (k, th)`
12692+
12693a generalization of throw that evaluates `th ()` in the context of
12694`k`. Thus, for example, if `th ()` raises an exception or captures
12695another continuation, it will see `k`, not the current continuation.
12696
12697
12698== Also see ==
12699
12700* <:MLtonContIsolateImplementation:>
12701
12702<<<
12703
12704:mlton-guide-page: MLtonContIsolateImplementation
12705[[MLtonContIsolateImplementation]]
12706MLtonContIsolateImplementation
12707==============================
12708
12709As noted before, it is fairly easy to get the operational behavior of `isolate` with just `callcc` and `throw`, but establishing the right space behavior is trickier. Here, we show how to start from the obvious, but inefficient, implementation of `isolate` using only `callcc` and `throw`, and 'derive' an equivalent, but more efficient, implementation of `isolate` using MLton's primitive stack capture and copy operations. This isn't a formal derivation, as we are not formally showing the equivalence of the programs (though I believe that they are all equivalent, modulo the space behavior).
12710
12711Here is a direct implementation of isolate using only `callcc` and `throw`:
12712
12713[source,sml]
12714----
12715val isolate: ('a -> unit) -> 'a t =
12716 fn (f: 'a -> unit) =>
12717 callcc
12718 (fn k1 =>
12719 let
12720 val x = callcc (fn k2 => throw (k1, k2))
12721 val _ = (f x ; Exit.topLevelSuffix ())
12722 handle exn => MLtonExn.topLevelHandler exn
12723 in
12724 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12725 end)
12726----
12727
12728
12729We use the standard nested `callcc` trick to return a continuation that is ready to receive an argument, execute the isolated function, and exit the program. Both `Exit.topLevelSuffix` and `MLtonExn.topLevelHandler` will terminate the program.
12730
12731Throwing to an isolated function will execute the function in a 'semantically' empty context, in the sense that we never re-execute the 'original' continuation of the call to isolate (i.e., the context that was in place at the time `isolate` was called). However, we assume that the compiler isn't able to recognize that the 'original' continuation is unused; for example, while we (the programmer) know that `Exit.topLevelSuffix` and `MLtonExn.topLevelHandler` will terminate the program, the compiler may only see opaque calls to unknown foreign-functions. So, that original continuation (in its entirety) is part of the continuation returned by `isolate` and throwing to the continuation returned by `isolate` will execute `f x` (with the exit wrapper) in the context of that original continuation. Thus, the garbage collector will retain everything reachable from that original continuation during the evaluation of `f x`, even though it is 'semantically' garbage.
12732
12733Note that this space-leak is independent of the implementation of continuations (it arises in both MLton's stack copying implementation of continuations and would arise in SML/NJ's CPS-translation implementation); we are only assuming that the implementation can't 'see' the program termination, and so must retain the original continuation (and anything reachable from it).
12734
12735So, we need an 'empty' continuation in which to execute `f x`. (No surprise there, as that is the written description of `isolate`.) To do this, we capture a top-level continuation and throw to that in order to execute `f x`:
12736
12737[source,sml]
12738----
12739local
12740val base: (unit -> unit) t =
12741 callcc
12742 (fn k1 =>
12743 let
12744 val th = callcc (fn k2 => throw (k1, k2))
12745 val _ = (th () ; Exit.topLevelSuffix ())
12746 handle exn => MLtonExn.topLevelHandler exn
12747 in
12748 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12749 end)
12750in
12751val isolate: ('a -> unit) -> 'a t =
12752 fn (f: 'a -> unit) =>
12753 callcc
12754 (fn k1 =>
12755 let
12756 val x = callcc (fn k2 => throw (k1, k2))
12757 in
12758 throw (base, fn () => f x)
12759 end)
12760end
12761----
12762
12763
12764We presume that `base` is evaluated 'early' in the program. There is a subtlety here, because one needs to believe that this `base` continuation (which technically corresponds to the entire rest of the program evaluation) 'works' as an empty context; in particular, we want it to be the case that executing `f x` in the `base` context retains less space than executing `f x` in the context in place at the call to `isolate` (as occurred in the previous implementation of `isolate`). This isn't particularly easy to believe if one takes a normal substitution-based operational semantics, because it seems that the context captured and bound to `base` is arbitrarily large. However, this context is mostly unevaluated code; the only heap-allocated values that are reachable from it are those that were evaluated before the evaluation of `base` (and used in the program after the evaluation of `base`). Assuming that `base` is evaluated 'early' in the program, we conclude that there are few heap-allocated values reachable from its continuation. In contrast, the previous implementation of `isolate` could capture a context that has many heap-allocated values reachable from it (because we could evaluate `isolate f` 'late' in the program and 'deep' in a call stack), which would all remain reachable during the evaluation of
12765`f x`. [We'll return to this point later, as it is taking a slightly MLton-esque view of the evaluation of a program, and may not apply as strongly to other implementations (e.g., SML/NJ).]
12766
12767Now, once we throw to `base` and begin executing `f x`, only the heap-allocated values reachable from `f` and `x` and the few heap-allocated values reachable from `base` are retained by the garbage collector. So, it seems that `base` 'works' as an empty context.
12768
12769But, what about the continuation returned from `isolate f`? Note that the continuation returned by `isolate` is one that receives an argument `x` and then
12770throws to `base` to evaluate `f x`. If we used a CPS-translation implementation (and assume sufficient beta-contractions to eliminate administrative redexes), then the original continuation passed to `isolate` (i.e., the continuation bound to `k1`) will not be free in the continuation returned by `isolate f`. Rather, the only free variables in the continuation returned by `isolate f` will be `base` and `f`, so the only heap-allocated values reachable from the continuation returned by `isolate f` will be those values reachable from `base` (assumed to be few) and those values reachable from `f` (necessary in order to execute `f` at some later point).
12771
12772But, MLton doesn't use a CPS-translation implementation. Rather, at each call to `callcc` in the body of `isolate`, MLton will copy the current execution stack. Thus, `k2` (the continuation returned by `isolate f`) will include execution stack at the time of the call to `isolate f` -- that is, it will include the 'original' continuation of the call to `isolate f`. Thus, the heap-allocated values reachable from the continuation returned by `isolate f` will include those values reachable from `base`, those values reachable from `f`, and those values reachable from the original continuation of the call to `isolate f`. So, just holding on to the continuation returned by `isolate f` will retain all of the heap-allocated values live at the time `isolate f` was called. This leaks space, since, 'semantically', the
12773continuation returned by `isolate f` only needs the heap-allocated values reachable from `f` (and `base`).
12774
12775In practice, this probably isn't a significant issue. A common use of `isolate` is implement `abort`:
12776[source,sml]
12777----
12778fun abort th = throw (isolate th, ())
12779----
12780
12781The continuation returned by `isolate th` is dead immediately after being thrown to -- the continuation isn't retained, so neither is the 'semantic'
12782garbage it would have retained.
12783
12784But, it is easy enough to 'move' onto the 'empty' context `base` the capturing of the context that we want to be returned by `isolate f`:
12785
12786[source,sml]
12787----
12788local
12789val base: (unit -> unit) t =
12790 callcc
12791 (fn k1 =>
12792 let
12793 val th = callcc (fn k2 => throw (k1, k2))
12794 val _ = (th () ; Exit.topLevelSuffix ())
12795 handle exn => MLtonExn.topLevelHandler exn
12796 in
12797 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12798 end)
12799in
12800val isolate: ('a -> unit) -> 'a t =
12801 fn (f: 'a -> unit) =>
12802 callcc
12803 (fn k1 =>
12804 throw (base, fn () =>
12805 let
12806 val x = callcc (fn k2 => throw (k1, k2))
12807 in
12808 throw (base, fn () => f x)
12809 end))
12810end
12811----
12812
12813
12814This implementation now has the right space behavior; the continuation returned by `isolate f` will only retain the heap-allocated values reachable from `f` and from `base`. (Technically, the continuation will retain two copies of the stack that was in place at the time `base` was evaluated, but we are assuming that that stack small.)
12815
12816One minor inefficiency of this implementation (given MLton's implementation of continuations) is that every `callcc` and `throw` entails copying a stack (albeit, some of them are small). We can avoid this in the evaluation of `base` by using a reference cell, because `base` is evaluated at the top-level:
12817
12818[source,sml]
12819----
12820local
12821val base: (unit -> unit) option t =
12822 let
12823 val baseRef: (unit -> unit) option t option ref = ref NONE
12824 val th = callcc (fn k => (base := SOME k; NONE))
12825 in
12826 case th of
12827 NONE => (case !baseRef of
12828 NONE => raise Fail "MLton.Cont.isolate: missing base"
12829 | SOME base => base)
12830 | SOME th => let
12831 val _ = (th () ; Exit.topLevelSuffix ())
12832 handle exn => MLtonExn.topLevelHandler exn
12833 in
12834 raise Fail "MLton.Cont.isolate: return from (wrapped)
12835 func"
12836 end
12837 end
12838in
12839val isolate: ('a -> unit) -> 'a t =
12840 fn (f: 'a -> unit) =>
12841 callcc
12842 (fn k1 =>
12843 throw (base, SOME (fn () =>
12844 let
12845 val x = callcc (fn k2 => throw (k1, k2))
12846 in
12847 throw (base, SOME (fn () => f x))
12848 end)))
12849end
12850----
12851
12852
12853Now, to evaluate `base`, we only copy the stack once (instead of 3 times). Because we don't have a dummy continuation around to initialize the reference cell, the reference cell holds a continuation `option`. To distinguish between the original evaluation of `base` (when we want to return the continuation) and the subsequent evaluations of `base` (when we want to evaluate a thunk), we capture a `(unit -> unit) option` continuation.
12854
12855This seems to be as far as we can go without exploiting the concrete implementation of continuations in <:MLtonCont:>. Examining the implementation, we note that the type of
12856continuations is given by
12857[source,sml]
12858----
12859type 'a t = (unit -> 'a) -> unit
12860----
12861
12862and the implementation of `throw` is given by
12863[source,sml]
12864----
12865fun ('a, 'b) throw' (k: 'a t, v: unit -> 'a): 'b =
12866 (k v; raise Fail "MLton.Cont.throw': return from continuation")
12867
12868fun ('a, 'b) throw (k: 'a t, v: 'a): 'b = throw' (k, fn () => v)
12869----
12870
12871
12872Suffice to say, a continuation is simply a function that accepts a thunk to yield the thrown value and the body of the function performs the actual throw. Using this knowledge, we can create a dummy continuation to initialize `baseRef` and greatly simplify the body of `isolate`:
12873
12874[source,sml]
12875----
12876local
12877val base: (unit -> unit) option t =
12878 let
12879 val baseRef: (unit -> unit) option t ref =
12880 ref (fn _ => raise Fail "MLton.Cont.isolate: missing base")
12881 val th = callcc (fn k => (baseRef := k; NONE))
12882 in
12883 case th of
12884 NONE => !baseRef
12885 | SOME th => let
12886 val _ = (th () ; Exit.topLevelSuffix ())
12887 handle exn => MLtonExn.topLevelHandler exn
12888 in
12889 raise Fail "MLton.Cont.isolate: return from (wrapped)
12890 func"
12891 end
12892 end
12893in
12894val isolate: ('a -> unit) -> 'a t =
12895 fn (f: 'a -> unit) =>
12896 fn (v: unit -> 'a) =>
12897 throw (base, SOME (f o v))
12898end
12899----
12900
12901
12902Note that this implementation of `isolate` makes it clear that the continuation returned by `isolate f` only retains the heap-allocated values reachable from `f` and `base`. It also retains only one copy of the stack that was in place at the time `base` was evaluated. Finally, it completely avoids making any copies of the stack that is in place at the time `isolate f` is evaluated; indeed, `isolate f` is a constant-time operation.
12903
12904Next, suppose we limited ourselves to capturing `unit` continuations with `callcc`. We can't pass the thunk to be evaluated in the 'empty' context directly, but we can use a reference cell.
12905
12906[source,sml]
12907----
12908local
12909val thRef: (unit -> unit) option ref = ref NONE
12910val base: unit t =
12911 let
12912 val baseRef: unit t ref =
12913 ref (fn _ => raise Fail "MLton.Cont.isolate: missing base")
12914 val () = callcc (fn k => baseRef := k)
12915 in
12916 case !thRef of
12917 NONE => !baseRef
12918 | SOME th =>
12919 let
12920 val _ = thRef := NONE
12921 val _ = (th () ; Exit.topLevelSuffix ())
12922 handle exn => MLtonExn.topLevelHandler exn
12923 in
12924 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12925 end
12926 end
12927in
12928val isolate: ('a -> unit) -> 'a t =
12929 fn (f: 'a -> unit) =>
12930 fn (v: unit -> 'a) =>
12931 let
12932 val () = thRef := SOME (f o v)
12933 in
12934 throw (base, ())
12935 end
12936end
12937----
12938
12939
12940Note that it is important to set `thRef` to `NONE` before evaluating the thunk, so that the garbage collector doesn't retain all the heap-allocated values reachable from `f` and `v` during the evaluation of `f (v ())`. This is because `thRef` is still live during the evaluation of the thunk; in particular, it was allocated before the evaluation of `base` (and used after), and so is retained by continuation on which the thunk is evaluated.
12941
12942This implementation can be easily adapted to use MLton's primitive stack copying operations.
12943
12944[source,sml]
12945----
12946local
12947val thRef: (unit -> unit) option ref = ref NONE
12948val base: Thread.preThread =
12949 let
12950 val () = Thread.copyCurrent ()
12951 in
12952 case !thRef of
12953 NONE => Thread.savedPre ()
12954 | SOME th =>
12955 let
12956 val () = thRef := NONE
12957 val _ = (th () ; Exit.topLevelSuffix ())
12958 handle exn => MLtonExn.topLevelHandler exn
12959 in
12960 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12961 end
12962 end
12963in
12964val isolate: ('a -> unit) -> 'a t =
12965 fn (f: 'a -> unit) =>
12966 fn (v: unit -> 'a) =>
12967 let
12968 val () = thRef := SOME (f o v)
12969 val new = Thread.copy base
12970 in
12971 Thread.switchTo new
12972 end
12973end
12974----
12975
12976
12977In essence, `Thread.copyCurrent` copies the current execution stack and stores it in an implicit reference cell in the runtime system, which is fetchable with `Thread.savedPre`. When we are ready to throw to the isolated function, `Thread.copy` copies the saved execution stack (because the stack is modified in place during execution, we need to retain a pristine copy in case the isolated function itself throws to other isolated functions) and `Thread.switchTo` abandons the current execution stack, installing the newly copied execution stack.
12978
12979The actual implementation of `MLton.Cont.isolate` simply adds some `Thread.atomicBegin` and `Thread.atomicEnd` commands, which effectively protect the global `thRef` and accommodate the fact that `Thread.switchTo` does an implicit `Thread.atomicEnd` (used for leaving a signal handler thread).
12980
12981[source,sml]
12982----
12983local
12984val thRef: (unit -> unit) option ref = ref NONE
12985val base: Thread.preThread =
12986 let
12987 val () = Thread.copyCurrent ()
12988 in
12989 case !thRef of
12990 NONE => Thread.savedPre ()
12991 | SOME th =>
12992 let
12993 val () = thRef := NONE
12994 val _ = MLton.atomicEnd (* Match 1 *)
12995 val _ = (th () ; Exit.topLevelSuffix ())
12996 handle exn => MLtonExn.topLevelHandler exn
12997 in
12998 raise Fail "MLton.Cont.isolate: return from (wrapped) func"
12999 end
13000 end
13001in
13002val isolate: ('a -> unit) -> 'a t =
13003 fn (f: 'a -> unit) =>
13004 fn (v: unit -> 'a) =>
13005 let
13006 val _ = MLton.atomicBegin (* Match 1 *)
13007 val () = thRef := SOME (f o v)
13008 val new = Thread.copy base
13009 val _ = MLton.atomicBegin (* Match 2 *)
13010 in
13011 Thread.switchTo new (* Match 2 *)
13012 end
13013end
13014----
13015
13016
13017It is perhaps interesting to note that the above implementation was originally 'derived' by specializing implementations of the <:MLtonThread:> `new`, `prepare`, and `switch` functions as if their only use was in the following implementation of `isolate`:
13018
13019[source,sml]
13020----
13021val isolate: ('a -> unit) -> 'a t =
13022 fn (f: 'a -> unit) =>
13023 fn (v: unit -> 'a) =>
13024 let
13025 val th = (f (v ()) ; Exit.topLevelSuffix ())
13026 handle exn => MLtonExn.topLevelHandler exn
13027 val t = MLton.Thread.prepare (MLton.Thread.new th, ())
13028 in
13029 MLton.Thread.switch (fn _ => t)
13030 end
13031----
13032
13033
13034It was pleasant to discover that it could equally well be 'derived' starting from the `callcc` and `throw` implementation.
13035
13036As a final comment, we noted that the degree to which the context of `base` could be considered 'empty' (i.e., retaining few heap-allocated values) depended upon a slightly MLton-esque view. In particular, MLton does not heap allocate executable code. So, although the `base` context keeps a lot of unevaluated code 'live', such code is not heap allocated. In a system like SML/NJ, that does heap allocate executable code, one might want it to be the case that after throwing to an isolated function, the garbage collector retains only the code necessary to evaluate the function, and not any code that was necessary to evaluate the `base` context.
13037
13038<<<
13039
13040:mlton-guide-page: MLtonCross
13041[[MLtonCross]]
13042MLtonCross
13043==========
13044
13045The debian package MLton-Cross adds various targets to MLton. In
13046combination with the emdebian project, this allows a debian system to
13047compile SML files to other architectures.
13048
13049Currently, these targets are supported:
13050
13051* _Windows (MinGW)_
13052** -target i586-mingw32msvc (mlton-target-i586-mingw32msvc)
13053** -target amd64-mingw32msvc( mlton-target-amd64-mingw32msvc)
13054* _Linux (Debian)_
13055** -target alpha-linux-gnu (mlton-target-alpha-linux-gnu)
13056** -target arm-linux-gnueabi (mlton-target-arm-linux-gnueabi)
13057** -target hppa-linux-gnu (mlton-target-hppa-linux-gnu)
13058** -target i486-linux-gnu (mlton-target-i486-linux-gnu)
13059** -target ia64-linux-gnu (mlton-target-ia64-linux-gnu)
13060** -target mips-linux-gnu (mlton-target-mips-linux-gnu)
13061** -target mipsel-linux-gnu (mlton-target-mipsel-linux-gnu)
13062** -target powerpc-linux-gnu (mlton-target-powerpc-linux-gnu)
13063** -target s390-linux-gnu (mlton-target-s390-linux-gnu)
13064** -target sparc-linux-gnu (mlton-target-sparc-linux-gnu)
13065** -target x86-64-linux-gnu (mlton-target-x86-64-linux-gnu)
13066
13067
13068== Download ==
13069
13070MLton-Cross is kept in-sync with the current MLton release.
13071
13072* <!Attachment(MLtonCross,mlton-cross_20100608.orig.tar.gz)>
13073
13074<<<
13075
13076:mlton-guide-page: MLtonExn
13077[[MLtonExn]]
13078MLtonExn
13079========
13080
13081[source,sml]
13082----
13083signature MLTON_EXN =
13084 sig
13085 val addExnMessager: (exn -> string option) -> unit
13086 val history: exn -> string list
13087
13088 val defaultTopLevelHandler: exn -> 'a
13089 val getTopLevelHandler: unit -> (exn -> unit)
13090 val setTopLevelHandler: (exn -> unit) -> unit
13091 val topLevelHandler: exn -> 'a
13092 end
13093----
13094
13095* `addExnMessager f`
13096+
13097adds `f` as a pretty-printer to be used by `General.exnMessage` for
13098converting exceptions to strings. Messagers are tried in order from
13099most recently added to least recently added.
13100
13101* `history e`
13102+
13103returns call stack at the point that `e` was first raised. Each
13104element of the list is a file position. The elements are in reverse
13105chronological order, i.e. the function called last is at the front of
13106the list.
13107+
13108`history e` will return `[]` unless the program is compiled with
13109`-const 'Exn.keepHistory true'`.
13110
13111* `defaultTopLevelHandler e`
13112+
13113function that behaves as the default top level handler; that is, print
13114out the unhandled exception message for `e` and exit.
13115
13116* `getTopLevelHandler ()`
13117+
13118get the top level handler.
13119
13120* `setTopLevelHandler f`
13121+
13122set the top level handler to the function `f`. The function `f`
13123should not raise an exception or return normally.
13124
13125* `topLevelHandler e`
13126+
13127behaves as if the top level handler received the exception `e`.
13128
13129<<<
13130
13131:mlton-guide-page: MLtonFinalizable
13132[[MLtonFinalizable]]
13133MLtonFinalizable
13134================
13135
13136[source,sml]
13137----
13138signature MLTON_FINALIZABLE =
13139 sig
13140 type 'a t
13141
13142 val addFinalizer: 'a t * ('a -> unit) -> unit
13143 val finalizeBefore: 'a t * 'b t -> unit
13144 val new: 'a -> 'a t
13145 val touch: 'a t -> unit
13146 val withValue: 'a t * ('a -> 'b) -> 'b
13147 end
13148----
13149
13150A _finalizable_ value is a container to which finalizers can be
13151attached. A container holds a value, which is reachable as long as
13152the container itself is reachable. A _finalizer_ is a function that
13153runs at some point after garbage collection determines that the
13154container to which it is attached has become
13155<:Reachability:unreachable>. A finalizer is treated like a signal
13156handler, in that it runs asynchronously in a separate thread, with
13157signals blocked, and will not interrupt a critical section (see
13158<:MLtonThread:>).
13159
13160* `addFinalizer (v, f)`
13161+
13162adds `f` as a finalizer to `v`. This means that sometime after the
13163last call to `withValue` on `v` completes and `v` becomes unreachable,
13164`f` will be called with the value of `v`.
13165
13166* `finalizeBefore (v1, v2)`
13167+
13168ensures that `v1` will be finalized before `v2`. A cycle of values
13169`v` = `v1`, ..., `vn` = `v` with `finalizeBefore (vi, vi+1)` will
13170result in none of the `vi` being finalized.
13171
13172* `new x`
13173+
13174creates a new finalizable value, `v`, with value `x`. The finalizers
13175of `v` will run sometime after the last call to `withValue` on `v`
13176when the garbage collector determines that `v` is unreachable.
13177
13178* `touch v`
13179+
13180ensures that `v`'s finalizers will not run before the call to `touch`.
13181
13182* `withValue (v, f)`
13183+
13184returns the result of applying `f` to the value of `v` and ensures
13185that `v`'s finalizers will not run before `f` completes. The call to
13186`f` is a nontail call.
13187
13188
13189== Example ==
13190
13191Suppose that `finalizable.sml` contains the following:
13192[source,sml]
13193----
13194sys::[./bin/InclGitFile.py mlton master doc/examples/finalizable/finalizable.sml]
13195----
13196
13197Suppose that `cons.c` contains the following.
13198[source,c]
13199----
13200sys::[./bin/InclGitFile.py mlton master doc/examples/finalizable/cons.c]
13201----
13202
13203We can compile these to create an executable with
13204----
13205% mlton -default-ann 'allowFFI true' finalizable.sml cons.c
13206----
13207
13208Running this executable will create output like the following.
13209----
13210% finalizable
132110x08072890 = listSing (2)
132120x080728a0 = listCons (2)
132130x080728b0 = listCons (2)
132140x080728c0 = listCons (2)
132150x080728d0 = listCons (2)
132160x080728e0 = listCons (2)
132170x080728f0 = listCons (2)
13218listSum
13219listSum(l) = 14
13220listFree (0x080728f0)
13221listFree (0x080728e0)
13222listFree (0x080728d0)
13223listFree (0x080728c0)
13224listFree (0x080728b0)
13225listFree (0x080728a0)
13226listFree (0x08072890)
13227----
13228
13229
13230== Synchronous Finalizers ==
13231
13232Finalizers in MLton are asynchronous. That is, they run at an
13233unspecified time, interrupting the user program. It is also possible,
13234and sometimes useful, to have synchronous finalizers, where the user
13235program explicitly decides when to run enabled finalizers. We have
13236considered this in MLton, and it seems possible, but there are some
13237unresolved design issues. See the thread at
13238
13239* http://www.mlton.org/pipermail/mlton/2004-September/016570.html
13240
13241== Also see ==
13242
13243* <!Cite(Boehm03)>
13244
13245<<<
13246
13247:mlton-guide-page: MLtonGC
13248[[MLtonGC]]
13249MLtonGC
13250=======
13251
13252[source,sml]
13253----
13254signature MLTON_GC =
13255 sig
13256 val collect: unit -> unit
13257 val pack: unit -> unit
13258 val setMessages: bool -> unit
13259 val setSummary: bool -> unit
13260 val unpack: unit -> unit
13261 structure Statistics :
13262 sig
13263 val bytesAllocated: unit -> IntInf.int
13264 val lastBytesLive: unit -> IntInf.int
13265 val numCopyingGCs: unit -> IntInf.int
13266 val numMarkCompactGCs: unit -> IntInf.int
13267 val numMinorGCs: unit -> IntInf.int
13268 val maxBytesLive: unit -> IntInf.int
13269 end
13270 end
13271----
13272
13273* `collect ()`
13274+
13275causes a garbage collection to occur.
13276
13277* `pack ()`
13278+
13279shrinks the heap as much as possible so that other processes can use
13280available RAM.
13281
13282* `setMessages b`
13283+
13284controls whether diagnostic messages are printed at the beginning and
13285end of each garbage collection. It is the same as the `gc-messages`
13286runtime system option.
13287
13288* `setSummary b`
13289+
13290controls whether a summary of garbage collection statistics is printed
13291upon termination of the program. It is the same as the `gc-summary`
13292runtime system option.
13293
13294* `unpack ()`
13295+
13296resizes a packed heap to the size desired by the runtime.
13297
13298* `Statistics.bytesAllocated ()`
13299+
13300returns bytes allocated (as of the most recent garbage collection).
13301
13302* `Statistics.lastBytesLive ()`
13303+
13304returns bytes live (as of the most recent garbage collection).
13305
13306* `Statistics.numCopyingGCs ()`
13307+
13308returns number of (major) copying garbage collections performed (as of
13309the most recent garbage collection).
13310
13311* `Statistics.numMarkCompactGCs ()`
13312+
13313returns number of (major) mark-compact garbage collections performed
13314(as of the most recent garbage collection).
13315
13316* `Statistics.numMinorGCs ()`
13317+
13318returns number of minor garbage collections performed (as of the most
13319recent garbage collection).
13320
13321* `Statistics.maxBytesLive ()`
13322+
13323returns maximum bytes live (as of the most recent garbage collection).
13324
13325<<<
13326
13327:mlton-guide-page: MLtonIntInf
13328[[MLtonIntInf]]
13329MLtonIntInf
13330===========
13331
13332[source,sml]
13333----
13334signature MLTON_INT_INF =
13335 sig
13336 type t = IntInf.int
13337
13338 val areSmall: t * t -> bool
13339 val gcd: t * t -> t
13340 val isSmall: t -> bool
13341
13342 structure BigWord : WORD
13343 structure SmallInt : INTEGER
13344 datatype rep =
13345 Big of BigWord.word vector
13346 | Small of SmallInt.int
13347 val rep: t -> rep
13348 val fromRep : rep -> t option
13349 end
13350----
13351
13352MLton represents an arbitrary precision integer either as an unboxed
13353word with the bottom bit set to 1 and the top bits representing a
13354small signed integer, or as a pointer to a vector of words, where the
13355first word indicates the sign and the rest are the limbs of a
13356<:GnuMP:> big integer.
13357
13358* `type t`
13359+
13360the same as type `IntInf.int`.
13361
13362* `areSmall (a, b)`
13363+
13364returns true iff both `a` and `b` are small.
13365
13366* `gcd (a, b)`
13367+
13368uses the <:GnuMP:GnuMP's> fast gcd implementation.
13369
13370* `isSmall a`
13371+
13372returns true iff `a` is small.
13373
13374* `BigWord : WORD`
13375+
13376representation of a big `IntInf.int` as a vector of words; on 32-bit
13377platforms, `BigWord` is likely to be equivalent to `Word32`, and on
1337864-bit platforms, `BigWord` is likely to be equivalent to `Word64`.
13379
13380* `SmallInt : INTEGER`
13381+
13382representation of a small `IntInf.int` as a signed integer; on 32-bit
13383platforms, `SmallInt` is likely to be equivalent to `Int32`, and on
1338464-bit platforms, `SmallInt` is likely to be equivalent to `Int64`.
13385
13386* `datatype rep`
13387+
13388the underlying representation of an `IntInf.int`.
13389
13390* `rep i`
13391+
13392returns the underlying representation of `i`.
13393
13394* `fromRep r`
13395+
13396converts from the underlying representation back to an `IntInf.int`.
13397If `fromRep r` is given anything besides the valid result of `rep i`
13398for some `i`, this function call will return `NONE`.
13399
13400<<<
13401
13402:mlton-guide-page: MLtonIO
13403[[MLtonIO]]
13404MLtonIO
13405=======
13406
13407[source,sml]
13408----
13409signature MLTON_IO =
13410 sig
13411 type instream
13412 type outstream
13413
13414 val inFd: instream -> Posix.IO.file_desc
13415 val mkstemp: string -> string * outstream
13416 val mkstemps: {prefix: string, suffix: string} -> string * outstream
13417 val newIn: Posix.IO.file_desc * string -> instream
13418 val newOut: Posix.IO.file_desc * string -> outstream
13419 val outFd: outstream -> Posix.IO.file_desc
13420 val tempPrefix: string -> string
13421 end
13422----
13423
13424* `inFd ins`
13425+
13426returns the file descriptor corresponding to `ins`.
13427
13428* `mkstemp s`
13429+
13430like the C `mkstemp` function, generates and open a temporary file
13431with prefix `s`.
13432
13433* `mkstemps {prefix, suffix}`
13434+
13435like `mkstemp`, except it has both a prefix and suffix.
13436
13437* `newIn (fd, name)`
13438+
13439creates a new instream from file descriptor `fd`, with `name` used in
13440any `Io` exceptions later raised.
13441
13442* `newOut (fd, name)`
13443+
13444creates a new outstream from file descriptor `fd`, with `name` used in
13445any `Io` exceptions later raised.
13446
13447* `outFd out`
13448+
13449returns the file descriptor corresponding to `out`.
13450
13451* `tempPrefix s`
13452+
13453adds a suitable system or user specific prefix (directory) for temp
13454files.
13455
13456<<<
13457
13458:mlton-guide-page: MLtonItimer
13459[[MLtonItimer]]
13460MLtonItimer
13461===========
13462
13463[source,sml]
13464----
13465signature MLTON_ITIMER =
13466 sig
13467 datatype t =
13468 Prof
13469 | Real
13470 | Virtual
13471
13472 val set: t * {interval: Time.time, value: Time.time} -> unit
13473 val signal: t -> Posix.Signal.signal
13474 end
13475----
13476
13477* `set (t, {interval, value})`
13478+
13479sets the interval timer (using `setitimer`) specified by `t` to the
13480given `interval` and `value`.
13481
13482* `signal t`
13483+
13484returns the signal corresponding to `t`.
13485
13486<<<
13487
13488:mlton-guide-page: MLtonLibraryProject
13489[[MLtonLibraryProject]]
13490MLtonLibraryProject
13491===================
13492
13493We have a https://github.com/MLton/mltonlib[MLton Library repository]
13494that is intended to collect libraries.
13495
13496=====
13497 https://github.com/MLton/mltonlib
13498=====
13499
13500Libraries are kept in the `master` branch, and are grouped according
13501to domain name, in the Java package style. For example,
13502<:VesaKarvonen:>, who works at `ssh.com`, has been putting code at:
13503
13504=====
13505 https://github.com/MLton/mltonlib/tree/master/com/ssh
13506=====
13507
13508<:StephenWeeks:>, owning `sweeks.com`, has been putting code at:
13509
13510=====
13511 https://github.com/MLton/mltonlib/tree/master/com/sweeks
13512=====
13513
13514A "library" is a subdirectory of some such directory. For example,
13515Stephen's basis-library replacement library is at
13516
13517=====
13518 https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic
13519=====
13520
13521We use "transparent per-library branching" to handle library
13522versioning. Each library has an "unstable" subdirectory in which work
13523happens. When one is happy with a library, one tags it by copying it
13524to a stable version directory. Stable libraries are immutable -- when
13525one refers to a stable library, one always gets exactly the same code.
13526No one has actually made a stable library yet, but, when I'm ready to
13527tag my library, I was thinking that I would do something like copying
13528
13529=====
13530 https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic/unstable
13531=====
13532
13533to
13534
13535=====
13536 https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic/v1
13537=====
13538
13539So far, libraries in the MLton repository have been licensed under
13540MLton's <:License:>. We haven't decided on whether that will be a
13541requirement to be in the repository or not. For the sake of
13542simplicity (a single license) and encouraging widest use of code,
13543contributors are encouraged to use that license. But it may be too
13544strict to require it.
13545
13546If someone wants to contribute a new library to our repository or to
13547work on an old one, they can make a pull request. If people want to
13548work in their own repository, they can do so -- that's the point of
13549using domain names to prevent clashes. The idea is that a user should
13550be able to bring library collections in from many different
13551repositories without problems. And those libraries could even work
13552with each other.
13553
13554At some point we may want to settle on an <:MLBasisPathMap:> variable
13555for the root of the library project. Or, we could reuse `SML_LIB`,
13556and migrate what we currently keep there into the library
13557infrastructure.
13558
13559<<<
13560
13561:mlton-guide-page: MLtonMonoArray
13562[[MLtonMonoArray]]
13563MLtonMonoArray
13564==============
13565
13566[source,sml]
13567----
13568signature MLTON_MONO_ARRAY =
13569 sig
13570 type t
13571 type elem
13572 val fromPoly: elem array -> t
13573 val toPoly: t -> elem array
13574 end
13575----
13576
13577* `type t`
13578+
13579type of monomorphic array
13580
13581* `type elem`
13582+
13583type of array elements
13584
13585* `fromPoly a`
13586+
13587type cast a polymorphic array to its monomorphic counterpart; the
13588argument and result arrays share the same identity
13589
13590* `toPoly a`
13591+
13592type cast a monomorphic array to its polymorphic counterpart; the
13593argument and result arrays share the same identity
13594
13595<<<
13596
13597:mlton-guide-page: MLtonMonoVector
13598[[MLtonMonoVector]]
13599MLtonMonoVector
13600===============
13601
13602[source,sml]
13603----
13604signature MLTON_MONO_VECTOR =
13605 sig
13606 type t
13607 type elem
13608 val fromPoly: elem vector -> t
13609 val toPoly: t -> elem vector
13610 end
13611----
13612
13613* `type t`
13614+
13615type of monomorphic vector
13616
13617* `type elem`
13618+
13619type of vector elements
13620
13621* `fromPoly v`
13622+
13623type cast a polymorphic vector to its monomorphic counterpart; in
13624MLton, this is a constant-time operation
13625
13626* `toPoly v`
13627+
13628type cast a monomorphic vector to its polymorphic counterpart; in
13629MLton, this is a constant-time operation
13630
13631<<<
13632
13633:mlton-guide-page: MLtonPlatform
13634[[MLtonPlatform]]
13635MLtonPlatform
13636=============
13637
13638[source,sml]
13639----
13640signature MLTON_PLATFORM =
13641 sig
13642 structure Arch:
13643 sig
13644 datatype t = Alpha | AMD64 | ARM | ARM64 | HPPA | IA64 | m68k
13645 | MIPS | PowerPC | PowerPC64 | S390 | Sparc | X86
13646
13647 val fromString: string -> t option
13648 val host: t
13649 val toString: t -> string
13650 end
13651
13652 structure OS:
13653 sig
13654 datatype t = AIX | Cygwin | Darwin | FreeBSD | Hurd | HPUX
13655 | Linux | MinGW | NetBSD | OpenBSD | Solaris
13656
13657 val fromString: string -> t option
13658 val host: t
13659 val toString: t -> string
13660 end
13661 end
13662----
13663
13664* `datatype Arch.t`
13665+
13666processor architectures
13667
13668* `Arch.fromString a`
13669+
13670converts from string to architecture. Case insensitive.
13671
13672* `Arch.host`
13673+
13674the architecture for which the program is compiled.
13675
13676* `Arch.toString`
13677+
13678string for architecture.
13679
13680* `datatype OS.t`
13681+
13682operating systems
13683
13684* `OS.fromString`
13685+
13686converts from string to operating system. Case insensitive.
13687
13688* `OS.host`
13689+
13690the operating system for which the program is compiled.
13691
13692* `OS.toString`
13693+
13694string for operating system.
13695
13696<<<
13697
13698:mlton-guide-page: MLtonPointer
13699[[MLtonPointer]]
13700MLtonPointer
13701============
13702
13703[source,sml]
13704----
13705signature MLTON_POINTER =
13706 sig
13707 eqtype t
13708
13709 val add: t * word -> t
13710 val compare: t * t -> order
13711 val diff: t * t -> word
13712 val getInt8: t * int -> Int8.int
13713 val getInt16: t * int -> Int16.int
13714 val getInt32: t * int -> Int32.int
13715 val getInt64: t * int -> Int64.int
13716 val getPointer: t * int -> t
13717 val getReal32: t * int -> Real32.real
13718 val getReal64: t * int -> Real64.real
13719 val getWord8: t * int -> Word8.word
13720 val getWord16: t * int -> Word16.word
13721 val getWord32: t * int -> Word32.word
13722 val getWord64: t * int -> Word64.word
13723 val null: t
13724 val setInt8: t * int * Int8.int -> unit
13725 val setInt16: t * int * Int16.int -> unit
13726 val setInt32: t * int * Int32.int -> unit
13727 val setInt64: t * int * Int64.int -> unit
13728 val setPointer: t * int * t -> unit
13729 val setReal32: t * int * Real32.real -> unit
13730 val setReal64: t * int * Real64.real -> unit
13731 val setWord8: t * int * Word8.word -> unit
13732 val setWord16: t * int * Word16.word -> unit
13733 val setWord32: t * int * Word32.word -> unit
13734 val setWord64: t * int * Word64.word -> unit
13735 val sizeofPointer: word
13736 val sub: t * word -> t
13737 end
13738----
13739
13740* `eqtype t`
13741+
13742the type of pointers, i.e. machine addresses.
13743
13744* `add (p, w)`
13745+
13746returns the pointer `w` bytes after than `p`. Does not check for
13747overflow.
13748
13749* `compare (p1, p2)`
13750+
13751compares the pointer `p1` to the pointer `p2` (as addresses).
13752
13753* `diff (p1, p2)`
13754+
13755returns the number of bytes `w` such that `add (p2, w) = p1`. Does
13756not check for overflow.
13757
13758* ++get__<X>__ (p, i)++
13759+
13760returns the object stored at index i of the array of _X_ objects
13761pointed to by `p`. For example, `getWord32 (p, 7)` returns the 32-bit
13762word stored 28 bytes beyond `p`.
13763
13764* `null`
13765+
13766the null pointer, i.e. 0.
13767
13768* ++set__<X>__ (p, i, v)++
13769+
13770assigns `v` to the object stored at index i of the array of _X_
13771objects pointed to by `p`. For example, `setWord32 (p, 7, w)` stores
13772the 32-bit word `w` at the address 28 bytes beyond `p`.
13773
13774* `sizeofPointer`
13775+
13776size, in bytes, of a pointer.
13777
13778* `sub (p, w)`
13779+
13780returns the pointer `w` bytes before `p`. Does not check for
13781overflow.
13782
13783<<<
13784
13785:mlton-guide-page: MLtonProcEnv
13786[[MLtonProcEnv]]
13787MLtonProcEnv
13788============
13789
13790[source,sml]
13791----
13792signature MLTON_PROC_ENV =
13793 sig
13794 type gid
13795
13796 val setenv: {name: string, value: string} -> unit
13797 val setgroups: gid list -> unit
13798 end
13799----
13800
13801* `setenv {name, value}`
13802+
13803like the C `setenv` function. Does not require `name` or `value` to
13804be null terminated.
13805
13806* `setgroups grps`
13807+
13808like the C `setgroups` function.
13809
13810<<<
13811
13812:mlton-guide-page: MLtonProcess
13813[[MLtonProcess]]
13814MLtonProcess
13815============
13816
13817[source,sml]
13818----
13819signature MLTON_PROCESS =
13820 sig
13821 type pid
13822
13823 val spawn: {args: string list, path: string} -> pid
13824 val spawne: {args: string list, env: string list, path: string} -> pid
13825 val spawnp: {args: string list, file: string} -> pid
13826
13827 type ('stdin, 'stdout, 'stderr) t
13828
13829 type input
13830 type output
13831
13832 type none
13833 type chain
13834 type any
13835
13836 exception MisuseOfForget
13837 exception DoublyRedirected
13838
13839 structure Child:
13840 sig
13841 type ('use, 'dir) t
13842
13843 val binIn: (BinIO.instream, input) t -> BinIO.instream
13844 val binOut: (BinIO.outstream, output) t -> BinIO.outstream
13845 val fd: (Posix.FileSys.file_desc, 'dir) t -> Posix.FileSys.file_desc
13846 val remember: (any, 'dir) t -> ('use, 'dir) t
13847 val textIn: (TextIO.instream, input) t -> TextIO.instream
13848 val textOut: (TextIO.outstream, output) t -> TextIO.outstream
13849 end
13850
13851 structure Param:
13852 sig
13853 type ('use, 'dir) t
13854
13855 val child: (chain, 'dir) Child.t -> (none, 'dir) t
13856 val fd: Posix.FileSys.file_desc -> (none, 'dir) t
13857 val file: string -> (none, 'dir) t
13858 val forget: ('use, 'dir) t -> (any, 'dir) t
13859 val null: (none, 'dir) t
13860 val pipe: ('use, 'dir) t
13861 val self: (none, 'dir) t
13862 end
13863
13864 val create:
13865 {args: string list,
13866 env: string list option,
13867 path: string,
13868 stderr: ('stderr, output) Param.t,
13869 stdin: ('stdin, input) Param.t,
13870 stdout: ('stdout, output) Param.t}
13871 -> ('stdin, 'stdout, 'stderr) t
13872 val getStderr: ('stdin, 'stdout, 'stderr) t -> ('stderr, input) Child.t
13873 val getStdin: ('stdin, 'stdout, 'stderr) t -> ('stdin, output) Child.t
13874 val getStdout: ('stdin, 'stdout, 'stderr) t -> ('stdout, input) Child.t
13875 val kill: ('stdin, 'stdout, 'stderr) t * Posix.Signal.signal -> unit
13876 val reap: ('stdin, 'stdout, 'stderr) t -> Posix.Process.exit_status
13877 end
13878----
13879
13880
13881== Spawn ==
13882
13883The `spawn` functions provide an alternative to the
13884`fork`/`exec` idiom that is typically used to create a new
13885process. On most platforms, the `spawn` functions are simple
13886wrappers around `fork`/`exec`. However, under Windows, the
13887`spawn` functions are primitive. All `spawn` functions return
13888the process id of the spawned process. They differ in how the
13889executable is found and the environment that it uses.
13890
13891* `spawn {args, path}`
13892+
13893starts a new process running the executable specified by `path`
13894with the arguments `args`. Like `Posix.Process.exec`.
13895
13896* `spawne {args, env, path}`
13897+
13898starts a new process running the executable specified by `path` with
13899the arguments `args` and environment `env`. Like
13900`Posix.Process.exece`.
13901
13902* `spawnp {args, file}`
13903+
13904search the `PATH` environment variable for an executable named `file`,
13905and start a new process running that executable with the arguments
13906`args`. Like `Posix.Process.execp`.
13907
13908
13909== Create ==
13910
13911`MLton.Process.create` provides functionality similar to
13912`Unix.executeInEnv`, but provides more control control over the input,
13913output, and error streams. In addition, `create` works on all
13914platforms, including Cygwin and MinGW (Windows) where `Posix.fork` is
13915unavailable. For greatest portability programs should still use the
13916standard `Unix.execute`, `Unix.executeInEnv`, and `OS.Process.system`.
13917
13918The following types and sub-structures are used by the `create`
13919function. They provide static type checking of correct stream usage.
13920
13921=== Child ===
13922
13923* `('use, 'dir) Child.t`
13924+
13925This represents a handle to one of a child's standard streams. The
13926`'dir` is viewed with respect to the parent. Thus a `('a, input)
13927Child.t` handle means that the parent may input the output from the
13928child.
13929
13930* `Child.{bin,text}{In,Out} h`
13931+
13932These functions take a handle and bind it to a stream of the named
13933type. The type system will detect attempts to reverse the direction
13934of a stream or to use the same stream in multiple, incompatible ways.
13935
13936* `Child.fd h`
13937+
13938This function behaves like the other `Child.*` functions; it opens a
13939stream. However, it does not enforce that you read or write from the
13940handle. If you use the descriptor in an inappropriate direction, the
13941behavior is undefined. Furthermore, this function may potentially be
13942unavailable on future MLton host platforms.
13943
13944* `Child.remember h`
13945+
13946This function takes a stream of use `any` and resets the use of the
13947stream so that the stream may be used by `Child.*`. An `any` stream
13948may have had use `none` or `'use` prior to calling `Param.forget`. If
13949the stream was `none` and is used, `MisuseOfForget` is raised.
13950
13951=== Param ===
13952
13953* `('use, 'dir) Param.t`
13954+
13955This is a handle to an input/output source and will be passed to the
13956created child process. The `'dir` is relative to the child process.
13957Input means that the child process will read from this stream.
13958
13959* `Param.child h`
13960+
13961Connect the stream of the new child process to the stream of a
13962previously created child process. A single child stream should be
13963connected to only one child process or else `DoublyRedirected` will be
13964raised.
13965
13966* `Param.fd fd`
13967+
13968This creates a stream from the provided file descriptor which will be
13969closed when `create` is called. This function may not be available on
13970future MLton host platforms.
13971
13972* `Param.forget h`
13973+
13974This hides the type of the actual parameter as `any`. This is useful
13975if you are implementing an application which conditionally attaches
13976the child process to files or pipes. However, you must ensure that
13977your use after `Child.remember` matches the original type.
13978
13979* `Param.file s`
13980+
13981Open the given file and connect it to the child process. Note that the
13982file will be opened only when `create` is called. So any exceptions
13983will be raised there and not by this function. If used for `input`,
13984the file is opened read-only. If used for `output`, the file is opened
13985read-write.
13986
13987* `Param.null`
13988+
13989In some situations, the child process should have its output
13990discarded. The `null` param when passed as `stdout` or `stderr` does
13991this. When used for `stdin`, the child process will either receive
13992`EOF` or a failure condition if it attempts to read from `stdin`.
13993
13994* `Param.pipe`
13995+
13996This will connect the input/output of the child process to a pipe
13997which the parent process holds. This may later form the input to one
13998of the `Child.*` functions and/or the `Param.child` function.
13999
14000* `Param.self`
14001+
14002This will connect the input/output of the child process to the
14003corresponding stream of the parent process.
14004
14005=== Process ===
14006
14007* `type ('stdin, 'stdout, 'stderr) t`
14008+
14009represents a handle to a child process. The type arguments capture
14010how the named stream of the child process may be used.
14011
14012* `type any`
14013+
14014bypasses the type system in situations where an application does not
14015want the it to enforce correct usage. See `Child.remember` and
14016`Param.forget`.
14017
14018* `type chain`
14019+
14020means that the child process's stream was connected via a pipe to the
14021parent process. The parent process may pass this pipe in turn to
14022another child, thus chaining them together.
14023
14024* `type input, output`
14025+
14026record the direction that a stream flows. They are used as a part of
14027`Param.t` and `Child.t` and is detailed there.
14028
14029* `type none`
14030+
14031means that the child process's stream my not be used by the parent
14032process. This happens when the child process is connected directly to
14033some source.
14034+
14035The types `BinIO.instream`, `BinIO.outstream`, `TextIO.instream`,
14036`TextIO.outstream`, and `Posix.FileSys.file_desc` are also valid types
14037with which to instantiate child streams.
14038
14039* `exception MisuseOfForget`
14040+
14041may be raised if `Child.remember` and `Param.forget` are used to
14042bypass the normal type checking. This exception will only be raised
14043in cases where the `forget` mechanism allows a misuse that would be
14044impossible with the type-safe versions.
14045
14046* `exception DoublyRedirected`
14047+
14048raised if a stream connected to a child process is redirected to two
14049separate child processes. It is safe, though bad style, to use the a
14050`Child.t` with the same `Child.*` function repeatedly.
14051
14052* `create {args, path, env, stderr, stdin, stdout}`
14053+
14054starts a child process with the given command-line `args` (excluding
14055the program name). `path` should be an absolute path to the executable
14056run in the new child process; relative paths work, but are less
14057robust. Optionally, the environment may be overridden with `env`
14058where each string element has the form `"key=value"`. The `std*`
14059options must be provided by the `Param.*` functions documented above.
14060+
14061Processes which are `create`-d must be either `reap`-ed or `kill`-ed.
14062
14063* `getStd{in,out,err} proc`
14064+
14065gets a handle to the specified stream. These should be used by the
14066`Child.*` functions. Failure to use a stream connected via pipe to a
14067child process may result in runtime dead-lock and elicits a compiler
14068warning.
14069
14070* `kill (proc, sig)`
14071+
14072terminates the child process immediately. The signal may or may not
14073mean anything depending on the host platform. A good value is
14074`Posix.Signal.term`.
14075
14076* `reap proc`
14077+
14078waits for the child process to terminate and return its exit status.
14079
14080
14081== Important usage notes ==
14082
14083When building an application with many pipes between child processes,
14084it is important to ensure that there are no cycles in the undirected
14085pipe graph. If this property is not maintained, deadlocks are a very
14086serious potential bug which may only appear under difficult to
14087reproduce conditions.
14088
14089The danger lies in that most operating systems implement pipes with a
14090fixed buffer size. If process A has two output pipes which process B
14091reads, it can happen that process A blocks writing to pipe 2 because
14092it is full while process B blocks reading from pipe 1 because it is
14093empty. This same situation can happen with any undirected cycle formed
14094between processes (vertexes) and pipes (undirected edges) in the
14095graph.
14096
14097It is possible to make this safe using low-level I/O primitives for
14098polling. However, these primitives are not very portable and
14099difficult to use properly. A far better approach is to make sure you
14100never create a cycle in the first place.
14101
14102For these reasons, the `Unix.executeInEnv` is a very dangerous
14103function. Be careful when using it to ensure that the child process
14104only operates on either `stdin` or `stdout`, but not both.
14105
14106
14107== Example use of MLton.Process.create ==
14108
14109The following example program launches the `ipconfig` utility, pipes
14110its output through `grep`, and then reads the result back into the
14111program.
14112
14113[source,sml]
14114----
14115open MLton.Process
14116val p =
14117 create {args = [ "/all" ],
14118 env = NONE,
14119 path = "C:\\WINDOWS\\system32\\ipconfig.exe",
14120 stderr = Param.self,
14121 stdin = Param.null,
14122 stdout = Param.pipe}
14123val q =
14124 create {args = [ "IP-Ad" ],
14125 env = NONE,
14126 path = "C:\\msys\\bin\\grep.exe",
14127 stderr = Param.self,
14128 stdin = Param.child (getStdout p),
14129 stdout = Param.pipe}
14130fun suck h =
14131 case TextIO.inputLine h of
14132 NONE => ()
14133 | SOME s => (print ("'" ^ s ^ "'\n"); suck h)
14134
14135val () = suck (Child.textIn (getStdout q))
14136----
14137
14138<<<
14139
14140:mlton-guide-page: MLtonProfile
14141[[MLtonProfile]]
14142MLtonProfile
14143============
14144
14145[source,sml]
14146----
14147signature MLTON_PROFILE =
14148 sig
14149 structure Data:
14150 sig
14151 type t
14152
14153 val equals: t * t -> bool
14154 val free: t -> unit
14155 val malloc: unit -> t
14156 val write: t * string -> unit
14157 end
14158
14159 val isOn: bool
14160 val withData: Data.t * (unit -> 'a) -> 'a
14161 end
14162----
14163
14164`MLton.Profile` provides <:Profiling:> control from within the
14165program, allowing you to profile individual portions of your
14166program. With `MLton.Profile`, you can create many units of profiling
14167data (essentially, mappings from functions to counts) during a run of
14168a program, switch between them while the program is running, and
14169output multiple `mlmon.out` files.
14170
14171* `isOn`
14172+
14173a compile-time constant that is false only when compiling `-profile no`.
14174
14175* `type Data.t`
14176+
14177the type of a unit of profiling data. In order to most efficiently
14178execute non-profiled programs, when compiling `-profile no` (the
14179default), `Data.t` is equivalent to `unit ref`.
14180
14181* `Data.equals (x, y)`
14182+
14183returns true if the `x` and `y` are the same unit of profiling data.
14184
14185* `Data.free x`
14186+
14187frees the memory associated with the unit of profiling data `x`. It
14188is an error to free the current unit of profiling data or to free a
14189previously freed unit of profiling data. When compiling
14190`-profile no`, `Data.free x` is a no-op.
14191
14192* `Data.malloc ()`
14193+
14194returns a new unit of profiling data. Each unit of profiling data is
14195allocated from the process address space (but is _not_ in the MLton
14196heap) and consumes memory proportional to the number of source
14197functions. When compiling `-profile no`, `Data.malloc ()` is
14198equivalent to allocating a new `unit ref`.
14199
14200* `write (x, f)`
14201+
14202writes the accumulated ticks in the unit of profiling data `x` to file
14203`f`. It is an error to write a previously freed unit of profiling
14204data. When compiling `-profile no`, `write (x, f)` is a no-op. A
14205profiled program will always write the current unit of profiling data
14206at program exit to a file named `mlmon.out`.
14207
14208* `withData (d, f)`
14209+
14210runs `f` with `d` as the unit of profiling data, and returns the
14211result of `f` after restoring the current unit of profiling data.
14212When compiling `-profile no`, `withData (d, f)` is equivalent to
14213`f ()`.
14214
14215
14216== Example ==
14217
14218Here is an example, taken from the `examples/profiling` directory,
14219showing how to profile the executions of the `fib` and `tak` functions
14220separately. Suppose that `fib-tak.sml` contains the following.
14221[source,sml]
14222----
14223structure Profile = MLton.Profile
14224
14225val fibData = Profile.Data.malloc ()
14226val takData = Profile.Data.malloc ()
14227
14228fun wrap (f, d) x =
14229 Profile.withData (d, fn () => f x)
14230
14231val rec fib =
14232 fn 0 => 0
14233 | 1 => 1
14234 | n => fib (n - 1) + fib (n - 2)
14235val fib = wrap (fib, fibData)
14236
14237fun tak (x,y,z) =
14238 if not (y < x)
14239 then z
14240 else tak (tak (x - 1, y, z),
14241 tak (y - 1, z, x),
14242 tak (z - 1, x, y))
14243val tak = wrap (tak, takData)
14244
14245val rec f =
14246 fn 0 => ()
14247 | n => (fib 38; f (n-1))
14248val _ = f 2
14249
14250val rec g =
14251 fn 0 => ()
14252 | n => (tak (18,12,6); g (n-1))
14253val _ = g 500
14254
14255fun done (data, file) =
14256 (Profile.Data.write (data, file)
14257 ; Profile.Data.free data)
14258
14259val _ = done (fibData, "mlmon.fib.out")
14260val _ = done (takData, "mlmon.tak.out")
14261----
14262
14263Compile and run the program.
14264----
14265% mlton -profile time fib-tak.sml
14266% ./fib-tak
14267----
14268
14269Separately display the profiling data for `fib`
14270----
14271% mlprof fib-tak mlmon.fib.out
142725.77 seconds of CPU time (0.00 seconds GC)
14273function cur
14274--------- -----
14275fib 96.9%
14276<unknown> 3.1%
14277----
14278and for `tak`
14279----
14280% mlprof fib-tak mlmon.tak.out
142810.68 seconds of CPU time (0.00 seconds GC)
14282function cur
14283-------- ------
14284tak 100.0%
14285----
14286
14287Combine the data for `fib` and `tak` by calling `mlprof`
14288with multiple `mlmon.out` files.
14289----
14290% mlprof fib-tak mlmon.fib.out mlmon.tak.out mlmon.out
142916.45 seconds of CPU time (0.00 seconds GC)
14292function cur
14293--------- -----
14294fib 86.7%
14295tak 10.5%
14296<unknown> 2.8%
14297----
14298
14299<<<
14300
14301:mlton-guide-page: MLtonRandom
14302[[MLtonRandom]]
14303MLtonRandom
14304===========
14305
14306[source,sml]
14307----
14308signature MLTON_RANDOM =
14309 sig
14310 val alphaNumChar: unit -> char
14311 val alphaNumString: int -> string
14312 val rand: unit -> word
14313 val seed: unit -> word option
14314 val srand: word -> unit
14315 val useed: unit -> word option
14316 end
14317----
14318
14319* `alphaNumChar ()`
14320+
14321returns a random alphanumeric character.
14322
14323* `alphaNumString n`
14324+
14325returns a string of length `n` of random alphanumeric characters.
14326
14327* `rand ()`
14328+
14329returns the next pseudo-random number.
14330
14331* `seed ()`
14332+
14333returns a random word from `/dev/random`. Useful as an arg to
14334`srand`. If `/dev/random` can not be read from, `seed ()` returns
14335`NONE`. A call to `seed` may block until enough random bits are
14336available.
14337
14338* `srand w`
14339+
14340sets the seed used by `rand` to `w`.
14341
14342* `useed ()`
14343+
14344returns a random word from `/dev/urandom`. Useful as an arg to
14345`srand`. If `/dev/urandom` can not be read from, `useed ()` returns
14346`NONE`. A call to `useed` will never block -- it will instead return
14347lower quality random bits.
14348
14349<<<
14350
14351:mlton-guide-page: MLtonReal
14352[[MLtonReal]]
14353MLtonReal
14354=========
14355
14356[source,sml]
14357----
14358signature MLTON_REAL =
14359 sig
14360 type t
14361
14362 val fromWord: word -> t
14363 val fromLargeWord: LargeWord.word -> t
14364 val toWord: IEEEReal.rounding_mode -> t -> word
14365 val toLargeWord: IEEEReal.rounding_mode -> t -> LargeWord.word
14366 end
14367----
14368
14369* `type t`
14370+
14371the type of reals. For `MLton.LargeReal` this is `LargeReal.real`,
14372for `MLton.Real` this is `Real.real`, for `MLton.Real32` this is
14373`Real32.real`, for `MLton.Real64` this is `Real64.real`.
14374
14375* `fromWord w`
14376* `fromLargeWord w`
14377+
14378convert the word `w` to a real value. If the value of `w` is larger
14379than (the appropriate) `REAL.maxFinite`, then infinity is returned.
14380If `w` cannot be exactly represented as a real value, then the current
14381rounding mode is used to determine the resulting value.
14382
14383* `toWord mode r`
14384* `toLargeWord mode r`
14385+
14386convert the argument `r` to a word type using the specified rounding
14387mode. They raise `Overflow` if the result is not representable, in
14388particular, if `r` is an infinity. They raise `Domain` if `r` is NaN.
14389
14390* `MLton.Real32.castFromWord w`
14391* `MLton.Real64.castFromWord w`
14392+
14393convert the argument `w` to a real type as a bit-wise cast.
14394
14395* `MLton.Real32.castToWord r`
14396* `MLton.Real64.castToWord r`
14397+
14398convert the argument `r` to a word type as a bit-wise cast.
14399
14400<<<
14401
14402:mlton-guide-page: MLtonRlimit
14403[[MLtonRlimit]]
14404MLtonRlimit
14405===========
14406
14407[source,sml]
14408----
14409signature MLTON_RLIMIT =
14410 sig
14411 structure RLim : sig
14412 type t
14413 val castFromSysWord: SysWord.word -> t
14414 val castToSysWord: t -> SysWord.word
14415 end
14416
14417 val infinity: RLim.t
14418
14419 type t
14420
14421 val coreFileSize: t (* CORE max core file size *)
14422 val cpuTime: t (* CPU CPU time in seconds *)
14423 val dataSize: t (* DATA max data size *)
14424 val fileSize: t (* FSIZE Maximum filesize *)
14425 val numFiles: t (* NOFILE max number of open files *)
14426 val lockedInMemorySize: t (* MEMLOCK max locked address space *)
14427 val numProcesses: t (* NPROC max number of processes *)
14428 val residentSetSize: t (* RSS max resident set size *)
14429 val stackSize: t (* STACK max stack size *)
14430 val virtualMemorySize: t (* AS virtual memory limit *)
14431
14432 val get: t -> {hard: rlim, soft: rlim}
14433 val set: t * {hard: rlim, soft: rlim} -> unit
14434 end
14435----
14436
14437`MLton.Rlimit` provides a wrapper around the C `getrlimit` and
14438`setrlimit` functions.
14439
14440* `type Rlim.t`
14441+
14442the type of resource limits.
14443
14444* `infinity`
14445+
14446indicates that a resource is unlimited.
14447
14448* `type t`
14449+
14450the types of resources that can be inspected and modified.
14451
14452* `get r`
14453+
14454returns the current hard and soft limits for resource `r`. May raise
14455`OS.SysErr`.
14456
14457* `set (r, {hard, soft})`
14458+
14459sets the hard and soft limits for resource `r`. May raise
14460`OS.SysErr`.
14461
14462<<<
14463
14464:mlton-guide-page: MLtonRusage
14465[[MLtonRusage]]
14466MLtonRusage
14467===========
14468
14469[source,sml]
14470----
14471signature MLTON_RUSAGE =
14472 sig
14473 type t = {utime: Time.time, (* user time *)
14474 stime: Time.time} (* system time *)
14475
14476 val measureGC: bool -> unit
14477 val rusage: unit -> {children: t, gc: t, self: t}
14478 end
14479----
14480
14481* `type t`
14482+
14483corresponds to a subset of the C `struct rusage`.
14484
14485* `measureGC b`
14486+
14487controls whether garbage collection time is separately measured during
14488program execution. This affects the behavior of both `rusage` and
14489`Timer.checkCPUTimes`, both of which will return gc times of zero with
14490`measureGC false`. Garbage collection time is always measured when
14491either `gc-messages` or `gc-summary` is given as a
14492<:RunTimeOptions:runtime system option>.
14493
14494* `rusage ()`
14495+
14496corresponds to the C `getrusage` function. It returns the resource
14497usage of the exited children, the garbage collector, and the process
14498itself. The `self` component includes the usage of the `gc`
14499component, regardless of whether `measureGC` is `true` or `false`. If
14500`rusage` is used in a program, either directly, or indirectly via the
14501`Timer` structure, then `measureGC true` is automatically called at
14502the start of the program (it can still be disable by user code later).
14503
14504<<<
14505
14506:mlton-guide-page: MLtonSignal
14507[[MLtonSignal]]
14508MLtonSignal
14509===========
14510
14511[source,sml]
14512----
14513signature MLTON_SIGNAL =
14514 sig
14515 type t = Posix.Signal.signal
14516 type signal = t
14517
14518 structure Handler:
14519 sig
14520 type t
14521
14522 val default: t
14523 val handler: (Thread.Runnable.t -> Thread.Runnable.t) -> t
14524 val ignore: t
14525 val isDefault: t -> bool
14526 val isIgnore: t -> bool
14527 val simple: (unit -> unit) -> t
14528 end
14529
14530 structure Mask:
14531 sig
14532 type t
14533
14534 val all: t
14535 val allBut: signal list -> t
14536 val block: t -> unit
14537 val getBlocked: unit -> t
14538 val isMember: t * signal -> bool
14539 val none: t
14540 val setBlocked: t -> unit
14541 val some: signal list -> t
14542 val unblock: t -> unit
14543 end
14544
14545 val getHandler: t -> Handler.t
14546 val handled: unit -> Mask.t
14547 val prof: t
14548 val restart: bool ref
14549 val setHandler: t * Handler.t -> unit
14550 val suspend: Mask.t -> unit
14551 val vtalrm: t
14552 end
14553----
14554
14555Signals handlers are functions from (runnable) threads to (runnable)
14556threads. When a signal arrives, the corresponding signal handler is
14557invoked, its argument being the thread that was interrupted by the
14558signal. The signal handler runs asynchronously, in its own thread.
14559The signal handler returns the thread that it would like to resume
14560execution (this is often the thread that it was passed). It is an
14561error for a signal handler to raise an exception that is not handled
14562within the signal handler itself.
14563
14564A signal handler is never invoked while the running thread is in a
14565critical section (see <:MLtonThread:>). Invoking a signal handler
14566implicitly enters a critical section and the normal return of a signal
14567handler implicitly exits the critical section; hence, a signal handler
14568is never interrupted by another signal handler.
14569
14570* `type t`
14571+
14572the type of signals.
14573
14574* `type Handler.t`
14575+
14576the type of signal handlers.
14577
14578* `Handler.default`
14579+
14580handles the signal with the default action.
14581
14582* `Handler.handler f`
14583+
14584returns a handler `h` such that when a signal `s` is handled by `h`,
14585`f` will be passed the thread that was interrupted by `s` and should
14586return the thread that will resume execution.
14587
14588* `Handler.ignore`
14589+
14590is a handler that will ignore the signal.
14591
14592* `Handler.isDefault`
14593+
14594returns true if the handler is the default handler.
14595
14596* `Handler.isIgnore`
14597+
14598returns true if the handler is the ignore handler.
14599
14600* `Handler.simple f`
14601+
14602returns a handler that executes `f ()` and does not switch threads.
14603
14604* `type Mask.t`
14605+
14606the type of signal masks, which are sets of blocked signals.
14607
14608* `Mask.all`
14609+
14610a mask of all signals.
14611
14612* `Mask.allBut l`
14613+
14614a mask of all signals except for those in `l`.
14615
14616* `Mask.block m`
14617+
14618blocks all signals in `m`.
14619
14620* `Mask.getBlocked ()`
14621+
14622gets the signal mask `m`, i.e. a signal is blocked if and only if it
14623is in `m`.
14624
14625* `Mask.isMember (m, s)`
14626+
14627returns true if the signal `s` is in `m`.
14628
14629* `Mask.none`
14630+
14631a mask of no signals.
14632
14633* `Mask.setBlocked m`
14634+
14635sets the signal mask to `m`, i.e. a signal is blocked if and only if
14636it is in `m`.
14637
14638* `Mask.some l`
14639+
14640a mask of the signals in `l`.
14641
14642* `Mask.unblock m`
14643+
14644unblocks all signals in `m`.
14645
14646* `getHandler s`
14647+
14648returns the current handler for signal `s`.
14649
14650* `handled ()`
14651+
14652returns the signal mask `m` corresponding to the currently handled
14653signals; i.e., a signal is handled if and only if it is in `m`.
14654
14655* `prof`
14656+
14657`SIGPROF`, the profiling signal.
14658
14659* `restart`
14660+
14661dynamically determines the behavior of interrupted system calls; when
14662`true`, interrupted system calls are restarted; when `false`,
14663interrupted system calls raise `OS.SysError`.
14664
14665* `setHandler (s, h)`
14666+
14667sets the handler for signal `s` to `h`.
14668
14669* `suspend m`
14670+
14671temporarily sets the signal mask to `m` and suspends until an unmasked
14672signal is received and handled, at which point `suspend` resets the
14673mask and returns.
14674
14675* `vtalrm`
14676+
14677`SIGVTALRM`, the signal for virtual timers.
14678
14679
14680== Interruptible System Calls ==
14681
14682Signal handling interacts in a non-trivial way with those functions in
14683the <:BasisLibrary:Basis Library> that correspond directly to
14684interruptible system calls (a subset of those functions that may raise
14685`OS.SysError`). The desire is that these functions should have
14686predictable semantics. The principal concerns are:
14687
146881. System calls that are interrupted by signals should, by default, be
14689restarted; the alternative is to raise
14690+
14691[source,sml]
14692----
14693OS.SysError (Posix.Error.errorMsg Posix.Error.intr,
14694 SOME Posix.Error.intr)
14695----
14696+
14697This behavior is determined dynamically by the value of `Signal.restart`.
14698
146992. Signal handlers should always get a chance to run (when outside a
14700critical region). If a system call is interrupted by a signal, then
14701the signal handler will run before the call is restarted or
14702`OS.SysError` is raised; that is, before the `Signal.restart` check.
14703
147043. A system call that must be restarted while in a critical section
14705will be restarted with the handled signals blocked (and the previously
14706blocked signals remembered). This encourages the system call to
14707complete, allowing the program to make progress towards leaving the
14708critical section where the signal can be handled. If the system call
14709completes, the set of blocked signals are restored to those previously
14710blocked.
14711
14712<<<
14713
14714:mlton-guide-page: MLtonStructure
14715[[MLtonStructure]]
14716MLtonStructure
14717==============
14718
14719The `MLton` structure contains a lot of functionality that is not
14720available in the <:BasisLibrary:Basis Library>. As a warning,
14721please keep in mind that the `MLton` structure and its
14722substructures do change from release to release of MLton.
14723
14724[source,sml]
14725----
14726structure MLton:
14727 sig
14728 val eq: 'a * 'a -> bool
14729 val equal: 'a * 'a -> bool
14730 val hash: 'a -> Word32.word
14731 val isMLton: bool
14732 val share: 'a -> unit
14733 val shareAll: unit -> unit
14734 val size: 'a -> int
14735
14736 structure Array: MLTON_ARRAY
14737 structure BinIO: MLTON_BIN_IO
14738 structure CharArray: MLTON_MONO_ARRAY where type t = CharArray.array
14739 where type elem = CharArray.elem
14740 structure CharVector: MLTON_MONO_VECTOR where type t = CharVector.vector
14741 where type elem = CharVector.elem
14742 structure Cont: MLTON_CONT
14743 structure Exn: MLTON_EXN
14744 structure Finalizable: MLTON_FINALIZABLE
14745 structure GC: MLTON_GC
14746 structure IntInf: MLTON_INT_INF
14747 structure Itimer: MLTON_ITIMER
14748 structure LargeReal: MLTON_REAL where type t = LargeReal.real
14749 structure LargeWord: MLTON_WORD where type t = LargeWord.word
14750 structure Platform: MLTON_PLATFORM
14751 structure Pointer: MLTON_POINTER
14752 structure ProcEnv: MLTON_PROC_ENV
14753 structure Process: MLTON_PROCESS
14754 structure Profile: MLTON_PROFILE
14755 structure Random: MLTON_RANDOM
14756 structure Real: MLTON_REAL where type t = Real.real
14757 structure Real32: sig
14758 include MLTON_REAL
14759 val castFromWord: Word32.word -> t
14760 val castToWord: t -> Word32.word
14761 end where type t = Real32.real
14762 structure Real64: sig
14763 include MLTON_REAL
14764 val castFromWord: Word64.word -> t
14765 val castToWord: t -> Word64.word
14766 end where type t = Real64.real
14767 structure Rlimit: MLTON_RLIMIT
14768 structure Rusage: MLTON_RUSAGE
14769 structure Signal: MLTON_SIGNAL
14770 structure Syslog: MLTON_SYSLOG
14771 structure TextIO: MLTON_TEXT_IO
14772 structure Thread: MLTON_THREAD
14773 structure Vector: MLTON_VECTOR
14774 structure Weak: MLTON_WEAK
14775 structure Word: MLTON_WORD where type t = Word.word
14776 structure Word8: MLTON_WORD where type t = Word8.word
14777 structure Word16: MLTON_WORD where type t = Word16.word
14778 structure Word32: MLTON_WORD where type t = Word32.word
14779 structure Word64: MLTON_WORD where type t = Word64.word
14780 structure Word8Array: MLTON_MONO_ARRAY where type t = Word8Array.array
14781 where type elem = Word8Array.elem
14782 structure Word8Vector: MLTON_MONO_VECTOR where type t = Word8Vector.vector
14783 where type elem = Word8Vector.elem
14784 structure World: MLTON_WORLD
14785 end
14786----
14787
14788
14789== Substructures ==
14790
14791* <:MLtonArray:>
14792* <:MLtonBinIO:>
14793* <:MLtonCont:>
14794* <:MLtonExn:>
14795* <:MLtonFinalizable:>
14796* <:MLtonGC:>
14797* <:MLtonIntInf:>
14798* <:MLtonIO:>
14799* <:MLtonItimer:>
14800* <:MLtonMonoArray:>
14801* <:MLtonMonoVector:>
14802* <:MLtonPlatform:>
14803* <:MLtonPointer:>
14804* <:MLtonProcEnv:>
14805* <:MLtonProcess:>
14806* <:MLtonRandom:>
14807* <:MLtonReal:>
14808* <:MLtonRlimit:>
14809* <:MLtonRusage:>
14810* <:MLtonSignal:>
14811* <:MLtonSyslog:>
14812* <:MLtonTextIO:>
14813* <:MLtonThread:>
14814* <:MLtonVector:>
14815* <:MLtonWeak:>
14816* <:MLtonWord:>
14817* <:MLtonWorld:>
14818
14819== Values ==
14820
14821* `eq (x, y)`
14822+
14823returns true if `x` and `y` are equal as pointers. For simple types
14824like `char`, `int`, and `word`, this is the same as equals. For
14825arrays, datatypes, strings, tuples, and vectors, this is a simple
14826pointer equality. The semantics is a bit murky.
14827
14828* `equal (x, y)`
14829+
14830returns true if `x` and `y` are structurally equal. For equality
14831types, this is the same as <:PolymorphicEquality:>. For other types,
14832it is a conservative approximation of equivalence.
14833
14834* `hash x`
14835+
14836returns a structural hash of `x`. The hash function is consistent
14837between execution of the same program, but may not be consistent
14838between different programs.
14839
14840* `isMLton`
14841+
14842is always `true` in a MLton implementation, and is always `false` in a
14843stub implementation.
14844
14845* `share x`
14846+
14847maximizes sharing in the heap for the object graph reachable from `x`.
14848
14849* `shareAll ()`
14850+
14851maximizes sharing in the heap by sharing space for equivalent
14852immutable objects. A call to `shareAll` performs a major garbage
14853collection, and takes time proportional to the size of the heap.
14854
14855* `size x`
14856+
14857returns the amount of heap space (in bytes) taken by the value of `x`,
14858including all objects reachable from `x` by following pointers. It
14859takes time proportional to the size of `x`. See below for an example.
14860
14861
14862== <!Anchor(size)>Example of `MLton.size` ==
14863
14864This example, `size.sml`, demonstrates the application of `MLton.size`
14865to many different kinds of objects.
14866[source,sml]
14867----
14868sys::[./bin/InclGitFile.py mlton master doc/examples/size/size.sml]
14869----
14870
14871Compile and run as usual.
14872----
14873% mlton size.sml
14874% ./size
14875The size of an int list of length 4 is 48 bytes.
14876The size of a string of length 10 is 24 bytes.
14877The size of an int array of length 10 is 52 bytes.
14878The size of a double array of length 10 is 92 bytes.
14879The size of an array of length 10 of 2-ples of ints is 92 bytes.
14880The size of a useless function is 0 bytes.
14881The size of a continuation option ref is 4544 bytes.
1488213
14883The size of a continuation option ref is 8 bytes.
14884----
14885
14886Note that sizes are dependent upon the target platform and compiler
14887optimizations.
14888
14889<<<
14890
14891:mlton-guide-page: MLtonSyslog
14892[[MLtonSyslog]]
14893MLtonSyslog
14894===========
14895
14896[source,sml]
14897----
14898signature MLTON_SYSLOG =
14899 sig
14900 type openflag
14901
14902 val CONS : openflag
14903 val NDELAY : openflag
14904 val NOWAIT : openflag
14905 val ODELAY : openflag
14906 val PERROR : openflag
14907 val PID : openflag
14908
14909 type facility
14910
14911 val AUTHPRIV : facility
14912 val CRON : facility
14913 val DAEMON : facility
14914 val KERN : facility
14915 val LOCAL0 : facility
14916 val LOCAL1 : facility
14917 val LOCAL2 : facility
14918 val LOCAL3 : facility
14919 val LOCAL4 : facility
14920 val LOCAL5 : facility
14921 val LOCAL6 : facility
14922 val LOCAL7 : facility
14923 val LPR : facility
14924 val MAIL : facility
14925 val NEWS : facility
14926 val SYSLOG : facility
14927 val USER : facility
14928 val UUCP : facility
14929
14930 type loglevel
14931
14932 val EMERG : loglevel
14933 val ALERT : loglevel
14934 val CRIT : loglevel
14935 val ERR : loglevel
14936 val WARNING : loglevel
14937 val NOTICE : loglevel
14938 val INFO : loglevel
14939 val DEBUG : loglevel
14940
14941 val closelog: unit -> unit
14942 val log: loglevel * string -> unit
14943 val openlog: string * openflag list * facility -> unit
14944 end
14945----
14946
14947`MLton.Syslog` is a complete interface to the system logging
14948facilities. See `man 3 syslog` for more details.
14949
14950* `closelog ()`
14951+
14952closes the connection to the system logger.
14953
14954* `log (l, s)`
14955+
14956logs message `s` at a loglevel `l`.
14957
14958* `openlog (name, flags, facility)`
14959+
14960opens a connection to the system logger. `name` will be prefixed to
14961each message, and is typically set to the program name.
14962
14963<<<
14964
14965:mlton-guide-page: MLtonTextIO
14966[[MLtonTextIO]]
14967MLtonTextIO
14968===========
14969
14970[source,sml]
14971----
14972signature MLTON_TEXT_IO = MLTON_IO
14973----
14974
14975See <:MLtonIO:>.
14976
14977<<<
14978
14979:mlton-guide-page: MLtonThread
14980[[MLtonThread]]
14981MLtonThread
14982===========
14983
14984[source,sml]
14985----
14986signature MLTON_THREAD =
14987 sig
14988 structure AtomicState:
14989 sig
14990 datatype t = NonAtomic | Atomic of int
14991 end
14992
14993 val atomically: (unit -> 'a) -> 'a
14994 val atomicBegin: unit -> unit
14995 val atomicEnd: unit -> unit
14996 val atomicState: unit -> AtomicState.t
14997
14998 structure Runnable:
14999 sig
15000 type t
15001 end
15002
15003 type 'a t
15004
15005 val atomicSwitch: ('a t -> Runnable.t) -> 'a
15006 val new: ('a -> unit) -> 'a t
15007 val prepend: 'a t * ('b -> 'a) -> 'b t
15008 val prepare: 'a t * 'a -> Runnable.t
15009 val switch: ('a t -> Runnable.t) -> 'a
15010 end
15011----
15012
15013`MLton.Thread` provides access to MLton's user-level thread
15014implementation (i.e. not OS-level threads). Threads are lightweight
15015data structures that represent a paused computation. Runnable threads
15016are threads that will begin or continue computing when `switch`-ed to.
15017`MLton.Thread` does not include a default scheduling mechanism, but it
15018can be used to implement both preemptive and non-preemptive threads.
15019
15020* `type AtomicState.t`
15021+
15022the type of atomic states.
15023
15024
15025* `atomically f`
15026+
15027runs `f` in a critical section.
15028
15029* `atomicBegin ()`
15030+
15031begins a critical section.
15032
15033* `atomicEnd ()`
15034+
15035ends a critical section.
15036
15037* `atomicState ()`
15038+
15039returns the current atomic state.
15040
15041* `type Runnable.t`
15042+
15043the type of threads that can be resumed.
15044
15045* `type 'a t`
15046+
15047the type of threads that expect a value of type `'a`.
15048
15049* `atomicSwitch f`
15050+
15051like `switch`, but assumes an atomic calling context. Upon
15052`switch`-ing back to the current thread, an implicit `atomicEnd` is
15053performed.
15054
15055* `new f`
15056+
15057creates a new thread that, when run, applies `f` to the value given to
15058the thread. `f` must terminate by `switch`ing to another thread or
15059exiting the process.
15060
15061* `prepend (t, f)`
15062+
15063creates a new thread (destroying `t` in the process) that first
15064applies `f` to the value given to the thread and then continues with
15065`t`. This is a constant time operation.
15066
15067* `prepare (t, v)`
15068+
15069prepares a new runnable thread (destroying `t` in the process) that
15070will evaluate `t` on `v`.
15071
15072* `switch f`
15073+
15074applies `f` to the current thread to get `rt`, and then start running
15075thread `rt`. It is an error for `f` to perform another `switch`. `f`
15076is guaranteed to run atomically.
15077
15078
15079== Example of non-preemptive threads ==
15080
15081[source,sml]
15082----
15083sys::[./bin/InclGitFile.py mlton master doc/examples/thread/non-preemptive-threads.sml]
15084----
15085
15086
15087== Example of preemptive threads ==
15088
15089[source,sml]
15090----
15091sys::[./bin/InclGitFile.py mlton master doc/examples/thread/preemptive-threads.sml]
15092----
15093
15094<<<
15095
15096:mlton-guide-page: MLtonVector
15097[[MLtonVector]]
15098MLtonVector
15099===========
15100
15101[source,sml]
15102----
15103signature MLTON_VECTOR =
15104 sig
15105 val create: int -> {done: unit -> 'a vector,
15106 sub: int -> 'a,
15107 update: int * 'a -> unit}
15108 val unfoldi: int * 'b * (int * 'b -> 'a * 'b) -> 'a vector * 'b
15109 end
15110----
15111
15112* `create n`
15113+
15114initiates the construction a vector _v_ of length `n`, returning
15115functions to manipulate the vector. The `done` function may be called
15116to return the created vector; it is an error to call `done` before all
15117entries have been initialized; it is an error to call `done` after
15118having called `done`. The `sub` function may be called to return an
15119initialized vector entry; it is not an error to call `sub` after
15120having called `done`. The `update` function may be called to
15121initialize a vector entry; it is an error to call `update` after
15122having called `done`. One must initialize vector entries in order
15123from lowest to highest; that is, before calling `update (i, x)`, one
15124must have already called `update (j, x)` for all `j` in `[0, i)`. The
15125`done`, `sub`, and `update` functions are all constant-time
15126operations.
15127
15128* `unfoldi (n, b, f)`
15129+
15130constructs a vector _v_ of length `n`, whose elements __v~i~__ are
15131determined by the equations __b~0~ = b__ and
15132__(v~i~, b~i+1~) = f (i, b~i~)__.
15133
15134<<<
15135
15136:mlton-guide-page: MLtonWeak
15137[[MLtonWeak]]
15138MLtonWeak
15139=========
15140
15141[source,sml]
15142----
15143signature MLTON_WEAK =
15144 sig
15145 type 'a t
15146
15147 val get: 'a t -> 'a option
15148 val new: 'a -> 'a t
15149 end
15150----
15151
15152A weak pointer is a pointer to an object that is nulled if the object
15153becomes <:Reachability:unreachable> due to garbage collection. The
15154weak pointer does not itself cause the object it points to be retained
15155by the garbage collector -- only other strong pointers can do that.
15156For objects that are not allocated in the heap, like integers, a weak
15157pointer will always be nulled. So, if `w: int Weak.t`, then
15158`Weak.get w = NONE`.
15159
15160* `type 'a t`
15161+
15162the type of weak pointers to objects of type `'a`
15163
15164* `get w`
15165+
15166returns `NONE` if the object pointed to by `w` no longer exists.
15167Otherwise, returns `SOME` of the object pointed to by `w`.
15168
15169* `new x`
15170+
15171returns a weak pointer to `x`.
15172
15173<<<
15174
15175:mlton-guide-page: MLtonWord
15176[[MLtonWord]]
15177MLtonWord
15178=========
15179
15180[source,sml]
15181----
15182signature MLTON_WORD =
15183 sig
15184 type t
15185
15186 val bswap: t -> t
15187 val rol: t * word -> t
15188 val ror: t * word -> t
15189 end
15190----
15191
15192* `type t`
15193+
15194the type of words. For `MLton.LargeWord` this is `LargeWord.word`,
15195for `MLton.Word` this is `Word.word`, for `MLton.Word8` this is
15196`Word8.word`, for `MLton.Word16` this is `Word16.word`, for
15197`MLton.Word32` this is `Word32.word`, for `MLton.Word64` this is
15198`Word64.word`.
15199
15200* `bswap w`
15201+
15202byte swap.
15203
15204* `rol (w, w')`
15205+
15206rotates left (circular).
15207
15208* `ror (w, w')`
15209+
15210rotates right (circular).
15211
15212<<<
15213
15214:mlton-guide-page: MLtonWorld
15215[[MLtonWorld]]
15216MLtonWorld
15217==========
15218
15219[source,sml]
15220----
15221signature MLTON_WORLD =
15222 sig
15223 datatype status = Clone | Original
15224
15225 val load: string -> 'a
15226 val save: string -> status
15227 val saveThread: string * Thread.Runnable.t -> unit
15228 end
15229----
15230
15231* `datatype status`
15232+
15233specifies whether a world is original or restarted (a clone).
15234
15235* `load f`
15236+
15237loads the saved computation from file `f`.
15238
15239* `save f`
15240+
15241saves the entire state of the computation to the file `f`. The
15242computation can then be restarted at a later time using `World.load`
15243or the `load-world` <:RunTimeOptions:runtime option>. The call to
15244`save` in the original computation returns `Original` and the call in
15245the restarted world returns `Clone`.
15246
15247* `saveThread (f, rt)`
15248+
15249saves the entire state of the computation to the file `f` that will
15250resume with thread `rt` upon restart.
15251
15252
15253== Notes ==
15254
15255<!Anchor(ASLR)>
15256Executables that save and load worlds are incompatible with
15257http://en.wikipedia.org/wiki/Address_space_layout_randomization[address space layout randomization (ASLR)]
15258of the executable (though, not of shared libraries). The state of a
15259computation includes addresses into the code and data segments of the
15260executable (e.g., static runtime-system data, return addresses); such
15261addresses are invalid when interpreted by the executable loaded at a
15262different base address.
15263
15264Executables that save and load worlds should be compiled with an
15265option to suppress the generation of position-independent executables.
15266
15267* <:RunningOnDarwin:Darwin 11 (Mac OS X Lion) and higher> : `-link-opt -fno-PIE`
15268
15269
15270== Example ==
15271
15272Suppose that `save-world.sml` contains the following.
15273[source,sml]
15274----
15275sys::[./bin/InclGitFile.py mlton master doc/examples/save-world/save-world.sml]
15276----
15277
15278Then, if we compile `save-world.sml` and run it, the `Original`
15279branch will execute, and a file named `world` will be created.
15280----
15281% mlton save-world.sml
15282% ./save-world
15283I am the original
15284----
15285
15286We can then load `world` using the `load-world`
15287<:RunTimeOptions:run time option>.
15288----
15289% ./save-world @MLton load-world world --
15290I am the clone
15291----
15292
15293<<<
15294
15295:mlton-guide-page: MLULex
15296[[MLULex]]
15297MLULex
15298======
15299
15300http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[MLULex] is a
15301scanner generator for <:StandardML:Standard ML>.
15302
15303== Also see ==
15304
15305* <:MLAntlr:>
15306* <:MLLPTLibrary:>
15307* <!Cite(OwensEtAl09)>
15308
15309<<<
15310
15311:mlton-guide-page: MLYacc
15312[[MLYacc]]
15313MLYacc
15314======
15315
15316<:MLYacc:> is a parser generator for <:StandardML:Standard ML> modeled
15317after the Yacc parser generator.
15318
15319A version of MLYacc, ported from the <:SMLNJ:SML/NJ> sources, is
15320distributed with MLton.
15321
15322== Also see ==
15323
15324* <!Attachment(Documentation,mlyacc.pdf)>
15325* <:MLLex:>
15326* <!Cite(TarditiAppel00)>
15327* <!Cite(Price09)>
15328
15329<<<
15330
15331:mlton-guide-page: Monomorphise
15332[[Monomorphise]]
15333Monomorphise
15334============
15335
15336<:Monomorphise:> is a translation pass from the <:XML:>
15337<:IntermediateLanguage:> to the <:SXML:> <:IntermediateLanguage:>.
15338
15339== Description ==
15340
15341Monomorphisation eliminates polymorphic values and datatype
15342declarations by duplicating them for each type at which they are used.
15343
15344Consider the following <:XML:> program.
15345[source,sml]
15346----
15347datatype 'a t = T of 'a
15348fun 'a f (x: 'a) = T x
15349val a = f 1
15350val b = f 2
15351val z = f (3, 4)
15352----
15353
15354The result of monomorphising this program is the following <:SXML:> program:
15355[source,sml]
15356----
15357datatype t1 = T1 of int
15358datatype t2 = T2 of int * int
15359fun f1 (x: int) = T1 x
15360fun f2 (x: int * int) = T2 x
15361val a = f1 1
15362val b = f1 2
15363val z = f2 (3, 4)
15364----
15365
15366== Implementation ==
15367
15368* <!ViewGitFile(mlton,master,mlton/xml/monomorphise.sig)>
15369* <!ViewGitFile(mlton,master,mlton/xml/monomorphise.fun)>
15370
15371== Details and Notes ==
15372
15373The monomorphiser works by making one pass over the entire program.
15374On the way down, it creates a cache for each variable declared in a
15375polymorphic declaration that maps a lists of type arguments to a new
15376variable name. At a variable reference, it consults the cache (based
15377on the types the variable is applied to). If there is already an
15378entry in the cache, it is used. If not, a new entry is created. On
15379the way up, the monomorphiser duplicates a variable declaration for
15380each entry in the cache.
15381
15382As with variables, the monomorphiser records all of the type at which
15383constructors are used. After the entire program is processed, the
15384monomorphiser duplicates each datatype declaration and its associated
15385constructors.
15386
15387The monomorphiser duplicates all of the functions declared in a
15388`fun` declaration as a unit. Consider the following program
15389[source,sml]
15390----
15391fun 'a f (x: 'a) = g x
15392and g (y: 'a) = f y
15393val a = f 13
15394val b = g 14
15395val c = f (1, 2)
15396----
15397
15398and its monomorphisation
15399
15400[source,sml]
15401----
15402fun f1 (x: int) = g1 x
15403and g1 (y: int) = f1 y
15404fun f2 (x : int * int) = g2 x
15405and g2 (y : int * int) = f2 y
15406val a = f1 13
15407val b = g1 14
15408val c = f2 (1, 2)
15409----
15410
15411== Pathological datatype declarations ==
15412
15413SML allows a pathological polymorphic datatype declaration in which
15414recursive uses of the defined type constructor are applied to
15415different type arguments than the definition. This has been
15416disallowed by others on type theoretic grounds. A canonical example
15417is the following.
15418[source,sml]
15419----
15420datatype 'a t = A of 'a | B of ('a * 'a) t
15421val z : int t = B (B (A ((1, 2), (3, 4))))
15422----
15423
15424The presence of the recursion in the datatype declaration might appear
15425to cause the need for the monomorphiser to create an infinite number
15426of types. However, due to the absence of polymorphic recursion in
15427SML, there are in fact only a finite number of instances of such types
15428in any given program. The monomorphiser translates the above program
15429to the following one.
15430[source,sml]
15431----
15432datatype t1 = B1 of t2
15433datatype t2 = B2 of t3
15434datatype t3 = A3 of (int * int) * (int * int)
15435val z : int t = B1 (B2 (A3 ((1, 2), (3, 4))))
15436----
15437
15438It is crucial that the monomorphiser be allowed to drop unused
15439constructors from datatype declarations in order for the translation
15440to terminate.
15441
15442<<<
15443
15444:mlton-guide-page: MoscowML
15445[[MoscowML]]
15446MoscowML
15447========
15448
15449http://mosml.org[Moscow ML] is a
15450<:StandardMLImplementations:Standard ML implementation>. It is a
15451byte-code compiler, so it compiles code quickly, but the code runs
15452slowly. See <:Performance:>.
15453
15454<<<
15455
15456:mlton-guide-page: Multi
15457[[Multi]]
15458Multi
15459=====
15460
15461<:Multi:> is an analysis pass for the <:SSA:>
15462<:IntermediateLanguage:>, invoked from <:ConstantPropagation:> and
15463<:LocalRef:>.
15464
15465== Description ==
15466
15467This pass analyzes the control flow of a <:SSA:> program to determine
15468which <:SSA:> functions and blocks might be executed more than once or
15469by more than one thread. It also determines when a program uses
15470threads and when functions and blocks directly or indirectly invoke
15471`Thread_copyCurrent`.
15472
15473== Implementation ==
15474
15475* <!ViewGitFile(mlton,master,mlton/ssa/multi.sig)>
15476* <!ViewGitFile(mlton,master,mlton/ssa/multi.fun)>
15477
15478== Details and Notes ==
15479
15480{empty}
15481
15482<<<
15483
15484:mlton-guide-page: Mutable
15485[[Mutable]]
15486Mutable
15487=======
15488
15489Mutable is an adjective meaning "can be modified". In
15490<:StandardML:Standard ML>, ref cells and arrays are mutable, while all
15491other values are <:Immutable:immutable>.
15492
15493<<<
15494
15495:mlton-guide-page: NeedsReview
15496[[NeedsReview]]
15497NeedsReview
15498===========
15499
15500This page documents some patches and bug fixes that need additional review by experienced developers:
15501
15502* Bug in transparent signature match:
15503** What is an 'original' interface and why does the equivalence of original interfaces implies the equivalence of the actual interfaces?
15504** http://www.mlton.org/pipermail/mlton/2007-September/029991.html
15505** http://www.mlton.org/pipermail/mlton/2007-September/029995.html
15506** SVN Revision: <!ViewSVNRev(6046)>
15507
15508* Bug in <:DeepFlatten:> pass:
15509** Should we allow argument to `Weak_new` to be flattened?
15510** SVN Revision: <!ViewSVNRev(6189)> (regression test demonstrating bug)
15511** SVN Revision: <!ViewSVNRev(6191)>
15512
15513<<<
15514
15515:mlton-guide-page: NumericLiteral
15516[[NumericLiteral]]
15517NumericLiteral
15518==============
15519
15520Numeric literals in <:StandardML:Standard ML> can be written in either
15521decimal or hexadecimal notation. Sometimes it can be convenient to
15522write numbers down in other bases. Fortunately, using <:Fold:>, it is
15523possible to define a concise syntax for numeric literals that allows
15524one to write numeric constants in any base and of various types
15525(`int`, `IntInf.int`, `word`, and more).
15526
15527We will define constants `I`, `II`, `W`, and +`+ so
15528that, for example,
15529[source,sml]
15530----
15531I 10 `1`2`3 $
15532----
15533denotes `123:int` in base 10, while
15534[source,sml]
15535----
15536II 8 `2`3 $
15537----
15538denotes `19:IntInf.int` in base 8, and
15539[source,sml]
15540----
15541W 2 `1`1`0`1 $
15542----
15543denotes `0w13: word`.
15544
15545Here is the code.
15546
15547[source,sml]
15548----
15549structure Num =
15550 struct
15551 fun make (op *, op +, i2x) iBase =
15552 let
15553 val xBase = i2x iBase
15554 in
15555 Fold.fold
15556 ((i2x 0,
15557 fn (i, x) =>
15558 if 0 <= i andalso i < iBase then
15559 x * xBase + i2x i
15560 else
15561 raise Fail (concat
15562 ["Num: ", Int.toString i,
15563 " is not a valid\
15564 \ digit in base ",
15565 Int.toString iBase])),
15566 fst)
15567 end
15568
15569 fun I ? = make (op *, op +, id) ?
15570 fun II ? = make (op *, op +, IntInf.fromInt) ?
15571 fun W ? = make (op *, op +, Word.fromInt) ?
15572
15573 fun ` ? = Fold.step1 (fn (i, (x, step)) =>
15574 (step (i, x), step)) ?
15575
15576 val a = 10
15577 val b = 11
15578 val c = 12
15579 val d = 13
15580 val e = 14
15581 val f = 15
15582 end
15583----
15584where
15585[source,sml]
15586----
15587fun fst (x, _) = x
15588----
15589
15590The idea is for the fold to start with zero and to construct the
15591result one digit at a time, with each stepper multiplying the previous
15592result by the base and adding the next digit. The code is abstracted
15593in two different ways for extra generality. First, the `make`
15594function abstracts over the various primitive operations (addition,
15595multiplication, etc) that are needed to construct a number. This
15596allows the same code to be shared for constants `I`, `II`, `W` used to
15597write down the various numeric types. It also allows users to add new
15598constants for additional numeric types, by supplying the necessary
15599arguments to make.
15600
15601Second, the step function, +&grave;+, is abstracted over the actual
15602construction operation, which is created by make, and passed along the
15603fold. This allows the same constant, +&grave;+, to be used for all
15604numeric types. The alternative approach, having a different step
15605function for each numeric type, would be more painful to use.
15606
15607On the surface, it appears that the code checks the digits dynamically
15608to ensure they are valid for the base. However, MLton will simplify
15609everything away at compile time, leaving just the final numeric
15610constant.
15611
15612<<<
15613
15614:mlton-guide-page: ObjectOrientedProgramming
15615[[ObjectOrientedProgramming]]
15616ObjectOrientedProgramming
15617=========================
15618
15619<:StandardML:Standard ML> does not have explicit support for
15620object-oriented programming. Here are some papers that show how to
15621express certain object-oriented concepts in SML.
15622
15623* <!Cite(Berthomieu00, OO Programming styles in ML)>
15624
15625* <!Cite(ThorupTofte94, Object-oriented programming and Standard ML)>
15626
15627* <!Cite(LarsenNiss04, mGTK: An SML binding of Gtk+)>
15628
15629* <!Cite(FluetPucella06, Phantom Types and Subtyping)>
15630
15631The question of OO programming in SML comes up every now and then.
15632The following discusses a simple object-oriented (OO) programming
15633technique in Standard ML. The reader is assumed to be able to read
15634Java and SML code.
15635
15636
15637== Motivation ==
15638
15639SML doesn't provide subtyping, but it does provide parametric
15640polymorphism, which can be used to encode some forms of subtyping.
15641Most articles on OO programming in SML concentrate on such encoding
15642techniques. While those techniques are interesting -- and it is
15643recommended to read such articles -- and sometimes useful, it seems
15644that basically all OO gurus agree that (deep) subtyping (or
15645inheritance) hierarchies aren't as practical as they were thought to
15646be in the early OO days. "Good", flexible, "OO" designs tend to have
15647a flat structure
15648
15649----
15650 Interface
15651 ^
15652 |
15653- - -+-------+-------+- - -
15654 | | |
15655 ImplA ImplB ImplC
15656----
15657
15658
15659and deep inheritance hierarchies
15660
15661----
15662ClassA
15663 ^
15664 |
15665ClassB
15666 ^
15667 |
15668ClassC
15669 ^
15670 |
15671----
15672
15673tend to be signs of design mistakes. There are good underlying
15674reasons for this, but a thorough discussion is not in the scope of
15675this article. However, the point is that perhaps the encoding of
15676subtyping is not as important as one might believe. In the following
15677we ignore subtyping and rather concentrate on a very simple and basic
15678dynamic dispatch technique.
15679
15680
15681== Dynamic Dispatch Using a Recursive Record of Functions ==
15682
15683Quite simply, the basic idea is to implement a "virtual function
15684table" using a record that is wrapped inside a (possibly recursive)
15685datatype. Let's first take a look at a simple concrete example.
15686
15687Consider the following Java interface:
15688
15689----
15690public interface Counter {
15691 public void inc();
15692 public int get();
15693}
15694----
15695
15696We can translate the `Counter` interface to SML as follows:
15697
15698[source,sml]
15699----
15700datatype counter = Counter of {inc : unit -> unit, get : unit -> int}
15701----
15702
15703Each value of type `counter` can be thought of as an object that
15704responds to two messages `inc` and `get`. To actually send messages
15705to a counter, it is useful to define auxiliary functions
15706
15707[source,sml]
15708----
15709local
15710 fun mk m (Counter t) = m t ()
15711in
15712 val cGet = mk#get
15713 val cInc = mk#inc
15714end
15715----
15716
15717that basically extract the "function table" `t` from a counter object
15718and then select the specified method `m` from the table.
15719
15720Let's then implement a simple function that increments a counter until a
15721given maximum is reached:
15722
15723[source,sml]
15724----
15725fun incUpto counter max = while cGet counter < max do cInc counter
15726----
15727
15728You can easily verify that the above code compiles even without any
15729concrete implementation of a counter, thus it is clear that it doesn't
15730depend on a particular counter implementation.
15731
15732Let's then implement a couple of counters. First consider the
15733following Java class implementing the `Counter` interface given earlier.
15734
15735----
15736public class BasicCounter implements Counter {
15737 private int cnt;
15738 public BasicCounter(int initialCnt) { this.cnt = initialCnt; }
15739 public void inc() { this.cnt += 1; }
15740 public int get() { return this.cnt; }
15741}
15742----
15743
15744We can translate the above to SML as follows:
15745
15746[source,sml]
15747----
15748fun newBasicCounter initialCnt = let
15749 val cnt = ref initialCnt
15750 in
15751 Counter {inc = fn () => cnt := !cnt + 1,
15752 get = fn () => !cnt}
15753 end
15754----
15755
15756The SML function `newBasicCounter` can be described as a constructor
15757function for counter objects of the `BasicCounter` "class". We can
15758also have other counter implementations. Here is the constructor for
15759a counter decorator that logs messages:
15760
15761[source,sml]
15762----
15763fun newLoggedCounter counter =
15764 Counter {inc = fn () => (print "inc\n" ; cInc counter),
15765 get = fn () => (print "get\n" ; cGet counter)}
15766----
15767
15768The `incUpto` function works just as well with objects of either
15769class:
15770
15771[source,sml]
15772----
15773val aCounter = newBasicCounter 0
15774val () = incUpto aCounter 5
15775val () = print (Int.toString (cGet aCounter) ^"\n")
15776
15777val aCounter = newLoggedCounter (newBasicCounter 0)
15778val () = incUpto aCounter 5
15779val () = print (Int.toString (cGet aCounter) ^"\n")
15780----
15781
15782In general, a dynamic dispatch interface is represented as a record
15783type wrapped inside a datatype. Each field of the record corresponds
15784to a public method or field of the object:
15785
15786[source,sml]
15787----
15788datatype interface =
15789 Interface of {method : t1 -> t2,
15790 immutableField : t,
15791 mutableField : t ref}
15792----
15793
15794The reason for wrapping the record inside a datatype is that records,
15795in SML, can not be recursive. However, SML datatypes can be
15796recursive. A record wrapped in a datatype can contain fields that
15797contain the datatype. For example, an interface such as `Cloneable`
15798
15799[source,sml]
15800----
15801datatype cloneable = Cloneable of {clone : unit -> cloneable}
15802----
15803
15804can be represented using recursive datatypes.
15805
15806Like in OO languages, interfaces are abstract and can not be
15807instantiated to produce objects. To be able to instantiate objects,
15808the constructors of a concrete class are needed. In SML, we can
15809implement constructors as simple functions from arbitrary arguments to
15810values of the interface type. Such a constructor function can
15811encapsulate arbitrary private state and functions using lexical
15812closure. It is also easy to share implementations of methods between
15813two or more constructors.
15814
15815While the `Counter` example is rather trivial, it should not be
15816difficult to see that this technique quite simply doesn't require a huge
15817amount of extra verbiage and is more than usable in practice.
15818
15819
15820== SML Modules and Dynamic Dispatch ==
15821
15822One might wonder about how SML modules and the dynamic dispatch
15823technique work together. Let's investigate! Let's use a simple
15824dispenser framework as a concrete example. (Note that this isn't
15825intended to be an introduction to the SML module system.)
15826
15827=== Programming with SML Modules ===
15828
15829Using SML signatures we can specify abstract data types (ADTs) such as
15830dispensers. Here is a signature for an "abstract" functional (as
15831opposed to imperative) dispenser:
15832
15833[source,sml]
15834----
15835signature ABSTRACT_DISPENSER = sig
15836 type 'a t
15837 val isEmpty : 'a t -> bool
15838 val push : 'a * 'a t -> 'a t
15839 val pop : 'a t -> ('a * 'a t) option
15840end
15841----
15842
15843The term "abstract" in the name of the signature refers to the fact that
15844the signature gives no way to instantiate a dispenser. It has nothing to
15845do with the concept of abstract data types.
15846
15847Using SML functors we can write "generic" algorithms that manipulate
15848dispensers of an unknown type. Here are a couple of very simple
15849algorithms:
15850
15851[source,sml]
15852----
15853functor DispenserAlgs (D : ABSTRACT_DISPENSER) = struct
15854 open D
15855
15856 fun pushAll (xs, d) = foldl push d xs
15857
15858 fun popAll d = let
15859 fun lp (xs, NONE) = rev xs
15860 | lp (xs, SOME (x, d)) = lp (x::xs, pop d)
15861 in
15862 lp ([], pop d)
15863 end
15864
15865 fun cp (from, to) = pushAll (popAll from, to)
15866end
15867----
15868
15869As one can easily verify, the above compiles even without any concrete
15870dispenser structure. Functors essentially provide a form a static
15871dispatch that one can use to break compile-time dependencies.
15872
15873We can also give a signature for a concrete dispenser
15874
15875[source,sml]
15876----
15877signature DISPENSER = sig
15878 include ABSTRACT_DISPENSER
15879 val empty : 'a t
15880end
15881----
15882
15883and write any number of concrete structures implementing the signature.
15884For example, we could implement stacks
15885
15886[source,sml]
15887----
15888structure Stack :> DISPENSER = struct
15889 type 'a t = 'a list
15890 val empty = []
15891 val isEmpty = null
15892 val push = op ::
15893 val pop = List.getItem
15894end
15895----
15896
15897and queues
15898
15899[source,sml]
15900----
15901structure Queue :> DISPENSER = struct
15902 datatype 'a t = T of 'a list * 'a list
15903 val empty = T ([], [])
15904 val isEmpty = fn T ([], _) => true | _ => false
15905 val normalize = fn ([], ys) => (rev ys, []) | q => q
15906 fun push (y, T (xs, ys)) = T (normalize (xs, y::ys))
15907 val pop = fn (T (x::xs, ys)) => SOME (x, T (normalize (xs, ys))) | _ => NONE
15908end
15909----
15910
15911One can now write code that uses either the `Stack` or the `Queue`
15912dispenser. One can also instantiate the previously defined functor to
15913create functions for manipulating dispensers of a type:
15914
15915[source,sml]
15916----
15917structure S = DispenserAlgs (Stack)
15918val [4,3,2,1] = S.popAll (S.pushAll ([1,2,3,4], Stack.empty))
15919
15920structure Q = DispenserAlgs (Queue)
15921val [1,2,3,4] = Q.popAll (Q.pushAll ([1,2,3,4], Queue.empty))
15922----
15923
15924There is no dynamic dispatch involved at the module level in SML. An
15925attempt to do dynamic dispatch
15926
15927[source,sml]
15928----
15929val q = Q.push (1, Stack.empty)
15930----
15931
15932will give a type error.
15933
15934=== Combining SML Modules and Dynamic Dispatch ===
15935
15936Let's then combine SML modules and the dynamic dispatch technique
15937introduced in this article. First we define an interface for
15938dispensers:
15939
15940[source,sml]
15941----
15942structure Dispenser = struct
15943 datatype 'a t =
15944 I of {isEmpty : unit -> bool,
15945 push : 'a -> 'a t,
15946 pop : unit -> ('a * 'a t) option}
15947
15948 fun O m (I t) = m t
15949
15950 fun isEmpty t = O#isEmpty t ()
15951 fun push (v, t) = O#push t v
15952 fun pop t = O#pop t ()
15953end
15954----
15955
15956The `Dispenser` module, which we can think of as an interface for
15957dispensers, implements the `ABSTRACT_DISPENSER` signature using
15958the dynamic dispatch technique, but we leave the signature ascription
15959until later.
15960
15961Then we define a `DispenserClass` functor that makes a "class" out of
15962a given dispenser module:
15963
15964[source,sml]
15965----
15966functor DispenserClass (D : DISPENSER) : DISPENSER = struct
15967 open Dispenser
15968
15969 fun make d =
15970 I {isEmpty = fn () => D.isEmpty d,
15971 push = fn x => make (D.push (x, d)),
15972 pop = fn () =>
15973 case D.pop d of
15974 NONE => NONE
15975 | SOME (x, d) => SOME (x, make d)}
15976
15977 val empty =
15978 I {isEmpty = fn () => true,
15979 push = fn x => make (D.push (x, D.empty)),
15980 pop = fn () => NONE}
15981end
15982----
15983
15984Finally we seal the `Dispenser` module:
15985
15986[source,sml]
15987----
15988structure Dispenser : ABSTRACT_DISPENSER = Dispenser
15989----
15990
15991This isn't necessary for type safety, because the unsealed `Dispenser`
15992module does not allow one to break encapsulation, but makes sure that
15993only the `DispenserClass` functor can create dispenser classes
15994(because the constructor `Dispenser.I` is no longer accessible).
15995
15996Using the `DispenserClass` functor we can turn any concrete dispenser
15997module into a dispenser class:
15998
15999[source,sml]
16000----
16001structure StackClass = DispenserClass (Stack)
16002structure QueueClass = DispenserClass (Queue)
16003----
16004
16005Each dispenser class implements the same dynamic dispatch interface
16006and the `ABSTRACT_DISPENSER` -signature.
16007
16008Because the dynamic dispatch `Dispenser` module implements the
16009`ABSTRACT_DISPENSER`-signature, we can use it to instantiate the
16010`DispenserAlgs`-functor:
16011
16012[source,sml]
16013----
16014structure D = DispenserAlgs (Dispenser)
16015----
16016
16017The resulting `D` module, like the `Dispenser` module, works with
16018any dispenser class and uses dynamic dispatch:
16019
16020[source,sml]
16021----
16022val [4, 3, 2, 1] = D.popAll (D.pushAll ([1, 2, 3, 4], StackClass.empty))
16023val [1, 2, 3, 4] = D.popAll (D.pushAll ([1, 2, 3, 4], QueueClass.empty))
16024----
16025
16026<<<
16027
16028:mlton-guide-page: OCaml
16029[[OCaml]]
16030OCaml
16031=====
16032
16033http://caml.inria.fr/[OCaml] is a variant of <:ML:> and is similar to
16034<:StandardML:Standard ML>.
16035
16036== OCaml and SML ==
16037
16038Here's a comparison of some aspects of the OCaml and SML languages.
16039
16040* Standard ML has a formal <:DefinitionOfStandardML:Definition>, while
16041OCaml is specified by its lone implementation and informal
16042documentation.
16043
16044* Standard ML has a number of <:StandardMLImplementations:compilers>,
16045while OCaml has only one.
16046
16047* OCaml has built-in support for object-oriented programming, while
16048Standard ML does not (however, see <:ObjectOrientedProgramming:>).
16049
16050* Andreas Rossberg has a
16051http://www.mpi-sws.org/%7Erossberg/sml-vs-ocaml.html[side-by-side
16052comparison] of the syntax of SML and OCaml.
16053
16054* Adam Chlipala has a
16055http://adam.chlipala.net/mlcomp[point-by-point comparison] of OCaml
16056and SML.
16057
16058== OCaml and MLton ==
16059
16060Here's a comparison of some aspects of OCaml and MLton.
16061
16062* Performance
16063
16064** Both OCaml and MLton have excellent performance.
16065
16066** MLton performs extensive <:WholeProgramOptimization:>, which can
16067provide substantial improvements in large, modular programs.
16068
16069** MLton uses native types, like 32-bit integers, without any penalty
16070due to tagging or boxing. OCaml uses 31-bit integers with a penalty
16071due to tagging, and 32-bit integers with a penalty due to boxing.
16072
16073** MLton uses native types, like 64-bit floats, without any penalty
16074due to boxing. OCaml, in some situations, boxes 64-bit floats.
16075
16076** MLton represents arrays of all types unboxed. In OCaml, only
16077arrays of 64-bit floats are unboxed, and then only when it is
16078syntactically apparent.
16079
16080** MLton represents records compactly by reordering and packing the
16081fields.
16082
16083** In MLton, polymorphic and monomorphic code have the same
16084performance. In OCaml, polymorphism can introduce a performance
16085penalty.
16086
16087** In MLton, module boundaries have no impact on performance. In
16088OCaml, moving code between modules can cause a performance penalty.
16089
16090** MLton's <:ForeignFunctionInterface:> is simpler than OCaml's.
16091
16092* Tools
16093
16094** OCaml has a debugger, while MLton does not.
16095
16096** OCaml supports separate compilation, while MLton does not.
16097
16098** OCaml compiles faster than MLton.
16099
16100** MLton supports profiling of both time and allocation.
16101
16102* Libraries
16103
16104** OCaml has more available libraries.
16105
16106* Community
16107
16108** OCaml has a larger community than MLton.
16109
16110** MLton has a very responsive
16111 http://www.mlton.org/mailman/listinfo/mlton[developer list].
16112
16113<<<
16114
16115:mlton-guide-page: OpenGL
16116[[OpenGL]]
16117OpenGL
16118======
16119
16120There are at least two interfaces to OpenGL for MLton/SML, both of
16121which should be considered alpha quality.
16122
16123* <:MikeThomas:> built a low-level interface, directly translating
16124many of the functions, covering GL, GLU, and GLUT. This is available
16125in the MLton <:Sources:>:
16126<!ViewGitDir(mltonlib,master,org/mlton/mike/opengl)>. The code
16127contains a number of small, standard OpenGL examples translated to
16128SML.
16129
16130* <:ChrisClearwater:> has written at least an interface to GL, and
16131possibly more. See
16132** http://mlton.org/pipermail/mlton/2005-January/026669.html
16133
16134<:Contact:> us for more information or an update on the status of
16135these projects.
16136
16137<<<
16138
16139:mlton-guide-page: OperatorPrecedence
16140[[OperatorPrecedence]]
16141OperatorPrecedence
16142==================
16143
16144<:StandardML:Standard ML> has a built in notion of precedence for
16145certain symbols. Every program that includes the
16146<:BasisLibrary:Basis Library> automatically gets the following infix
16147declarations. Higher number indicates higher precedence.
16148
16149[source,sml]
16150----
16151infix 7 * / mod div
16152infix 6 + - ^
16153infixr 5 :: @
16154infix 4 = <> > >= < <=
16155infix 3 := o
16156infix 0 before
16157----
16158
16159<<<
16160
16161:mlton-guide-page: OptionalArguments
16162[[OptionalArguments]]
16163OptionalArguments
16164=================
16165
16166<:StandardML:Standard ML> does not have built-in support for optional
16167arguments. Nevertheless, using <:Fold:>, it is easy to define
16168functions that take optional arguments.
16169
16170For example, suppose that we have the following definition of a
16171function `f`.
16172
16173[source,sml]
16174----
16175fun f (i, r, s) =
16176 concat [Int.toString i, ", ", Real.toString r, ", ", s]
16177----
16178
16179Using the `OptionalArg` structure described below, we can define a
16180function `f'`, an optionalized version of `f`, that takes 0, 1, 2, or
161813 arguments. Embedded within `f'` will be default values for `i`,
16182`r`, and `s`. If `f'` gets no arguments, then all the defaults are
16183used. If `f'` gets one argument, then that will be used for `i`. Two
16184arguments will be used for `i` and `r` respectively. Three arguments
16185will override all default values. Calls to `f'` will look like the
16186following.
16187
16188[source,sml]
16189----
16190f' $
16191f' `2 $
16192f' `2 `3.0 $
16193f' `2 `3.0 `"four" $
16194----
16195
16196The optional argument indicator, +&grave;+, is not special syntax ---
16197it is a normal SML value, defined in the `OptionalArg` structure
16198below.
16199
16200Here is the definition of `f'` using the `OptionalArg` structure, in
16201particular, `OptionalArg.make` and `OptionalArg.D`.
16202
16203[source,sml]
16204----
16205val f' =
16206 fn z =>
16207 let open OptionalArg in
16208 make (D 1) (D 2.0) (D "three") $
16209 end (fn i & r & s => f (i, r, s))
16210 z
16211----
16212
16213The definition of `f'` is eta expanded as with all uses of fold. A
16214call to `OptionalArg.make` is supplied with a variable number of
16215defaults (in this case, three), the end-of-arguments terminator, `$`,
16216and the function to run, taking its arguments as an n-ary
16217<:ProductType:product>. In this case, the function simply converts
16218the product to an ordinary tuple and calls `f`. Often, the function
16219body will simply be written directly.
16220
16221In general, the definition of an optional-argument function looks like
16222the following.
16223
16224[source,sml]
16225----
16226val f =
16227 fn z =>
16228 let open OptionalArg in
16229 make (D <default1>) (D <default2>) ... (D <defaultn>) $
16230 end (fn x1 & x2 & ... & xn =>
16231 <function code goes here>)
16232 z
16233----
16234
16235Here is the definition of `OptionalArg`.
16236
16237[source,sml]
16238----
16239structure OptionalArg =
16240 struct
16241 val make =
16242 fn z =>
16243 Fold.fold
16244 ((id, fn (f, x) => f x),
16245 fn (d, r) => fn func =>
16246 Fold.fold ((id, d ()), fn (f, d) =>
16247 let
16248 val d & () = r (id, f d)
16249 in
16250 func d
16251 end))
16252 z
16253
16254 fun D d = Fold.step0 (fn (f, r) =>
16255 (fn ds => f (d & ds),
16256 fn (f, a & b) => r (fn x => f a & x, b)))
16257
16258 val ` =
16259 fn z =>
16260 Fold.step1 (fn (x, (f, _ & d)) => (fn d => f (x & d), d))
16261 z
16262 end
16263----
16264
16265`OptionalArg.make` uses a nested fold. The first `fold` accumulates
16266the default values in a product, associated to the right, and a
16267reversal function that converts a product (of the same arity as the
16268number of defaults) from right associativity to left associativity.
16269The accumulated defaults are used by the second fold, which recurs
16270over the product, replacing the appropriate component as it encounters
16271optional arguments. The second fold also constructs a "fill"
16272function, `f`, that is used to reconstruct the product once the
16273end-of-arguments is reached. Finally, the finisher reconstructs the
16274product and uses the reversal function to convert the product from
16275right associative to left associative, at which point it is passed to
16276the user-supplied function.
16277
16278Much of the complexity comes from the fact that while recurring over a
16279product from left to right, one wants it to be right-associative,
16280e.g., look like
16281
16282[source,sml]
16283----
16284a & (b & (c & d))
16285----
16286
16287but the user function in the end wants the product to be left
16288associative, so that the product argument pattern can be written
16289without parentheses (since `&` is left associative).
16290
16291
16292== Labelled optional arguments ==
16293
16294In addition to the positional optional arguments described above, it
16295is sometimes useful to have labelled optional arguments. These allow
16296one to define a function, `f`, with defaults, say `a` and `b`. Then,
16297a caller of `f` can supply values for `a` and `b` by name. If no
16298value is supplied then the default is used.
16299
16300Labelled optional arguments are a simple extension of
16301<:FunctionalRecordUpdate:> using post composition. Suppose, for
16302example, that one wants a function `f` with labelled optional
16303arguments `a` and `b` with default values `0` and `0.0` respectively.
16304If one has a functional-record-update function `updateAB` for records
16305with `a` and `b` fields, then one can define `f` in the following way.
16306
16307[source,sml]
16308----
16309val f =
16310 fn z =>
16311 Fold.post
16312 (updateAB {a = 0, b = 0.0},
16313 fn {a, b} => print (concat [Int.toString a, " ",
16314 Real.toString b, "\n"]))
16315 z
16316----
16317
16318The idea is that `f` is the post composition (using `Fold.post`) of
16319the actual code for the function with a functional-record updater that
16320starts with the defaults.
16321
16322Here are some example calls to `f`.
16323[source,sml]
16324----
16325val () = f $
16326val () = f (U#a 13) $
16327val () = f (U#a 13) (U#b 17.5) $
16328val () = f (U#b 17.5) (U#a 13) $
16329----
16330
16331Notice that a caller can supply neither of the arguments, either of
16332the arguments, or both of the arguments, and in either order. All
16333that matter is that the arguments be labelled correctly (and of the
16334right type, of course).
16335
16336Here is another example.
16337
16338[source,sml]
16339----
16340val f =
16341 fn z =>
16342 Fold.post
16343 (updateBCD {b = 0, c = 0.0, d = "<>"},
16344 fn {b, c, d} =>
16345 print (concat [Int.toString b, " ",
16346 Real.toString c, " ",
16347 d, "\n"]))
16348 z
16349----
16350
16351Here are some example calls.
16352
16353[source,sml]
16354----
16355val () = f $
16356val () = f (U#d "goodbye") $
16357val () = f (U#d "hello") (U#b 17) (U#c 19.3) $
16358----
16359
16360<<<
16361
16362:mlton-guide-page: Overloading
16363[[Overloading]]
16364Overloading
16365===========
16366
16367In <:StandardML:Standard ML>, constants (like `13`, `0w13`, `13.0`)
16368are overloaded, meaning that they can denote a constant of the
16369appropriate type as determined by context. SML defines the
16370overloading classes _Int_, _Real_, and _Word_, which denote the sets
16371of types that integer, real, and word constants may take on. In
16372MLton, these are defined as follows.
16373
16374[cols="^25%,<75%"]
16375|=====
16376| _Int_ | `Int2.int`, `Int3.int`, ... `Int32.int`, `Int64.int`, `Int.int`, `IntInf.int`, `LargeInt.int`, `FixedInt.int`, `Position.int`
16377| _Real_ | `Real32.real`, `Real64.real`, `Real.real`, `LargeReal.real`
16378| _Word_ | `Word2.word`, `Word3.word`, ... `Word32.word`, `Word64.word`, `Word.word`, `LargeWord.word`, `SysWord.word`
16379|=====
16380
16381The <:DefinitionOfStandardML:Definition> allows flexibility in how
16382much context is used to resolve overloading. It says that the context
16383is _no larger than the smallest enclosing structure-level
16384declaration_, but that _an implementation may require that a smaller
16385context determines the type_. MLton uses the largest possible context
16386allowed by SML in resolving overloading. If the type of a constant is
16387not determined by context, then it takes on a default type. In MLton,
16388these are defined as follows.
16389
16390[cols="^25%,<75%"]
16391|=====
16392| _Int_ | `Int.int`
16393| _Real_ | `Real.real`
16394| _Word_ | `Word.word`
16395|=====
16396
16397Other implementations may use a smaller context or different default
16398types.
16399
16400== Also see ==
16401
16402 * http://www.standardml.org/Basis/top-level-chapter.html[discussion of overloading in the Basis Library]
16403
16404== Examples ==
16405
16406 * The following program is rejected.
16407+
16408[source,sml]
16409----
16410structure S:
16411 sig
16412 val x: Word8.word
16413 end =
16414 struct
16415 val x = 0w0
16416 end
16417----
16418+
16419The smallest enclosing structure declaration for `0w0` is
16420`val x = 0w0`. Hence, `0w0` receives the default type for words,
16421which is `Word.word`.
16422
16423<<<
16424
16425:mlton-guide-page: PackedRepresentation
16426[[PackedRepresentation]]
16427PackedRepresentation
16428====================
16429
16430<:PackedRepresentation:> is an analysis pass for the <:SSA2:>
16431<:IntermediateLanguage:>, invoked from <:ToRSSA:>.
16432
16433== Description ==
16434
16435This pass analyzes a <:SSA2:> program to compute a packed
16436representation for each object.
16437
16438== Implementation ==
16439
16440* <!ViewGitFile(mlton,master,mlton/backend/representation.sig)>
16441* <!ViewGitFile(mlton,master,mlton/backend/packed-representation.fun)>
16442
16443== Details and Notes ==
16444
16445Has a special case to make sure that `true` is represented as `1` and
16446`false` is represented as `0`.
16447
16448<<<
16449
16450:mlton-guide-page: ParallelMove
16451[[ParallelMove]]
16452ParallelMove
16453============
16454
16455<:ParallelMove:> is a rewrite pass, agnostic in the
16456<:IntermediateLanguage:> which it produces.
16457
16458== Description ==
16459
16460This function computes a sequence of individual moves to effect a
16461parallel move (with possibly overlapping froms and tos).
16462
16463== Implementation ==
16464
16465* <!ViewGitFile(mlton,master,mlton/backend/parallel-move.sig)>
16466* <!ViewGitFile(mlton,master,mlton/backend/parallel-move.fun)>
16467
16468== Details and Notes ==
16469
16470{empty}
16471
16472<<<
16473
16474:mlton-guide-page: Performance
16475[[Performance]]
16476Performance
16477===========
16478
16479This page compares the performance of a number of SML compilers on a
16480range of benchmarks.
16481
16482This page compares the following SML compiler versions.
16483
16484* <:Home:MLton> 20171211 (git 79d4a623c)
16485* <:MLKit:ML Kit> 4.3.12 (20171210)
16486* <:MoscowML:Moscow ML> 2.10.1 ++ (git f529b33bb, 20170711)
16487* <:PolyML:Poly/ML> 5.7.2 Testing (git 5.7.1-35-gcb73407a)
16488* <:SMLNJ:SML/NJ> 110.81 (20170501)
16489
16490There are tables for <:#RunTime:run time>, <:#CodeSize:code size>, and
16491<:#CompileTime:compile time>.
16492
16493
16494== Setup ==
16495
16496All benchmarks were compiled and run on a 2.6 GHz Core i7-5600U with 16G of
16497RAM. The benchmarks were compiled with the default settings for all
16498the compilers, except for Moscow ML, which was passed the
16499`-orthodox -standalone -toplevel` switches. The Poly/ML executables
16500were produced using `polyc`.
16501The SML/NJ executables were produced by wrapping the entire program in
16502a `local` declaration whose body performs an `SMLofNJ.exportFn`.
16503
16504For more details, or if you want to run the benchmarks yourself,
16505please see the <!ViewGitDir(mlton,master,benchmark)> directory of our
16506<:Sources:>.
16507
16508All of the benchmarks are available for download from this page. Some
16509of the benchmarks were obtained from the SML/NJ benchmark suite. Some
16510of the benchmarks expect certain input files to exist in the
16511<!ViewGitDir(mlton,master,benchmark/tests/DATA)> subdirectory.
16512
16513* <!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/hamlet-input.sml)>
16514* <!RawGitFile(mlton,master,benchmark/tests/ray.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/ray)>
16515* <!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/chess.gml)>
16516* <!RawGitFile(mlton,master,benchmark/tests/vliw.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/ndotprod.s)>
16517
16518
16519== <!Anchor(RunTime)>Run-time ratio ==
16520
16521The following table gives the ratio of the run time of each benchmark
16522when compiled by another compiler to the run time when compiled by
16523MLton. That is, the larger the number, the slower the generated code
16524runs. A number larger than one indicates that the corresponding
16525compiler produces code that runs more slowly than MLton. A * in an
16526entry means the compiler failed to compile the benchmark or that the
16527benchmark failed to run.
16528
16529[options="header",cols="<2,5*<1"]
16530|====
16531|benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ
16532|<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|1.00|10.11|19.36|2.98|1.24
16533|<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|1.00|*|7.87|1.22|1.75
16534|<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|1.00|30.79|*|10.94|9.08
16535|<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|1.00|6.51|40.42|2.34|2.32
16536|<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|1.00|0.97|*|0.60|*
16537|<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|1.00|0.50|11.50|0.42|0.42
16538|<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|1.00|7.35|81.51|4.03|1.19
16539|<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|1.00|1.41|10.94|1.25|1.17
16540|<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|1.00|7.19|68.33|5.28|13.16
16541|<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|1.00|4.97|22.85|1.58|*
16542|<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|1.00|4.99|57.84|3.34|4.67
16543|<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|1.00|*|18.43|3.18|3.06
16544|<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|1.00|2.76|7.94|3.19|*
16545|<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|1.00|1.80|20.19|0.89|1.50
16546|<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|1.00|5.10|11.06|1.15|1.27
16547|<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|1.00|3.50|25.52|1.33|1.28
16548|<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|1.00|29.40|183.02|7.41|15.19
16549|<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|1.00|95.18|*|32.61|47.47
16550|<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|1.00|1.42|*|0.74|3.24
16551|<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|1.00|1.83|8.45|0.84|*
16552|<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|1.00|4.03|12.42|1.70|2.25
16553|<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|1.00|3.73|57.44|2.05|3.22
16554|<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|1.00|3.96|*|1.73|1.20
16555|<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|1.00|6.26|30.85|7.82|5.99
16556|<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|1.00|9.37|44.78|2.18|2.15
16557|<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|1.00|*|*|2.79|3.59
16558|<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|1.00|5.68|165.56|3.92|37.52
16559|<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|1.00|12.05|25.08|8.73|1.75
16560|<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|1.00|*|*|2.11|3.33
16561|<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|1.00|2.95|24.03|3.67|1.93
16562|<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|1.00|*|*|1.04|*
16563|<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|1.00|1.88|28.01|0.70|2.67
16564|<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|1.00|1.58|23.57|0.90|1.04
16565|<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|1.00|1.69|15.90|1.57|2.01
16566|<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|1.00|*|*|*|2.07
16567|<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|1.00|2.19|66.76|3.27|1.48
16568|<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|1.00|*|19.43|1.08|1.03
16569|<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|1.00|13.85|*|1.80|12.48
16570|<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|1.00|*|*|*|13.92
16571|<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|1.00|7.88|68.85|9.39|68.80
16572|<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|1.00|2.46|15.39|1.43|1.55
16573|<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|1.00|6.00|*|29.25|9.54
16574|<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|1.00|80.43|*|19.45|8.71
16575|<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|1.00|4.62|35.56|1.68|9.97
16576|<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|1.00|*|*|*|1.60
16577|====
16578
16579<!Anchor(SNFNote)>
16580Note: for SML/NJ, the
16581<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>
16582benchmark was killed after running for over 51,000 seconds.
16583
16584
16585== <!Anchor(CodeSize)>Code size ==
16586
16587The following table gives the code size of each benchmark in bytes.
16588The size for MLton and the ML Kit is the sum of text and data for the
16589standalone executable as reported by `size`. The size for Moscow
16590ML is the size in bytes of the executable `a.out`. The size for
16591Poly/ML is the difference in size of the database before the session
16592start and after the commit. The size for SML/NJ is the size of the
16593heap file created by `exportFn` and does not include the size of
16594the SML/NJ runtime system (approximately 100K). A * in an entry means
16595that the compiler failed to compile the benchmark.
16596
16597[options="header",cols="<2,5*<1"]
16598|====
16599|benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ
16600|<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|180,788|810,267|199,503|148,120|402,480
16601|<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|250,246|*|248,018|196,984|496,664
16602|<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|122,422|225,274|*|106,088|406,560
16603|<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|151,878|250,126|187,048|144,032|428,136
16604|<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|223,073|827,483|*|272,664|*
16605|<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|122,350|87,586|181,415|106,072|380,928
16606|<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|145,008|237,230|186,228|131,400|418,896
16607|<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|122,310|87,402|181,312|106,088|380,928
16608|<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|121,958|104,102|181,464|106,072|394,256
16609|<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|1,503,849|2,280,691|407,219|2,249,504|*
16610|<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|122,078|89,346|181,470|106,088|381,952
16611|<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|193,145|*|192,659|161,080|400,408
16612|<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|308,296|826,819|213,128|268,272|*
16613|<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|141,862|721,419|186,463|118,552|384,024
16614|<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|211,086|782,667|188,908|198,408|409,624
16615|<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|122,086|700,075|183,037|106,104|386,048
16616|<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|124,398|280,006|184,328|110,232|416,784
16617|<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|150,497|271,794|*|122,624|399,416
16618|<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|123,846|100,858|181,542|106,136|381,960
16619|<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|678,920|1,233,587|263,721|576,728|*
16620|<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|846,779|1,432,283|297,108|777,664|985,304
16621|<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|124,126|229,078|184,440|114,584|392,232
16622|<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|298,038|507,186|*|475,808|456,744
16623|<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|157,973|699,003|181,680|118,800|380,928
16624|<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|156,401|201,138|183,438|110,456|385,072
16625|<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|126,486|106,166|*|106,088|393,256
16626|<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|150,174|265,694|190,088|184,536|414,760
16627|<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|260,863|736,795|195,064|198,976|512,160
16628|<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|384,905|*|*|446,424|623,824
16629|<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|365,578|895,139|197,765|1,051,952|708,696
16630|<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|286,474|*|*|262,616|547,984
16631|<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|119,102|140,626|183,249|106,088|390,160
16632|<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|122,110|87,890|181,369|106,072|381,952
16633|<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|122,246|87,402|181,349|106,088|376,832
16634|<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|186,545|*|*|*|421,984
16635|<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|163,033|722,571|188,634|126,984|393,264
16636|<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|235,449|*|195,401|184,816|478,296
16637|<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|123,790|104,398|*|106,200|394,256
16638|<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|123,846|*|*|*|405,552
16639|<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|122,982|104,614|181,534|106,072|394,256
16640|<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|538,074|1,182,851|249,884|580,792|749,752
16641|<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|186,152|699,459|191,347|127,200|386,048
16642|<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|196,232|700,131|191,539|127,232|387,072
16643|<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|230,433|128,354|186,322|127,048|390,184
16644|<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|156,902|*|*|*|453,768
16645|====
16646
16647
16648== <!Anchor(CompileTime)>Compile time ==
16649
16650The following table gives the compile time of each benchmark in
16651seconds. A * in an entry means that the compiler failed to compile
16652the benchmark.
16653
16654[options="header",cols="<2,5*<1"]
16655|====
16656|benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ
16657|<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|2.70|0.89|0.15|0.29|0.20
16658|<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|2.87|*|0.14|0.20|0.41
16659|<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|2.21|0.24|*|0.07|0.05
16660|<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|2.28|0.34|0.04|0.11|0.21
16661|<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|2.93|1.01|*|0.27|*
16662|<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|2.23|0.20|0.01|0.07|0.04
16663|<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|2.35|0.28|0.03|0.09|0.10
16664|<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|2.16|0.19|0.01|0.07|0.04
16665|<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|2.16|0.20|0.01|0.07|0.04
16666|<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|12.28|19.25|23.75|6.44|*
16667|<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|2.14|0.20|0.01|0.08|0.04
16668|<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|2.48|*|0.08|0.14|0.23
16669|<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|3.31|0.75|0.15|0.22|*
16670|<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|2.25|0.32|0.03|0.09|0.10
16671|<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|2.72|0.57|0.07|0.17|0.21
16672|<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|2.14|0.24|0.01|0.07|0.04
16673|<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|2.14|0.24|0.01|0.08|0.05
16674|<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|2.31|0.39|*|0.12|0.27
16675|<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|2.15|0.21|0.01|0.07|0.04
16676|<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|7.07|4.53|2.05|0.80|*
16677|<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|6.78|4.76|1.20|1.65|4.78
16678|<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|2.14|0.28|0.02|0.08|0.07
16679|<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|3.96|2.12|*|0.37|0.49
16680|<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|2.30|0.22|0.01|0.07|0.04
16681|<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|2.26|0.20|0.01|0.07|0.04
16682|<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|2.12|0.22|*|9.83|12.55
16683|<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|2.59|0.47|0.07|0.16|0.24
16684|<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|2.95|0.46|0.05|0.17|0.14
16685|<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|3.93|*|*|0.45|0.74
16686|<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|3.42|1.23|0.30|0.32|0.53
16687|<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|3.23|*|*|0.15|0.32
16688|<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|2.25|0.28|0.01|0.08|0.05
16689|<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|2.24|0.21|0.01|0.08|0.05
16690|<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|2.23|0.20|0.01|0.08|0.05
16691|<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|2.73|*|*|*|0.44
16692|<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|2.42|0.38|0.05|0.11|0.11
16693|<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|2.93|*|0.10|0.27|0.31
16694|<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|2.23|0.22|*|0.07|0.04
16695|<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|2.18|*|*|*|0.04
16696|<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|2.23|0.22|0.01|0.08|0.05
16697|<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|5.25|2.93|0.63|0.94|1.85
16698|<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|2.46|0.24|0.01|0.08|0.05
16699|<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|2.61|0.25|0.01|0.08|0.05
16700|<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|2.99|0.35|0.03|0.09|0.11
16701|<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|2.31|*|*|*|0.11
16702|====
16703
16704<<<
16705
16706:mlton-guide-page: PhantomType
16707[[PhantomType]]
16708PhantomType
16709===========
16710
16711A phantom type is a type that has no run-time representation, but is
16712used to force the type checker to ensure invariants at compile time.
16713This is done by augmenting a type with additional arguments (phantom
16714type variables) and expressing constraints by choosing phantom types
16715to stand for the phantom types in the types of values.
16716
16717== Also see ==
16718
16719* <!Cite(Blume01)>
16720** dimensions
16721** C type system
16722* <!Cite(FluetPucella06)>
16723** subtyping
16724* socket module in <:BasisLibrary:Basis Library>
16725
16726<<<
16727
16728:mlton-guide-page: PlatformSpecificNotes
16729[[PlatformSpecificNotes]]
16730PlatformSpecificNotes
16731=====================
16732
16733Here are notes about using MLton on the following platforms.
16734
16735== Operating Systems ==
16736
16737* <:RunningOnAIX:AIX>
16738* <:RunningOnCygwin:Cygwin>
16739* <:RunningOnDarwin:Darwin>
16740* <:RunningOnFreeBSD:FreeBSD>
16741* <:RunningOnHPUX:HPUX>
16742* <:RunningOnLinux:Linux>
16743* <:RunningOnMinGW:MinGW>
16744* <:RunningOnNetBSD:NetBSD>
16745* <:RunningOnOpenBSD:OpenBSD>
16746* <:RunningOnSolaris:Solaris>
16747
16748== Architectures ==
16749
16750* <:RunningOnAMD64:AMD64>
16751* <:RunningOnHPPA:HPPA>
16752* <:RunningOnPowerPC:PowerPC>
16753* <:RunningOnPowerPC64:PowerPC64>
16754* <:RunningOnSparc:Sparc>
16755* <:RunningOnX86:X86>
16756
16757== Also see ==
16758
16759* <:PortingMLton:>
16760
16761<<<
16762
16763:mlton-guide-page: PolyEqual
16764[[PolyEqual]]
16765PolyEqual
16766=========
16767
16768<:PolyEqual:> is an optimization pass for the <:SSA:>
16769<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
16770
16771== Description ==
16772
16773This pass implements polymorphic equality.
16774
16775== Implementation ==
16776
16777* <!ViewGitFile(mlton,master,mlton/ssa/poly-equal.fun)>
16778
16779== Details and Notes ==
16780
16781For each datatype, tycon, and vector type, it builds and equality
16782function and translates calls to `MLton_equal` into calls to that
16783function.
16784
16785Also generates calls to `Word_equal`.
16786
16787For tuples, it does the equality test inline; i.e., it does not create
16788a separate equality function for each tuple type.
16789
16790All equality functions are created only if necessary, i.e., if
16791equality is actually used at a type.
16792
16793Optimizations:
16794
16795* for datatypes that are enumerations, do not build a case dispatch,
16796just use `MLton_eq`, as the backend will represent these as ints
16797
16798* deep equality always does an `MLton_eq` test first
16799
16800* If one argument to `=` is a constant and the type will get
16801translated to an `IntOrPointer`, then just use `eq` instead of the
16802full equality. This is important for implementing code like the
16803following efficiently:
16804+
16805----
16806if x = 0 ... (* where x is of type IntInf.int *)
16807----
16808
16809* Also convert pointer equality on scalar types to type specific
16810primitives.
16811
16812<<<
16813
16814:mlton-guide-page: PolyHash
16815[[PolyHash]]
16816PolyHash
16817========
16818
16819<:PolyHash:> is an optimization pass for the <:SSA:>
16820<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
16821
16822== Description ==
16823
16824This pass implements polymorphic, structural hashing.
16825
16826== Implementation ==
16827
16828* <!ViewGitFile(mlton,master,mlton/ssa/poly-hash.fun)>
16829
16830== Details and Notes ==
16831
16832For each datatype, tycon, and vector type, it builds and equality
16833function and translates calls to `MLton_hash` into calls to that
16834function.
16835
16836For tuples, it does the equality test inline; i.e., it does not create
16837a separate equality function for each tuple type.
16838
16839All equality functions are created only if necessary, i.e., if
16840equality is actually used at a type.
16841
16842<<<
16843
16844:mlton-guide-page: PolyML
16845[[PolyML]]
16846PolyML
16847======
16848
16849http://www.polyml.org/[Poly/ML] is a
16850<:StandardMLImplementations:Standard ML implementation>.
16851
16852== Also see ==
16853
16854 * <!Cite(Matthews95)>
16855
16856<<<
16857
16858:mlton-guide-page: PolymorphicEquality
16859[[PolymorphicEquality]]
16860PolymorphicEquality
16861===================
16862
16863Polymorphic equality is a built-in function in
16864<:StandardML:Standard ML> that compares two values of the same type
16865for equality. It is specified as
16866
16867[source,sml]
16868----
16869val = : ''a * ''a -> bool
16870----
16871
16872The `''a` in the specification are
16873<:EqualityTypeVariable:equality type variables>, and indicate that
16874polymorphic equality can only be applied to values of an
16875<:EqualityType:equality type>. It is not allowed in SML to rebind
16876`=`, so a programmer is guaranteed that `=` always denotes polymorphic
16877equality.
16878
16879
16880== Equality of ground types ==
16881
16882Ground types like `char`, `int`, and `word` may be compared (to values
16883of the same type). For example, `13 = 14` is type correct and yields
16884`false`.
16885
16886
16887== Equality of reals ==
16888
16889The one ground type that can not be compared is `real`. So,
16890`13.0 = 14.0` is not type correct. One can use `Real.==` to compare
16891reals for equality, but beware that this has different algebraic
16892properties than polymorphic equality.
16893
16894See http://standardml.org/Basis/real.html for a discussion of why
16895`real` is not an equality type.
16896
16897
16898== Equality of functions ==
16899
16900Comparison of functions is not allowed.
16901
16902
16903== Equality of immutable types ==
16904
16905Polymorphic equality can be used on <:Immutable:immutable> values like
16906tuples, records, lists, and vectors. For example,
16907
16908----
16909(1, 2, 3) = (4, 5, 6)
16910----
16911
16912is a type-correct expression yielding `false`, while
16913
16914----
16915[1, 2, 3] = [1, 2, 3]
16916----
16917
16918is type correct and yields `true`.
16919
16920Equality on immutable values is computed by structure, which means
16921that values are compared by recursively descending the data structure
16922until ground types are reached, at which point the ground types are
16923compared with primitive equality tests (like comparison of
16924characters). So, the expression
16925
16926----
16927[1, 2, 3] = [1, 1 + 1, 1 + 1 + 1]
16928----
16929
16930is guaranteed to yield `true`, even though the lists may occupy
16931different locations in memory.
16932
16933Because of structural equality, immutable values can only be compared
16934if their components can be compared. For example, `[1, 2, 3]` can be
16935compared, but `[1.0, 2.0, 3.0]` can not. The SML type system uses
16936<:EqualityType:equality types> to ensure that structural equality is
16937only applied to valid values.
16938
16939
16940== Equality of mutable values ==
16941
16942In contrast to immutable values, polymorphic equality of
16943<:Mutable:mutable> values (like ref cells and arrays) is performed by
16944pointer comparison, not by structure. So, the expression
16945
16946----
16947ref 13 = ref 13
16948----
16949
16950is guaranteed to yield `false`, even though the ref cells hold the
16951same contents.
16952
16953Because equality of mutable values is not structural, arrays and refs
16954can be compared _even if their components are not equality types_.
16955Hence, the following expression is type correct (and yields true).
16956
16957[source,sml]
16958----
16959let
16960 val r = ref 13.0
16961in
16962 r = r
16963end
16964----
16965
16966
16967== Equality of datatypes ==
16968
16969Polymorphic equality of datatypes is structural. Two values of the
16970same datatype are equal if they are of the same <:Variant:variant> and
16971if the <:Variant:variant>'s arguments are equal (recursively). So,
16972with the datatype
16973
16974[source,sml]
16975----
16976datatype t = A | B of t
16977----
16978
16979then `B (B A) = B A` is type correct and yields `false`, while `A = A`
16980and `B A = B A` yield `true`.
16981
16982As polymorphic equality descends two values to compare them, it uses
16983pointer equality whenever it reaches a mutable value. So, with the
16984datatype
16985
16986[source,sml]
16987----
16988datatype t = A of int ref | ...
16989----
16990
16991then `A (ref 13) = A (ref 13)` is type correct and yields `false`,
16992because the pointer equality on the two ref cells yields `false`.
16993
16994One weakness of the SML type system is that datatypes do not inherit
16995the special property of the `ref` and `array` type constructors that
16996allows them to be compared regardless of their component type. For
16997example, after declaring
16998
16999[source,sml]
17000----
17001datatype 'a t = A of 'a ref
17002----
17003
17004one might expect to be able to compare two values of type `real t`,
17005because pointer comparison on a ref cell would suffice.
17006Unfortunately, the type system can only express that a user-defined
17007datatype <:AdmitsEquality:admits equality> or not. In this case, `t`
17008admits equality, which means that `int t` can be compared but that
17009`real t` can not. We can confirm this with the program
17010
17011[source,sml]
17012----
17013datatype 'a t = A of 'a ref
17014fun f (x: real t, y: real t) = x = y
17015----
17016
17017on which MLton reports the following error.
17018
17019----
17020Error: z.sml 2.32-2.36.
17021 Function applied to incorrect argument.
17022 expects: [<equality>] t * [<equality>] t
17023 but got: [real] t * [real] t
17024 in: = (x, y)
17025----
17026
17027
17028== Implementation ==
17029
17030Polymorphic equality is implemented by recursively descending the two
17031values being compared, stopping as soon as they are determined to be
17032unequal, or exploring the entire values to determine that they are
17033equal. Hence, polymorphic equality can take time proportional to the
17034size of the smaller value.
17035
17036MLton uses some optimizations to improve performance.
17037
17038* When computing structural equality, first do a pointer comparison.
17039If the comparison yields `true`, then stop and return `true`, since
17040the structural comparison is guaranteed to do so. If the pointer
17041comparison fails, then recursively descend the values.
17042
17043* If a datatype is an enum (e.g. `datatype t = A | B | C`), then a
17044single comparison suffices to compare values of the datatype. No case
17045dispatch is required to determine whether the two values are of the
17046same <:Variant:variant>.
17047
17048* When comparing a known constant non-value-carrying
17049<:Variant:variant>, use a single comparison. For example, the
17050following code will compile into a single comparison for `A = x`.
17051+
17052[source,sml]
17053----
17054datatype t = A | B | C of ...
17055fun f x = ... if A = x then ...
17056----
17057
17058* When comparing a small constant `IntInf.int` to another
17059`IntInf.int`, use a single comparison against the constant. No case
17060dispatch is required.
17061
17062
17063== Also see ==
17064
17065* <:AdmitsEquality:>
17066* <:EqualityType:>
17067* <:EqualityTypeVariable:>
17068
17069<<<
17070
17071:mlton-guide-page: Polyvariance
17072[[Polyvariance]]
17073Polyvariance
17074============
17075
17076Polyvariance is an optimization pass for the <:SXML:>
17077<:IntermediateLanguage:>, invoked from <:SXMLSimplify:>.
17078
17079== Description ==
17080
17081This pass duplicates a higher-order, `let` bound function at each
17082variable reference, if the cost is smaller than some threshold.
17083
17084== Implementation ==
17085
17086* <!ViewGitFile(mlton,master,mlton/xml/polyvariance.fun)>
17087
17088== Details and Notes ==
17089
17090{empty}
17091
17092<<<
17093
17094:mlton-guide-page: Poplog
17095[[Poplog]]
17096Poplog
17097======
17098
17099http://www.cs.bham.ac.uk/research/poplog/poplog.info.html[POPLOG] is a
17100development environment that includes implementations of a number of
17101languages, including <:StandardML:Standard ML>.
17102
17103While POPLOG is actively developed, the <:ML:> support predates
17104<:DefinitionOfStandardML:SML'97>, and there is no support for the
17105<:BasisLibrary:Basis Library>
17106http://www.standardml.org/Basis[specification].
17107
17108== Also see ==
17109
17110 * http://www.cs.bham.ac.uk/research/poplog/doc/pmlhelp/mlinpop[Mixed-language programming in ML and Pop-11].
17111
17112<<<
17113
17114:mlton-guide-page: PortingMLton
17115[[PortingMLton]]
17116PortingMLton
17117============
17118
17119Porting MLton to a new target platform (architecture or OS) involves
17120the following steps.
17121
171221. Make the necessary changes to the scripts, runtime system,
17123<:BasisLibrary: Basis Library> implementation, and compiler.
17124
171252. Get the regressions working using a cross compiler.
17126
171273. <:CrossCompiling: Cross compile> MLton and bootstrap on the target.
17128
17129MLton has a native code generator only for AMD64 and X86, so, if you
17130are porting to another architecture, you must use the C code
17131generator. These notes do not cover building a new native code
17132generator.
17133
17134Some of the following steps will not be necessary if MLton already
17135supports the architecture or operating system you are porting to.
17136
17137
17138== What code to change ==
17139
17140* Scripts.
17141+
17142--
17143* In `bin/platform`, add new cases to define `$HOST_OS` and `$HOST_ARCH`.
17144--
17145
17146* Runtime system.
17147+
17148--
17149The goal of this step is to be able to successfully run `make` in the
17150`runtime` directory on the target machine.
17151
17152* In `platform.h`, add a new case to include `platform/<arch>.h` and `platform/<os>.h`.
17153
17154* In `platform/<arch>.h`:
17155** define `MLton_Platform_Arch_host`.
17156
17157* In `platform/<os>.h`:
17158** include platform-specific includes.
17159** define `MLton_Platform_OS_host`.
17160** define all of the `HAS_*` macros.
17161
17162* In `platform/<os>.c` implement any platform-dependent functions that the runtime needs.
17163
17164* Add rounding mode control to `basis/Real/IEEEReal.c` for the new arch (if not `HAS_FEROUND`)
17165
17166* Compile and install the <:GnuMP:>. This varies from platform to platform. In `platform/<os>.h`, you need to include the appropriate `gmp.h`.
17167--
17168
17169* Basis Library implementation (`basis-library/*`)
17170+
17171--
17172* In `primitive/prim-mlton.sml`:
17173** Add a new variant to the `MLton.Platform.Arch.t` datatype.
17174** modify the constants that define `MLton.Platform.Arch.host` to match with `MLton_Platform_Arch_host`, as set in `runtime/platform/<arch>.h`.
17175** Add a new variant to the `MLton.Platform.OS.t` datatype.
17176** modify the constants that define `MLton.Platform.OS.host` to match with `MLton_Platform_OS_host`, as set in `runtime/platform/<os>.h`.
17177
17178* In `mlton/platform.{sig,sml}` add a new variant.
17179
17180* In `sml-nj/sml-nj.sml`, modify `getOSKind`.
17181
17182* Look at all the uses of `MLton.Platform` in the Basis Library implementation and see if you need to do anything special. You might use the following command to see where to look.
17183+
17184----
17185find basis-library -type f | xargs grep 'MLton\.Platform'
17186----
17187+
17188If in doubt, leave the code alone and wait to see what happens when you run the regression tests.
17189--
17190
17191* Compiler.
17192+
17193--
17194* In `lib/stubs/mlton-stubs/platform.sig` add any new variants, as was done in the Basis Library.
17195
17196* In `lib/stubs/mlton-stubs/mlton.sml` add any new variants in `MLton.Platform`, as was done in the Basis Library.
17197--
17198
17199The string used to identify a particular architecture or operating
17200system must be the same (except for possibly case of letters) in the
17201scripts, runtime, Basis Library implementation, and compiler (stubs).
17202In `mlton/main/main.fun`, MLton itself uses the conversions to and
17203from strings:
17204----
17205MLton.Platform.{Arch,OS}.{from,to}String
17206----
17207
17208If the there is a mismatch, you may see the error message
17209`strange arch` or `strange os`.
17210
17211
17212== Running the regressions with a cross compiler ==
17213
17214When porting to a new platform, it is always best to get all (or as
17215many as possible) of the regressions working before moving to a self
17216compile. It is easiest to do this by modifying and rebuilding the
17217compiler on a working machine and then running the regressions with a
17218cross compiler. It is not easy to build a gcc cross compiler, so we
17219recommend generating the C and assembly on a working machine (using
17220MLton's `-target` and `-stop g` flags, copying the generated files to
17221the target machine, then compiling and linking there.
17222
172231. Remake the compiler on a working machine.
17224
172252. Use `bin/add-cross` to add support for the new target. In particular, this should create `build/lib/mlton/targets/<target>/` with the platform-specific necessary cross-compilation information.
17226
172273. Run the regression tests with the cross-compiler. To cross-compile all the tests, do
17228+
17229----
17230bin/regression -cross <target>
17231----
17232+
17233This will create all the executables. Then, copy `bin/regression` and
17234the `regression` directory to the target machine, and do
17235+
17236----
17237bin/regression -run-only <target>
17238----
17239+
17240This should run all the tests.
17241
17242Repeat this step, interleaved with appropriate compiler modifications,
17243until all the regressions pass.
17244
17245
17246== Bootstrap ==
17247
17248Once you've got all the regressions working, you can build MLton for
17249the new target. As with the regressions, the idea for bootstrapping
17250is to generate the C and assembly on a working machine, copy it to the
17251target machine, and then compile and link there. Here's the sequence
17252of steps.
17253
172541. On a working machine, with the newly rebuilt compiler, in the `mlton` directory, do:
17255+
17256----
17257mlton -stop g -target <target> mlton.mlb
17258----
17259
172602. Copy to the target machine.
17261
172623. On the target machine, move the libraries to the right place. That is, in `build/lib/mlton/targets`, do:
17263+
17264----
17265rm -rf self
17266mv <target> self
17267----
17268+
17269Also make sure you have all the header files in build/lib/mlton/include. You can copy them from a host machine that has run `make runtime`.
17270
172714. On the target machine, compile and link MLton. That is, in the mlton directory, do something like:
17272+
17273----
17274gcc -c -Ibuild/lib/mlton/include -Ibuild/lib/mlton/targets/self/include -O1 -w mlton/mlton.*.[cs]
17275gcc -o build/lib/mlton/mlton-compile \
17276 -Lbuild/lib/mlton/targets/self \
17277 -L/usr/local/lib \
17278 mlton.*.o \
17279 -lmlton -lgmp -lgdtoa -lm
17280----
17281
172825. At this point, MLton should be working and you can finish the rest of a usual make on the target machine.
17283+
17284----
17285make basis-no-check script mlbpathmap constants libraries tools
17286----
17287
172886. Making the last tool, mlyacc, will fail, because mlyacc cannot bootstrap its own yacc.grm.* files. On the host machine, run `make -C mlyacc src/yacc.grm.sml`. Then copy both files to the target machine, and compile mlyacc, making sure to supply the path to your newly compile mllex: `make -C mlyacc MLLEX=mllex/mllex`.
17289
17290There are other details to get right, like making sure that the tools
17291directories were clean so that the tools are rebuilt on the new
17292platform, but hopefully this structure works. Once you've got a
17293compiler on the target machine, you should test it by running all the
17294regressions normally (i.e. without the `-cross` flag) and by running a
17295couple rounds of self compiles.
17296
17297
17298== Also see ==
17299
17300The above description is based on the following emails sent to the
17301MLton list.
17302
17303* http://www.mlton.org/pipermail/mlton/2002-October/013110.html
17304* http://www.mlton.org/pipermail/mlton/2004-July/016029.html
17305
17306<<<
17307
17308:mlton-guide-page: PrecedenceParse
17309[[PrecedenceParse]]
17310PrecedenceParse
17311===============
17312
17313<:PrecedenceParse:> is an analysis/rewrite pass for the <:AST:>
17314<:IntermediateLanguage:>, invoked from <:Elaborate:>.
17315
17316== Description ==
17317
17318This pass rewrites <:AST:> function clauses, expressions, and patterns
17319to resolve <:OperatorPrecedence:>.
17320
17321== Implementation ==
17322
17323* <!ViewGitFile(mlton,master,mlton/elaborate/precedence-parse.sig)>
17324* <!ViewGitFile(mlton,master,mlton/elaborate/precedence-parse.fun)>
17325
17326== Details and Notes ==
17327
17328{empty}
17329
17330<<<
17331
17332:mlton-guide-page: Printf
17333[[Printf]]
17334Printf
17335======
17336
17337Programmers coming from C or Java often ask if
17338<:StandardML:Standard ML> has a `printf` function. It does not.
17339However, it is possible to implement your own version with only a few
17340lines of code.
17341
17342Here is a definition for `printf` and `fprintf`, along with format
17343specifiers for booleans, integers, and reals.
17344
17345[source,sml]
17346----
17347structure Printf =
17348 struct
17349 fun $ (_, f) = f (fn p => p ()) ignore
17350 fun fprintf out f = f (out, id)
17351 val printf = fn z => fprintf TextIO.stdOut z
17352 fun one ((out, f), make) g =
17353 g (out, fn r =>
17354 f (fn p =>
17355 make (fn s =>
17356 r (fn () => (p (); TextIO.output (out, s))))))
17357 fun ` x s = one (x, fn f => f s)
17358 fun spec to x = one (x, fn f => f o to)
17359 val B = fn z => spec Bool.toString z
17360 val I = fn z => spec Int.toString z
17361 val R = fn z => spec Real.toString z
17362 end
17363----
17364
17365Here's an example use.
17366
17367[source,sml]
17368----
17369val () = printf `"Int="I`" Bool="B`" Real="R`"\n" $ 1 false 2.0
17370----
17371
17372This prints the following.
17373
17374----
17375Int=1 Bool=false Real=2.0
17376----
17377
17378In general, a use of `printf` looks like
17379
17380----
17381printf <spec1> ... <specn> $ <arg1> ... <argm>
17382----
17383
17384where each `<speci>` is either a specifier like `B`, `I`, or `R`, or
17385is an inline string, like ++&grave;"foo"++. A backtick (+&grave;+)
17386must precede each inline string. Each `<argi>` must be of the
17387appropriate type for the corresponding specifier.
17388
17389SML `printf` is more powerful than its C counterpart in a number of
17390ways. In particular, the function produced by `printf` is a perfectly
17391ordinary SML function, and can be passed around, used multiple times,
17392etc. For example:
17393
17394[source,sml]
17395----
17396val f: int -> bool -> unit = printf `"Int="I`" Bool="B`"\n" $
17397val () = f 1 true
17398val () = f 2 false
17399----
17400
17401The definition of `printf` is even careful to not print anything until
17402it is fully applied. So, examples like the following will work as
17403expected.
17404
17405----
17406val f: int -> bool -> unit = printf `"Int="I`" Bool="B`"\n" $ 13
17407val () = f true
17408val () = f false
17409----
17410
17411It is also easy to define new format specifiers. For example, suppose
17412we wanted format specifiers for characters and strings.
17413
17414----
17415val C = fn z => spec Char.toString z
17416val S = fn z => spec (fn s => s) z
17417----
17418
17419One can define format specifiers for more complex types, e.g. pairs of
17420integers.
17421
17422----
17423val I2 =
17424 fn z =>
17425 spec (fn (i, j) =>
17426 concat ["(", Int.toString i, ", ", Int.toString j, ")"])
17427 z
17428----
17429
17430Here's an example use.
17431
17432----
17433val () = printf `"Test "I2`" a string "S`"\n" $ (1, 2) "hello"
17434----
17435
17436
17437== Printf via <:Fold:> ==
17438
17439`printf` is best viewed as a special case of variable-argument
17440<:Fold:> that inductively builds a function as it processes its
17441arguments. Here is the definition of a `Printf` structure in terms of
17442fold. The structure is equivalent to the above one, except that it
17443uses the standard `$` instead of a specialized one.
17444
17445[source,sml]
17446----
17447structure Printf =
17448 struct
17449 fun fprintf out =
17450 Fold.fold ((out, id), fn (_, f) => f (fn p => p ()) ignore)
17451
17452 val printf = fn z => fprintf TextIO.stdOut z
17453
17454 fun one ((out, f), make) =
17455 (out, fn r =>
17456 f (fn p =>
17457 make (fn s =>
17458 r (fn () => (p (); TextIO.output (out, s))))))
17459
17460 val ` =
17461 fn z => Fold.step1 (fn (s, x) => one (x, fn f => f s)) z
17462
17463 fun spec to = Fold.step0 (fn x => one (x, fn f => f o to))
17464
17465 val B = fn z => spec Bool.toString z
17466 val I = fn z => spec Int.toString z
17467 val R = fn z => spec Real.toString z
17468 end
17469----
17470
17471Viewing `printf` as a fold opens up a number of possibilities. For
17472example, one can name parts of format strings using the fold idiom for
17473naming sequences of steps.
17474
17475----
17476val IB = fn u => Fold.fold u `"Int="I`" Bool="B
17477val () = printf IB`" "IB`"\n" $ 1 true 3 false
17478----
17479
17480One can even parametrize over partial format strings.
17481
17482----
17483fun XB X = fn u => Fold.fold u `"X="X`" Bool="B
17484val () = printf (XB I)`" "(XB R)`"\n" $ 1 true 2.0 false
17485----
17486
17487
17488== Also see ==
17489
17490* <:PrintfGentle:>
17491* <!Cite(Danvy98, Functional Unparsing)>
17492
17493<<<
17494
17495:mlton-guide-page: PrintfGentle
17496[[PrintfGentle]]
17497PrintfGentle
17498============
17499
17500This page provides a gentle introduction and derivation of <:Printf:>,
17501with sections and arrangement more suitable to a talk.
17502
17503
17504== Introduction ==
17505
17506SML does not have `printf`. Could we define it ourselves?
17507
17508[source,sml]
17509----
17510val () = printf ("here's an int %d and a real %f.\n", 13, 17.0)
17511val () = printf ("here's three values (%d, %f, %f).\n", 13, 17.0, 19.0)
17512----
17513
17514What could the type of `printf` be?
17515
17516This obviously can't work, because SML functions take a fixed number
17517of arguments. Actually they take one argument, but if that's a tuple,
17518it can only have a fixed number of components.
17519
17520
17521== From tupling to currying ==
17522
17523What about currying to get around the typing problem?
17524
17525[source,sml]
17526----
17527val () = printf "here's an int %d and a real %f.\n" 13 17.0
17528val () = printf "here's three values (%d, %f, %f).\n" 13 17.0 19.0
17529----
17530
17531That fails for a similar reason. We need two types for `printf`.
17532
17533----
17534val printf: string -> int -> real -> unit
17535val printf: string -> int -> real -> real -> unit
17536----
17537
17538This can't work, because `printf` can only have one type. SML doesn't
17539support programmer-defined overloading.
17540
17541
17542== Overloading and dependent types ==
17543
17544Even without worrying about number of arguments, there is another
17545problem. The type of `printf` depends on the format string.
17546
17547[source,sml]
17548----
17549val () = printf "here's an int %d and a real %f.\n" 13 17.0
17550val () = printf "here's a real %f and an int %d.\n" 17.0 13
17551----
17552
17553Now we need
17554
17555----
17556val printf: string -> int -> real -> unit
17557val printf: string -> real -> int -> unit
17558----
17559
17560Again, this can't possibly working because SML doesn't have
17561overloading, and types can't depend on values.
17562
17563
17564== Idea: express type information in the format string ==
17565
17566If we express type information in the format string, then different
17567uses of `printf` can have different types.
17568
17569[source,sml]
17570----
17571type 'a t (* the type of format strings *)
17572val printf: 'a t -> 'a
17573infix D F
17574val fs1: (int -> real -> unit) t = "here's an int "D" and a real "F".\n"
17575val fs2: (int -> real -> real -> unit) t =
17576 "here's three values ("D", "F", "F").\n"
17577val () = printf fs1 13 17.0
17578val () = printf fs2 13 17.0 19.0
17579----
17580
17581Now, our two calls to `printf` type check, because the format
17582string specializes `printf` to the appropriate type.
17583
17584
17585== The types of format characters ==
17586
17587What should the type of format characters `D` and `F` be? Each format
17588character requires an additional argument of the appropriate type to
17589be supplied to `printf`.
17590
17591Idea: guess the final type that will be needed for `printf` the format
17592string and verify it with each format character.
17593
17594[source,sml]
17595----
17596type ('a, 'b) t (* 'a = rest of type to verify, 'b = final type *)
17597val ` : string -> ('a, 'a) t (* guess the type, which must be verified *)
17598val D: (int -> 'a, 'b) t * string -> ('a, 'b) t (* consume an int *)
17599val F: (real -> 'a, 'b) t * string -> ('a, 'b) t (* consume a real *)
17600val printf: (unit, 'a) t -> 'a
17601----
17602
17603Don't worry. In the end, type inference will guess and verify for us.
17604
17605
17606== Understanding guess and verify ==
17607
17608Now, let's build up a format string and a specialized `printf`.
17609
17610[source,sml]
17611----
17612infix D F
17613val f0 = `"here's an int "
17614val f1 = f0 D " and a real "
17615val f2 = f1 F ".\n"
17616val p = printf f2
17617----
17618
17619These definitions yield the following types.
17620
17621[source,sml]
17622----
17623val f0: (int -> real -> unit, int -> real -> unit) t
17624val f1: (real -> unit, int -> real -> unit) t
17625val f2: (unit, int -> real -> unit) t
17626val p: int -> real -> unit
17627----
17628
17629So, `p` is a specialized `printf` function. We could use it as
17630follows
17631
17632[source,sml]
17633----
17634val () = p 13 17.0
17635val () = p 14 19.0
17636----
17637
17638
17639== Type checking this using a functor ==
17640
17641[source,sml]
17642----
17643signature PRINTF =
17644 sig
17645 type ('a, 'b) t
17646 val ` : string -> ('a, 'a) t
17647 val D: (int -> 'a, 'b) t * string -> ('a, 'b) t
17648 val F: (real -> 'a, 'b) t * string -> ('a, 'b) t
17649 val printf: (unit, 'a) t -> 'a
17650 end
17651
17652functor Test (P: PRINTF) =
17653 struct
17654 open P
17655 infix D F
17656
17657 val () = printf (`"here's an int "D" and a real "F".\n") 13 17.0
17658 val () = printf (`"here's three values ("D", "F ", "F").\n") 13 17.0 19.0
17659 end
17660----
17661
17662
17663== Implementing `Printf` ==
17664
17665Think of a format character as a formatter transformer. It takes the
17666formatter for the part of the format string before it and transforms
17667it into a new formatter that first does the left hand bit, then does
17668its bit, then continues on with the rest of the format string.
17669
17670[source,sml]
17671----
17672structure Printf: PRINTF =
17673 struct
17674 datatype ('a, 'b) t = T of (unit -> 'a) -> 'b
17675
17676 fun printf (T f) = f (fn () => ())
17677
17678 fun ` s = T (fn a => (print s; a ()))
17679
17680 fun D (T f, s) =
17681 T (fn g => f (fn () => fn i =>
17682 (print (Int.toString i); print s; g ())))
17683
17684 fun F (T f, s) =
17685 T (fn g => f (fn () => fn i =>
17686 (print (Real.toString i); print s; g ())))
17687 end
17688----
17689
17690
17691== Testing printf ==
17692
17693[source,sml]
17694----
17695structure Z = Test (Printf)
17696----
17697
17698
17699== User-definable formats ==
17700
17701The definition of the format characters is pretty much the same.
17702Within the `Printf` structure we can define a format character
17703generator.
17704
17705[source,sml]
17706----
17707val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t =
17708 fn toString => fn (T f, s) =>
17709 T (fn th => f (fn () => fn a => (print (toString a); print s ; th ())))
17710val D = fn z => newFormat Int.toString z
17711val F = fn z => newFormat Real.toString z
17712----
17713
17714
17715== A core `Printf` ==
17716
17717We can now have a very small `PRINTF` signature, and define all
17718the format strings externally to the core module.
17719
17720[source,sml]
17721----
17722signature PRINTF =
17723 sig
17724 type ('a, 'b) t
17725 val ` : string -> ('a, 'a) t
17726 val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t
17727 val printf: (unit, 'a) t -> 'a
17728 end
17729
17730structure Printf: PRINTF =
17731 struct
17732 datatype ('a, 'b) t = T of (unit -> 'a) -> 'b
17733
17734 fun printf (T f) = f (fn () => ())
17735
17736 fun ` s = T (fn a => (print s; a ()))
17737
17738 fun newFormat toString (T f, s) =
17739 T (fn th =>
17740 f (fn () => fn a =>
17741 (print (toString a)
17742 ; print s
17743 ; th ())))
17744 end
17745----
17746
17747
17748== Extending to fprintf ==
17749
17750One can implement fprintf by threading the outstream through all the
17751transformers.
17752
17753[source,sml]
17754----
17755signature PRINTF =
17756 sig
17757 type ('a, 'b) t
17758 val ` : string -> ('a, 'a) t
17759 val fprintf: (unit, 'a) t * TextIO.outstream -> 'a
17760 val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t
17761 val printf: (unit, 'a) t -> 'a
17762 end
17763
17764structure Printf: PRINTF =
17765 struct
17766 type out = TextIO.outstream
17767 val output = TextIO.output
17768
17769 datatype ('a, 'b) t = T of (out -> 'a) -> out -> 'b
17770
17771 fun fprintf (T f, out) = f (fn _ => ()) out
17772
17773 fun printf t = fprintf (t, TextIO.stdOut)
17774
17775 fun ` s = T (fn a => fn out => (output (out, s); a out))
17776
17777 fun newFormat toString (T f, s) =
17778 T (fn g =>
17779 f (fn out => fn a =>
17780 (output (out, toString a)
17781 ; output (out, s)
17782 ; g out)))
17783 end
17784----
17785
17786
17787== Notes ==
17788
17789* Lesson: instead of using dependent types for a function, express the
17790the dependency in the type of the argument.
17791
17792* If `printf` is partially applied, it will do the printing then and
17793there. Perhaps this could be fixed with some kind of terminator.
17794+
17795A syntactic or argument terminator is not necessary. A formatter can
17796either be eager (as above) or lazy (as below). A lazy formatter
17797accumulates enough state to print the entire string. The simplest
17798lazy formatter concatenates the strings as they become available:
17799+
17800[source,sml]
17801----
17802structure PrintfLazyConcat: PRINTF =
17803 struct
17804 datatype ('a, 'b) t = T of (string -> 'a) -> string -> 'b
17805
17806 fun printf (T f) = f print ""
17807
17808 fun ` s = T (fn th => fn s' => th (s' ^ s))
17809
17810 fun newFormat toString (T f, s) =
17811 T (fn th =>
17812 f (fn s' => fn a =>
17813 th (s' ^ toString a ^ s)))
17814 end
17815----
17816+
17817It is somewhat more efficient to accumulate the strings as a list:
17818+
17819[source,sml]
17820----
17821structure PrintfLazyList: PRINTF =
17822 struct
17823 datatype ('a, 'b) t = T of (string list -> 'a) -> string list -> 'b
17824
17825 fun printf (T f) = f (List.app print o List.rev) []
17826
17827 fun ` s = T (fn th => fn ss => th (s::ss))
17828
17829 fun newFormat toString (T f, s) =
17830 T (fn th =>
17831 f (fn ss => fn a =>
17832 th (s::toString a::ss)))
17833 end
17834----
17835
17836
17837== Also see ==
17838
17839* <:Printf:>
17840* <!Cite(Danvy98, Functional Unparsing)>
17841
17842<<<
17843
17844:mlton-guide-page: ProductType
17845[[ProductType]]
17846ProductType
17847===========
17848
17849<:StandardML:Standard ML> has special syntax for products (tuples). A
17850product type is written as
17851[source,sml]
17852----
17853t1 * t2 * ... * tN
17854----
17855and a product pattern is written as
17856[source,sml]
17857----
17858(p1, p2, ..., pN)
17859----
17860
17861In most situations the syntax is quite convenient. However, there are
17862situations where the syntax is cumbersome. There are also situations
17863in which it is useful to construct and destruct n-ary products
17864inductively, especially when using <:Fold:>.
17865
17866In such situations, it is useful to have a binary product datatype
17867with an infix constructor defined as follows.
17868[source,sml]
17869----
17870datatype ('a, 'b) product = & of 'a * 'b
17871infix &
17872----
17873
17874With these definitions, one can write an n-ary product as a nested
17875binary product quite conveniently.
17876[source,sml]
17877----
17878x1 & x2 & ... & xn
17879----
17880
17881Because of left associativity, this is the same as
17882[source,sml]
17883----
17884(((x1 & x2) & ...) & xn)
17885----
17886
17887Because `&` is a constructor, the syntax can also be used for
17888patterns.
17889
17890The symbol `&` is inspired by the Curry-Howard isomorphism: the proof
17891of a conjunction `(A & B)` is a pair of proofs `(a, b)`.
17892
17893
17894== Example: parser combinators ==
17895
17896A typical parser combinator library provides a combinator that has a
17897type of the form.
17898[source,sml]
17899----
17900'a parser * 'b parser -> ('a * 'b) parser
17901----
17902and produces a parser for the concatenation of two parsers. When more
17903than two parsers are concatenated, the result of the resulting parser
17904is a nested structure of pairs
17905[source,sml]
17906----
17907(...((p1, p2), p3)..., pN)
17908----
17909which is somewhat cumbersome.
17910
17911By using a product type, the type of the concatenation combinator then
17912becomes
17913[source,sml]
17914----
17915'a parser * 'b parser -> ('a, 'b) product parser
17916----
17917While this doesn't stop the nesting, it makes the pattern significantly
17918easier to write. Instead of
17919[source,sml]
17920----
17921(...((p1, p2), p3)..., pN)
17922----
17923the pattern is written as
17924[source,sml]
17925----
17926p1 & p2 & p3 & ... & pN
17927----
17928which is considerably more concise.
17929
17930
17931== Also see ==
17932
17933* <:VariableArityPolymorphism:>
17934* <:Utilities:>
17935
17936<<<
17937
17938:mlton-guide-page: Profiling
17939[[Profiling]]
17940Profiling
17941=========
17942
17943With MLton and `mlprof`, you can profile your program to find out
17944bytes allocated, execution counts, or time spent in each function. To
17945profile you program, compile with ++-profile __kind__++, where _kind_
17946is one of `alloc`, `count`, or `time`. Then, run the executable,
17947which will write an `mlmon.out` file when it finishes. You can then
17948run `mlprof` on the executable and the `mlmon.out` file to see the
17949performance data.
17950
17951Here are the three kinds of profiling that MLton supports.
17952
17953* <:ProfilingAllocation:>
17954* <:ProfilingCounts:>
17955* <:ProfilingTime:>
17956
17957== Next steps ==
17958
17959* <:CallGraph:>s to visualize profiling data.
17960* <:HowProfilingWorks:>
17961* <:MLmon:>
17962* <:MLtonProfile:> to selectively profile parts of your program.
17963* <:ProfilingTheStack:>
17964* <:ShowProf:>
17965
17966<<<
17967
17968:mlton-guide-page: ProfilingAllocation
17969[[ProfilingAllocation]]
17970ProfilingAllocation
17971===================
17972
17973With MLton and `mlprof`, you can <:Profiling:profile> your program to
17974find out how many bytes each function allocates. To do so, compile
17975your program with `-profile alloc`. For example, suppose that
17976`list-rev.sml` is the following.
17977
17978[source,sml]
17979----
17980sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/list-rev.sml]
17981----
17982
17983Compile and run `list-rev` as follows.
17984----
17985% mlton -profile alloc list-rev.sml
17986% ./list-rev
17987% mlprof -show-line true list-rev mlmon.out
179886,030,136 bytes allocated (108,336 bytes by GC)
17989 function cur
17990----------------------- -----
17991append list-rev.sml: 1 97.6%
17992<gc> 1.8%
17993<main> 0.4%
17994rev list-rev.sml: 6 0.2%
17995----
17996
17997The data shows that most of the allocation is done by the `append`
17998function defined on line 1 of `list-rev.sml`. The table also shows
17999how special functions like `gc` and `main` are handled: they are
18000printed with surrounding brackets. C functions are displayed
18001similarly. In this example, the allocation done by the garbage
18002collector is due to stack growth, which is usually the case.
18003
18004The run-time performance impact of allocation profiling is noticeable,
18005because it inserts additional C calls for object allocation.
18006
18007Compile with `-profile alloc -profile-branch true` to find out how
18008much allocation is done in each branch of a function; see
18009<:ProfilingCounts:> for more details on `-profile-branch`.
18010
18011<<<
18012
18013:mlton-guide-page: ProfilingCounts
18014[[ProfilingCounts]]
18015ProfilingCounts
18016===============
18017
18018With MLton and `mlprof`, you can <:Profiling:profile> your program to
18019find out how many times each function is called and how many times
18020each branch is taken. To do so, compile your program with
18021`-profile count -profile-branch true`. For example, suppose that
18022`tak.sml` contains the following.
18023
18024[source,sml]
18025----
18026sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/tak.sml]
18027----
18028
18029Compile with count profiling and run the program.
18030----
18031% mlton -profile count -profile-branch true tak.sml
18032% ./tak
18033----
18034
18035Display the profiling data, along with raw counts and file positions.
18036----
18037% mlprof -raw true -show-line true tak mlmon.out
18038623,610,002 ticks
18039 function cur raw
18040--------------------------------- ----- -------------
18041Tak.tak1.tak2 tak.sml: 5 38.2% (238,530,000)
18042Tak.tak1.tak2.<true> tak.sml: 7 27.5% (171,510,000)
18043Tak.tak1 tak.sml: 3 10.7% (67,025,000)
18044Tak.tak1.<true> tak.sml: 14 10.7% (67,025,000)
18045Tak.tak1.tak2.<false> tak.sml: 9 10.7% (67,020,000)
18046Tak.tak1.<false> tak.sml: 16 2.0% (12,490,000)
18047f tak.sml: 23 0.0% (5,001)
18048f.<branch> tak.sml: 25 0.0% (5,000)
18049f.<branch> tak.sml: 23 0.0% (1)
18050uncalled tak.sml: 29 0.0% (0)
18051f.<branch> tak.sml: 24 0.0% (0)
18052----
18053
18054Branches are displayed with lexical nesting followed by `<branch>`
18055where the function name would normally be, or `<true>` or `<false>`
18056for if-expressions. It is best to run `mlprof` with `-show-line true`
18057to help identify the branch.
18058
18059One use of `-profile count` is as a code-coverage tool, to help find
18060code in your program that hasn't been tested. For this reason,
18061`mlprof` displays functions and branches even if they have a count of
18062zero. As the above output shows, the branch on line 24 was never
18063taken and the function defined on line 29 was never called. To see
18064zero counts, it is best to run `mlprof` with `-raw true`, since some
18065code (e.g. the branch on line 23 above) will show up with `0.0%` but
18066may still have been executed and hence have a nonzero raw count.
18067
18068<<<
18069
18070:mlton-guide-page: ProfilingTheStack
18071[[ProfilingTheStack]]
18072ProfilingTheStack
18073=================
18074
18075For all forms of <:Profiling:>, you can gather counts for all
18076functions on the stack, not just the currently executing function. To
18077do so, compile your program with `-profile-stack true`. For example,
18078suppose that `list-rev.sml` contains the following.
18079
18080[source,sml]
18081----
18082sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/list-rev.sml]
18083----
18084
18085Compile with stack profiling and then run the program.
18086----
18087% mlton -profile alloc -profile-stack true list-rev.sml
18088% ./list-rev
18089----
18090
18091Display the profiling data.
18092----
18093% mlprof -show-line true list-rev mlmon.out
180946,030,136 bytes allocated (108,336 bytes by GC)
18095 function cur stack GC
18096----------------------- ----- ----- ----
18097append list-rev.sml: 1 97.6% 97.6% 1.4%
18098<gc> 1.8% 0.0% 1.8%
18099<main> 0.4% 98.2% 1.8%
18100rev list-rev.sml: 6 0.2% 97.6% 1.8%
18101----
18102
18103In the above table, we see that `rev`, defined on line 6 of
18104`list-rev.sml`, is only responsible for 0.2% of the allocation, but is
18105on the stack while 97.6% of the allocation is done by the user program
18106and while 1.8% of the allocation is done by the garbage collector.
18107
18108The run-time performance impact of `-profile-stack true` can be
18109noticeable since there is some extra bookkeeping at every nontail call
18110and return.
18111
18112<<<
18113
18114:mlton-guide-page: ProfilingTime
18115[[ProfilingTime]]
18116ProfilingTime
18117=============
18118
18119With MLton and `mlprof`, you can <:Profiling:profile> your program to
18120find out how much time is spent in each function over an entire run of
18121the program. To do so, compile your program with `-profile time`.
18122For example, suppose that `tak.sml` contains the following.
18123
18124[source,sml]
18125----
18126sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/tak.sml]
18127----
18128
18129Compile with time profiling and run the program.
18130----
18131% mlton -profile time tak.sml
18132% ./tak
18133----
18134
18135Display the profiling data.
18136----
18137% mlprof tak mlmon.out
181386.00 seconds of CPU time (0.00 seconds GC)
18139function cur
18140------------- -----
18141Tak.tak1.tak2 75.8%
18142Tak.tak1 24.2%
18143----
18144
18145This example shows how `mlprof` indicates lexical nesting: as a
18146sequence of period-separated names indicating the structures and
18147functions in which a function definition is nested. The profiling
18148data shows that roughly three-quarters of the time is spent in the
18149`Tak.tak1.tak2` function, while the rest is spent in `Tak.tak1`.
18150
18151Display raw counts in addition to percentages with `-raw true`.
18152----
18153% mlprof -raw true tak mlmon.out
181546.00 seconds of CPU time (0.00 seconds GC)
18155 function cur raw
18156------------- ----- -------
18157Tak.tak1.tak2 75.8% (4.55s)
18158Tak.tak1 24.2% (1.45s)
18159----
18160
18161Display the file name and line number for each function in addition to
18162its name with `-show-line true`.
18163----
18164% mlprof -show-line true tak mlmon.out
181656.00 seconds of CPU time (0.00 seconds GC)
18166 function cur
18167------------------------- -----
18168Tak.tak1.tak2 tak.sml: 5 75.8%
18169Tak.tak1 tak.sml: 3 24.2%
18170----
18171
18172Time profiling is designed to have a very small performance impact.
18173However, in some cases there will be a run-time performance cost,
18174which may perturb the results. There is more likely to be an impact
18175with `-codegen c` than `-codegen native`.
18176
18177You can also compile with `-profile time -profile-branch true` to find
18178out how much time is spent in each branch of a function; see
18179<:ProfilingCounts:> for more details on `-profile-branch`.
18180
18181
18182== Caveats ==
18183
18184With `-profile time`, use of the following in your program will cause
18185a run-time error, since they would interfere with the profiler signal
18186handler.
18187
18188* `MLton.Itimer.set (MLton.Itimer.Prof, ...)`
18189* `MLton.Signal.setHandler (MLton.Signal.prof, ...)`
18190
18191Also, because of the random sampling used to implement `-profile
18192time`, it is best to have a long running program (at least tens of
18193seconds) in order to get reasonable time
18194
18195<<<
18196
18197:mlton-guide-page: Projects
18198[[Projects]]
18199Projects
18200========
18201
18202We have lots of ideas for projects to improve MLton, many of which we
18203do not have time to implement, or at least haven't started on yet.
18204Here is a list of some of those improvements, ranging from the easy (1
18205week) to the difficult (several months). If you have any interest in
18206working on one of these, or some other improvement to MLton not listed
18207here, please send mail to
18208mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`].
18209
18210* Port to new platform: Windows (native, not Cygwin or MinGW), ...
18211* Source-level debugger
18212* Heap profiler
18213* Interfaces to libraries: OpenGL, Gtk+, D-BUS, ...
18214* More libraries written in SML (see <!ViewGitProj(mltonlib)>)
18215* Additional constant types: `structure Real80: REAL`, ...
18216* An IDE (possibly integrated with <:Eclipse:>)
18217* Port MLRISC and use for code generation
18218* Optimizations
18219** Improved closure representation
18220+
18221Right now, MLton's closure conversion algorithm uses a simple flat closure to represent each function.
18222+
18223*** http://www.mlton.org/pipermail/mlton/2003-October/024570.html
18224*** http://www.mlton.org/pipermail/mlton-user/2007-July/001150.html
18225*** <!Cite(ShaoAppel94)>
18226** Elimination of array bounds checks in loops
18227** Elimination of overflow checks on array index computations
18228** Common-subexpression elimination of repeated array subscripts
18229** Loop-invariant code motion, especially for tuple selects
18230** Partial redundancy elimination
18231*** http://www.mlton.org/pipermail/mlton/2006-April/028598.html
18232** Loop unrolling, especially for small loops
18233** Auto-vectorization, for MMX/SSE/3DNow!/AltiVec (see the http://gcc.gnu.org/projects/tree-ssa/vectorization.html[work done on GCC])
18234** Optimize `MLton_eq`: pointer equality is necessarily false when one of the arguments is freshly allocated in the block
18235* Analyses
18236** Uncaught exception analysis
18237
18238<<<
18239
18240:mlton-guide-page: Pronounce
18241[[Pronounce]]
18242Pronounce
18243=========
18244
18245Here is <!Attachment(Pronounce,pronounce-mlton.mp3,how "MLton" sounds)>.
18246
18247"MLton" is pronounced in two syllables, with stress on the first
18248syllable. The first syllable sounds like the word _mill_ (as in
18249"steel mill"), the second like the word _tin_ (as in "cookie tin").
18250
18251<<<
18252
18253:mlton-guide-page: PropertyList
18254[[PropertyList]]
18255PropertyList
18256============
18257
18258A property list is a dictionary-like data structure into which
18259properties (name-value pairs) can be inserted and from which
18260properties can be looked up by name. The term comes from the Lisp
18261language, where every symbol has a property list for storing
18262information, and where the names are typically symbols and keys can be
18263any type of value.
18264
18265Here is an SML signature for property lists such that for any type of
18266value a new property can be dynamically created to manipulate that
18267type of value in a property list.
18268
18269[source,sml]
18270----
18271signature PROPERTY_LIST =
18272 sig
18273 type t
18274
18275 val new: unit -> t
18276 val newProperty: unit -> {add: t * 'a -> unit,
18277 peek: t -> 'a option}
18278 end
18279----
18280
18281Here is a functor demonstrating the use of property lists. It first
18282creates a property list, then two new properties (of different types),
18283and adds a value to the list for each property.
18284
18285[source,sml]
18286----
18287functor Test (P: PROPERTY_LIST) =
18288 struct
18289 val pl = P.new ()
18290
18291 val {add = addInt: P.t * int -> unit, peek = peekInt} = P.newProperty ()
18292 val {add = addReal: P.t * real -> unit, peek = peekReal} = P.newProperty ()
18293
18294 val () = addInt (pl, 13)
18295 val () = addReal (pl, 17.0)
18296 val s1 = Int.toString (valOf (peekInt pl))
18297 val s2 = Real.toString (valOf (peekReal pl))
18298 val () = print (concat [s1, " ", s2, "\n"])
18299 end
18300----
18301
18302Applied to an appropriate implementation `PROPERTY_LIST`, the `Test`
18303functor will produce the following output.
18304
18305----
1830613 17.0
18307----
18308
18309
18310== Implementation ==
18311
18312Because property lists can hold values of any type, their
18313implementation requires a <:UniversalType:>. Given that, a property
18314list is simply a list of elements of the universal type. Adding a
18315property adds to the front of the list, and looking up a property
18316scans the list.
18317
18318[source,sml]
18319----
18320functor PropertyList (U: UNIVERSAL_TYPE): PROPERTY_LIST =
18321 struct
18322 datatype t = T of U.t list ref
18323
18324 fun new () = T (ref [])
18325
18326 fun 'a newProperty () =
18327 let
18328 val (inject, out) = U.embed ()
18329 fun add (T r, a: 'a): unit = r := inject a :: (!r)
18330 fun peek (T r) =
18331 Option.map (valOf o out) (List.find (isSome o out) (!r))
18332 in
18333 {add = add, peek = peek}
18334 end
18335 end
18336----
18337
18338
18339If `U: UNIVERSAL_TYPE`, then we can test our code as follows.
18340
18341[source,sml]
18342----
18343structure Z = Test (PropertyList (U))
18344----
18345
18346Of course, a serious implementation of property lists would have to
18347handle duplicate insertions of the same property, as well as the
18348removal of elements in order to avoid space leaks.
18349
18350== Also see ==
18351
18352* MLton relies heavily on property lists for attaching information to
18353syntax tree nodes in its intermediate languages. See
18354<!ViewGitFile(mlton,master,lib/mlton/basic/property-list.sig)> and
18355<!ViewGitFile(mlton,master,lib/mlton/basic/property-list.fun)>.
18356
18357* The <:MLRISCLibrary:> <!Cite(LeungGeorge99, uses property lists
18358extensively)>.
18359
18360<<<
18361
18362:mlton-guide-page: Pygments
18363[[Pygments]]
18364Pygments
18365========
18366
18367http://pygments.org/[Pygments] is a generic syntax highlighter. Here is a _lexer_ for highlighting
18368<:StandardML: Standard ML>.
18369
18370* <!ViewGitDir(mlton,master,ide/pygments/sml_lexer)> -- Provides highlighting of keywords, special constants, and (nested) comments.
18371
18372== Install and use ==
18373* Checkout all files and install as a http://pygments.org/[Pygments] plugin.
18374+
18375----
18376$ git clone https://github.com/MLton/mlton.git mlton
18377$ cd mlton/ide/pygments
18378$ python setup.py install
18379----
18380
18381* Invoke `pygmentize` with `-l sml`.
18382
18383== Feedback ==
18384
18385Comments and suggestions should be directed to <:MatthewFluet:>.
18386
18387<<<
18388
18389:mlton-guide-page: RayRacine
18390[[RayRacine]]
18391RayRacine
18392=========
18393
18394Using SML in some _Semantic Web_ stuff. Anyone interested in
18395similar, please contact me. GreyLensman on #sml on IRC or rracine at
18396this domain adelphia with a dot here net.
18397
18398Current areas of coding.
18399
18400. Pretty solid, high performance Rete implementation - base functionality is complete.
18401. N3 parser - mostly complete
18402. RDF parser based on fxg - not started.
18403. Swerve HTTP server - 1/2 done.
18404. SPARQL implementation - not started.
18405. Persistent engine based on BerkelyDB - not started.
18406. Native implementation of Postgresql protocol - underway, ways to go.
18407. I also have a small change to the MLton compiler to add ++PackWord__<N>__++ - changes compile but needs some more work, clean-up and unit tests.
18408
18409<<<
18410
18411:mlton-guide-page: Reachability
18412[[Reachability]]
18413Reachability
18414============
18415
18416Reachability is a notion dealing with the graph of heap objects
18417maintained at runtime. Nodes in the graph are heap objects and edges
18418correspond to the pointers between heap objects. As the program runs,
18419it allocates new objects (adds nodes to the graph), and those new
18420objects can contain pointers to other objects (new edges in the
18421graph). If the program uses mutable objects (refs or arrays), it can
18422also change edges in the graph.
18423
18424At any time, the program has access to some finite set of _root_
18425nodes, and can only ever access nodes that are reachable by following
18426edges from these root nodes. Nodes that are _unreachable_ can be
18427garbage collected.
18428
18429== Also see ==
18430
18431 * <:MLtonFinalizable:>
18432 * <:MLtonWeak:>
18433
18434<<<
18435
18436:mlton-guide-page: Redundant
18437[[Redundant]]
18438Redundant
18439=========
18440
18441<:Redundant:> is an optimization pass for the <:SSA:>
18442<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
18443
18444== Description ==
18445
18446The redundant SSA optimization eliminates redundant function and label
18447arguments; an argument of a function or label is redundant if it is
18448always the same as another argument of the same function or label.
18449The analysis finds an equivalence relation on the arguments of a
18450function or label, such that all arguments in an equivalence class are
18451redundant with respect to the other arguments in the equivalence
18452class; the transformation selects one representative of each
18453equivalence class and drops the binding occurrence of
18454non-representative variables and renames use occurrences of the
18455non-representative variables to the representative variable. The
18456analysis finds the equivalence classes via a fixed-point analysis.
18457Each vector of arguments to a function or label is initialized to
18458equivalence classes that equate all arguments of the same type; one
18459could start with an equivalence class that equates all arguments, but
18460arguments of different type cannot be redundant. Variables bound in
18461statements are initialized to singleton equivalence classes. The
18462fixed-point analysis repeatedly refines these equivalence classes on
18463the formals by the equivalence classes of the actuals.
18464
18465== Implementation ==
18466
18467* <!ViewGitFile(mlton,master,mlton/ssa/redundant.fun)>
18468
18469== Details and Notes ==
18470
18471The reason <:Redundant:> got put in was due to some output of the
18472<:ClosureConvert:> pass converter where the environment record, or
18473components of it, were passed around in several places. That may have
18474been more relevant with polyvariant analyses (which are long gone).
18475But it still seems possibly relevant, especially with more aggressive
18476flattening, which should reveal some fields in nested closure records
18477that are redundant.
18478
18479<<<
18480
18481:mlton-guide-page: RedundantTests
18482[[RedundantTests]]
18483RedundantTests
18484==============
18485
18486<:RedundantTests:> is an optimization pass for the <:SSA:>
18487<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
18488
18489== Description ==
18490
18491This pass simplifies conditionals whose results are implied by a
18492previous conditional test.
18493
18494== Implementation ==
18495
18496* <!ViewGitFile(mlton,master,mlton/ssa/redundant-tests.fun)>
18497
18498== Details and Notes ==
18499
18500An additional test will sometimes eliminate the overflow test when
18501adding or subtracting 1. In particular, it will eliminate it in the
18502following cases:
18503[source,sml]
18504----
18505if x < y
18506 then ... x + 1 ...
18507else ... y - 1 ...
18508----
18509
18510<<<
18511
18512:mlton-guide-page: References
18513[[References]]
18514References
18515==========
18516
18517<:#AAA:A>
18518<:#BBB:B>
18519<:#CCC:C>
18520<:#DDD:D>
18521<:#EEE:E>
18522<:#FFF:F>
18523<:#GGG:G>
18524<:#HHH:H>
18525<:#III:I>
18526<:#JJJ:J>
18527<:#KKK:K>
18528<:#LLL:L>
18529<:#MMM:M>
18530<:#NNN:N>
18531<:#OOO:O>
18532<:#PPP:P>
18533<:#QQQ:Q>
18534<:#RRR:R>
18535<:#SSS:S>
18536<:#TTT:T>
18537<:#UUU:U>
18538<:#VVV:V>
18539<:#WWW:W>
18540<:#XXX:X>
18541<:#YYY:Y>
18542<:#ZZZ:Z>
18543
18544== <!Anchor(AAA)>A ==
18545
18546 * <!Anchor(AcarEtAl06)>
18547 http://www.umut-acar.org/publications/pldi2006.pdf[An Experimental Analysis of Self-Adjusting Computation]
18548 Umut Acar, Guy Blelloch, Matthias Blume, and Kanat Tangwongsan.
18549 <:#PLDI:> 2006.
18550
18551 * <!Anchor(Appel92)>
18552 http://us.cambridge.org/titles/catalogue.asp?isbn=0521416957[Compiling with Continuations]
18553 (http://www.addall.com/New/submitNew.cgi?query=0-521-41695-7&type=ISBN&location=10000&state=&dispCurr=USD[addall]).
18554 ISBN 0521416957.
18555 Andrew W. Appel.
18556 Cambridge University Press, 1992.
18557
18558 * <!Anchor(Appel93)>
18559 http://www.cs.princeton.edu/research/techreps/TR-364-92[A Critique of Standard ML].
18560 Andrew W. Appel.
18561 <:#JFP:> 1993.
18562
18563 * <!Anchor(Appel98)>
18564 http://us.cambridge.org/titles/catalogue.asp?isbn=0521582741[Modern Compiler Implementation in ML]
18565 (http://www.addall.com/New/submitNew.cgi?query=0-521-58274-1&type=ISBN&location=10000&state=&dispCurr=USD[addall]).
18566 ISBN 0521582741
18567 Andrew W. Appel.
18568 Cambridge University Press, 1998.
18569
18570 * <!Anchor(AppelJim97)>
18571 http://ncstrl.cs.princeton.edu/expand.php?id=TR-556-97[Shrinking Lambda Expressions in Linear Time]
18572 Andrew Appel and Trevor Jim.
18573 <:#JFP:> 1997.
18574
18575 * <!Anchor(AppelEtAl94)>
18576 http://www.smlnj.org/doc/ML-Lex/manual.html[A lexical analyzer generator for Standard ML. Version 1.6.0]
18577 Andrew W. Appel, James S. Mattson, and David R. Tarditi. 1994
18578
18579== <!Anchor(BBB)>B ==
18580
18581 * <!Anchor(BaudinetMacQueen85)>
18582 http://www.classes.cs.uchicago.edu/archive/2011/spring/22620-1/papers/macqueen-baudinet85.pdf[Tree Pattern Matching for ML].
18583 Marianne Baudinet, David MacQueen. 1985.
18584+
18585____
18586Describes the match compiler used in an early version of
18587<:SMLNJ:SML/NJ>.
18588____
18589
18590 * <!Anchor(BentonEtAl98)>
18591 http://research.microsoft.com/en-us/um/people/nick/icfp98.pdf[Compiling Standard ML to Java Bytecodes].
18592 Nick Benton, Andrew Kennedy, and George Russell.
18593 <:#ICFP:> 1998.
18594
18595 * <!Anchor(BentonKennedy99)>
18596 http://research.microsoft.com/en-us/um/people/nick/SMLJavaInterop.pdf[Interlanguage Working Without Tears: Blending SML with Java].
18597 Nick Benton and Andrew Kennedy.
18598 <:#ICFP:> 1999.
18599
18600 * <!Anchor(BentonKennedy01)>
18601 http://research.microsoft.com/en-us/um/people/akenn/sml/ExceptionalSyntax.pdf[Exceptional Syntax].
18602 Nick Benton and Andrew Kennedy.
18603 <:#JFP:> 2001.
18604
18605 * <!Anchor(BentonEtAl04)>
18606 http://research.microsoft.com/en-us/um/people/nick/p53-Benton.pdf[Adventures in Interoperability: The SML.NET Experience].
18607 Nick Benton, Andrew Kennedy, and Claudio Russo.
18608 <:#PPDP:> 2004.
18609
18610 * <!Anchor(BentonEtAl04_2)>
18611 http://research.microsoft.com/en-us/um/people/nick/shrinking.pdf[Shrinking Reductions in SML.NET].
18612 Nick Benton, Andrew Kennedy, Sam Lindley and Claudio Russo.
18613 <:#IFL:> 2004.
18614+
18615____
18616Describes a linear-time implementation of an
18617<!Cite(AppelJim97,Appel-Jim shrinker)>, using a mutable IL, and shows
18618that it yields nice speedups in SML.NET's compile times. There are
18619also benchmarks showing that SML.NET when compiled by MLton runs
18620roughly five times faster than when compiled by SML/NJ.
18621____
18622
18623 * <!Anchor(Benton05)>
18624 http://research.microsoft.com/en-us/um/people/nick/benton03.pdf[Embedded Interpreters].
18625 Nick Benton.
18626 <:#JFP:> 2005.
18627
18628 * <!Anchor(Berry91)>
18629 http://www.lfcs.inf.ed.ac.uk/reports/91/ECS-LFCS-91-148/ECS-LFCS-91-148.pdf[The Edinburgh SML Library].
18630 Dave Berry.
18631 University of Edinburgh Technical Report ECS-LFCS-91-148, 1991.
18632
18633 * <!Anchor(BerryEtAl93)>
18634 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.7958&rep=rep1&type=ps[A semantics for ML concurrency primitives].
18635 Dave Berry, Robin Milner, and David N. Turner.
18636 <:#POPL:> 1992.
18637
18638 * <!Anchor(Berry93)>
18639 http://journals.cambridge.org/abstract_S0956796800000873[Lessons From the Design of a Standard ML Library].
18640 Dave Berry.
18641 <:#JFP:> 1993.
18642
18643 * <!Anchor(Bertelsen98)>
18644 http://www.petermb.dk/sml2jvm.ps.gz[Compiling SML to Java Bytecode].
18645 Peter Bertelsen.
18646 Master's Thesis, 1998.
18647
18648 * <!Anchor(Berthomieu00)>
18649 http://homepages.laas.fr/bernard/oo/ooml.html[OO Programming styles in ML].
18650 Bernard Berthomieu.
18651 LAAS Report #2000111, 2000.
18652
18653 * <!Anchor(Blume01)>
18654 http://people.cs.uchicago.edu/~blume/papers/nlffi-entcs.pdf[No-Longer-Foreign: Teaching an ML compiler to speak C "natively"].
18655 Matthias Blume.
18656 <:#BABEL:> 2001.
18657
18658 * <!Anchor(Blume01_02)>
18659 http://people.cs.uchicago.edu/~blume/pgraph/proposal.pdf[Portable library descriptions for Standard ML].
18660 Matthias Blume. 2001.
18661
18662 * <!Anchor(Boehm03)>
18663 http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html[Destructors, Finalizers, and Synchronization].
18664 Hans Boehm.
18665 <:#POPL:> 2003.
18666+
18667____
18668Discusses a number of issues in the design of finalizers. Many of the
18669design choices are consistent with <:MLtonFinalizable:>.
18670____
18671
18672== <!Anchor(CCC)>C ==
18673
18674 * <!Anchor(CejtinEtAl00)>
18675 http://www.cs.purdue.edu/homes/suresh/papers/icfp99.ps.gz[Flow-directed Closure Conversion for Typed Languages].
18676 Henry Cejtin, Suresh Jagannathan, and Stephen Weeks.
18677 <:#ESOP:> 2000.
18678+
18679____
18680Describes MLton's closure-conversion algorithm, which translates from
18681its simply-typed higher-order intermediate language to its
18682simply-typed first-order intermediate language.
18683____
18684
18685 * <!Anchor(ChengBlelloch01)>
18686 http://www.cs.cmu.edu/afs/cs/project/pscico/pscico/papers/gc01/pldi-final.pdf[A Parallel, Real-Time Garbage Collector].
18687 Perry Cheng and Guy E. Blelloch.
18688 <:#PLDI:> 2001.
18689
18690 * <!Anchor(Claessen00)>
18691 http://users.eecs.northwestern.edu/~robby/courses/395-495-2009-fall/quick.pdf[QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs].
18692 Koen Claessen and John Hughes.
18693 <:#ICFP:> 2000.
18694
18695 * <!Anchor(Clinger98)>
18696 http://www.cesura17.net/~will/Professional/Research/Papers/tail.pdf[Proper Tail Recursion and Space Efficiency].
18697 William D. Clinger.
18698 <:#PLDI:> 1998.
18699
18700 * <!Anchor(CooperMorrisett90)>
18701 http://www.eecs.harvard.edu/~greg/papers/jgmorris-mlthreads.ps[Adding Threads to Standard ML].
18702 Eric C. Cooper and J. Gregory Morrisett.
18703 CMU Technical Report CMU-CS-90-186, 1990.
18704
18705 * <!Anchor(CouttsEtAl07)>
18706 http://metagraph.org/papers/stream_fusion.pdf[Stream Fusion: From Lists to Streams to Nothing at All].
18707 Duncan Coutts, Roman Leshchinskiy, and Don Stewart.
18708 Submitted for publication. April 2007.
18709
18710== <!Anchor(DDD)>D ==
18711
18712 * <!Anchor(DamasMilner82)>
18713 http://groups.csail.mit.edu/pag/6.883/readings/p207-damas.pdf[Principal Type-Schemes for Functional Programs].
18714 Luis Damas and Robin Milner.
18715 <:#POPL:> 1982.
18716
18717 * <!Anchor(Danvy98)>
18718 http://www.brics.dk/RS/98/12[Functional Unparsing].
18719 Olivier Danvy.
18720 BRICS Technical Report RS 98-12, 1998.
18721
18722 * <!Anchor(Deboer05)>
18723 http://alleystoughton.us/eXene/dusty-thesis.pdf[Exhancements to eXene].
18724 Dustin B. deBoer.
18725 Master of Science Thesis, 2005.
18726+
18727____
18728Describes ways to improve widget concurrency, handling of input focus,
18729X resources and selections.
18730____
18731
18732 * <!Anchor(DoligezLeroy93)>
18733 http://cristal.inria.fr/~doligez/publications/doligez-leroy-popl-1993.pdf[A Concurrent, Generational Garbage Collector for a Multithreaded Implementation of ML].
18734 Damien Doligez and Xavier Leroy.
18735 <:#POPL:> 1993.
18736
18737 * <!Anchor(Dreyer07)>
18738 http://www.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf[Modular Type Classes].
18739 Derek Dreyer, Robert Harper, Manuel M.T. Chakravarty, Gabriele Keller.
18740 University of Chicago Technical Report TR-2007-02, 2006.
18741
18742 * <!Anchor(DreyerBlume07)>
18743 http://www.mpi-sws.org/~dreyer/papers/infmod/main-long.pdf[Principal Type Schemes for Modular Programs].
18744 Derek Dreyer and Matthias Blume.
18745 <:#ESOP:> 2007.
18746
18747 * <!Anchor(Dubois95)>
18748 ftp://ftp.inria.fr/INRIA/Projects/cristal/Francois.Rouaix/generics.dvi.Z[Extensional Polymorphism].
18749 Catherin Dubois, Francois Rouaix, and Pierre Weis.
18750 <:#POPL:> 1995.
18751+
18752____
18753An extension of ML that allows the definition of ad-hoc polymorphic
18754functions by inspecting the type of their argument.
18755____
18756
18757== <!Anchor(EEE)>E ==
18758
18759 * <!Anchor(Elsman03)>
18760 http://www.elsman.com/tldi03.pdf[Garbage Collection Safety for Region-based Memory Management].
18761 Martin Elsman.
18762 <:#TLDI:> 2003.
18763
18764 * <!Anchor(Elsman04)>
18765 http://www.elsman.com/ITU-TR-2004-43.pdf[Type-Specialized Serialization with Sharing].
18766 Martin Elsman. University of Copenhagen. IT University Technical
18767 Report TR-2004-43, 2004.
18768
18769== <!Anchor(FFF)>F ==
18770
18771 * <!Anchor(FelleisenFreidman98)>
18772 http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=4787[The Little MLer]
18773 (http://www3.addall.com/New/submitNew.cgi?query=026256114X&type=ISBN[addall]).
18774 ISBN 026256114X.
18775 Matthias Felleisen and Dan Freidman.
18776 The MIT Press, 1998.
18777
18778 * <!Anchor(FlattFindler04)>
18779 http://www.cs.utah.edu/plt/kill-safe/[Kill-Safe Synchronization Abstractions].
18780 Matthew Flatt and Robert Bruce Findler.
18781 <:#PLDI:> 2004.
18782
18783 * <!Anchor(FluetWeeks01)>
18784 http://www.cs.rit.edu/~mtf/research/contification[Contification Using Dominators].
18785 Matthew Fluet and Stephen Weeks.
18786 <:#ICFP:> 2001.
18787+
18788____
18789Describes contification, a generalization of tail-recursion
18790elimination that is an optimization operating on MLton's static single
18791assignment (SSA) intermediate language.
18792____
18793
18794 * <!Anchor(FluetPucella06)>
18795 http://www.cs.rit.edu/~mtf/research/phantom-subtyping/jfp06/jfp06.pdf[Phantom Types and Subtyping].
18796 Matthew Fluet and Riccardo Pucella.
18797 <:#JFP:> 2006.
18798
18799 * <!Anchor(Furuse01)>
18800 http://jfla.inria.fr/2001/actes/07-furuse.ps[Generic Polymorphism in ML].
18801 J{empty}. Furuse.
18802 <:#JFLA:> 2001.
18803+
18804____
18805The formalism behind G'CAML, which has an approach to ad-hoc
18806polymorphism based on <!Cite(Dubois95)>, the differences being in how
18807type checking works an an improved compilation approach for typecase
18808that does the matching at compile time, not run time.
18809____
18810
18811== <!Anchor(GGG)>G ==
18812
18813 * <!Anchor(GansnerReppy93)>
18814 http://alleystoughton.us/eXene/1993-trends.pdf[A Multi-Threaded Higher-order User Interface Toolkit].
18815 Emden R. Gansner and John H. Reppy.
18816 User Interface Software, 1993.
18817
18818 * <!Anchor(GansnerReppy04)>
18819http://www.cambridge.org/gb/academic/subjects/computer-science/programming-languages-and-applied-logic/standard-ml-basis-library[The Standard ML Basis Library].
18820 (http://www3.addall.com/New/submitNew.cgi?query=9780521794787&type=ISBN[addall])
18821 ISBN 9780521794787.
18822 Emden R. Gansner and John H. Reppy.
18823 Cambridge University Press, 2004.
18824+
18825____
18826An introduction and overview of the <:BasisLibrary:Basis Library>,
18827followed by a detailed description of each module. The module
18828descriptions are also available
18829http://www.standardml.org/Basis[online].
18830____
18831
18832 * <!Anchor(GrossmanEtAl02)>
18833 http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf[Region-based Memory Management in Cyclone].
18834 Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling
18835 Wang, and James Cheney.
18836 <:#PLDI:> 2002.
18837
18838== <!Anchor(HHH)>H ==
18839
18840 * <!Anchor(HallenbergEtAl02)>
18841 http://www.itu.dk/people/tofte/publ/pldi2002.pdf[Combining Region Inference and Garbage Collection].
18842 Niels Hallenberg, Martin Elsman, and Mads Tofte.
18843 <:#PLDI:> 2002.
18844
18845 * <!Anchor(HansenRichel99)>
18846 http://www.it.dtu.dk/introSML[Introduction to Programming Using SML]
18847 (http://www3.addall.com/New/submitNew.cgi?query=0201398206&type=ISBN[addall]).
18848 ISBN 0201398206.
18849 Michael R. Hansen, Hans Rischel.
18850 Addison-Wesley, 1999.
18851
18852 * <!Anchor(Harper11)>
18853 http://www.cs.cmu.edu/~rwh/smlbook/book.pdf[Programming in Standard ML].
18854 Robert Harper.
18855
18856 * <!Anchor(HarperEtAl93)>
18857 http://www.cs.cmu.edu/~rwh/papers/callcc/jfp.pdf[Typing First-Class Continuations in ML].
18858 Robert Harper, Bruce F. Duba, and David MacQueen.
18859 <:#JFP:> 1993.
18860
18861 * <!Anchor(HarperMitchell92)>
18862 http://www.cs.cmu.edu/~rwh/papers/xml/toplas93.pdf[On the Type Structure of Standard ML].
18863 Robert Harper and John C. Mitchell.
18864 <:#TOPLAS:> 1992.
18865
18866 * <!Anchor(HauserBenson04)>
18867 http://doi.ieeecomputersociety.org/10.1109/CSD.2004.1309122[On the Practicality and Desirability of Highly-concurrent, Mostly-functional Programming].
18868 Carl H. Hauser and David B. Benson.
18869 <:#ACSD:> 2004.
18870+
18871____
18872Describes the use of <:ConcurrentML: Concurrent ML> in implementing
18873the Ped text editor. Argues that using large numbers of threads and
18874message passing style is a practical and effective way of
18875modularizing a program.
18876____
18877
18878 * <!Anchor(HeckmanWilhelm97)>
18879 http://rw4.cs.uni-sb.de/~heckmann/abstracts/neuform.html[A Functional Description of TeX's Formula Layout].
18880 Reinhold Heckmann and Reinhard Wilhelm.
18881 <:#JFP:> 1997.
18882
18883 * <!Anchor(HicksEtAl03)>
18884 http://wwwold.cs.umd.edu/Library/TRs/CS-TR-4514/CS-TR-4514.pdf[Safe and Flexible Memory Management in Cyclone].
18885 Mike Hicks, Greg Morrisett, Dan Grossman, and Trevor Jim.
18886 University of Maryland Technical Report CS-TR-4514, 2003.
18887
18888 * <!Anchor(Hurd04)>
18889 http://www.gilith.com/research/talks/tphols2004.pdf[Compiling HOL4 to Native Code].
18890 Joe Hurd.
18891 <:#TPHOLs:> 2004.
18892+
18893____
18894Describes a port of HOL from Moscow ML to MLton, the difficulties
18895encountered in compiling large programs, and the speedups achieved
18896(roughly 10x).
18897____
18898
18899== <!Anchor(III)>I ==
18900
18901{empty}
18902
18903== <!Anchor(JJJ)>J ==
18904
18905 * <!Anchor(Jones99)>
18906 http://www.cs.kent.ac.uk/people/staff/rej/gcbook[Garbage Collection: Algorithms for Automatic Memory Management]
18907 (http://www3.addall.com/New/submitNew.cgi?query=0471941484&type=ISBN[addall]).
18908 ISBN 0471941484.
18909 Richard Jones.
18910 John Wiley & Sons, 1999.
18911
18912== <!Anchor(KKK)>K ==
18913
18914 * <!Anchor(Kahrs93)>
18915 http://kar.kent.ac.uk/21122/[Mistakes and Ambiguities in the Definition of Standard ML].
18916 Stefan Kahrs.
18917 University of Edinburgh Technical Report ECS-LFCS-93-257, 1993.
18918+
18919____
18920Describes a number of problems with the
18921<!Cite(MilnerEtAl90,1990 Definition)>, many of which were fixed in the
18922<!Cite(MilnerEtAl97,1997 Definition)>.
18923
18924Also see the http://www.cs.kent.ac.uk/~smk/errors-new.ps.Z[addenda]
18925published in 1996.
18926____
18927
18928 * <!Anchor(Karvonen07)>
18929 http://dl.acm.org/citation.cfm?doid=1292535.1292547[Generics for the Working ML'er].
18930 Vesa Karvonen.
18931 <:#ML:> 2007. http://research.microsoft.com/~crusso/ml2007/slides/ml08rp-karvonen-slides.pdf[Slides] from the presentation are also available.
18932
18933 * <!Anchor(Kennedy04)>
18934 http://research.microsoft.com/~akenn/fun/picklercombinators.pdf[Pickler Combinators].
18935 Andrew Kennedy.
18936 <:#JFP:> 2004.
18937
18938 * <!Anchor(KoserEtAl03)>
18939 http://www.litech.org/~vaughan/pdf/dpcool2003.pdf[sml2java: A Source To Source Translator].
18940 Justin Koser, Haakon Larsen, Jeffrey A. Vaughan.
18941 <:#DPCOOL:> 2003.
18942
18943== <!Anchor(LLL)>L ==
18944
18945 * <!Anchor(Lang99)>
18946 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.29.7130&rep=rep1&type=ps[Faster Algorithms for Finding Minimal Consistent DFAs].
18947 Kevin Lang. 1999.
18948
18949 * <!Anchor(LarsenNiss04)>
18950 http://usenix.org/publications/library/proceedings/usenix04/tech/freenix/full_papers/larsen/larsen.pdf[mGTK: An SML binding of Gtk+].
18951 Ken Larsen and Henning Niss.
18952 USENIX Annual Technical Conference, 2004.
18953
18954 * <!Anchor(Leibig13)>
18955 http://www.cs.rit.edu/~bal6053/msproject/[An LLVM Back-end for MLton].
18956 Brian Leibig.
18957 MS Project Report, 2013.
18958+
18959____
18960Describes MLton's <:LLVMCodegen:>.
18961____
18962
18963 * <!Anchor(Leroy90)>
18964 http://pauillac.inria.fr/~xleroy/bibrefs/Leroy-ZINC.html[The ZINC Experiment: an Economical Implementation of the ML Language].
18965 Xavier Leroy.
18966 Technical report 117, INRIA, 1990.
18967+
18968____
18969A detailed explanation of the design and implementation of a bytecode
18970compiler and interpreter for ML with a machine model aimed at
18971efficient implementation.
18972____
18973
18974 * <!Anchor(Leroy93)>
18975 http://pauillac.inria.fr/~xleroy/bibrefs/Leroy-poly-par-nom.html[Polymorphism by Name for References and Continuations].
18976 Xavier Leroy.
18977 <:#POPL:> 1993.
18978
18979 * <!Anchor(LeungGeorge99)>
18980 http://www.cs.nyu.edu/leunga/my-papers/annotations.ps[MLRISC Annotations].
18981 Allen Leung and Lal George. 1999.
18982
18983== <!Anchor(MMM)>M ==
18984
18985 * <!Anchor(MarlowEtAl01)>
18986 http://community.haskell.org/~simonmar/papers/async.pdf[Asynchronous Exceptions in Haskell].
18987 Simon Marlow, Simon Peyton Jones, Andy Moran and John Reppy.
18988 <:#PLDI:> 2001.
18989+
18990____
18991An asynchronous exception is a signal that one thread can send to
18992another, and is useful for the receiving thread to treat as an
18993exception so that it can clean up locks or other state relevant to its
18994current context.
18995____
18996
18997 * <!Anchor(MacQueenEtAl84)>
18998 http://homepages.inf.ed.ac.uk/gdp/publications/Ideal_model.pdf[An Ideal Model for Recursive Polymorphic Types].
18999 David MacQueen, Gordon Plotkin, Ravi Sethi.
19000 <:#POPL:> 1984.
19001
19002 * <!Anchor(Matthews91)>
19003 http://www.lfcs.inf.ed.ac.uk/reports/91/ECS-LFCS-91-174[A Distributed Concurrent Implementation of Standard ML].
19004 David Matthews.
19005 University of Edinburgh Technical Report ECS-LFCS-91-174, 1991.
19006
19007 * <!Anchor(Matthews95)>
19008 http://www.lfcs.inf.ed.ac.uk/reports/95/ECS-LFCS-95-335[Papers on Poly/ML].
19009 David C. J. Matthews.
19010 University of Edinburgh Technical Report ECS-LFCS-95-335, 1995.
19011
19012 * http://www.lfcs.inf.ed.ac.uk/reports/97/ECS-LFCS-97-375[That About Wraps it Up: Using FIX to Handle Errors Without Exceptions, and Other Programming Tricks].
19013 Bruce J. McAdam.
19014 University of Edinburgh Technical Report ECS-LFCS-97-375, 1997.
19015
19016 * <!Anchor(MeierNorgaard93)>
19017 A Just-In-Time Backend for Moscow ML 2.00 in SML.
19018 Bjarke Meier, Kristian Nørgaard.
19019 Masters Thesis, 2003.
19020+
19021____
19022A just-in-time compiler using GNU Lightning, showing a speedup of up
19023to four times over Moscow ML's usual bytecode interpreter.
19024
19025The full report is only available in
19026http://www.itu.dk/stud/speciale/bmkn/fundanemt/download/report[Danish].
19027____
19028
19029 * <!Anchor(Milner78)>
19030 http://courses.engr.illinois.edu/cs421/sp2013/project/milner-polymorphism.pdf[A Theory of Type Polymorphism in Programming].
19031 Robin Milner.
19032 Journal of Computer and System Sciences, 1978.
19033
19034 * <!Anchor(Milner82)>
19035 http://homepages.inf.ed.ac.uk/dts/fps/papers/evolved.dvi.gz[How ML Evolved].
19036 Robin Milner.
19037 Polymorphism--The ML/LCF/Hope Newsletter, 1983.
19038
19039 * <!Anchor(MilnerTofte91)>
19040 http://www.itu.dk/people/tofte/publ/1990sml/1990sml.html[Commentary on Standard ML]
19041 (http://www3.addall.com/New/submitNew.cgi?query=0262631377&type=ISBN[addall])
19042 ISBN 0262631377.
19043 Robin Milner and Mads Tofte.
19044 The MIT Press, 1991.
19045+
19046____
19047Introduces and explains the notation and approach used in
19048<!Cite(MilnerEtAl90,The Definition of Standard ML)>.
19049____
19050
19051 * <!Anchor(MilnerEtAl90)>
19052 http://www.itu.dk/people/tofte/publ/1990sml/1990sml.html[The Definition of Standard ML].
19053 (http://www3.addall.com/New/submitNew.cgi?query=0262631326&type=ISBN[addall])
19054 ISBN 0262631326.
19055 Robin Milner, Mads Tofte, and Robert Harper.
19056 The MIT Press, 1990.
19057+
19058____
19059Superseded by <!Cite(MilnerEtAl97,The Definition of Standard ML (Revised))>.
19060Accompanied by the <!Cite(MilnerTofte91,Commentary on Standard ML)>.
19061____
19062
19063 * <!Anchor(MilnerEtAl97)>
19064 http://mitpress.mit.edu/books/definition-standard-ml[The Definition of Standard ML (Revised)].
19065 (http://www3.addall.com/New/submitNew.cgi?query=0262631814&type=ISBN[addall])
19066 ISBN 0262631814.
19067 Robin Milner, Mads Tofte, Robert Harper, and David MacQueen.
19068 The MIT Press, 1997.
19069+
19070____
19071A terse and formal specification of Standard ML's syntax and
19072semantics. Supersedes <!Cite(MilnerEtAl90,The Definition of Standard ML)>.
19073____
19074
19075 * <!Anchor(ML2000)>
19076 http://flint.cs.yale.edu/flint/publications/ml2000.html[Principles and a Preliminary Design for ML2000].
19077 The ML2000 working group, 1999.
19078
19079 * <!Anchor(Morentsen99)>
19080 http://daimi.au.dk/CPnets/workshop99/papers/Mortensen.pdf[Automatic Code Generation from Coloured Petri Nets for an Access Control System].
19081 Kjeld H. Mortensen.
19082 Workshop on Practical Use of Coloured Petri Nets and Design/CPN, 1999.
19083
19084 * <!Anchor(MorrisettTolmach93)>
19085 http://web.cecs.pdx.edu/~apt/ppopp93.ps[Procs and Locks: a Portable Multiprocessing Platform for Standard ML of New Jersey].
19086 J{empty}. Gregory Morrisett and Andrew Tolmach.
19087 <:#PPoPP:> 1993.
19088
19089 * <!Anchor(Murphy06)>
19090 http://www.cs.cmu.edu/~tom7/papers/grid-ml06.pdf[ML Grid Programming with ConCert].
19091 Tom Murphy VII.
19092 <:#ML:> 2006.
19093
19094== <!Anchor(NNN)>N ==
19095
19096 * <!Anchor(Neumann99)>
19097 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.9485&rep=rep1&type=ps[fxp - Processing Structured Documents in SML].
19098 Andreas Neumann.
19099 Scottish Functional Programming Workshop, 1999.
19100+
19101____
19102Describes http://atseidl2.informatik.tu-muenchen.de/~berlea/Fxp[fxp],
19103an XML parser implemented in Standard ML.
19104____
19105
19106 * <!Anchor(Neumann99Thesis)>
19107 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.8108&rep=rep1&type=ps[Parsing and Querying XML Documents in SML].
19108 Andreas Neumann.
19109 Doctoral Thesis, 1999.
19110
19111 * <!Anchor(NguyenOhori06)>
19112 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/NguyenOhoriPPDP06.pdf[Compiling ML Polymorphism with Explicit Layout Bitmap].
19113 Huu-Duc Nguyen and Atsushi Ohori.
19114 <:#PPDP:> 2006.
19115
19116== <!Anchor(OOO)>O ==
19117
19118 * <!Anchor(Okasaki99)>
19119http://www.cambridge.org/gb/academic/subjects/computer-science/programming-languages-and-applied-logic/purely-functional-data-structures[Purely Functional Data Structures].
19120 ISBN 9780521663502.
19121 Chris Okasaki.
19122 Cambridge University Press, 1999.
19123
19124 * <!Anchor(Ohori89)>
19125 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/fpca89.pdf[A Simple Semantics for ML Polymorphism].
19126 Atsushi Ohori.
19127 <:#FPCA:> 1989.
19128
19129 * <!Anchor(Ohori95)>
19130 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/toplas95.pdf[A Polymorphic Record Calculus and Its Compilation].
19131 Atsushi Ohori.
19132 <:#TOPLAS:> 1995.
19133
19134 * <!Anchor(OhoriTakamizawa97)>
19135 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/jlsc97.pdf[An Unboxed Operational Semantics for ML Polymorphism].
19136 Atsushi Ohori and Tomonobu Takamizawa.
19137 <:#LASC:> 1997.
19138
19139 * <!Anchor(Ohori99)>
19140 http://www.pllab.riec.tohoku.ac.jp/~ohori/research/ic98.pdf[Type-Directed Specialization of Polymorphism].
19141 Atsushi Ohori.
19142 <:#IC:> 1999.
19143
19144 * <!Anchor(OwensEtAl09)>
19145 http://www.mpi-sws.org/~turon/re-deriv.pdf[Regular-expression derivatives reexamined].
19146 Scott Owens, John Reppy, and Aaron Turon.
19147 <:#JFP:> 2009.
19148
19149== <!Anchor(PPP)>P ==
19150
19151 * <!Anchor(Paulson96)>
19152 http://www.cambridge.org/co/academic/subjects/computer-science/programming-languages-and-applied-logic/ml-working-programmer-2nd-edition[ML For the Working Programmer]
19153 (http://www3.addall.com/New/submitNew.cgi?query=052156543X&type=ISBN[addall])
19154 ISBN 052156543X.
19155 Larry C. Paulson.
19156 Cambridge University Press, 1996.
19157
19158 * <!Anchor(PetterssonEtAl02)>
19159 http://user.it.uu.se/~kostis/Papers/flops02_22.ps.gz[The HiPE/x86 Erlang Compiler: System Description and Performance Evaluation].
19160 Mikael Pettersson, Konstantinos Sagonas, and Erik Johansson.
19161 <:#FLOPS:> 2002.
19162+
19163____
19164Describes a native x86 Erlang compiler and a comparison of many
19165different native x86 compilers (including MLton) and their register
19166usage and call stack implementations.
19167____
19168
19169 * <!Anchor(Price09)>
19170 http://rogerprice.org/#UG[User's Guide to ML-Lex and ML-Yacc]
19171 Roger Price. 2009.
19172
19173 * <!Anchor(Pucella98)>
19174 http://arxiv.org/abs/cs.PL/0405080[Reactive Programming in Standard ML].
19175 Riccardo R. Puccella. 1998.
19176 <:#ICCL:> 1998.
19177
19178== <!Anchor(QQQ)>Q ==
19179
19180{empty}
19181
19182== <!Anchor(RRR)>R ==
19183
19184 * <!Anchor(Ramsey90)>
19185 https://www.cs.princeton.edu/research/techreps/TR-262-90[Concurrent Programming in ML].
19186 Norman Ramsey.
19187 Princeton University Technical Report CS-TR-262-90, 1990.
19188
19189 * <!Anchor(Ramsey11)>
19190 http://www.cs.tufts.edu/~nr/pubs/embedj-abstract.html[Embedding an Interpreted Language Using Higher-Order Functions and Types].
19191 Norman Ramsey.
19192 <:#JFP:> 2011.
19193
19194 * <!Anchor(RamseyFisherGovereau05)>
19195 http://www.cs.tufts.edu/~nr/pubs/els-abstract.html[An Expressive Language of Signatures].
19196 Norman Ramsey, Kathleen Fisher, and Paul Govereau.
19197 <:#ICFP:> 2005.
19198
19199 * <!Anchor(RedwineRamsey04)>
19200 http://www.cs.tufts.edu/~nr/pubs/widen-abstract.html[Widening Integer Arithmetic].
19201 Kevin Redwine and Norman Ramsey.
19202 <:#CC:> 2004.
19203+
19204____
19205Describes a method to implement numeric types and operations (like
19206`Int31` or `Word17`) for sizes smaller than that provided by the
19207processor.
19208____
19209
19210 * <!Anchor(Reppy88)>
19211 Synchronous Operations as First-Class Values.
19212 John Reppy.
19213 <:#PLDI:> 1988.
19214
19215 * <!Anchor(Reppy07)>
19216 http://www.cambridge.org/co/academic/subjects/computer-science/distributed-networked-and-mobile-computing/concurrent-programming-ml[Concurrent Programming in ML]
19217 (http://www3.addall.com/New/submitNew.cgi?query=9780521714723&type=ISBN[addall]).
19218 ISBN 9780521714723.
19219 John Reppy.
19220 Cambridge University Press, 2007.
19221+
19222____
19223Describes <:ConcurrentML:>.
19224____
19225
19226 * <!Anchor(Reynolds98)>
19227 https://users-cs.au.dk/hosc/local/HOSC-11-4-pp355-361.pdf[Definitional Interpreters Revisited].
19228 John C. Reynolds.
19229 <:#HOSC:> 1998.
19230
19231 * <!Anchor(Reynolds98_2)>
19232 https://users-cs.au.dk/hosc/local/HOSC-11-4-pp363-397.pdf[Definitional Interpreters for Higher-Order Programming Languages]
19233 John C. Reynolds.
19234 <:#HOSC:> 1998.
19235
19236 * <!Anchor(Rossberg01)>
19237 http://www.mpi-sws.org/~rossberg/papers/Rossberg%20-%20Defects%20in%20the%20Revised%20Definition%20of%20Standard%20ML%20%5B2007-01-22%20Update%5D.pdf[Defects in the Revised Definition of Standard ML].
19238 Andreas Rossberg. 2001.
19239
19240== <!Anchor(SSS)>S ==
19241
19242 * <!Anchor(Sansom91)>
19243 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.1020&rep=rep1&type=ps[Dual-Mode Garbage Collection].
19244 Patrick M. Sansom.
19245 Workshop on the Parallel Implementation of Functional Languages, 1991.
19246
19247 * <!Anchor(ScottRamsey00)>
19248 http://www.cs.tufts.edu/~nr/pubs/match-abstract.html[When Do Match-Compilation Heuristics Matter].
19249 Kevin Scott and Norman Ramsey.
19250 University of Virginia Technical Report CS-2000-13, 2000.
19251+
19252____
19253Modified SML/NJ to experimentally compare a number of
19254match-compilation heuristics and showed that choice of heuristic
19255usually does not significantly affect code size or run time.
19256____
19257
19258 * <!Anchor(Sestoft96)>
19259 http://www.itu.dk/~sestoft/papers/match.ps.gz[ML Pattern Match Compilation and Partial Evaluation].
19260 Peter Sestoft.
19261 Partial Evaluation, 1996.
19262+
19263____
19264Describes the derivation of the match compiler used in
19265<:MoscowML:Moscow ML>.
19266____
19267
19268 * <!Anchor(ShaoAppel94)>
19269 http://flint.cs.yale.edu/flint/publications/closure.html[Space-Efficient Closure Representations].
19270 Zhong Shao and Andrew W. Appel.
19271 <:#LFP:> 1994.
19272
19273 * <!Anchor(Shipman02)>
19274 <!Attachment(References,Shipman02.pdf,Unix System Programming with Standard ML)>.
19275 Anthony L. Shipman.
19276 2002.
19277+
19278____
19279Includes a description of the <:Swerve:> HTTP server written in SML.
19280____
19281
19282 * <!Anchor(Signoles03)>
19283 Calcul Statique des Applications de Modules Parametres.
19284 Julien Signoles.
19285 <:#JFLA:> 2003.
19286+
19287____
19288Describes a http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=382[defunctorizer]
19289for OCaml, and compares it to existing defunctorizers, including MLton.
19290____
19291
19292 * <!Anchor(SittampalamEtAl04)>
19293 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.1349&rep=rep1&type=ps[Incremental Execution of Transformation Specifications].
19294 Ganesh Sittampalam, Oege de Moor, and Ken Friis Larsen.
19295 <:#POPL:> 2004.
19296+
19297____
19298Mentions a port from Moscow ML to MLton of
19299http://www.itu.dk/research/muddy/[MuDDY], an SML wrapper around the
19300http://sourceforge.net/projects/buddy[BuDDY] BDD package.
19301____
19302
19303 * <!Anchor(SwaseyEtAl06)>
19304 http://www.cs.cmu.edu/~tom7/papers/smlsc2-ml06.pdf[A Separate Compilation Extension to Standard ML].
19305 David Swasey, Tom Murphy VII, Karl Crary and Robert Harper.
19306 <:#ML:> 2006.
19307
19308== <!Anchor(TTT)>T ==
19309
19310 * <!Anchor(TarditiAppel00)>
19311 http://www.smlnj.org/doc/ML-Yacc/index.html[ML-Yacc User's Manual. Version 2.4]
19312 David R. Tarditi and Andrew W. Appel. 2000.
19313
19314 * <!Anchor(TarditiEtAl90)>
19315 http://research.microsoft.com/pubs/68738/loplas-sml2c.ps[No Assembly Required: Compiling Standard ML to C].
19316 David Tarditi, Peter Lee, and Anurag Acharya. 1990.
19317
19318 * <!Anchor(ThorupTofte94)>
19319 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.5372&rep=rep1&type=ps[Object-oriented programming and Standard ML].
19320 Lars Thorup and Mads Tofte.
19321 <:#ML:>, 1994.
19322
19323 * <!Anchor(Tofte90)>
19324 Type Inference for Polymorphic References.
19325 Mads Tofte.
19326 <:#IC:> 1990.
19327
19328 * <!Anchor(Tofte96)>
19329 http://www.itu.dk/courses/FDP/E2004/Tofte-1996-Essentials_of_SML_Modules.pdf[Essentials of Standard ML Modules].
19330 Mads Tofte.
19331
19332 * <!Anchor(Tofte09)>
19333 http://www.itu.dk/people/tofte/publ/tips.pdf[Tips for Computer Scientists on Standard ML (Revised)].
19334 Mads Tofte.
19335
19336 * <!Anchor(TolmachAppel95)>
19337 http://web.cecs.pdx.edu/~apt/jfp95.ps[A Debugger for Standard ML].
19338 Andrew Tolmach and Andrew W. Appel.
19339 <:#JFP:> 1995.
19340
19341 * <!Anchor(Tolmach97)>
19342 http://web.cecs.pdx.edu/~apt/tic97.ps[Combining Closure Conversion with Closure Analysis using Algebraic Types].
19343 Andrew Tolmach.
19344 <:#TIC:> 1997.
19345+
19346____
19347Describes a closure-conversion algorithm for a monomorphic IL. The
19348algorithm uses a unification-based flow analysis followed by
19349defunctionalization and is similar to the approach used in MLton
19350(<!Cite(CejtinEtAl00)>).
19351____
19352
19353 * <!Anchor(TolmachOliva98)>
19354 http://web.cecs.pdx.edu/~apt/jfp98.ps[From ML to Ada: Strongly-typed Language Interoperability via Source Translation].
19355 Andrew Tolmach and Dino Oliva.
19356 <:#JFP:> 1998.
19357+
19358____
19359Describes a compiler for RML, a core SML-like language. The compiler
19360is similar in structure to MLton, using monomorphisation,
19361defunctionalization, and optimization on a first-order IL.
19362____
19363
19364== <!Anchor(UUU)>U ==
19365
19366 * <!Anchor(Ullman98)>
19367 http://www-db.stanford.edu/~ullman/emlp.html[Elements of ML Programming]
19368 (http://www3.addall.com/New/submitNew.cgi?query=0137903871&type=ISBN[addall]).
19369 ISBN 0137903871.
19370 Jeffrey D. Ullman.
19371 Prentice-Hall, 1998.
19372
19373== <!Anchor(VVV)>V ==
19374
19375{empty}
19376
19377== <!Anchor(WWW)>W ==
19378
19379 * <!Anchor(Wand84)>
19380 http://portal.acm.org/citation.cfm?id=800527[A Types-as-Sets Semantics for Milner-Style Polymorphism].
19381 Mitchell Wand.
19382 <:#POPL:> 1984.
19383
19384 * <!Anchor(Wang01)>
19385 http://ncstrl.cs.princeton.edu/expand.php?id=TR-640-01[Managing Memory with Types].
19386 Daniel C. Wang.
19387 PhD Thesis.
19388+
19389____
19390Chapter 6 describes an implementation of a type-preserving garbage
19391collector for MLton.
19392____
19393
19394 * <!Anchor(WangAppel01)>
19395 http://www.cs.princeton.edu/~appel/papers/typegc.pdf[Type-Preserving Garbage Collectors].
19396 Daniel C. Wang and Andrew W. Appel.
19397 <:#POPL:> 2001.
19398+
19399____
19400Shows how to modify MLton to generate a strongly-typed garbage
19401collector as part of a program.
19402____
19403
19404 * <!Anchor(WangMurphy02)>
19405 http://www.cs.cmu.edu/~tom7/papers/wang-murphy-recursion.pdf[Programming With Recursion Schemes].
19406 Daniel C. Wang and Tom Murphy VII.
19407+
19408____
19409Describes a programming technique for data abstraction, along with
19410benchmarks of MLton and other SML compilers.
19411____
19412
19413 * <!Anchor(Weeks06)>
19414 <!Attachment(References,060916-mlton.pdf,Whole-Program Compilation in MLton)>.
19415 Stephen Weeks.
19416 <:#ML:> 2006.
19417
19418 * <!Anchor(Wright95)>
19419 http://homepages.inf.ed.ac.uk/dts/fps/papers/wright.ps.gz[Simple Imperative Polymorphism].
19420 Andrew Wright.
19421 <:#LASC:>, 8(4):343-355, 1995.
19422+
19423____
19424The origin of the <:ValueRestriction:>.
19425____
19426
19427== <!Anchor(XXX)>X ==
19428
19429{empty}
19430
19431== <!Anchor(YYY)>Y ==
19432
19433 * <!Anchor(Yang98)>
19434 http://cs.nyu.edu/zheyang/papers/YangZ\--ICFP98.html[Encoding Types in ML-like Languages].
19435 Zhe Yang.
19436 <:#ICFP:> 1998.
19437
19438== <!Anchor(ZZZ)>Z ==
19439
19440 * <!Anchor(ZiarekEtAl06)>
19441 http://www.cs.purdue.edu/homes/lziarek/icfp06.pdf[Stabilizers: A Modular Checkpointing Abstraction for Concurrent Functional Programs].
19442 Lukasz Ziarek, Philip Schatz, and Suresh Jagannathan.
19443 <:#ICFP:> 2006.
19444
19445 * <!Anchor(ZiarekEtAl08)>
19446 http://www.cse.buffalo.edu/~lziarek/hosc.pdf[Flattening tuples in an SSA intermediate representation].
19447 Lukasz Ziarek, Stephen Weeks, and Suresh Jagannathan.
19448 <:#HOSC:> 2008.
19449
19450
19451== Abbreviations ==
19452
19453* <!Anchor(ACSD)> ACSD = International Conference on Application of Concurrency to System Design
19454* <!Anchor(BABEL)> BABEL = Workshop on multi-language infrastructure and interoperability
19455* <!Anchor(CC)> CC = International Conference on Compiler Construction
19456* <!Anchor(DPCOOL)> DPCOOL = Workshop on Declarative Programming in the Context of OO Languages
19457* <!Anchor(ESOP)> ESOP = European Symposium on Programming
19458* <!Anchor(FLOPS)> FLOPS = Symposium on Functional and Logic Programming
19459* <!Anchor(FPCA)> FPCA = Conference on Functional Programming Languages and Computer Architecture
19460* <!Anchor(HOSC)> HOSC = Higher-Order and Symbolic Computation
19461* <!Anchor(IC)> IC = Information and Computation
19462* <!Anchor(ICCL)> ICCL = IEEE International Conference on Computer Languages
19463* <!Anchor(ICFP)> ICFP = International Conference on Functional Programming
19464* <!Anchor(IFL)> IFL = International Workshop on Implementation and Application of Functional Languages
19465* <!Anchor(IVME)> IVME = Workshop on Interpreters, Virtual Machines and Emulators
19466* <!Anchor(JFLA)> JFLA = Journees Francophones des Langages Applicatifs
19467* <!Anchor(JFP)> JFP = Journal of Functional Programming
19468* <!Anchor(LASC)> LASC = Lisp and Symbolic Computation
19469* <!Anchor(LFP)> LFP = Lisp and Functional Programming
19470* <!Anchor(ML)> ML = Workshop on ML
19471* <!Anchor(PLDI)> PLDI = Conference on Programming Language Design and Implementation
19472* <!Anchor(POPL)> POPL = Symposium on Principles of Programming Languages
19473* <!Anchor(PPDP)> PPDP = International Conference on Principles and Practice of Declarative Programming
19474* <!Anchor(PPoPP)> PPoPP = Principles and Practice of Parallel Programming
19475* <!Anchor(TCS)> TCS = IFIP International Conference on Theoretical Computer Science
19476* <!Anchor(TIC)> TIC = Types in Compilation
19477* <!Anchor(TLDI)> TLDI = Workshop on Types in Language Design and Implementation
19478* <!Anchor(TOPLAS)> TOPLAS = Transactions on Programming Languages and Systems
19479* <!Anchor(TPHOLs)> TPHOLs = International Conference on Theorem Proving in Higher Order Logics
19480
19481<<<
19482
19483:mlton-guide-page: RefFlatten
19484[[RefFlatten]]
19485RefFlatten
19486==========
19487
19488<:RefFlatten:> is an optimization pass for the <:SSA2:>
19489<:IntermediateLanguage:>, invoked from <:SSA2Simplify:>.
19490
19491== Description ==
19492
19493This pass flattens a `ref` cell into its containing object.
19494The idea is to replace, where possible, a type like
19495----
19496(int ref * real)
19497----
19498
19499with a type like
19500----
19501(int[m] * real)
19502----
19503
19504where the `[m]` indicates a mutable field of a tuple.
19505
19506== Implementation ==
19507
19508* <!ViewGitFile(mlton,master,mlton/ssa/ref-flatten.fun)>
19509
19510== Details and Notes ==
19511
19512The savings is obvious, I hope. We avoid an extra heap-allocated
19513object for the `ref`, which in the above case saves two words. We
19514also save the time and code for the extra indirection at each get and
19515set. There are lots of useful data structures (singly-linked and
19516doubly-linked lists, union-find, Fibonacci heaps, ...) that I believe
19517we are paying through the nose right now because of the absence of ref
19518flattening.
19519
19520The idea is to compute for each occurrence of a `ref` type in the
19521program whether or not that `ref` can be represented as an offset of
19522some object (constructor or tuple). As before, a unification-based
19523whole-program with deep abstract values makes sure the analysis is
19524consistent.
19525
19526The only syntactic part of the analysis that remains is the part that
19527checks that for a variable bound to a value constructed by `Ref_ref`:
19528
19529* the object allocation is in the same block. This is pretty
19530draconian, and it would be nice to generalize it some day to allow
19531flattening as long as the `ref` allocation and object allocation "line
19532up one-to-one" in the same loop-free chunk of code.
19533
19534* updates occur in the same block (and hence it is safe-for-space
19535because the containing object is still alive). It would be nice to
19536relax this to allow updates as long as it can be provedthat the
19537container is live.
19538
19539Prevent flattening of `unit ref`-s.
19540
19541<:RefFlatten:> is safe for space. The idea is to prevent a `ref`
19542being flattened into an object that has a component of unbounded size
19543(other than possibly the `ref` itself) unless we can prove that at
19544each point the `ref` is live, then the containing object is live too.
19545I used a pretty simple approximation to liveness.
19546
19547<<<
19548
19549:mlton-guide-page: Regions
19550[[Regions]]
19551Regions
19552=======
19553
19554In region-based memory management, the heap is divided into a
19555collection of regions into which objects are allocated. At compile
19556time, either in the source program or through automatic inference,
19557allocation points are annotated with the region in which the
19558allocation will occur. Typically, although not always, the regions
19559are allocated and deallocated according to a stack discipline.
19560
19561MLton does not use region-based memory management; it uses traditional
19562<:GarbageCollection:>. We have considered integrating regions with
19563MLton, but in our opinion it is far from clear that regions would
19564provide MLton with improved performance, while they would certainly
19565add a lot of complexity to the compiler and complicate reasoning about
19566and achieving <:SpaceSafety:>. Region-based memory management and
19567garbage collection have different strengths and weaknesses; it's
19568pretty easy to come up with programs that do significantly better
19569under regions than under GC, and vice versa. We believe that it is
19570the case that common SML idioms tend to work better under GC than
19571under regions.
19572
19573One common argument for regions is that the region operations can all
19574be done in (approximately) constant time; therefore, you eliminate GC
19575pause times, leading to a real-time GC. However, because of space
19576safety concerns (see below), we believe that region-based memory
19577management for SML must also include a traditional garbage collector.
19578Hence, to achieve real-time memory management for MLton/SML, we
19579believe that it would be both easier and more efficient to implement a
19580traditional real-time garbage collector than it would be to implement
19581a region system.
19582
19583== Regions, the ML Kit, and space safety ==
19584
19585The <:MLKit:ML Kit> pioneered the use of regions for compiling
19586Standard ML. The ML Kit maintains a stack of regions at run time. At
19587compile time, it uses region inference to decide when data can be
19588allocated in a stack-like manner, assigning it to an appropriate
19589region. The ML Kit has put a lot of effort into improving the
19590supporting analyses and representations of regions, which are all
19591necessary to improve the performance.
19592
19593Unfortunately, under a pure stack-based region system, space leaks are
19594inevitable in theory, and costly in practice. Data for which region
19595inference can not determine the lifetime is moved into the "global
19596region" whose lifetime is the entire program. There are two ways in
19597which region inference will place an object to the global region.
19598
19599* When the inference is too conservative, that is, when the data is
19600used in a stack-like manner but the region inference can't figure it
19601out.
19602
19603* When data is not used in a stack-like manner. In this case,
19604correctness requires region inference to place the object
19605
19606This global region is a source of space leaks. No matter what region
19607system you use, there are some programs such that the global region
19608must exist, and its size will grow to an unbounded multiple of the
19609live data size. For these programs one must have a GC to achieve
19610space safety.
19611
19612To solve this problem, the ML Kit has undergone work to combine
19613garbage collection with region-based memory management.
19614<!Cite(HallenbergEtAl02)> and <!Cite(Elsman03)> describe the addition
19615of a garbage collector to the ML Kit's region-based system. These
19616papers provide convincing evidence for space leaks in the global
19617region. They show a number of benchmarks where the memory usage of
19618the program running with just regions is a large multiple (2, 10, 50,
19619even 150) of the program running with regions plus GC.
19620
19621These papers also give some numbers to show the ML Kit with just
19622regions does better than either a system with just GC or a combined
19623system. Unfortunately, a pure region system isn't practical because
19624of the lack of space safety. And the other performance numbers are
19625not so convincing, because they compare to an old version of SML/NJ
19626and not at all with MLton. It would be interesting to see a
19627comparison with a more serious collector.
19628
19629== Regions, Garbage Collection, and Cyclone ==
19630
19631One possibility is to take Cyclone's approach, and provide both
19632region-based memory management and garbage collection, but at the
19633programmer's option (<!Cite(GrossmanEtAl02)>, <!Cite(HicksEtAl03)>).
19634
19635One might ask whether we might do the same thing -- i.e., provide a
19636`MLton.Regions` structure with explicit region based memory
19637management operations, so that the programmer could use them when
19638appropriate. <:MatthewFluet:> has thought about this question
19639
19640* http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html
19641
19642Unfortunately, his conclusion is that the SML type system is too weak
19643to support this option, although there might be a "poor-man's" version
19644with dynamic checks.
19645
19646<<<
19647
19648:mlton-guide-page: Release20041109
19649[[Release20041109]]
19650Release20041109
19651===============
19652
19653This is an archived public release of MLton, version 20041109.
19654
19655== Changes since the last public release ==
19656
19657* New platforms:
19658** x86: FreeBSD 5.x, OpenBSD
19659** PowerPC: Darwin (MacOSX)
19660* Support for the <:MLBasis: ML Basis system>, a new mechanism supporting programming in the very large, separate delivery of library sources, and more.
19661* Support for dynamic libraries.
19662* Support for <:ConcurrentML:> (CML).
19663* New structures: `Int2`, `Int3`, ..., `Int31` and `Word2`, `Word3`, ..., `Word31`.
19664* Front-end bug fixes and improvements.
19665* A new form of profiling with ++-profile count++, which can be used to test code coverage.
19666* A bytecode generator, available via ++-codegen bytecode++.
19667* Representation improvements:
19668** Tuples and datatypes are packed to decrease space usage.
19669** Ref cells may be unboxed into their containing object.
19670** Arrays of tuples may represent the tuples unboxed.
19671
19672For a complete list of changes and bug fixes since 20040227, see the
19673<!RawGitFile(mlton,on-20041109-release,doc/changelog)>.
19674
19675== Also see ==
19676
19677* <:Bugs20041109:>
19678
19679<<<
19680
19681:mlton-guide-page: Release20051202
19682[[Release20051202]]
19683Release20051202
19684===============
19685
19686This is an archived public release of MLton, version 20051202.
19687
19688== Changes since the last public release ==
19689
19690* The <:License:MLton license> is now BSD-style instead of the GPL.
19691* New platforms: <:RunningOnMinGW:X86/MinGW> and HPPA/Linux.
19692* Improved and expanded documentation, based on the MLton wiki.
19693* Compiler.
19694** improved exception history.
19695** <:CompileTimeOptions:Command-line switches>.
19696*** Added: ++-as-opt++, ++-mlb-path-map++, ++-target-as-opt++, ++-target-cc-opt++.
19697*** Removed: ++-native++, ++-sequence-unit++, ++-warn-match++, ++-warn-unused++.
19698* Language.
19699** <:ForeignFunctionInterface:FFI> syntax changes and extensions.
19700*** Added: `_symbol`.
19701*** Changed: `_export`, `_import`.
19702*** Removed: `_ffi`.
19703** <:MLBasisAnnotations:ML Basis annotations>.
19704*** Added: `allowFFI`, `nonexhaustiveExnMatch`, `nonexhaustiveMatch`, `redundantMatch`, `sequenceNonUnit`.
19705*** Deprecated: `allowExport`, `allowImport`, `sequenceUnit`, `warnMatch`.
19706* Libraries.
19707** Basis Library.
19708*** Added: `Int1`, `Word1`.
19709** <:MLtonStructure:MLton structure>.
19710*** Added: `Process.create`, `ProcEnv.setgroups`, `Rusage.measureGC`, `Socket.fdToSock`, `Socket.Ctl.getError`.
19711*** Changed: `MLton.Platform.Arch`.
19712** Other libraries.
19713*** Added: <:CKitLibrary:ckit>, <:MLNLFFI:ML-NLFFI library>, <:SMLNJLibrary:SML/NJ library>.
19714* Tools.
19715** Updates of `mllex` and `mlyacc` from SML/NJ.
19716** Added <:MLNLFFI:mlnlffigen>.
19717** <:Profiling:> supports better inclusion/exclusion of code.
19718
19719For a complete list of changes and bug fixes since
19720<:Release20041109:>, see the
19721<!RawGitFile(mlton,on-20051202-release,doc/changelog)> and
19722<:Bugs20041109:>.
19723
19724== 20051202 binary packages ==
19725
19726* x86
19727** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-cygwin.tgz[Cygwin] 1.5.18-1
19728** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-freebsd.tbz[FreeBSD] 5.4
19729** Linux
19730*** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1_i386.deb[Debian] sid
19731*** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1_i386.stable.deb[Debian] stable (Sarge)
19732*** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386.rpm[RedHat] 7.1-9.3 FC1-FC4
19733*** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-linux.tgz[tgz] for other distributions (glibc 2.3)
19734** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-mingw.tgz[MinGW]
19735** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-netbsd.tgz[NetBSD] 2.0.2
19736** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-openbsd.tgz[OpenBSD] 3.7
19737* PowerPC
19738** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.powerpc-darwin.tgz[Darwin] 7.9.0 (Mac OS X)
19739* Sparc
19740** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.sparc-solaris.tgz[Solaris] 8
19741
19742== 20051202 source packages ==
19743
19744* http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.src.tgz[source tgz]
19745* Debian http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1.dsc[dsc], http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1.diff.gz[diff.gz], http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202.orig.tar.gz[orig.tar.gz]
19746* RedHat http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.src.rpm[source rpm]
19747
19748== Packages available at other sites ==
19749
19750* http://packages.debian.org/cgi-bin/search_packages.pl?searchon=names&version=all&exact=1&keywords=mlton[Debian]
19751* http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD]
19752* Fedora Core http://fedoraproject.org/extras/4/i386/repodata/repoview/mlton-0-20051202-8.fc4.html[4] http://fedoraproject.org/extras/5/i386/repodata/repoview/mlton-0-20051202-8.fc5.html[5]
19753* http://packages.ubuntu.com/dapper/devel/mlton[Ubuntu]
19754
19755== Also see ==
19756
19757* <:Bugs20051202:>
19758* http://www.mlton.org/guide/20051202/[MLton Guide (20051202)].
19759+
19760A snapshot of the MLton wiki at the time of release.
19761
19762<<<
19763
19764:mlton-guide-page: Release20070826
19765[[Release20070826]]
19766Release20070826
19767===============
19768
19769This is an archived public release of MLton, version 20070826.
19770
19771== Changes since the last public release ==
19772
19773* New platforms:
19774** <:RunningOnAMD64:AMD64>/<:RunningOnLinux:Linux>, <:RunningOnAMD64:AMD64>/<:RunningOnFreeBSD:FreeBSD>
19775** <:RunningOnHPPA:HPPA>/<:RunningOnHPUX:HPUX>
19776** <:RunningOnPowerPC:PowerPC>/<:RunningOnAIX:AIX>
19777** <:RunningOnX86:X86>/<:RunningOnDarwin:Darwin (Mac OS X)>
19778* Compiler.
19779** Support for 64-bit platforms.
19780*** Native amd64 codegen.
19781** <:CompileTimeOptions:Compile-time options>.
19782*** Added: ++-codegen amd64++, ++-codegen x86++, ++-default-type __type__++, ++-profile-val {false|true}++.
19783*** Changed: ++-stop f++ (file listing now includes `.mlb` files).
19784** Bytecode codegen.
19785*** Support for exception history.
19786*** Support for profiling.
19787* Language.
19788*** <:MLBasisAnnotations:ML Basis annotations>.
19789**** Removed: `allowExport`, `allowImport`, `sequenceUnit`, `warnMatch`.
19790* Libraries.
19791** <:BasisLibrary:Basis Library>.
19792*** Added: `PackWord16Big`, `PackWord16Little`, `PackWord64Big`, `PackWord64Little`.
19793*** Bug fixes: see <!RawGitFile(mlton,on-20070826-release,doc/changelog)>.
19794** <:MLtonStructure:MLton structure>.
19795*** Added: `MLTON_MONO_ARRAY`, `MLTON_MONO_VECTOR`, `MLTON_REAL`, `MLton.BinIO.tempPrefix`, `MLton.CharArray`, `MLton.CharVector`, `MLton.Exn.defaultTopLevelHandler`, `MLton.Exn.getTopLevelHandler`, `MLton.Exn.setTopLevelHandler`, `MLton.IntInf.BigWord`, `Mlton.IntInf.SmallInt`, `MLton.LargeReal`, `MLton.LargeWord`, `MLton.Real`, `MLton.Real32`, `MLton.Real64`, `MLton.Rlimit.Rlim`, `MLton.TextIO.tempPrefix`, `MLton.Vector.create`, `MLton.Word.bswap`, `MLton.Word8.bswap`, `MLton.Word16`, `MLton.Word32`, `MLton.Word64`, `MLton.Word8Array`, `MLton.Word8Vector`.
19796*** Changed: `MLton.Array.unfoldi`, `MLton.IntInf.rep`, `MLton.Rlimit`, `MLton.Vector.unfoldi`.
19797*** Deprecated: `MLton.Socket`.
19798** Other libraries.
19799*** Added: <:MLRISCLibrary:MLRISC library>.
19800*** Updated: <:CKitLibrary:ckit library>, <:SMLNJLibrary:SML/NJ library>.
19801* Tools.
19802
19803For a complete list of changes and bug fixes since
19804<:Release20051202:>, see the
19805<!RawGitFile(mlton,on-20070826-release,doc/changelog)> and
19806<:Bugs20051202:>.
19807
19808== 20070826 binary packages ==
19809
19810* AMD64
19811** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.amd64-linux.tgz[Linux], glibc 2.3
19812* HPPA
19813** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.hppa-hpux1100.tgz[HPUX] 11.00 and above, statically linked against <:GnuMP:>
19814* PowerPC
19815** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-aix51.tgz[AIX] 5.1 and above, statically linked against <:GnuMP:>
19816** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-darwin.gmp-static.tgz[Darwin] 8.10 (Mac OS X), statically linked against <:GnuMP:>
19817** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-darwin.gmp-macports.tgz[Darwin] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19818* Sparc
19819** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.sparc-solaris8.tgz[Solaris] 8 and above, statically linked against <:GnuMP:>
19820* X86
19821** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-cygwin.tgz[Cygwin] 1.5.24-2
19822** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-macports.tgz[Darwin (.tgz)] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19823** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-macports.dmg[Darwin (.dmg)] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19824** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-static.tgz[Darwin (.tgz)] 8.10 (Mac OS X), statically linked against <:GnuMP:>
19825** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-static.dmg[Darwin (.dmg)] 8.10 (Mac OS X), statically linked against <:GnuMP:>
19826** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-freebsd.tgz[FreeBSD]
19827** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-linux.tgz[Linux], glibc 2.3
19828** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-linux.glibc213.gmp-static.tgz[Linux], glibc 2.1, statically linked against <:GnuMP:>
19829** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-mingw.gmp-dll.tgz[MinGW], dynamically linked against <:GnuMP:> (requires `libgmp-3.dll`)
19830** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-mingw.gmp-static.tgz[MinGW], statically linked against <:GnuMP:>
19831
19832== 20070826 source packages ==
19833
19834 * http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.src.tgz[source tgz]
19835
19836 * Debian http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826-1.dsc[dsc],
19837 http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826-1.diff.gz[diff.gz],
19838 http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826.orig.tar.gz[orig.tar.gz]
19839
19840== Packages available at other sites ==
19841
19842* http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all&section=all[Debian]
19843* http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD]
19844* https://admin.fedoraproject.org/pkgdb/packages/name/mlton[Fedora]
19845* http://packages.ubuntu.com/cgi-bin/search_packages.pl?keywords=mlton&searchon=names&version=all&release=all[Ubuntu]
19846
19847== Also see ==
19848
19849* <:Bugs20070826:>
19850* http://www.mlton.org/guide/20070826/[MLton Guide (20070826)].
19851+
19852A snapshot of the MLton wiki at the time of release.
19853
19854<<<
19855
19856:mlton-guide-page: Release20100608
19857[[Release20100608]]
19858Release20100608
19859===============
19860
19861This is an archived public release of MLton, version 20100608.
19862
19863== Changes since the last public release ==
19864
19865* New platforms.
19866** <:RunningOnAMD64:AMD64>/<:RunningOnDarwin:Darwin> (Mac OS X Snow Leopard)
19867** <:RunningOnIA64:IA64>/<:RunningOnHPUX:HPUX>
19868** <:RunningOnPowerPC64:PowerPC64>/<:RunningOnAIX:AIX>
19869* Compiler.
19870** <:CompileTimeOptions:Command-line switches>.
19871*** Added: ++-mlb-path-var __<name> <value>__++
19872*** Removed: ++-keep sml++, ++-stop sml++
19873** Improved constant folding of floating-point operations.
19874** Experimental: Support for compiling to a C library; see <:LibrarySupport: documentation>.
19875** Extended ++-show-def-use __output__++ to include types of variable definitions.
19876** Deprecated features (to be removed in a future release)
19877*** Bytecode codegen: The bytecode codegen has not seen significant use and it is not well understood by any of the active developers.
19878*** Support for `.cm` files as input: The ML Basis system provides much better infrastructure for "programming in the very large" than the (very) limited support for CM. The `cm2mlb` tool (available in the source distribution) can be used to convert CM projects to MLB projects, preserving the CM scoping of module identifiers.
19879** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)>
19880* Runtime.
19881** <:RunTimeOptions:@MLton switches>.
19882*** Added: ++may-page-heap {false|true}++
19883** ++may-page-heap++: By default, MLton will not page the heap to disk when unable to grow the heap to accommodate an allocation. (Previously, this behavior was the default, with no means to disable, with security an least-surprise issues.)
19884** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)>
19885* Language.
19886** Allow numeric characters in <:MLBasis:ML Basis> path variables.
19887* Libraries.
19888** <:BasisLibrary:Basis Library>.
19889*** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)>
19890** <:MLtonStructure:MLton structure>.
19891*** Added: `MLton.equal`, `MLton.hash`, `MLton.Cont.isolate`, `MLton.GC.Statistics`, `MLton.Pointer.sizeofPointer`, `MLton.Socket.Address.toVector`
19892*** Changed:
19893*** Deprecated: `MLton.Socket`
19894** <:UnsafeStructure:Unsafe structure>.
19895*** Added versions of all of the monomorphic array and vector structures.
19896** Other libraries.
19897*** Updated: <:CKitLibrary:ckit library>, <:MLRISCLibrary:MLRISC library>, <:SMLNJLibrary:SML/NJ library>.
19898* Tools.
19899** `mllex`
19900*** Eliminated top-level `type int = Int.int` in output.
19901*** Include `(*#line line:col "file.lex" *)` directives in output.
19902*** Added `%posint` command, to set the `yypos` type and allow the lexing of multi-gigabyte files.
19903** `mlnlffigen`
19904*** Added command-line switches `-linkage archive` and `-linkage shared`.
19905*** Deprecated command-line switch `-linkage static`.
19906*** Added support for <:RunningOnIA64:IA64> and <:RunningOnHPPA:HPPA> targets.
19907** `mlyacc`
19908*** Eliminated top-level `type int = Int.int` in output.
19909*** Include `(*#line line:col "file.grm" *)` directives in output.
19910
19911For a complete list of changes and bug fixes since <:Release20070826:>, see the
19912<!RawGitFile(mlton,on-20100608-release,doc/changelog)>
19913and <:Bugs20070826:>.
19914
19915== 20100608 binary packages ==
19916
19917* AMD64 (aka "x86-64" or "x64")
19918** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-darwin.gmp-macports.tgz[Darwin (.tgz)] 10.3 (Mac OS X Snow Leopard), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19919** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 10.3 (Mac OS X Snow Leopard), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables)
19920** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-linux.tgz[Linux], glibc 2.11
19921** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-linux.static.tgz[Linux], statically linked
19922** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.msi[MSI] (61MB) installer
19923* X86
19924** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-cygwin.tgz[Cygwin] 1.7.5
19925** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-darwin.gmp-macports.tgz[Darwin (.tgz)] 9.8 (Mac OS X Leopard), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
19926** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-darwin.gmp-static.tgz[Darwin (.tgz)] 9.8 (Mac OS X Leopard), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables)
19927** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-linux.tgz[Linux], glibc 2.11
19928** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-linux.static.tgz[Linux], statically linked
19929** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.msi[MSI] (61MB) installer
19930
19931== 20100608 source packages ==
19932
19933 * http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608.src.tgz[mlton-20100608.src.tgz]
19934
19935== Packages available at other sites ==
19936
19937 * http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all&section=all[Debian]
19938 * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD]
19939 * https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora]
19940 * http://packages.ubuntu.com/search?suite=default&section=all&arch=any&searchon=names&keywords=mlton[Ubuntu]
19941
19942== Also see ==
19943
19944* <:Bugs20100608:>
19945* http://www.mlton.org/guide/20100608/[MLton Guide (20100608)].
19946+
19947A snapshot of the MLton wiki at the time of release.
19948
19949<<<
19950
19951:mlton-guide-page: Release20130715
19952[[Release20130715]]
19953Release20130715
19954===============
19955
19956This is an archived public release of MLton, version 20130715.
19957
19958== Changes since the last public release ==
19959
19960// * New platforms.
19961// ** ???
19962* Compiler.
19963** Cosmetic improvements to type-error messages.
19964** Removed features:
19965*** Bytecode codegen: The bytecode codegen had not seen significant use and it was not well understood by any of the active developers.
19966*** Support for `.cm` files as input: The <:MLBasis:ML Basis system> provides much better infrastructure for "programming in the very large" than the (very) limited support for CM. The `cm2mlb` tool (available in the source distribution) can be used to convert CM projects to MLB projects, preserving the CM scoping of module identifiers.
19967** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)>
19968* Runtime.
19969** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)>
19970* Language.
19971** Interpret `(*#line line:col "file" *)` directives as relative file names.
19972** <:MLBasisAnnotations:ML Basis annotations>.
19973*** Added: `resolveScope`
19974* Libraries.
19975** <:BasisLibrary:Basis Library>.
19976*** Improved performance of `String.concatWith`.
19977*** Use bit operations for `REAL.class` and other low-level operations.
19978*** Support additional variables with `Posix.ProcEnv.sysconf`.
19979*** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)>
19980** <:MLtonStructure:MLton structure>.
19981*** Removed: `MLton.Socket`
19982** Other libraries.
19983*** Updated: <:CKitLibrary:ckit library>, <:MLRISCLibrary:MLRISC library>, <:SMLNJLibrary:SML/NJ library>
19984*** Added: <:MLLPTLibrary:MLLPT library>
19985* Tools.
19986** `mllex`
19987*** Generate `(*#line line:col "file.lex" *)` directives with simple (relative) file names, rather than absolute paths.
19988** `mlyacc`
19989*** Generate `(*#line line:col "file.grm" *)` directives with simple (relative) file names, rather than absolute paths.
19990*** Fixed bug in comment-handling in lexer.
19991
19992For a complete list of changes and bug fixes since
19993<:Release20100608:>, see the
19994<!RawGitFile(mlton,on-20130715-release,doc/changelog)> and
19995<:Bugs20100608:>.
19996
19997== 20130715 binary packages ==
19998
19999* AMD64 (aka "x86-64" or "x64")
20000** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-darwin.gmp-macports.tgz[Darwin (.tgz)] 11.4 (Mac OS X Lion), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>)
20001** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 11.4 (Mac OS X Lion), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables)
20002** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-linux.tgz[Linux], glibc 2.15
20003// ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-linux.static.tgz[Linux], statically linked
20004// ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.msi[MSI] (61MB) installer
20005* X86
20006// ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-cygwin.tgz[Cygwin] 1.7.5
20007** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-linux.tgz[Linux], glibc 2.15
20008// ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-linux.static.tgz[Linux], statically linked
20009// ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.msi[MSI] (61MB) installer
20010
20011== 20130715 source packages ==
20012
20013 * http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715.src.tgz[mlton-20130715.src.tgz]
20014
20015== Downstream packages ==
20016
20017 * http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all&section=all[Debian]
20018 * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD]
20019 * https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora]
20020 * http://packages.ubuntu.com/search?suite=default&section=all&arch=any&searchon=names&keywords=mlton[Ubuntu]
20021
20022== Also see ==
20023
20024* <:Bugs20130715:>
20025* http://www.mlton.org/guide/20130715/[MLton Guide (20130715)].
20026+
20027A snapshot of the MLton website at the time of release.
20028
20029<<<
20030
20031:mlton-guide-page: Release20180207
20032[[Release20180207]]
20033Release20180207
20034===============
20035
20036Here you can download the latest public release of MLton, version 20180207.
20037
20038== Changes since the last public release ==
20039
20040* Compiler.
20041 ** Added an experimental LLVM codegen (`-codegen llvm`); requires LLVM tools
20042 (`llvm-as`, `opt`, `llc`) version &ge; 3.7.
20043 ** Made many substantial cosmetic improvements to front-end diagnostic
20044 messages, especially with respect to source location regions, type inference
20045 for `fun` and `val rec` declarations, signature constraints applied to a
20046 structure, `sharing type` specifications and `where type` signature
20047 expressions, type constructor or type variable escaping scope, and
20048 nonexhaustive pattern matching.
20049 ** Fixed minor bugs with exception replication, precedence parsing of function
20050 clauses, and simultaneous `sharing` of multiple structures.
20051 ** Made compilation deterministic (eliminate output executable name from
20052 compile-time specified `@MLton` runtime arguments; deterministically generate
20053 magic constant for executable).
20054 ** Updated `-show-basis` (recursively expand structures in environments,
20055 displaying components with long identifiers; append `(* @ region *)`
20056 annotations to items shown in environment).
20057 ** Forced amd64 codegen to generate PIC on amd64-linux targets.
20058* Runtime.
20059 ** Added `gc-summary-file file` runtime option.
20060 ** Reorganized runtime support for `IntInf` operations so that programs that
20061 do not use `IntInf` compile to executables with no residual dependency on GMP.
20062 ** Changed heap representation to store forwarding pointer for an object in
20063 the object header (rather than in the object data and setting the header to a
20064 sentinel value).
20065* Language.
20066 ** Added support for selected SuccessorML features; see
20067 http://mlton.org/SuccessorML for details.
20068 ** Added `(*#showBasis "file" *)` directive; see
20069 http://mlton.org/ShowBasisDirective for details.
20070 ** FFI:
20071 *** Added `pure`, `impure`, and `reentrant` attributes to `_import`. An
20072 unattributed `_import` is treated as `impure`. A `pure` `_import` may be
20073 subject to more aggressive optimizations (common subexpression elimination,
20074 dead-code elimination). An `_import`-ed C function that (directly or
20075 indirectly) calls an `_export`-ed SML function should be attributed
20076 `reentrant`.
20077 ** ML Basis annotations.
20078 *** Added `allowSuccessorML {false|true}` to enable all SuccessorML features
20079 and other annotations to enable specific SuccessorML features; see
20080 http://mlton.org/SuccessorML for details.
20081 *** Split `nonexhaustiveMatch {warn|error|igore}` and `redundantMatch
20082 {warn|error|ignore}` into `nonexhaustiveMatch` and `redundantMatch`
20083 (controls diagnostics for `case` expressions, `fn` expressions, and `fun`
20084 declarations (which may raise `Match` on failure)) and `nonexhaustiveBind`
20085 and `redundantBind` (controls diagnostics for `val` declarations (which may
20086 raise `Bind` on failure)).
20087 *** Added `valrecConstr {warn|error|ignore}` to report when a `val rec` (or
20088 `fun`) declaration redefines an identifier that previously had constructor
20089 status.
20090* Libraries.
20091 ** Basis Library.
20092 *** Improved performance of `Array.copy`, `Array.copyVec`, `Vector.append`,
20093 `String.^`, `String.concat`, `String.concatWith`, and other related
20094 functions by using `memmove` rather than element-by-element constructions.
20095 ** `Unsafe` structure.
20096 *** Added unsafe operations for array uninitialization and raw arrays; see
20097 https://github.com/MLton/mlton/pull/207 for details.
20098 ** Other libraries.
20099 *** Updated: ckit library, MLLPT library, MLRISC library, SML/NJ library
20100* Tools.
20101 ** mlnlffigen
20102 *** Updated to warn and skip (rather than abort) when encountering functions
20103 with `struct`/`union` argument or return type.
20104
20105For a complete list of changes and bug fixes since
20106<:Release20130715:>, see the
20107<!ViewGitFile(mlton,on-20180207-release,CHANGELOG.adoc)> and
20108<:Bugs20130715:>.
20109
20110== 20180207 binary packages ==
20111
20112* AMD64 (aka "x86-64" or "x64")
20113** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-darwin.gmp-homebrew.tgz[Darwin (.tgz)] 16.7 (Mac OS X Sierra), dynamically linked against <:GnuMP:> in `/usr/local/lib` (suitable for https://brew.sh/[Homebrew] install of <:GnuMP:>)
20114** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 16.7 (Mac OS X Sierra), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables)
20115** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-linux.tgz[Linux], glibc 2.23
20116// ** Windows MinGW 32/64 https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.exe[self-extracting] (28MB) or https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.msi[MSI] (61MB) installer
20117// * X86
20118// ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-cygwin.tgz[Cygwin] 1.7.5
20119// ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-linux.tgz[Linux], glibc 2.23
20120// ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-linux.static.tgz[Linux], statically linked
20121// ** Windows MinGW 32/64 https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.exe[self-extracting] (28MB) or https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.msi[MSI] (61MB) installer
20122
20123== 20180207 source packages ==
20124
20125 * https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207.src.tgz[mlton-20180207.src.tgz]
20126
20127== Also see ==
20128
20129* <:Bugs20180207:>
20130* http://www.mlton.org/guide/20180207/[MLton Guide (20180207)].
20131+
20132A snapshot of the MLton website at the time of release.
20133
20134<<<
20135
20136:mlton-guide-page: ReleaseChecklist
20137[[ReleaseChecklist]]
20138ReleaseChecklist
20139================
20140
20141== Advance preparation for release ==
20142
20143* Update `./CHANGELOG.adoc`.
20144** Write entries for missing notable commits.
20145** Write summary of changes from previous release.
20146** Update with estimated release date.
20147* Update `./README.adoc`.
20148** Check features and description.
20149* Update `man/{mlton,mlprof}.1`.
20150** Check compile-time and run-time options in `man/mlton.1`.
20151** Check options in `man/mlprof.1`.
20152** Update with estimated release date.
20153* Update `doc/guide`.
20154// ** Check <:OrphanedPages:> and <:WantedPages:>.
20155** Synchronize <:Features:> page with `./README.adoc`.
20156** Update <:Credits:> page with acknowledgements.
20157** Create *ReleaseYYYYMM??* page (i.e., forthcoming release) based on *ReleaseXXXXLLCC* (i.e., previous release).
20158*** Update summary from `./CHANGELOG.adoc`.
20159*** Update links to estimated release date.
20160** Create *BugsYYYYMM??* page based on *BugsXXXXLLCC*.
20161*** Update links to estimated release date.
20162** Spell check pages.
20163* Ensure that all updates are pushed to `master` branch of <!ViewGitProj(mlton)>.
20164
20165== Prepare sources for tagging ==
20166
20167* Update `./CHANGELOG.adoc`.
20168** Update with proper release date.
20169* Update `man/{mlton,mlprof}.1`.
20170** Update with proper release date.
20171* Update `doc/guide`.
20172** Rename *ReleaseYYYYMM??* to *ReleaseYYYYMMDD* with proper release date.
20173*** Update links with proper release date.
20174** Rename *BugsYYYYMM??* to *BugsYYYYMMDD* with proper release date.
20175*** Update links with proper release date.
20176** Update *ReleaseXXXXLLCC*.
20177*** Change intro to "`This is an archived public release of MLton, version XXXXLLCC.`"
20178** Update <:Home:> with note of new release.
20179*** Change `What's new?` text to `Please try out our new release, <:ReleaseYYYYMMDD:MLton YYYYMMDD>`.
20180*** Update `Download` link with proper release date.
20181** Update <:Releases:> with new release.
20182* Ensure that all updates are pushed to `master` branch of <!ViewGitProj(mlton)>.
20183
20184== Tag sources ==
20185
20186* Shell commands:
20187+
20188----
20189git clone http://github.com/MLton/mlton mlton.git
20190cd mlton.git
20191git checkout master
20192git tag -a -m "Tagging YYYYMMDD release" on-YYYYMMDD-release master
20193git push origin on-YYYYMMDD-release
20194----
20195
20196== Packaging ==
20197
20198=== SourceForge FRS ===
20199
20200* Create *YYYYMMDD* directory:
20201+
20202-----
20203sftp user@frs.sourceforge.net:/home/frs/project/mlton/mlton
20204sftp> mkdir YYYYMMDD
20205sftp> quit
20206-----
20207
20208=== Source release ===
20209
20210* Create `mlton-YYYYMMDD.src.tgz`:
20211+
20212----
20213git clone http://github.com/MLton/mlton mlton
20214cd mlton
20215git checkout on-YYYYMMDD-release
20216make MLTON_VERSION=YYYYMMDD source-release
20217cd ..
20218----
20219+
20220or
20221+
20222----
20223wget https://github.com/MLton/mlton/archive/on-YYYYMMDD-release.tar.gz
20224tar xzvf on-YYYYMMDD-release.tar.gz
20225cd mlton-on-YYYYMMDD-release
20226make MLTON_VERSION=YYYYMMDD source-release
20227cd ..
20228----
20229
20230* Upload `mlton-YYYYMMDD.src.tgz`:
20231+
20232-----
20233scp mlton-YYYYMMDD.src.tgz user@frs.sourceforge.net:/home/frs/project/mlton/mlton/YYYYMMDD/
20234-----
20235
20236* Update *ReleaseYYYYMMDD* with `mlton-YYYYMMDD.src.tgz` link.
20237
20238=== Binary releases ===
20239
20240* Build and create `mlton-YYYYMMDD-1.ARCH-OS.tgz`:
20241+
20242----
20243wget http://sourceforge.net/projects/mlton/files/mlton/YYYYMMDD/mlton-YYYYMMDD.src.tgz
20244tar xzvf mlton-YYYYMMDD.src.tgz
20245cd mlton-YYYYMMDD
20246make binary-release
20247cd ..
20248----
20249
20250* Upload `mlton-YYYYMMDD-1.ARCH-OS.tgz`:
20251+
20252-----
20253scp mlton-YYYYMMDD-1.ARCH-OS.tgz user@frs.sourceforge.net:/home/frs/project/mlton/mlton/YYYYMMDD/
20254-----
20255
20256* Update *ReleaseYYYYMMDD* with `mlton-YYYYMMDD-1.ARCH-OS.tgz` link.
20257
20258== Website ==
20259
20260* `guide/YYYYMMDD` gets a copy of `doc/guide/localhost`.
20261* Shell commands:
20262+
20263----
20264wget http://sourceforge.net/projects/mlton/files/mlton/YYYYMMDD/mlton-YYYYMMDD.src.tgz
20265tar xzvf mlton-YYYYMMDD.src.tgz
20266cd mlton-YYYYMMDD
20267cd doc/guide
20268cp -prf localhost YYYYMMDD
20269tar czvf guide-YYYYMMDD.tgz YYYYMMDD
20270rsync -avzP --delete -e ssh YYYYMMDD user@web.sourceforge.net:/home/project-web/mlton/htdocs/guide/
20271rsync -avzP --delete -e ssh guide-YYYYMMDD.tgz user@web.sourceforge.net:/home/project-web/mlton/htdocs/guide/
20272----
20273
20274== Announce release ==
20275
20276* Mail announcement to:
20277** mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`]
20278** mailto:MLton-user@mlton.org[`MLton-user@mlton.org`]
20279
20280== Misc. ==
20281
20282* Generate new <:Performance:> numbers.
20283
20284<<<
20285
20286:mlton-guide-page: Releases
20287[[Releases]]
20288Releases
20289========
20290
20291Public releases of MLton:
20292
20293* <:Release20180207:>
20294* <:Release20130715:>
20295* <:Release20100608:>
20296* <:Release20070826:>
20297* <:Release20051202:>
20298* <:Release20041109:>
20299* Release20040227
20300* Release20030716
20301* Release20030711
20302* Release20030312
20303* Release20020923
20304* Release20020410
20305* Release20011006
20306* Release20010806
20307* Release20010706
20308* Release20000906
20309* Release20000712
20310* Release19990712
20311* Release19990319
20312* Release19980826
20313
20314<<<
20315
20316:mlton-guide-page: RemoveUnused
20317[[RemoveUnused]]
20318RemoveUnused
20319============
20320
20321<:RemoveUnused:> is an optimization pass for both the <:SSA:> and
20322<:SSA2:> <:IntermediateLanguage:>s, invoked from <:SSASimplify:> and
20323<:SSA2Simplify:>.
20324
20325== Description ==
20326
20327This pass aggressively removes unused:
20328
20329* datatypes
20330* datatype constructors
20331* datatype constructor arguments
20332* functions
20333* function arguments
20334* function returns
20335* blocks
20336* block arguments
20337* statements (variable bindings)
20338* handlers from non-tail calls (mayRaise analysis)
20339* continuations from non-tail calls (mayReturn analysis)
20340
20341== Implementation ==
20342
20343* <!ViewGitFile(mlton,master,mlton/ssa/remove-unused.fun)>
20344* <!ViewGitFile(mlton,master,mlton/ssa/remove-unused2.fun)>
20345
20346== Details and Notes ==
20347
20348{empty}
20349
20350<<<
20351
20352:mlton-guide-page: Restore
20353[[Restore]]
20354Restore
20355=======
20356
20357<:Restore:> is a rewrite pass for the <:SSA:> and <:SSA2:>
20358<:IntermediateLanguage:>s, invoked from <:KnownCase:> and
20359<:LocalRef:>.
20360
20361== Description ==
20362
20363This pass restores the SSA condition for a violating <:SSA:> or
20364<:SSA2:> program; the program must satisfy:
20365____
20366Every path from the root to a use of a variable (excluding globals)
20367passes through a def of that variable.
20368____
20369
20370== Implementation ==
20371
20372* <!ViewGitFile(mlton,master,mlton/ssa/restore.sig)>
20373* <!ViewGitFile(mlton,master,mlton/ssa/restore.fun)>
20374* <!ViewGitFile(mlton,master,mlton/ssa/restore2.sig)>
20375* <!ViewGitFile(mlton,master,mlton/ssa/restore2.fun)>
20376
20377== Details and Notes ==
20378
20379Based primarily on Section 19.1 of <!Cite(Appel98, Modern Compiler
20380Implementation in ML)>.
20381
20382The main deviation is the calculation of liveness of the violating
20383variables, which is used to predicate the insertion of phi arguments.
20384This is due to the algorithm's bias towards imperative languages, for
20385which it makes the assumption that all variables are defined in the
20386start block and all variables are "used" at exit.
20387
20388This is "optimized" for restoration of functions with small numbers of
20389violating variables -- use bool vectors to represent sets of violating
20390variables.
20391
20392Also, we use a `Promise.t` to suspend part of the dominance frontier
20393computation.
20394
20395<<<
20396
20397:mlton-guide-page: ReturnStatement
20398[[ReturnStatement]]
20399ReturnStatement
20400===============
20401
20402Programmers coming from languages that have a `return` statement, such
20403as C, Java, and Python, often ask how one can translate functions that
20404return early into SML. This page briefly describes a number of ways
20405to translate uses of `return` to SML.
20406
20407== Conditional iterator function ==
20408
20409A conditional iterator function, such as
20410http://www.standardml.org/Basis/list.html#SIG:LIST.find:VAL[`List.find`],
20411http://www.standardml.org/Basis/list.html#SIG:LIST.exists:VAL[`List.exists`],
20412or
20413http://www.standardml.org/Basis/list.html#SIG:LIST.all:VAL[`List.all`]
20414is probably what you want in most cases. Unfortunately, it might be
20415the case that the particular conditional iteration pattern that you
20416want isn't provided for your data structure. Usually the best
20417alternative in such a case is to implement the desired iteration
20418pattern as a higher-order function. For example, to implement a
20419`find` function for arrays (which already exists as
20420http://www.standardml.org/Basis/array.html#SIG:ARRAY.findi:VAL[`Array.find`])
20421one could write
20422
20423[source,sml]
20424----
20425fun find predicate array = let
20426 fun loop i =
20427 if i = Array.length array then
20428 NONE
20429 else if predicate (Array.sub (array, i)) then
20430 SOME (Array.sub (array, i))
20431 else
20432 loop (i+1)
20433in
20434 loop 0
20435end
20436----
20437
20438Of course, this technique, while probably the most common case in
20439practice, applies only if you are essentially iterating over some data
20440structure.
20441
20442== Escape handler ==
20443
20444Probably the most direct way to translate code using `return`
20445statements is to basically implement `return` using exception
20446handling. The mechanism can be packaged into a reusable module with
20447the signature
20448(<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/public/control/exit.sig)>):
20449[source,sml]
20450----
20451sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/public/control/exit.sig 6:]
20452----
20453
20454(<!Cite(HarperEtAl93, Typing First-Class Continuations in ML)>
20455discusses the typing of a related construct.) The implementation
20456(<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/detail/control/exit.sml)>)
20457is straightforward:
20458[source,sml]
20459----
20460sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/detail/control/exit.sml 6:]
20461----
20462
20463Here is an example of how one could implement a `find` function given
20464an `app` function:
20465[source,sml]
20466----
20467fun appToFind (app : ('a -> unit) -> 'b -> unit)
20468 (predicate : 'a -> bool)
20469 (data : 'b) =
20470 Exit.call
20471 (fn return =>
20472 (app (fn x =>
20473 if predicate x then
20474 return (SOME x)
20475 else
20476 ())
20477 data
20478 ; NONE))
20479----
20480
20481In the above, as soon as the expression `predicate x` evaluates to
20482`true` the `app` invocation is terminated.
20483
20484
20485== Continuation-passing Style (CPS) ==
20486
20487A general way to implement complex control patterns is to use
20488http://en.wikipedia.org/wiki/Continuation-passing_style[CPS]. In CPS,
20489instead of returning normally, functions invoke a function passed as
20490an argument. In general, multiple continuation functions may be
20491passed as arguments and the ordinary return continuation may also be
20492used. As an example, here is a function that finds the leftmost
20493element of a binary tree satisfying a given predicate:
20494[source,sml]
20495----
20496datatype 'a tree = LEAF | BRANCH of 'a tree * 'a * 'a tree
20497
20498fun find predicate = let
20499 fun recurse continue =
20500 fn LEAF =>
20501 continue ()
20502 | BRANCH (lhs, elem, rhs) =>
20503 recurse
20504 (fn () =>
20505 if predicate elem then
20506 SOME elem
20507 else
20508 recurse continue rhs)
20509 lhs
20510in
20511 recurse (fn () => NONE)
20512end
20513----
20514
20515Note that the above function returns as soon as the leftmost element
20516satisfying the predicate is found.
20517
20518<<<
20519
20520:mlton-guide-page: RSSA
20521[[RSSA]]
20522RSSA
20523====
20524
20525<:RSSA:> is an <:IntermediateLanguage:>, translated from <:SSA2:> by
20526<:ToRSSA:>, optimized by <:RSSASimplify:>, and translated by
20527<:ToMachine:> to <:Machine:>.
20528
20529== Description ==
20530
20531<:RSSA:> is a <:IntermediateLanguage:> that makes representation
20532decisions explicit.
20533
20534== Implementation ==
20535
20536* <!ViewGitFile(mlton,master,mlton/backend/rssa.sig)>
20537* <!ViewGitFile(mlton,master,mlton/backend/rssa.fun)>
20538
20539== Type Checking ==
20540
20541The new type language is aimed at expressing bit-level control over
20542layout and associated packing of data representations. There are
20543singleton types that denote constants, other atomic types for things
20544like integers and reals, and arbitrary sum types and sequence (tuple)
20545types. The big change to the type system is that type checking is now
20546based on subtyping, not type equality. So, for example, the singleton
20547type `0xFFFFEEBB` whose only inhabitant is the eponymous constant is a
20548subtype of the type `Word32`.
20549
20550== Details and Notes ==
20551
20552SSA is an abbreviation for Static Single Assignment. The <:RSSA:>
20553<:IntermediateLanguage:> is a variant of SSA.
20554
20555<<<
20556
20557:mlton-guide-page: RSSAShrink
20558[[RSSAShrink]]
20559RSSAShrink
20560==========
20561
20562<:RSSAShrink:> is an optimization pass for the <:RSSA:>
20563<:IntermediateLanguage:>.
20564
20565== Description ==
20566
20567This pass implements a whole family of compile-time reductions, like:
20568
20569* constant folding, copy propagation
20570* inline the `Goto` to a block with a unique predecessor
20571
20572== Implementation ==
20573
20574* <!ViewGitFile(mlton,master,mlton/backend/rssa.fun)>
20575
20576== Details and Notes ==
20577
20578{empty}
20579
20580<<<
20581
20582:mlton-guide-page: RSSASimplify
20583[[RSSASimplify]]
20584RSSASimplify
20585============
20586
20587The optimization passes for the <:RSSA:> <:IntermediateLanguage:> are
20588collected and controlled by the `Backend` functor
20589(<!ViewGitFile(mlton,master,mlton/backend/backend.sig)>,
20590<!ViewGitFile(mlton,master,mlton/backend/backend.fun)>).
20591
20592The following optimization pass is implemented:
20593
20594* <:RSSAShrink:>
20595
20596The following implementation passes are implemented:
20597
20598* <:ImplementHandlers:>
20599* <:ImplementProfiling:>
20600* <:InsertLimitChecks:>
20601* <:InsertSignalChecks:>
20602
20603The optimization passes can be controlled from the command-line by the options
20604
20605* `-diag-pass <pass>` -- keep diagnostic info for pass
20606* `-drop-pass <pass>` -- omit optimization pass
20607* `-keep-pass <pass>` -- keep the results of pass
20608
20609<<<
20610
20611:mlton-guide-page: RunningOnAIX
20612[[RunningOnAIX]]
20613RunningOnAIX
20614============
20615
20616MLton runs fine on AIX.
20617
20618== Also see ==
20619
20620* <:RunningOnPowerPC:>
20621* <:RunningOnPowerPC64:>
20622
20623<<<
20624
20625:mlton-guide-page: RunningOnAlpha
20626[[RunningOnAlpha]]
20627RunningOnAlpha
20628==============
20629
20630MLton runs fine on the Alpha architecture.
20631
20632== Notes ==
20633
20634* When compiling for Alpha, MLton doesn't support native code
20635generation (`-codegen native`). Hence, performance is not as good as
20636it might be and compile times are longer. Also, the quality of code
20637generated by `gcc` is important. By default, MLton calls `gcc -O1`.
20638You can change this by calling MLton with `-cc-opt -O2`.
20639
20640* When compiling for Alpha, MLton uses `-align 8` by default.
20641
20642<<<
20643
20644:mlton-guide-page: RunningOnAMD64
20645[[RunningOnAMD64]]
20646RunningOnAMD64
20647==============
20648
20649MLton runs fine on the AMD64 (aka "x86-64" or "x64") architecture.
20650
20651== Notes ==
20652
20653* When compiling for AMD64, MLton targets the 64-bit ABI.
20654
20655* On AMD64, MLton supports native code generation (`-codegen native` or `-codegen amd64`).
20656
20657* When compiling for AMD64, MLton uses `-align 8` by default. Using
20658`-align 4` may be incompatible with optimized builds of the <:GnuMP:>
20659library, which assume 8-byte alignment. (See the thread at
20660http://www.mlton.org/pipermail/mlton/2009-October/030674.html for more
20661details.)
20662
20663<<<
20664
20665:mlton-guide-page: RunningOnARM
20666[[RunningOnARM]]
20667RunningOnARM
20668============
20669
20670MLton runs fine on the ARM architecture.
20671
20672== Notes ==
20673
20674* When compiling for ARM, MLton doesn't support native code generation
20675(`-codegen native`). Hence, performance is not as good as it might be
20676and compile times are longer. Also, the quality of code generated by
20677`gcc` is important. By default, MLton calls `gcc -O1`. You can
20678change this by calling MLton with `-cc-opt -O2`.
20679
20680<<<
20681
20682:mlton-guide-page: RunningOnCygwin
20683[[RunningOnCygwin]]
20684RunningOnCygwin
20685===============
20686
20687MLton runs on the http://www.cygwin.com/[Cygwin] emulation layer,
20688which provides a Posix-like environment while running on Windows. To
20689run MLton with Cygwin, you must first install Cygwin on your Windows
20690machine. To do this, visit the Cygwin site from your Windows machine
20691and run their `setup.exe` script. Then, you can unpack the MLton
20692binary `tgz` in your Cygwin environment.
20693
20694To run MLton cross-compiled executables on Windows, you must install
20695the Cygwin `dll` on the Windows machine.
20696
20697== Known issues ==
20698
20699* Time profiling is disabled.
20700
20701* Cygwin's `mmap` emulation is less than perfect. Sometimes it
20702interacts badly with `Posix.Process.fork`.
20703
20704* The <!RawGitFile(mlton,master,regression/socket.sml)> regression
20705test fails. We suspect this is not a bug and is simply due to our
20706test relying on a certain behavior when connecting to a socket that
20707has not yet accepted, which is handled differently on Cygwin than
20708other platforms. Any help in understanding and resolving this issue
20709is appreciated.
20710
20711== Also see ==
20712
20713* <:RunningOnMinGW:RunningOnMinGW>
20714
20715<<<
20716
20717:mlton-guide-page: RunningOnDarwin
20718[[RunningOnDarwin]]
20719RunningOnDarwin
20720===============
20721
20722MLton runs fine on Darwin (and on Mac OS X).
20723
20724== Notes ==
20725
20726* MLton requires the <:GnuMP:> library, which is available via
20727http://www.finkproject.org[Fink], http://www.macports.com[MacPorts],
20728http://mxcl.github.io/homebrew/[Homebrew].
20729
20730* For Intel-based Macs, MLton targets the <:RunningOnAMD64:AMD64
20731architecture> on Darwin 10 (Mac OS X Snow Leopard) and higher and
20732targets the <:RunningOnX86:x86 architecture> on Darwin 8 (Mac OS X
20733Tiger) and Darwin 9 (Mac OS X Leopard).
20734
20735== Known issues ==
20736
20737* Executables that save and load worlds on Darwin 11 (Mac OS X Lion)
20738and higher should be compiled with `-link-opt -fno-PIE` ; see
20739<:MLtonWorld:> for more details.
20740
20741* <:ProfilingTime:> may give inaccurate results on multi-processor
20742machines. The `SIGPROF` signal, used to sample the profiled program,
20743is supposed to be delivered 100 times a second (i.e., at 10000us
20744intervals), but there can be delays of over 1 minute between the
20745delivery of consecutive `SIGPROF` signals. A more complete
20746description may be found
20747http://lists.apple.com/archives/Unix-porting/2007/Aug/msg00000.html[here]
20748and
20749http://lists.apple.com/archives/Darwin-dev/2007/Aug/msg00045.html[here].
20750
20751== Also see ==
20752
20753* <:RunningOnAMD64:>
20754* <:RunningOnPowerPC:>
20755* <:RunningOnX86:>
20756
20757<<<
20758
20759:mlton-guide-page: RunningOnFreeBSD
20760[[RunningOnFreeBSD]]
20761RunningOnFreeBSD
20762================
20763
20764MLton runs fine on http://www.freebsd.org/[FreeBSD].
20765
20766== Notes ==
20767
20768* MLton is available as a http://www.freebsd.org/[FreeBSD]
20769http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[port].
20770
20771== Known issues ==
20772
20773* Executables often run more slowly than on a comparable Linux
20774machine. We conjecture that part of this is due to costs due to heap
20775resizing and kernel zeroing of pages. Any help in solving the problem
20776would be appreciated.
20777
20778* FreeBSD defaults to a datasize limit of 512M, even if you have more
20779than that amount of memory in the computer. Hence, your MLton process
20780will be limited in the amount of memory it has. To fix this problem,
20781turn up the datasize and the default datasize available to a process:
20782Edit `/boot/loader.conf` to set the limits. For example, the setting
20783+
20784----
20785 kern.maxdsiz="671088640"
20786 kern.dfldsiz="671088640"
20787 kern.maxssiz="134217728"
20788----
20789+
20790will give a process 640M of datasize memory, default to 640M available
20791and set 128M of stack size memory.
20792
20793<<<
20794
20795:mlton-guide-page: RunningOnHPPA
20796[[RunningOnHPPA]]
20797RunningOnHPPA
20798=============
20799
20800MLton runs fine on the HPPA architecture.
20801
20802== Notes ==
20803
20804* When compiling for HPPA, MLton targets the 32-bit HPPA architecture.
20805
20806* When compiling for HPPA, MLton doesn't support native code
20807generation (`-codegen native`). Hence, performance is not as good as
20808it might be and compile times are longer. Also, the quality of code
20809generated by `gcc` is important. By default, MLton calls `gcc -O1`.
20810You can change this by calling MLton with `-cc-opt -O2`.
20811
20812* When compiling for HPPA, MLton uses `-align 8` by default. While
20813this speeds up reals, it also may increase object sizes. If your
20814program does not make significant use of reals, you might see a
20815speedup with `-align 4`.
20816
20817<<<
20818
20819:mlton-guide-page: RunningOnHPUX
20820[[RunningOnHPUX]]
20821RunningOnHPUX
20822=============
20823
20824MLton runs fine on HPUX.
20825
20826== Also see ==
20827
20828* <:RunningOnHPPA:>
20829
20830<<<
20831
20832:mlton-guide-page: RunningOnIA64
20833[[RunningOnIA64]]
20834RunningOnIA64
20835=============
20836
20837MLton runs fine on the IA64 architecture.
20838
20839== Notes ==
20840
20841* When compiling for IA64, MLton targets the 64-bit ABI.
20842
20843* When compiling for IA64, MLton doesn't support native code
20844generation (`-codegen native`). Hence, performance is not as good as
20845it might be and compile times are longer. Also, the quality of code
20846generated by `gcc` is important. By default, MLton calls `gcc -O1`.
20847You can change this by calling MLton with `-cc-opt -O2`.
20848
20849* When compiling for IA64, MLton uses `-align 8` by default.
20850
20851* On the IA64, the <:GnuMP:> library supports multiple ABIs. See the
20852<:GnuMP:> page for more details.
20853
20854<<<
20855
20856:mlton-guide-page: RunningOnLinux
20857[[RunningOnLinux]]
20858RunningOnLinux
20859==============
20860
20861MLton runs fine on Linux.
20862
20863<<<
20864
20865:mlton-guide-page: RunningOnMinGW
20866[[RunningOnMinGW]]
20867RunningOnMinGW
20868==============
20869
20870MLton runs on http://mingw.org[MinGW], a library for porting Unix
20871applications to Windows. Some library functionality is missing or
20872changed.
20873
20874== Notes ==
20875
20876* To compile MLton on MinGW:
20877** The <:GnuMP:> library is required.
20878** The Bash shell is required. If you are using a prebuilt MSYS, you
20879probably want to symlink `bash` to `sh`.
20880
20881== Known issues ==
20882
20883* Many functions are unimplemented and will `raise SysErr`.
20884** `MLton.Itimer.set`
20885** `MLton.ProcEnv.setgroups`
20886** `MLton.Process.kill`
20887** `MLton.Process.reap`
20888** `MLton.World.load`
20889** `OS.FileSys.readLink`
20890** `OS.IO.poll`
20891** `OS.Process.terminate`
20892** `Posix.FileSys.chown`
20893** `Posix.FileSys.fchown`
20894** `Posix.FileSys.fpathconf`
20895** `Posix.FileSys.link`
20896** `Posix.FileSys.mkfifo`
20897** `Posix.FileSys.pathconf`
20898** `Posix.FileSys.readlink`
20899** `Posix.FileSys.symlink`
20900** `Posix.IO.dupfd`
20901** `Posix.IO.getfd`
20902** `Posix.IO.getfl`
20903** `Posix.IO.getlk`
20904** `Posix.IO.setfd`
20905** `Posix.IO.setfl`
20906** `Posix.IO.setlkw`
20907** `Posix.IO.setlk`
20908** `Posix.ProcEnv.ctermid`
20909** `Posix.ProcEnv.getegid`
20910** `Posix.ProcEnv.geteuid`
20911** `Posix.ProcEnv.getgid`
20912** `Posix.ProcEnv.getgroups`
20913** `Posix.ProcEnv.getlogin`
20914** `Posix.ProcEnv.getpgrp`
20915** `Posix.ProcEnv.getpid`
20916** `Posix.ProcEnv.getppid`
20917** `Posix.ProcEnv.getuid`
20918** `Posix.ProcEnv.setgid`
20919** `Posix.ProcEnv.setpgid`
20920** `Posix.ProcEnv.setsid`
20921** `Posix.ProcEnv.setuid`
20922** `Posix.ProcEnv.sysconf`
20923** `Posix.ProcEnv.times`
20924** `Posix.ProcEnv.ttyname`
20925** `Posix.Process.exece`
20926** `Posix.Process.execp`
20927** `Posix.Process.exit`
20928** `Posix.Process.fork`
20929** `Posix.Process.kill`
20930** `Posix.Process.pause`
20931** `Posix.Process.waitpid_nh`
20932** `Posix.Process.waitpid`
20933** `Posix.SysDB.getgrgid`
20934** `Posix.SysDB.getgrnam`
20935** `Posix.SysDB.getpwuid`
20936** `Posix.TTY.TC.drain`
20937** `Posix.TTY.TC.flow`
20938** `Posix.TTY.TC.flush`
20939** `Posix.TTY.TC.getattr`
20940** `Posix.TTY.TC.getpgrp`
20941** `Posix.TTY.TC.sendbreak`
20942** `Posix.TTY.TC.setattr`
20943** `Posix.TTY.TC.setpgrp`
20944** `Unix.kill`
20945** `Unix.reap`
20946** `UnixSock.fromAddr`
20947** `UnixSock.toAddr`
20948
20949<<<
20950
20951:mlton-guide-page: RunningOnNetBSD
20952[[RunningOnNetBSD]]
20953RunningOnNetBSD
20954===============
20955
20956MLton runs fine on http://www.netbsd.org/[NetBSD].
20957
20958== Installing the correct packages for NetBSD ==
20959
20960The NetBSD system installs 3rd party packages by a mechanism known as
20961pkgsrc. This is a tree of Makefiles which when invoked downloads the
20962source code, builds a package and installs it on the system. In order
20963to run MLton on NetBSD, you will have to install several packages for
20964it to work:
20965
20966* `shells/bash`
20967
20968* `devel/gmp`
20969
20970* `devel/gmake`
20971
20972In order to get graphical call-graphs of profiling information, you
20973will need the additional package
20974
20975* `graphics/graphviz`
20976
20977To build the documentation for MLton, you will need the addtional
20978package
20979
20980* `textproc/asciidoc`.
20981
20982== Tips for compiling and using MLton on NetBSD ==
20983
20984MLton can be a memory-hog on computers with little memory. While
20985640Mb of RAM ought to be enough to self-compile MLton one might want
20986to do some tuning to the NetBSD VM subsystem in order to succeed. The
20987notes presented here is what <:JesperLouisAndersen:> uses for
20988compiling MLton on his laptop.
20989
20990=== The NetBSD VM subsystem ===
20991
20992NetBSD uses a VM subsystem named
20993http://www.ccrc.wustl.edu/pub/chuck/tech/uvm/[UVM].
20994http://www.selonen.org/arto/netbsd/vm_tune.html[Tuning the VM system]
20995can be done via the `sysctl(8)`-interface with the "VM" MIB set.
20996
20997=== Tuning the NetBSD VM subsystem for MLton ===
20998
20999MLton uses a lot of anonymous pages when it is running. Thus, we will
21000need to tune up the default of 80 for anonymous pages. Setting
21001
21002----
21003sysctl -w vm.anonmax=95
21004sysctl -w vm.anonmin=50
21005sysctl -w vm.filemin=2
21006sysctl -w vm.execmin=2
21007sysctl -w vm.filemax=4
21008sysctl -w vm.execmax=4
21009----
21010
21011makes it less likely for the VM system to swap out anonymous pages.
21012For a full explanation of the above flags, see the documentation.
21013
21014The result is that my laptop goes from a MLton compile where it swaps
21015a lot to a MLton compile with no swapping.
21016
21017<<<
21018
21019:mlton-guide-page: RunningOnOpenBSD
21020[[RunningOnOpenBSD]]
21021RunningOnOpenBSD
21022================
21023
21024MLton runs fine on http://www.openbsd.org/[OpenBSD].
21025
21026== Known issues ==
21027
21028* The <!RawGitFile(mlton,master,regression/socket.sml)> regression
21029test fails. We suspect this is not a bug and is simply due to our
21030test relying on a certain behavior when connecting to a socket that
21031has not yet accepted, which is handled differently on OpenBSD than
21032other platforms. Any help in understanding and resolving this issue
21033is appreciated.
21034
21035<<<
21036
21037:mlton-guide-page: RunningOnPowerPC
21038[[RunningOnPowerPC]]
21039RunningOnPowerPC
21040================
21041
21042MLton runs fine on the PowerPC architecture.
21043
21044== Notes ==
21045
21046* When compiling for PowerPC, MLton targets the 32-bit PowerPC
21047architecture.
21048
21049* When compiling for PowerPC, MLton doesn't support native code
21050generation (`-codegen native`). Hence, performance is not as good as
21051it might be and compile times are longer. Also, the quality of code
21052generated by `gcc` is important. By default, MLton calls `gcc -O1`.
21053You can change this by calling MLton with `-cc-opt -O2`.
21054
21055* On the PowerPC, the <:GnuMP:> library supports multiple ABIs. See
21056the <:GnuMP:> page for more details.
21057
21058<<<
21059
21060:mlton-guide-page: RunningOnPowerPC64
21061[[RunningOnPowerPC64]]
21062RunningOnPowerPC64
21063==================
21064
21065MLton runs fine on the PowerPC64 architecture.
21066
21067== Notes ==
21068
21069* When compiling for PowerPC64, MLton targets the 64-bit PowerPC
21070architecture.
21071
21072* When compiling for PowerPC64, MLton doesn't support native code
21073generation (`-codegen native`). Hence, performance is not as good as
21074it might be and compile times are longer. Also, the quality of code
21075generated by `gcc` is important. By default, MLton calls `gcc -O1`.
21076You can change this by calling MLton with `-cc-opt -O2`.
21077
21078* On the PowerPC64, the <:GnuMP:> library supports multiple ABIs. See
21079the <:GnuMP:> page for more details.
21080
21081<<<
21082
21083:mlton-guide-page: RunningOnS390
21084[[RunningOnS390]]
21085RunningOnS390
21086=============
21087
21088MLton runs fine on the S390 architecture.
21089
21090== Notes ==
21091
21092* When compiling for S390, MLton doesn't support native code
21093generation (`-codegen native`). Hence, performance is not as good as
21094it might be and compile times are longer. Also, the quality of code
21095generated by `gcc` is important. By default, MLton calls `gcc -O1`.
21096You can change this by calling MLton with `-cc-opt -O2`.
21097
21098<<<
21099
21100:mlton-guide-page: RunningOnSolaris
21101[[RunningOnSolaris]]
21102RunningOnSolaris
21103================
21104
21105MLton runs fine on Solaris.
21106
21107== Notes ==
21108
21109* You must install the `binutils`, `gcc`, and `make` packages. You
21110can find out how to get these at
21111http://www.sunfreeware.com[sunfreeware.com].
21112
21113* Making the documentation requires that you install `latex` and
21114`dvips`, which are available in the `tetex` package.
21115
21116== Known issues ==
21117
21118* Bootstrapping on the <:RunningOnSparc:Sparc architecture> is so slow
21119as to be impractical (many hours on a 500MHz UltraSparc). For this
21120reason, we strongly recommend building with a
21121<:CrossCompiling:cross compiler>.
21122
21123== Also see ==
21124
21125* <:RunningOnAMD64:>
21126* <:RunningOnSparc:>
21127* <:RunningOnX86:>
21128
21129<<<
21130
21131:mlton-guide-page: RunningOnSparc
21132[[RunningOnSparc]]
21133RunningOnSparc
21134==============
21135
21136MLton runs fine on the Sparc architecture.
21137
21138== Notes ==
21139
21140* When compiling for Sparc, MLton targets the 32-bit Sparc
21141architecture (i.e., Sparc V8).
21142
21143* When compiling for Sparc, MLton doesn't support native code
21144generation (`-codegen native`). Hence, performance is not as good as
21145it might be and compile times are longer. Also, the quality of code
21146generated by `gcc` is important. By default, MLton calls `gcc -O1`.
21147You can change this by calling MLton with `-cc-opt -O2`. We have seen
21148this speed up some programs by as much as 30%, especially those
21149involving floating point; however, it can also more than double
21150compile times.
21151
21152* When compiling for Sparc, MLton uses `-align 8` by default. While
21153this speeds up reals, it also may increase object sizes. If your
21154program does not make significant use of reals, you might see a
21155speedup with `-align 4`.
21156
21157== Known issues ==
21158
21159* Bootstrapping on the <:RunningOnSparc:Sparc architecture> is so slow
21160as to be impractical (many hours on a 500MHz UltraSparc). For this
21161reason, we strongly recommend building with a
21162<:CrossCompiling:cross compiler>.
21163
21164== Also see ==
21165
21166* <:RunningOnSolaris:>
21167
21168<<<
21169
21170:mlton-guide-page: RunningOnX86
21171[[RunningOnX86]]
21172RunningOnX86
21173============
21174
21175MLton runs fine on the x86 architecture.
21176
21177== Notes ==
21178
21179* On x86, MLton supports native code generation (`-codegen native` or
21180`-codegen x86`).
21181
21182<<<
21183
21184:mlton-guide-page: RunTimeOptions
21185[[RunTimeOptions]]
21186RunTimeOptions
21187==============
21188
21189Executables produced by MLton take command line arguments that control
21190the runtime system. These arguments are optional, and occur before
21191the executable's usual arguments. To use these options, the first
21192argument to the executable must be `@MLton`. The optional arguments
21193then follow, must be terminated by `--`, and are followed by any
21194arguments to the program. The optional arguments are _not_ made
21195available to the SML program via `CommandLine.arguments`. For
21196example, a valid call to `hello-world` is:
21197
21198----
21199hello-world @MLton gc-summary fixed-heap 10k -- a b c
21200----
21201
21202In the above example,
21203`CommandLine.arguments () = ["a", "b", "c"]`.
21204
21205It is allowed to have a sequence of `@MLton` arguments, as in:
21206
21207----
21208hello-world @MLton gc-summary -- @MLton fixed-heap 10k -- a b c
21209----
21210
21211Run-time options can also control MLton, as in
21212
21213----
21214mlton @MLton fixed-heap 0.5g -- foo.sml
21215----
21216
21217
21218== Options ==
21219
21220* ++fixed-heap __x__{k|K|m|M|g|G}++
21221+
21222Use a fixed size heap of size _x_, where _x_ is a real number and the
21223trailing letter indicates its units.
21224+
21225[cols="^25%,<75%"]
21226|====
21227| `k` or `K` | 1024
21228| `m` or `M` | 1,048,576
21229| `g` or `G` | 1,073,741,824
21230|====
21231+
21232A value of `0` means to use almost all the RAM present on the machine.
21233+
21234The heap size used by `fixed-heap` includes all memory allocated by
21235SML code, including memory for the stack (or stacks, if there are
21236multiple threads). It does not, however, include any memory used for
21237code itself or memory used by C globals, the C stack, or malloc.
21238
21239* ++gc-messages++
21240+
21241Print a message at the start and end of every garbage collection.
21242
21243* ++gc-summary++
21244+
21245Print a summary of garbage collection statistics upon program
21246termination to standard error.
21247
21248* ++gc-summary-file __file__++
21249+
21250Print a summary of garbage collection statistics upon program
21251termination to the file specified by _file_.
21252
21253* ++load-world __world__++
21254+
21255Restart the computation with the file specified by _world_, which must
21256have been created by a call to `MLton.World.save` by the same
21257executable. See <:MLtonWorld:>.
21258
21259* ++max-heap __x__{k|K|m|M|g|G}++
21260+
21261Run the computation with an automatically resized heap that is never
21262larger than _x_, where _x_ is a real number and the trailing letter
21263indicates the units as with `fixed-heap`. The heap size for
21264`max-heap` is accounted for as with `fixed-heap`.
21265
21266* ++may-page-heap {false|true}++
21267+
21268Enable paging the heap to disk when unable to grow the heap to a
21269desired size.
21270
21271* ++no-load-world++
21272+
21273Disable `load-world`. This can be used as an argument to the compiler
21274via `-runtime no-load-world` to create executables that will not load
21275a world. This may be useful to ensure that set-uid executables do not
21276load some strange world.
21277
21278* ++ram-slop __x__++
21279+
21280Multiply _x_ by the amount of RAM on the machine to obtain what the
21281runtime views as the amount of RAM it can use. Typically _x_ is less
21282than 1, and is used to account for space used by other programs
21283running on the same machine.
21284
21285* ++stop++
21286+
21287Causes the runtime to stop processing `@MLton` arguments once the next
21288`--` is reached. This can be used as an argument to the compiler via
21289`-runtime stop` to create executables that don't process any `@MLton`
21290arguments.
21291
21292<<<
21293
21294:mlton-guide-page: ScopeInference
21295[[ScopeInference]]
21296ScopeInference
21297==============
21298
21299Scope inference is an analysis/rewrite pass for the <:AST:>
21300<:IntermediateLanguage:>, invoked from <:Elaborate:>.
21301
21302== Description ==
21303
21304This pass adds free type variables to the `val` or `fun`
21305declaration where they are implicitly scoped.
21306
21307== Implementation ==
21308
21309<!ViewGitFile(mlton,master,mlton/elaborate/scope.sig)>
21310<!ViewGitFile(mlton,master,mlton/elaborate/scope.fun)>
21311
21312== Details and Notes ==
21313
21314Scope inference determines for each type variable, the declaration
21315where it is bound. Scope inference is a direct implementation of the
21316specification given in section 4.6 of the
21317<:DefinitionOfStandardML: Definition>. Recall that a free occurrence
21318of a type variable `'a` in a declaration `d` is _unguarded_
21319in `d` if `'a` is not part of a smaller declaration. A type
21320variable `'a` is implicitly scoped at `d` if `'a` is
21321unguarded in `d` and `'a` does not occur unguarded in any
21322declaration containing `d`.
21323
21324The first pass of scope inference walks down the tree and renames all
21325explicitly bound type variables in order to avoid name collisions. It
21326then walks up the tree and adds to each declaration the set of
21327unguarded type variables occurring in that declaration. At this
21328point, if declaration `d` contains an unguarded type variable
21329`'a` and the immediately containing declaration does not contain
21330`'a`, then `'a` is implicitly scoped at `d`. The final
21331pass walks down the tree leaving a `'a` at the a declaration where
21332it is scoped and removing it from all enclosed declarations.
21333
21334<<<
21335
21336:mlton-guide-page: SelfCompiling
21337[[SelfCompiling]]
21338SelfCompiling
21339=============
21340
21341If you want to compile MLton, you must first get the <:Sources:>. You
21342can compile with either MLton or SML/NJ, but we strongly recommend
21343using MLton, since it generates a much faster and more robust
21344executable.
21345
21346== Compiling with MLton ==
21347
21348To compile with MLton, you need the binary versions of `mlton`,
21349`mllex`, and `mlyacc` that come with the MLton binary package. To be
21350safe, you should use the same version of MLton that you are building.
21351However, older versions may work, as long as they don't go back too
21352far. To build MLton, run `make` from within the root directory of the
21353sources. This will build MLton first with the already installed
21354binary version of MLton and will then rebuild MLton with itself.
21355
21356First, the `Makefile` calls `mllex` and `mlyacc` to build the lexer
21357and parser, and then calls `mlton` to compile itself. When making
21358MLton using another version the `Makefile` automatically uses
21359`mlton-stubs.mlb`, which will put in enough stubs to emulate the
21360`structure MLton`. Once MLton is built, the `Makefile` will rebuild
21361MLton with itself, this time using `mlton.mlb` and the real
21362`structure MLton` from the <:BasisLibrary:Basis Library>. This second round
21363of compilation is essential in order to achieve a fast and robust
21364MLton.
21365
21366Compiling MLton requires at least 1GB of RAM for 32-bit platforms (2GB is
21367preferable) and at least 2GB RAM for 64-bit platforms (4GB is preferable).
21368If your machine has less RAM, self-compilation will
21369likely fail, or at least take a very long time due to paging. Even if
21370you have enough memory, there simply may not be enough available, due
21371to memory consumed by other processes. In this case, you may see an
21372`Out of memory` message, or self-compilation may become extremely
21373slow. The only fix is to make sure that enough memory is available.
21374
21375=== Possible Errors ===
21376
21377* The C compiler may not be able to find the <:GnuMP:> header file,
21378`gmp.h` leading to an error like the following.
21379+
21380----
21381 cenv.h:49:18: fatal error: gmp.h: No such file or directory
21382----
21383+
21384The solution is to install (or build) GnuMP on your machine. If you
21385install it at a location not on the default seach path, then run
21386++make WITH_GMP_INC_DIR=__/path/to/gmp/include__ WITH_GMP_LIB_DIR=__/path/to/gmp/lib__++.
21387
21388* The following errors indicates that a binary version of MLton could
21389not be found in your path.
21390+
21391----
21392/bin/sh: mlton: command not found
21393----
21394+
21395----
21396make[2]: mlton: Command not found
21397----
21398+
21399You need to have `mlton` in your path to build MLton from source.
21400+
21401During the build process, there are various times that the `Makefile`-s
21402look for a `mlton` in your path and in `src/build/bin`. It is OK if
21403the latter doesn't exist when the build starts; it is the target being
21404built. Failure to find a `mlton` in your path will abort the build.
21405
21406
21407== Compiling with SML/NJ ==
21408
21409To compile with SML/NJ, run `make bootstrap-smlnj` from within the
21410root directory of the sources. You must use a recent version of
21411SML/NJ. First, the `Makefile` calls `ml-lex` and `ml-yacc` to build
21412the lexer and parser. Then, it calls SML/NJ with the appropriate
21413`sources.cm` file. Once MLton is built with SML/NJ, the `Makefile`
21414will rebuild MLton with this SML/NJ built MLton and then will rebuild
21415MLton with the MLton built MLton. Building with SML/NJ takes
21416significant time (particularly during the "`parseAndElaborate`" phase
21417when the SML/NJ built MLton is compiling MLton). Unless you are doing
21418compiler development and need rapid recompilation, we recommend
21419compiling with MLton.
21420
21421<<<
21422
21423:mlton-guide-page: Serialization
21424[[Serialization]]
21425Serialization
21426=============
21427
21428<:StandardML:Standard ML> does not have built-in support for
21429serialization. Here are papers that describe user-level approaches:
21430
21431* <!Cite(Elsman04)>
21432* <!Cite(Kennedy04)>
21433
21434The MLton repository also contains an experimental generic programming
21435library (see
21436<!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>) that
21437includes a pickling (serialization) generic (see
21438<!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/public/value/pickle.sig)>).
21439
21440<<<
21441
21442:mlton-guide-page: ShareZeroVec
21443[[ShareZeroVec]]
21444ShareZeroVec
21445============
21446
21447<:ShareZeroVec:> is an optimization pass for the <:SSA:>
21448<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
21449
21450== Description ==
21451
21452An SSA optimization to share zero-length vectors.
21453
21454From <!ViewGitCommit(mlton,be8c5f576)>, which replaced the use of the
21455`Array_array0Const` primitive in the Basis Library implementation with a
21456(nullary) `Vector_vector` primitive:
21457
21458________
21459
21460The original motivation for the `Array_array0Const` primitive was to share the
21461heap space required for zero-length vectors among all vectors (of a given type).
21462It was claimed that this optimization is important, e.g., in a self-compile,
21463where vectors are used for lots of syntax tree elements and many of those
21464vectors are empty. See:
21465http://www.mlton.org/pipermail/mlton-devel/2002-February/021523.html
21466
21467Curiously, the full effect of this optimization has been missing for quite some
21468time (perhaps since the port of <:ConstantPropagation:> to the SSA IL). While
21469<:ConstantPropagation:> has "globalized" the nullary application of the
21470`Array_array0Const` primitive, it also simultaneously transformed it to an
21471application of the `Array_uninit` (previously, the `Array_array`) primitive to
21472the zero constant. The hash-consing of globals, meant to create exactly one
21473global for each distinct constant, treats `Array_uninit` primitives as unequal
21474(appropriately, since `Array_uninit` allocates an array with identity (though
21475the identity may be supressed by a subsequent `Array_toVector`)), hence each
21476distinct `Array_array0Const` primitive in the program remained as distinct
21477globals. The limited amount of inlining prior to <:ConstantPropagation:> meant
21478that there were typically fewer than a dozen "copies" of the same empty vector
21479in a program for a given type.
21480
21481As a "functional" primitive, a nullary `Vector_vector` is globalized by
21482ClosureConvert, but is further recognized by ConstantPropagation and hash-consed
21483into a unique instance for each type.
21484________
21485
21486However, a single, shared, global `Vector_vector ()` inhibits the
21487coercion-based optimizations of `Useless`. For example, consider the
21488following program:
21489
21490[source,sml]
21491----
21492 val n = valOf (Int.fromString (hd (CommandLine.arguments ())))
21493
21494 val v1 = Vector.tabulate (n, fn i =>
21495 let val w = Word16.fromInt i
21496 in (w - 0wx1, w, w + 0wx1 + w)
21497 end)
21498 val v2 = Vector.map (fn (w1, w2, w3) => (w1, 0wx2 * w2, 0wx3 * w3)) v1
21499 val v3 = VectorSlice.vector (VectorSlice.slice (v1, 1, SOME (n - 2)))
21500 val ans1 = Vector.foldl (fn ((w1,w2,w3),w) => w + w1 + w2 + w3) 0wx0 v1
21501 val ans2 = Vector.foldl (fn ((_,w2,_),w) => w + w2) 0wx0 v2
21502 val ans3 = Vector.foldl (fn ((_,w2,_),w) => w + w2) 0wx0 v3
21503
21504 val _ = print (concat ["ans1 = ", Word16.toString ans1, " ",
21505 "ans2 = ", Word16.toString ans2, " ",
21506 "ans3 = ", Word16.toString ans3, "\n"])
21507----
21508
21509We would like `v2` and `v3` to be optimized from
21510`(word16 * word16 * word16) vector` to `word16 vector` because only
21511the 2nd component of the elements is needed to compute the answer.
21512
21513With `Array_array0Const`, each distinct occurrence of
21514`Array_array0Const((word16 * word16 * word16))` arising from
21515polyvariance and inlining remained a distinct
21516`Array_uninit((word16 * word16 * word16)) (0x0)` global, which
21517resulted in distinct occurrences for the
21518`val v1 = Vector.tabulate ...` and for the
21519`val v2 = Vector.map ...`. The latter could be optimized to
21520`Array_uninit(word16) (0x0)` by `Useless`, because its result only
21521flows to places requiring the 2nd component of the elements.
21522
21523With `Vector_vector ()`, the distinct occurrences of
21524`Vector_vector((word16 * word16 * word16)) ()` arising from
21525polyvariance are globalized during `ClosureConvert`, those global
21526references may be further duplicated by inlining, but the distinct
21527occurrences of `Vector_vector((word16 * word16 * word16)) ()` are
21528merged to a single occurrence. Because this result flows to places
21529requiring all three components of the elements, it remains
21530`Vector_vector((word16 * word16 * word16)) ()` after
21531`Useless`. Furthermore, because one cannot (in constant time) coerce a
21532`(word16 * word16 * word16) vector` to a `word16 vector`, the `v2`
21533value remains of type `(word16 * word16 * word16) vector`.
21534
21535One option would be to drop the 0-element vector "optimization"
21536entirely. This costs some space (no sharing of empty vectors) and
21537some time (allocation and garbage collection of empty vectors).
21538
21539Another option would be to reinstate the `Array_array0Const` primitive
21540and associated `ConstantPropagation` treatment. But, the semantics
21541and purpose of `Array_array0Const` was poorly understood, resulting in
21542this break.
21543
21544The <:ShareZeroVec:> pass pursues a different approach: perform the 0-element
21545vector "optimization" as a separate optimization, after
21546`ConstantPropagation` and `Useless`. A trivial static analysis is
21547used to match `val v: t vector = Array_toVector(t) (a)` with
21548corresponding `val a: array = Array_uninit(t) (l)` and the later are
21549expanded to
21550`val a: t array = if 0 = l then zeroArr_[t] else Array_uninit(t) (l)`
21551with a single global `val zeroArr_[t] = Array_uninit(t) (0)` created
21552for each distinct type (after coercion-based optimizations).
21553
21554One disadvantage of this approach, compared to the `Vector_vector(t) ()`
21555approach, is that `Array_toVector` is applied each time a vector
21556is created, even if it is being applied to the `zeroArr_[t]`
21557zero-length array. (Although, this was the behavior of the
21558`Array_array0Const` approach.) This updates the object header each
21559time, whereas the `Vector_vector(t) ()` approach would have updated
21560the object header once, when the global was created, and the
21561`zeroVec_[t]` global and the `Array_toVector` result would flow to the
21562join point.
21563
21564It would be possible to properly share zero-length vectors, but doing
21565so is a more sophisticated analysis and transformation, because there
21566can be arbitrary code between the
21567`val a: t array = Array_uninit(t) (l)` and the corresponding
21568`val v: v vector = Array_toVector(t) (a)`, although, in practice,
21569nothing happens when a zero-length vector is created. It may be best
21570to pursue a more general "array to vector" optimization that
21571transforms creations of static-length vectors (e.g., all the
21572`Vector.new<N>` functions) into `Vector_vector` primitives (some of
21573which could be globalized).
21574
21575== Implementation ==
21576
21577* <!ViewGitFile(mlton,master,mlton/ssa/share-zero-vec.fun)>
21578
21579== Details and Notes ==
21580
21581{empty}
21582
21583<<<
21584
21585:mlton-guide-page: ShowBasis
21586[[ShowBasis]]
21587ShowBasis
21588=========
21589
21590MLton has a flag, `-show-basis <file>`, that causes MLton to pretty
21591print to _file_ the basis defined by the input program. For example,
21592if `foo.sml` contains
21593[source,sml]
21594----
21595fun f x = x + 1
21596----
21597then `mlton -show-basis foo.basis foo.sml` will create `foo.basis`
21598with the following contents.
21599----
21600val f: int -> int
21601----
21602
21603If you only want to see the basis and do not wish to compile the
21604program, you can call MLton with `-stop tc`.
21605
21606== Displaying signatures ==
21607
21608When displaying signatures, MLton prefixes types defined in the
21609signature them with `_sig.` to distinguish them from types defined in the
21610environment. For example,
21611[source,sml]
21612----
21613signature SIG =
21614 sig
21615 type t
21616 val x: t * int -> unit
21617 end
21618----
21619is displayed as
21620----
21621signature SIG =
21622 sig
21623 type t
21624 val x: _sig.t * int -> unit
21625 end
21626----
21627
21628Notice that `int` occurs without the `_sig.` prefix.
21629
21630MLton also uses a canonical name for each type in the signature, and
21631that name is used everywhere for that type, no matter what the input
21632signature looked like. For example:
21633[source,sml]
21634----
21635signature SIG =
21636 sig
21637 type t
21638 type u = t
21639 val x: t
21640 val y: u
21641 end
21642----
21643is displayed as
21644----
21645signature SIG =
21646 sig
21647 type t
21648 type u = _sig.t
21649 val x: _sig.t
21650 val y: _sig.t
21651 end
21652----
21653
21654Canonical names are always relative to the "top" of the signature,
21655even when used in nested substructures. For example:
21656[source,sml]
21657----
21658signature S =
21659 sig
21660 type t
21661 val w: t
21662 structure U:
21663 sig
21664 type u
21665 val x: t
21666 val y: u
21667 end
21668 val z: U.u
21669 end
21670----
21671is displayed as
21672----
21673signature S =
21674 sig
21675 type t
21676 val w: _sig.t
21677 val z: _sig.U.u
21678 structure U:
21679 sig
21680 type u
21681 val x: _sig.t
21682 val y: _sig.U.u
21683 end
21684 end
21685----
21686
21687== Displaying structures ==
21688
21689When displaying structures, MLton uses signature constraints wherever
21690possible, combined with `where type` clauses to specify the meanings
21691of the types defined within the signature. For example:
21692[source,sml]
21693----
21694signature SIG =
21695 sig
21696 type t
21697 val x: t
21698 end
21699structure S: SIG =
21700 struct
21701 type t = int
21702 val x = 13
21703 end
21704structure S2:> SIG = S
21705----
21706is displayed as
21707----
21708signature SIG =
21709 sig
21710 type t
21711 val x: _sig.t
21712 end
21713structure S: SIG
21714 where type t = int
21715structure S2: SIG
21716 where type t = S2.t
21717----
21718
21719<<<
21720
21721:mlton-guide-page: ShowBasisDirective
21722[[ShowBasisDirective]]
21723ShowBasisDirective
21724==================
21725
21726A comment of the form `(*#showBasis "<file>"*)` is recognized as a directive to
21727save the current basis (i.e., environment) to `<file>` (in the same format as
21728the `-show-basis <file>` <:CompileTimeOptions: compile-time option>). The
21729`<file>` is interpreted relative to the source file in which it appears. The
21730comment is lexed as a distinct token and is parsed as a structure-level
21731declaration. [Note that treating the directive as a top-level declaration would
21732prohibit using it inside a functor body, which would make the feature
21733significantly less useful in the context of the MLton compiler sources (with its
21734nearly fully functorial style).]
21735
21736This feature is meant to facilitate auto-completion via
21737https://github.com/MatthewFluet/company-mlton[`company-mlton`] and similar
21738tools.
21739
21740<<<
21741
21742:mlton-guide-page: ShowProf
21743[[ShowProf]]
21744ShowProf
21745========
21746
21747If an executable is compiled for <:Profiling:profiling>, then it
21748accepts a special command-line runtime system argument, `show-prof`,
21749that outputs information about the source functions that are profiled.
21750Normally, this information is used by `mlprof`. This page documents
21751the `show-prof` output format, and is intended for those working on
21752the profiler internals.
21753
21754The `show-prof` output is ASCII, and consists of a sequence of lines.
21755
21756* The magic number of the executable.
21757* The number of source names in the executable.
21758* A line for each source name giving the name of the function, a tab,
21759the filename of the file containing the function, a colon, a space,
21760and the line number that the function starts on in that file.
21761* The number of (split) source functions.
21762* A line for each (split) source function, where each line consists of
21763a source-name index (into the array of source names) and a successors
21764index (into the array of split-source sequences, defined below).
21765* The number of split-source sequences.
21766* A line for each split-source sequence, where each line is a space
21767separated list of (split) source functions.
21768
21769The latter two arrays, split sources and split-source sequences,
21770define a directed graph, which is the call-graph of the program.
21771
21772<<<
21773
21774:mlton-guide-page: Shrink
21775[[Shrink]]
21776Shrink
21777======
21778
21779<:Shrink:> is a rewrite pass for the <:SSA:> and <:SSA2:>
21780<:IntermediateLanguage:>s, invoked from every optimization pass (see
21781<:SSASimplify:> and <:SSA2Simplify:>).
21782
21783== Description ==
21784
21785This pass implements a whole family of compile-time reductions, like:
21786
21787* `#1(a, b)` => `a`
21788* `case C x of C y => e` => `let y = x in e`
21789* constant folding, copy propagation
21790* eta blocks
21791* tuple reconstruction elimination
21792
21793== Implementation ==
21794
21795* <!ViewGitFile(mlton,master,mlton/ssa/shrink.sig)>
21796* <!ViewGitFile(mlton,master,mlton/ssa/shrink.fun)>
21797* <!ViewGitFile(mlton,master,mlton/ssa/shrink2.sig)>
21798* <!ViewGitFile(mlton,master,mlton/ssa/shrink2.fun)>
21799
21800== Details and Notes ==
21801
21802The <:Shrink:> pass is run after every <:SSA:> and <:SSA2:>
21803optimization pass.
21804
21805The <:Shrink:> implementation also includes functions to eliminate
21806unreachable blocks from a <:SSA:> or <:SSA2:> program or function.
21807The <:Shrink:> pass does not guarantee to eliminate all unreachable
21808blocks. Doing so would unduly complicate the implementation, and it
21809is almost always the case that all unreachable blocks are eliminated.
21810However, a small number of optimization passes require that the input
21811have no unreachable blocks (essentially, when the analysis works on
21812the control flow graph and the rewrite iterates on the vector of
21813blocks). These passes explicitly call `eliminateDeadBlocks`.
21814
21815The <:Shrink:> pass has a special case to turn a non-tail call where
21816the continuation and handler only do `Profile` statements into a tail
21817call where the `Profile` statements precede the tail call.
21818
21819<<<
21820
21821:mlton-guide-page: SimplifyTypes
21822[[SimplifyTypes]]
21823SimplifyTypes
21824=============
21825
21826<:SimplifyTypes:> is an optimization pass for the <:SSA:>
21827<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
21828
21829== Description ==
21830
21831This pass computes a "cardinality" of each datatype, which is an
21832abstraction of the number of values of the datatype.
21833
21834* `Zero` means the datatype has no values (except for bottom).
21835* `One` means the datatype has one value (except for bottom).
21836* `Many` means the datatype has many values.
21837
21838This pass removes all datatypes whose cardinality is `Zero` or `One`
21839and removes:
21840
21841* components of tuples
21842* function args
21843* constructor args
21844
21845which are such datatypes.
21846
21847This pass marks constructors as one of:
21848
21849* `Useless`: it never appears in a `ConApp`.
21850* `Transparent`: it is the only variant in its datatype and its argument type does not contain any uses of `array` or `vector`.
21851* `Useful`: otherwise
21852
21853This pass also removes `Useless` and `Transparent` constructors.
21854
21855== Implementation ==
21856
21857* <!ViewGitFile(mlton,master,mlton/ssa/simplify-types.fun)>
21858
21859== Details and Notes ==
21860
21861This pass must happen before polymorphic equality is implemented because
21862
21863* it will make polymorphic equality faster because some types are simpler
21864* it removes uses of polymorphic equality that must return true
21865
21866We must keep track of `Transparent` constructors whose argument type
21867uses `array` because of datatypes like the following:
21868[source,sml]
21869----
21870datatype t = T of t array
21871----
21872
21873Such a datatype has `Cardinality.Many`, but we cannot eliminate the
21874datatype and replace the lhs by the rhs, i.e. we must keep the
21875circularity around.
21876
21877Must do similar things for `vectors`.
21878
21879Also, to eliminate as many `Transparent` constructors as possible, for
21880something like the following,
21881[source,sml]
21882----
21883datatype t = T of u array
21884 and u = U of t vector
21885----
21886we (arbitrarily) expand one of the datatypes first. The result will
21887be something like
21888[source,sml]
21889----
21890datatype u = U of u array array
21891----
21892where all uses of `t` are replaced by `u array`.
21893
21894<<<
21895
21896:mlton-guide-page: SML3d
21897[[SML3d]]
21898SML3d
21899=====
21900
21901The http://sml3d.cs.uchicago.edu/[SML3d Project] is a collection of
21902libraries to support 3D graphics programming using Standard ML and the
21903http://www.opengl.org/[OpenGL] graphics API. It currently requires the
21904MLton implementation of SML and is supported on Linux, Mac OS X, and
21905Microsoft Windows. There is also support for
21906http://www.khronos.org/opencl/[OpenCL].
21907
21908<<<
21909
21910:mlton-guide-page: SMLNET
21911[[SMLNET]]
21912SMLNET
21913======
21914
21915http://www.cl.cam.ac.uk/research/tsg/SMLNET[SML.NET] is a
21916<:StandardMLImplementations:Standard ML implementation> that
21917targets the .NET Common Language Runtime.
21918
21919SML.NET is based on the <:MLj:MLj> compiler.
21920
21921== Also see ==
21922
21923* <!Cite(BentonEtAl04)>
21924
21925<<<
21926
21927:mlton-guide-page: SMLNJ
21928[[SMLNJ]]
21929SMLNJ
21930=====
21931
21932http://www.smlnj.org/[SML/NJ] is a
21933<:StandardMLImplementations:Standard ML implementation>. It is a
21934native code compiler that runs on a variety of platforms and has a
21935number of libraries and tools.
21936
21937We maintain a list of SML/NJ's <:SMLNJDeviations:deviations> from
21938<:DefinitionOfStandardML:The Definition of Standard ML>.
21939
21940MLton has support for some features of SML/NJ in order to ease porting
21941between MLton and SML/NJ.
21942
21943* <:CompilationManager:> (CM)
21944* <:LineDirective:>s
21945* <:SMLofNJStructure:>
21946* <:UnsafeStructure:>
21947
21948<<<
21949
21950:mlton-guide-page: SMLNJDeviations
21951[[SMLNJDeviations]]
21952SMLNJDeviations
21953===============
21954
21955Here are some deviations of <:SMLNJ:SML/NJ> from
21956<:DefinitionOfStandardML:The Definition of Standard ML (Revised)>.
21957Some of these are documented in the
21958http://www.smlnj.org/doc/Conversion/index.html[SML '97 Conversion Guide].
21959Since MLton does not deviate from the Definition, you should look here
21960if you are having trouble porting a program from MLton to SML/NJ or
21961vice versa. If you discover other deviations of SML/NJ that aren't
21962listed here, please send mail to
21963mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`].
21964
21965* SML/NJ allows spaces in long identifiers, as in `S . x`. Section
219662.5 of the Definition implies that `S . x` should be treated as three
21967separate lexical items.
21968
21969* SML/NJ allows `op` to appear in `val` specifications:
21970+
21971[source,sml]
21972----
21973signature FOO = sig
21974 val op + : int * int -> int
21975end
21976----
21977+
21978The grammar on page 14 of the Definition does not allow it. Recent
21979versions of SML/NJ do give a warning.
21980
21981* SML/NJ rejects
21982+
21983[source,sml]
21984----
21985(op *)
21986----
21987+
21988as an unmatched close comment.
21989
21990* SML/NJ allows `=` to be rebound by the declaration:
21991+
21992[source,sml]
21993----
21994val op = = 13
21995----
21996+
21997This is explicitly forbidden on page 5 of the Definition. Recent
21998versions of SML/NJ do give a warning.
21999
22000* SML/NJ allows rebinding `true`, `false`, `nil`, `::`, and `ref` by
22001the declarations:
22002+
22003[source,sml]
22004----
22005fun true () = ()
22006fun false () = ()
22007fun nil () = ()
22008fun op :: () = ()
22009fun ref () = ()
22010----
22011+
22012This is explicitly forbidden on page 9 of the Definition.
22013
22014* SML/NJ extends the syntax of the language to allow vector
22015expressions and patterns like the following:
22016+
22017[source,sml]
22018----
22019val v = #[1,2,3]
22020val #[x,y,z] = v
22021----
22022+
22023MLton supports vector expressions and patterns with the <:SuccessorML#VectorExpsAndPats:`allowVectorExpsAndPats`> <:MLBasisAnnotations:ML Basis annotation>.
22024
22025* SML/NJ extends the syntax of the language to allow _or patterns_
22026like the following:
22027+
22028[source,sml]
22029----
22030datatype foo = Foo of int | Bar of int
22031val (Foo x | Bar x) = Foo 13
22032----
22033+
22034MLton supports or patterns with the <:SuccessorML#OrPats:`allowOrPats`> <:MLBasisAnnotations:ML Basis annotation>.
22035
22036* SML/NJ allows higher-order functors, that is, functors can be
22037components of structures and can be passed as functor arguments and
22038returned as functor results. As a consequence, SML/NJ allows
22039abbreviated functor definitions, as in the following:
22040+
22041[source,sml]
22042----
22043signature S =
22044 sig
22045 type t
22046 val x: t
22047 end
22048functor F (structure A: S): S =
22049 struct
22050 type t = A.t * A.t
22051 val x = (A.x, A.x)
22052 end
22053functor G = F
22054----
22055
22056* SML/NJ extends the syntax of the language to allow `functor` and
22057`signature` declarations to occur within the scope of `local` and
22058`structure` declarations.
22059
22060* SML/NJ allows duplicate type specifications in signatures when the
22061duplicates are introduced by `include`, as in the following:
22062+
22063[source,sml]
22064----
22065signature SIG1 =
22066 sig
22067 type t
22068 type u
22069 end
22070signature SIG2 =
22071 sig
22072 type t
22073 type v
22074 end
22075signature SIG =
22076 sig
22077 include SIG1
22078 include SIG2
22079 end
22080----
22081+
22082This is disallowed by rule 77 of the Definition.
22083
22084* SML/NJ allows sharing constraints between type abbreviations in
22085signatures, as in the following:
22086+
22087[source,sml]
22088----
22089signature SIG =
22090 sig
22091 type t = int * int
22092 type u = int * int
22093 sharing type t = u
22094 end
22095----
22096+
22097These are disallowed by rule 78 of the Definition. Recent versions of
22098SML/NJ correctly disallow sharing constraints between type
22099abbreviations in signatures.
22100
22101* SML/NJ disallows multiple `where type` specifications of the same
22102type name, as in the following
22103+
22104[source,sml]
22105----
22106signature S =
22107 sig
22108 type t
22109 type u = t
22110 end
22111 where type u = int
22112----
22113+
22114This is allowed by rule 64 of the Definition.
22115
22116* SML/NJ allows `and` in `sharing` specs in signatures, as in
22117+
22118[source,sml]
22119----
22120signature S =
22121 sig
22122 type t
22123 type u
22124 type v
22125 sharing type t = u
22126 and type u = v
22127 end
22128----
22129
22130* SML/NJ does not expand the `withtype` derived form as described by
22131the Definition. According to page 55 of the Definition, the type
22132bindings of a `withtype` declaration are substituted simultaneously in
22133the connected datatype. Consider the following program.
22134+
22135[source,sml]
22136----
22137type u = real ;
22138datatype a =
22139 A of t
22140 | B of u
22141withtype u = int
22142and t = u
22143----
22144+
22145According to the Definition, it should be expanded to the following.
22146+
22147[source,sml]
22148----
22149type u = real ;
22150datatype a =
22151 A of u
22152 | B of int ;
22153type u = int
22154and t = u
22155----
22156+
22157However, SML/NJ expands `withtype` bindings sequentially, meaning that
22158earlier bindings are expanded within later ones. Hence, the above
22159program is expanded to the following.
22160+
22161[source,sml]
22162----
22163type u = real ;
22164datatype a =
22165 A of int
22166 | B of int ;
22167type u = int
22168type t = int
22169----
22170
22171* SML/NJ allows `withtype` specifications in signatures.
22172+
22173MLton supports `withtype` specifications in signatures with the <:SuccessorML#SigWithtype:`allowSigWithtype`> <:MLBasisAnnotations:ML Basis annotation>.
22174
22175* SML/NJ allows a `where` structure specification that is similar to a
22176`where type` specification. For example:
22177+
22178[source,sml]
22179----
22180structure S = struct type t = int end
22181signature SIG =
22182 sig
22183 structure T : sig type t end
22184 end where T = S
22185----
22186+
22187This is equivalent to:
22188+
22189[source,sml]
22190----
22191structure S = struct type t = int end
22192signature SIG =
22193 sig
22194 structure T : sig type t end
22195 end where type T.t = S.t
22196----
22197+
22198SML/NJ also allows a definitional structure specification that is
22199similar to a definitional type specification. For example:
22200+
22201[source,sml]
22202----
22203structure S = struct type t = int end
22204signature SIG =
22205 sig
22206 structure T : sig type t end = S
22207 end
22208----
22209+
22210This is equivalent to the previous examples and to:
22211+
22212[source,sml]
22213----
22214structure S = struct type t = int end
22215signature SIG =
22216 sig
22217 structure T : sig type t end where type t = S.t
22218 end
22219----
22220
22221* SML/NJ disallows binding non-datatypes with datatype replication.
22222For example, it rejects the following program that should be allowed
22223according to the Definition.
22224+
22225[source,sml]
22226----
22227type ('a, 'b) t = 'a * 'b
22228datatype u = datatype t
22229----
22230+
22231This idiom can be useful when one wants to rename a type without
22232rewriting all the type arguments. For example, the above would have
22233to be written in SML/NJ as follows.
22234+
22235[source,sml]
22236----
22237type ('a, 'b) t = 'a * 'b
22238type ('a, 'b) u = ('a, 'b) t
22239----
22240
22241* SML/NJ disallows sharing a structure with one of its substructures.
22242For example, SML/NJ disallows the following.
22243+
22244[source,sml]
22245----
22246signature SIG =
22247 sig
22248 structure S:
22249 sig
22250 type t
22251 structure T: sig type t end
22252 end
22253 sharing S = S.T
22254 end
22255----
22256+
22257This signature is allowed by the Definition.
22258
22259* SML/NJ disallows polymorphic generalization of refutable
22260patterns. For example, SML/NJ disallows the following.
22261+
22262[source,sml]
22263----
22264val [x] = [[]]
22265val _ = (1 :: x, "one" :: x)
22266----
22267+
22268Recent versions of SML/NJ correctly allow polymorphic generalization
22269of refutable patterns.
22270
22271* SML/NJ uses an overly restrictive context for type inference. For
22272example, SML/NJ rejects both of the following.
22273+
22274[source,sml]
22275----
22276structure S =
22277struct
22278 val z = (fn x => x) []
22279 val y = z :: [true] :: nil
22280end
22281----
22282+
22283[source,sml]
22284----
22285structure S : sig val z : bool list end =
22286struct
22287 val z = (fn x => x) []
22288end
22289----
22290+
22291These structures are allowed by the Definition.
22292
22293== Deviations from the Basis Library Specification ==
22294
22295Here are some deviations of SML/NJ from the <:BasisLibrary:Basis Library>
22296http://www.standardml.org/Basis[specification].
22297
22298* SML/NJ exposes the equality of the `vector` type in structures such
22299as `Word8Vector` that abstractly match `MONO_VECTOR`, which says
22300`type vector`, not `eqtype vector`. So, for example, SML/NJ accepts
22301the following program:
22302+
22303[source,sml]
22304----
22305fun f (v: Word8Vector.vector) = v = v
22306----
22307
22308* SML/NJ exposes the equality property of the type `status` in
22309`OS.Process`. This means that programs which directly compare two
22310values of type `status` will work with SML/NJ but not MLton.
22311
22312* Under SML/NJ on Windows, `OS.Path.validVolume` incorrectly considers
22313absolute empty volumes to be valid. In other words, when the
22314expression
22315+
22316[source,sml]
22317----
22318OS.Path.validVolume { isAbs = true, vol = "" }
22319----
22320+
22321is evaluated by SML/NJ on Windows, the result is `true`. MLton, on
22322the other hand, correctly follows the Basis Library Specification,
22323which states that on Windows, `OS.Path.validVolume` should return
22324`false` whenever `isAbs = true` and `vol = ""`.
22325+
22326This incorrect behavior causes other `OS.Path` functions to behave
22327differently. For example, when the expression
22328+
22329[source,sml]
22330----
22331OS.Path.toString (OS.Path.fromString "\\usr\\local")
22332----
22333+
22334is evaluated by SML/NJ on Windows, the result is `"\\usr\\local"`,
22335whereas under MLton on Windows, evaluating this expression (correctly)
22336causes an `OS.Path.Path` exception to be raised.
22337
22338<<<
22339
22340:mlton-guide-page: SMLNJLibrary
22341[[SMLNJLibrary]]
22342SMLNJLibrary
22343============
22344
22345The http://www.smlnj.org/doc/smlnj-lib/index.html[SML/NJ Library] is a
22346collection of libraries that are distributed with SML/NJ. Due to
22347differences between SML/NJ and MLton, these libraries will not work
22348out-of-the box with MLton.
22349
22350As of 20180119, MLton includes a port of the SML/NJ Library
22351synchronized with SML/NJ version 110.82.
22352
22353== Usage ==
22354
22355* You can import a sub-library of the SML/NJ Library into an MLB file with:
22356+
22357[options="header"]
22358|=====
22359|MLB file|Description
22360|`$(SML_LIB)/smlnj-lib/Util/smlnj-lib.mlb`|Various utility modules, included collections, simple formating, ...
22361|`$(SML_LIB)/smlnj-lib/Controls/controls-lib.mlb`|A library for managing control flags in an application.
22362|`$(SML_LIB)/smlnj-lib/HashCons/hash-cons-lib.mlb`|Support for implementing hash-consed data structures.
22363|`$(SML_LIB)/smlnj-lib/HTML/html-lib.mlb`|HTML 3.2 parsing and pretty-printing library.
22364|`$(SML_LIB)/smlnj-lib/HTML4/html4-lib.mlb`|HTML 4.01 parsing and pretty-printing library.
22365|`$(SML_LIB)/smlnj-lib/INet/inet-lib.mlb`|Networking utilities; supported on both Unix and Windows systems.
22366|`$(SML_LIB)/smlnj-lib/JSON/json-lib.mlb`|JavaScript Object Notation (JSON) reading and writing library.
22367|`$(SML_LIB)/smlnj-lib/PP/pp-lib.mlb`|Pretty-printing library.
22368|`$(SML_LIB)/smlnj-lib/Reactive/reactive-lib.mlb`|Reactive scripting library.
22369|`$(SML_LIB)/smlnj-lib/RegExp/regexp-lib.mlb`|Regular expression library.
22370|`$(SML_LIB)/smlnj-lib/SExp/sexp-lib.mlb`|S-expression library.
22371|`$(SML_LIB)/smlnj-lib/Unix/unix-lib.mlb`|Utilities for Unix-based operating systems.
22372|`$(SML_LIB)/smlnj-lib/XML/xml-lib.mlb`|XML library.
22373|=====
22374
22375* If you are porting a project from SML/NJ's <:CompilationManager:> to
22376MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the
22377following maps are included by default:
22378+
22379-----
22380# SMLNJ Library
22381$SMLNJ-LIB $(SML_LIB)/smlnj-lib
22382$smlnj-lib.cm $(SML_LIB)/smlnj-lib/Util
22383$controls-lib.cm $(SML_LIB)/smlnj-lib/Controls
22384$hash-cons-lib.cm $(SML_LIB)/smlnj-lib/HashCons
22385$html-lib.cm $(SML_LIB)/smlnj-lib/HTML
22386$html4-lib.cm $(SML_LIB)/smlnj-lib/HTML4
22387$inet-lib.cm $(SML_LIB)/smlnj-lib/INet
22388$json-lib.cm $(SML_LIB)/smlnj-lib/JSON
22389$pp-lib.cm $(SML_LIB)/smlnj-lib/PP
22390$reactive-lib.cm $(SML_LIB)/smlnj-lib/Reactive
22391$regexp-lib.cm $(SML_LIB)/smlnj-lib/RegExp
22392$sexp-lib.cm $(SML_LIB)/smlnj-lib/SExp
22393$unix-lib.cm $(SML_LIB)/smlnj-lib/Unix
22394$xml-lib.cm $(SML_LIB)/smlnj-lib/XML
22395----
22396+
22397This will automatically convert a `$/smlnj-lib.cm` import in an input
22398`.cm` file into a `$(SML_LIB)/smlnj-lib/Util/smlnj-lib.mlb` import in
22399the output `.mlb` file.
22400
22401== Details ==
22402
22403The following changes were made to the SML/NJ Library, in addition to
22404deriving the `.mlb` files from the `.cm` files:
22405
22406* `HTML4/pp-init.sml` (added): Implements `structure PrettyPrint` using the SML/NJ PP Library. This implementation is taken from the SML/NJ compiler source, since the SML/NJ HTML4 Library used the `structure PrettyPrint` provided by the SML/NJ compiler itself.
22407* `Util/base64.sml` (modified): Rewrote use of `Unsafe.CharVector.create` and `Unsafe.CharVector.update`; MLton assumes that vectors are immutable.
22408* `Util/engine.mlton.sml` (added, not exported): Implements `structure Engine`, providing time-limited, resumable computations using <:MLtonThread:>, <:MLtonSignal:>, and <:MLtonItimer:>.
22409* `Util/graph-scc-fn.sml` (modified): Rewrote use of `where` structure specification.
22410* `Util/redblack-map-fn.sml` (modified): Rewrote use of `where` structure specification.
22411* `Util/redblack-set-fn.sml` (modified): Rewrote use of `where` structure specification.
22412* `Util/time-limit.mlb` (added): Exports `structure TimeLimit`, which is _not_ exported by `smlnj-lib.mlb`. Since MLton is very conservative in the presence of threads and signals, program performance may be adversely affected by unnecessarily including `structure TimeLimit`.
22413* `Util/time-limit.mlton.sml` (added): Implements `structure TimeLimit` using `structure Engine`. The SML/NJ implementation of `structure TimeLimit` uses SML/NJ's first-class continuations, signals, and interval timer.
22414
22415== Patch ==
22416
22417* <!ViewGitFile(mlton,master,lib/smlnj-lib/smlnj-lib.patch)>
22418
22419<<<
22420
22421:mlton-guide-page: SMLofNJStructure
22422[[SMLofNJStructure]]
22423SMLofNJStructure
22424================
22425
22426[source,sml]
22427----
22428signature SML_OF_NJ =
22429 sig
22430 structure Cont:
22431 sig
22432 type 'a cont
22433 val callcc: ('a cont -> 'a) -> 'a
22434 val isolate: ('a -> unit) -> 'a cont
22435 val throw: 'a cont -> 'a -> 'b
22436 end
22437 structure SysInfo:
22438 sig
22439 exception UNKNOWN
22440 datatype os_kind = BEOS | MACOS | OS2 | UNIX | WIN32
22441
22442 val getHostArch: unit -> string
22443 val getOSKind: unit -> os_kind
22444 val getOSName: unit -> string
22445 end
22446
22447 val exnHistory: exn -> string list
22448 val exportFn: string * (string * string list -> OS.Process.status) -> unit
22449 val exportML: string -> bool
22450 val getAllArgs: unit -> string list
22451 val getArgs: unit -> string list
22452 val getCmdName: unit -> string
22453 end
22454----
22455
22456`SMLofNJ` implements a subset of the structure of the same name
22457provided in <:SMLNJ:Standard ML of New Jersey>. It is included to
22458make it easier to port programs between the two systems. The
22459semantics of these functions may be different than in SML/NJ.
22460
22461* `structure Cont`
22462+
22463implements continuations.
22464
22465* `SysInfo.getHostArch ()`
22466+
22467returns the string for the architecture.
22468
22469* `SysInfo.getOSKind`
22470+
22471returns the OS kind.
22472
22473* `SysInfo.getOSName ()`
22474+
22475returns the string for the host.
22476
22477* `exnHistory`
22478+
22479the same as `MLton.Exn.history`.
22480
22481* `getCmdName ()`
22482+
22483the same as `CommandLine.name ()`.
22484
22485* `getArgs ()`
22486+
22487the same as `CommandLine.arguments ()`.
22488
22489* `getAllArgs ()`
22490+
22491the same as `getCmdName()::getArgs()`.
22492
22493* `exportFn f`
22494+
22495saves the state of the computation to a file that will apply `f` to
22496the command-line arguments upon restart.
22497
22498* `exportML f`
22499+
22500saves the state of the computation to file `f` and continue. Returns
22501`true` in the restarted computation and `false` in the continuing
22502computation.
22503
22504<<<
22505
22506:mlton-guide-page: SMLSharp
22507[[SMLSharp]]
22508SMLSharp
22509========
22510
22511http://www.pllab.riec.tohoku.ac.jp/smlsharp/[SML#] is an
22512<:StandardMLImplementations:implementation> of an extension of SML.
22513
22514It includes some
22515http://www.pllab.riec.tohoku.ac.jp/smlsharp/?Tools[generally useful SML tools]
22516including a pretty printer generator, a document generator, and a
22517regression testing framework, and
22518http://www.pllab.riec.tohoku.ac.jp/smlsharp/?Library%2FScripting[scripting library].
22519
22520<<<
22521
22522:mlton-guide-page: Sources
22523[[Sources]]
22524Sources
22525=======
22526
22527We maintain our sources with <:Git:>. You can
22528https://github.com/MLton/mlton/[view them on the web] or access
22529them with a git client.
22530
22531Anonymous read-only access is available via
22532----------
22533https://github.com/MLton/mlton.git
22534----------
22535or
22536----------
22537git://github.com/MLton/mlton.git
22538----------
22539
22540
22541== Commit email ==
22542
22543All commits are sent to
22544mailto:MLton-commit@mlton.org[`MLton-commit@mlton.org`]
22545(https://lists.sourceforge.net/lists/listinfo/mlton-commit[subscribe],
22546https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-commit[archive],
22547http://www.mlton.org/pipermail/mlton-commit/[archive]) which is a
22548read-only mailing list for commit emails. Discussion should go to
22549mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`].
22550
22551/////
22552If the first line of a commit log message begins with "++MAIL{nbsp} ++",
22553then the commit message will be sent with the subject as the rest of
22554that first line, and will also be sent to
22555mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`].
22556/////
22557
22558
22559== Changelog ==
22560
22561See <!ViewGitFile(mlton,master,CHANGELOG.adoc)> for a list of
22562changes and bug fixes.
22563
22564
22565== Subversion ==
22566
22567Prior to 20130308, we used <:Subversion:>.
22568
22569== CVS ==
22570
22571Prior to 20050730, we used <:CVS:>.
22572
22573<<<
22574
22575:mlton-guide-page: SpaceSafety
22576[[SpaceSafety]]
22577SpaceSafety
22578===========
22579
22580Informally, space safety is a property of a language implementation
22581that asymptotically bounds the space used by a running program.
22582
22583== Also see ==
22584
22585* Chapter 12 of <!Cite(Appel92)>
22586* <!Cite(Clinger98)>
22587
22588<<<
22589
22590:mlton-guide-page: SSA
22591[[SSA]]
22592SSA
22593===
22594
22595<:SSA:> is an <:IntermediateLanguage:>, translated from <:SXML:> by
22596<:ClosureConvert:>, optimized by <:SSASimplify:>, and translated by
22597<:ToSSA2:> to <:SSA2:>.
22598
22599== Description ==
22600
22601<:SSA:> is a <:FirstOrder:>, <:SimplyTyped:> <:IntermediateLanguage:>.
22602It is the main <:IntermediateLanguage:> used for optimizations.
22603
22604An <:SSA:> program consists of a collection of datatype declarations,
22605a sequence of global statements, and a collection of functions, along
22606with a distinguished "main" function. Each function consists of a
22607collection of basic blocks, where each basic block is a sequence of
22608statements ending with some control transfer.
22609
22610== Implementation ==
22611
22612* <!ViewGitFile(mlton,master,mlton/ssa/ssa.sig)>
22613* <!ViewGitFile(mlton,master,mlton/ssa/ssa.fun)>
22614* <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree.sig)>
22615* <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree.fun)>
22616
22617== Type Checking ==
22618
22619Type checking (<!ViewGitFile(mlton,master,mlton/ssa/type-check.sig)>,
22620<!ViewGitFile(mlton,master,mlton/ssa/type-check.fun)>) of a <:SSA:> program
22621verifies the following:
22622
22623* no duplicate definitions (tycons, cons, vars, labels, funcs)
22624* no out of scope references (tycons, cons, vars, labels, funcs)
22625* variable definitions dominate variable uses
22626* case transfers are exhaustive and irredundant
22627* `Enter`/`Leave` profile statements match
22628* "traditional" well-typedness
22629
22630== Details and Notes ==
22631
22632SSA is an abbreviation for Static Single Assignment.
22633
22634For some initial design discussion, see the thread at:
22635
22636* http://mlton.org/pipermail/mlton/2001-August/019689.html
22637
22638For some retrospectives, see the threads at:
22639
22640* http://mlton.org/pipermail/mlton/2003-January/023054.html
22641* http://mlton.org/pipermail/mlton/2007-February/029597.html
22642
22643<<<
22644
22645:mlton-guide-page: SSA2
22646[[SSA2]]
22647SSA2
22648====
22649
22650<:SSA2:> is an <:IntermediateLanguage:>, translated from <:SSA:> by
22651<:ToSSA2:>, optimized by <:SSA2Simplify:>, and translated by
22652<:ToRSSA:> to <:RSSA:>.
22653
22654== Description ==
22655
22656<:SSA2:> is a <:FirstOrder:>, <:SimplyTyped:>
22657<:IntermediateLanguage:>, a slight variant of the <:SSA:>
22658<:IntermediateLanguage:>,
22659
22660Like <:SSA:>, an <:SSA2:> program consists of a collection of datatype
22661declarations, a sequence of global statements, and a collection of
22662functions, along with a distinguished "main" function. Each function
22663consists of a collection of basic blocks, where each basic block is a
22664sequence of statements ending with some control transfer.
22665
22666Unlike <:SSA:>, <:SSA2:> includes mutable fields in objects and makes
22667the vector type constructor n-ary instead of unary. This allows
22668optimizations like <:RefFlatten:> and <:DeepFlatten:> to be expressed.
22669
22670== Implementation ==
22671
22672* <!ViewGitFile(mlton,master,mlton/ssa/ssa2.sig)>
22673* <!ViewGitFile(mlton,master,mlton/ssa/ssa2.fun)>
22674* <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree2.sig)>
22675* <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree2.fun)>
22676
22677== Type Checking ==
22678
22679Type checking (<!ViewGitFile(mlton,master,mlton/ssa/type-check2.sig)>,
22680<!ViewGitFile(mlton,master,mlton/ssa/type-check2.fun)>) of a <:SSA2:>
22681program verifies the following:
22682
22683* no duplicate definitions (tycons, cons, vars, labels, funcs)
22684* no out of scope references (tycons, cons, vars, labels, funcs)
22685* variable definitions dominate variable uses
22686* case transfers are exhaustive and irredundant
22687* `Enter`/`Leave` profile statements match
22688* "traditional" well-typedness
22689
22690== Details and Notes ==
22691
22692SSA is an abbreviation for Static Single Assignment.
22693
22694<<<
22695
22696:mlton-guide-page: SSA2Simplify
22697[[SSA2Simplify]]
22698SSA2Simplify
22699============
22700
22701The optimization passes for the <:SSA2:> <:IntermediateLanguage:> are
22702collected and controlled by the `Simplify2` functor
22703(<!ViewGitFile(mlton,master,mlton/ssa/simplify2.sig)>,
22704<!ViewGitFile(mlton,master,mlton/ssa/simplify2.fun)>).
22705
22706The following optimization passes are implemented:
22707
22708* <:DeepFlatten:>
22709* <:RefFlatten:>
22710* <:RemoveUnused:>
22711* <:Zone:>
22712
22713There are additional analysis and rewrite passes that augment many of the other optimization passes:
22714
22715* <:Restore:>
22716* <:Shrink:>
22717
22718The optimization passes can be controlled from the command-line by the options
22719
22720* `-diag-pass <pass>` -- keep diagnostic info for pass
22721* `-disable-pass <pass>` -- skip optimization pass (if normally performed)
22722* `-enable-pass <pass>` -- perform optimization pass (if normally skipped)
22723* `-keep-pass <pass>` -- keep the results of pass
22724* `-loop-passes <n>` -- loop optimization passes
22725* `-ssa2-passes <passes>` -- ssa optimization passes
22726
22727<<<
22728
22729:mlton-guide-page: SSASimplify
22730[[SSASimplify]]
22731SSASimplify
22732===========
22733
22734The optimization passes for the <:SSA:> <:IntermediateLanguage:> are
22735collected and controlled by the `Simplify` functor
22736(<!ViewGitFile(mlton,master,mlton/ssa/simplify.sig)>,
22737<!ViewGitFile(mlton,master,mlton/ssa/simplify.fun)>).
22738
22739The following optimization passes are implemented:
22740
22741* <:CombineConversions:>
22742* <:CommonArg:>
22743* <:CommonBlock:>
22744* <:CommonSubexp:>
22745* <:ConstantPropagation:>
22746* <:Contify:>
22747* <:Flatten:>
22748* <:Inline:>
22749* <:IntroduceLoops:>
22750* <:KnownCase:>
22751* <:LocalFlatten:>
22752* <:LocalRef:>
22753* <:LoopInvariant:>
22754* <:LoopUnfoll:>
22755* <:LoopUnswitch:>
22756* <:Redundant:>
22757* <:RedundantTests:>
22758* <:RemoveUnused:>
22759* <:ShareZeroVec:>
22760* <:SimplifyTypes:>
22761* <:Useless:>
22762
22763The following implementation passes are implemented:
22764
22765* <:PolyEqual:>
22766* <:PolyHash:>
22767
22768There are additional analysis and rewrite passes that augment many of the other optimization passes:
22769
22770* <:Multi:>
22771* <:Restore:>
22772* <:Shrink:>
22773
22774The optimization passes can be controlled from the command-line by the options:
22775
22776* `-diag-pass <pass>` -- keep diagnostic info for pass
22777* `-disable-pass <pass>` -- skip optimization pass (if normally performed)
22778* `-enable-pass <pass>` -- perform optimization pass (if normally skipped)
22779* `-keep-pass <pass>` -- keep the results of pass
22780* `-loop-passes <n>` -- loop optimization passes
22781* `-ssa-passes <passes>` -- ssa optimization passes
22782
22783<<<
22784
22785:mlton-guide-page: Stabilizers
22786[[Stabilizers]]
22787Stabilizers
22788===========
22789
22790== Installation ==
22791
22792* Stabilizers currently require the MLton sources, this should be fixed by the next release
22793
22794== License ==
22795
22796* Stabilizers are released under the MLton License
22797
22798== Instructions ==
22799
22800* Download and build a source copy of MLton
22801* Extract the tar.gz file attached to this page
22802* Some examples are provided in the "examples/" sub directory, more examples will be added to this page in the following week
22803
22804== Bug reports / Suggestions ==
22805
22806* Please send any errors you encounter to schatzp and lziarek at cs.purdue.edu
22807* We are looking to expand the usability of stabilizers
22808* Please send any suggestions and desired functionality to the above email addresses
22809
22810== Note ==
22811
22812* This is an alpha release. We expect to have another release shortly with added functionality soon
22813* More documentation, such as signatures and descriptions of functionality, will be forthcoming
22814
22815
22816== Documentation ==
22817
22818[source,sml]
22819----
22820signature STABLE =
22821 sig
22822 type checkpoint
22823
22824 val stable: ('a -> 'b) -> ('a -> 'b)
22825 val stabilize: unit -> 'a
22826
22827 val stableCP: (('a -> 'b) * (unit -> unit)) ->
22828 (('a -> 'b) * checkpoint)
22829 val stabilizeCP: checkpoint -> unit
22830
22831 val unmonitoredAssign: ('a ref * 'a) -> unit
22832 val monitoredAssign: ('a ref * 'a) -> unit
22833 end
22834----
22835
22836
22837`Stable` provides functions to manage stable sections.
22838
22839* `type checkpoint`
22840+
22841handle used to stabilize contexts other than the current one.
22842
22843* `stable f`
22844+
22845returns a function identical to `f` that will execute within a stable section.
22846
22847* `stabilize ()`
22848+
22849unrolls the effects made up to the current context to at least the
22850nearest enclosing _stable_ section. These effects may have propagated
22851to other threads, so all affected threads are returned to a globally
22852consistent previous state. The return is undefined because control
22853cannot resume after stabilize is called.
22854
22855* `stableCP (f, comp)`
22856+
22857returns a function `f'` and checkpoint tag `cp`. Function `f'` is
22858identical to `f` but when applied will execute within a stable
22859section. `comp` will be executed if `f'` is later stabilized. `cp`
22860is used by `stabilizeCP` to stabilize a given checkpoint.
22861
22862* `stabilizeCP cp`
22863+
22864same as stabilize except that the (possibly current) checkpoint to
22865stabilize is provided.
22866
22867* `unmonitoredAssign (r, v)`
22868+
22869standard assignment (`:=`). The version of CML distributed rebinds
22870`:=` to a monitored version so interesting effects can be recorded.
22871
22872* `monitoredAssign (r, v)`
22873+
22874the assignment operator that should be used in programs that use
22875stabilizers. `:=` is rebound to this by including CML.
22876
22877== Download ==
22878
22879* <!Attachment(Stabilizers,stabilizers_alpha_2006-10-09.tar.gz)>
22880
22881== Also see ==
22882
22883* <!Cite(ZiarekEtAl06)>
22884
22885<<<
22886
22887:mlton-guide-page: StandardML
22888[[StandardML]]
22889StandardML
22890==========
22891
22892Standard ML (SML) is a programming language that combines excellent
22893support for rapid prototyping, modularity, and development of large
22894programs, with performance approaching that of C.
22895
22896== SML Resources ==
22897
22898* <:StandardMLTutorials:Tutorials>
22899* <:StandardMLBooks:Books>
22900* <:StandardMLImplementations:Implementations>
22901// * http://google.com/coop/cse?cx=014714656471597805969%3Afzuz7eybmcy[SML web search] from Google Co-op
22902
22903== Aspects of SML ==
22904
22905* <:DefineTypeBeforeUse:>
22906* <:EqualityType:>
22907* <:EqualityTypeVariable:>
22908* <:GenerativeDatatype:>
22909* <:GenerativeException:>
22910* <:Identifier:>
22911* <:OperatorPrecedence:>
22912* <:Overloading:>
22913* <:PolymorphicEquality:>
22914* <:TypeVariableScope:>
22915* <:ValueRestriction:>
22916
22917== Using SML ==
22918
22919* <:Fixpoints:>
22920* <:ForLoops:>
22921* <:FunctionalRecordUpdate:>
22922* <:InfixingOperators:>
22923* <:Lazy:>
22924* <:ObjectOrientedProgramming:>
22925* <:OptionalArguments:>
22926* <:Printf:>
22927* <:PropertyList:>
22928* <:ReturnStatement:>
22929* <:Serialization:>
22930* <:StandardMLGotchas:>
22931* <:StyleGuide:>
22932* <:TipsForWritingConciseSML:>
22933* <:UniversalType:>
22934
22935== Programming in SML ==
22936
22937* <:Emacs:>
22938* <:Enscript:>
22939* <:Pygments:>
22940
22941== Notes ==
22942
22943* <:StandardMLHistory: History of SML>
22944* <:Regions:>
22945
22946== Related Languages ==
22947
22948* <:Alice:>
22949* <:FSharp:F#>
22950* <:OCaml:>
22951
22952<<<
22953
22954:mlton-guide-page: StandardMLBooks
22955[[StandardMLBooks]]
22956StandardMLBooks
22957===============
22958
22959== Introductory Books ==
22960
22961* <!Cite(Ullman98, Elements of ML Programming)>
22962
22963* <!Cite(Paulson96, ML For the Working Programmer)>
22964
22965* <!Cite(HansenRichel99, Introduction to Programming using SML)>
22966
22967* <!Cite(FelleisenFreidman98, The Little MLer)>
22968
22969== Applications ==
22970
22971* <!Cite(Shipman02, Unix System Programming with Standard ML)>
22972
22973== Reference Books ==
22974
22975* <!Cite(GansnerReppy04, The Standard ML Basis Library)>
22976
22977* <:DefinitionOfStandardML:The Definition of Standard ML (Revised)>
22978
22979== Related Topics ==
22980
22981* <!Cite(Reppy07, Concurrent Programming in ML)>
22982
22983* <!Cite(Okasaki99, Purely Functional Data Structures)>
22984
22985<<<
22986
22987:mlton-guide-page: StandardMLGotchas
22988[[StandardMLGotchas]]
22989StandardMLGotchas
22990=================
22991
22992This page contains brief explanations of some recurring sources of
22993confusion and problems that SML newbies encounter.
22994
22995Many confusions about the syntax of SML seem to arise from the use of
22996an interactive REPL (Read-Eval Print Loop) while trying to learn the
22997basics of the language. While writing your first SML programs, you
22998should keep the source code of your programs in a form that is
22999accepted by an SML compiler as a whole.
23000
23001== The `and` keyword ==
23002
23003It is a common mistake to misuse the `and` keyword or to not know how
23004to introduce mutually recursive definitions. The purpose of the `and`
23005keyword is to introduce mutually recursive definitions of functions
23006and datatypes. For example,
23007
23008[source,sml]
23009----
23010fun isEven 0w0 = true
23011 | isEven 0w1 = false
23012 | isEven n = isOdd (n-0w1)
23013and isOdd 0w0 = false
23014 | isOdd 0w1 = true
23015 | isOdd n = isEven (n-0w1)
23016----
23017
23018and
23019
23020[source,sml]
23021----
23022datatype decl = VAL of id * pat * expr
23023 (* | ... *)
23024 and expr = LET of decl * expr
23025 (* | ... *)
23026----
23027
23028You can also use `and` as a shorthand in a couple of other places, but
23029it is not necessary.
23030
23031== Constructed patterns ==
23032
23033It is a common mistake to forget to parenthesize constructed patterns
23034in `fun` bindings. Consider the following invalid definition:
23035
23036[source,sml]
23037----
23038fun length nil = 0
23039 | length h :: t = 1 + length t
23040----
23041
23042The pattern `h :: t` needs to be parenthesized:
23043
23044[source,sml]
23045----
23046fun length nil = 0
23047 | length (h :: t) = 1 + length t
23048----
23049
23050The parentheses are needed, because a `fun` definition may have
23051multiple consecutive constructed patterns through currying.
23052
23053The same applies to nonfix constructors. For example, the parentheses
23054in
23055
23056[source,sml]
23057----
23058fun valOf NONE = raise Option
23059 | valOf (SOME x) = x
23060----
23061
23062are required. However, the outermost constructed pattern in a `fn` or
23063`case` expression need not be parenthesized, because in those cases
23064there is always just one constructed pattern. So, both
23065
23066[source,sml]
23067----
23068val valOf = fn NONE => raise Option
23069 | SOME x => x
23070----
23071
23072and
23073
23074[source,sml]
23075----
23076fun valOf x = case x of
23077 NONE => raise Option
23078 | SOME x => x
23079----
23080
23081are fine.
23082
23083== Declarations and expressions ==
23084
23085It is a common mistake to confuse expressions and declarations.
23086Normally an SML source file should only contain declarations. The
23087following are declarations:
23088
23089[source,sml]
23090----
23091datatype dt = ...
23092fun f ... = ...
23093functor Fn (...) = ...
23094infix ...
23095infixr ...
23096local ... in ... end
23097nonfix ...
23098open ...
23099signature SIG = ...
23100structure Struct = ...
23101type t = ...
23102val v = ...
23103----
23104
23105Note that
23106
23107[source,sml]
23108----
23109let ... in ... end
23110----
23111
23112isn't a declaration.
23113
23114To specify a side-effecting computation in a source file, you can write:
23115
23116[source,sml]
23117----
23118val () = ...
23119----
23120
23121
23122== Equality types ==
23123
23124SML has a fairly intricate built-in notion of equality. See
23125<:EqualityType:> and <:EqualityTypeVariable:> for a thorough
23126discussion.
23127
23128
23129== Nested cases ==
23130
23131It is a common mistake to write nested case expressions without the
23132necessary parentheses. See <:UnresolvedBugs:> for a discussion.
23133
23134
23135== (op *) ==
23136
23137It used to be a common mistake to parenthesize `op *` as `(op *)`.
23138Before SML'97, `*)` was considered a comment terminator in SML and
23139caused a syntax error. At the time of writing, <:SMLNJ:SML/NJ> still
23140rejects the code. An extra space may be used for portability:
23141`(op * )`. However, parenthesizing `op` is redundant, even though it
23142is a widely used convention.
23143
23144
23145== Overloading ==
23146
23147A number of standard operators (`+`, `-`, `~`, `*`, `<`, `>`, ...) and
23148numeric constants are overloaded for some of the numeric types (`int`,
23149`real`, `word`). It is a common surprise that definitions using
23150overloaded operators such as
23151
23152[source,sml]
23153----
23154fun min (x, y) = if y < x then y else x
23155----
23156
23157are not overloaded themselves. SML doesn't really support
23158(user-defined) overloading or other forms of ad hoc polymorphism. In
23159cases such as the above where the context doesn't resolve the
23160overloading, expressions using overloaded operators or constants get
23161assigned a default type. The above definition gets the type
23162
23163[source,sml]
23164----
23165val min : int * int -> int
23166----
23167
23168See <:Overloading:> and <:TypeIndexedValues:> for further discussion.
23169
23170
23171== Semicolons ==
23172
23173It is a common mistake to use redundant semicolons in SML code. This
23174is probably caused by the fact that in an SML REPL, a semicolon (and
23175enter) is used to signal the REPL that it should evaluate the
23176preceding chunk of code as a unit. In SML source files, semicolons
23177are really needed in only two places. Namely, in expressions of the
23178form
23179
23180[source,sml]
23181----
23182(exp ; ... ; exp)
23183----
23184
23185and
23186
23187[source,sml]
23188----
23189let ... in exp ; ... ; exp end
23190----
23191
23192Note that semicolons act as expression (or declaration) separators
23193rather than as terminators.
23194
23195
23196== Stale bindings ==
23197
23198{empty}
23199
23200
23201== Unresolved records ==
23202
23203{empty}
23204
23205
23206== Value restriction ==
23207
23208See <:ValueRestriction:>.
23209
23210
23211== Type Variable Scope ==
23212
23213See <:TypeVariableScope:>.
23214
23215<<<
23216
23217:mlton-guide-page: StandardMLHistory
23218[[StandardMLHistory]]
23219StandardMLHistory
23220=================
23221
23222<:StandardML:Standard ML> grew out of <:ML:> in the early 1980s.
23223
23224For an excellent overview of SML's history, see Appendix F of the
23225<:DefinitionOfStandardML:Definition>.
23226
23227For an overview if its history before 1982, see <!Cite(Milner82, How
23228ML Evolved)>.
23229
23230<<<
23231
23232:mlton-guide-page: StandardMLImplementations
23233[[StandardMLImplementations]]
23234StandardMLImplementations
23235=========================
23236
23237There are a number of implementations of <:StandardML:Standard ML>,
23238from interpreters, to byte-code compilers, to incremental compilers,
23239to whole-program compilers.
23240
23241* <:Alice:Alice ML>
23242* <:HaMLet:HaMLet>
23243* <:MLKit:ML Kit>
23244* <:Home:MLton>
23245* <:MoscowML:Moscow ML>
23246* <:PolyML:Poly/ML>
23247* <:SMLSharp:SML#>
23248* <:SMLNJ:SML/NJ>
23249* <:SMLNET:SML.NET>
23250* <:TILT:TILT>
23251
23252== Not Actively Maintained ==
23253
23254* http://www.dcs.ed.ac.uk/home/edml/[Edinburgh ML]
23255* <:MLj:MLj>
23256* MLWorks
23257* <:Poplog:>
23258* http://www.cs.cornell.edu/Info/People/jgm/til.tar.Z[TIL]
23259
23260<<<
23261
23262:mlton-guide-page: StandardMLPortability
23263[[StandardMLPortability]]
23264StandardMLPortability
23265=====================
23266
23267Technically, SML'97 as defined in the
23268<:DefinitionOfStandardML:Definition>
23269requires only a minimal initial basis, which, while including the
23270types `int`, `real`, `char`, and `string`, need have
23271no operations on those base types. Hence, the only observable output
23272of an SML'97 program is termination or raising an exception. Most SML
23273compilers should agree there, to the degree each agrees with the
23274Definition. See <:UnresolvedBugs:> for MLton's very few corner cases.
23275
23276Realistically, a program needs to make use of the
23277<:BasisLibrary:Basis Library>.
23278Within the Basis Library, there are numerous places where the behavior
23279is implementation dependent. For a trivial example:
23280
23281[source,sml]
23282----
23283val _ = valOf (Int.maxInt)
23284----
23285
23286
23287may either raise the `Option` exception (if
23288`Int.maxInt == NONE`) or may terminate normally. The default
23289Int/Real/Word sizes are the biggest implementation dependent aspect;
23290so, one implementation may raise `Overflow` while another can
23291accommodate the result. Also, maximum array and vector lengths are
23292implementation dependent. Interfacing with the operating system is a
23293bit murky, and implementations surely differ in handling of errors
23294there.
23295
23296<<<
23297
23298:mlton-guide-page: StandardMLTutorials
23299[[StandardMLTutorials]]
23300StandardMLTutorials
23301===================
23302
23303* http://www.dcs.napier.ac.uk/course-notes/sml/manual.html[A Gentle Introduction to ML].
23304Andrew Cummings.
23305
23306* http://www.dcs.ed.ac.uk/home/stg/NOTES/[Programming in Standard ML '97: An Online Tutorial].
23307Stephen Gilmore.
23308
23309* <!Cite(Harper11, Programming in Standard ML)>.
23310Robert Harper.
23311
23312* <!Cite(Tofte96, Essentials of Standard ML Modules)>.
23313Mads Tofte.
23314
23315* <!Cite(Tofte09, Tips for Computer Scientists on Standard ML (Revised))>.
23316Mads Tofte.
23317
23318<<<
23319
23320:mlton-guide-page: StaticSum
23321[[StaticSum]]
23322StaticSum
23323=========
23324
23325While SML makes it impossible to write functions whose types would
23326depend on the values of their arguments, or so called dependently
23327typed functions, it is possible, and arguably commonplace, to write
23328functions whose types depend on the types of their arguments. Indeed,
23329the types of parametrically polymorphic functions like `map` and
23330`foldl` can be said to depend on the types of their arguments. What
23331is less commonplace, however, is to write functions whose behavior
23332would depend on the types of their arguments. Nevertheless, there are
23333several techniques for writing such functions.
23334<:TypeIndexedValues:Type-indexed values> and <:Fold:fold> are two such
23335techniques. This page presents another such technique dubbed static
23336sums.
23337
23338
23339== Ordinary Sums ==
23340
23341Consider the sum type as defined below:
23342[source,sml]
23343----
23344structure Sum = struct
23345 datatype ('a, 'b) t = INL of 'a | INR of 'b
23346end
23347----
23348
23349While a generic sum type such as defined above is very useful, it has
23350a number of limitations. As an example, we could write the function
23351`out` to extract the value from a sum as follows:
23352[source,sml]
23353----
23354fun out (s : ('a, 'a) Sum.t) : 'a =
23355 case s
23356 of Sum.INL a => a
23357 | Sum.INR a => a
23358----
23359
23360As can be seen from the type of `out`, it is limited in the sense that
23361it requires both variants of the sum to have the same type. So, `out`
23362cannot be used to extract the value of a sum of two different types,
23363such as the type `(int, real) Sum.t`. As another example of a
23364limitation, consider the following attempt at a `succ` function:
23365[source,sml]
23366----
23367fun succ (s : (int, real) Sum.t) : ??? =
23368 case s
23369 of Sum.INL i => i + 1
23370 | Sum.INR r => Real.nextAfter (r, Real.posInf)
23371----
23372
23373The above definition of `succ` cannot be typed, because there is no
23374type for the codomain within SML.
23375
23376
23377== Static Sums ==
23378
23379Interestingly, it is possible to define values `inL`, `inR`, and
23380`match` that satisfy the laws
23381----
23382match (inL x) (f, g) = f x
23383match (inR x) (f, g) = g x
23384----
23385and do not suffer from the same limitions. The definitions are
23386actually quite trivial:
23387[source,sml]
23388----
23389structure StaticSum = struct
23390 fun inL x (f, _) = f x
23391 fun inR x (_, g) = g x
23392 fun match x = x
23393end
23394----
23395
23396Now, given the `succ` function defined as
23397[source,sml]
23398----
23399fun succ s =
23400 StaticSum.match s
23401 (fn i => i + 1,
23402 fn r => Real.nextAfter (r, Real.posInf))
23403----
23404we get
23405[source,sml]
23406----
23407succ (StaticSum.inL 1) = 2
23408succ (StaticSum.inR Real.maxFinite) = Real.posInf
23409----
23410
23411To better understand how this works, consider the following signature
23412for static sums:
23413[source,sml]
23414----
23415structure StaticSum :> sig
23416 type ('dL, 'cL, 'dR, 'cR, 'c) t
23417 val inL : 'dL -> ('dL, 'cL, 'dR, 'cR, 'cL) t
23418 val inR : 'dR -> ('dL, 'cL, 'dR, 'cR, 'cR) t
23419 val match : ('dL, 'cL, 'dR, 'cR, 'c) t -> ('dL -> 'cL) * ('dR -> 'cR) -> 'c
23420end = struct
23421 type ('dL, 'cL, 'dR, 'cR, 'c) t = ('dL -> 'cL) * ('dR -> 'cR) -> 'c
23422 open StaticSum
23423end
23424----
23425
23426Above, `'d` stands for domain and `'c` for codomain. The key
23427difference between an ordinary sum type, like `(int, real) Sum.t`, and
23428a static sum type, like `(int, real, real, int, real) StaticSum.t`, is
23429that the ordinary sum type says nothing about the type of the result
23430of deconstructing a sum while the static sum type specifies the type.
23431
23432With the sealed static sum module, we get the type
23433[source,sml]
23434----
23435val succ : (int, int, real, real, 'a) StaticSum.t -> 'a
23436----
23437for the previously defined `succ` function. The type specifies that
23438`succ` maps a left `int` to an `int` and a right `real` to a `real`.
23439For example, the type of `StaticSum.inL 1` is
23440`(int, 'cL, 'dR, 'cR, 'cL) StaticSum.t`. Unifying this with the
23441argument type of `succ` gives the type `(int, int, real, real, int)
23442StaticSum.t -> int`.
23443
23444The `out` function is quite useful on its own. Here is how it can be
23445defined:
23446[source,sml]
23447----
23448structure StaticSum = struct
23449 open StaticSum
23450 val out : ('a, 'a, 'b, 'b, 'c) t -> 'c =
23451 fn s => match s (fn x => x, fn x => x)
23452end
23453----
23454
23455Due to the value restriction, lack of first class polymorphism and
23456polymorphic recursion, the usefulness and convenience of static sums
23457is somewhat limited in SML. So, don't throw away the ordinary sum
23458type just yet. Static sums can nevertheless be quite useful.
23459
23460
23461=== Example: Send and Receive with Argument Type Dependent Result Types ===
23462
23463In some situations it would seem useful to define functions whose
23464result type would depend on some of the arguments. Traditionally such
23465functions have been thought to be impossible in SML and the solution
23466has been to define multiple functions. For example, the
23467http://www.standardml.org/Basis/socket.html[`Socket` structure] of the
23468Basis library defines 16 `send` and 16 `recv` functions. In contrast,
23469the Net structure
23470(<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/net.sig)>) of the
23471Basic library designed by Stephen Weeks defines only a single `send`
23472and a single `receive` and the result types of the functions depend on
23473their arguments. The implementation
23474(<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/net.sml)>) uses
23475static sums (with a slighly different signature:
23476<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/static-sum.sig)>).
23477
23478
23479=== Example: Picking Monad Results ===
23480
23481Suppose that we need to write a parser that accepts a pair of integers
23482and returns their sum given a monadic parsing combinator library. A
23483part of the signature of such library could look like this
23484[source,sml]
23485----
23486signature PARSING = sig
23487 include MONAD
23488 val int : int t
23489 val lparen : unit t
23490 val rparen : unit t
23491 val comma : unit t
23492 (* ... *)
23493end
23494----
23495where the `MONAD` signature could be defined as
23496[source,sml]
23497----
23498signature MONAD = sig
23499 type 'a t
23500 val return : 'a -> 'a t
23501 val >>= : 'a t * ('a -> 'b t) -> 'b t
23502end
23503infix >>=
23504----
23505
23506The straightforward, but tedious, way to write the desired parser is:
23507[source,sml]
23508----
23509val p = lparen >>= (fn _ =>
23510 int >>= (fn x =>
23511 comma >>= (fn _ =>
23512 int >>= (fn y =>
23513 rparen >>= (fn _ =>
23514 return (x + y))))))
23515----
23516
23517In Haskell, the parser could be written using the `do` notation
23518considerably less verbosely as:
23519[source,haskell]
23520----
23521p = do { lparen ; x <- int ; comma ; y <- int ; rparen ; return $ x + y }
23522----
23523
23524SML doesn't provide a `do` notation, so we need another solution.
23525
23526Suppose we would have a "pick" notation for monads that would allows
23527us to write the parser as
23528[source,sml]
23529----
23530val p = `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (fn x & y => x + y)
23531----
23532using four auxiliary combinators: +&grave;+, `\`, `^`, and `@`.
23533
23534Roughly speaking
23535
23536* +&grave;p+ means that the result of `p` is dropped,
23537* `\p` means that the result of `p` is taken,
23538* `p ^ q` means that results of `p` and `q` are taken as a product, and
23539* `p @ a` means that the results of `p` are passed to the function `a` and that result is returned.
23540
23541The difficulty is in implementing the concatenation combinator `^`.
23542The type of the result of the concatenation depends on the types of
23543the arguments.
23544
23545Using static sums and the <:ProductType:product type>, the pick
23546notation for monads can be implemented as follows:
23547[source,sml]
23548----
23549functor MkMonadPick (include MONAD) = let
23550 open StaticSum
23551in
23552 struct
23553 fun `a = inL (a >>= (fn _ => return ()))
23554 val \ = inR
23555 fun a @ f = out a >>= (return o f)
23556 fun a ^ b =
23557 (match b o match a)
23558 (fn a =>
23559 (fn b => inL (a >>= (fn _ => b)),
23560 fn b => inR (a >>= (fn _ => b))),
23561 fn a =>
23562 (fn b => inR (a >>= (fn a => b >>= (fn _ => return a))),
23563 fn b => inR (a >>= (fn a => b >>= (fn b => return (a & b))))))
23564 end
23565end
23566----
23567
23568The above implementation is inefficient, however. It uses many more
23569bind operations, `>>=`, than necessary. That can be solved with an
23570additional level of abstraction:
23571[source,sml]
23572----
23573functor MkMonadPick (include MONAD) = let
23574 open StaticSum
23575in
23576 struct
23577 fun `a = inL (fn b => a >>= (fn _ => b ()))
23578 fun \a = inR (fn b => a >>= b)
23579 fun a @ f = out a (return o f)
23580 fun a ^ b =
23581 (match b o match a)
23582 (fn a => (fn b => inL (fn c => a (fn () => b c)),
23583 fn b => inR (fn c => a (fn () => b c))),
23584 fn a => (fn b => inR (fn c => a (fn a => b (fn () => c a))),
23585 fn b => inR (fn c => a (fn a => b (fn b => c (a & b))))))
23586 end
23587end
23588----
23589
23590After instantiating and opening either of the above monad pick
23591implementations, the previously given definition of `p` can be
23592compiled and results in a parser whose result is of type `int`. Here
23593is a functor to test the theory:
23594[source,sml]
23595----
23596functor Test (Arg : PARSING) = struct
23597 local
23598 structure Pick = MkMonadPick (Arg)
23599 open Pick Arg
23600 in
23601 val p : int t =
23602 `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (fn x & y => x + y)
23603 end
23604end
23605----
23606
23607
23608== Also see ==
23609
23610There are a number of related techniques. Here are some of them.
23611
23612* <:Fold:>
23613* <:TypeIndexedValues:>
23614
23615<<<
23616
23617:mlton-guide-page: StephenWeeks
23618[[StephenWeeks]]
23619StephenWeeks
23620============
23621
23622I live in the New York City area and work at http://janestcapital.com[Jane Street Capital].
23623
23624My http://sweeks.com/[home page].
23625
23626You can email me at sweeks@sweeks.com.
23627
23628<<<
23629
23630:mlton-guide-page: StyleGuide
23631[[StyleGuide]]
23632StyleGuide
23633==========
23634
23635These conventions are chosen so that inertia is towards modularity, code reuse and finding bugs early, _not_ to save typing.
23636
23637* <:SyntacticConventions:>
23638
23639<<<
23640
23641:mlton-guide-page: Subversion
23642[[Subversion]]
23643Subversion
23644==========
23645
23646http://subversion.apache.org/[Subversion] is a version control system.
23647The MLton project used Subversion to maintain its
23648<:Sources:source code>, but switched to <:Git:> on 20130308.
23649
23650Here are some online Subversion resources.
23651
23652* http://svnbook.red-bean.com[Version Control with Subversion]
23653
23654<<<
23655
23656:mlton-guide-page: SuccessorML
23657[[SuccessorML]]
23658SuccessorML
23659===========
23660
23661The purpose of http://sml-family.org/successor-ml/[successor ML], or
23662sML for short, is to provide a vehicle for the continued evolution of
23663ML, using Standard ML as a starting point. The intention is for
23664successor ML to be a living, evolving dialect of ML that is responsive
23665to community needs and advances in language design, implementation,
23666and semantics.
23667
23668== SuccessorML Features in MLton ==
23669
23670The following SuccessorML features have been implemented in MLton.
23671The features are disabled by default, and may be enabled utilizing the
23672feature's corresponding <:MLBasisAnnotations:ML Basis annotation>
23673which is listed directly after the feature name. In addition, the
23674+allowSuccessorML {false|true}+ annotation can be used to
23675simultaneously enable all of the features.
23676
23677* <!Anchor(DoDecls)>
23678`do` Declarations: +allowDoDecls {false|true}+
23679+
23680Allow a +do _exp_+ declaration form, which evaluates _exp_ for its
23681side effects. The following example uses a `do` declaration:
23682+
23683[source,sml]
23684----
23685do print "Hello world.\n"
23686----
23687+
23688and is equivalent to:
23689+
23690[source,sml]
23691----
23692val () = print "Hello world.\n"
23693----
23694
23695* <!Anchor(ExtendedConsts)>
23696Extended Constants: +allowExtendedConsts {false|true}+
23697+
23698--
23699Allow or disallow all of the extended constants features. This is a
23700proxy for all of the following annotations.
23701
23702** <!Anchor(ExtendedNumConsts)>
23703Extended Numeric Constants: +allowExtendedNumConsts {false|true}+
23704+
23705Allow underscores as a separator in numeric constants and allow binary
23706integer and word constants.
23707+
23708Underscores in a numeric constant must occur between digits and
23709consecutive underscores are allowed.
23710+
23711Binary integer constants use the prefix +0b+ and binary word constants
23712use the prefix +0wb+.
23713+
23714The following example uses extended numeric constants (although it may
23715be incorrectly syntax highlighted):
23716+
23717[source,sml]
23718----
23719val pb = 0b10101
23720val nb = ~0b10_10_10
23721val wb = 0wb1010
23722val i = 4__327__829
23723val r = 6.022_140_9e23
23724----
23725
23726** <!Anchor(ExtendedTextConsts)> Extended Text Constants: +allowExtendedTextConsts {false|true}+
23727+
23728Allow characters with integer codes &ge; 128 and &le; 247 that
23729correspond to syntactically well-formed UTF-8 byte sequences in text
23730constants.
23731+
23732////
23733and allow `\Uxxxxxxxx` numeric escapes in text constants.
23734////
23735+
23736Any 1, 2, 3, or 4 byte sequence that can be properly decoded to a
23737binary number according to the UTF-8 encoding/decoding scheme is
23738allowed in a text constant (but invalid sequences are not explicitly
23739rejected) and denotes the corresponding sequence of characters with
23740integer codes &ge; 128 and &le; 247. This feature enables "UTF-8
23741convenience" (but not comprehensive Unicode support); in particular,
23742it allows one to copy text from a browser and paste it into a string
23743constant in an editor and, furthermore, if the string is printed to a
23744terminal, then will (typically) appear as the original text. The
23745following example uses UTF-8 byte sequences:
23746+
23747[source,sml]
23748----
23749val s1 : String.string = "\240\159\130\161"
23750val s2 : String.string = "🂡"
23751val _ = print ("s1 --> " ^ s1 ^ "\n")
23752val _ = print ("s2 --> " ^ s2 ^ "\n")
23753val _ = print ("String.size s1 --> " ^ Int.toString (String.size s1) ^ "\n")
23754val _ = print ("String.size s2 --> " ^ Int.toString (String.size s2) ^ "\n")
23755val _ = print ("s1 = s2 --> " ^ Bool.toString (s1 = s2) ^ "\n")
23756----
23757+
23758and, when compiled and executed, will display:
23759+
23760----
23761s1 --> 🂡
23762s2 --> 🂡
23763String.size s1 --> 4
23764String.size s2 --> 4
23765s1 = s2 --> true
23766----
23767+
23768Note that the `String.string` type corresponds to any sequence of
237698-bit values, including invalid UTF-8 sequences; hence the string
23770constant `"\192"` (a UTF-8 leading byte with no UTF-8 continuation
23771byte) is valid. Similarly, the `Char.char` type corresponds to a
23772single 8-bit value; hence the char constant `#"α"` is not valid, as
23773the text constant `"α"` denotes a sequence of two 8-bit values.
23774+
23775////
23776A `\Uxxxxxxxx` numeric escape denotes a single character with the
23777hexadecimal integer code `xxxxxxxx`. Such numeric escapes are not
23778necessary for the `String.string` and `Char.char` types, since
23779characters in such text constants must have integer codes &le; 255 and
23780the `\ddd` and `\uxxxx` numeric escapes suffice. However, the
23781`\Uxxxxxxxx` numeric escapes are useful for the `WideString.string`
23782and `WideChar.char` types, since characters in such text constants may
23783have integer codes &le; 2^32^-1. The following uses a `\Uxxxxxxxx`
23784numeric escape (although it may be incorrectly syntax highlighted):
23785+
23786[source,sml]
23787----
23788val s1 : WideString.string = "\U0001F0A1" (* 'PLAYING CARD ACE OF SPADES' (U+1F0A1) *)
23789val _ = print ("WideString.size s1 --> " ^ Int.toString (WideString.size s1) ^ "\n")
23790----
23791+
23792and, when compiled and executed, will display:
23793+
23794----
23795WideString.size s1 --> 1
23796----
23797+
23798Note that the `WideString.string` type corresponds to any sequence of
2379932-bit values, including invalid Unicode code points; hence, the
23800string constants `"\U001F0000"` and `"\U40000000"` are valid (but the
23801corresponding integer codes are not valid Unicode code points).
23802Similarly, the `WideChar.char` type corresponds to a single 32-bit
23803value.
23804+
23805Finally, note that a UTF-8 byte sequence in a `WideString.string` or
23806`WideChar.char` text constant does not denote a single 32-bit value,
23807but rather a sequence of 32-bit values &ge; 128 and &le; 247. The
23808following example uses both UTF-8 byte sequences and `\Uxxxxxxxx`
23809numeric escapes (although it may be incorrectly syntax highlighted):
23810+
23811[source,sml]
23812----
23813val s1 : WideString.string = "\U0001F0A1" (* 'PLAYING CARD ACE OF SPADES' (U+1F0A1) *)
23814val s2 : WideString.string = "🂡"
23815val s3 : WideString.string = "\U000000F0\U0000009F\U00000082\U000000A1"
23816val _ = print ("WideString.size s1 --> " ^ Int.toString (WideString.size s1) ^ "\n")
23817val _ = print ("WideString.size s2 --> " ^ Int.toString (WideString.size s2) ^ "\n")
23818val _ = print ("WideString.size s3 --> " ^ Int.toString (WideString.size s3) ^ "\n")
23819val _ = print ("s1 = s2 --> " ^ Bool.toString (s1 = s2) ^ "\n")
23820val _ = print ("s2 = s3 --> " ^ Bool.toString (s2 = s3) ^ "\n")
23821----
23822+
23823and, when compiled and executed, will display:
23824+
23825----
23826WideString.size s1 --> 1
23827WideString.size s2 --> 4
23828WideString.size s3 --> 4
23829s1 = s2 --> false
23830s2 = s3 --> true
23831----
23832////
23833--
23834
23835* <!Anchor(LineComments)>
23836Line Comments: +allowLineComments {false|true}+
23837+
23838Allow line comments beginning with the token ++(*)++. The following
23839example uses a line comment:
23840+
23841[source,sml]
23842----
23843(*) This is a line comment
23844----
23845+
23846Line comments properly nest within block comments. The following
23847example uses line comments nested within block comments:
23848+
23849[source,sml]
23850----
23851(*
23852val x = 4 (*) This is a line comment
23853*)
23854
23855(*
23856val y = 5 (*) This is a line comment *)
23857*)
23858----
23859
23860* <!Anchor(OptBar)>
23861Optional Pattern Bars: +allowOptBar {false|true}+
23862+
23863Allow a bar to appear before the first match rule of a `case`, `fn`,
23864or `handle` expression, allow a bar to appear before the first
23865function-value binding of a `fun` declaration, and allow a bar to
23866appear before the first constructor binding or description of a
23867`datatype` declaration or specification. The following example uses
23868leading bars in a `datatype` declaration, a `fun` declaration, and a
23869`case` expression:
23870+
23871[source,sml]
23872----
23873datatype t =
23874 | C
23875 | B
23876 | A
23877
23878fun
23879 | f NONE = 0
23880 | f (SOME t) =
23881 (case t of
23882 | A => 1
23883 | B => 2
23884 | C => 3)
23885----
23886+
23887By eliminating the special case of the first element, this feature
23888allows for simpler refactoring (e.g., sorting the lines of the
23889`datatype` declaration's constructor bindings to put the constructors
23890in alphabetical order).
23891
23892* <!Anchor(OptSemicolon)>
23893Optional Semicolons: +allowOptSemicolon {false|true}+
23894+
23895Allow a semicolon to appear after the last expression in a sequence or
23896`let`-body expression. The following example uses a trailing
23897semicolon in the body of a `let` expression:
23898+
23899[source,sml]
23900----
23901fun h z =
23902 let
23903 val x = 3 * z
23904 in
23905 f x ;
23906 g x ;
23907 end
23908----
23909+
23910By eliminating the special case of the last element, this feature
23911allows for simpler refactoring.
23912
23913* <!Anchor(OrPats)>
23914Disjunctive (Or) Patterns: +allowOrPats {false|true}+
23915+
23916Allow disjunctive (a.k.a., "or") patterns of the form +_pat~1~_ |
23917_pat~2~_+, which matches a value that matches either +_pat~1~_+ or
23918+_pat~2~_+. Disjunctive patterns have lower precedence than `as`
23919patterns and constraint patterns, much as `orelse` expressions have
23920lower precedence than `andalso` expressions and constraint
23921expressions. Both sub-patterns of a disjunctive pattern must bind the
23922same variables with the same types. The following example uses
23923disjunctive patterns:
23924+
23925[source,sml]
23926----
23927datatype t = A of int | B of int | C of int | D of int * int | E of int * int
23928
23929fun f t =
23930 case t of
23931 A x | B x | C x => x + 1
23932 | D (x, _) | E (_, x) => x * 2
23933----
23934
23935* <!Anchor(RecordPunExps)>
23936Record Punning Expressions: +allowRecordPunExps {false|true}+
23937+
23938Allow record punning expressions, whereby an identifier +_vid_+ as an
23939expression row in a record expression denotes the expression row
23940+_vid_ = _vid_+ (i.e., treating a label as a variable). The following
23941example uses record punning expressions (and also record punning
23942patterns):
23943+
23944[source,sml]
23945----
23946fun incB r =
23947 case r of {a, b, c} => {a, b = b + 1, c}
23948----
23949+
23950and is equivalent to:
23951+
23952[source,sml]
23953----
23954fun incB r =
23955 case r of {a = a, b = b, c = c} => {a = a, b = b + 1, c = c}
23956----
23957
23958* <!Anchor(SigWithtype)>
23959`withtype` in Signatures: +allowSigWithtype {false|true}+
23960+
23961Allow `withtype` to modify a `datatype` specification in a signature.
23962The following example uses `withtype` in a signature (and also
23963`withtype` in a declaration):
23964+
23965[source,sml]
23966----
23967signature STREAM =
23968 sig
23969 datatype 'a u = Nil | Cons of 'a * 'a t
23970 withtype 'a t = unit -> 'a u
23971 end
23972structure Stream : STREAM =
23973 struct
23974 datatype 'a u = Nil | Cons of 'a * 'a t
23975 withtype 'a t = unit -> 'a u
23976 end
23977----
23978+
23979and is equivalent to:
23980+
23981[source,sml]
23982----
23983signature STREAM =
23984 sig
23985 datatype 'a u = Nil | Cons of 'a * (unit -> 'a u)
23986 type 'a t = unit -> 'a u
23987 end
23988structure Stream : STREAM =
23989 struct
23990 datatype 'a u = Nil | Cons of 'a * (unit -> 'a u)
23991 type 'a t = unit -> 'a u
23992 end
23993----
23994
23995* <!Anchor(VectorExpsAndPats)>
23996Vector Expressions and Patterns: +allowVectorExpsAndPats {false|true}+
23997+
23998--
23999Allow or disallow vector expressions and vector patterns. This is a
24000proxy for all of the following annotations.
24001
24002** <!Anchor(VectorExps)>
24003Vector Expressions: +allowVectorExps {false|true}+
24004+
24005Allow vector expressions of the form +#[_exp~0~_, _exp~1~_, ..., _exp~n-1~_]+ (where _n ≥ 0_). The expression has type +_τ_ vector+ when each expression _exp~i~_ has type +_τ_+.
24006
24007** <!Anchor(VectorPats)>
24008Vector Patterns: +allowVectorPats {false|true}+
24009+
24010Allow vector patterns of the form +#[_pat~0~_, _pat~1~_, ..., _pat~n-1~_]+ (where _n ≥ 0_). The pattern matches values of type +_τ_ vector+ when each pattern _pat~i~_ matches values of type +_τ_+.
24011--
24012
24013<<<
24014
24015:mlton-guide-page: SureshJagannathan
24016[[SureshJagannathan]]
24017SureshJagannathan
24018=================
24019
24020I am an Associate Professor at the http://www.cs.purdue.edu/[Department of Computer Science] at Purdue University.
24021My research focus is in programming language design and implementation, concurrency,
24022and distributed systems. I am interested in various aspects of MLton, mostly related to (in no particular order): (1) control-flow analysis (2) representation
24023strategies (e.g., flattening), (3) IR formats, and (4) extensions for distributed programming.
24024
24025
24026Please see my http://www.cs.purdue.edu/homes/suresh/index.html[Home page] for more details.
24027
24028<<<
24029
24030:mlton-guide-page: Swerve
24031[[Swerve]]
24032Swerve
24033======
24034
24035http://ftp.sun.ac.za/ftp/mirrorsites/ocaml/Systems_programming/book/c3253.html[Swerve]
24036is an HTTP server written in SML, originally developed with SML/NJ.
24037<:RayRacine:> ported Swerve to MLton in January 2005.
24038
24039<!Attachment(Swerve,swerve.tar.bz2,Download)> the port.
24040
24041Excerpt from the included `README`:
24042____
24043Total testing of this port consisted of a successful compile, startup,
24044and serving one html page with one gif image. Given that the original
24045code was throughly designed and implemented in a thoughtful manner and
24046I expect it is quite usable modulo a few minor bugs introduced by my
24047porting effort.
24048____
24049
24050Swerve is described in <!Cite(Shipman02)>.
24051
24052<<<
24053
24054:mlton-guide-page: SXML
24055[[SXML]]
24056SXML
24057====
24058
24059<:SXML:> is an <:IntermediateLanguage:>, translated from <:XML:> by
24060<:Monomorphise:>, optimized by <:SXMLSimplify:>, and translated by
24061<:ClosureConvert:> to <:SSA:>.
24062
24063== Description ==
24064
24065SXML is a simply-typed version of <:XML:>.
24066
24067== Implementation ==
24068
24069* <!ViewGitFile(mlton,master,mlton/xml/sxml.sig)>
24070* <!ViewGitFile(mlton,master,mlton/xml/sxml.fun)>
24071* <!ViewGitFile(mlton,master,mlton/xml/sxml-tree.sig)>
24072
24073== Type Checking ==
24074
24075<:SXML:> shares the type checker for <:XML:>.
24076
24077== Details and Notes ==
24078
24079There are only two differences between <:XML:> and <:SXML:>. First,
24080<:SXML:> `val`, `fun`, and `datatype` declarations always have an
24081empty list of type variables. Second, <:SXML:> variable references
24082always have an empty list of type arguments. Constructors uses can
24083only have a nonempty list of type arguments if the constructor is a
24084primitive.
24085
24086Although we could rely on the type system to enforce these constraints
24087by parameterizing the <:XML:> signature, <:StephenWeeks:> did so in a
24088previous version of the compiler, and the software engineering gains
24089were not worth the effort.
24090
24091<<<
24092
24093:mlton-guide-page: SXMLShrink
24094[[SXMLShrink]]
24095SXMLShrink
24096==========
24097
24098SXMLShrink is an optimization pass for the <:SXML:>
24099<:IntermediateLanguage:>, invoked from <:SXMLSimplify:>.
24100
24101== Description ==
24102
24103This pass performs optimizations based on a reduction system.
24104
24105== Implementation ==
24106
24107* <!ViewGitFile(mlton,master,mlton/xml/shrink.sig)>
24108* <!ViewGitFile(mlton,master,mlton/xml/shrink.fun)>
24109
24110== Details and Notes ==
24111
24112<:SXML:> shares the <:XMLShrink:> simplifier.
24113
24114<<<
24115
24116:mlton-guide-page: SXMLSimplify
24117[[SXMLSimplify]]
24118SXMLSimplify
24119============
24120
24121The optimization passes for the <:SXML:> <:IntermediateLanguage:> are
24122collected and controlled by the `SxmlSimplify` functor
24123(<!ViewGitFile(mlton,master,mlton/xml/sxml-simplify.sig)>,
24124<!ViewGitFile(mlton,master,mlton/xml/sxml-simplify.fun)>).
24125
24126The following optimization passes are implemented:
24127
24128* <:Polyvariance:>
24129* <:SXMLShrink:>
24130
24131The following implementation passes are implemented:
24132
24133* <:ImplementExceptions:>
24134* <:ImplementSuffix:>
24135
24136The following optimization passes are not implemented, but might prove useful:
24137
24138* <:Uncurry:>
24139* <:LambdaLift:>
24140
24141The optimization passes can be controlled from the command-line by the options
24142
24143* `-diag-pass <pass>` -- keep diagnostic info for pass
24144* `-disable-pass <pass>` -- skip optimization pass (if normally performed)
24145* `-enable-pass <pass>` -- perform optimization pass (if normally skipped)
24146* `-keep-pass <pass>` -- keep the results of pass
24147* `-sxml-passes <passes>` -- sxml optimization passes
24148
24149<<<
24150
24151:mlton-guide-page: SyntacticConventions
24152[[SyntacticConventions]]
24153SyntacticConventions
24154====================
24155
24156Here are a number of syntactic conventions useful for programming in
24157SML.
24158
24159
24160== General ==
24161
24162* A line of code never exceeds 80 columns.
24163
24164* Only split a syntactic entity across multiple lines if it doesn't fit on one line within 80 columns.
24165
24166* Use alphabetical order wherever possible.
24167
24168* Avoid redundant parentheses.
24169
24170* When using `:`, there is no space before the colon, and a single space after it.
24171
24172
24173== Identifiers ==
24174
24175* Variables, record labels and type constructors begin with and use
24176small letters, using capital letters to separate words.
24177+
24178[source,sml]
24179----
24180cost
24181maxValue
24182----
24183
24184* Variables that represent collections of objects (lists, arrays,
24185vectors, ...) are often suffixed with an `s`.
24186+
24187[source,sml]
24188----
24189xs
24190employees
24191----
24192
24193* Constructors, structure identifiers, and functor identifiers begin
24194with a capital letter.
24195+
24196[source,sml]
24197----
24198Queue
24199LinkedList
24200----
24201
24202* Signature identifiers are in all capitals, using `_` to separate
24203words.
24204+
24205[source,sml]
24206----
24207LIST
24208BINARY_HEAP
24209----
24210
24211
24212== Types ==
24213
24214* Alphabetize record labels. In a record type, there are spaces after
24215colons and commas, but not before colons or commas, or at the
24216delimiters `{` and `}`.
24217+
24218[source,sml]
24219----
24220{bar: int, foo: int}
24221----
24222
24223* Only split a record type across multiple lines if it doesn't fit on
24224one line. If a record type must be split over multiple lines, put one
24225field per line.
24226+
24227[source,sml]
24228----
24229{bar: int,
24230 foo: real * real,
24231 zoo: bool}
24232----
24233
24234
24235* In a tuple type, there are spaces before and after each `*`.
24236+
24237[source,sml]
24238----
24239int * bool * real
24240----
24241
24242* Only split a tuple type across multiple lines if it doesn't fit on
24243one line. In a tuple type split over multiple lines, there is one
24244type per line, and the `*`-s go at the beginning of the lines.
24245+
24246[source,sml]
24247----
24248int
24249* bool
24250* real
24251----
24252+
24253It may also be useful to parenthesize to make the grouping more
24254apparent.
24255+
24256[source,sml]
24257----
24258(int
24259 * bool
24260 * real)
24261----
24262
24263* In an arrow type split over multiple lines, put the arrow at the
24264beginning of its line.
24265+
24266[source,sml]
24267----
24268int * real
24269-> bool
24270----
24271+
24272It may also be useful to parenthesize to make the grouping more
24273apparent.
24274+
24275[source,sml]
24276----
24277(int * real
24278 -> bool)
24279----
24280
24281* Avoid redundant parentheses.
24282
24283* Arrow types associate to the right, so write
24284+
24285[source,sml]
24286----
24287a -> b -> c
24288----
24289+
24290not
24291+
24292[source,sml]
24293----
24294a -> (b -> c)
24295----
24296
24297* Type constructor application associates to the left, so write
24298+
24299[source,sml]
24300----
24301int ref list
24302----
24303+
24304not
24305+
24306[source,sml]
24307----
24308(int ref) list
24309----
24310
24311* Type constructor application binds more tightly than a tuple type,
24312so write
24313+
24314[source,sml]
24315----
24316int list * bool list
24317----
24318+
24319not
24320+
24321[source,sml]
24322----
24323(int list) * (bool list)
24324----
24325
24326* Tuple types bind more tightly than arrow types, so write
24327+
24328[source,sml]
24329----
24330int * bool -> real
24331----
24332+
24333not
24334+
24335[source,sml]
24336----
24337(int * bool) -> real
24338----
24339
24340
24341== Core ==
24342
24343* A core expression or declaration split over multiple lines does not
24344contain any blank lines.
24345
24346* A record field selector has no space between the `#` and the record
24347label. So, write
24348+
24349[source,sml]
24350----
24351#foo
24352----
24353+
24354not
24355+
24356[source,sml]
24357----
24358# foo
24359----
24360+
24361
24362* A tuple has a space after each comma, but not before, and not at the
24363delimiters `(` and `)`.
24364+
24365[source,sml]
24366----
24367(e1, e2, e3)
24368----
24369
24370* A tuple split over multiple lines has one element per line, and the
24371commas go at the end of the lines.
24372+
24373[source,sml]
24374----
24375(e1,
24376 e2,
24377 e3)
24378----
24379
24380* A list has a space after each comma, but not before, and not at the
24381delimiters `[` and `]`.
24382+
24383[source,sml]
24384----
24385[e1, e2, e3]
24386----
24387
24388* A list split over multiple lines has one element per line, and the
24389commas at the end of the lines.
24390+
24391[source,sml]
24392----
24393[e1,
24394 e2,
24395 e3]
24396----
24397
24398* A record has spaces before and after `=`, a space after each comma,
24399but not before, and not at the delimiters `{` and `}`. Field names
24400appear in alphabetical order.
24401+
24402[source,sml]
24403----
24404{bar = 13, foo = true}
24405----
24406
24407* A sequence expression has a space after each semicolon, but not before.
24408+
24409[source,sml]
24410----
24411(e1; e2; e3)
24412----
24413
24414* A sequence expression split over multiple lines has one expression
24415per line, and the semicolons at the beginning of lines. Lisp and
24416Scheme programmers may find this hard to read at first.
24417+
24418[source,sml]
24419----
24420(e1
24421 ; e2
24422 ; e3)
24423----
24424+
24425_Rationale_: this makes it easy to visually spot the beginning of each
24426expression, which becomes more valuable as the expressions themselves
24427are split across multiple lines.
24428
24429* An application expression has a space between the function and the
24430argument. There are no parens unless the argument is a tuple (in
24431which case the parens are really part of the tuple, not the
24432application).
24433+
24434[source,sml]
24435----
24436f a
24437f (a1, a2, a3)
24438----
24439
24440* Avoid redundant parentheses. Application associates to left, so
24441write
24442+
24443[source,sml]
24444----
24445f a1 a2 a3
24446----
24447+
24448not
24449+
24450[source,sml]
24451----
24452((f a1) a2) a3
24453----
24454
24455* Infix operators have a space before and after the operator.
24456+
24457[source,sml]
24458----
24459x + y
24460x * y - z
24461----
24462
24463* Avoid redundant parentheses. Use <:OperatorPrecedence:>. So, write
24464+
24465[source,sml]
24466----
24467x + y * z
24468----
24469+
24470not
24471+
24472[source,sml]
24473----
24474x + (y * z)
24475----
24476
24477* An `andalso` expression split over multiple lines has the `andalso`
24478at the beginning of subsequent lines.
24479+
24480[source,sml]
24481----
24482e1
24483andalso e2
24484andalso e3
24485----
24486
24487* A `case` expression is indented as follows
24488+
24489[source,sml]
24490----
24491case e1 of
24492 p1 => e1
24493 | p2 => e2
24494 | p3 => e3
24495----
24496
24497* A `datatype`'s constructors are alphabetized.
24498+
24499[source,sml]
24500----
24501datatype t = A | B | C
24502----
24503
24504* A `datatype` declaration has a space before and after each `|`.
24505+
24506[source,sml]
24507----
24508datatype t = A | B of int | C
24509----
24510
24511* A `datatype` split over multiple lines has one constructor per line,
24512with the `|` at the beginning of lines and the constructors beginning
245133 columns to the right of the `datatype`.
24514+
24515[source,sml]
24516----
24517datatype t =
24518 A
24519 | B
24520 | C
24521----
24522
24523* A `fun` declaration may start its body on the subsequent line,
24524indented 3 spaces.
24525+
24526[source,sml]
24527----
24528fun f x y =
24529 let
24530 val z = x + y + z
24531 in
24532 z
24533 end
24534----
24535
24536* An `if` expression is indented as follows.
24537+
24538[source,sml]
24539----
24540if e1
24541 then e2
24542else e3
24543----
24544
24545* A sequence of `if`-`then`-`else`-s is indented as follows.
24546+
24547[source,sml]
24548----
24549if e1
24550 then e2
24551else if e3
24552 then e4
24553else if e5
24554 then e6
24555else e7
24556----
24557
24558* A `let` expression has the `let`, `in`, and `end` on their own
24559lines, starting in the same column. Declarations and the body are
24560indented 3 spaces.
24561+
24562[source,sml]
24563----
24564let
24565 val x = 13
24566 val y = 14
24567in
24568 x + y
24569end
24570----
24571
24572* A `local` declaration has the `local`, `in`, and `end` on their own
24573lines, starting in the same column. Declarations are indented 3
24574spaces.
24575+
24576[source,sml]
24577----
24578local
24579 val x = 13
24580in
24581 val y = x
24582end
24583----
24584
24585* An `orelse` expression split over multiple lines has the `orelse` at
24586the beginning of subsequent lines.
24587+
24588[source,sml]
24589----
24590e1
24591orelse e2
24592orelse e3
24593----
24594
24595* A `val` declaration has a space before and after the `=`.
24596+
24597[source,sml]
24598----
24599val p = e
24600----
24601
24602* A `val` declaration can start the expression on the subsequent line,
24603indented 3 spaces.
24604+
24605[source,sml]
24606----
24607val p =
24608 if e1 then e2 else e3
24609----
24610
24611
24612== Signatures ==
24613
24614* A `signature` declaration is indented as follows.
24615+
24616[source,sml]
24617----
24618signature FOO =
24619 sig
24620 val x: int
24621 end
24622----
24623+
24624_Exception_: a signature declaration in a file to itself can omit the
24625indentation to save horizontal space.
24626+
24627[source,sml]
24628----
24629signature FOO =
24630sig
24631
24632val x: int
24633
24634end
24635----
24636+
24637In this case, there should be a blank line after the `sig` and before
24638the `end`.
24639
24640* A `val` specification has a space after the colon, but not before.
24641+
24642[source,sml]
24643----
24644val x: int
24645----
24646+
24647_Exception_: in the case of operators (like `+`), there is a space
24648before the colon to avoid lexing the colon as part of the operator.
24649+
24650[source,sml]
24651----
24652val + : t * t -> t
24653----
24654
24655* Alphabetize specifications in signatures.
24656+
24657[source,sml]
24658----
24659sig
24660 val x: int
24661 val y: bool
24662end
24663----
24664
24665
24666== Structures ==
24667
24668* A `structure` declaration has a space on both sides of the `=`.
24669+
24670[source,sml]
24671----
24672structure Foo = Bar
24673----
24674
24675* A `structure` declaration split over multiple lines is indented as
24676follows.
24677+
24678[source,sml]
24679----
24680structure S =
24681 struct
24682 val x = 13
24683 end
24684----
24685+
24686_Exception_: a structure declaration in a file to itself can omit the
24687indentation to save horizontal space.
24688+
24689[source,sml]
24690----
24691structure S =
24692struct
24693
24694val x = 13
24695
24696end
24697----
24698+
24699In this case, there should be a blank line after the `struct` and
24700before the `end`.
24701
24702* Declarations in a `struct` are separated by blank lines.
24703+
24704[source,sml]
24705----
24706struct
24707 val x =
24708 let
24709 y = 13
24710 in
24711 y + 1
24712 end
24713
24714 val z = 14
24715end
24716----
24717
24718
24719== Functors ==
24720
24721* A `functor` declaration has spaces after each `:` (or `:>`) but not
24722before, and a space before and after the `=`. It is indented as
24723follows.
24724+
24725[source,sml]
24726----
24727functor Foo (S: FOO_ARG): FOO =
24728 struct
24729 val x = S.x
24730 end
24731----
24732+
24733_Exception_: a functor declaration in a file to itself can omit the
24734indentation to save horizontal space.
24735+
24736[source,sml]
24737----
24738functor Foo (S: FOO_ARG): FOO =
24739struct
24740
24741val x = S.x
24742
24743end
24744----
24745+
24746In this case, there should be a blank line after the `struct`
24747and before the `end`.
24748
24749<<<
24750
24751:mlton-guide-page: Talk
24752[[Talk]]
24753Talk
24754====
24755
24756== The MLton Standard ML Compiler ==
24757
24758*Henry Cejtin, Matthew Fluet, Suresh Jagannathan, Stephen Weeks*
24759
24760{nbsp} +
24761{nbsp} +
24762{nbsp} +
24763
24764'''
24765
24766[cols="<,>"]
24767|====
24768||<:TalkStandardML: Next>
24769|====
24770
24771<<<
24772
24773:mlton-guide-page: TalkDiveIn
24774[[TalkDiveIn]]
24775TalkDiveIn
24776==========
24777
24778== Dive In ==
24779
24780 * to <:Development:>
24781 * to <:Documentation:>
24782 * to <:Download:>
24783
24784{nbsp} +
24785{nbsp} +
24786{nbsp} +
24787
24788'''
24789
24790[cols="<,>"]
24791|====
24792|<:TalkMLtonHistory: Prev>|
24793|====
24794
24795<<<
24796
24797:mlton-guide-page: TalkFolkLore
24798[[TalkFolkLore]]
24799TalkFolkLore
24800============
24801
24802== Folk Lore ==
24803
24804 * Defunctorization and monomorphisation are feasible
24805 * Global control-flow analysis is feasible
24806 * Early closure conversion is feasible
24807
24808{nbsp} +
24809{nbsp} +
24810{nbsp} +
24811
24812'''
24813
24814[cols="<,>"]
24815|====
24816|<:TalkWholeProgram: Prev>|<:TalkMLtonFeatures: Next>
24817|====
24818
24819<<<
24820
24821:mlton-guide-page: TalkFromSMLTo
24822[[TalkFromSMLTo]]
24823TalkFromSMLTo
24824=============
24825
24826== From Standard ML to S-T F-O IL ==
24827
24828 * What issues arise when translating from Standard ML into an intermediate language?
24829
24830{nbsp} +
24831{nbsp} +
24832{nbsp} +
24833
24834'''
24835
24836[cols="<,>"]
24837|====
24838|<:TalkMLtonApproach: Prev>|<:TalkHowModules: Next>
24839|====
24840
24841<<<
24842
24843:mlton-guide-page: TalkHowHigherOrder
24844[[TalkHowHigherOrder]]
24845TalkHowHigherOrder
24846==================
24847
24848== Higher-order Functions ==
24849
24850 * How does one represent SML's higher-order functions?
24851 * MLton's answer: defunctionalize
24852
24853{nbsp} +
24854{nbsp} +
24855
24856See <:ClosureConvert:>.
24857
24858{nbsp} +
24859{nbsp} +
24860{nbsp} +
24861
24862'''
24863[cols="<,>"]
24864|====
24865|<:TalkMLtonApproach: Prev>|<:TalkWholeProgram: Next>
24866|====
24867
24868<<<
24869
24870:mlton-guide-page: TalkHowModules
24871[[TalkHowModules]]
24872TalkHowModules
24873==============
24874
24875== Modules ==
24876
24877 * How does one represent SML's modules?
24878 * MLton's answer: defunctorize
24879
24880{nbsp} +
24881{nbsp} +
24882
24883See <:Elaborate:>.
24884
24885{nbsp} +
24886{nbsp} +
24887{nbsp} +
24888
24889'''
24890
24891[cols="<,>"]
24892|====
24893|<:TalkFromSMLTo: Prev>|<:TalkHowPolymorphism: Next>
24894|====
24895
24896<<<
24897
24898:mlton-guide-page: TalkHowPolymorphism
24899[[TalkHowPolymorphism]]
24900TalkHowPolymorphism
24901===================
24902
24903== Polymorphism ==
24904
24905 * How does one represent SML's polymorphism?
24906 * MLton's answer: monomorphise
24907
24908{nbsp} +
24909{nbsp} +
24910
24911See <:Monomorphise:>.
24912
24913{nbsp} +
24914{nbsp} +
24915{nbsp} +
24916
24917'''
24918
24919[cols="<,>"]
24920|====
24921|<:TalkHowModules: Prev>|<:TalkHowHigherOrder: Next>
24922|====
24923
24924<<<
24925
24926:mlton-guide-page: TalkMLtonApproach
24927[[TalkMLtonApproach]]
24928TalkMLtonApproach
24929=================
24930
24931== MLton's Approach ==
24932
24933 * whole-program optimization using a simply-typed, first-order intermediate language
24934 * ensures programs are not penalized for exploiting abstraction and modularity
24935
24936{nbsp} +
24937{nbsp} +
24938{nbsp} +
24939
24940'''
24941
24942[cols="<,>"]
24943|====
24944|<:TalkStandardML: Prev>|<:TalkFromSMLTo: Next>
24945|====
24946
24947<<<
24948
24949:mlton-guide-page: TalkMLtonFeatures
24950[[TalkMLtonFeatures]]
24951TalkMLtonFeatures
24952=================
24953
24954== MLton Features ==
24955
24956 * Supports full Standard ML language and Basis Library
24957 * Generates standalone executables
24958 * Extensions
24959 ** Foreign function interface (SML to C, C to SML)
24960 ** ML Basis system for programming in the very large
24961 ** Extension libraries
24962
24963{nbsp} +
24964{nbsp} +
24965
24966See <:Features:>.
24967
24968{nbsp} +
24969{nbsp} +
24970{nbsp} +
24971
24972'''
24973
24974[cols="<,>"]
24975|====
24976|<:TalkFolkLore: Prev>|<:TalkMLtonHistory: Next>
24977|====
24978
24979<<<
24980
24981:mlton-guide-page: TalkMLtonHistory
24982[[TalkMLtonHistory]]
24983TalkMLtonHistory
24984================
24985
24986== MLton History ==
24987
24988[cols="<25%,<75%"]
24989|====
24990| April 1997 | Stephen Weeks wrote a defunctorizer for SML/NJ
24991| Aug. 1997 | Begin independent compiler (`smlc`)
24992| Oct. 1997 | Monomorphiser
24993| Nov. 1997 | Polyvariant higher-order control-flow analysis (10,000 lines)
24994| March 1999 | First release of MLton (48,006 lines)
24995| Jan. 2002 | MLton at 102,541 lines
24996| Jan. 2003 | MLton at 112,204 lines
24997| Jan. 2004 | MLton at 122,299 lines
24998| Nov. 2004 | MLton at 141,311 lines
24999|====
25000
25001{nbsp} +
25002{nbsp} +
25003
25004See <:History:>.
25005
25006{nbsp} +
25007{nbsp} +
25008{nbsp} +
25009
25010'''
25011
25012[cols="<,>"]
25013|====
25014|<:TalkMLtonFeatures: Prev>|<:TalkDiveIn: Next>
25015|====
25016
25017<<<
25018
25019:mlton-guide-page: TalkStandardML
25020[[TalkStandardML]]
25021TalkStandardML
25022==============
25023
25024== Standard ML ==
25025
25026 * a high-level language makes
25027 ** a programmer's life easier
25028 ** a compiler writer's life harder
25029
25030 * perceived overheads of features discourage their use
25031 ** higher-order functions
25032 ** polymorphic datatypes
25033 ** separate modules
25034
25035{nbsp} +
25036{nbsp} +
25037
25038Also see <:StandardML:Standard ML>.
25039
25040{nbsp} +
25041{nbsp} +
25042{nbsp} +
25043
25044'''
25045
25046[cols="<,>"]
25047|====
25048|<:Talk: Prev>|<:TalkMLtonApproach: Next>
25049|====
25050
25051<<<
25052
25053:mlton-guide-page: TalkTemplate
25054[[TalkTemplate]]
25055TalkTemplate
25056============
25057
25058== Title ==
25059
25060 * Bullet
25061 * Bullet
25062
25063
25064{nbsp} +
25065{nbsp} +
25066{nbsp} +
25067
25068'''
25069
25070[cols="<,>"]
25071|====
25072|<:ZZZPrev: Prev>|<:ZZZNext: Next>
25073|====
25074
25075<<<
25076
25077:mlton-guide-page: TalkWholeProgram
25078[[TalkWholeProgram]]
25079TalkWholeProgram
25080================
25081
25082== Whole Program Compiler ==
25083
25084 * Each of these techniques requires whole-program analysis
25085 * But, additional benefits:
25086 ** eliminate (some) variability in programming styles
25087 ** specialize representations
25088 ** simplifies and improves runtime system
25089
25090{nbsp} +
25091{nbsp} +
25092{nbsp} +
25093
25094'''
25095
25096[cols="<,>"]
25097|====
25098|<:TalkHowHigherOrder: Prev>|<:TalkFolkLore: Next>
25099|====
25100
25101<<<
25102
25103:mlton-guide-page: TILT
25104[[TILT]]
25105TILT
25106====
25107
25108http://www.cs.cornell.edu/home/jgm/tilt.html[TILT] is a
25109<:StandardMLImplementations:Standard ML implementation>.
25110
25111<<<
25112
25113:mlton-guide-page: TipsForWritingConciseSML
25114[[TipsForWritingConciseSML]]
25115TipsForWritingConciseSML
25116========================
25117
25118SML is a rich enough language that there are often several ways to
25119express things. This page contains miscellaneous tips (ideas not
25120rules) for writing concise SML. The metric that we are interested in
25121here is the number of tokens or words (rather than the number of
25122lines, for example).
25123
25124== Datatypes in Signatures ==
25125
25126A seemingly frequent source of repetition in SML is that of datatype
25127definitions in signatures and structures. Actually, it isn't
25128repetition at all. A datatype specification in a signature, such as,
25129
25130[source,sml]
25131----
25132signature EXP = sig
25133 datatype exp = Fn of id * exp | App of exp * exp | Var of id
25134end
25135----
25136
25137is just a specification of a datatype that may be matched by multiple
25138(albeit identical) datatype declarations. For example, in
25139
25140[source,sml]
25141----
25142structure AnExp : EXP = struct
25143 datatype exp = Fn of id * exp | App of exp * exp | Var of id
25144end
25145
25146structure AnotherExp : EXP = struct
25147 datatype exp = Fn of id * exp | App of exp * exp | Var of id
25148end
25149----
25150
25151the types `AnExp.exp` and `AnotherExp.exp` are two distinct types. If
25152such <:GenerativeDatatype:generativity> isn't desired or needed, you
25153can avoid the repetition:
25154
25155[source,sml]
25156----
25157structure Exp = struct
25158 datatype exp = Fn of id * exp | App of exp * exp | Var of id
25159end
25160
25161signature EXP = sig
25162 datatype exp = datatype Exp.exp
25163end
25164
25165structure Exp : EXP = struct
25166 open Exp
25167end
25168----
25169
25170Keep in mind that this isn't semantically equivalent to the original.
25171
25172
25173== Clausal Function Definitions ==
25174
25175The syntax of clausal function definitions is rather repetitive. For
25176example,
25177
25178[source,sml]
25179----
25180fun isSome NONE = false
25181 | isSome (SOME _) = true
25182----
25183
25184is more verbose than
25185
25186[source,sml]
25187----
25188val isSome =
25189 fn NONE => false
25190 | SOME _ => true
25191----
25192
25193For recursive functions the break-even point is one clause higher. For example,
25194
25195[source,sml]
25196----
25197fun fib 0 = 0
25198 | fib 1 = 1
25199 | fib n = fib (n-1) + fib (n-2)
25200----
25201
25202isn't less verbose than
25203
25204[source,sml]
25205----
25206val rec fib =
25207 fn 0 => 0
25208 | 1 => 1
25209 | n => fib (n-1) + fib (n-2)
25210----
25211
25212It is quite often the case that a curried function primarily examines
25213just one of its arguments. Such functions can be written particularly
25214concisely by making the examined argument last. For example, instead
25215of
25216
25217[source,sml]
25218----
25219fun eval (Fn (v, b)) env => ...
25220 | eval (App (f, a) env => ...
25221 | eval (Var v) env => ...
25222----
25223
25224consider writing
25225
25226[source,sml]
25227----
25228fun eval env =
25229 fn Fn (v, b) => ...
25230 | App (f, a) => ...
25231 | Var v => ...
25232----
25233
25234
25235== Parentheses ==
25236
25237It is a good idea to avoid using lots of irritating superfluous
25238parentheses. An important rule to know is that prefix function
25239application in SML has higher precedence than any infix operator. For
25240example, the outer parentheses in
25241
25242[source,sml]
25243----
25244(square (5 + 1)) + (square (5 * 2))
25245----
25246
25247are superfluous.
25248
25249People trained in other languages often use superfluous parentheses in
25250a number of places. In particular, the parentheses in the following
25251examples are practically always superfluous and are best avoided:
25252
25253[source,sml]
25254----
25255if (condition) then ... else ...
25256while (condition) do ...
25257----
25258
25259The same basically applies to case expressions:
25260
25261[source,sml]
25262----
25263case (expression) of ...
25264----
25265
25266It is not uncommon to match a tuple of two or more values:
25267
25268[source,sml]
25269----
25270case (a, b) of
25271 (A1, B1) => ...
25272 | (A2, B2) => ...
25273----
25274
25275Such case expressions can be written more concisely with an
25276<:ProductType:infix product constructor>:
25277
25278[source,sml]
25279----
25280case a & b of
25281 A1 & B1 => ...
25282 | A2 & B2 => ...
25283----
25284
25285
25286== Conditionals ==
25287
25288Repeated sequences of conditionals such as
25289
25290[source,sml]
25291----
25292if x < y then ...
25293else if x = y then ...
25294else ...
25295----
25296
25297can often be written more concisely as case expressions such as
25298
25299[source,sml]
25300----
25301case Int.compare (x, y) of
25302 LESS => ...
25303 | EQUAL => ...
25304 | GREATER => ...
25305----
25306
25307For a custom comparison, you would then define an appropriate datatype
25308and a reification function. An alternative to using datatypes is to
25309use dispatch functions
25310
25311[source,sml]
25312----
25313comparing (x, y)
25314{lt = fn () => ...,
25315 eq = fn () => ...,
25316 gt = fn () => ...}
25317----
25318
25319where
25320
25321[source,sml]
25322----
25323fun comparing (x, y) {lt, eq, gt} =
25324 (case Int.compare (x, y) of
25325 LESS => lt
25326 | EQUAL => eq
25327 | GREATER => gt) ()
25328----
25329
25330An advantage is that no datatype definition is needed. A disadvantage
25331is that you can't combine multiple dispatch results easily.
25332
25333
25334== Command-Query Fusion ==
25335
25336Many are familiar with the
25337http://en.wikipedia.org/wiki/Command-Query_Separation[Command-Query
25338Separation Principle]. Adhering to the principle, a signature for an
25339imperative stack might contain specifications
25340
25341[source,sml]
25342----
25343val isEmpty : 'a t -> bool
25344val pop : 'a t -> 'a
25345----
25346
25347and use of a stack would look like
25348
25349[source,sml]
25350----
25351if isEmpty stack
25352then ... pop stack ...
25353else ...
25354----
25355
25356or, when the element needs to be named,
25357
25358[source,sml]
25359----
25360if isEmpty stack
25361then let val elem = pop stack in ... end
25362else ...
25363----
25364
25365For efficiency, correctness, and conciseness, it is often better to
25366combine the query and command and return the result as an option:
25367
25368[source,sml]
25369----
25370val pop : 'a t -> 'a option
25371----
25372
25373A use of a stack would then look like this:
25374
25375[source,sml]
25376----
25377case pop stack of
25378 NONE => ...
25379 | SOME elem => ...
25380----
25381
25382<<<
25383
25384:mlton-guide-page: ToMachine
25385[[ToMachine]]
25386ToMachine
25387=========
25388
25389<:ToMachine:> is a translation pass from the <:RSSA:>
25390<:IntermediateLanguage:> to the <:Machine:> <:IntermediateLanguage:>.
25391
25392== Description ==
25393
25394This pass converts from a <:RSSA:> program into a <:Machine:> program.
25395
25396It uses <:AllocateRegisters:>, <:Chunkify:>, and <:ParallelMove:>.
25397
25398== Implementation ==
25399
25400* <!ViewGitFile(mlton,master,mlton/backend/backend.sig)>
25401* <!ViewGitFile(mlton,master,mlton/backend/backend.fun)>
25402
25403== Details and Notes ==
25404
25405Because the MLton runtime system is shared by all <:Codegen:codegens>, it is most
25406convenient to decide on stack layout _before_ any <:Codegen:codegen> takes over.
25407In particular, we compute all the stack frame info for each <:RSSA:>
25408function, including stack size, <:GarbageCollection:garbage collector>
25409masks for each frame, etc. To do so, the <:Machine:>
25410<:IntermediateLanguage:> imagines an abstract machine with an infinite
25411number of (pseudo-)registers of every size. A liveness analysis
25412determines, for each variable, whether or not it is live across a
25413point where the runtime system might take over (for example, any
25414garbage collection point) or a non-tail call to another <:RSSA:>
25415function. Those that are live go on the stack, while those that
25416aren't live go into psuedo-registers. From this information, we know
25417all we need to about each stack frame. On the downside, nothing
25418further on is allowed to change this stack info; it is set in stone.
25419
25420<<<
25421
25422:mlton-guide-page: TomMurphy
25423[[TomMurphy]]
25424TomMurphy
25425=========
25426
25427Tom Murphy VII is a long time MLton user and occasional contributor. He works on programming languages for his PhD work at Carnegie Mellon in Pittsburgh, USA. <:AdamGoode:> lives on the same floor of Wean Hall.
25428
25429http://tom7.org[Home page]
25430
25431<<<
25432
25433:mlton-guide-page: ToRSSA
25434[[ToRSSA]]
25435ToRSSA
25436======
25437
25438<:ToRSSA:> is a translation pass from the <:SSA2:>
25439<:IntermediateLanguage:> to the <:RSSA:> <:IntermediateLanguage:>.
25440
25441== Description ==
25442
25443This pass converts a <:SSA2:> program into a <:RSSA:> program.
25444
25445It uses <:PackedRepresentation:>.
25446
25447== Implementation ==
25448
25449* <!ViewGitFile(mlton,master,mlton/backend/ssa-to-rssa.sig)>
25450* <!ViewGitFile(mlton,master,mlton/backend/ssa-to-rssa.fun)>
25451
25452== Details and Notes ==
25453
25454{empty}
25455
25456<<<
25457
25458:mlton-guide-page: ToSSA2
25459[[ToSSA2]]
25460ToSSA2
25461======
25462
25463<:ToSSA2:> is a translation pass from the <:SSA:>
25464<:IntermediateLanguage:> to the <:SSA2:> <:IntermediateLanguage:>.
25465
25466== Description ==
25467
25468This pass is a simple conversion from a <:SSA:> program into a
25469<:SSA2:> program.
25470
25471The only interesting portions of the translation are:
25472
25473* an <:SSA:> `ref` type becomes an object with a single mutable field
25474* `array`, `vector`, and `ref` are eliminated in favor of select and updates
25475* `Case` transfers separate discrimination and constructor argument selects
25476
25477== Implementation ==
25478
25479* <!ViewGitFile(mlton,master,mlton/ssa/ssa-to-ssa2.sig)>
25480* <!ViewGitFile(mlton,master,mlton/ssa/ssa-to-ssa2.fun)>
25481
25482== Details and Notes ==
25483
25484{empty}
25485
25486<<<
25487
25488:mlton-guide-page: TypeChecking
25489[[TypeChecking]]
25490TypeChecking
25491============
25492
25493MLton's type checker follows the <:DefinitionOfStandardML:Definition>
25494closely, so you may find differences between MLton and other SML
25495compilers that do not follow the Definition so closely. In
25496particular, SML/NJ has many deviations from the Definition -- please
25497see <:SMLNJDeviations:> for those that we are aware of.
25498
25499In some respects MLton's type checker is more powerful than other SML
25500compilers, so there are programs that MLton accepts that are rejected
25501by some other SML compilers. These kinds of programs fall into a few
25502simple categories.
25503
25504* MLton resolves flexible record patterns using a larger context than
25505many other SML compilers. For example, MLton accepts the
25506following.
25507+
25508[source,sml]
25509----
25510fun f {x, ...} = x
25511val _ = f {x = 13, y = "foo"}
25512----
25513
25514* MLton uses as large a context as possible to resolve the type of
25515variables constrained by the value restriction to be monotypes. For
25516example, MLton accepts the following.
25517+
25518[source,sml]
25519----
25520structure S:
25521 sig
25522 val f: int -> int
25523 end =
25524 struct
25525 val f = (fn x => x) (fn y => y)
25526 end
25527----
25528
25529
25530== Type error messages ==
25531
25532To aid in the understanding of type errors, MLton's type checker
25533displays type errors differently than other SML compilers. In
25534particular, when two types are different, it is important for the
25535programmer to easily understand why they are different. So, MLton
25536displays only the differences between two types that don't match,
25537using underscores for the parts that match. For example, if a
25538function expects `real * int` but gets `real * real`, the type error
25539message would look like
25540
25541----
25542expects: _ * [int]
25543but got: _ * [real]
25544----
25545
25546As another aid to spotting differences, MLton places brackets `[]`
25547around the parts of the types that don't match. A common situation is
25548when a function receives a different number of arguments than it
25549expects, in which case you might see an error like
25550
25551----
25552expects: [int * real]
25553but got: [int * real * string]
25554----
25555
25556The brackets make it easy to see that the problem is that the tuples
25557have different numbers of components -- not that the components don't
25558match. Contrast that with a case where a function receives the right
25559number of arguments, but in the wrong order, in which case you might
25560see an error like
25561
25562----
25563expects: [int] * [real]
25564but got: [real] * [int]
25565----
25566
25567Here the brackets make it easy to see that the components do not match.
25568
25569We appreciate feedback on any type error messages that you find
25570confusing, or suggestions you may have for improvements to error
25571messages.
25572
25573
25574== The shortest/most-recent rule for type names ==
25575
25576In a type error message, MLton often has a number of choices in
25577deciding what name to use for a type. For example, in the following
25578type-incorrect program
25579
25580[source,sml]
25581----
25582type t = int
25583fun f (x: t) = x
25584val _ = f "foo"
25585----
25586
25587MLton reports the error message
25588
25589----
25590Error: z.sml 3.9-3.15.
25591 Function applied to incorrect argument.
25592 expects: [t]
25593 but got: [string]
25594 in: f "foo"
25595----
25596
25597MLton could have reported `expects: [int]` instead of `expects: [t]`.
25598However, MLton uses the shortest/most-recent rule in order to decide
25599what type name to display. This rule means that, at the point of the
25600error, MLton first looks for the shortest name for a type in terms of
25601number of structure identifiers (e.g. `foobar` is shorter than `A.t`).
25602Next, if there are multiple names of the same length, then MLton uses
25603the most recently defined name. It is this tiebreaker that causes
25604MLton to prefer `t` to `int` in the above example.
25605
25606In signature matching, most recently defined is not taken to include
25607all of the definitions introduced by the structure (since the matching
25608takes place outside the structure and before it is defined). For
25609example, in the following type-incorrect program
25610
25611[source,sml]
25612----
25613structure S:
25614 sig
25615 val x: int
25616 end =
25617 struct
25618 type t = int
25619 val x = "foo"
25620 end
25621----
25622
25623MLton reports the error message
25624
25625----
25626Error: z.sml 2.4-4.6.
25627 Variable in structure disagrees with signature (type): x.
25628 structure: val x: [string]
25629 defn at: z.sml 7.11-7.11
25630 signature: val x: [int]
25631 spec at: z.sml 3.11-3.11
25632----
25633
25634If there is a type that only exists inside the structure being
25635matched, then the prefix `_str.` is used. For example, in the
25636following type-incorrect program
25637
25638[source,sml]
25639----
25640structure S:
25641 sig
25642 val x: int
25643 end =
25644 struct
25645 datatype t = T
25646 val x = T
25647 end
25648----
25649
25650MLton reports the error message
25651
25652----
25653Error: z.sml 2.4-4.6.
25654 Variable in structure disagrees with signature (type): x.
25655 structure: val x: [_str.t]
25656 defn at: z.sml 7.11-7.11
25657 signature: val x: [int]
25658 spec at: z.sml 3.11-3.11
25659----
25660
25661in which the `[_str.t]` refers to the type defined in the structure.
25662
25663<<<
25664
25665:mlton-guide-page: TypeConstructor
25666[[TypeConstructor]]
25667TypeConstructor
25668===============
25669
25670In <:StandardML:Standard ML>, a type constructor is a function from
25671types to types. Type constructors can be _nullary_, meaning that
25672they take no arguments, as in `char`, `int`, and `real`.
25673Type constructors can be _unary_, meaning that they take one
25674argument, as in `array`, `list`, and `vector`. A program
25675can define a new type constructor in two ways: a `type` definition
25676or a `datatype` declaration. User-defined type constructors can
25677can take any number of arguments.
25678
25679[source,sml]
25680----
25681datatype t = T of int * real (* 0 arguments *)
25682type 'a t = 'a * int (* 1 argument *)
25683datatype ('a, 'b) t = A | B of 'a * 'b (* 2 arguments *)
25684type ('a, 'b, 'c) t = 'a * ('b -> 'c) (* 3 arguments *)
25685----
25686
25687Here are the syntax rules for type constructor application.
25688
25689 * Type constructor application is written in postfix. So, one writes
25690 `int list`, not `list int`.
25691
25692 * Unary type constructors drop the parens, so one writes
25693 `int list`, not `(int) list`.
25694
25695 * Nullary type constructors drop the argument entirely, so one writes
25696 `int`, not `() int`.
25697
25698 * N-ary type constructors use tuple notation; for example,
25699 `(int, real) t`.
25700
25701 * Type constructor application associates to the left. So,
25702 `int ref list` is the same as `(int ref) list`.
25703
25704<<<
25705
25706:mlton-guide-page: TypeIndexedValues
25707[[TypeIndexedValues]]
25708TypeIndexedValues
25709=================
25710
25711<:StandardML:Standard ML> does not support ad hoc polymorphism. This
25712presents a challenge to programmers. The problem is that at first
25713glance there seems to be no practical way to implement something like
25714a function for converting a value of any type to a string or a
25715function for computing a hash value for a value of any type.
25716Fortunately there are ways to implement type-indexed values in SML as
25717discussed in <!Cite(Yang98)>. Various articles such as
25718<!Cite(Danvy98)>, <!Cite(Ramsey11)>, <!Cite(Elsman04)>,
25719<!Cite(Kennedy04)>, and <!Cite(Benton05)> also contain examples of
25720type-indexed values.
25721
25722*NOTE:* The technique used in the following example uses an early (and
25723somewhat broken) variation of the basic technique used in an
25724experimental generic programming library (see
25725<!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>) that can
25726be found from the MLton repository. The generic programming library
25727also includes a more advanced generic pretty printing function (see
25728<!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/public/value/pretty.sig)>).
25729
25730== Example: Converting any SML value to (roughly) SML syntax ==
25731
25732Consider the problem of converting any SML value to a textual
25733presentation that matches the syntax of SML as closely as possible.
25734One solution is a type-indexed function that maps a given type to a
25735function that maps any value (of the type) to its textual
25736presentation. A type-indexed function like this can be useful for a
25737variety of purposes. For example, one could use it to show debugging
25738information. We'll call this function "`show`".
25739
25740We'll do a fairly complete implementation of `show`. We do not
25741distinguish infix and nonfix constructors, but that is not an
25742intrinsic property of SML datatypes. We also don't reconstruct a type
25743name for the value, although it would be particularly useful for
25744functional values. To reconstruct type names, some changes would be
25745needed and the reader is encouraged to consider how to do that. A
25746more realistic implementation would use some pretty printing
25747combinators to compute a layout for the result. This should be a
25748relatively easy change (given a suitable pretty printing library).
25749Cyclic values (through references and arrays) do not have a standard
25750textual presentation and it is impossible to convert arbitrary
25751functional values (within SML) to a meaningful textual presentation.
25752Finally, it would also make sense to show sharing of references and
25753arrays. We'll leave these improvements to an actual library
25754implementation.
25755
25756The following code uses the <:Fixpoints:fixpoint framework> and other
25757utilities from an Extended Basis library (see
25758<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>).
25759
25760=== Signature ===
25761
25762Let's consider the design of the `SHOW` signature:
25763[source,sml]
25764----
25765infixr -->
25766
25767signature SHOW = sig
25768 type 'a t (* complete type-index *)
25769 type 'a s (* incomplete sum *)
25770 type ('a, 'k) p (* incomplete product *)
25771 type u (* tuple or unlabelled product *)
25772 type l (* record or labelled product *)
25773
25774 val show : 'a t -> 'a -> string
25775
25776 (* user-defined types *)
25777 val inj : ('a -> 'b) -> 'b t -> 'a t
25778
25779 (* tuples and records *)
25780 val * : ('a, 'k) p * ('b, 'k) p -> (('a, 'b) product, 'k) p
25781
25782 val U : 'a t -> ('a, u) p
25783 val L : string -> 'a t -> ('a, l) p
25784
25785 val tuple : ('a, u) p -> 'a t
25786 val record : ('a, l) p -> 'a t
25787
25788 (* datatypes *)
25789 val + : 'a s * 'b s -> (('a, 'b) sum) s
25790
25791 val C0 : string -> unit s
25792 val C1 : string -> 'a t -> 'a s
25793
25794 val data : 'a s -> 'a t
25795
25796 val Y : 'a t Tie.t
25797
25798 (* exceptions *)
25799 val exn : exn t
25800 val regExn : (exn -> ('a * 'a s) option) -> unit
25801
25802 (* some built-in type constructors *)
25803 val refc : 'a t -> 'a ref t
25804 val array : 'a t -> 'a array t
25805 val list : 'a t -> 'a list t
25806 val vector : 'a t -> 'a vector t
25807 val --> : 'a t * 'b t -> ('a -> 'b) t
25808
25809 (* some built-in base types *)
25810 val string : string t
25811 val unit : unit t
25812 val bool : bool t
25813 val char : char t
25814 val int : int t
25815 val word : word t
25816 val real : real t
25817end
25818----
25819
25820While some details are shaped by the specific requirements of `show`,
25821there are a number of (design) patterns that translate to other
25822type-indexed values. The former kind of details are mostly shaped by
25823the syntax of SML values that `show` is designed to produce. To this
25824end, abstract types and phantom types are used to distinguish
25825incomplete record, tuple, and datatype type-indices from each other
25826and from complete type-indices. Also, names of record labels and
25827datatype constructors need to be provided by the user.
25828
25829==== Arbitrary user-defined datatypes ====
25830
25831Perhaps the most important pattern is how the design supports
25832arbitrary user-defined datatypes. A number of combinators together
25833conspire to provide the functionality. First of all, to support new
25834user-defined types, a combinator taking a conversion function to a
25835previously supported type is provided:
25836[source,sml]
25837----
25838val inj : ('a -> 'b) -> 'b t -> 'a t
25839----
25840
25841An injection function is sufficient in this case, but in the general
25842case, an embedding with injection and projection functions may be
25843needed.
25844
25845To support products (tuples and records) a product combinator is
25846provided:
25847[source,sml]
25848----
25849val * : ('a, 'k) p * ('b, 'k) p -> (('a, 'b) product, 'k) p
25850----
25851The second (phantom) type variable `'k` is there to distinguish
25852between labelled and unlabelled products and the type `p`
25853distinguishes incomplete products from complete type-indices of type
25854`t`. Most type-indexed values do not need to make such distinctions.
25855
25856To support sums (datatypes) a sum combinator is provided:
25857[source,sml]
25858----
25859val + : 'a s * 'b s -> (('a, 'b) sum) s
25860----
25861Again, the purpose of the type `s` is to distinguish incomplete sums
25862from complete type-indices of type `t`, which usually isn't necessary.
25863
25864Finally, to support recursive datatypes, including sets of mutually
25865recursive datatypes, a <:Fixpoints:fixpoint tier> is provided:
25866[source,sml]
25867----
25868val Y : 'a t Tie.t
25869----
25870
25871Together these combinators (with the more domain specific combinators
25872`U`, `L`, `tuple`, `record`, `C0`, `C1`, and `data`) enable one to
25873encode a type-index for any user-defined datatype.
25874
25875==== Exceptions ====
25876
25877The `exn` type in SML is a <:UniversalType:universal type> into which
25878all types can be embedded. SML also allows a program to generate new
25879exception variants at run-time. Thus a mechanism is required to register
25880handlers for particular variants:
25881[source,sml]
25882----
25883val exn : exn t
25884val regExn : (exn -> ('a * 'a s) option) -> unit
25885----
25886
25887The universal `exn` type-index then makes use of the registered
25888handlers. The above particular form of handler, which converts an
25889exception value to a value of some type and a type-index for that type
25890(essentially an existential type) is designed to make it convenient to
25891write handlers. To write a handler, one can conveniently reuse
25892existing type-indices:
25893[source,sml]
25894----
25895exception Int of int
25896
25897local
25898 open Show
25899in
25900 val () = regExn (fn Int v => SOME (v, C1"Int" int)
25901 | _ => NONE)
25902end
25903----
25904
25905Note that a single handler may actually handle an arbitrary number of
25906different exceptions.
25907
25908==== Other types ====
25909
25910Some built-in and standard types typically require special treatment
25911due to their special nature. The most important of these are arrays
25912and references, because cyclic data (ignoring closures) and observable
25913sharing can only be constructed through them.
25914
25915When arrow types are really supported, unlike in this case, they
25916usually need special treatment due to the contravariance of arguments.
25917
25918Lists and vectors require special treatment in the case of `show`,
25919because of their special syntax. This isn't usually the case.
25920
25921The set of base types to support also needs to be considered unless
25922one exports an interface for constructing type-indices for entirely
25923new base types.
25924
25925== Usage ==
25926
25927Before going to the implementation, let's look at some examples. For
25928the following examples, we'll assume a structure binding
25929`Show :> SHOW`. If you want to try the examples immediately, just
25930skip forward to the implementation.
25931
25932To use `show`, one first needs a type-index, which is then given to
25933`show`. To show a list of integers, one would use the type-index
25934`list int`, which has the type `int list Show.t`:
25935[source,sml]
25936----
25937val "[3, 1, 4]" =
25938 let open Show in show (list int) end
25939 [3, 1, 4]
25940----
25941
25942Likewise, to show a list of lists of characters, one would use the
25943type-index `list (list char)`, which has the type `char list list
25944Show.t`:
25945[source,sml]
25946----
25947val "[[#\"a\", #\"b\", #\"c\"], []]" =
25948 let open Show in show (list (list char)) end
25949 [[#"a", #"b", #"c"], []]
25950----
25951
25952Handling standard types is not particularly interesting. It is more
25953interesting to see how user-defined types can be handled. Although
25954the `option` datatype is a standard type, it requires no special
25955support, so we can treat it as a user-defined type. Options can be
25956encoded easily using a sum:
25957[source,sml]
25958----
25959fun option t = let
25960 open Show
25961in
25962 inj (fn NONE => INL ()
25963 | SOME v => INR v)
25964 (data (C0"NONE" + C1"SOME" t))
25965end
25966
25967val "SOME 5" =
25968 let open Show in show (option int) end
25969 (SOME 5)
25970----
25971
25972Readers new to type-indexed values might want to type annotate each
25973subexpression of the above example as an exercise. (Use a compiler to
25974check your annotations.)
25975
25976Using a product, user specified records can be also be encoded easily:
25977[source,sml]
25978----
25979val abc = let
25980 open Show
25981in
25982 inj (fn {a, b, c} => a & b & c)
25983 (record (L"a" (option int) *
25984 L"b" real *
25985 L"c" bool))
25986end
25987
25988val "{a = SOME 1, b = 3.0, c = false}" =
25989 let open Show in show abc end
25990 {a = SOME 1, b = 3.0, c = false}
25991----
25992
25993As you can see, both of the above use `inj` to inject user-defined
25994types to the general purpose sum and product types.
25995
25996Of particular interest is whether recursive datatypes and cyclic data
25997can be handled. For example, how does one write a type-index for a
25998recursive datatype such as a cyclic graph?
25999[source,sml]
26000----
26001datatype 'a graph = VTX of 'a * 'a graph list ref
26002fun arcs (VTX (_, r)) = r
26003----
26004
26005Using the `Show` combinators, we could first write a new type-index
26006combinator for `graph`:
26007[source,sml]
26008----
26009fun graph a = let
26010 open Tie Show
26011in
26012 fix Y (fn graph_a =>
26013 inj (fn VTX (x, y) => x & y)
26014 (data (C1"VTX"
26015 (tuple (U a *
26016 U (refc (list graph_a)))))))
26017end
26018----
26019
26020To show a graph with integer labels
26021[source,sml]
26022----
26023val a_graph = let
26024 val a = VTX (1, ref [])
26025 val b = VTX (2, ref [])
26026 val c = VTX (3, ref [])
26027 val d = VTX (4, ref [])
26028 val e = VTX (5, ref [])
26029 val f = VTX (6, ref [])
26030in
26031 arcs a := [b, d]
26032 ; arcs b := [c, e]
26033 ; arcs c := [a, f]
26034 ; arcs d := [f]
26035 ; arcs e := [d]
26036 ; arcs f := [e]
26037 ; a
26038end
26039----
26040we could then simply write
26041[source,sml]
26042----
26043val "VTX (1, ref [VTX (2, ref [VTX (3, ref [VTX (1, %0), \
26044 \VTX (6, ref [VTX (5, ref [VTX (4, ref [VTX (6, %3)])])] as %3)]), \
26045 \VTX (5, ref [VTX (4, ref [VTX (6, ref [VTX (5, %2)])])] as %2)]), \
26046 \VTX (4, ref [VTX (6, ref [VTX (5, ref [VTX (4, %1)])])] as %1)] as %0)" =
26047 let open Show in show (graph int) end
26048 a_graph
26049----
26050
26051There is a subtle gotcha with cyclic data. Consider the following code:
26052[source,sml]
26053----
26054exception ExnArray of exn array
26055
26056val () = let
26057 open Show
26058in
26059 regExn (fn ExnArray a =>
26060 SOME (a, C1"ExnArray" (array exn))
26061 | _ => NONE)
26062end
26063
26064val a_cycle = let
26065 val a = Array.fromList [Empty]
26066in
26067 Array.update (a, 0, ExnArray a) ; a
26068end
26069----
26070
26071Although the above looks innocent enough, the evaluation of
26072[source,sml]
26073----
26074val "[|ExnArray %0|] as %0" =
26075 let open Show in show (array exn) end
26076 a_cycle
26077----
26078goes into an infinite loop. To avoid this problem, the type-index
26079`array exn` must be evaluated only once, as in the following:
26080[source,sml]
26081----
26082val array_exn = let open Show in array exn end
26083
26084exception ExnArray of exn array
26085
26086val () = let
26087 open Show
26088in
26089 regExn (fn ExnArray a =>
26090 SOME (a, C1"ExnArray" array_exn)
26091 | _ => NONE)
26092end
26093
26094val a_cycle = let
26095 val a = Array.fromList [Empty]
26096in
26097 Array.update (a, 0, ExnArray a) ; a
26098end
26099
26100val "[|ExnArray %0|] as %0" =
26101 let open Show in show array_exn end
26102 a_cycle
26103----
26104
26105Cyclic data (excluding closures) in Standard ML can only be
26106constructed imperatively through arrays and references (combined with
26107exceptions or recursive datatypes). Before recursing to a reference
26108or an array, one needs to check whether that reference or array has
26109already been seen before. When `ref` or `array` is called with a
26110type-index, a new cyclicity checker is instantiated.
26111
26112== Implementation ==
26113
26114[source,sml]
26115----
26116structure SmlSyntax = struct
26117 local
26118 structure CV = CharVector and C = Char
26119 in
26120 val isSym = Char.contains "!%&$#+-/:<=>?@\\~`^|*"
26121
26122 fun isSymId s = 0 < size s andalso CV.all isSym s
26123
26124 fun isAlphaNumId s =
26125 0 < size s
26126 andalso C.isAlpha (CV.sub (s, 0))
26127 andalso CV.all (fn c => C.isAlphaNum c
26128 orelse #"'" = c
26129 orelse #"_" = c) s
26130
26131 fun isNumLabel s =
26132 0 < size s
26133 andalso #"0" <> CV.sub (s, 0)
26134 andalso CV.all C.isDigit s
26135
26136 fun isId s = isAlphaNumId s orelse isSymId s
26137
26138 fun isLongId s = List.all isId (String.fields (#"." <\ op =) s)
26139
26140 fun isLabel s = isId s orelse isNumLabel s
26141 end
26142end
26143
26144structure Show :> SHOW = struct
26145 datatype 'a t = IN of exn list * 'a -> bool * string
26146 type 'a s = 'a t
26147 type ('a, 'k) p = 'a t
26148 type u = unit
26149 type l = unit
26150
26151 fun show (IN t) x = #2 (t ([], x))
26152
26153 (* user-defined types *)
26154 fun inj inj (IN b) = IN (b o Pair.map (id, inj))
26155
26156 local
26157 fun surround pre suf (_, s) = (false, concat [pre, s, suf])
26158 fun parenthesize x = if #1 x then surround "(" ")" x else x
26159 fun construct tag =
26160 (fn (_, s) => (true, concat [tag, " ", s])) o parenthesize
26161 fun check p m s = if p s then () else raise Fail (m^s)
26162 in
26163 (* tuples and records *)
26164 fun (IN l) * (IN r) =
26165 IN (fn (rs, a & b) =>
26166 (false, concat [#2 (l (rs, a)),
26167 ", ",
26168 #2 (r (rs, b))]))
26169
26170 val U = id
26171 fun L l = (check SmlSyntax.isLabel "Invalid label: " l
26172 ; fn IN t => IN (surround (l^" = ") "" o t))
26173
26174 fun tuple (IN t) = IN (surround "(" ")" o t)
26175 fun record (IN t) = IN (surround "{" "}" o t)
26176
26177 (* datatypes *)
26178 fun (IN l) + (IN r) = IN (fn (rs, INL a) => l (rs, a)
26179 | (rs, INR b) => r (rs, b))
26180
26181 fun C0 c = (check SmlSyntax.isId "Invalid constructor: " c
26182 ; IN (const (false, c)))
26183 fun C1 c (IN t) = (check SmlSyntax.isId "Invalid constructor: " c
26184 ; IN (construct c o t))
26185
26186 val data = id
26187
26188 fun Y ? = Tie.iso Tie.function (fn IN x => x, IN) ?
26189
26190 (* exceptions *)
26191 local
26192 val handlers = ref ([] : (exn -> unit t option) list)
26193 in
26194 val exn = IN (fn (rs, e) => let
26195 fun lp [] =
26196 C0(concat ["<exn:",
26197 General.exnName e,
26198 ">"])
26199 | lp (f::fs) =
26200 case f e
26201 of NONE => lp fs
26202 | SOME t => t
26203 val IN f = lp (!handlers)
26204 in
26205 f (rs, ())
26206 end)
26207 fun regExn f =
26208 handlers := (Option.map
26209 (fn (x, IN f) =>
26210 IN (fn (rs, ()) =>
26211 f (rs, x))) o f)
26212 :: !handlers
26213 end
26214
26215 (* some built-in type constructors *)
26216 local
26217 fun cyclic (IN t) = let
26218 exception E of ''a * bool ref
26219 in
26220 IN (fn (rs, v : ''a) => let
26221 val idx = Int.toString o length
26222 fun lp (E (v', c)::rs) =
26223 if v' <> v then lp rs
26224 else (c := false ; (false, "%"^idx rs))
26225 | lp (_::rs) = lp rs
26226 | lp [] = let
26227 val c = ref true
26228 val r = t (E (v, c)::rs, v)
26229 in
26230 if !c then r
26231 else surround "" (" as %"^idx rs) r
26232 end
26233 in
26234 lp rs
26235 end)
26236 end
26237
26238 fun aggregate pre suf toList (IN t) =
26239 IN (surround pre suf o
26240 (fn (rs, a) =>
26241 (false,
26242 String.concatWith
26243 ", "
26244 (map (#2 o curry t rs)
26245 (toList a)))))
26246 in
26247 fun refc ? = (cyclic o inj ! o C1"ref") ?
26248 fun array ? = (cyclic o aggregate "[|" "|]" (Array.foldr op:: [])) ?
26249 fun list ? = aggregate "[" "]" id ?
26250 fun vector ? = aggregate "#[" "]" (Vector.foldr op:: []) ?
26251 end
26252
26253 fun (IN _) --> (IN _) = IN (const (false, "<fn>"))
26254
26255 (* some built-in base types *)
26256 local
26257 fun mk toS = (fn x => (false, x)) o toS o (fn (_, x) => x)
26258 in
26259 val string =
26260 IN (surround "\"" "\"" o mk (String.translate Char.toString))
26261 val unit = IN (mk (fn () => "()"))
26262 val bool = IN (mk Bool.toString)
26263 val char = IN (surround "#\"" "\"" o mk Char.toString)
26264 val int = IN (mk Int.toString)
26265 val word = IN (surround "0wx" "" o mk Word.toString)
26266 val real = IN (mk Real.toString)
26267 end
26268 end
26269end
26270
26271(* Handlers for standard top-level exceptions *)
26272val () = let
26273 open Show
26274 fun E0 name = SOME ((), C0 name)
26275in
26276 regExn (fn Bind => E0"Bind"
26277 | Chr => E0"Chr"
26278 | Div => E0"Div"
26279 | Domain => E0"Domain"
26280 | Empty => E0"Empty"
26281 | Match => E0"Match"
26282 | Option => E0"Option"
26283 | Overflow => E0"Overflow"
26284 | Size => E0"Size"
26285 | Span => E0"Span"
26286 | Subscript => E0"Subscript"
26287 | _ => NONE)
26288 ; regExn (fn Fail s => SOME (s, C1"Fail" string)
26289 | _ => NONE)
26290end
26291----
26292
26293
26294== Also see ==
26295
26296There are a number of related techniques. Here are some of them.
26297
26298* <:Fold:>
26299* <:StaticSum:>
26300
26301<<<
26302
26303:mlton-guide-page: TypeVariableScope
26304[[TypeVariableScope]]
26305TypeVariableScope
26306=================
26307
26308In <:StandardML:Standard ML>, every type variable is _scoped_ (or
26309bound) at a particular point in the program. A type variable can be
26310either implicitly scoped or explicitly scoped. For example, `'a` is
26311implicitly scoped in
26312
26313[source,sml]
26314----
26315val id: 'a -> 'a = fn x => x
26316----
26317
26318and is implicitly scoped in
26319
26320[source,sml]
26321----
26322val id = fn x: 'a => x
26323----
26324
26325On the other hand, `'a` is explicitly scoped in
26326
26327[source,sml]
26328----
26329val 'a id: 'a -> 'a = fn x => x
26330----
26331
26332and is explicitly scoped in
26333
26334[source,sml]
26335----
26336val 'a id = fn x: 'a => x
26337----
26338
26339A type variable can be scoped at a `val` or `fun` declaration. An SML
26340type checker performs scope inference on each top-level declaration to
26341determine the scope of each implicitly scoped type variable. After
26342scope inference, every type variable is scoped at exactly one
26343enclosing `val` or `fun` declaration. Scope inference shows that the
26344first and second example above are equivalent to the third and fourth
26345example, respectively.
26346
26347Section 4.6 of the <:DefinitionOfStandardML:Definition> specifies
26348precisely the scope of an implicitly scoped type variable. A free
26349occurrence of a type variable `'a` in a declaration `d` is said to be
26350_unguarded_ in `d` if `'a` is not part of a smaller declaration. A
26351type variable `'a` is implicitly scoped at `d` if `'a` is unguarded in
26352`d` and `'a` does not occur unguarded in any declaration containing
26353`d`.
26354
26355
26356== Scope inference examples ==
26357
26358* In this example,
26359+
26360[source,sml]
26361----
26362val id: 'a -> 'a = fn x => x
26363----
26364+
26365`'a` is unguarded in `val id` and does not occur unguarded in any
26366containing declaration. Hence, `'a` is scoped at `val id` and the
26367declaration is equivalent to the following.
26368+
26369[source,sml]
26370----
26371val 'a id: 'a -> 'a = fn x => x
26372----
26373
26374* In this example,
26375+
26376[source,sml]
26377----
26378 val f = fn x => let exception E of 'a in E x end
26379----
26380+
26381`'a` is unguarded in `val f` and does not occur unguarded in any
26382containing declaration. Hence, `'a` is scoped at `val f` and the
26383declaration is equivalent to the following.
26384+
26385[source,sml]
26386----
26387val 'a f = fn x => let exception E of 'a in E x end
26388----
26389
26390* In this example (taken from the <:DefinitionOfStandardML:Definition>),
26391+
26392[source,sml]
26393----
26394val x: int -> int = let val id: 'a -> 'a = fn z => z in id id end
26395----
26396+
26397`'a` occurs unguarded in `val id`, but not in `val x`. Hence, `'a` is
26398implicitly scoped at `val id`, and the declaration is equivalent to
26399the following.
26400+
26401[source,sml]
26402----
26403val x: int -> int = let val 'a id: 'a -> 'a = fn z => z in id id end
26404----
26405
26406
26407* In this example,
26408+
26409[source,sml]
26410----
26411val f = (fn x: 'a => x) (fn y => y)
26412----
26413+
26414`'a` occurs unguarded in `val f` and does not occur unguarded in any
26415containing declaration. Hence, `'a` is implicitly scoped at `val f`,
26416and the declaration is equivalent to the following.
26417+
26418[source,sml]
26419----
26420val 'a f = (fn x: 'a => x) (fn y => y)
26421----
26422+
26423This does not type check due to the <:ValueRestriction:>.
26424
26425* In this example,
26426+
26427[source,sml]
26428----
26429fun f x =
26430 let
26431 fun g (y: 'a) = if true then x else y
26432 in
26433 g x
26434 end
26435----
26436+
26437`'a` occurs unguarded in `fun g`, not in `fun f`. Hence, `'a` is
26438implicitly scoped at `fun g`, and the declaration is equivalent to
26439+
26440[source,sml]
26441----
26442fun f x =
26443 let
26444 fun 'a g (y: 'a) = if true then x else y
26445 in
26446 g x
26447 end
26448----
26449+
26450This fails to type check because `x` and `y` must have the same type,
26451but the `x` occurs outside the scope of the type variable `'a`. MLton
26452reports the following error.
26453+
26454----
26455Error: z.sml 3.21-3.41.
26456 Then and else branches disagree.
26457 then: [???]
26458 else: ['a]
26459 in: if true then x else y
26460 note: type would escape its scope: 'a
26461 escape to: z.sml 1.1-6.5
26462----
26463+
26464This problem could be fixed either by adding an explicit type
26465constraint, as in `fun f (x: 'a)`, or by explicitly scoping `'a`, as
26466in `fun 'a f x = ...`.
26467
26468
26469== Restrictions on type variable scope ==
26470
26471It is not allowed to scope a type variable within a declaration in
26472which it is already in scope (see the last restriction listed on page
264739 of the <:DefinitionOfStandardML:Definition>). For example, the
26474following program is invalid.
26475
26476[source,sml]
26477----
26478fun 'a f (x: 'a) =
26479 let
26480 fun 'a g (y: 'a) = y
26481 in
26482 ()
26483 end
26484----
26485
26486MLton reports the following error.
26487
26488----
26489Error: z.sml 3.11-3.12.
26490 Type variable scoped at an outer declaration: 'a.
26491 scoped at: z.sml 1.1-6.6
26492----
26493
26494This is an error even if the scoping is implicit. That is, the
26495following program is invalid as well.
26496
26497[source,sml]
26498----
26499fun f (x: 'a) =
26500 let
26501 fun 'a g (y: 'a) = y
26502 in
26503 ()
26504 end
26505----
26506
26507<<<
26508
26509:mlton-guide-page: Unicode
26510[[Unicode]]
26511Unicode
26512=======
26513
26514== Support in The Definition of Standard ML ==
26515
26516There is no real support for Unicode in the
26517<:DefinitionOfStandardML:Definition>; there are only a few throw-away
26518sentences along the lines of "the characters with numbers 0 to 127
26519coincide with the ASCII character set."
26520
26521== Support in The Standard ML Basis Library ==
26522
26523Neither is there real support for Unicode in the <:BasisLibrary:Basis
26524Library>. The general consensus (which includes the opinions of the
26525editors of the Basis Library) is that the `WideChar` and `WideString`
26526structures are insufficient for the purposes of Unicode. There is no
26527`LargeChar` structure, which in itself is a deficiency, since a
26528programmer can not program against the largest supported character
26529size.
26530
26531== Current Support in MLton ==
26532
26533MLton, as a minor extension over the Definition, supports UTF-8 byte
26534sequences in text constants. This feature enables "UTF-8 convenience"
26535(but not comprehensive Unicode support); in particular, it allows one
26536to copy text from a browser and paste it into a string constant in an
26537editor and, furthermore, if the string is printed to a terminal, then
26538will (typically) appear as the original text. See the
26539<:SuccessorML#ExtendedTextConsts:extended text constants feature of
26540Successor ML> for more details.
26541
26542MLton, also as a minor extension over the Definition, supports
26543`\Uxxxxxxxx` numeric escapes in text constants and has preliminary
26544internal support for 16- and 32-bit characters and strings.
26545
26546MLton provides `WideChar` and `WideString` structures, corresponding
26547to 32-bit characters and strings, respectively.
26548
26549== Questions and Discussions ==
26550
26551There are periodic flurries of questions and discussion about Unicode
26552in MLton/SML. In December 2004, there was a discussion that led to
26553some seemingly sound design decisions. The discussion started at:
26554
26555 * http://www.mlton.org/pipermail/mlton/2004-December/026396.html
26556
26557There is a good summary of points at:
26558
26559 * http://www.mlton.org/pipermail/mlton/2004-December/026440.html
26560
26561In November 2005, there was a followup discussion and the beginning of
26562some coding.
26563
26564 * http://www.mlton.org/pipermail/mlton/2005-November/028300.html
26565
26566== Also see ==
26567
26568The <:fxp:> XML parser has some support for dealing with Unicode
26569documents.
26570
26571<<<
26572
26573:mlton-guide-page: UniversalType
26574[[UniversalType]]
26575UniversalType
26576=============
26577
26578A universal type is a type into which all other types can be embedded.
26579Here's a <:StandardML:Standard ML> signature for a universal type.
26580
26581[source,sml]
26582----
26583signature UNIVERSAL_TYPE =
26584 sig
26585 type t
26586
26587 val embed: unit -> ('a -> t) * (t -> 'a option)
26588 end
26589----
26590
26591The idea is that `type t` is the universal type and that each call to
26592`embed` returns a new pair of functions `(inject, project)`, where
26593`inject` embeds a value into the universal type and `project` extracts
26594the value from the universal type. A pair `(inject, project)`
26595returned by `embed` works together in that `project u` will return
26596`SOME v` if and only if `u` was created by `inject v`. If `u` was
26597created by a different function `inject'`, then `project` returns
26598`NONE`.
26599
26600Here's an example embedding integers and reals into a universal type.
26601
26602[source,sml]
26603----
26604functor Test (U: UNIVERSAL_TYPE): sig end =
26605 struct
26606 val (intIn: int -> U.t, intOut) = U.embed ()
26607 val r: U.t ref = ref (intIn 13)
26608 val s1 =
26609 case intOut (!r) of
26610 NONE => "NONE"
26611 | SOME i => Int.toString i
26612 val (realIn: real -> U.t, realOut) = U.embed ()
26613 val () = r := realIn 13.0
26614 val s2 =
26615 case intOut (!r) of
26616 NONE => "NONE"
26617 | SOME i => Int.toString i
26618 val s3 =
26619 case realOut (!r) of
26620 NONE => "NONE"
26621 | SOME x => Real.toString x
26622 val () = print (concat [s1, " ", s2, " ", s3, "\n"])
26623 end
26624----
26625
26626Applying `Test` to an appropriate implementation will print
26627
26628----
2662913 NONE 13.0
26630----
26631
26632Note that two different calls to embed on the same type return
26633different embeddings.
26634
26635Standard ML does not have explicit support for universal types;
26636however, there are at least two ways to implement them.
26637
26638
26639== Implementation Using Exceptions ==
26640
26641While the intended use of SML exceptions is for exception handling, an
26642accidental feature of their design is that the `exn` type is a
26643universal type. The implementation relies on being able to declare
26644exceptions locally to a function and on the fact that exceptions are
26645<:GenerativeException:generative>.
26646
26647[source,sml]
26648----
26649structure U:> UNIVERSAL_TYPE =
26650 struct
26651 type t = exn
26652
26653 fun 'a embed () =
26654 let
26655 exception E of 'a
26656 fun project (e: t): 'a option =
26657 case e of
26658 E a => SOME a
26659 | _ => NONE
26660 in
26661 (E, project)
26662 end
26663 end
26664----
26665
26666
26667== Implementation Using Functions and References ==
26668
26669[source,sml]
26670----
26671structure U:> UNIVERSAL_TYPE =
26672 struct
26673 datatype t = T of {clear: unit -> unit,
26674 store: unit -> unit}
26675
26676 fun 'a embed () =
26677 let
26678 val r: 'a option ref = ref NONE
26679 fun inject (a: 'a): t =
26680 T {clear = fn () => r := NONE,
26681 store = fn () => r := SOME a}
26682 fun project (T {clear, store}): 'a option =
26683 let
26684 val () = store ()
26685 val res = !r
26686 val () = clear ()
26687 in
26688 res
26689 end
26690 in
26691 (inject, project)
26692 end
26693 end
26694----
26695
26696Note that due to the use of a shared ref cell, the above
26697implementation is not thread safe.
26698
26699One could try to simplify the above implementation by eliminating the
26700`clear` function, making `type t = unit -> unit`.
26701
26702[source,sml]
26703----
26704structure U:> UNIVERSAL_TYPE =
26705 struct
26706 type t = unit -> unit
26707
26708 fun 'a embed () =
26709 let
26710 val r: 'a option ref = ref NONE
26711 fun inject (a: 'a): t = fn () => r := SOME a
26712 fun project (f: t): 'a option = (r := NONE; f (); !r)
26713 in
26714 (inject, project)
26715 end
26716 end
26717----
26718
26719While correct, this approach keeps the contents of the ref cell alive
26720longer than necessary, which could cause a space leak. The problem is
26721in `project`, where the call to `f` stores some value in some ref cell
26722`r'`. Perhaps `r'` is the same ref cell as `r`, but perhaps not. If
26723we do not clear `r'` before returning from `project`, then `r'` will
26724keep the value alive, even though it is useless.
26725
26726
26727== Also see ==
26728
26729* <:PropertyList:>: Lisp-style property lists implemented with a universal type
26730
26731<<<
26732
26733:mlton-guide-page: UnresolvedBugs
26734[[UnresolvedBugs]]
26735UnresolvedBugs
26736==============
26737
26738Here are the places where MLton deviates from
26739<:DefinitionOfStandardML:The Definition of Standard ML (Revised)> and
26740the <:BasisLibrary:Basis Library>. In general, MLton complies with
26741the <:DefinitionOfStandardML:Definition> quite closely, typically much
26742more closely than other SML compilers (see, e.g., our list of
26743<:SMLNJDeviations:SML/NJ's deviations>). In fact, the four deviations
26744listed here are the only known deviations, and we have no immediate
26745plans to fix them. If you find a deviation not listed here, please
26746report a <:Bug:>.
26747
26748We don't plan to fix these bugs because the first (parsing nested
26749cases) has historically never been accepted by any SML compiler, the
26750second clearly indicates a problem in the
26751<:DefinitionOfStandardML:Definition>, and the remaining are difficult
26752to resolve in the context of MLton's implementaton of Standard ML (and
26753unlikely to be problematic in practice).
26754
26755* MLton does not correctly parse case expressions nested within other
26756matches. For example, the following fails.
26757+
26758[source,sml]
26759----
26760fun f 0 y =
26761 case x of
26762 1 => 2
26763 | _ => 3
26764 | f _ y = 4
26765----
26766+
26767To do this in a program, simply parenthesize the case expression.
26768+
26769Allowing such expressions, although compliant with the Definition,
26770would be a mistake, since using parentheses is clearer and no SML
26771compiler has ever allowed them. Furthermore, implementing this would
26772require serious yacc grammar rewriting followed by postprocessing.
26773
26774* MLton does not raise the `Bind` exception at run time when
26775evaluating `val rec` (and `fun`) declarations that redefine
26776identifiers that previously had constructor status. (By default,
26777MLton does warn at compile time about `val rec` (and `fun`)
26778declarations that redefine identifiers that previously had
26779constructors status; see the `valrecConstr` <:MLBasisAnnotations:ML
26780Basis annotation>.) For example, the Definition requires the
26781following program to type check, but also (bizarelly) requires it to
26782raise the `Bind` exception
26783+
26784[source,sml]
26785----
26786val rec NONE = fn () => ()
26787----
26788+
26789The Definition's behavior is obviously an error, a mismatch between
26790the static semantics (rule 26) and the dynamic semantics (rule 126).
26791Given the comments on rule 26 in the Definition, it seems clear that
26792the authors meant for `val rec` to allow an identifier's constructor
26793status to be overridden both statically and dynamically. Hence, MLton
26794and most SML compilers follow rule 26, but do not follow rule 126.
26795
26796* MLton does not hide the equality aspect of types declared in
26797`abstype` declarations. So, MLton accepts programs like the following,
26798while the Definition rejects them.
26799+
26800[source,sml]
26801----
26802abstype t = T with end
26803val _ = fn (t1, t2 : t) => t1 = t2
26804
26805abstype t = T with val a = T end
26806val _ = a = a
26807----
26808+
26809One consequence of this choice is that MLton accepts the following
26810program, in accordance with the Definition.
26811+
26812[source,sml]
26813----
26814abstype t = T with val eq = op = end
26815val _ = fn (t1, t2 : t) => eq (t1, t2)
26816----
26817+
26818Other implementations will typically reject this program, because they
26819make an early choice for the type of `eq` to be `''a * ''a -> bool`
26820instead of `t * t -> bool`. The choice is understandable, since the
26821Definition accepts the following program.
26822+
26823[source,sml]
26824----
26825abstype t = T with val eq = op = end
26826val _ = eq (1, 2)
26827----
26828+
26829
26830* MLton (re-)type checks each functor definition at every
26831corresponding functor application (the compilation technique of
26832defunctorization). One consequence of this implementation is that
26833MLton accepts the following program, while the Definition rejects
26834it.
26835+
26836[source,sml]
26837----
26838functor F (X: sig type t end) = struct
26839 val f = id id
26840end
26841structure A = F (struct type t = int end)
26842structure B = F (struct type t = bool end)
26843val _ = A.f 10
26844val _ = B.f "dude"
26845----
26846+
26847On the other hand, other implementations will typically reject the
26848following program, while MLton and the Definition accept it.
26849+
26850[source,sml]
26851----
26852functor F (X: sig type t end) = struct
26853 val f = id id
26854end
26855structure A = F (struct type t = int end)
26856structure B = F (struct type t = bool end)
26857val _ = A.f 10
26858val _ = B.f false
26859----
26860+
26861See <!Cite(DreyerBlume07)> for more details.
26862
26863<<<
26864
26865:mlton-guide-page: UnsafeStructure
26866[[UnsafeStructure]]
26867UnsafeStructure
26868===============
26869
26870This module is a subset of the `Unsafe` module provided by SML/NJ,
26871with a few extract operations for `PackWord` and `PackReal`.
26872
26873[source,sml]
26874----
26875signature UNSAFE_MONO_ARRAY =
26876 sig
26877 type array
26878 type elem
26879
26880 val create: int -> array
26881 val sub: array * int -> elem
26882 val update: array * int * elem -> unit
26883 end
26884
26885signature UNSAFE_MONO_VECTOR =
26886 sig
26887 type elem
26888 type vector
26889
26890 val sub: vector * int -> elem
26891 end
26892
26893signature UNSAFE =
26894 sig
26895 structure Array:
26896 sig
26897 val create: int * 'a -> 'a array
26898 val sub: 'a array * int -> 'a
26899 val update: 'a array * int * 'a -> unit
26900 end
26901 structure CharArray: UNSAFE_MONO_ARRAY
26902 structure CharVector: UNSAFE_MONO_VECTOR
26903 structure IntArray: UNSAFE_MONO_ARRAY
26904 structure IntVector: UNSAFE_MONO_VECTOR
26905 structure Int8Array: UNSAFE_MONO_ARRAY
26906 structure Int8Vector: UNSAFE_MONO_VECTOR
26907 structure Int16Array: UNSAFE_MONO_ARRAY
26908 structure Int16Vector: UNSAFE_MONO_VECTOR
26909 structure Int32Array: UNSAFE_MONO_ARRAY
26910 structure Int32Vector: UNSAFE_MONO_VECTOR
26911 structure Int64Array: UNSAFE_MONO_ARRAY
26912 structure Int64Vector: UNSAFE_MONO_VECTOR
26913 structure IntInfArray: UNSAFE_MONO_ARRAY
26914 structure IntInfVector: UNSAFE_MONO_VECTOR
26915 structure LargeIntArray: UNSAFE_MONO_ARRAY
26916 structure LargeIntVector: UNSAFE_MONO_VECTOR
26917 structure LargeRealArray: UNSAFE_MONO_ARRAY
26918 structure LargeRealVector: UNSAFE_MONO_VECTOR
26919 structure LargeWordArray: UNSAFE_MONO_ARRAY
26920 structure LargeWordVector: UNSAFE_MONO_VECTOR
26921 structure RealArray: UNSAFE_MONO_ARRAY
26922 structure RealVector: UNSAFE_MONO_VECTOR
26923 structure Real32Array: UNSAFE_MONO_ARRAY
26924 structure Real32Vector: UNSAFE_MONO_VECTOR
26925 structure Real64Array: UNSAFE_MONO_ARRAY
26926 structure Vector:
26927 sig
26928 val sub: 'a vector * int -> 'a
26929 end
26930 structure Word8Array: UNSAFE_MONO_ARRAY
26931 structure Word8Vector: UNSAFE_MONO_VECTOR
26932 structure Word16Array: UNSAFE_MONO_ARRAY
26933 structure Word16Vector: UNSAFE_MONO_VECTOR
26934 structure Word32Array: UNSAFE_MONO_ARRAY
26935 structure Word32Vector: UNSAFE_MONO_VECTOR
26936 structure Word64Array: UNSAFE_MONO_ARRAY
26937 structure Word64Vector: UNSAFE_MONO_VECTOR
26938
26939 structure PackReal32Big : PACK_REAL
26940 structure PackReal32Little : PACK_REAL
26941 structure PackReal64Big : PACK_REAL
26942 structure PackReal64Little : PACK_REAL
26943 structure PackRealBig : PACK_REAL
26944 structure PackRealLittle : PACK_REAL
26945 structure PackWord16Big : PACK_WORD
26946 structure PackWord16Little : PACK_WORD
26947 structure PackWord32Big : PACK_WORD
26948 structure PackWord32Little : PACK_WORD
26949 structure PackWord64Big : PACK_WORD
26950 structure PackWord64Little : PACK_WORD
26951 end
26952----
26953
26954<<<
26955
26956:mlton-guide-page: Useless
26957[[Useless]]
26958Useless
26959=======
26960
26961<:Useless:> is an optimization pass for the <:SSA:>
26962<:IntermediateLanguage:>, invoked from <:SSASimplify:>.
26963
26964== Description ==
26965
26966This pass:
26967
26968* removes components of tuples that are constants (use unification)
26969* removes function arguments that are constants
26970* builds some kind of dependence graph where
26971** a value of ground type is useful if it is an arg to a primitive
26972** a tuple is useful if it contains a useful component
26973** a constructor is useful if it contains a useful component or is used in a `Case` transfer
26974
26975If a useful tuple is coerced to another useful tuple, then all of
26976their components must agree (exactly). It is trivial to convert a
26977useful value to a useless one.
26978
26979== Implementation ==
26980
26981* <!ViewGitFile(mlton,master,mlton/ssa/useless.fun)>
26982
26983== Details and Notes ==
26984
26985It is also trivial to convert a useful tuple to one of its useful
26986components -- but this seems hard.
26987
26988Suppose that you have a `ref`/`array`/`vector` that is useful, but the
26989components aren't -- then the components are converted to type `unit`,
26990and any primitive args must be as well.
26991
26992Unify all handler arguments so that `raise`/`handle` has a consistent
26993calling convention.
26994
26995<<<
26996
26997:mlton-guide-page: Users
26998[[Users]]
26999Users
27000=====
27001
27002Here is a list of companies, projects, and courses that use or have
27003used MLton. If you use MLton and are not here, please add your
27004project with a brief description and a link. Thanks.
27005
27006== Companies ==
27007
27008* http://www.hardcoreprocessing.com/[Hardcore Processing] uses MLton as a http://www.hardcoreprocessing.com/Freeware/MLTonWin32.html[crosscompiler from Linux to Windows] for graphics and game software.
27009** http://www.cex3d.net/[CEX3D Converter], a conversion program for 3D objects.
27010** http://www.hardcoreprocessing.com/company/showreel/index.html[Interactive Showreel], which contains a crossplatform GUI-toolkit and a realtime renderer for a subset of RenderMan written in Standard ML.
27011** various http://www.hardcoreprocessing.com/entertainment/index.html[games]
27012* http://www.mathworks.com/products/polyspace/[MathWorks/PolySpace Technologies] builds their product that detects runtime errors in embedded systems based on abstract interpretation.
27013// * http://www.sourcelight.com/[Sourcelight Technologies] uses MLton internally for prototyping and for processing databases as part of their system that makes personalized movie recommen
27014* http://www.reactive-systems.com/[Reactive Systems] uses MLton to build Reactis, a model-based testing and validation package used in the automotive and aerospace industries.
27015
27016== Projects ==
27017
27018* http://www-ia.hiof.no/%7Erolando/adate_intro.html[ADATE], Automatic Design of Algorithms Through Evolution, a system for automatic programming i.e., inductive inference of algorithms. ADATE can automatically generate non-trivial and novel algorithms written in Standard ML.
27019* http://types.bu.edu/reports/Dim+Wes+Mul+Tur+Wel+Con:TIC-2000-LNCS.html[CIL], a compiler for SML based on intersection and union types.
27020* http://www.cs.cmu.edu/%7Econcert/[ConCert], a project investigating certified code for grid computing.
27021* http://hcoop.sourceforge.net/[Cooperative Internet hosting tools]
27022// * http://www.eecs.harvard.edu/%7Estein/[DesynchFS], a programming model and distributed file system for large clusters
27023* http://www.fantasy-coders.de/projects/gh/[Guugelhupf], a simple search engine.
27024* http://www.mpi-sws.org/%7Erossberg/hamlet/[HaMLet], a model implementation of Standard ML.
27025* http://code.google.com/p/kepler-code/[KeplerCode], independent verification of the computational aspects of proofs of the Kepler conjecture and the Dodecahedral conjecture.
27026* http://www.gilith.com/research/metis/[Metis], a first-order prover (used in the http://hol.sourceforge.net/[HOL4 theorem prover] and the http://isabelle.in.tum.de/[Isabelle theorem prover]).
27027* http://tom7misc.cvs.sourceforge.net/viewvc/tom7misc/net/mlftpd/[mlftpd], an ftp daemon written in SML. <:TomMurphy:> is also working on http://tom7misc.cvs.sourceforge.net/viewvc/tom7misc/net/[replacements for standard network services] in SML. He also uses MLton to build his entries (http://www.cs.cmu.edu/%7Etom7/icfp2001/[2001], http://www.cs.cmu.edu/%7Etom7/icfp2002/[2002], http://www.cs.cmu.edu/%7Etom7/icfp2004/[2004], http://www.cs.cmu.edu/%7Etom7/icfp2005/[2005]) in the annual ICFP programming contest.
27028* http://www.informatik.uni-freiburg.de/proglang/research/software/mlope/[MLOPE], an offline partial evaluator for Standard ML.
27029* http://www.ida.liu.se/%7Epelab/rml/[RML], a system for developing, compiling and debugging and teaching structural operational semantics (SOS) and natural semantics specifications.
27030* http://www.macs.hw.ac.uk/ultra/skalpel/index.html[Skalpel], a type-error slicer for SML
27031// * http://alleystoughton.us/smlnjtrans/[SMLNJtrans], a program for generating SML/NJ transcripts in LaTeX.
27032* http://www.cs.cmu.edu/%7Etom7/ssapre/[SSA PRE], an implementation of Partial Redundancy Elimination for MLton.
27033* <:Stabilizers:>, a modular checkpointing abstraction for concurrent functional programs.
27034* http://ttic.uchicago.edu/%7Epl/sa-sml/[Self-Adjusting SML], self-adjusting computation, a model of computing where programs can automatically adjust to changes to their data.
27035* http://faculty.ist.unomaha.edu/winter/ShiftLab/TL_web/TL_index.html[TL System], providing general-purpose support for rewrite-based transformation over elements belonging to a (user-defined) domain language.
27036* http://projects.laas.fr/tina/[Tina] (Time Petri net Analyzer)
27037* http://www.twelf.org/[Twelf] an implementation of the LF logical framework.
27038* http://www.cs.indiana.edu/%7Errnewton/wavescope/[WaveScript/WaveScript], a sensor network project; the WaveScript compiler can generate SML (MLton) code.
27039
27040== Courses ==
27041
27042* http://www.eecs.harvard.edu/%7Enr/cs152/[Harvard CS-152], undergraduate programming languages.
27043* http://www.ia-stud.hiof.no/%7Erolando/PL/[Høgskolen i Østfold IAI30202], programming languages.
27044
27045<<<
27046
27047:mlton-guide-page: Utilities
27048[[Utilities]]
27049Utilities
27050=========
27051
27052This page is a collection of basic utilities used in the examples on
27053various pages. See
27054
27055 * <:InfixingOperators:>, and
27056 * <:ProductType:>
27057
27058for longer discussions on some of these utilities.
27059
27060[source,sml]
27061----
27062(* Operator precedence table *)
27063infix 8 * / div mod (* +1 from Basis Library *)
27064infix 7 + - ^ (* +1 from Basis Library *)
27065infixr 6 :: @ (* +1 from Basis Library *)
27066infix 5 = <> > >= < <= (* +1 from Basis Library *)
27067infix 4 <\ \>
27068infixr 4 </ />
27069infix 3 o
27070infix 2 >|
27071infixr 2 |<
27072infix 1 := (* -2 from Basis Library *)
27073infix 0 before &
27074
27075(* Some basic combinators *)
27076fun const x _ = x
27077fun cross (f, g) (x, y) = (f x, g y)
27078fun curry f x y = f (x, y)
27079fun fail e _ = raise e
27080fun id x = x
27081
27082(* Product type *)
27083datatype ('a, 'b) product = & of 'a * 'b
27084
27085(* Sum type *)
27086datatype ('a, 'b) sum = INL of 'a | INR of 'b
27087
27088(* Some type shorthands *)
27089type 'a uop = 'a -> 'a
27090type 'a fix = 'a uop -> 'a
27091type 'a thunk = unit -> 'a
27092type 'a effect = 'a -> unit
27093type ('a, 'b) emb = ('a -> 'b) * ('b -> 'a)
27094
27095(* Infixing, sectioning, and application operators *)
27096fun x <\ f = fn y => f (x, y)
27097fun f \> y = f y
27098fun f /> y = fn x => f (x, y)
27099fun x </ f = f x
27100
27101(* Piping operators *)
27102val op>| = op</
27103val op|< = op\>
27104----
27105
27106<<<
27107
27108:mlton-guide-page: ValueRestriction
27109[[ValueRestriction]]
27110ValueRestriction
27111================
27112
27113The value restriction is a rule that governs when type inference is
27114allowed to polymorphically generalize a value declaration. In short,
27115the value restriction says that generalization can only occur if the
27116right-hand side of an expression is syntactically a value. For
27117example, in
27118
27119[source,sml]
27120----
27121val f = fn x => x
27122val _ = (f "foo"; f 13)
27123----
27124
27125the expression `fn x => x` is syntactically a value, so `f` has
27126polymorphic type `'a -> 'a` and both calls to `f` type check. On the
27127other hand, in
27128
27129[source,sml]
27130----
27131val f = let in fn x => x end
27132val _ = (f "foo"; f 13)
27133----
27134
27135the expression `let in fn x => end end` is not syntactically a value
27136and so `f` can either have type `int -> int` or `string -> string`,
27137but not `'a -> 'a`. Hence, the program does not type check.
27138
27139<:DefinitionOfStandardML:The Definition of Standard ML> spells out
27140precisely which expressions are syntactic values (it refers to such
27141expressions as _non-expansive_). An expression is a value if it is of
27142one of the following forms.
27143
27144* a constant (`13`, `"foo"`, `13.0`, ...)
27145* a variable (`x`, `y`, ...)
27146* a function (`fn x => e`)
27147* the application of a constructor other than `ref` to a value (`Foo v`)
27148* a type constrained value (`v: t`)
27149* a tuple in which each field is a value `(v1, v2, ...)`
27150* a record in which each field is a value `{l1 = v1, l2 = v2, ...}`
27151* a list in which each element is a value `[v1, v2, ...]`
27152
27153
27154== Why the value restriction exists ==
27155
27156The value restriction prevents a ref cell (or an array) from holding
27157values of different types, which would allow a value of one type to be
27158cast to another and hence would break type safety. If the restriction
27159were not in place, the following program would type check.
27160
27161[source,sml]
27162----
27163val r: 'a option ref = ref NONE
27164val r1: string option ref = r
27165val r2: int option ref = r
27166val () = r1 := SOME "foo"
27167val v: int = valOf (!r2)
27168----
27169
27170The first line violates the value restriction because `ref NONE` is
27171not a value. All other lines are type correct. By its last line, the
27172program has cast the string `"foo"` to an integer. This breaks type
27173safety, because now we can add a string to an integer with an
27174expression like `v + 13`. We could even be more devious, by adding
27175the following two lines, which allow us to threat the string `"foo"`
27176as a function.
27177
27178[source,sml]
27179----
27180val r3: (int -> int) option ref = r
27181val v: int -> int = valOf (!r3)
27182----
27183
27184Eliminating the explicit `ref` does nothing to fix the problem. For
27185example, we could replace the declaration of `r` with the following.
27186
27187[source,sml]
27188----
27189val f: unit -> 'a option ref = fn () => ref NONE
27190val r: 'a option ref = f ()
27191----
27192
27193The declaration of `f` is well typed, while the declaration of `r`
27194violates the value restriction because `f ()` is not a value.
27195
27196
27197== Unnecessarily rejected programs ==
27198
27199Unfortunately, the value restriction rejects some programs that could
27200be accepted.
27201
27202[source,sml]
27203----
27204val id: 'a -> 'a = fn x => x
27205val f: 'a -> 'a = id id
27206----
27207
27208The type constraint on `f` requires `f` to be polymorphic, which is
27209disallowed because `id id` is not a value. MLton reports the
27210following type error.
27211
27212----
27213Error: z.sml 2.5-2.5.
27214 Type of variable cannot be generalized in expansive declaration: f.
27215 type: ['a] -> ['a]
27216 in: val 'a f: ('a -> 'a) = id id
27217----
27218
27219MLton indicates the inability to make `f` polymorphic by saying that
27220the type of `f` cannot be generalized (made polymorphic) its
27221declaration is expansive (not a value). MLton doesn't explicitly
27222mention the value restriction, but that is the reason. If we leave
27223the type constraint off of `f`
27224
27225[source,sml]
27226----
27227val id: 'a -> 'a = fn x => x
27228val f = id id
27229----
27230
27231then the program succeeds; however, MLton gives us the following
27232warning.
27233
27234----
27235Warning: z.sml 2.5-2.5.
27236 Type of variable was not inferred and could not be generalized: f.
27237 type: ??? -> ???
27238 in: val f = id id
27239----
27240
27241This warning indicates that MLton couldn't polymorphically generalize
27242`f`, nor was there enough context using `f` to determine its type.
27243This in itself is not a type error, but it it is a hint that something
27244is wrong with our program. Using `f` provides enough context to
27245eliminate the warning.
27246
27247[source,sml]
27248----
27249val id: 'a -> 'a = fn x => x
27250val f = id id
27251val _ = f 13
27252----
27253
27254But attempting to use `f` as a polymorphic function will fail.
27255
27256[source,sml]
27257----
27258val id: 'a -> 'a = fn x => x
27259val f = id id
27260val _ = f 13
27261val _ = f "foo"
27262----
27263
27264----
27265Error: z.sml 4.9-4.15.
27266 Function applied to incorrect argument.
27267 expects: [int]
27268 but got: [string]
27269 in: f "foo"
27270----
27271
27272
27273== Alternatives to the value restriction ==
27274
27275There would be nothing wrong with treating `f` as polymorphic in
27276
27277[source,sml]
27278----
27279val id: 'a -> 'a = fn x => x
27280val f = id id
27281----
27282
27283One might think that the value restriction could be relaxed, and that
27284only types involving `ref` should be disallowed. Unfortunately, the
27285following example shows that even the type `'a -> 'a` can cause
27286problems. If this program were allowed, then we could cast an integer
27287to a string (or any other type).
27288
27289[source,sml]
27290----
27291val f: 'a -> 'a =
27292 let
27293 val r: 'a option ref = ref NONE
27294 in
27295 fn x =>
27296 let
27297 val y = !r
27298 val () = r := SOME x
27299 in
27300 case y of
27301 NONE => x
27302 | SOME y => y
27303 end
27304 end
27305val _ = f 13
27306val _ = f "foo"
27307----
27308
27309The previous version of Standard ML took a different approach
27310(<!Cite(MilnerEtAl90)>, <!Cite(Tofte90)>, <:ImperativeTypeVariable:>)
27311than the value restriction. It encoded information in the type system
27312about when ref cells would be created, and used this to prevent a ref
27313cell from holding multiple types. Although it allowed more programs
27314to be type checked, this approach had significant drawbacks. First,
27315it was significantly more complex, both for implementers and for
27316programmers. Second, it had an unfortunate interaction with the
27317modularity, because information about ref usage was exposed in module
27318signatures. This either prevented the use of references for
27319implementing a signature, or required information that one would like
27320to keep hidden to propagate across modules.
27321
27322In the early nineties, Andrew Wright studied about 250,000 lines of
27323existing SML code and discovered that it did not make significant use
27324of the extended typing ability, and proposed the value restriction as
27325a simpler alternative (<!Cite(Wright95)>). This was adopted in the
27326revised <:DefinitionOfStandardML:Definition>.
27327
27328
27329== Working with the value restriction ==
27330
27331One technique that works with the value restriction is
27332<:EtaExpansion:>. We can use eta expansion to make our `id id`
27333example type check follows.
27334
27335[source,sml]
27336----
27337val id: 'a -> 'a = fn x => x
27338val f: 'a -> 'a = fn z => (id id) z
27339----
27340
27341This solution means that the computation (in this case `id id`) will
27342be performed each time `f` is applied, instead of just once when `f`
27343is declared. In this case, that is not a problem, but it could be if
27344the declaration of `f` performs substantial computation or creates a
27345shared data structure.
27346
27347Another technique that sometimes works is to move a monomorphic
27348computation prior to a (would-be) polymorphic declaration so that the
27349expression is a value. Consider the following program, which fails
27350due to the value restriction.
27351
27352[source,sml]
27353----
27354datatype 'a t = A of string | B of 'a
27355val x: 'a t = A (if true then "yes" else "no")
27356----
27357
27358It is easy to rewrite this program as
27359
27360[source,sml]
27361----
27362datatype 'a t = A of string | B of 'a
27363local
27364 val s = if true then "yes" else "no"
27365in
27366 val x: 'a t = A s
27367end
27368----
27369
27370The following example (taken from <!Cite(Wright95)>) creates a ref
27371cell to count the number of times a function is called.
27372
27373[source,sml]
27374----
27375val count: ('a -> 'a) -> ('a -> 'a) * (unit -> int) =
27376 fn f =>
27377 let
27378 val r = ref 0
27379 in
27380 (fn x => (r := 1 + !r; f x), fn () => !r)
27381 end
27382val id: 'a -> 'a = fn x => x
27383val (countId: 'a -> 'a, numCalls) = count id
27384----
27385
27386The example does not type check, due to the value restriction.
27387However, it is easy to rewrite the program, staging the ref cell
27388creation before the polymorphic code.
27389
27390[source,sml]
27391----
27392datatype t = T of int ref
27393val count1: unit -> t = fn () => T (ref 0)
27394val count2: t * ('a -> 'a) -> (unit -> int) * ('a -> 'a) =
27395 fn (T r, f) => (fn () => !r, fn x => (r := 1 + !r; f x))
27396val id: 'a -> 'a = fn x => x
27397val t = count1 ()
27398val countId: 'a -> 'a = fn z => #2 (count2 (t, id)) z
27399val numCalls = #1 (count2 (t, id))
27400----
27401
27402Of course, one can hide the constructor `T` inside a `local` or behind
27403a signature.
27404
27405
27406== Also see ==
27407
27408* <:ImperativeTypeVariable:>
27409
27410<<<
27411
27412:mlton-guide-page: VariableArityPolymorphism
27413[[VariableArityPolymorphism]]
27414VariableArityPolymorphism
27415=========================
27416
27417<:StandardML:Standard ML> programmers often face the problem of how to
27418provide a variable-arity polymorphic function. For example, suppose
27419one is defining a combinator library, e.g. for parsing or pickling.
27420The signature for such a library might look something like the
27421following.
27422
27423[source,sml]
27424----
27425signature COMBINATOR =
27426 sig
27427 type 'a t
27428
27429 val int: int t
27430 val real: real t
27431 val string: string t
27432 val unit: unit t
27433 val tuple2: 'a1 t * 'a2 t -> ('a1 * 'a2) t
27434 val tuple3: 'a1 t * 'a2 t * 'a3 t -> ('a1 * 'a2 * 'a3) t
27435 val tuple4: 'a1 t * 'a2 t * 'a3 t * 'a4 t
27436 -> ('a1 * 'a2 * 'a3 * 'a4) t
27437 ...
27438 end
27439----
27440
27441The question is how to define a variable-arity tuple combinator.
27442Traditionally, the only way to take a variable number of arguments in
27443SML is to put the arguments in a list (or vector) and pass that. So,
27444one might define a tuple combinator with the following signature.
27445[source,sml]
27446----
27447val tupleN: 'a list -> 'a list t
27448----
27449
27450The problem with this approach is that as soon as one places values in
27451a list, they must all have the same type. So, programmers often take
27452an alternative approach, and define a family of `tuple<N>` functions,
27453as we see in the `COMBINATOR` signature above.
27454
27455The family-of-functions approach is ugly for many reasons. First, it
27456clutters the signature with a number of functions when there should
27457really only be one. Second, it is _closed_, in that there are a fixed
27458number of tuple combinators in the interface, and should a client need
27459a combinator for a large tuple, he is out of luck. Third, this
27460approach often requires a lot of duplicate code in the implementation
27461of the combinators.
27462
27463Fortunately, using <:Fold01N:> and <:ProductType:products>, one can
27464provide an interface and implementation that solves all these
27465problems. Here is a simple pickling module that converts values to
27466strings.
27467[source,sml]
27468----
27469structure Pickler =
27470 struct
27471 type 'a t = 'a -> string
27472
27473 val unit = fn () => ""
27474
27475 val int = Int.toString
27476
27477 val real = Real.toString
27478
27479 val string = id
27480
27481 type 'a accum = 'a * string list -> string list
27482
27483 val tuple =
27484 fn z =>
27485 Fold01N.fold
27486 {finish = fn ps => fn x => concat (rev (ps (x, []))),
27487 start = fn p => fn (x, l) => p x :: l,
27488 zero = unit}
27489 z
27490
27491 val ` =
27492 fn z =>
27493 Fold01N.step1
27494 {combine = (fn (p, p') => fn (x & x', l) => p' x' :: "," :: p (x, l))}
27495 z
27496 end
27497----
27498
27499If one has `n` picklers of types
27500[source,sml]
27501----
27502val p1: a1 Pickler.t
27503val p2: a2 Pickler.t
27504...
27505val pn: an Pickler.t
27506----
27507then one can construct a pickler for n-ary products as follows.
27508[source,sml]
27509----
27510tuple `p1 `p2 ... `pn $ : (a1 & a2 & ... & an) Pickler.t
27511----
27512
27513For example, with `Pickler` in scope, one can prove the following
27514equations.
27515[source,sml]
27516----
27517"" = tuple $ ()
27518"1" = tuple `int $ 1
27519"1,2.0" = tuple `int `real $ (1 & 2.0)
27520"1,2.0,three" = tuple `int `real `string $ (1 & 2.0 & "three")
27521----
27522
27523Here is the signature for `Pickler`. It shows why the `accum` type is
27524useful.
27525[source,sml]
27526----
27527signature PICKLER =
27528 sig
27529 type 'a t
27530
27531 val int: int t
27532 val real: real t
27533 val string: string t
27534 val unit: unit t
27535
27536 type 'a accum
27537 val ` : ('a accum, 'b t, ('a, 'b) prod accum,
27538 'z1, 'z2, 'z3, 'z4, 'z5, 'z6, 'z7) Fold01N.step1
27539 val tuple: ('a t, 'a accum, 'b accum, 'b t, unit t,
27540 'z1, 'z2, 'z3, 'z4, 'z5) Fold01N.t
27541 end
27542
27543structure Pickler: PICKLER = Pickler
27544----
27545
27546<<<
27547
27548:mlton-guide-page: Variant
27549[[Variant]]
27550Variant
27551=======
27552
27553A _variant_ is an arm of a datatype declaration. For example, the
27554datatype
27555
27556[source,sml]
27557----
27558datatype t = A | B of int | C of real
27559----
27560
27561has three variants: `A`, `B`, and `C`.
27562
27563<<<
27564
27565:mlton-guide-page: VesaKarvonen
27566[[VesaKarvonen]]
27567VesaKarvonen
27568============
27569
27570Vesa Karvonen is a student at the http://www.cs.helsinki.fi/index.en.html[University of Helsinki].
27571His interests lie in programming techniques that allow complex programs to be expressed
27572clearly and concisely and the design and implementation of programming languages.
27573
27574image::VesaKarvonen.attachments/vesa-in-mlton-t-shirt.jpg[align="center"]
27575
27576Things he'd like to see for SML and hopes to be able to contribute towards:
27577
27578* A practical tool for documenting libraries. Preferably one that is
27579based on extracting the documentation from source code comments.
27580
27581* A good IDE. Possibly an enhanced SML mode (`esml-mode`) for Emacs.
27582Google for http://www.google.com/search?&q=SLIME+video[SLIME video] to
27583get an idea of what he'd like to see. Some specific notes:
27584+
27585--
27586 * show type at point
27587 * robust, consistent indentation
27588 * show documentation
27589 * jump to definition (see <:EmacsDefUseMode:>)
27590--
27591+
27592<:EmacsBgBuildMode:> has also been written for working with MLton.
27593
27594* Documented and cataloged libraries. Perhaps something like
27595http://www.boost.org[Boost], but for SML libraries. Here is a partial
27596list of libraries, tools, and frameworks Vesa is or has been working
27597on:
27598+
27599--
27600 * Asynchronous Programming Library (<!ViewGitFile(mltonlib,master,com/ssh/async/unstable/README)>)
27601 * Extended Basis Library (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>)
27602 * Generic Programming Library (<!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>)
27603 * Pretty Printing Library (<!ViewGitFile(mltonlib,master,com/ssh/prettier/unstable/README)>)
27604 * Random Generator Library (<!ViewGitFile(mltonlib,master,com/ssh/random/unstable/README)>)
27605 * RPC (Remote Procedure Call) Library (<!ViewGitFile(mltonlib,master,org/mlton/vesak/rpc-lib/unstable/README)>)
27606 * http://www.libsdl.org/[SDL] Binding (<!ViewGitFile(mltonlib,master,org/mlton/vesak/sdl/unstable/README)>)
27607 * Unit Testing Library (<!ViewGitFile(mltonlib,master,com/ssh/unit-test/unstable/README)>)
27608 * Use Library (<!ViewGitFile(mltonlib,master,org/mlton/vesak/use-lib/unstable/README)>)
27609 * Windows Library (<!ViewGitFile(mltonlib,master,com/ssh/windows/unstable/README)>)
27610--
27611Note that most of these libraries have been ported to several <:StandardMLImplementations:SML implementations>.
27612
27613<<<
27614
27615:mlton-guide-page: WarnUnusedAnomalies
27616[[WarnUnusedAnomalies]]
27617WarnUnusedAnomalies
27618===================
27619
27620The `warnUnused` <:MLBasisAnnotations:MLBasis annotation> can be used
27621to report unused identifiers. This can be useful for catching bugs
27622and for code maintenance (e.g., eliminating dead code). However, the
27623`warnUnused` annotation can sometimes behave in counter-intuitive
27624ways. This page gives some of the anomalies that have been reported.
27625
27626* Functions whose only uses are recursive uses within their bodies are
27627not warned as unused:
27628+
27629[source,sml]
27630----
27631local
27632fun foo () = foo () : unit
27633val bar = let fun baz () = baz () : unit in baz end
27634in
27635end
27636----
27637+
27638----
27639Warning: z.sml 3.5.
27640 Unused variable: bar.
27641----
27642
27643* Components of actual functor argument that are necessary to match
27644the functor argument signature but are unused in the body of the
27645functor are warned as unused:
27646+
27647[source,sml]
27648----
27649functor Warning (type t val x : t) = struct
27650 val y = x
27651end
27652structure X = Warning (type t = int val x = 1)
27653----
27654+
27655----
27656Warning: z.sml 4.29.
27657 Unused type: t.
27658----
27659
27660
27661* No component of a functor result is warned as unused. In the
27662following, the only uses of `f2` are to match the functor argument
27663signatures of `functor G` and `functor H` and there are no uses of
27664`z`:
27665+
27666[source,sml]
27667----
27668functor F(structure X : sig type t end) = struct
27669 type t = X.t
27670 fun f1 (_ : X.t) = ()
27671 fun f2 (_ : X.t) = ()
27672 val z = ()
27673end
27674functor G(structure Y : sig
27675 type t
27676 val f1 : t -> unit
27677 val f2 : t -> unit
27678 val z : unit
27679 end) = struct
27680 fun g (x : Y.t) = Y.f1 x
27681end
27682functor H(structure Y : sig
27683 type t
27684 val f1 : t -> unit
27685 val f2 : t -> unit
27686 val z : unit
27687 end) = struct
27688 fun h (x : Y.t) = Y.f1 x
27689end
27690functor Z() = struct
27691 structure S = F(structure X = struct type t = unit end)
27692 structure SG = G(structure Y = S)
27693 structure SH = H(structure Y = S)
27694end
27695structure U = Z()
27696val _ = U.SG.g ()
27697val _ = U.SH.h ()
27698----
27699+
27700----
27701----
27702
27703<<<
27704
27705:mlton-guide-page: WesleyTerpstra
27706[[WesleyTerpstra]]
27707WesleyTerpstra
27708==============
27709
27710Wesley W. Terpstra is a PhD student at the Technische Universitat Darmstadt (Germany).
27711
27712Research interests
27713
27714* Distributed systems (P2P)
27715* Number theory (Error-correcting codes)
27716
27717My interest in SML is centered on the fact the the language is able to directly express ideas from number theory which are important for my work. Modules and Functors seem to be a very natural basis for implementing many algebraic structures. MLton provides an ideal platform for actual implementation as it is fast and has unboxed words.
27718
27719Things I would like from MLton in the future:
27720
27721* Some better optimization of mathematical expressions
27722* IPv6 and multicast support
27723* A complete GUI toolkit like mGTK
27724* More supported platforms so that applications written under MLton have a wider audience
27725
27726<<<
27727
27728:mlton-guide-page: WholeProgramOptimization
27729[[WholeProgramOptimization]]
27730WholeProgramOptimization
27731========================
27732
27733Whole-program optimization is a compilation technique in which
27734optimizations operate over the entire program. This allows the
27735compiler many optimization opportunities that are not available when
27736analyzing modules separately (as with separate compilation).
27737
27738Most of MLton's optimizations are whole-program optimizations.
27739Because MLton compiles the whole program at once, it can perform
27740optimization across module boundaries. As a consequence, MLton often
27741reduces or eliminates the run-time penalty that arises with separate
27742compilation of SML features such as functors, modules, polymorphism,
27743and higher-order functions. MLton takes advantage of having the
27744entire program to perform transformations such as: defunctorization,
27745monomorphisation, higher-order control-flow analysis, inlining,
27746unboxing, argument flattening, redundant-argument removal, constant
27747folding, and representation selection. Whole-program compilation is
27748an integral part of the design of MLton and is not likely to change.
27749
27750<<<
27751
27752:mlton-guide-page: WishList
27753[[WishList]]
27754WishList
27755========
27756
27757This page is mainly for recording recurring feature requests. If you
27758have a new feature request, you probably want to query interest on one
27759of the <:Contact:mailing lists> first.
27760
27761Please be aware of MLton's policy on
27762<:LanguageChanges:language changes>. Nonetheless, we hope to provide
27763support for some of the "immediate" <:SuccessorML:> proposals in a
27764future release.
27765
27766
27767== Support for link options in ML Basis files ==
27768
27769Introduce a mechanism to specify link options in <:MLBasis:ML Basis>
27770files. For example, generalizing a bit, a ML Basis declaration of the
27771form
27772
27773----
27774option "option"
27775----
27776
27777could be introduced whose semantics would be the same (as closely as
27778possible) as if the option string were specified on the compiler
27779command line.
27780
27781The main motivation for this is that a MLton library that would
27782introduce bindings (through <:ForeignFunctionInterface:FFI>) to an
27783external library could be packaged conveniently as a single MLB file.
27784For example, to link with library `foo` the MLB file would simply
27785contain:
27786
27787----
27788option "-link-opt -lfoo"
27789----
27790
27791Similar feature requests have been discussed previously on the mailing lists:
27792
27793* http://www.mlton.org/pipermail/mlton/2004-July/025553.html
27794* http://www.mlton.org/pipermail/mlton/2005-January/026648.html
27795
27796<<<
27797
27798:mlton-guide-page: XML
27799[[XML]]
27800XML
27801===
27802
27803<:XML:> is an <:IntermediateLanguage:>, translated from <:CoreML:> by
27804<:Defunctorize:>, optimized by <:XMLSimplify:>, and translated by
27805<:Monomorphise:> to <:SXML:>.
27806
27807== Description ==
27808
27809<:XML:> is polymorphic, higher-order, with flat patterns. Every
27810<:XML:> expression is annotated with its type. Polymorphic
27811generalization is made explicit through type variables annotating
27812`val` and `fun` declarations. Polymorphic instantiation is made
27813explicit by specifying type arguments at variable references. <:XML:>
27814patterns can not be nested and can not contain wildcards, constraints,
27815flexible records, or layering.
27816
27817== Implementation ==
27818
27819* <!ViewGitFile(mlton,master,mlton/xml/xml.sig)>
27820* <!ViewGitFile(mlton,master,mlton/xml/xml.fun)>
27821* <!ViewGitFile(mlton,master,mlton/xml/xml-tree.sig)>
27822* <!ViewGitFile(mlton,master,mlton/xml/xml-tree.fun)>
27823
27824== Type Checking ==
27825
27826<:XML:> also has a type checker, used for debugging. At present, the
27827type checker is also the best specification of the type system of
27828<:XML:>. If you need more details, the type checker
27829(<!ViewGitFile(mlton,master,mlton/xml/type-check.sig)>,
27830<!ViewGitFile(mlton,master,mlton/xml/type-check.fun)>), is pretty short.
27831
27832Since the type checker does not affect the output of the compiler
27833(unless it reports an error), it can be turned off. The type checker
27834recursively descends the program, checking that the type annotating
27835each node is the same as the type synthesized from the types of the
27836expressions subnodes.
27837
27838== Details and Notes ==
27839
27840<:XML:> uses the same atoms as <:CoreML:>, hence all identifiers
27841(constructors, variables, etc.) are unique and can have properties
27842attached to them. Finally, <:XML:> has a simplifier (<:XMLShrink:>),
27843which implements a reduction system.
27844
27845=== Types ===
27846
27847<:XML:> types are either type variables or applications of n-ary type
27848constructors. There are many utility functions for constructing and
27849destructing types involving built-in type constructors.
27850
27851A type scheme binds list of type variables in a type. The only
27852interesting operation on type schemes is the application of a type
27853scheme to a list of types, which performs a simultaneous substitution
27854of the type arguments for the bound type variables of the scheme. For
27855the purposes of type checking, it is necessary to know the type scheme
27856of variables, constructors, and primitives. This is done by
27857associating the scheme with the identifier using its property list.
27858This approach is used instead of the more traditional environment
27859approach for reasons of speed.
27860
27861=== XmlTree ===
27862
27863Before defining `XML`, the signature for language <:XML:>, we need to
27864define an auxiliary signature `XML_TREE`, that contains the datatype
27865declarations for the expression trees of <:XML:>. This is done solely
27866for the purpose of modularity -- it allows the simplifier and type
27867checker to be defined by separate functors (which take a structure
27868matching `XML_TREE`). Then, `Xml` is defined as the signature for a
27869module containing the expression trees, the simplifier, and the type
27870checker.
27871
27872Both constructors and variables can have type schemes, hence both
27873constructor and variable references specify the instance of the scheme
27874at the point of references. An instance is specified with a vector of
27875types, which corresponds to the type variables in the scheme.
27876
27877<:XML:> patterns are flat (i.e. not nested). A pattern is a
27878constructor with an optional argument variable. Patterns only occur
27879in `case` expressions. To evaluate a case expression, compare the
27880test value sequentially against each pattern. For the first pattern
27881that matches, destruct the value if necessary to bind the pattern
27882variables and evaluate the corresponding expression. If no pattern
27883matches, evaluate the default. All patterns of a case statement are
27884of the same variant of `Pat.t`, although this is not enforced by ML's
27885type system. The type checker, however, does enforce this. Because
27886tuple patterns are irrefutable, there will only ever be one tuple
27887pattern in a case expression and there will be no default.
27888
27889<:XML:> contains value, exception, and mutually recursive function
27890declarations. There are no free type variables in <:XML:>. All type
27891variables are explicitly bound at either a value or function
27892declaration. At some point in the future, exception declarations may
27893go away, and exceptions may be represented with a single datatype
27894containing a `unit ref` component to implement genericity.
27895
27896<:XML:> expressions are like those of <:CoreML:>, with the following
27897exceptions. There are no records expressions. After type inference,
27898all records (some of which may have originally been tuples in the
27899source) are converted to tuples, because once flexible record patterns
27900have been resolved, tuple labels are superfluous. Tuple components
27901are ordered based on the field ordering relation. <:XML:> eta expands
27902primitives and constructors so that there are always fully applied.
27903Hence, the only kind of value of arrow type is a lambda. This
27904property is useful for flow analysis and later in code generation.
27905
27906An <:XML:> program is a list of toplevel datatype declarations and a
27907body expression. Because datatype declarations are not generative,
27908the defunctorizer can safely move them to toplevel.
27909
27910<<<
27911
27912:mlton-guide-page: XMLShrink
27913[[XMLShrink]]
27914XMLShrink
27915=========
27916
27917XMLShrink is an optimization pass for the <:XML:>
27918<:IntermediateLanguage:>, invoked from <:XMLSimplify:>.
27919
27920== Description ==
27921
27922This pass performs optimizations based on a reduction system.
27923
27924== Implementation ==
27925
27926* <!ViewGitFile(mlton,master,mlton/xml/shrink.sig)>
27927* <!ViewGitFile(mlton,master,mlton/xml/shrink.fun)>
27928
27929== Details and Notes ==
27930
27931The simplifier is based on <!Cite(AppelJim97, Shrinking Lambda
27932Expressions in Linear Time)>.
27933
27934The source program may contain functions that are only called once, or
27935not even called at all. Match compilation introduces many such
27936functions. In order to reduce the program size, speed up later
27937phases, and improve the flow analysis, a source to source simplifier
27938is run on <:XML:> after type inference and match compilation.
27939
27940The simplifier implements the reductions shown below. The reductions
27941eliminate unnecessary declarations (see the side constraint in the
27942figure), applications where the function is immediate, and case
27943statements where the test is immediate. Declarations can be
27944eliminated only when the expression is nonexpansive (see Section 4.7
27945of the <:DefinitionOfStandardML: Definition>), which is a syntactic
27946condition that ensures that the expression has no effects
27947(assignments, raises, or nontermination). The reductions on case
27948statements do not show the other irrelevant cases that may exist. The
27949reductions were chosen so that they were strongly normalizing and so
27950that they never increased tree size.
27951
27952* {empty}
27953+
27954--
27955[source,sml]
27956----
27957let x = e1 in e2
27958----
27959
27960reduces to
27961
27962[source,sml]
27963----
27964e2 [x -> e1]
27965----
27966
27967if `e1` is a constant or variable or if `e1` is nonexpansive and `x` occurs zero or one time in `e2`
27968--
27969
27970* {empty}
27971+
27972--
27973[source,sml]
27974----
27975(fn x => e1) e2
27976----
27977
27978reduces to
27979
27980[source,sml]
27981----
27982let x = e2 in e1
27983----
27984--
27985
27986* {empty}
27987+
27988--
27989[source,sml]
27990----
27991e1 handle e2
27992----
27993
27994reduces to
27995
27996[source,sml]
27997----
27998e1
27999----
28000
28001if `e1` is nonexpansive
28002--
28003
28004* {empty}
28005+
28006--
28007[source,sml]
28008----
28009case let d in e end of p1 => e1 ...
28010----
28011
28012reduces to
28013
28014[source,sml]
28015----
28016let d in case e of p1 => e1 ... end
28017----
28018--
28019
28020* {empty}
28021+
28022--
28023[source,sml]
28024----
28025case C e1 of C x => e2
28026----
28027
28028reduces to
28029
28030[source,sml]
28031----
28032let x = e1 in e2
28033----
28034--
28035
28036<<<
28037
28038:mlton-guide-page: XMLSimplify
28039[[XMLSimplify]]
28040XMLSimplify
28041===========
28042
28043The optimization passes for the <:XML:> <:IntermediateLanguage:> are
28044collected and controlled by the `XmlSimplify` functor
28045(<!ViewGitFile(mlton,master,mlton/xml/xml-simplify.sig)>,
28046<!ViewGitFile(mlton,master,mlton/xml/xml-simplify.fun)>).
28047
28048The following optimization passes are implemented:
28049
28050* <:XMLSimplifyTypes:>
28051* <:XMLShrink:>
28052
28053The optimization passes can be controlled from the command-line by the options
28054
28055* `-diag-pass <pass>` -- keep diagnostic info for pass
28056* `-disable-pass <pass>` -- skip optimization pass (if normally performed)
28057* `-enable-pass <pass>` -- perform optimization pass (if normally skipped)
28058* `-keep-pass <pass>` -- keep the results of pass
28059* `-xml-passes <passes>` -- xml optimization passes
28060
28061<<<
28062
28063:mlton-guide-page: XMLSimplifyTypes
28064[[XMLSimplifyTypes]]
28065XMLSimplifyTypes
28066================
28067
28068<:XMLSimplifyTypes:> is an optimization pass for the <:XML:>
28069<:IntermediateLanguage:>, invoked from <:XMLSimplify:>.
28070
28071== Description ==
28072
28073This pass simplifies types in an <:XML:> program, eliminating all
28074unused type arguments.
28075
28076== Implementation ==
28077
28078* <!ViewGitFile(mlton,master,mlton/xml/simplify-types.sig)>
28079* <!ViewGitFile(mlton,master,mlton/xml/simplify-types.fun)>
28080
28081== Details and Notes ==
28082
28083It first computes a simple fixpoint on all the `datatype` declarations
28084to determine which `datatype` `tycon` args are actually used. Then it
28085does a single pass over the program to determine which polymorphic
28086declaration type variables are used, and rewrites types to eliminate
28087unused type arguments.
28088
28089This pass should eliminate any spurious duplication that the
28090<:Monomorphise:> pass might perform due to phantom types.
28091
28092<<<
28093
28094:mlton-guide-page: Zone
28095[[Zone]]
28096Zone
28097====
28098
28099<:Zone:> is an optimization pass for the <:SSA2:>
28100<:IntermediateLanguage:>, invoked from <:SSA2Simplify:>.
28101
28102== Description ==
28103
28104This pass breaks large <:SSA2:> functions into zones, which are
28105connected subgraphs of the dominator tree. For each zone, at the node
28106that dominates the zone (the "zone root"), it places a tuple
28107collecting all of the live variables at that node. It replaces any
28108variables used in that zone with offsets from the tuple. The goal is
28109to decrease the liveness information in large <:SSA:> functions.
28110
28111== Implementation ==
28112
28113* <!ViewGitFile(mlton,master,mlton/ssa/zone.fun)>
28114
28115== Details and Notes ==
28116
28117Compute strongly-connected components to avoid put tuple constructions
28118in loops.
28119
28120There are two (expert) flags that govern the use of this pass
28121
28122* `-max-function-size <n>`
28123* `-zone-cut-depth <n>`
28124
28125Zone splitting only works when the number of basic blocks in a
28126function is greater than `n`. The `n` used to cut the dominator tree
28127is set by `-zone-cut-depth`.
28128
28129There is currently no attempt to be safe-for-space. That is, the
28130tuples are not restricted to containing only "small" values.
28131
28132In the `HOL` program, the particular problem is the main function,
28133which has 161,783 blocks and 257,519 variables -- the product of those
28134two numbers being about 41 billion. Now, we're not likely going to
28135need that much space since we use a sparse representation. But even
281361/100th would really hurt. And of course this rules out bit vectors.
28137
28138<<<
28139
28140:mlton-guide-page: ZZZOrphanedPages
28141[[ZZZOrphanedPages]]
28142ZZZOrphanedPages
28143================
28144
28145The contents of these pages have been moved to other pages.
28146
28147These templates are used by other pages.
28148
28149 * <:CompilerPassTemplate:>
28150 * <:TalkTemplate:>
28151
28152<<<