Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | MLton Guide ({mlton-version}) |
2 | ============================= | |
3 | :toc: | |
4 | :mlton-guide-page!: | |
5 | ||
6 | [abstract] | |
7 | -- | |
8 | This is the guide for MLton, an open-source, whole-program, optimizing Standard ML compiler. | |
9 | ||
10 | This guide was generated automatically from the MLton website, available online at http://mlton.org. It is up to date for MLton {mlton-version}. | |
11 | -- | |
12 | ||
13 | ||
14 | :leveloffset: 1 | |
15 | ||
16 | :mlton-guide-page: Home | |
17 | [[Home]] | |
18 | MLton | |
19 | ===== | |
20 | ||
21 | == What is MLton? == | |
22 | ||
23 | MLton is an open-source, whole-program, optimizing | |
24 | <:StandardML:Standard ML> compiler. | |
25 | ||
26 | == What's new? == | |
27 | ||
28 | * 20180207: Please try out our latest release, <:Release20180207:MLton 20180207>. | |
29 | ||
30 | * 20140730: http://www.cs.rit.edu/%7emtf[Matthew Fluet] and | |
31 | http://www.cse.buffalo.edu/%7elziarek[Lukasz Ziarek] have been | |
32 | awarded an http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=12810[NSF | |
33 | CISE Research Infrastructure (CRI)] grant titled "Positioning MLton | |
34 | for Next-Generation Programming Languages Research;" read the award | |
35 | abstracts | |
36 | (http://www.nsf.gov/awardsearch/showAward?AWD_ID=1405770[Award{nbsp}#1405770] | |
37 | and | |
38 | http://www.nsf.gov/awardsearch/showAward?AWD_ID=1405614[Award{nbsp}#1405614]) | |
39 | for more details. | |
40 | ||
41 | == Next steps == | |
42 | ||
43 | * Read about MLton's <:Features:>. | |
44 | * Look at <:Documentation:>. | |
45 | * See some <:Users:> of MLton. | |
46 | * https://sourceforge.net/projects/mlton/files/mlton/20180207[Download] MLton. | |
47 | * Meet the MLton <:Developers:>. | |
48 | * Get involved with MLton <:Development:>. | |
49 | * User-maintained <:FAQ:>. | |
50 | * <:Contact:> us. | |
51 | ||
52 | <<< | |
53 | ||
54 | :mlton-guide-page: AdamGoode | |
55 | [[AdamGoode]] | |
56 | AdamGoode | |
57 | ========= | |
58 | ||
59 | * I maintain the Fedora package of MLton, in https://admin.fedoraproject.org/pkgdb/packages/name/mlton[Fedora]. | |
60 | * I have contributed some patches for Makefiles and PDF documentation building. | |
61 | ||
62 | <<< | |
63 | ||
64 | :mlton-guide-page: AdmitsEquality | |
65 | [[AdmitsEquality]] | |
66 | AdmitsEquality | |
67 | ============== | |
68 | ||
69 | A <:TypeConstructor:> admits equality if whenever it is applied to | |
70 | equality types, the result is an <:EqualityType:>. This notion enables | |
71 | one to determine whether a type constructor application yields an | |
72 | equality type solely from the application, without looking at the | |
73 | definition of the type constructor. It helps to ensure that | |
74 | <:PolymorphicEquality:> is only applied to sensible values. | |
75 | ||
76 | The definition of admits equality depends on whether the type | |
77 | constructor was declared by a `type` definition or a | |
78 | `datatype` declaration. | |
79 | ||
80 | ||
81 | == Type definitions == | |
82 | ||
83 | For type definition | |
84 | ||
85 | [source,sml] | |
86 | ---- | |
87 | type ('a1, ..., 'an) t = ... | |
88 | ---- | |
89 | ||
90 | type constructor `t` admits equality if the right-hand side of the | |
91 | definition is an equality type after replacing `'a1`, ..., | |
92 | `'an` by equality types (it doesn't matter which equality types | |
93 | are chosen). | |
94 | ||
95 | For a nullary type definition, this amounts to the right-hand side | |
96 | being an equality type. For example, after the definition | |
97 | ||
98 | [source,sml] | |
99 | ---- | |
100 | type t = bool * int | |
101 | ---- | |
102 | ||
103 | type constructor `t` admits equality because `bool * int` is | |
104 | an equality type. On the other hand, after the definition | |
105 | ||
106 | [source,sml] | |
107 | ---- | |
108 | type t = bool * int * real | |
109 | ---- | |
110 | ||
111 | type constructor `t` does not admit equality, because `real` | |
112 | is not an equality type. | |
113 | ||
114 | For another example, after the definition | |
115 | ||
116 | [source,sml] | |
117 | ---- | |
118 | type 'a t = bool * 'a | |
119 | ---- | |
120 | ||
121 | type constructor `t` admits equality because `bool * int` | |
122 | is an equality type (we could have chosen any equality type other than | |
123 | `int`). | |
124 | ||
125 | On the other hand, after the definition | |
126 | ||
127 | [source,sml] | |
128 | ---- | |
129 | type 'a t = real * 'a | |
130 | ---- | |
131 | ||
132 | type constructor `t` does not admit equality because | |
133 | `real * int` is not equality type. | |
134 | ||
135 | We can check that a type constructor admits equality using an | |
136 | `eqtype` specification. | |
137 | ||
138 | [source,sml] | |
139 | ---- | |
140 | structure Ok: sig eqtype 'a t end = | |
141 | struct | |
142 | type 'a t = bool * 'a | |
143 | end | |
144 | ---- | |
145 | ||
146 | [source,sml] | |
147 | ---- | |
148 | structure Bad: sig eqtype 'a t end = | |
149 | struct | |
150 | type 'a t = real * int * 'a | |
151 | end | |
152 | ---- | |
153 | ||
154 | On `structure Bad`, MLton reports the following error. | |
155 | ---- | |
156 | Error: z.sml 1.16-1.34. | |
157 | Type in structure disagrees with signature (admits equality): t. | |
158 | structure: type 'a t = [real] * _ * _ | |
159 | defn at: z.sml 3.15-3.15 | |
160 | signature: [eqtype] 'a t | |
161 | spec at: z.sml 1.30-1.30 | |
162 | ---- | |
163 | ||
164 | The `structure:` section provides an explanation of why the type | |
165 | did not admit equality, highlighting the problematic component | |
166 | (`real`). | |
167 | ||
168 | ||
169 | == Datatype declarations == | |
170 | ||
171 | For a type constructor declared by a datatype declaration to admit | |
172 | equality, every <:Variant:variant> of the datatype must admit equality. For | |
173 | example, the following datatype admits equality because `bool` and | |
174 | `char * int` are equality types. | |
175 | ||
176 | [source,sml] | |
177 | ---- | |
178 | datatype t = A of bool | B of char * int | |
179 | ---- | |
180 | ||
181 | Nullary constructors trivially admit equality, so that the following | |
182 | datatype admits equality. | |
183 | ||
184 | [source,sml] | |
185 | ---- | |
186 | datatype t = A | B | C | |
187 | ---- | |
188 | ||
189 | For a parameterized datatype constructor to admit equality, we | |
190 | consider each <:Variant:variant> as a type definition, and require that the | |
191 | definition admit equality. For example, for the datatype | |
192 | ||
193 | [source,sml] | |
194 | ---- | |
195 | datatype 'a t = A of bool * 'a | B of 'a | |
196 | ---- | |
197 | ||
198 | the type definitions | |
199 | ||
200 | [source,sml] | |
201 | ---- | |
202 | type 'a tA = bool * 'a | |
203 | type 'a tB = 'a | |
204 | ---- | |
205 | ||
206 | both admit equality. Thus, type constructor `t` admits equality. | |
207 | ||
208 | On the other hand, the following datatype does not admit equality. | |
209 | ||
210 | [source,sml] | |
211 | ---- | |
212 | datatype 'a t = A of bool * 'a | B of real * 'a | |
213 | ---- | |
214 | ||
215 | As with type definitions, we can check using an `eqtype` | |
216 | specification. | |
217 | ||
218 | [source,sml] | |
219 | ---- | |
220 | structure Bad: sig eqtype 'a t end = | |
221 | struct | |
222 | datatype 'a t = A of bool * 'a | B of real * 'a | |
223 | end | |
224 | ---- | |
225 | ||
226 | MLton reports the following error. | |
227 | ||
228 | ---- | |
229 | Error: z.sml 1.16-1.34. | |
230 | Type in structure disagrees with signature (admits equality): t. | |
231 | structure: datatype 'a t = B of [real] * _ | ... | |
232 | defn at: z.sml 3.19-3.19 | |
233 | signature: [eqtype] 'a t | |
234 | spec at: z.sml 1.30-1.30 | |
235 | ---- | |
236 | ||
237 | MLton indicates the problematic constructor (`B`), as well as | |
238 | the problematic component of the constructor's argument. | |
239 | ||
240 | ||
241 | === Recursive datatypes === | |
242 | ||
243 | A recursive datatype like | |
244 | ||
245 | [source,sml] | |
246 | ---- | |
247 | datatype t = A | B of int * t | |
248 | ---- | |
249 | ||
250 | introduces a new problem, since in order to decide whether `t` | |
251 | admits equality, we need to know for the `B` <:Variant:variant> whether | |
252 | `t` admits equality. The <:DefinitionOfStandardML:Definition> | |
253 | answers this question by requiring a type constructor to admit | |
254 | equality if it is consistent to do so. So, in our above example, if | |
255 | we assume that `t` admits equality, then the <:Variant:variant> | |
256 | `B of int * t` admits equality. Then, since the `A` <:Variant:variant> | |
257 | trivially admits equality, so does the type constructor `t`. | |
258 | Thus, it was consistent to assume that `t` admits equality, and | |
259 | so, `t` does admit equality. | |
260 | ||
261 | On the other hand, in the following declaration | |
262 | ||
263 | [source,sml] | |
264 | ---- | |
265 | datatype t = A | B of real * t | |
266 | ---- | |
267 | ||
268 | if we assume that `t` admits equality, then the `B` <:Variant:variant> | |
269 | does not admit equality. Hence, the type constructor `t` does not | |
270 | admit equality, and our assumption was inconsistent. Hence, `t` | |
271 | does not admit equality. | |
272 | ||
273 | The same kind of reasoning applies to mutually recursive datatypes as | |
274 | well. For example, the following defines both `t` and `u` to | |
275 | admit equality. | |
276 | ||
277 | [source,sml] | |
278 | ---- | |
279 | datatype t = A | B of u | |
280 | and u = C | D of t | |
281 | ---- | |
282 | ||
283 | But the following defines neither `t` nor `u` to admit | |
284 | equality. | |
285 | ||
286 | [source,sml] | |
287 | ---- | |
288 | datatype t = A | B of u * real | |
289 | and u = C | D of t | |
290 | ---- | |
291 | ||
292 | As always, we can check whether a type admits equality using an | |
293 | `eqtype` specification. | |
294 | ||
295 | [source,sml] | |
296 | ---- | |
297 | structure Bad: sig eqtype t eqtype u end = | |
298 | struct | |
299 | datatype t = A | B of u * real | |
300 | and u = C | D of t | |
301 | end | |
302 | ---- | |
303 | ||
304 | MLton reports the following error. | |
305 | ||
306 | ---- | |
307 | Error: z.sml 1.16-1.40. | |
308 | Type in structure disagrees with signature (admits equality): t. | |
309 | structure: datatype t = B of [_str.u] * [real] | ... | |
310 | defn at: z.sml 3.16-3.16 | |
311 | signature: [eqtype] t | |
312 | spec at: z.sml 1.27-1.27 | |
313 | Error: z.sml 1.16-1.40. | |
314 | Type in structure disagrees with signature (admits equality): u. | |
315 | structure: datatype u = D of [_str.t] | ... | |
316 | defn at: z.sml 4.11-4.11 | |
317 | signature: [eqtype] u | |
318 | spec at: z.sml 1.36-1.36 | |
319 | ---- | |
320 | ||
321 | <<< | |
322 | ||
323 | :mlton-guide-page: Alice | |
324 | [[Alice]] | |
325 | Alice | |
326 | ===== | |
327 | ||
328 | http://www.ps.uni-saarland.de/alice[Alice ML] is an extension of SML with | |
329 | concurrency, dynamic typing, components, distribution, and constraint | |
330 | solving. | |
331 | ||
332 | <<< | |
333 | ||
334 | :mlton-guide-page: AllocateRegisters | |
335 | [[AllocateRegisters]] | |
336 | AllocateRegisters | |
337 | ================= | |
338 | ||
339 | <:AllocateRegisters:> is an analysis pass for the <:RSSA:> | |
340 | <:IntermediateLanguage:>, invoked from <:ToMachine:>. | |
341 | ||
342 | == Description == | |
343 | ||
344 | Computes an allocation of <:RSSA:> variables as <:Machine:> register | |
345 | or stack operands. | |
346 | ||
347 | == Implementation == | |
348 | ||
349 | * <!ViewGitFile(mlton,master,mlton/backend/allocate-registers.sig)> | |
350 | * <!ViewGitFile(mlton,master,mlton/backend/allocate-registers.fun)> | |
351 | ||
352 | == Details and Notes == | |
353 | ||
354 | {empty} | |
355 | ||
356 | <<< | |
357 | ||
358 | :mlton-guide-page: AndreiFormiga | |
359 | [[AndreiFormiga]] | |
360 | AndreiFormiga | |
361 | ============= | |
362 | ||
363 | I'm a graduate student just back in academia. I study concurrent and parallel systems, with a great deal of interest in programming languages (theory, design, implementation). I happen to like functional languages. | |
364 | ||
365 | I use the nickname tautologico on #sml and my email is andrei DOT formiga AT gmail DOT com. | |
366 | ||
367 | <<< | |
368 | ||
369 | :mlton-guide-page: ArrayLiteral | |
370 | [[ArrayLiteral]] | |
371 | ArrayLiteral | |
372 | ============ | |
373 | ||
374 | <:StandardML:Standard ML> does not have a syntax for array literals or | |
375 | vector literals. The only way to write down an array is like | |
376 | [source,sml] | |
377 | ---- | |
378 | Array.fromList [w, x, y, z] | |
379 | ---- | |
380 | ||
381 | No SML compiler produces efficient code for the above expression. The | |
382 | generated code allocates a list and then converts it to an array. To | |
383 | alleviate this, one could write down the same array using | |
384 | `Array.tabulate`, or even using `Array.array` and `Array.update`, but | |
385 | that is syntactically unwieldy. | |
386 | ||
387 | Fortunately, using <:Fold:>, it is possible to define constants `A`, | |
388 | and +`+ so that one can write down an array like: | |
389 | [source,sml] | |
390 | ---- | |
391 | A `w `x `y `z $ | |
392 | ---- | |
393 | This is as syntactically concise as the `fromList` expression. | |
394 | Furthermore, MLton, at least, will generate the efficient code as if | |
395 | one had written down a use of `Array.array` followed by four uses of | |
396 | `Array.update`. | |
397 | ||
398 | Along with `A` and +`+, one can define a constant `V` that makes | |
399 | it possible to define vector literals with the same syntax, e.g., | |
400 | [source,sml] | |
401 | ---- | |
402 | V `w `x `y `z $ | |
403 | ---- | |
404 | ||
405 | Note that the same element indicator, +`+, serves for both array | |
406 | and vector literals. Of course, the `$` is the end-of-arguments | |
407 | marker always used with <:Fold:>. The only difference between an | |
408 | array literal and vector literal is the `A` or `V` at the beginning. | |
409 | ||
410 | Here is the implementation of `A`, `V`, and +`+. We place them | |
411 | in a structure and use signature abstraction to hide the type of the | |
412 | accumulator. See <:Fold:> for more on this technique. | |
413 | [source,sml] | |
414 | ---- | |
415 | structure Literal:> | |
416 | sig | |
417 | type 'a z | |
418 | val A: ('a z, 'a z, 'a array, 'd) Fold.t | |
419 | val V: ('a z, 'a z, 'a vector, 'd) Fold.t | |
420 | val ` : ('a, 'a z, 'a z, 'b, 'c, 'd) Fold.step1 | |
421 | end = | |
422 | struct | |
423 | type 'a z = int * 'a option * ('a array -> unit) | |
424 | ||
425 | val A = | |
426 | fn z => | |
427 | Fold.fold | |
428 | ((0, NONE, ignore), | |
429 | fn (n, opt, fill) => | |
430 | case opt of | |
431 | NONE => | |
432 | Array.tabulate (0, fn _ => raise Fail "array0") | |
433 | | SOME x => | |
434 | let | |
435 | val a = Array.array (n, x) | |
436 | val () = fill a | |
437 | in | |
438 | a | |
439 | end) | |
440 | z | |
441 | ||
442 | val V = fn z => Fold.post (A, Array.vector) z | |
443 | ||
444 | val ` = | |
445 | fn z => | |
446 | Fold.step1 | |
447 | (fn (x, (i, opt, fill)) => | |
448 | (i + 1, | |
449 | SOME x, | |
450 | fn a => (Array.update (a, i, x); fill a))) | |
451 | z | |
452 | end | |
453 | ---- | |
454 | ||
455 | The idea of the code is for the fold to accumulate a count of the | |
456 | number of elements, a sample element, and a function that fills in all | |
457 | the elements. When the fold is complete, the finishing function | |
458 | allocates the array, applies the fill function, and returns the array. | |
459 | The only difference between `A` and `V` is at the very end; `A` just | |
460 | returns the array, while `V` converts it to a vector using | |
461 | post-composition, which is further described on the <:Fold:> page. | |
462 | ||
463 | <<< | |
464 | ||
465 | :mlton-guide-page: AST | |
466 | [[AST]] | |
467 | AST | |
468 | === | |
469 | ||
470 | <:AST:> is the <:IntermediateLanguage:> produced by the <:FrontEnd:> | |
471 | and translated by <:Elaborate:> to <:CoreML:>. | |
472 | ||
473 | == Description == | |
474 | ||
475 | The abstract syntax tree produced by the <:FrontEnd:>. | |
476 | ||
477 | == Implementation == | |
478 | ||
479 | * <!ViewGitFile(mlton,master,mlton/ast/ast-programs.sig)> | |
480 | * <!ViewGitFile(mlton,master,mlton/ast/ast-programs.fun)> | |
481 | * <!ViewGitFile(mlton,master,mlton/ast/ast-modules.sig)> | |
482 | * <!ViewGitFile(mlton,master,mlton/ast/ast-modules.fun)> | |
483 | * <!ViewGitFile(mlton,master,mlton/ast/ast-core.sig)> | |
484 | * <!ViewGitFile(mlton,master,mlton/ast/ast-core.fun)> | |
485 | * <!ViewGitDir(mlton,master,mlton/ast)> | |
486 | ||
487 | == Type Checking == | |
488 | ||
489 | The <:AST:> <:IntermediateLanguage:> has no independent type | |
490 | checker. Type inference is performed on an AST program as part of | |
491 | <:Elaborate:>. | |
492 | ||
493 | == Details and Notes == | |
494 | ||
495 | === Source locations === | |
496 | ||
497 | MLton makes use of a relatively clean method for annotating the | |
498 | abstract syntax tree with source location information. Every source | |
499 | program phrase is "wrapped" with the `WRAPPED` interface: | |
500 | ||
501 | [source,sml] | |
502 | ---- | |
503 | sys::[./bin/InclGitFile.py mlton master mlton/control/wrapped.sig 8:19] | |
504 | ---- | |
505 | ||
506 | The key idea is that `node'` is the type of an unannotated syntax | |
507 | phrase and `obj` is the type of its annotated counterpart. In the | |
508 | implementation, every `node'` is annotated with a `Region.t` | |
509 | (<!ViewGitFile(mlton,master,mlton/control/region.sig)>, | |
510 | <!ViewGitFile(mlton,master,mlton/control/region.sml)>), which describes the | |
511 | syntax phrase's left source position and right source position, where | |
512 | `SourcePos.t` (<!ViewGitFile(mlton,master,mlton/control/source-pos.sig)>, | |
513 | <!ViewGitFile(mlton,master,mlton/control/source-pos.sml)>) denotes a | |
514 | particular file, line, and column. A typical use of the `WRAPPED` | |
515 | interface is illustrated by the following code: | |
516 | ||
517 | [source,sml] | |
518 | ---- | |
519 | sys::[./bin/InclGitFile.py mlton master mlton/ast/ast-core.sig 46:65] | |
520 | ---- | |
521 | ||
522 | Thus, AST nodes are cleanly separated from source locations. By way | |
523 | of contrast, consider the approach taken by <:SMLNJ:SML/NJ> (and also | |
524 | by the <:CKitLibrary:CKit Library>). Each datatype denoting a syntax | |
525 | phrase dedicates a special constructor for annotating source | |
526 | locations: | |
527 | [source,sml] | |
528 | ----- | |
529 | datatype pat = WildPat (* empty pattern *) | |
530 | | AppPat of {constr:pat,argument:pat} (* application *) | |
531 | | MarkPat of pat * region (* mark a pattern *) | |
532 | ---- | |
533 | ||
534 | The main drawback of this approach is that static type checking is not | |
535 | sufficient to guarantee that the AST emitted from the front-end is | |
536 | properly annotated. | |
537 | ||
538 | <<< | |
539 | ||
540 | :mlton-guide-page: BasisLibrary | |
541 | [[BasisLibrary]] | |
542 | BasisLibrary | |
543 | ============ | |
544 | ||
545 | The <:StandardML:Standard ML> Basis Library is a collection of modules | |
546 | dealing with basic types, input/output, OS interfaces, and simple | |
547 | datatypes. It is intended as a portable library usable across all | |
548 | implementations of SML. For the official online version of the Basis | |
549 | Library specification, see http://www.standardml.org/Basis. | |
550 | <!Cite(GansnerReppy04, The Standard ML Basis Library)> is a book | |
551 | version that includes all of the online version and more. For a | |
552 | reverse chronological list of changes to the specification, see | |
553 | http://www.standardml.org/Basis/history.html. | |
554 | ||
555 | MLton implements all of the required portions of the Basis Library. | |
556 | MLton also implements many of the optional structures. You can obtain | |
557 | a complete and current list of what's available using | |
558 | `mlton -show-basis` (see <:ShowBasis:>). By default, MLton makes the | |
559 | Basis Library available to user programs. You can also | |
560 | <:MLBasisAvailableLibraries:access the Basis Library> from | |
561 | <:MLBasis: ML Basis> files. | |
562 | ||
563 | Below is a complete list of what MLton implements. | |
564 | ||
565 | == Top-level types and constructors == | |
566 | ||
567 | `eqtype 'a array` | |
568 | ||
569 | `datatype bool = false | true` | |
570 | ||
571 | `eqtype char` | |
572 | ||
573 | `type exn` | |
574 | ||
575 | `eqtype int` | |
576 | ||
577 | ++datatype 'a list = nil | {two-colons} of ('a * 'a list)++ | |
578 | ||
579 | `datatype 'a option = NONE | SOME of 'a` | |
580 | ||
581 | `datatype order = EQUAL | GREATER | LESS` | |
582 | ||
583 | `type real` | |
584 | ||
585 | `datatype 'a ref = ref of 'a` | |
586 | ||
587 | `eqtype string` | |
588 | ||
589 | `type substring` | |
590 | ||
591 | `eqtype unit` | |
592 | ||
593 | `eqtype 'a vector` | |
594 | ||
595 | `eqtype word` | |
596 | ||
597 | == Top-level exception constructors == | |
598 | ||
599 | `Bind` | |
600 | ||
601 | `Chr` | |
602 | ||
603 | `Div` | |
604 | ||
605 | `Domain` | |
606 | ||
607 | `Empty` | |
608 | ||
609 | `Fail of string` | |
610 | ||
611 | `Match` | |
612 | ||
613 | `Option` | |
614 | ||
615 | `Overflow` | |
616 | ||
617 | `Size` | |
618 | ||
619 | `Span` | |
620 | ||
621 | `Subscript` | |
622 | ||
623 | == Top-level values == | |
624 | ||
625 | MLton does not implement the optional top-level value | |
626 | `use: string -> unit`, which conflicts with whole-program | |
627 | compilation because it allows new code to be loaded dynamically. | |
628 | ||
629 | MLton implements all other top-level values: | |
630 | ||
631 | `!`, | |
632 | `:=`, | |
633 | `<>`, | |
634 | `=`, | |
635 | `@`, | |
636 | `^`, | |
637 | `app`, | |
638 | `before`, | |
639 | `ceil`, | |
640 | `chr`, | |
641 | `concat`, | |
642 | `exnMessage`, | |
643 | `exnName`, | |
644 | `explode`, | |
645 | `floor`, | |
646 | `foldl`, | |
647 | `foldr`, | |
648 | `getOpt`, | |
649 | `hd`, | |
650 | `ignore`, | |
651 | `implode`, | |
652 | `isSome`, | |
653 | `length`, | |
654 | `map`, | |
655 | `not`, | |
656 | `null`, | |
657 | `o`, | |
658 | `ord`, | |
659 | `print`, | |
660 | `real`, | |
661 | `rev`, | |
662 | `round`, | |
663 | `size`, | |
664 | `str`, | |
665 | `substring`, | |
666 | `tl`, | |
667 | `trunc`, | |
668 | `valOf`, | |
669 | `vector` | |
670 | ||
671 | == Overloaded identifiers == | |
672 | ||
673 | `*`, | |
674 | `+`, | |
675 | `-`, | |
676 | `/`, | |
677 | `<`, | |
678 | `<=`, | |
679 | `>`, | |
680 | `>=`, | |
681 | `~`, | |
682 | `abs`, | |
683 | `div`, | |
684 | `mod` | |
685 | ||
686 | == Top-level signatures == | |
687 | ||
688 | `ARRAY` | |
689 | ||
690 | `ARRAY2` | |
691 | ||
692 | `ARRAY_SLICE` | |
693 | ||
694 | `BIN_IO` | |
695 | ||
696 | `BIT_FLAGS` | |
697 | ||
698 | `BOOL` | |
699 | ||
700 | `BYTE` | |
701 | ||
702 | `CHAR` | |
703 | ||
704 | `COMMAND_LINE` | |
705 | ||
706 | `DATE` | |
707 | ||
708 | `GENERAL` | |
709 | ||
710 | `GENERIC_SOCK` | |
711 | ||
712 | `IEEE_REAL` | |
713 | ||
714 | `IMPERATIVE_IO` | |
715 | ||
716 | `INET_SOCK` | |
717 | ||
718 | `INTEGER` | |
719 | ||
720 | `INT_INF` | |
721 | ||
722 | `IO` | |
723 | ||
724 | `LIST` | |
725 | ||
726 | `LIST_PAIR` | |
727 | ||
728 | `MATH` | |
729 | ||
730 | `MONO_ARRAY` | |
731 | ||
732 | `MONO_ARRAY2` | |
733 | ||
734 | `MONO_ARRAY_SLICE` | |
735 | ||
736 | `MONO_VECTOR` | |
737 | ||
738 | `MONO_VECTOR_SLICE` | |
739 | ||
740 | `NET_HOST_DB` | |
741 | ||
742 | `NET_PROT_DB` | |
743 | ||
744 | `NET_SERV_DB` | |
745 | ||
746 | `OPTION` | |
747 | ||
748 | `OS` | |
749 | ||
750 | `OS_FILE_SYS` | |
751 | ||
752 | `OS_IO` | |
753 | ||
754 | `OS_PATH` | |
755 | ||
756 | `OS_PROCESS` | |
757 | ||
758 | `PACK_REAL` | |
759 | ||
760 | `PACK_WORD` | |
761 | ||
762 | `POSIX` | |
763 | ||
764 | `POSIX_ERROR` | |
765 | ||
766 | `POSIX_FILE_SYS` | |
767 | ||
768 | `POSIX_IO` | |
769 | ||
770 | `POSIX_PROCESS` | |
771 | ||
772 | `POSIX_PROC_ENV` | |
773 | ||
774 | `POSIX_SIGNAL` | |
775 | ||
776 | `POSIX_SYS_DB` | |
777 | ||
778 | `POSIX_TTY` | |
779 | ||
780 | `PRIM_IO` | |
781 | ||
782 | `REAL` | |
783 | ||
784 | `SOCKET` | |
785 | ||
786 | `STREAM_IO` | |
787 | ||
788 | `STRING` | |
789 | ||
790 | `STRING_CVT` | |
791 | ||
792 | `SUBSTRING` | |
793 | ||
794 | `TEXT` | |
795 | ||
796 | `TEXT_IO` | |
797 | ||
798 | `TEXT_STREAM_IO` | |
799 | ||
800 | `TIME` | |
801 | ||
802 | `TIMER` | |
803 | ||
804 | `UNIX` | |
805 | ||
806 | `UNIX_SOCK` | |
807 | ||
808 | `VECTOR` | |
809 | ||
810 | `VECTOR_SLICE` | |
811 | ||
812 | `WORD` | |
813 | ||
814 | == Top-level structures == | |
815 | ||
816 | `structure Array: ARRAY` | |
817 | ||
818 | `structure Array2: ARRAY2` | |
819 | ||
820 | `structure ArraySlice: ARRAY_SLICE` | |
821 | ||
822 | `structure BinIO: BIN_IO` | |
823 | ||
824 | `structure BinPrimIO: PRIM_IO` | |
825 | ||
826 | `structure Bool: BOOL` | |
827 | ||
828 | `structure BoolArray: MONO_ARRAY` | |
829 | ||
830 | `structure BoolArray2: MONO_ARRAY2` | |
831 | ||
832 | `structure BoolArraySlice: MONO_ARRAY_SLICE` | |
833 | ||
834 | `structure BoolVector: MONO_VECTOR` | |
835 | ||
836 | `structure BoolVectorSlice: MONO_VECTOR_SLICE` | |
837 | ||
838 | `structure Byte: BYTE` | |
839 | ||
840 | `structure Char: CHAR` | |
841 | ||
842 | * `Char` characters correspond to ISO-8859-1. The `Char` functions do not depend on locale. | |
843 | ||
844 | `structure CharArray: MONO_ARRAY` | |
845 | ||
846 | `structure CharArray2: MONO_ARRAY2` | |
847 | ||
848 | `structure CharArraySlice: MONO_ARRAY_SLICE` | |
849 | ||
850 | `structure CharVector: MONO_VECTOR` | |
851 | ||
852 | `structure CharVectorSlice: MONO_VECTOR_SLICE` | |
853 | ||
854 | `structure CommandLine: COMMAND_LINE` | |
855 | ||
856 | `structure Date: DATE` | |
857 | ||
858 | * `Date.fromString` and `Date.scan` accept a space in addition to a zero for the first character of the day of the month. The Basis Library specification only allows a zero. | |
859 | ||
860 | `structure FixedInt: INTEGER` | |
861 | ||
862 | `structure General: GENERAL` | |
863 | ||
864 | `structure GenericSock: GENERIC_SOCK` | |
865 | ||
866 | `structure IEEEReal: IEEE_REAL` | |
867 | ||
868 | `structure INetSock: INET_SOCK` | |
869 | ||
870 | `structure IO: IO` | |
871 | ||
872 | `structure Int: INTEGER` | |
873 | ||
874 | `structure Int1: INTEGER` | |
875 | ||
876 | `structure Int2: INTEGER` | |
877 | ||
878 | `structure Int3: INTEGER` | |
879 | ||
880 | `structure Int4: INTEGER` | |
881 | ||
882 | ... | |
883 | ||
884 | `structure Int31: INTEGER` | |
885 | ||
886 | `structure Int32: INTEGER` | |
887 | ||
888 | `structure Int64: INTEGER` | |
889 | ||
890 | `structure IntArray: MONO_ARRAY` | |
891 | ||
892 | `structure IntArray2: MONO_ARRAY2` | |
893 | ||
894 | `structure IntArraySlice: MONO_ARRAY_SLICE` | |
895 | ||
896 | `structure IntVector: MONO_VECTOR` | |
897 | ||
898 | `structure IntVectorSlice: MONO_VECTOR_SLICE` | |
899 | ||
900 | `structure Int8: INTEGER` | |
901 | ||
902 | `structure Int8Array: MONO_ARRAY` | |
903 | ||
904 | `structure Int8Array2: MONO_ARRAY2` | |
905 | ||
906 | `structure Int8ArraySlice: MONO_ARRAY_SLICE` | |
907 | ||
908 | `structure Int8Vector: MONO_VECTOR` | |
909 | ||
910 | `structure Int8VectorSlice: MONO_VECTOR_SLICE` | |
911 | ||
912 | `structure Int16: INTEGER` | |
913 | ||
914 | `structure Int16Array: MONO_ARRAY` | |
915 | ||
916 | `structure Int16Array2: MONO_ARRAY2` | |
917 | ||
918 | `structure Int16ArraySlice: MONO_ARRAY_SLICE` | |
919 | ||
920 | `structure Int16Vector: MONO_VECTOR` | |
921 | ||
922 | `structure Int16VectorSlice: MONO_VECTOR_SLICE` | |
923 | ||
924 | `structure Int32: INTEGER` | |
925 | ||
926 | `structure Int32Array: MONO_ARRAY` | |
927 | ||
928 | `structure Int32Array2: MONO_ARRAY2` | |
929 | ||
930 | `structure Int32ArraySlice: MONO_ARRAY_SLICE` | |
931 | ||
932 | `structure Int32Vector: MONO_VECTOR` | |
933 | ||
934 | `structure Int32VectorSlice: MONO_VECTOR_SLICE` | |
935 | ||
936 | `structure Int64Array: MONO_ARRAY` | |
937 | ||
938 | `structure Int64Array2: MONO_ARRAY2` | |
939 | ||
940 | `structure Int64ArraySlice: MONO_ARRAY_SLICE` | |
941 | ||
942 | `structure Int64Vector: MONO_VECTOR` | |
943 | ||
944 | `structure Int64VectorSlice: MONO_VECTOR_SLICE` | |
945 | ||
946 | `structure IntInf: INT_INF` | |
947 | ||
948 | `structure LargeInt: INTEGER` | |
949 | ||
950 | `structure LargeIntArray: MONO_ARRAY` | |
951 | ||
952 | `structure LargeIntArray2: MONO_ARRAY2` | |
953 | ||
954 | `structure LargeIntArraySlice: MONO_ARRAY_SLICE` | |
955 | ||
956 | `structure LargeIntVector: MONO_VECTOR` | |
957 | ||
958 | `structure LargeIntVectorSlice: MONO_VECTOR_SLICE` | |
959 | ||
960 | `structure LargeReal: REAL` | |
961 | ||
962 | `structure LargeRealArray: MONO_ARRAY` | |
963 | ||
964 | `structure LargeRealArray2: MONO_ARRAY2` | |
965 | ||
966 | `structure LargeRealArraySlice: MONO_ARRAY_SLICE` | |
967 | ||
968 | `structure LargeRealVector: MONO_VECTOR` | |
969 | ||
970 | `structure LargeRealVectorSlice: MONO_VECTOR_SLICE` | |
971 | ||
972 | `structure LargeWord: WORD` | |
973 | ||
974 | `structure LargeWordArray: MONO_ARRAY` | |
975 | ||
976 | `structure LargeWordArray2: MONO_ARRAY2` | |
977 | ||
978 | `structure LargeWordArraySlice: MONO_ARRAY_SLICE` | |
979 | ||
980 | `structure LargeWordVector: MONO_VECTOR` | |
981 | ||
982 | `structure LargeWordVectorSlice: MONO_VECTOR_SLICE` | |
983 | ||
984 | `structure List: LIST` | |
985 | ||
986 | `structure ListPair: LIST_PAIR` | |
987 | ||
988 | `structure Math: MATH` | |
989 | ||
990 | `structure NetHostDB: NET_HOST_DB` | |
991 | ||
992 | `structure NetProtDB: NET_PROT_DB` | |
993 | ||
994 | `structure NetServDB: NET_SERV_DB` | |
995 | ||
996 | `structure OS: OS` | |
997 | ||
998 | `structure Option: OPTION` | |
999 | ||
1000 | `structure PackReal32Big: PACK_REAL` | |
1001 | ||
1002 | `structure PackReal32Little: PACK_REAL` | |
1003 | ||
1004 | `structure PackReal64Big: PACK_REAL` | |
1005 | ||
1006 | `structure PackReal64Little: PACK_REAL` | |
1007 | ||
1008 | `structure PackRealBig: PACK_REAL` | |
1009 | ||
1010 | `structure PackRealLittle: PACK_REAL` | |
1011 | ||
1012 | `structure PackWord16Big: PACK_WORD` | |
1013 | ||
1014 | `structure PackWord16Little: PACK_WORD` | |
1015 | ||
1016 | `structure PackWord32Big: PACK_WORD` | |
1017 | ||
1018 | `structure PackWord32Little: PACK_WORD` | |
1019 | ||
1020 | `structure PackWord64Big: PACK_WORD` | |
1021 | ||
1022 | `structure PackWord64Little: PACK_WORD` | |
1023 | ||
1024 | `structure Position: INTEGER` | |
1025 | ||
1026 | `structure Posix: POSIX` | |
1027 | ||
1028 | `structure Real: REAL` | |
1029 | ||
1030 | `structure RealArray: MONO_ARRAY` | |
1031 | ||
1032 | `structure RealArray2: MONO_ARRAY2` | |
1033 | ||
1034 | `structure RealArraySlice: MONO_ARRAY_SLICE` | |
1035 | ||
1036 | `structure RealVector: MONO_VECTOR` | |
1037 | ||
1038 | `structure RealVectorSlice: MONO_VECTOR_SLICE` | |
1039 | ||
1040 | `structure Real32: REAL` | |
1041 | ||
1042 | `structure Real32Array: MONO_ARRAY` | |
1043 | ||
1044 | `structure Real32Array2: MONO_ARRAY2` | |
1045 | ||
1046 | `structure Real32ArraySlice: MONO_ARRAY_SLICE` | |
1047 | ||
1048 | `structure Real32Vector: MONO_VECTOR` | |
1049 | ||
1050 | `structure Real32VectorSlice: MONO_VECTOR_SLICE` | |
1051 | ||
1052 | `structure Real64: REAL` | |
1053 | ||
1054 | `structure Real64Array: MONO_ARRAY` | |
1055 | ||
1056 | `structure Real64Array2: MONO_ARRAY2` | |
1057 | ||
1058 | `structure Real64ArraySlice: MONO_ARRAY_SLICE` | |
1059 | ||
1060 | `structure Real64Vector: MONO_VECTOR` | |
1061 | ||
1062 | `structure Real64VectorSlice: MONO_VECTOR_SLICE` | |
1063 | ||
1064 | `structure Socket: SOCKET` | |
1065 | ||
1066 | * The Basis Library specification requires functions like | |
1067 | `Socket.sendVec` to raise an exception if they fail. However, on some | |
1068 | platforms, sending to a socket that hasn't yet been connected causes a | |
1069 | `SIGPIPE` signal, which invokes the default signal handler for | |
1070 | `SIGPIPE` and causes the program to terminate. If you want the | |
1071 | exception to be raised, you can ignore `SIGPIPE` by adding the | |
1072 | following to your program. | |
1073 | + | |
1074 | [source,sml] | |
1075 | ---- | |
1076 | let | |
1077 | open MLton.Signal | |
1078 | in | |
1079 | setHandler (Posix.Signal.pipe, Handler.ignore) | |
1080 | end | |
1081 | ---- | |
1082 | ||
1083 | `structure String: STRING` | |
1084 | ||
1085 | * The `String` functions do not depend on locale. | |
1086 | ||
1087 | `structure StringCvt: STRING_CVT` | |
1088 | ||
1089 | `structure Substring: SUBSTRING` | |
1090 | ||
1091 | `structure SysWord: WORD` | |
1092 | ||
1093 | `structure Text: TEXT` | |
1094 | ||
1095 | `structure TextIO: TEXT_IO` | |
1096 | ||
1097 | `structure TextPrimIO: PRIM_IO` | |
1098 | ||
1099 | `structure Time: TIME` | |
1100 | ||
1101 | `structure Timer: TIMER` | |
1102 | ||
1103 | `structure Unix: UNIX` | |
1104 | ||
1105 | `structure UnixSock: UNIX_SOCK` | |
1106 | ||
1107 | `structure Vector: VECTOR` | |
1108 | ||
1109 | `structure VectorSlice: VECTOR_SLICE` | |
1110 | ||
1111 | `structure Word: WORD` | |
1112 | ||
1113 | `structure Word1: WORD` | |
1114 | ||
1115 | `structure Word2: WORD` | |
1116 | ||
1117 | `structure Word3: WORD` | |
1118 | ||
1119 | `structure Word4: WORD` | |
1120 | ||
1121 | ... | |
1122 | ||
1123 | `structure Word31: WORD` | |
1124 | ||
1125 | `structure Word32: WORD` | |
1126 | ||
1127 | `structure Word64: WORD` | |
1128 | ||
1129 | `structure WordArray: MONO_ARRAY` | |
1130 | ||
1131 | `structure WordArray2: MONO_ARRAY2` | |
1132 | ||
1133 | `structure WordArraySlice: MONO_ARRAY_SLICE` | |
1134 | ||
1135 | `structure WordVectorSlice: MONO_VECTOR_SLICE` | |
1136 | ||
1137 | `structure WordVector: MONO_VECTOR` | |
1138 | ||
1139 | `structure Word8Array: MONO_ARRAY` | |
1140 | ||
1141 | `structure Word8Array2: MONO_ARRAY2` | |
1142 | ||
1143 | `structure Word8ArraySlice: MONO_ARRAY_SLICE` | |
1144 | ||
1145 | `structure Word8Vector: MONO_VECTOR` | |
1146 | ||
1147 | `structure Word8VectorSlice: MONO_VECTOR_SLICE` | |
1148 | ||
1149 | `structure Word16Array: MONO_ARRAY` | |
1150 | ||
1151 | `structure Word16Array2: MONO_ARRAY2` | |
1152 | ||
1153 | `structure Word16ArraySlice: MONO_ARRAY_SLICE` | |
1154 | ||
1155 | `structure Word16Vector: MONO_VECTOR` | |
1156 | ||
1157 | `structure Word16VectorSlice: MONO_VECTOR_SLICE` | |
1158 | ||
1159 | `structure Word32Array: MONO_ARRAY` | |
1160 | ||
1161 | `structure Word32Array2: MONO_ARRAY2` | |
1162 | ||
1163 | `structure Word32ArraySlice: MONO_ARRAY_SLICE` | |
1164 | ||
1165 | `structure Word32Vector: MONO_VECTOR` | |
1166 | ||
1167 | `structure Word32VectorSlice: MONO_VECTOR_SLICE` | |
1168 | ||
1169 | `structure Word64Array: MONO_ARRAY` | |
1170 | ||
1171 | `structure Word64Array2: MONO_ARRAY2` | |
1172 | ||
1173 | `structure Word64ArraySlice: MONO_ARRAY_SLICE` | |
1174 | ||
1175 | `structure Word64Vector: MONO_VECTOR` | |
1176 | ||
1177 | `structure Word64VectorSlice: MONO_VECTOR_SLICE` | |
1178 | ||
1179 | == Top-level functors == | |
1180 | ||
1181 | `ImperativeIO` | |
1182 | ||
1183 | `PrimIO` | |
1184 | ||
1185 | `StreamIO` | |
1186 | ||
1187 | * MLton's `StreamIO` functor takes structures `ArraySlice` and | |
1188 | `VectorSlice` in addition to the arguments specified in the Basis | |
1189 | Library specification. | |
1190 | ||
1191 | == Type equivalences == | |
1192 | ||
1193 | The following types are equivalent. | |
1194 | ---- | |
1195 | FixedInt = Int64.int | |
1196 | LargeInt = IntInf.int | |
1197 | LargeReal.real = Real64.real | |
1198 | LargeWord = Word64.word | |
1199 | ---- | |
1200 | ||
1201 | The default `int`, `real`, and `word` types may be set by the | |
1202 | ++-default-type __type__++ <:CompileTimeOptions: compile-time option>. | |
1203 | By default, the following types are equivalent: | |
1204 | ---- | |
1205 | int = Int.int = Int32.int | |
1206 | real = Real.real = Real64.real | |
1207 | word = Word.word = Word32.word | |
1208 | ---- | |
1209 | ||
1210 | == Real and Math functions == | |
1211 | ||
1212 | The `Real`, `Real32`, and `Real64` modules are implemented | |
1213 | using the `C` math library, so the SML functions will reflect the | |
1214 | behavior of the underlying library function. We have made some effort | |
1215 | to unify the differences between the math libraries on different | |
1216 | platforms, and in particular to handle exceptional cases according to | |
1217 | the Basis Library specification. However, there will be differences | |
1218 | due to different numerical algorithms and cases we may have missed. | |
1219 | Please submit a <:Bug:bug report> if you encounter an error in | |
1220 | the handling of an exceptional case. | |
1221 | ||
1222 | On x86, real arithmetic is implemented internally using 80 bits of | |
1223 | precision. Using higher precision for intermediate results in | |
1224 | computations can lead to different results than if all the computation | |
1225 | is done at 32 or 64 bits. If you require strict IEEE compliance, you | |
1226 | can compile with `-ieee-fp true`, which will cause intermediate | |
1227 | results to be stored after each operation. This may cause a | |
1228 | substantial performance penalty. | |
1229 | ||
1230 | <<< | |
1231 | ||
1232 | :mlton-guide-page: Bug | |
1233 | [[Bug]] | |
1234 | Bug | |
1235 | === | |
1236 | ||
1237 | To report a bug, please send mail to | |
1238 | mailto:mlton-devel@mlton.org[`mlton-devel@mlton.org`]. Please include | |
1239 | the complete SML program that caused the problem and a log of a | |
1240 | compile of the program with `-verbose 2`. For large programs (over | |
1241 | 256K), please send an email containing the discussion text and a link | |
1242 | to any large files. | |
1243 | ||
1244 | There are some <:UnresolvedBugs:> that we don't plan to fix. | |
1245 | ||
1246 | We also maintain a list of bugs found with each release. | |
1247 | ||
1248 | * <:Bugs20130715:> | |
1249 | * <:Bugs20100608:> | |
1250 | * <:Bugs20070826:> | |
1251 | * <:Bugs20051202:> | |
1252 | * <:Bugs20041109:> | |
1253 | ||
1254 | <<< | |
1255 | ||
1256 | :mlton-guide-page: Bugs20041109 | |
1257 | [[Bugs20041109]] | |
1258 | Bugs20041109 | |
1259 | ============ | |
1260 | ||
1261 | Here are the known bugs in <:Release20041109:MLton 20041109>, listed | |
1262 | in reverse chronological order of date reported. | |
1263 | ||
1264 | * <!Anchor(bug17)> | |
1265 | `MLton.Finalizable.touch` doesn't necessarily keep values alive | |
1266 | long enough. Our SVN has a patch to the compiler. You must rebuild | |
1267 | the compiler in order for the patch to take effect. | |
1268 | + | |
1269 | Thanks to Florian Weimer for reporting this bug. | |
1270 | ||
1271 | * <!Anchor(bug16)> | |
1272 | A bug in an optimization pass may incorrectly transform a program | |
1273 | to flatten ref cells into their containing data structure, yielding a | |
1274 | type-error in the transformed program. Our CVS has a | |
1275 | http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/ssa/ref-flatten.fun.diff?r1=1.35&r2=1.37[patch] | |
1276 | to the compiler. You must rebuild the compiler in order for the | |
1277 | patch to take effect. | |
1278 | + | |
1279 | Thanks to <:VesaKarvonen:> for reporting this bug. | |
1280 | ||
1281 | * <!Anchor(bug15)> | |
1282 | A bug in the front end mistakenly allows unary constructors to be | |
1283 | used without an argument in patterns. For example, the following | |
1284 | program is accepted, and triggers a large internal error. | |
1285 | + | |
1286 | [source,sml] | |
1287 | ---- | |
1288 | fun f x = case x of SOME => true | _ => false | |
1289 | ---- | |
1290 | + | |
1291 | We have fixed the problem in our CVS. | |
1292 | + | |
1293 | Thanks to William Lovas for reporting this bug. | |
1294 | ||
1295 | * <!Anchor(bug14)> | |
1296 | A bug in `Posix.IO.{getlk,setlk,setlkw}` causes a link-time error: | |
1297 | `undefined reference to Posix_IO_FLock_typ` | |
1298 | Our CVS has a | |
1299 | http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/posix/primitive.sml.diff?r1=1.34&r2=1.35[patch] | |
1300 | to the Basis Library implementation. | |
1301 | + | |
1302 | Thanks to Adam Chlipala for reporting this bug. | |
1303 | ||
1304 | * <!Anchor(bug13)> | |
1305 | A bug can cause programs compiled with `-profile alloc` to | |
1306 | segfault. Our CVS has a | |
1307 | http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/backend/ssa-to-rssa.fun.diff?r1=1.106&r2=1.107[patch] | |
1308 | to the compiler. You must rebuild the compiler in order for the | |
1309 | patch to take effect. | |
1310 | + | |
1311 | Thanks to John Reppy for reporting this bug. | |
1312 | ||
1313 | * <!Anchor(bug12)> | |
1314 | A bug in an optimization pass may incorrectly flatten ref cells | |
1315 | into their containing data structure, breaking the sharing between | |
1316 | the cells. Our CVS has a | |
1317 | http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/ssa/ref-flatten.fun.diff?r1=1.32&r2=1.33[patch] | |
1318 | to the compiler. You must rebuild the compiler in order for the | |
1319 | patch to take effect. | |
1320 | + | |
1321 | Thanks to Paul Govereau for reporting this bug. | |
1322 | ||
1323 | * <!Anchor(bug11)> | |
1324 | Some arrays or vectors, such as `(char * char) vector`, are | |
1325 | incorrectly implemented, and will conflate the first and second | |
1326 | components of each element. Our CVS has a | |
1327 | http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/mlton/backend/packed-representation.fun.diff?r1=1.32&r2=1.33[patch] | |
1328 | to the compiler. You must rebuild the compiler in order for the | |
1329 | patch to take effect. | |
1330 | + | |
1331 | Thanks to Scott Cruzen for reporting this bug. | |
1332 | ||
1333 | * <!Anchor(bug10)> | |
1334 | `Socket.Ctl.getLINGER` and `Socket.Ctl.setLINGER` | |
1335 | mistakenly raise `Subscript`. | |
1336 | Our CVS has a | |
1337 | http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/net/socket.sml.diff?r1=1.14&r2=1.15[patch] | |
1338 | to the Basis Library implementation. | |
1339 | + | |
1340 | Thanks to Ray Racine for reporting the bug. | |
1341 | ||
1342 | * <!Anchor(bug09)> | |
1343 | <:ConcurrentML: CML> `Mailbox.send` makes a call in the wrong atomic context. | |
1344 | Our CVS has a http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/lib/cml/core-cml/mailbox.sml.diff?r1=1.3&r2=1.4[patch] | |
1345 | to the CML implementation. | |
1346 | ||
1347 | * <!Anchor(bug08)> | |
1348 | `OS.Path.joinDirFile` and `OS.Path.toString` did not | |
1349 | raise `InvalidArc` when they were supposed to. They now do. | |
1350 | Our CVS has a http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/system/path.sml.diff?r1=1.8&r2=1.11[patch] | |
1351 | to the Basis Library implementation. | |
1352 | + | |
1353 | Thanks to Andreas Rossberg for reporting the bug. | |
1354 | ||
1355 | * <!Anchor(bug07)> | |
1356 | The front end incorrectly disallows sequences of expressions | |
1357 | (separated by semicolons) after a topdec has already been processed. | |
1358 | For example, the following is incorrectly rejected. | |
1359 | + | |
1360 | [source,sml] | |
1361 | ---- | |
1362 | val x = 0; | |
1363 | ignore x; | |
1364 | ignore x; | |
1365 | ---- | |
1366 | + | |
1367 | We have fixed the problem in our CVS. | |
1368 | + | |
1369 | Thanks to Andreas Rossberg for reporting the bug. | |
1370 | ||
1371 | * <!Anchor(bug06)> | |
1372 | The front end incorrectly disallows expansive `val` | |
1373 | declarations that bind a type variable that doesn't occur in the | |
1374 | type of the value being bound. For example, the following is | |
1375 | incorrectly rejected. | |
1376 | + | |
1377 | [source,sml] | |
1378 | ---- | |
1379 | val 'a x = let exception E of 'a in () end | |
1380 | ---- | |
1381 | + | |
1382 | We have fixed the problem in our CVS. | |
1383 | + | |
1384 | Thanks to Andreas Rossberg for reporting this bug. | |
1385 | ||
1386 | * <!Anchor(bug05)> | |
1387 | The x86 codegen fails to account for the possibility that a 64-bit | |
1388 | move could interfere with itself (as simulated by 32-bit moves). We | |
1389 | have fixed the problem in our CVS. | |
1390 | + | |
1391 | Thanks to Scott Cruzen for reporting this bug. | |
1392 | ||
1393 | * <!Anchor(bug04)> | |
1394 | `NetHostDB.scan` and `NetHostDB.fromString` incorrectly | |
1395 | raise an exception on internet addresses whose last component is a | |
1396 | zero, e.g `0.0.0.0`. Our CVS has a | |
1397 | http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/net/net-host-db.sml.diff?r1=1.12&r2=1.13[patch] to the Basis Library implementation. | |
1398 | + | |
1399 | Thanks to Scott Cruzen for reporting this bug. | |
1400 | ||
1401 | * <!Anchor(bug03)> | |
1402 | `StreamIO.inputLine` has an off-by-one error causing it to drop | |
1403 | the first character after a newline in some situations. Our CVS has a | |
1404 | http://mlton.org/cgi-bin/viewcvs.cgi/mlton/mlton/basis-library/io/stream-io.fun.diff?r1=text&tr1=1.29&r2=text&tr2=1.30&diff_format=h[patch]. | |
1405 | to the Basis Library implementation. | |
1406 | + | |
1407 | Thanks to Scott Cruzen for reporting this bug. | |
1408 | ||
1409 | * <!Anchor(bug02)> | |
1410 | `BinIO.getInstream` and `TextIO.getInstream` are | |
1411 | implemented incorrectly. This also impacts the behavior of | |
1412 | `BinIO.scanStream` and `TextIO.scanStream`. If you (directly | |
1413 | or indirectly) realize a `TextIO.StreamIO.instream` and do not | |
1414 | (directly or indirectly) call `TextIO.setInstream` with a derived | |
1415 | stream, you may lose input data. We have fixed the problem in our | |
1416 | CVS. | |
1417 | + | |
1418 | Thanks to <:WesleyTerpstra:> for reporting this bug. | |
1419 | ||
1420 | * <!Anchor(bug01)> | |
1421 | `Posix.ProcEnv.setpgid` doesn't work. If you compile a program | |
1422 | that uses it, you will get a link time error | |
1423 | + | |
1424 | ---- | |
1425 | undefined reference to `Posix_ProcEnv_setpgid' | |
1426 | ---- | |
1427 | + | |
1428 | The bug is due to `Posix_ProcEnv_setpgid` being omitted from the | |
1429 | MLton runtime. We fixed the problem in our CVS by adding the | |
1430 | following definition to `runtime/Posix/ProcEnv/ProcEnv.c` | |
1431 | + | |
1432 | [source,c] | |
1433 | ---- | |
1434 | Int Posix_ProcEnv_setpgid (Pid p, Gid g) { | |
1435 | return setpgid (p, g); | |
1436 | } | |
1437 | ---- | |
1438 | + | |
1439 | Thanks to Tom Murphy for reporting this bug. | |
1440 | ||
1441 | <<< | |
1442 | ||
1443 | :mlton-guide-page: Bugs20051202 | |
1444 | [[Bugs20051202]] | |
1445 | Bugs20051202 | |
1446 | ============ | |
1447 | ||
1448 | Here are the known bugs in <:Release20051202:MLton 20051202>, listed | |
1449 | in reverse chronological order of date reported. | |
1450 | ||
1451 | * <!Anchor(bug16)> | |
1452 | Bug in the http://www.standardml.org/Basis/real.html#SIG:REAL.fmt:VAL[++Real__<N>__.fmt++], http://www.standardml.org/Basis/real.html#SIG:REAL.fromString:VAL[++Real__<N>__.fromString++], http://www.standardml.org/Basis/real.html#SIG:REAL.scan:VAL[++Real__<N>__.scan++], and http://www.standardml.org/Basis/real.html#SIG:REAL.toString:VAL[++Real__<N>__.toString++] functions of the <:BasisLibrary:Basis Library> implementation. These functions were using `TO_NEAREST` semantics, but should obey the current rounding mode. (Only ++Real__<N>__.fmt StringCvt.EXACT++, ++Real__<N>__.fromDecimal++, and ++Real__<N>__.toDecimal++ are specified to override the current rounding mode with `TO_NEAREST` semantics.) | |
1453 | + | |
1454 | Thanks to Sean McLaughlin for the bug report. | |
1455 | + | |
1456 | Fixed by revision <!ViewSVNRev(5827)>. | |
1457 | ||
1458 | * <!Anchor(bug15)> | |
1459 | Bug in the treatment of floating-point operations. Floating-point operations depend on the current rounding mode, but were being treated as pure. | |
1460 | + | |
1461 | Thanks to Sean McLaughlin for the bug report. | |
1462 | + | |
1463 | Fixed by revision <!ViewSVNRev(5794)>. | |
1464 | ||
1465 | * <!Anchor(bug14)> | |
1466 | Bug in the http://www.standardml.org/Basis/real.html#SIG:REAL.toInt:VAL[++Real32.toInt++] function of the <:BasisLibrary:Basis Library> implementation could lead incorrect results when applied to a `Real32.real` value numerically close to `valOf(Int.maxInt)`. | |
1467 | + | |
1468 | Fixed by revision <!ViewSVNRev(5764)>. | |
1469 | ||
1470 | * <!Anchor(bug13)> | |
1471 | The http://www.standardml.org/Basis/socket.html[++Socket++] structure of the <:BasisLibrary:Basis Library> implementation used `andb` rather than `orb` to unmarshal socket options (for ++Socket.Ctl.get__<OPT>__++ functions). | |
1472 | + | |
1473 | Thanks to Anders Petersson for the bug report and patch. | |
1474 | + | |
1475 | Fixed by revision <!ViewSVNRev(5735)>. | |
1476 | ||
1477 | * <!Anchor(bug12)> | |
1478 | Bug in the http://www.standardml.org/Basis/date.html[++Date++] structure of the <:BasisLibrary:Basis Library> implementation yielded some functions that would erroneously raise `Date` when applied to a year before 1900. | |
1479 | + | |
1480 | Thanks to Joe Hurd for the bug report. | |
1481 | + | |
1482 | Fixed by revision <!ViewSVNRev(5732)>. | |
1483 | ||
1484 | * <!Anchor(bug11)> | |
1485 | Bug in monomorphisation pass could exhibit the error `Type error: type mismatch`. | |
1486 | + | |
1487 | Thanks to Vesa Karvonen for the bug report. | |
1488 | + | |
1489 | Fixed by revision <!ViewSVNRev(5731)>. | |
1490 | ||
1491 | * <!Anchor(bug10)> | |
1492 | The http://www.standardml.org/Basis/pack-float.html#SIG:PACK_REAL.toBytes:VAL[++PackReal__<N>__.toBytes++] function in the <:BasisLibrary:Basis Library> implementation incorrectly shared (and mutated) the result vector. | |
1493 | + | |
1494 | Thanks to Eric McCorkle for the bug report and patch. | |
1495 | + | |
1496 | Fixed by revision <!ViewSVNRev(5281)>. | |
1497 | ||
1498 | * <!Anchor(bug09)> | |
1499 | Bug in elaboration of FFI forms. Using a unary FFI types (e.g., `array`, `ref`, `vector`) in places where `MLton.Pointer.t` was required would lead to an internal error `TypeError`. | |
1500 | + | |
1501 | Fixed by revision <!ViewSVNRev(4890)>. | |
1502 | ||
1503 | * <!Anchor(bug08)> | |
1504 | The http://www.standardml.org/Basis/mono-vector.html[++MONO_VECTOR++] signature of the <:BasisLibrary:Basis Library> implementation incorrectly omits the specification of `find`. | |
1505 | + | |
1506 | Fixed by revision <!ViewSVNRev(4707)>. | |
1507 | ||
1508 | * <!Anchor(bug07)> | |
1509 | The optimizer reports an internal error (`TypeError`) when an imported C function is called but not used. | |
1510 | + | |
1511 | Thanks to "jq" for the bug report. | |
1512 | + | |
1513 | Fixed by revision <!ViewSVNRev(4690)>. | |
1514 | ||
1515 | * <!Anchor(bug06)> | |
1516 | Bug in pass to flatten data structures. | |
1517 | + | |
1518 | Thanks to Joe Hurd for the bug report. | |
1519 | + | |
1520 | Fixed by revision <!ViewSVNRev(4662)>. | |
1521 | ||
1522 | * <!Anchor(bug05)> | |
1523 | The native codegen's implementation of the C-calling convention failed to widen 16-bit arguments to 32-bits. | |
1524 | + | |
1525 | Fixed by revision <!ViewSVNRev(4631)>. | |
1526 | ||
1527 | * <!Anchor(bug04)> | |
1528 | The http://www.standardml.org/Basis/pack-float.html[++PACK_REAL++] structures of the <:BasisLibrary:Basis Library> implementation used byte, rather than element, indexing. | |
1529 | + | |
1530 | Fixed by revision <!ViewSVNRev(4411)>. | |
1531 | ||
1532 | * <!Anchor(bug03)> | |
1533 | `MLton.share` could cause a segmentation fault. | |
1534 | + | |
1535 | Fixed by revision <!ViewSVNRev(4400)>. | |
1536 | ||
1537 | * <!Anchor(bug02)> | |
1538 | The SSA simplifier could eliminate an irredundant test. | |
1539 | + | |
1540 | Fixed by revision <!ViewSVNRev(4370)>. | |
1541 | ||
1542 | * <!Anchor(bug01)> | |
1543 | A program with a very large number of functors could exhibit the error `ElaborateEnv.functorClosure: firstTycons`. | |
1544 | + | |
1545 | Fixed by revision <!ViewSVNRev(4344)>. | |
1546 | ||
1547 | <<< | |
1548 | ||
1549 | :mlton-guide-page: Bugs20070826 | |
1550 | [[Bugs20070826]] | |
1551 | Bugs20070826 | |
1552 | ============ | |
1553 | ||
1554 | Here are the known bugs in <:Release20070826:MLton 20070826>, listed | |
1555 | in reverse chronological order of date reported. | |
1556 | ||
1557 | * <!Anchor(bug25)> | |
1558 | Bug in the mark-compact garbage collector where the C library's `memcpy` was used to move objects during the compaction phase; this could lead to heap corruption and segmentation faults with newer versions of gcc and/or glibc, which assume that src and dst in a `memcpy` do not overlap. | |
1559 | + | |
1560 | Fixed by revision <!ViewSVNRev(7461)>. | |
1561 | ||
1562 | * <!Anchor(bug24)> | |
1563 | Bug in elaboration of `datatype` declarations with `withtype` bindings. | |
1564 | + | |
1565 | Fixed by revision <!ViewSVNRev(7434)>. | |
1566 | ||
1567 | * <!Anchor(bug23)> | |
1568 | Performance bug in <:RefFlatten:> optimization pass. | |
1569 | + | |
1570 | Thanks to Reactive Systems for the bug report. | |
1571 | + | |
1572 | Fixed by revision <!ViewSVNRev(7379)>. | |
1573 | ||
1574 | * <!Anchor(bug22)> | |
1575 | Performance bug in <:SimplifyTypes:> optimization pass. | |
1576 | + | |
1577 | Thanks to Reactive Systems for the bug report. | |
1578 | + | |
1579 | Fixed by revisions <!ViewSVNRev(7377)> and <!ViewSVNRev(7378)>. | |
1580 | ||
1581 | * <!Anchor(bug21)> | |
1582 | Bug in amd64 codegen register allocation of indirect C calls. | |
1583 | + | |
1584 | Thanks to David Hansel for the bug report. | |
1585 | + | |
1586 | Fixed by revision <!ViewSVNRev(7368)>. | |
1587 | ||
1588 | * <!Anchor(bug20)> | |
1589 | Bug in `IntInf.scan` and `IntInf.fromString` where leading spaces were only accepted if the stream had an explicit sign character. | |
1590 | + | |
1591 | Thanks to David Hansel for the bug report. | |
1592 | + | |
1593 | Fixed by revisions <!ViewSVNRev(7227)> and <!ViewSVNRev(7230)>. | |
1594 | ||
1595 | * <!Anchor(bug19)> | |
1596 | Bug in `IntInf.~>>` that could cause a `glibc` assertion. | |
1597 | + | |
1598 | Fixed by revisions <!ViewSVNRev(7083)>, <!ViewSVNRev(7084)>, and <!ViewSVNRev(7085)>. | |
1599 | ||
1600 | * <!Anchor(bug18)> | |
1601 | Bug in the return type of `MLton.Process.reap`. | |
1602 | + | |
1603 | Thanks to Risto Saarelma for the bug report. | |
1604 | + | |
1605 | Fixed by revision <!ViewSVNRev(7029)>. | |
1606 | ||
1607 | * <!Anchor(bug17)> | |
1608 | Bug in `MLton.size` and `MLton.share` when tracing the current stack. | |
1609 | + | |
1610 | Fixed by revisions <!ViewSVNRev(6978)>, <!ViewSVNRev(6981)>, <!ViewSVNRev(6988)>, <!ViewSVNRev(6989)>, and <!ViewSVNRev(6990)>. | |
1611 | ||
1612 | * <!Anchor(bug16)> | |
1613 | Bug in nested `_export`/`_import` functions. | |
1614 | + | |
1615 | Fixed by revision <!ViewSVNRev(6919)>. | |
1616 | ||
1617 | * <!Anchor(bug15)> | |
1618 | Bug in the name mangling of `_import`-ed functions with the `stdcall` convention. | |
1619 | + | |
1620 | Thanks to Lars Bergstrom for the bug report. | |
1621 | + | |
1622 | Fixed by revision <!ViewSVNRev(6672)>. | |
1623 | ||
1624 | * <!Anchor(bug14)> | |
1625 | Bug in Windows code to page the heap to disk when unable to grow the heap to a desired size. | |
1626 | + | |
1627 | Thanks to Sami Evangelista for the bug report. | |
1628 | + | |
1629 | Fixed by revisions <!ViewSVNRev(6600)> and <!ViewSVNRev(6624)>. | |
1630 | ||
1631 | * <!Anchor(bug13)> | |
1632 | Bug in \*NIX code to page the heap to disk when unable to grow the heap to a desired size. | |
1633 | + | |
1634 | Thanks to Nicolas Bertolotti for the bug report and patch. | |
1635 | + | |
1636 | Fixed by revisions <!ViewSVNRev(6596)> and <!ViewSVNRev(6600)>. | |
1637 | ||
1638 | * <!Anchor(bug12)> | |
1639 | Space-safety bug in pass to <:RefFlatten: flatten refs> into containing data structure. | |
1640 | + | |
1641 | Thanks to Daniel Spoonhower for the bug report and initial diagnosis and patch. | |
1642 | + | |
1643 | Fixed by revision <!ViewSVNRev(6395)>. | |
1644 | ||
1645 | * <!Anchor(bug11)> | |
1646 | Bug in the frontend that rejected `op longvid` patterns and expressions. | |
1647 | + | |
1648 | Thanks to Florian Weimer for the bug report. | |
1649 | + | |
1650 | Fixed by revision <!ViewSVNRev(6347)>. | |
1651 | ||
1652 | * <!Anchor(bug10)> | |
1653 | Bug in the http://www.standardml.org/Basis/imperative-io.html#SIG:IMPERATIVE_IO.canInput:VAL[`IMPERATIVE_IO.canInput`] function of the <:BasisLibrary:Basis Library> implementation. | |
1654 | + | |
1655 | Thanks to Ville Laurikari for the bug report. | |
1656 | + | |
1657 | Fixed by revision <!ViewSVNRev(6261)>. | |
1658 | ||
1659 | * <!Anchor(bug09)> | |
1660 | Bug in algebraic simplification of real primitives. http://www.standardml.org/Basis/real.html#SIG:REAL.\|@LTE\|:VAL[++REAL__<N>__.\<=(x, x)++] is `false` when `x` is NaN. | |
1661 | + | |
1662 | Fixed by revision <!ViewSVNRev(6242)>. | |
1663 | ||
1664 | * <!Anchor(bug08)> | |
1665 | Bug in the FFI visible representation of `Int16.int ref` (and references of other primitive types smaller than 32-bits) on big-endian platforms. | |
1666 | + | |
1667 | Thanks to Dave Herman for the bug report. | |
1668 | + | |
1669 | Fixed by revision <!ViewSVNRev(6267)>. | |
1670 | ||
1671 | * <!Anchor(bug07)> | |
1672 | Bug in type inference of flexible records. This would later cause the compiler to raise the `TypeError` exception. | |
1673 | + | |
1674 | Thanks to Wesley Terpstra for the bug report. | |
1675 | + | |
1676 | Fixed by revision <!ViewSVNRev(6229)>. | |
1677 | ||
1678 | * <!Anchor(bug06)> | |
1679 | Bug in cross-compilation of `gdtoa` library. | |
1680 | + | |
1681 | Thanks to Wesley Terpstra for the bug report and patch. | |
1682 | + | |
1683 | Fixed by revision <!ViewSVNRev(6620)>. | |
1684 | ||
1685 | * <!Anchor(bug05)> | |
1686 | Bug in pass to <:RefFlatten: flatten refs> into containing data structure. | |
1687 | + | |
1688 | Thanks to Ruy Ley-Wild for the bug report. | |
1689 | + | |
1690 | Fixed by revision <!ViewSVNRev(6191)>. | |
1691 | ||
1692 | * <!Anchor(bug04)> | |
1693 | Bug in the handling of weak pointers by the mark-compact garbage collector. | |
1694 | + | |
1695 | Thanks to Sean McLaughlin for the bug report and Florian Weimer for the initial diagnosis. | |
1696 | + | |
1697 | Fixed by revision <!ViewSVNRev(6183)>. | |
1698 | ||
1699 | * <!Anchor(bug03)> | |
1700 | Bug in the elaboration of structures with signature constraints. This would later cause the compiler to raise the `TypeError` exception. | |
1701 | + | |
1702 | Thanks to Vesa Karvonen for the bug report. | |
1703 | + | |
1704 | Fixed by revision <!ViewSVNRev(6046)>. | |
1705 | ||
1706 | * <!Anchor(bug02)> | |
1707 | Bug in the interaction of `_export`-ed functions and signal handlers. | |
1708 | + | |
1709 | Thanks to Sean McLaughlin for the bug report. | |
1710 | + | |
1711 | Fixed by revision <!ViewSVNRev(6013)>. | |
1712 | ||
1713 | * <!Anchor(bug01)> | |
1714 | Bug in the implementation of `_export`-ed functions using the `char` type, leading to a linker error. | |
1715 | + | |
1716 | Thanks to Katsuhiro Ueno for the bug report. | |
1717 | + | |
1718 | Fixed by revision <!ViewSVNRev(5999)>. | |
1719 | ||
1720 | <<< | |
1721 | ||
1722 | :mlton-guide-page: Bugs20100608 | |
1723 | [[Bugs20100608]] | |
1724 | Bugs20100608 | |
1725 | ============ | |
1726 | ||
1727 | Here are the known bugs in <:Release20100608:MLton 20100608>, listed | |
1728 | in reverse chronological order of date reported. | |
1729 | ||
1730 | * <!Anchor(bug11)> | |
1731 | Bugs in `REAL.signBit`, `REAL.copySign`, and `REAL.toDecimal`/`REAL.fromDecimal`. | |
1732 | + | |
1733 | Thanks to Phil Clayton for the bug report and examples. | |
1734 | + | |
1735 | Fixed by revisions <!ViewSVNRev(7571)>, <!ViewSVNRev(7572)>, and <!ViewSVNRev(7573)>. | |
1736 | ||
1737 | * <!Anchor(bug10)> | |
1738 | Bug in elaboration of type variables with and without equality status. | |
1739 | + | |
1740 | Thanks to Rob Simmons for the bug report and examples. | |
1741 | + | |
1742 | Fixed by revision <!ViewSVNRev(7565)>. | |
1743 | ||
1744 | * <!Anchor(bug09)> | |
1745 | Bug in <:Redundant:redundant> <:SSA:> optimization. | |
1746 | + | |
1747 | Thanks to Lars Magnusson for the bug report and example. | |
1748 | + | |
1749 | Fixed by revision <!ViewSVNRev(7561)>. | |
1750 | ||
1751 | * <!Anchor(bug08)> | |
1752 | Bug in <:SSA:>/<:SSA2:> <:Shrink:shrinker> that could erroneously turn a non-tail function call with a `Bug` transfer as its continuation into a tail function call. | |
1753 | + | |
1754 | Thanks to Lars Bergstrom for the bug report. | |
1755 | + | |
1756 | Fixed by revision <!ViewSVNRev(7546)>. | |
1757 | ||
1758 | * <!Anchor(bug07)> | |
1759 | Bug in translation from <:SSA2:> to <:RSSA:> with `case` expressions over non-primitive-sized words. | |
1760 | + | |
1761 | Fixed by revision <!ViewSVNRev(7544)>. | |
1762 | ||
1763 | * <!Anchor(bug06)> | |
1764 | Bug with <:SSA:>/<:SSA2:> type checking of case expressions over words. | |
1765 | + | |
1766 | Fixed by revision <!ViewSVNRev(7542)>. | |
1767 | ||
1768 | * <!Anchor(bug05)> | |
1769 | Bug with treatment of `as`-patterns, which should not allow the redefinition of constructor status. | |
1770 | + | |
1771 | Thanks to Michael Norrish for the bug report. | |
1772 | + | |
1773 | Fixed by revision <!ViewSVNRev(7530)>. | |
1774 | ||
1775 | * <!Anchor(bug04)> | |
1776 | Bug with treatment of `nan` in <:CommonSubexp:common subexpression elimination> <:SSA:> optimization. | |
1777 | + | |
1778 | Thanks to Alexandre Hamez for the bug report. | |
1779 | + | |
1780 | Fixed by revision <!ViewSVNRev(7503)>. | |
1781 | ||
1782 | * <!Anchor(bug03)> | |
1783 | Bug in translation from <:SSA2:> to <:RSSA:> with weak pointers. | |
1784 | + | |
1785 | Thanks to Alexandre Hamez for the bug report. | |
1786 | + | |
1787 | Fixed by revision <!ViewSVNRev(7502)>. | |
1788 | ||
1789 | * <!Anchor(bug02)> | |
1790 | Bug in amd64 codegen calling convention for varargs C calls. | |
1791 | + | |
1792 | Thanks to <:HenryCejtin:> for the bug report and <:WesleyTerpstra:> for the initial diagnosis. | |
1793 | + | |
1794 | Fixed by revision <!ViewSVNRev(7501)>. | |
1795 | ||
1796 | * <!Anchor(bug01)> | |
1797 | Bug in comment-handling in lexer for <:MLYacc:>'s input language. | |
1798 | + | |
1799 | Thanks to Michael Norrish for the bug report and patch. | |
1800 | + | |
1801 | Fixed by revision <!ViewSVNRev(7500)>. | |
1802 | ||
1803 | * <!Anchor(bug00)> | |
1804 | Bug in elaboration of function clauses with different numbers of arguments that would raise an uncaught `Subscript` exception. | |
1805 | + | |
1806 | Fixed by revision <!ViewSVNRev(75497)>. | |
1807 | ||
1808 | <<< | |
1809 | ||
1810 | :mlton-guide-page: Bugs20130715 | |
1811 | [[Bugs20130715]] | |
1812 | Bugs20130715 | |
1813 | ============ | |
1814 | ||
1815 | Here are the known bugs in <:Release20130715:MLton 20130715>, listed | |
1816 | in reverse chronological order of date reported. | |
1817 | ||
1818 | * <!Anchor(bug06)> | |
1819 | Bug with simultaneous `sharing` of multiple structures. | |
1820 | + | |
1821 | Fixed by commit <!ViewGitCommit(mlton,9cb5164f6)>. | |
1822 | ||
1823 | * <!Anchor(bug05)> | |
1824 | Minor bug with exception replication. | |
1825 | + | |
1826 | Fixed by commit <!ViewGitCommit(mlton,1c89c42f6)>. | |
1827 | ||
1828 | * <!Anchor(bug04)> | |
1829 | Minor bug erroneously accepting symbolic identifiers for strid, sigid, and fctid | |
1830 | and erroneously accepting symbolic identifiers before `.` in long identifiers. | |
1831 | + | |
1832 | Fixed by commit <!ViewGitCommit(mlton,9a56be647)>. | |
1833 | ||
1834 | * <!Anchor(bug03)> | |
1835 | Minor bug in precedence parsing of function clauses. | |
1836 | + | |
1837 | Fixed by commit <!ViewGitCommit(mlton,1a6d25ec9)>. | |
1838 | ||
1839 | * <!Anchor(bug02)> | |
1840 | Performance bug in creation of worker threads to service calls of `_export`-ed | |
1841 | functions. | |
1842 | + | |
1843 | Thanks to Bernard Berthomieu for the bug report. | |
1844 | + | |
1845 | Fixed by commit <!ViewGitCommit(mlton,97c2bdf1d)>. | |
1846 | ||
1847 | * <!Anchor(bug01)> | |
1848 | Bug in `MLton.IntInf.fromRep` that could yield values that violate the `IntInf` | |
1849 | representation invariants. | |
1850 | + | |
1851 | Thanks to Rob Simmons for the bug report. | |
1852 | + | |
1853 | Fixed by commit <!ViewGitCommit(mlton,3add91eda)>. | |
1854 | ||
1855 | * <!Anchor(bug00)> | |
1856 | Bug in equality status of some arrays, vectors, and slices in Basis Library | |
1857 | implementation. | |
1858 | + | |
1859 | Fixed by commit <!ViewGitCommit(mlton,a7ed9cbf1)>. | |
1860 | ||
1861 | <<< | |
1862 | ||
1863 | :mlton-guide-page: Bugs20180207 | |
1864 | [[Bugs20180207]] | |
1865 | Bugs20180207 | |
1866 | ============ | |
1867 | ||
1868 | Here are the known bugs in <:Release20180207:MLton 20180207>, listed | |
1869 | in reverse chronological order of date reported. | |
1870 | ||
1871 | <<< | |
1872 | ||
1873 | :mlton-guide-page: CallGraph | |
1874 | [[CallGraph]] | |
1875 | CallGraph | |
1876 | ========= | |
1877 | ||
1878 | For easier visualization of <:Profiling:profiling> data, `mlprof` can | |
1879 | create a call graph of the program in dot format, from which you can | |
1880 | use the http://www.research.att.com/sw/tools/graphviz/[graphviz] | |
1881 | software package to create a PostScript or PNG graph. For example, | |
1882 | ---- | |
1883 | mlprof -call-graph foo.dot foo mlmon.out | |
1884 | ---- | |
1885 | will create `foo.dot` with a complete call graph. For each source | |
1886 | function, there will be one node in the graph that contains the | |
1887 | function name (and source position with `-show-line true`), as | |
1888 | well as the percentage of ticks. If you want to create a call graph | |
1889 | for your program without any profiling data, you can simply call | |
1890 | `mlprof` without any `mlmon.out` files, as in | |
1891 | ---- | |
1892 | mlprof -call-graph foo.dot foo | |
1893 | ---- | |
1894 | ||
1895 | Because SML has higher-order functions, the call graph is is dependent | |
1896 | on MLton's analysis of which functions call each other. This analysis | |
1897 | depends on many implementation details and might display spurious | |
1898 | edges that a human could conclude are impossible. However, in | |
1899 | practice, the call graphs tend to be very accurate. | |
1900 | ||
1901 | Because call graphs can get big, `mlprof` provides the `-keep` option | |
1902 | to specify the nodes that you would like to see. This option also | |
1903 | controls which functions appear in the table that `mlprof` prints. | |
1904 | The argument to `-keep` is an expression describing a set of source | |
1905 | functions (i.e. graph nodes). The expression _e_ should be of the | |
1906 | following form. | |
1907 | ||
1908 | * ++all++ | |
1909 | * ++"__s__"++ | |
1910 | * ++(and __e ...__)++ | |
1911 | * ++(from __e__)++ | |
1912 | * ++(not __e__)++ | |
1913 | * ++(or __e__)++ | |
1914 | * ++(pred __e__)++ | |
1915 | * ++(succ __e__)++ | |
1916 | * ++(thresh __x__)++ | |
1917 | * ++(thresh-gc __x__)++ | |
1918 | * ++(thresh-stack __x__)++ | |
1919 | * ++(to __e__)++ | |
1920 | ||
1921 | In the grammar, ++all++ denotes the set of all nodes. ++"__s__"++ is | |
1922 | a regular expression denoting the set of functions whose name | |
1923 | (followed by a space and the source position) has a prefix matching | |
1924 | the regexp. The `and`, `not`, and `or` expressions denote | |
1925 | intersection, complement, and union, respectively. The `pred` and | |
1926 | `succ` expressions add the set of immediate predecessors or successors | |
1927 | to their argument, respectively. The `from` and `to` expressions | |
1928 | denote the set of nodes that have paths from or to the set of nodes | |
1929 | denoted by their arguments, respectively. Finally, `thresh`, | |
1930 | `thresh-gc`, and `thresh-stack` denote the set of nodes whose | |
1931 | percentage of ticks, gc ticks, or stack ticks, respectively, is | |
1932 | greater than or equal to the real number _x_. | |
1933 | ||
1934 | For example, if you want to see the entire call graph for a program, | |
1935 | you can use `-keep all` (this is the default). If you want to see | |
1936 | all nodes reachable from function `foo` in your program, you would | |
1937 | use `-keep '(from "foo")'`. Or, if you want to see all the | |
1938 | functions defined in subdirectory `bar` of your project that used | |
1939 | at least 1% of the ticks, you would use | |
1940 | ---- | |
1941 | -keep '(and ".*/bar/" (thresh 1.0))' | |
1942 | ---- | |
1943 | To see all functions with ticks above a threshold, you can also use | |
1944 | `-thresh x`, which is an abbreviation for `-keep '(thresh x)'`. You | |
1945 | can not use multiple `-keep` arguments or both `-keep` and `-thresh`. | |
1946 | When you use `-keep` to display a subset of the functions, `mlprof` | |
1947 | will add dashed edges to the call graph to indicate a path in the | |
1948 | original call graph from one function to another. | |
1949 | ||
1950 | When compiling with `-profile-stack true`, you can use `mlprof -gray | |
1951 | true` to make the nodes darker or lighter depending on whether their | |
1952 | stack percentage is higher or lower. | |
1953 | ||
1954 | MLton's optimizer may duplicate source functions for any of a number | |
1955 | of reasons (functor duplication, monomorphisation, polyvariance, | |
1956 | inlining). By default, all duplicates of a function are treated as | |
1957 | one. If you would like to treat the duplicates separately, you can | |
1958 | use ++mlprof -split __regexp__++, which will cause all duplicates of | |
1959 | functions whose name has a prefix matching the regular expression to | |
1960 | be treated separately. This can be especially useful for higher-order | |
1961 | utility functions like `General.o`. | |
1962 | ||
1963 | == Caveats == | |
1964 | ||
1965 | Technically speaking, `mlprof` produces a call-stack graph rather than | |
1966 | a call graph, because it describes the set of possible call stacks. | |
1967 | The difference is in how tail calls are displayed. For example if `f` | |
1968 | nontail calls `g` and `g` tail calls `h`, then the call-stack graph | |
1969 | has edges from `f` to `g` and `f` to `h`, while the call graph has | |
1970 | edges from `f` to `g` and `g` to `h`. That is, a tail call from `g` | |
1971 | to `h` removes `g` from the call stack and replaces it with `h`. | |
1972 | ||
1973 | <<< | |
1974 | ||
1975 | :mlton-guide-page: CallingFromCToSML | |
1976 | [[CallingFromCToSML]] | |
1977 | CallingFromCToSML | |
1978 | ================= | |
1979 | ||
1980 | MLton's <:ForeignFunctionInterface:> allows programs to _export_ SML | |
1981 | functions to be called from C. Suppose you would like export from SML | |
1982 | a function of type `real * char -> int` as the C function `foo`. | |
1983 | MLton extends the syntax of SML to allow expressions like the | |
1984 | following: | |
1985 | ---- | |
1986 | _export "foo": (real * char -> int) -> unit; | |
1987 | ---- | |
1988 | The above expression exports a C function named `foo`, with | |
1989 | prototype | |
1990 | [source,c] | |
1991 | ---- | |
1992 | Int32 foo (Real64 x0, Char x1); | |
1993 | ---- | |
1994 | The `_export` expression denotes a function of type | |
1995 | `(real * char -> int) -> unit` that when called with a function | |
1996 | `f`, arranges for the exported `foo` function to call `f` | |
1997 | when `foo` is called. So, for example, the following exports and | |
1998 | defines `foo`. | |
1999 | [source,sml] | |
2000 | ---- | |
2001 | val e = _export "foo": (real * char -> int) -> unit; | |
2002 | val _ = e (fn (x, c) => 13 + Real.floor x + Char.ord c) | |
2003 | ---- | |
2004 | ||
2005 | The general form of an `_export` expression is | |
2006 | ---- | |
2007 | _export "C function name" attr... : cFuncTy -> unit; | |
2008 | ---- | |
2009 | The type and the semicolon are not optional. As with `_import`, a | |
2010 | sequence of attributes may follow the function name. | |
2011 | ||
2012 | MLton's `-export-header` option generates a C header file with | |
2013 | prototypes for all of the functions exported from SML. Include this | |
2014 | header file in your C files to type check calls to functions exported | |
2015 | from SML. This header file includes ++typedef++s for the | |
2016 | <:ForeignFunctionInterfaceTypes: types that can be passed between SML and C>. | |
2017 | ||
2018 | ||
2019 | == Example == | |
2020 | ||
2021 | Suppose that `export.sml` is | |
2022 | ||
2023 | [source,sml] | |
2024 | ---- | |
2025 | sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/export.sml] | |
2026 | ---- | |
2027 | ||
2028 | Note that the the `reentrant` attribute is used for `_import`-ing the | |
2029 | C functions that will call the `_export`-ed SML functions. | |
2030 | ||
2031 | Create the header file with `-export-header`. | |
2032 | ---- | |
2033 | % mlton -default-ann 'allowFFI true' \ | |
2034 | -export-header export.h \ | |
2035 | -stop tc \ | |
2036 | export.sml | |
2037 | ---- | |
2038 | ||
2039 | `export.h` now contains the following C prototypes. | |
2040 | ---- | |
2041 | Int8 f (Int32 x0, Real64 x1, Int8 x2); | |
2042 | Pointer f2 (Word8 x0); | |
2043 | void f3 (); | |
2044 | void f4 (Int32 x0); | |
2045 | extern Int32 zzz; | |
2046 | ---- | |
2047 | ||
2048 | Use `export.h` in a C program, `ffi-export.c`, as follows. | |
2049 | ||
2050 | [source,c] | |
2051 | ---- | |
2052 | sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/ffi-export.c] | |
2053 | ---- | |
2054 | ||
2055 | Compile `ffi-export.c` and `export.sml`. | |
2056 | ---- | |
2057 | % gcc -c ffi-export.c | |
2058 | % mlton -default-ann 'allowFFI true' \ | |
2059 | export.sml ffi-export.o | |
2060 | ---- | |
2061 | ||
2062 | Finally, run `export`. | |
2063 | ---- | |
2064 | % ./export | |
2065 | g starting | |
2066 | ... | |
2067 | g4 (0) | |
2068 | success | |
2069 | ---- | |
2070 | ||
2071 | ||
2072 | == Download == | |
2073 | * <!RawGitFile(mlton,master,doc/examples/ffi/export.sml)> | |
2074 | * <!RawGitFile(mlton,master,doc/examples/ffi/ffi-export.c)> | |
2075 | ||
2076 | <<< | |
2077 | ||
2078 | :mlton-guide-page: CallingFromSMLToC | |
2079 | [[CallingFromSMLToC]] | |
2080 | CallingFromSMLToC | |
2081 | ================= | |
2082 | ||
2083 | MLton's <:ForeignFunctionInterface:> allows an SML program to _import_ | |
2084 | C functions. Suppose you would like to import from C a function with | |
2085 | the following prototype: | |
2086 | [source,c] | |
2087 | ---- | |
2088 | int foo (double d, char c); | |
2089 | ---- | |
2090 | MLton extends the syntax of SML to allow expressions like the following: | |
2091 | ---- | |
2092 | _import "foo": real * char -> int; | |
2093 | ---- | |
2094 | This expression denotes a function of type `real * char -> int` whose | |
2095 | behavior is implemented by calling the C function whose name is `foo`. | |
2096 | Thinking in terms of C, imagine that there are C variables `d` of type | |
2097 | `double`, `c` of type `unsigned char`, and `i` of type `int`. Then, | |
2098 | the C statement `i = foo (d, c)` is executed and `i` is returned. | |
2099 | ||
2100 | The general form of an `_import` expression is: | |
2101 | ---- | |
2102 | _import "C function name" attr... : cFuncTy; | |
2103 | ---- | |
2104 | The type and the semicolon are not optional. | |
2105 | ||
2106 | The function name is followed by a (possibly empty) sequence of | |
2107 | attributes, analogous to C `__attribute__` specifiers. | |
2108 | ||
2109 | ||
2110 | == Example == | |
2111 | ||
2112 | `import.sml` imports the C function `ffi` and the C variable `FFI_INT` | |
2113 | as follows. | |
2114 | ||
2115 | [source,sml] | |
2116 | ---- | |
2117 | sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/import.sml] | |
2118 | ---- | |
2119 | ||
2120 | `ffi-import.c` is | |
2121 | ||
2122 | [source,c] | |
2123 | ---- | |
2124 | sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/ffi-import.c] | |
2125 | ---- | |
2126 | ||
2127 | Compile and run the program. | |
2128 | ---- | |
2129 | % mlton -default-ann 'allowFFI true' -export-header export.h import.sml ffi-import.c | |
2130 | % ./import | |
2131 | 13 | |
2132 | success | |
2133 | ---- | |
2134 | ||
2135 | ||
2136 | == Download == | |
2137 | * <!RawGitFile(mlton,master,doc/examples/ffi/import.sml)> | |
2138 | * <!RawGitFile(mlton,master,doc/examples/ffi/ffi-import.c)> | |
2139 | ||
2140 | ||
2141 | == Next Steps == | |
2142 | ||
2143 | * <:CallingFromSMLToCFunctionPointer:> | |
2144 | ||
2145 | <<< | |
2146 | ||
2147 | :mlton-guide-page: CallingFromSMLToCFunctionPointer | |
2148 | [[CallingFromSMLToCFunctionPointer]] | |
2149 | CallingFromSMLToCFunctionPointer | |
2150 | ================================ | |
2151 | ||
2152 | Just as MLton can <:CallingFromSMLToC:directly call C functions>, it | |
2153 | is possible to make indirect function calls; that is, function calls | |
2154 | through a function pointer. MLton extends the syntax of SML to allow | |
2155 | expressions like the following: | |
2156 | ---- | |
2157 | _import * : MLton.Pointer.t -> real * char -> int; | |
2158 | ---- | |
2159 | This expression denotes a function of type | |
2160 | [source,sml] | |
2161 | ---- | |
2162 | MLton.Pointer.t -> real * char -> int | |
2163 | ---- | |
2164 | whose behavior is implemented by calling the C function at the address | |
2165 | denoted by the `MLton.Pointer.t` argument, and supplying the C | |
2166 | function two arguments, a `double` and an `int`. The C function | |
2167 | pointer may be obtained, for example, by the dynamic linking loader | |
2168 | (`dlopen`, `dlsym`, ...). | |
2169 | ||
2170 | The general form of an indirect `_import` expression is: | |
2171 | ---- | |
2172 | _import * attr... : cPtrTy -> cFuncTy; | |
2173 | ---- | |
2174 | The type and the semicolon are not optional. | |
2175 | ||
2176 | ||
2177 | == Example == | |
2178 | ||
2179 | This example uses `dlopen` and friends (imported using normal | |
2180 | `_import`) to dynamically load the math library (`libm`) and call the | |
2181 | `cos` function. Suppose `iimport.sml` contains the following. | |
2182 | ||
2183 | [source,sml] | |
2184 | ---- | |
2185 | sys::[./bin/InclGitFile.py mlton master doc/examples/ffi/iimport.sml] | |
2186 | ---- | |
2187 | ||
2188 | Compile and run `iimport.sml`. | |
2189 | ---- | |
2190 | % mlton -default-ann 'allowFFI true' \ | |
2191 | -target-link-opt linux -ldl \ | |
2192 | -target-link-opt solaris -ldl \ | |
2193 | iimport.sml | |
2194 | % iimport | |
2195 | Math.cos(2.0) = ~0.416146836547 | |
2196 | libm.so::cos(2.0) = ~0.416146836547 | |
2197 | ---- | |
2198 | ||
2199 | This example also shows the `-target-link-opt` option, which uses the | |
2200 | switch when linking only when on the specified platform. Compile with | |
2201 | `-verbose 1` to see in more detail what's being passed to `gcc`. | |
2202 | ||
2203 | ||
2204 | == Download == | |
2205 | * <!RawGitFile(mlton,master,doc/examples/ffi/iimport.sml)> | |
2206 | ||
2207 | <<< | |
2208 | ||
2209 | :mlton-guide-page: CCodegen | |
2210 | [[CCodegen]] | |
2211 | CCodegen | |
2212 | ======== | |
2213 | ||
2214 | The <:CCodegen:> is a <:Codegen:code generator> that translates the | |
2215 | <:Machine:> <:IntermediateLanguage:> to C, which is further optimized | |
2216 | and compiled to native object code by `gcc` (or another C compiler). | |
2217 | ||
2218 | == Implementation == | |
2219 | ||
2220 | * <!ViewGitFile(mlton,master,mlton/codegen/c-codegen/c-codegen.sig)> | |
2221 | * <!ViewGitFile(mlton,master,mlton/codegen/c-codegen/c-codegen.fun)> | |
2222 | ||
2223 | == Details and Notes == | |
2224 | ||
2225 | The <:CCodegen:> is the original <:Codegen:code generator> for MLton. | |
2226 | ||
2227 | <<< | |
2228 | ||
2229 | :mlton-guide-page: Changelog | |
2230 | [[Changelog]] | |
2231 | Changelog | |
2232 | ========= | |
2233 | ||
2234 | * <!ViewGitFile(mlton,master,CHANGELOG.adoc)> | |
2235 | ||
2236 | ---- | |
2237 | sys::[./bin/InclGitFile.py mlton master CHANGELOG.adoc] | |
2238 | ---- | |
2239 | ||
2240 | <<< | |
2241 | ||
2242 | :mlton-guide-page: ChrisClearwater | |
2243 | [[ChrisClearwater]] | |
2244 | ChrisClearwater | |
2245 | =============== | |
2246 | ||
2247 | {empty} | |
2248 | ||
2249 | <<< | |
2250 | ||
2251 | :mlton-guide-page: Chunkify | |
2252 | [[Chunkify]] | |
2253 | Chunkify | |
2254 | ======== | |
2255 | ||
2256 | <:Chunkify:> is an analysis pass for the <:RSSA:> | |
2257 | <:IntermediateLanguage:>, invoked from <:ToMachine:>. | |
2258 | ||
2259 | == Description == | |
2260 | ||
2261 | It partitions all the labels (function and block) in an <:RSSA:> | |
2262 | program into disjoint sets, referred to as chunks. | |
2263 | ||
2264 | == Implementation == | |
2265 | ||
2266 | * <!ViewGitFile(mlton,master,mlton/backend/chunkify.sig)> | |
2267 | * <!ViewGitFile(mlton,master,mlton/backend/chunkify.fun)> | |
2268 | ||
2269 | == Details and Notes == | |
2270 | ||
2271 | Breaking large <:RSSA:> functions into chunks is necessary for | |
2272 | reasonable compile times with the <:CCodegen:> and the <:LLVMCodegen:>. | |
2273 | ||
2274 | <<< | |
2275 | ||
2276 | :mlton-guide-page: CKitLibrary | |
2277 | [[CKitLibrary]] | |
2278 | CKitLibrary | |
2279 | =========== | |
2280 | ||
2281 | The http://www.smlnj.org/doc/ckit[ckit Library] is a C front end | |
2282 | written in SML that translates C source code (after preprocessing) | |
2283 | into abstract syntax represented as a set of SML datatypes. The ckit | |
2284 | Library is distributed with SML/NJ. Due to differences between SML/NJ | |
2285 | and MLton, this library will not work out-of-the box with MLton. | |
2286 | ||
2287 | As of 20180119, MLton includes a port of the ckit Library synchronized | |
2288 | with SML/NJ version 110.82. | |
2289 | ||
2290 | == Usage == | |
2291 | ||
2292 | * You can import the ckit Library into an MLB file with: | |
2293 | + | |
2294 | [options="header"] | |
2295 | |===== | |
2296 | |MLB file|Description | |
2297 | |`$(SML_LIB)/ckit-lib/ckit-lib.mlb`| | |
2298 | |===== | |
2299 | ||
2300 | * If you are porting a project from SML/NJ's <:CompilationManager:> to | |
2301 | MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the | |
2302 | following map is included by default: | |
2303 | + | |
2304 | ---- | |
2305 | # ckit Library | |
2306 | $ckit-lib.cm $(SML_LIB)/ckit-lib | |
2307 | $ckit-lib.cm/ckit-lib.cm $(SML_LIB)/ckit-lib/ckit-lib.mlb | |
2308 | ---- | |
2309 | + | |
2310 | This will automatically convert a `$/ckit-lib.cm` import in an input | |
2311 | `.cm` file into a `$(SML_LIB)/ckit-lib/ckit-lib.mlb` import in the | |
2312 | output `.mlb` file. | |
2313 | ||
2314 | == Details == | |
2315 | ||
2316 | The following changes were made to the ckit Library, in addition to | |
2317 | deriving the `.mlb` file from the `.cm` file: | |
2318 | ||
2319 | * `ast/pp/pp-ast-adornment-sig.sml` (modified): Rewrote use of `signature` in `local`. | |
2320 | * `ast/pp/pp-ast-ext-sig.sml` (modified): Rewrote use of `signature` in `local`. | |
2321 | * `ast/type-util-sig.sml` (modified): Rewrote use of `signature` in `local`. | |
2322 | * `parser/parse-tree-sig.sml` (modified): Rewrote use of (sequential) `withtype` in signature. | |
2323 | * `parser/parse-tree.sml` (modified): Rewrote use of (sequential) `withtype`. | |
2324 | ||
2325 | == Patch == | |
2326 | ||
2327 | * <!ViewGitFile(mlton,master,lib/ckit-lib/ckit.patch)> | |
2328 | ||
2329 | <<< | |
2330 | ||
2331 | :mlton-guide-page: Closure | |
2332 | [[Closure]] | |
2333 | Closure | |
2334 | ======= | |
2335 | ||
2336 | A closure is a data structure that is the run-time representation of a | |
2337 | function. | |
2338 | ||
2339 | ||
2340 | == Typical Implementation == | |
2341 | ||
2342 | In a typical implementation, a closure consists of a _code pointer_ | |
2343 | (indicating what the function does) and an _environment_ containing | |
2344 | the values of the free variables of the function. For example, in the | |
2345 | expression | |
2346 | ||
2347 | [source,sml] | |
2348 | ---- | |
2349 | let | |
2350 | val x = 5 | |
2351 | in | |
2352 | fn y => x + y | |
2353 | end | |
2354 | ---- | |
2355 | ||
2356 | the closure for `fn y => x + y` contains a pointer to a piece of code | |
2357 | that knows to take its argument and add the value of `x` to it, plus | |
2358 | the environment recording the value of `x` as `5`. | |
2359 | ||
2360 | To call a function, the code pointer is extracted and jumped to, | |
2361 | passing in some agreed upon location the environment and the argument. | |
2362 | ||
2363 | ||
2364 | == MLton's Implementation == | |
2365 | ||
2366 | MLton does not implement closures traditionally. Instead, based on | |
2367 | whole-program higher-order control-flow analysis, MLton represents a | |
2368 | function as an element of a sum type, where the variant indicates | |
2369 | which function it is and carries the free variables as arguments. See | |
2370 | <:ClosureConvert:> and <!Cite(CejtinEtAl00)> for details. | |
2371 | ||
2372 | <<< | |
2373 | ||
2374 | :mlton-guide-page: ClosureConvert | |
2375 | [[ClosureConvert]] | |
2376 | ClosureConvert | |
2377 | ============== | |
2378 | ||
2379 | <:ClosureConvert:> is a translation pass from the <:SXML:> | |
2380 | <:IntermediateLanguage:> to the <:SSA:> <:IntermediateLanguage:>. | |
2381 | ||
2382 | == Description == | |
2383 | ||
2384 | It converts an <:SXML:> program into an <:SSA:> program. | |
2385 | ||
2386 | <:Defunctionalization:> is the technique used to eliminate | |
2387 | <:Closure:>s (see <!Cite(CejtinEtAl00)>). | |
2388 | ||
2389 | Uses <:Globalize:> and <:LambdaFree:> analyses. | |
2390 | ||
2391 | == Implementation == | |
2392 | ||
2393 | * <!ViewGitFile(mlton,master,mlton/closure-convert/closure-convert.sig)> | |
2394 | * <!ViewGitFile(mlton,master,mlton/closure-convert/closure-convert.fun)> | |
2395 | ||
2396 | == Details and Notes == | |
2397 | ||
2398 | {empty} | |
2399 | ||
2400 | <<< | |
2401 | ||
2402 | :mlton-guide-page: CMinusMinus | |
2403 | [[CMinusMinus]] | |
2404 | CMinusMinus | |
2405 | =========== | |
2406 | ||
2407 | http://cminusminus.org[C--] is a portable assembly language intended | |
2408 | to make it easy for compilers for different high-level languages to | |
2409 | share the same backend. An experimental version of MLton has been | |
2410 | made to generate C--. | |
2411 | ||
2412 | * http://www.mlton.org/pipermail/mlton/2005-March/026850.html | |
2413 | ||
2414 | == Also see == | |
2415 | ||
2416 | * <:LLVM:> | |
2417 | ||
2418 | <<< | |
2419 | ||
2420 | :mlton-guide-page: Codegen | |
2421 | [[Codegen]] | |
2422 | Codegen | |
2423 | ======= | |
2424 | ||
2425 | <:Codegen:> is a translation pass from the <:Machine:> | |
2426 | <:IntermediateLanguage:> to one or more compilation units that can be | |
2427 | compiled to native object code by an external tool. | |
2428 | ||
2429 | == Implementation == | |
2430 | ||
2431 | * <!ViewGitDir(mlton,master,mlton/codegen)> | |
2432 | ||
2433 | == Details and Notes == | |
2434 | ||
2435 | The following <:Codegen:codegens> are implemented: | |
2436 | ||
2437 | * <:AMD64Codegen:> | |
2438 | * <:CCodegen:> | |
2439 | * <:LLVMCodegen:> | |
2440 | * <:X86Codegen:> | |
2441 | ||
2442 | <<< | |
2443 | ||
2444 | :mlton-guide-page: CombineConversions | |
2445 | [[CombineConversions]] | |
2446 | CombineConversions | |
2447 | ================== | |
2448 | ||
2449 | <:CombineConversions:> is an optimization pass for the <:SSA:> | |
2450 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
2451 | ||
2452 | == Description == | |
2453 | ||
2454 | This pass looks for and simplifies nested calls to (signed) | |
2455 | extension/truncation. | |
2456 | ||
2457 | == Implementation == | |
2458 | ||
2459 | * <!ViewGitFile(mlton,master,mlton/ssa/combine-conversions.fun)> | |
2460 | ||
2461 | == Details and Notes == | |
2462 | ||
2463 | It processes each block in dfs order (visiting definitions before uses): | |
2464 | ||
2465 | * If the statement is not a `PrimApp` with `Word_extdToWord`, skip it. | |
2466 | * After processing a conversion, it tags the `Var` for subsequent use. | |
2467 | * When inspecting a conversion, check if the `Var` operand is also the | |
2468 | result of a conversion. If it is, try to combine the two operations. | |
2469 | Repeatedly simplify until hitting either a non-conversion `Var` or a | |
2470 | case where the conversion cannot be simplified. | |
2471 | ||
2472 | The optimization rules are very simple: | |
2473 | ---- | |
2474 | x1 = ... | |
2475 | x2 = Word_extdToWord (W1, W2, {signed=s1}) x1 | |
2476 | x3 = Word_extdToWord (W2, W3, {signed=s2}) x2 | |
2477 | ---- | |
2478 | ||
2479 | * If `W1 = W2`, then there is no conversions before `x_1`. | |
2480 | + | |
2481 | This is guaranteed because `W2 = W3` will always trigger optimization. | |
2482 | ||
2483 | * Case `W1 <= W3 <= W2`: | |
2484 | + | |
2485 | ---- | |
2486 | x3 = Word_extdToWord (W1, W3, {signed=s1}) x1 | |
2487 | ---- | |
2488 | ||
2489 | * Case `W1 < W2 < W3 AND ((NOT s1) OR s2)`: | |
2490 | + | |
2491 | ---- | |
2492 | x3 = Word_extdToWord (W1, W3, {signed=s1}) x1 | |
2493 | ---- | |
2494 | ||
2495 | * Case `W1 = W2 < W3`: | |
2496 | + | |
2497 | unoptimized, because there are no conversions past `W1` and `x2 = x1` | |
2498 | ||
2499 | * Case `W3 <= W2 <= W1 OR W3 <= W1 <= W2`: | |
2500 | + | |
2501 | ---- | |
2502 | x_3 = Word_extdToWord (W1, W3, {signed=_}) x1 | |
2503 | ---- | |
2504 | + | |
2505 | because `W3 <= W1 && W3 <= W2`, just clip `x1` | |
2506 | ||
2507 | * Case `W2 < W1 <= W3 OR W2 < W3 <= W1`: | |
2508 | + | |
2509 | unoptimized, because `W2 < W1 && W2 < W3`, has truncation effect | |
2510 | ||
2511 | * Case `W1 < W2 < W3 AND (s1 AND (NOT s2))`: | |
2512 | + | |
2513 | unoptimized, because each conversion affects the result separately | |
2514 | ||
2515 | <<< | |
2516 | ||
2517 | :mlton-guide-page: CommonArg | |
2518 | [[CommonArg]] | |
2519 | CommonArg | |
2520 | ========= | |
2521 | ||
2522 | <:CommonArg:> is an optimization pass for the <:SSA:> | |
2523 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
2524 | ||
2525 | == Description == | |
2526 | ||
2527 | It optimizes instances of `Goto` transfers that pass the same | |
2528 | arguments to the same label; e.g. | |
2529 | ---- | |
2530 | L_1 () | |
2531 | ... | |
2532 | z1 = ? | |
2533 | ... | |
2534 | L_3 (x, y, z1) | |
2535 | L_2 () | |
2536 | ... | |
2537 | z2 = ? | |
2538 | ... | |
2539 | L_3 (x, y, z2) | |
2540 | L_3 (a, b, c) | |
2541 | ... | |
2542 | ---- | |
2543 | ||
2544 | This code can be simplified to: | |
2545 | ---- | |
2546 | L_1 () | |
2547 | ... | |
2548 | z1 = ? | |
2549 | ... | |
2550 | L_3 (z1) | |
2551 | L_2 () | |
2552 | ... | |
2553 | z2 = ? | |
2554 | ... | |
2555 | L_3 (z2) | |
2556 | L_3 (c) | |
2557 | a = x | |
2558 | b = y | |
2559 | ---- | |
2560 | which saves a number of resources: time of setting up the arguments | |
2561 | for the jump to `L_3`, space (either stack or pseudo-registers) for | |
2562 | the arguments of `L_3`, etc. It may also expose some other | |
2563 | optimizations, if more information is known about `x` or `y`. | |
2564 | ||
2565 | == Implementation == | |
2566 | ||
2567 | * <!ViewGitFile(mlton,master,mlton/ssa/common-arg.fun)> | |
2568 | ||
2569 | == Details and Notes == | |
2570 | ||
2571 | Three analyses were originally proposed to drive the optimization | |
2572 | transformation. Only the _Dominator Analysis_ is currently | |
2573 | implemented. (Implementations of the other analyses are available in | |
2574 | the <:Sources:repository history>.) | |
2575 | ||
2576 | === Syntactic Analysis === | |
2577 | ||
2578 | The simplest analysis I could think of maintains | |
2579 | ---- | |
2580 | varInfo: Var.t -> Var.t option list ref | |
2581 | ---- | |
2582 | initialized to `[]`. | |
2583 | ||
2584 | * For each variable `v` bound in a `Statement.t` or in the | |
2585 | `Function.t` args, then `List.push(varInfo v, NONE)`. | |
2586 | * For each `L (x1, ..., xn)` transfer where `(a1, ..., an)` are the | |
2587 | formals of `L`, then `List.push(varInfo ai, SOME xi)`. | |
2588 | * For each block argument a used in an unknown context (e.g., | |
2589 | arguments of blocks used as continuations, handlers, arith success, | |
2590 | runtime return, or case switch labels), then | |
2591 | `List.push(varInfo a, NONE)`. | |
2592 | ||
2593 | Now, any block argument `a` such that `varInfo a = xs`, where all of | |
2594 | the elements of `xs` are equal to `SOME x`, can be optimized by | |
2595 | setting `a = x` at the beginning of the block and dropping the | |
2596 | argument from `Goto` transfers. | |
2597 | ||
2598 | That takes care of the example above. We can clearly do slightly | |
2599 | better, by changing the transformation criteria to the following: any | |
2600 | block argument a such that `varInfo a = xs`, where all of the elements | |
2601 | of `xs` are equal to `SOME x` _or_ are equal to `SOME a`, can be | |
2602 | optimized by setting `a = x` at the beginning of the block and | |
2603 | dropping the argument from `Goto` transfers. This optimizes a case | |
2604 | like: | |
2605 | ---- | |
2606 | L_1 () | |
2607 | ... z1 = ? ... | |
2608 | L_3 (x, y, z1) | |
2609 | L_2 () | |
2610 | ... z2 = ? ... | |
2611 | L_3(x, y, z2) | |
2612 | L_3 (a, b, c) | |
2613 | ... w = ? ... | |
2614 | case w of | |
2615 | true => L_4 | false => L_5 | |
2616 | L_4 () | |
2617 | ... | |
2618 | L_3 (a, b, w) | |
2619 | L_5 () | |
2620 | ... | |
2621 | ---- | |
2622 | where a common argument is passed to a loop (and is invariant through | |
2623 | the loop). Of course, the <:LoopInvariant:> optimization pass would | |
2624 | normally introduce a local loop and essentially reduce this to the | |
2625 | first example, but I have seen this in practice, which suggests that | |
2626 | some optimizations after <:LoopInvariant:> do enough simplifications | |
2627 | to introduce (new) loop invariant arguments. | |
2628 | ||
2629 | === Fixpoint Analysis === | |
2630 | ||
2631 | However, the above analysis and transformation doesn't cover the cases | |
2632 | where eliminating one common argument exposes the opportunity to | |
2633 | eliminate other common arguments. For example: | |
2634 | ---- | |
2635 | L_1 () | |
2636 | ... | |
2637 | L_3 (x) | |
2638 | L_2 () | |
2639 | ... | |
2640 | L_3 (x) | |
2641 | L_3 (a) | |
2642 | ... | |
2643 | L_5 (a) | |
2644 | L_4 () | |
2645 | ... | |
2646 | L_5 (x) | |
2647 | L_5 (b) | |
2648 | ... | |
2649 | ---- | |
2650 | ||
2651 | One pass of analysis and transformation would eliminate the argument | |
2652 | to `L_3` and rewrite the `L_5(a)` transfer to `L_5 (x)`, thereby | |
2653 | exposing the opportunity to eliminate the common argument to `L_5`. | |
2654 | ||
2655 | The interdependency the arguments to `L_3` and `L_5` suggest | |
2656 | performing some sort of fixed-point analysis. This analysis is | |
2657 | relatively simple; maintain | |
2658 | ---- | |
2659 | varInfo: Var.t -> VarLattice.t | |
2660 | ---- | |
2661 | {empty}where | |
2662 | ---- | |
2663 | VarLattice.t ~=~ Bot | Point of Var.t | Top | |
2664 | ---- | |
2665 | (but is implemented by the <:FlatLattice:> functor with a `lessThan` | |
2666 | list and `value ref` under the hood), initialized to `Bot`. | |
2667 | ||
2668 | * For each variable `v` bound in a `Statement.t` or in the | |
2669 | `Function.t` args, then `VarLattice.<= (Point v, varInfo v)` | |
2670 | * For each `L (x1, ..., xn)` transfer where `(a1, ..., an)` are the | |
2671 | formals of `L`}, then `VarLattice.<= (varInfo xi, varInfo ai)`. | |
2672 | * For each block argument a used in an unknown context, then | |
2673 | `VarLattice.<= (Point a, varInfo a)`. | |
2674 | ||
2675 | Now, any block argument a such that `varInfo a = Point x` can be | |
2676 | optimized by setting `a = x` at the beginning of the block and | |
2677 | dropping the argument from `Goto` transfers. | |
2678 | ||
2679 | Now, with the last example, we introduce the ordering constraints: | |
2680 | ---- | |
2681 | varInfo x <= varInfo a | |
2682 | varInfo a <= varInfo b | |
2683 | varInfo x <= varInfo b | |
2684 | ---- | |
2685 | ||
2686 | Assuming that `varInfo x = Point x`, then we get `varInfo a = Point x` | |
2687 | and `varInfo b = Point x`, and we optimize the example as desired. | |
2688 | ||
2689 | But, that is a rather weak assumption. It's quite possible for | |
2690 | `varInfo x = Top`. For example, consider: | |
2691 | ---- | |
2692 | G_1 () | |
2693 | ... n = 1 ... | |
2694 | L_0 (n) | |
2695 | G_2 () | |
2696 | ... m = 2 ... | |
2697 | L_0 (m) | |
2698 | L_0 (x) | |
2699 | ... | |
2700 | L_1 () | |
2701 | ... | |
2702 | L_3 (x) | |
2703 | L_2 () | |
2704 | ... | |
2705 | L_3 (x) | |
2706 | L_3 (a) | |
2707 | ... | |
2708 | L_5(a) | |
2709 | L_4 () | |
2710 | ... | |
2711 | L_5(x) | |
2712 | L_5 (b) | |
2713 | ... | |
2714 | ---- | |
2715 | ||
2716 | Now `varInfo x = varInfo a = varInfo b = Top`. What went wrong here? | |
2717 | When `varInfo x` went to `Top`, it got propagated all the way through | |
2718 | to `a` and `b`, and prevented the elimination of any common arguments. | |
2719 | What we'd like to do instead is when `varInfo x` goes to `Top`, | |
2720 | propagate on `Point x` -- we have no hope of eliminating `x`, but if | |
2721 | we hold `x` constant, then we have a chance of eliminating arguments | |
2722 | for which `x` is passed as an actual. | |
2723 | ||
2724 | === Dominator Analysis === | |
2725 | ||
2726 | Does anyone see where this is going yet? Pausing for a little | |
2727 | thought, <:MatthewFluet:> realized that he had once before tried | |
2728 | proposing this kind of "fix" to a fixed-point analysis -- when we were | |
2729 | first investigating the <:Contify:> optimization in light of John | |
2730 | Reppy's CWS paper. Of course, that "fix" failed because it defined a | |
2731 | non-monotonic function and one couldn't take the fixed point. But, | |
2732 | <:StephenWeeks:> suggested a dominator based approach, and we were | |
2733 | able to show that, indeed, the dominator analysis subsumed both the | |
2734 | previous call based analysis and the cont based analysis. And, a | |
2735 | moment's reflection reveals further parallels: when | |
2736 | `varInfo: Var.t -> Var.t option list ref`, we have something analogous | |
2737 | to the call analysis, and when `varInfo: Var.t -> VarLattice.t`, we | |
2738 | have something analogous to the cont analysis. Maybe there is | |
2739 | something analogous to the dominator approach (and therefore superior | |
2740 | to the previous analyses). | |
2741 | ||
2742 | And this turns out to be the case. Construct the graph `G` as follows: | |
2743 | ---- | |
2744 | nodes(G) = {Root} U Var.t | |
2745 | edges(G) = {Root -> v | v bound in a Statement.t or | |
2746 | in the Function.t args} U | |
2747 | {xi -> ai | L(x1, ..., xn) transfer where (a1, ..., an) | |
2748 | are the formals of L} U | |
2749 | {Root -> a | a is a block argument used in an unknown context} | |
2750 | ---- | |
2751 | ||
2752 | Let `idom(x)` be the immediate dominator of `x` in `G` with root | |
2753 | `Root`. Now, any block argument a such that `idom(a) = x <> Root` can | |
2754 | be optimized by setting `a = x` at the beginning of the block and | |
2755 | dropping the argument from `Goto` transfers. | |
2756 | ||
2757 | Furthermore, experimental evidence suggests (and we are confident that | |
2758 | a formal presentation could prove) that the dominator analysis | |
2759 | subsumes the "syntactic" and "fixpoint" based analyses in this context | |
2760 | as well and that the dominator analysis gets "everything" in one go. | |
2761 | ||
2762 | === Final Thoughts === | |
2763 | ||
2764 | I must admit, I was rather surprised at this progression and final | |
2765 | result. At the outset, I never would have thought of a connection | |
2766 | between <:Contify:> and <:CommonArg:> optimizations. They would seem | |
2767 | to be two completely different optimizations. Although, this may not | |
2768 | really be the case. As one of the reviewers of the ICFP paper said: | |
2769 | ____ | |
2770 | I understand that such a form of CPS might be convenient in some | |
2771 | cases, but when we're talking about analyzing code to detect that some | |
2772 | continuation is constant, I think it makes a lot more sense to make | |
2773 | all the continuation arguments completely explicit. | |
2774 | ||
2775 | I believe that making all the continuation arguments explicit will | |
2776 | show that the optimization can be generalized to eliminating constant | |
2777 | arguments, whether continuations or not. | |
2778 | ____ | |
2779 | ||
2780 | What I think the common argument optimization shows is that the | |
2781 | dominator analysis does slightly better than the reviewer puts it: we | |
2782 | find more than just constant continuations, we find common | |
2783 | continuations. And I think this is further justified by the fact that | |
2784 | I have observed common argument eliminate some `env_X` arguments which | |
2785 | would appear to correspond to determining that while the closure being | |
2786 | executed isn't constant it is at least the same as the closure being | |
2787 | passed elsewhere. | |
2788 | ||
2789 | At first, I was curious whether or not we had missed a bigger picture | |
2790 | with the dominator analysis. When we wrote the contification paper, I | |
2791 | assumed that the dominator analysis was a specialized solution to a | |
2792 | specialized problem; we never suggested that it was a technique suited | |
2793 | to a larger class of analyses. After initially finding a connection | |
2794 | between <:Contify:> and <:CommonArg:> (and thinking that the only | |
2795 | connection was the technique), I wondered if the dominator technique | |
2796 | really was applicable to a larger class of analyses. That is still a | |
2797 | question, but after writing up the above, I'm suspecting that the | |
2798 | "real story" is that the dominator analysis is a solution to the | |
2799 | common argument optimization, and that the <:Contify:> optimization is | |
2800 | specializing <:CommonArg:> to the case of continuation arguments (with | |
2801 | a different transformation at the end). (Note, a whole-program, | |
2802 | inter-procedural common argument analysis doesn't really make sense | |
2803 | (in our <:SSA:> <:IntermediateLanguage:>), because the only way of | |
2804 | passing values between functions is as arguments. (Unless of course | |
2805 | in the case that the common argument is also a constant argument, in | |
2806 | which case <:ConstantPropagation:> could lift it to a global.) The | |
2807 | inter-procedural <:Contify:> optimization works out because there we | |
2808 | move the function to the argument.) | |
2809 | ||
2810 | Anyways, it's still unclear to me whether or not the dominator based | |
2811 | approach solves other kinds of problems. | |
2812 | ||
2813 | === Phase Ordering === | |
2814 | ||
2815 | On the downside, the optimization doesn't have a huge impact on | |
2816 | runtime, although it does predictably saved some code size. I stuck | |
2817 | it in the optimization sequence after <:Flatten:> and (the third round | |
2818 | of) <:LocalFlatten:>, since it seems to me that we could have cases | |
2819 | where some components of a tuple used as an argument are common, but | |
2820 | the whole tuple isn't. I think it makes sense to add it after | |
2821 | <:IntroduceLoops:> and <:LoopInvariant:> (even though <:CommonArg:> | |
2822 | get some things that <:LoopInvariant:> gets, it doesn't get all of | |
2823 | them). I also think that it makes sense to add it before | |
2824 | <:CommonSubexp:>, since identifying variables could expose more common | |
2825 | subexpressions. I would think a similar thought applies to | |
2826 | <:RedundantTests:>. | |
2827 | ||
2828 | <<< | |
2829 | ||
2830 | :mlton-guide-page: CommonBlock | |
2831 | [[CommonBlock]] | |
2832 | CommonBlock | |
2833 | =========== | |
2834 | ||
2835 | <:CommonBlock:> is an optimization pass for the <:SSA:> | |
2836 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
2837 | ||
2838 | == Description == | |
2839 | ||
2840 | It eliminates equivalent blocks in a <:SSA:> function. The | |
2841 | equivalence criteria requires blocks to have no arguments or | |
2842 | statements and transfer via `Raise`, `Return`, or `Goto` of a single | |
2843 | global variable. | |
2844 | ||
2845 | == Implementation == | |
2846 | ||
2847 | * <!ViewGitFile(mlton,master,mlton/ssa/common-block.fun)> | |
2848 | ||
2849 | == Details and Notes == | |
2850 | ||
2851 | * Rewrites | |
2852 | + | |
2853 | ---- | |
2854 | L_X () | |
2855 | raise (global_Y) | |
2856 | ---- | |
2857 | + | |
2858 | to | |
2859 | + | |
2860 | ---- | |
2861 | L_X () | |
2862 | L_Y' () | |
2863 | ---- | |
2864 | + | |
2865 | and adds | |
2866 | + | |
2867 | ---- | |
2868 | L_Y' () | |
2869 | raise (global_Y) | |
2870 | ---- | |
2871 | + | |
2872 | to the <:SSA:> function. | |
2873 | ||
2874 | * Rewrites | |
2875 | + | |
2876 | ---- | |
2877 | L_X () | |
2878 | return (global_Y) | |
2879 | ---- | |
2880 | + | |
2881 | to | |
2882 | + | |
2883 | ---- | |
2884 | L_X () | |
2885 | L_Y' () | |
2886 | ---- | |
2887 | + | |
2888 | and adds | |
2889 | + | |
2890 | ---- | |
2891 | L_Y' () | |
2892 | return (global_Y) | |
2893 | ---- | |
2894 | + | |
2895 | to the <:SSA:> function. | |
2896 | ||
2897 | * Rewrites | |
2898 | + | |
2899 | ---- | |
2900 | L_X () | |
2901 | L_Z (global_Y) | |
2902 | ---- | |
2903 | + | |
2904 | to | |
2905 | + | |
2906 | ---- | |
2907 | L_X () | |
2908 | L_Y' () | |
2909 | ---- | |
2910 | + | |
2911 | and adds | |
2912 | + | |
2913 | ---- | |
2914 | L_Y' () | |
2915 | L_Z (global_Y) | |
2916 | ---- | |
2917 | + | |
2918 | to the <:SSA:> function. | |
2919 | ||
2920 | The <:Shrink:> pass rewrites all uses of `L_X` to `L_Y'` and drops `L_X`. | |
2921 | ||
2922 | For example, all uncaught `Overflow` exceptions in a <:SSA:> function | |
2923 | share the same raising block. | |
2924 | ||
2925 | <<< | |
2926 | ||
2927 | :mlton-guide-page: CommonSubexp | |
2928 | [[CommonSubexp]] | |
2929 | CommonSubexp | |
2930 | ============ | |
2931 | ||
2932 | <:CommonSubexp:> is an optimization pass for the <:SSA:> | |
2933 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
2934 | ||
2935 | == Description == | |
2936 | ||
2937 | It eliminates instances of common subexpressions. | |
2938 | ||
2939 | == Implementation == | |
2940 | ||
2941 | * <!ViewGitFile(mlton,master,mlton/ssa/common-subexp.fun)> | |
2942 | ||
2943 | == Details and Notes == | |
2944 | ||
2945 | In addition to getting the usual sorts of things like | |
2946 | ||
2947 | * {empty} | |
2948 | + | |
2949 | ---- | |
2950 | (w + 0wx1) + (w + 0wx1) | |
2951 | ---- | |
2952 | + | |
2953 | rewritten to | |
2954 | + | |
2955 | ---- | |
2956 | let val w' = w + 0wx1 in w' + w' end | |
2957 | ---- | |
2958 | ||
2959 | it also gets things like | |
2960 | ||
2961 | * {empty} | |
2962 | + | |
2963 | ---- | |
2964 | val a = Array_uninit n | |
2965 | val b = Array_length a | |
2966 | ---- | |
2967 | + | |
2968 | rewritten to | |
2969 | + | |
2970 | ---- | |
2971 | val a = Array_uninit n | |
2972 | val b = n | |
2973 | ---- | |
2974 | ||
2975 | `Arith` transfers are handled specially. The _result_ of an `Arith` | |
2976 | transfer can be used in _common_ `Arith` transfers that it dominates: | |
2977 | ||
2978 | * {empty} | |
2979 | + | |
2980 | ---- | |
2981 | val l = (n + m) + (n + m) | |
2982 | ||
2983 | val k = (l + n) + ((l + m) handle Overflow => ((l + m) | |
2984 | handle Overflow => l + n)) | |
2985 | ---- | |
2986 | + | |
2987 | is rewritten so that `(n + m)` is computed exactly once, as are | |
2988 | `(l + n)` and `(l + m)`. | |
2989 | ||
2990 | <<< | |
2991 | ||
2992 | :mlton-guide-page: CompilationManager | |
2993 | [[CompilationManager]] | |
2994 | CompilationManager | |
2995 | ================== | |
2996 | ||
2997 | The http://www.smlnj.org/doc/CM/index.html[Compilation Manager] (CM) is SML/NJ's mechanism for supporting programming-in-the-very-large. | |
2998 | ||
2999 | == Porting SML/NJ CM files to MLton == | |
3000 | ||
3001 | To help in porting CM files to MLton, the MLton source distribution | |
3002 | includes the sources for a utility, `cm2mlb`, that will print an | |
3003 | <:MLBasis: ML Basis> file with essentially the same semantics as the | |
3004 | CM file -- handling the full syntax of CM supported by your installed | |
3005 | SML/NJ version and correctly handling export filters. When `cm2mlb` | |
3006 | encounters a `.cm` import, it attempts to convert it to a | |
3007 | corresponding `.mlb` import. CM anchored paths are translated to | |
3008 | paths according to a default configuration file | |
3009 | (<!ViewGitFile(mlton,master,util/cm2mlb/cm2mlb-map)>). For example, | |
3010 | the default configuration includes | |
3011 | ---- | |
3012 | # Standard ML Basis Library | |
3013 | $SMLNJ-BASIS $(SML_LIB)/basis | |
3014 | $basis.cm $(SML_LIB)/basis | |
3015 | $basis.cm/basis.cm $(SML_LIB)/basis/basis.mlb | |
3016 | ---- | |
3017 | to ensure that a `$/basis.cm` import is translated to a | |
3018 | `$(SML_LIB)/basis/basis.mlb` import. See `util/cm2mlb` for details. | |
3019 | Building `cm2mlb` requires that you have already installed a recent | |
3020 | version of SML/NJ. | |
3021 | ||
3022 | <<< | |
3023 | ||
3024 | :mlton-guide-page: CompilerOverview | |
3025 | [[CompilerOverview]] | |
3026 | CompilerOverview | |
3027 | ================ | |
3028 | ||
3029 | The following table shows the overall structure of the compiler. | |
3030 | <:IntermediateLanguage:>s are shown in the center column. The names | |
3031 | of compiler passes are listed in the left and right columns. | |
3032 | ||
3033 | [align="center",witdth="50%",cols="^,^,^"] | |
3034 | |==== | |
3035 | 3+^| *Compiler Overview* | |
3036 | | _Translation Passes_ | _<:IntermediateLanguage:>_ | _Optimization Passes_ | |
3037 | | | Source | | |
3038 | | <:FrontEnd:> | | | |
3039 | | | <:AST:> | | |
3040 | | <:Elaborate:> | | | |
3041 | | | <:CoreML:> | <:CoreMLSimplify:> | |
3042 | | <:Defunctorize:> | | | |
3043 | | | <:XML:> | <:XMLSimplify:> | |
3044 | | <:Monomorphise:> | | | |
3045 | | | <:SXML:> | <:SXMLSimplify:> | |
3046 | | <:ClosureConvert:> | | | |
3047 | | | <:SSA:> | <:SSASimplify:> | |
3048 | | <:ToSSA2:> | | | |
3049 | | | <:SSA2:> | <:SSA2Simplify:> | |
3050 | | <:ToRSSA:> | | | |
3051 | | | <:RSSA:> | <:RSSASimplify:> | |
3052 | | <:ToMachine:> | | | |
3053 | | | <:Machine:> | | |
3054 | | <:Codegen:> | | | |
3055 | |==== | |
3056 | ||
3057 | The `Compile` functor (<!ViewGitFile(mlton,master,mlton/main/compile.sig)>, | |
3058 | <!ViewGitFile(mlton,master,mlton/main/compile.fun)>), controls the | |
3059 | high-level view of the compiler passes, from <:FrontEnd:> to code | |
3060 | generation. | |
3061 | ||
3062 | <<< | |
3063 | ||
3064 | :mlton-guide-page: CompilerPassTemplate | |
3065 | [[CompilerPassTemplate]] | |
3066 | CompilerPassTemplate | |
3067 | ==================== | |
3068 | ||
3069 | An analysis pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZOtherPass:>. | |
3070 | An implementation pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZSimplify:>. | |
3071 | An optimization pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZSimplify:>. | |
3072 | A rewrite pass for the <:ZZZ:> <:IntermediateLanguage:>, invoked from <:ZZZOtherPass:>. | |
3073 | A translation pass from the <:ZZA:> <:IntermediateLanguage:> to the <:ZZB:> <:IntermediateLanguage:>. | |
3074 | ||
3075 | == Description == | |
3076 | ||
3077 | A short description of the pass. | |
3078 | ||
3079 | == Implementation == | |
3080 | ||
3081 | * <!ViewGitFile(mlton,master,mlton/ZZZ.fun)> | |
3082 | ||
3083 | == Details and Notes == | |
3084 | ||
3085 | Relevant details and notes. | |
3086 | ||
3087 | <<< | |
3088 | ||
3089 | :mlton-guide-page: CompileTimeOptions | |
3090 | [[CompileTimeOptions]] | |
3091 | CompileTimeOptions | |
3092 | ================== | |
3093 | ||
3094 | MLton's compile-time options control the name of the output file, the | |
3095 | verbosity of compile-time messages, and whether or not certain | |
3096 | optimizations are performed. They also can specify which intermediate | |
3097 | files are saved and can stop the compilation process early, at some | |
3098 | intermediate pass, in which case compilation can be resumed by passing | |
3099 | the generated files to MLton. MLton uses the input file suffix to | |
3100 | determine the type of input program. The possibilities are `.c`, | |
3101 | `.mlb`, `.o`, `.s`, and `.sml`. | |
3102 | ||
3103 | With no arguments, MLton prints the version number and exits. For a | |
3104 | usage message, run MLton with an invalid switch, e.g. `mlton -z`. In | |
3105 | the explanation below and in the usage message, for flags that take a | |
3106 | number of choices (e.g. `{true|false}`), the first value listed is the | |
3107 | default. | |
3108 | ||
3109 | ||
3110 | == Options == | |
3111 | ||
3112 | * ++-align __n__++ | |
3113 | + | |
3114 | Aligns object in memory by the specified alignment (+4+ or +8+). | |
3115 | The default varies depending on architecture. | |
3116 | ||
3117 | * ++-as-opt __option__++ | |
3118 | + | |
3119 | Pass _option_ to `gcc` when compiling assembler code. If you wish to | |
3120 | pass an option to the assembler, you must use `gcc`'s `-Wa,` syntax. | |
3121 | ||
3122 | * ++-cc-opt __option__++ | |
3123 | + | |
3124 | Pass _option_ to `gcc` when compiling C code. | |
3125 | ||
3126 | * ++-codegen {native|amd64|c|llvm|x86}++ | |
3127 | + | |
3128 | Generate native object code via amd64 assembly, C code, LLVM code, or | |
3129 | x86 code or C code. With `-codegen native` (`-codegen amd64` or | |
3130 | `-codegen x86`), MLton typically compiles more quickly and generates | |
3131 | better code. | |
3132 | ||
3133 | * ++-const __name__ __value__++ | |
3134 | + | |
3135 | Set the value of a compile-time constant. Here is a list of | |
3136 | available constants, their default values, and what they control. | |
3137 | + | |
3138 | ** ++Exn.keepHistory {false|true}++ | |
3139 | + | |
3140 | Enable `MLton.Exn.history`. See <:MLtonExn:> for details. There is a | |
3141 | performance cost to setting this to `true`, both in memory usage of | |
3142 | exceptions and in run time, because of additional work that must be | |
3143 | performed at each exception construction, raise, and handle. | |
3144 | ||
3145 | * ++-default-ann __ann__++ | |
3146 | + | |
3147 | Specify default <:MLBasisAnnotations:ML Basis annotations>. For | |
3148 | example, `-default-ann 'warnUnused true'` causes unused variable | |
3149 | warnings to be enabled by default. A default is overridden by the | |
3150 | corresponding annotation in an ML Basis file. | |
3151 | ||
3152 | * ++-default-type __type__++ | |
3153 | + | |
3154 | Specify the default binding for a primitive type. For example, | |
3155 | `-default-type word64` causes the top-level type `word` and the | |
3156 | top-level structure `Word` in the <:BasisLibrary:Basis Library> to be | |
3157 | equal to `Word64.word` and `Word64:WORD`, respectively. Similarly, | |
3158 | `-default-type intinf` causes the top-level type `int` and the | |
3159 | top-level structure `Int` in the <:BasisLibrary:Basis Library> to be | |
3160 | equal to `IntInf.int` and `IntInf:INTEGER`, respectively. | |
3161 | ||
3162 | * ++-disable-ann __ann__++ | |
3163 | + | |
3164 | Ignore the specified <:MLBasisAnnotations:ML Basis annotation> in | |
3165 | every ML Basis file. For example, to see _all_ match and unused | |
3166 | warnings, compile with | |
3167 | + | |
3168 | ---- | |
3169 | -default-ann 'warnUnused true' | |
3170 | -disable-ann forceUsed | |
3171 | -disable-ann nonexhaustiveMatch | |
3172 | -disable-ann redundantMatch | |
3173 | -disable-ann warnUnused | |
3174 | ---- | |
3175 | ||
3176 | * ++-export-header __file__++ | |
3177 | + | |
3178 | Write C prototypes to _file_ for all of the functions in the program | |
3179 | <:CallingFromCToSML:exported from SML to C>. | |
3180 | ||
3181 | * ++-ieee-fp {false|true}++ | |
3182 | + | |
3183 | Cause the x86 native code generator to be pedantic about following the | |
3184 | IEEE floating point standard. By default, it is not, because of the | |
3185 | performance cost. This only has an effect with `-codegen x86`. | |
3186 | ||
3187 | * ++-inline __n__++ | |
3188 | + | |
3189 | Set the inlining threshold used in the optimizer. The threshold is an | |
3190 | approximate measure of code size of a procedure. The default is | |
3191 | `320`. | |
3192 | ||
3193 | * ++-keep {g|o}++ | |
3194 | + | |
3195 | Save intermediate files. If no `-keep` argument is given, then only | |
3196 | the output file is saved. | |
3197 | + | |
3198 | [cols="^25%,<75%"] | |
3199 | |==== | |
3200 | | `g` | generated `.c` and `.s` files passed to `gcc` and generated `.ll` files passed to `llvm-as` | |
3201 | | `o` | object (`.o`) files | |
3202 | |==== | |
3203 | ||
3204 | * ++-link-opt __option__++ | |
3205 | + | |
3206 | Pass _option_ to `gcc` when linking. You can use this to specify | |
3207 | library search paths, e.g. `-link-opt -Lpath`, and libraries to link | |
3208 | with, e.g., `-link-opt -lfoo`, or even both at the same time, | |
3209 | e.g. `-link-opt '-Lpath -lfoo'`. If you wish to pass an option to the | |
3210 | linker, you must use `gcc`'s `-Wl,` syntax, e.g., | |
3211 | `-link-opt '-Wl,--export-dynamic'`. | |
3212 | ||
3213 | * ++-llvm-as-opt __option__++ | |
3214 | + | |
3215 | Pass _option_ to `llvm-as` when assembling (`.ll` to `.bc`) LLVM code. | |
3216 | ||
3217 | * ++-llvm-llc-opt __option__++ | |
3218 | + | |
3219 | Pass _option_ to `llc` when compiling (`.bc` to `.o`) LLVM code. | |
3220 | ||
3221 | * ++-llvm-opt-opt __option__++ | |
3222 | + | |
3223 | Pass _option_ to `opt` when optimizing (`.bc` to `.bc`) LLVM code. | |
3224 | ||
3225 | * ++-mlb-path-map __file__++ | |
3226 | + | |
3227 | Use _file_ as an <:MLBasisPathMap:ML Basis path map> to define | |
3228 | additional MLB path variables. Multiple uses of `-mlb-path-map` and | |
3229 | `-mlb-path-var` are allowed, with variable definitions in later path | |
3230 | maps taking precedence over earlier ones. | |
3231 | ||
3232 | * ++-mlb-path-var __name__ __value__++ | |
3233 | + | |
3234 | Define an additional MLB path variable. Multiple uses of | |
3235 | `-mlb-path-map` and `-mlb-path-var` are allowed, with variable | |
3236 | definitions in later path maps taking precedence over earlier ones. | |
3237 | ||
3238 | * ++-output __file__++ | |
3239 | + | |
3240 | Specify the name of the final output file. The default name is the | |
3241 | input file name with its suffix removed and an appropriate, possibly | |
3242 | empty, suffix added. | |
3243 | ||
3244 | * ++-profile {no|alloc|count|time}++ | |
3245 | + | |
3246 | Produce an executable that gathers <:Profiling: profiling> data. When | |
3247 | such an executable is run, it produces an `mlmon.out` file. | |
3248 | ||
3249 | * ++-profile-branch {false|true}++ | |
3250 | + | |
3251 | If true, the profiler will separately gather profiling data for each | |
3252 | branch of a function definition, `case` expression, and `if` | |
3253 | expression. | |
3254 | ||
3255 | * ++-profile-stack {false|true}++ | |
3256 | + | |
3257 | If `true`, the executable will gather profiling data for all functions | |
3258 | on the stack, not just the currently executing function. See | |
3259 | <:ProfilingTheStack:>. | |
3260 | ||
3261 | * ++-profile-val {false|true}++ | |
3262 | + | |
3263 | If `true`, the profiler will separately gather profiling data for each | |
3264 | (expansive) `val` declaration. | |
3265 | ||
3266 | * ++-runtime __arg__++ | |
3267 | + | |
3268 | Pass argument to the runtime system via `@MLton`. See | |
3269 | <:RunTimeOptions:>. The argument will be processed before other | |
3270 | `@MLton` command line switches. Multiple uses of `-runtime` are | |
3271 | allowed, and will pass all the arguments in order. If the same | |
3272 | runtime switch occurs more than once, then the last setting will take | |
3273 | effect. There is no need to supply the leading `@MLton` or the | |
3274 | trailing `--`; these will be supplied automatically. | |
3275 | + | |
3276 | An argument to `-runtime` may contain spaces, which will cause the | |
3277 | argument to be treated as a sequence of words by the runtime. For | |
3278 | example the command line: | |
3279 | + | |
3280 | ---- | |
3281 | mlton -runtime 'ram-slop 0.4' foo.sml | |
3282 | ---- | |
3283 | + | |
3284 | will cause `foo` to run as if it had been called like: | |
3285 | + | |
3286 | ---- | |
3287 | foo @MLton ram-slop 0.4 -- | |
3288 | ---- | |
3289 | + | |
3290 | An executable created with `-runtime stop` doesn't process any | |
3291 | `@MLton` arguments. This is useful to create an executable, e.g., | |
3292 | `echo`, that must treat `@MLton` like any other command-line argument. | |
3293 | + | |
3294 | ---- | |
3295 | % mlton -runtime stop echo.sml | |
3296 | % echo @MLton -- | |
3297 | @MLton -- | |
3298 | ---- | |
3299 | ||
3300 | * ++-show-basis __file__++ | |
3301 | + | |
3302 | Pretty print to _file_ the basis defined by the input program. See | |
3303 | <:ShowBasis:>. | |
3304 | ||
3305 | * ++-show-def-use __file__++ | |
3306 | + | |
3307 | Output def-use information to _file_. Each identifier that is defined | |
3308 | appears on a line, followed on subsequent lines by the position of | |
3309 | each use. | |
3310 | ||
3311 | * ++-stop {f|g|o|tc}++ | |
3312 | + | |
3313 | Specify when to stop. | |
3314 | + | |
3315 | [cols="^25%,<75%"] | |
3316 | |==== | |
3317 | | `f` | list of files on stdout (only makes sense when input is `foo.mlb`) | |
3318 | | `g` | generated `.c` and `.s` files | |
3319 | | `o` | object (`.o`) files | |
3320 | | `tc` | after type checking | |
3321 | |==== | |
3322 | + | |
3323 | If you compile with `-stop g` or `-stop o`, you can resume compilation | |
3324 | by running MLton on the generated `.c` and `.s` or `.o` files. | |
3325 | ||
3326 | * ++-target {self|__...__}++ | |
3327 | + | |
3328 | Generate an executable that runs on the specified platform. The | |
3329 | default is `self`, which means to compile for the machine that MLton | |
3330 | is running on. To use any other target, you must first install a | |
3331 | <:CrossCompiling: cross compiler>. | |
3332 | ||
3333 | * ++-target-as-opt __target__ __option__++ | |
3334 | + | |
3335 | Like `-as-opt`, this passes _option_ to `gcc` when compliling | |
3336 | assembler code, except it only passes _option_ when the target | |
3337 | architecture, operating system, or arch-os pair is _target_. | |
3338 | ||
3339 | * ++-target-cc-opt __target__ __option__++ | |
3340 | + | |
3341 | Like `-cc-opt`, this passes _option_ to `gcc` when compiling C code, | |
3342 | except it only passes _option_ when the target architecture, operating | |
3343 | system, or arch-os pair is _target_. | |
3344 | ||
3345 | * ++-target-link-opt __target__ __option__++ | |
3346 | + | |
3347 | Like `-link-opt`, this passes _option_ to `gcc` when linking, except | |
3348 | it only passes _option_ when the target architecture, operating | |
3349 | system, or arch-os pair is _target_. | |
3350 | ||
3351 | * ++-verbose {0|1|2|3}++ | |
3352 | + | |
3353 | How verbose to be about what passes are running. The default is `0`. | |
3354 | + | |
3355 | [cols="^25%,<75%"] | |
3356 | |==== | |
3357 | | `0` | silent | |
3358 | | `1` | calls to compiler, assembler, and linker | |
3359 | | `2` | 1, plus intermediate compiler passes | |
3360 | | `3` | 2, plus some data structure sizes | |
3361 | |==== | |
3362 | ||
3363 | <<< | |
3364 | ||
3365 | :mlton-guide-page: CompilingWithSMLNJ | |
3366 | [[CompilingWithSMLNJ]] | |
3367 | CompilingWithSMLNJ | |
3368 | ================== | |
3369 | ||
3370 | You can compile MLton with <:SMLNJ:SML/NJ>, however the resulting | |
3371 | compiler will run much more slowly than MLton compiled by itself. We | |
3372 | don't recommend using SML/NJ as a means of | |
3373 | <:PortingMLton:porting MLton> to a new platform or bootstrapping on a | |
3374 | new platform. | |
3375 | ||
3376 | If you do want to build MLton with SML/NJ, it is best to have a binary | |
3377 | MLton package installed. If you don't, here are some issues you may | |
3378 | encounter when you run `make smlnj-mlton`. | |
3379 | ||
3380 | You will get (many copies of) the error messages: | |
3381 | ||
3382 | ---- | |
3383 | /bin/sh: mlton: command not found | |
3384 | ---- | |
3385 | ||
3386 | and | |
3387 | ||
3388 | ---- | |
3389 | make[2]: mlton: Command not found | |
3390 | ---- | |
3391 | ||
3392 | The `Makefile` calls `mlton` to determine dependencies, and can | |
3393 | proceed in spite of this error. | |
3394 | ||
3395 | If you don't have an `mllex` executable, you will get the error | |
3396 | message: | |
3397 | ||
3398 | ---- | |
3399 | mllex: Command not found | |
3400 | ---- | |
3401 | ||
3402 | Building MLton requires `mllex` and `mlyacc` executables, which are | |
3403 | distributed with a binary package of MLton. The easiest solution is | |
3404 | to copy the front-end lexer/parser files from a different machine | |
3405 | (`ml.grm.sml`, `ml.grm.sig`, `ml.lex.sml`, `mlb.grm.sig`, | |
3406 | `mlb.grm.sml`). | |
3407 | ||
3408 | <<< | |
3409 | ||
3410 | :mlton-guide-page: ConcurrentML | |
3411 | [[ConcurrentML]] | |
3412 | ConcurrentML | |
3413 | ============ | |
3414 | ||
3415 | http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency | |
3416 | library based on synchronous message passing. MLton has an initial | |
3417 | port of CML from SML/NJ, but is missing a thread-safe wrapper around | |
3418 | the Basis Library and event-based equivalents to `IO` and `OS` | |
3419 | functions. | |
3420 | ||
3421 | All of the core CML functionality is present. | |
3422 | ||
3423 | [source,sml] | |
3424 | ---- | |
3425 | structure CML: CML | |
3426 | structure SyncVar: SYNC_VAR | |
3427 | structure Mailbox: MAILBOX | |
3428 | structure Multicast: MULTICAST | |
3429 | structure SimpleRPC: SIMPLE_RPC | |
3430 | structure RunCML: RUN_CML | |
3431 | ---- | |
3432 | ||
3433 | The `RUN_CML` signature is minimal. | |
3434 | ||
3435 | [source,sml] | |
3436 | ---- | |
3437 | signature RUN_CML = | |
3438 | sig | |
3439 | val isRunning: unit -> bool | |
3440 | val doit: (unit -> unit) * Time.time option -> OS.Process.status | |
3441 | val shutdown: OS.Process.status -> 'a | |
3442 | end | |
3443 | ---- | |
3444 | ||
3445 | MLton's `RunCML` structure does not include all of the cleanup and | |
3446 | logging operations of SML/NJ's `RunCML` structure. However, the | |
3447 | implementation does include the `CML.timeOutEvt` and `CML.atTimeEvt` | |
3448 | functions, and a preemptive scheduler that knows to sleep when there | |
3449 | are no ready threads and some threads blocked on time events. | |
3450 | ||
3451 | Because MLton does not wrap the Basis Library for CML, the "right" way | |
3452 | to call a Basis Library function that is stateful is to wrap the call | |
3453 | with `MLton.Thread.atomically`. | |
3454 | ||
3455 | == Usage == | |
3456 | ||
3457 | * You can import the CML Library into an MLB file with: | |
3458 | + | |
3459 | [options="header"] | |
3460 | |===== | |
3461 | |MLB file|Description | |
3462 | |`$(SML_LIB)/cml/cml.mlb`| | |
3463 | |==== | |
3464 | ||
3465 | * If you are porting a project from SML/NJ's <:CompilationManager:> to | |
3466 | MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the | |
3467 | following map is included by default: | |
3468 | + | |
3469 | ---- | |
3470 | # CML Library | |
3471 | $cml $(SML_LIB)/cml | |
3472 | $cml/cml.cm $(SML_LIB)/cml/cml.mlb | |
3473 | ---- | |
3474 | + | |
3475 | This will automatically convert a `$cml/cml.cm` import in an input `.cm` file into a `$(SML_LIB)/cml/cml.mlb` import in the output `.mlb` file. | |
3476 | ||
3477 | == Also see == | |
3478 | ||
3479 | * <:ConcurrentMLImplementation:> | |
3480 | * <:eXene:> | |
3481 | ||
3482 | <<< | |
3483 | ||
3484 | :mlton-guide-page: ConcurrentMLImplementation | |
3485 | [[ConcurrentMLImplementation]] | |
3486 | ConcurrentMLImplementation | |
3487 | ========================== | |
3488 | ||
3489 | Here are some notes on MLton's implementation of <:ConcurrentML:>. | |
3490 | ||
3491 | Concurrent ML was originally implemented for SML/NJ. It was ported to | |
3492 | MLton in the summer of 2004. The main difference between the | |
3493 | implementations is that SML/NJ uses continuations to implement CML | |
3494 | threads, while MLton uses its underlying <:MLtonThread:thread> | |
3495 | package. Presently, MLton's threads are a little more heavyweight | |
3496 | than SML/NJ's continuations, but it's pretty clear that there is some | |
3497 | fat there that could be trimmed. | |
3498 | ||
3499 | The implementation of CML in SML/NJ is built upon the first-class | |
3500 | continuations of the `SMLofNJ.Cont` module. | |
3501 | [source,sml] | |
3502 | ---- | |
3503 | type 'a cont | |
3504 | val callcc: ('a cont -> 'a) -> 'a | |
3505 | val isolate: ('a -> unit) -> 'a cont | |
3506 | val throw: 'a cont -> 'a -> 'b | |
3507 | ---- | |
3508 | ||
3509 | The implementation of CML in MLton is built upon the first-class | |
3510 | threads of the <:MLtonThread:> module. | |
3511 | [source,sml] | |
3512 | ---- | |
3513 | type 'a t | |
3514 | val new: ('a -> unit) -> 'a t | |
3515 | val prepare: 'a t * 'a -> Runnable.t | |
3516 | val switch: ('a t -> Runnable.t) -> 'a | |
3517 | ---- | |
3518 | ||
3519 | The port is relatively straightforward, because CML always throws to a | |
3520 | continuation at most once. Hence, an "abstract" implementation of | |
3521 | CML could be built upon first-class one-shot continuations, which map | |
3522 | equally well to SML/NJ's continuations and MLton's threads. | |
3523 | ||
3524 | The "essence" of the port is to transform: | |
3525 | ---- | |
3526 | callcc (fn k => ... throw k' v') | |
3527 | ---- | |
3528 | {empty}to | |
3529 | ---- | |
3530 | switch (fn t => ... prepare (t', v')) | |
3531 | ---- | |
3532 | which suffices for the vast majority of the CML implementation. | |
3533 | ||
3534 | There was only one complicated transformation: blocking multiple base | |
3535 | events. In SML/NJ CML, the representation of base events is given by: | |
3536 | [source,sml] | |
3537 | ---- | |
3538 | datatype 'a event_status | |
3539 | = ENABLED of {prio: int, doFn: unit -> 'a} | |
3540 | | BLOCKED of { | |
3541 | transId: trans_id ref, | |
3542 | cleanUp: unit -> unit, | |
3543 | next: unit -> unit | |
3544 | } -> 'a | |
3545 | type 'a base_evt = unit -> 'a event_status | |
3546 | ---- | |
3547 | ||
3548 | When synchronizing on a set of base events, which are all blocked, we | |
3549 | must invoke each `BLOCKED` function with the same `transId` and | |
3550 | `cleanUp` (the `transId` is (checked and) set to `CANCEL` by the | |
3551 | `cleanUp` function, which is invoked by the first enabled event; this | |
3552 | "fizzles" every other event in the synchronization group that later | |
3553 | becomes enabled). However, each `BLOCKED` function is implemented by | |
3554 | a callcc, so that when the event is enabled, it throws back to the | |
3555 | point of synchronization. Hence, the next function (which doesn't | |
3556 | return) is invoked by the `BLOCKED` function to escape the callcc and | |
3557 | continue in the thread performing the synchronization. In SML/NJ this | |
3558 | is implemented as follows: | |
3559 | [source,sml] | |
3560 | ---- | |
3561 | fun ext ([], blockFns) = callcc (fn k => let | |
3562 | val throw = throw k | |
3563 | val (transId, setFlg) = mkFlg() | |
3564 | fun log [] = S.atomicDispatch () | |
3565 | | log (blockFn:: r) = | |
3566 | throw (blockFn { | |
3567 | transId = transId, | |
3568 | cleanUp = setFlg, | |
3569 | next = fn () => log r | |
3570 | }) | |
3571 | in | |
3572 | log blockFns; error "[log]" | |
3573 | end) | |
3574 | ---- | |
3575 | (Note that `S.atomicDispatch` invokes the continuation of the next | |
3576 | continuation on the ready queue.) This doesn't map well to the MLton | |
3577 | thread model. Although it follows the | |
3578 | ---- | |
3579 | callcc (fn k => ... throw k v) | |
3580 | ---- | |
3581 | model, the fact that `blockFn` will also attempt to do | |
3582 | ---- | |
3583 | callcc (fn k' => ... next ()) | |
3584 | ---- | |
3585 | means that the naive transformation will result in nested `switch`-es. | |
3586 | ||
3587 | We need to think a little more about what this code is trying to do. | |
3588 | Essentially, each `blockFn` wants to capture this continuation, hold | |
3589 | on to it until the event is enabled, and continue with next; when the | |
3590 | event is enabled, before invoking the continuation and returning to | |
3591 | the synchronization point, the `cleanUp` and other event specific | |
3592 | operations are performed. | |
3593 | ||
3594 | To accomplish the same effect in the MLton thread implementation, we | |
3595 | have the following: | |
3596 | [source,sml] | |
3597 | ---- | |
3598 | datatype 'a status = | |
3599 | ENABLED of {prio: int, doitFn: unit -> 'a} | |
3600 | | BLOCKED of {transId: trans_id, | |
3601 | cleanUp: unit -> unit, | |
3602 | next: unit -> rdy_thread} -> 'a | |
3603 | ||
3604 | type 'a base = unit -> 'a status | |
3605 | ||
3606 | fun ext ([], blockFns): 'a = | |
3607 | S.atomicSwitch | |
3608 | (fn (t: 'a S.thread) => | |
3609 | let | |
3610 | val (transId, cleanUp) = TransID.mkFlg () | |
3611 | fun log blockFns: S.rdy_thread = | |
3612 | case blockFns of | |
3613 | [] => S.next () | |
3614 | | blockFn::blockFns => | |
3615 | (S.prep o S.new) | |
3616 | (fn _ => fn () => | |
3617 | let | |
3618 | val () = S.atomicBegin () | |
3619 | val x = blockFn {transId = transId, | |
3620 | cleanUp = cleanUp, | |
3621 | next = fn () => log blockFns} | |
3622 | in S.switch(fn _ => S.prepVal (t, x)) | |
3623 | end) | |
3624 | in | |
3625 | log blockFns | |
3626 | end) | |
3627 | ---- | |
3628 | ||
3629 | To avoid the nested `switch`-es, I run the `blockFn` in it's own | |
3630 | thread, whose only purpose is to return to the synchronization point. | |
3631 | This corresponds to the `throw (blockFn {...})` in the SML/NJ | |
3632 | implementation. I'm worried that this implementation might be a | |
3633 | little expensive, starting a new thread for each blocked event (when | |
3634 | there are only multiple blocked events in a synchronization group). | |
3635 | But, I don't see another way of implementing this behavior in the | |
3636 | MLton thread model. | |
3637 | ||
3638 | Note that another way of thinking about what is going on is to | |
3639 | consider each `blockFn` as prepending a different set of actions to | |
3640 | the thread `t`. It might be possible to give a | |
3641 | `MLton.Thread.unsafePrepend`. | |
3642 | [source,sml] | |
3643 | ---- | |
3644 | fun unsafePrepend (T r: 'a t, f: 'b -> 'a): 'b t = | |
3645 | let | |
3646 | val t = | |
3647 | case !r of | |
3648 | Dead => raise Fail "prepend to a Dead thread" | |
3649 | | New g => New (g o f) | |
3650 | | Paused (g, t) => Paused (fn h => g (f o h), t) | |
3651 | in (* r := Dead; *) | |
3652 | T (ref t) | |
3653 | end | |
3654 | ---- | |
3655 | I have commented out the `r := Dead`, which would allow multiple | |
3656 | prepends to the same thread (i.e., not destroying the original thread | |
3657 | in the process). Of course, only one of the threads could be run: if | |
3658 | the original thread were in the `Paused` state, then multiple threads | |
3659 | would share the underlying runtime/primitive thread. Now, this | |
3660 | matches the "one-shot" nature of CML continuations/threads, but I'm | |
3661 | not comfortable with extending `MLton.Thread` with such an unsafe | |
3662 | operation. | |
3663 | ||
3664 | Other than this complication with blocking multiple base events, the | |
3665 | port was quite routine. (As a very pleasant surprise, the CML | |
3666 | implementation in SML/NJ doesn't use any SML/NJ-isms.) There is a | |
3667 | slight difference in the way in which critical sections are handled in | |
3668 | SML/NJ and MLton; since `MLton.Thread.switch` _always_ leaves a | |
3669 | critical section, it is sometimes necessary to add additional | |
3670 | `atomicBegin`-s/`atomicEnd`-s to ensure that we remain in a critical | |
3671 | section after a thread switch. | |
3672 | ||
3673 | While looking at virtually every file in the core CML implementation, | |
3674 | I took the liberty of simplifying things where it seemed possible; in | |
3675 | terms of style, the implementation is about half-way between Reppy's | |
3676 | original and MLton's. | |
3677 | ||
3678 | Some changes of note: | |
3679 | ||
3680 | * `util/` contains all pertinent data-structures: (functional and | |
3681 | imperative) queues, (functional) priority queues. Hence, it should be | |
3682 | easier to switch in more efficient or real-time implementations. | |
3683 | ||
3684 | * `core-cml/scheduler.sml`: in both implementations, this is where | |
3685 | most of the interesting action takes place. I've made the connection | |
3686 | between `MLton.Thread.t`-s and `ThreadId.thread_id`-s more abstract | |
3687 | than it is in the SML/NJ implementation, and encapsulated all of the | |
3688 | `MLton.Thread` operations in this module. | |
3689 | ||
3690 | * eliminated all of the "by hand" inlining | |
3691 | ||
3692 | ||
3693 | == Future Extensions == | |
3694 | ||
3695 | The CML documentation says the following: | |
3696 | ____ | |
3697 | ||
3698 | ---- | |
3699 | CML.joinEvt: thread_id -> unit event | |
3700 | ---- | |
3701 | ||
3702 | * `joinEvt tid` | |
3703 | + | |
3704 | creates an event value for synchronizing on the termination of the | |
3705 | thread with the ID tid. There are three ways that a thread may | |
3706 | terminate: the function that was passed to spawn (or spawnc) may | |
3707 | return; it may call the exit function, or it may have an uncaught | |
3708 | exception. Note that `joinEvt` does not distinguish between these | |
3709 | cases; it also does not become enabled if the named thread deadlocks | |
3710 | (even if it is garbage collected). | |
3711 | ____ | |
3712 | ||
3713 | I believe that the `MLton.Finalizable` might be able to relax that | |
3714 | last restriction. Upon the creation of a `'a Scheduler.thread`, we | |
3715 | could attach a finalizer to the underlying `'a MLton.Thread.t` that | |
3716 | enables the `joinEvt` (in the associated `ThreadID.thread_id`) when | |
3717 | the `'a MLton.Thread.t` becomes unreachable. | |
3718 | ||
3719 | I don't know why CML doesn't have | |
3720 | ---- | |
3721 | CML.kill: thread_id -> unit | |
3722 | ---- | |
3723 | which has a fairly simple implementation -- setting a kill flag in the | |
3724 | `thread_id` and adjusting the scheduler to discard any killed threads | |
3725 | that it takes off the ready queue. The fairness of the scheduler | |
3726 | ensures that a killed thread will eventually be discarded. The | |
3727 | semantics are little murky for blocked threads that are killed, | |
3728 | though. For example, consider a thread blocked on `SyncVar.mTake mv` | |
3729 | and a thread blocked on `SyncVar.mGet mv`. If the first thread is | |
3730 | killed while blocked, and a third thread does `SyncVar.mPut (mv, x)`, | |
3731 | then we might expect that we'll enable the second thread, and never | |
3732 | the first. But, when only the ready queue is able to discard killed | |
3733 | threads, then the `SyncVar.mPut` could enable the first thread | |
3734 | (putting it on the ready queue, from which it will be discarded) and | |
3735 | leave the second thread blocked. We could solve this by adjusting the | |
3736 | `TransID.trans_id types` and the "cleaner" functions to look for both | |
3737 | canceled transactions and transactions on killed threads. | |
3738 | ||
3739 | John Reppy says that <!Cite(MarlowEtAl01)> and <!Cite(FlattFindler04)> | |
3740 | explain why `CML.kill` would be a bad idea. | |
3741 | ||
3742 | Between `CML.timeOutEvt` and `CML.kill`, one could give an efficient | |
3743 | solution to the recent `comp.lang.ml` post about terminating a | |
3744 | function that doesn't complete in a given time. | |
3745 | [source,sml] | |
3746 | ---- | |
3747 | fun timeOut (f: unit -> 'a, t: Time.time): 'a option = | |
3748 | let | |
3749 | val iv = SyncVar.iVar () | |
3750 | val tid = CML.spawn (fn () => SyncVar.iPut (iv, f ())) | |
3751 | in | |
3752 | CML.select | |
3753 | [CML.wrap (CML.timeOutEvt t, fn () => (CML.kill tid; NONE)), | |
3754 | CML.wrap (SyncVar.iGetEvt iv, fn x => SOME x)] | |
3755 | end | |
3756 | ---- | |
3757 | ||
3758 | ||
3759 | == Space Safety == | |
3760 | ||
3761 | There are some CML related posts on the MLton mailing list: | |
3762 | ||
3763 | * http://www.mlton.org/pipermail/mlton/2004-May/ | |
3764 | ||
3765 | that discuss concerns that SML/NJ's implementation is not space | |
3766 | efficient, because multi-shot continuations can be held indefinitely | |
3767 | on event queues. MLton is better off because of the one-shot nature | |
3768 | -- when an event enables a thread, all other copies of the thread | |
3769 | waiting in other event queues get turned into dead threads (of zero | |
3770 | size). | |
3771 | ||
3772 | <<< | |
3773 | ||
3774 | :mlton-guide-page: ConstantPropagation | |
3775 | [[ConstantPropagation]] | |
3776 | ConstantPropagation | |
3777 | =================== | |
3778 | ||
3779 | <:ConstantPropagation:> is an optimization pass for the <:SSA:> | |
3780 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
3781 | ||
3782 | == Description == | |
3783 | ||
3784 | This is whole-program constant propagation, even through data | |
3785 | structures. It also performs globalization of (small) values computed | |
3786 | once. | |
3787 | ||
3788 | Uses <:Multi:>. | |
3789 | ||
3790 | == Implementation == | |
3791 | ||
3792 | * <!ViewGitFile(mlton,master,mlton/ssa/constant-propagation.fun)> | |
3793 | ||
3794 | == Details and Notes == | |
3795 | ||
3796 | {empty} | |
3797 | ||
3798 | <<< | |
3799 | ||
3800 | :mlton-guide-page: Contact | |
3801 | [[Contact]] | |
3802 | Contact | |
3803 | ======= | |
3804 | ||
3805 | == Mailing lists == | |
3806 | ||
3807 | There are three mailing lists available. | |
3808 | ||
3809 | * mailto:MLton-user@mlton.org[`MLton-user@mlton.org`] | |
3810 | + | |
3811 | MLton user community discussion | |
3812 | + | |
3813 | -- | |
3814 | * https://lists.sourceforge.net/lists/listinfo/mlton-user[subscribe] | |
3815 | https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-user[archive (SourceForge; current)], | |
3816 | http://www.mlton.org/pipermail/mlton-user/[archive (PiperMail; through 201110)] | |
3817 | -- | |
3818 | ||
3819 | * mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`] | |
3820 | + | |
3821 | MLton developer community discussion | |
3822 | + | |
3823 | -- | |
3824 | * https://lists.sourceforge.net/lists/listinfo/mlton-devel[subscribe] | |
3825 | https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-devel[archive (SourceForge; current)], | |
3826 | http://www.mlton.org/pipermail/mlton-devel/[archive (PiperMail; through 201110)] | |
3827 | -- | |
3828 | ||
3829 | * mailto:MLton-commit@mlton.org[`MLton-commit@mlton.org`] | |
3830 | + | |
3831 | MLton code commits | |
3832 | + | |
3833 | -- | |
3834 | * https://lists.sourceforge.net/lists/listinfo/mlton-commit[subscribe] | |
3835 | * https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-commit[archive (SourceForge; current)], | |
3836 | http://www.mlton.org/pipermail/mlton-commit/[archive (PiperMail; through 201110)] | |
3837 | -- | |
3838 | ||
3839 | ||
3840 | === Mailing list policies === | |
3841 | ||
3842 | * Both mailing lists are unmoderated. However, the mailing lists are | |
3843 | configured to discard all spam, to hold all non-subscriber posts | |
3844 | for moderation, to accept all subscriber posts, and to admin approve | |
3845 | subscription requests. Please contact | |
3846 | mailto:matthew.fluet@gmail.com[Matthew Fluet] if it appears that your | |
3847 | messages are being discarded as spam. | |
3848 | ||
3849 | * Large messages (over 256K) should not be sent. Rather, please send | |
3850 | an email containing the discussion text and a link to any large files. | |
3851 | ||
3852 | ///// | |
3853 | * Very active mailto:MLton-devel@mlton.org[`MLton@mlton.org`] list | |
3854 | members who might otherwise be expected to provide a fast response | |
3855 | should send a message when they will be offline for more than a few | |
3856 | days. The convention is to put | |
3857 | "++__userid__ offline until __date__++" in the subject line to make it | |
3858 | easy to scan. | |
3859 | ///// | |
3860 | ||
3861 | * Discussions started on the mailing lists should stay on the mailing | |
3862 | lists. Private replies may be bounced to the mailing list for the | |
3863 | benefit of those following the discussion. | |
3864 | ||
3865 | * Discussions started on | |
3866 | mailto:MLton-user@mlton.org[`MLton-user@mlton.org`] may be migrated to | |
3867 | mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`], particularly | |
3868 | when the discussion shifts from how to use MLton to how to modify | |
3869 | MLton (e.g., to fix a bug identified by the initial discussion). | |
3870 | ||
3871 | == IRC == | |
3872 | ||
3873 | * Some MLton developers and users are in channel `#sml` on http://freenode.net. | |
3874 | ||
3875 | <<< | |
3876 | ||
3877 | :mlton-guide-page: Contify | |
3878 | [[Contify]] | |
3879 | Contify | |
3880 | ======= | |
3881 | ||
3882 | <:Contify:> is an optimization pass for the <:SSA:> | |
3883 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
3884 | ||
3885 | == Description == | |
3886 | ||
3887 | Contification is a compiler optimization that turns a function that | |
3888 | always returns to the same place into a continuation. This exposes | |
3889 | control-flow information that is required by many optimizations, | |
3890 | including traditional loop optimizations. | |
3891 | ||
3892 | == Implementation == | |
3893 | ||
3894 | * <!ViewGitFile(mlton,master,mlton/ssa/contify.fun)> | |
3895 | ||
3896 | == Details and Notes == | |
3897 | ||
3898 | See <!Cite(FluetWeeks01, Contification Using Dominators)>. The | |
3899 | intermediate language described in that paper has since evolved to the | |
3900 | <:SSA:> <:IntermediateLanguage:>; hence, the complication described in | |
3901 | Section 6.1 is no longer relevant. | |
3902 | ||
3903 | <<< | |
3904 | ||
3905 | :mlton-guide-page: CoreML | |
3906 | [[CoreML]] | |
3907 | CoreML | |
3908 | ====== | |
3909 | ||
3910 | <:CoreML:Core ML> is an <:IntermediateLanguage:>, translated from | |
3911 | <:AST:> by <:Elaborate:>, optimized by <:CoreMLSimplify:>, and | |
3912 | translated by <:Defunctorize:> to <:XML:>. | |
3913 | ||
3914 | == Description == | |
3915 | ||
3916 | <:CoreML:> is polymorphic, higher-order, and has nested patterns. | |
3917 | ||
3918 | == Implementation == | |
3919 | ||
3920 | * <!ViewGitFile(mlton,master,mlton/core-ml/core-ml.sig)> | |
3921 | * <!ViewGitFile(mlton,master,mlton/core-ml/core-ml.fun)> | |
3922 | ||
3923 | == Type Checking == | |
3924 | ||
3925 | The <:CoreML:> <:IntermediateLanguage:> has no independent type | |
3926 | checker. | |
3927 | ||
3928 | == Details and Notes == | |
3929 | ||
3930 | {empty} | |
3931 | ||
3932 | <<< | |
3933 | ||
3934 | :mlton-guide-page: CoreMLSimplify | |
3935 | [[CoreMLSimplify]] | |
3936 | CoreMLSimplify | |
3937 | ============== | |
3938 | ||
3939 | The single optimization pass for the <:CoreML:> | |
3940 | <:IntermediateLanguage:> is controlled by the `Compile` functor | |
3941 | (<!ViewGitFile(mlton,master,mlton/main/compile.fun)>). | |
3942 | ||
3943 | The following optimization pass is implemented: | |
3944 | ||
3945 | * <:DeadCode:> | |
3946 | ||
3947 | <<< | |
3948 | ||
3949 | :mlton-guide-page: Credits | |
3950 | [[Credits]] | |
3951 | Credits | |
3952 | ======= | |
3953 | ||
3954 | MLton was designed and implemented by HenryCejtin, | |
3955 | MatthewFluet, SureshJagannathan, and <:StephenWeeks:>. | |
3956 | ||
3957 | * <:HenryCejtin:> wrote the `IntInf` implementation, the original | |
3958 | profiler, the original man pages, the `.spec` files for the RPMs, | |
3959 | and lots of little hacks to speed stuff up. | |
3960 | ||
3961 | * <:MatthewFluet:> implemented the X86 and AMD64 native code generators, | |
3962 | ported `mlprof` to work with the native code generator, did a lot | |
3963 | of work on the SSA optimizer, both adding new optimizations and | |
3964 | improving or porting existing optimizations, updated the | |
3965 | <:BasisLibrary:Basis Library> implementation, ported | |
3966 | <:ConcurrentML:> and <:MLNLFFI:ML-NLFFI> to MLton, implemented the | |
3967 | <:MLBasis: ML Basis system>, ported MLton to 64-bit platforms, | |
3968 | and currently leads the project. | |
3969 | ||
3970 | * <:SureshJagannathan:> implemented some early inlining and uncurrying | |
3971 | optimizations. | |
3972 | ||
3973 | * <:StephenWeeks:> implemented most of the original version of MLton, and | |
3974 | continues to keep his fingers in most every part. | |
3975 | ||
3976 | Many people have helped us over the years. Here is an alphabetical | |
3977 | list. | |
3978 | ||
3979 | * <:JesperLouisAndersen:> sent several patches to improve the runtime on | |
3980 | FreeBSD and ported MLton to run on NetBSD and OpenBSD. | |
3981 | ||
3982 | * <:JohnnyAndersen:> implemented `BinIO`, modified MLton so it could | |
3983 | cross compile to MinGW, and provided useful discussion about | |
3984 | cross-compilation. | |
3985 | ||
3986 | * Alexander Abushkevich extended support for OpenBSD. | |
3987 | ||
3988 | * Ross Bayer added the `-keep ast` compile-time option and experimented with | |
3989 | porting the build system to CMake. | |
3990 | ||
3991 | * Kevin Bradley added initial support for <:SuccessorML:> features. | |
3992 | ||
3993 | * Bryan Camp added `-disable-pass _regex_` and `enable-pass _regex_` compile | |
3994 | options to generalize `-drop-pass _regex_` and added `Array_copyArray` and | |
3995 | `Array_copyVector` primitives. | |
3996 | ||
3997 | * Jason Carr added a parser combinator library and a parser for the <:SXML:> | |
3998 | IR, extended compilation to start with a `.sxml` file, and experimented with | |
3999 | alternate control-flow analyses for <:ClosureConvert: closure conversion>. | |
4000 | ||
4001 | * Christopher Cramer contributed support for additional | |
4002 | `Posix.ProcEnv.sysconf` variables, performance improvements for | |
4003 | `String.concatWith`, and Debian packaging. | |
4004 | ||
4005 | * Alain Deutsch and | |
4006 | http://www.polyspace.com/[PolySpace Technologies] provided many bug | |
4007 | fixes and runtime system improvements, code to help the Sparc/Solaris | |
4008 | port, and funded a number of improvements to MLton. | |
4009 | ||
4010 | * Armando Doval updated `mlnlffigen` to warn and skip functions with | |
4011 | `struct`/`union` arguments. | |
4012 | ||
4013 | * Martin Elsman provided helpful discussions in the development of | |
4014 | the <:MLBasis:ML Basis system>. | |
4015 | ||
4016 | * Brent Fulgham ported MLton most of the way to MinGW. | |
4017 | ||
4018 | * <:AdamGoode:> provided a script to build the PDF MLton Guide and | |
4019 | maintains the | |
4020 | https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora] | |
4021 | packages. | |
4022 | ||
4023 | * Simon Helsen provided bug reports, suggestions, and helpful | |
4024 | discussions. | |
4025 | ||
4026 | * Joe Hurd provided useful discussion and feedback on source-level | |
4027 | profiling. | |
4028 | ||
4029 | * <:VesaKarvonen:> contributed `esml-mode.el` and `esml-mlb-mode.el` (see <:Emacs:>), | |
4030 | contributed patches for improving match warnings, | |
4031 | contributed `esml-du-mlton.el` and extended def-use output to include types of variable definitions (see <:EmacsDefUseMode:>), and | |
4032 | improved constant folding of floating-point operations. | |
4033 | ||
4034 | * Richard Kelsey provided helpful discussions. | |
4035 | ||
4036 | * Ville Laurikari ported MLton to IA64/HPUX, HPPA/HPUX, PowerPC/AIX, PowerPC64/AIX. | |
4037 | ||
4038 | * Brian Leibig implemented the <:LLVMCodegen:>. | |
4039 | ||
4040 | * Geoffrey Mainland helped with FreeBSD packaging. | |
4041 | ||
4042 | * Eric McCorkle ported MLton to Intel Mac. | |
4043 | ||
4044 | * <:TomMurphy:> wrote the original version of `MLton.Syslog` as part | |
4045 | of his `mlftpd` project, and has sent many useful bug reports and | |
4046 | suggestions. | |
4047 | ||
4048 | * Michael Neumann helped to patch the runtime to compile under | |
4049 | FreeBSD. | |
4050 | ||
4051 | * Barak Pearlmutter built the original | |
4052 | http://packages.debian.org/mlton[Debian package] for MLton, and | |
4053 | helped us to take over the process. | |
4054 | ||
4055 | * Filip Pizlo ported MLton to (PowerPC) Darwin. | |
4056 | ||
4057 | * Vedant Raiththa extended the <:ForeignFunctionInterface:> with support for | |
4058 | `pure` and `impure` attributes to `_import`. | |
4059 | ||
4060 | * Krishna Ravikumar added initial support for vector expressions and the | |
4061 | `Vector_vector` primitive. | |
4062 | ||
4063 | * John Reppy assisted in porting MLton to Intel Mac. | |
4064 | ||
4065 | * Sam Rushing ported MLton to FreeBSD. | |
4066 | ||
4067 | * Rob Simmons refactored the array and vector implementation in the | |
4068 | <:BasisLibrary: Basis Library:> into a primitive implementation (using | |
4069 | `SeqInt.int` for indexing) and a wrapper implementation (using the default | |
4070 | `Int.int` for indexing). | |
4071 | ||
4072 | * Jeffrey Mark Siskind provided helpful discussions and inspiration | |
4073 | with his Stalin Scheme compiler. | |
4074 | ||
4075 | * Matthew Surawski added <:LoopUnroll:> and <:LoopUnswitch:> SSA optimizations. | |
4076 | ||
4077 | * <:WesleyTerpstra:> added support for `MLton.Process.create`, made | |
4078 | a number of contributions to the <:ForeignFunctionInterface:>, | |
4079 | contributed a number of runtime system patches, | |
4080 | added support for compiling to a <:LibrarySupport:C library>, | |
4081 | ported MLton to http://mingw.org[MinGW] and all http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian] supported architectures with <:CrossCompiling:cross-compiling> support, | |
4082 | and maintains the http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian] and http://mingw.org[MinGW] packages. | |
4083 | ||
4084 | * Maksim Yegorov added rudimentary support for `./configure` and other | |
4085 | improvements to the build system and implemented the <:ShareZeroVec:> SSA | |
4086 | optimization. | |
4087 | ||
4088 | * Luke Ziarek assisted in porting MLton to (PowerPC) Darwin. | |
4089 | ||
4090 | We have also benefited from other software development tools and | |
4091 | used code from other sources. | |
4092 | ||
4093 | * MLton was developed using | |
4094 | <:SMLNJ:Standard ML of New Jersey> and the | |
4095 | <:CompilationManager:Compilation Manager (CM)> | |
4096 | ||
4097 | * MLton's lexer (`mlton/frontend/ml.lex`), parser | |
4098 | (`mlton/frontend/ml.grm`), and precedence-parser | |
4099 | (`mlton/elaborate/precedence-parse.fun`) are modified versions of | |
4100 | code from SML/NJ. | |
4101 | ||
4102 | * The MLton <:BasisLibrary:Basis Library> implementation of | |
4103 | conversions between binary and decimal representations of reals uses | |
4104 | David Gay's http://www.netlib.org/fp/[gdtoa] library. | |
4105 | ||
4106 | * The MLton <:BasisLibrary:Basis Library> implementation uses | |
4107 | modified versions of portions of the the SML/NJ Basis Library | |
4108 | implementation modules `OS.IO`, `Posix.IO`, `Process`, | |
4109 | and `Unix`. | |
4110 | ||
4111 | * The MLton <:BasisLibrary:Basis Library> implementation uses | |
4112 | modified versions of portions of the <:MLKit:ML Kit> Version 4.1.4 | |
4113 | Basis Library implementation modules `Path`, `Time`, and | |
4114 | `Date`. | |
4115 | ||
4116 | * Many of the benchmarks come from the SML/NJ benchmark suite. | |
4117 | ||
4118 | * Many of the regression tests come from the ML Kit Version 4.1.4 | |
4119 | distribution, which borrowed them from the | |
4120 | http://www.dina.kvl.dk/%7Esestoft/mosml.html[Moscow ML] distribution. | |
4121 | ||
4122 | * MLton uses the http://www.gnu.org/software/gmp/gmp.html[GNU multiprecision library] for its implementation of `IntInf`. | |
4123 | ||
4124 | * MLton's implementation of <:MLLex: mllex>, <:MLYacc: mlyacc>, | |
4125 | the <:CKitLibrary:ckit Library>, | |
4126 | the <:MLLPTLibrary:ML-LPT Library>, | |
4127 | the <:MLRISCLibrary:MLRISC Library>, | |
4128 | the <:SMLNJLibrary:SML/NJ Library>, | |
4129 | <:ConcurrentML:Concurrent ML>, | |
4130 | mlnlffigen and <:MLNLFFI:ML-NLFFI> | |
4131 | are modified versions of code from SML/NJ. | |
4132 | ||
4133 | <<< | |
4134 | ||
4135 | :mlton-guide-page: CrossCompiling | |
4136 | [[CrossCompiling]] | |
4137 | CrossCompiling | |
4138 | ============== | |
4139 | ||
4140 | MLton's `-target` flag directs MLton to cross compile an application | |
4141 | for another platform. By default, MLton is only able to compile for | |
4142 | the machine it is running on. In order to use MLton as a cross | |
4143 | compiler, you need to do two things. | |
4144 | ||
4145 | 1. Install the GCC cross-compiler tools on the host so that GCC can | |
4146 | compile to the target. | |
4147 | ||
4148 | 2. Cross compile the MLton runtime system to build the runtime | |
4149 | libraries for the target. | |
4150 | ||
4151 | To make the terminology clear, we refer to the _host_ as the machine | |
4152 | MLton is running on and the _target_ as the machine that MLton is | |
4153 | compiling for. | |
4154 | ||
4155 | To build a GCC cross-compiler toolset on the host, you can use the | |
4156 | script `bin/build-cross-gcc`, available in the MLton sources, as a | |
4157 | template. The value of the `target` variable in that script is | |
4158 | important, since that is what you will pass to MLton's `-target` flag. | |
4159 | Once you have the toolset built, you should be able to test it by | |
4160 | cross compiling a simple hello world program on your host machine. | |
4161 | ---- | |
4162 | % gcc -b i386-pc-cygwin -o hello-world hello-world.c | |
4163 | ---- | |
4164 | ||
4165 | You should now be able to run `hello-world` on the target machine, in | |
4166 | this case, a Cygwin machine. | |
4167 | ||
4168 | Next, you must cross compile the MLton runtime system and inform MLton | |
4169 | of the availability of the new target. The script `bin/add-cross` | |
4170 | from the MLton sources will help you do this. Please read the | |
4171 | comments at the top of the script. Here is a sample run adding a | |
4172 | Solaris cross compiler. | |
4173 | ---- | |
4174 | % add-cross sparc-sun-solaris sun blade | |
4175 | Making runtime. | |
4176 | Building print-constants executable. | |
4177 | Running print-constants on blade. | |
4178 | ---- | |
4179 | ||
4180 | Running `add-cross` uses `ssh` to compile the runtime on the target | |
4181 | machine and to create `print-constants`, which prints out all of the | |
4182 | constants that MLton needs in order to implement the | |
4183 | <:BasisLibrary:Basis Library>. The script runs `print-constants` on | |
4184 | the target machine (`blade` in this case), and saves the output. | |
4185 | ||
4186 | Once you have done all this, you should be able to cross compile SML | |
4187 | applications. For example, | |
4188 | ---- | |
4189 | mlton -target i386-pc-cygwin hello-world.sml | |
4190 | ---- | |
4191 | will create `hello-world`, which you should be able to run from a | |
4192 | Cygwin shell on your Windows machine. | |
4193 | ||
4194 | ||
4195 | == Cross-compiling alternatives == | |
4196 | ||
4197 | Building and maintaining cross-compiling `gcc`'s is complex. You may | |
4198 | find it simpler to use `mlton -keep g` to generate the files on the | |
4199 | host, then copy the files to the target, and then use `gcc` or `mlton` | |
4200 | on the target to compile the files. | |
4201 | ||
4202 | <<< | |
4203 | ||
4204 | :mlton-guide-page: CVS | |
4205 | [[CVS]] | |
4206 | CVS | |
4207 | === | |
4208 | ||
4209 | http://www.gnu.org/software/cvs/[CVS] (Concurrent Versions System) is | |
4210 | a version control system. The MLton project used CVS to maintain its | |
4211 | <:Sources:source code>, but switched to <:Subversion:> on 20050730. | |
4212 | ||
4213 | Here are some online CVS resources. | |
4214 | ||
4215 | * http://cvsbook.red-bean.com/[Open Source Development with CVS] | |
4216 | ||
4217 | <<< | |
4218 | ||
4219 | :mlton-guide-page: DeadCode | |
4220 | [[DeadCode]] | |
4221 | DeadCode | |
4222 | ======== | |
4223 | ||
4224 | <:DeadCode:> is an optimization pass for the <:CoreML:> | |
4225 | <:IntermediateLanguage:>, invoked from <:CoreMLSimplify:>. | |
4226 | ||
4227 | == Description == | |
4228 | ||
4229 | This pass eliminates declarations from the | |
4230 | <:BasisLibrary:Basis Library> not needed by the user program. | |
4231 | ||
4232 | == Implementation == | |
4233 | ||
4234 | * <!ViewGitFile(mlton,master,mlton/core-ml/dead-code.sig)> | |
4235 | * <!ViewGitFile(mlton,master,mlton/core-ml/dead-code.fun)> | |
4236 | ||
4237 | == Details and Notes == | |
4238 | ||
4239 | In order to compile small programs rapidly, a pass of dead code | |
4240 | elimination is run in order to eliminate as much of the Basis Library | |
4241 | as possible. The dead code elimination algorithm used is not safe in | |
4242 | general, and only works because the Basis Library implementation has | |
4243 | special properties: | |
4244 | ||
4245 | * it terminates | |
4246 | * it performs no I/O | |
4247 | ||
4248 | The dead code elimination includes the minimal set of | |
4249 | declarations from the Basis Library so that there are no free | |
4250 | variables in the user program (or remaining Basis Library | |
4251 | implementation). It has a special hack to include all | |
4252 | bindings of the form: | |
4253 | [source,sml] | |
4254 | ---- | |
4255 | val _ = ... | |
4256 | ---- | |
4257 | ||
4258 | There is an <:MLBasisAnnotations:ML Basis annotation>, | |
4259 | `deadCode true`, that governs which code is subject to this unsafe | |
4260 | dead-code elimination. | |
4261 | ||
4262 | <<< | |
4263 | ||
4264 | :mlton-guide-page: DeepFlatten | |
4265 | [[DeepFlatten]] | |
4266 | DeepFlatten | |
4267 | =========== | |
4268 | ||
4269 | <:DeepFlatten:> is an optimization pass for the <:SSA2:> | |
4270 | <:IntermediateLanguage:>, invoked from <:SSA2Simplify:>. | |
4271 | ||
4272 | == Description == | |
4273 | ||
4274 | This pass flattens into mutable fields of objects and into vectors. | |
4275 | ||
4276 | For example, an `(int * int) ref` is represented by a 2 word | |
4277 | object, and an `(int * int) array` contains pairs of `int`-s, | |
4278 | rather than pointers to pairs of `int`-s. | |
4279 | ||
4280 | == Implementation == | |
4281 | ||
4282 | * <!ViewGitFile(mlton,master,mlton/ssa/deep-flatten.fun)> | |
4283 | ||
4284 | == Details and Notes == | |
4285 | ||
4286 | There are some performance issues with the deep flatten pass, where it | |
4287 | consumes an excessive amount of memory. | |
4288 | ||
4289 | * http://www.mlton.org/pipermail/mlton/2005-April/026990.html | |
4290 | * http://www.mlton.org/pipermail/mlton-user/2010-June/001626.html | |
4291 | * http://www.mlton.org/pipermail/mlton/2010-December/030876.html | |
4292 | ||
4293 | A number of applications require compilation with | |
4294 | `-disable-pass deepFlatten` to avoid exceeding available memory. It is | |
4295 | often asked whether the deep flatten pass usually has a significant | |
4296 | impact on performance. The standard benchmark suite was run with and | |
4297 | without the deep flatten pass enabled when the pass was first | |
4298 | introduced: | |
4299 | ||
4300 | * http://www.mlton.org/pipermail/mlton/2004-August/025760.html | |
4301 | ||
4302 | The conclusion is that it does not have a significant impact. | |
4303 | However, these are micro benchmarks; other applications may derive | |
4304 | greater benefit from the pass. | |
4305 | ||
4306 | <<< | |
4307 | ||
4308 | :mlton-guide-page: DefineTypeBeforeUse | |
4309 | [[DefineTypeBeforeUse]] | |
4310 | DefineTypeBeforeUse | |
4311 | =================== | |
4312 | ||
4313 | <:StandardML:Standard ML> requires types to be defined before they are | |
4314 | used. Because of type inference, the use of a type can be implicit; | |
4315 | hence, this requirement is more subtle than it might appear. For | |
4316 | example, the following program is not type correct, because the type | |
4317 | of `r` is `t option ref`, but `t` is defined after `r`. | |
4318 | ||
4319 | [source,sml] | |
4320 | ---- | |
4321 | val r = ref NONE | |
4322 | datatype t = A | B | |
4323 | val () = r := SOME A | |
4324 | ---- | |
4325 | ||
4326 | MLton reports the following error, indicating that the type defined on | |
4327 | line 2 is used on line 1. | |
4328 | ||
4329 | ---- | |
4330 | Error: z.sml 3.10-3.20. | |
4331 | Function applied to incorrect argument. | |
4332 | expects: _ * [???] option | |
4333 | but got: _ * [t] option | |
4334 | in: := (r, SOME A) | |
4335 | note: type would escape its scope: t | |
4336 | escape from: z.sml 2.10-2.10 | |
4337 | escape to: z.sml 1.1-1.16 | |
4338 | Warning: z.sml 1.5-1.5. | |
4339 | Type of variable was not inferred and could not be generalized: r. | |
4340 | type: ??? option ref | |
4341 | in: val r = ref NONE | |
4342 | ---- | |
4343 | ||
4344 | While the above example is benign, the following example shows how to | |
4345 | cast an integer to a function by (implicitly) using a type before it | |
4346 | is defined. In the example, the ref cell `r` is of type | |
4347 | `t option ref`, where `t` is defined _after_ `r`, as a parameter to | |
4348 | functor `F`. | |
4349 | ||
4350 | [source,sml] | |
4351 | ---- | |
4352 | val r = ref NONE | |
4353 | functor F (type t | |
4354 | val x: t) = | |
4355 | struct | |
4356 | val () = r := SOME x | |
4357 | fun get () = valOf (!r) | |
4358 | end | |
4359 | structure S1 = F (type t = unit -> unit | |
4360 | val x = fn () => ()) | |
4361 | structure S2 = F (type t = int | |
4362 | val x = 13) | |
4363 | val () = S1.get () () | |
4364 | ---- | |
4365 | ||
4366 | MLton reports the following error. | |
4367 | ||
4368 | ---- | |
4369 | Warning: z.sml 1.5-1.5. | |
4370 | Type of variable was not inferred and could not be generalized: r. | |
4371 | type: ??? option ref | |
4372 | in: val r = ref NONE | |
4373 | Error: z.sml 5.16-5.26. | |
4374 | Function applied to incorrect argument. | |
4375 | expects: _ * [???] option | |
4376 | but got: _ * [t] option | |
4377 | in: := (r, SOME x) | |
4378 | note: type would escape its scope: t | |
4379 | escape from: z.sml 2.17-2.17 | |
4380 | escape to: z.sml 1.1-1.16 | |
4381 | Warning: z.sml 6.11-6.13. | |
4382 | Type of variable was not inferred and could not be generalized: get. | |
4383 | type: unit -> ??? | |
4384 | in: fun get () = (valOf (! r)) | |
4385 | Error: z.sml 12.10-12.18. | |
4386 | Function not of arrow type. | |
4387 | function: [unit] | |
4388 | in: (S1.get ()) () | |
4389 | ---- | |
4390 | ||
4391 | <<< | |
4392 | ||
4393 | :mlton-guide-page: DefinitionOfStandardML | |
4394 | [[DefinitionOfStandardML]] | |
4395 | DefinitionOfStandardML | |
4396 | ====================== | |
4397 | ||
4398 | <!Cite(MilnerEtAl97, The Definition of Standard ML (Revised))> is a | |
4399 | terse and formal specification of <:StandardML:Standard ML>'s syntax | |
4400 | and semantics. The language specified by this book is often referred | |
4401 | to as SML 97. You can check its syntax | |
4402 | http://www.mpi-sws.org/~rossberg/sml.html[grammar] online (thanks to | |
4403 | Andreas Rossberg). | |
4404 | ||
4405 | <!Cite(MilnerEtAl90, The Definition of Standard ML)> is an older | |
4406 | version of the definition, published in 1990. The accompanying | |
4407 | <!Cite(MilnerTofte91, Commentary)> introduces and explains the notation | |
4408 | and approach. The same notation is used in the SML 97 definition, so it | |
4409 | is worth keeping the older definition and its commentary at hand if you | |
4410 | intend a close study of the definition. | |
4411 | ||
4412 | <<< | |
4413 | ||
4414 | :mlton-guide-page: Defunctorize | |
4415 | [[Defunctorize]] | |
4416 | Defunctorize | |
4417 | ============ | |
4418 | ||
4419 | <:Defunctorize:> is a translation pass from the <:CoreML:> | |
4420 | <:IntermediateLanguage:> to the <:XML:> <:IntermediateLanguage:>. | |
4421 | ||
4422 | == Description == | |
4423 | ||
4424 | This pass converts a <:CoreML:> program to an <:XML:> program by | |
4425 | performing: | |
4426 | ||
4427 | * linearization | |
4428 | * <:MatchCompile:> | |
4429 | * polymorphic `val` dec expansion | |
4430 | * `datatype` lifting (to the top-level) | |
4431 | ||
4432 | == Implementation == | |
4433 | ||
4434 | * <!ViewGitFile(mlton,master,mlton/defunctorize/defunctorize.sig)> | |
4435 | * <!ViewGitFile(mlton,master,mlton/defunctorize/defunctorize.fun)> | |
4436 | ||
4437 | == Details and Notes == | |
4438 | ||
4439 | This pass is grossly misnamed and does not perform defunctorization. | |
4440 | ||
4441 | === Datatype Lifting === | |
4442 | ||
4443 | This pass moves all `datatype` declarations to the top level. | |
4444 | ||
4445 | <:StandardML:Standard ML> `datatype` declarations can contain type | |
4446 | variables that are not bound in the declaration itself. For example, | |
4447 | the following program is valid. | |
4448 | [source,sml] | |
4449 | ---- | |
4450 | fun 'a f (x: 'a) = | |
4451 | let | |
4452 | datatype 'b t = T of 'a * 'b | |
4453 | val y: int t = T (x, 1) | |
4454 | in | |
4455 | 13 | |
4456 | end | |
4457 | ---- | |
4458 | ||
4459 | Unfortunately, the `datatype` declaration can not be immediately moved | |
4460 | to the top level, because that would leave `'a` free. | |
4461 | [source,sml] | |
4462 | ---- | |
4463 | datatype 'b t = T of 'a * 'b | |
4464 | fun 'a f (x: 'a) = | |
4465 | let | |
4466 | val y: int t = T (x, 1) | |
4467 | in | |
4468 | 13 | |
4469 | end | |
4470 | ---- | |
4471 | ||
4472 | In order to safely move `datatype`s, this pass must close them, as | |
4473 | well as add any free type variables as extra arguments to the type | |
4474 | constructor. For example, the above program would be translated to | |
4475 | the following. | |
4476 | [source,sml] | |
4477 | ---- | |
4478 | datatype ('a, 'b) t = T of 'a * 'b | |
4479 | fun 'a f (x: 'a) = | |
4480 | let | |
4481 | val y: ('a * int) t = T (x, 1) | |
4482 | in | |
4483 | 13 | |
4484 | end | |
4485 | ---- | |
4486 | ||
4487 | == Historical Notes == | |
4488 | ||
4489 | The <:Defunctorize:> pass originally eliminated | |
4490 | <:StandardML:Standard ML> functors by duplicating their body at each | |
4491 | application. These duties have been adopted by the <:Elaborate:> | |
4492 | pass. | |
4493 | ||
4494 | <<< | |
4495 | ||
4496 | :mlton-guide-page: Developers | |
4497 | [[Developers]] | |
4498 | Developers | |
4499 | ========== | |
4500 | ||
4501 | Here is a picture of the MLton team at a meeting in Chicago in August | |
4502 | 2003. From left to right we have: | |
4503 | ||
4504 | [align="center",frame="none",cols="^"] | |
4505 | |===== | |
4506 | |<:StephenWeeks:> -- <:MatthewFluet:> -- <:HenryCejtin:> -- <:SureshJagannathan:> | |
4507 | |===== | |
4508 | ||
4509 | image::Developers.attachments/team.jpg[align="center"] | |
4510 | ||
4511 | Also see the <:Credits:> for a list of specific contributions. | |
4512 | ||
4513 | ||
4514 | == Developers list == | |
4515 | ||
4516 | A number of people read the developers mailing list, | |
4517 | mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`], and make | |
4518 | contributions there. Here's a list of those who have a page here. | |
4519 | ||
4520 | * <:AndreiFormiga:> | |
4521 | * <:JesperLouisAndersen:> | |
4522 | * <:JohnnyAndersen:> | |
4523 | * <:MichaelNorrish:> | |
4524 | * <:MikeThomas:> | |
4525 | * <:RayRacine:> | |
4526 | * <:WesleyTerpstra:> | |
4527 | * <:VesaKarvonen:> | |
4528 | ||
4529 | <<< | |
4530 | ||
4531 | :mlton-guide-page: Development | |
4532 | [[Development]] | |
4533 | Development | |
4534 | =========== | |
4535 | ||
4536 | This page is the central point for MLton development. | |
4537 | ||
4538 | * Access the <:Sources:>. | |
4539 | * Check the current <!ViewGitFile(mlton,master,CHANGELOG.adoc)> or recent https://github.com/MLton/mlton/commits/master[commits]. | |
4540 | * Open https://github.com/MLton/mlton/issues[Issues]. | |
4541 | * Ideas for <:Projects:> to improve MLton. | |
4542 | * <:Developers:> that are or have been involved in the project. | |
4543 | // * Help maintain and improve the <:WebSite:>. | |
4544 | ||
4545 | == Notes == | |
4546 | ||
4547 | * <:CompilerOverview:> | |
4548 | * <:CompilingWithSMLNJ:> | |
4549 | * <:CrossCompiling:> | |
4550 | * <:License:> | |
4551 | * <:NeedsReview:> | |
4552 | * <:PortingMLton:> | |
4553 | * <:ReleaseChecklist:> | |
4554 | * <:SelfCompiling:> | |
4555 | ||
4556 | <<< | |
4557 | ||
4558 | :mlton-guide-page: Documentation | |
4559 | [[Documentation]] | |
4560 | Documentation | |
4561 | ============= | |
4562 | ||
4563 | Documentation is available on the following topics. | |
4564 | ||
4565 | * <:StandardML:Standard ML> | |
4566 | ** <:BasisLibrary:Basis Library> | |
4567 | ** <:Libraries: Additional libraries> | |
4568 | * <:Installation:Installing MLton> | |
4569 | * Using MLton | |
4570 | ** <:ForeignFunctionInterface: Foreign function interface (FFI)> | |
4571 | ** <:ManualPage: Manual page> (<:CompileTimeOptions:compile-time options> <:RunTimeOptions:run-time options>) | |
4572 | ** <:MLBasis: ML Basis system> | |
4573 | ** <:MLtonStructure: MLton structure> | |
4574 | ** <:PlatformSpecificNotes: Platform-specific notes> | |
4575 | ** <:Profiling: Profiling> | |
4576 | ** <:TypeChecking: Type checking> | |
4577 | ** Help for porting from <:SMLNJ:SML/NJ> to MLton. | |
4578 | * About MLton | |
4579 | ** <:Credits:> | |
4580 | ** <:Drawbacks:> | |
4581 | ** <:Features:> | |
4582 | ** <:History:> | |
4583 | ** <:License:> | |
4584 | ** <:Talk:> | |
4585 | ** <:WishList:> | |
4586 | * Tools | |
4587 | ** <:MLLex:> (<!Attachment(Documentation,mllex.pdf)>) | |
4588 | ** <:MLYacc:> (<!Attachment(Documentation,mlyacc.pdf)>) | |
4589 | ** <:MLNLFFIGen:> (<!Attachment(Documentation,mlyacc.pdf)>) | |
4590 | * <:References:> | |
4591 | ||
4592 | <<< | |
4593 | ||
4594 | :mlton-guide-page: Drawbacks | |
4595 | [[Drawbacks]] | |
4596 | Drawbacks | |
4597 | ========= | |
4598 | ||
4599 | MLton has several drawbacks due to its use of whole-program | |
4600 | compilation. | |
4601 | ||
4602 | * Large compile-time memory requirement. | |
4603 | + | |
4604 | Because MLton performs whole-program analysis and optimization, | |
4605 | compilation requires a large amount of memory. For example, compiling | |
4606 | MLton (over 140K lines) requires at least 512M RAM. | |
4607 | ||
4608 | * Long compile times. | |
4609 | + | |
4610 | Whole-program compilation can take a long time. For example, | |
4611 | compiling MLton (over 140K lines) on a 1.6GHz machine takes five to | |
4612 | ten minutes. | |
4613 | ||
4614 | * No interactive top level. | |
4615 | + | |
4616 | Because of whole-program compilation, MLton does not provide an | |
4617 | interactive top level. In particular, it does not implement the | |
4618 | optional <:BasisLibrary:Basis Library> function `use`. | |
4619 | ||
4620 | <<< | |
4621 | ||
4622 | :mlton-guide-page: Eclipse | |
4623 | [[Eclipse]] | |
4624 | Eclipse | |
4625 | ======= | |
4626 | ||
4627 | http://eclipse.org/[Eclipse] is an open, extensible IDE. | |
4628 | ||
4629 | http://www.cse.iitd.ernet.in/%7Ecsu02132/mldev/[ML-Dev] is a plug-in | |
4630 | for Eclipse, based on <:SMLNJ:SML/NJ>. | |
4631 | ||
4632 | There has been some talk on the MLton mailing list about adding | |
4633 | support to Eclipse for MLton/SML, and in particular, using | |
4634 | http://eclipsefp.sourceforge.net/. We are unaware of any progress | |
4635 | along those lines. | |
4636 | ||
4637 | <<< | |
4638 | ||
4639 | :mlton-guide-page: Elaborate | |
4640 | [[Elaborate]] | |
4641 | Elaborate | |
4642 | ========= | |
4643 | ||
4644 | <:Elaborate:> is a translation pass from the <:AST:> | |
4645 | <:IntermediateLanguage:> to the <:CoreML:> <:IntermediateLanguage:>. | |
4646 | ||
4647 | == Description == | |
4648 | ||
4649 | This pass performs type inference and type checking according to the | |
4650 | <:DefinitionOfStandardML:Definition>. It also defunctorizes the | |
4651 | program, eliminating all module-level constructs. | |
4652 | ||
4653 | == Implementation == | |
4654 | ||
4655 | * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate.sig)> | |
4656 | * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate.fun)> | |
4657 | * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-env.sig)> | |
4658 | * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-env.fun)> | |
4659 | * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-modules.sig)> | |
4660 | * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-modules.fun)> | |
4661 | * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-core.sig)> | |
4662 | * <!ViewGitFile(mlton,master,mlton/elaborate/elaborate-core.fun)> | |
4663 | * <!ViewGitDir(mlton,master,mlton/elaborate)> | |
4664 | ||
4665 | == Details and Notes == | |
4666 | ||
4667 | At the modules level, the <:Elaborate:> pass: | |
4668 | ||
4669 | * elaborates signatures with interfaces (see | |
4670 | <!ViewGitFile(mlton,master,mlton/elaborate/interface.sig)> and | |
4671 | <!ViewGitFile(mlton,master,mlton/elaborate/interface.fun)>) | |
4672 | + | |
4673 | The main trick is to use disjoint sets to efficiently handle sharing | |
4674 | of tycons and of structures and then to copy signatures as dags rather | |
4675 | than as trees. | |
4676 | ||
4677 | * checks functors at the point of definition, using functor summaries | |
4678 | to speed up checking of functor applications. | |
4679 | + | |
4680 | When a functor is first type checked, we keep track of the dummy | |
4681 | argument structure and the dummy result structure, as well as all the | |
4682 | tycons that were created while elaborating the body. Then, if we | |
4683 | later need to type check an application of the functor (as opposed to | |
4684 | defunctorize an application), we pair up tycons in the dummy argument | |
4685 | structure with the actual argument structure and then replace the | |
4686 | dummy tycons with the actual tycons in the dummy result structure, | |
4687 | yielding the actual result structure. We also generate new tycons for | |
4688 | all the tycons that we created while originally elaborating the body. | |
4689 | ||
4690 | * handles opaque signature constraints. | |
4691 | + | |
4692 | This is implemented by building a dummy structure realized from the | |
4693 | signature, just as we would for a functor argument when type checking | |
4694 | a functor. The dummy structure contains exactly the type information | |
4695 | that is in the signature, which is what opacity requires. We then | |
4696 | replace the variables (and constructors) in the dummy structure with | |
4697 | the corresponding variables (and constructors) from the actual | |
4698 | structure so that the translation to <:CoreML:> uses the right stuff. | |
4699 | For each tycon in the dummy structure, we keep track of the | |
4700 | corresponding type structure in the actual structure. This is used | |
4701 | when producing the <:CoreML:> types (see `expandOpaque` in | |
4702 | <!ViewGitFile(mlton,master,mlton/elaborate/type-env.sig)> and | |
4703 | <!ViewGitFile(mlton,master,mlton/elaborate/type-env.fun)>). | |
4704 | + | |
4705 | Then, within each `structure` or `functor` body, for each declaration | |
4706 | (`<dec>` in the <:StandardML:Standard ML> grammar), the <:Elaborate:> | |
4707 | pass does three steps: | |
4708 | + | |
4709 | -- | |
4710 | 1. <:ScopeInference:> | |
4711 | 2. {empty} | |
4712 | ** <:PrecedenceParse:> | |
4713 | ** `_{ex,im}port` expansion | |
4714 | ** profiling insertion | |
4715 | ** unification | |
4716 | 3. Overloaded {constant, function, record pattern} resolution | |
4717 | -- | |
4718 | ||
4719 | === Defunctorization === | |
4720 | ||
4721 | The <:Elaborate:> pass performs a number of duties historically | |
4722 | assigned to the <:Defunctorize:> pass. | |
4723 | ||
4724 | As part of the <:Elaborate:> pass, all module level constructs | |
4725 | (`open`, `signature`, `structure`, `functor`, long identifiers) are | |
4726 | removed. This works because the <:Elaborate:> pass assigns a unique | |
4727 | name to every type and variable in the program. This also allows the | |
4728 | <:Elaborate:> pass to eliminate `local` declarations, which are purely | |
4729 | for namespace management. | |
4730 | ||
4731 | ||
4732 | == Examples == | |
4733 | ||
4734 | Here are a number of examples of elaboration. | |
4735 | ||
4736 | * All variables bound in `val` declarations are renamed. | |
4737 | + | |
4738 | [source,sml] | |
4739 | ---- | |
4740 | val x = 13 | |
4741 | val y = x | |
4742 | ---- | |
4743 | + | |
4744 | ---- | |
4745 | val x_0 = 13 | |
4746 | val y_0 = x_0 | |
4747 | ---- | |
4748 | ||
4749 | * All variables in `fun` declarations are renamed. | |
4750 | + | |
4751 | [source,sml] | |
4752 | ---- | |
4753 | fun f x = g x | |
4754 | and g y = f y | |
4755 | ---- | |
4756 | + | |
4757 | ---- | |
4758 | fun f_0 x_0 = g_0 x_0 | |
4759 | and g_0 y_0 = f_0 y_0 | |
4760 | ---- | |
4761 | ||
4762 | * Type abbreviations are removed, and the abbreviation is expanded | |
4763 | wherever it is used. | |
4764 | + | |
4765 | [source,sml] | |
4766 | ---- | |
4767 | type 'a u = int * 'a | |
4768 | type 'b t = 'b u * real | |
4769 | fun f (x : bool t) = x | |
4770 | ---- | |
4771 | + | |
4772 | ---- | |
4773 | fun f_0 (x_0 : (int * bool) * real) = x_0 | |
4774 | ---- | |
4775 | ||
4776 | * Exception declarations create a new constructor and rename the type. | |
4777 | + | |
4778 | [source,sml] | |
4779 | ---- | |
4780 | type t = int | |
4781 | exception E of t * real | |
4782 | ---- | |
4783 | + | |
4784 | ---- | |
4785 | exception E_0 of int * real | |
4786 | ---- | |
4787 | ||
4788 | * The type and value constructors in datatype declarations are renamed. | |
4789 | + | |
4790 | [source,sml] | |
4791 | ---- | |
4792 | datatype t = A of int | B of real * t | |
4793 | ---- | |
4794 | + | |
4795 | ---- | |
4796 | datatype t_0 = A_0 of int | B_0 of real * t_0 | |
4797 | ---- | |
4798 | ||
4799 | * Local declarations are moved to the top-level. The environment | |
4800 | keeps track of the variables in scope. | |
4801 | + | |
4802 | [source,sml] | |
4803 | ---- | |
4804 | val x = 13 | |
4805 | local val x = 14 | |
4806 | in val y = x | |
4807 | end | |
4808 | val z = x | |
4809 | ---- | |
4810 | + | |
4811 | ---- | |
4812 | val x_0 = 13 | |
4813 | val x_1 = 14 | |
4814 | val y_0 = x_1 | |
4815 | val z_0 = x_0 | |
4816 | ---- | |
4817 | ||
4818 | * Structure declarations are eliminated, with all declarations moved | |
4819 | to the top level. Long identifiers are renamed. | |
4820 | + | |
4821 | [source,sml] | |
4822 | ---- | |
4823 | structure S = | |
4824 | struct | |
4825 | type t = int | |
4826 | val x : t = 13 | |
4827 | end | |
4828 | val y : S.t = S.x | |
4829 | ---- | |
4830 | + | |
4831 | ---- | |
4832 | val x_0 : int = 13 | |
4833 | val y_0 : int = x_0 | |
4834 | ---- | |
4835 | ||
4836 | * Open declarations are eliminated. | |
4837 | + | |
4838 | [source,sml] | |
4839 | ---- | |
4840 | val x = 13 | |
4841 | val y = 14 | |
4842 | structure S = | |
4843 | struct | |
4844 | val x = 15 | |
4845 | end | |
4846 | open S | |
4847 | val z = x + y | |
4848 | ---- | |
4849 | + | |
4850 | ---- | |
4851 | val x_0 = 13 | |
4852 | val y_0 = 14 | |
4853 | val x_1 = 15 | |
4854 | val z_0 = x_1 + y_0 | |
4855 | ---- | |
4856 | ||
4857 | * Functor declarations are eliminated, and the body of a functor is | |
4858 | duplicated wherever the functor is applied. | |
4859 | + | |
4860 | [source,sml] | |
4861 | ---- | |
4862 | functor F(val x : int) = | |
4863 | struct | |
4864 | val y = x | |
4865 | end | |
4866 | structure F1 = F(val x = 13) | |
4867 | structure F2 = F(val x = 14) | |
4868 | val z = F1.y + F2.y | |
4869 | ---- | |
4870 | + | |
4871 | ---- | |
4872 | val x_0 = 13 | |
4873 | val y_0 = x_0 | |
4874 | val x_1 = 14 | |
4875 | val y_1 = x_1 | |
4876 | val z_0 = y_0 + y_1 | |
4877 | ---- | |
4878 | ||
4879 | * Signature constraints are eliminated. Note that signatures do | |
4880 | affect how subsequent variables are renamed. | |
4881 | + | |
4882 | [source,sml] | |
4883 | ---- | |
4884 | val y = 13 | |
4885 | structure S : sig | |
4886 | val x : int | |
4887 | end = | |
4888 | struct | |
4889 | val x = 14 | |
4890 | val y = x | |
4891 | end | |
4892 | open S | |
4893 | val z = x + y | |
4894 | ---- | |
4895 | + | |
4896 | ---- | |
4897 | val y_0 = 13 | |
4898 | val x_0 = 14 | |
4899 | val y_1 = x_0 | |
4900 | val z_0 = x_0 + y_0 | |
4901 | ---- | |
4902 | ||
4903 | <<< | |
4904 | ||
4905 | :mlton-guide-page: Emacs | |
4906 | [[Emacs]] | |
4907 | Emacs | |
4908 | ===== | |
4909 | ||
4910 | == SML modes == | |
4911 | ||
4912 | There are a few Emacs modes for SML. | |
4913 | ||
4914 | * `sml-mode` | |
4915 | ** http://www.xemacs.org/Documentation/packages/html/sml-mode_3.html | |
4916 | ** http://www.smlnj.org/doc/Emacs/sml-mode.html | |
4917 | ** http://www.iro.umontreal.ca/%7Emonnier/elisp/ | |
4918 | ||
4919 | * <!ViewGitFile(mlton,master,ide/emacs/mlton.el)> contains the Emacs lisp that <:StephenWeeks:> uses to interact with MLton (in addition to using `sml-mode`). | |
4920 | ||
4921 | * http://primate.net/%7Eitz/mindent.tar, developed by Ian Zimmerman, who writes: | |
4922 | + | |
4923 | _____ | |
4924 | Unlike the widespread `sml-mode.el` it doesn't try to indent code | |
4925 | based on ML syntax. I gradually got skeptical about this approach | |
4926 | after writing the initial indentation support for caml mode and | |
4927 | watching it bloat insanely as the language added new features. Also, | |
4928 | any such attempts that I know of impose a particular coding style, or | |
4929 | at best a choice among a limited set of styles, which I now oppose. | |
4930 | Instead my mode is based on a generic package which provides manual | |
4931 | bindable commands for common indentation operations (example: indent | |
4932 | the current line under the n-th occurrence of a particular character | |
4933 | in the previous non-blank line). | |
4934 | _____ | |
4935 | ||
4936 | == MLB modes == | |
4937 | ||
4938 | There is a mode for editing <:MLBasis: ML Basis> files. | |
4939 | ||
4940 | * <!ViewGitFile(mlton,master,ide/emacs/esml-mlb-mode.el)> (plus other files) | |
4941 | ||
4942 | == Definitions and uses == | |
4943 | ||
4944 | There is a mode that supports the precise def-use information that | |
4945 | MLton can output. It highlights definitions and uses and provides | |
4946 | commands for navigation (e.g., `jump-to-def`, `jump-to-next`, | |
4947 | `list-all-refs`). It can be handy, for example, for navigating in the | |
4948 | MLton compiler source code. See <:EmacsDefUseMode:> for further | |
4949 | information. | |
4950 | ||
4951 | == Building on the background == | |
4952 | ||
4953 | Tired of manually starting/stopping/restarting builds after editing | |
4954 | files? Now you don't have to. See <:EmacsBgBuildMode:> for further | |
4955 | information. | |
4956 | ||
4957 | == Error messages == | |
4958 | ||
4959 | MLton's error messages are not among those that the Emacs `next-error` | |
4960 | parser natively understands. The easiest way to fix this is to add | |
4961 | the following to your `.emacs` to teach Emacs to recognize MLton's | |
4962 | error messages. | |
4963 | ||
4964 | [source,cl] | |
4965 | ---- | |
4966 | (require 'compile) | |
4967 | (add-to-list 'compilation-error-regexp-alist 'mlton) | |
4968 | (add-to-list 'compilation-error-regexp-alist-alist | |
4969 | '(mlton | |
4970 | "^[[:space:]]*\\(\\(?:\\(Error\\)\\|\\(Warning\\)\\|\\(\\(?:\\(?:defn\\|spec\\) at\\)\\|\\(?:escape \\(?:from\\|to\\)\\)\\|\\(?:scoped at\\)\\)\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\(?:-\\([0-9]+\\)\\.\\([0-9]+\\)\\)?\\.?\\)$" | |
4971 | 5 (6 . 8) (7 . 9) (3 . 4) 1)) | |
4972 | ---- | |
4973 | ||
4974 | <<< | |
4975 | ||
4976 | :mlton-guide-page: EmacsBgBuildMode | |
4977 | [[EmacsBgBuildMode]] | |
4978 | EmacsBgBuildMode | |
4979 | ================ | |
4980 | ||
4981 | Do you really want to think about starting a build of you project? | |
4982 | What if you had a personal slave that would restart a build of your | |
4983 | project whenever you save any file belonging to that project? The | |
4984 | bg-build mode does just that. Just save the file, a compile is | |
4985 | started (silently!), you can continue working without even thinking | |
4986 | about starting a build, and if there are errors, you are notified | |
4987 | (with a message), and can then jump to errors. | |
4988 | ||
4989 | This mode is not specific to MLton per se, but is particularly useful | |
4990 | for working with MLton due to the longer compile times. By the time | |
4991 | you start wondering about possible errors, the build is already on the | |
4992 | way. | |
4993 | ||
4994 | == Functionality and Features == | |
4995 | ||
4996 | * Each time a file is saved, and after a user configurable delay | |
4997 | period has been exhausted, a build is started silently in the | |
4998 | background. | |
4999 | * When the build is finished, a status indicator (message) is | |
5000 | displayed non-intrusively. | |
5001 | * At any time, you can switch to a build process buffer where all the | |
5002 | messages from the build are shown. | |
5003 | * Optionally highlights (error/warning) message locations in (source | |
5004 | code) buffers after a finished build. | |
5005 | * After a build has finished, you can jump to locations of warnings | |
5006 | and errors from the build process buffer or by using the `first-error` | |
5007 | and `next-error` commands. | |
5008 | * When a build fails, bg-build mode can optionally execute a user | |
5009 | specified command. By default, bg-build mode executes `first-error`. | |
5010 | * When starting a build of a particular project, a possible previous | |
5011 | live build of the same project is interrupted first. | |
5012 | * A project configuration file specifies the commands required to | |
5013 | build a project. | |
5014 | * Multiple projects can be loaded into bg-build mode and bg-build mode | |
5015 | can build a given maximum number of projects concurrently. | |
5016 | * Supports both http://www.gnu.org/software/emacs/[Gnu Emacs] and | |
5017 | http://www.xemacs.org[XEmacs]. | |
5018 | ||
5019 | ||
5020 | == Download == | |
5021 | ||
5022 | There is no package for the mode at the moment. To install the mode you | |
5023 | need to fetch the Emacs Lisp, `*.el`, files from the MLton repository: | |
5024 | <!ViewGitDir(mlton,master,ide/emacs)>. | |
5025 | ||
5026 | ||
5027 | == Setup == | |
5028 | ||
5029 | The easiest way to load the mode is to first tell Emacs where to find the | |
5030 | files. For example, add | |
5031 | ||
5032 | [source,cl] | |
5033 | ---- | |
5034 | (add-to-list 'load-path (file-truename "path-to-the-el-files")) | |
5035 | ---- | |
5036 | ||
5037 | to your `~/.emacs` or `~/.xemacs/init.el`. You'll probably also want | |
5038 | to start the mode automatically by adding | |
5039 | ||
5040 | [source,cl] | |
5041 | ---- | |
5042 | (require 'bg-build-mode) | |
5043 | (bg-build-mode) | |
5044 | ---- | |
5045 | ||
5046 | to your Emacs init file. Once the mode is activated, you should see | |
5047 | the `BGB` indicator on the mode line. | |
5048 | ||
5049 | ||
5050 | === MLton and Compilation-Mode === | |
5051 | ||
5052 | At the time of writing, neither Gnu Emacs nor XEmacs contain an error | |
5053 | regexp that would match MLton's messages. | |
5054 | ||
5055 | If you use Gnu Emacs, insert the following code into your `.emacs` file: | |
5056 | ||
5057 | [source,cl] | |
5058 | ---- | |
5059 | (require 'compile) | |
5060 | (add-to-list | |
5061 | 'compilation-error-regexp-alist | |
5062 | '("^\\(Warning\\|Error\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\.$" | |
5063 | 2 3 4)) | |
5064 | ---- | |
5065 | ||
5066 | If you use XEmacs, insert the following code into your `init.el` file: | |
5067 | ||
5068 | [source,cl] | |
5069 | ---- | |
5070 | (require 'compile) | |
5071 | (add-to-list | |
5072 | 'compilation-error-regexp-alist-alist | |
5073 | '(mlton | |
5074 | ("^\\(Warning\\|Error\\): \\(.+\\) \\([0-9]+\\)\\.\\([0-9]+\\)\\.$" | |
5075 | 2 3 4))) | |
5076 | (compilation-build-compilation-error-regexp-alist) | |
5077 | ---- | |
5078 | ||
5079 | == Usage == | |
5080 | ||
5081 | Typically projects are built (or compiled) using a tool like http://www.gnu.org/software/make/[`make`], | |
5082 | but the details vary. The bg-build mode needs a project configuration file to | |
5083 | know how to build your project. A project configuration file basically contains | |
5084 | an Emacs Lisp expression calling a function named `bg-build` that returns a | |
5085 | project object. A simple example of a project configuration file would be the | |
5086 | (<!ViewGitFile(mltonlib,master,com/ssh/async/unstable/example/smlbot/Build.bgb)>) | |
5087 | file used with smlbot: | |
5088 | ||
5089 | [source,cl] | |
5090 | ---- | |
5091 | sys::[./bin/InclGitFile.py mltonlib master com/ssh/async/unstable/example/smlbot/Build.bgb 5:] | |
5092 | ---- | |
5093 | ||
5094 | The `bg-build` function takes a number of keyword arguments: | |
5095 | ||
5096 | * `:name` specifies the name of the project. This can be any | |
5097 | expression that evaluates to a string or to a nullary function that | |
5098 | returns a string. | |
5099 | ||
5100 | * `:shell` specifies a shell command to execute. This can be any | |
5101 | expression that evaluates to a string, a list of strings, or to a | |
5102 | nullary function returning a list of strings. | |
5103 | ||
5104 | * `:build?` specifies a predicate to determine whether the project | |
5105 | should be built after some files have been modified. The predicate is | |
5106 | given a list of filenames and should return a non-nil value when the | |
5107 | project should be built and nil otherwise. | |
5108 | ||
5109 | All of the keyword arguments, except `:shell`, are optional and can be left out. | |
5110 | ||
5111 | Note the use of the `nice` command above. It means that background | |
5112 | build process is given a lower priority by the system process | |
5113 | scheduler. Assuming your machine has enough memory, using nice | |
5114 | ensures that your computer remains responsive. (You probably won't | |
5115 | even notice when a build is started.) | |
5116 | ||
5117 | Once you have written a project file for bg-build mode. Use the | |
5118 | `bg-build-add-project` command to load the project file for bg-build | |
5119 | mode. The bg-build mode can also optionally load recent project files | |
5120 | automatically at startup. | |
5121 | ||
5122 | After the project file has been loaded and bg-build mode activated, | |
5123 | each time you save a file in Emacs, the bg-build mode tries to build | |
5124 | your project. | |
5125 | ||
5126 | The `bg-build-status` command creates a buffer that displays some | |
5127 | status information on builds and allows you to manage projects (start | |
5128 | builds explicitly, remove a project from bg-build, ...) as well as | |
5129 | visit buffers created by bg-build. Notice the count of started | |
5130 | builds. At the end of the day it can be in the hundreds or thousands. | |
5131 | Imagine the number of times you've been relieved of starting a build | |
5132 | explicitly! | |
5133 | ||
5134 | <<< | |
5135 | ||
5136 | :mlton-guide-page: EmacsDefUseMode | |
5137 | [[EmacsDefUseMode]] | |
5138 | EmacsDefUseMode | |
5139 | =============== | |
5140 | ||
5141 | MLton provides an <:CompileTimeOptions:option>, | |
5142 | ++-show-def-use __file__++, to output precise (giving exact source | |
5143 | locations) and accurate (including all uses and no false data) | |
5144 | whole-program def-use information to a file. Unlike typical tags | |
5145 | facilities, the information includes local variables and distinguishes | |
5146 | between different definitions even when they have the same name. The | |
5147 | def-use Emacs mode uses the information to provide navigation support, | |
5148 | which can be particularly useful while reading SML programs compiled | |
5149 | with MLton (such as the MLton compiler itself). | |
5150 | ||
5151 | ||
5152 | == Screen Capture == | |
5153 | ||
5154 | Note the highlighting and the type displayed in the minibuffer. | |
5155 | ||
5156 | image::EmacsDefUseMode.attachments/def-use-capture.png[align="center"] | |
5157 | ||
5158 | ||
5159 | == Features == | |
5160 | ||
5161 | * Highlights definitions and uses. Different colors for definitions, unused definitions, and uses. | |
5162 | * Shows types (with highlighting) of variable definitions in the minibuffer. | |
5163 | * Navigation: `jump-to-def`, `jump-to-next`, and `jump-to-prev`. These work precisely (no searching involved). | |
5164 | * Can list, visit and mark all references to a definition (within a program). | |
5165 | * Automatically reloads updated def-use files. | |
5166 | * Automatically loads previously used def-use files at startup. | |
5167 | * Supports both http://www.gnu.org/software/emacs/[Gnu Emacs] and http://www.xemacs.org[XEmacs]. | |
5168 | ||
5169 | ||
5170 | == Download == | |
5171 | ||
5172 | There is no separate package for the def-use mode although the mode | |
5173 | has been relatively stable for some time already. To install the mode | |
5174 | you need to get the Emacs Lisp, `*.el`, files from MLton's repository: | |
5175 | <!ViewGitDir(mlton,master,ide/emacs)>. The easiest way to get the files | |
5176 | is to use <:Git:> to access MLton's <:Sources:sources>. | |
5177 | ||
5178 | ///// | |
5179 | If you only want the Emacs lisp files, you can use the following | |
5180 | command: | |
5181 | ---- | |
5182 | svn co svn://mlton.org/mlton/trunk/ide/emacs mlton-emacs-ide | |
5183 | ---- | |
5184 | ///// | |
5185 | ||
5186 | == Setup == | |
5187 | ||
5188 | The easiest way to load def-use mode is to first tell Emacs where to | |
5189 | find the files. For example, add | |
5190 | ||
5191 | [source,cl] | |
5192 | ---- | |
5193 | (add-to-list 'load-path (file-truename "path-to-the-el-files")) | |
5194 | ---- | |
5195 | ||
5196 | to your `~/.emacs` or `~/.xemacs/init.el`. You'll probably | |
5197 | also want to start `def-use-mode` automatically by adding | |
5198 | ||
5199 | [source,cl] | |
5200 | ---- | |
5201 | (require 'esml-du-mlton) | |
5202 | (def-use-mode) | |
5203 | ---- | |
5204 | ||
5205 | to your Emacs init file. Once the def-use mode is activated, you | |
5206 | should see the `DU` indicator on the mode line. | |
5207 | ||
5208 | == Usage == | |
5209 | ||
5210 | To use def-use mode one typically first sets up the program's makefile | |
5211 | or build script so that the def-use information is saved each time the | |
5212 | program is compiled. In addition to the ++-show-def-use __file__++ | |
5213 | option, the ++-prefer-abs-paths true++ expert option is required. | |
5214 | Note that the time it takes to save the information is small (compared | |
5215 | to type-checking), so it is recommended to simply add the options to | |
5216 | the MLton invocation that compiles the program. However, it is only | |
5217 | necessary to type check the program (or library), so one can specify | |
5218 | the ++-stop tc++ option. For example, suppose you have a program | |
5219 | defined by an MLB file named `my-prg.mlb`, you can save the def-use | |
5220 | information to the file `my-prg.du` by invoking MLton as: | |
5221 | ||
5222 | ---- | |
5223 | mlton -prefer-abs-paths true -show-def-use my-prg.du -stop tc my-prg.mlb | |
5224 | ---- | |
5225 | ||
5226 | Finally, one needs to tell the mode where to find the def-use | |
5227 | information. This is done with the `esml-du-mlton` command. For | |
5228 | example, to load the `my-prg.du` file, one would type: | |
5229 | ||
5230 | ---- | |
5231 | M-x esml-du-mlton my-prg.du | |
5232 | ---- | |
5233 | ||
5234 | After doing all of the above, find an SML file covered by the | |
5235 | previously saved and loaded def-use information, and place the cursor | |
5236 | at some variable (definition or use, it doesn't matter). You should | |
5237 | see the variable being highlighted. (Note that specifications in | |
5238 | signatures do not define variables.) | |
5239 | ||
5240 | You might also want to setup and use the | |
5241 | <:EmacsBgBuildMode:Bg-Build mode> to start builds automatically. | |
5242 | ||
5243 | ||
5244 | == Types == | |
5245 | ||
5246 | `-show-def-use` output was extended to include types of variable | |
5247 | definitions in revision <!ViewSVNRev(6333)>. To get good type names, the | |
5248 | types must be in scope at the end of the program. If you are using the | |
5249 | <:MLBasis:ML Basis> system, this means that the root MLB-file for your | |
5250 | application should not wrap the libraries used in the application inside | |
5251 | `local ... in ... end`, because that would remove them from the scope before | |
5252 | the end of the program. | |
5253 | ||
5254 | <<< | |
5255 | ||
5256 | :mlton-guide-page: Enscript | |
5257 | [[Enscript]] | |
5258 | Enscript | |
5259 | ======== | |
5260 | ||
5261 | http://www.gnu.org/s/enscript/[GNU Enscript] converts ASCII files to | |
5262 | PostScript, HTML, and other output languages, applying language | |
5263 | sensitive highlighting (similar to <:Emacs:>'s font lock mode). Here | |
5264 | are a few _states_ files for highlighting <:StandardML: Standard ML>. | |
5265 | ||
5266 | * <!ViewGitFile(mlton,master,ide/enscript/sml_simple.st)> -- Provides highlighting of keywords, string and character constants, and (nested) comments. | |
5267 | ///// | |
5268 | + | |
5269 | [source,sml] | |
5270 | ---- | |
5271 | (* Comments (* can be nested *) *) | |
5272 | structure S = struct | |
5273 | val x = (1, 2, "three") | |
5274 | end | |
5275 | ---- | |
5276 | ///// | |
5277 | ||
5278 | * <!ViewGitFile(mlton,master,ide/enscript/sml_verbose.st)> -- Supersedes | |
5279 | the above, adding highlighting of numeric constants. Due to the | |
5280 | limited parsing available, numeric record labels are highlighted as | |
5281 | numeric constants, in all contexts. Likewise, a binding precedence | |
5282 | separated from `infix` or `infixr` by a newline is highlighted as a | |
5283 | numeric constant and a numeric record label selector separated from | |
5284 | `#` by a newline is highlighted as a numeric constant. | |
5285 | ///// | |
5286 | + | |
5287 | [source,sml] | |
5288 | ---- | |
5289 | structure S = struct | |
5290 | (* These look good *) | |
5291 | val x = (1, 2, "three") | |
5292 | val z = #2 x | |
5293 | ||
5294 | (* Although these look bad (not all the numbers are constants), * | |
5295 | * they never occur in practice, as they are equivalent to the above. *) | |
5296 | val x = {1 = 1, 3 = "three", 2 = 2} | |
5297 | val z = # | |
5298 | 2 x | |
5299 | end | |
5300 | ---- | |
5301 | ///// | |
5302 | ||
5303 | * <!ViewGitFile(mlton,master,ide/enscript/sml_fancy.st)> -- Supersedes the | |
5304 | above, adding highlighting of type and constructor bindings, | |
5305 | highlighting of explicit binding of type variables at `val` and `fun` | |
5306 | declarations, and separate highlighting of core and modules level | |
5307 | keywords. Due to the limited parsing available, it is assumed that | |
5308 | the input is a syntactically correct, top-level declaration. | |
5309 | ///// | |
5310 | + | |
5311 | [source,sml] | |
5312 | ---- | |
5313 | structure S = struct | |
5314 | val x = (1, 2, "three") | |
5315 | datatype 'a t = T of 'a | |
5316 | and u = U of v * v | |
5317 | withtype v = {left: int t, right: int t} | |
5318 | exception E1 of int and E2 | |
5319 | fun 'a id (x: 'a) : 'a = x | |
5320 | ||
5321 | (* Although this looks bad (the explicitly bound type variable 'a is * | |
5322 | * not highlighted), it is unlikely to occur in practice. *) | |
5323 | val | |
5324 | 'a id = fn (x : 'a) => x | |
5325 | end | |
5326 | ---- | |
5327 | ///// | |
5328 | ||
5329 | * <!ViewGitFile(mlton,master,ide/enscript/sml_gaudy.st)> -- Supersedes the | |
5330 | above, adding highlighting of type annotations, in both expressions | |
5331 | and signatures. Due to the limited parsing available, it is assumed | |
5332 | that the input is a syntactically correct, top-level declaration. | |
5333 | ///// | |
5334 | + | |
5335 | [source,sml] | |
5336 | ---- | |
5337 | signature S = sig | |
5338 | type t | |
5339 | val x : t | |
5340 | val f : t * int -> int | |
5341 | end | |
5342 | structure S : S = struct | |
5343 | datatype t = T of int | |
5344 | val x : t = T 0 | |
5345 | fun f (T x, i : int) : int = x + y | |
5346 | fun 'a id (x: 'a) : 'a = x | |
5347 | end | |
5348 | ---- | |
5349 | ///// | |
5350 | ||
5351 | == Install and use == | |
5352 | ||
5353 | * Version 1.6.3 of http://people.ssh.com/mtr/genscript[GNU Enscript] | |
5354 | ** Copy all files to `/usr/share/enscript/hl/` or `.enscript/` in your home directory. | |
5355 | ** Invoke `enscript` with `--highlight=sml_simple` (or `--highlight=sml_verbose` or `--highlight=sml_fancy` or `--highlight=sml_gaudy`). | |
5356 | ||
5357 | * Version 1.6.1 of http://people.ssh.com/mtr/genscript[GNU Enscript] | |
5358 | ** Append <!ViewGitFile(mlton,master,ide/enscript/sml_all.st)> to `/usr/share/enscript/enscript.st` | |
5359 | ** Invoke `enscript` with `--pretty-print=sml_simple` (or `--pretty-print=sml_verbose` or `--pretty-print=sml_fancy` or `--pretty-print=sml_gaudy`). | |
5360 | ||
5361 | == Feedback == | |
5362 | ||
5363 | Comments and suggestions should be directed to <:MatthewFluet:>. | |
5364 | ||
5365 | <<< | |
5366 | ||
5367 | :mlton-guide-page: EqualityType | |
5368 | [[EqualityType]] | |
5369 | EqualityType | |
5370 | ============ | |
5371 | ||
5372 | An equality type is a type to which <:PolymorphicEquality:> can be | |
5373 | applied. The <:DefinitionOfStandardML:Definition> and the | |
5374 | <:BasisLibrary:Basis Library> precisely spell out which types are | |
5375 | equality types. | |
5376 | ||
5377 | * `bool`, `char`, `IntInf.int`, ++Int__<N>__.int++, `string`, and ++Word__<N>__.word++ are equality types. | |
5378 | ||
5379 | * for any `t`, both `t array` and `t ref` are equality types. | |
5380 | ||
5381 | * if `t` is an equality type, then `t list`, and `t vector` are equality types. | |
5382 | ||
5383 | * if `t1`, ..., `tn` are equality types, then `t1 * ... * tn` and `{l1: t1, ..., ln: tn}` are equality types. | |
5384 | ||
5385 | * if `t1`, ..., `tn` are equality types and `t` <:AdmitsEquality:>, then `(t1, ..., tn) t` is an equality type. | |
5386 | ||
5387 | To check that a type t is an equality type, use the following idiom. | |
5388 | [source,sml] | |
5389 | ---- | |
5390 | structure S: sig eqtype t end = | |
5391 | struct | |
5392 | type t = ... | |
5393 | end | |
5394 | ---- | |
5395 | ||
5396 | Notably, `exn` and `real` are not equality types. Neither is `t1 -> t2`, for any `t1` and `t2`. | |
5397 | ||
5398 | Equality on arrays and ref cells is by identity, not structure. | |
5399 | For example, `ref 13 = ref 13` is `false`. | |
5400 | On the other hand, equality for lists, strings, and vectors is by | |
5401 | structure, not identity. For example, the following equalities hold. | |
5402 | ||
5403 | [source,sml] | |
5404 | ---- | |
5405 | val _ = [1, 2, 3] = 1 :: [2, 3] | |
5406 | val _ = "foo" = concat ["f", "o", "o"] | |
5407 | val _ = Vector.fromList [1, 2, 3] = Vector.tabulate (3, fn i => i + 1) | |
5408 | ---- | |
5409 | ||
5410 | <<< | |
5411 | ||
5412 | :mlton-guide-page: EqualityTypeVariable | |
5413 | [[EqualityTypeVariable]] | |
5414 | EqualityTypeVariable | |
5415 | ==================== | |
5416 | ||
5417 | An equality type variable is a type variable that starts with two or | |
5418 | more primes, as in `''a` or `''b`. The canonical use of equality type | |
5419 | variables is in specifying the type of the <:PolymorphicEquality:> | |
5420 | function, which is `''a * ''a -> bool`. Equality type variables | |
5421 | ensure that polymorphic equality is only used on | |
5422 | <:EqualityType:equality types>, by requiring that at every use of a | |
5423 | polymorphic value, equality type variables are instantiated by | |
5424 | equality types. | |
5425 | ||
5426 | For example, the following program is type correct because polymorphic | |
5427 | equality is applied to variables of type `''a`. | |
5428 | ||
5429 | [source,sml] | |
5430 | ---- | |
5431 | fun f (x: ''a, y: ''a): bool = x = y | |
5432 | ---- | |
5433 | ||
5434 | On the other hand, the following program is not type correct, because | |
5435 | polymorphic equality is applied to variables of type `'a`, which is | |
5436 | not an equality type. | |
5437 | ||
5438 | [source,sml] | |
5439 | ---- | |
5440 | fun f (x: 'a, y: 'a): bool = x = y | |
5441 | ---- | |
5442 | ||
5443 | MLton reports the following error, indicating that polymorphic | |
5444 | equality expects equality types, but didn't get them. | |
5445 | ||
5446 | ---- | |
5447 | Error: z.sml 1.30-1.34. | |
5448 | Function applied to incorrect argument. | |
5449 | expects: [<equality>] * [<equality>] | |
5450 | but got: ['a] * ['a] | |
5451 | in: = (x, y) | |
5452 | ---- | |
5453 | ||
5454 | As an example of using such a function that requires equality types, | |
5455 | suppose that `f` has polymorphic type `''a -> unit`. Then, `f 13` is | |
5456 | type correct because `int` is an equality type. On the other hand, | |
5457 | `f 13.0` and `f (fn x => x)` are not type correct, because `real` and | |
5458 | arrow types are not equality types. We can test these facts with the | |
5459 | following short programs. First, we verify that such an `f` can be | |
5460 | applied to integers. | |
5461 | ||
5462 | [source,sml] | |
5463 | ---- | |
5464 | functor Ok (val f: ''a -> unit): sig end = | |
5465 | struct | |
5466 | val () = f 13 | |
5467 | val () = f 14 | |
5468 | end | |
5469 | ---- | |
5470 | ||
5471 | We can do better, and verify that such an `f` can be applied to | |
5472 | any integer. | |
5473 | ||
5474 | [source,sml] | |
5475 | ---- | |
5476 | functor Ok (val f: ''a -> unit): sig end = | |
5477 | struct | |
5478 | fun g (x: int) = f x | |
5479 | end | |
5480 | ---- | |
5481 | ||
5482 | Even better, we don't need to introduce a dummy function name; we can | |
5483 | use a type constraint. | |
5484 | ||
5485 | [source,sml] | |
5486 | ---- | |
5487 | functor Ok (val f: ''a -> unit): sig end = | |
5488 | struct | |
5489 | val _ = f: int -> unit | |
5490 | end | |
5491 | ---- | |
5492 | ||
5493 | Even better, we can use a signature constraint. | |
5494 | ||
5495 | [source,sml] | |
5496 | ---- | |
5497 | functor Ok (S: sig val f: ''a -> unit end): | |
5498 | sig val f: int -> unit end = S | |
5499 | ---- | |
5500 | ||
5501 | This functor concisely verifies that a function of polymorphic type | |
5502 | `''a -> unit` can be safely used as a function of type `int -> unit`. | |
5503 | ||
5504 | As above, we can verify that such an `f` can not be used at | |
5505 | non-equality types. | |
5506 | ||
5507 | [source,sml] | |
5508 | ---- | |
5509 | functor Bad (S: sig val f: ''a -> unit end): | |
5510 | sig val f: real -> unit end = S | |
5511 | ||
5512 | functor Bad (S: sig val f: ''a -> unit end): | |
5513 | sig val f: ('a -> 'a) -> unit end = S | |
5514 | ---- | |
5515 | ||
5516 | MLton reports the following errors. | |
5517 | ||
5518 | ---- | |
5519 | Error: z.sml 2.4-2.30. | |
5520 | Variable in structure disagrees with signature (type): f. | |
5521 | structure: val f: [<equality>] -> _ | |
5522 | defn at: z.sml 1.25-1.25 | |
5523 | signature: val f: [real] -> _ | |
5524 | spec at: z.sml 2.12-2.12 | |
5525 | Error: z.sml 5.4-5.36. | |
5526 | Variable in structure disagrees with signature (type): f. | |
5527 | structure: val f: [<equality>] -> _ | |
5528 | defn at: z.sml 4.25-4.25 | |
5529 | signature: val f: [_ -> _] -> _ | |
5530 | spec at: z.sml 5.12-5.12 | |
5531 | ---- | |
5532 | ||
5533 | ||
5534 | == Equality type variables in type and datatype declarations == | |
5535 | ||
5536 | Equality type variables can be used in type and datatype declarations; | |
5537 | however they play no special role. For example, | |
5538 | ||
5539 | [source,sml] | |
5540 | ---- | |
5541 | type 'a t = 'a * int | |
5542 | ---- | |
5543 | ||
5544 | is completely identical to | |
5545 | ||
5546 | [source,sml] | |
5547 | ---- | |
5548 | type ''a t = ''a * int | |
5549 | ---- | |
5550 | ||
5551 | In particular, such a definition does _not_ require that `t` only be | |
5552 | applied to equality types. | |
5553 | ||
5554 | Similarly, | |
5555 | ||
5556 | [source,sml] | |
5557 | ---- | |
5558 | datatype 'a t = A | B of 'a | |
5559 | ---- | |
5560 | ||
5561 | is completely identical to | |
5562 | ||
5563 | [source,sml] | |
5564 | ---- | |
5565 | datatype ''a t = A | B of ''a | |
5566 | ---- | |
5567 | ||
5568 | <<< | |
5569 | ||
5570 | :mlton-guide-page: EtaExpansion | |
5571 | [[EtaExpansion]] | |
5572 | EtaExpansion | |
5573 | ============ | |
5574 | ||
5575 | Eta expansion is a simple syntactic change used to work around the | |
5576 | <:ValueRestriction:> in <:StandardML:Standard ML>. | |
5577 | ||
5578 | The eta expansion of an expression `e` is the expression | |
5579 | `fn z => e z`, where `z` does not occur in `e`. This only | |
5580 | makes sense if `e` denotes a function, i.e. is of arrow type. Eta | |
5581 | expansion delays the evaluation of `e` until the function is | |
5582 | applied, and will re-evaluate `e` each time the function is | |
5583 | applied. | |
5584 | ||
5585 | The name "eta expansion" comes from the eta-conversion rule of the | |
5586 | <:LambdaCalculus:lambda calculus>. Expansion refers to the | |
5587 | directionality of the equivalence being used, namely taking `e` to | |
5588 | `fn z => e z` rather than `fn z => e z` to `e` (eta | |
5589 | contraction). | |
5590 | ||
5591 | <<< | |
5592 | ||
5593 | :mlton-guide-page: eXene | |
5594 | [[eXene]] | |
5595 | eXene | |
5596 | ===== | |
5597 | ||
5598 | http://people.cs.uchicago.edu/%7Ejhr/eXene/index.html[eXene] is a | |
5599 | multi-threaded X Window System toolkit written in <:ConcurrentML:>. | |
5600 | ||
5601 | There is a group at K-State working toward | |
5602 | http://www.cis.ksu.edu/%7Estough/eXene/[eXene 2.0]. | |
5603 | ||
5604 | <<< | |
5605 | ||
5606 | :mlton-guide-page: FAQ | |
5607 | [[FAQ]] | |
5608 | FAQ | |
5609 | === | |
5610 | ||
5611 | Feel free to ask questions and to update answers by editing this page. | |
5612 | Since we try to make as much information as possible available on the | |
5613 | web site and we like to avoid duplication, many of the answers are | |
5614 | simply links to a web page that answers the question. | |
5615 | ||
5616 | == How do you pronounce MLton? == | |
5617 | ||
5618 | <:Pronounce:> | |
5619 | ||
5620 | == What SML software has been ported to MLton? == | |
5621 | ||
5622 | <:Libraries:> | |
5623 | ||
5624 | == What graphical libraries are available for MLton? == | |
5625 | ||
5626 | <:Libraries:> | |
5627 | ||
5628 | == How does MLton's performance compare to other SML compilers and to other languages? == | |
5629 | ||
5630 | MLton has <:Performance:excellent performance>. | |
5631 | ||
5632 | == Does MLton treat monomorphic arrays and vectors specially? == | |
5633 | ||
5634 | MLton implements monomorphic arrays and vectors (e.g. `BoolArray`, | |
5635 | `Word8Vector`) exactly as instantiations of their polymorphic | |
5636 | counterpart (e.g. `bool array`, `Word8.word vector`). Thus, there is | |
5637 | no need to use the monomorphic versions except when required to | |
5638 | interface with the <:BasisLibrary:Basis Library> or for portability | |
5639 | with other SML implementations. | |
5640 | ||
5641 | == Why do I get a Segfault/Bus error in a program that uses `IntInf`/`LargeInt` to calculate numbers with several hundred thousand digits? == | |
5642 | ||
5643 | <:GnuMP:> | |
5644 | ||
5645 | == How can I decrease compile-time memory usage? == | |
5646 | ||
5647 | * Compile with `-verbose 3` to find out if the problem is due to an | |
5648 | SSA optimization pass. If so, compile with ++-disable-pass __pass__++ to | |
5649 | skip that pass. | |
5650 | ||
5651 | * Compile with `@MLton hash-cons 0.5 --`, which will instruct the | |
5652 | runtime to hash cons the heap every other GC. | |
5653 | ||
5654 | * Compile with `-polyvariance false`, which is an undocumented option | |
5655 | that causes less code duplication. | |
5656 | ||
5657 | Also, please <:Contact:> us to let us know the problem to help us | |
5658 | better understand MLton's limitations. | |
5659 | ||
5660 | == How portable is SML code across SML compilers? == | |
5661 | ||
5662 | <:StandardMLPortability:> | |
5663 | ||
5664 | <<< | |
5665 | ||
5666 | :mlton-guide-page: Features | |
5667 | [[Features]] | |
5668 | Features | |
5669 | ======== | |
5670 | ||
5671 | MLton has the following features. | |
5672 | ||
5673 | == Portability == | |
5674 | ||
5675 | * Runs on a variety of platforms. | |
5676 | ||
5677 | ** <:RunningOnARM:ARM>: | |
5678 | *** <:RunningOnLinux:Linux> (Debian) | |
5679 | ||
5680 | ** <:RunningOnAlpha:Alpha>: | |
5681 | *** <:RunningOnLinux:Linux> (Debian) | |
5682 | ||
5683 | ** <:RunningOnAMD64:AMD64>: | |
5684 | *** <:RunningOnDarwin:Darwin> (Mac OS X) | |
5685 | *** <:RunningOnFreeBSD:FreeBSD> | |
5686 | *** <:RunningOnLinux:Linux> (Debian, Fedora, Ubuntu, ...) | |
5687 | *** <:RunningOnOpenBSD:OpenBSD> | |
5688 | *** <:RunningOnSolaris:Solaris> (10 and above) | |
5689 | ||
5690 | ** <:RunningOnHPPA:HPPA>: | |
5691 | *** <:RunningOnHPUX:HPUX> (11.11 and above) | |
5692 | *** <:RunningOnLinux:Linux> (Debian) | |
5693 | ||
5694 | ** <:RunningOnIA64:IA64>: | |
5695 | *** <:RunningOnHPUX:HPUX> (11.11 and above) | |
5696 | *** <:RunningOnLinux:Linux> (Debian) | |
5697 | ||
5698 | ** <:RunningOnPowerPC:PowerPC>: | |
5699 | *** <:RunningOnAIX:AIX> (5.2 and above) | |
5700 | *** <:RunningOnDarwin:Darwin> (Mac OS X) | |
5701 | *** <:RunningOnLinux:Linux> (Debian, Fedora, ...) | |
5702 | ||
5703 | ** <:RunningOnPowerPC64:PowerPC64>: | |
5704 | *** <:RunningOnAIX:AIX> (5.2 and above) | |
5705 | ||
5706 | ** <:RunningOnS390:S390> | |
5707 | *** <:RunningOnLinux:Linux> (Debian) | |
5708 | ||
5709 | ** <:RunningOnSparc:Sparc> | |
5710 | *** <:RunningOnLinux:Linux> (Debian) | |
5711 | *** <:RunningOnSolaris:Solaris> (8 and above) | |
5712 | ||
5713 | ** <:RunningOnX86:X86>: | |
5714 | *** <:RunningOnCygwin:Cygwin>/Windows | |
5715 | *** <:RunningOnDarwin:Darwin> (Mac OS X) | |
5716 | *** <:RunningOnFreeBSD:FreeBSD> | |
5717 | *** <:RunningOnLinux:Linux> (Debian, Fedora, Ubuntu, ...) | |
5718 | *** <:RunningOnMinGW:MinGW>/Windows | |
5719 | *** <:RunningOnNetBSD:NetBSD> | |
5720 | *** <:RunningOnOpenBSD:OpenBSD> | |
5721 | *** <:RunningOnSolaris:Solaris> (10 and above) | |
5722 | ||
5723 | == Robustness == | |
5724 | ||
5725 | * Supports the full SML 97 language as given in <:DefinitionOfStandardML:The Definition of Standard ML (Revised)>. | |
5726 | + | |
5727 | If there is a program that is valid according to the | |
5728 | <:DefinitionOfStandardML:Definition> that is rejected by MLton, or a | |
5729 | program that is invalid according to the | |
5730 | <:DefinitionOfStandardML:Definition> that is accepted by MLton, it is | |
5731 | a bug. For a list of known bugs, see <:UnresolvedBugs:>. | |
5732 | ||
5733 | * A complete implementation of the <:BasisLibrary:Basis Library>. | |
5734 | + | |
5735 | MLton's implementation matches latest <:BasisLibrary:Basis Library> | |
5736 | http://www.standardml.org/Basis[specification], and includes a | |
5737 | complete implementation of all the required modules, as well as many | |
5738 | of the optional modules. | |
5739 | ||
5740 | * Generates standalone executables. | |
5741 | + | |
5742 | No additional code or libraries are necessary in order to run an | |
5743 | executable, except for the standard shared libraries. MLton can also | |
5744 | generate statically linked executables. | |
5745 | ||
5746 | * Compiles large programs. | |
5747 | + | |
5748 | MLton is sufficiently efficient and robust that it can compile large | |
5749 | programs, including itself (over 190K lines). The distributed version | |
5750 | of MLton was compiled by MLton. | |
5751 | ||
5752 | * Support for large amounts of memory (up to 4G on 32-bit systems; more on 64-bit systems). | |
5753 | ||
5754 | * Support for large array lengths (up to 2^31^-1 on 32-bit systems; up to 2^63^-1 on 64-bit systems). | |
5755 | ||
5756 | * Support for large files, using 64-bit file positions. | |
5757 | ||
5758 | == Performance == | |
5759 | ||
5760 | * Executables have <:Performance:excellent running times>. | |
5761 | ||
5762 | * Generates small executables. | |
5763 | + | |
5764 | MLton takes advantage of whole-program compilation to perform very | |
5765 | aggressive dead-code elimination, which often leads to smaller | |
5766 | executables than with other SML compilers. | |
5767 | ||
5768 | * Untagged and unboxed native integers, reals, and words. | |
5769 | + | |
5770 | In MLton, integers and words are 8 bits, 16 bits, 32 bits, and 64 bits | |
5771 | and arithmetic does not have any overhead due to tagging or boxing. | |
5772 | Also, reals (32-bit and 64-bit) are stored unboxed, avoiding any | |
5773 | overhead due to boxing. | |
5774 | ||
5775 | * Unboxed native arrays. | |
5776 | + | |
5777 | In MLton, an array (or vector) of integers, reals, or words uses the | |
5778 | natural C-like representation. This is fast and supports easy | |
5779 | exchange of data with C. Monomorphic arrays (and vectors) use the | |
5780 | same C-like representations as their polymorphic counterparts. | |
5781 | ||
5782 | * Multiple <:GarbageCollection:garbage collection> strategies. | |
5783 | ||
5784 | * Fast arbitrary precision arithmetic (`IntInf`) based on <:GnuMP:>. | |
5785 | + | |
5786 | For `IntInf` intensive programs, MLton can be an order of magnitude or | |
5787 | more faster than Poly/ML or SML/NJ. | |
5788 | ||
5789 | == Tools == | |
5790 | ||
5791 | * Source-level <:Profiling:> of both time and allocation. | |
5792 | * <:MLLex:> lexer generator | |
5793 | * <:MLYacc:> parser generator | |
5794 | * <:MLNLFFIGen:> foreign-function-interface generator | |
5795 | ||
5796 | == Extensions == | |
5797 | ||
5798 | * A simple and fast C <:ForeignFunctionInterface:> that supports calling from SML to C and from C to SML. | |
5799 | ||
5800 | * The <:MLBasis:ML Basis system> for programming in the very large, separate delivery of library sources, and more. | |
5801 | ||
5802 | * A number of extension libraries that provide useful functionality | |
5803 | that cannot be implemented with the <:BasisLibrary:Basis Library>. | |
5804 | See below for an overview and <:MLtonStructure:> for details. | |
5805 | ||
5806 | ** <:MLtonCont:continuations> | |
5807 | + | |
5808 | MLton supports continuations via `callcc` and `throw`. | |
5809 | ||
5810 | ** <:MLtonFinalizable:finalization> | |
5811 | + | |
5812 | MLton supports finalizable values of arbitrary type. | |
5813 | ||
5814 | ** <:MLtonItimer:interval timers> | |
5815 | + | |
5816 | MLton supports the functionality of the C `setitimer` function. | |
5817 | ||
5818 | ** <:MLtonRandom:random numbers> | |
5819 | + | |
5820 | MLton has functions similar to the C `rand` and `srand` functions, as well as support for access to `/dev/random` and `/dev/urandom`. | |
5821 | ||
5822 | ** <:MLtonRlimit:resource limits> | |
5823 | + | |
5824 | MLton has functions similar to the C `getrlimit` and `setrlimit` functions. | |
5825 | ||
5826 | ** <:MLtonRusage:resource usage> | |
5827 | + | |
5828 | MLton supports a subset of the functionality of the C `getrusage` function. | |
5829 | ||
5830 | ** <:MLtonSignal:signal handlers> | |
5831 | + | |
5832 | MLton supports signal handlers written in SML. Signal handlers run in | |
5833 | a separate MLton thread, and have access to the thread that was | |
5834 | interrupted by the signal. Signal handlers can be used in conjunction | |
5835 | with threads to implement preemptive multitasking. | |
5836 | ||
5837 | ** <:MLtonStructure:size primitive> | |
5838 | + | |
5839 | MLton includes a primitive that returns the size (in bytes) of any | |
5840 | object. This can be useful in understanding the space behavior of a | |
5841 | program. | |
5842 | ||
5843 | ** <:MLtonSyslog:system logging> | |
5844 | + | |
5845 | MLton has a complete interface to the C `syslog` function. | |
5846 | ||
5847 | ** <:MLtonThread:threads> | |
5848 | + | |
5849 | MLton has support for its own threads, upon which either preemptive or | |
5850 | non-preemptive multitasking can be implemented. MLton also has | |
5851 | support for <:ConcurrentML:Concurrent ML> (CML). | |
5852 | ||
5853 | ** <:MLtonWeak:weak pointers> | |
5854 | + | |
5855 | MLton supports weak pointers, which allow the garbage collector to | |
5856 | reclaim objects that it would otherwise be forced to keep. Weak | |
5857 | pointers are also used to provide finalization. | |
5858 | ||
5859 | ** <:MLtonWorld:world save and restore> | |
5860 | + | |
5861 | MLton has a facility for saving the entire state of a computation to a | |
5862 | file and restarting it later. This facility can be used for staging | |
5863 | and for checkpointing computations. It can even be used from within | |
5864 | signal handlers, allowing interrupt driven checkpointing. | |
5865 | ||
5866 | <<< | |
5867 | ||
5868 | :mlton-guide-page: FirstClassPolymorphism | |
5869 | [[FirstClassPolymorphism]] | |
5870 | FirstClassPolymorphism | |
5871 | ====================== | |
5872 | ||
5873 | First-class polymorphism is the ability to treat polymorphic functions | |
5874 | just like other values: pass them as arguments, store them in data | |
5875 | structures, etc. Although <:StandardML:Standard ML> does have | |
5876 | polymorphic functions, it does not support first-class polymorphism. | |
5877 | ||
5878 | For example, the following declares and uses the polymorphic function | |
5879 | `id`. | |
5880 | [source,sml] | |
5881 | ---- | |
5882 | val id = fn x => x | |
5883 | val _ = id 13 | |
5884 | val _ = id "foo" | |
5885 | ---- | |
5886 | ||
5887 | If SML supported first-class polymorphism, we could write the | |
5888 | following. | |
5889 | [source,sml] | |
5890 | ---- | |
5891 | fun useId id = (id 13; id "foo") | |
5892 | ---- | |
5893 | ||
5894 | However, this does not type check. MLton reports the following error. | |
5895 | ---- | |
5896 | Error: z.sml 1.24-1.31. | |
5897 | Function applied to incorrect argument. | |
5898 | expects: [int] | |
5899 | but got: [string] | |
5900 | in: id "foo" | |
5901 | ---- | |
5902 | The error message arises because MLton infers from `id 13` that `id` | |
5903 | accepts an integer argument, but that `id "foo"` is passing a string. | |
5904 | ||
5905 | Using explicit types sheds some light on the problem. | |
5906 | [source,sml] | |
5907 | ---- | |
5908 | fun useId (id: 'a -> 'a) = (id 13; id "foo") | |
5909 | ---- | |
5910 | ||
5911 | On this, MLton reports the following errors. | |
5912 | ---- | |
5913 | Error: z.sml 1.29-1.33. | |
5914 | Function applied to incorrect argument. | |
5915 | expects: ['a] | |
5916 | but got: [int] | |
5917 | in: id 13 | |
5918 | Error: z.sml 1.36-1.43. | |
5919 | Function applied to incorrect argument. | |
5920 | expects: ['a] | |
5921 | but got: [string] | |
5922 | in: id "foo" | |
5923 | ---- | |
5924 | ||
5925 | The errors arise because the argument `id` is _not_ polymorphic; | |
5926 | rather, it is monomorphic, with type `'a -> 'a`. It is perfectly | |
5927 | valid to apply `id` to a value of type `'a`, as in the following | |
5928 | [source,sml] | |
5929 | ---- | |
5930 | fun useId (id: 'a -> 'a, x: 'a) = id x (* type correct *) | |
5931 | ---- | |
5932 | ||
5933 | So, what is the difference between the type specification on `id` in | |
5934 | the following two declarations? | |
5935 | [source,sml] | |
5936 | ---- | |
5937 | val id: 'a -> 'a = fn x => x | |
5938 | fun useId (id: 'a -> 'a) = (id 13; id "foo") | |
5939 | ---- | |
5940 | ||
5941 | While the type specifications on `id` look identical, they mean | |
5942 | different things. The difference can be made clearer by explicitly | |
5943 | <:TypeVariableScope:scoping the type variables>. | |
5944 | [source,sml] | |
5945 | ---- | |
5946 | val 'a id: 'a -> 'a = fn x => x | |
5947 | fun 'a useId (id: 'a -> 'a) = (id 13; id "foo") (* type error *) | |
5948 | ---- | |
5949 | ||
5950 | In `val 'a id`, the type variable scoping means that for any `'a`, | |
5951 | `id` has type `'a -> 'a`. Hence, `id` can be applied to arguments of | |
5952 | type `int`, `real`, etc. Similarly, in `fun 'a useId`, the scoping | |
5953 | means that `useId` is a polymorphic function that for any `'a` takes a | |
5954 | function of type `'a -> 'a` and does something. Thus, `useId` could | |
5955 | be applied to a function of type `int -> int`, `real -> real`, etc. | |
5956 | ||
5957 | One could imagine an extension of SML that allowed scoping of type | |
5958 | variables at places other than `fun` or `val` declarations, as in the | |
5959 | following. | |
5960 | ---- | |
5961 | fun useId (id: ('a).'a -> 'a) = (id 13; id "foo") (* not SML *) | |
5962 | ---- | |
5963 | ||
5964 | Such an extension would need to be thought through very carefully, as | |
5965 | it could cause significant complications with <:TypeInference:>, | |
5966 | possible even undecidability. | |
5967 | ||
5968 | <<< | |
5969 | ||
5970 | :mlton-guide-page: Fixpoints | |
5971 | [[Fixpoints]] | |
5972 | Fixpoints | |
5973 | ========= | |
5974 | ||
5975 | This page discusses a framework that makes it possible to compute | |
5976 | fixpoints over arbitrary products of abstract types. The code is from | |
5977 | an Extended Basis library | |
5978 | (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>). | |
5979 | ||
5980 | First the signature of the framework | |
5981 | (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/public/generic/tie.sig)>): | |
5982 | [source,sml] | |
5983 | ---- | |
5984 | sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/public/generic/tie.sig 6:] | |
5985 | ---- | |
5986 | ||
5987 | `fix` is a <:TypeIndexedValues:type-indexed> function. The type-index | |
5988 | parameter to `fix` is called a "witness". To compute fixpoints over | |
5989 | products, one uses the +*`+ operator to combine witnesses. To provide | |
5990 | a fixpoint combinator for an abstract type, one implements a witness | |
5991 | providing a thunk whose instantiation allocates a fresh, mutable proxy | |
5992 | and a procedure for updating the proxy with the solution. Naturally | |
5993 | this means that not all possible ways of computing a fixpoint of a | |
5994 | particular type are possible under the framework. The `pure` | |
5995 | combinator is a generalization of `tier`. The `iso` combinator is | |
5996 | provided for reusing existing witnesses. | |
5997 | ||
5998 | Note that instead of using an infix operator, we could alternatively | |
5999 | employ an interface using <:Fold:>. Also, witnesses are eta-expanded | |
6000 | to work around the <:ValueRestriction:value restriction>, while | |
6001 | maintaining abstraction. | |
6002 | ||
6003 | Here is the implementation | |
6004 | (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/detail/generic/tie.sml)>): | |
6005 | [source,sml] | |
6006 | ---- | |
6007 | sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/detail/generic/tie.sml 6:] | |
6008 | ---- | |
6009 | ||
6010 | Let's then take a look at a couple of additional examples. | |
6011 | ||
6012 | Here is a naive implementation of lazy promises: | |
6013 | [source,sml] | |
6014 | ---- | |
6015 | structure Promise :> sig | |
6016 | type 'a t | |
6017 | val lazy : 'a Thunk.t -> 'a t | |
6018 | val force : 'a t -> 'a | |
6019 | val Y : 'a t Tie.t | |
6020 | end = struct | |
6021 | datatype 'a t' = | |
6022 | EXN of exn | |
6023 | | THUNK of 'a Thunk.t | |
6024 | | VALUE of 'a | |
6025 | type 'a t = 'a t' Ref.t | |
6026 | fun lazy f = ref (THUNK f) | |
6027 | fun force t = | |
6028 | case !t | |
6029 | of EXN e => raise e | |
6030 | | THUNK f => (t := VALUE (f ()) handle e => t := EXN e ; force t) | |
6031 | | VALUE v => v | |
6032 | fun Y ? = Tie.tier (fn () => let | |
6033 | val r = lazy (raising Fix.Fix) | |
6034 | in | |
6035 | (r, r <\ op := o !) | |
6036 | end) ? | |
6037 | end | |
6038 | ---- | |
6039 | ||
6040 | An example use of our naive lazy promises is to implement equally naive | |
6041 | lazy streams: | |
6042 | [source,sml] | |
6043 | ---- | |
6044 | structure Stream :> sig | |
6045 | type 'a t | |
6046 | val cons : 'a * 'a t -> 'a t | |
6047 | val get : 'a t -> ('a * 'a t) Option.t | |
6048 | val Y : 'a t Tie.t | |
6049 | end = struct | |
6050 | datatype 'a t = IN of ('a * 'a t) Option.t Promise.t | |
6051 | fun cons (x, xs) = IN (Promise.lazy (fn () => SOME (x, xs))) | |
6052 | fun get (IN p) = Promise.force p | |
6053 | fun Y ? = Tie.iso Promise.Y (fn IN p => p, IN) ? | |
6054 | end | |
6055 | ---- | |
6056 | ||
6057 | Note that above we make use of the `iso` combinator. Here is a finite | |
6058 | representation of an infinite stream of ones: | |
6059 | ||
6060 | [source,sml] | |
6061 | ---- | |
6062 | val ones = let | |
6063 | open Tie Stream | |
6064 | in | |
6065 | fix Y (fn ones => cons (1, ones)) | |
6066 | end | |
6067 | ---- | |
6068 | ||
6069 | <<< | |
6070 | ||
6071 | :mlton-guide-page: Flatten | |
6072 | [[Flatten]] | |
6073 | Flatten | |
6074 | ======= | |
6075 | ||
6076 | <:Flatten:> is an optimization pass for the <:SSA:> | |
6077 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
6078 | ||
6079 | == Description == | |
6080 | ||
6081 | This pass flattens arguments to <:SSA:> constructors, blocks, and | |
6082 | functions. | |
6083 | ||
6084 | If a tuple is explicitly available at all uses of a function | |
6085 | (resp. block), then: | |
6086 | ||
6087 | * The formals and call sites are changed so that the components of the | |
6088 | tuple are passed. | |
6089 | ||
6090 | * The tuple is reconstructed at the beginning of the body of the | |
6091 | function (resp. block). | |
6092 | ||
6093 | Similarly, if a tuple is explicitly available at all uses of a | |
6094 | constructor, then: | |
6095 | ||
6096 | * The constructor argument datatype is changed to flatten the tuple | |
6097 | type. | |
6098 | ||
6099 | * The tuple is passed flat at each `ConApp`. | |
6100 | ||
6101 | * The tuple is reconstructed at each `Case` transfer target. | |
6102 | ||
6103 | == Implementation == | |
6104 | ||
6105 | * <!ViewGitFile(mlton,master,mlton/ssa/flatten.fun)> | |
6106 | ||
6107 | == Details and Notes == | |
6108 | ||
6109 | {empty} | |
6110 | ||
6111 | <<< | |
6112 | ||
6113 | :mlton-guide-page: Fold | |
6114 | [[Fold]] | |
6115 | Fold | |
6116 | ==== | |
6117 | ||
6118 | This page describes a technique that enables convenient syntax for a | |
6119 | number of language features that are not explicitly supported by | |
6120 | <:StandardML:Standard ML>, including: variable number of arguments, | |
6121 | <:OptionalArguments:optional arguments and labeled arguments>, | |
6122 | <:ArrayLiteral:array and vector literals>, | |
6123 | <:FunctionalRecordUpdate:functional record update>, | |
6124 | and (seemingly) dependently typed functions like <:Printf:printf> and scanf. | |
6125 | ||
6126 | The key idea to _fold_ is to define functions `fold`, `step0`, | |
6127 | and `$` such that the following equation holds. | |
6128 | ||
6129 | [source,sml] | |
6130 | ---- | |
6131 | fold (a, f) (step0 h1) (step0 h2) ... (step0 hn) $ | |
6132 | = f (hn (... (h2 (h1 a)))) | |
6133 | ---- | |
6134 | ||
6135 | The name `fold` comes because this is like a traditional list fold, | |
6136 | where `a` is the _base element_, and each _step function_, | |
6137 | `step0 hi`, corresponds to one element of the list and does one | |
6138 | step of the fold. The name `$` is chosen to mean "end of | |
6139 | arguments" from its common use in regular-expression syntax. | |
6140 | ||
6141 | Unlike the usual list fold in which the same function is used to step | |
6142 | over each element in the list, this fold allows the step functions to | |
6143 | be different from each other, and even to be of different types. Also | |
6144 | unlike the usual list fold, this fold includes a "finishing | |
6145 | function", `f`, that is applied to the result of the fold. The | |
6146 | presence of the finishing function may seem odd because there is no | |
6147 | analogy in list fold. However, the finishing function is essential; | |
6148 | without it, there would be no way for the folder to perform an | |
6149 | arbitrary computation after processing all the arguments. The | |
6150 | examples below will make this clear. | |
6151 | ||
6152 | The functions `fold`, `step0`, and `$` are easy to | |
6153 | define. | |
6154 | ||
6155 | [source,sml] | |
6156 | ---- | |
6157 | fun $ (a, f) = f a | |
6158 | fun id x = x | |
6159 | structure Fold = | |
6160 | struct | |
6161 | fun fold (a, f) g = g (a, f) | |
6162 | fun step0 h (a, f) = fold (h a, f) | |
6163 | end | |
6164 | ---- | |
6165 | ||
6166 | We've placed `fold` and `step0` in the `Fold` structure | |
6167 | but left `$` at the toplevel because it is convenient in code to | |
6168 | always have `$` in scope. We've also defined the identity | |
6169 | function, `id`, at the toplevel since we use it so frequently. | |
6170 | ||
6171 | Plugging in the definitions, it is easy to verify the equation from | |
6172 | above. | |
6173 | ||
6174 | [source,sml] | |
6175 | ---- | |
6176 | fold (a, f) (step0 h1) (step0 h2) ... (step0 hn) $ | |
6177 | = step0 h1 (a, f) (step0 h2) ... (step0 hn) $ | |
6178 | = fold (h1 a, f) (step0 h2) ... (step0 hn) $ | |
6179 | = step0 h2 (h1 a, f) ... (step0 hn) $ | |
6180 | = fold (h2 (h1 a), f) ... (step0 hn) $ | |
6181 | ... | |
6182 | = fold (hn (... (h2 (h1 a))), f) $ | |
6183 | = $ (hn (... (h2 (h1 a))), f) | |
6184 | = f (hn (... (h2 (h1 a)))) | |
6185 | ---- | |
6186 | ||
6187 | ||
6188 | == Example: variable number of arguments == | |
6189 | ||
6190 | The simplest example of fold is accepting a variable number of | |
6191 | (curried) arguments. We'll define a function `f` and argument | |
6192 | `a` such that all of the following expressions are valid. | |
6193 | ||
6194 | [source,sml] | |
6195 | ---- | |
6196 | f $ | |
6197 | f a $ | |
6198 | f a a $ | |
6199 | f a a a $ | |
6200 | f a a a ... a a a $ (* as many a's as we want *) | |
6201 | ---- | |
6202 | ||
6203 | Off-hand it may appear impossible that all of the above expressions | |
6204 | are type correct SML -- how can a function `f` accept a variable | |
6205 | number of curried arguments? What could the type of `f` be? | |
6206 | We'll have more to say later on how type checking works. For now, | |
6207 | once we have supplied the definitions below, you can check that the | |
6208 | expressions are type correct by feeding them to your favorite SML | |
6209 | implementation. | |
6210 | ||
6211 | It is simple to define `f` and `a`. We define `f` as a | |
6212 | folder whose base element is `()` and whose finish function does | |
6213 | nothing. We define `a` as the step function that does nothing. | |
6214 | The only trickiness is that we must <:EtaExpansion:eta expand> the | |
6215 | definition of `f` and `a` to work around the ValueRestriction; | |
6216 | we frequently use eta expansion for this purpose without mention. | |
6217 | ||
6218 | [source,sml] | |
6219 | ---- | |
6220 | val base = () | |
6221 | fun finish () = () | |
6222 | fun step () = () | |
6223 | val f = fn z => Fold.fold (base, finish) z | |
6224 | val a = fn z => Fold.step0 step z | |
6225 | ---- | |
6226 | ||
6227 | One can easily apply the fold equation to verify by hand that `f` | |
6228 | applied to any number of `a`'s evaluates to `()`. | |
6229 | ||
6230 | [source,sml] | |
6231 | ---- | |
6232 | f a ... a $ | |
6233 | = finish (step (... (step base))) | |
6234 | = finish (step (... ())) | |
6235 | ... | |
6236 | = finish () | |
6237 | = () | |
6238 | ---- | |
6239 | ||
6240 | ||
6241 | == Example: variable-argument sum == | |
6242 | ||
6243 | Let's look at an example that computes something: a variable-argument | |
6244 | function `sum` and a stepper `a` such that | |
6245 | ||
6246 | [source,sml] | |
6247 | ---- | |
6248 | sum (a i1) (a i2) ... (a im) $ = i1 + i2 + ... + im | |
6249 | ---- | |
6250 | ||
6251 | The idea is simple -- the folder starts with a base accumulator of | |
6252 | `0` and the stepper adds each element to the accumulator, `s`, | |
6253 | which the folder simply returns at the end. | |
6254 | ||
6255 | [source,sml] | |
6256 | ---- | |
6257 | val sum = fn z => Fold.fold (0, fn s => s) z | |
6258 | fun a i = Fold.step0 (fn s => i + s) | |
6259 | ---- | |
6260 | ||
6261 | Using the fold equation, one can verify the following. | |
6262 | ||
6263 | [source,sml] | |
6264 | ---- | |
6265 | sum (a 1) (a 2) (a 3) $ = 6 | |
6266 | ---- | |
6267 | ||
6268 | ||
6269 | == Step1 == | |
6270 | ||
6271 | It is sometimes syntactically convenient to omit the parentheses | |
6272 | around the steps in a fold. This is easily done by defining a new | |
6273 | function, `step1`, as follows. | |
6274 | ||
6275 | [source,sml] | |
6276 | ---- | |
6277 | structure Fold = | |
6278 | struct | |
6279 | open Fold | |
6280 | fun step1 h (a, f) b = fold (h (b, a), f) | |
6281 | end | |
6282 | ---- | |
6283 | ||
6284 | From the definition of `step1`, we have the following | |
6285 | equivalence. | |
6286 | ||
6287 | [source,sml] | |
6288 | ---- | |
6289 | fold (a, f) (step1 h) b | |
6290 | = step1 h (a, f) b | |
6291 | = fold (h (b, a), f) | |
6292 | ---- | |
6293 | ||
6294 | Using the above equivalence, we can compute the following equation for | |
6295 | `step1`. | |
6296 | ||
6297 | [source,sml] | |
6298 | ---- | |
6299 | fold (a, f) (step1 h1) b1 (step1 h2) b2 ... (step1 hn) bn $ | |
6300 | = fold (h1 (b1, a), f) (step1 h2) b2 ... (step1 hn) bn $ | |
6301 | = fold (h2 (b2, h1 (b1, a)), f) ... (step1 hn) bn $ | |
6302 | = fold (hn (bn, ... (h2 (b2, h1 (b1, a)))), f) $ | |
6303 | = f (hn (bn, ... (h2 (b2, h1 (b1, a))))) | |
6304 | ---- | |
6305 | ||
6306 | Here is an example using `step1` to define a variable-argument | |
6307 | product function, `prod`, with a convenient syntax. | |
6308 | ||
6309 | [source,sml] | |
6310 | ---- | |
6311 | val prod = fn z => Fold.fold (1, fn p => p) z | |
6312 | val ` = fn z => Fold.step1 (fn (i, p) => i * p) z | |
6313 | ---- | |
6314 | ||
6315 | The functions `prod` and +`+ satisfy the following equation. | |
6316 | [source,sml] | |
6317 | ---- | |
6318 | prod `i1 `i2 ... `im $ = i1 * i2 * ... * im | |
6319 | ---- | |
6320 | ||
6321 | Note that in SML, +`i1+ is two different tokens, +`+ and | |
6322 | `i1`. We often use +`+ for an instance of a `step1` function | |
6323 | because of its syntactic unobtrusiveness and because no space is | |
6324 | required to separate it from an alphanumeric token. | |
6325 | ||
6326 | Also note that there are no parenthesis around the steps. That is, | |
6327 | the following expression is not the same as the above one (in fact, it | |
6328 | is not type correct). | |
6329 | ||
6330 | [source,sml] | |
6331 | ---- | |
6332 | prod (`i1) (`i2) ... (`im) $ | |
6333 | ---- | |
6334 | ||
6335 | ||
6336 | == Example: list literals == | |
6337 | ||
6338 | SML already has a syntax for list literals, e.g. `[w, x, y, z]`. | |
6339 | However, using fold, we can define our own syntax. | |
6340 | ||
6341 | [source,sml] | |
6342 | ---- | |
6343 | val list = fn z => Fold.fold ([], rev) z | |
6344 | val ` = fn z => Fold.step1 (op ::) z | |
6345 | ---- | |
6346 | ||
6347 | The idea is that the folder starts out with the empty list, the steps | |
6348 | accumulate the elements into a list, and then the finishing function | |
6349 | reverses the list at the end. | |
6350 | ||
6351 | With these definitions one can write a list like: | |
6352 | ||
6353 | [source,sml] | |
6354 | ---- | |
6355 | list `w `x `y `z $ | |
6356 | ---- | |
6357 | ||
6358 | While the example is not practically useful, it does demonstrate the | |
6359 | need for the finishing function to be incorporated in `fold`. | |
6360 | Without a finishing function, every use of `list` would need to be | |
6361 | wrapped in `rev`, as follows. | |
6362 | ||
6363 | [source,sml] | |
6364 | ---- | |
6365 | rev (list `w `x `y `z $) | |
6366 | ---- | |
6367 | ||
6368 | The finishing function allows us to incorporate the reversal into the | |
6369 | definition of `list`, and to treat `list` as a truly variable | |
6370 | argument function, performing an arbitrary computation after receiving | |
6371 | all of its arguments. | |
6372 | ||
6373 | See <:ArrayLiteral:> for a similar use of `fold` that provides a | |
6374 | syntax for array and vector literals, which are not built in to SML. | |
6375 | ||
6376 | ||
6377 | == Fold right == | |
6378 | ||
6379 | Just as `fold` is analogous to a fold left, in which the functions | |
6380 | are applied to the accumulator left-to-right, we can define a variant | |
6381 | of `fold` that is analogous to a fold right, in which the | |
6382 | functions are applied to the accumulator right-to-left. That is, we | |
6383 | can define functions `foldr` and `step0` such that the | |
6384 | following equation holds. | |
6385 | ||
6386 | [source,sml] | |
6387 | ---- | |
6388 | foldr (a, f) (step0 h1) (step0 h2) ... (step0 hn) $ | |
6389 | = f (h1 (h2 (... (hn a)))) | |
6390 | ---- | |
6391 | ||
6392 | The implementation of fold right is easy, using fold. The idea is for | |
6393 | the fold to start with `f` and for each step to precompose the | |
6394 | next `hi`. Then, the finisher applies the composed function to | |
6395 | the base value, `a`. Here is the code. | |
6396 | ||
6397 | [source,sml] | |
6398 | ---- | |
6399 | structure Foldr = | |
6400 | struct | |
6401 | fun foldr (a, f) = Fold.fold (f, fn g => g a) | |
6402 | fun step0 h = Fold.step0 (fn g => g o h) | |
6403 | end | |
6404 | ---- | |
6405 | ||
6406 | Verifying the fold-right equation is straightforward, using the | |
6407 | fold-left equation. | |
6408 | ||
6409 | [source,sml] | |
6410 | ---- | |
6411 | foldr (a, f) (Foldr.step0 h1) (Foldr.step0 h2) ... (Foldr.step0 hn) $ | |
6412 | = fold (f, fn g => g a) | |
6413 | (Fold.step0 (fn g => g o h1)) | |
6414 | (Fold.step0 (fn g => g o h2)) | |
6415 | ... | |
6416 | (Fold.step0 (fn g => g o hn)) $ | |
6417 | = (fn g => g a) | |
6418 | ((fn g => g o hn) (... ((fn g => g o h2) ((fn g => g o h1) f)))) | |
6419 | = (fn g => g a) | |
6420 | ((fn g => g o hn) (... ((fn g => g o h2) (f o h1)))) | |
6421 | = (fn g => g a) ((fn g => g o hn) (... (f o h1 o h2))) | |
6422 | = (fn g => g a) (f o h1 o h2 o ... o hn) | |
6423 | = (f o h1 o h2 o ... o hn) a | |
6424 | = f (h1 (h2 (... (hn a)))) | |
6425 | ---- | |
6426 | ||
6427 | One can also define the fold-right analogue of `step1`. | |
6428 | ||
6429 | [source,sml] | |
6430 | ---- | |
6431 | structure Foldr = | |
6432 | struct | |
6433 | open Foldr | |
6434 | fun step1 h = Fold.step1 (fn (b, g) => g o (fn a => h (b, a))) | |
6435 | end | |
6436 | ---- | |
6437 | ||
6438 | ||
6439 | == Example: list literals via fold right == | |
6440 | ||
6441 | Revisiting the list literal example from earlier, we can use fold | |
6442 | right to define a syntax for list literals that doesn't do a reversal. | |
6443 | ||
6444 | [source,sml] | |
6445 | ---- | |
6446 | val list = fn z => Foldr.foldr ([], fn l => l) z | |
6447 | val ` = fn z => Foldr.step1 (op ::) z | |
6448 | ---- | |
6449 | ||
6450 | As before, with these definitions, one can write a list like: | |
6451 | ||
6452 | [source,sml] | |
6453 | ---- | |
6454 | list `w `x `y `z $ | |
6455 | ---- | |
6456 | ||
6457 | The difference between the fold-left and fold-right approaches is that | |
6458 | the fold-right approach does not have to reverse the list at the end, | |
6459 | since it accumulates the elements in the correct order. In practice, | |
6460 | MLton will simplify away all of the intermediate function composition, | |
6461 | so the the fold-right approach will be more efficient. | |
6462 | ||
6463 | ||
6464 | == Mixing steppers == | |
6465 | ||
6466 | All of the examples so far have used the same step function throughout | |
6467 | a fold. This need not be the case. For example, consider the | |
6468 | following. | |
6469 | ||
6470 | [source,sml] | |
6471 | ---- | |
6472 | val n = fn z => Fold.fold (0, fn i => i) z | |
6473 | val I = fn z => Fold.step0 (fn i => i * 2) z | |
6474 | val O = fn z => Fold.step0 (fn i => i * 2 + 1) z | |
6475 | ---- | |
6476 | ||
6477 | Here we have one folder, `n`, that can be used with two different | |
6478 | steppers, `I` and `O`. By using the fold equation, one can | |
6479 | verify the following equations. | |
6480 | ||
6481 | [source,sml] | |
6482 | ---- | |
6483 | n O $ = 0 | |
6484 | n I $ = 1 | |
6485 | n I O $ = 2 | |
6486 | n I O I $ = 5 | |
6487 | n I I I O $ = 14 | |
6488 | ---- | |
6489 | ||
6490 | That is, we've defined a syntax for writing binary integer constants. | |
6491 | ||
6492 | Not only can one use different instances of `step0` in the same | |
6493 | fold, one can also intermix uses of `step0` and `step1`. For | |
6494 | example, consider the following. | |
6495 | ||
6496 | [source,sml] | |
6497 | ---- | |
6498 | val n = fn z => Fold.fold (0, fn i => i) z | |
6499 | val O = fn z => Fold.step0 (fn i => n * 8) z | |
6500 | val ` = fn z => Fold.step1 (fn (i, n) => n * 8 + i) z | |
6501 | ---- | |
6502 | ||
6503 | Using the straightforward generalization of the fold equation to mixed | |
6504 | steppers, one can verify the following equations. | |
6505 | ||
6506 | [source,sml] | |
6507 | ---- | |
6508 | n 0 $ = 0 | |
6509 | n `3 O $ = 24 | |
6510 | n `1 O `7 $ = 71 | |
6511 | ---- | |
6512 | ||
6513 | That is, we've defined a syntax for writing octal integer constants, | |
6514 | with a special syntax, `O`, for the zero digit (admittedly | |
6515 | contrived, since one could just write +`0+ instead of `O`). | |
6516 | ||
6517 | See <:NumericLiteral:> for a practical extension of this approach that | |
6518 | supports numeric constants in any base and of any type. | |
6519 | ||
6520 | ||
6521 | == (Seemingly) dependent types == | |
6522 | ||
6523 | A normal list fold always returns the same type no matter what | |
6524 | elements are in the list or how long the list is. Variable-argument | |
6525 | fold is more powerful, because the result type can vary based both on | |
6526 | the arguments that are passed and on their number. This can provide | |
6527 | the illusion of dependent types. | |
6528 | ||
6529 | For example, consider the following. | |
6530 | ||
6531 | [source,sml] | |
6532 | ---- | |
6533 | val f = fn z => Fold.fold ((), id) z | |
6534 | val a = fn z => Fold.step0 (fn () => "hello") z | |
6535 | val b = fn z => Fold.step0 (fn () => 13) z | |
6536 | val c = fn z => Fold.step0 (fn () => (1, 2)) z | |
6537 | ---- | |
6538 | ||
6539 | Using the fold equation, one can verify the following equations. | |
6540 | ||
6541 | [source,sml] | |
6542 | ---- | |
6543 | f a $ = "hello": string | |
6544 | f b $ = 13: int | |
6545 | f c $ = (1, 2): int * int | |
6546 | ---- | |
6547 | ||
6548 | That is, `f` returns a value of a different type depending on | |
6549 | whether it is applied to argument `a`, argument `b`, or | |
6550 | argument `c`. | |
6551 | ||
6552 | The following example shows how the type of a fold can depend on the | |
6553 | number of arguments. | |
6554 | ||
6555 | [source,sml] | |
6556 | ---- | |
6557 | val grow = fn z => Fold.fold ([], fn l => l) z | |
6558 | val a = fn z => Fold.step0 (fn x => [x]) z | |
6559 | ---- | |
6560 | ||
6561 | Using the fold equation, one can verify the following equations. | |
6562 | ||
6563 | [source,sml] | |
6564 | ---- | |
6565 | grow $ = []: 'a list | |
6566 | grow a $ = [[]]: 'a list list | |
6567 | grow a a $ = [[[]]]: 'a list list list | |
6568 | ---- | |
6569 | ||
6570 | Clearly, the result type of a call to the variable argument `grow` | |
6571 | function depends on the number of arguments that are passed. | |
6572 | ||
6573 | As a reminder, this is well-typed SML. You can check it out in any | |
6574 | implementation. | |
6575 | ||
6576 | ||
6577 | == (Seemingly) dependently-typed functional results == | |
6578 | ||
6579 | Fold is especially useful when it returns a curried function whose | |
6580 | arity depends on the number of arguments. For example, consider the | |
6581 | following. | |
6582 | ||
6583 | [source,sml] | |
6584 | ---- | |
6585 | val makeSum = fn z => Fold.fold (id, fn f => f 0) z | |
6586 | val I = fn z => Fold.step0 (fn f => fn i => fn x => f (x + i)) z | |
6587 | ---- | |
6588 | ||
6589 | The `makeSum` folder constructs a function whose arity depends on | |
6590 | the number of `I` arguments and that adds together all of its | |
6591 | arguments. For example, | |
6592 | `makeSum I $` is of type `int -> int` and | |
6593 | `makeSum I I $` is of type `int -> int -> int`. | |
6594 | ||
6595 | One can use the fold equation to verify that the `makeSum` works | |
6596 | correctly. For example, one can easily check by hand the following | |
6597 | equations. | |
6598 | ||
6599 | [source,sml] | |
6600 | ---- | |
6601 | makeSum I $ 1 = 1 | |
6602 | makeSum I I $ 1 2 = 3 | |
6603 | makeSum I I I $ 1 2 3 = 6 | |
6604 | ---- | |
6605 | ||
6606 | Returning a function becomes especially interesting when there are | |
6607 | steppers of different types. For example, the following `makeSum` | |
6608 | folder constructs functions that sum integers and reals. | |
6609 | ||
6610 | [source,sml] | |
6611 | ---- | |
6612 | val makeSum = fn z => Foldr.foldr (id, fn f => f 0.0) z | |
6613 | val I = fn z => Foldr.step0 (fn f => fn x => fn i => f (x + real i)) z | |
6614 | val R = fn z => Foldr.step0 (fn f => fn x: real => fn r => f (x + r)) z | |
6615 | ---- | |
6616 | ||
6617 | With these definitions, `makeSum I R $` is of type | |
6618 | `int -> real -> real` and `makeSum R I I $` is of type | |
6619 | `real -> int -> int -> real`. One can use the foldr equation to | |
6620 | check the following equations. | |
6621 | ||
6622 | [source,sml] | |
6623 | ---- | |
6624 | makeSum I $ 1 = 1.0 | |
6625 | makeSum I R $ 1 2.5 = 3.5 | |
6626 | makeSum R I I $ 1.5 2 3 = 6.5 | |
6627 | ---- | |
6628 | ||
6629 | We used `foldr` instead of `fold` for this so that the order | |
6630 | in which the specifiers `I` and `R` appear is the same as the | |
6631 | order in which the arguments appear. Had we used `fold`, things | |
6632 | would have been reversed. | |
6633 | ||
6634 | An extension of this idea is sufficient to define <:Printf:>-like | |
6635 | functions in SML. | |
6636 | ||
6637 | ||
6638 | == An idiom for combining steps == | |
6639 | ||
6640 | It is sometimes useful to combine a number of steps together and name | |
6641 | them as a single step. As a simple example, suppose that one often | |
6642 | sees an integer follower by a real in the `makeSum` example above. | |
6643 | One can define a new _compound step_ `IR` as follows. | |
6644 | ||
6645 | [source,sml] | |
6646 | ---- | |
6647 | val IR = fn u => Fold.fold u I R | |
6648 | ---- | |
6649 | ||
6650 | With this definition in place, one can verify the following. | |
6651 | ||
6652 | [source,sml] | |
6653 | ---- | |
6654 | makeSum IR IR $ 1 2.2 3 4.4 = 10.6 | |
6655 | ---- | |
6656 | ||
6657 | In general, one can combine steps `s1`, `s2`, ... `sn` as | |
6658 | ||
6659 | [source,sml] | |
6660 | ---- | |
6661 | fn u => Fold.fold u s1 s2 ... sn | |
6662 | ---- | |
6663 | ||
6664 | The following calculation shows why a compound step behaves as the | |
6665 | composition of its constituent steps. | |
6666 | ||
6667 | [source,sml] | |
6668 | ---- | |
6669 | fold u (fn u => fold u s1 s2 ... sn) | |
6670 | = (fn u => fold u s1 s2 ... sn) u | |
6671 | = fold u s1 s2 ... sn | |
6672 | ---- | |
6673 | ||
6674 | ||
6675 | == Post composition == | |
6676 | ||
6677 | Suppose we already have a function defined via fold, | |
6678 | `w = fold (a, f)`, and we would like to construct a new fold | |
6679 | function that is like `w`, but applies `g` to the result | |
6680 | produced by `w`. This is similar to function composition, but we | |
6681 | can't just do `g o w`, because we don't want to use `g` until | |
6682 | `w` has been applied to all of its arguments and received the | |
6683 | end-of-arguments terminator `$`. | |
6684 | ||
6685 | More precisely, we want to define a post-composition function | |
6686 | `post` that satisfies the following equation. | |
6687 | ||
6688 | [source,sml] | |
6689 | ---- | |
6690 | post (w, g) s1 ... sn $ = g (w s1 ... sn $) | |
6691 | ---- | |
6692 | ||
6693 | Here is the definition of `post`. | |
6694 | ||
6695 | [source,sml] | |
6696 | ---- | |
6697 | structure Fold = | |
6698 | struct | |
6699 | open Fold | |
6700 | fun post (w, g) s = w (fn (a, h) => s (a, g o h)) | |
6701 | end | |
6702 | ---- | |
6703 | ||
6704 | The following calculations show that `post` satisfies the desired | |
6705 | equation, where `w = fold (a, f)`. | |
6706 | ||
6707 | [source,sml] | |
6708 | ---- | |
6709 | post (w, g) s | |
6710 | = w (fn (a, h) => s (a, g o h)) | |
6711 | = fold (a, f) (fn (a, h) => s (a, g o h)) | |
6712 | = (fn (a, h) => s (a, g o h)) (a, f) | |
6713 | = s (a, g o f) | |
6714 | = fold (a, g o f) s | |
6715 | ---- | |
6716 | ||
6717 | Now, suppose `si = step0 hi` for `i` from `1` to `n`. | |
6718 | ||
6719 | [source,sml] | |
6720 | ---- | |
6721 | post (w, g) s1 s2 ... sn $ | |
6722 | = fold (a, g o f) s1 s2 ... sn $ | |
6723 | = (g o f) (hn (... (h1 a))) | |
6724 | = g (f (hn (... (h1 a)))) | |
6725 | = g (fold (a, f) s1 ... sn $) | |
6726 | = g (w s1 ... sn $) | |
6727 | ---- | |
6728 | ||
6729 | For a practical example of post composition, see <:ArrayLiteral:>. | |
6730 | ||
6731 | ||
6732 | == Lift == | |
6733 | ||
6734 | We now define a peculiar-looking function, `lift0`, that is, | |
6735 | equationally speaking, equivalent to the identity function on a step | |
6736 | function. | |
6737 | ||
6738 | [source,sml] | |
6739 | ---- | |
6740 | fun lift0 s (a, f) = fold (fold (a, id) s $, f) | |
6741 | ---- | |
6742 | ||
6743 | Using the definitions, we can prove the following equation. | |
6744 | ||
6745 | [source,sml] | |
6746 | ---- | |
6747 | fold (a, f) (lift0 (step0 h)) = fold (a, f) (step0 h) | |
6748 | ---- | |
6749 | ||
6750 | Here is the proof. | |
6751 | ||
6752 | [source,sml] | |
6753 | ---- | |
6754 | fold (a, f) (lift0 (step0 h)) | |
6755 | = lift0 (step0 h) (a, f) | |
6756 | = fold (fold (a, id) (step0 h) $, f) | |
6757 | = fold (step0 h (a, id) $, f) | |
6758 | = fold (fold (h a, id) $, f) | |
6759 | = fold ($ (h a, id), f) | |
6760 | = fold (id (h a), f) | |
6761 | = fold (h a, f) | |
6762 | = step0 h (a, f) | |
6763 | = fold (a, f) (step0 h) | |
6764 | ---- | |
6765 | ||
6766 | If `lift0` is the identity, then why even define it? The answer | |
6767 | lies in the typing of fold expressions, which we have, until now, left | |
6768 | unexplained. | |
6769 | ||
6770 | ||
6771 | == Typing == | |
6772 | ||
6773 | Perhaps the most surprising aspect of fold is that it can be checked | |
6774 | by the SML type system. The types involved in fold expressions are | |
6775 | complex; fortunately type inference is able to deduce them. | |
6776 | Nevertheless, it is instructive to study the types of fold functions | |
6777 | and steppers. More importantly, it is essential to understand the | |
6778 | typing aspects of fold in order to write down signatures of functions | |
6779 | defined using fold and step. | |
6780 | ||
6781 | Here is the `FOLD` signature, and a recapitulation of the entire | |
6782 | `Fold` structure, with additional type annotations. | |
6783 | ||
6784 | [source,sml] | |
6785 | ---- | |
6786 | signature FOLD = | |
6787 | sig | |
6788 | type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd | |
6789 | type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd | |
6790 | type ('a1, 'a2, 'b, 'c, 'd) step0 = | |
6791 | ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step | |
6792 | type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 = | |
6793 | ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step | |
6794 | ||
6795 | val fold: 'a * ('b -> 'c) -> ('a, 'b, 'c, 'd) t | |
6796 | val lift0: ('a1, 'a2, 'a2, 'a2, 'a2) step0 | |
6797 | -> ('a1, 'a2, 'b, 'c, 'd) step0 | |
6798 | val post: ('a, 'b, 'c1, 'd) t * ('c1 -> 'c2) | |
6799 | -> ('a, 'b, 'c2, 'd) t | |
6800 | val step0: ('a1 -> 'a2) -> ('a1, 'a2, 'b, 'c, 'd) step0 | |
6801 | val step1: ('a11 * 'a12 -> 'a2) | |
6802 | -> ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 | |
6803 | end | |
6804 | ||
6805 | structure Fold:> FOLD = | |
6806 | struct | |
6807 | type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd | |
6808 | ||
6809 | type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd | |
6810 | ||
6811 | type ('a1, 'a2, 'b, 'c, 'd) step0 = | |
6812 | ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step | |
6813 | ||
6814 | type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 = | |
6815 | ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step | |
6816 | ||
6817 | fun fold (a: 'a, f: 'b -> 'c) | |
6818 | (g: ('a, 'b, 'c, 'd) step): 'd = | |
6819 | g (a, f) | |
6820 | ||
6821 | fun step0 (h: 'a1 -> 'a2) | |
6822 | (a1: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t = | |
6823 | fold (h a1, f) | |
6824 | ||
6825 | fun step1 (h: 'a11 * 'a12 -> 'a2) | |
6826 | (a12: 'a12, f: 'b -> 'c) | |
6827 | (a11: 'a11): ('a2, 'b, 'c, 'd) t = | |
6828 | fold (h (a11, a12), f) | |
6829 | ||
6830 | fun lift0 (s: ('a1, 'a2, 'a2, 'a2, 'a2) step0) | |
6831 | (a: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t = | |
6832 | fold (fold (a, id) s $, f) | |
6833 | ||
6834 | fun post (w: ('a, 'b, 'c1, 'd) t, | |
6835 | g: 'c1 -> 'c2) | |
6836 | (s: ('a, 'b, 'c2, 'd) step): 'd = | |
6837 | w (fn (a, h) => s (a, g o h)) | |
6838 | end | |
6839 | ---- | |
6840 | ||
6841 | That's a lot to swallow, so let's walk through it one step at a time. | |
6842 | First, we have the definition of type `Fold.step`. | |
6843 | ||
6844 | [source,sml] | |
6845 | ---- | |
6846 | type ('a, 'b, 'c, 'd) step = 'a * ('b -> 'c) -> 'd | |
6847 | ---- | |
6848 | ||
6849 | As a fold proceeds over its arguments, it maintains two things: the | |
6850 | accumulator, of type `'a`, and the finishing function, of type | |
6851 | `'b -> 'c`. Each step in the fold is a function that takes those | |
6852 | two pieces (i.e. `'a * ('b -> 'c)` and does something to them | |
6853 | (i.e. produces `'d`). The result type of the step is completely | |
6854 | left open to be filled in by type inference, as it is an arrow type | |
6855 | that is capable of consuming the rest of the arguments to the fold. | |
6856 | ||
6857 | A folder, of type `Fold.t`, is a function that consumes a single | |
6858 | step. | |
6859 | ||
6860 | [source,sml] | |
6861 | ---- | |
6862 | type ('a, 'b, 'c, 'd) t = ('a, 'b, 'c, 'd) step -> 'd | |
6863 | ---- | |
6864 | ||
6865 | Expanding out the type, we have: | |
6866 | ||
6867 | [source,sml] | |
6868 | ---- | |
6869 | type ('a, 'b, 'c, 'd) t = ('a * ('b -> 'c) -> 'd) -> 'd | |
6870 | ---- | |
6871 | ||
6872 | This shows that the only thing a folder does is to hand its | |
6873 | accumulator (`'a`) and finisher (`'b -> 'c`) to the next step | |
6874 | (`'a * ('b -> 'c) -> 'd`). If SML had <:FirstClassPolymorphism:first-class polymorphism>, | |
6875 | we would write the fold type as follows. | |
6876 | ||
6877 | [source,sml] | |
6878 | ---- | |
6879 | type ('a, 'b, 'c) t = Forall 'd . ('a, 'b, 'c, 'd) step -> 'd | |
6880 | ---- | |
6881 | ||
6882 | This type definition shows that a folder had nothing to do with | |
6883 | the rest of the fold, it only deals with the next step. | |
6884 | ||
6885 | We now can understand the type of `fold`, which takes the initial | |
6886 | value of the accumulator and the finishing function, and constructs a | |
6887 | folder, i.e. a function awaiting the next step. | |
6888 | ||
6889 | [source,sml] | |
6890 | ---- | |
6891 | val fold: 'a * ('b -> 'c) -> ('a, 'b, 'c, 'd) t | |
6892 | fun fold (a: 'a, f: 'b -> 'c) | |
6893 | (g: ('a, 'b, 'c, 'd) step): 'd = | |
6894 | g (a, f) | |
6895 | ---- | |
6896 | ||
6897 | Continuing on, we have the type of step functions. | |
6898 | ||
6899 | [source,sml] | |
6900 | ---- | |
6901 | type ('a1, 'a2, 'b, 'c, 'd) step0 = | |
6902 | ('a1, 'b, 'c, ('a2, 'b, 'c, 'd) t) step | |
6903 | ---- | |
6904 | ||
6905 | Expanding out the type a bit gives: | |
6906 | ||
6907 | [source,sml] | |
6908 | ---- | |
6909 | type ('a1, 'a2, 'b, 'c, 'd) step0 = | |
6910 | 'a1 * ('b -> 'c) -> ('a2, 'b, 'c, 'd) t | |
6911 | ---- | |
6912 | ||
6913 | So, a step function takes the accumulator (`'a1`) and finishing | |
6914 | function (`'b -> 'c`), which will be passed to it by the previous | |
6915 | folder, and transforms them to a new folder. This new folder has a | |
6916 | new accumulator (`'a2`) and the same finishing function. | |
6917 | ||
6918 | Again, imagining that SML had <:FirstClassPolymorphism:first-class polymorphism> makes the type | |
6919 | clearer. | |
6920 | ||
6921 | [source,sml] | |
6922 | ---- | |
6923 | type ('a1, 'a2) step0 = | |
6924 | Forall ('b, 'c) . ('a1, 'b, 'c, ('a2, 'b, 'c) t) step | |
6925 | ---- | |
6926 | ||
6927 | Thus, in essence, a `step0` function is a wrapper around a | |
6928 | function of type `'a1 -> 'a2`, which is exactly what the | |
6929 | definition of `step0` does. | |
6930 | ||
6931 | [source,sml] | |
6932 | ---- | |
6933 | val step0: ('a1 -> 'a2) -> ('a1, 'a2, 'b, 'c, 'd) step0 | |
6934 | fun step0 (h: 'a1 -> 'a2) | |
6935 | (a1: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t = | |
6936 | fold (h a1, f) | |
6937 | ---- | |
6938 | ||
6939 | It is not much beyond `step0` to understand `step1`. | |
6940 | ||
6941 | [source,sml] | |
6942 | ---- | |
6943 | type ('a11, 'a12, 'a2, 'b, 'c, 'd) step1 = | |
6944 | ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c, 'd) t) step | |
6945 | ---- | |
6946 | ||
6947 | A `step1` function takes the accumulator (`'a12`) and finisher | |
6948 | (`'b -> 'c`) passed to it by the previous folder and transforms | |
6949 | them into a function that consumes the next argument (`'a11`) and | |
6950 | produces a folder that will continue the fold with a new accumulator | |
6951 | (`'a2`) and the same finisher. | |
6952 | ||
6953 | [source,sml] | |
6954 | ---- | |
6955 | fun step1 (h: 'a11 * 'a12 -> 'a2) | |
6956 | (a12: 'a12, f: 'b -> 'c) | |
6957 | (a11: 'a11): ('a2, 'b, 'c, 'd) t = | |
6958 | fold (h (a11, a12), f) | |
6959 | ---- | |
6960 | ||
6961 | With <:FirstClassPolymorphism:first-class polymorphism>, a `step1` function is more clearly | |
6962 | seen as a wrapper around a binary function of type | |
6963 | `'a11 * 'a12 -> 'a2`. | |
6964 | ||
6965 | [source,sml] | |
6966 | ---- | |
6967 | type ('a11, 'a12, 'a2) step1 = | |
6968 | Forall ('b, 'c) . ('a12, 'b, 'c, 'a11 -> ('a2, 'b, 'c) t) step | |
6969 | ---- | |
6970 | ||
6971 | The type of `post` is clear: it takes a folder with a finishing | |
6972 | function that produces type `'c1`, and a function of type | |
6973 | `'c1 -> 'c2` to postcompose onto the folder. It returns a new | |
6974 | folder with a finishing function that produces type `'c2`. | |
6975 | ||
6976 | [source,sml] | |
6977 | ---- | |
6978 | val post: ('a, 'b, 'c1, 'd) t * ('c1 -> 'c2) | |
6979 | -> ('a, 'b, 'c2, 'd) t | |
6980 | fun post (w: ('a, 'b, 'c1, 'd) t, | |
6981 | g: 'c1 -> 'c2) | |
6982 | (s: ('a, 'b, 'c2, 'd) step): 'd = | |
6983 | w (fn (a, h) => s (a, g o h)) | |
6984 | ---- | |
6985 | ||
6986 | We will return to `lift0` after an example. | |
6987 | ||
6988 | ||
6989 | == An example typing == | |
6990 | ||
6991 | Let's type check our simplest example, a variable-argument fold. | |
6992 | Recall that we have a folder `f` and a stepper `a` defined as | |
6993 | follows. | |
6994 | ||
6995 | [source,sml] | |
6996 | ---- | |
6997 | val f = fn z => Fold.fold ((), fn () => ()) z | |
6998 | val a = fn z => Fold.step0 (fn () => ()) z | |
6999 | ---- | |
7000 | ||
7001 | Since the accumulator and finisher are uninteresting, we'll use some | |
7002 | abbreviations to simplify things. | |
7003 | ||
7004 | [source,sml] | |
7005 | ---- | |
7006 | type 'd step = (unit, unit, unit, 'd) Fold.step | |
7007 | type 'd fold = 'd step -> 'd | |
7008 | ---- | |
7009 | ||
7010 | With these abbreviations, `f` and `a` have the following polymorphic | |
7011 | types. | |
7012 | ||
7013 | [source,sml] | |
7014 | ---- | |
7015 | f: 'd fold | |
7016 | a: 'd step | |
7017 | ---- | |
7018 | ||
7019 | Suppose we want to type check | |
7020 | ||
7021 | [source,sml] | |
7022 | ---- | |
7023 | f a a a $: unit | |
7024 | ---- | |
7025 | ||
7026 | As a reminder, the fully parenthesized expression is | |
7027 | [source,sml] | |
7028 | ---- | |
7029 | ((((f a) a) a) a) $ | |
7030 | ---- | |
7031 | ||
7032 | The observation that we will use repeatedly is that for any type | |
7033 | `z`, if `f: z fold` and `s: z step`, then `f s: z`. | |
7034 | So, if we want | |
7035 | ||
7036 | [source,sml] | |
7037 | ---- | |
7038 | (f a a a) $: unit | |
7039 | ---- | |
7040 | ||
7041 | then we must have | |
7042 | ||
7043 | [source,sml] | |
7044 | ---- | |
7045 | f a a a: unit fold | |
7046 | $: unit step | |
7047 | ---- | |
7048 | ||
7049 | Applying the observation again, we must have | |
7050 | ||
7051 | [source,sml] | |
7052 | ---- | |
7053 | f a a: unit fold fold | |
7054 | a: unit fold step | |
7055 | ---- | |
7056 | ||
7057 | Applying the observation two more times leads to the following type | |
7058 | derivation. | |
7059 | ||
7060 | [source,sml] | |
7061 | ---- | |
7062 | f: unit fold fold fold fold a: unit fold fold fold step | |
7063 | f a: unit fold fold fold a: unit fold fold step | |
7064 | f a a: unit fold fold a: unit fold step | |
7065 | f a a a: unit fold $: unit step | |
7066 | f a a a $: unit | |
7067 | ---- | |
7068 | ||
7069 | So, each application is a fold that consumes the next step, producing | |
7070 | a fold of one smaller type. | |
7071 | ||
7072 | One can expand some of the type definitions in `f` to see that it is | |
7073 | indeed a function that takes four curried arguments, each one a step | |
7074 | function. | |
7075 | ||
7076 | [source,sml] | |
7077 | ---- | |
7078 | f: unit fold fold fold step | |
7079 | -> unit fold fold step | |
7080 | -> unit fold step | |
7081 | -> unit step | |
7082 | -> unit | |
7083 | ---- | |
7084 | ||
7085 | This example shows why we must eta expand uses of `fold` and `step0` | |
7086 | to work around the value restriction and make folders and steppers | |
7087 | polymorphic. The type of a fold function like `f` depends on the | |
7088 | number of arguments, and so will vary from use to use. Similarly, | |
7089 | each occurrence of an argument like `a` has a different type, | |
7090 | depending on the number of remaining arguments. | |
7091 | ||
7092 | This example also shows that the type of a folder, when fully | |
7093 | expanded, is exponential in the number of arguments: there are as many | |
7094 | nested occurrences of the `fold` type constructor as there are | |
7095 | arguments, and each occurrence duplicates its type argument. One can | |
7096 | observe this exponential behavior in a type checker that doesn't share | |
7097 | enough of the representation of types (e.g. one that represents types | |
7098 | as trees rather than directed acyclic graphs). | |
7099 | ||
7100 | Generalizing this type derivation to uses of fold where the | |
7101 | accumulator and finisher are more interesting is straightforward. One | |
7102 | simply includes the type of the accumulator, which may change, for | |
7103 | each step, and the type of the finisher, which doesn't change from | |
7104 | step to step. | |
7105 | ||
7106 | ||
7107 | == Typing lift == | |
7108 | ||
7109 | The lack of <:FirstClassPolymorphism:first-class polymorphism> in SML | |
7110 | causes problems if one wants to use a step in a first-class way. | |
7111 | Consider the following `double` function, which takes a step, `s`, and | |
7112 | produces a composite step that does `s` twice. | |
7113 | ||
7114 | [source,sml] | |
7115 | ---- | |
7116 | fun double s = fn u => Fold.fold u s s | |
7117 | ---- | |
7118 | ||
7119 | The definition of `double` is not type correct. The problem is that | |
7120 | the type of a step depends on the number of remaining arguments but | |
7121 | that the parameter `s` is not polymorphic, and so can not be used in | |
7122 | two different positions. | |
7123 | ||
7124 | Fortunately, we can define a function, `lift0`, that takes a monotyped | |
7125 | step function and _lifts_ it into a polymorphic step function. This | |
7126 | is apparent in the type of `lift0`. | |
7127 | ||
7128 | [source,sml] | |
7129 | ---- | |
7130 | val lift0: ('a1, 'a2, 'a2, 'a2, 'a2) step0 | |
7131 | -> ('a1, 'a2, 'b, 'c, 'd) step0 | |
7132 | fun lift0 (s: ('a1, 'a2, 'a2, 'a2, 'a2) step0) | |
7133 | (a: 'a1, f: 'b -> 'c): ('a2, 'b, 'c, 'd) t = | |
7134 | fold (fold (a, id) s $, f) | |
7135 | ---- | |
7136 | ||
7137 | The following definition of `double` uses `lift0`, appropriately eta | |
7138 | wrapped, to fix the problem. | |
7139 | ||
7140 | [source,sml] | |
7141 | ---- | |
7142 | fun double s = | |
7143 | let | |
7144 | val s = fn z => Fold.lift0 s z | |
7145 | in | |
7146 | fn u => Fold.fold u s s | |
7147 | end | |
7148 | ---- | |
7149 | ||
7150 | With that definition of `double` in place, we can use it as in the | |
7151 | following example. | |
7152 | ||
7153 | [source,sml] | |
7154 | ---- | |
7155 | val f = fn z => Fold.fold ((), fn () => ()) z | |
7156 | val a = fn z => Fold.step0 (fn () => ()) z | |
7157 | val a2 = fn z => double a z | |
7158 | val () = f a a2 a a2 $ | |
7159 | ---- | |
7160 | ||
7161 | Of course, we must eta wrap the call `double` in order to use its | |
7162 | result, which is a step function, polymorphically. | |
7163 | ||
7164 | ||
7165 | == Hiding the type of the accumulator == | |
7166 | ||
7167 | For clarity and to avoid mistakes, it can be useful to hide the type | |
7168 | of the accumulator in a fold. Reworking the simple variable-argument | |
7169 | example to do this leads to the following. | |
7170 | ||
7171 | [source,sml] | |
7172 | ---- | |
7173 | structure S:> | |
7174 | sig | |
7175 | type ac | |
7176 | val f: (ac, ac, unit, 'd) Fold.t | |
7177 | val s: (ac, ac, 'b, 'c, 'd) Fold.step0 | |
7178 | end = | |
7179 | struct | |
7180 | type ac = unit | |
7181 | val f = fn z => Fold.fold ((), fn () => ()) z | |
7182 | val s = fn z => Fold.step0 (fn () => ()) z | |
7183 | end | |
7184 | ---- | |
7185 | ||
7186 | The idea is to name the accumulator type and use opaque signature | |
7187 | matching to make it abstract. This can prevent improper manipulation | |
7188 | of the accumulator by client code and ensure invariants that the | |
7189 | folder and stepper would like to maintain. | |
7190 | ||
7191 | For a practical example of this technique, see <:ArrayLiteral:>. | |
7192 | ||
7193 | ||
7194 | == Also see == | |
7195 | ||
7196 | Fold has a number of practical applications. Here are some of them. | |
7197 | ||
7198 | * <:ArrayLiteral:> | |
7199 | * <:Fold01N:> | |
7200 | * <:FunctionalRecordUpdate:> | |
7201 | * <:NumericLiteral:> | |
7202 | * <:OptionalArguments:> | |
7203 | * <:Printf:> | |
7204 | * <:VariableArityPolymorphism:> | |
7205 | ||
7206 | There are a number of related techniques. Here are some of them. | |
7207 | ||
7208 | * <:StaticSum:> | |
7209 | * <:TypeIndexedValues:> | |
7210 | ||
7211 | <<< | |
7212 | ||
7213 | :mlton-guide-page: Fold01N | |
7214 | [[Fold01N]] | |
7215 | Fold01N | |
7216 | ======= | |
7217 | ||
7218 | A common use pattern of <:Fold:> is to define a variable-arity | |
7219 | function that combines multiple arguments together using a binary | |
7220 | function. It is slightly tricky to do this directly using fold, | |
7221 | because of the special treatment required for the case of zero or one | |
7222 | argument. Here is a structure, `Fold01N`, that solves the problem | |
7223 | once and for all, and eases the definition of such functions. | |
7224 | ||
7225 | [source,sml] | |
7226 | ---- | |
7227 | structure Fold01N = | |
7228 | struct | |
7229 | fun fold {finish, start, zero} = | |
7230 | Fold.fold ((id, finish, fn () => zero, start), | |
7231 | fn (finish, _, p, _) => finish (p ())) | |
7232 | ||
7233 | fun step0 {combine, input} = | |
7234 | Fold.step0 (fn (_, finish, _, f) => | |
7235 | (finish, | |
7236 | finish, | |
7237 | fn () => f input, | |
7238 | fn x' => combine (f input, x'))) | |
7239 | ||
7240 | fun step1 {combine} z input = | |
7241 | step0 {combine = combine, input = input} z | |
7242 | end | |
7243 | ---- | |
7244 | ||
7245 | If one has a value `zero`, and functions `start`, `c`, and `finish`, | |
7246 | then one can define a variable-arity function `f` and stepper | |
7247 | +`+ as follows. | |
7248 | [source,sml] | |
7249 | ---- | |
7250 | val f = fn z => Fold01N.fold {finish = finish, start = start, zero = zero} z | |
7251 | val ` = fn z => Fold01N.step1 {combine = c} z | |
7252 | ---- | |
7253 | ||
7254 | One can then use the fold equation to prove the following equations. | |
7255 | [source,sml] | |
7256 | ---- | |
7257 | f $ = zero | |
7258 | f `a1 $ = finish (start a1) | |
7259 | f `a1 `a2 $ = finish (c (start a1, a2)) | |
7260 | f `a1 `a2 `a3 $ = finish (c (c (start a1, a2), a3)) | |
7261 | ... | |
7262 | ---- | |
7263 | ||
7264 | For an example of `Fold01N`, see <:VariableArityPolymorphism:>. | |
7265 | ||
7266 | ||
7267 | == Typing Fold01N == | |
7268 | ||
7269 | Here is the signature for `Fold01N`. We use a trick to avoid having | |
7270 | to duplicate the definition of some rather complex types in both the | |
7271 | signature and the structure. We first define the types in a | |
7272 | structure. Then, we define them via type re-definitions in the | |
7273 | signature, and via `open` in the full structure. | |
7274 | [source,sml] | |
7275 | ---- | |
7276 | structure Fold01N = | |
7277 | struct | |
7278 | type ('input, 'accum1, 'accum2, 'answer, 'zero, | |
7279 | 'a, 'b, 'c, 'd, 'e) t = | |
7280 | (('zero -> 'zero) | |
7281 | * ('accum2 -> 'answer) | |
7282 | * (unit -> 'zero) | |
7283 | * ('input -> 'accum1), | |
7284 | ('a -> 'b) * 'c * (unit -> 'a) * 'd, | |
7285 | 'b, | |
7286 | 'e) Fold.t | |
7287 | ||
7288 | type ('input1, 'accum1, 'input2, 'accum2, | |
7289 | 'a, 'b, 'c, 'd, 'e, 'f) step0 = | |
7290 | ('a * 'b * 'c * ('input1 -> 'accum1), | |
7291 | 'b * 'b * (unit -> 'accum1) * ('input2 -> 'accum2), | |
7292 | 'd, 'e, 'f) Fold.step0 | |
7293 | ||
7294 | type ('accum1, 'input, 'accum2, | |
7295 | 'a, 'b, 'c, 'd, 'e, 'f, 'g) step1 = | |
7296 | ('a, | |
7297 | 'b * 'c * 'd * ('a -> 'accum1), | |
7298 | 'c * 'c * (unit -> 'accum1) * ('input -> 'accum2), | |
7299 | 'e, 'f, 'g) Fold.step1 | |
7300 | end | |
7301 | ||
7302 | signature FOLD_01N = | |
7303 | sig | |
7304 | type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) t = | |
7305 | ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.t | |
7306 | type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) step0 = | |
7307 | ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.step0 | |
7308 | type ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) step1 = | |
7309 | ('a, 'b, 'c, 'd, 'e, 'f, 'g, 'h, 'i, 'j) Fold01N.step1 | |
7310 | ||
7311 | val fold: | |
7312 | {finish: 'accum2 -> 'answer, | |
7313 | start: 'input -> 'accum1, | |
7314 | zero: 'zero} | |
7315 | -> ('input, 'accum1, 'accum2, 'answer, 'zero, | |
7316 | 'a, 'b, 'c, 'd, 'e) t | |
7317 | ||
7318 | val step0: | |
7319 | {combine: 'accum1 * 'input2 -> 'accum2, | |
7320 | input: 'input1} | |
7321 | -> ('input1, 'accum1, 'input2, 'accum2, | |
7322 | 'a, 'b, 'c, 'd, 'e, 'f) step0 | |
7323 | ||
7324 | val step1: | |
7325 | {combine: 'accum1 * 'input -> 'accum2} | |
7326 | -> ('accum1, 'input, 'accum2, | |
7327 | 'a, 'b, 'c, 'd, 'e, 'f, 'g) step1 | |
7328 | end | |
7329 | ||
7330 | structure Fold01N: FOLD_01N = | |
7331 | struct | |
7332 | open Fold01N | |
7333 | ||
7334 | fun fold {finish, start, zero} = | |
7335 | Fold.fold ((id, finish, fn () => zero, start), | |
7336 | fn (finish, _, p, _) => finish (p ())) | |
7337 | ||
7338 | fun step0 {combine, input} = | |
7339 | Fold.step0 (fn (_, finish, _, f) => | |
7340 | (finish, | |
7341 | finish, | |
7342 | fn () => f input, | |
7343 | fn x' => combine (f input, x'))) | |
7344 | ||
7345 | fun step1 {combine} z input = | |
7346 | step0 {combine = combine, input = input} z | |
7347 | end | |
7348 | ---- | |
7349 | ||
7350 | <<< | |
7351 | ||
7352 | :mlton-guide-page: ForeignFunctionInterface | |
7353 | [[ForeignFunctionInterface]] | |
7354 | ForeignFunctionInterface | |
7355 | ======================== | |
7356 | ||
7357 | MLton's foreign function interface (FFI) extends Standard ML and makes | |
7358 | it easy to take the address of C global objects, access C global | |
7359 | variables, call from SML to C, and call from C to SML. MLton also | |
7360 | provides <:MLNLFFI:ML-NLFFI>, which is a higher-level FFI for calling | |
7361 | C functions and manipulating C data from SML. | |
7362 | ||
7363 | == Overview == | |
7364 | * <:ForeignFunctionInterfaceTypes:Foreign Function Interface Types> | |
7365 | * <:ForeignFunctionInterfaceSyntax:Foreign Function Interface Syntax> | |
7366 | ||
7367 | == Importing Code into SML == | |
7368 | * <:CallingFromSMLToC:Calling From SML To C> | |
7369 | * <:CallingFromSMLToCFunctionPointer:Calling From SML To C Function Pointer> | |
7370 | ||
7371 | == Exporting Code from SML == | |
7372 | * <:CallingFromCToSML:Calling From C To SML> | |
7373 | ||
7374 | == Building System Libraries == | |
7375 | * <:LibrarySupport:Library Support> | |
7376 | ||
7377 | <<< | |
7378 | ||
7379 | :mlton-guide-page: ForeignFunctionInterfaceSyntax | |
7380 | [[ForeignFunctionInterfaceSyntax]] | |
7381 | ForeignFunctionInterfaceSyntax | |
7382 | ============================== | |
7383 | ||
7384 | MLton extends the syntax of SML with expressions that enable a | |
7385 | <:ForeignFunctionInterface:> to C. The following description of the | |
7386 | syntax uses some abbreviations. | |
7387 | ||
7388 | [options="header"] | |
7389 | |==== | |
7390 | | C base type | _cBaseTy_ | <:ForeignFunctionInterfaceTypes: Foreign Function Interface types> | |
7391 | | C argument type | _cArgTy_ | _cBaseTy_~1~ `*` ... `*` _cBaseTy_~n~ or `unit` | |
7392 | | C return type | _cRetTy_ | _cBaseTy_ or `unit` | |
7393 | | C function type | _cFuncTy_ | _cArgTy_ `->` _cRetTy_ | |
7394 | | C pointer type | _cPtrTy_ | `MLton.Pointer.t` | |
7395 | |==== | |
7396 | ||
7397 | The type annotation and the semicolon are not optional in the syntax | |
7398 | of <:ForeignFunctionInterface:> expressions. However, the type is | |
7399 | lexed, parsed, and elaborated as an SML type, so any type (including | |
7400 | type abbreviations) may be used, so long as it elaborates to a type of | |
7401 | the correct form. | |
7402 | ||
7403 | ||
7404 | == Address == | |
7405 | ||
7406 | ---- | |
7407 | _address "CFunctionOrVariableName" attr... : cPtrTy; | |
7408 | ---- | |
7409 | ||
7410 | Denotes the address of the C function or variable. | |
7411 | ||
7412 | `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized: | |
7413 | ||
7414 | * `external` : import with external symbol scope (see <:LibrarySupport:>) (default). | |
7415 | * `private` : import with private symbol scope (see <:LibrarySupport:>). | |
7416 | * `public` : import with public symbol scope (see <:LibrarySupport:>). | |
7417 | ||
7418 | See <:MLtonPointer: MLtonPointer> for functions that manipulate C pointers. | |
7419 | ||
7420 | ||
7421 | == Symbol == | |
7422 | ||
7423 | ---- | |
7424 | _symbol "CVariableName" attr... : (unit -> cBaseTy) * (cBaseTy -> unit); | |
7425 | ---- | |
7426 | ||
7427 | Denotes the _getter_ and _setter_ for a C variable. The __cBaseTy__s | |
7428 | must be identical. | |
7429 | ||
7430 | `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized: | |
7431 | ||
7432 | * `alloc` : allocate storage (and export a symbol) for the C variable. | |
7433 | * `external` : import or export with external symbol scope (see <:LibrarySupport:>) (default if not `alloc`). | |
7434 | * `private` : import or export with private symbol scope (see <:LibrarySupport:>). | |
7435 | * `public` : import or export with public symbol scope (see <:LibrarySupport:>) (default if `alloc`). | |
7436 | ||
7437 | ||
7438 | ---- | |
7439 | _symbol * : cPtrTy -> (unit -> cBaseTy) * (cBaseTy -> unit); | |
7440 | ---- | |
7441 | ||
7442 | Denotes the _getter_ and _setter_ for a C pointer to a variable. | |
7443 | The __cBaseTy__s must be identical. | |
7444 | ||
7445 | ||
7446 | == Import == | |
7447 | ||
7448 | ---- | |
7449 | _import "CFunctionName" attr... : cFuncTy; | |
7450 | ---- | |
7451 | ||
7452 | Denotes an SML function whose behavior is implemented by calling the C | |
7453 | function. See <:CallingFromSMLToC: Calling from SML to C> for more | |
7454 | details. | |
7455 | ||
7456 | `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized: | |
7457 | ||
7458 | * `cdecl` : call with the `cdecl` calling convention (default). | |
7459 | * `external` : import with external symbol scope (see <:LibrarySupport:>) (default). | |
7460 | * `impure`: assert that the function depends upon state and/or performs side effects (default). | |
7461 | * `private` : import with private symbol scope (see <:LibrarySupport:>). | |
7462 | * `public` : import with public symbol scope (see <:LibrarySupport:>). | |
7463 | * `pure`: assert that the function does not depend upon state or perform any side effects; such functions are subject to various optimizations (e.g., <:CommonSubexp:>, <:RemoveUnused:>) | |
7464 | * `reentrant`: assert that the function (directly or indirectly) calls an `_export`-ed SML function. | |
7465 | * `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW). | |
7466 | ||
7467 | ||
7468 | ---- | |
7469 | _import * attr... : cPtrTy -> cFuncTy; | |
7470 | ---- | |
7471 | ||
7472 | Denotes an SML function whose behavior is implemented by calling a C | |
7473 | function through a C function pointer. | |
7474 | ||
7475 | `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized: | |
7476 | ||
7477 | * `cdecl` : call with the `cdecl` calling convention (default). | |
7478 | * `impure`: assert that the function depends upon state and/or performs side effects (default). | |
7479 | * `pure`: assert that the function does not depend upon state or perform any side effects; such functions are subject to various optimizations (e.g., <:CommonSubexp:>, <:RemoveUnused:>) | |
7480 | * `reentrant`: assert that the function (directly or indirectly) calls an `_export`-ed SML function. | |
7481 | * `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW). | |
7482 | ||
7483 | See | |
7484 | <:CallingFromSMLToCFunctionPointer: Calling from SML to C function pointer> | |
7485 | for more details. | |
7486 | ||
7487 | ||
7488 | == Export == | |
7489 | ||
7490 | ---- | |
7491 | _export "CFunctionName" attr... : cFuncTy -> unit; | |
7492 | ---- | |
7493 | ||
7494 | Exports a C function with the name `CFunctionName` that can be used to | |
7495 | call an SML function of the type _cFuncTy_. When the function denoted | |
7496 | by the export expression is applied to an SML function `f`, subsequent | |
7497 | C calls to `CFunctionName` will call `f`. It is an error to call | |
7498 | `CFunctionName` before the export has been applied. The export may be | |
7499 | applied more than once, with each application replacing any previous | |
7500 | definition of `CFunctionName`. | |
7501 | ||
7502 | `attr...` denotes a (possibly empty) sequence of attributes. The following attributes are recognized: | |
7503 | ||
7504 | * `cdecl` : call with the `cdecl` calling convention (default). | |
7505 | * `private` : export with private symbol scope (see <:LibrarySupport:>). | |
7506 | * `public` : export with public symbol scope (see <:LibrarySupport:>) (default). | |
7507 | * `stdcall` : call with the `stdcall` calling convention (ignored except on Cygwin and MinGW). | |
7508 | ||
7509 | See <:CallingFromCToSML: Calling from C to SML> for more details. | |
7510 | ||
7511 | <<< | |
7512 | ||
7513 | :mlton-guide-page: ForeignFunctionInterfaceTypes | |
7514 | [[ForeignFunctionInterfaceTypes]] | |
7515 | ForeignFunctionInterfaceTypes | |
7516 | ============================= | |
7517 | ||
7518 | MLton's <:ForeignFunctionInterface:> only allows values of certain SML | |
7519 | types to be passed between SML and C. The following types are | |
7520 | allowed: `bool`, `char`, `int`, `real`, `word`. All of the different | |
7521 | sizes of (fixed-sized) integers, reals, and words are supported as | |
7522 | well: `Int8.int`, `Int16.int`, `Int32.int`, `Int64.int`, | |
7523 | `Real32.real`, `Real64.real`, `Word8.word`, `Word16.word`, | |
7524 | `Word32.word`, `Word64.word`. There is a special type, | |
7525 | `MLton.Pointer.t`, for passing C pointers -- see <:MLtonPointer:> for | |
7526 | details. | |
7527 | ||
7528 | Arrays, refs, and vectors of the above types are also allowed. | |
7529 | Because in MLton monomorphic arrays and vectors are exactly the same | |
7530 | as their polymorphic counterpart, these are also allowed. Hence, | |
7531 | `string`, `char vector`, and `CharVector.vector` are also allowed. | |
7532 | Strings are not null terminated, unless you manually do so from the | |
7533 | SML side. | |
7534 | ||
7535 | Unfortunately, passing tuples or datatypes is not allowed because that | |
7536 | would interfere with representation optimizations. | |
7537 | ||
7538 | The C header file that `-export-header` generates includes | |
7539 | ++typedef++s for the C types corresponding to the SML types. Here is | |
7540 | the mapping between SML types and C types. | |
7541 | ||
7542 | [options="header"] | |
7543 | |==== | |
7544 | | SML type | C typedef | C type | Note | |
7545 | | `array` | `Pointer` | `unsigned char *` | | |
7546 | | `bool` | `Bool` | `int32_t` | | |
7547 | | `char` | `Char8` | `uint8_t` | | |
7548 | | `Int8.int` | `Int8` | `int8_t` | | |
7549 | | `Int16.int` | `Int16` | `int16_t` | | |
7550 | | `Int32.int` | `Int32` | `int32_t` | | |
7551 | | `Int64.int` | `Int64` | `int64_t` | | |
7552 | | `int` | `Int32` | `int32_t` | <:#Default:(default)> | |
7553 | | `MLton.Pointer.t` | `Pointer` | `unsigned char *` | | |
7554 | | `Real32.real` | `Real32` | `float` | | |
7555 | | `Real64.real` | `Real64` | `double` | | |
7556 | | `real` | `Real64` | `double` | <:#Default:(default)> | |
7557 | | `ref` | `Pointer` | `unsigned char *` | | |
7558 | | `string` | `Pointer` | `unsigned char *` | <:#ReadOnly:(read only)> | |
7559 | | `vector` | `Pointer` | `unsigned char *` | <:#ReadOnly:(read only)> | |
7560 | | `Word8.word` | `Word8` | `uint8_t` | | |
7561 | | `Word16.word` | `Word16` | `uint16_t` | | |
7562 | | `Word32.word` | `Word32` | `uint32_t` | | |
7563 | | `Word64.word` | `Word64` | `uint64_t` | | |
7564 | | `word` | `Word32` | `uint32_t` | <:#Default:(default)> | |
7565 | |==== | |
7566 | ||
7567 | <!Anchor(Default)>Note (default): The default `int`, `real`, and | |
7568 | `word` types may be set by the ++-default-type __type__++ | |
7569 | <:CompileTimeOptions: compiler option>. The given C typedef and C | |
7570 | types correspond to the default behavior. | |
7571 | ||
7572 | <!Anchor(ReadOnly)>Note (read only): Because MLton assumes that | |
7573 | vectors and strings are read-only (and will perform optimizations | |
7574 | that, for instance, cause them to share space), you must not modify | |
7575 | the data pointed to by the `unsigned char *` in C code. | |
7576 | ||
7577 | Although the C type of an array, ref, or vector is always `Pointer`, | |
7578 | in reality, the object has the natural C representation. Your C code | |
7579 | should cast to the appropriate C type if you want to keep the C | |
7580 | compiler from complaining. | |
7581 | ||
7582 | When calling an <:CallingFromSMLToC: imported C function from SML> | |
7583 | that returns an array, ref, or vector result or when calling an | |
7584 | <:CallingFromCToSML: exported SML function from C> that takes an | |
7585 | array, ref, or string argument, then the object must be an ML object | |
7586 | allocated on the ML heap. (Although an array, ref, or vector object | |
7587 | has the natural C representation, the object also has an additional | |
7588 | header used by the SML runtime system.) | |
7589 | ||
7590 | In addition, there is an <:MLBasis:> file, `$(SML_LIB)/basis/c-types.mlb`, | |
7591 | which provides structure aliases for various C types: | |
7592 | ||
7593 | |==== | |
7594 | | C type | Structure | Signature | |
7595 | | `char` | `C_Char` | `INTEGER` | |
7596 | | `signed char` | `C_SChar` | `INTEGER` | |
7597 | | `unsigned char` | `C_UChar` | `WORD` | |
7598 | | `short` | `C_Short` | `INTEGER` | |
7599 | | `signed short` | `C_SShort` | `INTEGER` | |
7600 | | `unsigned short` | `C_UShort` | `WORD` | |
7601 | | `int` | `C_Int` | `INTEGER` | |
7602 | | `signed int` | `C_SInt` | `INTEGER` | |
7603 | | `unsigned int` | `C_UInt` | `WORD` | |
7604 | | `long` | `C_Long` | `INTEGER` | |
7605 | | `signed long` | `C_SLong` | `INTEGER` | |
7606 | | `unsigned long` | `C_ULong` | `WORD` | |
7607 | | `long long` | `C_LongLong` | `INTEGER` | |
7608 | | `signed long long` | `C_SLongLong` | `INTEGER` | |
7609 | | `unsigned long long` | `C_ULongLong` | `WORD` | |
7610 | | `float` | `C_Float` | `REAL` | |
7611 | | `double` | `C_Double` | `REAL` | |
7612 | | `size_t` | `C_Size` | `WORD` | |
7613 | | `ptrdiff_t` | `C_Ptrdiff` | `INTEGER` | |
7614 | | `intmax_t` | `C_Intmax` | `INTEGER` | |
7615 | | `uintmax_t` | `C_UIntmax` | `WORD` | |
7616 | | `intptr_t` | `C_Intptr` | `INTEGER` | |
7617 | | `uintptr_t` | `C_UIntptr` | `WORD` | |
7618 | | `void *` | `C_Pointer` | `WORD` | |
7619 | |==== | |
7620 | ||
7621 | These aliases depend on the configuration of the C compiler for the | |
7622 | target architecture, and are independent of the configuration of MLton | |
7623 | (including the ++-default-type __type__++ | |
7624 | <:CompileTimeOptions: compiler option>). | |
7625 | ||
7626 | <<< | |
7627 | ||
7628 | :mlton-guide-page: ForLoops | |
7629 | [[ForLoops]] | |
7630 | ForLoops | |
7631 | ======== | |
7632 | ||
7633 | A `for`-loop is typically used to iterate over a range of consecutive | |
7634 | integers that denote indices of some sort. For example, in <:OCaml:> | |
7635 | a `for`-loop takes either the form | |
7636 | ---- | |
7637 | for <name> = <lower> to <upper> do <body> done | |
7638 | ---- | |
7639 | or the form | |
7640 | ---- | |
7641 | for <name> = <upper> downto <lower> do <body> done | |
7642 | ---- | |
7643 | ||
7644 | Some languages provide considerably more flexible `for`-loop or | |
7645 | `foreach`-constructs. | |
7646 | ||
7647 | A bit surprisingly, <:StandardML:Standard ML> provides special syntax | |
7648 | for `while`-loops, but not for `for`-loops. Indeed, in SML, many uses | |
7649 | of `for`-loops are better expressed using `app`, `foldl`/`foldr`, | |
7650 | `map` and many other higher-order functions provided by the | |
7651 | <:BasisLibrary:Basis Library> for manipulating lists, vectors and | |
7652 | arrays. However, the Basis Library does not provide a function for | |
7653 | iterating over a range of integer values. Fortunately, it is very | |
7654 | easy to write one. | |
7655 | ||
7656 | ||
7657 | == A fairly simple design == | |
7658 | ||
7659 | The following implementation imitates both the syntax and semantics of | |
7660 | the OCaml `for`-loop. | |
7661 | ||
7662 | [source,sml] | |
7663 | ---- | |
7664 | datatype for = to of int * int | |
7665 | | downto of int * int | |
7666 | ||
7667 | infix to downto | |
7668 | ||
7669 | val for = | |
7670 | fn lo to up => | |
7671 | (fn f => let fun loop lo = if lo > up then () | |
7672 | else (f lo; loop (lo+1)) | |
7673 | in loop lo end) | |
7674 | | up downto lo => | |
7675 | (fn f => let fun loop up = if up < lo then () | |
7676 | else (f up; loop (up-1)) | |
7677 | in loop up end) | |
7678 | ---- | |
7679 | ||
7680 | For example, | |
7681 | ||
7682 | [source,sml] | |
7683 | ---- | |
7684 | for (1 to 9) | |
7685 | (fn i => print (Int.toString i)) | |
7686 | ---- | |
7687 | ||
7688 | would print `123456789` and | |
7689 | ||
7690 | [source,sml] | |
7691 | ---- | |
7692 | for (9 downto 1) | |
7693 | (fn i => print (Int.toString i)) | |
7694 | ---- | |
7695 | ||
7696 | would print `987654321`. | |
7697 | ||
7698 | Straightforward formatting of nested loops | |
7699 | ||
7700 | [source,sml] | |
7701 | ---- | |
7702 | for (a to b) | |
7703 | (fn i => | |
7704 | for (c to d) | |
7705 | (fn j => | |
7706 | ...)) | |
7707 | ---- | |
7708 | ||
7709 | is fairly readable, but tends to cause the body of the loop to be | |
7710 | indented quite deeply. | |
7711 | ||
7712 | ||
7713 | == Off-by-one == | |
7714 | ||
7715 | The above design has an annoying feature. In practice, the upper | |
7716 | bound of the iterated range is almost always excluded and most loops | |
7717 | would subtract one from the upper bound: | |
7718 | ||
7719 | [source,sml] | |
7720 | ---- | |
7721 | for (0 to n-1) ... | |
7722 | for (n-1 downto 0) ... | |
7723 | ---- | |
7724 | ||
7725 | It is probably better to break convention and exclude the upper bound | |
7726 | by default, because it leads to more concise code and becomes | |
7727 | idiomatic with very little practice. The iterator combinators | |
7728 | described below exclude the upper bound by default. | |
7729 | ||
7730 | ||
7731 | == Iterator combinators == | |
7732 | ||
7733 | While the simple `for`-function described in the previous section is | |
7734 | probably good enough for many uses, it is a bit cumbersome when one | |
7735 | needs to iterate over a Cartesian product. One might also want to | |
7736 | iterate over more than just consecutive integers. It turns out that | |
7737 | one can provide a library of iterator combinators that allow one to | |
7738 | implement iterators more flexibly. | |
7739 | ||
7740 | Since the types of the combinators may be a bit difficult to infer | |
7741 | from their implementations, let's first take a look at a signature of | |
7742 | the iterator combinator library: | |
7743 | ||
7744 | [source,sml] | |
7745 | ---- | |
7746 | signature ITER = | |
7747 | sig | |
7748 | type 'a t = ('a -> unit) -> unit | |
7749 | ||
7750 | val return : 'a -> 'a t | |
7751 | val >>= : 'a t * ('a -> 'b t) -> 'b t | |
7752 | ||
7753 | val none : 'a t | |
7754 | ||
7755 | val to : int * int -> int t | |
7756 | val downto : int * int -> int t | |
7757 | ||
7758 | val inList : 'a list -> 'a t | |
7759 | val inVector : 'a vector -> 'a t | |
7760 | val inArray : 'a array -> 'a t | |
7761 | ||
7762 | val using : ('a, 'b) StringCvt.reader -> 'b -> 'a t | |
7763 | ||
7764 | val when : 'a t * ('a -> bool) -> 'a t | |
7765 | val by : 'a t * ('a -> 'b) -> 'b t | |
7766 | val @@ : 'a t * 'a t -> 'a t | |
7767 | val ** : 'a t * 'b t -> ('a, 'b) product t | |
7768 | ||
7769 | val for : 'a -> 'a | |
7770 | end | |
7771 | ---- | |
7772 | ||
7773 | Several of the above combinators are meant to be used as infix | |
7774 | operators. Here is a set of suitable infix declarations: | |
7775 | ||
7776 | [source,sml] | |
7777 | ---- | |
7778 | infix 2 to downto | |
7779 | infix 1 @@ when by | |
7780 | infix 0 >>= ** | |
7781 | ---- | |
7782 | ||
7783 | A few notes are in order: | |
7784 | ||
7785 | * The `'a t` type constructor with the `return` and `>>=` operators forms a monad. | |
7786 | ||
7787 | * The `to` and `downto` combinators will omit the upper bound of the range. | |
7788 | ||
7789 | * `for` is the identity function. It is purely for syntactic sugar and is not strictly required. | |
7790 | ||
7791 | * The `@@` combinator produces an iterator for the concatenation of the given iterators. | |
7792 | ||
7793 | * The `**` combinator produces an iterator for the Cartesian product of the given iterators. | |
7794 | ** See <:ProductType:> for the type constructor `('a, 'b) product` used in the type of the iterator produced by `**`. | |
7795 | ||
7796 | * The `using` combinator allows one to iterate over slices, streams and many other kinds of sequences. | |
7797 | ||
7798 | * `when` is the filtering combinator. The name `when` is inspired by <:OCaml:>'s guard clauses. | |
7799 | ||
7800 | * `by` is the mapping combinator. | |
7801 | ||
7802 | The below implementation of the `ITER`-signature makes use of the | |
7803 | following basic combinators: | |
7804 | ||
7805 | [source,sml] | |
7806 | ---- | |
7807 | fun const x _ = x | |
7808 | fun flip f x y = f y x | |
7809 | fun id x = x | |
7810 | fun opt fno fso = fn NONE => fno () | SOME ? => fso ? | |
7811 | fun pass x f = f x | |
7812 | ---- | |
7813 | ||
7814 | Here is an implementation the `ITER`-signature: | |
7815 | ||
7816 | [source,sml] | |
7817 | ---- | |
7818 | structure Iter :> ITER = | |
7819 | struct | |
7820 | type 'a t = ('a -> unit) -> unit | |
7821 | ||
7822 | val return = pass | |
7823 | fun (iA >>= a2iB) f = iA (flip a2iB f) | |
7824 | ||
7825 | val none = ignore | |
7826 | ||
7827 | fun (l to u) f = let fun `l = if l<u then (f l; `(l+1)) else () in `l end | |
7828 | fun (u downto l) f = let fun `u = if u>l then (f (u-1); `(u-1)) else () in `u end | |
7829 | ||
7830 | fun inList ? = flip List.app ? | |
7831 | fun inVector ? = flip Vector.app ? | |
7832 | fun inArray ? = flip Array.app ? | |
7833 | ||
7834 | fun using get s f = let fun `s = opt (const ()) (fn (x, s) => (f x; `s)) (get s) in `s end | |
7835 | ||
7836 | fun (iA when p) f = iA (fn a => if p a then f a else ()) | |
7837 | fun (iA by g) f = iA (f o g) | |
7838 | fun (iA @@ iB) f = (iA f : unit; iB f) | |
7839 | fun (iA ** iB) f = iA (fn a => iB (fn b => f (a & b))) | |
7840 | ||
7841 | val for = id | |
7842 | end | |
7843 | ---- | |
7844 | ||
7845 | Note that some of the above combinators (e.g. `**`) could be expressed | |
7846 | in terms of the other combinators, most notably `return` and `>>=`. | |
7847 | Another implementation issue worth mentioning is that `downto` is | |
7848 | written specifically to avoid computing `l-1`, which could cause an | |
7849 | `Overflow`. | |
7850 | ||
7851 | To use the above combinators the `Iter`-structure needs to be opened | |
7852 | ||
7853 | [source,sml] | |
7854 | ---- | |
7855 | open Iter | |
7856 | ---- | |
7857 | ||
7858 | and one usually also wants to declare the infix status of the | |
7859 | operators as shown earlier. | |
7860 | ||
7861 | Here is an example that illustrates some of the features: | |
7862 | ||
7863 | [source,sml] | |
7864 | ---- | |
7865 | for (0 to 10 when (fn x => x mod 3 <> 0) ** inList ["a", "b"] ** 2 downto 1 by real) | |
7866 | (fn x & y & z => | |
7867 | print ("("^Int.toString x^", \""^y^"\", "^Real.toString z^")\n")) | |
7868 | ---- | |
7869 | ||
7870 | Using the `Iter` combinators one can easily produce more complicated | |
7871 | iterators. For example, here is an iterator over a "triangle": | |
7872 | ||
7873 | [source,sml] | |
7874 | ---- | |
7875 | fun triangle (l, u) = l to u >>= (fn i => i to u >>= (fn j => return (i, j))) | |
7876 | ---- | |
7877 | ||
7878 | <<< | |
7879 | ||
7880 | :mlton-guide-page: FrontEnd | |
7881 | [[FrontEnd]] | |
7882 | FrontEnd | |
7883 | ======== | |
7884 | ||
7885 | <:FrontEnd:> is a translation pass from source to the <:AST:> | |
7886 | <:IntermediateLanguage:>. | |
7887 | ||
7888 | == Description == | |
7889 | ||
7890 | This pass performs lexing and parsing to produce an abstract syntax | |
7891 | tree. | |
7892 | ||
7893 | == Implementation == | |
7894 | ||
7895 | * <!ViewGitFile(mlton,master,mlton/front-end/front-end.sig)> | |
7896 | * <!ViewGitFile(mlton,master,mlton/front-end/front-end.fun)> | |
7897 | ||
7898 | == Details and Notes == | |
7899 | ||
7900 | The lexer is produced by <:MLLex:> from | |
7901 | <!ViewGitFile(mlton,master,mlton/front-end/ml.lex)>. | |
7902 | ||
7903 | The parser is produced by <:MLYacc:> from | |
7904 | <!ViewGitFile(mlton,master,mlton/front-end/ml.grm)>. | |
7905 | ||
7906 | The specifications for the lexer and parser were originally taken from | |
7907 | <:SMLNJ: SML/NJ> (version 109.32), but have been heavily modified | |
7908 | since then. | |
7909 | ||
7910 | <<< | |
7911 | ||
7912 | :mlton-guide-page: FSharp | |
7913 | [[FSharp]] | |
7914 | FSharp | |
7915 | ====== | |
7916 | ||
7917 | http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/[F#] | |
7918 | is a functional programming language developed at Microsoft Research. | |
7919 | F# was partly inspired by the <:OCaml:OCaml> language and shares some | |
7920 | common core constructs with it. F# is integrated with Visual Studio | |
7921 | 2010 as a first-class language. | |
7922 | ||
7923 | <<< | |
7924 | ||
7925 | :mlton-guide-page: FunctionalRecordUpdate | |
7926 | [[FunctionalRecordUpdate]] | |
7927 | FunctionalRecordUpdate | |
7928 | ====================== | |
7929 | ||
7930 | Functional record update is the copying of a record while replacing | |
7931 | the values of some of the fields. <:StandardML:Standard ML> does not | |
7932 | have explicit syntax for functional record update. We will show below | |
7933 | how to implement functional record update in SML, with a little | |
7934 | boilerplate code. | |
7935 | ||
7936 | As an example, the functional update of the record | |
7937 | ||
7938 | [source,sml] | |
7939 | ---- | |
7940 | {a = 13, b = 14, c = 15} | |
7941 | ---- | |
7942 | ||
7943 | with `c = 16` yields a new record | |
7944 | ||
7945 | [source,sml] | |
7946 | ---- | |
7947 | {a = 13, b = 14, c = 16} | |
7948 | ---- | |
7949 | ||
7950 | Functional record update also makes sense with multiple simultaneous | |
7951 | updates. For example, the functional update of the record above with | |
7952 | `a = 18, c = 19` yields a new record | |
7953 | ||
7954 | [source,sml] | |
7955 | ---- | |
7956 | {a = 18, b = 14, c = 19} | |
7957 | ---- | |
7958 | ||
7959 | ||
7960 | One could easily imagine an extension of the SML that supports | |
7961 | functional record update. For example | |
7962 | ||
7963 | [source,sml] | |
7964 | ---- | |
7965 | e with {a = 16, b = 17} | |
7966 | ---- | |
7967 | ||
7968 | would create a copy of the record denoted by `e` with field `a` | |
7969 | replaced with `16` and `b` replaced with `17`. | |
7970 | ||
7971 | Since there is no such syntax in SML, we now show how to implement | |
7972 | functional record update directly. We first give a simple | |
7973 | implementation that has a number of problems. We then give an | |
7974 | advanced implementation, that, while complex underneath, is a reusable | |
7975 | library that admits simple use. | |
7976 | ||
7977 | ||
7978 | == Simple implementation == | |
7979 | ||
7980 | To support functional record update on the record type | |
7981 | ||
7982 | [source,sml] | |
7983 | ---- | |
7984 | {a: 'a, b: 'b, c: 'c} | |
7985 | ---- | |
7986 | ||
7987 | first, define an update function for each component. | |
7988 | ||
7989 | [source,sml] | |
7990 | ---- | |
7991 | fun withA ({a = _, b, c}, a) = {a = a, b = b, c = c} | |
7992 | fun withB ({a, b = _, c}, b) = {a = a, b = b, c = c} | |
7993 | fun withC ({a, b, c = _}, c) = {a = a, b = b, c = c} | |
7994 | ---- | |
7995 | ||
7996 | Then, one can express `e with {a = 16, b = 17}` as | |
7997 | ||
7998 | [source,sml] | |
7999 | ---- | |
8000 | withB (withA (e, 16), 17) | |
8001 | ---- | |
8002 | ||
8003 | With infix notation | |
8004 | ||
8005 | [source,sml] | |
8006 | ---- | |
8007 | infix withA withB withC | |
8008 | ---- | |
8009 | ||
8010 | the syntax is almost as concise as a language extension. | |
8011 | ||
8012 | [source,sml] | |
8013 | ---- | |
8014 | e withA 16 withB 17 | |
8015 | ---- | |
8016 | ||
8017 | This approach suffers from the fact that the amount of boilerplate | |
8018 | code is quadratic in the number of record fields. Furthermore, | |
8019 | changing, adding, or deleting a field requires time proportional to | |
8020 | the number of fields (because each ++with__<L>__++ function must be | |
8021 | changed). It is also annoying to have to define a ++with__<L>__++ | |
8022 | function, possibly with a fixity declaration, for each field. | |
8023 | ||
8024 | Fortunately, there is a solution to these problems. | |
8025 | ||
8026 | ||
8027 | == Advanced implementation == | |
8028 | ||
8029 | Using <:Fold:> one can define a family of ++makeUpdate__<N>__++ | |
8030 | functions and single _update_ operator `U` so that one can define a | |
8031 | functional record update function for any record type simply by | |
8032 | specifying a (trivial) isomorphism between that type and function | |
8033 | argument list. For example, suppose that we would like to do | |
8034 | functional record update on records with fields `a` and `b`. Then one | |
8035 | defines a function `updateAB` as follows. | |
8036 | ||
8037 | [source,sml] | |
8038 | ---- | |
8039 | val updateAB = | |
8040 | fn z => | |
8041 | let | |
8042 | fun from v1 v2 = {a = v1, b = v2} | |
8043 | fun to f {a = v1, b = v2} = f v1 v2 | |
8044 | in | |
8045 | makeUpdate2 (from, from, to) | |
8046 | end | |
8047 | z | |
8048 | ---- | |
8049 | ||
8050 | The functions `from` (think _from function arguments_) and `to` (think | |
8051 | _to function arguements_) specify an isomorphism between `a`,`b` | |
8052 | records and function arguments. There is a second use of `from` to | |
8053 | work around the lack of | |
8054 | <:FirstClassPolymorphism:first-class polymorphism> in SML. | |
8055 | ||
8056 | With the definition of `updateAB` in place, the following expressions | |
8057 | are valid. | |
8058 | ||
8059 | [source,sml] | |
8060 | ---- | |
8061 | updateAB {a = 13, b = "hello"} (set#b "goodbye") $ | |
8062 | updateAB {a = 13.5, b = true} (set#b false) (set#a 12.5) $ | |
8063 | ---- | |
8064 | ||
8065 | As another example, suppose that we would like to do functional record | |
8066 | update on records with fields `b`, `c`, and `d`. Then one defines a | |
8067 | function `updateBCD` as follows. | |
8068 | ||
8069 | [source,sml] | |
8070 | ---- | |
8071 | val updateBCD = | |
8072 | fn z => | |
8073 | let | |
8074 | fun from v1 v2 v3 = {b = v1, c = v2, d = v3} | |
8075 | fun to f {b = v1, c = v2, d = v3} = f v1 v2 v3 | |
8076 | in | |
8077 | makeUpdate3 (from, from, to) | |
8078 | end | |
8079 | z | |
8080 | ---- | |
8081 | ||
8082 | With the definition of `updateBCD` in place, the following expression | |
8083 | is valid. | |
8084 | ||
8085 | [source,sml] | |
8086 | ---- | |
8087 | updateBCD {b = 1, c = 2, d = 3} (set#c 4) (set#c 5) $ | |
8088 | ---- | |
8089 | ||
8090 | Note that not all fields need be updated and that the same field may | |
8091 | be updated multiple times. Further note that the same `set` operator | |
8092 | is used for all update functions (in the above, for both `updateAB` | |
8093 | and `updateBCD`). | |
8094 | ||
8095 | In general, to define a functional-record-update function on records | |
8096 | with fields `f1`, `f2`, ..., `fN`, use the following template. | |
8097 | ||
8098 | [source,sml] | |
8099 | ---- | |
8100 | val update = | |
8101 | fn z => | |
8102 | let | |
8103 | fun from v1 v2 ... vn = {f1 = v1, f2 = v2, ..., fn = vn} | |
8104 | fun to f {f1 = v1, f2 = v2, ..., fn = vn} = v1 v2 ... vn | |
8105 | in | |
8106 | makeUpdateN (from, from, to) | |
8107 | end | |
8108 | z | |
8109 | ---- | |
8110 | ||
8111 | With this, one can update a record as follows. | |
8112 | ||
8113 | [source,sml] | |
8114 | ---- | |
8115 | update {f1 = v1, ..., fn = vn} (set#fi1 vi1) ... (set#fim vim) $ | |
8116 | ---- | |
8117 | ||
8118 | ||
8119 | == The `FunctionalRecordUpdate` structure == | |
8120 | ||
8121 | Here is the implementation of functional record update. | |
8122 | ||
8123 | [source,sml] | |
8124 | ---- | |
8125 | structure FunctionalRecordUpdate = | |
8126 | struct | |
8127 | local | |
8128 | fun next g (f, z) x = g (f x, z) | |
8129 | fun f1 (f, z) x = f (z x) | |
8130 | fun f2 z = next f1 z | |
8131 | fun f3 z = next f2 z | |
8132 | ||
8133 | fun c0 from = from | |
8134 | fun c1 from = c0 from f1 | |
8135 | fun c2 from = c1 from f2 | |
8136 | fun c3 from = c2 from f3 | |
8137 | ||
8138 | fun makeUpdate cX (from, from', to) record = | |
8139 | let | |
8140 | fun ops () = cX from' | |
8141 | fun vars f = to f record | |
8142 | in | |
8143 | Fold.fold ((vars, ops), fn (vars, _) => vars from) | |
8144 | end | |
8145 | in | |
8146 | fun makeUpdate0 z = makeUpdate c0 z | |
8147 | fun makeUpdate1 z = makeUpdate c1 z | |
8148 | fun makeUpdate2 z = makeUpdate c2 z | |
8149 | fun makeUpdate3 z = makeUpdate c3 z | |
8150 | ||
8151 | fun upd z = Fold.step2 (fn (s, f, (vars, ops)) => (fn out => vars (s (ops ()) (out, f)), ops)) z | |
8152 | fun set z = Fold.step2 (fn (s, v, (vars, ops)) => (fn out => vars (s (ops ()) (out, fn _ => v)), ops)) z | |
8153 | end | |
8154 | end | |
8155 | ---- | |
8156 | ||
8157 | The idea of `makeUpdate` is to build a record of functions which can | |
8158 | replace the contents of one argument out of a list of arguments. The | |
8159 | functions ++f__<X>__++ replace the 0th, 1st, ... argument with their | |
8160 | argument `z`. The ++c__<X>__++ functions pass the first __X__ `f` | |
8161 | functions to the record constructor. | |
8162 | ||
8163 | The `#field` notation of Standard ML allows us to select the map | |
8164 | function which replaces the corresponding argument. By converting the | |
8165 | record to an argument list, feeding that list through the selected map | |
8166 | function and piping the list into the record constructor, functional | |
8167 | record update is achieved. | |
8168 | ||
8169 | ||
8170 | == Efficiency == | |
8171 | ||
8172 | With MLton, the efficiency of this approach is as good as one would | |
8173 | expect with the special syntax. Namely a sequence of updates will be | |
8174 | optimized into a single record construction that copies the unchanged | |
8175 | fields and fills in the changed fields with their new values. | |
8176 | ||
8177 | Before Sep 14, 2009, this page advocated an alternative implementation | |
8178 | of <:FunctionalRecordUpdate:>. However, the old structure caused | |
8179 | exponentially increasing compile times. We advise you to switch to | |
8180 | the newer version. | |
8181 | ||
8182 | ||
8183 | == Applications == | |
8184 | ||
8185 | Functional record update can be used to implement labelled | |
8186 | <:OptionalArguments:optional arguments>. | |
8187 | ||
8188 | <<< | |
8189 | ||
8190 | :mlton-guide-page: fxp | |
8191 | [[fxp]] | |
8192 | fxp | |
8193 | === | |
8194 | ||
8195 | http://atseidl2.informatik.tu-muenchen.de/%7Eberlea/Fxp/[fxp] is an XML | |
8196 | parser written in Standard ML. | |
8197 | ||
8198 | It has a | |
8199 | http://atseidl2.informatik.tu-muenchen.de/%7Eberlea/Fxp/mlton.html[patch] | |
8200 | to compile with MLton. | |
8201 | ||
8202 | <<< | |
8203 | ||
8204 | :mlton-guide-page: GarbageCollection | |
8205 | [[GarbageCollection]] | |
8206 | GarbageCollection | |
8207 | ================= | |
8208 | ||
8209 | For a good introduction and overview to garbage collection, see | |
8210 | <!Cite(Jones99)>. | |
8211 | ||
8212 | MLton's garbage collector uses copying, mark-compact, and generational | |
8213 | collection, automatically switching between them at run time based on | |
8214 | the amount of live data relative to the amount of RAM. The runtime | |
8215 | system tries to keep the heap within RAM if at all possible. | |
8216 | ||
8217 | MLton's copying collector is a simple, two-space, breadth-first, | |
8218 | Cheney-style collector. The design for the generational and | |
8219 | mark-compact GC is based on <!Cite(Sansom91)>. | |
8220 | ||
8221 | == Design notes == | |
8222 | ||
8223 | * http://www.mlton.org/pipermail/mlton/2002-May/012420.html | |
8224 | + | |
8225 | object layout and header word design | |
8226 | ||
8227 | == Also see == | |
8228 | ||
8229 | * <:Regions:> | |
8230 | ||
8231 | <<< | |
8232 | ||
8233 | :mlton-guide-page: GenerativeDatatype | |
8234 | [[GenerativeDatatype]] | |
8235 | GenerativeDatatype | |
8236 | ================== | |
8237 | ||
8238 | In <:StandardML:Standard ML>, datatype declarations are said to be | |
8239 | _generative_, because each time a datatype declaration is evaluated, | |
8240 | it yields a new type. Thus, any attempt to mix the types will lead to | |
8241 | a type error at compile-time. The following program, which does not | |
8242 | type check, demonstrates this. | |
8243 | ||
8244 | [source,sml] | |
8245 | ---- | |
8246 | functor F () = | |
8247 | struct | |
8248 | datatype t = T | |
8249 | end | |
8250 | structure S1 = F () | |
8251 | structure S2 = F () | |
8252 | val _: S1.t -> S2.t = fn x => x | |
8253 | ---- | |
8254 | ||
8255 | Generativity also means that two different datatype declarations | |
8256 | define different types, even if they define identical constructors. | |
8257 | The following program does not type check due to this. | |
8258 | ||
8259 | [source,sml] | |
8260 | ---- | |
8261 | datatype t = A | B | |
8262 | val a1 = A | |
8263 | datatype t = A | B | |
8264 | val a2 = A | |
8265 | val _ = if true then a1 else a2 | |
8266 | ---- | |
8267 | ||
8268 | == Also see == | |
8269 | ||
8270 | * <:GenerativeException:> | |
8271 | ||
8272 | <<< | |
8273 | ||
8274 | :mlton-guide-page: GenerativeException | |
8275 | [[GenerativeException]] | |
8276 | GenerativeException | |
8277 | =================== | |
8278 | ||
8279 | In <:StandardML:Standard ML>, exception declarations are said to be | |
8280 | _generative_, because each time an exception declaration is evaluated, | |
8281 | it yields a new exception. | |
8282 | ||
8283 | The following program demonstrates the generativity of exceptions. | |
8284 | ||
8285 | [source,sml] | |
8286 | ---- | |
8287 | exception E | |
8288 | val e1 = E | |
8289 | fun isE1 (e: exn): bool = | |
8290 | case e of | |
8291 | E => true | |
8292 | | _ => false | |
8293 | exception E | |
8294 | val e2 = E | |
8295 | fun isE2 (e: exn): bool = | |
8296 | case e of | |
8297 | E => true | |
8298 | | _ => false | |
8299 | fun pb (b: bool): unit = | |
8300 | print (concat [Bool.toString b, "\n"]) | |
8301 | val () = (pb (isE1 e1) | |
8302 | ;pb (isE1 e2) | |
8303 | ; pb (isE2 e1) | |
8304 | ; pb (isE2 e2)) | |
8305 | ---- | |
8306 | ||
8307 | In the above program, two different exception declarations declare an | |
8308 | exception `E` and a corresponding function that returns `true` only on | |
8309 | that exception. Although declared by syntactically identical | |
8310 | exception declarations, `e1` and `e2` are different exceptions. The | |
8311 | program, when run, prints `true`, `false`, `false`, `true`. | |
8312 | ||
8313 | A slight modification of the above program shows that even a single | |
8314 | exception declaration yields a new exception each time it is | |
8315 | evaluated. | |
8316 | ||
8317 | [source,sml] | |
8318 | ---- | |
8319 | fun f (): exn * (exn -> bool) = | |
8320 | let | |
8321 | exception E | |
8322 | in | |
8323 | (E, fn E => true | _ => false) | |
8324 | end | |
8325 | val (e1, isE1) = f () | |
8326 | val (e2, isE2) = f () | |
8327 | fun pb (b: bool): unit = | |
8328 | print (concat [Bool.toString b, "\n"]) | |
8329 | val () = (pb (isE1 e1) | |
8330 | ; pb (isE1 e2) | |
8331 | ; pb (isE2 e1) | |
8332 | ; pb (isE2 e2)) | |
8333 | ---- | |
8334 | ||
8335 | Each call to `f` yields a new exception and a function that returns | |
8336 | `true` only on that exception. The program, when run, prints `true`, | |
8337 | `false`, `false`, `true`. | |
8338 | ||
8339 | ||
8340 | == Type Safety == | |
8341 | ||
8342 | Exception generativity is required for type safety. Consider the | |
8343 | following valid SML program. | |
8344 | ||
8345 | [source,sml] | |
8346 | ---- | |
8347 | fun f (): ('a -> exn) * (exn -> 'a) = | |
8348 | let | |
8349 | exception E of 'a | |
8350 | in | |
8351 | (E, fn E x => x | _ => raise Fail "f") | |
8352 | end | |
8353 | fun cast (a: 'a): 'b = | |
8354 | let | |
8355 | val (make: 'a -> exn, _) = f () | |
8356 | val (_, get: exn -> 'b) = f () | |
8357 | in | |
8358 | get (make a) | |
8359 | end | |
8360 | val _ = ((cast 13): int -> int) 14 | |
8361 | ---- | |
8362 | ||
8363 | If exceptions weren't generative, then each call `f ()` would yield | |
8364 | the same exception constructor `E`. Then, our `cast` function could | |
8365 | use `make: 'a -> exn` to convert any value into an exception and then | |
8366 | `get: exn -> 'b` to convert that exception to a value of arbitrary | |
8367 | type. If `cast` worked, then we could cast an integer as a function | |
8368 | and apply. Of course, because of generative exceptions, this program | |
8369 | raises `Fail "f"`. | |
8370 | ||
8371 | ||
8372 | == Applications == | |
8373 | ||
8374 | The `exn` type is effectively a <:UniversalType:universal type>. | |
8375 | ||
8376 | ||
8377 | == Also see == | |
8378 | ||
8379 | * <:GenerativeDatatype:> | |
8380 | ||
8381 | <<< | |
8382 | ||
8383 | :mlton-guide-page: Git | |
8384 | [[Git]] | |
8385 | Git | |
8386 | === | |
8387 | ||
8388 | http://git-scm.com/[Git] is a distributed version control system. The | |
8389 | MLton project currently uses Git to maintain its | |
8390 | <:Sources:source code>. | |
8391 | ||
8392 | Here are some online Git resources. | |
8393 | ||
8394 | * http://git-scm.com/docs[Reference Manual] | |
8395 | * http://git-scm.com/book[ProGit, by Scott Chacon] | |
8396 | ||
8397 | <<< | |
8398 | ||
8399 | :mlton-guide-page: Glade | |
8400 | [[Glade]] | |
8401 | Glade | |
8402 | ===== | |
8403 | ||
8404 | http://glade.gnome.org/features.html[Glade] is a tool for generating | |
8405 | Gtk user interfaces. | |
8406 | ||
8407 | <:WesleyTerpstra:> is working on a Glade->mGTK converter. | |
8408 | ||
8409 | * http://www.mlton.org/pipermail/mlton/2004-December/016865.html | |
8410 | ||
8411 | <<< | |
8412 | ||
8413 | :mlton-guide-page: Globalize | |
8414 | [[Globalize]] | |
8415 | Globalize | |
8416 | ========= | |
8417 | ||
8418 | <:Globalize:> is an analysis pass for the <:SXML:> | |
8419 | <:IntermediateLanguage:>, invoked from <:ClosureConvert:>. | |
8420 | ||
8421 | == Description == | |
8422 | ||
8423 | This pass marks values that are constant, allowing <:ClosureConvert:> | |
8424 | to move them out to the top level so they are only evaluated once and | |
8425 | do not appear in closures. | |
8426 | ||
8427 | == Implementation == | |
8428 | ||
8429 | * <!ViewGitFile(mlton,master,mlton/closure-convert/globalize.sig)> | |
8430 | * <!ViewGitFile(mlton,master,mlton/closure-convert/globalize.fun)> | |
8431 | ||
8432 | == Details and Notes == | |
8433 | ||
8434 | {empty} | |
8435 | ||
8436 | <<< | |
8437 | ||
8438 | :mlton-guide-page: GnuMP | |
8439 | [[GnuMP]] | |
8440 | GnuMP | |
8441 | ===== | |
8442 | ||
8443 | The http://gmplib.org[GnuMP] library (GNU Multiple Precision | |
8444 | arithmetic library) is a library for arbitrary precision integer | |
8445 | arithmetic. MLton uses the GnuMP library to implement the | |
8446 | <:BasisLibrary: Basis Library> `IntInf` module. | |
8447 | ||
8448 | == Known issues == | |
8449 | ||
8450 | * There is a known problem with the GnuMP library (prior to version | |
8451 | 4.2.x), where it requires a lot of stack space for some computations, | |
8452 | e.g. `IntInf.toString` of a million digit number. If you run with | |
8453 | stack size limited, you may see a segfault in such programs. This | |
8454 | problem is mentioned in the http://gmplib.org/#FAQ[GnuMP FAQ], where | |
8455 | they describe two solutions. | |
8456 | ||
8457 | ** Increase (or unlimit) your stack space. From your program, use | |
8458 | `setrlimit`, or from the shell, use `ulimit`. | |
8459 | ||
8460 | ** Configure and rebuild `libgmp` with `--disable-alloca`, which will | |
8461 | cause it to allocate temporaries using `malloc` instead of on the | |
8462 | stack. | |
8463 | ||
8464 | * On some platforms, the GnuMP library may be configured to use one of | |
8465 | multiple ABIs (Application Binary Interfaces). For example, on some | |
8466 | 32-bit architectures, GnuMP may be configured to represent a limb as | |
8467 | either a 32-bit `long` or as a 64-bit `long long`. Similarly, GnuMP | |
8468 | may be configured to use specific CPU features. | |
8469 | + | |
8470 | In order to efficiently use the GnuMP library, MLton represents an | |
8471 | `IntInf.int` value in a manner compatible with the GnuMP library's | |
8472 | representation of a limb. Hence, it is important that MLton and the | |
8473 | GnuMP library agree upon the representation of a limb. | |
8474 | ||
8475 | ** When using a source package of MLton, building will detect the | |
8476 | GnuMP library's representation of a limb. | |
8477 | ||
8478 | ** When using a binary package of MLton that is dynamically linked | |
8479 | against the GnuMP library, the build machine and the install machine | |
8480 | must have the GnuMP library configured with the same representation of | |
8481 | a limb. (On the other hand, the build machine need not have the GnuMP | |
8482 | library configured with CPU features compatible with the install | |
8483 | machine.) | |
8484 | ||
8485 | ** When using a binary package of MLton that is statically linked | |
8486 | against the GnuMP library, the build machine and the install machine | |
8487 | need not have the GnuMP library configured with the same | |
8488 | representation of a limb. (On the other hand, the build machine must | |
8489 | have the GnuMP library configured with CPU features compatible with | |
8490 | the install machine.) | |
8491 | + | |
8492 | However, MLton will be configured with the representation of a limb | |
8493 | from the GnuMP library of the build machine. Executables produced by | |
8494 | MLton will be incompatible with the GnuMP library of the install | |
8495 | machine. To _reconfigure_ MLton with the representation of a limb | |
8496 | from the GnuMP library of the install machine, one must edit: | |
8497 | + | |
8498 | ---- | |
8499 | /usr/lib/mlton/self/sizes | |
8500 | ---- | |
8501 | + | |
8502 | changing the | |
8503 | + | |
8504 | ---- | |
8505 | mplimb = ?? | |
8506 | ---- | |
8507 | + | |
8508 | entry so that `??` corresponds to the bytes in a limb; and, one must edit: | |
8509 | + | |
8510 | ---- | |
8511 | /usr/lib/mlton/sml/basis/config/c/arch-os/c-types.sml | |
8512 | ---- | |
8513 | + | |
8514 | changing the | |
8515 | + | |
8516 | ---- | |
8517 | (* from "gmp.h" *) | |
8518 | structure C_MPLimb = struct open Word?? type t = word end | |
8519 | functor C_MPLimb_ChooseWordN (A: CHOOSE_WORDN_ARG) = ChooseWordN_Word?? (A) | |
8520 | ---- | |
8521 | + | |
8522 | entries so that `??` corresponds to the bits in a limb. | |
8523 | ||
8524 | <<< | |
8525 | ||
8526 | :mlton-guide-page: GoogleSummerOfCode2013 | |
8527 | [[GoogleSummerOfCode2013]] | |
8528 | Google Summer of Code (2013) | |
8529 | ============================ | |
8530 | ||
8531 | == Mentors == | |
8532 | ||
8533 | The following developers have agreed to serve as mentors for the 2013 Google Summer of Code: | |
8534 | ||
8535 | * http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8536 | * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
8537 | * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan] | |
8538 | ||
8539 | == Ideas List == | |
8540 | ||
8541 | === Implement a Partial Redundancy Elimination (PRE) Optimization === | |
8542 | ||
8543 | Partial redundancy elimination (PRE) is a program transformation that | |
8544 | removes operations that are redundant on some, but not necessarily all | |
8545 | paths, through the program. PRE can subsume both common subexpression | |
8546 | elimination and loop-invariant code motion, and is therefore a | |
8547 | potentially powerful optimization. However, a naïve | |
8548 | implementation of PRE on a program in static single assignment (SSA) | |
8549 | form is unlikely to be effective. This project aims to adapt and | |
8550 | implement the SSAPRE algorithm(s) of Thomas VanDrunen in MLton's SSA | |
8551 | intermediate language. | |
8552 | ||
8553 | Background: | |
8554 | -- | |
8555 | * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based partial redundancy elimination for static single assignment form]; Thomas VanDrunen and Antony L. Hosking | |
8556 | * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen | |
8557 | * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking | |
8558 | * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial redundancy elimination in SSA form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow | |
8559 | -- | |
8560 | ||
8561 | Recommended Skills: SML programming experience; some middle-end compiler experience | |
8562 | ||
8563 | ///// | |
8564 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8565 | ///// | |
8566 | ||
8567 | === Design and Implement a Heap Profiler === | |
8568 | ||
8569 | A heap profile is a description of the space usage of a program. A | |
8570 | heap profile is concerned with the allocation, retention, and | |
8571 | deallocation (via garbage collection) of heap data during the | |
8572 | execution of a program. A heap profile can be used to diagnose | |
8573 | performance problems in a functional program that arise from space | |
8574 | leaks. This project aims to design and implement a heap profiler for | |
8575 | MLton compiled programs. | |
8576 | ||
8577 | Background: | |
8578 | -- | |
8579 | * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones | |
8580 | * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo | |
8581 | * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo | |
8582 | * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling | |
8583 | -- | |
8584 | ||
8585 | Recommended Skills: C and SML programming experience; some experience with UI and visualization | |
8586 | ||
8587 | ///// | |
8588 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8589 | ///// | |
8590 | ||
8591 | === Garbage Collector Improvements === | |
8592 | ||
8593 | The garbage collector plays a significant role in the performance of | |
8594 | functional languages. Garbage collect too often, and program | |
8595 | performance suffers due to the excessive time spent in the garbage | |
8596 | collector. Garbage collect not often enough, and program performance | |
8597 | suffers due to the excessive space used by the uncollected garbage. | |
8598 | One particular issue is ensuring that a program utilizing a garbage | |
8599 | collector "plays nice" with other processes on the system, by not | |
8600 | using too much or too little physical memory. While there are some | |
8601 | reasonable theoretical results about garbage collections with heaps of | |
8602 | fixed size, there seems to be insufficient work that really looks | |
8603 | carefully at the question of dynamically resizing the heap in response | |
8604 | to the live data demands of the application and, similarly, in | |
8605 | response to the behavior of the operating system and other processes. | |
8606 | This project aims to investigate improvements to the memory behavior of | |
8607 | MLton compiled programs through better tuning of the garbage | |
8608 | collector. | |
8609 | ||
8610 | Background: | |
8611 | -- | |
8612 | * http://www.dcs.gla.ac.uk/%7Ewhited/papers/automated_heap_sizing.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews | |
8613 | * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski | |
8614 | * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling garbage collection and heap growth to reduce the execution time of Java applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham | |
8615 | * http://portal.acm.org/citation.cfm?doid=1065010.1065028[Garbage collection without paging]; Matthew Hertz, Yi Feng, and Emery D. Berger | |
8616 | * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic heap sizing: taking real memory into account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss | |
8617 | -- | |
8618 | ||
8619 | Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience | |
8620 | ||
8621 | ///// | |
8622 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8623 | ///// | |
8624 | ||
8625 | === Implement Successor{nbsp}ML Language Features === | |
8626 | ||
8627 | Any programming language, including Standard{nbsp}ML, can be improved. | |
8628 | The community has identified a number of modest extensions and | |
8629 | revisions to the Standard{nbsp}ML programming language that would | |
8630 | likely prove useful in practice. This project aims to implement these | |
8631 | language features in the MLton compiler. | |
8632 | ||
8633 | Background: | |
8634 | -- | |
8635 | * http://successor-ml.org/index.php?title=Main_Page[Successor{nbsp}ML] | |
8636 | * http://www.mpi-sws.org/%7Erossberg/hamlet/index.html#successor-ml[HaMLet (Successor{nbsp}ML)] | |
8637 | * http://journals.cambridge.org/action/displayAbstract?aid=1322628[A critique of Standard{nbsp}ML]; Andrew W. Appel | |
8638 | -- | |
8639 | ||
8640 | Recommended Skills: SML programming experience; some front-end compiler experience (i.e., scanners and parsers) | |
8641 | ||
8642 | ///// | |
8643 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8644 | ///// | |
8645 | ||
8646 | === Implement Source-level Debugging === | |
8647 | ||
8648 | Debugging is a fact of programming life. Unfortunately, most SML | |
8649 | implementations (including MLton) provide little to no source-level | |
8650 | debugging support. This project aims to add basic to intermediate | |
8651 | source-level debugging support to the MLton compiler. MLton already | |
8652 | supports source-level profiling, which can be used to attribute bytes | |
8653 | allocated or time spent in source functions. It should be relatively | |
8654 | straightforward to leverage this source-level information into basic | |
8655 | source-level debugging support, with the ability to set/unset | |
8656 | breakpoints and step through declarations and functions. It may be | |
8657 | possible to also provide intermediate source-level debugging support, | |
8658 | with the ability to inspect in-scope variables of basic types (e.g., | |
8659 | types compatible with MLton's foreign function interface). | |
8660 | ||
8661 | Background: | |
8662 | -- | |
8663 | * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works] | |
8664 | * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types] | |
8665 | * http://dwarfstd.org/[DWARF Debugging Standard] | |
8666 | * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format] | |
8667 | -- | |
8668 | ||
8669 | Recommended Skills: SML programming experience; some compiler experience | |
8670 | ||
8671 | ///// | |
8672 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8673 | ///// | |
8674 | ||
8675 | === SIMD Primitives === | |
8676 | ||
8677 | Most modern processors offer some direct support for SIMD (Single | |
8678 | Instruction, Multiple Data) operations, such as Intel's MMX/SSE | |
8679 | instructions, AMD's 3DNow! instructions, and IBM's AltiVec. Such | |
8680 | instructions are particularly useful for multimedia, scientific, and | |
8681 | cryptographic applications. This project aims to add preliminary | |
8682 | support for vector data and vector operations to the MLton compiler. | |
8683 | Ideally, after surveying SIMD instruction sets and SIMD support in | |
8684 | other compilers, a core set of SIMD primitives with broad architecture | |
8685 | and compiler support can be identified. After adding SIMD primitives | |
8686 | to the core compiler and carrying them through to the various | |
8687 | backends, there will be opportunities to design and implement an SML | |
8688 | library that exposes the primitives to the SML programmer as well as | |
8689 | opportunities to design and implement auto-vectorization | |
8690 | optimizations. | |
8691 | ||
8692 | Background: | |
8693 | -- | |
8694 | * http://en.wikipedia.org/wiki/SIMD[SIMD] | |
8695 | * http://gcc.gnu.org/projects/tree-ssa/vectorization.html[Auto-vectorization in GCC] | |
8696 | * http://llvm.org/docs/Vectorizers.html[Auto-vectorization in LLVM] | |
8697 | -- | |
8698 | ||
8699 | Recommended Skills: SML programming experience; some compiler experience; some computer architecture experience | |
8700 | ||
8701 | ///// | |
8702 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8703 | ///// | |
8704 | ||
8705 | === RTOS Support === | |
8706 | ||
8707 | This project entails porting the MLton compiler to RTOSs such as: | |
8708 | RTEMS, RT Linux, and FreeRTOS. The project will include modifications | |
8709 | to the MLton build and configuration process. Students will need to | |
8710 | extend the MLton configuration process for each of the RTOSs. The | |
8711 | MLton compilation process will need to be extended to invoke the C | |
8712 | cross compilers the RTOSs provide for embedded support. Test scripts | |
8713 | for validation will be necessary and these will need to be run in | |
8714 | emulators for supported architectures. | |
8715 | ||
8716 | Recommended Skills: C programming experience; some scripting experience | |
8717 | ||
8718 | ///// | |
8719 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
8720 | ///// | |
8721 | ||
8722 | === Region Based Memory Management === | |
8723 | ||
8724 | Region based memory management is an alternative automatic memory | |
8725 | management scheme to garbage collection. Regions can be inferred by | |
8726 | the compiler (e.g., Cyclone and MLKit) or provided to the programmer | |
8727 | through a library. Since many students do not have extensive | |
8728 | experience with compilers we plan on adopting the later approach. | |
8729 | Creating a viable region based memory solution requires the removal of | |
8730 | the GC and changes to the allocator. Additionally, write barriers | |
8731 | will be necessary to ensure references between two ML objects is never | |
8732 | established if the left hand side of the assignment has a longer | |
8733 | lifetime than the right hand side. Students will need to come up with | |
8734 | an appropriate interface for creating, entering, and exiting regions | |
8735 | (examples include RTSJ scoped memory and SCJ scoped memory). | |
8736 | ||
8737 | Background: | |
8738 | -- | |
8739 | * Cyclone | |
8740 | * MLKit | |
8741 | * RTSJ + SCJ scopes | |
8742 | -- | |
8743 | ||
8744 | Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience | |
8745 | ||
8746 | ///// | |
8747 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
8748 | ///// | |
8749 | ||
8750 | === Integration of Multi-MLton === | |
8751 | ||
8752 | http://multimlton.cs.purdue.edu[MultiMLton] is a compiler and runtime | |
8753 | environment that targets scalable multicore platforms. It is an | |
8754 | extension of MLton. It combines new language abstractions and | |
8755 | associated compiler analyses for expressing and implementing various | |
8756 | kinds of fine-grained parallelism (safe futures, speculation, | |
8757 | transactions, etc.), along with a sophisticated runtime system tuned | |
8758 | to efficiently handle large numbers of lightweight threads. The core | |
8759 | stable features of MultiMLton will need to be integrated with the | |
8760 | latest MLton public release. Certain experimental features, such as | |
8761 | support for the Intel SCC and distributed runtime will be omitted. | |
8762 | This project requires students to understand the delta between the | |
8763 | MultiMLton code base and the MLton code base. Students will need to | |
8764 | create build and configuration scripts for MLton to enable MultiMLton | |
8765 | features. | |
8766 | ||
8767 | Background | |
8768 | -- | |
8769 | * http://multimlton.cs.purdue.edu/mML/Publications.html[MultiMLton -- Publications] | |
8770 | -- | |
8771 | ||
8772 | Recommended Skills: SML programming experience; C programming experience; some compiler experience | |
8773 | ||
8774 | ///// | |
8775 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
8776 | ///// | |
8777 | ||
8778 | <<< | |
8779 | ||
8780 | :mlton-guide-page: GoogleSummerOfCode2014 | |
8781 | [[GoogleSummerOfCode2014]] | |
8782 | Google Summer of Code (2014) | |
8783 | ============================ | |
8784 | ||
8785 | == Mentors == | |
8786 | ||
8787 | The following developers have agreed to serve as mentors for the 2014 Google Summer of Code: | |
8788 | ||
8789 | * http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8790 | * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
8791 | * http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
8792 | * http://www.cs.purdue.edu/homes/chandras[KC Sivaramakrishnan] | |
8793 | ///// | |
8794 | * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan] | |
8795 | ///// | |
8796 | ||
8797 | == Ideas List == | |
8798 | ||
8799 | === Implement a Partial Redundancy Elimination (PRE) Optimization === | |
8800 | ||
8801 | Partial redundancy elimination (PRE) is a program transformation that | |
8802 | removes operations that are redundant on some, but not necessarily all | |
8803 | paths, through the program. PRE can subsume both common subexpression | |
8804 | elimination and loop-invariant code motion, and is therefore a | |
8805 | potentially powerful optimization. However, a naïve | |
8806 | implementation of PRE on a program in static single assignment (SSA) | |
8807 | form is unlikely to be effective. This project aims to adapt and | |
8808 | implement the SSAPRE algorithm(s) of Thomas VanDrunen in MLton's SSA | |
8809 | intermediate language. | |
8810 | ||
8811 | Background: | |
8812 | -- | |
8813 | * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based partial redundancy elimination for static single assignment form]; Thomas VanDrunen and Antony L. Hosking | |
8814 | * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen | |
8815 | * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking | |
8816 | * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial redundancy elimination in SSA form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow | |
8817 | -- | |
8818 | ||
8819 | Recommended Skills: SML programming experience; some middle-end compiler experience | |
8820 | ||
8821 | ///// | |
8822 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8823 | ///// | |
8824 | ||
8825 | === Design and Implement a Heap Profiler === | |
8826 | ||
8827 | A heap profile is a description of the space usage of a program. A | |
8828 | heap profile is concerned with the allocation, retention, and | |
8829 | deallocation (via garbage collection) of heap data during the | |
8830 | execution of a program. A heap profile can be used to diagnose | |
8831 | performance problems in a functional program that arise from space | |
8832 | leaks. This project aims to design and implement a heap profiler for | |
8833 | MLton compiled programs. | |
8834 | ||
8835 | Background: | |
8836 | -- | |
8837 | * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones | |
8838 | * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo | |
8839 | * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo | |
8840 | * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling | |
8841 | -- | |
8842 | ||
8843 | Recommended Skills: C and SML programming experience; some experience with UI and visualization | |
8844 | ||
8845 | ///// | |
8846 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8847 | ///// | |
8848 | ||
8849 | === Garbage Collector Improvements === | |
8850 | ||
8851 | The garbage collector plays a significant role in the performance of | |
8852 | functional languages. Garbage collect too often, and program | |
8853 | performance suffers due to the excessive time spent in the garbage | |
8854 | collector. Garbage collect not often enough, and program performance | |
8855 | suffers due to the excessive space used by the uncollected garbage. | |
8856 | One particular issue is ensuring that a program utilizing a garbage | |
8857 | collector "plays nice" with other processes on the system, by not | |
8858 | using too much or too little physical memory. While there are some | |
8859 | reasonable theoretical results about garbage collections with heaps of | |
8860 | fixed size, there seems to be insufficient work that really looks | |
8861 | carefully at the question of dynamically resizing the heap in response | |
8862 | to the live data demands of the application and, similarly, in | |
8863 | response to the behavior of the operating system and other processes. | |
8864 | This project aims to investigate improvements to the memory behavior of | |
8865 | MLton compiled programs through better tuning of the garbage | |
8866 | collector. | |
8867 | ||
8868 | Background: | |
8869 | -- | |
8870 | * http://www.dcs.gla.ac.uk/%7Ewhited/papers/automated_heap_sizing.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews | |
8871 | * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski | |
8872 | * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling garbage collection and heap growth to reduce the execution time of Java applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham | |
8873 | * http://portal.acm.org/citation.cfm?doid=1065010.1065028[Garbage collection without paging]; Matthew Hertz, Yi Feng, and Emery D. Berger | |
8874 | * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic heap sizing: taking real memory into account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss | |
8875 | -- | |
8876 | ||
8877 | Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience | |
8878 | ||
8879 | ///// | |
8880 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8881 | ///// | |
8882 | ||
8883 | === Implement Successor{nbsp}ML Language Features === | |
8884 | ||
8885 | Any programming language, including Standard{nbsp}ML, can be improved. | |
8886 | The community has identified a number of modest extensions and | |
8887 | revisions to the Standard{nbsp}ML programming language that would | |
8888 | likely prove useful in practice. This project aims to implement these | |
8889 | language features in the MLton compiler. | |
8890 | ||
8891 | Background: | |
8892 | -- | |
8893 | * http://successor-ml.org/index.php?title=Main_Page[Successor{nbsp}ML] | |
8894 | * http://www.mpi-sws.org/%7Erossberg/hamlet/index.html#successor-ml[HaMLet (Successor{nbsp}ML)] | |
8895 | * http://journals.cambridge.org/action/displayAbstract?aid=1322628[A critique of Standard{nbsp}ML]; Andrew W. Appel | |
8896 | -- | |
8897 | ||
8898 | Recommended Skills: SML programming experience; some front-end compiler experience (i.e., scanners and parsers) | |
8899 | ||
8900 | ///// | |
8901 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8902 | ///// | |
8903 | ||
8904 | === Implement Source-level Debugging === | |
8905 | ||
8906 | Debugging is a fact of programming life. Unfortunately, most SML | |
8907 | implementations (including MLton) provide little to no source-level | |
8908 | debugging support. This project aims to add basic to intermediate | |
8909 | source-level debugging support to the MLton compiler. MLton already | |
8910 | supports source-level profiling, which can be used to attribute bytes | |
8911 | allocated or time spent in source functions. It should be relatively | |
8912 | straightforward to leverage this source-level information into basic | |
8913 | source-level debugging support, with the ability to set/unset | |
8914 | breakpoints and step through declarations and functions. It may be | |
8915 | possible to also provide intermediate source-level debugging support, | |
8916 | with the ability to inspect in-scope variables of basic types (e.g., | |
8917 | types compatible with MLton's foreign function interface). | |
8918 | ||
8919 | Background: | |
8920 | -- | |
8921 | * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works] | |
8922 | * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types] | |
8923 | * http://dwarfstd.org/[DWARF Debugging Standard] | |
8924 | * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format] | |
8925 | -- | |
8926 | ||
8927 | Recommended Skills: SML programming experience; some compiler experience | |
8928 | ||
8929 | ///// | |
8930 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
8931 | ///// | |
8932 | ||
8933 | === Region Based Memory Management === | |
8934 | ||
8935 | Region based memory management is an alternative automatic memory | |
8936 | management scheme to garbage collection. Regions can be inferred by | |
8937 | the compiler (e.g., Cyclone and MLKit) or provided to the programmer | |
8938 | through a library. Since many students do not have extensive | |
8939 | experience with compilers we plan on adopting the later approach. | |
8940 | Creating a viable region based memory solution requires the removal of | |
8941 | the GC and changes to the allocator. Additionally, write barriers | |
8942 | will be necessary to ensure references between two ML objects is never | |
8943 | established if the left hand side of the assignment has a longer | |
8944 | lifetime than the right hand side. Students will need to come up with | |
8945 | an appropriate interface for creating, entering, and exiting regions | |
8946 | (examples include RTSJ scoped memory and SCJ scoped memory). | |
8947 | ||
8948 | Background: | |
8949 | -- | |
8950 | * Cyclone | |
8951 | * MLKit | |
8952 | * RTSJ + SCJ scopes | |
8953 | -- | |
8954 | ||
8955 | Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience | |
8956 | ||
8957 | ///// | |
8958 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
8959 | ///// | |
8960 | ||
8961 | === Integration of Multi-MLton === | |
8962 | ||
8963 | http://multimlton.cs.purdue.edu[MultiMLton] is a compiler and runtime | |
8964 | environment that targets scalable multicore platforms. It is an | |
8965 | extension of MLton. It combines new language abstractions and | |
8966 | associated compiler analyses for expressing and implementing various | |
8967 | kinds of fine-grained parallelism (safe futures, speculation, | |
8968 | transactions, etc.), along with a sophisticated runtime system tuned | |
8969 | to efficiently handle large numbers of lightweight threads. The core | |
8970 | stable features of MultiMLton will need to be integrated with the | |
8971 | latest MLton public release. Certain experimental features, such as | |
8972 | support for the Intel SCC and distributed runtime will be omitted. | |
8973 | This project requires students to understand the delta between the | |
8974 | MultiMLton code base and the MLton code base. Students will need to | |
8975 | create build and configuration scripts for MLton to enable MultiMLton | |
8976 | features. | |
8977 | ||
8978 | Background | |
8979 | -- | |
8980 | * http://multimlton.cs.purdue.edu/mML/Publications.html[MultiMLton -- Publications] | |
8981 | -- | |
8982 | ||
8983 | Recommended Skills: SML programming experience; C programming experience; some compiler experience | |
8984 | ||
8985 | ///// | |
8986 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
8987 | ///// | |
8988 | ||
8989 | === Concurrent{nbsp}ML Improvements === | |
8990 | ||
8991 | http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency | |
8992 | library based on synchronous message passing. MLton has a partial | |
8993 | implementation of the CML message-passing primitives, but its use in | |
8994 | real-world applications has been stymied by the lack of completeness | |
8995 | and thread-safe I/O libraries. This project would aim to flesh out | |
8996 | the CML implementation in MLton to be fully compatible with the | |
8997 | "official" version distributed as part of SML/NJ. Furthermore, time | |
8998 | permitting, runtime system support could be added to allow use of | |
8999 | modern OS features, such as asynchronous I/O, in the implementation of | |
9000 | CML's system interfaces. | |
9001 | ||
9002 | Background | |
9003 | -- | |
9004 | * http://cml.cs.uchicago.edu/ | |
9005 | * http://mlton.org/ConcurrentML | |
9006 | * http://mlton.org/ConcurrentMLImplementation | |
9007 | -- | |
9008 | ||
9009 | Recommended Skills: SML programming experience; knowledge of concurrent programming; some operating systems and/or systems programming experience | |
9010 | ||
9011 | ///// | |
9012 | Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
9013 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9014 | ///// | |
9015 | ||
9016 | ///// | |
9017 | === SML3d Development === | |
9018 | ||
9019 | The SML3d Project is a collection of libraries to support 3D graphics | |
9020 | programming using Standard ML and the http://opengl.org/[OpenGL] | |
9021 | graphics API. It currently requires the MLton implementation of SML | |
9022 | and is supported on Linux, Mac OS X, and Microsoft Windows. There is | |
9023 | also support for http://www.khronos.org/opencl/[OpenCL]. This project | |
9024 | aims to continue development of the SML3d Project. | |
9025 | ||
9026 | Background | |
9027 | -- | |
9028 | * http://sml3d.cs.uchicago.edu/ | |
9029 | -- | |
9030 | ||
9031 | Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
9032 | ///// | |
9033 | ||
9034 | <<< | |
9035 | ||
9036 | :mlton-guide-page: GoogleSummerOfCode2015 | |
9037 | [[GoogleSummerOfCode2015]] | |
9038 | Google Summer of Code (2015) | |
9039 | ============================ | |
9040 | ||
9041 | == Mentors == | |
9042 | ||
9043 | The following developers have agreed to serve as mentors for the 2015 Google Summer of Code: | |
9044 | ||
9045 | * http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9046 | * http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
9047 | ///// | |
9048 | * http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
9049 | * http://www.cs.purdue.edu/homes/chandras[KC Sivaramakrishnan] | |
9050 | * http://www.cs.purdue.edu/homes/suresh/[Suresh Jagannathan] | |
9051 | ///// | |
9052 | ||
9053 | == Ideas List == | |
9054 | ||
9055 | ///// | |
9056 | === Implement a Partial Redundancy Elimination (PRE) Optimization === | |
9057 | ||
9058 | Partial redundancy elimination (PRE) is a program transformation that | |
9059 | removes operations that are redundant on some, but not necessarily all | |
9060 | paths, through the program. PRE can subsume both common subexpression | |
9061 | elimination and loop-invariant code motion, and is therefore a | |
9062 | potentially powerful optimization. However, a naïve implementation of | |
9063 | PRE on a program in static single assignment (SSA) form is unlikely to | |
9064 | be effective. This project aims to adapt and implement the GVN-PRE | |
9065 | algorithm of Thomas VanDrunen in MLton's SSA intermediate language. | |
9066 | ||
9067 | Background: | |
9068 | -- | |
9069 | * http://cs.wheaton.edu/%7Etvandrun/writings/thesis.pdf[Partial Redundancy Elimination for Global Value Numbering]; Thomas VanDrunen | |
9070 | * http://www.cs.purdue.edu/research/technical_reports/2003/TR%2003-032.pdf[Corner-cases in Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking | |
9071 | * http://www.springerlink.com/content/w06m3cw453nphm1u/[Value-Based Partial Redundancy Elimination]; Thomas VanDrunen and Antony L. Hosking | |
9072 | * http://onlinelibrary.wiley.com/doi/10.1002/spe.618/abstract[Anticipation-based Partial Redundancy Elimination for Static Single Assignment Form]; Thomas VanDrunen and Antony L. Hosking | |
9073 | * http://portal.acm.org/citation.cfm?doid=319301.319348[Partial Redundancy Elimination in SSA Form]; Robert Kennedy, Sun Chan, Shin-Ming Liu, Raymond Lo, Peng Tu, and Fred Chow | |
9074 | -- | |
9075 | ||
9076 | Recommended Skills: SML programming experience; some middle-end compiler experience | |
9077 | ||
9078 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9079 | ///// | |
9080 | ||
9081 | === Design and Implement a Heap Profiler === | |
9082 | ||
9083 | A heap profile is a description of the space usage of a program. A | |
9084 | heap profile is concerned with the allocation, retention, and | |
9085 | deallocation (via garbage collection) of heap data during the | |
9086 | execution of a program. A heap profile can be used to diagnose | |
9087 | performance problems in a functional program that arise from space | |
9088 | leaks. This project aims to design and implement a heap profiler for | |
9089 | MLton compiled programs. | |
9090 | ||
9091 | Background: | |
9092 | -- | |
9093 | * http://portal.acm.org/citation.cfm?doid=583854.582451[GCspy: an adaptable heap visualisation framework]; Tony Printezis and Richard Jones | |
9094 | * http://journals.cambridge.org/action/displayAbstract?aid=1349892[New dimensions in heap profiling]; Colin Runciman and Niklas Röjemo | |
9095 | * http://www.springerlink.com/content/710501660722gw37/[Heap profiling for space efficiency]; Colin Runciman and Niklas Röjemo | |
9096 | * http://journals.cambridge.org/action/displayAbstract?aid=1323096[Heap profiling of lazy functional programs]; Colin Runciman and David Wakeling | |
9097 | -- | |
9098 | ||
9099 | Recommended Skills: C and SML programming experience; some experience with UI and visualization | |
9100 | ||
9101 | ///// | |
9102 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9103 | ///// | |
9104 | ||
9105 | === Garbage Collector Improvements === | |
9106 | ||
9107 | The garbage collector plays a significant role in the performance of | |
9108 | functional languages. Garbage collect too often, and program | |
9109 | performance suffers due to the excessive time spent in the garbage | |
9110 | collector. Garbage collect not often enough, and program performance | |
9111 | suffers due to the excessive space used by the uncollected | |
9112 | garbage. One particular issue is ensuring that a program utilizing a | |
9113 | garbage collector "plays nice" with other processes on the system, by | |
9114 | not using too much or too little physical memory. While there are some | |
9115 | reasonable theoretical results about garbage collections with heaps of | |
9116 | fixed size, there seems to be insufficient work that really looks | |
9117 | carefully at the question of dynamically resizing the heap in response | |
9118 | to the live data demands of the application and, similarly, in | |
9119 | response to the behavior of the operating system and other | |
9120 | processes. This project aims to investigate improvements to the memory | |
9121 | behavior of MLton compiled programs through better tuning of the | |
9122 | garbage collector. | |
9123 | ||
9124 | Background: | |
9125 | -- | |
9126 | * http://gchandbook.org/[The Garbage Collection Handbook: The Art of Automatic Memory Management]; Richard Jones, Antony Hosking, Eliot Moss | |
9127 | * http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.1020[Dual-Mode Garbage Collection]; Patrick Sansom | |
9128 | * http://portal.acm.org/citation.cfm?doid=1029873.1029881[Automatic Heap Sizing: Taking Real Memory into Account]; Ting Yang, Matthew Hertz, Emery D. Berger, Scott F. Kaplan, and J. Eliot B. Moss | |
9129 | * http://portal.acm.org/citation.cfm?doid=1152649.1152652[Controlling Garbage Collection and Heap Growth to Reduce the Execution Time of Java Applications]; Tim Brecht, Eshrat Arjomandi, Chang Li, and Hang Pham | |
9130 | * http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4145125[Isla Vista Heap Sizing: Using Feedback to Avoid Paging]; Chris Grzegorczyk, Sunil Soman, Chandra Krintz, and Rich Wolski | |
9131 | * http://portal.acm.org/citation.cfm?doid=1806651.1806669[The Economics of Garbage Collection]; Jeremy Singer, Richard E. Jones, Gavin Brown, and Mikel Luján | |
9132 | * http://www.dcs.gla.ac.uk/%7Ejsinger/pdfs/tfp12.pdf[Automated Heap Sizing in the Poly/ML Runtime (Position Paper)]; David White, Jeremy Singer, Jonathan Aitken, and David Matthews | |
9133 | * http://portal.acm.org/citation.cfm?doid=2555670.2466481[Control Theory for Principled Heap Sizing]; David R. White, Jeremy Singer, Jonathan M. Aitken, and Richard E. Jones | |
9134 | -- | |
9135 | ||
9136 | Recommended Skills: C programming experience; some operating systems and/or systems programming experience; some compiler and garbage collector experience | |
9137 | ||
9138 | ///// | |
9139 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9140 | ///// | |
9141 | ||
9142 | === Heap-allocated Activation Records === | |
9143 | ||
9144 | Activation records (a.k.a., stack frames) are traditionally allocated | |
9145 | on a stack. This naturally corresponds to the call-return pattern of | |
9146 | function invocation. However, there are some disadvantages to | |
9147 | stack-allocated activation records. In a functional programming | |
9148 | language, functions may be deeply recursive, resulting in call stacks | |
9149 | that are much larger than typically supported by the operating system; | |
9150 | hence, a functional programming language implementation will typically | |
9151 | store its stack in its heap. Furthermore, a functional programming | |
9152 | language implementation must handle and recover from stack overflow, | |
9153 | by allocating a larger stack (again, in its heap) and copying | |
9154 | activation records from the old stack to the new stack. In the | |
9155 | presence of threads, stacks must be allocated in a heap and, in the | |
9156 | presence of a garbage collector, should be garbage collected when | |
9157 | unreachable. While heap-allocated activation records avoid many of | |
9158 | these disadvantages, they have not been widely implemented. This | |
9159 | project aims to implement and evaluate heap-allocated activation | |
9160 | records in the MLton compiler. | |
9161 | ||
9162 | Background: | |
9163 | -- | |
9164 | * http://journals.cambridge.org/action/displayAbstract?aid=1295104[Empirical and Analytic Study of Stack Versus Heap Cost for Languages with Closures]; Andrew W. Appel and Zhong Shao | |
9165 | * http://portal.acm.org/citation.cfm?doid=182590.156783[Space-efficient closure representations]; Zhong Shao and Andrew W. Appel | |
9166 | * http://portal.acm.org/citation.cfm?doid=93548.93554[Representing control in the presence of first-class continuations]; R. Hieb, R. Kent Dybvig, and Carl Bruggeman | |
9167 | -- | |
9168 | ||
9169 | Recommended Skills: SML programming experience; some middle- and back-end compiler experience | |
9170 | ||
9171 | ///// | |
9172 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9173 | ///// | |
9174 | ||
9175 | === Correctly Rounded Floating-point Binary-to-Decimal and Decimal-to-Binary Conversion Routines in Standard ML === | |
9176 | ||
9177 | The | |
9178 | http://en.wikipedia.org/wiki/IEEE_754-2008[IEEE Standard for Floating-Point Arithmetic (IEEE 754)] | |
9179 | is the de facto representation for floating-point computation. | |
9180 | However, it is a _binary_ (base 2) representation of floating-point | |
9181 | values, while many applications call for input and output of | |
9182 | floating-point values in _decimal_ (base 10) representation. The | |
9183 | _decimal-to-binary_ conversion problem takes a decimal floating-point | |
9184 | representation (e.g., a string like +"0.1"+) and returns the best | |
9185 | binary floating-point representation of that number. The | |
9186 | _binary-to-decimal_ conversion problem takes a binary floating-point | |
9187 | representation and returns a decimal floating-point representation | |
9188 | using the smallest number of digits that allow the decimal | |
9189 | floating-point representation to be converted to the original binary | |
9190 | floating-point representation. For both conversion routines, "best" | |
9191 | is dependent upon the current floating-point rounding mode. | |
9192 | ||
9193 | MLton uses David Gay's | |
9194 | http://www.netlib.org/fp/gdtoa.tgz[gdtoa library] for floating-point | |
9195 | conversions. While this is an exellent library, it generalizes the | |
9196 | decimal-to-binary and binary-to-decimal conversion routines beyond | |
9197 | what is required by the | |
9198 | http://standardml.org/Basis/[Standard ML Basis Library] and induces an | |
9199 | external dependency on the compiler. Native implementations of these | |
9200 | conversion routines in Standard ML would obviate the dependency on the | |
9201 | +gdtoa+ library, while also being able to take advantage of Standard | |
9202 | ML features in the implementation (e.g., the published algorithms | |
9203 | often require use of infinite precision arithmetic, which is provided | |
9204 | by the +IntInf+ structure in Standard ML, but is provided in an ad hoc | |
9205 | fasion in the +gdtoa+ library). | |
9206 | ||
9207 | This project aims to develop a native implementation of the conversion | |
9208 | routines in Standard ML. | |
9209 | ||
9210 | Background: | |
9211 | -- | |
9212 | * http://dl.acm.org/citation.cfm?doid=103162.103163[What every computer scientist should know about floating-point arithmetic]; David Goldberg | |
9213 | * http://dl.acm.org/citation.cfm?doid=93542.93559[How to print floating-point numbers accurately]; Guy L. Steele, Jr. and Jon L. White | |
9214 | * http://dl.acm.org/citation.cfm?doid=93542.93557[How to read floating point numbers accurately]; William D. Clinger | |
9215 | * http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz[Correctly Rounded Binary-Decimal and Decimal-Binary Conversions]; David Gay | |
9216 | * http://dl.acm.org/citation.cfm?doid=249069.231397[Printing floating-point numbers quickly and accurately]; Robert G. Burger and R. Kent Dybvig | |
9217 | * http://dl.acm.org/citation.cfm?doid=1806596.1806623[Printing floating-point numbers quickly and accurately with integers]; Florian Loitsch | |
9218 | -- | |
9219 | ||
9220 | Recommended Skills: SML programming experience; algorithm design and implementation | |
9221 | ||
9222 | ///// | |
9223 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9224 | ///// | |
9225 | ||
9226 | === Implement Source-level Debugging === | |
9227 | ||
9228 | Debugging is a fact of programming life. Unfortunately, most SML | |
9229 | implementations (including MLton) provide little to no source-level | |
9230 | debugging support. This project aims to add basic to intermediate | |
9231 | source-level debugging support to the MLton compiler. MLton already | |
9232 | supports source-level profiling, which can be used to attribute bytes | |
9233 | allocated or time spent in source functions. It should be relatively | |
9234 | straightforward to leverage this source-level information into basic | |
9235 | source-level debugging support, with the ability to set/unset | |
9236 | breakpoints and step through declarations and functions. It may be | |
9237 | possible to also provide intermediate source-level debugging support, | |
9238 | with the ability to inspect in-scope variables of basic types (e.g., | |
9239 | types compatible with MLton's foreign function interface). | |
9240 | ||
9241 | Background: | |
9242 | -- | |
9243 | * http://mlton.org/HowProfilingWorks[MLton -- How Profiling Works] | |
9244 | * http://mlton.org/ForeignFunctionInterfaceTypes[MLton -- Foreign Function Interface Types] | |
9245 | * http://dwarfstd.org/[DWARF Debugging Standard] | |
9246 | * http://sourceware.org/gdb/current/onlinedocs/stabs/index.html[STABS Debugging Format] | |
9247 | -- | |
9248 | ||
9249 | Recommended Skills: SML programming experience; some compiler experience | |
9250 | ||
9251 | ///// | |
9252 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9253 | ///// | |
9254 | ||
9255 | === Region Based Memory Management === | |
9256 | ||
9257 | Region based memory management is an alternative automatic memory | |
9258 | management scheme to garbage collection. Regions can be inferred by | |
9259 | the compiler (e.g., Cyclone and MLKit) or provided to the programmer | |
9260 | through a library. Since many students do not have extensive | |
9261 | experience with compilers we plan on adopting the later approach. | |
9262 | Creating a viable region based memory solution requires the removal of | |
9263 | the GC and changes to the allocator. Additionally, write barriers | |
9264 | will be necessary to ensure references between two ML objects is never | |
9265 | established if the left hand side of the assignment has a longer | |
9266 | lifetime than the right hand side. Students will need to come up with | |
9267 | an appropriate interface for creating, entering, and exiting regions | |
9268 | (examples include RTSJ scoped memory and SCJ scoped memory). | |
9269 | ||
9270 | Background: | |
9271 | -- | |
9272 | * Cyclone | |
9273 | * MLKit | |
9274 | * RTSJ + SCJ scopes | |
9275 | -- | |
9276 | ||
9277 | Recommended Skills: SML programming experience; C programming experience; some compiler and garbage collector experience | |
9278 | ||
9279 | ///// | |
9280 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
9281 | ///// | |
9282 | ||
9283 | === Adding Real-Time Capabilities === | |
9284 | ||
9285 | This project focuses on exposing real-time APIs from a real-time OS | |
9286 | kernel at the SML level. This will require mapping the current MLton | |
9287 | (or http://multimlton.cs.purdue.edu[MultiMLton]) threading framework | |
9288 | to real-time threads that the RTOS provides. This will include | |
9289 | associating priorities with MLton threads and building priority based | |
9290 | scheduling algorithms. Additionally, support for perdioc, aperiodic, | |
9291 | and sporadic tasks should be supported. A real-time SML library will | |
9292 | need to be created to provide a forward facing interface for | |
9293 | programmers. Stretch goals include reworking the MLton +atomic+ | |
9294 | statement and associated synchronization primitives built on top of | |
9295 | the MLton +atomic+ statement. | |
9296 | ||
9297 | Recommended Skills: SML programming experience; C programming experience; real-time experience a plus but not required | |
9298 | ||
9299 | ///// | |
9300 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
9301 | ///// | |
9302 | ||
9303 | === Real-Time Garbage Collection === | |
9304 | ||
9305 | This project focuses on modifications to the MLton GC to support | |
9306 | real-time garbage collection. We will model the real-time GC on the | |
9307 | Schism RTGC. The first task will be to create a fixed size runtime | |
9308 | object representation. Large structures will need to be represented | |
9309 | as a linked lists of fixed sized objects. Arrays and vectors will be | |
9310 | transferred into dense trees. Compaction and copying can therefore be | |
9311 | removed from the GC algorithms that MLton currently supports. Lastly, | |
9312 | the GC will be made concurrent, allowing for the execution of the GC | |
9313 | threads as the lowest priority task in the system. Stretch goals | |
9314 | include a priority aware mechanism for the GC to signal to real-time | |
9315 | ML threads that it needs to scan their stack and identification of | |
9316 | places where the stack is shallow to bound priority inversion during | |
9317 | this procedure. | |
9318 | ||
9319 | Recommended Skills: C programming experience; garbage collector experience a plus but not required | |
9320 | ||
9321 | ///// | |
9322 | Mentor: http://www.cse.buffalo.edu/%7Elziarek/[Lukasz (Luke) Ziarek] | |
9323 | ///// | |
9324 | ||
9325 | ///// | |
9326 | === Concurrent{nbsp}ML Improvements === | |
9327 | ||
9328 | http://cml.cs.uchicago.edu/[Concurrent ML] is an SML concurrency | |
9329 | library based on synchronous message passing. MLton has a partial | |
9330 | implementation of the CML message-passing primitives, but its use in | |
9331 | real-world applications has been stymied by the lack of completeness | |
9332 | and thread-safe I/O libraries. This project would aim to flesh out | |
9333 | the CML implementation in MLton to be fully compatible with the | |
9334 | "official" version distributed as part of SML/NJ. Furthermore, time | |
9335 | permitting, runtime system support could be added to allow use of | |
9336 | modern OS features, such as asynchronous I/O, in the implementation of | |
9337 | CML's system interfaces. | |
9338 | ||
9339 | Background | |
9340 | -- | |
9341 | * http://cml.cs.uchicago.edu/ | |
9342 | * http://mlton.org/ConcurrentML | |
9343 | * http://mlton.org/ConcurrentMLImplementation | |
9344 | -- | |
9345 | ||
9346 | Recommended Skills: SML programming experience; knowledge of concurrent programming; some operating systems and/or systems programming experience | |
9347 | ||
9348 | Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
9349 | Mentor: http://www.cs.rit.edu/%7Emtf[Matthew Fluet] | |
9350 | ///// | |
9351 | ||
9352 | ///// | |
9353 | === SML3d Development === | |
9354 | ||
9355 | The SML3d Project is a collection of libraries to support 3D graphics | |
9356 | programming using Standard ML and the http://opengl.org/[OpenGL] | |
9357 | graphics API. It currently requires the MLton implementation of SML | |
9358 | and is supported on Linux, Mac OS X, and Microsoft Windows. There is | |
9359 | also support for http://www.khronos.org/opencl/[OpenCL]. This project | |
9360 | aims to continue development of the SML3d Project. | |
9361 | ||
9362 | Background | |
9363 | -- | |
9364 | * http://sml3d.cs.uchicago.edu/ | |
9365 | -- | |
9366 | ||
9367 | Mentor: http://people.cs.uchicago.edu/~jhr/[John Reppy] | |
9368 | ///// | |
9369 | ||
9370 | <<< | |
9371 | ||
9372 | :mlton-guide-page: HaMLet | |
9373 | [[HaMLet]] | |
9374 | HaMLet | |
9375 | ====== | |
9376 | ||
9377 | http://www.mpi-sws.org/~rossberg/hamlet/[HaMLet] is a | |
9378 | <:StandardMLImplementations:Standard ML implementation>. It is | |
9379 | intended as reference implementation of | |
9380 | <:DefinitionOfStandardML:The Definition of Standard ML (Revised)> and | |
9381 | not for serious practical work. | |
9382 | ||
9383 | <<< | |
9384 | ||
9385 | :mlton-guide-page: HenryCejtin | |
9386 | [[HenryCejtin]] | |
9387 | HenryCejtin | |
9388 | =========== | |
9389 | ||
9390 | I was one of the original developers of Mathematica (actually employee #1). | |
9391 | My background is a combination of mathematics and computer science. | |
9392 | Currently I am doing various things in Chicago. | |
9393 | ||
9394 | <<< | |
9395 | ||
9396 | :mlton-guide-page: History | |
9397 | [[History]] | |
9398 | History | |
9399 | ======= | |
9400 | ||
9401 | In April 1997, Stephen Weeks wrote a defunctorizer for Standard ML and | |
9402 | integrated it with SML/NJ. The defunctorizer used SML/NJ's visible | |
9403 | compiler and operated on the `Ast` intermediate representation | |
9404 | produced by the SML/NJ front end. Experiments showed that | |
9405 | defunctorization gave a speedup of up to six times over separate | |
9406 | compilation and up to two times over batch compilation without functor | |
9407 | expansion. | |
9408 | ||
9409 | In August 1997, we began development of an independent compiler for | |
9410 | SML. At the time the compiler was called `smlc`. By October, we had | |
9411 | a working monomorphiser. By November, we added a polyvariant | |
9412 | higher-order control-flow analysis. At that point, MLton was about | |
9413 | 10,000 lines of code. | |
9414 | ||
9415 | Over the next year and half, `smlc` morphed into a full-fledged | |
9416 | compiler for SML. It was renamed MLton, and first released in March | |
9417 | 1999. | |
9418 | ||
9419 | From the start, MLton has been driven by whole-program optimization | |
9420 | and an emphasis on performance. Also from the start, MLton has had a | |
9421 | fast C FFI and `IntInf` based on the GNU multiprecision library. At | |
9422 | its first release, MLton was 48,006 lines. | |
9423 | ||
9424 | Between the March 1999 and January 2002, MLton grew to 102,541 lines, | |
9425 | as we added a native code generator, mllex, mlyacc, a profiler, many | |
9426 | optimizations, and many libraries including threads and signal | |
9427 | handling. | |
9428 | ||
9429 | During 2002, MLton grew to 112,204 lines and we had releases in April | |
9430 | and September. We added support for cross compilation and used this | |
9431 | to enable MLton to run on Cygwin/Windows and FreeBSD. We also made | |
9432 | improvements to the garbage collector, so that it now works with large | |
9433 | arrays and up to 4G of memory and so that it automatically uses | |
9434 | copying, mark-compact, or generational collection depending on heap | |
9435 | usage and RAM size. We also continued improvements to the optimizer | |
9436 | and libraries. | |
9437 | ||
9438 | During 2003, MLton grew to 122,299 lines and we had releases in March | |
9439 | and July. We extended the profiler to support source-level profiling | |
9440 | of time and allocation and to display call graphs. We completed the | |
9441 | Basis Library implementation, and added new MLton-specific libraries | |
9442 | for weak pointers and finalization. We extended the FFI to allow | |
9443 | callbacks from C to SML. We added support for the Sparc/Solaris | |
9444 | platform, and made many improvements to the C code generator. | |
9445 | ||
9446 | <<< | |
9447 | ||
9448 | :mlton-guide-page: HowProfilingWorks | |
9449 | [[HowProfilingWorks]] | |
9450 | HowProfilingWorks | |
9451 | ================= | |
9452 | ||
9453 | Here's how <:Profiling:> works. If profiling is on, the front end | |
9454 | (elaborator) inserts `Enter` and `Leave` statements into the source | |
9455 | program for function entry and exit. For example, | |
9456 | [source,sml] | |
9457 | ---- | |
9458 | fun f n = if n = 0 then 0 else 1 + f (n - 1) | |
9459 | ---- | |
9460 | becomes | |
9461 | [source,sml] | |
9462 | ---- | |
9463 | fun f n = | |
9464 | let | |
9465 | val () = Enter "f" | |
9466 | val res = (if n = 0 then 0 else 1 + f (n - 1)) | |
9467 | handle e => (Leave "f"; raise e) | |
9468 | val () = Leave "f" | |
9469 | in | |
9470 | res | |
9471 | end | |
9472 | ---- | |
9473 | ||
9474 | Actually there is a bit more information than just the source function | |
9475 | name; there is also lexical nesting and file position. | |
9476 | ||
9477 | Most of the middle of the compiler ignores, but preserves, `Enter` and | |
9478 | `Leave`. However, so that profiling preserves tail calls, the | |
9479 | <:Shrink:SSA shrinker> has an optimization that notices when the only | |
9480 | operations that cause a call to be a nontail call are profiling | |
9481 | operations, and if so, moves them before the call, turning it into a | |
9482 | tail call. If you observe a program that has a tail call that appears | |
9483 | to be turned into a nontail when compiled with profiling, please | |
9484 | <:Bug:report a bug>. | |
9485 | ||
9486 | There is the `checkProf` function in | |
9487 | <!ViewGitFile(mlton,master,mlton/ssa/type-check.fun)>, which checks that | |
9488 | the `Enter`/`Leave` statements match up. | |
9489 | ||
9490 | In the backend, just before translating to the <:Machine: Machine IL>, | |
9491 | the profiler uses the `Enter`/`Leave` statements to infer the "local" | |
9492 | portion of the control stack at each program point. The profiler then | |
9493 | removes the ++Enter++s/++Leave++s and inserts different information | |
9494 | depending on which kind of profiling is happening. For time profiling | |
9495 | (with the <:AMD64Codegen:> and <:X86Codegen:>), the profiler inserts labels that cover the | |
9496 | code (i.e. each statement has a unique label in its basic block that | |
9497 | prefixes it) and associates each label with the local control stack. | |
9498 | For time profiling (with the <:CCodegen:> and <:LLVMCodegen:>), the profiler | |
9499 | inserts code that sets a global field that records the local control | |
9500 | stack. For allocation profiling, the profiler inserts calls to a C | |
9501 | function that will maintain byte counts. With stack profiling, the | |
9502 | profiler also inserts a call to a C function at each nontail call in | |
9503 | order to maintain information at runtime about what SML functions are | |
9504 | on the stack. | |
9505 | ||
9506 | At run time, the profiler associates counters (either clock ticks or | |
9507 | byte counts) with source functions. When the program finishes, the | |
9508 | profiler writes the counts out to the `mlmon.out` file. Then, | |
9509 | `mlprof` uses source information stored in the executable to | |
9510 | associate the counts in the `mlmon.out` file with source | |
9511 | functions. | |
9512 | ||
9513 | For time profiling, the profiler catches the `SIGPROF` signal 100 | |
9514 | times per second and increments the appropriate counter, determined by | |
9515 | looking at the label prefixing the current program counter and mapping | |
9516 | that to the current source function. | |
9517 | ||
9518 | == Caveats == | |
9519 | ||
9520 | There may be a few missed clock ticks or bytes allocated at the very | |
9521 | end of the program after the data is written. | |
9522 | ||
9523 | Profiling has not been tested with signals or threads. In particular, | |
9524 | stack profiling may behave strangely. | |
9525 | ||
9526 | <<< | |
9527 | ||
9528 | :mlton-guide-page: Identifier | |
9529 | [[Identifier]] | |
9530 | Identifier | |
9531 | ========== | |
9532 | ||
9533 | In <:StandardML:Standard ML>, there are syntactically two kinds of | |
9534 | identifiers. | |
9535 | ||
9536 | * Alphanumeric: starts with a letter or prime (`'`) and is followed by letters, digits, primes and underbars (`_`). | |
9537 | + | |
9538 | Examples: `abc`, `ABC123`, `Abc_123`, `'a`. | |
9539 | ||
9540 | * Symbolic: a sequence of the following | |
9541 | + | |
9542 | ---- | |
9543 | ! % & $ # + - / : < = > ? @ | ~ ` ^ | * | |
9544 | ---- | |
9545 | + | |
9546 | Examples: `+=`, `<=`, `>>`, `$`. | |
9547 | ||
9548 | With the exception of `=`, reserved words can not be identifiers. | |
9549 | ||
9550 | There are a number of different classes of identifiers, some of which | |
9551 | have additional syntactic rules. | |
9552 | ||
9553 | * Identifiers not starting with a prime. | |
9554 | ** value identifier (includes variables and constructors) | |
9555 | ** type constructor | |
9556 | ** structure identifier | |
9557 | ** signature identifier | |
9558 | ** functor identifier | |
9559 | * Identifiers starting with a prime. | |
9560 | ** type variable | |
9561 | * Identifiers not starting with a prime and numeric labels (`1`, `2`, ...). | |
9562 | ** record label | |
9563 | ||
9564 | <<< | |
9565 | ||
9566 | :mlton-guide-page: Immutable | |
9567 | [[Immutable]] | |
9568 | Immutable | |
9569 | ========= | |
9570 | ||
9571 | Immutable means not <:Mutable:mutable> and is an adjective meaning | |
9572 | "can not be modified". Most values in <:StandardML:Standard ML> are | |
9573 | immutable. For example, constants, tuples, records, lists, and | |
9574 | vectors are all immutable. | |
9575 | ||
9576 | <<< | |
9577 | ||
9578 | :mlton-guide-page: ImperativeTypeVariable | |
9579 | [[ImperativeTypeVariable]] | |
9580 | ImperativeTypeVariable | |
9581 | ====================== | |
9582 | ||
9583 | In <:StandardML:Standard ML>, an imperative type variable is a type | |
9584 | variable whose second character is a digit, as in `'1a` or | |
9585 | `'2b`. Imperative type variables were used as an alternative to | |
9586 | the <:ValueRestriction:> in an earlier version of SML, but no longer play | |
9587 | a role. They are treated exactly as other type variables. | |
9588 | ||
9589 | <<< | |
9590 | ||
9591 | :mlton-guide-page: ImplementExceptions | |
9592 | [[ImplementExceptions]] | |
9593 | ImplementExceptions | |
9594 | =================== | |
9595 | ||
9596 | <:ImplementExceptions:> is a pass for the <:SXML:> | |
9597 | <:IntermediateLanguage:>, invoked from <:SXMLSimplify:>. | |
9598 | ||
9599 | == Description == | |
9600 | ||
9601 | This pass implements exceptions. | |
9602 | ||
9603 | == Implementation == | |
9604 | ||
9605 | * <!ViewGitFile(mlton,master,mlton/xml/implement-exceptions.fun)> | |
9606 | ||
9607 | == Details and Notes == | |
9608 | ||
9609 | {empty} | |
9610 | ||
9611 | <<< | |
9612 | ||
9613 | :mlton-guide-page: ImplementHandlers | |
9614 | [[ImplementHandlers]] | |
9615 | ImplementHandlers | |
9616 | ================= | |
9617 | ||
9618 | <:ImplementHandlers:> is a pass for the <:RSSA:> | |
9619 | <:IntermediateLanguage:>, invoked from <:RSSASimplify:>. | |
9620 | ||
9621 | == Description == | |
9622 | ||
9623 | This pass implements the (threaded) exception handler stack. | |
9624 | ||
9625 | == Implementation == | |
9626 | ||
9627 | * <!ViewGitFile(mlton,master,mlton/backend/implement-handlers.fun)> | |
9628 | ||
9629 | == Details and Notes == | |
9630 | ||
9631 | {empty} | |
9632 | ||
9633 | <<< | |
9634 | ||
9635 | :mlton-guide-page: ImplementProfiling | |
9636 | [[ImplementProfiling]] | |
9637 | ImplementProfiling | |
9638 | ================== | |
9639 | ||
9640 | <:ImplementProfiling:> is a pass for the <:RSSA:> | |
9641 | <:IntermediateLanguage:>, invoked from <:RSSASimplify:>. | |
9642 | ||
9643 | == Description == | |
9644 | ||
9645 | This pass implements profiling. | |
9646 | ||
9647 | == Implementation == | |
9648 | ||
9649 | * <!ViewGitFile(mlton,master,mlton/backend/implement-profiling.fun)> | |
9650 | ||
9651 | == Details and Notes == | |
9652 | ||
9653 | See <:HowProfilingWorks:>. | |
9654 | ||
9655 | <<< | |
9656 | ||
9657 | :mlton-guide-page: ImplementSuffix | |
9658 | [[ImplementSuffix]] | |
9659 | ImplementSuffix | |
9660 | =============== | |
9661 | ||
9662 | <:ImplementSuffix:> is a pass for the <:SXML:> | |
9663 | <:IntermediateLanguage:>, invoked from <:SXMLSimplify:>. | |
9664 | ||
9665 | == Description == | |
9666 | ||
9667 | This pass implements the `TopLevel_setSuffix` primitive, which | |
9668 | installs a function to exit the program. | |
9669 | ||
9670 | == Implementation == | |
9671 | ||
9672 | * <!ViewGitFile(mlton,master,mlton/xml/implement-suffix.fun)> | |
9673 | ||
9674 | == Details and Notes == | |
9675 | ||
9676 | <:ImplementSuffix:> works by introducing a new `ref` cell to contain | |
9677 | the function of type `unit -> unit` that should be called on program | |
9678 | exit. | |
9679 | ||
9680 | * The following code (appropriately alpha-converted) is appended to the beginning of the <:SXML:> program: | |
9681 | + | |
9682 | [source,sml] | |
9683 | ---- | |
9684 | val z_0 = | |
9685 | fn a_0 => | |
9686 | let | |
9687 | val x_0 = | |
9688 | "toplevel suffix not installed" | |
9689 | val x_1 = | |
9690 | MLton_bug (x_0) | |
9691 | in | |
9692 | x_1 | |
9693 | end | |
9694 | val topLevelSuffixCell = | |
9695 | Ref_ref (z_0) | |
9696 | ---- | |
9697 | ||
9698 | * Any occurrence of | |
9699 | + | |
9700 | [source,sml] | |
9701 | ---- | |
9702 | val x_0 = | |
9703 | TopLevel_setSuffix (f_0) | |
9704 | ---- | |
9705 | + | |
9706 | is rewritten to | |
9707 | + | |
9708 | [source,sml] | |
9709 | ---- | |
9710 | val x_0 = | |
9711 | Ref_assign (topLevelSuffixCell, f_0) | |
9712 | ---- | |
9713 | ||
9714 | * The following code (appropriately alpha-converted) is appended to the end of the <:SXML:> program: | |
9715 | + | |
9716 | [source,sml] | |
9717 | ---- | |
9718 | val f_0 = | |
9719 | Ref_deref (topLevelSuffixCell) | |
9720 | val z_0 = | |
9721 | () | |
9722 | val x_0 = | |
9723 | f_0 z_0 | |
9724 | ---- | |
9725 | ||
9726 | <<< | |
9727 | ||
9728 | :mlton-guide-page: InfixingOperators | |
9729 | [[InfixingOperators]] | |
9730 | InfixingOperators | |
9731 | ================= | |
9732 | ||
9733 | Fixity specifications are not part of signatures in | |
9734 | <:StandardML:Standard ML>. When one wants to use a module that | |
9735 | provides functions designed to be used as infix operators there are | |
9736 | several obvious alternatives: | |
9737 | ||
9738 | * Use only prefix applications. Unfortunately there are situations | |
9739 | where infix applications lead to considerably more readable code. | |
9740 | ||
9741 | * Make the fixity declarations at the top-level. This may lead to | |
9742 | collisions and may be unsustainable in a large project. Pollution of | |
9743 | the top-level should be avoided. | |
9744 | ||
9745 | * Make the fixity declarations at each scope where you want to use | |
9746 | infix applications. The duplication becomes inconvenient if the | |
9747 | operators are widely used. Duplication of code should be avoided. | |
9748 | ||
9749 | * Use non-standard extensions, such as the <:MLBasis: ML Basis system> | |
9750 | to control the scope of fixity declarations. This has the obvious | |
9751 | drawback of reduced portability. | |
9752 | ||
9753 | * Reuse existing infix operator symbols (`^`, `+`, `-`, ...). This | |
9754 | can be convenient when the standard operators aren't needed in the | |
9755 | same scope with the new operators. On the other hand, one is limited | |
9756 | to the standard operator symbols and the code may appear confusing. | |
9757 | ||
9758 | None of the obvious alternatives is best in every case. The following | |
9759 | describes a slightly less obvious alternative that can sometimes be | |
9760 | useful. The idea is to approximate Haskell's special syntax for | |
9761 | treating any identifier enclosed in grave accents (backquotes) as an | |
9762 | infix operator. In Haskell, instead of writing the prefix application | |
9763 | `f x y` one can write the infix application ++x `f` y++. | |
9764 | ||
9765 | ||
9766 | == Infixing operators == | |
9767 | ||
9768 | Let's first take a look at the definitions of the operators: | |
9769 | ||
9770 | [source,sml] | |
9771 | ---- | |
9772 | infix 3 <\ fun x <\ f = fn y => f (x, y) (* Left section *) | |
9773 | infix 3 \> fun f \> y = f y (* Left application *) | |
9774 | infixr 3 /> fun f /> y = fn x => f (x, y) (* Right section *) | |
9775 | infixr 3 </ fun x </ f = f x (* Right application *) | |
9776 | ||
9777 | infix 2 o (* See motivation below *) | |
9778 | infix 0 := | |
9779 | ---- | |
9780 | ||
9781 | The left and right sectioning operators, `<\` and `/>`, are useful in | |
9782 | SML for partial application of infix operators. | |
9783 | <!Cite(Paulson96, ML For the Working Programmer)> describes curried | |
9784 | functions `secl` and `secr` for the same purpose on pages 179-181. | |
9785 | For example, | |
9786 | ||
9787 | [source,sml] | |
9788 | ---- | |
9789 | List.map (op- /> y) | |
9790 | ---- | |
9791 | ||
9792 | is a function for subtracting `y` from a list of integers and | |
9793 | ||
9794 | [source,sml] | |
9795 | ---- | |
9796 | List.exists (x <\ op=) | |
9797 | ---- | |
9798 | ||
9799 | is a function for testing whether a list contains an `x`. | |
9800 | ||
9801 | Together with the left and right application operators, `\>` and `</`, | |
9802 | the sectioning operators provide a way to treat any binary function | |
9803 | (i.e. a function whose domain is a pair) as an infix operator. In | |
9804 | general, | |
9805 | ||
9806 | ---- | |
9807 | x0 <\f1\> x1 <\f2\> x2 ... <\fN\> xN = fN (... f2 (f1 (x0, x1), x2) ..., xN) | |
9808 | ---- | |
9809 | ||
9810 | and | |
9811 | ||
9812 | ---- | |
9813 | xN </fN/> ... x2 </f2/> x1 </f1/> x0 = fN (xN, ... f2 (x2, f1 (x1, x0)) ...) | |
9814 | ---- | |
9815 | ||
9816 | ||
9817 | === Examples === | |
9818 | ||
9819 | As a fairly realistic example, consider providing a function for sequencing | |
9820 | comparisons: | |
9821 | ||
9822 | [source,sml] | |
9823 | ---- | |
9824 | structure Order (* ... *) = | |
9825 | struct | |
9826 | (* ... *) | |
9827 | val orWhenEq = fn (EQUAL, th) => th () | |
9828 | | (other, _) => other | |
9829 | (* ... *) | |
9830 | end | |
9831 | ---- | |
9832 | Using `orWhenEq` and the infixing operators, one can write a | |
9833 | `compare` function for triples as | |
9834 | ||
9835 | [source,sml] | |
9836 | ---- | |
9837 | fun compare (fad, fbe, fcf) ((a, b, c), (d, e, f)) = | |
9838 | fad (a, d) <\Order.orWhenEq\> `fbe (b, e) <\Order.orWhenEq\> `fcf (c, f) | |
9839 | ---- | |
9840 | ||
9841 | where +`+ is defined as | |
9842 | ||
9843 | [source,sml] | |
9844 | ---- | |
9845 | fun `f x = fn () => f x | |
9846 | ---- | |
9847 | ||
9848 | Although `orWhenEq` can be convenient (try rewriting the above without | |
9849 | it), it is probably not useful enough to be defined at the top level | |
9850 | as an infix operator. Fortunately we can use the infixing operators | |
9851 | and don't have to. | |
9852 | ||
9853 | Another fairly realistic example would be to use the infixing operators with | |
9854 | the technique described on the <:Printf:> page. Assuming that you would have | |
9855 | a `Printf` module binding `printf`, +`+, and formatting combinators | |
9856 | named `int` and `string`, you could write | |
9857 | ||
9858 | [source,sml] | |
9859 | ---- | |
9860 | let open Printf in | |
9861 | printf (`"Here's an int "<\int\>" and a string "<\string\>".") 13 "foo" end | |
9862 | ---- | |
9863 | ||
9864 | without having to duplicate the fixity declarations. Alternatively, you could | |
9865 | write | |
9866 | ||
9867 | [source,sml] | |
9868 | ---- | |
9869 | P.printf (P.`"Here's an int "<\P.int\>" and a string "<\P.string\>".") 13 "foo" | |
9870 | ---- | |
9871 | ||
9872 | assuming you have the made the binding | |
9873 | ||
9874 | [source,sml] | |
9875 | ---- | |
9876 | structure P = Printf | |
9877 | ---- | |
9878 | ||
9879 | ||
9880 | == Application and piping operators == | |
9881 | ||
9882 | The left and right application operators may also provide some notational | |
9883 | convenience on their own. In general, | |
9884 | ||
9885 | ---- | |
9886 | f \> x1 \> ... \> xN = f x1 ... xN | |
9887 | ---- | |
9888 | ||
9889 | and | |
9890 | ||
9891 | ---- | |
9892 | xN </ ... </ x1 </ f = f x1 ... xN | |
9893 | ---- | |
9894 | ||
9895 | If nothing else, both of them can eliminate parentheses. For example, | |
9896 | ||
9897 | [source,sml] | |
9898 | ---- | |
9899 | foo (1 + 2) = foo \> 1 + 2 | |
9900 | ---- | |
9901 | ||
9902 | The left and right application operators are related to operators | |
9903 | that could be described as the right and left piping operators: | |
9904 | ||
9905 | [source,sml] | |
9906 | ---- | |
9907 | infix 1 >| val op>| = op</ (* Left pipe *) | |
9908 | infixr 1 |< val op|< = op\> (* Right pipe *) | |
9909 | ---- | |
9910 | ||
9911 | As you can see, the left and right piping operators, `>|` and `|<`, | |
9912 | are the same as the right and left application operators, | |
9913 | respectively, except the associativities are reversed and the binding | |
9914 | strength is lower. They are useful for piping data through a sequence | |
9915 | of operations. In general, | |
9916 | ||
9917 | ---- | |
9918 | x >| f1 >| ... >| fN = fN (... (f1 x) ...) = (fN o ... o f1) x | |
9919 | ---- | |
9920 | ||
9921 | and | |
9922 | ||
9923 | ---- | |
9924 | fN |< ... |< f1 |< x = fN (... (f1 x) ...) = (fN o ... o f1) x | |
9925 | ---- | |
9926 | ||
9927 | The right piping operator, `|<`, is provided by the Haskell prelude as | |
9928 | `$`. It can be convenient in CPS or continuation passing style. | |
9929 | ||
9930 | A use for the left piping operator is with parsing combinators. In a | |
9931 | strict language, like SML, eta-reduction is generally unsafe. Using | |
9932 | the left piping operator, parsing functions can be formatted | |
9933 | conveniently as | |
9934 | ||
9935 | [source,sml] | |
9936 | ---- | |
9937 | fun parsingFunc input = | |
9938 | input >| (* ... *) | |
9939 | || (* ... *) | |
9940 | || (* ... *) | |
9941 | ---- | |
9942 | ||
9943 | where `||` is supposed to be a combinator provided by the parsing combinator | |
9944 | library. | |
9945 | ||
9946 | ||
9947 | == About precedences == | |
9948 | ||
9949 | You probably noticed that we redefined the | |
9950 | <:OperatorPrecedence:precedences> of the function composition operator | |
9951 | `o` and the assignment operator `:=`. Doing so is not strictly | |
9952 | necessary, but can be convenient and should be relatively | |
9953 | safe. Consider the following motivating examples from | |
9954 | <:WesleyTerpstra: Wesley W. Terpstra> relying on the redefined | |
9955 | precedences: | |
9956 | ||
9957 | [source,sml] | |
9958 | ---- | |
9959 | Word8.fromInt o Char.ord o s <\String.sub | |
9960 | (* Combining sectioning and composition *) | |
9961 | ||
9962 | x := s <\String.sub\> i | |
9963 | (* Assigning the result of an infixed application *) | |
9964 | ---- | |
9965 | ||
9966 | In imperative languages, assignment usually has the lowest precedence | |
9967 | (ignoring statement separators). The precedence of `:=` in the | |
9968 | <:BasisLibrary: Basis Library> is perhaps unnecessarily high, because | |
9969 | an expression of the form `r := x` always returns a unit, which makes | |
9970 | little sense to combine with anything. Dropping `:=` to the lowest | |
9971 | precedence level makes it behave more like in other imperative | |
9972 | languages. | |
9973 | ||
9974 | The case for `o` is different. With the exception of `before` and | |
9975 | `:=`, it doesn't seem to make much sense to use `o` with any of the | |
9976 | operators defined by the <:BasisLibrary: Basis Library> in an | |
9977 | unparenthesized expression. This is simply because none of the other | |
9978 | operators deal with functions. It would seem that the precedence of | |
9979 | `o` could be chosen completely arbitrarily from the set `{1, ..., 9}` | |
9980 | without having any adverse effects with respect to other infix | |
9981 | operators defined by the <:BasisLibrary: Basis Library>. | |
9982 | ||
9983 | ||
9984 | == Design of the symbols == | |
9985 | ||
9986 | The closest approximation of Haskell's ++x `f` y++ syntax | |
9987 | achievable in Standard ML would probably be something like | |
9988 | ++x `f^ y++, but `^` is already used for string | |
9989 | concatenation by the <:BasisLibrary: Basis Library>. Other | |
9990 | combinations of the characters +`+ and `^` would be | |
9991 | possible, but none seems clearly the best visually. The symbols `<\`, | |
9992 | `\>`, `</`, and `/>` are reasonably concise and have a certain | |
9993 | self-documenting appearance and symmetry, which can help to remember | |
9994 | them. As the names suggest, the symbols of the piping operators `>|` | |
9995 | and `|<` are inspired by Unix shell pipelines. | |
9996 | ||
9997 | ||
9998 | == Also see == | |
9999 | ||
10000 | * <:Utilities:> | |
10001 | ||
10002 | <<< | |
10003 | ||
10004 | :mlton-guide-page: Inline | |
10005 | [[Inline]] | |
10006 | Inline | |
10007 | ====== | |
10008 | ||
10009 | <:Inline:> is an optimization pass for the <:SSA:> | |
10010 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
10011 | ||
10012 | == Description == | |
10013 | ||
10014 | This pass inlines <:SSA:> functions using a size-based metric. | |
10015 | ||
10016 | == Implementation == | |
10017 | ||
10018 | * <!ViewGitFile(mlton,master,mlton/ssa/inline.sig)> | |
10019 | * <!ViewGitFile(mlton,master,mlton/ssa/inline.fun)> | |
10020 | ||
10021 | == Details and Notes == | |
10022 | ||
10023 | The <:Inline:> pass can be invoked to use one of three metrics: | |
10024 | ||
10025 | * `NonRecursive(product, small)` -- inline any function satisfying `(numCalls - 1) * (size - small) <= product`, where `numCalls` is the static number of calls to the function and `size` is the size of the function. | |
10026 | * `Leaf(size)` -- inline any leaf function smaller than `size` | |
10027 | * `LeafNoLoop(size)` -- inline any leaf function without loops smaller than `size` | |
10028 | ||
10029 | <<< | |
10030 | ||
10031 | :mlton-guide-page: InsertLimitChecks | |
10032 | [[InsertLimitChecks]] | |
10033 | InsertLimitChecks | |
10034 | ================= | |
10035 | ||
10036 | <:InsertLimitChecks:> is a pass for the <:RSSA:> | |
10037 | <:IntermediateLanguage:>, invoked from <:RSSASimplify:>. | |
10038 | ||
10039 | == Description == | |
10040 | ||
10041 | This pass inserts limit checks. | |
10042 | ||
10043 | == Implementation == | |
10044 | ||
10045 | * <!ViewGitFile(mlton,master,mlton/backend/limit-check.fun)> | |
10046 | ||
10047 | == Details and Notes == | |
10048 | ||
10049 | {empty} | |
10050 | ||
10051 | <<< | |
10052 | ||
10053 | :mlton-guide-page: InsertSignalChecks | |
10054 | [[InsertSignalChecks]] | |
10055 | InsertSignalChecks | |
10056 | ================== | |
10057 | ||
10058 | <:InsertSignalChecks:> is a pass for the <:RSSA:> | |
10059 | <:IntermediateLanguage:>, invoked from <:RSSASimplify:>. | |
10060 | ||
10061 | == Description == | |
10062 | ||
10063 | This pass inserts signal checks. | |
10064 | ||
10065 | == Implementation == | |
10066 | ||
10067 | * <!ViewGitFile(mlton,master,mlton/backend/limit-check.fun)> | |
10068 | ||
10069 | == Details and Notes == | |
10070 | ||
10071 | {empty} | |
10072 | ||
10073 | <<< | |
10074 | ||
10075 | :mlton-guide-page: Installation | |
10076 | [[Installation]] | |
10077 | Installation | |
10078 | ============ | |
10079 | ||
10080 | MLton runs on a variety of platforms and is distributed in both source and | |
10081 | binary form. | |
10082 | ||
10083 | A `.tgz` or `.tbz` binary package can be extracted at any location, yielding | |
10084 | `README.adoc` (this file), `CHANGELOG.adoc`, `LICENSE`, `Makefile`, `bin/`, | |
10085 | `lib/`, and `share/`. The compiler and tools can be executed in-place (e.g., | |
10086 | `./bin/mlton`). | |
10087 | ||
10088 | A small set of `Makefile` variables can be used to customize the binary package | |
10089 | via `make update`: | |
10090 | ||
10091 | * `CC`: Specify C compiler. Can be used for alternative tools (e.g., | |
10092 | `CC=clang` or `CC=gcc-7`). | |
10093 | * `WITH_GMP_DIR`, `WITH_GMP_INC_DIR`, `WITH_GMP_LIB_DIR`: Specify GMP include | |
10094 | and library paths, if not on default search paths. (If `WITH_GMP_DIR` is | |
10095 | set, then `WITH_GMP_INC_DIR` defaults to `$(WITH_GMP_DIR)/include` and | |
10096 | `WITH_GMP_LIB_DIR` defaults to `$(WITH_GMP_DIR)/lib`.) | |
10097 | ||
10098 | For example: | |
10099 | ||
10100 | [source,sml] | |
10101 | ---- | |
10102 | $ make CC=clang WITH_GMP_DIR=/opt/gmp update | |
10103 | ---- | |
10104 | ||
10105 | On typical platforms, installing MLton (after optionally performing | |
10106 | `make update`) to `/usr/local` can be accomplished via: | |
10107 | ||
10108 | [source,sml] | |
10109 | ---- | |
10110 | $ make install | |
10111 | ---- | |
10112 | ||
10113 | A small set of `Makefile` variables can be used to customize the installation: | |
10114 | ||
10115 | * `PREFIX`: Specify the installation prefix. | |
10116 | * `CC`: Specify C compiler. Can be used for alternative tools (e.g., | |
10117 | `CC=clang` or `CC=gcc-7`). | |
10118 | * `WITH_GMP_DIR`, `WITH_GMP_INC_DIR`, `WITH_GMP_LIB_DIR`: Specify GMP include | |
10119 | and library paths, if not on default search paths. (If `WITH_GMP_DIR` is | |
10120 | set, then `WITH_GMP_INC_DIR` defaults to `$(WITH_GMP_DIR)/include` and | |
10121 | `WITH_GMP_LIB_DIR` defaults to `$(WITH_GMP_DIR)/lib`.) | |
10122 | ||
10123 | For example: | |
10124 | ||
10125 | [source,sml] | |
10126 | ---- | |
10127 | $ make PREFIX=/opt/mlton install | |
10128 | ---- | |
10129 | ||
10130 | Installation of MLton creates the following files and directories. | |
10131 | ||
10132 | * ++__prefix__/bin/mllex++ | |
10133 | + | |
10134 | The <:MLLex:> lexer generator. | |
10135 | ||
10136 | * ++__prefix__/bin/mlnlffigen++ | |
10137 | + | |
10138 | The <:MLNLFFI:ML-NLFFI> tool. | |
10139 | ||
10140 | * ++__prefix__/bin/mlprof++ | |
10141 | + | |
10142 | A <:Profiling:> tool. | |
10143 | ||
10144 | * ++__prefix__/bin/mlton++ | |
10145 | + | |
10146 | A script to call the compiler. This script may be moved anywhere, | |
10147 | however, it makes use of files in ++__prefix__/lib/mlton++. | |
10148 | ||
10149 | * ++__prefix__/bin/mlyacc++ | |
10150 | + | |
10151 | The <:MLYacc:> parser generator. | |
10152 | ||
10153 | * ++__prefix__/lib/mlton++ | |
10154 | + | |
10155 | Directory containing libraries and include files needed during compilation. | |
10156 | ||
10157 | * ++__prefix__/share/man/man1/{mllex,mlnlffigen,mlprof,mlton,mlyacc}.1++ | |
10158 | + | |
10159 | Man pages. | |
10160 | ||
10161 | * ++__prefix__/share/doc/mlton++ | |
10162 | + | |
10163 | Directory containing the user guide for MLton, mllex, and mlyacc, as | |
10164 | well as example SML programs (in the `examples` directory), and license | |
10165 | information. | |
10166 | ||
10167 | ||
10168 | == Hello, World! == | |
10169 | ||
10170 | Once you have installed MLton, create a file called `hello-world.sml` | |
10171 | with the following contents. | |
10172 | ||
10173 | ---- | |
10174 | print "Hello, world!\n"; | |
10175 | ---- | |
10176 | ||
10177 | Now create an executable, `hello-world`, with the following command. | |
10178 | ---- | |
10179 | mlton hello-world.sml | |
10180 | ---- | |
10181 | ||
10182 | You can now run `hello-world` to verify that it works. There are more | |
10183 | small examples in ++__prefix__/share/doc/mlton/examples++. | |
10184 | ||
10185 | ||
10186 | == Installation on Cygwin == | |
10187 | ||
10188 | When installing the Cygwin `tgz`, you should use Cygwin's `bash` and | |
10189 | `tar`. The use of an archiving tool that is not aware of Cygwin's | |
10190 | mounts will put the files in the wrong place. | |
10191 | ||
10192 | <<< | |
10193 | ||
10194 | :mlton-guide-page: IntermediateLanguage | |
10195 | [[IntermediateLanguage]] | |
10196 | IntermediateLanguage | |
10197 | ==================== | |
10198 | ||
10199 | MLton uses a number of intermediate languages in translating from the input source program to low-level code. Here is a list in the order which they are translated to. | |
10200 | ||
10201 | * <:AST:>. Pretty close to the source. | |
10202 | * <:CoreML:>. Explicitly typed, no module constructs. | |
10203 | * <:XML:>. Polymorphic, <:HigherOrder:>. | |
10204 | * <:SXML:>. SimplyTyped, <:HigherOrder:>. | |
10205 | * <:SSA:>. SimplyTyped, <:FirstOrder:>. | |
10206 | * <:SSA2:>. SimplyTyped, <:FirstOrder:>. | |
10207 | * <:RSSA:>. Explicit data representations. | |
10208 | * <:Machine:>. Untyped register transfer language. | |
10209 | ||
10210 | <<< | |
10211 | ||
10212 | :mlton-guide-page: IntroduceLoops | |
10213 | [[IntroduceLoops]] | |
10214 | IntroduceLoops | |
10215 | ============== | |
10216 | ||
10217 | <:IntroduceLoops:> is an optimization pass for the <:SSA:> | |
10218 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
10219 | ||
10220 | == Description == | |
10221 | ||
10222 | This pass rewrites any <:SSA:> function that calls itself in tail | |
10223 | position into one with a local loop and no self tail calls. | |
10224 | ||
10225 | A <:SSA:> function like | |
10226 | ---- | |
10227 | fun F (arg_0, arg_1) = L_0 () | |
10228 | ... | |
10229 | L_16 (x_0) | |
10230 | ... | |
10231 | F (z_0, z_1) Tail | |
10232 | ... | |
10233 | ---- | |
10234 | becomes | |
10235 | ---- | |
10236 | fun F (arg_0', arg_1') = loopS_0 () | |
10237 | loopS_0 () | |
10238 | loop_0 (arg_0', arg_1') | |
10239 | loop_0 (arg_0, arg_1) | |
10240 | L_0 () | |
10241 | ... | |
10242 | L_16 (x_0) | |
10243 | ... | |
10244 | loop_0 (z_0, z_1) | |
10245 | ... | |
10246 | ---- | |
10247 | ||
10248 | == Implementation == | |
10249 | ||
10250 | * <!ViewGitFile(mlton,master,mlton/ssa/introduce-loops.fun)> | |
10251 | ||
10252 | == Details and Notes == | |
10253 | ||
10254 | {empty} | |
10255 | ||
10256 | <<< | |
10257 | ||
10258 | :mlton-guide-page: JesperLouisAndersen | |
10259 | [[JesperLouisAndersen]] | |
10260 | JesperLouisAndersen | |
10261 | =================== | |
10262 | ||
10263 | Jesper Louis Andersen is an undergraduate student at DIKU, the department of computer science, Copenhagen university. His contributions to MLton are few, though he has made the port of MLton to the NetBSD and OpenBSD platforms. | |
10264 | ||
10265 | His general interests in computer science are compiler theory, language theory, algorithms and datastructures and programming. His assets are his general knowledge of UNIX systems, knowledge of system administration, knowledge of operating system kernels; NetBSD in particular. | |
10266 | ||
10267 | He was employed by the university as a system administrator for 2 years, which has set him back somewhat in his studies. Currently he is trying to learn mathematics (real analysis, general topology, complex functional analysis and algebra). | |
10268 | ||
10269 | ||
10270 | == Projects using MLton == | |
10271 | ||
10272 | === A register allocator === | |
10273 | For internal use at a compiler course at DIKU. It is written in the literate programming style and implements the _Iterated Register Coalescing_ algorithm by Lal George and Andrew Appel http://citeseer.ist.psu.edu/george96iterated.html. The status of the project is that it is unfinished. Most of the basic parts of the algorithm is done, but the interface to the students (simple) datatype takes some conversion. | |
10274 | ||
10275 | === A configuration management system in SML === | |
10276 | At this time, only loose plans exists for this. The plan is to build a Configuration Management system on the principles of the OpenCM system, see http://www.opencm.org/docs.html. The basic idea is to unify "naming" and "identity" into one by uniquely identifying all objects managed in the repository by the use of cryptographic checksums. This mantra guides the rest of the system, providing integrity, accessibility and confidentiality. | |
10277 | ||
10278 | <<< | |
10279 | ||
10280 | :mlton-guide-page: JohnnyAndersen | |
10281 | [[JohnnyAndersen]] | |
10282 | JohnnyAndersen | |
10283 | ============== | |
10284 | ||
10285 | Johnny Andersen (aka Anoq of the Sun) | |
10286 | ||
10287 | Here is a picture in front of the academy building | |
10288 | at the University of Athens, Greece, taken in September 2003. | |
10289 | ||
10290 | image::JohnnyAndersen.attachments/anoq.jpg[align="center"] | |
10291 | ||
10292 | <<< | |
10293 | ||
10294 | :mlton-guide-page: KnownCase | |
10295 | [[KnownCase]] | |
10296 | KnownCase | |
10297 | ========= | |
10298 | ||
10299 | <:KnownCase:> is an optimization pass for the <:SSA:> | |
10300 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
10301 | ||
10302 | == Description == | |
10303 | ||
10304 | This pass duplicates and simplifies `Case` transfers when the | |
10305 | constructor of the scrutinee is known. | |
10306 | ||
10307 | Uses <:Restore:>. | |
10308 | ||
10309 | For example, the program | |
10310 | [source,sml] | |
10311 | ---- | |
10312 | val rec last = | |
10313 | fn [] => 0 | |
10314 | | [x] => x | |
10315 | | _ :: l => last l | |
10316 | ||
10317 | val _ = 1 + last [2, 3, 4, 5, 6, 7] | |
10318 | ---- | |
10319 | ||
10320 | gives rise to the <:SSA:> function | |
10321 | ||
10322 | ---- | |
10323 | fun last_0 (x_142) = loopS_1 () | |
10324 | loopS_1 () | |
10325 | loop_11 (x_142) | |
10326 | loop_11 (x_143) | |
10327 | case x_143 of | |
10328 | nil_1 => L_73 | ::_0 => L_74 | |
10329 | L_73 () | |
10330 | return global_5 | |
10331 | L_74 (x_145, x_144) | |
10332 | case x_145 of | |
10333 | nil_1 => L_75 | _ => L_76 | |
10334 | L_75 () | |
10335 | return x_144 | |
10336 | L_76 () | |
10337 | loop_11 (x_145) | |
10338 | ---- | |
10339 | ||
10340 | which is simplified to | |
10341 | ||
10342 | ---- | |
10343 | fun last_0 (x_142) = loopS_1 () | |
10344 | loopS_1 () | |
10345 | case x_142 of | |
10346 | nil_1 => L_73 | ::_0 => L_118 | |
10347 | L_73 () | |
10348 | return global_5 | |
10349 | L_118 (x_230, x_229) | |
10350 | L_74 (x_230, x_229, x_142) | |
10351 | L_74 (x_145, x_144, x_232) | |
10352 | case x_145 of | |
10353 | nil_1 => L_75 | ::_0 => L_114 | |
10354 | L_75 () | |
10355 | return x_144 | |
10356 | L_114 (x_227, x_226) | |
10357 | L_74 (x_227, x_226, x_145) | |
10358 | ---- | |
10359 | ||
10360 | == Implementation == | |
10361 | ||
10362 | * <!ViewGitFile(mlton,master,mlton/ssa/known-case.fun)> | |
10363 | ||
10364 | == Details and Notes == | |
10365 | ||
10366 | One interesting aspect of <:KnownCase:>, is that it often has the | |
10367 | effect of unrolling list traversals by one iteration, moving the | |
10368 | `nil`/`::` check to the end of the loop, rather than the beginning. | |
10369 | ||
10370 | <<< | |
10371 | ||
10372 | :mlton-guide-page: LambdaCalculus | |
10373 | [[LambdaCalculus]] | |
10374 | LambdaCalculus | |
10375 | ============== | |
10376 | ||
10377 | The http://en.wikipedia.org/wiki/Lambda_calculus[lambda calculus] is | |
10378 | the formal system underlying <:StandardML:Standard ML>. | |
10379 | ||
10380 | <<< | |
10381 | ||
10382 | :mlton-guide-page: LambdaFree | |
10383 | [[LambdaFree]] | |
10384 | LambdaFree | |
10385 | ========== | |
10386 | ||
10387 | <:LambdaFree:> is an analysis pass for the <:SXML:> | |
10388 | <:IntermediateLanguage:>, invoked from <:ClosureConvert:>. | |
10389 | ||
10390 | == Description == | |
10391 | ||
10392 | This pass descends the entire <:SXML:> program and attaches a property | |
10393 | to each `Lambda` `PrimExp.t` in the program. Then, you can use | |
10394 | `lambdaFree` and `lambdaRec` to get free variables of that `Lambda`. | |
10395 | ||
10396 | == Implementation == | |
10397 | ||
10398 | * <!ViewGitFile(mlton,master,mlton/closure-convert/lambda-free.sig)> | |
10399 | * <!ViewGitFile(mlton,master,mlton/closure-convert/lambda-free.fun)> | |
10400 | ||
10401 | == Details and Notes == | |
10402 | ||
10403 | For `Lambda`-s bound in a `Fun` dec, `lambdaFree` gives the union of | |
10404 | the frees of the entire group of mutually recursive functions. Hence, | |
10405 | `lambdaFree` for every `Lambda` in a single `Fun` dec is the same. | |
10406 | Furthermore, for a `Lambda` bound in a `Fun` dec, `lambdaRec` gives | |
10407 | the list of other functions bound in the same dec defining that | |
10408 | `Lambda`. | |
10409 | ||
10410 | For example: | |
10411 | ---- | |
10412 | val rec f = fn x => ... y ... g ... f ... | |
10413 | and g = fn z => ... f ... w ... | |
10414 | ---- | |
10415 | ||
10416 | ---- | |
10417 | lambdaFree(fn x =>) = [y, w] | |
10418 | lambdaFree(fn z =>) = [y, w] | |
10419 | lambdaRec(fn x =>) = [g, f] | |
10420 | lambdaRec(fn z =>) = [f] | |
10421 | ---- | |
10422 | ||
10423 | <<< | |
10424 | ||
10425 | :mlton-guide-page: LanguageChanges | |
10426 | [[LanguageChanges]] | |
10427 | LanguageChanges | |
10428 | =============== | |
10429 | ||
10430 | We are sometimes asked to modify MLton to change the language it | |
10431 | compiles. In short, we are conservative about making such changes. | |
10432 | There are a number of reasons for this. | |
10433 | ||
10434 | * <:DefinitionOfStandardML:The Definition of Standard ML> is an | |
10435 | extremely high standard of specification. The value of the Definition | |
10436 | would be significantly diluted by changes that are not specified at an | |
10437 | equally high level, and the dilution increases with the complexity of | |
10438 | the language change and its interaction with other language features. | |
10439 | ||
10440 | * The SML community is small and there are a number of | |
10441 | <:StandardMLImplementations:SML implementations>. Without an | |
10442 | agreed-upon standard, it becomes very difficult to port programs | |
10443 | between compilers, and the community would be balkanized. | |
10444 | ||
10445 | * Our main goal is to enable programmers to be as effective as | |
10446 | possible with MLton/SML. There are a number of improvements other | |
10447 | than language changes that we could spend our time on that would | |
10448 | provide more benefit to programmers. | |
10449 | ||
10450 | * The more the language that MLton compiles changes over time, the | |
10451 | more difficult it is to use MLton as a stable platform for serious | |
10452 | program development. | |
10453 | ||
10454 | Despite these drawbacks, we have extended SML in a couple of cases. | |
10455 | ||
10456 | * <:ForeignFunctionInterface: Foreign function interface> | |
10457 | * <:MLBasis: ML Basis system> | |
10458 | * <:SuccessorML: Successor ML features> | |
10459 | ||
10460 | We allow these language extensions because they provide functionality | |
10461 | that is impossible to achieve without them or have non-trivial | |
10462 | community support. The Definition does not define a foreign function | |
10463 | interface. So, we must either extend the language or greatly restrict | |
10464 | the class of programs that can be written. Similarly, the Definition | |
10465 | does not provide a mechanism for namespace control at the module | |
10466 | level, making it impossible to deliver packaged libraries and have a | |
10467 | hope of users using them without name clashes. The ML Basis system | |
10468 | addresses this problem. We have also provided a formal specification | |
10469 | of the ML Basis system at the level of the Definition. | |
10470 | ||
10471 | == Also see == | |
10472 | ||
10473 | * http://www.mlton.org/pipermail/mlton/2004-August/016165.html | |
10474 | * http://www.mlton.org/pipermail/mlton-user/2004-December/000320.html | |
10475 | ||
10476 | <<< | |
10477 | ||
10478 | :mlton-guide-page: Lazy | |
10479 | [[Lazy]] | |
10480 | Lazy | |
10481 | ==== | |
10482 | ||
10483 | In a lazy (or non-strict) language, the arguments to a function are | |
10484 | not evaluated before calling the function. Instead, the arguments are | |
10485 | suspended and only evaluated by the function if needed. | |
10486 | ||
10487 | <:StandardML:Standard ML> is an eager (or strict) language, not a lazy | |
10488 | language. However, it is easy to delay evaluation of an expression in | |
10489 | SML by creating a _thunk_, which is a nullary function. In SML, a | |
10490 | thunk is written `fn () => e`. Another essential feature of laziness | |
10491 | is _memoization_, meaning that once a suspended argument is evaluated, | |
10492 | subsequent references look up the value. We can express this in SML | |
10493 | with a function that maps a thunk to a memoized thunk. | |
10494 | ||
10495 | [source,sml] | |
10496 | ---- | |
10497 | signature LAZY = | |
10498 | sig | |
10499 | val lazy: (unit -> 'a) -> unit -> 'a | |
10500 | end | |
10501 | ---- | |
10502 | ||
10503 | This is easy to implement in SML. | |
10504 | ||
10505 | [source,sml] | |
10506 | ---- | |
10507 | structure Lazy: LAZY = | |
10508 | struct | |
10509 | fun lazy (th: unit -> 'a): unit -> 'a = | |
10510 | let | |
10511 | datatype 'a lazy_result = Unevaluated of (unit -> 'a) | |
10512 | | Evaluated of 'a | |
10513 | | Failed of exn | |
10514 | ||
10515 | val r = ref (Unevaluated th) | |
10516 | in | |
10517 | fn () => | |
10518 | case !r of | |
10519 | Unevaluated th => let | |
10520 | val a = th () | |
10521 | handle x => (r := Failed x; raise x) | |
10522 | val () = r := Evaluated a | |
10523 | in | |
10524 | a | |
10525 | end | |
10526 | | Evaluated a => a | |
10527 | | Failed x => raise x | |
10528 | end | |
10529 | end | |
10530 | ---- | |
10531 | ||
10532 | <<< | |
10533 | ||
10534 | :mlton-guide-page: Libraries | |
10535 | [[Libraries]] | |
10536 | Libraries | |
10537 | ========= | |
10538 | ||
10539 | In theory every strictly conforming Standard ML program should run on | |
10540 | MLton. However, often large SML projects use implementation specific | |
10541 | features so some "porting" is required. Here is a partial list of | |
10542 | software that is known to run on MLton. | |
10543 | ||
10544 | * Utility libraries: | |
10545 | ** <:SMLNJLibrary:> - distributed with MLton | |
10546 | ** <:MLtonLibraryProject:> - various libraries located on the MLton subversion repository | |
10547 | ** <!ViewGitDir(mlton,master,lib/mlton)> - the internal MLton utility library, which we hope to cleanup and make more accessible someday | |
10548 | ** http://github.com/seanmcl/sml-ext[sml-ext], a grab bag of libraries for MLton and other SML implementations (by Sean McLaughlin) | |
10549 | ** http://tom7misc.cvs.sourceforge.net/tom7misc/sml-lib/[sml-lib], a grab bag of libraries for MLton and other SML implementations (by <:TomMurphy:>) | |
10550 | * Scanner generators: | |
10551 | ** <:MLLPTLibrary:> - distributed with MLton | |
10552 | ** <:MLLex:> - distributed with MLton | |
10553 | ** <:MLULex:> - | |
10554 | * Parser generators: | |
10555 | ** <:MLAntlr:> - | |
10556 | ** <:MLLPTLibrary:> - distributed with MLton | |
10557 | ** <:MLYacc:> - distributed with MLton | |
10558 | * Concurrency: <:ConcurrentML:> - distributed with MLton | |
10559 | * Graphics | |
10560 | ** <:SML3d:> | |
10561 | ** <:mGTK:> | |
10562 | * Misc. libraries: | |
10563 | ** <:CKitLibrary:> - distributed with MLton | |
10564 | ** <:MLRISCLibrary:> - distributed with MLton | |
10565 | ** <:MLNLFFI:ML-NLFFI> - distributed with MLton | |
10566 | ** <:Swerve:>, an HTTP server | |
10567 | ** <:fxp:>, an XML parser | |
10568 | ||
10569 | == Ports in progress == | |
10570 | ||
10571 | <:Contact:> us for details on any of these. | |
10572 | ||
10573 | * <:MLDoc:> http://people.cs.uchicago.edu/%7Ejhr/tools/ml-doc.html | |
10574 | * <:Unicode:> | |
10575 | ||
10576 | == More == | |
10577 | ||
10578 | More projects using MLton can be seen on the <:Users:> page. | |
10579 | ||
10580 | == Software for SML implementations other than MLton == | |
10581 | ||
10582 | * PostgreSQL | |
10583 | ** Moscow ML: http://www.dina.kvl.dk/%7Esestoft/mosmllib/Postgres.html | |
10584 | ** SML/NJ NLFFI: http://smlweb.sourceforge.net/smlsql/ | |
10585 | * Web: | |
10586 | ** ML Kit: http://www.smlserver.org[SMLserver] (a plugin for AOLserver) | |
10587 | ** Moscow ML: http://ellemose.dina.kvl.dk/%7Esestoft/msp/index.msp[ML Server Pages] (support for PHP-style CGI scripting) | |
10588 | ** SML/NJ: http://smlweb.sourceforge.net/[smlweb] | |
10589 | ||
10590 | <<< | |
10591 | ||
10592 | :mlton-guide-page: LibrarySupport | |
10593 | [[LibrarySupport]] | |
10594 | LibrarySupport | |
10595 | ============== | |
10596 | ||
10597 | MLton supports both linking to and creating system-level libraries. | |
10598 | While Standard ML libraries should be designed with the <:MLBasis:> system to work with other Standard ML programs, | |
10599 | system-level library support allows MLton to create libraries for use by other programming languages. | |
10600 | Even more importantly, system-level library support allows MLton to access libraries from other languages. | |
10601 | This article will explain how to use libraries portably with MLton. | |
10602 | ||
10603 | == The Basics == | |
10604 | ||
10605 | A Dynamic Shared Object (DSO) is a piece of executable code written in a format understood by the operating system. | |
10606 | Executable programs and dynamic libraries are the two most common examples of a DSO. | |
10607 | They are called shared because if they are used more than once, they are only loaded once into main memory. | |
10608 | For example, if you start two instances of your web browser (an executable), there may be two processes running, but the program code of the executable is only loaded once. | |
10609 | A dynamic library, for example a graphical toolkit, might be used by several different executable programs, each possibly running multiple times. | |
10610 | Nevertheless, the dynamic library is only loaded once and it's program code is shared between all of the processes. | |
10611 | ||
10612 | In addition to program code, DSOs contain a table of textual strings called symbols. | |
10613 | These are used in order to make the DSO do something useful, like execute. | |
10614 | For example, on linux the symbol `_start` refers to the point in the program code where the operating system should start executing the program. | |
10615 | Dynamic libraries generally provide many symbols, corresponding to functions which can be called and variables which can be read or written. | |
10616 | Symbols can be used by the DSO itself, or by other DSOs which require services. | |
10617 | ||
10618 | When a DSO creates a symbol, this is called 'exporting'. | |
10619 | If a DSO needs to use a symbol, this is called 'importing'. | |
10620 | A DSO might need to use symbols defined within itself or perhaps from another DSO. | |
10621 | In both cases, it is importing that symbol, but the scope of the import differs. | |
10622 | Similarly, a DSO might export a symbol for use only within itself, or it might export a symbol for use by other DSOs. | |
10623 | Some symbols are resolved at compile time by the linker (those used within the DSO) and some are resolved at runtime by the dynamic link loader (symbols accessed between DSOs). | |
10624 | ||
10625 | == Symbols in MLton == | |
10626 | ||
10627 | Symbols in MLton are both imported and exported via the <:ForeignFunctionInterface:>. | |
10628 | The notation `_import "symbolname"` imports functions, `_symbol "symbolname"` imports variables, and `_address "symbolname"` imports an address. | |
10629 | To create and export a symbol, `_export "symbolname"` creates a function symbol and `_symbol "symbolname" 'alloc'` creates and exports a variable. | |
10630 | For details of the syntax and restrictions on the supported FFI types, read the <:ForeignFunctionInterface:> page. | |
10631 | In this discussion it only matters that every FFI use is either an import or an export. | |
10632 | ||
10633 | When exporting a symbol, MLton supports controlling the export scope. | |
10634 | If the symbol should only be used within the same DSO, that symbol has '`private`' scope. | |
10635 | Conversely, if the symbol should also be available to other DSOs the symbol has '`public`' scope. | |
10636 | Generally, one should have as few public exports as possible. | |
10637 | Since they are public, other DSOs will come to depend on them, limiting your ability to change them. | |
10638 | You specify the export scope in MLton by putting `private` or `public` after the symbol's name in an FFI directive. | |
10639 | eg: `_export "foo" private: int->int;` or `_export "bar" public: int->int;` . | |
10640 | ||
10641 | For technical reasons, the linker and loader on various platforms need to know the scope of a symbol being imported. | |
10642 | If the symbol is exported by the same DSO, use `public` or `private` as appropriate. | |
10643 | If the symbol is exported by a different DSO, then the scope '`external`' should be used to import it. | |
10644 | Within a DSO, all references to a symbol must use the same scope. | |
10645 | MLton will check this at compile time, reporting: `symbol "foo" redeclared as public (previously external)`. This may cause linker errors. | |
10646 | However, MLton can only check usage within Standard ML. | |
10647 | All objects being linked into a resulting DSO must agree, and it is the programmer's responsibility to ensure this. | |
10648 | ||
10649 | Summary of symbol scopes: | |
10650 | ||
10651 | * `private`: used for symbols exported within a DSO only for use within that DSO | |
10652 | * `public`: used for symbols exported within a DSO that may also be used outside that DSO | |
10653 | * `external`: used for importing symbols from another DSO | |
10654 | * All uses of a symbol within a DSO (both imports and exports) must agree on the symbol scope | |
10655 | ||
10656 | == Output Formats == | |
10657 | ||
10658 | MLton can create executables (`-format executable`) and dynamic shared libraries (`-format library`). | |
10659 | To link a shared library, use `-link-opt -l<dso_name>`. | |
10660 | The default output format is executable. | |
10661 | ||
10662 | MLton can also create archives. | |
10663 | An archive is not a DSO, but it does have a collection of symbols. | |
10664 | When an archive is linked into a DSO, it is completely absorbed. | |
10665 | Other objects being compiled into the DSO should refer to the public symbols in the archive as public, since they are still in the same DSO. | |
10666 | However, in the interest of modular programming, private symbols in an archive cannot be used outside of that archive, even within the same DSO. | |
10667 | ||
10668 | Although both executables and libraries are DSOs, some implementation details differ on some platforms. | |
10669 | For this reason, MLton can create two types or archives. | |
10670 | A normal archive (`-format archive`) is appropriate for linking into an executable. | |
10671 | Conversely, a libarchive (`-format libarchive`) should be used if it will be linked into a dynamic library. | |
10672 | ||
10673 | When MLton does not create an executable, it creates two special symbols. | |
10674 | The symbol `libname_open` is a function which must be called before any other symbols are accessed. | |
10675 | The `libname` is controlled by the `-libname` compile option and defaults to the name of the output, with any prefixing lib stripped (eg: `foo` -> `foo`, `libfoo` -> `foo`). | |
10676 | The symbol `libname_close` is a function which should be called to clean up memory once done. | |
10677 | ||
10678 | Summary of `-format` options: | |
10679 | ||
10680 | * `executable`: create an executable (a DSO) | |
10681 | * `library`: create a dynamic shared library (a DSO) | |
10682 | * `archive`: create an archive of symbols (not a DSO) that can be linked into an executable | |
10683 | * `libarchive`: create an archive of symbols (not a DSO) that can be linked into a library | |
10684 | ||
10685 | Related options: | |
10686 | ||
10687 | * `-libname x`: controls the name of the special `_open` and `_close` functions. | |
10688 | ||
10689 | ||
10690 | == Interfacing with C == | |
10691 | ||
10692 | MLton can generate a C header file. | |
10693 | When the output format is not an executable, it creates one by default named `libname.h`. | |
10694 | This can be overridden with `-export-header foo.h`. | |
10695 | This header file should be included by any C files using the exported Standard ML symbols. | |
10696 | ||
10697 | If C is being linked with Standard ML into the same output archive or DSO, | |
10698 | then the C code should `#define PART_OF_LIBNAME` before it includes the header file. | |
10699 | This ensures that the C code is using the symbols with correct scope. | |
10700 | Any symbols exported from C should also be marked using the `PRIVATE`/`PUBLIC`/`EXTERNAL` macros defined in the Standard ML export header. | |
10701 | The declared C scope on exported C symbols should match the import scope used in Standard ML. | |
10702 | ||
10703 | An example: | |
10704 | [source,c] | |
10705 | ---- | |
10706 | #define PART_OF_FOO | |
10707 | #include "foo.h" | |
10708 | ||
10709 | PUBLIC int cFoo() { | |
10710 | return smlFoo(); | |
10711 | } | |
10712 | ---- | |
10713 | ||
10714 | [source,sml] | |
10715 | ---- | |
10716 | val () = _export "smlFoo" private: unit -> int; (fn () => 5) | |
10717 | val cFoo = _import "cFoo" public: unit -> int; | |
10718 | ---- | |
10719 | ||
10720 | ||
10721 | == Operating-system specific details == | |
10722 | ||
10723 | On Windows, `libarchive` and `archive` are the same. | |
10724 | However, depending on this will lead to portability problems. | |
10725 | Windows is also especially sensitive to mixups of '`public`' and '`external`'. | |
10726 | If an archive is linked, make sure it's symbols are imported as `public`. | |
10727 | If a DLL is linked, make sure it's symbols are imported as `external`. | |
10728 | Using `external` instead of `public` will result in link errors that `__imp__foo is undefined`. | |
10729 | Using `public` instead of `external` will result in inconsistent function pointer addresses and failure to update the imported variables. | |
10730 | ||
10731 | On Linux, `libarchive` and `archive` are different. | |
10732 | Libarchives are quite rare, but necessary if creating a library from an archive. | |
10733 | It is common for a library to provide both an archive and a dynamic library on this platform. | |
10734 | The linker will pick one or the other, usually preferring the dynamic library. | |
10735 | While a quirk of the operating system allows external import to work for both archives and libraries, | |
10736 | portable projects should not depend on this behaviour. | |
10737 | On other systems it can matter how the library is linked (static or dynamic). | |
10738 | ||
10739 | <<< | |
10740 | ||
10741 | :mlton-guide-page: License | |
10742 | [[License]] | |
10743 | License | |
10744 | ======= | |
10745 | ||
10746 | == Web Site == | |
10747 | In order to allow the maximum freedom for the future use of the | |
10748 | content in this web site, we require that contributions to the web | |
10749 | site be dedicated to the public domain. That means that you can only | |
10750 | add works that are already in the public domain, or that you must hold | |
10751 | the copyright on the work that you agree to dedicate the work to the | |
10752 | public domain. | |
10753 | ||
10754 | By contributing to this web site, you agree to dedicate your | |
10755 | contribution to the public domain. | |
10756 | ||
10757 | == Software == | |
10758 | ||
10759 | As of 20050812, MLton software is licensed under the BSD-style license | |
10760 | below. By contributing code to the project, you agree to release the | |
10761 | code under this license. Contributors can retain copyright to their | |
10762 | contributions by asserting copyright in their code. Contributors may | |
10763 | also add to the list of copyright holders in | |
10764 | `doc/license/MLton-LICENSE`, which appears below. | |
10765 | ||
10766 | [source,text] | |
10767 | ---- | |
10768 | sys::[./bin/InclGitFile.py mlton master doc/license/MLton-LICENSE] | |
10769 | ---- | |
10770 | ||
10771 | <<< | |
10772 | ||
10773 | :mlton-guide-page: LineDirective | |
10774 | [[LineDirective]] | |
10775 | LineDirective | |
10776 | ============= | |
10777 | ||
10778 | To aid in the debugging of code produced by program generators such | |
10779 | as http://www.eecs.harvard.edu/%7Enr/noweb/[Noweb], MLton supports | |
10780 | comments with line directives of the form | |
10781 | [source,sml] | |
10782 | ---- | |
10783 | (*#line l.c "f"*) | |
10784 | ---- | |
10785 | Here, _l_ and _c_ are sequences of decimal digits and _f_ is the | |
10786 | source file. The first character of a source file has the position | |
10787 | 1.1. A line directive causes the front end to believe that the | |
10788 | character following the right parenthesis is at the line and column of | |
10789 | the specified file. A line directive only affects the reporting of | |
10790 | error messages and does not affect program semantics (except for | |
10791 | functions like `MLton.Exn.history` that report source file positions). | |
10792 | Syntactically invalid line directives are ignored. To prevent | |
10793 | incompatibilities with SML, the file name may not contain the | |
10794 | character sequence `*)`. | |
10795 | ||
10796 | <<< | |
10797 | ||
10798 | :mlton-guide-page: LLVM | |
10799 | [[LLVM]] | |
10800 | LLVM | |
10801 | ==== | |
10802 | ||
10803 | The http://www.llvm.org/[LLVM Project] is a collection of modular and | |
10804 | reusable compiler and toolchain technologies. | |
10805 | ||
10806 | MLton supports code generation via LLVM (`-codegen llvm`); see | |
10807 | <:LLVMCodegen:>. | |
10808 | ||
10809 | == Also see == | |
10810 | ||
10811 | * <:CMinusMinus:> | |
10812 | ||
10813 | <<< | |
10814 | ||
10815 | :mlton-guide-page: LLVMCodegen | |
10816 | [[LLVMCodegen]] | |
10817 | LLVMCodegen | |
10818 | =========== | |
10819 | ||
10820 | The <:LLVMCodegen:> is a <:Codegen:code generator> that translates the | |
10821 | <:Machine:> <:IntermediateLanguage:> to <:LLVM:> assembly, which is | |
10822 | further optimized and compiled to native object code by the <:LLVM:> | |
10823 | toolchain. | |
10824 | ||
10825 | It requires <:LLVM:> version 3.7 or greater to be installed. | |
10826 | ||
10827 | In benchmarks performed on the <:RunningOnAMD64:AMD64> architecture, | |
10828 | code size with this generator is usually slightly smaller than either | |
10829 | the <:AMD64Codegen:native> or the <:CCodegen:C> code generators. Compile | |
10830 | time is worse than <:AMD64Codegen:native>, but slightly better than | |
10831 | <:CCodegen:C>. Run time is often better than either <:AMD64Codegen:native> | |
10832 | or <:CCodegen:C>. | |
10833 | ||
10834 | == Implementation == | |
10835 | ||
10836 | * <!ViewGitFile(mlton,master,mlton/codegen/llvm-codegen/llvm-codegen.sig)> | |
10837 | * <!ViewGitFile(mlton,master,mlton/codegen/llvm-codegen/llvm-codegen.fun)> | |
10838 | ||
10839 | == Details and Notes == | |
10840 | ||
10841 | The <:LLVMCodegen:> was initially developed by Brian Leibig (see | |
10842 | <!Cite(Leibig13,An LLVM Back-end for MLton)>). | |
10843 | ||
10844 | <<< | |
10845 | ||
10846 | :mlton-guide-page: LocalFlatten | |
10847 | [[LocalFlatten]] | |
10848 | LocalFlatten | |
10849 | ============ | |
10850 | ||
10851 | <:LocalFlatten:> is an optimization pass for the <:SSA:> | |
10852 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
10853 | ||
10854 | == Description == | |
10855 | ||
10856 | This pass flattens arguments to <:SSA:> blocks. | |
10857 | ||
10858 | A block argument is flattened as long as it only flows to selects and | |
10859 | there is some tuple constructed in this function that flows to it. | |
10860 | ||
10861 | == Implementation == | |
10862 | ||
10863 | * <!ViewGitFile(mlton,master,mlton/ssa/local-flatten.fun)> | |
10864 | ||
10865 | == Details and Notes == | |
10866 | ||
10867 | {empty} | |
10868 | ||
10869 | <<< | |
10870 | ||
10871 | :mlton-guide-page: LocalRef | |
10872 | [[LocalRef]] | |
10873 | LocalRef | |
10874 | ======== | |
10875 | ||
10876 | <:LocalRef:> is an optimization pass for the <:SSA:> | |
10877 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
10878 | ||
10879 | == Description == | |
10880 | ||
10881 | This pass optimizes `ref` cells local to a <:SSA:> function: | |
10882 | ||
10883 | * global `ref`-s only used in one function are moved to the function | |
10884 | ||
10885 | * `ref`-s only created, read from, and written to (i.e., don't escape) | |
10886 | are converted into function local variables | |
10887 | ||
10888 | Uses <:Multi:> and <:Restore:>. | |
10889 | ||
10890 | == Implementation == | |
10891 | ||
10892 | * <!ViewGitFile(mlton,master,mlton/ssa/local-ref.fun)> | |
10893 | ||
10894 | == Details and Notes == | |
10895 | ||
10896 | Moving a global `ref` requires the <:Multi:> analysis, because a | |
10897 | global `ref` can only be moved into a function that is executed at | |
10898 | most once. | |
10899 | ||
10900 | Conversion of non-escaping `ref`-s is structured in three phases: | |
10901 | ||
10902 | * analysis -- a variable `r = Ref_ref x` escapes if | |
10903 | ** `r` is used in any context besides `Ref_assign (r, _)` or `Ref_deref r` | |
10904 | ** all uses `r` reachable from a (direct or indirect) call to `Thread_copyCurrent` are of the same flavor (either `Ref_assign` or `Ref_deref`); this also requires the <:Multi:> analysis. | |
10905 | ||
10906 | * transformation | |
10907 | + | |
10908 | -- | |
10909 | ** rewrites `r = Ref_ref x` to `r = x` | |
10910 | ** rewrites `_ = Ref_assign (r, y)` to `r = y` | |
10911 | ** rewrites `z = Ref_deref r` to `z = r` | |
10912 | -- | |
10913 | + | |
10914 | Note that the resulting program violates the SSA condition. | |
10915 | ||
10916 | * <:Restore:> -- restore the SSA condition. | |
10917 | ||
10918 | <<< | |
10919 | ||
10920 | :mlton-guide-page: Logo | |
10921 | [[Logo]] | |
10922 | Logo | |
10923 | ==== | |
10924 | ||
10925 | ifdef::basebackend-html[] | |
10926 | image::Logo.attachments/mlton.svg[align="center",height="128",width="128"] | |
10927 | endif::[] | |
10928 | ifdef::basebackend-docbook[] | |
10929 | image::Logo.attachments/mlton-128.pdf[align="center"] | |
10930 | endif::[] | |
10931 | ||
10932 | == Files == | |
10933 | ||
10934 | * <!Attachment(Logo,mlton.svg)> | |
10935 | * <!Attachment(Logo,mlton-1024.png)> | |
10936 | * <!Attachment(Logo,mlton-512.png)> | |
10937 | * <!Attachment(Logo,mlton-256.png)> | |
10938 | * <!Attachment(Logo,mlton-128.png)> | |
10939 | * <!Attachment(Logo,mlton-64.png)> | |
10940 | * <!Attachment(Logo,mlton-32.png)> | |
10941 | ||
10942 | <<< | |
10943 | ||
10944 | :mlton-guide-page: LoopInvariant | |
10945 | [[LoopInvariant]] | |
10946 | LoopInvariant | |
10947 | ============= | |
10948 | ||
10949 | <:LoopInvariant:> is an optimization pass for the <:SSA:> | |
10950 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
10951 | ||
10952 | == Description == | |
10953 | ||
10954 | This pass removes loop invariant arguments to local loops. | |
10955 | ||
10956 | ---- | |
10957 | loop (x, y) | |
10958 | ... | |
10959 | ... | |
10960 | loop (x, z) | |
10961 | ... | |
10962 | ---- | |
10963 | ||
10964 | becomes | |
10965 | ||
10966 | ---- | |
10967 | loop' (x, y) | |
10968 | loop (y) | |
10969 | loop (y) | |
10970 | ... | |
10971 | ... | |
10972 | loop (z) | |
10973 | ... | |
10974 | ---- | |
10975 | ||
10976 | == Implementation == | |
10977 | ||
10978 | * <!ViewGitFile(mlton,master,mlton/ssa/loop-invariant.fun)> | |
10979 | ||
10980 | == Details and Notes == | |
10981 | ||
10982 | {empty} | |
10983 | ||
10984 | <<< | |
10985 | ||
10986 | :mlton-guide-page: LoopUnroll | |
10987 | [[LoopUnroll]] | |
10988 | LoopUnroll | |
10989 | ========== | |
10990 | ||
10991 | <:LoopUnroll:> is an optimization pass for the <:SSA:> <:IntermediateLanguage:>, | |
10992 | invoked from <:SSASimplify:>. | |
10993 | ||
10994 | == Description == | |
10995 | ||
10996 | A simple loop unrolling optimization. | |
10997 | ||
10998 | == Implementation == | |
10999 | ||
11000 | * <!ViewGitFile(mlton,master,mlton/ssa/loop-unroll.fun)> | |
11001 | ||
11002 | == Details and Notes == | |
11003 | ||
11004 | {empty} | |
11005 | ||
11006 | <<< | |
11007 | ||
11008 | :mlton-guide-page: LoopUnswitch | |
11009 | [[LoopUnswitch]] | |
11010 | LoopUnswitch | |
11011 | ============ | |
11012 | ||
11013 | <:LoopUnswitch:> is an optimization pass for the <:SSA:> | |
11014 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
11015 | ||
11016 | == Description == | |
11017 | ||
11018 | A simple loop unswitching optimization. | |
11019 | ||
11020 | == Implementation == | |
11021 | ||
11022 | * <!ViewGitFile(mlton,master,mlton/ssa/loop-unswitch.fun)> | |
11023 | ||
11024 | == Details and Notes == | |
11025 | ||
11026 | {empty} | |
11027 | ||
11028 | <<< | |
11029 | ||
11030 | :mlton-guide-page: Machine | |
11031 | [[Machine]] | |
11032 | Machine | |
11033 | ======= | |
11034 | ||
11035 | <:Machine:> is an <:IntermediateLanguage:>, translated from <:RSSA:> | |
11036 | by <:ToMachine:> and used as input by the <:Codegen:>. | |
11037 | ||
11038 | == Description == | |
11039 | ||
11040 | <:Machine:> is an <:Untyped:> <:IntermediateLanguage:>, corresponding | |
11041 | to a abstract register machine. | |
11042 | ||
11043 | == Implementation == | |
11044 | ||
11045 | * <!ViewGitFile(mlton,master,mlton/backend/machine.sig)> | |
11046 | * <!ViewGitFile(mlton,master,mlton/backend/machine.fun)> | |
11047 | ||
11048 | == Type Checking == | |
11049 | ||
11050 | The <:Machine:> <:IntermediateLanguage:> has a primitive type checker | |
11051 | (<!ViewGitFile(mlton,master,mlton/backend/machine.sig)>, | |
11052 | <!ViewGitFile(mlton,master,mlton/backend/machine.fun)>), which only checks | |
11053 | some liveness properties. | |
11054 | ||
11055 | == Details and Notes == | |
11056 | ||
11057 | The runtime structure sets some constants according to the | |
11058 | configuration files on the target architecture and OS. | |
11059 | ||
11060 | <<< | |
11061 | ||
11062 | :mlton-guide-page: ManualPage | |
11063 | [[ManualPage]] | |
11064 | ManualPage | |
11065 | ========== | |
11066 | ||
11067 | MLton is run from the command line with a collection of options | |
11068 | followed by a file name and a list of files to compile, assemble, and | |
11069 | link with. | |
11070 | ||
11071 | ---- | |
11072 | mlton [option ...] file.{c|mlb|o|sml} [file.{c|o|s|S} ...] | |
11073 | ---- | |
11074 | ||
11075 | The simplest case is to run `mlton foo.sml`, where `foo.sml` contains | |
11076 | a valid SML program, in which case MLton compiles the program to | |
11077 | produce an executable `foo`. Since MLton does not support separate | |
11078 | compilation, the program must be the entire program you wish to | |
11079 | compile. However, the program may refer to signatures and structures | |
11080 | defined in the <:BasisLibrary:Basis Library>. | |
11081 | ||
11082 | Larger programs, spanning many files, can be compiled with the | |
11083 | <:MLBasis:ML Basis system>. In this case, `mlton foo.mlb` will | |
11084 | compile the complete SML program described by the basis `foo.mlb`, | |
11085 | which may specify both SML files and additional bases. | |
11086 | ||
11087 | == Next Steps == | |
11088 | ||
11089 | * <:CompileTimeOptions:> | |
11090 | * <:RunTimeOptions:> | |
11091 | ||
11092 | <<< | |
11093 | ||
11094 | :mlton-guide-page: MatchCompilation | |
11095 | [[MatchCompilation]] | |
11096 | MatchCompilation | |
11097 | ================ | |
11098 | ||
11099 | Match compilation is the process of translating an SML match into a | |
11100 | nested tree (or dag) of simple case expressions and tests. | |
11101 | ||
11102 | MLton's match compiler is described <:MatchCompile:here>. | |
11103 | ||
11104 | == Match compilation in other compilers == | |
11105 | ||
11106 | * <!Cite(BaudinetMacQueen85)> | |
11107 | * <!Cite(Leroy90)>, pages 60-69. | |
11108 | * <!Cite(Sestoft96)> | |
11109 | * <!Cite(ScottRamsey00)> | |
11110 | ||
11111 | <<< | |
11112 | ||
11113 | :mlton-guide-page: MatchCompile | |
11114 | [[MatchCompile]] | |
11115 | MatchCompile | |
11116 | ============ | |
11117 | ||
11118 | <:MatchCompile:> is a translation pass, agnostic in the | |
11119 | <:IntermediateLanguage:>s between which it translates. | |
11120 | ||
11121 | == Description == | |
11122 | ||
11123 | <:MatchCompilation:Match compilation> converts a case expression with | |
11124 | nested patterns into a case expression with flat patterns. | |
11125 | ||
11126 | == Implementation == | |
11127 | ||
11128 | * <!ViewGitFile(mlton,master,mlton/match-compile/match-compile.sig)> | |
11129 | * <!ViewGitFile(mlton,master,mlton/match-compile/match-compile.fun)> | |
11130 | ||
11131 | == Details and Notes == | |
11132 | ||
11133 | [source,sml] | |
11134 | ---- | |
11135 | val matchCompile: | |
11136 | {caseType: Type.t, (* type of entire expression *) | |
11137 | cases: (NestedPat.t * ((Var.t -> Var.t) -> Exp.t)) vector, | |
11138 | conTycon: Con.t -> Tycon.t, | |
11139 | region: Region.t, | |
11140 | test: Var.t, | |
11141 | testType: Type.t, | |
11142 | tyconCons: Tycon.t -> {con: Con.t, hasArg: bool} vector} | |
11143 | -> Exp.t * (unit -> ((Layout.t * {isOnlyExns: bool}) vector) vector) | |
11144 | ---- | |
11145 | ||
11146 | `matchCompile` is complicated by the desire for modularity between the | |
11147 | match compiler and its caller. Its caller is responsible for building | |
11148 | the right hand side of a rule `p => e`. On the other hand, the match | |
11149 | compiler is responsible for destructing the test and binding new | |
11150 | variables to the components. In order to connect the new variables | |
11151 | created by the match compiler with the variables in the pattern `p`, | |
11152 | the match compiler passes an environment back to its caller that maps | |
11153 | each variable in `p` to the corresponding variable introduced by the | |
11154 | match compiler. | |
11155 | ||
11156 | The match compiler builds a tree of n-way case expressions by working | |
11157 | from outside to inside and left to right in the patterns. For example, | |
11158 | [source,sml] | |
11159 | ---- | |
11160 | case x of | |
11161 | (_, C1 a) => e1 | |
11162 | | (C2 b, C3 c) => e2 | |
11163 | ---- | |
11164 | is translated to | |
11165 | [source,sml] | |
11166 | ---- | |
11167 | let | |
11168 | fun f1 a = e1 | |
11169 | fun f2 (b, c) = e2 | |
11170 | in | |
11171 | case x of | |
11172 | (x1, x2) => | |
11173 | (case x1 of | |
11174 | C2 b' => (case x2 of | |
11175 | C1 a' => f1 a' | |
11176 | | C3 c' => f2(b',c') | |
11177 | | _ => raise Match) | |
11178 | | _ => (case x2 of | |
11179 | C1 a_ => f1 a_ | |
11180 | | _ => raise Match)) | |
11181 | end | |
11182 | ---- | |
11183 | ||
11184 | Here you can see the necessity of abstracting out the ride hand sides | |
11185 | of the cases in order to avoid code duplication. Right hand sides are | |
11186 | always abstracted. The simplifier cleans things up. You can also see | |
11187 | the new (primed) variables introduced by the match compiler and how | |
11188 | the renaming works. Finally, you can see how the match compiler | |
11189 | introduces the necessary default clauses in order to make a match | |
11190 | exhaustive, i.e. cover all the cases. | |
11191 | ||
11192 | The match compiler uses `numCons` and `tyconCons` to determine | |
11193 | the exhaustivity of matches against constructors. | |
11194 | ||
11195 | <<< | |
11196 | ||
11197 | :mlton-guide-page: MatthewFluet | |
11198 | [[MatthewFluet]] | |
11199 | MatthewFluet | |
11200 | ============ | |
11201 | ||
11202 | Matthew Fluet ( | |
11203 | mailto:matthew.fluet@gmail.com[matthew.fluet@gmail.com] | |
11204 | , | |
11205 | http://www.cs.rit.edu/%7Emtf | |
11206 | ) | |
11207 | is an Assistant Professor at the http://www.rit.edu[Rochester Institute of Technology]. | |
11208 | ||
11209 | '''' | |
11210 | ||
11211 | Current MLton projects: | |
11212 | ||
11213 | * general maintenance | |
11214 | * release new version | |
11215 | ||
11216 | '''' | |
11217 | ||
11218 | Misc. and underspecified TODOs: | |
11219 | ||
11220 | * understand <:RefFlatten:> and <:DeepFlatten:> | |
11221 | ** http://www.mlton.org/pipermail/mlton/2005-April/026990.html | |
11222 | ** http://www.mlton.org/pipermail/mlton/2007-November/030056.html | |
11223 | ** http://www.mlton.org/pipermail/mlton/2008-April/030250.html | |
11224 | ** http://www.mlton.org/pipermail/mlton/2008-July/030279.html | |
11225 | ** http://www.mlton.org/pipermail/mlton/2008-August/030312.html | |
11226 | ** http://www.mlton.org/pipermail/mlton/2008-September/030360.html | |
11227 | ** http://www.mlton.org/pipermail/mlton-user/2009-June/001542.html | |
11228 | * `MSG_DONTWAIT` isn't Posix | |
11229 | * coordinate w/ Dan Spoonhower and Lukasz Ziarek and Armand Navabi on multi-threaded | |
11230 | ** http://www.mlton.org/pipermail/mlton/2008-March/030214.html | |
11231 | * Intel Research bug: `no tyconRep property` (company won't release sample code) | |
11232 | ** http://www.mlton.org/pipermail/mlton-user/2008-March/001358.html | |
11233 | * treatment of real constants | |
11234 | ** http://www.mlton.org/pipermail/mlton/2008-May/030262.html | |
11235 | ** http://www.mlton.org/pipermail/mlton/2008-June/030271.html | |
11236 | * representation of `bool` and `_bool` in <:ForeignFunctionInterface:> | |
11237 | ** http://www.mlton.org/pipermail/mlton/2008-May/030264.html | |
11238 | * http://www.icfpcontest.org | |
11239 | ** John Reppy claims that "It looks like the card-marking overhead that one incurs when using generational collection swamps the benefits of generational collection." | |
11240 | * page to disk policy / single heap | |
11241 | ** http://www.mlton.org/pipermail/mlton/2008-June/030278.html | |
11242 | ** http://www.mlton.org/pipermail/mlton/2008-August/030318.html | |
11243 | * `MLton.GC.pack` doesn't keep a small heap if a garbage collection occurs before `MLton.GC.unpack`. | |
11244 | ** It might be preferable for `MLton.GC.pack` to be implemented as a (new) `MLton.GC.Ratios.setLive 1.1` followed by `MLton.GC.collect ()` and for `MLton.GC.unpack` to be implemented as `MLton.GC.Ratios.setLive 8.0` followed by `MLton.GC.collect ()`. | |
11245 | * The `static struct GC_objectType objectTypes[] =` array includes many duplicates. Objects of distinct source type, but equivalent representations (in terms of size, bytes non-pointers, number pointers) can share the objectType index. | |
11246 | * PolySpace bug: <:Redundant:> optimization (company won't release sample code) | |
11247 | ** http://www.mlton.org/pipermail/mlton/2008-September/030355.html | |
11248 | * treatment of exception raised during <:BasisLibrary:> evaluation | |
11249 | ** http://www.mlton.org/pipermail/mlton/2008-December/030501.html | |
11250 | ** http://www.mlton.org/pipermail/mlton/2008-December/030502.html | |
11251 | ** http://www.mlton.org/pipermail/mlton/2008-December/030503.html | |
11252 | * Use `memcpy` | |
11253 | ** http://www.mlton.org/pipermail/mlton-user/2009-January/001506.html | |
11254 | ** http://www.mlton.org/pipermail/mlton/2009-January/030506.html | |
11255 | * Implement more 64bit primops in x86 codegen | |
11256 | ** http://www.mlton.org/pipermail/mlton/2009-January/030507.html | |
11257 | * Enrich path-map file syntax: | |
11258 | ** http://www.mlton.org/pipermail/mlton/2008-September/030348.html | |
11259 | ** http://www.mlton.org/pipermail/mlton-user/2009-January/001507.html | |
11260 | * PolySpace bug: crash during Cheney-copy collection | |
11261 | ** http://www.mlton.org/pipermail/mlton/2009-February/030513.html | |
11262 | * eliminate `-build-constants` | |
11263 | ** all `_const`-s are known by `runtime/gen/basis-ffi.def` | |
11264 | ** generate `gen-constants.c` from `basis-ffi.def` | |
11265 | ** generate `constants` from `gen-constants.c` and `libmlton.a` | |
11266 | ** similar to `gen-sizes.c` and `sizes` | |
11267 | * eliminate "Windows hacks" for Cygwin from `Path` module | |
11268 | ** http://www.mlton.org/pipermail/mlton/2009-July/030606.html | |
11269 | * extend IL type checkers to check for empty property lists | |
11270 | * make (unsafe) `IntInf` conversions into primitives | |
11271 | ** http://www.mlton.org/pipermail/mlton/2009-July/030622.html | |
11272 | ||
11273 | <<< | |
11274 | ||
11275 | :mlton-guide-page: mGTK | |
11276 | [[mGTK]] | |
11277 | mGTK | |
11278 | ==== | |
11279 | ||
11280 | http://mgtk.sourceforge.net/[mGTK] is a wrapper for | |
11281 | http://www.gtk.org/[GTK+], a GUI toolkit. | |
11282 | ||
11283 | We recommend using mGTK 0.93, which is not listed on their home page, | |
11284 | but is available at the | |
11285 | http://sourceforge.net/project/showfiles.php?group_id=23226&package_id=16523[file | |
11286 | release page]. To test it, after unpacking, do `cd examples; make | |
11287 | mlton`, after which you should be able to run the many examples | |
11288 | (`signup-mlton`, `listview-mlton`, ...). | |
11289 | ||
11290 | == Also see == | |
11291 | ||
11292 | * <:Glade:> | |
11293 | ||
11294 | <<< | |
11295 | ||
11296 | :mlton-guide-page: MichaelNorrish | |
11297 | [[MichaelNorrish]] | |
11298 | MichaelNorrish | |
11299 | ============== | |
11300 | ||
11301 | I am a researcher at http://nicta.com.au[NICTA], with a web-page http://web.rsise.anu.edu.au/%7Emichaeln/[here]. | |
11302 | ||
11303 | I'm interested in MLton because of the chance that it might be a good vehicle for future implementations of the http://hol.sf.net[HOL] theorem-proving system. It's beginning to look as if one route forward will be to embed an SML interpreter into a MLton-compiled executable. I don't know if an extensible interpreter of the kind we're looking for already exists. | |
11304 | ||
11305 | <<< | |
11306 | ||
11307 | :mlton-guide-page: MikeThomas | |
11308 | [[MikeThomas]] | |
11309 | MikeThomas | |
11310 | ========== | |
11311 | ||
11312 | Here is a picture at home in Brisbane, Queensland, Australia, taken in January 2004. | |
11313 | ||
11314 | image::MikeThomas.attachments/picture.jpg[align="center"] | |
11315 | ||
11316 | <<< | |
11317 | ||
11318 | :mlton-guide-page: ML | |
11319 | [[ML]] | |
11320 | ML | |
11321 | == | |
11322 | ||
11323 | ML stands for _meta language_. ML was originally designed in the | |
11324 | 1970s as a programming language to assist theorem proving in the logic | |
11325 | LCF. In the 1980s, ML split into two variants, | |
11326 | <:StandardML:Standard ML> and <:OCaml:>, both of which are still used | |
11327 | today. | |
11328 | ||
11329 | <<< | |
11330 | ||
11331 | :mlton-guide-page: MLAntlr | |
11332 | [[MLAntlr]] | |
11333 | MLAntlr | |
11334 | ======= | |
11335 | ||
11336 | http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[MLAntlr] is a | |
11337 | parser generator for <:StandardML:Standard ML>. | |
11338 | ||
11339 | == Also see == | |
11340 | ||
11341 | * <:MLULex:> | |
11342 | * <:MLLPTLibrary:> | |
11343 | ||
11344 | <<< | |
11345 | ||
11346 | :mlton-guide-page: MLBasis | |
11347 | [[MLBasis]] | |
11348 | MLBasis | |
11349 | ======= | |
11350 | ||
11351 | The ML Basis system extends <:StandardML:Standard ML> to support | |
11352 | programming-in-the-very-large, namespace management at the module | |
11353 | level, separate delivery of library sources, and more. While Standard | |
11354 | ML modules are a sophisticated language for programming-in-the-large, | |
11355 | it is difficult, if not impossible, to accomplish a number of routine | |
11356 | namespace management operations when a program draws upon multiple | |
11357 | libraries provided by different vendors. | |
11358 | ||
11359 | The ML Basis system is a simple, yet powerful, approach that builds | |
11360 | upon the programmer's intuitive notion (and | |
11361 | <:DefinitionOfStandardML: The Definition of Standard ML (Revised)>'s | |
11362 | formal notion) of the top-level environment (a _basis_). The system | |
11363 | is designed as a natural extension of <:StandardML: Standard ML>; the | |
11364 | formal specification of the ML Basis system | |
11365 | (<!Attachment(MLBasis,mlb-formal.pdf)>) is given in the style | |
11366 | of the Definition. | |
11367 | ||
11368 | Here are some of the key features of the ML Basis system: | |
11369 | ||
11370 | 1. Explicit file order: The order of files (and, hence, the order of | |
11371 | evaluation) in the program is explicit. The ML Basis system's | |
11372 | semantics are structured in such a way that for any well-formed | |
11373 | project, there will be exactly one possible interpretation of the | |
11374 | project's syntax, static semantics, and dynamic semantics. | |
11375 | ||
11376 | 2. Implicit dependencies: A source file (corresponding to an SML | |
11377 | top-level declaration) is elaborated in the environment described by | |
11378 | preceding declarations. It is not necessary to explicitly list the | |
11379 | dependencies of a file. | |
11380 | ||
11381 | 3. Scoping and renaming: The ML Basis system provides mechanisms for | |
11382 | limiting the scope of (i.e, hiding) and renaming identifiers. | |
11383 | ||
11384 | 4. No naming convention for finding the file that defines a module. | |
11385 | To import a module, its defining file must appear in some ML Basis | |
11386 | file. | |
11387 | ||
11388 | == Next steps == | |
11389 | ||
11390 | * <:MLBasisSyntaxAndSemantics:> | |
11391 | * <:MLBasisExamples:> | |
11392 | * <:MLBasisPathMap:> | |
11393 | * <:MLBasisAnnotations:> | |
11394 | * <:MLBasisAvailableLibraries:> | |
11395 | ||
11396 | <<< | |
11397 | ||
11398 | :mlton-guide-page: MLBasisAnnotationExamples | |
11399 | [[MLBasisAnnotationExamples]] | |
11400 | MLBasisAnnotationExamples | |
11401 | ========================= | |
11402 | ||
11403 | Here are some example uses of <:MLBasisAnnotations:>. | |
11404 | ||
11405 | == Eliminate spurious warnings in automatically generated code == | |
11406 | ||
11407 | Programs that automatically generate source code can often produce | |
11408 | nonexhaustive patterns, relying on invariants of the generated code to | |
11409 | ensure that the pattern matchings never fail. A programmer may wish | |
11410 | to elide the nonexhaustive warnings from this code, in order that | |
11411 | legitimate warnings are not missed in a flurry of false positives. To | |
11412 | do so, the programmer simply annotates the generated code with the | |
11413 | `nonexhaustiveBind ignore` and `nonexhaustiveMatch ignore` | |
11414 | annotations: | |
11415 | ||
11416 | ---- | |
11417 | local | |
11418 | $(GEN_ROOT)/gen-lib.mlb | |
11419 | ||
11420 | ann | |
11421 | "nonexhaustiveBind ignore" | |
11422 | "nonexhaustiveMatch ignore" | |
11423 | in | |
11424 | foo.gen.sml | |
11425 | end | |
11426 | in | |
11427 | signature FOO | |
11428 | structure Foo | |
11429 | end | |
11430 | ---- | |
11431 | ||
11432 | ||
11433 | == Deliver a library == | |
11434 | ||
11435 | Standard ML libraries can be delivered via `.mlb` files. Authors of | |
11436 | such libraries should strive to be mindful of the ways in which | |
11437 | programmers may choose to compile their programs. For example, | |
11438 | although the defaults for `sequenceNonUnit` and `warnUnused` are | |
11439 | `ignore` and `false`, periodically compiling with these annotations | |
11440 | defaulted to `warn` and `true` can help uncover likely bugs. However, | |
11441 | a programmer is unlikely to be interested in unused modules from an | |
11442 | imported library, and the behavior of `sequenceNonUnit error` may be | |
11443 | incompatible with some libraries. Hence, a library author may choose | |
11444 | to deliver a library as follows: | |
11445 | ||
11446 | ---- | |
11447 | ann | |
11448 | "nonexhaustiveBind warn" "nonexhaustiveMatch warn" | |
11449 | "redundantBind warn" "redundantMatch warn" | |
11450 | "sequenceNonUnit warn" | |
11451 | "warnUnused true" "forceUsed" | |
11452 | in | |
11453 | local | |
11454 | file1.sml | |
11455 | ... | |
11456 | filen.sml | |
11457 | in | |
11458 | functor F1 | |
11459 | ... | |
11460 | signature S1 | |
11461 | ... | |
11462 | structure SN | |
11463 | ... | |
11464 | end | |
11465 | end | |
11466 | ---- | |
11467 | ||
11468 | The annotations `nonexhaustiveBind warn`, `redundantBind warn`, | |
11469 | `nonexhaustiveMatch warn`, `redundantMatch warn`, and `sequenceNonUnit | |
11470 | warn` have the obvious effect on elaboration. The annotations | |
11471 | `warnUnused true` and `forceUsed` work in conjunction -- warning on | |
11472 | any identifiers that do not contribute to the exported modules, and | |
11473 | preventing warnings on exported modules that are not used in the | |
11474 | remainder of the program. Many of the | |
11475 | <:MLBasisAvailableLibraries:available libraries> are delivered with | |
11476 | these annotations. | |
11477 | ||
11478 | <<< | |
11479 | ||
11480 | :mlton-guide-page: MLBasisAnnotations | |
11481 | [[MLBasisAnnotations]] | |
11482 | MLBasisAnnotations | |
11483 | ================== | |
11484 | ||
11485 | <:MLBasis:ML Basis> annotations control options that affect the | |
11486 | elaboration of SML source files. Conceptually, a basis file is | |
11487 | elaborated in a default annotation environment (just as it is | |
11488 | elaborated in an empty basis). The declaration | |
11489 | ++ann++{nbsp}++"++__ann__++"++{nbsp}++in++{nbsp}__basdec__{nbsp}++end++ | |
11490 | merges the annotation _ann_ with the "current" annotation environment | |
11491 | for the elaboration of _basdec_. To allow for future expansion, | |
11492 | ++"++__ann__++"++ is lexed as a single SML string constant. To | |
11493 | conveniently specify multiple annotations, the following derived form | |
11494 | is provided: | |
11495 | ||
11496 | **** | |
11497 | +ann+ ++"++__ann__++"++ (++"++__ann__++"++ )^\+^ +in+ _basdec_ +end+ | |
11498 | => | |
11499 | +ann+ ++"++__ann__++"++ +in+ +ann+ (++"++__ann__++"++)^\+^ +in+ _basdec_ +end+ +end+ | |
11500 | **** | |
11501 | ||
11502 | Here are the available annotations. In the explanation below, for | |
11503 | annotations that take an argument, the first value listed is the | |
11504 | default. | |
11505 | ||
11506 | * +allowFFI {false|true}+ | |
11507 | + | |
11508 | If `true`, allow `_address`, `_export`, `_import`, and `_symbol` | |
11509 | expressions to appear in source files. See | |
11510 | <:ForeignFunctionInterface:>. | |
11511 | ||
11512 | * +allowSuccessorML {false|true}+ | |
11513 | + | |
11514 | -- | |
11515 | Allow or disallow all of the <:SuccessorML:> features. This is a | |
11516 | proxy for all of the following annotations. | |
11517 | ||
11518 | ** +allowDoDecls {false|true}+ | |
11519 | + | |
11520 | If `true`, allow a +do _exp_+ declaration form. | |
11521 | ||
11522 | ** +allowExtendedConsts {false|true}+ | |
11523 | + | |
11524 | -- | |
11525 | Allow or disallow all of the extended constants features. This is a | |
11526 | proxy for all of the following annotations. | |
11527 | ||
11528 | *** +allowExtendedNumConsts {false|true}+ | |
11529 | + | |
11530 | If `true`, allow extended numeric constants. | |
11531 | ||
11532 | *** +allowExtendedTextConsts {false|true}+ | |
11533 | + | |
11534 | If `true`, allow extended text constants. | |
11535 | -- | |
11536 | ||
11537 | ** +allowLineComments {false|true}+ | |
11538 | + | |
11539 | If `true`, allow line comments beginning with the token ++(*)++. | |
11540 | ||
11541 | ** +allowOptBar {false|true}+ | |
11542 | + | |
11543 | If `true`, allow a bar to appear before the first match rule of a | |
11544 | `case`, `fn`, or `handle` expression, allow a bar to appear before the | |
11545 | first function-value binding of a `fun` declaration, and allow a bar | |
11546 | to appear before the first constructor binding or description of a | |
11547 | `datatype` declaration or specification. | |
11548 | ||
11549 | ** +allowOptSemicolon {false|true}+ | |
11550 | + | |
11551 | If `true`, allows a semicolon to appear after the last expression in a | |
11552 | sequence expression or `let` body. | |
11553 | ||
11554 | ** +allowOrPats {false|true}+ | |
11555 | + | |
11556 | If `true`, allows disjunctive (a.k.a., "or") patterns of the form | |
11557 | +_pat_ | _pat_+. | |
11558 | ||
11559 | ** +allowRecordPunExps {false|true}+ | |
11560 | + | |
11561 | If `true`, allows record punning expressions. | |
11562 | ||
11563 | ** +allowSigWithtype {false|true}+ | |
11564 | + | |
11565 | If `true`, allows `withtype` to modify a `datatype` specification in a | |
11566 | signature. | |
11567 | ||
11568 | ** +allowVectorExpsAndPats {false|true}+ | |
11569 | + | |
11570 | -- | |
11571 | Allow or disallow vector expressions and vector patterns. This is a | |
11572 | proxy for all of the following annotations. | |
11573 | ||
11574 | *** +allowVectorExps {false|true}+ | |
11575 | + | |
11576 | If `true`, allow vector expressions. | |
11577 | ||
11578 | *** +allowVectorPats {false|true}+ | |
11579 | + | |
11580 | If `true`, allow vector patterns. | |
11581 | -- | |
11582 | -- | |
11583 | ||
11584 | * +forceUsed+ | |
11585 | + | |
11586 | Force all identifiers in the basis denoted by the body of the `ann` to | |
11587 | be considered used; use in conjunction with `warnUnused true`. | |
11588 | ||
11589 | * +nonexhaustiveBind {warn|error|ignore}+ | |
11590 | + | |
11591 | If `error` or `warn`, report nonexhaustive patterns in `val` | |
11592 | declarations (i.e., pattern-match failures that raise the `Bind` | |
11593 | exception). An error will abort a compile, while a warning will not. | |
11594 | ||
11595 | * +nonexhaustiveExnBind {default|ignore}+ | |
11596 | + | |
11597 | If `ignore`, suppress errors and warnings about nonexhaustive matches | |
11598 | in `val` declarations that arise solely from unmatched exceptions. | |
11599 | If `default`, follow the behavior of `nonexhaustiveBind`. | |
11600 | ||
11601 | * +nonexhaustiveExnMatch {default|ignore}+ | |
11602 | + | |
11603 | If `ignore`, suppress errors and warnings about nonexhaustive matches | |
11604 | in `fn` expressions, `case` expressions, and `fun` declarations that | |
11605 | arise solely from unmatched exceptions. If `default`, follow the | |
11606 | behavior of `nonexhaustiveMatch`. | |
11607 | ||
11608 | * +nonexhaustiveExnRaise {ignore|default}+ | |
11609 | + | |
11610 | If `ignore`, suppress errors and warnings about nonexhaustive matches | |
11611 | in `handle` expressions that arise solely from unmatched exceptions. | |
11612 | If `default`, follow the behavior of `nonexhaustiveRaise`. | |
11613 | ||
11614 | * +nonexhaustiveMatch {warn|error|ignore}+ | |
11615 | + | |
11616 | If `error` or `warn`, report nonexhaustive patterns in `fn` | |
11617 | expressions, `case` expressions, and `fun` declarations (i.e., | |
11618 | pattern-match failures that raise the `Match` exception). An error | |
11619 | will abort a compile, while a warning will not. | |
11620 | ||
11621 | * +nonexhaustiveRaise {ignore|warn|error}+ | |
11622 | + | |
11623 | If `error` or `warn`, report nonexhaustive patterns in `handle` | |
11624 | expressions (i.e., pattern-match failures that implicitly (re)raise | |
11625 | the unmatched exception). An error will abort a compile, while a | |
11626 | warning will not. | |
11627 | ||
11628 | * +redundantBind {warn|error|ignore}+ | |
11629 | + | |
11630 | If `error` or `warn`, report redundant patterns in `val` declarations. | |
11631 | An error will abort a compile, while a warning will not. | |
11632 | ||
11633 | * +redundantMatch {warn|error|ignore}+ | |
11634 | + | |
11635 | If `error` or `warn`, report redundant patterns in `fn` expressions, | |
11636 | `case` expressions, and `fun` declarations. An error will abort a | |
11637 | compile, while a warning will not. | |
11638 | ||
11639 | * +redundantRaise {warn|error|ignore}+ | |
11640 | + | |
11641 | If `error` or `warn`, report redundant patterns in `handle` | |
11642 | expressions. An error will abort a compile, while a warning will not. | |
11643 | ||
11644 | * +resolveScope {strdec|dec|topdec|program}+ | |
11645 | + | |
11646 | Used to control the scope at which overload constraints are resolved | |
11647 | to default types (if not otherwise resolved by type inference) and the | |
11648 | scope at which unresolved flexible record constraints are reported. | |
11649 | + | |
11650 | The syntactic-class argument means to perform resolution checks at the | |
11651 | smallest enclosing syntactic form of the given class. The default | |
11652 | behavior is to resolve at the smallest enclosing _strdec_ (which is | |
11653 | equivalent to the largest enclosing _dec_). Other useful behaviors | |
11654 | are to resolve at the smallest enclosing _topdec_ (which is equivalent | |
11655 | to the largest enclosing _strdec_) and at the smallest enclosing | |
11656 | _program_ (which corresponds to a single `.sml` file and does not | |
11657 | correspond to the whole `.mlb` program). | |
11658 | ||
11659 | * +sequenceNonUnit {ignore|error|warn}+ | |
11660 | + | |
11661 | If `error` or `warn`, report when `e1` is not of type `unit` in the | |
11662 | sequence expression `(e1; e2)`. This can be helpful in detecting | |
11663 | curried applications that are mistakenly not fully applied. To | |
11664 | silence spurious messages, you can use `ignore e1`. | |
11665 | ||
11666 | * +valrecConstr {warn|error|ignore}+ | |
11667 | + | |
11668 | If `error` or `warn`, report when a `val rec` (or `fun`) declaration | |
11669 | redefines an identifier that previously had constructor status. An | |
11670 | error will abort a compile, while a warning will not. | |
11671 | ||
11672 | * +warnUnused {false|true}+ | |
11673 | + | |
11674 | Report unused identifiers. | |
11675 | ||
11676 | == Next Steps == | |
11677 | ||
11678 | * <:MLBasisAnnotationExamples:> | |
11679 | * <:WarnUnusedAnomalies:> | |
11680 | ||
11681 | <<< | |
11682 | ||
11683 | :mlton-guide-page: MLBasisAvailableLibraries | |
11684 | [[MLBasisAvailableLibraries]] | |
11685 | MLBasisAvailableLibraries | |
11686 | ========================= | |
11687 | ||
11688 | MLton comes with the following <:MLBasis:ML Basis> files available. | |
11689 | ||
11690 | * `$(SML_LIB)/basis/basis.mlb` | |
11691 | + | |
11692 | The <:BasisLibrary:Basis Library>. | |
11693 | ||
11694 | * `$(SML_LIB)/basis/basis-1997.mlb` | |
11695 | + | |
11696 | The (deprecated) 1997 version of the <:BasisLibrary:Basis Library>. | |
11697 | ||
11698 | * `$(SML_LIB)/basis/mlton.mlb` | |
11699 | + | |
11700 | The <:MLtonStructure:MLton> structure and signatures. | |
11701 | ||
11702 | * `$(SML_LIB)/basis/c-types.mlb` | |
11703 | + | |
11704 | Various structure aliases useful as <:ForeignFunctionInterfaceTypes:>. | |
11705 | ||
11706 | * `$(SML_LIB)/basis/unsafe.mlb` | |
11707 | + | |
11708 | The <:UnsafeStructure:Unsafe> structure and signature. | |
11709 | ||
11710 | * `$(SML_LIB)/basis/sml-nj.mlb` | |
11711 | + | |
11712 | The <:SMLofNJStructure:SMLofNJ> structure and signature. | |
11713 | ||
11714 | * `$(SML_LIB)/mlyacc-lib/mlyacc-lib.mlb` | |
11715 | + | |
11716 | Modules used by parsers built with <:MLYacc:>. | |
11717 | ||
11718 | * `$(SML_LIB)/cml/cml.mlb` | |
11719 | + | |
11720 | <:ConcurrentML:>, a library for message-passing concurrency. | |
11721 | ||
11722 | * `$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb` | |
11723 | + | |
11724 | <:MLNLFFI:ML-NLFFI>, a library for foreign function interfaces. | |
11725 | ||
11726 | * `$(SML_LIB)/mlrisc-lib/...` | |
11727 | + | |
11728 | <:MLRISCLibrary:>, a library for retargetable and optimizing compiler back ends. | |
11729 | ||
11730 | * `$(SML_LIB)/smlnj-lib/...` | |
11731 | + | |
11732 | <:SMLNJLibrary:>, a collection of libraries distributed with SML/NJ. | |
11733 | ||
11734 | * `$(SML_LIB)/ckit-lib/ckit-lib.mlb` | |
11735 | + | |
11736 | <:CKitLibrary:>, a library for C source code. | |
11737 | ||
11738 | * `$(SML_LIB)/mllpt-lib/mllpt-lib.mlb` | |
11739 | + | |
11740 | <:MLLPTLibrary:>, a support library for the <:MLULex:> scanner generator and the <:MLAntlr:> parser generator. | |
11741 | ||
11742 | ||
11743 | == Basis fragments == | |
11744 | ||
11745 | There are a number of specialized ML Basis files for importing | |
11746 | fragments of the <:BasisLibrary: Basis Library> that can not be | |
11747 | expressed within SML. | |
11748 | ||
11749 | * `$(SML_LIB)/basis/pervasive-types.mlb` | |
11750 | + | |
11751 | The top-level types and constructors of the Basis Library. | |
11752 | ||
11753 | * `$(SML_LIB)/basis/pervasive-exns.mlb` | |
11754 | + | |
11755 | The top-level exception constructors of the Basis Library. | |
11756 | ||
11757 | * `$(SML_LIB)/basis/pervasive-vals.mlb` | |
11758 | + | |
11759 | The top-level values of the Basis Library, without infix status. | |
11760 | ||
11761 | * `$(SML_LIB)/basis/overloads.mlb` | |
11762 | + | |
11763 | The top-level overloaded values of the Basis Library, without infix status. | |
11764 | ||
11765 | * `$(SML_LIB)/basis/equal.mlb` | |
11766 | + | |
11767 | The polymorphic equality `=` and inequality `<>` values, without infix status. | |
11768 | ||
11769 | * `$(SML_LIB)/basis/infixes.mlb` | |
11770 | + | |
11771 | The infix declarations of the Basis Library. | |
11772 | ||
11773 | * `$(SML_LIB)/basis/pervasive.mlb` | |
11774 | + | |
11775 | The entire top-level value and type environment of the Basis Library, with infix status. This is the same as importing the above six MLB files. | |
11776 | ||
11777 | <<< | |
11778 | ||
11779 | :mlton-guide-page: MLBasisExamples | |
11780 | [[MLBasisExamples]] | |
11781 | MLBasisExamples | |
11782 | =============== | |
11783 | ||
11784 | Here are some example uses of <:MLBasis:ML Basis> files. | |
11785 | ||
11786 | ||
11787 | == Complete program == | |
11788 | ||
11789 | Suppose your complete program consists of the files `file1.sml`, ..., | |
11790 | `filen.sml`, which depend upon libraries `lib1.mlb`, ..., `libm.mlb`. | |
11791 | ||
11792 | ---- | |
11793 | (* import libraries *) | |
11794 | lib1.mlb | |
11795 | ... | |
11796 | libm.mlb | |
11797 | ||
11798 | (* program files *) | |
11799 | file1.sml | |
11800 | ... | |
11801 | filen.sml | |
11802 | ---- | |
11803 | ||
11804 | The bases denoted by `lib1.mlb`, ..., `libm.mlb` are merged (bindings | |
11805 | of names in later bases take precedence over bindings of the same name | |
11806 | in earlier bases), producing a basis in which `file1.sml`, ..., | |
11807 | `filen.sml` are elaborated, adding additional bindings to the basis. | |
11808 | ||
11809 | ||
11810 | == Export filter == | |
11811 | ||
11812 | Suppose you only want to export certain structures, signatures, and | |
11813 | functors from a collection of files. | |
11814 | ||
11815 | ---- | |
11816 | local | |
11817 | file1.sml | |
11818 | ... | |
11819 | filen.sml | |
11820 | in | |
11821 | (* export filter here *) | |
11822 | functor F | |
11823 | structure S | |
11824 | end | |
11825 | ---- | |
11826 | ||
11827 | While `file1.sml`, ..., `filen.sml` may declare top-level identifiers | |
11828 | in addition to `F` and `S`, such names are not accessible to programs | |
11829 | and libraries that import this `.mlb`. | |
11830 | ||
11831 | ||
11832 | == Export filter with renaming == | |
11833 | ||
11834 | Suppose you want an export filter, but want to rename one of the | |
11835 | modules. | |
11836 | ||
11837 | ---- | |
11838 | local | |
11839 | file1.sml | |
11840 | ... | |
11841 | filen.sml | |
11842 | in | |
11843 | (* export filter, with renaming, here *) | |
11844 | functor F | |
11845 | structure S' = S | |
11846 | end | |
11847 | ---- | |
11848 | ||
11849 | Note that `functor F` is an abbreviation for `functor F = F`, which | |
11850 | simply exports an identifier under the same name. | |
11851 | ||
11852 | ||
11853 | == Import filter == | |
11854 | ||
11855 | Suppose you only want to import a functor `F` from one library and a | |
11856 | structure `S` from another library. | |
11857 | ||
11858 | ---- | |
11859 | local | |
11860 | lib1.mlb | |
11861 | in | |
11862 | (* import filter here *) | |
11863 | functor F | |
11864 | end | |
11865 | local | |
11866 | lib2.mlb | |
11867 | in | |
11868 | (* import filter here *) | |
11869 | structure S | |
11870 | end | |
11871 | file1.sml | |
11872 | ... | |
11873 | filen.sml | |
11874 | ---- | |
11875 | ||
11876 | ||
11877 | == Import filter with renaming == | |
11878 | ||
11879 | Suppose you want to import a structure `S` from one library and | |
11880 | another structure `S` from another library. | |
11881 | ||
11882 | ---- | |
11883 | local | |
11884 | lib1.mlb | |
11885 | in | |
11886 | (* import filter, with renaming, here *) | |
11887 | structure S1 = S | |
11888 | end | |
11889 | local | |
11890 | lib2.mlb | |
11891 | in | |
11892 | (* import filter, with renaming, here *) | |
11893 | structure S2 = S | |
11894 | end | |
11895 | file1.sml | |
11896 | ... | |
11897 | filen.sml | |
11898 | ---- | |
11899 | ||
11900 | ||
11901 | == Full Basis == | |
11902 | ||
11903 | Since the Modules level of SML is the natural means for organizing | |
11904 | program and library components, MLB files provide convenient syntax | |
11905 | for renaming Modules level identifiers (in fact, renaming of functor | |
11906 | identifiers provides a mechanism that is not available in SML). | |
11907 | However, please note that `.mlb` files elaborate to full bases | |
11908 | including top-level types and values (including infix status), in | |
11909 | addition to structures, signatures, and functors. For example, | |
11910 | suppose you wished to extend the <:BasisLibrary:Basis Library> with an | |
11911 | `('a, 'b) either` datatype corresponding to a disjoint sum; the type | |
11912 | and some operations should be available at the top-level; | |
11913 | additionally, a signature and structure provide the complete | |
11914 | interface. | |
11915 | ||
11916 | We could use the following files. | |
11917 | ||
11918 | `either-sigs.sml` | |
11919 | [source,sml] | |
11920 | ---- | |
11921 | signature EITHER_GLOBAL = | |
11922 | sig | |
11923 | datatype ('a, 'b) either = Left of 'a | Right of 'b | |
11924 | val & : ('a -> 'c) * ('b -> 'c) -> ('a, 'b) either -> 'c | |
11925 | val && : ('a -> 'c) * ('b -> 'd) -> ('a, 'b) either -> ('c, 'd) either | |
11926 | end | |
11927 | ||
11928 | signature EITHER = | |
11929 | sig | |
11930 | include EITHER_GLOBAL | |
11931 | val isLeft : ('a, 'b) either -> bool | |
11932 | val isRight : ('a, 'b) either -> bool | |
11933 | ... | |
11934 | end | |
11935 | ---- | |
11936 | ||
11937 | `either-strs.sml` | |
11938 | [source,sml] | |
11939 | ---- | |
11940 | structure Either : EITHER = | |
11941 | struct | |
11942 | datatype ('a, 'b) either = Left of 'a | Right of 'b | |
11943 | fun f & g = fn x => | |
11944 | case x of Left z => f z | Right z => g z | |
11945 | fun f && g = (Left o f) & (Right o g) | |
11946 | fun isLeft x = ((fn _ => true) & (fn _ => false)) x | |
11947 | fun isRight x = (not o isLeft) x | |
11948 | ... | |
11949 | end | |
11950 | structure EitherGlobal : EITHER_GLOBAL = Either | |
11951 | ---- | |
11952 | ||
11953 | `either-infixes.sml` | |
11954 | [source,sml] | |
11955 | ---- | |
11956 | infixr 3 & && | |
11957 | ---- | |
11958 | ||
11959 | `either-open.sml` | |
11960 | [source,sml] | |
11961 | ---- | |
11962 | open EitherGlobal | |
11963 | ---- | |
11964 | ||
11965 | `either.mlb` | |
11966 | ---- | |
11967 | either-infixes.sml | |
11968 | local | |
11969 | (* import Basis Library *) | |
11970 | $(SML_LIB)/basis/basis.mlb | |
11971 | either-sigs.sml | |
11972 | either-strs.sml | |
11973 | in | |
11974 | signature EITHER | |
11975 | structure Either | |
11976 | either-open.sml | |
11977 | end | |
11978 | ---- | |
11979 | ||
11980 | A client that imports `either.mlb` will have access to neither | |
11981 | `EITHER_GLOBAL` nor `EitherGlobal`, but will have access to the type | |
11982 | `either` and the values `&` and `&&` (with infix status) in the | |
11983 | top-level environment. Note that `either-infixes.sml` is outside the | |
11984 | scope of the local, because we want the infixes available in the | |
11985 | implementation of the library and to clients of the library. | |
11986 | ||
11987 | <<< | |
11988 | ||
11989 | :mlton-guide-page: MLBasisPathMap | |
11990 | [[MLBasisPathMap]] | |
11991 | MLBasisPathMap | |
11992 | ============== | |
11993 | ||
11994 | An <:MLBasis:ML Basis> _path map_ describes a map from ML Basis path | |
11995 | variables (of the form `$(VAR)`) to file system paths. ML Basis path | |
11996 | variables provide a flexible way to refer to libraries while allowing | |
11997 | them to be moved without changing their clients. | |
11998 | ||
11999 | The format of an `mlb-path-map` file is a sequence of lines; each line | |
12000 | consists of two, white-space delimited tokens. The first token is a | |
12001 | path variable `VAR` and the second token is the path to which the | |
12002 | variable is mapped. The path may include path variables, which are | |
12003 | recursively expanded. | |
12004 | ||
12005 | The mapping from path variables to paths is initialized by the compiler. | |
12006 | Additional path maps can be specified with `-mlb-path-map` and | |
12007 | individual path variable mappings can be specified with | |
12008 | `-mlb-path-var` (see <:CompileTimeOptions:>). Configuration files are | |
12009 | processed from first to last and from top to bottom, later mappings | |
12010 | take precedence over earlier mappings. | |
12011 | ||
12012 | The compiler and system-wide configuration file makes the following | |
12013 | path variables available. | |
12014 | ||
12015 | [options="header",cols="^25%,<75%"] | |
12016 | |==== | |
12017 | |MLB path variable|Description | |
12018 | |`SML_LIB`|path to system-wide libraries, usually `/usr/lib/mlton/sml` | |
12019 | |`TARGET_ARCH`|string representation of target architecture | |
12020 | |`TARGET_OS`|string representation of target operating system | |
12021 | |`DEFAULT_INT`|binding for default int, usually `int32` | |
12022 | |`DEFAULT_WORD`|binding for default word, usually `word32` | |
12023 | |`DEFAULT_REAL`|binding for default real, usually `real64` | |
12024 | |==== | |
12025 | ||
12026 | <<< | |
12027 | ||
12028 | :mlton-guide-page: MLBasisSyntaxAndSemantics | |
12029 | [[MLBasisSyntaxAndSemantics]] | |
12030 | MLBasisSyntaxAndSemantics | |
12031 | ========================= | |
12032 | ||
12033 | An <:MLBasis:ML Basis> (MLB) file should have the `.mlb` suffix and | |
12034 | should contain a basis declaration. | |
12035 | ||
12036 | == Syntax == | |
12037 | ||
12038 | A basis declaration (_basdec_) must be one of the following forms. | |
12039 | ||
12040 | * +basis+ _basid_ +=+ _basexp_ (+and+ _basid_ +=+ _basexp_)^*^ | |
12041 | * +open+ _basid~1~_ ... _basid~n~_ | |
12042 | * +local+ _basdec_ +in+ _basdec_ +end+ | |
12043 | * _basdec_ [+;+] _basdec_ | |
12044 | * +structure+ _strid_ [+=+ _strid_] (+and+ _strid_[+=+ _strid_])^*^ | |
12045 | * +signature+ _sigid_ [+=+ _sigid_] (+and+ _sigid_ [+=+ _sigid_])^*^ | |
12046 | * +functor+ _funid_ [+=+ _funid_] (+and+ _funid_ [+=+ _funid_])^*^ | |
12047 | * __path__++.sml++, __path__++.sig++, or __path__++.fun++ | |
12048 | * __path__++.mlb++ | |
12049 | * +ann+ ++"++_ann_++"++ +in+ _basdec_ +end+ | |
12050 | ||
12051 | A basis expression (_basexp_) must be of one the following forms. | |
12052 | ||
12053 | * +bas+ _basdec_ +end+ | |
12054 | * _basid_ | |
12055 | * +let+ _basdec_ +in+ _basexp_ +end+ | |
12056 | ||
12057 | Nested SML-style comments (enclosed with `(*` and `*)`) are ignored | |
12058 | (but <:LineDirective:>s are recognized). | |
12059 | ||
12060 | Paths can be relative or absolute. Relative paths are relative to the | |
12061 | directory containing the MLB file. Paths may include path variables | |
12062 | and are expanded according to a <:MLBasisPathMap:path map>. Unquoted | |
12063 | paths may include alpha-numeric characters and the symbols "`-`" and | |
12064 | "`_`", along with the arc separator "`/`" and extension separator | |
12065 | "`.`". More complicated paths, including paths with spaces, may be | |
12066 | included by quoting the path with `"`. A quoted path is lexed as an | |
12067 | SML string constant. | |
12068 | ||
12069 | <:MLBasisAnnotations:Annotations> allow a library author to | |
12070 | control options that affect the elaboration of SML source files. | |
12071 | ||
12072 | == Semantics == | |
12073 | ||
12074 | There is a <!Attachment(MLBasis,mlb-formal.pdf,formal semantics)> for | |
12075 | ML Basis files in the style of the | |
12076 | <:DefinitionOfStandardML:Definition>. Here, we give an informal | |
12077 | explanation. | |
12078 | ||
12079 | An SML structure is a collection of types, values, and other | |
12080 | structures. Similarly, a basis is a collection, but of more kinds of | |
12081 | objects: types, values, structures, fixities, signatures, functors, | |
12082 | and other bases. | |
12083 | ||
12084 | A basis declaration denotes a basis. A structure, signature, or | |
12085 | functor declaration denotes a basis containing the corresponding | |
12086 | module. Sequencing of basis declarations merges bases, with later | |
12087 | definitions taking precedence over earlier ones, just like sequencing | |
12088 | of SML declarations. Local declarations provide name hiding, just | |
12089 | like SML local declarations. A reference to an SML source file causes | |
12090 | the file to be elaborated in the basis extant at the point of | |
12091 | reference. A reference to an MLB file causes the basis denoted by | |
12092 | that MLB file to be imported -- the basis at the point of reference | |
12093 | does _not_ affect the imported basis. | |
12094 | ||
12095 | Basis expressions and basis identifiers allow binding a basis to a | |
12096 | name. | |
12097 | ||
12098 | An MLB file is elaborated starting in an empty basis. Each MLB file | |
12099 | is elaborated and evaluated only once, with the result being cached. | |
12100 | Subsequent references use the cached value. Thus, any observable | |
12101 | effects due to evaluation are not duplicated if the MLB file is | |
12102 | referred to multiple times. | |
12103 | ||
12104 | <<< | |
12105 | ||
12106 | :mlton-guide-page: MLj | |
12107 | [[MLj]] | |
12108 | MLj | |
12109 | === | |
12110 | ||
12111 | http://www.dcs.ed.ac.uk/home/mlj/[MLj] is a | |
12112 | <:StandardMLImplementations:Standard ML implementation> that targets | |
12113 | Java bytecode. It is no longer maintained. It has morphed into | |
12114 | <:SMLNET:SML.NET>. | |
12115 | ||
12116 | == Also see == | |
12117 | ||
12118 | * <!Cite(BentonEtAl98)> | |
12119 | * <!Cite(BentonKennedy99)> | |
12120 | ||
12121 | <<< | |
12122 | ||
12123 | :mlton-guide-page: MLKit | |
12124 | [[MLKit]] | |
12125 | MLKit | |
12126 | ===== | |
12127 | ||
12128 | The http://sourceforge.net/apps/mediawiki/mlkit[ML Kit] is a | |
12129 | <:StandardMLImplementations:Standard ML implementation>. | |
12130 | ||
12131 | MLKit supports: | |
12132 | ||
12133 | * <:DefinitionOfStandardML:SML'97> | |
12134 | ** including most of the latest <:BasisLibrary:Basis Library> | |
12135 | http://www.standardml.org/Basis[specification], | |
12136 | * <:MLBasis:ML Basis> files | |
12137 | ** and separate compilation, | |
12138 | * <:Regions:Region-Based Memory Management> | |
12139 | ** and <:GarbageCollection:garbage collection>, | |
12140 | * Multiple backends, including | |
12141 | ** native x86, | |
12142 | ** bytecode, and | |
12143 | ** JavaScript (see http://www.itu.dk/people/mael/smltojs/[SMLtoJs]). | |
12144 | ||
12145 | At the time of writing, MLKit does not support: | |
12146 | ||
12147 | * concurrent programming / threads, | |
12148 | * calling from C to SML. | |
12149 | ||
12150 | <<< | |
12151 | ||
12152 | :mlton-guide-page: MLLex | |
12153 | [[MLLex]] | |
12154 | MLLex | |
12155 | ===== | |
12156 | ||
12157 | <:MLLex:> is a lexical analyzer generator for <:StandardML:Standard ML> | |
12158 | modeled after the Lex lexical analyzer generator. | |
12159 | ||
12160 | A version of MLLex, ported from the <:SMLNJ:SML/NJ> sources, is | |
12161 | distributed with MLton. | |
12162 | ||
12163 | == Description == | |
12164 | ||
12165 | MLLex takes as input the lex language as defined in the ML-Lex manual, | |
12166 | and outputs a lexical analyzer in SML. | |
12167 | ||
12168 | == Implementation == | |
12169 | ||
12170 | * <!ViewGitFile(mlton,master,mllex/lexgen.sml)> | |
12171 | * <!ViewGitFile(mlton,master,mllex/main.sml)> | |
12172 | * <!ViewGitFile(mlton,master,mllex/call-main.sml)> | |
12173 | ||
12174 | == Details and Notes == | |
12175 | ||
12176 | There are 3 main passes in the MLLex tool: | |
12177 | ||
12178 | * Source parsing. In this pass, lex source program are parsed into internal representations. The core part of this pass is a hand-written lexer and an LL(1) parser. The output of this pass is a record of user code, rules (along with start states) and actions. (MLLex definitions are wiped off.) | |
12179 | * DFA construction. In this pass, a DFA is constructed by the algorithm of H. Yamada et. al. | |
12180 | * Output. In this pass, the generated DFA is written out as a transition table, along with a table-driven algorithm, to an SML file. | |
12181 | ||
12182 | == Also see == | |
12183 | ||
12184 | * <!Attachment(Documentation,mllex.pdf)> | |
12185 | * <:MLYacc:> | |
12186 | * <!Cite(AppelEtAl94)> | |
12187 | * <!Cite(Price09)> | |
12188 | ||
12189 | <<< | |
12190 | ||
12191 | :mlton-guide-page: MLLPTLibrary | |
12192 | [[MLLPTLibrary]] | |
12193 | MLLPTLibrary | |
12194 | ============ | |
12195 | ||
12196 | The | |
12197 | http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[ML-LPT Library] | |
12198 | is a support library for the <:MLULex:> scanner generator and the | |
12199 | <:MLAntlr:> parser generator. The ML-LPT Library is distributed with | |
12200 | SML/NJ. | |
12201 | ||
12202 | As of 20180119, MLton includes the ML-LPT Library synchronized with | |
12203 | SML/NJ version 110.82. | |
12204 | ||
12205 | == Usage == | |
12206 | ||
12207 | * You can import the ML-LPT Library into an MLB file with: | |
12208 | + | |
12209 | [options="header"] | |
12210 | |===== | |
12211 | |MLB file|Description | |
12212 | |`$(SML_LIB)/mllpt-lib/mllpt-lib.mlb`| | |
12213 | |===== | |
12214 | ||
12215 | * If you are porting a project from SML/NJ's <:CompilationManager:> to | |
12216 | MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the | |
12217 | following map is included by default: | |
12218 | + | |
12219 | ---- | |
12220 | # MLLPT Library | |
12221 | $ml-lpt-lib.cm $(SML_LIB)/mllpt-lib | |
12222 | $ml-lpt-lib.cm/ml-lpt-lib.cm $(SML_LIB)/mllpt-lib/mllpt-lib.mlb | |
12223 | ---- | |
12224 | + | |
12225 | This will automatically convert a `$/mllpt-lib.cm` import in an input | |
12226 | `.cm` file into a `$(SML_LIB)/mllpt-lib/mllpt-lib.mlb` import in the | |
12227 | output `.mlb` file. | |
12228 | ||
12229 | == Details == | |
12230 | ||
12231 | {empty} | |
12232 | ||
12233 | == Patch == | |
12234 | ||
12235 | * <!ViewGitFile(mlton,master,lib/mllpt-lib/ml-lpt.patch)> | |
12236 | ||
12237 | <<< | |
12238 | ||
12239 | :mlton-guide-page: MLmon | |
12240 | [[MLmon]] | |
12241 | MLmon | |
12242 | ===== | |
12243 | ||
12244 | An `mlmon.out` file records dynamic <:Profiling:profiling> counts. | |
12245 | ||
12246 | == File format == | |
12247 | ||
12248 | An `mlmon.out` file is a text file with a sequence of lines. | |
12249 | ||
12250 | * The string "`MLton prof`". | |
12251 | ||
12252 | * The string "`alloc`", "`count`", or "`time`", depending on the kind | |
12253 | of profiling information, corresponding to the command-line argument | |
12254 | supplied to `mlton -profile`. | |
12255 | ||
12256 | * The string "`current`" or "`stack`" depending on whether profiling | |
12257 | data was gathered for only the current function (the top of the stack) | |
12258 | or for all functions on the stack. This corresponds to whether the | |
12259 | executable was compiled with `-profile-stack false` or `-profile-stack | |
12260 | true`. | |
12261 | ||
12262 | * The magic number of the executable. | |
12263 | ||
12264 | * The number of non-gc ticks, followed by a space, then the number of | |
12265 | GC ticks. | |
12266 | ||
12267 | * The number of (split) functions for which data is recorded. | |
12268 | ||
12269 | * A line for each (split) function with counts. Each line contains an | |
12270 | integer count of the number of ticks while the function was current. | |
12271 | In addition, if stack data was gathered (`-profile-stack true`), then | |
12272 | the line contains two additional tick counts: | |
12273 | ||
12274 | ** the number of ticks while the function was on the stack. | |
12275 | ** the number of ticks while the function was on the stack and a GC | |
12276 | was performed. | |
12277 | ||
12278 | * The number of (master) functions for which data is recorded. | |
12279 | ||
12280 | * A line for each (master) function with counts. The lines have the | |
12281 | same format and meaning as with split-function counts. | |
12282 | ||
12283 | <<< | |
12284 | ||
12285 | :mlton-guide-page: MLNLFFI | |
12286 | [[MLNLFFI]] | |
12287 | MLNLFFI | |
12288 | ======= | |
12289 | ||
12290 | <!Cite(Blume01, ML-NLFFI)> is the no-longer-foreign-function interface | |
12291 | library for SML. | |
12292 | ||
12293 | As of 20050212, MLton has an initial port of ML-NLFFI from SML/NJ to | |
12294 | MLton. All of the ML-NLFFI functionality is present. | |
12295 | ||
12296 | Additionally, MLton has an initial port of the | |
12297 | <:MLNLFFIGen:mlnlffigen> tool from SML/NJ to MLton. Due to low-level | |
12298 | details, the code generated by SML/NJ's `ml-nlffigen` is not | |
12299 | compatible with MLton, and vice-versa. However, the generated code | |
12300 | has the same interface, so portable client code can be written. | |
12301 | MLton's `mlnlffigen` does not currently support C functions with | |
12302 | `struct` or `union` arguments. | |
12303 | ||
12304 | == Usage == | |
12305 | ||
12306 | * You can import the ML-NLFFI Library into an MLB file with | |
12307 | + | |
12308 | [options="header"] | |
12309 | |===== | |
12310 | |MLB file|Description | |
12311 | |`$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb`| | |
12312 | |===== | |
12313 | ||
12314 | * If you are porting a project from SML/NJ's <:CompilationManager:> to | |
12315 | MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the | |
12316 | following maps are included by default: | |
12317 | + | |
12318 | ---- | |
12319 | # MLNLFFI Library | |
12320 | $c $(SML_LIB)/mlnlffi-lib | |
12321 | $c/c.cm $(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb | |
12322 | ---- | |
12323 | + | |
12324 | This will automatically convert a `$/c.cm` import in an input `.cm` | |
12325 | file into a `$(SML_LIB)/mlnlffi-lib/mlnlffi-lib.mlb` import in the | |
12326 | output `.mlb` file. | |
12327 | ||
12328 | == Also see == | |
12329 | ||
12330 | * <!Cite(Blume01)> | |
12331 | * <:MLNLFFIImplementation:> | |
12332 | * <:MLNLFFIGen:> | |
12333 | ||
12334 | <<< | |
12335 | ||
12336 | :mlton-guide-page: MLNLFFIGen | |
12337 | [[MLNLFFIGen]] | |
12338 | MLNLFFIGen | |
12339 | ========== | |
12340 | ||
12341 | `mlnlffigen` generates a <:MLNLFFI:> binding from a collection of `.c` | |
12342 | files. It is based on the <:CKitLibrary:>, which is primarily designed | |
12343 | to handle standardized C and thus does not understand many (any?) | |
12344 | compiler extensions; however, it attempts to recover from errors when | |
12345 | seeing unrecognized definitions. | |
12346 | ||
12347 | In order to work around common gcc extensions, it may be useful to add | |
12348 | `-cppopt` options to the command line; for example | |
12349 | `-cppopt '-D__extension__'` may be occasionally useful. Fortunately, | |
12350 | most portable libraries largely avoid the use of these types of | |
12351 | extensions in header files. | |
12352 | ||
12353 | `mlnlffigen` will normally not generate bindings for `#included` | |
12354 | files; see `-match` and `-allSU` if this is desirable. | |
12355 | ||
12356 | <<< | |
12357 | ||
12358 | :mlton-guide-page: MLNLFFIImplementation | |
12359 | [[MLNLFFIImplementation]] | |
12360 | MLNLFFIImplementation | |
12361 | ===================== | |
12362 | ||
12363 | MLton's implementation(s) of the <:MLNLFFI:> library differs from the | |
12364 | SML/NJ implementation in two important ways: | |
12365 | ||
12366 | * MLton cannot utilize the `Unsafe.cast` "cheat" described in Section | |
12367 | 3.7 of <!Cite(Blume01)>. (MLton's representation of | |
12368 | <:Closure:closures> and | |
12369 | <:PackedRepresentation:aggressive representation> optimizations make | |
12370 | an `Unsafe.cast` even more "unsafe" than in other implementations.) | |
12371 | + | |
12372 | -- | |
12373 | We have considered two solutions: | |
12374 | ||
12375 | ** One solution is to utilize an additional type parameter (as | |
12376 | described in Section 3.7 of <!Cite(Blume01)>): | |
12377 | + | |
12378 | -- | |
12379 | __________ | |
12380 | [source,sml] | |
12381 | ---- | |
12382 | signature C = sig | |
12383 | type ('t, 'f, 'c) obj | |
12384 | eqtype ('t, 'f, 'c) obj' | |
12385 | ... | |
12386 | type ('o, 'f) ptr | |
12387 | eqtype ('o, 'f) ptr' | |
12388 | ... | |
12389 | type 'f fptr | |
12390 | type 'f ptr' | |
12391 | ... | |
12392 | structure T : sig | |
12393 | type ('t, 'f) typ | |
12394 | ... | |
12395 | end | |
12396 | end | |
12397 | ---- | |
12398 | ||
12399 | The rule for `('t, 'f, 'c) obj`,`('t, 'f, 'c) ptr`, and also `('t, 'f) | |
12400 | T.typ` is that whenever `F fptr` occurs within the instantiation of | |
12401 | `'t`, then `'f` must be instantiated to `F`. In all other cases, `'f` | |
12402 | will be instantiated to `unit`. | |
12403 | __________ | |
12404 | ||
12405 | (In the actual MLton implementation, an abstract type `naf` | |
12406 | (not-a-function) is used instead of `unit`.) | |
12407 | ||
12408 | While this means that type-annotated programs may not type-check under | |
12409 | both the SML/NJ implementation and the MLton implementation, this | |
12410 | should not be a problem in practice. Tools, like `ml-nlffigen`, which | |
12411 | are necessarily implementation dependent (in order to make | |
12412 | <:CallingFromSMLToCFunctionPointer:calls through a C function | |
12413 | pointer>), may be easily extended to emit the additional type | |
12414 | parameter. Client code which uses such generated glue-code (e.g., | |
12415 | Section 1 of <!Cite(Blume01)>) need rarely write type-annotations, | |
12416 | thanks to the magic of type inference. | |
12417 | -- | |
12418 | ||
12419 | ** The above implementation suffers from two disadvantages. | |
12420 | + | |
12421 | -- | |
12422 | First, it changes the MLNLFFI Library interface, meaning that the same | |
12423 | program may not type-check under both the SML/NJ implementation and | |
12424 | the MLton implementation (though, in light of type inference and the | |
12425 | richer `MLRep` structure provided by MLton, this point is mostly | |
12426 | moot). | |
12427 | ||
12428 | Second, it appears to unnecessarily duplicate type information. For | |
12429 | example, an external C variable of type `int (* f[3])(int)` (that is, | |
12430 | an array of three function pointers), would be represented by the SML | |
12431 | type `(((sint -> sint) fptr, dec dg3) arr, sint -> sint, rw) obj`. | |
12432 | One might well ask why the `'f` instantiation (`sint -> sint` in this | |
12433 | case) cannot be _extracted_ from the `'t` instantiation | |
12434 | (`((sint -> sint) fptr, dec dg3) arr` in this case), obviating the | |
12435 | need for a separate _function-type_ type argument. There are a number | |
12436 | of components to an complete answer to this question. Foremost is the | |
12437 | fact that <:StandardML: Standard ML> supports neither (general) | |
12438 | type-level functions nor intensional polymorphism. | |
12439 | ||
12440 | A more direct answer for MLNLFFI is that in the SML/NJ implemention, | |
12441 | the definition of the types `('t, 'c) obj` and `('t, 'c) ptr` are made | |
12442 | in such a way that the type variables `'t` and `'c` are <:PhantomType: | |
12443 | phantom> (not contributing to the run-time representation of an | |
12444 | `('t, 'c) obj` or `('t, 'c) ptr` value), despite the fact that the | |
12445 | types `((sint -> sint) fptr, rw) ptr` and | |
12446 | `((double -> double) fptr, rw) ptr` necessarily carry distinct (and | |
12447 | type incompatible) run-time (C-)type information (RTTI), corresponding | |
12448 | to the different calling conventions of the two C functions. The | |
12449 | `Unsafe.cast` "cheat" overcomes the type incompatibility without | |
12450 | introducing a new type variable (as in the first solution above). | |
12451 | ||
12452 | Hence, the reason that _function-type_ type cannot be extracted from | |
12453 | the `'t` type variable instantiation is that the type of the | |
12454 | representation of RTTI doesn't even _see_ the (phantom) `'t` type | |
12455 | variable. The solution which presents itself is to give up on the | |
12456 | phantomness of the `'t` type variable, making it available to the | |
12457 | representation of RTTI. | |
12458 | ||
12459 | This is not without some small drawbacks. Because many of the types | |
12460 | used to instantiate `'t` carry more structure than is strictly | |
12461 | necessary for `'t`'s RTTI, it is sometimes necessary to wrap and | |
12462 | unwrap RTTI to accommodate the additional structure. (In the other | |
12463 | implementations, the corresponding operations can pass along the RTTI | |
12464 | unchanged.) However, these coercions contribute minuscule overhead; | |
12465 | in fact, in a majority of cases, MLton's optimizations will completely | |
12466 | eliminate the RTTI from the final program. | |
12467 | -- | |
12468 | ||
12469 | The implementation distributed with MLton uses the second solution. | |
12470 | ||
12471 | Bonus question: Why can't one use a <:UniversalType: universal type> | |
12472 | to eliminate the use of `Unsafe.cast`? | |
12473 | ||
12474 | ** Answer: ??? | |
12475 | -- | |
12476 | ||
12477 | * MLton (in both of the above implementations) provides a richer | |
12478 | `MLRep` structure, utilizing ++Int__<N>__++ and ++Word__<N>__++ | |
12479 | structures. | |
12480 | + | |
12481 | -- | |
12482 | [source,sml] | |
12483 | ----- | |
12484 | structure MLRep = struct | |
12485 | structure Char = | |
12486 | struct | |
12487 | structure Signed = Int8 | |
12488 | structure Unsigned = Word8 | |
12489 | (* word-style bit-operations on integers... *) | |
12490 | structure <:SignedBitops:> = IntBitOps(structure I = Signed | |
12491 | structure W = Unsigned) | |
12492 | end | |
12493 | structure Short = | |
12494 | struct | |
12495 | structure Signed = Int16 | |
12496 | structure Unsigned = Word16 | |
12497 | (* word-style bit-operations on integers... *) | |
12498 | structure <:SignedBitops:> = IntBitOps(structure I = Signed | |
12499 | structure W = Unsigned) | |
12500 | end | |
12501 | structure Int = | |
12502 | struct | |
12503 | structure Signed = Int32 | |
12504 | structure Unsigned = Word32 | |
12505 | (* word-style bit-operations on integers... *) | |
12506 | structure <:SignedBitops:> = IntBitOps(structure I = Signed | |
12507 | structure W = Unsigned) | |
12508 | end | |
12509 | structure Long = | |
12510 | struct | |
12511 | structure Signed = Int32 | |
12512 | structure Unsigned = Word32 | |
12513 | (* word-style bit-operations on integers... *) | |
12514 | structure <:SignedBitops:> = IntBitOps(structure I = Signed | |
12515 | structure W = Unsigned) | |
12516 | end | |
12517 | structure <:LongLong:> = | |
12518 | struct | |
12519 | structure Signed = Int64 | |
12520 | structure Unsigned = Word64 | |
12521 | (* word-style bit-operations on integers... *) | |
12522 | structure <:SignedBitops:> = IntBitOps(structure I = Signed | |
12523 | structure W = Unsigned) | |
12524 | end | |
12525 | structure Float = Real32 | |
12526 | structure Double = Real64 | |
12527 | end | |
12528 | ---- | |
12529 | ||
12530 | This would appear to be a better interface, even when an | |
12531 | implementation must choose `Int32` and `Word32` as the representation | |
12532 | for smaller C-types. | |
12533 | -- | |
12534 | ||
12535 | <<< | |
12536 | ||
12537 | :mlton-guide-page: MLRISCLibrary | |
12538 | [[MLRISCLibrary]] | |
12539 | MLRISCLibrary | |
12540 | ============= | |
12541 | ||
12542 | The http://www.cs.nyu.edu/leunga/www/MLRISC/Doc/html/index.html[MLRISC | |
12543 | Library] is a framework for retargetable and optimizing compiler back | |
12544 | ends. The MLRISC Library is distributed with SML/NJ. Due to | |
12545 | differences between SML/NJ and MLton, this library will not work | |
12546 | out-of-the box with MLton. | |
12547 | ||
12548 | As of 20180119, MLton includes a port of the MLRISC Library | |
12549 | synchronized with SML/NJ version 110.82. | |
12550 | ||
12551 | == Usage == | |
12552 | ||
12553 | * You can import a sub-library of the MLRISC Library into an MLB file with: | |
12554 | + | |
12555 | [options="header"] | |
12556 | |==== | |
12557 | |MLB file|Description | |
12558 | |`$(SML_LIB)/mlrisc-lib/mlb/ALPHA.mlb`|The ALPHA backend | |
12559 | |`$(SML_LIB)/mlrisc-lib/mlb/AMD64.mlb`|The AMD64 backend | |
12560 | |`$(SML_LIB)/mlrisc-lib/mlb/AMD64-Peephole.mlb`|The AMD64 peephole optimizer | |
12561 | |`$(SML_LIB)/mlrisc-lib/mlb/CCall.mlb`| | |
12562 | |`$(SML_LIB)/mlrisc-lib/mlb/CCall-sparc.mlb`| | |
12563 | |`$(SML_LIB)/mlrisc-lib/mlb/CCall-x86-64.mlb`| | |
12564 | |`$(SML_LIB)/mlrisc-lib/mlb/CCall-x86.mlb`| | |
12565 | |`$(SML_LIB)/mlrisc-lib/mlb/Control.mlb`| | |
12566 | |`$(SML_LIB)/mlrisc-lib/mlb/Graphs.mlb`| | |
12567 | |`$(SML_LIB)/mlrisc-lib/mlb/HPPA.mlb`|The HPPA backend | |
12568 | |`$(SML_LIB)/mlrisc-lib/mlb/IA32.mlb`|The IA32 backend | |
12569 | |`$(SML_LIB)/mlrisc-lib/mlb/IA32-Peephole.mlb`|The IA32 peephole optimizer | |
12570 | |`$(SML_LIB)/mlrisc-lib/mlb/Lib.mlb`| | |
12571 | |`$(SML_LIB)/mlrisc-lib/mlb/MLRISC.mlb`| | |
12572 | |`$(SML_LIB)/mlrisc-lib/mlb/MLTREE.mlb`| | |
12573 | |`$(SML_LIB)/mlrisc-lib/mlb/Peephole.mlb`| | |
12574 | |`$(SML_LIB)/mlrisc-lib/mlb/PPC.mlb`|The PPC backend | |
12575 | |`$(SML_LIB)/mlrisc-lib/mlb/RA.mlb`| | |
12576 | |`$(SML_LIB)/mlrisc-lib/mlb/SPARC.mlb`|The Sparc backend | |
12577 | |`$(SML_LIB)/mlrisc-lib/mlb/StagedAlloc.mlb`| | |
12578 | |`$(SML_LIB)/mlrisc-lib/mlb/Visual.mlb`| | |
12579 | |===== | |
12580 | ||
12581 | * If you are porting a project from SML/NJ's <:CompilationManager:> to | |
12582 | MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the | |
12583 | following map is included by default: | |
12584 | + | |
12585 | ---- | |
12586 | # MLRISC Library | |
12587 | $SMLNJ-MLRISC $(SML_LIB)/mlrisc-lib/mlb | |
12588 | ---- | |
12589 | + | |
12590 | This will automatically convert a `$SMLNJ-MLRISC/MLRISC.cm` import in | |
12591 | an input `.cm` file into a `$(SML_LIB)/mlrisc-lib/mlb/MLRISC.mlb` | |
12592 | import in the output `.mlb` file. | |
12593 | ||
12594 | == Details == | |
12595 | ||
12596 | The following changes were made to the MLRISC Library, in addition to | |
12597 | deriving the `.mlb` files from the `.cm` files: | |
12598 | ||
12599 | * eliminate sequential `withtype` expansions: Most could be rewritten as a sequence of type definitions and datatype definitions. | |
12600 | * eliminate higher-order functors: Every higher-order functor definition and application could be uncurried in the obvious way. | |
12601 | * eliminate `where <str> = <str>`: Quite painful to expand out all the flexible types in the respective structures. Furthermore, many of the implied type equalities aren't needed, but it's too hard to pick out the right ones. | |
12602 | * `library/array-noneq.sml` (added, not exported): Implements `signature ARRAY_NONEQ`, similar to `signature ARRAY` from the <:BasisLibrary:Basis Library>, but replacing the latter's `eqtype 'a array = 'a array` and `type 'a vector = 'a Vector.vector` with `type 'a array` and `type 'a vector`. Thus, array-like containers may match `ARRAY_NONEQ`, whereas only the pervasive `'a array` container may math `ARRAY`. (SML/NJ's implementation of `signature ARRAY` omits the type realizations.) | |
12603 | * `library/dynamic-array.sml` and `library/hash-array.sml` (modifed): Replace `include ARRAY` with `include ARRAY_NONEQ`; see above. | |
12604 | ||
12605 | == Patch == | |
12606 | ||
12607 | * <!ViewGitFile(mlton,master,lib/mlrisc-lib/MLRISC.patch)> | |
12608 | ||
12609 | <<< | |
12610 | ||
12611 | :mlton-guide-page: MLtonArray | |
12612 | [[MLtonArray]] | |
12613 | MLtonArray | |
12614 | ========== | |
12615 | ||
12616 | [source,sml] | |
12617 | ---- | |
12618 | signature MLTON_ARRAY = | |
12619 | sig | |
12620 | val unfoldi: int * 'b * (int * 'b -> 'a * 'b) -> 'a array * 'b | |
12621 | end | |
12622 | ---- | |
12623 | ||
12624 | * `unfoldi (n, b, f)` | |
12625 | + | |
12626 | constructs an array _a_ of length `n`, whose elements _a~i~_ are | |
12627 | determined by the equations __b~0~ = b__ and | |
12628 | __(a~i~, b~i+1~) = f (i, b~i~)__. | |
12629 | ||
12630 | <<< | |
12631 | ||
12632 | :mlton-guide-page: MLtonBinIO | |
12633 | [[MLtonBinIO]] | |
12634 | MLtonBinIO | |
12635 | ========== | |
12636 | ||
12637 | [source,sml] | |
12638 | ---- | |
12639 | signature MLTON_BIN_IO = MLTON_IO | |
12640 | ---- | |
12641 | ||
12642 | See <:MLtonIO:>. | |
12643 | ||
12644 | <<< | |
12645 | ||
12646 | :mlton-guide-page: MLtonCont | |
12647 | [[MLtonCont]] | |
12648 | MLtonCont | |
12649 | ========= | |
12650 | ||
12651 | [source,sml] | |
12652 | ---- | |
12653 | signature MLTON_CONT = | |
12654 | sig | |
12655 | type 'a t | |
12656 | ||
12657 | val callcc: ('a t -> 'a) -> 'a | |
12658 | val isolate: ('a -> unit) -> 'a t | |
12659 | val prepend: 'a t * ('b -> 'a) -> 'b t | |
12660 | val throw: 'a t * 'a -> 'b | |
12661 | val throw': 'a t * (unit -> 'a) -> 'b | |
12662 | end | |
12663 | ---- | |
12664 | ||
12665 | * `type 'a t` | |
12666 | + | |
12667 | the type of continuations that expect a value of type `'a`. | |
12668 | ||
12669 | * `callcc f` | |
12670 | + | |
12671 | applies `f` to the current continuation. This copies the entire | |
12672 | stack; hence, `callcc` takes time proportional to the size of the | |
12673 | current stack. | |
12674 | ||
12675 | * `isolate f` | |
12676 | + | |
12677 | creates a continuation that evaluates `f` in an empty context. This | |
12678 | is a constant time operation, and yields a constant size stack. | |
12679 | ||
12680 | * `prepend (k, f)` | |
12681 | + | |
12682 | composes a function `f` with a continuation `k` to create a | |
12683 | continuation that first does `f` and then does `k`. This is a | |
12684 | constant time operation. | |
12685 | ||
12686 | * `throw (k, v)` | |
12687 | + | |
12688 | throws value `v` to continuation `k`. This copies the entire stack of | |
12689 | `k`; hence, `throw` takes time proportional to the size of this stack. | |
12690 | ||
12691 | * `throw' (k, th)` | |
12692 | + | |
12693 | a generalization of throw that evaluates `th ()` in the context of | |
12694 | `k`. Thus, for example, if `th ()` raises an exception or captures | |
12695 | another continuation, it will see `k`, not the current continuation. | |
12696 | ||
12697 | ||
12698 | == Also see == | |
12699 | ||
12700 | * <:MLtonContIsolateImplementation:> | |
12701 | ||
12702 | <<< | |
12703 | ||
12704 | :mlton-guide-page: MLtonContIsolateImplementation | |
12705 | [[MLtonContIsolateImplementation]] | |
12706 | MLtonContIsolateImplementation | |
12707 | ============================== | |
12708 | ||
12709 | As noted before, it is fairly easy to get the operational behavior of `isolate` with just `callcc` and `throw`, but establishing the right space behavior is trickier. Here, we show how to start from the obvious, but inefficient, implementation of `isolate` using only `callcc` and `throw`, and 'derive' an equivalent, but more efficient, implementation of `isolate` using MLton's primitive stack capture and copy operations. This isn't a formal derivation, as we are not formally showing the equivalence of the programs (though I believe that they are all equivalent, modulo the space behavior). | |
12710 | ||
12711 | Here is a direct implementation of isolate using only `callcc` and `throw`: | |
12712 | ||
12713 | [source,sml] | |
12714 | ---- | |
12715 | val isolate: ('a -> unit) -> 'a t = | |
12716 | fn (f: 'a -> unit) => | |
12717 | callcc | |
12718 | (fn k1 => | |
12719 | let | |
12720 | val x = callcc (fn k2 => throw (k1, k2)) | |
12721 | val _ = (f x ; Exit.topLevelSuffix ()) | |
12722 | handle exn => MLtonExn.topLevelHandler exn | |
12723 | in | |
12724 | raise Fail "MLton.Cont.isolate: return from (wrapped) func" | |
12725 | end) | |
12726 | ---- | |
12727 | ||
12728 | ||
12729 | We use the standard nested `callcc` trick to return a continuation that is ready to receive an argument, execute the isolated function, and exit the program. Both `Exit.topLevelSuffix` and `MLtonExn.topLevelHandler` will terminate the program. | |
12730 | ||
12731 | Throwing to an isolated function will execute the function in a 'semantically' empty context, in the sense that we never re-execute the 'original' continuation of the call to isolate (i.e., the context that was in place at the time `isolate` was called). However, we assume that the compiler isn't able to recognize that the 'original' continuation is unused; for example, while we (the programmer) know that `Exit.topLevelSuffix` and `MLtonExn.topLevelHandler` will terminate the program, the compiler may only see opaque calls to unknown foreign-functions. So, that original continuation (in its entirety) is part of the continuation returned by `isolate` and throwing to the continuation returned by `isolate` will execute `f x` (with the exit wrapper) in the context of that original continuation. Thus, the garbage collector will retain everything reachable from that original continuation during the evaluation of `f x`, even though it is 'semantically' garbage. | |
12732 | ||
12733 | Note that this space-leak is independent of the implementation of continuations (it arises in both MLton's stack copying implementation of continuations and would arise in SML/NJ's CPS-translation implementation); we are only assuming that the implementation can't 'see' the program termination, and so must retain the original continuation (and anything reachable from it). | |
12734 | ||
12735 | So, we need an 'empty' continuation in which to execute `f x`. (No surprise there, as that is the written description of `isolate`.) To do this, we capture a top-level continuation and throw to that in order to execute `f x`: | |
12736 | ||
12737 | [source,sml] | |
12738 | ---- | |
12739 | local | |
12740 | val base: (unit -> unit) t = | |
12741 | callcc | |
12742 | (fn k1 => | |
12743 | let | |
12744 | val th = callcc (fn k2 => throw (k1, k2)) | |
12745 | val _ = (th () ; Exit.topLevelSuffix ()) | |
12746 | handle exn => MLtonExn.topLevelHandler exn | |
12747 | in | |
12748 | raise Fail "MLton.Cont.isolate: return from (wrapped) func" | |
12749 | end) | |
12750 | in | |
12751 | val isolate: ('a -> unit) -> 'a t = | |
12752 | fn (f: 'a -> unit) => | |
12753 | callcc | |
12754 | (fn k1 => | |
12755 | let | |
12756 | val x = callcc (fn k2 => throw (k1, k2)) | |
12757 | in | |
12758 | throw (base, fn () => f x) | |
12759 | end) | |
12760 | end | |
12761 | ---- | |
12762 | ||
12763 | ||
12764 | We presume that `base` is evaluated 'early' in the program. There is a subtlety here, because one needs to believe that this `base` continuation (which technically corresponds to the entire rest of the program evaluation) 'works' as an empty context; in particular, we want it to be the case that executing `f x` in the `base` context retains less space than executing `f x` in the context in place at the call to `isolate` (as occurred in the previous implementation of `isolate`). This isn't particularly easy to believe if one takes a normal substitution-based operational semantics, because it seems that the context captured and bound to `base` is arbitrarily large. However, this context is mostly unevaluated code; the only heap-allocated values that are reachable from it are those that were evaluated before the evaluation of `base` (and used in the program after the evaluation of `base`). Assuming that `base` is evaluated 'early' in the program, we conclude that there are few heap-allocated values reachable from its continuation. In contrast, the previous implementation of `isolate` could capture a context that has many heap-allocated values reachable from it (because we could evaluate `isolate f` 'late' in the program and 'deep' in a call stack), which would all remain reachable during the evaluation of | |
12765 | `f x`. [We'll return to this point later, as it is taking a slightly MLton-esque view of the evaluation of a program, and may not apply as strongly to other implementations (e.g., SML/NJ).] | |
12766 | ||
12767 | Now, once we throw to `base` and begin executing `f x`, only the heap-allocated values reachable from `f` and `x` and the few heap-allocated values reachable from `base` are retained by the garbage collector. So, it seems that `base` 'works' as an empty context. | |
12768 | ||
12769 | But, what about the continuation returned from `isolate f`? Note that the continuation returned by `isolate` is one that receives an argument `x` and then | |
12770 | throws to `base` to evaluate `f x`. If we used a CPS-translation implementation (and assume sufficient beta-contractions to eliminate administrative redexes), then the original continuation passed to `isolate` (i.e., the continuation bound to `k1`) will not be free in the continuation returned by `isolate f`. Rather, the only free variables in the continuation returned by `isolate f` will be `base` and `f`, so the only heap-allocated values reachable from the continuation returned by `isolate f` will be those values reachable from `base` (assumed to be few) and those values reachable from `f` (necessary in order to execute `f` at some later point). | |
12771 | ||
12772 | But, MLton doesn't use a CPS-translation implementation. Rather, at each call to `callcc` in the body of `isolate`, MLton will copy the current execution stack. Thus, `k2` (the continuation returned by `isolate f`) will include execution stack at the time of the call to `isolate f` -- that is, it will include the 'original' continuation of the call to `isolate f`. Thus, the heap-allocated values reachable from the continuation returned by `isolate f` will include those values reachable from `base`, those values reachable from `f`, and those values reachable from the original continuation of the call to `isolate f`. So, just holding on to the continuation returned by `isolate f` will retain all of the heap-allocated values live at the time `isolate f` was called. This leaks space, since, 'semantically', the | |
12773 | continuation returned by `isolate f` only needs the heap-allocated values reachable from `f` (and `base`). | |
12774 | ||
12775 | In practice, this probably isn't a significant issue. A common use of `isolate` is implement `abort`: | |
12776 | [source,sml] | |
12777 | ---- | |
12778 | fun abort th = throw (isolate th, ()) | |
12779 | ---- | |
12780 | ||
12781 | The continuation returned by `isolate th` is dead immediately after being thrown to -- the continuation isn't retained, so neither is the 'semantic' | |
12782 | garbage it would have retained. | |
12783 | ||
12784 | But, it is easy enough to 'move' onto the 'empty' context `base` the capturing of the context that we want to be returned by `isolate f`: | |
12785 | ||
12786 | [source,sml] | |
12787 | ---- | |
12788 | local | |
12789 | val base: (unit -> unit) t = | |
12790 | callcc | |
12791 | (fn k1 => | |
12792 | let | |
12793 | val th = callcc (fn k2 => throw (k1, k2)) | |
12794 | val _ = (th () ; Exit.topLevelSuffix ()) | |
12795 | handle exn => MLtonExn.topLevelHandler exn | |
12796 | in | |
12797 | raise Fail "MLton.Cont.isolate: return from (wrapped) func" | |
12798 | end) | |
12799 | in | |
12800 | val isolate: ('a -> unit) -> 'a t = | |
12801 | fn (f: 'a -> unit) => | |
12802 | callcc | |
12803 | (fn k1 => | |
12804 | throw (base, fn () => | |
12805 | let | |
12806 | val x = callcc (fn k2 => throw (k1, k2)) | |
12807 | in | |
12808 | throw (base, fn () => f x) | |
12809 | end)) | |
12810 | end | |
12811 | ---- | |
12812 | ||
12813 | ||
12814 | This implementation now has the right space behavior; the continuation returned by `isolate f` will only retain the heap-allocated values reachable from `f` and from `base`. (Technically, the continuation will retain two copies of the stack that was in place at the time `base` was evaluated, but we are assuming that that stack small.) | |
12815 | ||
12816 | One minor inefficiency of this implementation (given MLton's implementation of continuations) is that every `callcc` and `throw` entails copying a stack (albeit, some of them are small). We can avoid this in the evaluation of `base` by using a reference cell, because `base` is evaluated at the top-level: | |
12817 | ||
12818 | [source,sml] | |
12819 | ---- | |
12820 | local | |
12821 | val base: (unit -> unit) option t = | |
12822 | let | |
12823 | val baseRef: (unit -> unit) option t option ref = ref NONE | |
12824 | val th = callcc (fn k => (base := SOME k; NONE)) | |
12825 | in | |
12826 | case th of | |
12827 | NONE => (case !baseRef of | |
12828 | NONE => raise Fail "MLton.Cont.isolate: missing base" | |
12829 | | SOME base => base) | |
12830 | | SOME th => let | |
12831 | val _ = (th () ; Exit.topLevelSuffix ()) | |
12832 | handle exn => MLtonExn.topLevelHandler exn | |
12833 | in | |
12834 | raise Fail "MLton.Cont.isolate: return from (wrapped) | |
12835 | func" | |
12836 | end | |
12837 | end | |
12838 | in | |
12839 | val isolate: ('a -> unit) -> 'a t = | |
12840 | fn (f: 'a -> unit) => | |
12841 | callcc | |
12842 | (fn k1 => | |
12843 | throw (base, SOME (fn () => | |
12844 | let | |
12845 | val x = callcc (fn k2 => throw (k1, k2)) | |
12846 | in | |
12847 | throw (base, SOME (fn () => f x)) | |
12848 | end))) | |
12849 | end | |
12850 | ---- | |
12851 | ||
12852 | ||
12853 | Now, to evaluate `base`, we only copy the stack once (instead of 3 times). Because we don't have a dummy continuation around to initialize the reference cell, the reference cell holds a continuation `option`. To distinguish between the original evaluation of `base` (when we want to return the continuation) and the subsequent evaluations of `base` (when we want to evaluate a thunk), we capture a `(unit -> unit) option` continuation. | |
12854 | ||
12855 | This seems to be as far as we can go without exploiting the concrete implementation of continuations in <:MLtonCont:>. Examining the implementation, we note that the type of | |
12856 | continuations is given by | |
12857 | [source,sml] | |
12858 | ---- | |
12859 | type 'a t = (unit -> 'a) -> unit | |
12860 | ---- | |
12861 | ||
12862 | and the implementation of `throw` is given by | |
12863 | [source,sml] | |
12864 | ---- | |
12865 | fun ('a, 'b) throw' (k: 'a t, v: unit -> 'a): 'b = | |
12866 | (k v; raise Fail "MLton.Cont.throw': return from continuation") | |
12867 | ||
12868 | fun ('a, 'b) throw (k: 'a t, v: 'a): 'b = throw' (k, fn () => v) | |
12869 | ---- | |
12870 | ||
12871 | ||
12872 | Suffice to say, a continuation is simply a function that accepts a thunk to yield the thrown value and the body of the function performs the actual throw. Using this knowledge, we can create a dummy continuation to initialize `baseRef` and greatly simplify the body of `isolate`: | |
12873 | ||
12874 | [source,sml] | |
12875 | ---- | |
12876 | local | |
12877 | val base: (unit -> unit) option t = | |
12878 | let | |
12879 | val baseRef: (unit -> unit) option t ref = | |
12880 | ref (fn _ => raise Fail "MLton.Cont.isolate: missing base") | |
12881 | val th = callcc (fn k => (baseRef := k; NONE)) | |
12882 | in | |
12883 | case th of | |
12884 | NONE => !baseRef | |
12885 | | SOME th => let | |
12886 | val _ = (th () ; Exit.topLevelSuffix ()) | |
12887 | handle exn => MLtonExn.topLevelHandler exn | |
12888 | in | |
12889 | raise Fail "MLton.Cont.isolate: return from (wrapped) | |
12890 | func" | |
12891 | end | |
12892 | end | |
12893 | in | |
12894 | val isolate: ('a -> unit) -> 'a t = | |
12895 | fn (f: 'a -> unit) => | |
12896 | fn (v: unit -> 'a) => | |
12897 | throw (base, SOME (f o v)) | |
12898 | end | |
12899 | ---- | |
12900 | ||
12901 | ||
12902 | Note that this implementation of `isolate` makes it clear that the continuation returned by `isolate f` only retains the heap-allocated values reachable from `f` and `base`. It also retains only one copy of the stack that was in place at the time `base` was evaluated. Finally, it completely avoids making any copies of the stack that is in place at the time `isolate f` is evaluated; indeed, `isolate f` is a constant-time operation. | |
12903 | ||
12904 | Next, suppose we limited ourselves to capturing `unit` continuations with `callcc`. We can't pass the thunk to be evaluated in the 'empty' context directly, but we can use a reference cell. | |
12905 | ||
12906 | [source,sml] | |
12907 | ---- | |
12908 | local | |
12909 | val thRef: (unit -> unit) option ref = ref NONE | |
12910 | val base: unit t = | |
12911 | let | |
12912 | val baseRef: unit t ref = | |
12913 | ref (fn _ => raise Fail "MLton.Cont.isolate: missing base") | |
12914 | val () = callcc (fn k => baseRef := k) | |
12915 | in | |
12916 | case !thRef of | |
12917 | NONE => !baseRef | |
12918 | | SOME th => | |
12919 | let | |
12920 | val _ = thRef := NONE | |
12921 | val _ = (th () ; Exit.topLevelSuffix ()) | |
12922 | handle exn => MLtonExn.topLevelHandler exn | |
12923 | in | |
12924 | raise Fail "MLton.Cont.isolate: return from (wrapped) func" | |
12925 | end | |
12926 | end | |
12927 | in | |
12928 | val isolate: ('a -> unit) -> 'a t = | |
12929 | fn (f: 'a -> unit) => | |
12930 | fn (v: unit -> 'a) => | |
12931 | let | |
12932 | val () = thRef := SOME (f o v) | |
12933 | in | |
12934 | throw (base, ()) | |
12935 | end | |
12936 | end | |
12937 | ---- | |
12938 | ||
12939 | ||
12940 | Note that it is important to set `thRef` to `NONE` before evaluating the thunk, so that the garbage collector doesn't retain all the heap-allocated values reachable from `f` and `v` during the evaluation of `f (v ())`. This is because `thRef` is still live during the evaluation of the thunk; in particular, it was allocated before the evaluation of `base` (and used after), and so is retained by continuation on which the thunk is evaluated. | |
12941 | ||
12942 | This implementation can be easily adapted to use MLton's primitive stack copying operations. | |
12943 | ||
12944 | [source,sml] | |
12945 | ---- | |
12946 | local | |
12947 | val thRef: (unit -> unit) option ref = ref NONE | |
12948 | val base: Thread.preThread = | |
12949 | let | |
12950 | val () = Thread.copyCurrent () | |
12951 | in | |
12952 | case !thRef of | |
12953 | NONE => Thread.savedPre () | |
12954 | | SOME th => | |
12955 | let | |
12956 | val () = thRef := NONE | |
12957 | val _ = (th () ; Exit.topLevelSuffix ()) | |
12958 | handle exn => MLtonExn.topLevelHandler exn | |
12959 | in | |
12960 | raise Fail "MLton.Cont.isolate: return from (wrapped) func" | |
12961 | end | |
12962 | end | |
12963 | in | |
12964 | val isolate: ('a -> unit) -> 'a t = | |
12965 | fn (f: 'a -> unit) => | |
12966 | fn (v: unit -> 'a) => | |
12967 | let | |
12968 | val () = thRef := SOME (f o v) | |
12969 | val new = Thread.copy base | |
12970 | in | |
12971 | Thread.switchTo new | |
12972 | end | |
12973 | end | |
12974 | ---- | |
12975 | ||
12976 | ||
12977 | In essence, `Thread.copyCurrent` copies the current execution stack and stores it in an implicit reference cell in the runtime system, which is fetchable with `Thread.savedPre`. When we are ready to throw to the isolated function, `Thread.copy` copies the saved execution stack (because the stack is modified in place during execution, we need to retain a pristine copy in case the isolated function itself throws to other isolated functions) and `Thread.switchTo` abandons the current execution stack, installing the newly copied execution stack. | |
12978 | ||
12979 | The actual implementation of `MLton.Cont.isolate` simply adds some `Thread.atomicBegin` and `Thread.atomicEnd` commands, which effectively protect the global `thRef` and accommodate the fact that `Thread.switchTo` does an implicit `Thread.atomicEnd` (used for leaving a signal handler thread). | |
12980 | ||
12981 | [source,sml] | |
12982 | ---- | |
12983 | local | |
12984 | val thRef: (unit -> unit) option ref = ref NONE | |
12985 | val base: Thread.preThread = | |
12986 | let | |
12987 | val () = Thread.copyCurrent () | |
12988 | in | |
12989 | case !thRef of | |
12990 | NONE => Thread.savedPre () | |
12991 | | SOME th => | |
12992 | let | |
12993 | val () = thRef := NONE | |
12994 | val _ = MLton.atomicEnd (* Match 1 *) | |
12995 | val _ = (th () ; Exit.topLevelSuffix ()) | |
12996 | handle exn => MLtonExn.topLevelHandler exn | |
12997 | in | |
12998 | raise Fail "MLton.Cont.isolate: return from (wrapped) func" | |
12999 | end | |
13000 | end | |
13001 | in | |
13002 | val isolate: ('a -> unit) -> 'a t = | |
13003 | fn (f: 'a -> unit) => | |
13004 | fn (v: unit -> 'a) => | |
13005 | let | |
13006 | val _ = MLton.atomicBegin (* Match 1 *) | |
13007 | val () = thRef := SOME (f o v) | |
13008 | val new = Thread.copy base | |
13009 | val _ = MLton.atomicBegin (* Match 2 *) | |
13010 | in | |
13011 | Thread.switchTo new (* Match 2 *) | |
13012 | end | |
13013 | end | |
13014 | ---- | |
13015 | ||
13016 | ||
13017 | It is perhaps interesting to note that the above implementation was originally 'derived' by specializing implementations of the <:MLtonThread:> `new`, `prepare`, and `switch` functions as if their only use was in the following implementation of `isolate`: | |
13018 | ||
13019 | [source,sml] | |
13020 | ---- | |
13021 | val isolate: ('a -> unit) -> 'a t = | |
13022 | fn (f: 'a -> unit) => | |
13023 | fn (v: unit -> 'a) => | |
13024 | let | |
13025 | val th = (f (v ()) ; Exit.topLevelSuffix ()) | |
13026 | handle exn => MLtonExn.topLevelHandler exn | |
13027 | val t = MLton.Thread.prepare (MLton.Thread.new th, ()) | |
13028 | in | |
13029 | MLton.Thread.switch (fn _ => t) | |
13030 | end | |
13031 | ---- | |
13032 | ||
13033 | ||
13034 | It was pleasant to discover that it could equally well be 'derived' starting from the `callcc` and `throw` implementation. | |
13035 | ||
13036 | As a final comment, we noted that the degree to which the context of `base` could be considered 'empty' (i.e., retaining few heap-allocated values) depended upon a slightly MLton-esque view. In particular, MLton does not heap allocate executable code. So, although the `base` context keeps a lot of unevaluated code 'live', such code is not heap allocated. In a system like SML/NJ, that does heap allocate executable code, one might want it to be the case that after throwing to an isolated function, the garbage collector retains only the code necessary to evaluate the function, and not any code that was necessary to evaluate the `base` context. | |
13037 | ||
13038 | <<< | |
13039 | ||
13040 | :mlton-guide-page: MLtonCross | |
13041 | [[MLtonCross]] | |
13042 | MLtonCross | |
13043 | ========== | |
13044 | ||
13045 | The debian package MLton-Cross adds various targets to MLton. In | |
13046 | combination with the emdebian project, this allows a debian system to | |
13047 | compile SML files to other architectures. | |
13048 | ||
13049 | Currently, these targets are supported: | |
13050 | ||
13051 | * _Windows (MinGW)_ | |
13052 | ** -target i586-mingw32msvc (mlton-target-i586-mingw32msvc) | |
13053 | ** -target amd64-mingw32msvc( mlton-target-amd64-mingw32msvc) | |
13054 | * _Linux (Debian)_ | |
13055 | ** -target alpha-linux-gnu (mlton-target-alpha-linux-gnu) | |
13056 | ** -target arm-linux-gnueabi (mlton-target-arm-linux-gnueabi) | |
13057 | ** -target hppa-linux-gnu (mlton-target-hppa-linux-gnu) | |
13058 | ** -target i486-linux-gnu (mlton-target-i486-linux-gnu) | |
13059 | ** -target ia64-linux-gnu (mlton-target-ia64-linux-gnu) | |
13060 | ** -target mips-linux-gnu (mlton-target-mips-linux-gnu) | |
13061 | ** -target mipsel-linux-gnu (mlton-target-mipsel-linux-gnu) | |
13062 | ** -target powerpc-linux-gnu (mlton-target-powerpc-linux-gnu) | |
13063 | ** -target s390-linux-gnu (mlton-target-s390-linux-gnu) | |
13064 | ** -target sparc-linux-gnu (mlton-target-sparc-linux-gnu) | |
13065 | ** -target x86-64-linux-gnu (mlton-target-x86-64-linux-gnu) | |
13066 | ||
13067 | ||
13068 | == Download == | |
13069 | ||
13070 | MLton-Cross is kept in-sync with the current MLton release. | |
13071 | ||
13072 | * <!Attachment(MLtonCross,mlton-cross_20100608.orig.tar.gz)> | |
13073 | ||
13074 | <<< | |
13075 | ||
13076 | :mlton-guide-page: MLtonExn | |
13077 | [[MLtonExn]] | |
13078 | MLtonExn | |
13079 | ======== | |
13080 | ||
13081 | [source,sml] | |
13082 | ---- | |
13083 | signature MLTON_EXN = | |
13084 | sig | |
13085 | val addExnMessager: (exn -> string option) -> unit | |
13086 | val history: exn -> string list | |
13087 | ||
13088 | val defaultTopLevelHandler: exn -> 'a | |
13089 | val getTopLevelHandler: unit -> (exn -> unit) | |
13090 | val setTopLevelHandler: (exn -> unit) -> unit | |
13091 | val topLevelHandler: exn -> 'a | |
13092 | end | |
13093 | ---- | |
13094 | ||
13095 | * `addExnMessager f` | |
13096 | + | |
13097 | adds `f` as a pretty-printer to be used by `General.exnMessage` for | |
13098 | converting exceptions to strings. Messagers are tried in order from | |
13099 | most recently added to least recently added. | |
13100 | ||
13101 | * `history e` | |
13102 | + | |
13103 | returns call stack at the point that `e` was first raised. Each | |
13104 | element of the list is a file position. The elements are in reverse | |
13105 | chronological order, i.e. the function called last is at the front of | |
13106 | the list. | |
13107 | + | |
13108 | `history e` will return `[]` unless the program is compiled with | |
13109 | `-const 'Exn.keepHistory true'`. | |
13110 | ||
13111 | * `defaultTopLevelHandler e` | |
13112 | + | |
13113 | function that behaves as the default top level handler; that is, print | |
13114 | out the unhandled exception message for `e` and exit. | |
13115 | ||
13116 | * `getTopLevelHandler ()` | |
13117 | + | |
13118 | get the top level handler. | |
13119 | ||
13120 | * `setTopLevelHandler f` | |
13121 | + | |
13122 | set the top level handler to the function `f`. The function `f` | |
13123 | should not raise an exception or return normally. | |
13124 | ||
13125 | * `topLevelHandler e` | |
13126 | + | |
13127 | behaves as if the top level handler received the exception `e`. | |
13128 | ||
13129 | <<< | |
13130 | ||
13131 | :mlton-guide-page: MLtonFinalizable | |
13132 | [[MLtonFinalizable]] | |
13133 | MLtonFinalizable | |
13134 | ================ | |
13135 | ||
13136 | [source,sml] | |
13137 | ---- | |
13138 | signature MLTON_FINALIZABLE = | |
13139 | sig | |
13140 | type 'a t | |
13141 | ||
13142 | val addFinalizer: 'a t * ('a -> unit) -> unit | |
13143 | val finalizeBefore: 'a t * 'b t -> unit | |
13144 | val new: 'a -> 'a t | |
13145 | val touch: 'a t -> unit | |
13146 | val withValue: 'a t * ('a -> 'b) -> 'b | |
13147 | end | |
13148 | ---- | |
13149 | ||
13150 | A _finalizable_ value is a container to which finalizers can be | |
13151 | attached. A container holds a value, which is reachable as long as | |
13152 | the container itself is reachable. A _finalizer_ is a function that | |
13153 | runs at some point after garbage collection determines that the | |
13154 | container to which it is attached has become | |
13155 | <:Reachability:unreachable>. A finalizer is treated like a signal | |
13156 | handler, in that it runs asynchronously in a separate thread, with | |
13157 | signals blocked, and will not interrupt a critical section (see | |
13158 | <:MLtonThread:>). | |
13159 | ||
13160 | * `addFinalizer (v, f)` | |
13161 | + | |
13162 | adds `f` as a finalizer to `v`. This means that sometime after the | |
13163 | last call to `withValue` on `v` completes and `v` becomes unreachable, | |
13164 | `f` will be called with the value of `v`. | |
13165 | ||
13166 | * `finalizeBefore (v1, v2)` | |
13167 | + | |
13168 | ensures that `v1` will be finalized before `v2`. A cycle of values | |
13169 | `v` = `v1`, ..., `vn` = `v` with `finalizeBefore (vi, vi+1)` will | |
13170 | result in none of the `vi` being finalized. | |
13171 | ||
13172 | * `new x` | |
13173 | + | |
13174 | creates a new finalizable value, `v`, with value `x`. The finalizers | |
13175 | of `v` will run sometime after the last call to `withValue` on `v` | |
13176 | when the garbage collector determines that `v` is unreachable. | |
13177 | ||
13178 | * `touch v` | |
13179 | + | |
13180 | ensures that `v`'s finalizers will not run before the call to `touch`. | |
13181 | ||
13182 | * `withValue (v, f)` | |
13183 | + | |
13184 | returns the result of applying `f` to the value of `v` and ensures | |
13185 | that `v`'s finalizers will not run before `f` completes. The call to | |
13186 | `f` is a nontail call. | |
13187 | ||
13188 | ||
13189 | == Example == | |
13190 | ||
13191 | Suppose that `finalizable.sml` contains the following: | |
13192 | [source,sml] | |
13193 | ---- | |
13194 | sys::[./bin/InclGitFile.py mlton master doc/examples/finalizable/finalizable.sml] | |
13195 | ---- | |
13196 | ||
13197 | Suppose that `cons.c` contains the following. | |
13198 | [source,c] | |
13199 | ---- | |
13200 | sys::[./bin/InclGitFile.py mlton master doc/examples/finalizable/cons.c] | |
13201 | ---- | |
13202 | ||
13203 | We can compile these to create an executable with | |
13204 | ---- | |
13205 | % mlton -default-ann 'allowFFI true' finalizable.sml cons.c | |
13206 | ---- | |
13207 | ||
13208 | Running this executable will create output like the following. | |
13209 | ---- | |
13210 | % finalizable | |
13211 | 0x08072890 = listSing (2) | |
13212 | 0x080728a0 = listCons (2) | |
13213 | 0x080728b0 = listCons (2) | |
13214 | 0x080728c0 = listCons (2) | |
13215 | 0x080728d0 = listCons (2) | |
13216 | 0x080728e0 = listCons (2) | |
13217 | 0x080728f0 = listCons (2) | |
13218 | listSum | |
13219 | listSum(l) = 14 | |
13220 | listFree (0x080728f0) | |
13221 | listFree (0x080728e0) | |
13222 | listFree (0x080728d0) | |
13223 | listFree (0x080728c0) | |
13224 | listFree (0x080728b0) | |
13225 | listFree (0x080728a0) | |
13226 | listFree (0x08072890) | |
13227 | ---- | |
13228 | ||
13229 | ||
13230 | == Synchronous Finalizers == | |
13231 | ||
13232 | Finalizers in MLton are asynchronous. That is, they run at an | |
13233 | unspecified time, interrupting the user program. It is also possible, | |
13234 | and sometimes useful, to have synchronous finalizers, where the user | |
13235 | program explicitly decides when to run enabled finalizers. We have | |
13236 | considered this in MLton, and it seems possible, but there are some | |
13237 | unresolved design issues. See the thread at | |
13238 | ||
13239 | * http://www.mlton.org/pipermail/mlton/2004-September/016570.html | |
13240 | ||
13241 | == Also see == | |
13242 | ||
13243 | * <!Cite(Boehm03)> | |
13244 | ||
13245 | <<< | |
13246 | ||
13247 | :mlton-guide-page: MLtonGC | |
13248 | [[MLtonGC]] | |
13249 | MLtonGC | |
13250 | ======= | |
13251 | ||
13252 | [source,sml] | |
13253 | ---- | |
13254 | signature MLTON_GC = | |
13255 | sig | |
13256 | val collect: unit -> unit | |
13257 | val pack: unit -> unit | |
13258 | val setMessages: bool -> unit | |
13259 | val setSummary: bool -> unit | |
13260 | val unpack: unit -> unit | |
13261 | structure Statistics : | |
13262 | sig | |
13263 | val bytesAllocated: unit -> IntInf.int | |
13264 | val lastBytesLive: unit -> IntInf.int | |
13265 | val numCopyingGCs: unit -> IntInf.int | |
13266 | val numMarkCompactGCs: unit -> IntInf.int | |
13267 | val numMinorGCs: unit -> IntInf.int | |
13268 | val maxBytesLive: unit -> IntInf.int | |
13269 | end | |
13270 | end | |
13271 | ---- | |
13272 | ||
13273 | * `collect ()` | |
13274 | + | |
13275 | causes a garbage collection to occur. | |
13276 | ||
13277 | * `pack ()` | |
13278 | + | |
13279 | shrinks the heap as much as possible so that other processes can use | |
13280 | available RAM. | |
13281 | ||
13282 | * `setMessages b` | |
13283 | + | |
13284 | controls whether diagnostic messages are printed at the beginning and | |
13285 | end of each garbage collection. It is the same as the `gc-messages` | |
13286 | runtime system option. | |
13287 | ||
13288 | * `setSummary b` | |
13289 | + | |
13290 | controls whether a summary of garbage collection statistics is printed | |
13291 | upon termination of the program. It is the same as the `gc-summary` | |
13292 | runtime system option. | |
13293 | ||
13294 | * `unpack ()` | |
13295 | + | |
13296 | resizes a packed heap to the size desired by the runtime. | |
13297 | ||
13298 | * `Statistics.bytesAllocated ()` | |
13299 | + | |
13300 | returns bytes allocated (as of the most recent garbage collection). | |
13301 | ||
13302 | * `Statistics.lastBytesLive ()` | |
13303 | + | |
13304 | returns bytes live (as of the most recent garbage collection). | |
13305 | ||
13306 | * `Statistics.numCopyingGCs ()` | |
13307 | + | |
13308 | returns number of (major) copying garbage collections performed (as of | |
13309 | the most recent garbage collection). | |
13310 | ||
13311 | * `Statistics.numMarkCompactGCs ()` | |
13312 | + | |
13313 | returns number of (major) mark-compact garbage collections performed | |
13314 | (as of the most recent garbage collection). | |
13315 | ||
13316 | * `Statistics.numMinorGCs ()` | |
13317 | + | |
13318 | returns number of minor garbage collections performed (as of the most | |
13319 | recent garbage collection). | |
13320 | ||
13321 | * `Statistics.maxBytesLive ()` | |
13322 | + | |
13323 | returns maximum bytes live (as of the most recent garbage collection). | |
13324 | ||
13325 | <<< | |
13326 | ||
13327 | :mlton-guide-page: MLtonIntInf | |
13328 | [[MLtonIntInf]] | |
13329 | MLtonIntInf | |
13330 | =========== | |
13331 | ||
13332 | [source,sml] | |
13333 | ---- | |
13334 | signature MLTON_INT_INF = | |
13335 | sig | |
13336 | type t = IntInf.int | |
13337 | ||
13338 | val areSmall: t * t -> bool | |
13339 | val gcd: t * t -> t | |
13340 | val isSmall: t -> bool | |
13341 | ||
13342 | structure BigWord : WORD | |
13343 | structure SmallInt : INTEGER | |
13344 | datatype rep = | |
13345 | Big of BigWord.word vector | |
13346 | | Small of SmallInt.int | |
13347 | val rep: t -> rep | |
13348 | val fromRep : rep -> t option | |
13349 | end | |
13350 | ---- | |
13351 | ||
13352 | MLton represents an arbitrary precision integer either as an unboxed | |
13353 | word with the bottom bit set to 1 and the top bits representing a | |
13354 | small signed integer, or as a pointer to a vector of words, where the | |
13355 | first word indicates the sign and the rest are the limbs of a | |
13356 | <:GnuMP:> big integer. | |
13357 | ||
13358 | * `type t` | |
13359 | + | |
13360 | the same as type `IntInf.int`. | |
13361 | ||
13362 | * `areSmall (a, b)` | |
13363 | + | |
13364 | returns true iff both `a` and `b` are small. | |
13365 | ||
13366 | * `gcd (a, b)` | |
13367 | + | |
13368 | uses the <:GnuMP:GnuMP's> fast gcd implementation. | |
13369 | ||
13370 | * `isSmall a` | |
13371 | + | |
13372 | returns true iff `a` is small. | |
13373 | ||
13374 | * `BigWord : WORD` | |
13375 | + | |
13376 | representation of a big `IntInf.int` as a vector of words; on 32-bit | |
13377 | platforms, `BigWord` is likely to be equivalent to `Word32`, and on | |
13378 | 64-bit platforms, `BigWord` is likely to be equivalent to `Word64`. | |
13379 | ||
13380 | * `SmallInt : INTEGER` | |
13381 | + | |
13382 | representation of a small `IntInf.int` as a signed integer; on 32-bit | |
13383 | platforms, `SmallInt` is likely to be equivalent to `Int32`, and on | |
13384 | 64-bit platforms, `SmallInt` is likely to be equivalent to `Int64`. | |
13385 | ||
13386 | * `datatype rep` | |
13387 | + | |
13388 | the underlying representation of an `IntInf.int`. | |
13389 | ||
13390 | * `rep i` | |
13391 | + | |
13392 | returns the underlying representation of `i`. | |
13393 | ||
13394 | * `fromRep r` | |
13395 | + | |
13396 | converts from the underlying representation back to an `IntInf.int`. | |
13397 | If `fromRep r` is given anything besides the valid result of `rep i` | |
13398 | for some `i`, this function call will return `NONE`. | |
13399 | ||
13400 | <<< | |
13401 | ||
13402 | :mlton-guide-page: MLtonIO | |
13403 | [[MLtonIO]] | |
13404 | MLtonIO | |
13405 | ======= | |
13406 | ||
13407 | [source,sml] | |
13408 | ---- | |
13409 | signature MLTON_IO = | |
13410 | sig | |
13411 | type instream | |
13412 | type outstream | |
13413 | ||
13414 | val inFd: instream -> Posix.IO.file_desc | |
13415 | val mkstemp: string -> string * outstream | |
13416 | val mkstemps: {prefix: string, suffix: string} -> string * outstream | |
13417 | val newIn: Posix.IO.file_desc * string -> instream | |
13418 | val newOut: Posix.IO.file_desc * string -> outstream | |
13419 | val outFd: outstream -> Posix.IO.file_desc | |
13420 | val tempPrefix: string -> string | |
13421 | end | |
13422 | ---- | |
13423 | ||
13424 | * `inFd ins` | |
13425 | + | |
13426 | returns the file descriptor corresponding to `ins`. | |
13427 | ||
13428 | * `mkstemp s` | |
13429 | + | |
13430 | like the C `mkstemp` function, generates and open a temporary file | |
13431 | with prefix `s`. | |
13432 | ||
13433 | * `mkstemps {prefix, suffix}` | |
13434 | + | |
13435 | like `mkstemp`, except it has both a prefix and suffix. | |
13436 | ||
13437 | * `newIn (fd, name)` | |
13438 | + | |
13439 | creates a new instream from file descriptor `fd`, with `name` used in | |
13440 | any `Io` exceptions later raised. | |
13441 | ||
13442 | * `newOut (fd, name)` | |
13443 | + | |
13444 | creates a new outstream from file descriptor `fd`, with `name` used in | |
13445 | any `Io` exceptions later raised. | |
13446 | ||
13447 | * `outFd out` | |
13448 | + | |
13449 | returns the file descriptor corresponding to `out`. | |
13450 | ||
13451 | * `tempPrefix s` | |
13452 | + | |
13453 | adds a suitable system or user specific prefix (directory) for temp | |
13454 | files. | |
13455 | ||
13456 | <<< | |
13457 | ||
13458 | :mlton-guide-page: MLtonItimer | |
13459 | [[MLtonItimer]] | |
13460 | MLtonItimer | |
13461 | =========== | |
13462 | ||
13463 | [source,sml] | |
13464 | ---- | |
13465 | signature MLTON_ITIMER = | |
13466 | sig | |
13467 | datatype t = | |
13468 | Prof | |
13469 | | Real | |
13470 | | Virtual | |
13471 | ||
13472 | val set: t * {interval: Time.time, value: Time.time} -> unit | |
13473 | val signal: t -> Posix.Signal.signal | |
13474 | end | |
13475 | ---- | |
13476 | ||
13477 | * `set (t, {interval, value})` | |
13478 | + | |
13479 | sets the interval timer (using `setitimer`) specified by `t` to the | |
13480 | given `interval` and `value`. | |
13481 | ||
13482 | * `signal t` | |
13483 | + | |
13484 | returns the signal corresponding to `t`. | |
13485 | ||
13486 | <<< | |
13487 | ||
13488 | :mlton-guide-page: MLtonLibraryProject | |
13489 | [[MLtonLibraryProject]] | |
13490 | MLtonLibraryProject | |
13491 | =================== | |
13492 | ||
13493 | We have a https://github.com/MLton/mltonlib[MLton Library repository] | |
13494 | that is intended to collect libraries. | |
13495 | ||
13496 | ===== | |
13497 | https://github.com/MLton/mltonlib | |
13498 | ===== | |
13499 | ||
13500 | Libraries are kept in the `master` branch, and are grouped according | |
13501 | to domain name, in the Java package style. For example, | |
13502 | <:VesaKarvonen:>, who works at `ssh.com`, has been putting code at: | |
13503 | ||
13504 | ===== | |
13505 | https://github.com/MLton/mltonlib/tree/master/com/ssh | |
13506 | ===== | |
13507 | ||
13508 | <:StephenWeeks:>, owning `sweeks.com`, has been putting code at: | |
13509 | ||
13510 | ===== | |
13511 | https://github.com/MLton/mltonlib/tree/master/com/sweeks | |
13512 | ===== | |
13513 | ||
13514 | A "library" is a subdirectory of some such directory. For example, | |
13515 | Stephen's basis-library replacement library is at | |
13516 | ||
13517 | ===== | |
13518 | https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic | |
13519 | ===== | |
13520 | ||
13521 | We use "transparent per-library branching" to handle library | |
13522 | versioning. Each library has an "unstable" subdirectory in which work | |
13523 | happens. When one is happy with a library, one tags it by copying it | |
13524 | to a stable version directory. Stable libraries are immutable -- when | |
13525 | one refers to a stable library, one always gets exactly the same code. | |
13526 | No one has actually made a stable library yet, but, when I'm ready to | |
13527 | tag my library, I was thinking that I would do something like copying | |
13528 | ||
13529 | ===== | |
13530 | https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic/unstable | |
13531 | ===== | |
13532 | ||
13533 | to | |
13534 | ||
13535 | ===== | |
13536 | https://github.com/MLton/mltonlib/tree/master/com/sweeks/basic/v1 | |
13537 | ===== | |
13538 | ||
13539 | So far, libraries in the MLton repository have been licensed under | |
13540 | MLton's <:License:>. We haven't decided on whether that will be a | |
13541 | requirement to be in the repository or not. For the sake of | |
13542 | simplicity (a single license) and encouraging widest use of code, | |
13543 | contributors are encouraged to use that license. But it may be too | |
13544 | strict to require it. | |
13545 | ||
13546 | If someone wants to contribute a new library to our repository or to | |
13547 | work on an old one, they can make a pull request. If people want to | |
13548 | work in their own repository, they can do so -- that's the point of | |
13549 | using domain names to prevent clashes. The idea is that a user should | |
13550 | be able to bring library collections in from many different | |
13551 | repositories without problems. And those libraries could even work | |
13552 | with each other. | |
13553 | ||
13554 | At some point we may want to settle on an <:MLBasisPathMap:> variable | |
13555 | for the root of the library project. Or, we could reuse `SML_LIB`, | |
13556 | and migrate what we currently keep there into the library | |
13557 | infrastructure. | |
13558 | ||
13559 | <<< | |
13560 | ||
13561 | :mlton-guide-page: MLtonMonoArray | |
13562 | [[MLtonMonoArray]] | |
13563 | MLtonMonoArray | |
13564 | ============== | |
13565 | ||
13566 | [source,sml] | |
13567 | ---- | |
13568 | signature MLTON_MONO_ARRAY = | |
13569 | sig | |
13570 | type t | |
13571 | type elem | |
13572 | val fromPoly: elem array -> t | |
13573 | val toPoly: t -> elem array | |
13574 | end | |
13575 | ---- | |
13576 | ||
13577 | * `type t` | |
13578 | + | |
13579 | type of monomorphic array | |
13580 | ||
13581 | * `type elem` | |
13582 | + | |
13583 | type of array elements | |
13584 | ||
13585 | * `fromPoly a` | |
13586 | + | |
13587 | type cast a polymorphic array to its monomorphic counterpart; the | |
13588 | argument and result arrays share the same identity | |
13589 | ||
13590 | * `toPoly a` | |
13591 | + | |
13592 | type cast a monomorphic array to its polymorphic counterpart; the | |
13593 | argument and result arrays share the same identity | |
13594 | ||
13595 | <<< | |
13596 | ||
13597 | :mlton-guide-page: MLtonMonoVector | |
13598 | [[MLtonMonoVector]] | |
13599 | MLtonMonoVector | |
13600 | =============== | |
13601 | ||
13602 | [source,sml] | |
13603 | ---- | |
13604 | signature MLTON_MONO_VECTOR = | |
13605 | sig | |
13606 | type t | |
13607 | type elem | |
13608 | val fromPoly: elem vector -> t | |
13609 | val toPoly: t -> elem vector | |
13610 | end | |
13611 | ---- | |
13612 | ||
13613 | * `type t` | |
13614 | + | |
13615 | type of monomorphic vector | |
13616 | ||
13617 | * `type elem` | |
13618 | + | |
13619 | type of vector elements | |
13620 | ||
13621 | * `fromPoly v` | |
13622 | + | |
13623 | type cast a polymorphic vector to its monomorphic counterpart; in | |
13624 | MLton, this is a constant-time operation | |
13625 | ||
13626 | * `toPoly v` | |
13627 | + | |
13628 | type cast a monomorphic vector to its polymorphic counterpart; in | |
13629 | MLton, this is a constant-time operation | |
13630 | ||
13631 | <<< | |
13632 | ||
13633 | :mlton-guide-page: MLtonPlatform | |
13634 | [[MLtonPlatform]] | |
13635 | MLtonPlatform | |
13636 | ============= | |
13637 | ||
13638 | [source,sml] | |
13639 | ---- | |
13640 | signature MLTON_PLATFORM = | |
13641 | sig | |
13642 | structure Arch: | |
13643 | sig | |
13644 | datatype t = Alpha | AMD64 | ARM | ARM64 | HPPA | IA64 | m68k | |
13645 | | MIPS | PowerPC | PowerPC64 | S390 | Sparc | X86 | |
13646 | ||
13647 | val fromString: string -> t option | |
13648 | val host: t | |
13649 | val toString: t -> string | |
13650 | end | |
13651 | ||
13652 | structure OS: | |
13653 | sig | |
13654 | datatype t = AIX | Cygwin | Darwin | FreeBSD | Hurd | HPUX | |
13655 | | Linux | MinGW | NetBSD | OpenBSD | Solaris | |
13656 | ||
13657 | val fromString: string -> t option | |
13658 | val host: t | |
13659 | val toString: t -> string | |
13660 | end | |
13661 | end | |
13662 | ---- | |
13663 | ||
13664 | * `datatype Arch.t` | |
13665 | + | |
13666 | processor architectures | |
13667 | ||
13668 | * `Arch.fromString a` | |
13669 | + | |
13670 | converts from string to architecture. Case insensitive. | |
13671 | ||
13672 | * `Arch.host` | |
13673 | + | |
13674 | the architecture for which the program is compiled. | |
13675 | ||
13676 | * `Arch.toString` | |
13677 | + | |
13678 | string for architecture. | |
13679 | ||
13680 | * `datatype OS.t` | |
13681 | + | |
13682 | operating systems | |
13683 | ||
13684 | * `OS.fromString` | |
13685 | + | |
13686 | converts from string to operating system. Case insensitive. | |
13687 | ||
13688 | * `OS.host` | |
13689 | + | |
13690 | the operating system for which the program is compiled. | |
13691 | ||
13692 | * `OS.toString` | |
13693 | + | |
13694 | string for operating system. | |
13695 | ||
13696 | <<< | |
13697 | ||
13698 | :mlton-guide-page: MLtonPointer | |
13699 | [[MLtonPointer]] | |
13700 | MLtonPointer | |
13701 | ============ | |
13702 | ||
13703 | [source,sml] | |
13704 | ---- | |
13705 | signature MLTON_POINTER = | |
13706 | sig | |
13707 | eqtype t | |
13708 | ||
13709 | val add: t * word -> t | |
13710 | val compare: t * t -> order | |
13711 | val diff: t * t -> word | |
13712 | val getInt8: t * int -> Int8.int | |
13713 | val getInt16: t * int -> Int16.int | |
13714 | val getInt32: t * int -> Int32.int | |
13715 | val getInt64: t * int -> Int64.int | |
13716 | val getPointer: t * int -> t | |
13717 | val getReal32: t * int -> Real32.real | |
13718 | val getReal64: t * int -> Real64.real | |
13719 | val getWord8: t * int -> Word8.word | |
13720 | val getWord16: t * int -> Word16.word | |
13721 | val getWord32: t * int -> Word32.word | |
13722 | val getWord64: t * int -> Word64.word | |
13723 | val null: t | |
13724 | val setInt8: t * int * Int8.int -> unit | |
13725 | val setInt16: t * int * Int16.int -> unit | |
13726 | val setInt32: t * int * Int32.int -> unit | |
13727 | val setInt64: t * int * Int64.int -> unit | |
13728 | val setPointer: t * int * t -> unit | |
13729 | val setReal32: t * int * Real32.real -> unit | |
13730 | val setReal64: t * int * Real64.real -> unit | |
13731 | val setWord8: t * int * Word8.word -> unit | |
13732 | val setWord16: t * int * Word16.word -> unit | |
13733 | val setWord32: t * int * Word32.word -> unit | |
13734 | val setWord64: t * int * Word64.word -> unit | |
13735 | val sizeofPointer: word | |
13736 | val sub: t * word -> t | |
13737 | end | |
13738 | ---- | |
13739 | ||
13740 | * `eqtype t` | |
13741 | + | |
13742 | the type of pointers, i.e. machine addresses. | |
13743 | ||
13744 | * `add (p, w)` | |
13745 | + | |
13746 | returns the pointer `w` bytes after than `p`. Does not check for | |
13747 | overflow. | |
13748 | ||
13749 | * `compare (p1, p2)` | |
13750 | + | |
13751 | compares the pointer `p1` to the pointer `p2` (as addresses). | |
13752 | ||
13753 | * `diff (p1, p2)` | |
13754 | + | |
13755 | returns the number of bytes `w` such that `add (p2, w) = p1`. Does | |
13756 | not check for overflow. | |
13757 | ||
13758 | * ++get__<X>__ (p, i)++ | |
13759 | + | |
13760 | returns the object stored at index i of the array of _X_ objects | |
13761 | pointed to by `p`. For example, `getWord32 (p, 7)` returns the 32-bit | |
13762 | word stored 28 bytes beyond `p`. | |
13763 | ||
13764 | * `null` | |
13765 | + | |
13766 | the null pointer, i.e. 0. | |
13767 | ||
13768 | * ++set__<X>__ (p, i, v)++ | |
13769 | + | |
13770 | assigns `v` to the object stored at index i of the array of _X_ | |
13771 | objects pointed to by `p`. For example, `setWord32 (p, 7, w)` stores | |
13772 | the 32-bit word `w` at the address 28 bytes beyond `p`. | |
13773 | ||
13774 | * `sizeofPointer` | |
13775 | + | |
13776 | size, in bytes, of a pointer. | |
13777 | ||
13778 | * `sub (p, w)` | |
13779 | + | |
13780 | returns the pointer `w` bytes before `p`. Does not check for | |
13781 | overflow. | |
13782 | ||
13783 | <<< | |
13784 | ||
13785 | :mlton-guide-page: MLtonProcEnv | |
13786 | [[MLtonProcEnv]] | |
13787 | MLtonProcEnv | |
13788 | ============ | |
13789 | ||
13790 | [source,sml] | |
13791 | ---- | |
13792 | signature MLTON_PROC_ENV = | |
13793 | sig | |
13794 | type gid | |
13795 | ||
13796 | val setenv: {name: string, value: string} -> unit | |
13797 | val setgroups: gid list -> unit | |
13798 | end | |
13799 | ---- | |
13800 | ||
13801 | * `setenv {name, value}` | |
13802 | + | |
13803 | like the C `setenv` function. Does not require `name` or `value` to | |
13804 | be null terminated. | |
13805 | ||
13806 | * `setgroups grps` | |
13807 | + | |
13808 | like the C `setgroups` function. | |
13809 | ||
13810 | <<< | |
13811 | ||
13812 | :mlton-guide-page: MLtonProcess | |
13813 | [[MLtonProcess]] | |
13814 | MLtonProcess | |
13815 | ============ | |
13816 | ||
13817 | [source,sml] | |
13818 | ---- | |
13819 | signature MLTON_PROCESS = | |
13820 | sig | |
13821 | type pid | |
13822 | ||
13823 | val spawn: {args: string list, path: string} -> pid | |
13824 | val spawne: {args: string list, env: string list, path: string} -> pid | |
13825 | val spawnp: {args: string list, file: string} -> pid | |
13826 | ||
13827 | type ('stdin, 'stdout, 'stderr) t | |
13828 | ||
13829 | type input | |
13830 | type output | |
13831 | ||
13832 | type none | |
13833 | type chain | |
13834 | type any | |
13835 | ||
13836 | exception MisuseOfForget | |
13837 | exception DoublyRedirected | |
13838 | ||
13839 | structure Child: | |
13840 | sig | |
13841 | type ('use, 'dir) t | |
13842 | ||
13843 | val binIn: (BinIO.instream, input) t -> BinIO.instream | |
13844 | val binOut: (BinIO.outstream, output) t -> BinIO.outstream | |
13845 | val fd: (Posix.FileSys.file_desc, 'dir) t -> Posix.FileSys.file_desc | |
13846 | val remember: (any, 'dir) t -> ('use, 'dir) t | |
13847 | val textIn: (TextIO.instream, input) t -> TextIO.instream | |
13848 | val textOut: (TextIO.outstream, output) t -> TextIO.outstream | |
13849 | end | |
13850 | ||
13851 | structure Param: | |
13852 | sig | |
13853 | type ('use, 'dir) t | |
13854 | ||
13855 | val child: (chain, 'dir) Child.t -> (none, 'dir) t | |
13856 | val fd: Posix.FileSys.file_desc -> (none, 'dir) t | |
13857 | val file: string -> (none, 'dir) t | |
13858 | val forget: ('use, 'dir) t -> (any, 'dir) t | |
13859 | val null: (none, 'dir) t | |
13860 | val pipe: ('use, 'dir) t | |
13861 | val self: (none, 'dir) t | |
13862 | end | |
13863 | ||
13864 | val create: | |
13865 | {args: string list, | |
13866 | env: string list option, | |
13867 | path: string, | |
13868 | stderr: ('stderr, output) Param.t, | |
13869 | stdin: ('stdin, input) Param.t, | |
13870 | stdout: ('stdout, output) Param.t} | |
13871 | -> ('stdin, 'stdout, 'stderr) t | |
13872 | val getStderr: ('stdin, 'stdout, 'stderr) t -> ('stderr, input) Child.t | |
13873 | val getStdin: ('stdin, 'stdout, 'stderr) t -> ('stdin, output) Child.t | |
13874 | val getStdout: ('stdin, 'stdout, 'stderr) t -> ('stdout, input) Child.t | |
13875 | val kill: ('stdin, 'stdout, 'stderr) t * Posix.Signal.signal -> unit | |
13876 | val reap: ('stdin, 'stdout, 'stderr) t -> Posix.Process.exit_status | |
13877 | end | |
13878 | ---- | |
13879 | ||
13880 | ||
13881 | == Spawn == | |
13882 | ||
13883 | The `spawn` functions provide an alternative to the | |
13884 | `fork`/`exec` idiom that is typically used to create a new | |
13885 | process. On most platforms, the `spawn` functions are simple | |
13886 | wrappers around `fork`/`exec`. However, under Windows, the | |
13887 | `spawn` functions are primitive. All `spawn` functions return | |
13888 | the process id of the spawned process. They differ in how the | |
13889 | executable is found and the environment that it uses. | |
13890 | ||
13891 | * `spawn {args, path}` | |
13892 | + | |
13893 | starts a new process running the executable specified by `path` | |
13894 | with the arguments `args`. Like `Posix.Process.exec`. | |
13895 | ||
13896 | * `spawne {args, env, path}` | |
13897 | + | |
13898 | starts a new process running the executable specified by `path` with | |
13899 | the arguments `args` and environment `env`. Like | |
13900 | `Posix.Process.exece`. | |
13901 | ||
13902 | * `spawnp {args, file}` | |
13903 | + | |
13904 | search the `PATH` environment variable for an executable named `file`, | |
13905 | and start a new process running that executable with the arguments | |
13906 | `args`. Like `Posix.Process.execp`. | |
13907 | ||
13908 | ||
13909 | == Create == | |
13910 | ||
13911 | `MLton.Process.create` provides functionality similar to | |
13912 | `Unix.executeInEnv`, but provides more control control over the input, | |
13913 | output, and error streams. In addition, `create` works on all | |
13914 | platforms, including Cygwin and MinGW (Windows) where `Posix.fork` is | |
13915 | unavailable. For greatest portability programs should still use the | |
13916 | standard `Unix.execute`, `Unix.executeInEnv`, and `OS.Process.system`. | |
13917 | ||
13918 | The following types and sub-structures are used by the `create` | |
13919 | function. They provide static type checking of correct stream usage. | |
13920 | ||
13921 | === Child === | |
13922 | ||
13923 | * `('use, 'dir) Child.t` | |
13924 | + | |
13925 | This represents a handle to one of a child's standard streams. The | |
13926 | `'dir` is viewed with respect to the parent. Thus a `('a, input) | |
13927 | Child.t` handle means that the parent may input the output from the | |
13928 | child. | |
13929 | ||
13930 | * `Child.{bin,text}{In,Out} h` | |
13931 | + | |
13932 | These functions take a handle and bind it to a stream of the named | |
13933 | type. The type system will detect attempts to reverse the direction | |
13934 | of a stream or to use the same stream in multiple, incompatible ways. | |
13935 | ||
13936 | * `Child.fd h` | |
13937 | + | |
13938 | This function behaves like the other `Child.*` functions; it opens a | |
13939 | stream. However, it does not enforce that you read or write from the | |
13940 | handle. If you use the descriptor in an inappropriate direction, the | |
13941 | behavior is undefined. Furthermore, this function may potentially be | |
13942 | unavailable on future MLton host platforms. | |
13943 | ||
13944 | * `Child.remember h` | |
13945 | + | |
13946 | This function takes a stream of use `any` and resets the use of the | |
13947 | stream so that the stream may be used by `Child.*`. An `any` stream | |
13948 | may have had use `none` or `'use` prior to calling `Param.forget`. If | |
13949 | the stream was `none` and is used, `MisuseOfForget` is raised. | |
13950 | ||
13951 | === Param === | |
13952 | ||
13953 | * `('use, 'dir) Param.t` | |
13954 | + | |
13955 | This is a handle to an input/output source and will be passed to the | |
13956 | created child process. The `'dir` is relative to the child process. | |
13957 | Input means that the child process will read from this stream. | |
13958 | ||
13959 | * `Param.child h` | |
13960 | + | |
13961 | Connect the stream of the new child process to the stream of a | |
13962 | previously created child process. A single child stream should be | |
13963 | connected to only one child process or else `DoublyRedirected` will be | |
13964 | raised. | |
13965 | ||
13966 | * `Param.fd fd` | |
13967 | + | |
13968 | This creates a stream from the provided file descriptor which will be | |
13969 | closed when `create` is called. This function may not be available on | |
13970 | future MLton host platforms. | |
13971 | ||
13972 | * `Param.forget h` | |
13973 | + | |
13974 | This hides the type of the actual parameter as `any`. This is useful | |
13975 | if you are implementing an application which conditionally attaches | |
13976 | the child process to files or pipes. However, you must ensure that | |
13977 | your use after `Child.remember` matches the original type. | |
13978 | ||
13979 | * `Param.file s` | |
13980 | + | |
13981 | Open the given file and connect it to the child process. Note that the | |
13982 | file will be opened only when `create` is called. So any exceptions | |
13983 | will be raised there and not by this function. If used for `input`, | |
13984 | the file is opened read-only. If used for `output`, the file is opened | |
13985 | read-write. | |
13986 | ||
13987 | * `Param.null` | |
13988 | + | |
13989 | In some situations, the child process should have its output | |
13990 | discarded. The `null` param when passed as `stdout` or `stderr` does | |
13991 | this. When used for `stdin`, the child process will either receive | |
13992 | `EOF` or a failure condition if it attempts to read from `stdin`. | |
13993 | ||
13994 | * `Param.pipe` | |
13995 | + | |
13996 | This will connect the input/output of the child process to a pipe | |
13997 | which the parent process holds. This may later form the input to one | |
13998 | of the `Child.*` functions and/or the `Param.child` function. | |
13999 | ||
14000 | * `Param.self` | |
14001 | + | |
14002 | This will connect the input/output of the child process to the | |
14003 | corresponding stream of the parent process. | |
14004 | ||
14005 | === Process === | |
14006 | ||
14007 | * `type ('stdin, 'stdout, 'stderr) t` | |
14008 | + | |
14009 | represents a handle to a child process. The type arguments capture | |
14010 | how the named stream of the child process may be used. | |
14011 | ||
14012 | * `type any` | |
14013 | + | |
14014 | bypasses the type system in situations where an application does not | |
14015 | want the it to enforce correct usage. See `Child.remember` and | |
14016 | `Param.forget`. | |
14017 | ||
14018 | * `type chain` | |
14019 | + | |
14020 | means that the child process's stream was connected via a pipe to the | |
14021 | parent process. The parent process may pass this pipe in turn to | |
14022 | another child, thus chaining them together. | |
14023 | ||
14024 | * `type input, output` | |
14025 | + | |
14026 | record the direction that a stream flows. They are used as a part of | |
14027 | `Param.t` and `Child.t` and is detailed there. | |
14028 | ||
14029 | * `type none` | |
14030 | + | |
14031 | means that the child process's stream my not be used by the parent | |
14032 | process. This happens when the child process is connected directly to | |
14033 | some source. | |
14034 | + | |
14035 | The types `BinIO.instream`, `BinIO.outstream`, `TextIO.instream`, | |
14036 | `TextIO.outstream`, and `Posix.FileSys.file_desc` are also valid types | |
14037 | with which to instantiate child streams. | |
14038 | ||
14039 | * `exception MisuseOfForget` | |
14040 | + | |
14041 | may be raised if `Child.remember` and `Param.forget` are used to | |
14042 | bypass the normal type checking. This exception will only be raised | |
14043 | in cases where the `forget` mechanism allows a misuse that would be | |
14044 | impossible with the type-safe versions. | |
14045 | ||
14046 | * `exception DoublyRedirected` | |
14047 | + | |
14048 | raised if a stream connected to a child process is redirected to two | |
14049 | separate child processes. It is safe, though bad style, to use the a | |
14050 | `Child.t` with the same `Child.*` function repeatedly. | |
14051 | ||
14052 | * `create {args, path, env, stderr, stdin, stdout}` | |
14053 | + | |
14054 | starts a child process with the given command-line `args` (excluding | |
14055 | the program name). `path` should be an absolute path to the executable | |
14056 | run in the new child process; relative paths work, but are less | |
14057 | robust. Optionally, the environment may be overridden with `env` | |
14058 | where each string element has the form `"key=value"`. The `std*` | |
14059 | options must be provided by the `Param.*` functions documented above. | |
14060 | + | |
14061 | Processes which are `create`-d must be either `reap`-ed or `kill`-ed. | |
14062 | ||
14063 | * `getStd{in,out,err} proc` | |
14064 | + | |
14065 | gets a handle to the specified stream. These should be used by the | |
14066 | `Child.*` functions. Failure to use a stream connected via pipe to a | |
14067 | child process may result in runtime dead-lock and elicits a compiler | |
14068 | warning. | |
14069 | ||
14070 | * `kill (proc, sig)` | |
14071 | + | |
14072 | terminates the child process immediately. The signal may or may not | |
14073 | mean anything depending on the host platform. A good value is | |
14074 | `Posix.Signal.term`. | |
14075 | ||
14076 | * `reap proc` | |
14077 | + | |
14078 | waits for the child process to terminate and return its exit status. | |
14079 | ||
14080 | ||
14081 | == Important usage notes == | |
14082 | ||
14083 | When building an application with many pipes between child processes, | |
14084 | it is important to ensure that there are no cycles in the undirected | |
14085 | pipe graph. If this property is not maintained, deadlocks are a very | |
14086 | serious potential bug which may only appear under difficult to | |
14087 | reproduce conditions. | |
14088 | ||
14089 | The danger lies in that most operating systems implement pipes with a | |
14090 | fixed buffer size. If process A has two output pipes which process B | |
14091 | reads, it can happen that process A blocks writing to pipe 2 because | |
14092 | it is full while process B blocks reading from pipe 1 because it is | |
14093 | empty. This same situation can happen with any undirected cycle formed | |
14094 | between processes (vertexes) and pipes (undirected edges) in the | |
14095 | graph. | |
14096 | ||
14097 | It is possible to make this safe using low-level I/O primitives for | |
14098 | polling. However, these primitives are not very portable and | |
14099 | difficult to use properly. A far better approach is to make sure you | |
14100 | never create a cycle in the first place. | |
14101 | ||
14102 | For these reasons, the `Unix.executeInEnv` is a very dangerous | |
14103 | function. Be careful when using it to ensure that the child process | |
14104 | only operates on either `stdin` or `stdout`, but not both. | |
14105 | ||
14106 | ||
14107 | == Example use of MLton.Process.create == | |
14108 | ||
14109 | The following example program launches the `ipconfig` utility, pipes | |
14110 | its output through `grep`, and then reads the result back into the | |
14111 | program. | |
14112 | ||
14113 | [source,sml] | |
14114 | ---- | |
14115 | open MLton.Process | |
14116 | val p = | |
14117 | create {args = [ "/all" ], | |
14118 | env = NONE, | |
14119 | path = "C:\\WINDOWS\\system32\\ipconfig.exe", | |
14120 | stderr = Param.self, | |
14121 | stdin = Param.null, | |
14122 | stdout = Param.pipe} | |
14123 | val q = | |
14124 | create {args = [ "IP-Ad" ], | |
14125 | env = NONE, | |
14126 | path = "C:\\msys\\bin\\grep.exe", | |
14127 | stderr = Param.self, | |
14128 | stdin = Param.child (getStdout p), | |
14129 | stdout = Param.pipe} | |
14130 | fun suck h = | |
14131 | case TextIO.inputLine h of | |
14132 | NONE => () | |
14133 | | SOME s => (print ("'" ^ s ^ "'\n"); suck h) | |
14134 | ||
14135 | val () = suck (Child.textIn (getStdout q)) | |
14136 | ---- | |
14137 | ||
14138 | <<< | |
14139 | ||
14140 | :mlton-guide-page: MLtonProfile | |
14141 | [[MLtonProfile]] | |
14142 | MLtonProfile | |
14143 | ============ | |
14144 | ||
14145 | [source,sml] | |
14146 | ---- | |
14147 | signature MLTON_PROFILE = | |
14148 | sig | |
14149 | structure Data: | |
14150 | sig | |
14151 | type t | |
14152 | ||
14153 | val equals: t * t -> bool | |
14154 | val free: t -> unit | |
14155 | val malloc: unit -> t | |
14156 | val write: t * string -> unit | |
14157 | end | |
14158 | ||
14159 | val isOn: bool | |
14160 | val withData: Data.t * (unit -> 'a) -> 'a | |
14161 | end | |
14162 | ---- | |
14163 | ||
14164 | `MLton.Profile` provides <:Profiling:> control from within the | |
14165 | program, allowing you to profile individual portions of your | |
14166 | program. With `MLton.Profile`, you can create many units of profiling | |
14167 | data (essentially, mappings from functions to counts) during a run of | |
14168 | a program, switch between them while the program is running, and | |
14169 | output multiple `mlmon.out` files. | |
14170 | ||
14171 | * `isOn` | |
14172 | + | |
14173 | a compile-time constant that is false only when compiling `-profile no`. | |
14174 | ||
14175 | * `type Data.t` | |
14176 | + | |
14177 | the type of a unit of profiling data. In order to most efficiently | |
14178 | execute non-profiled programs, when compiling `-profile no` (the | |
14179 | default), `Data.t` is equivalent to `unit ref`. | |
14180 | ||
14181 | * `Data.equals (x, y)` | |
14182 | + | |
14183 | returns true if the `x` and `y` are the same unit of profiling data. | |
14184 | ||
14185 | * `Data.free x` | |
14186 | + | |
14187 | frees the memory associated with the unit of profiling data `x`. It | |
14188 | is an error to free the current unit of profiling data or to free a | |
14189 | previously freed unit of profiling data. When compiling | |
14190 | `-profile no`, `Data.free x` is a no-op. | |
14191 | ||
14192 | * `Data.malloc ()` | |
14193 | + | |
14194 | returns a new unit of profiling data. Each unit of profiling data is | |
14195 | allocated from the process address space (but is _not_ in the MLton | |
14196 | heap) and consumes memory proportional to the number of source | |
14197 | functions. When compiling `-profile no`, `Data.malloc ()` is | |
14198 | equivalent to allocating a new `unit ref`. | |
14199 | ||
14200 | * `write (x, f)` | |
14201 | + | |
14202 | writes the accumulated ticks in the unit of profiling data `x` to file | |
14203 | `f`. It is an error to write a previously freed unit of profiling | |
14204 | data. When compiling `-profile no`, `write (x, f)` is a no-op. A | |
14205 | profiled program will always write the current unit of profiling data | |
14206 | at program exit to a file named `mlmon.out`. | |
14207 | ||
14208 | * `withData (d, f)` | |
14209 | + | |
14210 | runs `f` with `d` as the unit of profiling data, and returns the | |
14211 | result of `f` after restoring the current unit of profiling data. | |
14212 | When compiling `-profile no`, `withData (d, f)` is equivalent to | |
14213 | `f ()`. | |
14214 | ||
14215 | ||
14216 | == Example == | |
14217 | ||
14218 | Here is an example, taken from the `examples/profiling` directory, | |
14219 | showing how to profile the executions of the `fib` and `tak` functions | |
14220 | separately. Suppose that `fib-tak.sml` contains the following. | |
14221 | [source,sml] | |
14222 | ---- | |
14223 | structure Profile = MLton.Profile | |
14224 | ||
14225 | val fibData = Profile.Data.malloc () | |
14226 | val takData = Profile.Data.malloc () | |
14227 | ||
14228 | fun wrap (f, d) x = | |
14229 | Profile.withData (d, fn () => f x) | |
14230 | ||
14231 | val rec fib = | |
14232 | fn 0 => 0 | |
14233 | | 1 => 1 | |
14234 | | n => fib (n - 1) + fib (n - 2) | |
14235 | val fib = wrap (fib, fibData) | |
14236 | ||
14237 | fun tak (x,y,z) = | |
14238 | if not (y < x) | |
14239 | then z | |
14240 | else tak (tak (x - 1, y, z), | |
14241 | tak (y - 1, z, x), | |
14242 | tak (z - 1, x, y)) | |
14243 | val tak = wrap (tak, takData) | |
14244 | ||
14245 | val rec f = | |
14246 | fn 0 => () | |
14247 | | n => (fib 38; f (n-1)) | |
14248 | val _ = f 2 | |
14249 | ||
14250 | val rec g = | |
14251 | fn 0 => () | |
14252 | | n => (tak (18,12,6); g (n-1)) | |
14253 | val _ = g 500 | |
14254 | ||
14255 | fun done (data, file) = | |
14256 | (Profile.Data.write (data, file) | |
14257 | ; Profile.Data.free data) | |
14258 | ||
14259 | val _ = done (fibData, "mlmon.fib.out") | |
14260 | val _ = done (takData, "mlmon.tak.out") | |
14261 | ---- | |
14262 | ||
14263 | Compile and run the program. | |
14264 | ---- | |
14265 | % mlton -profile time fib-tak.sml | |
14266 | % ./fib-tak | |
14267 | ---- | |
14268 | ||
14269 | Separately display the profiling data for `fib` | |
14270 | ---- | |
14271 | % mlprof fib-tak mlmon.fib.out | |
14272 | 5.77 seconds of CPU time (0.00 seconds GC) | |
14273 | function cur | |
14274 | --------- ----- | |
14275 | fib 96.9% | |
14276 | <unknown> 3.1% | |
14277 | ---- | |
14278 | and for `tak` | |
14279 | ---- | |
14280 | % mlprof fib-tak mlmon.tak.out | |
14281 | 0.68 seconds of CPU time (0.00 seconds GC) | |
14282 | function cur | |
14283 | -------- ------ | |
14284 | tak 100.0% | |
14285 | ---- | |
14286 | ||
14287 | Combine the data for `fib` and `tak` by calling `mlprof` | |
14288 | with multiple `mlmon.out` files. | |
14289 | ---- | |
14290 | % mlprof fib-tak mlmon.fib.out mlmon.tak.out mlmon.out | |
14291 | 6.45 seconds of CPU time (0.00 seconds GC) | |
14292 | function cur | |
14293 | --------- ----- | |
14294 | fib 86.7% | |
14295 | tak 10.5% | |
14296 | <unknown> 2.8% | |
14297 | ---- | |
14298 | ||
14299 | <<< | |
14300 | ||
14301 | :mlton-guide-page: MLtonRandom | |
14302 | [[MLtonRandom]] | |
14303 | MLtonRandom | |
14304 | =========== | |
14305 | ||
14306 | [source,sml] | |
14307 | ---- | |
14308 | signature MLTON_RANDOM = | |
14309 | sig | |
14310 | val alphaNumChar: unit -> char | |
14311 | val alphaNumString: int -> string | |
14312 | val rand: unit -> word | |
14313 | val seed: unit -> word option | |
14314 | val srand: word -> unit | |
14315 | val useed: unit -> word option | |
14316 | end | |
14317 | ---- | |
14318 | ||
14319 | * `alphaNumChar ()` | |
14320 | + | |
14321 | returns a random alphanumeric character. | |
14322 | ||
14323 | * `alphaNumString n` | |
14324 | + | |
14325 | returns a string of length `n` of random alphanumeric characters. | |
14326 | ||
14327 | * `rand ()` | |
14328 | + | |
14329 | returns the next pseudo-random number. | |
14330 | ||
14331 | * `seed ()` | |
14332 | + | |
14333 | returns a random word from `/dev/random`. Useful as an arg to | |
14334 | `srand`. If `/dev/random` can not be read from, `seed ()` returns | |
14335 | `NONE`. A call to `seed` may block until enough random bits are | |
14336 | available. | |
14337 | ||
14338 | * `srand w` | |
14339 | + | |
14340 | sets the seed used by `rand` to `w`. | |
14341 | ||
14342 | * `useed ()` | |
14343 | + | |
14344 | returns a random word from `/dev/urandom`. Useful as an arg to | |
14345 | `srand`. If `/dev/urandom` can not be read from, `useed ()` returns | |
14346 | `NONE`. A call to `useed` will never block -- it will instead return | |
14347 | lower quality random bits. | |
14348 | ||
14349 | <<< | |
14350 | ||
14351 | :mlton-guide-page: MLtonReal | |
14352 | [[MLtonReal]] | |
14353 | MLtonReal | |
14354 | ========= | |
14355 | ||
14356 | [source,sml] | |
14357 | ---- | |
14358 | signature MLTON_REAL = | |
14359 | sig | |
14360 | type t | |
14361 | ||
14362 | val fromWord: word -> t | |
14363 | val fromLargeWord: LargeWord.word -> t | |
14364 | val toWord: IEEEReal.rounding_mode -> t -> word | |
14365 | val toLargeWord: IEEEReal.rounding_mode -> t -> LargeWord.word | |
14366 | end | |
14367 | ---- | |
14368 | ||
14369 | * `type t` | |
14370 | + | |
14371 | the type of reals. For `MLton.LargeReal` this is `LargeReal.real`, | |
14372 | for `MLton.Real` this is `Real.real`, for `MLton.Real32` this is | |
14373 | `Real32.real`, for `MLton.Real64` this is `Real64.real`. | |
14374 | ||
14375 | * `fromWord w` | |
14376 | * `fromLargeWord w` | |
14377 | + | |
14378 | convert the word `w` to a real value. If the value of `w` is larger | |
14379 | than (the appropriate) `REAL.maxFinite`, then infinity is returned. | |
14380 | If `w` cannot be exactly represented as a real value, then the current | |
14381 | rounding mode is used to determine the resulting value. | |
14382 | ||
14383 | * `toWord mode r` | |
14384 | * `toLargeWord mode r` | |
14385 | + | |
14386 | convert the argument `r` to a word type using the specified rounding | |
14387 | mode. They raise `Overflow` if the result is not representable, in | |
14388 | particular, if `r` is an infinity. They raise `Domain` if `r` is NaN. | |
14389 | ||
14390 | * `MLton.Real32.castFromWord w` | |
14391 | * `MLton.Real64.castFromWord w` | |
14392 | + | |
14393 | convert the argument `w` to a real type as a bit-wise cast. | |
14394 | ||
14395 | * `MLton.Real32.castToWord r` | |
14396 | * `MLton.Real64.castToWord r` | |
14397 | + | |
14398 | convert the argument `r` to a word type as a bit-wise cast. | |
14399 | ||
14400 | <<< | |
14401 | ||
14402 | :mlton-guide-page: MLtonRlimit | |
14403 | [[MLtonRlimit]] | |
14404 | MLtonRlimit | |
14405 | =========== | |
14406 | ||
14407 | [source,sml] | |
14408 | ---- | |
14409 | signature MLTON_RLIMIT = | |
14410 | sig | |
14411 | structure RLim : sig | |
14412 | type t | |
14413 | val castFromSysWord: SysWord.word -> t | |
14414 | val castToSysWord: t -> SysWord.word | |
14415 | end | |
14416 | ||
14417 | val infinity: RLim.t | |
14418 | ||
14419 | type t | |
14420 | ||
14421 | val coreFileSize: t (* CORE max core file size *) | |
14422 | val cpuTime: t (* CPU CPU time in seconds *) | |
14423 | val dataSize: t (* DATA max data size *) | |
14424 | val fileSize: t (* FSIZE Maximum filesize *) | |
14425 | val numFiles: t (* NOFILE max number of open files *) | |
14426 | val lockedInMemorySize: t (* MEMLOCK max locked address space *) | |
14427 | val numProcesses: t (* NPROC max number of processes *) | |
14428 | val residentSetSize: t (* RSS max resident set size *) | |
14429 | val stackSize: t (* STACK max stack size *) | |
14430 | val virtualMemorySize: t (* AS virtual memory limit *) | |
14431 | ||
14432 | val get: t -> {hard: rlim, soft: rlim} | |
14433 | val set: t * {hard: rlim, soft: rlim} -> unit | |
14434 | end | |
14435 | ---- | |
14436 | ||
14437 | `MLton.Rlimit` provides a wrapper around the C `getrlimit` and | |
14438 | `setrlimit` functions. | |
14439 | ||
14440 | * `type Rlim.t` | |
14441 | + | |
14442 | the type of resource limits. | |
14443 | ||
14444 | * `infinity` | |
14445 | + | |
14446 | indicates that a resource is unlimited. | |
14447 | ||
14448 | * `type t` | |
14449 | + | |
14450 | the types of resources that can be inspected and modified. | |
14451 | ||
14452 | * `get r` | |
14453 | + | |
14454 | returns the current hard and soft limits for resource `r`. May raise | |
14455 | `OS.SysErr`. | |
14456 | ||
14457 | * `set (r, {hard, soft})` | |
14458 | + | |
14459 | sets the hard and soft limits for resource `r`. May raise | |
14460 | `OS.SysErr`. | |
14461 | ||
14462 | <<< | |
14463 | ||
14464 | :mlton-guide-page: MLtonRusage | |
14465 | [[MLtonRusage]] | |
14466 | MLtonRusage | |
14467 | =========== | |
14468 | ||
14469 | [source,sml] | |
14470 | ---- | |
14471 | signature MLTON_RUSAGE = | |
14472 | sig | |
14473 | type t = {utime: Time.time, (* user time *) | |
14474 | stime: Time.time} (* system time *) | |
14475 | ||
14476 | val measureGC: bool -> unit | |
14477 | val rusage: unit -> {children: t, gc: t, self: t} | |
14478 | end | |
14479 | ---- | |
14480 | ||
14481 | * `type t` | |
14482 | + | |
14483 | corresponds to a subset of the C `struct rusage`. | |
14484 | ||
14485 | * `measureGC b` | |
14486 | + | |
14487 | controls whether garbage collection time is separately measured during | |
14488 | program execution. This affects the behavior of both `rusage` and | |
14489 | `Timer.checkCPUTimes`, both of which will return gc times of zero with | |
14490 | `measureGC false`. Garbage collection time is always measured when | |
14491 | either `gc-messages` or `gc-summary` is given as a | |
14492 | <:RunTimeOptions:runtime system option>. | |
14493 | ||
14494 | * `rusage ()` | |
14495 | + | |
14496 | corresponds to the C `getrusage` function. It returns the resource | |
14497 | usage of the exited children, the garbage collector, and the process | |
14498 | itself. The `self` component includes the usage of the `gc` | |
14499 | component, regardless of whether `measureGC` is `true` or `false`. If | |
14500 | `rusage` is used in a program, either directly, or indirectly via the | |
14501 | `Timer` structure, then `measureGC true` is automatically called at | |
14502 | the start of the program (it can still be disable by user code later). | |
14503 | ||
14504 | <<< | |
14505 | ||
14506 | :mlton-guide-page: MLtonSignal | |
14507 | [[MLtonSignal]] | |
14508 | MLtonSignal | |
14509 | =========== | |
14510 | ||
14511 | [source,sml] | |
14512 | ---- | |
14513 | signature MLTON_SIGNAL = | |
14514 | sig | |
14515 | type t = Posix.Signal.signal | |
14516 | type signal = t | |
14517 | ||
14518 | structure Handler: | |
14519 | sig | |
14520 | type t | |
14521 | ||
14522 | val default: t | |
14523 | val handler: (Thread.Runnable.t -> Thread.Runnable.t) -> t | |
14524 | val ignore: t | |
14525 | val isDefault: t -> bool | |
14526 | val isIgnore: t -> bool | |
14527 | val simple: (unit -> unit) -> t | |
14528 | end | |
14529 | ||
14530 | structure Mask: | |
14531 | sig | |
14532 | type t | |
14533 | ||
14534 | val all: t | |
14535 | val allBut: signal list -> t | |
14536 | val block: t -> unit | |
14537 | val getBlocked: unit -> t | |
14538 | val isMember: t * signal -> bool | |
14539 | val none: t | |
14540 | val setBlocked: t -> unit | |
14541 | val some: signal list -> t | |
14542 | val unblock: t -> unit | |
14543 | end | |
14544 | ||
14545 | val getHandler: t -> Handler.t | |
14546 | val handled: unit -> Mask.t | |
14547 | val prof: t | |
14548 | val restart: bool ref | |
14549 | val setHandler: t * Handler.t -> unit | |
14550 | val suspend: Mask.t -> unit | |
14551 | val vtalrm: t | |
14552 | end | |
14553 | ---- | |
14554 | ||
14555 | Signals handlers are functions from (runnable) threads to (runnable) | |
14556 | threads. When a signal arrives, the corresponding signal handler is | |
14557 | invoked, its argument being the thread that was interrupted by the | |
14558 | signal. The signal handler runs asynchronously, in its own thread. | |
14559 | The signal handler returns the thread that it would like to resume | |
14560 | execution (this is often the thread that it was passed). It is an | |
14561 | error for a signal handler to raise an exception that is not handled | |
14562 | within the signal handler itself. | |
14563 | ||
14564 | A signal handler is never invoked while the running thread is in a | |
14565 | critical section (see <:MLtonThread:>). Invoking a signal handler | |
14566 | implicitly enters a critical section and the normal return of a signal | |
14567 | handler implicitly exits the critical section; hence, a signal handler | |
14568 | is never interrupted by another signal handler. | |
14569 | ||
14570 | * `type t` | |
14571 | + | |
14572 | the type of signals. | |
14573 | ||
14574 | * `type Handler.t` | |
14575 | + | |
14576 | the type of signal handlers. | |
14577 | ||
14578 | * `Handler.default` | |
14579 | + | |
14580 | handles the signal with the default action. | |
14581 | ||
14582 | * `Handler.handler f` | |
14583 | + | |
14584 | returns a handler `h` such that when a signal `s` is handled by `h`, | |
14585 | `f` will be passed the thread that was interrupted by `s` and should | |
14586 | return the thread that will resume execution. | |
14587 | ||
14588 | * `Handler.ignore` | |
14589 | + | |
14590 | is a handler that will ignore the signal. | |
14591 | ||
14592 | * `Handler.isDefault` | |
14593 | + | |
14594 | returns true if the handler is the default handler. | |
14595 | ||
14596 | * `Handler.isIgnore` | |
14597 | + | |
14598 | returns true if the handler is the ignore handler. | |
14599 | ||
14600 | * `Handler.simple f` | |
14601 | + | |
14602 | returns a handler that executes `f ()` and does not switch threads. | |
14603 | ||
14604 | * `type Mask.t` | |
14605 | + | |
14606 | the type of signal masks, which are sets of blocked signals. | |
14607 | ||
14608 | * `Mask.all` | |
14609 | + | |
14610 | a mask of all signals. | |
14611 | ||
14612 | * `Mask.allBut l` | |
14613 | + | |
14614 | a mask of all signals except for those in `l`. | |
14615 | ||
14616 | * `Mask.block m` | |
14617 | + | |
14618 | blocks all signals in `m`. | |
14619 | ||
14620 | * `Mask.getBlocked ()` | |
14621 | + | |
14622 | gets the signal mask `m`, i.e. a signal is blocked if and only if it | |
14623 | is in `m`. | |
14624 | ||
14625 | * `Mask.isMember (m, s)` | |
14626 | + | |
14627 | returns true if the signal `s` is in `m`. | |
14628 | ||
14629 | * `Mask.none` | |
14630 | + | |
14631 | a mask of no signals. | |
14632 | ||
14633 | * `Mask.setBlocked m` | |
14634 | + | |
14635 | sets the signal mask to `m`, i.e. a signal is blocked if and only if | |
14636 | it is in `m`. | |
14637 | ||
14638 | * `Mask.some l` | |
14639 | + | |
14640 | a mask of the signals in `l`. | |
14641 | ||
14642 | * `Mask.unblock m` | |
14643 | + | |
14644 | unblocks all signals in `m`. | |
14645 | ||
14646 | * `getHandler s` | |
14647 | + | |
14648 | returns the current handler for signal `s`. | |
14649 | ||
14650 | * `handled ()` | |
14651 | + | |
14652 | returns the signal mask `m` corresponding to the currently handled | |
14653 | signals; i.e., a signal is handled if and only if it is in `m`. | |
14654 | ||
14655 | * `prof` | |
14656 | + | |
14657 | `SIGPROF`, the profiling signal. | |
14658 | ||
14659 | * `restart` | |
14660 | + | |
14661 | dynamically determines the behavior of interrupted system calls; when | |
14662 | `true`, interrupted system calls are restarted; when `false`, | |
14663 | interrupted system calls raise `OS.SysError`. | |
14664 | ||
14665 | * `setHandler (s, h)` | |
14666 | + | |
14667 | sets the handler for signal `s` to `h`. | |
14668 | ||
14669 | * `suspend m` | |
14670 | + | |
14671 | temporarily sets the signal mask to `m` and suspends until an unmasked | |
14672 | signal is received and handled, at which point `suspend` resets the | |
14673 | mask and returns. | |
14674 | ||
14675 | * `vtalrm` | |
14676 | + | |
14677 | `SIGVTALRM`, the signal for virtual timers. | |
14678 | ||
14679 | ||
14680 | == Interruptible System Calls == | |
14681 | ||
14682 | Signal handling interacts in a non-trivial way with those functions in | |
14683 | the <:BasisLibrary:Basis Library> that correspond directly to | |
14684 | interruptible system calls (a subset of those functions that may raise | |
14685 | `OS.SysError`). The desire is that these functions should have | |
14686 | predictable semantics. The principal concerns are: | |
14687 | ||
14688 | 1. System calls that are interrupted by signals should, by default, be | |
14689 | restarted; the alternative is to raise | |
14690 | + | |
14691 | [source,sml] | |
14692 | ---- | |
14693 | OS.SysError (Posix.Error.errorMsg Posix.Error.intr, | |
14694 | SOME Posix.Error.intr) | |
14695 | ---- | |
14696 | + | |
14697 | This behavior is determined dynamically by the value of `Signal.restart`. | |
14698 | ||
14699 | 2. Signal handlers should always get a chance to run (when outside a | |
14700 | critical region). If a system call is interrupted by a signal, then | |
14701 | the signal handler will run before the call is restarted or | |
14702 | `OS.SysError` is raised; that is, before the `Signal.restart` check. | |
14703 | ||
14704 | 3. A system call that must be restarted while in a critical section | |
14705 | will be restarted with the handled signals blocked (and the previously | |
14706 | blocked signals remembered). This encourages the system call to | |
14707 | complete, allowing the program to make progress towards leaving the | |
14708 | critical section where the signal can be handled. If the system call | |
14709 | completes, the set of blocked signals are restored to those previously | |
14710 | blocked. | |
14711 | ||
14712 | <<< | |
14713 | ||
14714 | :mlton-guide-page: MLtonStructure | |
14715 | [[MLtonStructure]] | |
14716 | MLtonStructure | |
14717 | ============== | |
14718 | ||
14719 | The `MLton` structure contains a lot of functionality that is not | |
14720 | available in the <:BasisLibrary:Basis Library>. As a warning, | |
14721 | please keep in mind that the `MLton` structure and its | |
14722 | substructures do change from release to release of MLton. | |
14723 | ||
14724 | [source,sml] | |
14725 | ---- | |
14726 | structure MLton: | |
14727 | sig | |
14728 | val eq: 'a * 'a -> bool | |
14729 | val equal: 'a * 'a -> bool | |
14730 | val hash: 'a -> Word32.word | |
14731 | val isMLton: bool | |
14732 | val share: 'a -> unit | |
14733 | val shareAll: unit -> unit | |
14734 | val size: 'a -> int | |
14735 | ||
14736 | structure Array: MLTON_ARRAY | |
14737 | structure BinIO: MLTON_BIN_IO | |
14738 | structure CharArray: MLTON_MONO_ARRAY where type t = CharArray.array | |
14739 | where type elem = CharArray.elem | |
14740 | structure CharVector: MLTON_MONO_VECTOR where type t = CharVector.vector | |
14741 | where type elem = CharVector.elem | |
14742 | structure Cont: MLTON_CONT | |
14743 | structure Exn: MLTON_EXN | |
14744 | structure Finalizable: MLTON_FINALIZABLE | |
14745 | structure GC: MLTON_GC | |
14746 | structure IntInf: MLTON_INT_INF | |
14747 | structure Itimer: MLTON_ITIMER | |
14748 | structure LargeReal: MLTON_REAL where type t = LargeReal.real | |
14749 | structure LargeWord: MLTON_WORD where type t = LargeWord.word | |
14750 | structure Platform: MLTON_PLATFORM | |
14751 | structure Pointer: MLTON_POINTER | |
14752 | structure ProcEnv: MLTON_PROC_ENV | |
14753 | structure Process: MLTON_PROCESS | |
14754 | structure Profile: MLTON_PROFILE | |
14755 | structure Random: MLTON_RANDOM | |
14756 | structure Real: MLTON_REAL where type t = Real.real | |
14757 | structure Real32: sig | |
14758 | include MLTON_REAL | |
14759 | val castFromWord: Word32.word -> t | |
14760 | val castToWord: t -> Word32.word | |
14761 | end where type t = Real32.real | |
14762 | structure Real64: sig | |
14763 | include MLTON_REAL | |
14764 | val castFromWord: Word64.word -> t | |
14765 | val castToWord: t -> Word64.word | |
14766 | end where type t = Real64.real | |
14767 | structure Rlimit: MLTON_RLIMIT | |
14768 | structure Rusage: MLTON_RUSAGE | |
14769 | structure Signal: MLTON_SIGNAL | |
14770 | structure Syslog: MLTON_SYSLOG | |
14771 | structure TextIO: MLTON_TEXT_IO | |
14772 | structure Thread: MLTON_THREAD | |
14773 | structure Vector: MLTON_VECTOR | |
14774 | structure Weak: MLTON_WEAK | |
14775 | structure Word: MLTON_WORD where type t = Word.word | |
14776 | structure Word8: MLTON_WORD where type t = Word8.word | |
14777 | structure Word16: MLTON_WORD where type t = Word16.word | |
14778 | structure Word32: MLTON_WORD where type t = Word32.word | |
14779 | structure Word64: MLTON_WORD where type t = Word64.word | |
14780 | structure Word8Array: MLTON_MONO_ARRAY where type t = Word8Array.array | |
14781 | where type elem = Word8Array.elem | |
14782 | structure Word8Vector: MLTON_MONO_VECTOR where type t = Word8Vector.vector | |
14783 | where type elem = Word8Vector.elem | |
14784 | structure World: MLTON_WORLD | |
14785 | end | |
14786 | ---- | |
14787 | ||
14788 | ||
14789 | == Substructures == | |
14790 | ||
14791 | * <:MLtonArray:> | |
14792 | * <:MLtonBinIO:> | |
14793 | * <:MLtonCont:> | |
14794 | * <:MLtonExn:> | |
14795 | * <:MLtonFinalizable:> | |
14796 | * <:MLtonGC:> | |
14797 | * <:MLtonIntInf:> | |
14798 | * <:MLtonIO:> | |
14799 | * <:MLtonItimer:> | |
14800 | * <:MLtonMonoArray:> | |
14801 | * <:MLtonMonoVector:> | |
14802 | * <:MLtonPlatform:> | |
14803 | * <:MLtonPointer:> | |
14804 | * <:MLtonProcEnv:> | |
14805 | * <:MLtonProcess:> | |
14806 | * <:MLtonRandom:> | |
14807 | * <:MLtonReal:> | |
14808 | * <:MLtonRlimit:> | |
14809 | * <:MLtonRusage:> | |
14810 | * <:MLtonSignal:> | |
14811 | * <:MLtonSyslog:> | |
14812 | * <:MLtonTextIO:> | |
14813 | * <:MLtonThread:> | |
14814 | * <:MLtonVector:> | |
14815 | * <:MLtonWeak:> | |
14816 | * <:MLtonWord:> | |
14817 | * <:MLtonWorld:> | |
14818 | ||
14819 | == Values == | |
14820 | ||
14821 | * `eq (x, y)` | |
14822 | + | |
14823 | returns true if `x` and `y` are equal as pointers. For simple types | |
14824 | like `char`, `int`, and `word`, this is the same as equals. For | |
14825 | arrays, datatypes, strings, tuples, and vectors, this is a simple | |
14826 | pointer equality. The semantics is a bit murky. | |
14827 | ||
14828 | * `equal (x, y)` | |
14829 | + | |
14830 | returns true if `x` and `y` are structurally equal. For equality | |
14831 | types, this is the same as <:PolymorphicEquality:>. For other types, | |
14832 | it is a conservative approximation of equivalence. | |
14833 | ||
14834 | * `hash x` | |
14835 | + | |
14836 | returns a structural hash of `x`. The hash function is consistent | |
14837 | between execution of the same program, but may not be consistent | |
14838 | between different programs. | |
14839 | ||
14840 | * `isMLton` | |
14841 | + | |
14842 | is always `true` in a MLton implementation, and is always `false` in a | |
14843 | stub implementation. | |
14844 | ||
14845 | * `share x` | |
14846 | + | |
14847 | maximizes sharing in the heap for the object graph reachable from `x`. | |
14848 | ||
14849 | * `shareAll ()` | |
14850 | + | |
14851 | maximizes sharing in the heap by sharing space for equivalent | |
14852 | immutable objects. A call to `shareAll` performs a major garbage | |
14853 | collection, and takes time proportional to the size of the heap. | |
14854 | ||
14855 | * `size x` | |
14856 | + | |
14857 | returns the amount of heap space (in bytes) taken by the value of `x`, | |
14858 | including all objects reachable from `x` by following pointers. It | |
14859 | takes time proportional to the size of `x`. See below for an example. | |
14860 | ||
14861 | ||
14862 | == <!Anchor(size)>Example of `MLton.size` == | |
14863 | ||
14864 | This example, `size.sml`, demonstrates the application of `MLton.size` | |
14865 | to many different kinds of objects. | |
14866 | [source,sml] | |
14867 | ---- | |
14868 | sys::[./bin/InclGitFile.py mlton master doc/examples/size/size.sml] | |
14869 | ---- | |
14870 | ||
14871 | Compile and run as usual. | |
14872 | ---- | |
14873 | % mlton size.sml | |
14874 | % ./size | |
14875 | The size of an int list of length 4 is 48 bytes. | |
14876 | The size of a string of length 10 is 24 bytes. | |
14877 | The size of an int array of length 10 is 52 bytes. | |
14878 | The size of a double array of length 10 is 92 bytes. | |
14879 | The size of an array of length 10 of 2-ples of ints is 92 bytes. | |
14880 | The size of a useless function is 0 bytes. | |
14881 | The size of a continuation option ref is 4544 bytes. | |
14882 | 13 | |
14883 | The size of a continuation option ref is 8 bytes. | |
14884 | ---- | |
14885 | ||
14886 | Note that sizes are dependent upon the target platform and compiler | |
14887 | optimizations. | |
14888 | ||
14889 | <<< | |
14890 | ||
14891 | :mlton-guide-page: MLtonSyslog | |
14892 | [[MLtonSyslog]] | |
14893 | MLtonSyslog | |
14894 | =========== | |
14895 | ||
14896 | [source,sml] | |
14897 | ---- | |
14898 | signature MLTON_SYSLOG = | |
14899 | sig | |
14900 | type openflag | |
14901 | ||
14902 | val CONS : openflag | |
14903 | val NDELAY : openflag | |
14904 | val NOWAIT : openflag | |
14905 | val ODELAY : openflag | |
14906 | val PERROR : openflag | |
14907 | val PID : openflag | |
14908 | ||
14909 | type facility | |
14910 | ||
14911 | val AUTHPRIV : facility | |
14912 | val CRON : facility | |
14913 | val DAEMON : facility | |
14914 | val KERN : facility | |
14915 | val LOCAL0 : facility | |
14916 | val LOCAL1 : facility | |
14917 | val LOCAL2 : facility | |
14918 | val LOCAL3 : facility | |
14919 | val LOCAL4 : facility | |
14920 | val LOCAL5 : facility | |
14921 | val LOCAL6 : facility | |
14922 | val LOCAL7 : facility | |
14923 | val LPR : facility | |
14924 | val MAIL : facility | |
14925 | val NEWS : facility | |
14926 | val SYSLOG : facility | |
14927 | val USER : facility | |
14928 | val UUCP : facility | |
14929 | ||
14930 | type loglevel | |
14931 | ||
14932 | val EMERG : loglevel | |
14933 | val ALERT : loglevel | |
14934 | val CRIT : loglevel | |
14935 | val ERR : loglevel | |
14936 | val WARNING : loglevel | |
14937 | val NOTICE : loglevel | |
14938 | val INFO : loglevel | |
14939 | val DEBUG : loglevel | |
14940 | ||
14941 | val closelog: unit -> unit | |
14942 | val log: loglevel * string -> unit | |
14943 | val openlog: string * openflag list * facility -> unit | |
14944 | end | |
14945 | ---- | |
14946 | ||
14947 | `MLton.Syslog` is a complete interface to the system logging | |
14948 | facilities. See `man 3 syslog` for more details. | |
14949 | ||
14950 | * `closelog ()` | |
14951 | + | |
14952 | closes the connection to the system logger. | |
14953 | ||
14954 | * `log (l, s)` | |
14955 | + | |
14956 | logs message `s` at a loglevel `l`. | |
14957 | ||
14958 | * `openlog (name, flags, facility)` | |
14959 | + | |
14960 | opens a connection to the system logger. `name` will be prefixed to | |
14961 | each message, and is typically set to the program name. | |
14962 | ||
14963 | <<< | |
14964 | ||
14965 | :mlton-guide-page: MLtonTextIO | |
14966 | [[MLtonTextIO]] | |
14967 | MLtonTextIO | |
14968 | =========== | |
14969 | ||
14970 | [source,sml] | |
14971 | ---- | |
14972 | signature MLTON_TEXT_IO = MLTON_IO | |
14973 | ---- | |
14974 | ||
14975 | See <:MLtonIO:>. | |
14976 | ||
14977 | <<< | |
14978 | ||
14979 | :mlton-guide-page: MLtonThread | |
14980 | [[MLtonThread]] | |
14981 | MLtonThread | |
14982 | =========== | |
14983 | ||
14984 | [source,sml] | |
14985 | ---- | |
14986 | signature MLTON_THREAD = | |
14987 | sig | |
14988 | structure AtomicState: | |
14989 | sig | |
14990 | datatype t = NonAtomic | Atomic of int | |
14991 | end | |
14992 | ||
14993 | val atomically: (unit -> 'a) -> 'a | |
14994 | val atomicBegin: unit -> unit | |
14995 | val atomicEnd: unit -> unit | |
14996 | val atomicState: unit -> AtomicState.t | |
14997 | ||
14998 | structure Runnable: | |
14999 | sig | |
15000 | type t | |
15001 | end | |
15002 | ||
15003 | type 'a t | |
15004 | ||
15005 | val atomicSwitch: ('a t -> Runnable.t) -> 'a | |
15006 | val new: ('a -> unit) -> 'a t | |
15007 | val prepend: 'a t * ('b -> 'a) -> 'b t | |
15008 | val prepare: 'a t * 'a -> Runnable.t | |
15009 | val switch: ('a t -> Runnable.t) -> 'a | |
15010 | end | |
15011 | ---- | |
15012 | ||
15013 | `MLton.Thread` provides access to MLton's user-level thread | |
15014 | implementation (i.e. not OS-level threads). Threads are lightweight | |
15015 | data structures that represent a paused computation. Runnable threads | |
15016 | are threads that will begin or continue computing when `switch`-ed to. | |
15017 | `MLton.Thread` does not include a default scheduling mechanism, but it | |
15018 | can be used to implement both preemptive and non-preemptive threads. | |
15019 | ||
15020 | * `type AtomicState.t` | |
15021 | + | |
15022 | the type of atomic states. | |
15023 | ||
15024 | ||
15025 | * `atomically f` | |
15026 | + | |
15027 | runs `f` in a critical section. | |
15028 | ||
15029 | * `atomicBegin ()` | |
15030 | + | |
15031 | begins a critical section. | |
15032 | ||
15033 | * `atomicEnd ()` | |
15034 | + | |
15035 | ends a critical section. | |
15036 | ||
15037 | * `atomicState ()` | |
15038 | + | |
15039 | returns the current atomic state. | |
15040 | ||
15041 | * `type Runnable.t` | |
15042 | + | |
15043 | the type of threads that can be resumed. | |
15044 | ||
15045 | * `type 'a t` | |
15046 | + | |
15047 | the type of threads that expect a value of type `'a`. | |
15048 | ||
15049 | * `atomicSwitch f` | |
15050 | + | |
15051 | like `switch`, but assumes an atomic calling context. Upon | |
15052 | `switch`-ing back to the current thread, an implicit `atomicEnd` is | |
15053 | performed. | |
15054 | ||
15055 | * `new f` | |
15056 | + | |
15057 | creates a new thread that, when run, applies `f` to the value given to | |
15058 | the thread. `f` must terminate by `switch`ing to another thread or | |
15059 | exiting the process. | |
15060 | ||
15061 | * `prepend (t, f)` | |
15062 | + | |
15063 | creates a new thread (destroying `t` in the process) that first | |
15064 | applies `f` to the value given to the thread and then continues with | |
15065 | `t`. This is a constant time operation. | |
15066 | ||
15067 | * `prepare (t, v)` | |
15068 | + | |
15069 | prepares a new runnable thread (destroying `t` in the process) that | |
15070 | will evaluate `t` on `v`. | |
15071 | ||
15072 | * `switch f` | |
15073 | + | |
15074 | applies `f` to the current thread to get `rt`, and then start running | |
15075 | thread `rt`. It is an error for `f` to perform another `switch`. `f` | |
15076 | is guaranteed to run atomically. | |
15077 | ||
15078 | ||
15079 | == Example of non-preemptive threads == | |
15080 | ||
15081 | [source,sml] | |
15082 | ---- | |
15083 | sys::[./bin/InclGitFile.py mlton master doc/examples/thread/non-preemptive-threads.sml] | |
15084 | ---- | |
15085 | ||
15086 | ||
15087 | == Example of preemptive threads == | |
15088 | ||
15089 | [source,sml] | |
15090 | ---- | |
15091 | sys::[./bin/InclGitFile.py mlton master doc/examples/thread/preemptive-threads.sml] | |
15092 | ---- | |
15093 | ||
15094 | <<< | |
15095 | ||
15096 | :mlton-guide-page: MLtonVector | |
15097 | [[MLtonVector]] | |
15098 | MLtonVector | |
15099 | =========== | |
15100 | ||
15101 | [source,sml] | |
15102 | ---- | |
15103 | signature MLTON_VECTOR = | |
15104 | sig | |
15105 | val create: int -> {done: unit -> 'a vector, | |
15106 | sub: int -> 'a, | |
15107 | update: int * 'a -> unit} | |
15108 | val unfoldi: int * 'b * (int * 'b -> 'a * 'b) -> 'a vector * 'b | |
15109 | end | |
15110 | ---- | |
15111 | ||
15112 | * `create n` | |
15113 | + | |
15114 | initiates the construction a vector _v_ of length `n`, returning | |
15115 | functions to manipulate the vector. The `done` function may be called | |
15116 | to return the created vector; it is an error to call `done` before all | |
15117 | entries have been initialized; it is an error to call `done` after | |
15118 | having called `done`. The `sub` function may be called to return an | |
15119 | initialized vector entry; it is not an error to call `sub` after | |
15120 | having called `done`. The `update` function may be called to | |
15121 | initialize a vector entry; it is an error to call `update` after | |
15122 | having called `done`. One must initialize vector entries in order | |
15123 | from lowest to highest; that is, before calling `update (i, x)`, one | |
15124 | must have already called `update (j, x)` for all `j` in `[0, i)`. The | |
15125 | `done`, `sub`, and `update` functions are all constant-time | |
15126 | operations. | |
15127 | ||
15128 | * `unfoldi (n, b, f)` | |
15129 | + | |
15130 | constructs a vector _v_ of length `n`, whose elements __v~i~__ are | |
15131 | determined by the equations __b~0~ = b__ and | |
15132 | __(v~i~, b~i+1~) = f (i, b~i~)__. | |
15133 | ||
15134 | <<< | |
15135 | ||
15136 | :mlton-guide-page: MLtonWeak | |
15137 | [[MLtonWeak]] | |
15138 | MLtonWeak | |
15139 | ========= | |
15140 | ||
15141 | [source,sml] | |
15142 | ---- | |
15143 | signature MLTON_WEAK = | |
15144 | sig | |
15145 | type 'a t | |
15146 | ||
15147 | val get: 'a t -> 'a option | |
15148 | val new: 'a -> 'a t | |
15149 | end | |
15150 | ---- | |
15151 | ||
15152 | A weak pointer is a pointer to an object that is nulled if the object | |
15153 | becomes <:Reachability:unreachable> due to garbage collection. The | |
15154 | weak pointer does not itself cause the object it points to be retained | |
15155 | by the garbage collector -- only other strong pointers can do that. | |
15156 | For objects that are not allocated in the heap, like integers, a weak | |
15157 | pointer will always be nulled. So, if `w: int Weak.t`, then | |
15158 | `Weak.get w = NONE`. | |
15159 | ||
15160 | * `type 'a t` | |
15161 | + | |
15162 | the type of weak pointers to objects of type `'a` | |
15163 | ||
15164 | * `get w` | |
15165 | + | |
15166 | returns `NONE` if the object pointed to by `w` no longer exists. | |
15167 | Otherwise, returns `SOME` of the object pointed to by `w`. | |
15168 | ||
15169 | * `new x` | |
15170 | + | |
15171 | returns a weak pointer to `x`. | |
15172 | ||
15173 | <<< | |
15174 | ||
15175 | :mlton-guide-page: MLtonWord | |
15176 | [[MLtonWord]] | |
15177 | MLtonWord | |
15178 | ========= | |
15179 | ||
15180 | [source,sml] | |
15181 | ---- | |
15182 | signature MLTON_WORD = | |
15183 | sig | |
15184 | type t | |
15185 | ||
15186 | val bswap: t -> t | |
15187 | val rol: t * word -> t | |
15188 | val ror: t * word -> t | |
15189 | end | |
15190 | ---- | |
15191 | ||
15192 | * `type t` | |
15193 | + | |
15194 | the type of words. For `MLton.LargeWord` this is `LargeWord.word`, | |
15195 | for `MLton.Word` this is `Word.word`, for `MLton.Word8` this is | |
15196 | `Word8.word`, for `MLton.Word16` this is `Word16.word`, for | |
15197 | `MLton.Word32` this is `Word32.word`, for `MLton.Word64` this is | |
15198 | `Word64.word`. | |
15199 | ||
15200 | * `bswap w` | |
15201 | + | |
15202 | byte swap. | |
15203 | ||
15204 | * `rol (w, w')` | |
15205 | + | |
15206 | rotates left (circular). | |
15207 | ||
15208 | * `ror (w, w')` | |
15209 | + | |
15210 | rotates right (circular). | |
15211 | ||
15212 | <<< | |
15213 | ||
15214 | :mlton-guide-page: MLtonWorld | |
15215 | [[MLtonWorld]] | |
15216 | MLtonWorld | |
15217 | ========== | |
15218 | ||
15219 | [source,sml] | |
15220 | ---- | |
15221 | signature MLTON_WORLD = | |
15222 | sig | |
15223 | datatype status = Clone | Original | |
15224 | ||
15225 | val load: string -> 'a | |
15226 | val save: string -> status | |
15227 | val saveThread: string * Thread.Runnable.t -> unit | |
15228 | end | |
15229 | ---- | |
15230 | ||
15231 | * `datatype status` | |
15232 | + | |
15233 | specifies whether a world is original or restarted (a clone). | |
15234 | ||
15235 | * `load f` | |
15236 | + | |
15237 | loads the saved computation from file `f`. | |
15238 | ||
15239 | * `save f` | |
15240 | + | |
15241 | saves the entire state of the computation to the file `f`. The | |
15242 | computation can then be restarted at a later time using `World.load` | |
15243 | or the `load-world` <:RunTimeOptions:runtime option>. The call to | |
15244 | `save` in the original computation returns `Original` and the call in | |
15245 | the restarted world returns `Clone`. | |
15246 | ||
15247 | * `saveThread (f, rt)` | |
15248 | + | |
15249 | saves the entire state of the computation to the file `f` that will | |
15250 | resume with thread `rt` upon restart. | |
15251 | ||
15252 | ||
15253 | == Notes == | |
15254 | ||
15255 | <!Anchor(ASLR)> | |
15256 | Executables that save and load worlds are incompatible with | |
15257 | http://en.wikipedia.org/wiki/Address_space_layout_randomization[address space layout randomization (ASLR)] | |
15258 | of the executable (though, not of shared libraries). The state of a | |
15259 | computation includes addresses into the code and data segments of the | |
15260 | executable (e.g., static runtime-system data, return addresses); such | |
15261 | addresses are invalid when interpreted by the executable loaded at a | |
15262 | different base address. | |
15263 | ||
15264 | Executables that save and load worlds should be compiled with an | |
15265 | option to suppress the generation of position-independent executables. | |
15266 | ||
15267 | * <:RunningOnDarwin:Darwin 11 (Mac OS X Lion) and higher> : `-link-opt -fno-PIE` | |
15268 | ||
15269 | ||
15270 | == Example == | |
15271 | ||
15272 | Suppose that `save-world.sml` contains the following. | |
15273 | [source,sml] | |
15274 | ---- | |
15275 | sys::[./bin/InclGitFile.py mlton master doc/examples/save-world/save-world.sml] | |
15276 | ---- | |
15277 | ||
15278 | Then, if we compile `save-world.sml` and run it, the `Original` | |
15279 | branch will execute, and a file named `world` will be created. | |
15280 | ---- | |
15281 | % mlton save-world.sml | |
15282 | % ./save-world | |
15283 | I am the original | |
15284 | ---- | |
15285 | ||
15286 | We can then load `world` using the `load-world` | |
15287 | <:RunTimeOptions:run time option>. | |
15288 | ---- | |
15289 | % ./save-world @MLton load-world world -- | |
15290 | I am the clone | |
15291 | ---- | |
15292 | ||
15293 | <<< | |
15294 | ||
15295 | :mlton-guide-page: MLULex | |
15296 | [[MLULex]] | |
15297 | MLULex | |
15298 | ====== | |
15299 | ||
15300 | http://smlnj-gforge.cs.uchicago.edu/projects/ml-lpt/[MLULex] is a | |
15301 | scanner generator for <:StandardML:Standard ML>. | |
15302 | ||
15303 | == Also see == | |
15304 | ||
15305 | * <:MLAntlr:> | |
15306 | * <:MLLPTLibrary:> | |
15307 | * <!Cite(OwensEtAl09)> | |
15308 | ||
15309 | <<< | |
15310 | ||
15311 | :mlton-guide-page: MLYacc | |
15312 | [[MLYacc]] | |
15313 | MLYacc | |
15314 | ====== | |
15315 | ||
15316 | <:MLYacc:> is a parser generator for <:StandardML:Standard ML> modeled | |
15317 | after the Yacc parser generator. | |
15318 | ||
15319 | A version of MLYacc, ported from the <:SMLNJ:SML/NJ> sources, is | |
15320 | distributed with MLton. | |
15321 | ||
15322 | == Also see == | |
15323 | ||
15324 | * <!Attachment(Documentation,mlyacc.pdf)> | |
15325 | * <:MLLex:> | |
15326 | * <!Cite(TarditiAppel00)> | |
15327 | * <!Cite(Price09)> | |
15328 | ||
15329 | <<< | |
15330 | ||
15331 | :mlton-guide-page: Monomorphise | |
15332 | [[Monomorphise]] | |
15333 | Monomorphise | |
15334 | ============ | |
15335 | ||
15336 | <:Monomorphise:> is a translation pass from the <:XML:> | |
15337 | <:IntermediateLanguage:> to the <:SXML:> <:IntermediateLanguage:>. | |
15338 | ||
15339 | == Description == | |
15340 | ||
15341 | Monomorphisation eliminates polymorphic values and datatype | |
15342 | declarations by duplicating them for each type at which they are used. | |
15343 | ||
15344 | Consider the following <:XML:> program. | |
15345 | [source,sml] | |
15346 | ---- | |
15347 | datatype 'a t = T of 'a | |
15348 | fun 'a f (x: 'a) = T x | |
15349 | val a = f 1 | |
15350 | val b = f 2 | |
15351 | val z = f (3, 4) | |
15352 | ---- | |
15353 | ||
15354 | The result of monomorphising this program is the following <:SXML:> program: | |
15355 | [source,sml] | |
15356 | ---- | |
15357 | datatype t1 = T1 of int | |
15358 | datatype t2 = T2 of int * int | |
15359 | fun f1 (x: int) = T1 x | |
15360 | fun f2 (x: int * int) = T2 x | |
15361 | val a = f1 1 | |
15362 | val b = f1 2 | |
15363 | val z = f2 (3, 4) | |
15364 | ---- | |
15365 | ||
15366 | == Implementation == | |
15367 | ||
15368 | * <!ViewGitFile(mlton,master,mlton/xml/monomorphise.sig)> | |
15369 | * <!ViewGitFile(mlton,master,mlton/xml/monomorphise.fun)> | |
15370 | ||
15371 | == Details and Notes == | |
15372 | ||
15373 | The monomorphiser works by making one pass over the entire program. | |
15374 | On the way down, it creates a cache for each variable declared in a | |
15375 | polymorphic declaration that maps a lists of type arguments to a new | |
15376 | variable name. At a variable reference, it consults the cache (based | |
15377 | on the types the variable is applied to). If there is already an | |
15378 | entry in the cache, it is used. If not, a new entry is created. On | |
15379 | the way up, the monomorphiser duplicates a variable declaration for | |
15380 | each entry in the cache. | |
15381 | ||
15382 | As with variables, the monomorphiser records all of the type at which | |
15383 | constructors are used. After the entire program is processed, the | |
15384 | monomorphiser duplicates each datatype declaration and its associated | |
15385 | constructors. | |
15386 | ||
15387 | The monomorphiser duplicates all of the functions declared in a | |
15388 | `fun` declaration as a unit. Consider the following program | |
15389 | [source,sml] | |
15390 | ---- | |
15391 | fun 'a f (x: 'a) = g x | |
15392 | and g (y: 'a) = f y | |
15393 | val a = f 13 | |
15394 | val b = g 14 | |
15395 | val c = f (1, 2) | |
15396 | ---- | |
15397 | ||
15398 | and its monomorphisation | |
15399 | ||
15400 | [source,sml] | |
15401 | ---- | |
15402 | fun f1 (x: int) = g1 x | |
15403 | and g1 (y: int) = f1 y | |
15404 | fun f2 (x : int * int) = g2 x | |
15405 | and g2 (y : int * int) = f2 y | |
15406 | val a = f1 13 | |
15407 | val b = g1 14 | |
15408 | val c = f2 (1, 2) | |
15409 | ---- | |
15410 | ||
15411 | == Pathological datatype declarations == | |
15412 | ||
15413 | SML allows a pathological polymorphic datatype declaration in which | |
15414 | recursive uses of the defined type constructor are applied to | |
15415 | different type arguments than the definition. This has been | |
15416 | disallowed by others on type theoretic grounds. A canonical example | |
15417 | is the following. | |
15418 | [source,sml] | |
15419 | ---- | |
15420 | datatype 'a t = A of 'a | B of ('a * 'a) t | |
15421 | val z : int t = B (B (A ((1, 2), (3, 4)))) | |
15422 | ---- | |
15423 | ||
15424 | The presence of the recursion in the datatype declaration might appear | |
15425 | to cause the need for the monomorphiser to create an infinite number | |
15426 | of types. However, due to the absence of polymorphic recursion in | |
15427 | SML, there are in fact only a finite number of instances of such types | |
15428 | in any given program. The monomorphiser translates the above program | |
15429 | to the following one. | |
15430 | [source,sml] | |
15431 | ---- | |
15432 | datatype t1 = B1 of t2 | |
15433 | datatype t2 = B2 of t3 | |
15434 | datatype t3 = A3 of (int * int) * (int * int) | |
15435 | val z : int t = B1 (B2 (A3 ((1, 2), (3, 4)))) | |
15436 | ---- | |
15437 | ||
15438 | It is crucial that the monomorphiser be allowed to drop unused | |
15439 | constructors from datatype declarations in order for the translation | |
15440 | to terminate. | |
15441 | ||
15442 | <<< | |
15443 | ||
15444 | :mlton-guide-page: MoscowML | |
15445 | [[MoscowML]] | |
15446 | MoscowML | |
15447 | ======== | |
15448 | ||
15449 | http://mosml.org[Moscow ML] is a | |
15450 | <:StandardMLImplementations:Standard ML implementation>. It is a | |
15451 | byte-code compiler, so it compiles code quickly, but the code runs | |
15452 | slowly. See <:Performance:>. | |
15453 | ||
15454 | <<< | |
15455 | ||
15456 | :mlton-guide-page: Multi | |
15457 | [[Multi]] | |
15458 | Multi | |
15459 | ===== | |
15460 | ||
15461 | <:Multi:> is an analysis pass for the <:SSA:> | |
15462 | <:IntermediateLanguage:>, invoked from <:ConstantPropagation:> and | |
15463 | <:LocalRef:>. | |
15464 | ||
15465 | == Description == | |
15466 | ||
15467 | This pass analyzes the control flow of a <:SSA:> program to determine | |
15468 | which <:SSA:> functions and blocks might be executed more than once or | |
15469 | by more than one thread. It also determines when a program uses | |
15470 | threads and when functions and blocks directly or indirectly invoke | |
15471 | `Thread_copyCurrent`. | |
15472 | ||
15473 | == Implementation == | |
15474 | ||
15475 | * <!ViewGitFile(mlton,master,mlton/ssa/multi.sig)> | |
15476 | * <!ViewGitFile(mlton,master,mlton/ssa/multi.fun)> | |
15477 | ||
15478 | == Details and Notes == | |
15479 | ||
15480 | {empty} | |
15481 | ||
15482 | <<< | |
15483 | ||
15484 | :mlton-guide-page: Mutable | |
15485 | [[Mutable]] | |
15486 | Mutable | |
15487 | ======= | |
15488 | ||
15489 | Mutable is an adjective meaning "can be modified". In | |
15490 | <:StandardML:Standard ML>, ref cells and arrays are mutable, while all | |
15491 | other values are <:Immutable:immutable>. | |
15492 | ||
15493 | <<< | |
15494 | ||
15495 | :mlton-guide-page: NeedsReview | |
15496 | [[NeedsReview]] | |
15497 | NeedsReview | |
15498 | =========== | |
15499 | ||
15500 | This page documents some patches and bug fixes that need additional review by experienced developers: | |
15501 | ||
15502 | * Bug in transparent signature match: | |
15503 | ** What is an 'original' interface and why does the equivalence of original interfaces implies the equivalence of the actual interfaces? | |
15504 | ** http://www.mlton.org/pipermail/mlton/2007-September/029991.html | |
15505 | ** http://www.mlton.org/pipermail/mlton/2007-September/029995.html | |
15506 | ** SVN Revision: <!ViewSVNRev(6046)> | |
15507 | ||
15508 | * Bug in <:DeepFlatten:> pass: | |
15509 | ** Should we allow argument to `Weak_new` to be flattened? | |
15510 | ** SVN Revision: <!ViewSVNRev(6189)> (regression test demonstrating bug) | |
15511 | ** SVN Revision: <!ViewSVNRev(6191)> | |
15512 | ||
15513 | <<< | |
15514 | ||
15515 | :mlton-guide-page: NumericLiteral | |
15516 | [[NumericLiteral]] | |
15517 | NumericLiteral | |
15518 | ============== | |
15519 | ||
15520 | Numeric literals in <:StandardML:Standard ML> can be written in either | |
15521 | decimal or hexadecimal notation. Sometimes it can be convenient to | |
15522 | write numbers down in other bases. Fortunately, using <:Fold:>, it is | |
15523 | possible to define a concise syntax for numeric literals that allows | |
15524 | one to write numeric constants in any base and of various types | |
15525 | (`int`, `IntInf.int`, `word`, and more). | |
15526 | ||
15527 | We will define constants `I`, `II`, `W`, and +`+ so | |
15528 | that, for example, | |
15529 | [source,sml] | |
15530 | ---- | |
15531 | I 10 `1`2`3 $ | |
15532 | ---- | |
15533 | denotes `123:int` in base 10, while | |
15534 | [source,sml] | |
15535 | ---- | |
15536 | II 8 `2`3 $ | |
15537 | ---- | |
15538 | denotes `19:IntInf.int` in base 8, and | |
15539 | [source,sml] | |
15540 | ---- | |
15541 | W 2 `1`1`0`1 $ | |
15542 | ---- | |
15543 | denotes `0w13: word`. | |
15544 | ||
15545 | Here is the code. | |
15546 | ||
15547 | [source,sml] | |
15548 | ---- | |
15549 | structure Num = | |
15550 | struct | |
15551 | fun make (op *, op +, i2x) iBase = | |
15552 | let | |
15553 | val xBase = i2x iBase | |
15554 | in | |
15555 | Fold.fold | |
15556 | ((i2x 0, | |
15557 | fn (i, x) => | |
15558 | if 0 <= i andalso i < iBase then | |
15559 | x * xBase + i2x i | |
15560 | else | |
15561 | raise Fail (concat | |
15562 | ["Num: ", Int.toString i, | |
15563 | " is not a valid\ | |
15564 | \ digit in base ", | |
15565 | Int.toString iBase])), | |
15566 | fst) | |
15567 | end | |
15568 | ||
15569 | fun I ? = make (op *, op +, id) ? | |
15570 | fun II ? = make (op *, op +, IntInf.fromInt) ? | |
15571 | fun W ? = make (op *, op +, Word.fromInt) ? | |
15572 | ||
15573 | fun ` ? = Fold.step1 (fn (i, (x, step)) => | |
15574 | (step (i, x), step)) ? | |
15575 | ||
15576 | val a = 10 | |
15577 | val b = 11 | |
15578 | val c = 12 | |
15579 | val d = 13 | |
15580 | val e = 14 | |
15581 | val f = 15 | |
15582 | end | |
15583 | ---- | |
15584 | where | |
15585 | [source,sml] | |
15586 | ---- | |
15587 | fun fst (x, _) = x | |
15588 | ---- | |
15589 | ||
15590 | The idea is for the fold to start with zero and to construct the | |
15591 | result one digit at a time, with each stepper multiplying the previous | |
15592 | result by the base and adding the next digit. The code is abstracted | |
15593 | in two different ways for extra generality. First, the `make` | |
15594 | function abstracts over the various primitive operations (addition, | |
15595 | multiplication, etc) that are needed to construct a number. This | |
15596 | allows the same code to be shared for constants `I`, `II`, `W` used to | |
15597 | write down the various numeric types. It also allows users to add new | |
15598 | constants for additional numeric types, by supplying the necessary | |
15599 | arguments to make. | |
15600 | ||
15601 | Second, the step function, +`+, is abstracted over the actual | |
15602 | construction operation, which is created by make, and passed along the | |
15603 | fold. This allows the same constant, +`+, to be used for all | |
15604 | numeric types. The alternative approach, having a different step | |
15605 | function for each numeric type, would be more painful to use. | |
15606 | ||
15607 | On the surface, it appears that the code checks the digits dynamically | |
15608 | to ensure they are valid for the base. However, MLton will simplify | |
15609 | everything away at compile time, leaving just the final numeric | |
15610 | constant. | |
15611 | ||
15612 | <<< | |
15613 | ||
15614 | :mlton-guide-page: ObjectOrientedProgramming | |
15615 | [[ObjectOrientedProgramming]] | |
15616 | ObjectOrientedProgramming | |
15617 | ========================= | |
15618 | ||
15619 | <:StandardML:Standard ML> does not have explicit support for | |
15620 | object-oriented programming. Here are some papers that show how to | |
15621 | express certain object-oriented concepts in SML. | |
15622 | ||
15623 | * <!Cite(Berthomieu00, OO Programming styles in ML)> | |
15624 | ||
15625 | * <!Cite(ThorupTofte94, Object-oriented programming and Standard ML)> | |
15626 | ||
15627 | * <!Cite(LarsenNiss04, mGTK: An SML binding of Gtk+)> | |
15628 | ||
15629 | * <!Cite(FluetPucella06, Phantom Types and Subtyping)> | |
15630 | ||
15631 | The question of OO programming in SML comes up every now and then. | |
15632 | The following discusses a simple object-oriented (OO) programming | |
15633 | technique in Standard ML. The reader is assumed to be able to read | |
15634 | Java and SML code. | |
15635 | ||
15636 | ||
15637 | == Motivation == | |
15638 | ||
15639 | SML doesn't provide subtyping, but it does provide parametric | |
15640 | polymorphism, which can be used to encode some forms of subtyping. | |
15641 | Most articles on OO programming in SML concentrate on such encoding | |
15642 | techniques. While those techniques are interesting -- and it is | |
15643 | recommended to read such articles -- and sometimes useful, it seems | |
15644 | that basically all OO gurus agree that (deep) subtyping (or | |
15645 | inheritance) hierarchies aren't as practical as they were thought to | |
15646 | be in the early OO days. "Good", flexible, "OO" designs tend to have | |
15647 | a flat structure | |
15648 | ||
15649 | ---- | |
15650 | Interface | |
15651 | ^ | |
15652 | | | |
15653 | - - -+-------+-------+- - - | |
15654 | | | | | |
15655 | ImplA ImplB ImplC | |
15656 | ---- | |
15657 | ||
15658 | ||
15659 | and deep inheritance hierarchies | |
15660 | ||
15661 | ---- | |
15662 | ClassA | |
15663 | ^ | |
15664 | | | |
15665 | ClassB | |
15666 | ^ | |
15667 | | | |
15668 | ClassC | |
15669 | ^ | |
15670 | | | |
15671 | ---- | |
15672 | ||
15673 | tend to be signs of design mistakes. There are good underlying | |
15674 | reasons for this, but a thorough discussion is not in the scope of | |
15675 | this article. However, the point is that perhaps the encoding of | |
15676 | subtyping is not as important as one might believe. In the following | |
15677 | we ignore subtyping and rather concentrate on a very simple and basic | |
15678 | dynamic dispatch technique. | |
15679 | ||
15680 | ||
15681 | == Dynamic Dispatch Using a Recursive Record of Functions == | |
15682 | ||
15683 | Quite simply, the basic idea is to implement a "virtual function | |
15684 | table" using a record that is wrapped inside a (possibly recursive) | |
15685 | datatype. Let's first take a look at a simple concrete example. | |
15686 | ||
15687 | Consider the following Java interface: | |
15688 | ||
15689 | ---- | |
15690 | public interface Counter { | |
15691 | public void inc(); | |
15692 | public int get(); | |
15693 | } | |
15694 | ---- | |
15695 | ||
15696 | We can translate the `Counter` interface to SML as follows: | |
15697 | ||
15698 | [source,sml] | |
15699 | ---- | |
15700 | datatype counter = Counter of {inc : unit -> unit, get : unit -> int} | |
15701 | ---- | |
15702 | ||
15703 | Each value of type `counter` can be thought of as an object that | |
15704 | responds to two messages `inc` and `get`. To actually send messages | |
15705 | to a counter, it is useful to define auxiliary functions | |
15706 | ||
15707 | [source,sml] | |
15708 | ---- | |
15709 | local | |
15710 | fun mk m (Counter t) = m t () | |
15711 | in | |
15712 | val cGet = mk#get | |
15713 | val cInc = mk#inc | |
15714 | end | |
15715 | ---- | |
15716 | ||
15717 | that basically extract the "function table" `t` from a counter object | |
15718 | and then select the specified method `m` from the table. | |
15719 | ||
15720 | Let's then implement a simple function that increments a counter until a | |
15721 | given maximum is reached: | |
15722 | ||
15723 | [source,sml] | |
15724 | ---- | |
15725 | fun incUpto counter max = while cGet counter < max do cInc counter | |
15726 | ---- | |
15727 | ||
15728 | You can easily verify that the above code compiles even without any | |
15729 | concrete implementation of a counter, thus it is clear that it doesn't | |
15730 | depend on a particular counter implementation. | |
15731 | ||
15732 | Let's then implement a couple of counters. First consider the | |
15733 | following Java class implementing the `Counter` interface given earlier. | |
15734 | ||
15735 | ---- | |
15736 | public class BasicCounter implements Counter { | |
15737 | private int cnt; | |
15738 | public BasicCounter(int initialCnt) { this.cnt = initialCnt; } | |
15739 | public void inc() { this.cnt += 1; } | |
15740 | public int get() { return this.cnt; } | |
15741 | } | |
15742 | ---- | |
15743 | ||
15744 | We can translate the above to SML as follows: | |
15745 | ||
15746 | [source,sml] | |
15747 | ---- | |
15748 | fun newBasicCounter initialCnt = let | |
15749 | val cnt = ref initialCnt | |
15750 | in | |
15751 | Counter {inc = fn () => cnt := !cnt + 1, | |
15752 | get = fn () => !cnt} | |
15753 | end | |
15754 | ---- | |
15755 | ||
15756 | The SML function `newBasicCounter` can be described as a constructor | |
15757 | function for counter objects of the `BasicCounter` "class". We can | |
15758 | also have other counter implementations. Here is the constructor for | |
15759 | a counter decorator that logs messages: | |
15760 | ||
15761 | [source,sml] | |
15762 | ---- | |
15763 | fun newLoggedCounter counter = | |
15764 | Counter {inc = fn () => (print "inc\n" ; cInc counter), | |
15765 | get = fn () => (print "get\n" ; cGet counter)} | |
15766 | ---- | |
15767 | ||
15768 | The `incUpto` function works just as well with objects of either | |
15769 | class: | |
15770 | ||
15771 | [source,sml] | |
15772 | ---- | |
15773 | val aCounter = newBasicCounter 0 | |
15774 | val () = incUpto aCounter 5 | |
15775 | val () = print (Int.toString (cGet aCounter) ^"\n") | |
15776 | ||
15777 | val aCounter = newLoggedCounter (newBasicCounter 0) | |
15778 | val () = incUpto aCounter 5 | |
15779 | val () = print (Int.toString (cGet aCounter) ^"\n") | |
15780 | ---- | |
15781 | ||
15782 | In general, a dynamic dispatch interface is represented as a record | |
15783 | type wrapped inside a datatype. Each field of the record corresponds | |
15784 | to a public method or field of the object: | |
15785 | ||
15786 | [source,sml] | |
15787 | ---- | |
15788 | datatype interface = | |
15789 | Interface of {method : t1 -> t2, | |
15790 | immutableField : t, | |
15791 | mutableField : t ref} | |
15792 | ---- | |
15793 | ||
15794 | The reason for wrapping the record inside a datatype is that records, | |
15795 | in SML, can not be recursive. However, SML datatypes can be | |
15796 | recursive. A record wrapped in a datatype can contain fields that | |
15797 | contain the datatype. For example, an interface such as `Cloneable` | |
15798 | ||
15799 | [source,sml] | |
15800 | ---- | |
15801 | datatype cloneable = Cloneable of {clone : unit -> cloneable} | |
15802 | ---- | |
15803 | ||
15804 | can be represented using recursive datatypes. | |
15805 | ||
15806 | Like in OO languages, interfaces are abstract and can not be | |
15807 | instantiated to produce objects. To be able to instantiate objects, | |
15808 | the constructors of a concrete class are needed. In SML, we can | |
15809 | implement constructors as simple functions from arbitrary arguments to | |
15810 | values of the interface type. Such a constructor function can | |
15811 | encapsulate arbitrary private state and functions using lexical | |
15812 | closure. It is also easy to share implementations of methods between | |
15813 | two or more constructors. | |
15814 | ||
15815 | While the `Counter` example is rather trivial, it should not be | |
15816 | difficult to see that this technique quite simply doesn't require a huge | |
15817 | amount of extra verbiage and is more than usable in practice. | |
15818 | ||
15819 | ||
15820 | == SML Modules and Dynamic Dispatch == | |
15821 | ||
15822 | One might wonder about how SML modules and the dynamic dispatch | |
15823 | technique work together. Let's investigate! Let's use a simple | |
15824 | dispenser framework as a concrete example. (Note that this isn't | |
15825 | intended to be an introduction to the SML module system.) | |
15826 | ||
15827 | === Programming with SML Modules === | |
15828 | ||
15829 | Using SML signatures we can specify abstract data types (ADTs) such as | |
15830 | dispensers. Here is a signature for an "abstract" functional (as | |
15831 | opposed to imperative) dispenser: | |
15832 | ||
15833 | [source,sml] | |
15834 | ---- | |
15835 | signature ABSTRACT_DISPENSER = sig | |
15836 | type 'a t | |
15837 | val isEmpty : 'a t -> bool | |
15838 | val push : 'a * 'a t -> 'a t | |
15839 | val pop : 'a t -> ('a * 'a t) option | |
15840 | end | |
15841 | ---- | |
15842 | ||
15843 | The term "abstract" in the name of the signature refers to the fact that | |
15844 | the signature gives no way to instantiate a dispenser. It has nothing to | |
15845 | do with the concept of abstract data types. | |
15846 | ||
15847 | Using SML functors we can write "generic" algorithms that manipulate | |
15848 | dispensers of an unknown type. Here are a couple of very simple | |
15849 | algorithms: | |
15850 | ||
15851 | [source,sml] | |
15852 | ---- | |
15853 | functor DispenserAlgs (D : ABSTRACT_DISPENSER) = struct | |
15854 | open D | |
15855 | ||
15856 | fun pushAll (xs, d) = foldl push d xs | |
15857 | ||
15858 | fun popAll d = let | |
15859 | fun lp (xs, NONE) = rev xs | |
15860 | | lp (xs, SOME (x, d)) = lp (x::xs, pop d) | |
15861 | in | |
15862 | lp ([], pop d) | |
15863 | end | |
15864 | ||
15865 | fun cp (from, to) = pushAll (popAll from, to) | |
15866 | end | |
15867 | ---- | |
15868 | ||
15869 | As one can easily verify, the above compiles even without any concrete | |
15870 | dispenser structure. Functors essentially provide a form a static | |
15871 | dispatch that one can use to break compile-time dependencies. | |
15872 | ||
15873 | We can also give a signature for a concrete dispenser | |
15874 | ||
15875 | [source,sml] | |
15876 | ---- | |
15877 | signature DISPENSER = sig | |
15878 | include ABSTRACT_DISPENSER | |
15879 | val empty : 'a t | |
15880 | end | |
15881 | ---- | |
15882 | ||
15883 | and write any number of concrete structures implementing the signature. | |
15884 | For example, we could implement stacks | |
15885 | ||
15886 | [source,sml] | |
15887 | ---- | |
15888 | structure Stack :> DISPENSER = struct | |
15889 | type 'a t = 'a list | |
15890 | val empty = [] | |
15891 | val isEmpty = null | |
15892 | val push = op :: | |
15893 | val pop = List.getItem | |
15894 | end | |
15895 | ---- | |
15896 | ||
15897 | and queues | |
15898 | ||
15899 | [source,sml] | |
15900 | ---- | |
15901 | structure Queue :> DISPENSER = struct | |
15902 | datatype 'a t = T of 'a list * 'a list | |
15903 | val empty = T ([], []) | |
15904 | val isEmpty = fn T ([], _) => true | _ => false | |
15905 | val normalize = fn ([], ys) => (rev ys, []) | q => q | |
15906 | fun push (y, T (xs, ys)) = T (normalize (xs, y::ys)) | |
15907 | val pop = fn (T (x::xs, ys)) => SOME (x, T (normalize (xs, ys))) | _ => NONE | |
15908 | end | |
15909 | ---- | |
15910 | ||
15911 | One can now write code that uses either the `Stack` or the `Queue` | |
15912 | dispenser. One can also instantiate the previously defined functor to | |
15913 | create functions for manipulating dispensers of a type: | |
15914 | ||
15915 | [source,sml] | |
15916 | ---- | |
15917 | structure S = DispenserAlgs (Stack) | |
15918 | val [4,3,2,1] = S.popAll (S.pushAll ([1,2,3,4], Stack.empty)) | |
15919 | ||
15920 | structure Q = DispenserAlgs (Queue) | |
15921 | val [1,2,3,4] = Q.popAll (Q.pushAll ([1,2,3,4], Queue.empty)) | |
15922 | ---- | |
15923 | ||
15924 | There is no dynamic dispatch involved at the module level in SML. An | |
15925 | attempt to do dynamic dispatch | |
15926 | ||
15927 | [source,sml] | |
15928 | ---- | |
15929 | val q = Q.push (1, Stack.empty) | |
15930 | ---- | |
15931 | ||
15932 | will give a type error. | |
15933 | ||
15934 | === Combining SML Modules and Dynamic Dispatch === | |
15935 | ||
15936 | Let's then combine SML modules and the dynamic dispatch technique | |
15937 | introduced in this article. First we define an interface for | |
15938 | dispensers: | |
15939 | ||
15940 | [source,sml] | |
15941 | ---- | |
15942 | structure Dispenser = struct | |
15943 | datatype 'a t = | |
15944 | I of {isEmpty : unit -> bool, | |
15945 | push : 'a -> 'a t, | |
15946 | pop : unit -> ('a * 'a t) option} | |
15947 | ||
15948 | fun O m (I t) = m t | |
15949 | ||
15950 | fun isEmpty t = O#isEmpty t () | |
15951 | fun push (v, t) = O#push t v | |
15952 | fun pop t = O#pop t () | |
15953 | end | |
15954 | ---- | |
15955 | ||
15956 | The `Dispenser` module, which we can think of as an interface for | |
15957 | dispensers, implements the `ABSTRACT_DISPENSER` signature using | |
15958 | the dynamic dispatch technique, but we leave the signature ascription | |
15959 | until later. | |
15960 | ||
15961 | Then we define a `DispenserClass` functor that makes a "class" out of | |
15962 | a given dispenser module: | |
15963 | ||
15964 | [source,sml] | |
15965 | ---- | |
15966 | functor DispenserClass (D : DISPENSER) : DISPENSER = struct | |
15967 | open Dispenser | |
15968 | ||
15969 | fun make d = | |
15970 | I {isEmpty = fn () => D.isEmpty d, | |
15971 | push = fn x => make (D.push (x, d)), | |
15972 | pop = fn () => | |
15973 | case D.pop d of | |
15974 | NONE => NONE | |
15975 | | SOME (x, d) => SOME (x, make d)} | |
15976 | ||
15977 | val empty = | |
15978 | I {isEmpty = fn () => true, | |
15979 | push = fn x => make (D.push (x, D.empty)), | |
15980 | pop = fn () => NONE} | |
15981 | end | |
15982 | ---- | |
15983 | ||
15984 | Finally we seal the `Dispenser` module: | |
15985 | ||
15986 | [source,sml] | |
15987 | ---- | |
15988 | structure Dispenser : ABSTRACT_DISPENSER = Dispenser | |
15989 | ---- | |
15990 | ||
15991 | This isn't necessary for type safety, because the unsealed `Dispenser` | |
15992 | module does not allow one to break encapsulation, but makes sure that | |
15993 | only the `DispenserClass` functor can create dispenser classes | |
15994 | (because the constructor `Dispenser.I` is no longer accessible). | |
15995 | ||
15996 | Using the `DispenserClass` functor we can turn any concrete dispenser | |
15997 | module into a dispenser class: | |
15998 | ||
15999 | [source,sml] | |
16000 | ---- | |
16001 | structure StackClass = DispenserClass (Stack) | |
16002 | structure QueueClass = DispenserClass (Queue) | |
16003 | ---- | |
16004 | ||
16005 | Each dispenser class implements the same dynamic dispatch interface | |
16006 | and the `ABSTRACT_DISPENSER` -signature. | |
16007 | ||
16008 | Because the dynamic dispatch `Dispenser` module implements the | |
16009 | `ABSTRACT_DISPENSER`-signature, we can use it to instantiate the | |
16010 | `DispenserAlgs`-functor: | |
16011 | ||
16012 | [source,sml] | |
16013 | ---- | |
16014 | structure D = DispenserAlgs (Dispenser) | |
16015 | ---- | |
16016 | ||
16017 | The resulting `D` module, like the `Dispenser` module, works with | |
16018 | any dispenser class and uses dynamic dispatch: | |
16019 | ||
16020 | [source,sml] | |
16021 | ---- | |
16022 | val [4, 3, 2, 1] = D.popAll (D.pushAll ([1, 2, 3, 4], StackClass.empty)) | |
16023 | val [1, 2, 3, 4] = D.popAll (D.pushAll ([1, 2, 3, 4], QueueClass.empty)) | |
16024 | ---- | |
16025 | ||
16026 | <<< | |
16027 | ||
16028 | :mlton-guide-page: OCaml | |
16029 | [[OCaml]] | |
16030 | OCaml | |
16031 | ===== | |
16032 | ||
16033 | http://caml.inria.fr/[OCaml] is a variant of <:ML:> and is similar to | |
16034 | <:StandardML:Standard ML>. | |
16035 | ||
16036 | == OCaml and SML == | |
16037 | ||
16038 | Here's a comparison of some aspects of the OCaml and SML languages. | |
16039 | ||
16040 | * Standard ML has a formal <:DefinitionOfStandardML:Definition>, while | |
16041 | OCaml is specified by its lone implementation and informal | |
16042 | documentation. | |
16043 | ||
16044 | * Standard ML has a number of <:StandardMLImplementations:compilers>, | |
16045 | while OCaml has only one. | |
16046 | ||
16047 | * OCaml has built-in support for object-oriented programming, while | |
16048 | Standard ML does not (however, see <:ObjectOrientedProgramming:>). | |
16049 | ||
16050 | * Andreas Rossberg has a | |
16051 | http://www.mpi-sws.org/%7Erossberg/sml-vs-ocaml.html[side-by-side | |
16052 | comparison] of the syntax of SML and OCaml. | |
16053 | ||
16054 | * Adam Chlipala has a | |
16055 | http://adam.chlipala.net/mlcomp[point-by-point comparison] of OCaml | |
16056 | and SML. | |
16057 | ||
16058 | == OCaml and MLton == | |
16059 | ||
16060 | Here's a comparison of some aspects of OCaml and MLton. | |
16061 | ||
16062 | * Performance | |
16063 | ||
16064 | ** Both OCaml and MLton have excellent performance. | |
16065 | ||
16066 | ** MLton performs extensive <:WholeProgramOptimization:>, which can | |
16067 | provide substantial improvements in large, modular programs. | |
16068 | ||
16069 | ** MLton uses native types, like 32-bit integers, without any penalty | |
16070 | due to tagging or boxing. OCaml uses 31-bit integers with a penalty | |
16071 | due to tagging, and 32-bit integers with a penalty due to boxing. | |
16072 | ||
16073 | ** MLton uses native types, like 64-bit floats, without any penalty | |
16074 | due to boxing. OCaml, in some situations, boxes 64-bit floats. | |
16075 | ||
16076 | ** MLton represents arrays of all types unboxed. In OCaml, only | |
16077 | arrays of 64-bit floats are unboxed, and then only when it is | |
16078 | syntactically apparent. | |
16079 | ||
16080 | ** MLton represents records compactly by reordering and packing the | |
16081 | fields. | |
16082 | ||
16083 | ** In MLton, polymorphic and monomorphic code have the same | |
16084 | performance. In OCaml, polymorphism can introduce a performance | |
16085 | penalty. | |
16086 | ||
16087 | ** In MLton, module boundaries have no impact on performance. In | |
16088 | OCaml, moving code between modules can cause a performance penalty. | |
16089 | ||
16090 | ** MLton's <:ForeignFunctionInterface:> is simpler than OCaml's. | |
16091 | ||
16092 | * Tools | |
16093 | ||
16094 | ** OCaml has a debugger, while MLton does not. | |
16095 | ||
16096 | ** OCaml supports separate compilation, while MLton does not. | |
16097 | ||
16098 | ** OCaml compiles faster than MLton. | |
16099 | ||
16100 | ** MLton supports profiling of both time and allocation. | |
16101 | ||
16102 | * Libraries | |
16103 | ||
16104 | ** OCaml has more available libraries. | |
16105 | ||
16106 | * Community | |
16107 | ||
16108 | ** OCaml has a larger community than MLton. | |
16109 | ||
16110 | ** MLton has a very responsive | |
16111 | http://www.mlton.org/mailman/listinfo/mlton[developer list]. | |
16112 | ||
16113 | <<< | |
16114 | ||
16115 | :mlton-guide-page: OpenGL | |
16116 | [[OpenGL]] | |
16117 | OpenGL | |
16118 | ====== | |
16119 | ||
16120 | There are at least two interfaces to OpenGL for MLton/SML, both of | |
16121 | which should be considered alpha quality. | |
16122 | ||
16123 | * <:MikeThomas:> built a low-level interface, directly translating | |
16124 | many of the functions, covering GL, GLU, and GLUT. This is available | |
16125 | in the MLton <:Sources:>: | |
16126 | <!ViewGitDir(mltonlib,master,org/mlton/mike/opengl)>. The code | |
16127 | contains a number of small, standard OpenGL examples translated to | |
16128 | SML. | |
16129 | ||
16130 | * <:ChrisClearwater:> has written at least an interface to GL, and | |
16131 | possibly more. See | |
16132 | ** http://mlton.org/pipermail/mlton/2005-January/026669.html | |
16133 | ||
16134 | <:Contact:> us for more information or an update on the status of | |
16135 | these projects. | |
16136 | ||
16137 | <<< | |
16138 | ||
16139 | :mlton-guide-page: OperatorPrecedence | |
16140 | [[OperatorPrecedence]] | |
16141 | OperatorPrecedence | |
16142 | ================== | |
16143 | ||
16144 | <:StandardML:Standard ML> has a built in notion of precedence for | |
16145 | certain symbols. Every program that includes the | |
16146 | <:BasisLibrary:Basis Library> automatically gets the following infix | |
16147 | declarations. Higher number indicates higher precedence. | |
16148 | ||
16149 | [source,sml] | |
16150 | ---- | |
16151 | infix 7 * / mod div | |
16152 | infix 6 + - ^ | |
16153 | infixr 5 :: @ | |
16154 | infix 4 = <> > >= < <= | |
16155 | infix 3 := o | |
16156 | infix 0 before | |
16157 | ---- | |
16158 | ||
16159 | <<< | |
16160 | ||
16161 | :mlton-guide-page: OptionalArguments | |
16162 | [[OptionalArguments]] | |
16163 | OptionalArguments | |
16164 | ================= | |
16165 | ||
16166 | <:StandardML:Standard ML> does not have built-in support for optional | |
16167 | arguments. Nevertheless, using <:Fold:>, it is easy to define | |
16168 | functions that take optional arguments. | |
16169 | ||
16170 | For example, suppose that we have the following definition of a | |
16171 | function `f`. | |
16172 | ||
16173 | [source,sml] | |
16174 | ---- | |
16175 | fun f (i, r, s) = | |
16176 | concat [Int.toString i, ", ", Real.toString r, ", ", s] | |
16177 | ---- | |
16178 | ||
16179 | Using the `OptionalArg` structure described below, we can define a | |
16180 | function `f'`, an optionalized version of `f`, that takes 0, 1, 2, or | |
16181 | 3 arguments. Embedded within `f'` will be default values for `i`, | |
16182 | `r`, and `s`. If `f'` gets no arguments, then all the defaults are | |
16183 | used. If `f'` gets one argument, then that will be used for `i`. Two | |
16184 | arguments will be used for `i` and `r` respectively. Three arguments | |
16185 | will override all default values. Calls to `f'` will look like the | |
16186 | following. | |
16187 | ||
16188 | [source,sml] | |
16189 | ---- | |
16190 | f' $ | |
16191 | f' `2 $ | |
16192 | f' `2 `3.0 $ | |
16193 | f' `2 `3.0 `"four" $ | |
16194 | ---- | |
16195 | ||
16196 | The optional argument indicator, +`+, is not special syntax --- | |
16197 | it is a normal SML value, defined in the `OptionalArg` structure | |
16198 | below. | |
16199 | ||
16200 | Here is the definition of `f'` using the `OptionalArg` structure, in | |
16201 | particular, `OptionalArg.make` and `OptionalArg.D`. | |
16202 | ||
16203 | [source,sml] | |
16204 | ---- | |
16205 | val f' = | |
16206 | fn z => | |
16207 | let open OptionalArg in | |
16208 | make (D 1) (D 2.0) (D "three") $ | |
16209 | end (fn i & r & s => f (i, r, s)) | |
16210 | z | |
16211 | ---- | |
16212 | ||
16213 | The definition of `f'` is eta expanded as with all uses of fold. A | |
16214 | call to `OptionalArg.make` is supplied with a variable number of | |
16215 | defaults (in this case, three), the end-of-arguments terminator, `$`, | |
16216 | and the function to run, taking its arguments as an n-ary | |
16217 | <:ProductType:product>. In this case, the function simply converts | |
16218 | the product to an ordinary tuple and calls `f`. Often, the function | |
16219 | body will simply be written directly. | |
16220 | ||
16221 | In general, the definition of an optional-argument function looks like | |
16222 | the following. | |
16223 | ||
16224 | [source,sml] | |
16225 | ---- | |
16226 | val f = | |
16227 | fn z => | |
16228 | let open OptionalArg in | |
16229 | make (D <default1>) (D <default2>) ... (D <defaultn>) $ | |
16230 | end (fn x1 & x2 & ... & xn => | |
16231 | <function code goes here>) | |
16232 | z | |
16233 | ---- | |
16234 | ||
16235 | Here is the definition of `OptionalArg`. | |
16236 | ||
16237 | [source,sml] | |
16238 | ---- | |
16239 | structure OptionalArg = | |
16240 | struct | |
16241 | val make = | |
16242 | fn z => | |
16243 | Fold.fold | |
16244 | ((id, fn (f, x) => f x), | |
16245 | fn (d, r) => fn func => | |
16246 | Fold.fold ((id, d ()), fn (f, d) => | |
16247 | let | |
16248 | val d & () = r (id, f d) | |
16249 | in | |
16250 | func d | |
16251 | end)) | |
16252 | z | |
16253 | ||
16254 | fun D d = Fold.step0 (fn (f, r) => | |
16255 | (fn ds => f (d & ds), | |
16256 | fn (f, a & b) => r (fn x => f a & x, b))) | |
16257 | ||
16258 | val ` = | |
16259 | fn z => | |
16260 | Fold.step1 (fn (x, (f, _ & d)) => (fn d => f (x & d), d)) | |
16261 | z | |
16262 | end | |
16263 | ---- | |
16264 | ||
16265 | `OptionalArg.make` uses a nested fold. The first `fold` accumulates | |
16266 | the default values in a product, associated to the right, and a | |
16267 | reversal function that converts a product (of the same arity as the | |
16268 | number of defaults) from right associativity to left associativity. | |
16269 | The accumulated defaults are used by the second fold, which recurs | |
16270 | over the product, replacing the appropriate component as it encounters | |
16271 | optional arguments. The second fold also constructs a "fill" | |
16272 | function, `f`, that is used to reconstruct the product once the | |
16273 | end-of-arguments is reached. Finally, the finisher reconstructs the | |
16274 | product and uses the reversal function to convert the product from | |
16275 | right associative to left associative, at which point it is passed to | |
16276 | the user-supplied function. | |
16277 | ||
16278 | Much of the complexity comes from the fact that while recurring over a | |
16279 | product from left to right, one wants it to be right-associative, | |
16280 | e.g., look like | |
16281 | ||
16282 | [source,sml] | |
16283 | ---- | |
16284 | a & (b & (c & d)) | |
16285 | ---- | |
16286 | ||
16287 | but the user function in the end wants the product to be left | |
16288 | associative, so that the product argument pattern can be written | |
16289 | without parentheses (since `&` is left associative). | |
16290 | ||
16291 | ||
16292 | == Labelled optional arguments == | |
16293 | ||
16294 | In addition to the positional optional arguments described above, it | |
16295 | is sometimes useful to have labelled optional arguments. These allow | |
16296 | one to define a function, `f`, with defaults, say `a` and `b`. Then, | |
16297 | a caller of `f` can supply values for `a` and `b` by name. If no | |
16298 | value is supplied then the default is used. | |
16299 | ||
16300 | Labelled optional arguments are a simple extension of | |
16301 | <:FunctionalRecordUpdate:> using post composition. Suppose, for | |
16302 | example, that one wants a function `f` with labelled optional | |
16303 | arguments `a` and `b` with default values `0` and `0.0` respectively. | |
16304 | If one has a functional-record-update function `updateAB` for records | |
16305 | with `a` and `b` fields, then one can define `f` in the following way. | |
16306 | ||
16307 | [source,sml] | |
16308 | ---- | |
16309 | val f = | |
16310 | fn z => | |
16311 | Fold.post | |
16312 | (updateAB {a = 0, b = 0.0}, | |
16313 | fn {a, b} => print (concat [Int.toString a, " ", | |
16314 | Real.toString b, "\n"])) | |
16315 | z | |
16316 | ---- | |
16317 | ||
16318 | The idea is that `f` is the post composition (using `Fold.post`) of | |
16319 | the actual code for the function with a functional-record updater that | |
16320 | starts with the defaults. | |
16321 | ||
16322 | Here are some example calls to `f`. | |
16323 | [source,sml] | |
16324 | ---- | |
16325 | val () = f $ | |
16326 | val () = f (U#a 13) $ | |
16327 | val () = f (U#a 13) (U#b 17.5) $ | |
16328 | val () = f (U#b 17.5) (U#a 13) $ | |
16329 | ---- | |
16330 | ||
16331 | Notice that a caller can supply neither of the arguments, either of | |
16332 | the arguments, or both of the arguments, and in either order. All | |
16333 | that matter is that the arguments be labelled correctly (and of the | |
16334 | right type, of course). | |
16335 | ||
16336 | Here is another example. | |
16337 | ||
16338 | [source,sml] | |
16339 | ---- | |
16340 | val f = | |
16341 | fn z => | |
16342 | Fold.post | |
16343 | (updateBCD {b = 0, c = 0.0, d = "<>"}, | |
16344 | fn {b, c, d} => | |
16345 | print (concat [Int.toString b, " ", | |
16346 | Real.toString c, " ", | |
16347 | d, "\n"])) | |
16348 | z | |
16349 | ---- | |
16350 | ||
16351 | Here are some example calls. | |
16352 | ||
16353 | [source,sml] | |
16354 | ---- | |
16355 | val () = f $ | |
16356 | val () = f (U#d "goodbye") $ | |
16357 | val () = f (U#d "hello") (U#b 17) (U#c 19.3) $ | |
16358 | ---- | |
16359 | ||
16360 | <<< | |
16361 | ||
16362 | :mlton-guide-page: Overloading | |
16363 | [[Overloading]] | |
16364 | Overloading | |
16365 | =========== | |
16366 | ||
16367 | In <:StandardML:Standard ML>, constants (like `13`, `0w13`, `13.0`) | |
16368 | are overloaded, meaning that they can denote a constant of the | |
16369 | appropriate type as determined by context. SML defines the | |
16370 | overloading classes _Int_, _Real_, and _Word_, which denote the sets | |
16371 | of types that integer, real, and word constants may take on. In | |
16372 | MLton, these are defined as follows. | |
16373 | ||
16374 | [cols="^25%,<75%"] | |
16375 | |===== | |
16376 | | _Int_ | `Int2.int`, `Int3.int`, ... `Int32.int`, `Int64.int`, `Int.int`, `IntInf.int`, `LargeInt.int`, `FixedInt.int`, `Position.int` | |
16377 | | _Real_ | `Real32.real`, `Real64.real`, `Real.real`, `LargeReal.real` | |
16378 | | _Word_ | `Word2.word`, `Word3.word`, ... `Word32.word`, `Word64.word`, `Word.word`, `LargeWord.word`, `SysWord.word` | |
16379 | |===== | |
16380 | ||
16381 | The <:DefinitionOfStandardML:Definition> allows flexibility in how | |
16382 | much context is used to resolve overloading. It says that the context | |
16383 | is _no larger than the smallest enclosing structure-level | |
16384 | declaration_, but that _an implementation may require that a smaller | |
16385 | context determines the type_. MLton uses the largest possible context | |
16386 | allowed by SML in resolving overloading. If the type of a constant is | |
16387 | not determined by context, then it takes on a default type. In MLton, | |
16388 | these are defined as follows. | |
16389 | ||
16390 | [cols="^25%,<75%"] | |
16391 | |===== | |
16392 | | _Int_ | `Int.int` | |
16393 | | _Real_ | `Real.real` | |
16394 | | _Word_ | `Word.word` | |
16395 | |===== | |
16396 | ||
16397 | Other implementations may use a smaller context or different default | |
16398 | types. | |
16399 | ||
16400 | == Also see == | |
16401 | ||
16402 | * http://www.standardml.org/Basis/top-level-chapter.html[discussion of overloading in the Basis Library] | |
16403 | ||
16404 | == Examples == | |
16405 | ||
16406 | * The following program is rejected. | |
16407 | + | |
16408 | [source,sml] | |
16409 | ---- | |
16410 | structure S: | |
16411 | sig | |
16412 | val x: Word8.word | |
16413 | end = | |
16414 | struct | |
16415 | val x = 0w0 | |
16416 | end | |
16417 | ---- | |
16418 | + | |
16419 | The smallest enclosing structure declaration for `0w0` is | |
16420 | `val x = 0w0`. Hence, `0w0` receives the default type for words, | |
16421 | which is `Word.word`. | |
16422 | ||
16423 | <<< | |
16424 | ||
16425 | :mlton-guide-page: PackedRepresentation | |
16426 | [[PackedRepresentation]] | |
16427 | PackedRepresentation | |
16428 | ==================== | |
16429 | ||
16430 | <:PackedRepresentation:> is an analysis pass for the <:SSA2:> | |
16431 | <:IntermediateLanguage:>, invoked from <:ToRSSA:>. | |
16432 | ||
16433 | == Description == | |
16434 | ||
16435 | This pass analyzes a <:SSA2:> program to compute a packed | |
16436 | representation for each object. | |
16437 | ||
16438 | == Implementation == | |
16439 | ||
16440 | * <!ViewGitFile(mlton,master,mlton/backend/representation.sig)> | |
16441 | * <!ViewGitFile(mlton,master,mlton/backend/packed-representation.fun)> | |
16442 | ||
16443 | == Details and Notes == | |
16444 | ||
16445 | Has a special case to make sure that `true` is represented as `1` and | |
16446 | `false` is represented as `0`. | |
16447 | ||
16448 | <<< | |
16449 | ||
16450 | :mlton-guide-page: ParallelMove | |
16451 | [[ParallelMove]] | |
16452 | ParallelMove | |
16453 | ============ | |
16454 | ||
16455 | <:ParallelMove:> is a rewrite pass, agnostic in the | |
16456 | <:IntermediateLanguage:> which it produces. | |
16457 | ||
16458 | == Description == | |
16459 | ||
16460 | This function computes a sequence of individual moves to effect a | |
16461 | parallel move (with possibly overlapping froms and tos). | |
16462 | ||
16463 | == Implementation == | |
16464 | ||
16465 | * <!ViewGitFile(mlton,master,mlton/backend/parallel-move.sig)> | |
16466 | * <!ViewGitFile(mlton,master,mlton/backend/parallel-move.fun)> | |
16467 | ||
16468 | == Details and Notes == | |
16469 | ||
16470 | {empty} | |
16471 | ||
16472 | <<< | |
16473 | ||
16474 | :mlton-guide-page: Performance | |
16475 | [[Performance]] | |
16476 | Performance | |
16477 | =========== | |
16478 | ||
16479 | This page compares the performance of a number of SML compilers on a | |
16480 | range of benchmarks. | |
16481 | ||
16482 | This page compares the following SML compiler versions. | |
16483 | ||
16484 | * <:Home:MLton> 20171211 (git 79d4a623c) | |
16485 | * <:MLKit:ML Kit> 4.3.12 (20171210) | |
16486 | * <:MoscowML:Moscow ML> 2.10.1 ++ (git f529b33bb, 20170711) | |
16487 | * <:PolyML:Poly/ML> 5.7.2 Testing (git 5.7.1-35-gcb73407a) | |
16488 | * <:SMLNJ:SML/NJ> 110.81 (20170501) | |
16489 | ||
16490 | There are tables for <:#RunTime:run time>, <:#CodeSize:code size>, and | |
16491 | <:#CompileTime:compile time>. | |
16492 | ||
16493 | ||
16494 | == Setup == | |
16495 | ||
16496 | All benchmarks were compiled and run on a 2.6 GHz Core i7-5600U with 16G of | |
16497 | RAM. The benchmarks were compiled with the default settings for all | |
16498 | the compilers, except for Moscow ML, which was passed the | |
16499 | `-orthodox -standalone -toplevel` switches. The Poly/ML executables | |
16500 | were produced using `polyc`. | |
16501 | The SML/NJ executables were produced by wrapping the entire program in | |
16502 | a `local` declaration whose body performs an `SMLofNJ.exportFn`. | |
16503 | ||
16504 | For more details, or if you want to run the benchmarks yourself, | |
16505 | please see the <!ViewGitDir(mlton,master,benchmark)> directory of our | |
16506 | <:Sources:>. | |
16507 | ||
16508 | All of the benchmarks are available for download from this page. Some | |
16509 | of the benchmarks were obtained from the SML/NJ benchmark suite. Some | |
16510 | of the benchmarks expect certain input files to exist in the | |
16511 | <!ViewGitDir(mlton,master,benchmark/tests/DATA)> subdirectory. | |
16512 | ||
16513 | * <!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/hamlet-input.sml)> | |
16514 | * <!RawGitFile(mlton,master,benchmark/tests/ray.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/ray)> | |
16515 | * <!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/chess.gml)> | |
16516 | * <!RawGitFile(mlton,master,benchmark/tests/vliw.sml)> <!RawGitFile(mlton,master,benchmark/tests/DATA/ndotprod.s)> | |
16517 | ||
16518 | ||
16519 | == <!Anchor(RunTime)>Run-time ratio == | |
16520 | ||
16521 | The following table gives the ratio of the run time of each benchmark | |
16522 | when compiled by another compiler to the run time when compiled by | |
16523 | MLton. That is, the larger the number, the slower the generated code | |
16524 | runs. A number larger than one indicates that the corresponding | |
16525 | compiler produces code that runs more slowly than MLton. A * in an | |
16526 | entry means the compiler failed to compile the benchmark or that the | |
16527 | benchmark failed to run. | |
16528 | ||
16529 | [options="header",cols="<2,5*<1"] | |
16530 | |==== | |
16531 | |benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ | |
16532 | |<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|1.00|10.11|19.36|2.98|1.24 | |
16533 | |<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|1.00|*|7.87|1.22|1.75 | |
16534 | |<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|1.00|30.79|*|10.94|9.08 | |
16535 | |<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|1.00|6.51|40.42|2.34|2.32 | |
16536 | |<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|1.00|0.97|*|0.60|* | |
16537 | |<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|1.00|0.50|11.50|0.42|0.42 | |
16538 | |<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|1.00|7.35|81.51|4.03|1.19 | |
16539 | |<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|1.00|1.41|10.94|1.25|1.17 | |
16540 | |<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|1.00|7.19|68.33|5.28|13.16 | |
16541 | |<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|1.00|4.97|22.85|1.58|* | |
16542 | |<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|1.00|4.99|57.84|3.34|4.67 | |
16543 | |<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|1.00|*|18.43|3.18|3.06 | |
16544 | |<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|1.00|2.76|7.94|3.19|* | |
16545 | |<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|1.00|1.80|20.19|0.89|1.50 | |
16546 | |<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|1.00|5.10|11.06|1.15|1.27 | |
16547 | |<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|1.00|3.50|25.52|1.33|1.28 | |
16548 | |<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|1.00|29.40|183.02|7.41|15.19 | |
16549 | |<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|1.00|95.18|*|32.61|47.47 | |
16550 | |<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|1.00|1.42|*|0.74|3.24 | |
16551 | |<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|1.00|1.83|8.45|0.84|* | |
16552 | |<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|1.00|4.03|12.42|1.70|2.25 | |
16553 | |<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|1.00|3.73|57.44|2.05|3.22 | |
16554 | |<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|1.00|3.96|*|1.73|1.20 | |
16555 | |<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|1.00|6.26|30.85|7.82|5.99 | |
16556 | |<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|1.00|9.37|44.78|2.18|2.15 | |
16557 | |<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|1.00|*|*|2.79|3.59 | |
16558 | |<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|1.00|5.68|165.56|3.92|37.52 | |
16559 | |<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|1.00|12.05|25.08|8.73|1.75 | |
16560 | |<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|1.00|*|*|2.11|3.33 | |
16561 | |<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|1.00|2.95|24.03|3.67|1.93 | |
16562 | |<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|1.00|*|*|1.04|* | |
16563 | |<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|1.00|1.88|28.01|0.70|2.67 | |
16564 | |<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|1.00|1.58|23.57|0.90|1.04 | |
16565 | |<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|1.00|1.69|15.90|1.57|2.01 | |
16566 | |<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|1.00|*|*|*|2.07 | |
16567 | |<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|1.00|2.19|66.76|3.27|1.48 | |
16568 | |<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|1.00|*|19.43|1.08|1.03 | |
16569 | |<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|1.00|13.85|*|1.80|12.48 | |
16570 | |<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|1.00|*|*|*|13.92 | |
16571 | |<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|1.00|7.88|68.85|9.39|68.80 | |
16572 | |<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|1.00|2.46|15.39|1.43|1.55 | |
16573 | |<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|1.00|6.00|*|29.25|9.54 | |
16574 | |<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|1.00|80.43|*|19.45|8.71 | |
16575 | |<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|1.00|4.62|35.56|1.68|9.97 | |
16576 | |<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|1.00|*|*|*|1.60 | |
16577 | |==== | |
16578 | ||
16579 | <!Anchor(SNFNote)> | |
16580 | Note: for SML/NJ, the | |
16581 | <!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)> | |
16582 | benchmark was killed after running for over 51,000 seconds. | |
16583 | ||
16584 | ||
16585 | == <!Anchor(CodeSize)>Code size == | |
16586 | ||
16587 | The following table gives the code size of each benchmark in bytes. | |
16588 | The size for MLton and the ML Kit is the sum of text and data for the | |
16589 | standalone executable as reported by `size`. The size for Moscow | |
16590 | ML is the size in bytes of the executable `a.out`. The size for | |
16591 | Poly/ML is the difference in size of the database before the session | |
16592 | start and after the commit. The size for SML/NJ is the size of the | |
16593 | heap file created by `exportFn` and does not include the size of | |
16594 | the SML/NJ runtime system (approximately 100K). A * in an entry means | |
16595 | that the compiler failed to compile the benchmark. | |
16596 | ||
16597 | [options="header",cols="<2,5*<1"] | |
16598 | |==== | |
16599 | |benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ | |
16600 | |<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|180,788|810,267|199,503|148,120|402,480 | |
16601 | |<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|250,246|*|248,018|196,984|496,664 | |
16602 | |<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|122,422|225,274|*|106,088|406,560 | |
16603 | |<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|151,878|250,126|187,048|144,032|428,136 | |
16604 | |<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|223,073|827,483|*|272,664|* | |
16605 | |<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|122,350|87,586|181,415|106,072|380,928 | |
16606 | |<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|145,008|237,230|186,228|131,400|418,896 | |
16607 | |<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|122,310|87,402|181,312|106,088|380,928 | |
16608 | |<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|121,958|104,102|181,464|106,072|394,256 | |
16609 | |<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|1,503,849|2,280,691|407,219|2,249,504|* | |
16610 | |<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|122,078|89,346|181,470|106,088|381,952 | |
16611 | |<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|193,145|*|192,659|161,080|400,408 | |
16612 | |<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|308,296|826,819|213,128|268,272|* | |
16613 | |<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|141,862|721,419|186,463|118,552|384,024 | |
16614 | |<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|211,086|782,667|188,908|198,408|409,624 | |
16615 | |<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|122,086|700,075|183,037|106,104|386,048 | |
16616 | |<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|124,398|280,006|184,328|110,232|416,784 | |
16617 | |<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|150,497|271,794|*|122,624|399,416 | |
16618 | |<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|123,846|100,858|181,542|106,136|381,960 | |
16619 | |<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|678,920|1,233,587|263,721|576,728|* | |
16620 | |<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|846,779|1,432,283|297,108|777,664|985,304 | |
16621 | |<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|124,126|229,078|184,440|114,584|392,232 | |
16622 | |<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|298,038|507,186|*|475,808|456,744 | |
16623 | |<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|157,973|699,003|181,680|118,800|380,928 | |
16624 | |<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|156,401|201,138|183,438|110,456|385,072 | |
16625 | |<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|126,486|106,166|*|106,088|393,256 | |
16626 | |<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|150,174|265,694|190,088|184,536|414,760 | |
16627 | |<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|260,863|736,795|195,064|198,976|512,160 | |
16628 | |<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|384,905|*|*|446,424|623,824 | |
16629 | |<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|365,578|895,139|197,765|1,051,952|708,696 | |
16630 | |<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|286,474|*|*|262,616|547,984 | |
16631 | |<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|119,102|140,626|183,249|106,088|390,160 | |
16632 | |<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|122,110|87,890|181,369|106,072|381,952 | |
16633 | |<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|122,246|87,402|181,349|106,088|376,832 | |
16634 | |<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|186,545|*|*|*|421,984 | |
16635 | |<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|163,033|722,571|188,634|126,984|393,264 | |
16636 | |<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|235,449|*|195,401|184,816|478,296 | |
16637 | |<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|123,790|104,398|*|106,200|394,256 | |
16638 | |<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|123,846|*|*|*|405,552 | |
16639 | |<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|122,982|104,614|181,534|106,072|394,256 | |
16640 | |<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|538,074|1,182,851|249,884|580,792|749,752 | |
16641 | |<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|186,152|699,459|191,347|127,200|386,048 | |
16642 | |<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|196,232|700,131|191,539|127,232|387,072 | |
16643 | |<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|230,433|128,354|186,322|127,048|390,184 | |
16644 | |<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|156,902|*|*|*|453,768 | |
16645 | |==== | |
16646 | ||
16647 | ||
16648 | == <!Anchor(CompileTime)>Compile time == | |
16649 | ||
16650 | The following table gives the compile time of each benchmark in | |
16651 | seconds. A * in an entry means that the compiler failed to compile | |
16652 | the benchmark. | |
16653 | ||
16654 | [options="header",cols="<2,5*<1"] | |
16655 | |==== | |
16656 | |benchmark|MLton|ML-Kit|MosML|Poly/ML|SML/NJ | |
16657 | |<!RawGitFile(mlton,master,benchmark/tests/barnes-hut.sml)>|2.70|0.89|0.15|0.29|0.20 | |
16658 | |<!RawGitFile(mlton,master,benchmark/tests/boyer.sml)>|2.87|*|0.14|0.20|0.41 | |
16659 | |<!RawGitFile(mlton,master,benchmark/tests/checksum.sml)>|2.21|0.24|*|0.07|0.05 | |
16660 | |<!RawGitFile(mlton,master,benchmark/tests/count-graphs.sml)>|2.28|0.34|0.04|0.11|0.21 | |
16661 | |<!RawGitFile(mlton,master,benchmark/tests/DLXSimulator.sml)>|2.93|1.01|*|0.27|* | |
16662 | |<!RawGitFile(mlton,master,benchmark/tests/even-odd.sml)>|2.23|0.20|0.01|0.07|0.04 | |
16663 | |<!RawGitFile(mlton,master,benchmark/tests/fft.sml)>|2.35|0.28|0.03|0.09|0.10 | |
16664 | |<!RawGitFile(mlton,master,benchmark/tests/fib.sml)>|2.16|0.19|0.01|0.07|0.04 | |
16665 | |<!RawGitFile(mlton,master,benchmark/tests/flat-array.sml)>|2.16|0.20|0.01|0.07|0.04 | |
16666 | |<!RawGitFile(mlton,master,benchmark/tests/hamlet.sml)>|12.28|19.25|23.75|6.44|* | |
16667 | |<!RawGitFile(mlton,master,benchmark/tests/imp-for.sml)>|2.14|0.20|0.01|0.08|0.04 | |
16668 | |<!RawGitFile(mlton,master,benchmark/tests/knuth-bendix.sml)>|2.48|*|0.08|0.14|0.23 | |
16669 | |<!RawGitFile(mlton,master,benchmark/tests/lexgen.sml)>|3.31|0.75|0.15|0.22|* | |
16670 | |<!RawGitFile(mlton,master,benchmark/tests/life.sml)>|2.25|0.32|0.03|0.09|0.10 | |
16671 | |<!RawGitFile(mlton,master,benchmark/tests/logic.sml)>|2.72|0.57|0.07|0.17|0.21 | |
16672 | |<!RawGitFile(mlton,master,benchmark/tests/mandelbrot.sml)>|2.14|0.24|0.01|0.07|0.04 | |
16673 | |<!RawGitFile(mlton,master,benchmark/tests/matrix-multiply.sml)>|2.14|0.24|0.01|0.08|0.05 | |
16674 | |<!RawGitFile(mlton,master,benchmark/tests/md5.sml)>|2.31|0.39|*|0.12|0.27 | |
16675 | |<!RawGitFile(mlton,master,benchmark/tests/merge.sml)>|2.15|0.21|0.01|0.07|0.04 | |
16676 | |<!RawGitFile(mlton,master,benchmark/tests/mlyacc.sml)>|7.07|4.53|2.05|0.80|* | |
16677 | |<!RawGitFile(mlton,master,benchmark/tests/model-elimination.sml)>|6.78|4.76|1.20|1.65|4.78 | |
16678 | |<!RawGitFile(mlton,master,benchmark/tests/mpuz.sml)>|2.14|0.28|0.02|0.08|0.07 | |
16679 | |<!RawGitFile(mlton,master,benchmark/tests/nucleic.sml)>|3.96|2.12|*|0.37|0.49 | |
16680 | |<!RawGitFile(mlton,master,benchmark/tests/output1.sml)>|2.30|0.22|0.01|0.07|0.04 | |
16681 | |<!RawGitFile(mlton,master,benchmark/tests/peek.sml)>|2.26|0.20|0.01|0.07|0.04 | |
16682 | |<!RawGitFile(mlton,master,benchmark/tests/psdes-random.sml)>|2.12|0.22|*|9.83|12.55 | |
16683 | |<!RawGitFile(mlton,master,benchmark/tests/ratio-regions.sml)>|2.59|0.47|0.07|0.16|0.24 | |
16684 | |<!RawGitFile(mlton,master,benchmark/tests/ray.sml)>|2.95|0.46|0.05|0.17|0.14 | |
16685 | |<!RawGitFile(mlton,master,benchmark/tests/raytrace.sml)>|3.93|*|*|0.45|0.74 | |
16686 | |<!RawGitFile(mlton,master,benchmark/tests/simple.sml)>|3.42|1.23|0.30|0.32|0.53 | |
16687 | |<!RawGitFile(mlton,master,benchmark/tests/smith-normal-form.sml)>|3.23|*|*|0.15|0.32 | |
16688 | |<!RawGitFile(mlton,master,benchmark/tests/string-concat.sml)>|2.25|0.28|0.01|0.08|0.05 | |
16689 | |<!RawGitFile(mlton,master,benchmark/tests/tailfib.sml)>|2.24|0.21|0.01|0.08|0.05 | |
16690 | |<!RawGitFile(mlton,master,benchmark/tests/tak.sml)>|2.23|0.20|0.01|0.08|0.05 | |
16691 | |<!RawGitFile(mlton,master,benchmark/tests/tensor.sml)>|2.73|*|*|*|0.44 | |
16692 | |<!RawGitFile(mlton,master,benchmark/tests/tsp.sml)>|2.42|0.38|0.05|0.11|0.11 | |
16693 | |<!RawGitFile(mlton,master,benchmark/tests/tyan.sml)>|2.93|*|0.10|0.27|0.31 | |
16694 | |<!RawGitFile(mlton,master,benchmark/tests/vector32-concat.sml)>|2.23|0.22|*|0.07|0.04 | |
16695 | |<!RawGitFile(mlton,master,benchmark/tests/vector64-concat.sml)>|2.18|*|*|*|0.04 | |
16696 | |<!RawGitFile(mlton,master,benchmark/tests/vector-rev.sml)>|2.23|0.22|0.01|0.08|0.05 | |
16697 | |<!RawGitFile(mlton,master,benchmark/tests/vliw.sml)>|5.25|2.93|0.63|0.94|1.85 | |
16698 | |<!RawGitFile(mlton,master,benchmark/tests/wc-input1.sml)>|2.46|0.24|0.01|0.08|0.05 | |
16699 | |<!RawGitFile(mlton,master,benchmark/tests/wc-scanStream.sml)>|2.61|0.25|0.01|0.08|0.05 | |
16700 | |<!RawGitFile(mlton,master,benchmark/tests/zebra.sml)>|2.99|0.35|0.03|0.09|0.11 | |
16701 | |<!RawGitFile(mlton,master,benchmark/tests/zern.sml)>|2.31|*|*|*|0.11 | |
16702 | |==== | |
16703 | ||
16704 | <<< | |
16705 | ||
16706 | :mlton-guide-page: PhantomType | |
16707 | [[PhantomType]] | |
16708 | PhantomType | |
16709 | =========== | |
16710 | ||
16711 | A phantom type is a type that has no run-time representation, but is | |
16712 | used to force the type checker to ensure invariants at compile time. | |
16713 | This is done by augmenting a type with additional arguments (phantom | |
16714 | type variables) and expressing constraints by choosing phantom types | |
16715 | to stand for the phantom types in the types of values. | |
16716 | ||
16717 | == Also see == | |
16718 | ||
16719 | * <!Cite(Blume01)> | |
16720 | ** dimensions | |
16721 | ** C type system | |
16722 | * <!Cite(FluetPucella06)> | |
16723 | ** subtyping | |
16724 | * socket module in <:BasisLibrary:Basis Library> | |
16725 | ||
16726 | <<< | |
16727 | ||
16728 | :mlton-guide-page: PlatformSpecificNotes | |
16729 | [[PlatformSpecificNotes]] | |
16730 | PlatformSpecificNotes | |
16731 | ===================== | |
16732 | ||
16733 | Here are notes about using MLton on the following platforms. | |
16734 | ||
16735 | == Operating Systems == | |
16736 | ||
16737 | * <:RunningOnAIX:AIX> | |
16738 | * <:RunningOnCygwin:Cygwin> | |
16739 | * <:RunningOnDarwin:Darwin> | |
16740 | * <:RunningOnFreeBSD:FreeBSD> | |
16741 | * <:RunningOnHPUX:HPUX> | |
16742 | * <:RunningOnLinux:Linux> | |
16743 | * <:RunningOnMinGW:MinGW> | |
16744 | * <:RunningOnNetBSD:NetBSD> | |
16745 | * <:RunningOnOpenBSD:OpenBSD> | |
16746 | * <:RunningOnSolaris:Solaris> | |
16747 | ||
16748 | == Architectures == | |
16749 | ||
16750 | * <:RunningOnAMD64:AMD64> | |
16751 | * <:RunningOnHPPA:HPPA> | |
16752 | * <:RunningOnPowerPC:PowerPC> | |
16753 | * <:RunningOnPowerPC64:PowerPC64> | |
16754 | * <:RunningOnSparc:Sparc> | |
16755 | * <:RunningOnX86:X86> | |
16756 | ||
16757 | == Also see == | |
16758 | ||
16759 | * <:PortingMLton:> | |
16760 | ||
16761 | <<< | |
16762 | ||
16763 | :mlton-guide-page: PolyEqual | |
16764 | [[PolyEqual]] | |
16765 | PolyEqual | |
16766 | ========= | |
16767 | ||
16768 | <:PolyEqual:> is an optimization pass for the <:SSA:> | |
16769 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
16770 | ||
16771 | == Description == | |
16772 | ||
16773 | This pass implements polymorphic equality. | |
16774 | ||
16775 | == Implementation == | |
16776 | ||
16777 | * <!ViewGitFile(mlton,master,mlton/ssa/poly-equal.fun)> | |
16778 | ||
16779 | == Details and Notes == | |
16780 | ||
16781 | For each datatype, tycon, and vector type, it builds and equality | |
16782 | function and translates calls to `MLton_equal` into calls to that | |
16783 | function. | |
16784 | ||
16785 | Also generates calls to `Word_equal`. | |
16786 | ||
16787 | For tuples, it does the equality test inline; i.e., it does not create | |
16788 | a separate equality function for each tuple type. | |
16789 | ||
16790 | All equality functions are created only if necessary, i.e., if | |
16791 | equality is actually used at a type. | |
16792 | ||
16793 | Optimizations: | |
16794 | ||
16795 | * for datatypes that are enumerations, do not build a case dispatch, | |
16796 | just use `MLton_eq`, as the backend will represent these as ints | |
16797 | ||
16798 | * deep equality always does an `MLton_eq` test first | |
16799 | ||
16800 | * If one argument to `=` is a constant and the type will get | |
16801 | translated to an `IntOrPointer`, then just use `eq` instead of the | |
16802 | full equality. This is important for implementing code like the | |
16803 | following efficiently: | |
16804 | + | |
16805 | ---- | |
16806 | if x = 0 ... (* where x is of type IntInf.int *) | |
16807 | ---- | |
16808 | ||
16809 | * Also convert pointer equality on scalar types to type specific | |
16810 | primitives. | |
16811 | ||
16812 | <<< | |
16813 | ||
16814 | :mlton-guide-page: PolyHash | |
16815 | [[PolyHash]] | |
16816 | PolyHash | |
16817 | ======== | |
16818 | ||
16819 | <:PolyHash:> is an optimization pass for the <:SSA:> | |
16820 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
16821 | ||
16822 | == Description == | |
16823 | ||
16824 | This pass implements polymorphic, structural hashing. | |
16825 | ||
16826 | == Implementation == | |
16827 | ||
16828 | * <!ViewGitFile(mlton,master,mlton/ssa/poly-hash.fun)> | |
16829 | ||
16830 | == Details and Notes == | |
16831 | ||
16832 | For each datatype, tycon, and vector type, it builds and equality | |
16833 | function and translates calls to `MLton_hash` into calls to that | |
16834 | function. | |
16835 | ||
16836 | For tuples, it does the equality test inline; i.e., it does not create | |
16837 | a separate equality function for each tuple type. | |
16838 | ||
16839 | All equality functions are created only if necessary, i.e., if | |
16840 | equality is actually used at a type. | |
16841 | ||
16842 | <<< | |
16843 | ||
16844 | :mlton-guide-page: PolyML | |
16845 | [[PolyML]] | |
16846 | PolyML | |
16847 | ====== | |
16848 | ||
16849 | http://www.polyml.org/[Poly/ML] is a | |
16850 | <:StandardMLImplementations:Standard ML implementation>. | |
16851 | ||
16852 | == Also see == | |
16853 | ||
16854 | * <!Cite(Matthews95)> | |
16855 | ||
16856 | <<< | |
16857 | ||
16858 | :mlton-guide-page: PolymorphicEquality | |
16859 | [[PolymorphicEquality]] | |
16860 | PolymorphicEquality | |
16861 | =================== | |
16862 | ||
16863 | Polymorphic equality is a built-in function in | |
16864 | <:StandardML:Standard ML> that compares two values of the same type | |
16865 | for equality. It is specified as | |
16866 | ||
16867 | [source,sml] | |
16868 | ---- | |
16869 | val = : ''a * ''a -> bool | |
16870 | ---- | |
16871 | ||
16872 | The `''a` in the specification are | |
16873 | <:EqualityTypeVariable:equality type variables>, and indicate that | |
16874 | polymorphic equality can only be applied to values of an | |
16875 | <:EqualityType:equality type>. It is not allowed in SML to rebind | |
16876 | `=`, so a programmer is guaranteed that `=` always denotes polymorphic | |
16877 | equality. | |
16878 | ||
16879 | ||
16880 | == Equality of ground types == | |
16881 | ||
16882 | Ground types like `char`, `int`, and `word` may be compared (to values | |
16883 | of the same type). For example, `13 = 14` is type correct and yields | |
16884 | `false`. | |
16885 | ||
16886 | ||
16887 | == Equality of reals == | |
16888 | ||
16889 | The one ground type that can not be compared is `real`. So, | |
16890 | `13.0 = 14.0` is not type correct. One can use `Real.==` to compare | |
16891 | reals for equality, but beware that this has different algebraic | |
16892 | properties than polymorphic equality. | |
16893 | ||
16894 | See http://standardml.org/Basis/real.html for a discussion of why | |
16895 | `real` is not an equality type. | |
16896 | ||
16897 | ||
16898 | == Equality of functions == | |
16899 | ||
16900 | Comparison of functions is not allowed. | |
16901 | ||
16902 | ||
16903 | == Equality of immutable types == | |
16904 | ||
16905 | Polymorphic equality can be used on <:Immutable:immutable> values like | |
16906 | tuples, records, lists, and vectors. For example, | |
16907 | ||
16908 | ---- | |
16909 | (1, 2, 3) = (4, 5, 6) | |
16910 | ---- | |
16911 | ||
16912 | is a type-correct expression yielding `false`, while | |
16913 | ||
16914 | ---- | |
16915 | [1, 2, 3] = [1, 2, 3] | |
16916 | ---- | |
16917 | ||
16918 | is type correct and yields `true`. | |
16919 | ||
16920 | Equality on immutable values is computed by structure, which means | |
16921 | that values are compared by recursively descending the data structure | |
16922 | until ground types are reached, at which point the ground types are | |
16923 | compared with primitive equality tests (like comparison of | |
16924 | characters). So, the expression | |
16925 | ||
16926 | ---- | |
16927 | [1, 2, 3] = [1, 1 + 1, 1 + 1 + 1] | |
16928 | ---- | |
16929 | ||
16930 | is guaranteed to yield `true`, even though the lists may occupy | |
16931 | different locations in memory. | |
16932 | ||
16933 | Because of structural equality, immutable values can only be compared | |
16934 | if their components can be compared. For example, `[1, 2, 3]` can be | |
16935 | compared, but `[1.0, 2.0, 3.0]` can not. The SML type system uses | |
16936 | <:EqualityType:equality types> to ensure that structural equality is | |
16937 | only applied to valid values. | |
16938 | ||
16939 | ||
16940 | == Equality of mutable values == | |
16941 | ||
16942 | In contrast to immutable values, polymorphic equality of | |
16943 | <:Mutable:mutable> values (like ref cells and arrays) is performed by | |
16944 | pointer comparison, not by structure. So, the expression | |
16945 | ||
16946 | ---- | |
16947 | ref 13 = ref 13 | |
16948 | ---- | |
16949 | ||
16950 | is guaranteed to yield `false`, even though the ref cells hold the | |
16951 | same contents. | |
16952 | ||
16953 | Because equality of mutable values is not structural, arrays and refs | |
16954 | can be compared _even if their components are not equality types_. | |
16955 | Hence, the following expression is type correct (and yields true). | |
16956 | ||
16957 | [source,sml] | |
16958 | ---- | |
16959 | let | |
16960 | val r = ref 13.0 | |
16961 | in | |
16962 | r = r | |
16963 | end | |
16964 | ---- | |
16965 | ||
16966 | ||
16967 | == Equality of datatypes == | |
16968 | ||
16969 | Polymorphic equality of datatypes is structural. Two values of the | |
16970 | same datatype are equal if they are of the same <:Variant:variant> and | |
16971 | if the <:Variant:variant>'s arguments are equal (recursively). So, | |
16972 | with the datatype | |
16973 | ||
16974 | [source,sml] | |
16975 | ---- | |
16976 | datatype t = A | B of t | |
16977 | ---- | |
16978 | ||
16979 | then `B (B A) = B A` is type correct and yields `false`, while `A = A` | |
16980 | and `B A = B A` yield `true`. | |
16981 | ||
16982 | As polymorphic equality descends two values to compare them, it uses | |
16983 | pointer equality whenever it reaches a mutable value. So, with the | |
16984 | datatype | |
16985 | ||
16986 | [source,sml] | |
16987 | ---- | |
16988 | datatype t = A of int ref | ... | |
16989 | ---- | |
16990 | ||
16991 | then `A (ref 13) = A (ref 13)` is type correct and yields `false`, | |
16992 | because the pointer equality on the two ref cells yields `false`. | |
16993 | ||
16994 | One weakness of the SML type system is that datatypes do not inherit | |
16995 | the special property of the `ref` and `array` type constructors that | |
16996 | allows them to be compared regardless of their component type. For | |
16997 | example, after declaring | |
16998 | ||
16999 | [source,sml] | |
17000 | ---- | |
17001 | datatype 'a t = A of 'a ref | |
17002 | ---- | |
17003 | ||
17004 | one might expect to be able to compare two values of type `real t`, | |
17005 | because pointer comparison on a ref cell would suffice. | |
17006 | Unfortunately, the type system can only express that a user-defined | |
17007 | datatype <:AdmitsEquality:admits equality> or not. In this case, `t` | |
17008 | admits equality, which means that `int t` can be compared but that | |
17009 | `real t` can not. We can confirm this with the program | |
17010 | ||
17011 | [source,sml] | |
17012 | ---- | |
17013 | datatype 'a t = A of 'a ref | |
17014 | fun f (x: real t, y: real t) = x = y | |
17015 | ---- | |
17016 | ||
17017 | on which MLton reports the following error. | |
17018 | ||
17019 | ---- | |
17020 | Error: z.sml 2.32-2.36. | |
17021 | Function applied to incorrect argument. | |
17022 | expects: [<equality>] t * [<equality>] t | |
17023 | but got: [real] t * [real] t | |
17024 | in: = (x, y) | |
17025 | ---- | |
17026 | ||
17027 | ||
17028 | == Implementation == | |
17029 | ||
17030 | Polymorphic equality is implemented by recursively descending the two | |
17031 | values being compared, stopping as soon as they are determined to be | |
17032 | unequal, or exploring the entire values to determine that they are | |
17033 | equal. Hence, polymorphic equality can take time proportional to the | |
17034 | size of the smaller value. | |
17035 | ||
17036 | MLton uses some optimizations to improve performance. | |
17037 | ||
17038 | * When computing structural equality, first do a pointer comparison. | |
17039 | If the comparison yields `true`, then stop and return `true`, since | |
17040 | the structural comparison is guaranteed to do so. If the pointer | |
17041 | comparison fails, then recursively descend the values. | |
17042 | ||
17043 | * If a datatype is an enum (e.g. `datatype t = A | B | C`), then a | |
17044 | single comparison suffices to compare values of the datatype. No case | |
17045 | dispatch is required to determine whether the two values are of the | |
17046 | same <:Variant:variant>. | |
17047 | ||
17048 | * When comparing a known constant non-value-carrying | |
17049 | <:Variant:variant>, use a single comparison. For example, the | |
17050 | following code will compile into a single comparison for `A = x`. | |
17051 | + | |
17052 | [source,sml] | |
17053 | ---- | |
17054 | datatype t = A | B | C of ... | |
17055 | fun f x = ... if A = x then ... | |
17056 | ---- | |
17057 | ||
17058 | * When comparing a small constant `IntInf.int` to another | |
17059 | `IntInf.int`, use a single comparison against the constant. No case | |
17060 | dispatch is required. | |
17061 | ||
17062 | ||
17063 | == Also see == | |
17064 | ||
17065 | * <:AdmitsEquality:> | |
17066 | * <:EqualityType:> | |
17067 | * <:EqualityTypeVariable:> | |
17068 | ||
17069 | <<< | |
17070 | ||
17071 | :mlton-guide-page: Polyvariance | |
17072 | [[Polyvariance]] | |
17073 | Polyvariance | |
17074 | ============ | |
17075 | ||
17076 | Polyvariance is an optimization pass for the <:SXML:> | |
17077 | <:IntermediateLanguage:>, invoked from <:SXMLSimplify:>. | |
17078 | ||
17079 | == Description == | |
17080 | ||
17081 | This pass duplicates a higher-order, `let` bound function at each | |
17082 | variable reference, if the cost is smaller than some threshold. | |
17083 | ||
17084 | == Implementation == | |
17085 | ||
17086 | * <!ViewGitFile(mlton,master,mlton/xml/polyvariance.fun)> | |
17087 | ||
17088 | == Details and Notes == | |
17089 | ||
17090 | {empty} | |
17091 | ||
17092 | <<< | |
17093 | ||
17094 | :mlton-guide-page: Poplog | |
17095 | [[Poplog]] | |
17096 | Poplog | |
17097 | ====== | |
17098 | ||
17099 | http://www.cs.bham.ac.uk/research/poplog/poplog.info.html[POPLOG] is a | |
17100 | development environment that includes implementations of a number of | |
17101 | languages, including <:StandardML:Standard ML>. | |
17102 | ||
17103 | While POPLOG is actively developed, the <:ML:> support predates | |
17104 | <:DefinitionOfStandardML:SML'97>, and there is no support for the | |
17105 | <:BasisLibrary:Basis Library> | |
17106 | http://www.standardml.org/Basis[specification]. | |
17107 | ||
17108 | == Also see == | |
17109 | ||
17110 | * http://www.cs.bham.ac.uk/research/poplog/doc/pmlhelp/mlinpop[Mixed-language programming in ML and Pop-11]. | |
17111 | ||
17112 | <<< | |
17113 | ||
17114 | :mlton-guide-page: PortingMLton | |
17115 | [[PortingMLton]] | |
17116 | PortingMLton | |
17117 | ============ | |
17118 | ||
17119 | Porting MLton to a new target platform (architecture or OS) involves | |
17120 | the following steps. | |
17121 | ||
17122 | 1. Make the necessary changes to the scripts, runtime system, | |
17123 | <:BasisLibrary: Basis Library> implementation, and compiler. | |
17124 | ||
17125 | 2. Get the regressions working using a cross compiler. | |
17126 | ||
17127 | 3. <:CrossCompiling: Cross compile> MLton and bootstrap on the target. | |
17128 | ||
17129 | MLton has a native code generator only for AMD64 and X86, so, if you | |
17130 | are porting to another architecture, you must use the C code | |
17131 | generator. These notes do not cover building a new native code | |
17132 | generator. | |
17133 | ||
17134 | Some of the following steps will not be necessary if MLton already | |
17135 | supports the architecture or operating system you are porting to. | |
17136 | ||
17137 | ||
17138 | == What code to change == | |
17139 | ||
17140 | * Scripts. | |
17141 | + | |
17142 | -- | |
17143 | * In `bin/platform`, add new cases to define `$HOST_OS` and `$HOST_ARCH`. | |
17144 | -- | |
17145 | ||
17146 | * Runtime system. | |
17147 | + | |
17148 | -- | |
17149 | The goal of this step is to be able to successfully run `make` in the | |
17150 | `runtime` directory on the target machine. | |
17151 | ||
17152 | * In `platform.h`, add a new case to include `platform/<arch>.h` and `platform/<os>.h`. | |
17153 | ||
17154 | * In `platform/<arch>.h`: | |
17155 | ** define `MLton_Platform_Arch_host`. | |
17156 | ||
17157 | * In `platform/<os>.h`: | |
17158 | ** include platform-specific includes. | |
17159 | ** define `MLton_Platform_OS_host`. | |
17160 | ** define all of the `HAS_*` macros. | |
17161 | ||
17162 | * In `platform/<os>.c` implement any platform-dependent functions that the runtime needs. | |
17163 | ||
17164 | * Add rounding mode control to `basis/Real/IEEEReal.c` for the new arch (if not `HAS_FEROUND`) | |
17165 | ||
17166 | * Compile and install the <:GnuMP:>. This varies from platform to platform. In `platform/<os>.h`, you need to include the appropriate `gmp.h`. | |
17167 | -- | |
17168 | ||
17169 | * Basis Library implementation (`basis-library/*`) | |
17170 | + | |
17171 | -- | |
17172 | * In `primitive/prim-mlton.sml`: | |
17173 | ** Add a new variant to the `MLton.Platform.Arch.t` datatype. | |
17174 | ** modify the constants that define `MLton.Platform.Arch.host` to match with `MLton_Platform_Arch_host`, as set in `runtime/platform/<arch>.h`. | |
17175 | ** Add a new variant to the `MLton.Platform.OS.t` datatype. | |
17176 | ** modify the constants that define `MLton.Platform.OS.host` to match with `MLton_Platform_OS_host`, as set in `runtime/platform/<os>.h`. | |
17177 | ||
17178 | * In `mlton/platform.{sig,sml}` add a new variant. | |
17179 | ||
17180 | * In `sml-nj/sml-nj.sml`, modify `getOSKind`. | |
17181 | ||
17182 | * Look at all the uses of `MLton.Platform` in the Basis Library implementation and see if you need to do anything special. You might use the following command to see where to look. | |
17183 | + | |
17184 | ---- | |
17185 | find basis-library -type f | xargs grep 'MLton\.Platform' | |
17186 | ---- | |
17187 | + | |
17188 | If in doubt, leave the code alone and wait to see what happens when you run the regression tests. | |
17189 | -- | |
17190 | ||
17191 | * Compiler. | |
17192 | + | |
17193 | -- | |
17194 | * In `lib/stubs/mlton-stubs/platform.sig` add any new variants, as was done in the Basis Library. | |
17195 | ||
17196 | * In `lib/stubs/mlton-stubs/mlton.sml` add any new variants in `MLton.Platform`, as was done in the Basis Library. | |
17197 | -- | |
17198 | ||
17199 | The string used to identify a particular architecture or operating | |
17200 | system must be the same (except for possibly case of letters) in the | |
17201 | scripts, runtime, Basis Library implementation, and compiler (stubs). | |
17202 | In `mlton/main/main.fun`, MLton itself uses the conversions to and | |
17203 | from strings: | |
17204 | ---- | |
17205 | MLton.Platform.{Arch,OS}.{from,to}String | |
17206 | ---- | |
17207 | ||
17208 | If the there is a mismatch, you may see the error message | |
17209 | `strange arch` or `strange os`. | |
17210 | ||
17211 | ||
17212 | == Running the regressions with a cross compiler == | |
17213 | ||
17214 | When porting to a new platform, it is always best to get all (or as | |
17215 | many as possible) of the regressions working before moving to a self | |
17216 | compile. It is easiest to do this by modifying and rebuilding the | |
17217 | compiler on a working machine and then running the regressions with a | |
17218 | cross compiler. It is not easy to build a gcc cross compiler, so we | |
17219 | recommend generating the C and assembly on a working machine (using | |
17220 | MLton's `-target` and `-stop g` flags, copying the generated files to | |
17221 | the target machine, then compiling and linking there. | |
17222 | ||
17223 | 1. Remake the compiler on a working machine. | |
17224 | ||
17225 | 2. Use `bin/add-cross` to add support for the new target. In particular, this should create `build/lib/mlton/targets/<target>/` with the platform-specific necessary cross-compilation information. | |
17226 | ||
17227 | 3. Run the regression tests with the cross-compiler. To cross-compile all the tests, do | |
17228 | + | |
17229 | ---- | |
17230 | bin/regression -cross <target> | |
17231 | ---- | |
17232 | + | |
17233 | This will create all the executables. Then, copy `bin/regression` and | |
17234 | the `regression` directory to the target machine, and do | |
17235 | + | |
17236 | ---- | |
17237 | bin/regression -run-only <target> | |
17238 | ---- | |
17239 | + | |
17240 | This should run all the tests. | |
17241 | ||
17242 | Repeat this step, interleaved with appropriate compiler modifications, | |
17243 | until all the regressions pass. | |
17244 | ||
17245 | ||
17246 | == Bootstrap == | |
17247 | ||
17248 | Once you've got all the regressions working, you can build MLton for | |
17249 | the new target. As with the regressions, the idea for bootstrapping | |
17250 | is to generate the C and assembly on a working machine, copy it to the | |
17251 | target machine, and then compile and link there. Here's the sequence | |
17252 | of steps. | |
17253 | ||
17254 | 1. On a working machine, with the newly rebuilt compiler, in the `mlton` directory, do: | |
17255 | + | |
17256 | ---- | |
17257 | mlton -stop g -target <target> mlton.mlb | |
17258 | ---- | |
17259 | ||
17260 | 2. Copy to the target machine. | |
17261 | ||
17262 | 3. On the target machine, move the libraries to the right place. That is, in `build/lib/mlton/targets`, do: | |
17263 | + | |
17264 | ---- | |
17265 | rm -rf self | |
17266 | mv <target> self | |
17267 | ---- | |
17268 | + | |
17269 | Also make sure you have all the header files in build/lib/mlton/include. You can copy them from a host machine that has run `make runtime`. | |
17270 | ||
17271 | 4. On the target machine, compile and link MLton. That is, in the mlton directory, do something like: | |
17272 | + | |
17273 | ---- | |
17274 | gcc -c -Ibuild/lib/mlton/include -Ibuild/lib/mlton/targets/self/include -O1 -w mlton/mlton.*.[cs] | |
17275 | gcc -o build/lib/mlton/mlton-compile \ | |
17276 | -Lbuild/lib/mlton/targets/self \ | |
17277 | -L/usr/local/lib \ | |
17278 | mlton.*.o \ | |
17279 | -lmlton -lgmp -lgdtoa -lm | |
17280 | ---- | |
17281 | ||
17282 | 5. At this point, MLton should be working and you can finish the rest of a usual make on the target machine. | |
17283 | + | |
17284 | ---- | |
17285 | make basis-no-check script mlbpathmap constants libraries tools | |
17286 | ---- | |
17287 | ||
17288 | 6. Making the last tool, mlyacc, will fail, because mlyacc cannot bootstrap its own yacc.grm.* files. On the host machine, run `make -C mlyacc src/yacc.grm.sml`. Then copy both files to the target machine, and compile mlyacc, making sure to supply the path to your newly compile mllex: `make -C mlyacc MLLEX=mllex/mllex`. | |
17289 | ||
17290 | There are other details to get right, like making sure that the tools | |
17291 | directories were clean so that the tools are rebuilt on the new | |
17292 | platform, but hopefully this structure works. Once you've got a | |
17293 | compiler on the target machine, you should test it by running all the | |
17294 | regressions normally (i.e. without the `-cross` flag) and by running a | |
17295 | couple rounds of self compiles. | |
17296 | ||
17297 | ||
17298 | == Also see == | |
17299 | ||
17300 | The above description is based on the following emails sent to the | |
17301 | MLton list. | |
17302 | ||
17303 | * http://www.mlton.org/pipermail/mlton/2002-October/013110.html | |
17304 | * http://www.mlton.org/pipermail/mlton/2004-July/016029.html | |
17305 | ||
17306 | <<< | |
17307 | ||
17308 | :mlton-guide-page: PrecedenceParse | |
17309 | [[PrecedenceParse]] | |
17310 | PrecedenceParse | |
17311 | =============== | |
17312 | ||
17313 | <:PrecedenceParse:> is an analysis/rewrite pass for the <:AST:> | |
17314 | <:IntermediateLanguage:>, invoked from <:Elaborate:>. | |
17315 | ||
17316 | == Description == | |
17317 | ||
17318 | This pass rewrites <:AST:> function clauses, expressions, and patterns | |
17319 | to resolve <:OperatorPrecedence:>. | |
17320 | ||
17321 | == Implementation == | |
17322 | ||
17323 | * <!ViewGitFile(mlton,master,mlton/elaborate/precedence-parse.sig)> | |
17324 | * <!ViewGitFile(mlton,master,mlton/elaborate/precedence-parse.fun)> | |
17325 | ||
17326 | == Details and Notes == | |
17327 | ||
17328 | {empty} | |
17329 | ||
17330 | <<< | |
17331 | ||
17332 | :mlton-guide-page: Printf | |
17333 | [[Printf]] | |
17334 | Printf | |
17335 | ====== | |
17336 | ||
17337 | Programmers coming from C or Java often ask if | |
17338 | <:StandardML:Standard ML> has a `printf` function. It does not. | |
17339 | However, it is possible to implement your own version with only a few | |
17340 | lines of code. | |
17341 | ||
17342 | Here is a definition for `printf` and `fprintf`, along with format | |
17343 | specifiers for booleans, integers, and reals. | |
17344 | ||
17345 | [source,sml] | |
17346 | ---- | |
17347 | structure Printf = | |
17348 | struct | |
17349 | fun $ (_, f) = f (fn p => p ()) ignore | |
17350 | fun fprintf out f = f (out, id) | |
17351 | val printf = fn z => fprintf TextIO.stdOut z | |
17352 | fun one ((out, f), make) g = | |
17353 | g (out, fn r => | |
17354 | f (fn p => | |
17355 | make (fn s => | |
17356 | r (fn () => (p (); TextIO.output (out, s)))))) | |
17357 | fun ` x s = one (x, fn f => f s) | |
17358 | fun spec to x = one (x, fn f => f o to) | |
17359 | val B = fn z => spec Bool.toString z | |
17360 | val I = fn z => spec Int.toString z | |
17361 | val R = fn z => spec Real.toString z | |
17362 | end | |
17363 | ---- | |
17364 | ||
17365 | Here's an example use. | |
17366 | ||
17367 | [source,sml] | |
17368 | ---- | |
17369 | val () = printf `"Int="I`" Bool="B`" Real="R`"\n" $ 1 false 2.0 | |
17370 | ---- | |
17371 | ||
17372 | This prints the following. | |
17373 | ||
17374 | ---- | |
17375 | Int=1 Bool=false Real=2.0 | |
17376 | ---- | |
17377 | ||
17378 | In general, a use of `printf` looks like | |
17379 | ||
17380 | ---- | |
17381 | printf <spec1> ... <specn> $ <arg1> ... <argm> | |
17382 | ---- | |
17383 | ||
17384 | where each `<speci>` is either a specifier like `B`, `I`, or `R`, or | |
17385 | is an inline string, like ++`"foo"++. A backtick (+`+) | |
17386 | must precede each inline string. Each `<argi>` must be of the | |
17387 | appropriate type for the corresponding specifier. | |
17388 | ||
17389 | SML `printf` is more powerful than its C counterpart in a number of | |
17390 | ways. In particular, the function produced by `printf` is a perfectly | |
17391 | ordinary SML function, and can be passed around, used multiple times, | |
17392 | etc. For example: | |
17393 | ||
17394 | [source,sml] | |
17395 | ---- | |
17396 | val f: int -> bool -> unit = printf `"Int="I`" Bool="B`"\n" $ | |
17397 | val () = f 1 true | |
17398 | val () = f 2 false | |
17399 | ---- | |
17400 | ||
17401 | The definition of `printf` is even careful to not print anything until | |
17402 | it is fully applied. So, examples like the following will work as | |
17403 | expected. | |
17404 | ||
17405 | ---- | |
17406 | val f: int -> bool -> unit = printf `"Int="I`" Bool="B`"\n" $ 13 | |
17407 | val () = f true | |
17408 | val () = f false | |
17409 | ---- | |
17410 | ||
17411 | It is also easy to define new format specifiers. For example, suppose | |
17412 | we wanted format specifiers for characters and strings. | |
17413 | ||
17414 | ---- | |
17415 | val C = fn z => spec Char.toString z | |
17416 | val S = fn z => spec (fn s => s) z | |
17417 | ---- | |
17418 | ||
17419 | One can define format specifiers for more complex types, e.g. pairs of | |
17420 | integers. | |
17421 | ||
17422 | ---- | |
17423 | val I2 = | |
17424 | fn z => | |
17425 | spec (fn (i, j) => | |
17426 | concat ["(", Int.toString i, ", ", Int.toString j, ")"]) | |
17427 | z | |
17428 | ---- | |
17429 | ||
17430 | Here's an example use. | |
17431 | ||
17432 | ---- | |
17433 | val () = printf `"Test "I2`" a string "S`"\n" $ (1, 2) "hello" | |
17434 | ---- | |
17435 | ||
17436 | ||
17437 | == Printf via <:Fold:> == | |
17438 | ||
17439 | `printf` is best viewed as a special case of variable-argument | |
17440 | <:Fold:> that inductively builds a function as it processes its | |
17441 | arguments. Here is the definition of a `Printf` structure in terms of | |
17442 | fold. The structure is equivalent to the above one, except that it | |
17443 | uses the standard `$` instead of a specialized one. | |
17444 | ||
17445 | [source,sml] | |
17446 | ---- | |
17447 | structure Printf = | |
17448 | struct | |
17449 | fun fprintf out = | |
17450 | Fold.fold ((out, id), fn (_, f) => f (fn p => p ()) ignore) | |
17451 | ||
17452 | val printf = fn z => fprintf TextIO.stdOut z | |
17453 | ||
17454 | fun one ((out, f), make) = | |
17455 | (out, fn r => | |
17456 | f (fn p => | |
17457 | make (fn s => | |
17458 | r (fn () => (p (); TextIO.output (out, s)))))) | |
17459 | ||
17460 | val ` = | |
17461 | fn z => Fold.step1 (fn (s, x) => one (x, fn f => f s)) z | |
17462 | ||
17463 | fun spec to = Fold.step0 (fn x => one (x, fn f => f o to)) | |
17464 | ||
17465 | val B = fn z => spec Bool.toString z | |
17466 | val I = fn z => spec Int.toString z | |
17467 | val R = fn z => spec Real.toString z | |
17468 | end | |
17469 | ---- | |
17470 | ||
17471 | Viewing `printf` as a fold opens up a number of possibilities. For | |
17472 | example, one can name parts of format strings using the fold idiom for | |
17473 | naming sequences of steps. | |
17474 | ||
17475 | ---- | |
17476 | val IB = fn u => Fold.fold u `"Int="I`" Bool="B | |
17477 | val () = printf IB`" "IB`"\n" $ 1 true 3 false | |
17478 | ---- | |
17479 | ||
17480 | One can even parametrize over partial format strings. | |
17481 | ||
17482 | ---- | |
17483 | fun XB X = fn u => Fold.fold u `"X="X`" Bool="B | |
17484 | val () = printf (XB I)`" "(XB R)`"\n" $ 1 true 2.0 false | |
17485 | ---- | |
17486 | ||
17487 | ||
17488 | == Also see == | |
17489 | ||
17490 | * <:PrintfGentle:> | |
17491 | * <!Cite(Danvy98, Functional Unparsing)> | |
17492 | ||
17493 | <<< | |
17494 | ||
17495 | :mlton-guide-page: PrintfGentle | |
17496 | [[PrintfGentle]] | |
17497 | PrintfGentle | |
17498 | ============ | |
17499 | ||
17500 | This page provides a gentle introduction and derivation of <:Printf:>, | |
17501 | with sections and arrangement more suitable to a talk. | |
17502 | ||
17503 | ||
17504 | == Introduction == | |
17505 | ||
17506 | SML does not have `printf`. Could we define it ourselves? | |
17507 | ||
17508 | [source,sml] | |
17509 | ---- | |
17510 | val () = printf ("here's an int %d and a real %f.\n", 13, 17.0) | |
17511 | val () = printf ("here's three values (%d, %f, %f).\n", 13, 17.0, 19.0) | |
17512 | ---- | |
17513 | ||
17514 | What could the type of `printf` be? | |
17515 | ||
17516 | This obviously can't work, because SML functions take a fixed number | |
17517 | of arguments. Actually they take one argument, but if that's a tuple, | |
17518 | it can only have a fixed number of components. | |
17519 | ||
17520 | ||
17521 | == From tupling to currying == | |
17522 | ||
17523 | What about currying to get around the typing problem? | |
17524 | ||
17525 | [source,sml] | |
17526 | ---- | |
17527 | val () = printf "here's an int %d and a real %f.\n" 13 17.0 | |
17528 | val () = printf "here's three values (%d, %f, %f).\n" 13 17.0 19.0 | |
17529 | ---- | |
17530 | ||
17531 | That fails for a similar reason. We need two types for `printf`. | |
17532 | ||
17533 | ---- | |
17534 | val printf: string -> int -> real -> unit | |
17535 | val printf: string -> int -> real -> real -> unit | |
17536 | ---- | |
17537 | ||
17538 | This can't work, because `printf` can only have one type. SML doesn't | |
17539 | support programmer-defined overloading. | |
17540 | ||
17541 | ||
17542 | == Overloading and dependent types == | |
17543 | ||
17544 | Even without worrying about number of arguments, there is another | |
17545 | problem. The type of `printf` depends on the format string. | |
17546 | ||
17547 | [source,sml] | |
17548 | ---- | |
17549 | val () = printf "here's an int %d and a real %f.\n" 13 17.0 | |
17550 | val () = printf "here's a real %f and an int %d.\n" 17.0 13 | |
17551 | ---- | |
17552 | ||
17553 | Now we need | |
17554 | ||
17555 | ---- | |
17556 | val printf: string -> int -> real -> unit | |
17557 | val printf: string -> real -> int -> unit | |
17558 | ---- | |
17559 | ||
17560 | Again, this can't possibly working because SML doesn't have | |
17561 | overloading, and types can't depend on values. | |
17562 | ||
17563 | ||
17564 | == Idea: express type information in the format string == | |
17565 | ||
17566 | If we express type information in the format string, then different | |
17567 | uses of `printf` can have different types. | |
17568 | ||
17569 | [source,sml] | |
17570 | ---- | |
17571 | type 'a t (* the type of format strings *) | |
17572 | val printf: 'a t -> 'a | |
17573 | infix D F | |
17574 | val fs1: (int -> real -> unit) t = "here's an int "D" and a real "F".\n" | |
17575 | val fs2: (int -> real -> real -> unit) t = | |
17576 | "here's three values ("D", "F", "F").\n" | |
17577 | val () = printf fs1 13 17.0 | |
17578 | val () = printf fs2 13 17.0 19.0 | |
17579 | ---- | |
17580 | ||
17581 | Now, our two calls to `printf` type check, because the format | |
17582 | string specializes `printf` to the appropriate type. | |
17583 | ||
17584 | ||
17585 | == The types of format characters == | |
17586 | ||
17587 | What should the type of format characters `D` and `F` be? Each format | |
17588 | character requires an additional argument of the appropriate type to | |
17589 | be supplied to `printf`. | |
17590 | ||
17591 | Idea: guess the final type that will be needed for `printf` the format | |
17592 | string and verify it with each format character. | |
17593 | ||
17594 | [source,sml] | |
17595 | ---- | |
17596 | type ('a, 'b) t (* 'a = rest of type to verify, 'b = final type *) | |
17597 | val ` : string -> ('a, 'a) t (* guess the type, which must be verified *) | |
17598 | val D: (int -> 'a, 'b) t * string -> ('a, 'b) t (* consume an int *) | |
17599 | val F: (real -> 'a, 'b) t * string -> ('a, 'b) t (* consume a real *) | |
17600 | val printf: (unit, 'a) t -> 'a | |
17601 | ---- | |
17602 | ||
17603 | Don't worry. In the end, type inference will guess and verify for us. | |
17604 | ||
17605 | ||
17606 | == Understanding guess and verify == | |
17607 | ||
17608 | Now, let's build up a format string and a specialized `printf`. | |
17609 | ||
17610 | [source,sml] | |
17611 | ---- | |
17612 | infix D F | |
17613 | val f0 = `"here's an int " | |
17614 | val f1 = f0 D " and a real " | |
17615 | val f2 = f1 F ".\n" | |
17616 | val p = printf f2 | |
17617 | ---- | |
17618 | ||
17619 | These definitions yield the following types. | |
17620 | ||
17621 | [source,sml] | |
17622 | ---- | |
17623 | val f0: (int -> real -> unit, int -> real -> unit) t | |
17624 | val f1: (real -> unit, int -> real -> unit) t | |
17625 | val f2: (unit, int -> real -> unit) t | |
17626 | val p: int -> real -> unit | |
17627 | ---- | |
17628 | ||
17629 | So, `p` is a specialized `printf` function. We could use it as | |
17630 | follows | |
17631 | ||
17632 | [source,sml] | |
17633 | ---- | |
17634 | val () = p 13 17.0 | |
17635 | val () = p 14 19.0 | |
17636 | ---- | |
17637 | ||
17638 | ||
17639 | == Type checking this using a functor == | |
17640 | ||
17641 | [source,sml] | |
17642 | ---- | |
17643 | signature PRINTF = | |
17644 | sig | |
17645 | type ('a, 'b) t | |
17646 | val ` : string -> ('a, 'a) t | |
17647 | val D: (int -> 'a, 'b) t * string -> ('a, 'b) t | |
17648 | val F: (real -> 'a, 'b) t * string -> ('a, 'b) t | |
17649 | val printf: (unit, 'a) t -> 'a | |
17650 | end | |
17651 | ||
17652 | functor Test (P: PRINTF) = | |
17653 | struct | |
17654 | open P | |
17655 | infix D F | |
17656 | ||
17657 | val () = printf (`"here's an int "D" and a real "F".\n") 13 17.0 | |
17658 | val () = printf (`"here's three values ("D", "F ", "F").\n") 13 17.0 19.0 | |
17659 | end | |
17660 | ---- | |
17661 | ||
17662 | ||
17663 | == Implementing `Printf` == | |
17664 | ||
17665 | Think of a format character as a formatter transformer. It takes the | |
17666 | formatter for the part of the format string before it and transforms | |
17667 | it into a new formatter that first does the left hand bit, then does | |
17668 | its bit, then continues on with the rest of the format string. | |
17669 | ||
17670 | [source,sml] | |
17671 | ---- | |
17672 | structure Printf: PRINTF = | |
17673 | struct | |
17674 | datatype ('a, 'b) t = T of (unit -> 'a) -> 'b | |
17675 | ||
17676 | fun printf (T f) = f (fn () => ()) | |
17677 | ||
17678 | fun ` s = T (fn a => (print s; a ())) | |
17679 | ||
17680 | fun D (T f, s) = | |
17681 | T (fn g => f (fn () => fn i => | |
17682 | (print (Int.toString i); print s; g ()))) | |
17683 | ||
17684 | fun F (T f, s) = | |
17685 | T (fn g => f (fn () => fn i => | |
17686 | (print (Real.toString i); print s; g ()))) | |
17687 | end | |
17688 | ---- | |
17689 | ||
17690 | ||
17691 | == Testing printf == | |
17692 | ||
17693 | [source,sml] | |
17694 | ---- | |
17695 | structure Z = Test (Printf) | |
17696 | ---- | |
17697 | ||
17698 | ||
17699 | == User-definable formats == | |
17700 | ||
17701 | The definition of the format characters is pretty much the same. | |
17702 | Within the `Printf` structure we can define a format character | |
17703 | generator. | |
17704 | ||
17705 | [source,sml] | |
17706 | ---- | |
17707 | val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t = | |
17708 | fn toString => fn (T f, s) => | |
17709 | T (fn th => f (fn () => fn a => (print (toString a); print s ; th ()))) | |
17710 | val D = fn z => newFormat Int.toString z | |
17711 | val F = fn z => newFormat Real.toString z | |
17712 | ---- | |
17713 | ||
17714 | ||
17715 | == A core `Printf` == | |
17716 | ||
17717 | We can now have a very small `PRINTF` signature, and define all | |
17718 | the format strings externally to the core module. | |
17719 | ||
17720 | [source,sml] | |
17721 | ---- | |
17722 | signature PRINTF = | |
17723 | sig | |
17724 | type ('a, 'b) t | |
17725 | val ` : string -> ('a, 'a) t | |
17726 | val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t | |
17727 | val printf: (unit, 'a) t -> 'a | |
17728 | end | |
17729 | ||
17730 | structure Printf: PRINTF = | |
17731 | struct | |
17732 | datatype ('a, 'b) t = T of (unit -> 'a) -> 'b | |
17733 | ||
17734 | fun printf (T f) = f (fn () => ()) | |
17735 | ||
17736 | fun ` s = T (fn a => (print s; a ())) | |
17737 | ||
17738 | fun newFormat toString (T f, s) = | |
17739 | T (fn th => | |
17740 | f (fn () => fn a => | |
17741 | (print (toString a) | |
17742 | ; print s | |
17743 | ; th ()))) | |
17744 | end | |
17745 | ---- | |
17746 | ||
17747 | ||
17748 | == Extending to fprintf == | |
17749 | ||
17750 | One can implement fprintf by threading the outstream through all the | |
17751 | transformers. | |
17752 | ||
17753 | [source,sml] | |
17754 | ---- | |
17755 | signature PRINTF = | |
17756 | sig | |
17757 | type ('a, 'b) t | |
17758 | val ` : string -> ('a, 'a) t | |
17759 | val fprintf: (unit, 'a) t * TextIO.outstream -> 'a | |
17760 | val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t | |
17761 | val printf: (unit, 'a) t -> 'a | |
17762 | end | |
17763 | ||
17764 | structure Printf: PRINTF = | |
17765 | struct | |
17766 | type out = TextIO.outstream | |
17767 | val output = TextIO.output | |
17768 | ||
17769 | datatype ('a, 'b) t = T of (out -> 'a) -> out -> 'b | |
17770 | ||
17771 | fun fprintf (T f, out) = f (fn _ => ()) out | |
17772 | ||
17773 | fun printf t = fprintf (t, TextIO.stdOut) | |
17774 | ||
17775 | fun ` s = T (fn a => fn out => (output (out, s); a out)) | |
17776 | ||
17777 | fun newFormat toString (T f, s) = | |
17778 | T (fn g => | |
17779 | f (fn out => fn a => | |
17780 | (output (out, toString a) | |
17781 | ; output (out, s) | |
17782 | ; g out))) | |
17783 | end | |
17784 | ---- | |
17785 | ||
17786 | ||
17787 | == Notes == | |
17788 | ||
17789 | * Lesson: instead of using dependent types for a function, express the | |
17790 | the dependency in the type of the argument. | |
17791 | ||
17792 | * If `printf` is partially applied, it will do the printing then and | |
17793 | there. Perhaps this could be fixed with some kind of terminator. | |
17794 | + | |
17795 | A syntactic or argument terminator is not necessary. A formatter can | |
17796 | either be eager (as above) or lazy (as below). A lazy formatter | |
17797 | accumulates enough state to print the entire string. The simplest | |
17798 | lazy formatter concatenates the strings as they become available: | |
17799 | + | |
17800 | [source,sml] | |
17801 | ---- | |
17802 | structure PrintfLazyConcat: PRINTF = | |
17803 | struct | |
17804 | datatype ('a, 'b) t = T of (string -> 'a) -> string -> 'b | |
17805 | ||
17806 | fun printf (T f) = f print "" | |
17807 | ||
17808 | fun ` s = T (fn th => fn s' => th (s' ^ s)) | |
17809 | ||
17810 | fun newFormat toString (T f, s) = | |
17811 | T (fn th => | |
17812 | f (fn s' => fn a => | |
17813 | th (s' ^ toString a ^ s))) | |
17814 | end | |
17815 | ---- | |
17816 | + | |
17817 | It is somewhat more efficient to accumulate the strings as a list: | |
17818 | + | |
17819 | [source,sml] | |
17820 | ---- | |
17821 | structure PrintfLazyList: PRINTF = | |
17822 | struct | |
17823 | datatype ('a, 'b) t = T of (string list -> 'a) -> string list -> 'b | |
17824 | ||
17825 | fun printf (T f) = f (List.app print o List.rev) [] | |
17826 | ||
17827 | fun ` s = T (fn th => fn ss => th (s::ss)) | |
17828 | ||
17829 | fun newFormat toString (T f, s) = | |
17830 | T (fn th => | |
17831 | f (fn ss => fn a => | |
17832 | th (s::toString a::ss))) | |
17833 | end | |
17834 | ---- | |
17835 | ||
17836 | ||
17837 | == Also see == | |
17838 | ||
17839 | * <:Printf:> | |
17840 | * <!Cite(Danvy98, Functional Unparsing)> | |
17841 | ||
17842 | <<< | |
17843 | ||
17844 | :mlton-guide-page: ProductType | |
17845 | [[ProductType]] | |
17846 | ProductType | |
17847 | =========== | |
17848 | ||
17849 | <:StandardML:Standard ML> has special syntax for products (tuples). A | |
17850 | product type is written as | |
17851 | [source,sml] | |
17852 | ---- | |
17853 | t1 * t2 * ... * tN | |
17854 | ---- | |
17855 | and a product pattern is written as | |
17856 | [source,sml] | |
17857 | ---- | |
17858 | (p1, p2, ..., pN) | |
17859 | ---- | |
17860 | ||
17861 | In most situations the syntax is quite convenient. However, there are | |
17862 | situations where the syntax is cumbersome. There are also situations | |
17863 | in which it is useful to construct and destruct n-ary products | |
17864 | inductively, especially when using <:Fold:>. | |
17865 | ||
17866 | In such situations, it is useful to have a binary product datatype | |
17867 | with an infix constructor defined as follows. | |
17868 | [source,sml] | |
17869 | ---- | |
17870 | datatype ('a, 'b) product = & of 'a * 'b | |
17871 | infix & | |
17872 | ---- | |
17873 | ||
17874 | With these definitions, one can write an n-ary product as a nested | |
17875 | binary product quite conveniently. | |
17876 | [source,sml] | |
17877 | ---- | |
17878 | x1 & x2 & ... & xn | |
17879 | ---- | |
17880 | ||
17881 | Because of left associativity, this is the same as | |
17882 | [source,sml] | |
17883 | ---- | |
17884 | (((x1 & x2) & ...) & xn) | |
17885 | ---- | |
17886 | ||
17887 | Because `&` is a constructor, the syntax can also be used for | |
17888 | patterns. | |
17889 | ||
17890 | The symbol `&` is inspired by the Curry-Howard isomorphism: the proof | |
17891 | of a conjunction `(A & B)` is a pair of proofs `(a, b)`. | |
17892 | ||
17893 | ||
17894 | == Example: parser combinators == | |
17895 | ||
17896 | A typical parser combinator library provides a combinator that has a | |
17897 | type of the form. | |
17898 | [source,sml] | |
17899 | ---- | |
17900 | 'a parser * 'b parser -> ('a * 'b) parser | |
17901 | ---- | |
17902 | and produces a parser for the concatenation of two parsers. When more | |
17903 | than two parsers are concatenated, the result of the resulting parser | |
17904 | is a nested structure of pairs | |
17905 | [source,sml] | |
17906 | ---- | |
17907 | (...((p1, p2), p3)..., pN) | |
17908 | ---- | |
17909 | which is somewhat cumbersome. | |
17910 | ||
17911 | By using a product type, the type of the concatenation combinator then | |
17912 | becomes | |
17913 | [source,sml] | |
17914 | ---- | |
17915 | 'a parser * 'b parser -> ('a, 'b) product parser | |
17916 | ---- | |
17917 | While this doesn't stop the nesting, it makes the pattern significantly | |
17918 | easier to write. Instead of | |
17919 | [source,sml] | |
17920 | ---- | |
17921 | (...((p1, p2), p3)..., pN) | |
17922 | ---- | |
17923 | the pattern is written as | |
17924 | [source,sml] | |
17925 | ---- | |
17926 | p1 & p2 & p3 & ... & pN | |
17927 | ---- | |
17928 | which is considerably more concise. | |
17929 | ||
17930 | ||
17931 | == Also see == | |
17932 | ||
17933 | * <:VariableArityPolymorphism:> | |
17934 | * <:Utilities:> | |
17935 | ||
17936 | <<< | |
17937 | ||
17938 | :mlton-guide-page: Profiling | |
17939 | [[Profiling]] | |
17940 | Profiling | |
17941 | ========= | |
17942 | ||
17943 | With MLton and `mlprof`, you can profile your program to find out | |
17944 | bytes allocated, execution counts, or time spent in each function. To | |
17945 | profile you program, compile with ++-profile __kind__++, where _kind_ | |
17946 | is one of `alloc`, `count`, or `time`. Then, run the executable, | |
17947 | which will write an `mlmon.out` file when it finishes. You can then | |
17948 | run `mlprof` on the executable and the `mlmon.out` file to see the | |
17949 | performance data. | |
17950 | ||
17951 | Here are the three kinds of profiling that MLton supports. | |
17952 | ||
17953 | * <:ProfilingAllocation:> | |
17954 | * <:ProfilingCounts:> | |
17955 | * <:ProfilingTime:> | |
17956 | ||
17957 | == Next steps == | |
17958 | ||
17959 | * <:CallGraph:>s to visualize profiling data. | |
17960 | * <:HowProfilingWorks:> | |
17961 | * <:MLmon:> | |
17962 | * <:MLtonProfile:> to selectively profile parts of your program. | |
17963 | * <:ProfilingTheStack:> | |
17964 | * <:ShowProf:> | |
17965 | ||
17966 | <<< | |
17967 | ||
17968 | :mlton-guide-page: ProfilingAllocation | |
17969 | [[ProfilingAllocation]] | |
17970 | ProfilingAllocation | |
17971 | =================== | |
17972 | ||
17973 | With MLton and `mlprof`, you can <:Profiling:profile> your program to | |
17974 | find out how many bytes each function allocates. To do so, compile | |
17975 | your program with `-profile alloc`. For example, suppose that | |
17976 | `list-rev.sml` is the following. | |
17977 | ||
17978 | [source,sml] | |
17979 | ---- | |
17980 | sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/list-rev.sml] | |
17981 | ---- | |
17982 | ||
17983 | Compile and run `list-rev` as follows. | |
17984 | ---- | |
17985 | % mlton -profile alloc list-rev.sml | |
17986 | % ./list-rev | |
17987 | % mlprof -show-line true list-rev mlmon.out | |
17988 | 6,030,136 bytes allocated (108,336 bytes by GC) | |
17989 | function cur | |
17990 | ----------------------- ----- | |
17991 | append list-rev.sml: 1 97.6% | |
17992 | <gc> 1.8% | |
17993 | <main> 0.4% | |
17994 | rev list-rev.sml: 6 0.2% | |
17995 | ---- | |
17996 | ||
17997 | The data shows that most of the allocation is done by the `append` | |
17998 | function defined on line 1 of `list-rev.sml`. The table also shows | |
17999 | how special functions like `gc` and `main` are handled: they are | |
18000 | printed with surrounding brackets. C functions are displayed | |
18001 | similarly. In this example, the allocation done by the garbage | |
18002 | collector is due to stack growth, which is usually the case. | |
18003 | ||
18004 | The run-time performance impact of allocation profiling is noticeable, | |
18005 | because it inserts additional C calls for object allocation. | |
18006 | ||
18007 | Compile with `-profile alloc -profile-branch true` to find out how | |
18008 | much allocation is done in each branch of a function; see | |
18009 | <:ProfilingCounts:> for more details on `-profile-branch`. | |
18010 | ||
18011 | <<< | |
18012 | ||
18013 | :mlton-guide-page: ProfilingCounts | |
18014 | [[ProfilingCounts]] | |
18015 | ProfilingCounts | |
18016 | =============== | |
18017 | ||
18018 | With MLton and `mlprof`, you can <:Profiling:profile> your program to | |
18019 | find out how many times each function is called and how many times | |
18020 | each branch is taken. To do so, compile your program with | |
18021 | `-profile count -profile-branch true`. For example, suppose that | |
18022 | `tak.sml` contains the following. | |
18023 | ||
18024 | [source,sml] | |
18025 | ---- | |
18026 | sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/tak.sml] | |
18027 | ---- | |
18028 | ||
18029 | Compile with count profiling and run the program. | |
18030 | ---- | |
18031 | % mlton -profile count -profile-branch true tak.sml | |
18032 | % ./tak | |
18033 | ---- | |
18034 | ||
18035 | Display the profiling data, along with raw counts and file positions. | |
18036 | ---- | |
18037 | % mlprof -raw true -show-line true tak mlmon.out | |
18038 | 623,610,002 ticks | |
18039 | function cur raw | |
18040 | --------------------------------- ----- ------------- | |
18041 | Tak.tak1.tak2 tak.sml: 5 38.2% (238,530,000) | |
18042 | Tak.tak1.tak2.<true> tak.sml: 7 27.5% (171,510,000) | |
18043 | Tak.tak1 tak.sml: 3 10.7% (67,025,000) | |
18044 | Tak.tak1.<true> tak.sml: 14 10.7% (67,025,000) | |
18045 | Tak.tak1.tak2.<false> tak.sml: 9 10.7% (67,020,000) | |
18046 | Tak.tak1.<false> tak.sml: 16 2.0% (12,490,000) | |
18047 | f tak.sml: 23 0.0% (5,001) | |
18048 | f.<branch> tak.sml: 25 0.0% (5,000) | |
18049 | f.<branch> tak.sml: 23 0.0% (1) | |
18050 | uncalled tak.sml: 29 0.0% (0) | |
18051 | f.<branch> tak.sml: 24 0.0% (0) | |
18052 | ---- | |
18053 | ||
18054 | Branches are displayed with lexical nesting followed by `<branch>` | |
18055 | where the function name would normally be, or `<true>` or `<false>` | |
18056 | for if-expressions. It is best to run `mlprof` with `-show-line true` | |
18057 | to help identify the branch. | |
18058 | ||
18059 | One use of `-profile count` is as a code-coverage tool, to help find | |
18060 | code in your program that hasn't been tested. For this reason, | |
18061 | `mlprof` displays functions and branches even if they have a count of | |
18062 | zero. As the above output shows, the branch on line 24 was never | |
18063 | taken and the function defined on line 29 was never called. To see | |
18064 | zero counts, it is best to run `mlprof` with `-raw true`, since some | |
18065 | code (e.g. the branch on line 23 above) will show up with `0.0%` but | |
18066 | may still have been executed and hence have a nonzero raw count. | |
18067 | ||
18068 | <<< | |
18069 | ||
18070 | :mlton-guide-page: ProfilingTheStack | |
18071 | [[ProfilingTheStack]] | |
18072 | ProfilingTheStack | |
18073 | ================= | |
18074 | ||
18075 | For all forms of <:Profiling:>, you can gather counts for all | |
18076 | functions on the stack, not just the currently executing function. To | |
18077 | do so, compile your program with `-profile-stack true`. For example, | |
18078 | suppose that `list-rev.sml` contains the following. | |
18079 | ||
18080 | [source,sml] | |
18081 | ---- | |
18082 | sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/list-rev.sml] | |
18083 | ---- | |
18084 | ||
18085 | Compile with stack profiling and then run the program. | |
18086 | ---- | |
18087 | % mlton -profile alloc -profile-stack true list-rev.sml | |
18088 | % ./list-rev | |
18089 | ---- | |
18090 | ||
18091 | Display the profiling data. | |
18092 | ---- | |
18093 | % mlprof -show-line true list-rev mlmon.out | |
18094 | 6,030,136 bytes allocated (108,336 bytes by GC) | |
18095 | function cur stack GC | |
18096 | ----------------------- ----- ----- ---- | |
18097 | append list-rev.sml: 1 97.6% 97.6% 1.4% | |
18098 | <gc> 1.8% 0.0% 1.8% | |
18099 | <main> 0.4% 98.2% 1.8% | |
18100 | rev list-rev.sml: 6 0.2% 97.6% 1.8% | |
18101 | ---- | |
18102 | ||
18103 | In the above table, we see that `rev`, defined on line 6 of | |
18104 | `list-rev.sml`, is only responsible for 0.2% of the allocation, but is | |
18105 | on the stack while 97.6% of the allocation is done by the user program | |
18106 | and while 1.8% of the allocation is done by the garbage collector. | |
18107 | ||
18108 | The run-time performance impact of `-profile-stack true` can be | |
18109 | noticeable since there is some extra bookkeeping at every nontail call | |
18110 | and return. | |
18111 | ||
18112 | <<< | |
18113 | ||
18114 | :mlton-guide-page: ProfilingTime | |
18115 | [[ProfilingTime]] | |
18116 | ProfilingTime | |
18117 | ============= | |
18118 | ||
18119 | With MLton and `mlprof`, you can <:Profiling:profile> your program to | |
18120 | find out how much time is spent in each function over an entire run of | |
18121 | the program. To do so, compile your program with `-profile time`. | |
18122 | For example, suppose that `tak.sml` contains the following. | |
18123 | ||
18124 | [source,sml] | |
18125 | ---- | |
18126 | sys::[./bin/InclGitFile.py mlton master doc/examples/profiling/tak.sml] | |
18127 | ---- | |
18128 | ||
18129 | Compile with time profiling and run the program. | |
18130 | ---- | |
18131 | % mlton -profile time tak.sml | |
18132 | % ./tak | |
18133 | ---- | |
18134 | ||
18135 | Display the profiling data. | |
18136 | ---- | |
18137 | % mlprof tak mlmon.out | |
18138 | 6.00 seconds of CPU time (0.00 seconds GC) | |
18139 | function cur | |
18140 | ------------- ----- | |
18141 | Tak.tak1.tak2 75.8% | |
18142 | Tak.tak1 24.2% | |
18143 | ---- | |
18144 | ||
18145 | This example shows how `mlprof` indicates lexical nesting: as a | |
18146 | sequence of period-separated names indicating the structures and | |
18147 | functions in which a function definition is nested. The profiling | |
18148 | data shows that roughly three-quarters of the time is spent in the | |
18149 | `Tak.tak1.tak2` function, while the rest is spent in `Tak.tak1`. | |
18150 | ||
18151 | Display raw counts in addition to percentages with `-raw true`. | |
18152 | ---- | |
18153 | % mlprof -raw true tak mlmon.out | |
18154 | 6.00 seconds of CPU time (0.00 seconds GC) | |
18155 | function cur raw | |
18156 | ------------- ----- ------- | |
18157 | Tak.tak1.tak2 75.8% (4.55s) | |
18158 | Tak.tak1 24.2% (1.45s) | |
18159 | ---- | |
18160 | ||
18161 | Display the file name and line number for each function in addition to | |
18162 | its name with `-show-line true`. | |
18163 | ---- | |
18164 | % mlprof -show-line true tak mlmon.out | |
18165 | 6.00 seconds of CPU time (0.00 seconds GC) | |
18166 | function cur | |
18167 | ------------------------- ----- | |
18168 | Tak.tak1.tak2 tak.sml: 5 75.8% | |
18169 | Tak.tak1 tak.sml: 3 24.2% | |
18170 | ---- | |
18171 | ||
18172 | Time profiling is designed to have a very small performance impact. | |
18173 | However, in some cases there will be a run-time performance cost, | |
18174 | which may perturb the results. There is more likely to be an impact | |
18175 | with `-codegen c` than `-codegen native`. | |
18176 | ||
18177 | You can also compile with `-profile time -profile-branch true` to find | |
18178 | out how much time is spent in each branch of a function; see | |
18179 | <:ProfilingCounts:> for more details on `-profile-branch`. | |
18180 | ||
18181 | ||
18182 | == Caveats == | |
18183 | ||
18184 | With `-profile time`, use of the following in your program will cause | |
18185 | a run-time error, since they would interfere with the profiler signal | |
18186 | handler. | |
18187 | ||
18188 | * `MLton.Itimer.set (MLton.Itimer.Prof, ...)` | |
18189 | * `MLton.Signal.setHandler (MLton.Signal.prof, ...)` | |
18190 | ||
18191 | Also, because of the random sampling used to implement `-profile | |
18192 | time`, it is best to have a long running program (at least tens of | |
18193 | seconds) in order to get reasonable time | |
18194 | ||
18195 | <<< | |
18196 | ||
18197 | :mlton-guide-page: Projects | |
18198 | [[Projects]] | |
18199 | Projects | |
18200 | ======== | |
18201 | ||
18202 | We have lots of ideas for projects to improve MLton, many of which we | |
18203 | do not have time to implement, or at least haven't started on yet. | |
18204 | Here is a list of some of those improvements, ranging from the easy (1 | |
18205 | week) to the difficult (several months). If you have any interest in | |
18206 | working on one of these, or some other improvement to MLton not listed | |
18207 | here, please send mail to | |
18208 | mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`]. | |
18209 | ||
18210 | * Port to new platform: Windows (native, not Cygwin or MinGW), ... | |
18211 | * Source-level debugger | |
18212 | * Heap profiler | |
18213 | * Interfaces to libraries: OpenGL, Gtk+, D-BUS, ... | |
18214 | * More libraries written in SML (see <!ViewGitProj(mltonlib)>) | |
18215 | * Additional constant types: `structure Real80: REAL`, ... | |
18216 | * An IDE (possibly integrated with <:Eclipse:>) | |
18217 | * Port MLRISC and use for code generation | |
18218 | * Optimizations | |
18219 | ** Improved closure representation | |
18220 | + | |
18221 | Right now, MLton's closure conversion algorithm uses a simple flat closure to represent each function. | |
18222 | + | |
18223 | *** http://www.mlton.org/pipermail/mlton/2003-October/024570.html | |
18224 | *** http://www.mlton.org/pipermail/mlton-user/2007-July/001150.html | |
18225 | *** <!Cite(ShaoAppel94)> | |
18226 | ** Elimination of array bounds checks in loops | |
18227 | ** Elimination of overflow checks on array index computations | |
18228 | ** Common-subexpression elimination of repeated array subscripts | |
18229 | ** Loop-invariant code motion, especially for tuple selects | |
18230 | ** Partial redundancy elimination | |
18231 | *** http://www.mlton.org/pipermail/mlton/2006-April/028598.html | |
18232 | ** Loop unrolling, especially for small loops | |
18233 | ** Auto-vectorization, for MMX/SSE/3DNow!/AltiVec (see the http://gcc.gnu.org/projects/tree-ssa/vectorization.html[work done on GCC]) | |
18234 | ** Optimize `MLton_eq`: pointer equality is necessarily false when one of the arguments is freshly allocated in the block | |
18235 | * Analyses | |
18236 | ** Uncaught exception analysis | |
18237 | ||
18238 | <<< | |
18239 | ||
18240 | :mlton-guide-page: Pronounce | |
18241 | [[Pronounce]] | |
18242 | Pronounce | |
18243 | ========= | |
18244 | ||
18245 | Here is <!Attachment(Pronounce,pronounce-mlton.mp3,how "MLton" sounds)>. | |
18246 | ||
18247 | "MLton" is pronounced in two syllables, with stress on the first | |
18248 | syllable. The first syllable sounds like the word _mill_ (as in | |
18249 | "steel mill"), the second like the word _tin_ (as in "cookie tin"). | |
18250 | ||
18251 | <<< | |
18252 | ||
18253 | :mlton-guide-page: PropertyList | |
18254 | [[PropertyList]] | |
18255 | PropertyList | |
18256 | ============ | |
18257 | ||
18258 | A property list is a dictionary-like data structure into which | |
18259 | properties (name-value pairs) can be inserted and from which | |
18260 | properties can be looked up by name. The term comes from the Lisp | |
18261 | language, where every symbol has a property list for storing | |
18262 | information, and where the names are typically symbols and keys can be | |
18263 | any type of value. | |
18264 | ||
18265 | Here is an SML signature for property lists such that for any type of | |
18266 | value a new property can be dynamically created to manipulate that | |
18267 | type of value in a property list. | |
18268 | ||
18269 | [source,sml] | |
18270 | ---- | |
18271 | signature PROPERTY_LIST = | |
18272 | sig | |
18273 | type t | |
18274 | ||
18275 | val new: unit -> t | |
18276 | val newProperty: unit -> {add: t * 'a -> unit, | |
18277 | peek: t -> 'a option} | |
18278 | end | |
18279 | ---- | |
18280 | ||
18281 | Here is a functor demonstrating the use of property lists. It first | |
18282 | creates a property list, then two new properties (of different types), | |
18283 | and adds a value to the list for each property. | |
18284 | ||
18285 | [source,sml] | |
18286 | ---- | |
18287 | functor Test (P: PROPERTY_LIST) = | |
18288 | struct | |
18289 | val pl = P.new () | |
18290 | ||
18291 | val {add = addInt: P.t * int -> unit, peek = peekInt} = P.newProperty () | |
18292 | val {add = addReal: P.t * real -> unit, peek = peekReal} = P.newProperty () | |
18293 | ||
18294 | val () = addInt (pl, 13) | |
18295 | val () = addReal (pl, 17.0) | |
18296 | val s1 = Int.toString (valOf (peekInt pl)) | |
18297 | val s2 = Real.toString (valOf (peekReal pl)) | |
18298 | val () = print (concat [s1, " ", s2, "\n"]) | |
18299 | end | |
18300 | ---- | |
18301 | ||
18302 | Applied to an appropriate implementation `PROPERTY_LIST`, the `Test` | |
18303 | functor will produce the following output. | |
18304 | ||
18305 | ---- | |
18306 | 13 17.0 | |
18307 | ---- | |
18308 | ||
18309 | ||
18310 | == Implementation == | |
18311 | ||
18312 | Because property lists can hold values of any type, their | |
18313 | implementation requires a <:UniversalType:>. Given that, a property | |
18314 | list is simply a list of elements of the universal type. Adding a | |
18315 | property adds to the front of the list, and looking up a property | |
18316 | scans the list. | |
18317 | ||
18318 | [source,sml] | |
18319 | ---- | |
18320 | functor PropertyList (U: UNIVERSAL_TYPE): PROPERTY_LIST = | |
18321 | struct | |
18322 | datatype t = T of U.t list ref | |
18323 | ||
18324 | fun new () = T (ref []) | |
18325 | ||
18326 | fun 'a newProperty () = | |
18327 | let | |
18328 | val (inject, out) = U.embed () | |
18329 | fun add (T r, a: 'a): unit = r := inject a :: (!r) | |
18330 | fun peek (T r) = | |
18331 | Option.map (valOf o out) (List.find (isSome o out) (!r)) | |
18332 | in | |
18333 | {add = add, peek = peek} | |
18334 | end | |
18335 | end | |
18336 | ---- | |
18337 | ||
18338 | ||
18339 | If `U: UNIVERSAL_TYPE`, then we can test our code as follows. | |
18340 | ||
18341 | [source,sml] | |
18342 | ---- | |
18343 | structure Z = Test (PropertyList (U)) | |
18344 | ---- | |
18345 | ||
18346 | Of course, a serious implementation of property lists would have to | |
18347 | handle duplicate insertions of the same property, as well as the | |
18348 | removal of elements in order to avoid space leaks. | |
18349 | ||
18350 | == Also see == | |
18351 | ||
18352 | * MLton relies heavily on property lists for attaching information to | |
18353 | syntax tree nodes in its intermediate languages. See | |
18354 | <!ViewGitFile(mlton,master,lib/mlton/basic/property-list.sig)> and | |
18355 | <!ViewGitFile(mlton,master,lib/mlton/basic/property-list.fun)>. | |
18356 | ||
18357 | * The <:MLRISCLibrary:> <!Cite(LeungGeorge99, uses property lists | |
18358 | extensively)>. | |
18359 | ||
18360 | <<< | |
18361 | ||
18362 | :mlton-guide-page: Pygments | |
18363 | [[Pygments]] | |
18364 | Pygments | |
18365 | ======== | |
18366 | ||
18367 | http://pygments.org/[Pygments] is a generic syntax highlighter. Here is a _lexer_ for highlighting | |
18368 | <:StandardML: Standard ML>. | |
18369 | ||
18370 | * <!ViewGitDir(mlton,master,ide/pygments/sml_lexer)> -- Provides highlighting of keywords, special constants, and (nested) comments. | |
18371 | ||
18372 | == Install and use == | |
18373 | * Checkout all files and install as a http://pygments.org/[Pygments] plugin. | |
18374 | + | |
18375 | ---- | |
18376 | $ git clone https://github.com/MLton/mlton.git mlton | |
18377 | $ cd mlton/ide/pygments | |
18378 | $ python setup.py install | |
18379 | ---- | |
18380 | ||
18381 | * Invoke `pygmentize` with `-l sml`. | |
18382 | ||
18383 | == Feedback == | |
18384 | ||
18385 | Comments and suggestions should be directed to <:MatthewFluet:>. | |
18386 | ||
18387 | <<< | |
18388 | ||
18389 | :mlton-guide-page: RayRacine | |
18390 | [[RayRacine]] | |
18391 | RayRacine | |
18392 | ========= | |
18393 | ||
18394 | Using SML in some _Semantic Web_ stuff. Anyone interested in | |
18395 | similar, please contact me. GreyLensman on #sml on IRC or rracine at | |
18396 | this domain adelphia with a dot here net. | |
18397 | ||
18398 | Current areas of coding. | |
18399 | ||
18400 | . Pretty solid, high performance Rete implementation - base functionality is complete. | |
18401 | . N3 parser - mostly complete | |
18402 | . RDF parser based on fxg - not started. | |
18403 | . Swerve HTTP server - 1/2 done. | |
18404 | . SPARQL implementation - not started. | |
18405 | . Persistent engine based on BerkelyDB - not started. | |
18406 | . Native implementation of Postgresql protocol - underway, ways to go. | |
18407 | . I also have a small change to the MLton compiler to add ++PackWord__<N>__++ - changes compile but needs some more work, clean-up and unit tests. | |
18408 | ||
18409 | <<< | |
18410 | ||
18411 | :mlton-guide-page: Reachability | |
18412 | [[Reachability]] | |
18413 | Reachability | |
18414 | ============ | |
18415 | ||
18416 | Reachability is a notion dealing with the graph of heap objects | |
18417 | maintained at runtime. Nodes in the graph are heap objects and edges | |
18418 | correspond to the pointers between heap objects. As the program runs, | |
18419 | it allocates new objects (adds nodes to the graph), and those new | |
18420 | objects can contain pointers to other objects (new edges in the | |
18421 | graph). If the program uses mutable objects (refs or arrays), it can | |
18422 | also change edges in the graph. | |
18423 | ||
18424 | At any time, the program has access to some finite set of _root_ | |
18425 | nodes, and can only ever access nodes that are reachable by following | |
18426 | edges from these root nodes. Nodes that are _unreachable_ can be | |
18427 | garbage collected. | |
18428 | ||
18429 | == Also see == | |
18430 | ||
18431 | * <:MLtonFinalizable:> | |
18432 | * <:MLtonWeak:> | |
18433 | ||
18434 | <<< | |
18435 | ||
18436 | :mlton-guide-page: Redundant | |
18437 | [[Redundant]] | |
18438 | Redundant | |
18439 | ========= | |
18440 | ||
18441 | <:Redundant:> is an optimization pass for the <:SSA:> | |
18442 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
18443 | ||
18444 | == Description == | |
18445 | ||
18446 | The redundant SSA optimization eliminates redundant function and label | |
18447 | arguments; an argument of a function or label is redundant if it is | |
18448 | always the same as another argument of the same function or label. | |
18449 | The analysis finds an equivalence relation on the arguments of a | |
18450 | function or label, such that all arguments in an equivalence class are | |
18451 | redundant with respect to the other arguments in the equivalence | |
18452 | class; the transformation selects one representative of each | |
18453 | equivalence class and drops the binding occurrence of | |
18454 | non-representative variables and renames use occurrences of the | |
18455 | non-representative variables to the representative variable. The | |
18456 | analysis finds the equivalence classes via a fixed-point analysis. | |
18457 | Each vector of arguments to a function or label is initialized to | |
18458 | equivalence classes that equate all arguments of the same type; one | |
18459 | could start with an equivalence class that equates all arguments, but | |
18460 | arguments of different type cannot be redundant. Variables bound in | |
18461 | statements are initialized to singleton equivalence classes. The | |
18462 | fixed-point analysis repeatedly refines these equivalence classes on | |
18463 | the formals by the equivalence classes of the actuals. | |
18464 | ||
18465 | == Implementation == | |
18466 | ||
18467 | * <!ViewGitFile(mlton,master,mlton/ssa/redundant.fun)> | |
18468 | ||
18469 | == Details and Notes == | |
18470 | ||
18471 | The reason <:Redundant:> got put in was due to some output of the | |
18472 | <:ClosureConvert:> pass converter where the environment record, or | |
18473 | components of it, were passed around in several places. That may have | |
18474 | been more relevant with polyvariant analyses (which are long gone). | |
18475 | But it still seems possibly relevant, especially with more aggressive | |
18476 | flattening, which should reveal some fields in nested closure records | |
18477 | that are redundant. | |
18478 | ||
18479 | <<< | |
18480 | ||
18481 | :mlton-guide-page: RedundantTests | |
18482 | [[RedundantTests]] | |
18483 | RedundantTests | |
18484 | ============== | |
18485 | ||
18486 | <:RedundantTests:> is an optimization pass for the <:SSA:> | |
18487 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
18488 | ||
18489 | == Description == | |
18490 | ||
18491 | This pass simplifies conditionals whose results are implied by a | |
18492 | previous conditional test. | |
18493 | ||
18494 | == Implementation == | |
18495 | ||
18496 | * <!ViewGitFile(mlton,master,mlton/ssa/redundant-tests.fun)> | |
18497 | ||
18498 | == Details and Notes == | |
18499 | ||
18500 | An additional test will sometimes eliminate the overflow test when | |
18501 | adding or subtracting 1. In particular, it will eliminate it in the | |
18502 | following cases: | |
18503 | [source,sml] | |
18504 | ---- | |
18505 | if x < y | |
18506 | then ... x + 1 ... | |
18507 | else ... y - 1 ... | |
18508 | ---- | |
18509 | ||
18510 | <<< | |
18511 | ||
18512 | :mlton-guide-page: References | |
18513 | [[References]] | |
18514 | References | |
18515 | ========== | |
18516 | ||
18517 | <:#AAA:A> | |
18518 | <:#BBB:B> | |
18519 | <:#CCC:C> | |
18520 | <:#DDD:D> | |
18521 | <:#EEE:E> | |
18522 | <:#FFF:F> | |
18523 | <:#GGG:G> | |
18524 | <:#HHH:H> | |
18525 | <:#III:I> | |
18526 | <:#JJJ:J> | |
18527 | <:#KKK:K> | |
18528 | <:#LLL:L> | |
18529 | <:#MMM:M> | |
18530 | <:#NNN:N> | |
18531 | <:#OOO:O> | |
18532 | <:#PPP:P> | |
18533 | <:#QQQ:Q> | |
18534 | <:#RRR:R> | |
18535 | <:#SSS:S> | |
18536 | <:#TTT:T> | |
18537 | <:#UUU:U> | |
18538 | <:#VVV:V> | |
18539 | <:#WWW:W> | |
18540 | <:#XXX:X> | |
18541 | <:#YYY:Y> | |
18542 | <:#ZZZ:Z> | |
18543 | ||
18544 | == <!Anchor(AAA)>A == | |
18545 | ||
18546 | * <!Anchor(AcarEtAl06)> | |
18547 | http://www.umut-acar.org/publications/pldi2006.pdf[An Experimental Analysis of Self-Adjusting Computation] | |
18548 | Umut Acar, Guy Blelloch, Matthias Blume, and Kanat Tangwongsan. | |
18549 | <:#PLDI:> 2006. | |
18550 | ||
18551 | * <!Anchor(Appel92)> | |
18552 | http://us.cambridge.org/titles/catalogue.asp?isbn=0521416957[Compiling with Continuations] | |
18553 | (http://www.addall.com/New/submitNew.cgi?query=0-521-41695-7&type=ISBN&location=10000&state=&dispCurr=USD[addall]). | |
18554 | ISBN 0521416957. | |
18555 | Andrew W. Appel. | |
18556 | Cambridge University Press, 1992. | |
18557 | ||
18558 | * <!Anchor(Appel93)> | |
18559 | http://www.cs.princeton.edu/research/techreps/TR-364-92[A Critique of Standard ML]. | |
18560 | Andrew W. Appel. | |
18561 | <:#JFP:> 1993. | |
18562 | ||
18563 | * <!Anchor(Appel98)> | |
18564 | http://us.cambridge.org/titles/catalogue.asp?isbn=0521582741[Modern Compiler Implementation in ML] | |
18565 | (http://www.addall.com/New/submitNew.cgi?query=0-521-58274-1&type=ISBN&location=10000&state=&dispCurr=USD[addall]). | |
18566 | ISBN 0521582741 | |
18567 | Andrew W. Appel. | |
18568 | Cambridge University Press, 1998. | |
18569 | ||
18570 | * <!Anchor(AppelJim97)> | |
18571 | http://ncstrl.cs.princeton.edu/expand.php?id=TR-556-97[Shrinking Lambda Expressions in Linear Time] | |
18572 | Andrew Appel and Trevor Jim. | |
18573 | <:#JFP:> 1997. | |
18574 | ||
18575 | * <!Anchor(AppelEtAl94)> | |
18576 | http://www.smlnj.org/doc/ML-Lex/manual.html[A lexical analyzer generator for Standard ML. Version 1.6.0] | |
18577 | Andrew W. Appel, James S. Mattson, and David R. Tarditi. 1994 | |
18578 | ||
18579 | == <!Anchor(BBB)>B == | |
18580 | ||
18581 | * <!Anchor(BaudinetMacQueen85)> | |
18582 | http://www.classes.cs.uchicago.edu/archive/2011/spring/22620-1/papers/macqueen-baudinet85.pdf[Tree Pattern Matching for ML]. | |
18583 | Marianne Baudinet, David MacQueen. 1985. | |
18584 | + | |
18585 | ____ | |
18586 | Describes the match compiler used in an early version of | |
18587 | <:SMLNJ:SML/NJ>. | |
18588 | ____ | |
18589 | ||
18590 | * <!Anchor(BentonEtAl98)> | |
18591 | http://research.microsoft.com/en-us/um/people/nick/icfp98.pdf[Compiling Standard ML to Java Bytecodes]. | |
18592 | Nick Benton, Andrew Kennedy, and George Russell. | |
18593 | <:#ICFP:> 1998. | |
18594 | ||
18595 | * <!Anchor(BentonKennedy99)> | |
18596 | http://research.microsoft.com/en-us/um/people/nick/SMLJavaInterop.pdf[Interlanguage Working Without Tears: Blending SML with Java]. | |
18597 | Nick Benton and Andrew Kennedy. | |
18598 | <:#ICFP:> 1999. | |
18599 | ||
18600 | * <!Anchor(BentonKennedy01)> | |
18601 | http://research.microsoft.com/en-us/um/people/akenn/sml/ExceptionalSyntax.pdf[Exceptional Syntax]. | |
18602 | Nick Benton and Andrew Kennedy. | |
18603 | <:#JFP:> 2001. | |
18604 | ||
18605 | * <!Anchor(BentonEtAl04)> | |
18606 | http://research.microsoft.com/en-us/um/people/nick/p53-Benton.pdf[Adventures in Interoperability: The SML.NET Experience]. | |
18607 | Nick Benton, Andrew Kennedy, and Claudio Russo. | |
18608 | <:#PPDP:> 2004. | |
18609 | ||
18610 | * <!Anchor(BentonEtAl04_2)> | |
18611 | http://research.microsoft.com/en-us/um/people/nick/shrinking.pdf[Shrinking Reductions in SML.NET]. | |
18612 | Nick Benton, Andrew Kennedy, Sam Lindley and Claudio Russo. | |
18613 | <:#IFL:> 2004. | |
18614 | + | |
18615 | ____ | |
18616 | Describes a linear-time implementation of an | |
18617 | <!Cite(AppelJim97,Appel-Jim shrinker)>, using a mutable IL, and shows | |
18618 | that it yields nice speedups in SML.NET's compile times. There are | |
18619 | also benchmarks showing that SML.NET when compiled by MLton runs | |
18620 | roughly five times faster than when compiled by SML/NJ. | |
18621 | ____ | |
18622 | ||
18623 | * <!Anchor(Benton05)> | |
18624 | http://research.microsoft.com/en-us/um/people/nick/benton03.pdf[Embedded Interpreters]. | |
18625 | Nick Benton. | |
18626 | <:#JFP:> 2005. | |
18627 | ||
18628 | * <!Anchor(Berry91)> | |
18629 | http://www.lfcs.inf.ed.ac.uk/reports/91/ECS-LFCS-91-148/ECS-LFCS-91-148.pdf[The Edinburgh SML Library]. | |
18630 | Dave Berry. | |
18631 | University of Edinburgh Technical Report ECS-LFCS-91-148, 1991. | |
18632 | ||
18633 | * <!Anchor(BerryEtAl93)> | |
18634 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.36.7958&rep=rep1&type=ps[A semantics for ML concurrency primitives]. | |
18635 | Dave Berry, Robin Milner, and David N. Turner. | |
18636 | <:#POPL:> 1992. | |
18637 | ||
18638 | * <!Anchor(Berry93)> | |
18639 | http://journals.cambridge.org/abstract_S0956796800000873[Lessons From the Design of a Standard ML Library]. | |
18640 | Dave Berry. | |
18641 | <:#JFP:> 1993. | |
18642 | ||
18643 | * <!Anchor(Bertelsen98)> | |
18644 | http://www.petermb.dk/sml2jvm.ps.gz[Compiling SML to Java Bytecode]. | |
18645 | Peter Bertelsen. | |
18646 | Master's Thesis, 1998. | |
18647 | ||
18648 | * <!Anchor(Berthomieu00)> | |
18649 | http://homepages.laas.fr/bernard/oo/ooml.html[OO Programming styles in ML]. | |
18650 | Bernard Berthomieu. | |
18651 | LAAS Report #2000111, 2000. | |
18652 | ||
18653 | * <!Anchor(Blume01)> | |
18654 | http://people.cs.uchicago.edu/~blume/papers/nlffi-entcs.pdf[No-Longer-Foreign: Teaching an ML compiler to speak C "natively"]. | |
18655 | Matthias Blume. | |
18656 | <:#BABEL:> 2001. | |
18657 | ||
18658 | * <!Anchor(Blume01_02)> | |
18659 | http://people.cs.uchicago.edu/~blume/pgraph/proposal.pdf[Portable library descriptions for Standard ML]. | |
18660 | Matthias Blume. 2001. | |
18661 | ||
18662 | * <!Anchor(Boehm03)> | |
18663 | http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html[Destructors, Finalizers, and Synchronization]. | |
18664 | Hans Boehm. | |
18665 | <:#POPL:> 2003. | |
18666 | + | |
18667 | ____ | |
18668 | Discusses a number of issues in the design of finalizers. Many of the | |
18669 | design choices are consistent with <:MLtonFinalizable:>. | |
18670 | ____ | |
18671 | ||
18672 | == <!Anchor(CCC)>C == | |
18673 | ||
18674 | * <!Anchor(CejtinEtAl00)> | |
18675 | http://www.cs.purdue.edu/homes/suresh/papers/icfp99.ps.gz[Flow-directed Closure Conversion for Typed Languages]. | |
18676 | Henry Cejtin, Suresh Jagannathan, and Stephen Weeks. | |
18677 | <:#ESOP:> 2000. | |
18678 | + | |
18679 | ____ | |
18680 | Describes MLton's closure-conversion algorithm, which translates from | |
18681 | its simply-typed higher-order intermediate language to its | |
18682 | simply-typed first-order intermediate language. | |
18683 | ____ | |
18684 | ||
18685 | * <!Anchor(ChengBlelloch01)> | |
18686 | http://www.cs.cmu.edu/afs/cs/project/pscico/pscico/papers/gc01/pldi-final.pdf[A Parallel, Real-Time Garbage Collector]. | |
18687 | Perry Cheng and Guy E. Blelloch. | |
18688 | <:#PLDI:> 2001. | |
18689 | ||
18690 | * <!Anchor(Claessen00)> | |
18691 | http://users.eecs.northwestern.edu/~robby/courses/395-495-2009-fall/quick.pdf[QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs]. | |
18692 | Koen Claessen and John Hughes. | |
18693 | <:#ICFP:> 2000. | |
18694 | ||
18695 | * <!Anchor(Clinger98)> | |
18696 | http://www.cesura17.net/~will/Professional/Research/Papers/tail.pdf[Proper Tail Recursion and Space Efficiency]. | |
18697 | William D. Clinger. | |
18698 | <:#PLDI:> 1998. | |
18699 | ||
18700 | * <!Anchor(CooperMorrisett90)> | |
18701 | http://www.eecs.harvard.edu/~greg/papers/jgmorris-mlthreads.ps[Adding Threads to Standard ML]. | |
18702 | Eric C. Cooper and J. Gregory Morrisett. | |
18703 | CMU Technical Report CMU-CS-90-186, 1990. | |
18704 | ||
18705 | * <!Anchor(CouttsEtAl07)> | |
18706 | http://metagraph.org/papers/stream_fusion.pdf[Stream Fusion: From Lists to Streams to Nothing at All]. | |
18707 | Duncan Coutts, Roman Leshchinskiy, and Don Stewart. | |
18708 | Submitted for publication. April 2007. | |
18709 | ||
18710 | == <!Anchor(DDD)>D == | |
18711 | ||
18712 | * <!Anchor(DamasMilner82)> | |
18713 | http://groups.csail.mit.edu/pag/6.883/readings/p207-damas.pdf[Principal Type-Schemes for Functional Programs]. | |
18714 | Luis Damas and Robin Milner. | |
18715 | <:#POPL:> 1982. | |
18716 | ||
18717 | * <!Anchor(Danvy98)> | |
18718 | http://www.brics.dk/RS/98/12[Functional Unparsing]. | |
18719 | Olivier Danvy. | |
18720 | BRICS Technical Report RS 98-12, 1998. | |
18721 | ||
18722 | * <!Anchor(Deboer05)> | |
18723 | http://alleystoughton.us/eXene/dusty-thesis.pdf[Exhancements to eXene]. | |
18724 | Dustin B. deBoer. | |
18725 | Master of Science Thesis, 2005. | |
18726 | + | |
18727 | ____ | |
18728 | Describes ways to improve widget concurrency, handling of input focus, | |
18729 | X resources and selections. | |
18730 | ____ | |
18731 | ||
18732 | * <!Anchor(DoligezLeroy93)> | |
18733 | http://cristal.inria.fr/~doligez/publications/doligez-leroy-popl-1993.pdf[A Concurrent, Generational Garbage Collector for a Multithreaded Implementation of ML]. | |
18734 | Damien Doligez and Xavier Leroy. | |
18735 | <:#POPL:> 1993. | |
18736 | ||
18737 | * <!Anchor(Dreyer07)> | |
18738 | http://www.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf[Modular Type Classes]. | |
18739 | Derek Dreyer, Robert Harper, Manuel M.T. Chakravarty, Gabriele Keller. | |
18740 | University of Chicago Technical Report TR-2007-02, 2006. | |
18741 | ||
18742 | * <!Anchor(DreyerBlume07)> | |
18743 | http://www.mpi-sws.org/~dreyer/papers/infmod/main-long.pdf[Principal Type Schemes for Modular Programs]. | |
18744 | Derek Dreyer and Matthias Blume. | |
18745 | <:#ESOP:> 2007. | |
18746 | ||
18747 | * <!Anchor(Dubois95)> | |
18748 | ftp://ftp.inria.fr/INRIA/Projects/cristal/Francois.Rouaix/generics.dvi.Z[Extensional Polymorphism]. | |
18749 | Catherin Dubois, Francois Rouaix, and Pierre Weis. | |
18750 | <:#POPL:> 1995. | |
18751 | + | |
18752 | ____ | |
18753 | An extension of ML that allows the definition of ad-hoc polymorphic | |
18754 | functions by inspecting the type of their argument. | |
18755 | ____ | |
18756 | ||
18757 | == <!Anchor(EEE)>E == | |
18758 | ||
18759 | * <!Anchor(Elsman03)> | |
18760 | http://www.elsman.com/tldi03.pdf[Garbage Collection Safety for Region-based Memory Management]. | |
18761 | Martin Elsman. | |
18762 | <:#TLDI:> 2003. | |
18763 | ||
18764 | * <!Anchor(Elsman04)> | |
18765 | http://www.elsman.com/ITU-TR-2004-43.pdf[Type-Specialized Serialization with Sharing]. | |
18766 | Martin Elsman. University of Copenhagen. IT University Technical | |
18767 | Report TR-2004-43, 2004. | |
18768 | ||
18769 | == <!Anchor(FFF)>F == | |
18770 | ||
18771 | * <!Anchor(FelleisenFreidman98)> | |
18772 | http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=4787[The Little MLer] | |
18773 | (http://www3.addall.com/New/submitNew.cgi?query=026256114X&type=ISBN[addall]). | |
18774 | ISBN 026256114X. | |
18775 | Matthias Felleisen and Dan Freidman. | |
18776 | The MIT Press, 1998. | |
18777 | ||
18778 | * <!Anchor(FlattFindler04)> | |
18779 | http://www.cs.utah.edu/plt/kill-safe/[Kill-Safe Synchronization Abstractions]. | |
18780 | Matthew Flatt and Robert Bruce Findler. | |
18781 | <:#PLDI:> 2004. | |
18782 | ||
18783 | * <!Anchor(FluetWeeks01)> | |
18784 | http://www.cs.rit.edu/~mtf/research/contification[Contification Using Dominators]. | |
18785 | Matthew Fluet and Stephen Weeks. | |
18786 | <:#ICFP:> 2001. | |
18787 | + | |
18788 | ____ | |
18789 | Describes contification, a generalization of tail-recursion | |
18790 | elimination that is an optimization operating on MLton's static single | |
18791 | assignment (SSA) intermediate language. | |
18792 | ____ | |
18793 | ||
18794 | * <!Anchor(FluetPucella06)> | |
18795 | http://www.cs.rit.edu/~mtf/research/phantom-subtyping/jfp06/jfp06.pdf[Phantom Types and Subtyping]. | |
18796 | Matthew Fluet and Riccardo Pucella. | |
18797 | <:#JFP:> 2006. | |
18798 | ||
18799 | * <!Anchor(Furuse01)> | |
18800 | http://jfla.inria.fr/2001/actes/07-furuse.ps[Generic Polymorphism in ML]. | |
18801 | J{empty}. Furuse. | |
18802 | <:#JFLA:> 2001. | |
18803 | + | |
18804 | ____ | |
18805 | The formalism behind G'CAML, which has an approach to ad-hoc | |
18806 | polymorphism based on <!Cite(Dubois95)>, the differences being in how | |
18807 | type checking works an an improved compilation approach for typecase | |
18808 | that does the matching at compile time, not run time. | |
18809 | ____ | |
18810 | ||
18811 | == <!Anchor(GGG)>G == | |
18812 | ||
18813 | * <!Anchor(GansnerReppy93)> | |
18814 | http://alleystoughton.us/eXene/1993-trends.pdf[A Multi-Threaded Higher-order User Interface Toolkit]. | |
18815 | Emden R. Gansner and John H. Reppy. | |
18816 | User Interface Software, 1993. | |
18817 | ||
18818 | * <!Anchor(GansnerReppy04)> | |
18819 | http://www.cambridge.org/gb/academic/subjects/computer-science/programming-languages-and-applied-logic/standard-ml-basis-library[The Standard ML Basis Library]. | |
18820 | (http://www3.addall.com/New/submitNew.cgi?query=9780521794787&type=ISBN[addall]) | |
18821 | ISBN 9780521794787. | |
18822 | Emden R. Gansner and John H. Reppy. | |
18823 | Cambridge University Press, 2004. | |
18824 | + | |
18825 | ____ | |
18826 | An introduction and overview of the <:BasisLibrary:Basis Library>, | |
18827 | followed by a detailed description of each module. The module | |
18828 | descriptions are also available | |
18829 | http://www.standardml.org/Basis[online]. | |
18830 | ____ | |
18831 | ||
18832 | * <!Anchor(GrossmanEtAl02)> | |
18833 | http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf[Region-based Memory Management in Cyclone]. | |
18834 | Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling | |
18835 | Wang, and James Cheney. | |
18836 | <:#PLDI:> 2002. | |
18837 | ||
18838 | == <!Anchor(HHH)>H == | |
18839 | ||
18840 | * <!Anchor(HallenbergEtAl02)> | |
18841 | http://www.itu.dk/people/tofte/publ/pldi2002.pdf[Combining Region Inference and Garbage Collection]. | |
18842 | Niels Hallenberg, Martin Elsman, and Mads Tofte. | |
18843 | <:#PLDI:> 2002. | |
18844 | ||
18845 | * <!Anchor(HansenRichel99)> | |
18846 | http://www.it.dtu.dk/introSML[Introduction to Programming Using SML] | |
18847 | (http://www3.addall.com/New/submitNew.cgi?query=0201398206&type=ISBN[addall]). | |
18848 | ISBN 0201398206. | |
18849 | Michael R. Hansen, Hans Rischel. | |
18850 | Addison-Wesley, 1999. | |
18851 | ||
18852 | * <!Anchor(Harper11)> | |
18853 | http://www.cs.cmu.edu/~rwh/smlbook/book.pdf[Programming in Standard ML]. | |
18854 | Robert Harper. | |
18855 | ||
18856 | * <!Anchor(HarperEtAl93)> | |
18857 | http://www.cs.cmu.edu/~rwh/papers/callcc/jfp.pdf[Typing First-Class Continuations in ML]. | |
18858 | Robert Harper, Bruce F. Duba, and David MacQueen. | |
18859 | <:#JFP:> 1993. | |
18860 | ||
18861 | * <!Anchor(HarperMitchell92)> | |
18862 | http://www.cs.cmu.edu/~rwh/papers/xml/toplas93.pdf[On the Type Structure of Standard ML]. | |
18863 | Robert Harper and John C. Mitchell. | |
18864 | <:#TOPLAS:> 1992. | |
18865 | ||
18866 | * <!Anchor(HauserBenson04)> | |
18867 | http://doi.ieeecomputersociety.org/10.1109/CSD.2004.1309122[On the Practicality and Desirability of Highly-concurrent, Mostly-functional Programming]. | |
18868 | Carl H. Hauser and David B. Benson. | |
18869 | <:#ACSD:> 2004. | |
18870 | + | |
18871 | ____ | |
18872 | Describes the use of <:ConcurrentML: Concurrent ML> in implementing | |
18873 | the Ped text editor. Argues that using large numbers of threads and | |
18874 | message passing style is a practical and effective way of | |
18875 | modularizing a program. | |
18876 | ____ | |
18877 | ||
18878 | * <!Anchor(HeckmanWilhelm97)> | |
18879 | http://rw4.cs.uni-sb.de/~heckmann/abstracts/neuform.html[A Functional Description of TeX's Formula Layout]. | |
18880 | Reinhold Heckmann and Reinhard Wilhelm. | |
18881 | <:#JFP:> 1997. | |
18882 | ||
18883 | * <!Anchor(HicksEtAl03)> | |
18884 | http://wwwold.cs.umd.edu/Library/TRs/CS-TR-4514/CS-TR-4514.pdf[Safe and Flexible Memory Management in Cyclone]. | |
18885 | Mike Hicks, Greg Morrisett, Dan Grossman, and Trevor Jim. | |
18886 | University of Maryland Technical Report CS-TR-4514, 2003. | |
18887 | ||
18888 | * <!Anchor(Hurd04)> | |
18889 | http://www.gilith.com/research/talks/tphols2004.pdf[Compiling HOL4 to Native Code]. | |
18890 | Joe Hurd. | |
18891 | <:#TPHOLs:> 2004. | |
18892 | + | |
18893 | ____ | |
18894 | Describes a port of HOL from Moscow ML to MLton, the difficulties | |
18895 | encountered in compiling large programs, and the speedups achieved | |
18896 | (roughly 10x). | |
18897 | ____ | |
18898 | ||
18899 | == <!Anchor(III)>I == | |
18900 | ||
18901 | {empty} | |
18902 | ||
18903 | == <!Anchor(JJJ)>J == | |
18904 | ||
18905 | * <!Anchor(Jones99)> | |
18906 | http://www.cs.kent.ac.uk/people/staff/rej/gcbook[Garbage Collection: Algorithms for Automatic Memory Management] | |
18907 | (http://www3.addall.com/New/submitNew.cgi?query=0471941484&type=ISBN[addall]). | |
18908 | ISBN 0471941484. | |
18909 | Richard Jones. | |
18910 | John Wiley & Sons, 1999. | |
18911 | ||
18912 | == <!Anchor(KKK)>K == | |
18913 | ||
18914 | * <!Anchor(Kahrs93)> | |
18915 | http://kar.kent.ac.uk/21122/[Mistakes and Ambiguities in the Definition of Standard ML]. | |
18916 | Stefan Kahrs. | |
18917 | University of Edinburgh Technical Report ECS-LFCS-93-257, 1993. | |
18918 | + | |
18919 | ____ | |
18920 | Describes a number of problems with the | |
18921 | <!Cite(MilnerEtAl90,1990 Definition)>, many of which were fixed in the | |
18922 | <!Cite(MilnerEtAl97,1997 Definition)>. | |
18923 | ||
18924 | Also see the http://www.cs.kent.ac.uk/~smk/errors-new.ps.Z[addenda] | |
18925 | published in 1996. | |
18926 | ____ | |
18927 | ||
18928 | * <!Anchor(Karvonen07)> | |
18929 | http://dl.acm.org/citation.cfm?doid=1292535.1292547[Generics for the Working ML'er]. | |
18930 | Vesa Karvonen. | |
18931 | <:#ML:> 2007. http://research.microsoft.com/~crusso/ml2007/slides/ml08rp-karvonen-slides.pdf[Slides] from the presentation are also available. | |
18932 | ||
18933 | * <!Anchor(Kennedy04)> | |
18934 | http://research.microsoft.com/~akenn/fun/picklercombinators.pdf[Pickler Combinators]. | |
18935 | Andrew Kennedy. | |
18936 | <:#JFP:> 2004. | |
18937 | ||
18938 | * <!Anchor(KoserEtAl03)> | |
18939 | http://www.litech.org/~vaughan/pdf/dpcool2003.pdf[sml2java: A Source To Source Translator]. | |
18940 | Justin Koser, Haakon Larsen, Jeffrey A. Vaughan. | |
18941 | <:#DPCOOL:> 2003. | |
18942 | ||
18943 | == <!Anchor(LLL)>L == | |
18944 | ||
18945 | * <!Anchor(Lang99)> | |
18946 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.29.7130&rep=rep1&type=ps[Faster Algorithms for Finding Minimal Consistent DFAs]. | |
18947 | Kevin Lang. 1999. | |
18948 | ||
18949 | * <!Anchor(LarsenNiss04)> | |
18950 | http://usenix.org/publications/library/proceedings/usenix04/tech/freenix/full_papers/larsen/larsen.pdf[mGTK: An SML binding of Gtk+]. | |
18951 | Ken Larsen and Henning Niss. | |
18952 | USENIX Annual Technical Conference, 2004. | |
18953 | ||
18954 | * <!Anchor(Leibig13)> | |
18955 | http://www.cs.rit.edu/~bal6053/msproject/[An LLVM Back-end for MLton]. | |
18956 | Brian Leibig. | |
18957 | MS Project Report, 2013. | |
18958 | + | |
18959 | ____ | |
18960 | Describes MLton's <:LLVMCodegen:>. | |
18961 | ____ | |
18962 | ||
18963 | * <!Anchor(Leroy90)> | |
18964 | http://pauillac.inria.fr/~xleroy/bibrefs/Leroy-ZINC.html[The ZINC Experiment: an Economical Implementation of the ML Language]. | |
18965 | Xavier Leroy. | |
18966 | Technical report 117, INRIA, 1990. | |
18967 | + | |
18968 | ____ | |
18969 | A detailed explanation of the design and implementation of a bytecode | |
18970 | compiler and interpreter for ML with a machine model aimed at | |
18971 | efficient implementation. | |
18972 | ____ | |
18973 | ||
18974 | * <!Anchor(Leroy93)> | |
18975 | http://pauillac.inria.fr/~xleroy/bibrefs/Leroy-poly-par-nom.html[Polymorphism by Name for References and Continuations]. | |
18976 | Xavier Leroy. | |
18977 | <:#POPL:> 1993. | |
18978 | ||
18979 | * <!Anchor(LeungGeorge99)> | |
18980 | http://www.cs.nyu.edu/leunga/my-papers/annotations.ps[MLRISC Annotations]. | |
18981 | Allen Leung and Lal George. 1999. | |
18982 | ||
18983 | == <!Anchor(MMM)>M == | |
18984 | ||
18985 | * <!Anchor(MarlowEtAl01)> | |
18986 | http://community.haskell.org/~simonmar/papers/async.pdf[Asynchronous Exceptions in Haskell]. | |
18987 | Simon Marlow, Simon Peyton Jones, Andy Moran and John Reppy. | |
18988 | <:#PLDI:> 2001. | |
18989 | + | |
18990 | ____ | |
18991 | An asynchronous exception is a signal that one thread can send to | |
18992 | another, and is useful for the receiving thread to treat as an | |
18993 | exception so that it can clean up locks or other state relevant to its | |
18994 | current context. | |
18995 | ____ | |
18996 | ||
18997 | * <!Anchor(MacQueenEtAl84)> | |
18998 | http://homepages.inf.ed.ac.uk/gdp/publications/Ideal_model.pdf[An Ideal Model for Recursive Polymorphic Types]. | |
18999 | David MacQueen, Gordon Plotkin, Ravi Sethi. | |
19000 | <:#POPL:> 1984. | |
19001 | ||
19002 | * <!Anchor(Matthews91)> | |
19003 | http://www.lfcs.inf.ed.ac.uk/reports/91/ECS-LFCS-91-174[A Distributed Concurrent Implementation of Standard ML]. | |
19004 | David Matthews. | |
19005 | University of Edinburgh Technical Report ECS-LFCS-91-174, 1991. | |
19006 | ||
19007 | * <!Anchor(Matthews95)> | |
19008 | http://www.lfcs.inf.ed.ac.uk/reports/95/ECS-LFCS-95-335[Papers on Poly/ML]. | |
19009 | David C. J. Matthews. | |
19010 | University of Edinburgh Technical Report ECS-LFCS-95-335, 1995. | |
19011 | ||
19012 | * http://www.lfcs.inf.ed.ac.uk/reports/97/ECS-LFCS-97-375[That About Wraps it Up: Using FIX to Handle Errors Without Exceptions, and Other Programming Tricks]. | |
19013 | Bruce J. McAdam. | |
19014 | University of Edinburgh Technical Report ECS-LFCS-97-375, 1997. | |
19015 | ||
19016 | * <!Anchor(MeierNorgaard93)> | |
19017 | A Just-In-Time Backend for Moscow ML 2.00 in SML. | |
19018 | Bjarke Meier, Kristian Nørgaard. | |
19019 | Masters Thesis, 2003. | |
19020 | + | |
19021 | ____ | |
19022 | A just-in-time compiler using GNU Lightning, showing a speedup of up | |
19023 | to four times over Moscow ML's usual bytecode interpreter. | |
19024 | ||
19025 | The full report is only available in | |
19026 | http://www.itu.dk/stud/speciale/bmkn/fundanemt/download/report[Danish]. | |
19027 | ____ | |
19028 | ||
19029 | * <!Anchor(Milner78)> | |
19030 | http://courses.engr.illinois.edu/cs421/sp2013/project/milner-polymorphism.pdf[A Theory of Type Polymorphism in Programming]. | |
19031 | Robin Milner. | |
19032 | Journal of Computer and System Sciences, 1978. | |
19033 | ||
19034 | * <!Anchor(Milner82)> | |
19035 | http://homepages.inf.ed.ac.uk/dts/fps/papers/evolved.dvi.gz[How ML Evolved]. | |
19036 | Robin Milner. | |
19037 | Polymorphism--The ML/LCF/Hope Newsletter, 1983. | |
19038 | ||
19039 | * <!Anchor(MilnerTofte91)> | |
19040 | http://www.itu.dk/people/tofte/publ/1990sml/1990sml.html[Commentary on Standard ML] | |
19041 | (http://www3.addall.com/New/submitNew.cgi?query=0262631377&type=ISBN[addall]) | |
19042 | ISBN 0262631377. | |
19043 | Robin Milner and Mads Tofte. | |
19044 | The MIT Press, 1991. | |
19045 | + | |
19046 | ____ | |
19047 | Introduces and explains the notation and approach used in | |
19048 | <!Cite(MilnerEtAl90,The Definition of Standard ML)>. | |
19049 | ____ | |
19050 | ||
19051 | * <!Anchor(MilnerEtAl90)> | |
19052 | http://www.itu.dk/people/tofte/publ/1990sml/1990sml.html[The Definition of Standard ML]. | |
19053 | (http://www3.addall.com/New/submitNew.cgi?query=0262631326&type=ISBN[addall]) | |
19054 | ISBN 0262631326. | |
19055 | Robin Milner, Mads Tofte, and Robert Harper. | |
19056 | The MIT Press, 1990. | |
19057 | + | |
19058 | ____ | |
19059 | Superseded by <!Cite(MilnerEtAl97,The Definition of Standard ML (Revised))>. | |
19060 | Accompanied by the <!Cite(MilnerTofte91,Commentary on Standard ML)>. | |
19061 | ____ | |
19062 | ||
19063 | * <!Anchor(MilnerEtAl97)> | |
19064 | http://mitpress.mit.edu/books/definition-standard-ml[The Definition of Standard ML (Revised)]. | |
19065 | (http://www3.addall.com/New/submitNew.cgi?query=0262631814&type=ISBN[addall]) | |
19066 | ISBN 0262631814. | |
19067 | Robin Milner, Mads Tofte, Robert Harper, and David MacQueen. | |
19068 | The MIT Press, 1997. | |
19069 | + | |
19070 | ____ | |
19071 | A terse and formal specification of Standard ML's syntax and | |
19072 | semantics. Supersedes <!Cite(MilnerEtAl90,The Definition of Standard ML)>. | |
19073 | ____ | |
19074 | ||
19075 | * <!Anchor(ML2000)> | |
19076 | http://flint.cs.yale.edu/flint/publications/ml2000.html[Principles and a Preliminary Design for ML2000]. | |
19077 | The ML2000 working group, 1999. | |
19078 | ||
19079 | * <!Anchor(Morentsen99)> | |
19080 | http://daimi.au.dk/CPnets/workshop99/papers/Mortensen.pdf[Automatic Code Generation from Coloured Petri Nets for an Access Control System]. | |
19081 | Kjeld H. Mortensen. | |
19082 | Workshop on Practical Use of Coloured Petri Nets and Design/CPN, 1999. | |
19083 | ||
19084 | * <!Anchor(MorrisettTolmach93)> | |
19085 | http://web.cecs.pdx.edu/~apt/ppopp93.ps[Procs and Locks: a Portable Multiprocessing Platform for Standard ML of New Jersey]. | |
19086 | J{empty}. Gregory Morrisett and Andrew Tolmach. | |
19087 | <:#PPoPP:> 1993. | |
19088 | ||
19089 | * <!Anchor(Murphy06)> | |
19090 | http://www.cs.cmu.edu/~tom7/papers/grid-ml06.pdf[ML Grid Programming with ConCert]. | |
19091 | Tom Murphy VII. | |
19092 | <:#ML:> 2006. | |
19093 | ||
19094 | == <!Anchor(NNN)>N == | |
19095 | ||
19096 | * <!Anchor(Neumann99)> | |
19097 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.9485&rep=rep1&type=ps[fxp - Processing Structured Documents in SML]. | |
19098 | Andreas Neumann. | |
19099 | Scottish Functional Programming Workshop, 1999. | |
19100 | + | |
19101 | ____ | |
19102 | Describes http://atseidl2.informatik.tu-muenchen.de/~berlea/Fxp[fxp], | |
19103 | an XML parser implemented in Standard ML. | |
19104 | ____ | |
19105 | ||
19106 | * <!Anchor(Neumann99Thesis)> | |
19107 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.8108&rep=rep1&type=ps[Parsing and Querying XML Documents in SML]. | |
19108 | Andreas Neumann. | |
19109 | Doctoral Thesis, 1999. | |
19110 | ||
19111 | * <!Anchor(NguyenOhori06)> | |
19112 | http://www.pllab.riec.tohoku.ac.jp/~ohori/research/NguyenOhoriPPDP06.pdf[Compiling ML Polymorphism with Explicit Layout Bitmap]. | |
19113 | Huu-Duc Nguyen and Atsushi Ohori. | |
19114 | <:#PPDP:> 2006. | |
19115 | ||
19116 | == <!Anchor(OOO)>O == | |
19117 | ||
19118 | * <!Anchor(Okasaki99)> | |
19119 | http://www.cambridge.org/gb/academic/subjects/computer-science/programming-languages-and-applied-logic/purely-functional-data-structures[Purely Functional Data Structures]. | |
19120 | ISBN 9780521663502. | |
19121 | Chris Okasaki. | |
19122 | Cambridge University Press, 1999. | |
19123 | ||
19124 | * <!Anchor(Ohori89)> | |
19125 | http://www.pllab.riec.tohoku.ac.jp/~ohori/research/fpca89.pdf[A Simple Semantics for ML Polymorphism]. | |
19126 | Atsushi Ohori. | |
19127 | <:#FPCA:> 1989. | |
19128 | ||
19129 | * <!Anchor(Ohori95)> | |
19130 | http://www.pllab.riec.tohoku.ac.jp/~ohori/research/toplas95.pdf[A Polymorphic Record Calculus and Its Compilation]. | |
19131 | Atsushi Ohori. | |
19132 | <:#TOPLAS:> 1995. | |
19133 | ||
19134 | * <!Anchor(OhoriTakamizawa97)> | |
19135 | http://www.pllab.riec.tohoku.ac.jp/~ohori/research/jlsc97.pdf[An Unboxed Operational Semantics for ML Polymorphism]. | |
19136 | Atsushi Ohori and Tomonobu Takamizawa. | |
19137 | <:#LASC:> 1997. | |
19138 | ||
19139 | * <!Anchor(Ohori99)> | |
19140 | http://www.pllab.riec.tohoku.ac.jp/~ohori/research/ic98.pdf[Type-Directed Specialization of Polymorphism]. | |
19141 | Atsushi Ohori. | |
19142 | <:#IC:> 1999. | |
19143 | ||
19144 | * <!Anchor(OwensEtAl09)> | |
19145 | http://www.mpi-sws.org/~turon/re-deriv.pdf[Regular-expression derivatives reexamined]. | |
19146 | Scott Owens, John Reppy, and Aaron Turon. | |
19147 | <:#JFP:> 2009. | |
19148 | ||
19149 | == <!Anchor(PPP)>P == | |
19150 | ||
19151 | * <!Anchor(Paulson96)> | |
19152 | http://www.cambridge.org/co/academic/subjects/computer-science/programming-languages-and-applied-logic/ml-working-programmer-2nd-edition[ML For the Working Programmer] | |
19153 | (http://www3.addall.com/New/submitNew.cgi?query=052156543X&type=ISBN[addall]) | |
19154 | ISBN 052156543X. | |
19155 | Larry C. Paulson. | |
19156 | Cambridge University Press, 1996. | |
19157 | ||
19158 | * <!Anchor(PetterssonEtAl02)> | |
19159 | http://user.it.uu.se/~kostis/Papers/flops02_22.ps.gz[The HiPE/x86 Erlang Compiler: System Description and Performance Evaluation]. | |
19160 | Mikael Pettersson, Konstantinos Sagonas, and Erik Johansson. | |
19161 | <:#FLOPS:> 2002. | |
19162 | + | |
19163 | ____ | |
19164 | Describes a native x86 Erlang compiler and a comparison of many | |
19165 | different native x86 compilers (including MLton) and their register | |
19166 | usage and call stack implementations. | |
19167 | ____ | |
19168 | ||
19169 | * <!Anchor(Price09)> | |
19170 | http://rogerprice.org/#UG[User's Guide to ML-Lex and ML-Yacc] | |
19171 | Roger Price. 2009. | |
19172 | ||
19173 | * <!Anchor(Pucella98)> | |
19174 | http://arxiv.org/abs/cs.PL/0405080[Reactive Programming in Standard ML]. | |
19175 | Riccardo R. Puccella. 1998. | |
19176 | <:#ICCL:> 1998. | |
19177 | ||
19178 | == <!Anchor(QQQ)>Q == | |
19179 | ||
19180 | {empty} | |
19181 | ||
19182 | == <!Anchor(RRR)>R == | |
19183 | ||
19184 | * <!Anchor(Ramsey90)> | |
19185 | https://www.cs.princeton.edu/research/techreps/TR-262-90[Concurrent Programming in ML]. | |
19186 | Norman Ramsey. | |
19187 | Princeton University Technical Report CS-TR-262-90, 1990. | |
19188 | ||
19189 | * <!Anchor(Ramsey11)> | |
19190 | http://www.cs.tufts.edu/~nr/pubs/embedj-abstract.html[Embedding an Interpreted Language Using Higher-Order Functions and Types]. | |
19191 | Norman Ramsey. | |
19192 | <:#JFP:> 2011. | |
19193 | ||
19194 | * <!Anchor(RamseyFisherGovereau05)> | |
19195 | http://www.cs.tufts.edu/~nr/pubs/els-abstract.html[An Expressive Language of Signatures]. | |
19196 | Norman Ramsey, Kathleen Fisher, and Paul Govereau. | |
19197 | <:#ICFP:> 2005. | |
19198 | ||
19199 | * <!Anchor(RedwineRamsey04)> | |
19200 | http://www.cs.tufts.edu/~nr/pubs/widen-abstract.html[Widening Integer Arithmetic]. | |
19201 | Kevin Redwine and Norman Ramsey. | |
19202 | <:#CC:> 2004. | |
19203 | + | |
19204 | ____ | |
19205 | Describes a method to implement numeric types and operations (like | |
19206 | `Int31` or `Word17`) for sizes smaller than that provided by the | |
19207 | processor. | |
19208 | ____ | |
19209 | ||
19210 | * <!Anchor(Reppy88)> | |
19211 | Synchronous Operations as First-Class Values. | |
19212 | John Reppy. | |
19213 | <:#PLDI:> 1988. | |
19214 | ||
19215 | * <!Anchor(Reppy07)> | |
19216 | http://www.cambridge.org/co/academic/subjects/computer-science/distributed-networked-and-mobile-computing/concurrent-programming-ml[Concurrent Programming in ML] | |
19217 | (http://www3.addall.com/New/submitNew.cgi?query=9780521714723&type=ISBN[addall]). | |
19218 | ISBN 9780521714723. | |
19219 | John Reppy. | |
19220 | Cambridge University Press, 2007. | |
19221 | + | |
19222 | ____ | |
19223 | Describes <:ConcurrentML:>. | |
19224 | ____ | |
19225 | ||
19226 | * <!Anchor(Reynolds98)> | |
19227 | https://users-cs.au.dk/hosc/local/HOSC-11-4-pp355-361.pdf[Definitional Interpreters Revisited]. | |
19228 | John C. Reynolds. | |
19229 | <:#HOSC:> 1998. | |
19230 | ||
19231 | * <!Anchor(Reynolds98_2)> | |
19232 | https://users-cs.au.dk/hosc/local/HOSC-11-4-pp363-397.pdf[Definitional Interpreters for Higher-Order Programming Languages] | |
19233 | John C. Reynolds. | |
19234 | <:#HOSC:> 1998. | |
19235 | ||
19236 | * <!Anchor(Rossberg01)> | |
19237 | http://www.mpi-sws.org/~rossberg/papers/Rossberg%20-%20Defects%20in%20the%20Revised%20Definition%20of%20Standard%20ML%20%5B2007-01-22%20Update%5D.pdf[Defects in the Revised Definition of Standard ML]. | |
19238 | Andreas Rossberg. 2001. | |
19239 | ||
19240 | == <!Anchor(SSS)>S == | |
19241 | ||
19242 | * <!Anchor(Sansom91)> | |
19243 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.1020&rep=rep1&type=ps[Dual-Mode Garbage Collection]. | |
19244 | Patrick M. Sansom. | |
19245 | Workshop on the Parallel Implementation of Functional Languages, 1991. | |
19246 | ||
19247 | * <!Anchor(ScottRamsey00)> | |
19248 | http://www.cs.tufts.edu/~nr/pubs/match-abstract.html[When Do Match-Compilation Heuristics Matter]. | |
19249 | Kevin Scott and Norman Ramsey. | |
19250 | University of Virginia Technical Report CS-2000-13, 2000. | |
19251 | + | |
19252 | ____ | |
19253 | Modified SML/NJ to experimentally compare a number of | |
19254 | match-compilation heuristics and showed that choice of heuristic | |
19255 | usually does not significantly affect code size or run time. | |
19256 | ____ | |
19257 | ||
19258 | * <!Anchor(Sestoft96)> | |
19259 | http://www.itu.dk/~sestoft/papers/match.ps.gz[ML Pattern Match Compilation and Partial Evaluation]. | |
19260 | Peter Sestoft. | |
19261 | Partial Evaluation, 1996. | |
19262 | + | |
19263 | ____ | |
19264 | Describes the derivation of the match compiler used in | |
19265 | <:MoscowML:Moscow ML>. | |
19266 | ____ | |
19267 | ||
19268 | * <!Anchor(ShaoAppel94)> | |
19269 | http://flint.cs.yale.edu/flint/publications/closure.html[Space-Efficient Closure Representations]. | |
19270 | Zhong Shao and Andrew W. Appel. | |
19271 | <:#LFP:> 1994. | |
19272 | ||
19273 | * <!Anchor(Shipman02)> | |
19274 | <!Attachment(References,Shipman02.pdf,Unix System Programming with Standard ML)>. | |
19275 | Anthony L. Shipman. | |
19276 | 2002. | |
19277 | + | |
19278 | ____ | |
19279 | Includes a description of the <:Swerve:> HTTP server written in SML. | |
19280 | ____ | |
19281 | ||
19282 | * <!Anchor(Signoles03)> | |
19283 | Calcul Statique des Applications de Modules Parametres. | |
19284 | Julien Signoles. | |
19285 | <:#JFLA:> 2003. | |
19286 | + | |
19287 | ____ | |
19288 | Describes a http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=382[defunctorizer] | |
19289 | for OCaml, and compares it to existing defunctorizers, including MLton. | |
19290 | ____ | |
19291 | ||
19292 | * <!Anchor(SittampalamEtAl04)> | |
19293 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.1349&rep=rep1&type=ps[Incremental Execution of Transformation Specifications]. | |
19294 | Ganesh Sittampalam, Oege de Moor, and Ken Friis Larsen. | |
19295 | <:#POPL:> 2004. | |
19296 | + | |
19297 | ____ | |
19298 | Mentions a port from Moscow ML to MLton of | |
19299 | http://www.itu.dk/research/muddy/[MuDDY], an SML wrapper around the | |
19300 | http://sourceforge.net/projects/buddy[BuDDY] BDD package. | |
19301 | ____ | |
19302 | ||
19303 | * <!Anchor(SwaseyEtAl06)> | |
19304 | http://www.cs.cmu.edu/~tom7/papers/smlsc2-ml06.pdf[A Separate Compilation Extension to Standard ML]. | |
19305 | David Swasey, Tom Murphy VII, Karl Crary and Robert Harper. | |
19306 | <:#ML:> 2006. | |
19307 | ||
19308 | == <!Anchor(TTT)>T == | |
19309 | ||
19310 | * <!Anchor(TarditiAppel00)> | |
19311 | http://www.smlnj.org/doc/ML-Yacc/index.html[ML-Yacc User's Manual. Version 2.4] | |
19312 | David R. Tarditi and Andrew W. Appel. 2000. | |
19313 | ||
19314 | * <!Anchor(TarditiEtAl90)> | |
19315 | http://research.microsoft.com/pubs/68738/loplas-sml2c.ps[No Assembly Required: Compiling Standard ML to C]. | |
19316 | David Tarditi, Peter Lee, and Anurag Acharya. 1990. | |
19317 | ||
19318 | * <!Anchor(ThorupTofte94)> | |
19319 | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.5372&rep=rep1&type=ps[Object-oriented programming and Standard ML]. | |
19320 | Lars Thorup and Mads Tofte. | |
19321 | <:#ML:>, 1994. | |
19322 | ||
19323 | * <!Anchor(Tofte90)> | |
19324 | Type Inference for Polymorphic References. | |
19325 | Mads Tofte. | |
19326 | <:#IC:> 1990. | |
19327 | ||
19328 | * <!Anchor(Tofte96)> | |
19329 | http://www.itu.dk/courses/FDP/E2004/Tofte-1996-Essentials_of_SML_Modules.pdf[Essentials of Standard ML Modules]. | |
19330 | Mads Tofte. | |
19331 | ||
19332 | * <!Anchor(Tofte09)> | |
19333 | http://www.itu.dk/people/tofte/publ/tips.pdf[Tips for Computer Scientists on Standard ML (Revised)]. | |
19334 | Mads Tofte. | |
19335 | ||
19336 | * <!Anchor(TolmachAppel95)> | |
19337 | http://web.cecs.pdx.edu/~apt/jfp95.ps[A Debugger for Standard ML]. | |
19338 | Andrew Tolmach and Andrew W. Appel. | |
19339 | <:#JFP:> 1995. | |
19340 | ||
19341 | * <!Anchor(Tolmach97)> | |
19342 | http://web.cecs.pdx.edu/~apt/tic97.ps[Combining Closure Conversion with Closure Analysis using Algebraic Types]. | |
19343 | Andrew Tolmach. | |
19344 | <:#TIC:> 1997. | |
19345 | + | |
19346 | ____ | |
19347 | Describes a closure-conversion algorithm for a monomorphic IL. The | |
19348 | algorithm uses a unification-based flow analysis followed by | |
19349 | defunctionalization and is similar to the approach used in MLton | |
19350 | (<!Cite(CejtinEtAl00)>). | |
19351 | ____ | |
19352 | ||
19353 | * <!Anchor(TolmachOliva98)> | |
19354 | http://web.cecs.pdx.edu/~apt/jfp98.ps[From ML to Ada: Strongly-typed Language Interoperability via Source Translation]. | |
19355 | Andrew Tolmach and Dino Oliva. | |
19356 | <:#JFP:> 1998. | |
19357 | + | |
19358 | ____ | |
19359 | Describes a compiler for RML, a core SML-like language. The compiler | |
19360 | is similar in structure to MLton, using monomorphisation, | |
19361 | defunctionalization, and optimization on a first-order IL. | |
19362 | ____ | |
19363 | ||
19364 | == <!Anchor(UUU)>U == | |
19365 | ||
19366 | * <!Anchor(Ullman98)> | |
19367 | http://www-db.stanford.edu/~ullman/emlp.html[Elements of ML Programming] | |
19368 | (http://www3.addall.com/New/submitNew.cgi?query=0137903871&type=ISBN[addall]). | |
19369 | ISBN 0137903871. | |
19370 | Jeffrey D. Ullman. | |
19371 | Prentice-Hall, 1998. | |
19372 | ||
19373 | == <!Anchor(VVV)>V == | |
19374 | ||
19375 | {empty} | |
19376 | ||
19377 | == <!Anchor(WWW)>W == | |
19378 | ||
19379 | * <!Anchor(Wand84)> | |
19380 | http://portal.acm.org/citation.cfm?id=800527[A Types-as-Sets Semantics for Milner-Style Polymorphism]. | |
19381 | Mitchell Wand. | |
19382 | <:#POPL:> 1984. | |
19383 | ||
19384 | * <!Anchor(Wang01)> | |
19385 | http://ncstrl.cs.princeton.edu/expand.php?id=TR-640-01[Managing Memory with Types]. | |
19386 | Daniel C. Wang. | |
19387 | PhD Thesis. | |
19388 | + | |
19389 | ____ | |
19390 | Chapter 6 describes an implementation of a type-preserving garbage | |
19391 | collector for MLton. | |
19392 | ____ | |
19393 | ||
19394 | * <!Anchor(WangAppel01)> | |
19395 | http://www.cs.princeton.edu/~appel/papers/typegc.pdf[Type-Preserving Garbage Collectors]. | |
19396 | Daniel C. Wang and Andrew W. Appel. | |
19397 | <:#POPL:> 2001. | |
19398 | + | |
19399 | ____ | |
19400 | Shows how to modify MLton to generate a strongly-typed garbage | |
19401 | collector as part of a program. | |
19402 | ____ | |
19403 | ||
19404 | * <!Anchor(WangMurphy02)> | |
19405 | http://www.cs.cmu.edu/~tom7/papers/wang-murphy-recursion.pdf[Programming With Recursion Schemes]. | |
19406 | Daniel C. Wang and Tom Murphy VII. | |
19407 | + | |
19408 | ____ | |
19409 | Describes a programming technique for data abstraction, along with | |
19410 | benchmarks of MLton and other SML compilers. | |
19411 | ____ | |
19412 | ||
19413 | * <!Anchor(Weeks06)> | |
19414 | <!Attachment(References,060916-mlton.pdf,Whole-Program Compilation in MLton)>. | |
19415 | Stephen Weeks. | |
19416 | <:#ML:> 2006. | |
19417 | ||
19418 | * <!Anchor(Wright95)> | |
19419 | http://homepages.inf.ed.ac.uk/dts/fps/papers/wright.ps.gz[Simple Imperative Polymorphism]. | |
19420 | Andrew Wright. | |
19421 | <:#LASC:>, 8(4):343-355, 1995. | |
19422 | + | |
19423 | ____ | |
19424 | The origin of the <:ValueRestriction:>. | |
19425 | ____ | |
19426 | ||
19427 | == <!Anchor(XXX)>X == | |
19428 | ||
19429 | {empty} | |
19430 | ||
19431 | == <!Anchor(YYY)>Y == | |
19432 | ||
19433 | * <!Anchor(Yang98)> | |
19434 | http://cs.nyu.edu/zheyang/papers/YangZ\--ICFP98.html[Encoding Types in ML-like Languages]. | |
19435 | Zhe Yang. | |
19436 | <:#ICFP:> 1998. | |
19437 | ||
19438 | == <!Anchor(ZZZ)>Z == | |
19439 | ||
19440 | * <!Anchor(ZiarekEtAl06)> | |
19441 | http://www.cs.purdue.edu/homes/lziarek/icfp06.pdf[Stabilizers: A Modular Checkpointing Abstraction for Concurrent Functional Programs]. | |
19442 | Lukasz Ziarek, Philip Schatz, and Suresh Jagannathan. | |
19443 | <:#ICFP:> 2006. | |
19444 | ||
19445 | * <!Anchor(ZiarekEtAl08)> | |
19446 | http://www.cse.buffalo.edu/~lziarek/hosc.pdf[Flattening tuples in an SSA intermediate representation]. | |
19447 | Lukasz Ziarek, Stephen Weeks, and Suresh Jagannathan. | |
19448 | <:#HOSC:> 2008. | |
19449 | ||
19450 | ||
19451 | == Abbreviations == | |
19452 | ||
19453 | * <!Anchor(ACSD)> ACSD = International Conference on Application of Concurrency to System Design | |
19454 | * <!Anchor(BABEL)> BABEL = Workshop on multi-language infrastructure and interoperability | |
19455 | * <!Anchor(CC)> CC = International Conference on Compiler Construction | |
19456 | * <!Anchor(DPCOOL)> DPCOOL = Workshop on Declarative Programming in the Context of OO Languages | |
19457 | * <!Anchor(ESOP)> ESOP = European Symposium on Programming | |
19458 | * <!Anchor(FLOPS)> FLOPS = Symposium on Functional and Logic Programming | |
19459 | * <!Anchor(FPCA)> FPCA = Conference on Functional Programming Languages and Computer Architecture | |
19460 | * <!Anchor(HOSC)> HOSC = Higher-Order and Symbolic Computation | |
19461 | * <!Anchor(IC)> IC = Information and Computation | |
19462 | * <!Anchor(ICCL)> ICCL = IEEE International Conference on Computer Languages | |
19463 | * <!Anchor(ICFP)> ICFP = International Conference on Functional Programming | |
19464 | * <!Anchor(IFL)> IFL = International Workshop on Implementation and Application of Functional Languages | |
19465 | * <!Anchor(IVME)> IVME = Workshop on Interpreters, Virtual Machines and Emulators | |
19466 | * <!Anchor(JFLA)> JFLA = Journees Francophones des Langages Applicatifs | |
19467 | * <!Anchor(JFP)> JFP = Journal of Functional Programming | |
19468 | * <!Anchor(LASC)> LASC = Lisp and Symbolic Computation | |
19469 | * <!Anchor(LFP)> LFP = Lisp and Functional Programming | |
19470 | * <!Anchor(ML)> ML = Workshop on ML | |
19471 | * <!Anchor(PLDI)> PLDI = Conference on Programming Language Design and Implementation | |
19472 | * <!Anchor(POPL)> POPL = Symposium on Principles of Programming Languages | |
19473 | * <!Anchor(PPDP)> PPDP = International Conference on Principles and Practice of Declarative Programming | |
19474 | * <!Anchor(PPoPP)> PPoPP = Principles and Practice of Parallel Programming | |
19475 | * <!Anchor(TCS)> TCS = IFIP International Conference on Theoretical Computer Science | |
19476 | * <!Anchor(TIC)> TIC = Types in Compilation | |
19477 | * <!Anchor(TLDI)> TLDI = Workshop on Types in Language Design and Implementation | |
19478 | * <!Anchor(TOPLAS)> TOPLAS = Transactions on Programming Languages and Systems | |
19479 | * <!Anchor(TPHOLs)> TPHOLs = International Conference on Theorem Proving in Higher Order Logics | |
19480 | ||
19481 | <<< | |
19482 | ||
19483 | :mlton-guide-page: RefFlatten | |
19484 | [[RefFlatten]] | |
19485 | RefFlatten | |
19486 | ========== | |
19487 | ||
19488 | <:RefFlatten:> is an optimization pass for the <:SSA2:> | |
19489 | <:IntermediateLanguage:>, invoked from <:SSA2Simplify:>. | |
19490 | ||
19491 | == Description == | |
19492 | ||
19493 | This pass flattens a `ref` cell into its containing object. | |
19494 | The idea is to replace, where possible, a type like | |
19495 | ---- | |
19496 | (int ref * real) | |
19497 | ---- | |
19498 | ||
19499 | with a type like | |
19500 | ---- | |
19501 | (int[m] * real) | |
19502 | ---- | |
19503 | ||
19504 | where the `[m]` indicates a mutable field of a tuple. | |
19505 | ||
19506 | == Implementation == | |
19507 | ||
19508 | * <!ViewGitFile(mlton,master,mlton/ssa/ref-flatten.fun)> | |
19509 | ||
19510 | == Details and Notes == | |
19511 | ||
19512 | The savings is obvious, I hope. We avoid an extra heap-allocated | |
19513 | object for the `ref`, which in the above case saves two words. We | |
19514 | also save the time and code for the extra indirection at each get and | |
19515 | set. There are lots of useful data structures (singly-linked and | |
19516 | doubly-linked lists, union-find, Fibonacci heaps, ...) that I believe | |
19517 | we are paying through the nose right now because of the absence of ref | |
19518 | flattening. | |
19519 | ||
19520 | The idea is to compute for each occurrence of a `ref` type in the | |
19521 | program whether or not that `ref` can be represented as an offset of | |
19522 | some object (constructor or tuple). As before, a unification-based | |
19523 | whole-program with deep abstract values makes sure the analysis is | |
19524 | consistent. | |
19525 | ||
19526 | The only syntactic part of the analysis that remains is the part that | |
19527 | checks that for a variable bound to a value constructed by `Ref_ref`: | |
19528 | ||
19529 | * the object allocation is in the same block. This is pretty | |
19530 | draconian, and it would be nice to generalize it some day to allow | |
19531 | flattening as long as the `ref` allocation and object allocation "line | |
19532 | up one-to-one" in the same loop-free chunk of code. | |
19533 | ||
19534 | * updates occur in the same block (and hence it is safe-for-space | |
19535 | because the containing object is still alive). It would be nice to | |
19536 | relax this to allow updates as long as it can be provedthat the | |
19537 | container is live. | |
19538 | ||
19539 | Prevent flattening of `unit ref`-s. | |
19540 | ||
19541 | <:RefFlatten:> is safe for space. The idea is to prevent a `ref` | |
19542 | being flattened into an object that has a component of unbounded size | |
19543 | (other than possibly the `ref` itself) unless we can prove that at | |
19544 | each point the `ref` is live, then the containing object is live too. | |
19545 | I used a pretty simple approximation to liveness. | |
19546 | ||
19547 | <<< | |
19548 | ||
19549 | :mlton-guide-page: Regions | |
19550 | [[Regions]] | |
19551 | Regions | |
19552 | ======= | |
19553 | ||
19554 | In region-based memory management, the heap is divided into a | |
19555 | collection of regions into which objects are allocated. At compile | |
19556 | time, either in the source program or through automatic inference, | |
19557 | allocation points are annotated with the region in which the | |
19558 | allocation will occur. Typically, although not always, the regions | |
19559 | are allocated and deallocated according to a stack discipline. | |
19560 | ||
19561 | MLton does not use region-based memory management; it uses traditional | |
19562 | <:GarbageCollection:>. We have considered integrating regions with | |
19563 | MLton, but in our opinion it is far from clear that regions would | |
19564 | provide MLton with improved performance, while they would certainly | |
19565 | add a lot of complexity to the compiler and complicate reasoning about | |
19566 | and achieving <:SpaceSafety:>. Region-based memory management and | |
19567 | garbage collection have different strengths and weaknesses; it's | |
19568 | pretty easy to come up with programs that do significantly better | |
19569 | under regions than under GC, and vice versa. We believe that it is | |
19570 | the case that common SML idioms tend to work better under GC than | |
19571 | under regions. | |
19572 | ||
19573 | One common argument for regions is that the region operations can all | |
19574 | be done in (approximately) constant time; therefore, you eliminate GC | |
19575 | pause times, leading to a real-time GC. However, because of space | |
19576 | safety concerns (see below), we believe that region-based memory | |
19577 | management for SML must also include a traditional garbage collector. | |
19578 | Hence, to achieve real-time memory management for MLton/SML, we | |
19579 | believe that it would be both easier and more efficient to implement a | |
19580 | traditional real-time garbage collector than it would be to implement | |
19581 | a region system. | |
19582 | ||
19583 | == Regions, the ML Kit, and space safety == | |
19584 | ||
19585 | The <:MLKit:ML Kit> pioneered the use of regions for compiling | |
19586 | Standard ML. The ML Kit maintains a stack of regions at run time. At | |
19587 | compile time, it uses region inference to decide when data can be | |
19588 | allocated in a stack-like manner, assigning it to an appropriate | |
19589 | region. The ML Kit has put a lot of effort into improving the | |
19590 | supporting analyses and representations of regions, which are all | |
19591 | necessary to improve the performance. | |
19592 | ||
19593 | Unfortunately, under a pure stack-based region system, space leaks are | |
19594 | inevitable in theory, and costly in practice. Data for which region | |
19595 | inference can not determine the lifetime is moved into the "global | |
19596 | region" whose lifetime is the entire program. There are two ways in | |
19597 | which region inference will place an object to the global region. | |
19598 | ||
19599 | * When the inference is too conservative, that is, when the data is | |
19600 | used in a stack-like manner but the region inference can't figure it | |
19601 | out. | |
19602 | ||
19603 | * When data is not used in a stack-like manner. In this case, | |
19604 | correctness requires region inference to place the object | |
19605 | ||
19606 | This global region is a source of space leaks. No matter what region | |
19607 | system you use, there are some programs such that the global region | |
19608 | must exist, and its size will grow to an unbounded multiple of the | |
19609 | live data size. For these programs one must have a GC to achieve | |
19610 | space safety. | |
19611 | ||
19612 | To solve this problem, the ML Kit has undergone work to combine | |
19613 | garbage collection with region-based memory management. | |
19614 | <!Cite(HallenbergEtAl02)> and <!Cite(Elsman03)> describe the addition | |
19615 | of a garbage collector to the ML Kit's region-based system. These | |
19616 | papers provide convincing evidence for space leaks in the global | |
19617 | region. They show a number of benchmarks where the memory usage of | |
19618 | the program running with just regions is a large multiple (2, 10, 50, | |
19619 | even 150) of the program running with regions plus GC. | |
19620 | ||
19621 | These papers also give some numbers to show the ML Kit with just | |
19622 | regions does better than either a system with just GC or a combined | |
19623 | system. Unfortunately, a pure region system isn't practical because | |
19624 | of the lack of space safety. And the other performance numbers are | |
19625 | not so convincing, because they compare to an old version of SML/NJ | |
19626 | and not at all with MLton. It would be interesting to see a | |
19627 | comparison with a more serious collector. | |
19628 | ||
19629 | == Regions, Garbage Collection, and Cyclone == | |
19630 | ||
19631 | One possibility is to take Cyclone's approach, and provide both | |
19632 | region-based memory management and garbage collection, but at the | |
19633 | programmer's option (<!Cite(GrossmanEtAl02)>, <!Cite(HicksEtAl03)>). | |
19634 | ||
19635 | One might ask whether we might do the same thing -- i.e., provide a | |
19636 | `MLton.Regions` structure with explicit region based memory | |
19637 | management operations, so that the programmer could use them when | |
19638 | appropriate. <:MatthewFluet:> has thought about this question | |
19639 | ||
19640 | * http://www.cs.cornell.edu/People/fluet/rgn-monad/index.html | |
19641 | ||
19642 | Unfortunately, his conclusion is that the SML type system is too weak | |
19643 | to support this option, although there might be a "poor-man's" version | |
19644 | with dynamic checks. | |
19645 | ||
19646 | <<< | |
19647 | ||
19648 | :mlton-guide-page: Release20041109 | |
19649 | [[Release20041109]] | |
19650 | Release20041109 | |
19651 | =============== | |
19652 | ||
19653 | This is an archived public release of MLton, version 20041109. | |
19654 | ||
19655 | == Changes since the last public release == | |
19656 | ||
19657 | * New platforms: | |
19658 | ** x86: FreeBSD 5.x, OpenBSD | |
19659 | ** PowerPC: Darwin (MacOSX) | |
19660 | * Support for the <:MLBasis: ML Basis system>, a new mechanism supporting programming in the very large, separate delivery of library sources, and more. | |
19661 | * Support for dynamic libraries. | |
19662 | * Support for <:ConcurrentML:> (CML). | |
19663 | * New structures: `Int2`, `Int3`, ..., `Int31` and `Word2`, `Word3`, ..., `Word31`. | |
19664 | * Front-end bug fixes and improvements. | |
19665 | * A new form of profiling with ++-profile count++, which can be used to test code coverage. | |
19666 | * A bytecode generator, available via ++-codegen bytecode++. | |
19667 | * Representation improvements: | |
19668 | ** Tuples and datatypes are packed to decrease space usage. | |
19669 | ** Ref cells may be unboxed into their containing object. | |
19670 | ** Arrays of tuples may represent the tuples unboxed. | |
19671 | ||
19672 | For a complete list of changes and bug fixes since 20040227, see the | |
19673 | <!RawGitFile(mlton,on-20041109-release,doc/changelog)>. | |
19674 | ||
19675 | == Also see == | |
19676 | ||
19677 | * <:Bugs20041109:> | |
19678 | ||
19679 | <<< | |
19680 | ||
19681 | :mlton-guide-page: Release20051202 | |
19682 | [[Release20051202]] | |
19683 | Release20051202 | |
19684 | =============== | |
19685 | ||
19686 | This is an archived public release of MLton, version 20051202. | |
19687 | ||
19688 | == Changes since the last public release == | |
19689 | ||
19690 | * The <:License:MLton license> is now BSD-style instead of the GPL. | |
19691 | * New platforms: <:RunningOnMinGW:X86/MinGW> and HPPA/Linux. | |
19692 | * Improved and expanded documentation, based on the MLton wiki. | |
19693 | * Compiler. | |
19694 | ** improved exception history. | |
19695 | ** <:CompileTimeOptions:Command-line switches>. | |
19696 | *** Added: ++-as-opt++, ++-mlb-path-map++, ++-target-as-opt++, ++-target-cc-opt++. | |
19697 | *** Removed: ++-native++, ++-sequence-unit++, ++-warn-match++, ++-warn-unused++. | |
19698 | * Language. | |
19699 | ** <:ForeignFunctionInterface:FFI> syntax changes and extensions. | |
19700 | *** Added: `_symbol`. | |
19701 | *** Changed: `_export`, `_import`. | |
19702 | *** Removed: `_ffi`. | |
19703 | ** <:MLBasisAnnotations:ML Basis annotations>. | |
19704 | *** Added: `allowFFI`, `nonexhaustiveExnMatch`, `nonexhaustiveMatch`, `redundantMatch`, `sequenceNonUnit`. | |
19705 | *** Deprecated: `allowExport`, `allowImport`, `sequenceUnit`, `warnMatch`. | |
19706 | * Libraries. | |
19707 | ** Basis Library. | |
19708 | *** Added: `Int1`, `Word1`. | |
19709 | ** <:MLtonStructure:MLton structure>. | |
19710 | *** Added: `Process.create`, `ProcEnv.setgroups`, `Rusage.measureGC`, `Socket.fdToSock`, `Socket.Ctl.getError`. | |
19711 | *** Changed: `MLton.Platform.Arch`. | |
19712 | ** Other libraries. | |
19713 | *** Added: <:CKitLibrary:ckit>, <:MLNLFFI:ML-NLFFI library>, <:SMLNJLibrary:SML/NJ library>. | |
19714 | * Tools. | |
19715 | ** Updates of `mllex` and `mlyacc` from SML/NJ. | |
19716 | ** Added <:MLNLFFI:mlnlffigen>. | |
19717 | ** <:Profiling:> supports better inclusion/exclusion of code. | |
19718 | ||
19719 | For a complete list of changes and bug fixes since | |
19720 | <:Release20041109:>, see the | |
19721 | <!RawGitFile(mlton,on-20051202-release,doc/changelog)> and | |
19722 | <:Bugs20041109:>. | |
19723 | ||
19724 | == 20051202 binary packages == | |
19725 | ||
19726 | * x86 | |
19727 | ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-cygwin.tgz[Cygwin] 1.5.18-1 | |
19728 | ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-freebsd.tbz[FreeBSD] 5.4 | |
19729 | ** Linux | |
19730 | *** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1_i386.deb[Debian] sid | |
19731 | *** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1_i386.stable.deb[Debian] stable (Sarge) | |
19732 | *** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386.rpm[RedHat] 7.1-9.3 FC1-FC4 | |
19733 | *** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-linux.tgz[tgz] for other distributions (glibc 2.3) | |
19734 | ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-mingw.tgz[MinGW] | |
19735 | ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-netbsd.tgz[NetBSD] 2.0.2 | |
19736 | ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.i386-openbsd.tgz[OpenBSD] 3.7 | |
19737 | * PowerPC | |
19738 | ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.powerpc-darwin.tgz[Darwin] 7.9.0 (Mac OS X) | |
19739 | * Sparc | |
19740 | ** http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.sparc-solaris.tgz[Solaris] 8 | |
19741 | ||
19742 | == 20051202 source packages == | |
19743 | ||
19744 | * http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.src.tgz[source tgz] | |
19745 | * Debian http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1.dsc[dsc], http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202-1.diff.gz[diff.gz], http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton_20051202.orig.tar.gz[orig.tar.gz] | |
19746 | * RedHat http://sourceforge.net/projects/mlton/files/mlton/20051202/mlton-20051202-1.src.rpm[source rpm] | |
19747 | ||
19748 | == Packages available at other sites == | |
19749 | ||
19750 | * http://packages.debian.org/cgi-bin/search_packages.pl?searchon=names&version=all&exact=1&keywords=mlton[Debian] | |
19751 | * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD] | |
19752 | * Fedora Core http://fedoraproject.org/extras/4/i386/repodata/repoview/mlton-0-20051202-8.fc4.html[4] http://fedoraproject.org/extras/5/i386/repodata/repoview/mlton-0-20051202-8.fc5.html[5] | |
19753 | * http://packages.ubuntu.com/dapper/devel/mlton[Ubuntu] | |
19754 | ||
19755 | == Also see == | |
19756 | ||
19757 | * <:Bugs20051202:> | |
19758 | * http://www.mlton.org/guide/20051202/[MLton Guide (20051202)]. | |
19759 | + | |
19760 | A snapshot of the MLton wiki at the time of release. | |
19761 | ||
19762 | <<< | |
19763 | ||
19764 | :mlton-guide-page: Release20070826 | |
19765 | [[Release20070826]] | |
19766 | Release20070826 | |
19767 | =============== | |
19768 | ||
19769 | This is an archived public release of MLton, version 20070826. | |
19770 | ||
19771 | == Changes since the last public release == | |
19772 | ||
19773 | * New platforms: | |
19774 | ** <:RunningOnAMD64:AMD64>/<:RunningOnLinux:Linux>, <:RunningOnAMD64:AMD64>/<:RunningOnFreeBSD:FreeBSD> | |
19775 | ** <:RunningOnHPPA:HPPA>/<:RunningOnHPUX:HPUX> | |
19776 | ** <:RunningOnPowerPC:PowerPC>/<:RunningOnAIX:AIX> | |
19777 | ** <:RunningOnX86:X86>/<:RunningOnDarwin:Darwin (Mac OS X)> | |
19778 | * Compiler. | |
19779 | ** Support for 64-bit platforms. | |
19780 | *** Native amd64 codegen. | |
19781 | ** <:CompileTimeOptions:Compile-time options>. | |
19782 | *** Added: ++-codegen amd64++, ++-codegen x86++, ++-default-type __type__++, ++-profile-val {false|true}++. | |
19783 | *** Changed: ++-stop f++ (file listing now includes `.mlb` files). | |
19784 | ** Bytecode codegen. | |
19785 | *** Support for exception history. | |
19786 | *** Support for profiling. | |
19787 | * Language. | |
19788 | *** <:MLBasisAnnotations:ML Basis annotations>. | |
19789 | **** Removed: `allowExport`, `allowImport`, `sequenceUnit`, `warnMatch`. | |
19790 | * Libraries. | |
19791 | ** <:BasisLibrary:Basis Library>. | |
19792 | *** Added: `PackWord16Big`, `PackWord16Little`, `PackWord64Big`, `PackWord64Little`. | |
19793 | *** Bug fixes: see <!RawGitFile(mlton,on-20070826-release,doc/changelog)>. | |
19794 | ** <:MLtonStructure:MLton structure>. | |
19795 | *** Added: `MLTON_MONO_ARRAY`, `MLTON_MONO_VECTOR`, `MLTON_REAL`, `MLton.BinIO.tempPrefix`, `MLton.CharArray`, `MLton.CharVector`, `MLton.Exn.defaultTopLevelHandler`, `MLton.Exn.getTopLevelHandler`, `MLton.Exn.setTopLevelHandler`, `MLton.IntInf.BigWord`, `Mlton.IntInf.SmallInt`, `MLton.LargeReal`, `MLton.LargeWord`, `MLton.Real`, `MLton.Real32`, `MLton.Real64`, `MLton.Rlimit.Rlim`, `MLton.TextIO.tempPrefix`, `MLton.Vector.create`, `MLton.Word.bswap`, `MLton.Word8.bswap`, `MLton.Word16`, `MLton.Word32`, `MLton.Word64`, `MLton.Word8Array`, `MLton.Word8Vector`. | |
19796 | *** Changed: `MLton.Array.unfoldi`, `MLton.IntInf.rep`, `MLton.Rlimit`, `MLton.Vector.unfoldi`. | |
19797 | *** Deprecated: `MLton.Socket`. | |
19798 | ** Other libraries. | |
19799 | *** Added: <:MLRISCLibrary:MLRISC library>. | |
19800 | *** Updated: <:CKitLibrary:ckit library>, <:SMLNJLibrary:SML/NJ library>. | |
19801 | * Tools. | |
19802 | ||
19803 | For a complete list of changes and bug fixes since | |
19804 | <:Release20051202:>, see the | |
19805 | <!RawGitFile(mlton,on-20070826-release,doc/changelog)> and | |
19806 | <:Bugs20051202:>. | |
19807 | ||
19808 | == 20070826 binary packages == | |
19809 | ||
19810 | * AMD64 | |
19811 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.amd64-linux.tgz[Linux], glibc 2.3 | |
19812 | * HPPA | |
19813 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.hppa-hpux1100.tgz[HPUX] 11.00 and above, statically linked against <:GnuMP:> | |
19814 | * PowerPC | |
19815 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-aix51.tgz[AIX] 5.1 and above, statically linked against <:GnuMP:> | |
19816 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-darwin.gmp-static.tgz[Darwin] 8.10 (Mac OS X), statically linked against <:GnuMP:> | |
19817 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.powerpc-darwin.gmp-macports.tgz[Darwin] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>) | |
19818 | * Sparc | |
19819 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.sparc-solaris8.tgz[Solaris] 8 and above, statically linked against <:GnuMP:> | |
19820 | * X86 | |
19821 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-cygwin.tgz[Cygwin] 1.5.24-2 | |
19822 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-macports.tgz[Darwin (.tgz)] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>) | |
19823 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-macports.dmg[Darwin (.dmg)] 8.10 (Mac OS X), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>) | |
19824 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-static.tgz[Darwin (.tgz)] 8.10 (Mac OS X), statically linked against <:GnuMP:> | |
19825 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-darwin.gmp-static.dmg[Darwin (.dmg)] 8.10 (Mac OS X), statically linked against <:GnuMP:> | |
19826 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-freebsd.tgz[FreeBSD] | |
19827 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-linux.tgz[Linux], glibc 2.3 | |
19828 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-linux.glibc213.gmp-static.tgz[Linux], glibc 2.1, statically linked against <:GnuMP:> | |
19829 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-mingw.gmp-dll.tgz[MinGW], dynamically linked against <:GnuMP:> (requires `libgmp-3.dll`) | |
19830 | ** http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.x86-mingw.gmp-static.tgz[MinGW], statically linked against <:GnuMP:> | |
19831 | ||
19832 | == 20070826 source packages == | |
19833 | ||
19834 | * http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton-20070826-1.src.tgz[source tgz] | |
19835 | ||
19836 | * Debian http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826-1.dsc[dsc], | |
19837 | http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826-1.diff.gz[diff.gz], | |
19838 | http://sourceforge.net/projects/mlton/files/mlton/20070826/mlton_20070826.orig.tar.gz[orig.tar.gz] | |
19839 | ||
19840 | == Packages available at other sites == | |
19841 | ||
19842 | * http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian] | |
19843 | * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD] | |
19844 | * https://admin.fedoraproject.org/pkgdb/packages/name/mlton[Fedora] | |
19845 | * http://packages.ubuntu.com/cgi-bin/search_packages.pl?keywords=mlton&searchon=names&version=all&release=all[Ubuntu] | |
19846 | ||
19847 | == Also see == | |
19848 | ||
19849 | * <:Bugs20070826:> | |
19850 | * http://www.mlton.org/guide/20070826/[MLton Guide (20070826)]. | |
19851 | + | |
19852 | A snapshot of the MLton wiki at the time of release. | |
19853 | ||
19854 | <<< | |
19855 | ||
19856 | :mlton-guide-page: Release20100608 | |
19857 | [[Release20100608]] | |
19858 | Release20100608 | |
19859 | =============== | |
19860 | ||
19861 | This is an archived public release of MLton, version 20100608. | |
19862 | ||
19863 | == Changes since the last public release == | |
19864 | ||
19865 | * New platforms. | |
19866 | ** <:RunningOnAMD64:AMD64>/<:RunningOnDarwin:Darwin> (Mac OS X Snow Leopard) | |
19867 | ** <:RunningOnIA64:IA64>/<:RunningOnHPUX:HPUX> | |
19868 | ** <:RunningOnPowerPC64:PowerPC64>/<:RunningOnAIX:AIX> | |
19869 | * Compiler. | |
19870 | ** <:CompileTimeOptions:Command-line switches>. | |
19871 | *** Added: ++-mlb-path-var __<name> <value>__++ | |
19872 | *** Removed: ++-keep sml++, ++-stop sml++ | |
19873 | ** Improved constant folding of floating-point operations. | |
19874 | ** Experimental: Support for compiling to a C library; see <:LibrarySupport: documentation>. | |
19875 | ** Extended ++-show-def-use __output__++ to include types of variable definitions. | |
19876 | ** Deprecated features (to be removed in a future release) | |
19877 | *** Bytecode codegen: The bytecode codegen has not seen significant use and it is not well understood by any of the active developers. | |
19878 | *** Support for `.cm` files as input: The ML Basis system provides much better infrastructure for "programming in the very large" than the (very) limited support for CM. The `cm2mlb` tool (available in the source distribution) can be used to convert CM projects to MLB projects, preserving the CM scoping of module identifiers. | |
19879 | ** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)> | |
19880 | * Runtime. | |
19881 | ** <:RunTimeOptions:@MLton switches>. | |
19882 | *** Added: ++may-page-heap {false|true}++ | |
19883 | ** ++may-page-heap++: By default, MLton will not page the heap to disk when unable to grow the heap to accommodate an allocation. (Previously, this behavior was the default, with no means to disable, with security an least-surprise issues.) | |
19884 | ** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)> | |
19885 | * Language. | |
19886 | ** Allow numeric characters in <:MLBasis:ML Basis> path variables. | |
19887 | * Libraries. | |
19888 | ** <:BasisLibrary:Basis Library>. | |
19889 | *** Bug fixes: see <!RawGitFile(mlton,on-20100608-release,doc/changelog)> | |
19890 | ** <:MLtonStructure:MLton structure>. | |
19891 | *** Added: `MLton.equal`, `MLton.hash`, `MLton.Cont.isolate`, `MLton.GC.Statistics`, `MLton.Pointer.sizeofPointer`, `MLton.Socket.Address.toVector` | |
19892 | *** Changed: | |
19893 | *** Deprecated: `MLton.Socket` | |
19894 | ** <:UnsafeStructure:Unsafe structure>. | |
19895 | *** Added versions of all of the monomorphic array and vector structures. | |
19896 | ** Other libraries. | |
19897 | *** Updated: <:CKitLibrary:ckit library>, <:MLRISCLibrary:MLRISC library>, <:SMLNJLibrary:SML/NJ library>. | |
19898 | * Tools. | |
19899 | ** `mllex` | |
19900 | *** Eliminated top-level `type int = Int.int` in output. | |
19901 | *** Include `(*#line line:col "file.lex" *)` directives in output. | |
19902 | *** Added `%posint` command, to set the `yypos` type and allow the lexing of multi-gigabyte files. | |
19903 | ** `mlnlffigen` | |
19904 | *** Added command-line switches `-linkage archive` and `-linkage shared`. | |
19905 | *** Deprecated command-line switch `-linkage static`. | |
19906 | *** Added support for <:RunningOnIA64:IA64> and <:RunningOnHPPA:HPPA> targets. | |
19907 | ** `mlyacc` | |
19908 | *** Eliminated top-level `type int = Int.int` in output. | |
19909 | *** Include `(*#line line:col "file.grm" *)` directives in output. | |
19910 | ||
19911 | For a complete list of changes and bug fixes since <:Release20070826:>, see the | |
19912 | <!RawGitFile(mlton,on-20100608-release,doc/changelog)> | |
19913 | and <:Bugs20070826:>. | |
19914 | ||
19915 | == 20100608 binary packages == | |
19916 | ||
19917 | * AMD64 (aka "x86-64" or "x64") | |
19918 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-darwin.gmp-macports.tgz[Darwin (.tgz)] 10.3 (Mac OS X Snow Leopard), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>) | |
19919 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 10.3 (Mac OS X Snow Leopard), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables) | |
19920 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-linux.tgz[Linux], glibc 2.11 | |
19921 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.amd64-linux.static.tgz[Linux], statically linked | |
19922 | ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.msi[MSI] (61MB) installer | |
19923 | * X86 | |
19924 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-cygwin.tgz[Cygwin] 1.7.5 | |
19925 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-darwin.gmp-macports.tgz[Darwin (.tgz)] 9.8 (Mac OS X Leopard), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>) | |
19926 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-darwin.gmp-static.tgz[Darwin (.tgz)] 9.8 (Mac OS X Leopard), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables) | |
19927 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-linux.tgz[Linux], glibc 2.11 | |
19928 | ** http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608-1.x86-linux.static.tgz[Linux], statically linked | |
19929 | ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20100608/MLton-20100608-1.msi[MSI] (61MB) installer | |
19930 | ||
19931 | == 20100608 source packages == | |
19932 | ||
19933 | * http://sourceforge.net/projects/mlton/files/mlton/20100608/mlton-20100608.src.tgz[mlton-20100608.src.tgz] | |
19934 | ||
19935 | == Packages available at other sites == | |
19936 | ||
19937 | * http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian] | |
19938 | * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD] | |
19939 | * https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora] | |
19940 | * http://packages.ubuntu.com/search?suite=default§ion=all&arch=any&searchon=names&keywords=mlton[Ubuntu] | |
19941 | ||
19942 | == Also see == | |
19943 | ||
19944 | * <:Bugs20100608:> | |
19945 | * http://www.mlton.org/guide/20100608/[MLton Guide (20100608)]. | |
19946 | + | |
19947 | A snapshot of the MLton wiki at the time of release. | |
19948 | ||
19949 | <<< | |
19950 | ||
19951 | :mlton-guide-page: Release20130715 | |
19952 | [[Release20130715]] | |
19953 | Release20130715 | |
19954 | =============== | |
19955 | ||
19956 | This is an archived public release of MLton, version 20130715. | |
19957 | ||
19958 | == Changes since the last public release == | |
19959 | ||
19960 | // * New platforms. | |
19961 | // ** ??? | |
19962 | * Compiler. | |
19963 | ** Cosmetic improvements to type-error messages. | |
19964 | ** Removed features: | |
19965 | *** Bytecode codegen: The bytecode codegen had not seen significant use and it was not well understood by any of the active developers. | |
19966 | *** Support for `.cm` files as input: The <:MLBasis:ML Basis system> provides much better infrastructure for "programming in the very large" than the (very) limited support for CM. The `cm2mlb` tool (available in the source distribution) can be used to convert CM projects to MLB projects, preserving the CM scoping of module identifiers. | |
19967 | ** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)> | |
19968 | * Runtime. | |
19969 | ** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)> | |
19970 | * Language. | |
19971 | ** Interpret `(*#line line:col "file" *)` directives as relative file names. | |
19972 | ** <:MLBasisAnnotations:ML Basis annotations>. | |
19973 | *** Added: `resolveScope` | |
19974 | * Libraries. | |
19975 | ** <:BasisLibrary:Basis Library>. | |
19976 | *** Improved performance of `String.concatWith`. | |
19977 | *** Use bit operations for `REAL.class` and other low-level operations. | |
19978 | *** Support additional variables with `Posix.ProcEnv.sysconf`. | |
19979 | *** Bug fixes: see <!RawGitFile(mlton,on-20130715-release,doc/changelog)> | |
19980 | ** <:MLtonStructure:MLton structure>. | |
19981 | *** Removed: `MLton.Socket` | |
19982 | ** Other libraries. | |
19983 | *** Updated: <:CKitLibrary:ckit library>, <:MLRISCLibrary:MLRISC library>, <:SMLNJLibrary:SML/NJ library> | |
19984 | *** Added: <:MLLPTLibrary:MLLPT library> | |
19985 | * Tools. | |
19986 | ** `mllex` | |
19987 | *** Generate `(*#line line:col "file.lex" *)` directives with simple (relative) file names, rather than absolute paths. | |
19988 | ** `mlyacc` | |
19989 | *** Generate `(*#line line:col "file.grm" *)` directives with simple (relative) file names, rather than absolute paths. | |
19990 | *** Fixed bug in comment-handling in lexer. | |
19991 | ||
19992 | For a complete list of changes and bug fixes since | |
19993 | <:Release20100608:>, see the | |
19994 | <!RawGitFile(mlton,on-20130715-release,doc/changelog)> and | |
19995 | <:Bugs20100608:>. | |
19996 | ||
19997 | == 20130715 binary packages == | |
19998 | ||
19999 | * AMD64 (aka "x86-64" or "x64") | |
20000 | ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-darwin.gmp-macports.tgz[Darwin (.tgz)] 11.4 (Mac OS X Lion), dynamically linked against <:GnuMP:> in `/opt/local/lib` (suitable for http://macports.org[MacPorts] install of <:GnuMP:>) | |
20001 | ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 11.4 (Mac OS X Lion), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables) | |
20002 | ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-linux.tgz[Linux], glibc 2.15 | |
20003 | // ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.amd64-linux.static.tgz[Linux], statically linked | |
20004 | // ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.msi[MSI] (61MB) installer | |
20005 | * X86 | |
20006 | // ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-cygwin.tgz[Cygwin] 1.7.5 | |
20007 | ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-linux.tgz[Linux], glibc 2.15 | |
20008 | // ** http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715-1.x86-linux.static.tgz[Linux], statically linked | |
20009 | // ** Windows MinGW 32/64 http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.exe[self-extracting] (28MB) or http://sourceforge.net/projects/mlton/files/mlton/20130715/MLton-20130715-1.msi[MSI] (61MB) installer | |
20010 | ||
20011 | == 20130715 source packages == | |
20012 | ||
20013 | * http://sourceforge.net/projects/mlton/files/mlton/20130715/mlton-20130715.src.tgz[mlton-20130715.src.tgz] | |
20014 | ||
20015 | == Downstream packages == | |
20016 | ||
20017 | * http://packages.debian.org/search?keywords=mlton&searchon=names&suite=all§ion=all[Debian] | |
20018 | * http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[FreeBSD] | |
20019 | * https://admin.fedoraproject.org/pkgdb/acls/name/mlton[Fedora] | |
20020 | * http://packages.ubuntu.com/search?suite=default§ion=all&arch=any&searchon=names&keywords=mlton[Ubuntu] | |
20021 | ||
20022 | == Also see == | |
20023 | ||
20024 | * <:Bugs20130715:> | |
20025 | * http://www.mlton.org/guide/20130715/[MLton Guide (20130715)]. | |
20026 | + | |
20027 | A snapshot of the MLton website at the time of release. | |
20028 | ||
20029 | <<< | |
20030 | ||
20031 | :mlton-guide-page: Release20180207 | |
20032 | [[Release20180207]] | |
20033 | Release20180207 | |
20034 | =============== | |
20035 | ||
20036 | Here you can download the latest public release of MLton, version 20180207. | |
20037 | ||
20038 | == Changes since the last public release == | |
20039 | ||
20040 | * Compiler. | |
20041 | ** Added an experimental LLVM codegen (`-codegen llvm`); requires LLVM tools | |
20042 | (`llvm-as`, `opt`, `llc`) version ≥ 3.7. | |
20043 | ** Made many substantial cosmetic improvements to front-end diagnostic | |
20044 | messages, especially with respect to source location regions, type inference | |
20045 | for `fun` and `val rec` declarations, signature constraints applied to a | |
20046 | structure, `sharing type` specifications and `where type` signature | |
20047 | expressions, type constructor or type variable escaping scope, and | |
20048 | nonexhaustive pattern matching. | |
20049 | ** Fixed minor bugs with exception replication, precedence parsing of function | |
20050 | clauses, and simultaneous `sharing` of multiple structures. | |
20051 | ** Made compilation deterministic (eliminate output executable name from | |
20052 | compile-time specified `@MLton` runtime arguments; deterministically generate | |
20053 | magic constant for executable). | |
20054 | ** Updated `-show-basis` (recursively expand structures in environments, | |
20055 | displaying components with long identifiers; append `(* @ region *)` | |
20056 | annotations to items shown in environment). | |
20057 | ** Forced amd64 codegen to generate PIC on amd64-linux targets. | |
20058 | * Runtime. | |
20059 | ** Added `gc-summary-file file` runtime option. | |
20060 | ** Reorganized runtime support for `IntInf` operations so that programs that | |
20061 | do not use `IntInf` compile to executables with no residual dependency on GMP. | |
20062 | ** Changed heap representation to store forwarding pointer for an object in | |
20063 | the object header (rather than in the object data and setting the header to a | |
20064 | sentinel value). | |
20065 | * Language. | |
20066 | ** Added support for selected SuccessorML features; see | |
20067 | http://mlton.org/SuccessorML for details. | |
20068 | ** Added `(*#showBasis "file" *)` directive; see | |
20069 | http://mlton.org/ShowBasisDirective for details. | |
20070 | ** FFI: | |
20071 | *** Added `pure`, `impure`, and `reentrant` attributes to `_import`. An | |
20072 | unattributed `_import` is treated as `impure`. A `pure` `_import` may be | |
20073 | subject to more aggressive optimizations (common subexpression elimination, | |
20074 | dead-code elimination). An `_import`-ed C function that (directly or | |
20075 | indirectly) calls an `_export`-ed SML function should be attributed | |
20076 | `reentrant`. | |
20077 | ** ML Basis annotations. | |
20078 | *** Added `allowSuccessorML {false|true}` to enable all SuccessorML features | |
20079 | and other annotations to enable specific SuccessorML features; see | |
20080 | http://mlton.org/SuccessorML for details. | |
20081 | *** Split `nonexhaustiveMatch {warn|error|igore}` and `redundantMatch | |
20082 | {warn|error|ignore}` into `nonexhaustiveMatch` and `redundantMatch` | |
20083 | (controls diagnostics for `case` expressions, `fn` expressions, and `fun` | |
20084 | declarations (which may raise `Match` on failure)) and `nonexhaustiveBind` | |
20085 | and `redundantBind` (controls diagnostics for `val` declarations (which may | |
20086 | raise `Bind` on failure)). | |
20087 | *** Added `valrecConstr {warn|error|ignore}` to report when a `val rec` (or | |
20088 | `fun`) declaration redefines an identifier that previously had constructor | |
20089 | status. | |
20090 | * Libraries. | |
20091 | ** Basis Library. | |
20092 | *** Improved performance of `Array.copy`, `Array.copyVec`, `Vector.append`, | |
20093 | `String.^`, `String.concat`, `String.concatWith`, and other related | |
20094 | functions by using `memmove` rather than element-by-element constructions. | |
20095 | ** `Unsafe` structure. | |
20096 | *** Added unsafe operations for array uninitialization and raw arrays; see | |
20097 | https://github.com/MLton/mlton/pull/207 for details. | |
20098 | ** Other libraries. | |
20099 | *** Updated: ckit library, MLLPT library, MLRISC library, SML/NJ library | |
20100 | * Tools. | |
20101 | ** mlnlffigen | |
20102 | *** Updated to warn and skip (rather than abort) when encountering functions | |
20103 | with `struct`/`union` argument or return type. | |
20104 | ||
20105 | For a complete list of changes and bug fixes since | |
20106 | <:Release20130715:>, see the | |
20107 | <!ViewGitFile(mlton,on-20180207-release,CHANGELOG.adoc)> and | |
20108 | <:Bugs20130715:>. | |
20109 | ||
20110 | == 20180207 binary packages == | |
20111 | ||
20112 | * AMD64 (aka "x86-64" or "x64") | |
20113 | ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-darwin.gmp-homebrew.tgz[Darwin (.tgz)] 16.7 (Mac OS X Sierra), dynamically linked against <:GnuMP:> in `/usr/local/lib` (suitable for https://brew.sh/[Homebrew] install of <:GnuMP:>) | |
20114 | ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-darwin.gmp-static.tgz[Darwin (.tgz)] 16.7 (Mac OS X Sierra), statically linked against <:GnuMP:> (but requires <:GnuMP:> for generated executables) | |
20115 | ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.amd64-linux.tgz[Linux], glibc 2.23 | |
20116 | // ** Windows MinGW 32/64 https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.exe[self-extracting] (28MB) or https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.msi[MSI] (61MB) installer | |
20117 | // * X86 | |
20118 | // ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-cygwin.tgz[Cygwin] 1.7.5 | |
20119 | // ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-linux.tgz[Linux], glibc 2.23 | |
20120 | // ** https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207-1.x86-linux.static.tgz[Linux], statically linked | |
20121 | // ** Windows MinGW 32/64 https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.exe[self-extracting] (28MB) or https://sourceforge.net/projects/mlton/files/mlton/20180207/MLton-20180207-1.msi[MSI] (61MB) installer | |
20122 | ||
20123 | == 20180207 source packages == | |
20124 | ||
20125 | * https://sourceforge.net/projects/mlton/files/mlton/20180207/mlton-20180207.src.tgz[mlton-20180207.src.tgz] | |
20126 | ||
20127 | == Also see == | |
20128 | ||
20129 | * <:Bugs20180207:> | |
20130 | * http://www.mlton.org/guide/20180207/[MLton Guide (20180207)]. | |
20131 | + | |
20132 | A snapshot of the MLton website at the time of release. | |
20133 | ||
20134 | <<< | |
20135 | ||
20136 | :mlton-guide-page: ReleaseChecklist | |
20137 | [[ReleaseChecklist]] | |
20138 | ReleaseChecklist | |
20139 | ================ | |
20140 | ||
20141 | == Advance preparation for release == | |
20142 | ||
20143 | * Update `./CHANGELOG.adoc`. | |
20144 | ** Write entries for missing notable commits. | |
20145 | ** Write summary of changes from previous release. | |
20146 | ** Update with estimated release date. | |
20147 | * Update `./README.adoc`. | |
20148 | ** Check features and description. | |
20149 | * Update `man/{mlton,mlprof}.1`. | |
20150 | ** Check compile-time and run-time options in `man/mlton.1`. | |
20151 | ** Check options in `man/mlprof.1`. | |
20152 | ** Update with estimated release date. | |
20153 | * Update `doc/guide`. | |
20154 | // ** Check <:OrphanedPages:> and <:WantedPages:>. | |
20155 | ** Synchronize <:Features:> page with `./README.adoc`. | |
20156 | ** Update <:Credits:> page with acknowledgements. | |
20157 | ** Create *ReleaseYYYYMM??* page (i.e., forthcoming release) based on *ReleaseXXXXLLCC* (i.e., previous release). | |
20158 | *** Update summary from `./CHANGELOG.adoc`. | |
20159 | *** Update links to estimated release date. | |
20160 | ** Create *BugsYYYYMM??* page based on *BugsXXXXLLCC*. | |
20161 | *** Update links to estimated release date. | |
20162 | ** Spell check pages. | |
20163 | * Ensure that all updates are pushed to `master` branch of <!ViewGitProj(mlton)>. | |
20164 | ||
20165 | == Prepare sources for tagging == | |
20166 | ||
20167 | * Update `./CHANGELOG.adoc`. | |
20168 | ** Update with proper release date. | |
20169 | * Update `man/{mlton,mlprof}.1`. | |
20170 | ** Update with proper release date. | |
20171 | * Update `doc/guide`. | |
20172 | ** Rename *ReleaseYYYYMM??* to *ReleaseYYYYMMDD* with proper release date. | |
20173 | *** Update links with proper release date. | |
20174 | ** Rename *BugsYYYYMM??* to *BugsYYYYMMDD* with proper release date. | |
20175 | *** Update links with proper release date. | |
20176 | ** Update *ReleaseXXXXLLCC*. | |
20177 | *** Change intro to "`This is an archived public release of MLton, version XXXXLLCC.`" | |
20178 | ** Update <:Home:> with note of new release. | |
20179 | *** Change `What's new?` text to `Please try out our new release, <:ReleaseYYYYMMDD:MLton YYYYMMDD>`. | |
20180 | *** Update `Download` link with proper release date. | |
20181 | ** Update <:Releases:> with new release. | |
20182 | * Ensure that all updates are pushed to `master` branch of <!ViewGitProj(mlton)>. | |
20183 | ||
20184 | == Tag sources == | |
20185 | ||
20186 | * Shell commands: | |
20187 | + | |
20188 | ---- | |
20189 | git clone http://github.com/MLton/mlton mlton.git | |
20190 | cd mlton.git | |
20191 | git checkout master | |
20192 | git tag -a -m "Tagging YYYYMMDD release" on-YYYYMMDD-release master | |
20193 | git push origin on-YYYYMMDD-release | |
20194 | ---- | |
20195 | ||
20196 | == Packaging == | |
20197 | ||
20198 | === SourceForge FRS === | |
20199 | ||
20200 | * Create *YYYYMMDD* directory: | |
20201 | + | |
20202 | ----- | |
20203 | sftp user@frs.sourceforge.net:/home/frs/project/mlton/mlton | |
20204 | sftp> mkdir YYYYMMDD | |
20205 | sftp> quit | |
20206 | ----- | |
20207 | ||
20208 | === Source release === | |
20209 | ||
20210 | * Create `mlton-YYYYMMDD.src.tgz`: | |
20211 | + | |
20212 | ---- | |
20213 | git clone http://github.com/MLton/mlton mlton | |
20214 | cd mlton | |
20215 | git checkout on-YYYYMMDD-release | |
20216 | make MLTON_VERSION=YYYYMMDD source-release | |
20217 | cd .. | |
20218 | ---- | |
20219 | + | |
20220 | or | |
20221 | + | |
20222 | ---- | |
20223 | wget https://github.com/MLton/mlton/archive/on-YYYYMMDD-release.tar.gz | |
20224 | tar xzvf on-YYYYMMDD-release.tar.gz | |
20225 | cd mlton-on-YYYYMMDD-release | |
20226 | make MLTON_VERSION=YYYYMMDD source-release | |
20227 | cd .. | |
20228 | ---- | |
20229 | ||
20230 | * Upload `mlton-YYYYMMDD.src.tgz`: | |
20231 | + | |
20232 | ----- | |
20233 | scp mlton-YYYYMMDD.src.tgz user@frs.sourceforge.net:/home/frs/project/mlton/mlton/YYYYMMDD/ | |
20234 | ----- | |
20235 | ||
20236 | * Update *ReleaseYYYYMMDD* with `mlton-YYYYMMDD.src.tgz` link. | |
20237 | ||
20238 | === Binary releases === | |
20239 | ||
20240 | * Build and create `mlton-YYYYMMDD-1.ARCH-OS.tgz`: | |
20241 | + | |
20242 | ---- | |
20243 | wget http://sourceforge.net/projects/mlton/files/mlton/YYYYMMDD/mlton-YYYYMMDD.src.tgz | |
20244 | tar xzvf mlton-YYYYMMDD.src.tgz | |
20245 | cd mlton-YYYYMMDD | |
20246 | make binary-release | |
20247 | cd .. | |
20248 | ---- | |
20249 | ||
20250 | * Upload `mlton-YYYYMMDD-1.ARCH-OS.tgz`: | |
20251 | + | |
20252 | ----- | |
20253 | scp mlton-YYYYMMDD-1.ARCH-OS.tgz user@frs.sourceforge.net:/home/frs/project/mlton/mlton/YYYYMMDD/ | |
20254 | ----- | |
20255 | ||
20256 | * Update *ReleaseYYYYMMDD* with `mlton-YYYYMMDD-1.ARCH-OS.tgz` link. | |
20257 | ||
20258 | == Website == | |
20259 | ||
20260 | * `guide/YYYYMMDD` gets a copy of `doc/guide/localhost`. | |
20261 | * Shell commands: | |
20262 | + | |
20263 | ---- | |
20264 | wget http://sourceforge.net/projects/mlton/files/mlton/YYYYMMDD/mlton-YYYYMMDD.src.tgz | |
20265 | tar xzvf mlton-YYYYMMDD.src.tgz | |
20266 | cd mlton-YYYYMMDD | |
20267 | cd doc/guide | |
20268 | cp -prf localhost YYYYMMDD | |
20269 | tar czvf guide-YYYYMMDD.tgz YYYYMMDD | |
20270 | rsync -avzP --delete -e ssh YYYYMMDD user@web.sourceforge.net:/home/project-web/mlton/htdocs/guide/ | |
20271 | rsync -avzP --delete -e ssh guide-YYYYMMDD.tgz user@web.sourceforge.net:/home/project-web/mlton/htdocs/guide/ | |
20272 | ---- | |
20273 | ||
20274 | == Announce release == | |
20275 | ||
20276 | * Mail announcement to: | |
20277 | ** mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`] | |
20278 | ** mailto:MLton-user@mlton.org[`MLton-user@mlton.org`] | |
20279 | ||
20280 | == Misc. == | |
20281 | ||
20282 | * Generate new <:Performance:> numbers. | |
20283 | ||
20284 | <<< | |
20285 | ||
20286 | :mlton-guide-page: Releases | |
20287 | [[Releases]] | |
20288 | Releases | |
20289 | ======== | |
20290 | ||
20291 | Public releases of MLton: | |
20292 | ||
20293 | * <:Release20180207:> | |
20294 | * <:Release20130715:> | |
20295 | * <:Release20100608:> | |
20296 | * <:Release20070826:> | |
20297 | * <:Release20051202:> | |
20298 | * <:Release20041109:> | |
20299 | * Release20040227 | |
20300 | * Release20030716 | |
20301 | * Release20030711 | |
20302 | * Release20030312 | |
20303 | * Release20020923 | |
20304 | * Release20020410 | |
20305 | * Release20011006 | |
20306 | * Release20010806 | |
20307 | * Release20010706 | |
20308 | * Release20000906 | |
20309 | * Release20000712 | |
20310 | * Release19990712 | |
20311 | * Release19990319 | |
20312 | * Release19980826 | |
20313 | ||
20314 | <<< | |
20315 | ||
20316 | :mlton-guide-page: RemoveUnused | |
20317 | [[RemoveUnused]] | |
20318 | RemoveUnused | |
20319 | ============ | |
20320 | ||
20321 | <:RemoveUnused:> is an optimization pass for both the <:SSA:> and | |
20322 | <:SSA2:> <:IntermediateLanguage:>s, invoked from <:SSASimplify:> and | |
20323 | <:SSA2Simplify:>. | |
20324 | ||
20325 | == Description == | |
20326 | ||
20327 | This pass aggressively removes unused: | |
20328 | ||
20329 | * datatypes | |
20330 | * datatype constructors | |
20331 | * datatype constructor arguments | |
20332 | * functions | |
20333 | * function arguments | |
20334 | * function returns | |
20335 | * blocks | |
20336 | * block arguments | |
20337 | * statements (variable bindings) | |
20338 | * handlers from non-tail calls (mayRaise analysis) | |
20339 | * continuations from non-tail calls (mayReturn analysis) | |
20340 | ||
20341 | == Implementation == | |
20342 | ||
20343 | * <!ViewGitFile(mlton,master,mlton/ssa/remove-unused.fun)> | |
20344 | * <!ViewGitFile(mlton,master,mlton/ssa/remove-unused2.fun)> | |
20345 | ||
20346 | == Details and Notes == | |
20347 | ||
20348 | {empty} | |
20349 | ||
20350 | <<< | |
20351 | ||
20352 | :mlton-guide-page: Restore | |
20353 | [[Restore]] | |
20354 | Restore | |
20355 | ======= | |
20356 | ||
20357 | <:Restore:> is a rewrite pass for the <:SSA:> and <:SSA2:> | |
20358 | <:IntermediateLanguage:>s, invoked from <:KnownCase:> and | |
20359 | <:LocalRef:>. | |
20360 | ||
20361 | == Description == | |
20362 | ||
20363 | This pass restores the SSA condition for a violating <:SSA:> or | |
20364 | <:SSA2:> program; the program must satisfy: | |
20365 | ____ | |
20366 | Every path from the root to a use of a variable (excluding globals) | |
20367 | passes through a def of that variable. | |
20368 | ____ | |
20369 | ||
20370 | == Implementation == | |
20371 | ||
20372 | * <!ViewGitFile(mlton,master,mlton/ssa/restore.sig)> | |
20373 | * <!ViewGitFile(mlton,master,mlton/ssa/restore.fun)> | |
20374 | * <!ViewGitFile(mlton,master,mlton/ssa/restore2.sig)> | |
20375 | * <!ViewGitFile(mlton,master,mlton/ssa/restore2.fun)> | |
20376 | ||
20377 | == Details and Notes == | |
20378 | ||
20379 | Based primarily on Section 19.1 of <!Cite(Appel98, Modern Compiler | |
20380 | Implementation in ML)>. | |
20381 | ||
20382 | The main deviation is the calculation of liveness of the violating | |
20383 | variables, which is used to predicate the insertion of phi arguments. | |
20384 | This is due to the algorithm's bias towards imperative languages, for | |
20385 | which it makes the assumption that all variables are defined in the | |
20386 | start block and all variables are "used" at exit. | |
20387 | ||
20388 | This is "optimized" for restoration of functions with small numbers of | |
20389 | violating variables -- use bool vectors to represent sets of violating | |
20390 | variables. | |
20391 | ||
20392 | Also, we use a `Promise.t` to suspend part of the dominance frontier | |
20393 | computation. | |
20394 | ||
20395 | <<< | |
20396 | ||
20397 | :mlton-guide-page: ReturnStatement | |
20398 | [[ReturnStatement]] | |
20399 | ReturnStatement | |
20400 | =============== | |
20401 | ||
20402 | Programmers coming from languages that have a `return` statement, such | |
20403 | as C, Java, and Python, often ask how one can translate functions that | |
20404 | return early into SML. This page briefly describes a number of ways | |
20405 | to translate uses of `return` to SML. | |
20406 | ||
20407 | == Conditional iterator function == | |
20408 | ||
20409 | A conditional iterator function, such as | |
20410 | http://www.standardml.org/Basis/list.html#SIG:LIST.find:VAL[`List.find`], | |
20411 | http://www.standardml.org/Basis/list.html#SIG:LIST.exists:VAL[`List.exists`], | |
20412 | or | |
20413 | http://www.standardml.org/Basis/list.html#SIG:LIST.all:VAL[`List.all`] | |
20414 | is probably what you want in most cases. Unfortunately, it might be | |
20415 | the case that the particular conditional iteration pattern that you | |
20416 | want isn't provided for your data structure. Usually the best | |
20417 | alternative in such a case is to implement the desired iteration | |
20418 | pattern as a higher-order function. For example, to implement a | |
20419 | `find` function for arrays (which already exists as | |
20420 | http://www.standardml.org/Basis/array.html#SIG:ARRAY.findi:VAL[`Array.find`]) | |
20421 | one could write | |
20422 | ||
20423 | [source,sml] | |
20424 | ---- | |
20425 | fun find predicate array = let | |
20426 | fun loop i = | |
20427 | if i = Array.length array then | |
20428 | NONE | |
20429 | else if predicate (Array.sub (array, i)) then | |
20430 | SOME (Array.sub (array, i)) | |
20431 | else | |
20432 | loop (i+1) | |
20433 | in | |
20434 | loop 0 | |
20435 | end | |
20436 | ---- | |
20437 | ||
20438 | Of course, this technique, while probably the most common case in | |
20439 | practice, applies only if you are essentially iterating over some data | |
20440 | structure. | |
20441 | ||
20442 | == Escape handler == | |
20443 | ||
20444 | Probably the most direct way to translate code using `return` | |
20445 | statements is to basically implement `return` using exception | |
20446 | handling. The mechanism can be packaged into a reusable module with | |
20447 | the signature | |
20448 | (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/public/control/exit.sig)>): | |
20449 | [source,sml] | |
20450 | ---- | |
20451 | sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/public/control/exit.sig 6:] | |
20452 | ---- | |
20453 | ||
20454 | (<!Cite(HarperEtAl93, Typing First-Class Continuations in ML)> | |
20455 | discusses the typing of a related construct.) The implementation | |
20456 | (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/detail/control/exit.sml)>) | |
20457 | is straightforward: | |
20458 | [source,sml] | |
20459 | ---- | |
20460 | sys::[./bin/InclGitFile.py mltonlib master com/ssh/extended-basis/unstable/detail/control/exit.sml 6:] | |
20461 | ---- | |
20462 | ||
20463 | Here is an example of how one could implement a `find` function given | |
20464 | an `app` function: | |
20465 | [source,sml] | |
20466 | ---- | |
20467 | fun appToFind (app : ('a -> unit) -> 'b -> unit) | |
20468 | (predicate : 'a -> bool) | |
20469 | (data : 'b) = | |
20470 | Exit.call | |
20471 | (fn return => | |
20472 | (app (fn x => | |
20473 | if predicate x then | |
20474 | return (SOME x) | |
20475 | else | |
20476 | ()) | |
20477 | data | |
20478 | ; NONE)) | |
20479 | ---- | |
20480 | ||
20481 | In the above, as soon as the expression `predicate x` evaluates to | |
20482 | `true` the `app` invocation is terminated. | |
20483 | ||
20484 | ||
20485 | == Continuation-passing Style (CPS) == | |
20486 | ||
20487 | A general way to implement complex control patterns is to use | |
20488 | http://en.wikipedia.org/wiki/Continuation-passing_style[CPS]. In CPS, | |
20489 | instead of returning normally, functions invoke a function passed as | |
20490 | an argument. In general, multiple continuation functions may be | |
20491 | passed as arguments and the ordinary return continuation may also be | |
20492 | used. As an example, here is a function that finds the leftmost | |
20493 | element of a binary tree satisfying a given predicate: | |
20494 | [source,sml] | |
20495 | ---- | |
20496 | datatype 'a tree = LEAF | BRANCH of 'a tree * 'a * 'a tree | |
20497 | ||
20498 | fun find predicate = let | |
20499 | fun recurse continue = | |
20500 | fn LEAF => | |
20501 | continue () | |
20502 | | BRANCH (lhs, elem, rhs) => | |
20503 | recurse | |
20504 | (fn () => | |
20505 | if predicate elem then | |
20506 | SOME elem | |
20507 | else | |
20508 | recurse continue rhs) | |
20509 | lhs | |
20510 | in | |
20511 | recurse (fn () => NONE) | |
20512 | end | |
20513 | ---- | |
20514 | ||
20515 | Note that the above function returns as soon as the leftmost element | |
20516 | satisfying the predicate is found. | |
20517 | ||
20518 | <<< | |
20519 | ||
20520 | :mlton-guide-page: RSSA | |
20521 | [[RSSA]] | |
20522 | RSSA | |
20523 | ==== | |
20524 | ||
20525 | <:RSSA:> is an <:IntermediateLanguage:>, translated from <:SSA2:> by | |
20526 | <:ToRSSA:>, optimized by <:RSSASimplify:>, and translated by | |
20527 | <:ToMachine:> to <:Machine:>. | |
20528 | ||
20529 | == Description == | |
20530 | ||
20531 | <:RSSA:> is a <:IntermediateLanguage:> that makes representation | |
20532 | decisions explicit. | |
20533 | ||
20534 | == Implementation == | |
20535 | ||
20536 | * <!ViewGitFile(mlton,master,mlton/backend/rssa.sig)> | |
20537 | * <!ViewGitFile(mlton,master,mlton/backend/rssa.fun)> | |
20538 | ||
20539 | == Type Checking == | |
20540 | ||
20541 | The new type language is aimed at expressing bit-level control over | |
20542 | layout and associated packing of data representations. There are | |
20543 | singleton types that denote constants, other atomic types for things | |
20544 | like integers and reals, and arbitrary sum types and sequence (tuple) | |
20545 | types. The big change to the type system is that type checking is now | |
20546 | based on subtyping, not type equality. So, for example, the singleton | |
20547 | type `0xFFFFEEBB` whose only inhabitant is the eponymous constant is a | |
20548 | subtype of the type `Word32`. | |
20549 | ||
20550 | == Details and Notes == | |
20551 | ||
20552 | SSA is an abbreviation for Static Single Assignment. The <:RSSA:> | |
20553 | <:IntermediateLanguage:> is a variant of SSA. | |
20554 | ||
20555 | <<< | |
20556 | ||
20557 | :mlton-guide-page: RSSAShrink | |
20558 | [[RSSAShrink]] | |
20559 | RSSAShrink | |
20560 | ========== | |
20561 | ||
20562 | <:RSSAShrink:> is an optimization pass for the <:RSSA:> | |
20563 | <:IntermediateLanguage:>. | |
20564 | ||
20565 | == Description == | |
20566 | ||
20567 | This pass implements a whole family of compile-time reductions, like: | |
20568 | ||
20569 | * constant folding, copy propagation | |
20570 | * inline the `Goto` to a block with a unique predecessor | |
20571 | ||
20572 | == Implementation == | |
20573 | ||
20574 | * <!ViewGitFile(mlton,master,mlton/backend/rssa.fun)> | |
20575 | ||
20576 | == Details and Notes == | |
20577 | ||
20578 | {empty} | |
20579 | ||
20580 | <<< | |
20581 | ||
20582 | :mlton-guide-page: RSSASimplify | |
20583 | [[RSSASimplify]] | |
20584 | RSSASimplify | |
20585 | ============ | |
20586 | ||
20587 | The optimization passes for the <:RSSA:> <:IntermediateLanguage:> are | |
20588 | collected and controlled by the `Backend` functor | |
20589 | (<!ViewGitFile(mlton,master,mlton/backend/backend.sig)>, | |
20590 | <!ViewGitFile(mlton,master,mlton/backend/backend.fun)>). | |
20591 | ||
20592 | The following optimization pass is implemented: | |
20593 | ||
20594 | * <:RSSAShrink:> | |
20595 | ||
20596 | The following implementation passes are implemented: | |
20597 | ||
20598 | * <:ImplementHandlers:> | |
20599 | * <:ImplementProfiling:> | |
20600 | * <:InsertLimitChecks:> | |
20601 | * <:InsertSignalChecks:> | |
20602 | ||
20603 | The optimization passes can be controlled from the command-line by the options | |
20604 | ||
20605 | * `-diag-pass <pass>` -- keep diagnostic info for pass | |
20606 | * `-drop-pass <pass>` -- omit optimization pass | |
20607 | * `-keep-pass <pass>` -- keep the results of pass | |
20608 | ||
20609 | <<< | |
20610 | ||
20611 | :mlton-guide-page: RunningOnAIX | |
20612 | [[RunningOnAIX]] | |
20613 | RunningOnAIX | |
20614 | ============ | |
20615 | ||
20616 | MLton runs fine on AIX. | |
20617 | ||
20618 | == Also see == | |
20619 | ||
20620 | * <:RunningOnPowerPC:> | |
20621 | * <:RunningOnPowerPC64:> | |
20622 | ||
20623 | <<< | |
20624 | ||
20625 | :mlton-guide-page: RunningOnAlpha | |
20626 | [[RunningOnAlpha]] | |
20627 | RunningOnAlpha | |
20628 | ============== | |
20629 | ||
20630 | MLton runs fine on the Alpha architecture. | |
20631 | ||
20632 | == Notes == | |
20633 | ||
20634 | * When compiling for Alpha, MLton doesn't support native code | |
20635 | generation (`-codegen native`). Hence, performance is not as good as | |
20636 | it might be and compile times are longer. Also, the quality of code | |
20637 | generated by `gcc` is important. By default, MLton calls `gcc -O1`. | |
20638 | You can change this by calling MLton with `-cc-opt -O2`. | |
20639 | ||
20640 | * When compiling for Alpha, MLton uses `-align 8` by default. | |
20641 | ||
20642 | <<< | |
20643 | ||
20644 | :mlton-guide-page: RunningOnAMD64 | |
20645 | [[RunningOnAMD64]] | |
20646 | RunningOnAMD64 | |
20647 | ============== | |
20648 | ||
20649 | MLton runs fine on the AMD64 (aka "x86-64" or "x64") architecture. | |
20650 | ||
20651 | == Notes == | |
20652 | ||
20653 | * When compiling for AMD64, MLton targets the 64-bit ABI. | |
20654 | ||
20655 | * On AMD64, MLton supports native code generation (`-codegen native` or `-codegen amd64`). | |
20656 | ||
20657 | * When compiling for AMD64, MLton uses `-align 8` by default. Using | |
20658 | `-align 4` may be incompatible with optimized builds of the <:GnuMP:> | |
20659 | library, which assume 8-byte alignment. (See the thread at | |
20660 | http://www.mlton.org/pipermail/mlton/2009-October/030674.html for more | |
20661 | details.) | |
20662 | ||
20663 | <<< | |
20664 | ||
20665 | :mlton-guide-page: RunningOnARM | |
20666 | [[RunningOnARM]] | |
20667 | RunningOnARM | |
20668 | ============ | |
20669 | ||
20670 | MLton runs fine on the ARM architecture. | |
20671 | ||
20672 | == Notes == | |
20673 | ||
20674 | * When compiling for ARM, MLton doesn't support native code generation | |
20675 | (`-codegen native`). Hence, performance is not as good as it might be | |
20676 | and compile times are longer. Also, the quality of code generated by | |
20677 | `gcc` is important. By default, MLton calls `gcc -O1`. You can | |
20678 | change this by calling MLton with `-cc-opt -O2`. | |
20679 | ||
20680 | <<< | |
20681 | ||
20682 | :mlton-guide-page: RunningOnCygwin | |
20683 | [[RunningOnCygwin]] | |
20684 | RunningOnCygwin | |
20685 | =============== | |
20686 | ||
20687 | MLton runs on the http://www.cygwin.com/[Cygwin] emulation layer, | |
20688 | which provides a Posix-like environment while running on Windows. To | |
20689 | run MLton with Cygwin, you must first install Cygwin on your Windows | |
20690 | machine. To do this, visit the Cygwin site from your Windows machine | |
20691 | and run their `setup.exe` script. Then, you can unpack the MLton | |
20692 | binary `tgz` in your Cygwin environment. | |
20693 | ||
20694 | To run MLton cross-compiled executables on Windows, you must install | |
20695 | the Cygwin `dll` on the Windows machine. | |
20696 | ||
20697 | == Known issues == | |
20698 | ||
20699 | * Time profiling is disabled. | |
20700 | ||
20701 | * Cygwin's `mmap` emulation is less than perfect. Sometimes it | |
20702 | interacts badly with `Posix.Process.fork`. | |
20703 | ||
20704 | * The <!RawGitFile(mlton,master,regression/socket.sml)> regression | |
20705 | test fails. We suspect this is not a bug and is simply due to our | |
20706 | test relying on a certain behavior when connecting to a socket that | |
20707 | has not yet accepted, which is handled differently on Cygwin than | |
20708 | other platforms. Any help in understanding and resolving this issue | |
20709 | is appreciated. | |
20710 | ||
20711 | == Also see == | |
20712 | ||
20713 | * <:RunningOnMinGW:RunningOnMinGW> | |
20714 | ||
20715 | <<< | |
20716 | ||
20717 | :mlton-guide-page: RunningOnDarwin | |
20718 | [[RunningOnDarwin]] | |
20719 | RunningOnDarwin | |
20720 | =============== | |
20721 | ||
20722 | MLton runs fine on Darwin (and on Mac OS X). | |
20723 | ||
20724 | == Notes == | |
20725 | ||
20726 | * MLton requires the <:GnuMP:> library, which is available via | |
20727 | http://www.finkproject.org[Fink], http://www.macports.com[MacPorts], | |
20728 | http://mxcl.github.io/homebrew/[Homebrew]. | |
20729 | ||
20730 | * For Intel-based Macs, MLton targets the <:RunningOnAMD64:AMD64 | |
20731 | architecture> on Darwin 10 (Mac OS X Snow Leopard) and higher and | |
20732 | targets the <:RunningOnX86:x86 architecture> on Darwin 8 (Mac OS X | |
20733 | Tiger) and Darwin 9 (Mac OS X Leopard). | |
20734 | ||
20735 | == Known issues == | |
20736 | ||
20737 | * Executables that save and load worlds on Darwin 11 (Mac OS X Lion) | |
20738 | and higher should be compiled with `-link-opt -fno-PIE` ; see | |
20739 | <:MLtonWorld:> for more details. | |
20740 | ||
20741 | * <:ProfilingTime:> may give inaccurate results on multi-processor | |
20742 | machines. The `SIGPROF` signal, used to sample the profiled program, | |
20743 | is supposed to be delivered 100 times a second (i.e., at 10000us | |
20744 | intervals), but there can be delays of over 1 minute between the | |
20745 | delivery of consecutive `SIGPROF` signals. A more complete | |
20746 | description may be found | |
20747 | http://lists.apple.com/archives/Unix-porting/2007/Aug/msg00000.html[here] | |
20748 | and | |
20749 | http://lists.apple.com/archives/Darwin-dev/2007/Aug/msg00045.html[here]. | |
20750 | ||
20751 | == Also see == | |
20752 | ||
20753 | * <:RunningOnAMD64:> | |
20754 | * <:RunningOnPowerPC:> | |
20755 | * <:RunningOnX86:> | |
20756 | ||
20757 | <<< | |
20758 | ||
20759 | :mlton-guide-page: RunningOnFreeBSD | |
20760 | [[RunningOnFreeBSD]] | |
20761 | RunningOnFreeBSD | |
20762 | ================ | |
20763 | ||
20764 | MLton runs fine on http://www.freebsd.org/[FreeBSD]. | |
20765 | ||
20766 | == Notes == | |
20767 | ||
20768 | * MLton is available as a http://www.freebsd.org/[FreeBSD] | |
20769 | http://www.freebsd.org/cgi/ports.cgi?query=mlton&stype=all[port]. | |
20770 | ||
20771 | == Known issues == | |
20772 | ||
20773 | * Executables often run more slowly than on a comparable Linux | |
20774 | machine. We conjecture that part of this is due to costs due to heap | |
20775 | resizing and kernel zeroing of pages. Any help in solving the problem | |
20776 | would be appreciated. | |
20777 | ||
20778 | * FreeBSD defaults to a datasize limit of 512M, even if you have more | |
20779 | than that amount of memory in the computer. Hence, your MLton process | |
20780 | will be limited in the amount of memory it has. To fix this problem, | |
20781 | turn up the datasize and the default datasize available to a process: | |
20782 | Edit `/boot/loader.conf` to set the limits. For example, the setting | |
20783 | + | |
20784 | ---- | |
20785 | kern.maxdsiz="671088640" | |
20786 | kern.dfldsiz="671088640" | |
20787 | kern.maxssiz="134217728" | |
20788 | ---- | |
20789 | + | |
20790 | will give a process 640M of datasize memory, default to 640M available | |
20791 | and set 128M of stack size memory. | |
20792 | ||
20793 | <<< | |
20794 | ||
20795 | :mlton-guide-page: RunningOnHPPA | |
20796 | [[RunningOnHPPA]] | |
20797 | RunningOnHPPA | |
20798 | ============= | |
20799 | ||
20800 | MLton runs fine on the HPPA architecture. | |
20801 | ||
20802 | == Notes == | |
20803 | ||
20804 | * When compiling for HPPA, MLton targets the 32-bit HPPA architecture. | |
20805 | ||
20806 | * When compiling for HPPA, MLton doesn't support native code | |
20807 | generation (`-codegen native`). Hence, performance is not as good as | |
20808 | it might be and compile times are longer. Also, the quality of code | |
20809 | generated by `gcc` is important. By default, MLton calls `gcc -O1`. | |
20810 | You can change this by calling MLton with `-cc-opt -O2`. | |
20811 | ||
20812 | * When compiling for HPPA, MLton uses `-align 8` by default. While | |
20813 | this speeds up reals, it also may increase object sizes. If your | |
20814 | program does not make significant use of reals, you might see a | |
20815 | speedup with `-align 4`. | |
20816 | ||
20817 | <<< | |
20818 | ||
20819 | :mlton-guide-page: RunningOnHPUX | |
20820 | [[RunningOnHPUX]] | |
20821 | RunningOnHPUX | |
20822 | ============= | |
20823 | ||
20824 | MLton runs fine on HPUX. | |
20825 | ||
20826 | == Also see == | |
20827 | ||
20828 | * <:RunningOnHPPA:> | |
20829 | ||
20830 | <<< | |
20831 | ||
20832 | :mlton-guide-page: RunningOnIA64 | |
20833 | [[RunningOnIA64]] | |
20834 | RunningOnIA64 | |
20835 | ============= | |
20836 | ||
20837 | MLton runs fine on the IA64 architecture. | |
20838 | ||
20839 | == Notes == | |
20840 | ||
20841 | * When compiling for IA64, MLton targets the 64-bit ABI. | |
20842 | ||
20843 | * When compiling for IA64, MLton doesn't support native code | |
20844 | generation (`-codegen native`). Hence, performance is not as good as | |
20845 | it might be and compile times are longer. Also, the quality of code | |
20846 | generated by `gcc` is important. By default, MLton calls `gcc -O1`. | |
20847 | You can change this by calling MLton with `-cc-opt -O2`. | |
20848 | ||
20849 | * When compiling for IA64, MLton uses `-align 8` by default. | |
20850 | ||
20851 | * On the IA64, the <:GnuMP:> library supports multiple ABIs. See the | |
20852 | <:GnuMP:> page for more details. | |
20853 | ||
20854 | <<< | |
20855 | ||
20856 | :mlton-guide-page: RunningOnLinux | |
20857 | [[RunningOnLinux]] | |
20858 | RunningOnLinux | |
20859 | ============== | |
20860 | ||
20861 | MLton runs fine on Linux. | |
20862 | ||
20863 | <<< | |
20864 | ||
20865 | :mlton-guide-page: RunningOnMinGW | |
20866 | [[RunningOnMinGW]] | |
20867 | RunningOnMinGW | |
20868 | ============== | |
20869 | ||
20870 | MLton runs on http://mingw.org[MinGW], a library for porting Unix | |
20871 | applications to Windows. Some library functionality is missing or | |
20872 | changed. | |
20873 | ||
20874 | == Notes == | |
20875 | ||
20876 | * To compile MLton on MinGW: | |
20877 | ** The <:GnuMP:> library is required. | |
20878 | ** The Bash shell is required. If you are using a prebuilt MSYS, you | |
20879 | probably want to symlink `bash` to `sh`. | |
20880 | ||
20881 | == Known issues == | |
20882 | ||
20883 | * Many functions are unimplemented and will `raise SysErr`. | |
20884 | ** `MLton.Itimer.set` | |
20885 | ** `MLton.ProcEnv.setgroups` | |
20886 | ** `MLton.Process.kill` | |
20887 | ** `MLton.Process.reap` | |
20888 | ** `MLton.World.load` | |
20889 | ** `OS.FileSys.readLink` | |
20890 | ** `OS.IO.poll` | |
20891 | ** `OS.Process.terminate` | |
20892 | ** `Posix.FileSys.chown` | |
20893 | ** `Posix.FileSys.fchown` | |
20894 | ** `Posix.FileSys.fpathconf` | |
20895 | ** `Posix.FileSys.link` | |
20896 | ** `Posix.FileSys.mkfifo` | |
20897 | ** `Posix.FileSys.pathconf` | |
20898 | ** `Posix.FileSys.readlink` | |
20899 | ** `Posix.FileSys.symlink` | |
20900 | ** `Posix.IO.dupfd` | |
20901 | ** `Posix.IO.getfd` | |
20902 | ** `Posix.IO.getfl` | |
20903 | ** `Posix.IO.getlk` | |
20904 | ** `Posix.IO.setfd` | |
20905 | ** `Posix.IO.setfl` | |
20906 | ** `Posix.IO.setlkw` | |
20907 | ** `Posix.IO.setlk` | |
20908 | ** `Posix.ProcEnv.ctermid` | |
20909 | ** `Posix.ProcEnv.getegid` | |
20910 | ** `Posix.ProcEnv.geteuid` | |
20911 | ** `Posix.ProcEnv.getgid` | |
20912 | ** `Posix.ProcEnv.getgroups` | |
20913 | ** `Posix.ProcEnv.getlogin` | |
20914 | ** `Posix.ProcEnv.getpgrp` | |
20915 | ** `Posix.ProcEnv.getpid` | |
20916 | ** `Posix.ProcEnv.getppid` | |
20917 | ** `Posix.ProcEnv.getuid` | |
20918 | ** `Posix.ProcEnv.setgid` | |
20919 | ** `Posix.ProcEnv.setpgid` | |
20920 | ** `Posix.ProcEnv.setsid` | |
20921 | ** `Posix.ProcEnv.setuid` | |
20922 | ** `Posix.ProcEnv.sysconf` | |
20923 | ** `Posix.ProcEnv.times` | |
20924 | ** `Posix.ProcEnv.ttyname` | |
20925 | ** `Posix.Process.exece` | |
20926 | ** `Posix.Process.execp` | |
20927 | ** `Posix.Process.exit` | |
20928 | ** `Posix.Process.fork` | |
20929 | ** `Posix.Process.kill` | |
20930 | ** `Posix.Process.pause` | |
20931 | ** `Posix.Process.waitpid_nh` | |
20932 | ** `Posix.Process.waitpid` | |
20933 | ** `Posix.SysDB.getgrgid` | |
20934 | ** `Posix.SysDB.getgrnam` | |
20935 | ** `Posix.SysDB.getpwuid` | |
20936 | ** `Posix.TTY.TC.drain` | |
20937 | ** `Posix.TTY.TC.flow` | |
20938 | ** `Posix.TTY.TC.flush` | |
20939 | ** `Posix.TTY.TC.getattr` | |
20940 | ** `Posix.TTY.TC.getpgrp` | |
20941 | ** `Posix.TTY.TC.sendbreak` | |
20942 | ** `Posix.TTY.TC.setattr` | |
20943 | ** `Posix.TTY.TC.setpgrp` | |
20944 | ** `Unix.kill` | |
20945 | ** `Unix.reap` | |
20946 | ** `UnixSock.fromAddr` | |
20947 | ** `UnixSock.toAddr` | |
20948 | ||
20949 | <<< | |
20950 | ||
20951 | :mlton-guide-page: RunningOnNetBSD | |
20952 | [[RunningOnNetBSD]] | |
20953 | RunningOnNetBSD | |
20954 | =============== | |
20955 | ||
20956 | MLton runs fine on http://www.netbsd.org/[NetBSD]. | |
20957 | ||
20958 | == Installing the correct packages for NetBSD == | |
20959 | ||
20960 | The NetBSD system installs 3rd party packages by a mechanism known as | |
20961 | pkgsrc. This is a tree of Makefiles which when invoked downloads the | |
20962 | source code, builds a package and installs it on the system. In order | |
20963 | to run MLton on NetBSD, you will have to install several packages for | |
20964 | it to work: | |
20965 | ||
20966 | * `shells/bash` | |
20967 | ||
20968 | * `devel/gmp` | |
20969 | ||
20970 | * `devel/gmake` | |
20971 | ||
20972 | In order to get graphical call-graphs of profiling information, you | |
20973 | will need the additional package | |
20974 | ||
20975 | * `graphics/graphviz` | |
20976 | ||
20977 | To build the documentation for MLton, you will need the addtional | |
20978 | package | |
20979 | ||
20980 | * `textproc/asciidoc`. | |
20981 | ||
20982 | == Tips for compiling and using MLton on NetBSD == | |
20983 | ||
20984 | MLton can be a memory-hog on computers with little memory. While | |
20985 | 640Mb of RAM ought to be enough to self-compile MLton one might want | |
20986 | to do some tuning to the NetBSD VM subsystem in order to succeed. The | |
20987 | notes presented here is what <:JesperLouisAndersen:> uses for | |
20988 | compiling MLton on his laptop. | |
20989 | ||
20990 | === The NetBSD VM subsystem === | |
20991 | ||
20992 | NetBSD uses a VM subsystem named | |
20993 | http://www.ccrc.wustl.edu/pub/chuck/tech/uvm/[UVM]. | |
20994 | http://www.selonen.org/arto/netbsd/vm_tune.html[Tuning the VM system] | |
20995 | can be done via the `sysctl(8)`-interface with the "VM" MIB set. | |
20996 | ||
20997 | === Tuning the NetBSD VM subsystem for MLton === | |
20998 | ||
20999 | MLton uses a lot of anonymous pages when it is running. Thus, we will | |
21000 | need to tune up the default of 80 for anonymous pages. Setting | |
21001 | ||
21002 | ---- | |
21003 | sysctl -w vm.anonmax=95 | |
21004 | sysctl -w vm.anonmin=50 | |
21005 | sysctl -w vm.filemin=2 | |
21006 | sysctl -w vm.execmin=2 | |
21007 | sysctl -w vm.filemax=4 | |
21008 | sysctl -w vm.execmax=4 | |
21009 | ---- | |
21010 | ||
21011 | makes it less likely for the VM system to swap out anonymous pages. | |
21012 | For a full explanation of the above flags, see the documentation. | |
21013 | ||
21014 | The result is that my laptop goes from a MLton compile where it swaps | |
21015 | a lot to a MLton compile with no swapping. | |
21016 | ||
21017 | <<< | |
21018 | ||
21019 | :mlton-guide-page: RunningOnOpenBSD | |
21020 | [[RunningOnOpenBSD]] | |
21021 | RunningOnOpenBSD | |
21022 | ================ | |
21023 | ||
21024 | MLton runs fine on http://www.openbsd.org/[OpenBSD]. | |
21025 | ||
21026 | == Known issues == | |
21027 | ||
21028 | * The <!RawGitFile(mlton,master,regression/socket.sml)> regression | |
21029 | test fails. We suspect this is not a bug and is simply due to our | |
21030 | test relying on a certain behavior when connecting to a socket that | |
21031 | has not yet accepted, which is handled differently on OpenBSD than | |
21032 | other platforms. Any help in understanding and resolving this issue | |
21033 | is appreciated. | |
21034 | ||
21035 | <<< | |
21036 | ||
21037 | :mlton-guide-page: RunningOnPowerPC | |
21038 | [[RunningOnPowerPC]] | |
21039 | RunningOnPowerPC | |
21040 | ================ | |
21041 | ||
21042 | MLton runs fine on the PowerPC architecture. | |
21043 | ||
21044 | == Notes == | |
21045 | ||
21046 | * When compiling for PowerPC, MLton targets the 32-bit PowerPC | |
21047 | architecture. | |
21048 | ||
21049 | * When compiling for PowerPC, MLton doesn't support native code | |
21050 | generation (`-codegen native`). Hence, performance is not as good as | |
21051 | it might be and compile times are longer. Also, the quality of code | |
21052 | generated by `gcc` is important. By default, MLton calls `gcc -O1`. | |
21053 | You can change this by calling MLton with `-cc-opt -O2`. | |
21054 | ||
21055 | * On the PowerPC, the <:GnuMP:> library supports multiple ABIs. See | |
21056 | the <:GnuMP:> page for more details. | |
21057 | ||
21058 | <<< | |
21059 | ||
21060 | :mlton-guide-page: RunningOnPowerPC64 | |
21061 | [[RunningOnPowerPC64]] | |
21062 | RunningOnPowerPC64 | |
21063 | ================== | |
21064 | ||
21065 | MLton runs fine on the PowerPC64 architecture. | |
21066 | ||
21067 | == Notes == | |
21068 | ||
21069 | * When compiling for PowerPC64, MLton targets the 64-bit PowerPC | |
21070 | architecture. | |
21071 | ||
21072 | * When compiling for PowerPC64, MLton doesn't support native code | |
21073 | generation (`-codegen native`). Hence, performance is not as good as | |
21074 | it might be and compile times are longer. Also, the quality of code | |
21075 | generated by `gcc` is important. By default, MLton calls `gcc -O1`. | |
21076 | You can change this by calling MLton with `-cc-opt -O2`. | |
21077 | ||
21078 | * On the PowerPC64, the <:GnuMP:> library supports multiple ABIs. See | |
21079 | the <:GnuMP:> page for more details. | |
21080 | ||
21081 | <<< | |
21082 | ||
21083 | :mlton-guide-page: RunningOnS390 | |
21084 | [[RunningOnS390]] | |
21085 | RunningOnS390 | |
21086 | ============= | |
21087 | ||
21088 | MLton runs fine on the S390 architecture. | |
21089 | ||
21090 | == Notes == | |
21091 | ||
21092 | * When compiling for S390, MLton doesn't support native code | |
21093 | generation (`-codegen native`). Hence, performance is not as good as | |
21094 | it might be and compile times are longer. Also, the quality of code | |
21095 | generated by `gcc` is important. By default, MLton calls `gcc -O1`. | |
21096 | You can change this by calling MLton with `-cc-opt -O2`. | |
21097 | ||
21098 | <<< | |
21099 | ||
21100 | :mlton-guide-page: RunningOnSolaris | |
21101 | [[RunningOnSolaris]] | |
21102 | RunningOnSolaris | |
21103 | ================ | |
21104 | ||
21105 | MLton runs fine on Solaris. | |
21106 | ||
21107 | == Notes == | |
21108 | ||
21109 | * You must install the `binutils`, `gcc`, and `make` packages. You | |
21110 | can find out how to get these at | |
21111 | http://www.sunfreeware.com[sunfreeware.com]. | |
21112 | ||
21113 | * Making the documentation requires that you install `latex` and | |
21114 | `dvips`, which are available in the `tetex` package. | |
21115 | ||
21116 | == Known issues == | |
21117 | ||
21118 | * Bootstrapping on the <:RunningOnSparc:Sparc architecture> is so slow | |
21119 | as to be impractical (many hours on a 500MHz UltraSparc). For this | |
21120 | reason, we strongly recommend building with a | |
21121 | <:CrossCompiling:cross compiler>. | |
21122 | ||
21123 | == Also see == | |
21124 | ||
21125 | * <:RunningOnAMD64:> | |
21126 | * <:RunningOnSparc:> | |
21127 | * <:RunningOnX86:> | |
21128 | ||
21129 | <<< | |
21130 | ||
21131 | :mlton-guide-page: RunningOnSparc | |
21132 | [[RunningOnSparc]] | |
21133 | RunningOnSparc | |
21134 | ============== | |
21135 | ||
21136 | MLton runs fine on the Sparc architecture. | |
21137 | ||
21138 | == Notes == | |
21139 | ||
21140 | * When compiling for Sparc, MLton targets the 32-bit Sparc | |
21141 | architecture (i.e., Sparc V8). | |
21142 | ||
21143 | * When compiling for Sparc, MLton doesn't support native code | |
21144 | generation (`-codegen native`). Hence, performance is not as good as | |
21145 | it might be and compile times are longer. Also, the quality of code | |
21146 | generated by `gcc` is important. By default, MLton calls `gcc -O1`. | |
21147 | You can change this by calling MLton with `-cc-opt -O2`. We have seen | |
21148 | this speed up some programs by as much as 30%, especially those | |
21149 | involving floating point; however, it can also more than double | |
21150 | compile times. | |
21151 | ||
21152 | * When compiling for Sparc, MLton uses `-align 8` by default. While | |
21153 | this speeds up reals, it also may increase object sizes. If your | |
21154 | program does not make significant use of reals, you might see a | |
21155 | speedup with `-align 4`. | |
21156 | ||
21157 | == Known issues == | |
21158 | ||
21159 | * Bootstrapping on the <:RunningOnSparc:Sparc architecture> is so slow | |
21160 | as to be impractical (many hours on a 500MHz UltraSparc). For this | |
21161 | reason, we strongly recommend building with a | |
21162 | <:CrossCompiling:cross compiler>. | |
21163 | ||
21164 | == Also see == | |
21165 | ||
21166 | * <:RunningOnSolaris:> | |
21167 | ||
21168 | <<< | |
21169 | ||
21170 | :mlton-guide-page: RunningOnX86 | |
21171 | [[RunningOnX86]] | |
21172 | RunningOnX86 | |
21173 | ============ | |
21174 | ||
21175 | MLton runs fine on the x86 architecture. | |
21176 | ||
21177 | == Notes == | |
21178 | ||
21179 | * On x86, MLton supports native code generation (`-codegen native` or | |
21180 | `-codegen x86`). | |
21181 | ||
21182 | <<< | |
21183 | ||
21184 | :mlton-guide-page: RunTimeOptions | |
21185 | [[RunTimeOptions]] | |
21186 | RunTimeOptions | |
21187 | ============== | |
21188 | ||
21189 | Executables produced by MLton take command line arguments that control | |
21190 | the runtime system. These arguments are optional, and occur before | |
21191 | the executable's usual arguments. To use these options, the first | |
21192 | argument to the executable must be `@MLton`. The optional arguments | |
21193 | then follow, must be terminated by `--`, and are followed by any | |
21194 | arguments to the program. The optional arguments are _not_ made | |
21195 | available to the SML program via `CommandLine.arguments`. For | |
21196 | example, a valid call to `hello-world` is: | |
21197 | ||
21198 | ---- | |
21199 | hello-world @MLton gc-summary fixed-heap 10k -- a b c | |
21200 | ---- | |
21201 | ||
21202 | In the above example, | |
21203 | `CommandLine.arguments () = ["a", "b", "c"]`. | |
21204 | ||
21205 | It is allowed to have a sequence of `@MLton` arguments, as in: | |
21206 | ||
21207 | ---- | |
21208 | hello-world @MLton gc-summary -- @MLton fixed-heap 10k -- a b c | |
21209 | ---- | |
21210 | ||
21211 | Run-time options can also control MLton, as in | |
21212 | ||
21213 | ---- | |
21214 | mlton @MLton fixed-heap 0.5g -- foo.sml | |
21215 | ---- | |
21216 | ||
21217 | ||
21218 | == Options == | |
21219 | ||
21220 | * ++fixed-heap __x__{k|K|m|M|g|G}++ | |
21221 | + | |
21222 | Use a fixed size heap of size _x_, where _x_ is a real number and the | |
21223 | trailing letter indicates its units. | |
21224 | + | |
21225 | [cols="^25%,<75%"] | |
21226 | |==== | |
21227 | | `k` or `K` | 1024 | |
21228 | | `m` or `M` | 1,048,576 | |
21229 | | `g` or `G` | 1,073,741,824 | |
21230 | |==== | |
21231 | + | |
21232 | A value of `0` means to use almost all the RAM present on the machine. | |
21233 | + | |
21234 | The heap size used by `fixed-heap` includes all memory allocated by | |
21235 | SML code, including memory for the stack (or stacks, if there are | |
21236 | multiple threads). It does not, however, include any memory used for | |
21237 | code itself or memory used by C globals, the C stack, or malloc. | |
21238 | ||
21239 | * ++gc-messages++ | |
21240 | + | |
21241 | Print a message at the start and end of every garbage collection. | |
21242 | ||
21243 | * ++gc-summary++ | |
21244 | + | |
21245 | Print a summary of garbage collection statistics upon program | |
21246 | termination to standard error. | |
21247 | ||
21248 | * ++gc-summary-file __file__++ | |
21249 | + | |
21250 | Print a summary of garbage collection statistics upon program | |
21251 | termination to the file specified by _file_. | |
21252 | ||
21253 | * ++load-world __world__++ | |
21254 | + | |
21255 | Restart the computation with the file specified by _world_, which must | |
21256 | have been created by a call to `MLton.World.save` by the same | |
21257 | executable. See <:MLtonWorld:>. | |
21258 | ||
21259 | * ++max-heap __x__{k|K|m|M|g|G}++ | |
21260 | + | |
21261 | Run the computation with an automatically resized heap that is never | |
21262 | larger than _x_, where _x_ is a real number and the trailing letter | |
21263 | indicates the units as with `fixed-heap`. The heap size for | |
21264 | `max-heap` is accounted for as with `fixed-heap`. | |
21265 | ||
21266 | * ++may-page-heap {false|true}++ | |
21267 | + | |
21268 | Enable paging the heap to disk when unable to grow the heap to a | |
21269 | desired size. | |
21270 | ||
21271 | * ++no-load-world++ | |
21272 | + | |
21273 | Disable `load-world`. This can be used as an argument to the compiler | |
21274 | via `-runtime no-load-world` to create executables that will not load | |
21275 | a world. This may be useful to ensure that set-uid executables do not | |
21276 | load some strange world. | |
21277 | ||
21278 | * ++ram-slop __x__++ | |
21279 | + | |
21280 | Multiply _x_ by the amount of RAM on the machine to obtain what the | |
21281 | runtime views as the amount of RAM it can use. Typically _x_ is less | |
21282 | than 1, and is used to account for space used by other programs | |
21283 | running on the same machine. | |
21284 | ||
21285 | * ++stop++ | |
21286 | + | |
21287 | Causes the runtime to stop processing `@MLton` arguments once the next | |
21288 | `--` is reached. This can be used as an argument to the compiler via | |
21289 | `-runtime stop` to create executables that don't process any `@MLton` | |
21290 | arguments. | |
21291 | ||
21292 | <<< | |
21293 | ||
21294 | :mlton-guide-page: ScopeInference | |
21295 | [[ScopeInference]] | |
21296 | ScopeInference | |
21297 | ============== | |
21298 | ||
21299 | Scope inference is an analysis/rewrite pass for the <:AST:> | |
21300 | <:IntermediateLanguage:>, invoked from <:Elaborate:>. | |
21301 | ||
21302 | == Description == | |
21303 | ||
21304 | This pass adds free type variables to the `val` or `fun` | |
21305 | declaration where they are implicitly scoped. | |
21306 | ||
21307 | == Implementation == | |
21308 | ||
21309 | <!ViewGitFile(mlton,master,mlton/elaborate/scope.sig)> | |
21310 | <!ViewGitFile(mlton,master,mlton/elaborate/scope.fun)> | |
21311 | ||
21312 | == Details and Notes == | |
21313 | ||
21314 | Scope inference determines for each type variable, the declaration | |
21315 | where it is bound. Scope inference is a direct implementation of the | |
21316 | specification given in section 4.6 of the | |
21317 | <:DefinitionOfStandardML: Definition>. Recall that a free occurrence | |
21318 | of a type variable `'a` in a declaration `d` is _unguarded_ | |
21319 | in `d` if `'a` is not part of a smaller declaration. A type | |
21320 | variable `'a` is implicitly scoped at `d` if `'a` is | |
21321 | unguarded in `d` and `'a` does not occur unguarded in any | |
21322 | declaration containing `d`. | |
21323 | ||
21324 | The first pass of scope inference walks down the tree and renames all | |
21325 | explicitly bound type variables in order to avoid name collisions. It | |
21326 | then walks up the tree and adds to each declaration the set of | |
21327 | unguarded type variables occurring in that declaration. At this | |
21328 | point, if declaration `d` contains an unguarded type variable | |
21329 | `'a` and the immediately containing declaration does not contain | |
21330 | `'a`, then `'a` is implicitly scoped at `d`. The final | |
21331 | pass walks down the tree leaving a `'a` at the a declaration where | |
21332 | it is scoped and removing it from all enclosed declarations. | |
21333 | ||
21334 | <<< | |
21335 | ||
21336 | :mlton-guide-page: SelfCompiling | |
21337 | [[SelfCompiling]] | |
21338 | SelfCompiling | |
21339 | ============= | |
21340 | ||
21341 | If you want to compile MLton, you must first get the <:Sources:>. You | |
21342 | can compile with either MLton or SML/NJ, but we strongly recommend | |
21343 | using MLton, since it generates a much faster and more robust | |
21344 | executable. | |
21345 | ||
21346 | == Compiling with MLton == | |
21347 | ||
21348 | To compile with MLton, you need the binary versions of `mlton`, | |
21349 | `mllex`, and `mlyacc` that come with the MLton binary package. To be | |
21350 | safe, you should use the same version of MLton that you are building. | |
21351 | However, older versions may work, as long as they don't go back too | |
21352 | far. To build MLton, run `make` from within the root directory of the | |
21353 | sources. This will build MLton first with the already installed | |
21354 | binary version of MLton and will then rebuild MLton with itself. | |
21355 | ||
21356 | First, the `Makefile` calls `mllex` and `mlyacc` to build the lexer | |
21357 | and parser, and then calls `mlton` to compile itself. When making | |
21358 | MLton using another version the `Makefile` automatically uses | |
21359 | `mlton-stubs.mlb`, which will put in enough stubs to emulate the | |
21360 | `structure MLton`. Once MLton is built, the `Makefile` will rebuild | |
21361 | MLton with itself, this time using `mlton.mlb` and the real | |
21362 | `structure MLton` from the <:BasisLibrary:Basis Library>. This second round | |
21363 | of compilation is essential in order to achieve a fast and robust | |
21364 | MLton. | |
21365 | ||
21366 | Compiling MLton requires at least 1GB of RAM for 32-bit platforms (2GB is | |
21367 | preferable) and at least 2GB RAM for 64-bit platforms (4GB is preferable). | |
21368 | If your machine has less RAM, self-compilation will | |
21369 | likely fail, or at least take a very long time due to paging. Even if | |
21370 | you have enough memory, there simply may not be enough available, due | |
21371 | to memory consumed by other processes. In this case, you may see an | |
21372 | `Out of memory` message, or self-compilation may become extremely | |
21373 | slow. The only fix is to make sure that enough memory is available. | |
21374 | ||
21375 | === Possible Errors === | |
21376 | ||
21377 | * The C compiler may not be able to find the <:GnuMP:> header file, | |
21378 | `gmp.h` leading to an error like the following. | |
21379 | + | |
21380 | ---- | |
21381 | cenv.h:49:18: fatal error: gmp.h: No such file or directory | |
21382 | ---- | |
21383 | + | |
21384 | The solution is to install (or build) GnuMP on your machine. If you | |
21385 | install it at a location not on the default seach path, then run | |
21386 | ++make WITH_GMP_INC_DIR=__/path/to/gmp/include__ WITH_GMP_LIB_DIR=__/path/to/gmp/lib__++. | |
21387 | ||
21388 | * The following errors indicates that a binary version of MLton could | |
21389 | not be found in your path. | |
21390 | + | |
21391 | ---- | |
21392 | /bin/sh: mlton: command not found | |
21393 | ---- | |
21394 | + | |
21395 | ---- | |
21396 | make[2]: mlton: Command not found | |
21397 | ---- | |
21398 | + | |
21399 | You need to have `mlton` in your path to build MLton from source. | |
21400 | + | |
21401 | During the build process, there are various times that the `Makefile`-s | |
21402 | look for a `mlton` in your path and in `src/build/bin`. It is OK if | |
21403 | the latter doesn't exist when the build starts; it is the target being | |
21404 | built. Failure to find a `mlton` in your path will abort the build. | |
21405 | ||
21406 | ||
21407 | == Compiling with SML/NJ == | |
21408 | ||
21409 | To compile with SML/NJ, run `make bootstrap-smlnj` from within the | |
21410 | root directory of the sources. You must use a recent version of | |
21411 | SML/NJ. First, the `Makefile` calls `ml-lex` and `ml-yacc` to build | |
21412 | the lexer and parser. Then, it calls SML/NJ with the appropriate | |
21413 | `sources.cm` file. Once MLton is built with SML/NJ, the `Makefile` | |
21414 | will rebuild MLton with this SML/NJ built MLton and then will rebuild | |
21415 | MLton with the MLton built MLton. Building with SML/NJ takes | |
21416 | significant time (particularly during the "`parseAndElaborate`" phase | |
21417 | when the SML/NJ built MLton is compiling MLton). Unless you are doing | |
21418 | compiler development and need rapid recompilation, we recommend | |
21419 | compiling with MLton. | |
21420 | ||
21421 | <<< | |
21422 | ||
21423 | :mlton-guide-page: Serialization | |
21424 | [[Serialization]] | |
21425 | Serialization | |
21426 | ============= | |
21427 | ||
21428 | <:StandardML:Standard ML> does not have built-in support for | |
21429 | serialization. Here are papers that describe user-level approaches: | |
21430 | ||
21431 | * <!Cite(Elsman04)> | |
21432 | * <!Cite(Kennedy04)> | |
21433 | ||
21434 | The MLton repository also contains an experimental generic programming | |
21435 | library (see | |
21436 | <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>) that | |
21437 | includes a pickling (serialization) generic (see | |
21438 | <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/public/value/pickle.sig)>). | |
21439 | ||
21440 | <<< | |
21441 | ||
21442 | :mlton-guide-page: ShareZeroVec | |
21443 | [[ShareZeroVec]] | |
21444 | ShareZeroVec | |
21445 | ============ | |
21446 | ||
21447 | <:ShareZeroVec:> is an optimization pass for the <:SSA:> | |
21448 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
21449 | ||
21450 | == Description == | |
21451 | ||
21452 | An SSA optimization to share zero-length vectors. | |
21453 | ||
21454 | From <!ViewGitCommit(mlton,be8c5f576)>, which replaced the use of the | |
21455 | `Array_array0Const` primitive in the Basis Library implementation with a | |
21456 | (nullary) `Vector_vector` primitive: | |
21457 | ||
21458 | ________ | |
21459 | ||
21460 | The original motivation for the `Array_array0Const` primitive was to share the | |
21461 | heap space required for zero-length vectors among all vectors (of a given type). | |
21462 | It was claimed that this optimization is important, e.g., in a self-compile, | |
21463 | where vectors are used for lots of syntax tree elements and many of those | |
21464 | vectors are empty. See: | |
21465 | http://www.mlton.org/pipermail/mlton-devel/2002-February/021523.html | |
21466 | ||
21467 | Curiously, the full effect of this optimization has been missing for quite some | |
21468 | time (perhaps since the port of <:ConstantPropagation:> to the SSA IL). While | |
21469 | <:ConstantPropagation:> has "globalized" the nullary application of the | |
21470 | `Array_array0Const` primitive, it also simultaneously transformed it to an | |
21471 | application of the `Array_uninit` (previously, the `Array_array`) primitive to | |
21472 | the zero constant. The hash-consing of globals, meant to create exactly one | |
21473 | global for each distinct constant, treats `Array_uninit` primitives as unequal | |
21474 | (appropriately, since `Array_uninit` allocates an array with identity (though | |
21475 | the identity may be supressed by a subsequent `Array_toVector`)), hence each | |
21476 | distinct `Array_array0Const` primitive in the program remained as distinct | |
21477 | globals. The limited amount of inlining prior to <:ConstantPropagation:> meant | |
21478 | that there were typically fewer than a dozen "copies" of the same empty vector | |
21479 | in a program for a given type. | |
21480 | ||
21481 | As a "functional" primitive, a nullary `Vector_vector` is globalized by | |
21482 | ClosureConvert, but is further recognized by ConstantPropagation and hash-consed | |
21483 | into a unique instance for each type. | |
21484 | ________ | |
21485 | ||
21486 | However, a single, shared, global `Vector_vector ()` inhibits the | |
21487 | coercion-based optimizations of `Useless`. For example, consider the | |
21488 | following program: | |
21489 | ||
21490 | [source,sml] | |
21491 | ---- | |
21492 | val n = valOf (Int.fromString (hd (CommandLine.arguments ()))) | |
21493 | ||
21494 | val v1 = Vector.tabulate (n, fn i => | |
21495 | let val w = Word16.fromInt i | |
21496 | in (w - 0wx1, w, w + 0wx1 + w) | |
21497 | end) | |
21498 | val v2 = Vector.map (fn (w1, w2, w3) => (w1, 0wx2 * w2, 0wx3 * w3)) v1 | |
21499 | val v3 = VectorSlice.vector (VectorSlice.slice (v1, 1, SOME (n - 2))) | |
21500 | val ans1 = Vector.foldl (fn ((w1,w2,w3),w) => w + w1 + w2 + w3) 0wx0 v1 | |
21501 | val ans2 = Vector.foldl (fn ((_,w2,_),w) => w + w2) 0wx0 v2 | |
21502 | val ans3 = Vector.foldl (fn ((_,w2,_),w) => w + w2) 0wx0 v3 | |
21503 | ||
21504 | val _ = print (concat ["ans1 = ", Word16.toString ans1, " ", | |
21505 | "ans2 = ", Word16.toString ans2, " ", | |
21506 | "ans3 = ", Word16.toString ans3, "\n"]) | |
21507 | ---- | |
21508 | ||
21509 | We would like `v2` and `v3` to be optimized from | |
21510 | `(word16 * word16 * word16) vector` to `word16 vector` because only | |
21511 | the 2nd component of the elements is needed to compute the answer. | |
21512 | ||
21513 | With `Array_array0Const`, each distinct occurrence of | |
21514 | `Array_array0Const((word16 * word16 * word16))` arising from | |
21515 | polyvariance and inlining remained a distinct | |
21516 | `Array_uninit((word16 * word16 * word16)) (0x0)` global, which | |
21517 | resulted in distinct occurrences for the | |
21518 | `val v1 = Vector.tabulate ...` and for the | |
21519 | `val v2 = Vector.map ...`. The latter could be optimized to | |
21520 | `Array_uninit(word16) (0x0)` by `Useless`, because its result only | |
21521 | flows to places requiring the 2nd component of the elements. | |
21522 | ||
21523 | With `Vector_vector ()`, the distinct occurrences of | |
21524 | `Vector_vector((word16 * word16 * word16)) ()` arising from | |
21525 | polyvariance are globalized during `ClosureConvert`, those global | |
21526 | references may be further duplicated by inlining, but the distinct | |
21527 | occurrences of `Vector_vector((word16 * word16 * word16)) ()` are | |
21528 | merged to a single occurrence. Because this result flows to places | |
21529 | requiring all three components of the elements, it remains | |
21530 | `Vector_vector((word16 * word16 * word16)) ()` after | |
21531 | `Useless`. Furthermore, because one cannot (in constant time) coerce a | |
21532 | `(word16 * word16 * word16) vector` to a `word16 vector`, the `v2` | |
21533 | value remains of type `(word16 * word16 * word16) vector`. | |
21534 | ||
21535 | One option would be to drop the 0-element vector "optimization" | |
21536 | entirely. This costs some space (no sharing of empty vectors) and | |
21537 | some time (allocation and garbage collection of empty vectors). | |
21538 | ||
21539 | Another option would be to reinstate the `Array_array0Const` primitive | |
21540 | and associated `ConstantPropagation` treatment. But, the semantics | |
21541 | and purpose of `Array_array0Const` was poorly understood, resulting in | |
21542 | this break. | |
21543 | ||
21544 | The <:ShareZeroVec:> pass pursues a different approach: perform the 0-element | |
21545 | vector "optimization" as a separate optimization, after | |
21546 | `ConstantPropagation` and `Useless`. A trivial static analysis is | |
21547 | used to match `val v: t vector = Array_toVector(t) (a)` with | |
21548 | corresponding `val a: array = Array_uninit(t) (l)` and the later are | |
21549 | expanded to | |
21550 | `val a: t array = if 0 = l then zeroArr_[t] else Array_uninit(t) (l)` | |
21551 | with a single global `val zeroArr_[t] = Array_uninit(t) (0)` created | |
21552 | for each distinct type (after coercion-based optimizations). | |
21553 | ||
21554 | One disadvantage of this approach, compared to the `Vector_vector(t) ()` | |
21555 | approach, is that `Array_toVector` is applied each time a vector | |
21556 | is created, even if it is being applied to the `zeroArr_[t]` | |
21557 | zero-length array. (Although, this was the behavior of the | |
21558 | `Array_array0Const` approach.) This updates the object header each | |
21559 | time, whereas the `Vector_vector(t) ()` approach would have updated | |
21560 | the object header once, when the global was created, and the | |
21561 | `zeroVec_[t]` global and the `Array_toVector` result would flow to the | |
21562 | join point. | |
21563 | ||
21564 | It would be possible to properly share zero-length vectors, but doing | |
21565 | so is a more sophisticated analysis and transformation, because there | |
21566 | can be arbitrary code between the | |
21567 | `val a: t array = Array_uninit(t) (l)` and the corresponding | |
21568 | `val v: v vector = Array_toVector(t) (a)`, although, in practice, | |
21569 | nothing happens when a zero-length vector is created. It may be best | |
21570 | to pursue a more general "array to vector" optimization that | |
21571 | transforms creations of static-length vectors (e.g., all the | |
21572 | `Vector.new<N>` functions) into `Vector_vector` primitives (some of | |
21573 | which could be globalized). | |
21574 | ||
21575 | == Implementation == | |
21576 | ||
21577 | * <!ViewGitFile(mlton,master,mlton/ssa/share-zero-vec.fun)> | |
21578 | ||
21579 | == Details and Notes == | |
21580 | ||
21581 | {empty} | |
21582 | ||
21583 | <<< | |
21584 | ||
21585 | :mlton-guide-page: ShowBasis | |
21586 | [[ShowBasis]] | |
21587 | ShowBasis | |
21588 | ========= | |
21589 | ||
21590 | MLton has a flag, `-show-basis <file>`, that causes MLton to pretty | |
21591 | print to _file_ the basis defined by the input program. For example, | |
21592 | if `foo.sml` contains | |
21593 | [source,sml] | |
21594 | ---- | |
21595 | fun f x = x + 1 | |
21596 | ---- | |
21597 | then `mlton -show-basis foo.basis foo.sml` will create `foo.basis` | |
21598 | with the following contents. | |
21599 | ---- | |
21600 | val f: int -> int | |
21601 | ---- | |
21602 | ||
21603 | If you only want to see the basis and do not wish to compile the | |
21604 | program, you can call MLton with `-stop tc`. | |
21605 | ||
21606 | == Displaying signatures == | |
21607 | ||
21608 | When displaying signatures, MLton prefixes types defined in the | |
21609 | signature them with `_sig.` to distinguish them from types defined in the | |
21610 | environment. For example, | |
21611 | [source,sml] | |
21612 | ---- | |
21613 | signature SIG = | |
21614 | sig | |
21615 | type t | |
21616 | val x: t * int -> unit | |
21617 | end | |
21618 | ---- | |
21619 | is displayed as | |
21620 | ---- | |
21621 | signature SIG = | |
21622 | sig | |
21623 | type t | |
21624 | val x: _sig.t * int -> unit | |
21625 | end | |
21626 | ---- | |
21627 | ||
21628 | Notice that `int` occurs without the `_sig.` prefix. | |
21629 | ||
21630 | MLton also uses a canonical name for each type in the signature, and | |
21631 | that name is used everywhere for that type, no matter what the input | |
21632 | signature looked like. For example: | |
21633 | [source,sml] | |
21634 | ---- | |
21635 | signature SIG = | |
21636 | sig | |
21637 | type t | |
21638 | type u = t | |
21639 | val x: t | |
21640 | val y: u | |
21641 | end | |
21642 | ---- | |
21643 | is displayed as | |
21644 | ---- | |
21645 | signature SIG = | |
21646 | sig | |
21647 | type t | |
21648 | type u = _sig.t | |
21649 | val x: _sig.t | |
21650 | val y: _sig.t | |
21651 | end | |
21652 | ---- | |
21653 | ||
21654 | Canonical names are always relative to the "top" of the signature, | |
21655 | even when used in nested substructures. For example: | |
21656 | [source,sml] | |
21657 | ---- | |
21658 | signature S = | |
21659 | sig | |
21660 | type t | |
21661 | val w: t | |
21662 | structure U: | |
21663 | sig | |
21664 | type u | |
21665 | val x: t | |
21666 | val y: u | |
21667 | end | |
21668 | val z: U.u | |
21669 | end | |
21670 | ---- | |
21671 | is displayed as | |
21672 | ---- | |
21673 | signature S = | |
21674 | sig | |
21675 | type t | |
21676 | val w: _sig.t | |
21677 | val z: _sig.U.u | |
21678 | structure U: | |
21679 | sig | |
21680 | type u | |
21681 | val x: _sig.t | |
21682 | val y: _sig.U.u | |
21683 | end | |
21684 | end | |
21685 | ---- | |
21686 | ||
21687 | == Displaying structures == | |
21688 | ||
21689 | When displaying structures, MLton uses signature constraints wherever | |
21690 | possible, combined with `where type` clauses to specify the meanings | |
21691 | of the types defined within the signature. For example: | |
21692 | [source,sml] | |
21693 | ---- | |
21694 | signature SIG = | |
21695 | sig | |
21696 | type t | |
21697 | val x: t | |
21698 | end | |
21699 | structure S: SIG = | |
21700 | struct | |
21701 | type t = int | |
21702 | val x = 13 | |
21703 | end | |
21704 | structure S2:> SIG = S | |
21705 | ---- | |
21706 | is displayed as | |
21707 | ---- | |
21708 | signature SIG = | |
21709 | sig | |
21710 | type t | |
21711 | val x: _sig.t | |
21712 | end | |
21713 | structure S: SIG | |
21714 | where type t = int | |
21715 | structure S2: SIG | |
21716 | where type t = S2.t | |
21717 | ---- | |
21718 | ||
21719 | <<< | |
21720 | ||
21721 | :mlton-guide-page: ShowBasisDirective | |
21722 | [[ShowBasisDirective]] | |
21723 | ShowBasisDirective | |
21724 | ================== | |
21725 | ||
21726 | A comment of the form `(*#showBasis "<file>"*)` is recognized as a directive to | |
21727 | save the current basis (i.e., environment) to `<file>` (in the same format as | |
21728 | the `-show-basis <file>` <:CompileTimeOptions: compile-time option>). The | |
21729 | `<file>` is interpreted relative to the source file in which it appears. The | |
21730 | comment is lexed as a distinct token and is parsed as a structure-level | |
21731 | declaration. [Note that treating the directive as a top-level declaration would | |
21732 | prohibit using it inside a functor body, which would make the feature | |
21733 | significantly less useful in the context of the MLton compiler sources (with its | |
21734 | nearly fully functorial style).] | |
21735 | ||
21736 | This feature is meant to facilitate auto-completion via | |
21737 | https://github.com/MatthewFluet/company-mlton[`company-mlton`] and similar | |
21738 | tools. | |
21739 | ||
21740 | <<< | |
21741 | ||
21742 | :mlton-guide-page: ShowProf | |
21743 | [[ShowProf]] | |
21744 | ShowProf | |
21745 | ======== | |
21746 | ||
21747 | If an executable is compiled for <:Profiling:profiling>, then it | |
21748 | accepts a special command-line runtime system argument, `show-prof`, | |
21749 | that outputs information about the source functions that are profiled. | |
21750 | Normally, this information is used by `mlprof`. This page documents | |
21751 | the `show-prof` output format, and is intended for those working on | |
21752 | the profiler internals. | |
21753 | ||
21754 | The `show-prof` output is ASCII, and consists of a sequence of lines. | |
21755 | ||
21756 | * The magic number of the executable. | |
21757 | * The number of source names in the executable. | |
21758 | * A line for each source name giving the name of the function, a tab, | |
21759 | the filename of the file containing the function, a colon, a space, | |
21760 | and the line number that the function starts on in that file. | |
21761 | * The number of (split) source functions. | |
21762 | * A line for each (split) source function, where each line consists of | |
21763 | a source-name index (into the array of source names) and a successors | |
21764 | index (into the array of split-source sequences, defined below). | |
21765 | * The number of split-source sequences. | |
21766 | * A line for each split-source sequence, where each line is a space | |
21767 | separated list of (split) source functions. | |
21768 | ||
21769 | The latter two arrays, split sources and split-source sequences, | |
21770 | define a directed graph, which is the call-graph of the program. | |
21771 | ||
21772 | <<< | |
21773 | ||
21774 | :mlton-guide-page: Shrink | |
21775 | [[Shrink]] | |
21776 | Shrink | |
21777 | ====== | |
21778 | ||
21779 | <:Shrink:> is a rewrite pass for the <:SSA:> and <:SSA2:> | |
21780 | <:IntermediateLanguage:>s, invoked from every optimization pass (see | |
21781 | <:SSASimplify:> and <:SSA2Simplify:>). | |
21782 | ||
21783 | == Description == | |
21784 | ||
21785 | This pass implements a whole family of compile-time reductions, like: | |
21786 | ||
21787 | * `#1(a, b)` => `a` | |
21788 | * `case C x of C y => e` => `let y = x in e` | |
21789 | * constant folding, copy propagation | |
21790 | * eta blocks | |
21791 | * tuple reconstruction elimination | |
21792 | ||
21793 | == Implementation == | |
21794 | ||
21795 | * <!ViewGitFile(mlton,master,mlton/ssa/shrink.sig)> | |
21796 | * <!ViewGitFile(mlton,master,mlton/ssa/shrink.fun)> | |
21797 | * <!ViewGitFile(mlton,master,mlton/ssa/shrink2.sig)> | |
21798 | * <!ViewGitFile(mlton,master,mlton/ssa/shrink2.fun)> | |
21799 | ||
21800 | == Details and Notes == | |
21801 | ||
21802 | The <:Shrink:> pass is run after every <:SSA:> and <:SSA2:> | |
21803 | optimization pass. | |
21804 | ||
21805 | The <:Shrink:> implementation also includes functions to eliminate | |
21806 | unreachable blocks from a <:SSA:> or <:SSA2:> program or function. | |
21807 | The <:Shrink:> pass does not guarantee to eliminate all unreachable | |
21808 | blocks. Doing so would unduly complicate the implementation, and it | |
21809 | is almost always the case that all unreachable blocks are eliminated. | |
21810 | However, a small number of optimization passes require that the input | |
21811 | have no unreachable blocks (essentially, when the analysis works on | |
21812 | the control flow graph and the rewrite iterates on the vector of | |
21813 | blocks). These passes explicitly call `eliminateDeadBlocks`. | |
21814 | ||
21815 | The <:Shrink:> pass has a special case to turn a non-tail call where | |
21816 | the continuation and handler only do `Profile` statements into a tail | |
21817 | call where the `Profile` statements precede the tail call. | |
21818 | ||
21819 | <<< | |
21820 | ||
21821 | :mlton-guide-page: SimplifyTypes | |
21822 | [[SimplifyTypes]] | |
21823 | SimplifyTypes | |
21824 | ============= | |
21825 | ||
21826 | <:SimplifyTypes:> is an optimization pass for the <:SSA:> | |
21827 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
21828 | ||
21829 | == Description == | |
21830 | ||
21831 | This pass computes a "cardinality" of each datatype, which is an | |
21832 | abstraction of the number of values of the datatype. | |
21833 | ||
21834 | * `Zero` means the datatype has no values (except for bottom). | |
21835 | * `One` means the datatype has one value (except for bottom). | |
21836 | * `Many` means the datatype has many values. | |
21837 | ||
21838 | This pass removes all datatypes whose cardinality is `Zero` or `One` | |
21839 | and removes: | |
21840 | ||
21841 | * components of tuples | |
21842 | * function args | |
21843 | * constructor args | |
21844 | ||
21845 | which are such datatypes. | |
21846 | ||
21847 | This pass marks constructors as one of: | |
21848 | ||
21849 | * `Useless`: it never appears in a `ConApp`. | |
21850 | * `Transparent`: it is the only variant in its datatype and its argument type does not contain any uses of `array` or `vector`. | |
21851 | * `Useful`: otherwise | |
21852 | ||
21853 | This pass also removes `Useless` and `Transparent` constructors. | |
21854 | ||
21855 | == Implementation == | |
21856 | ||
21857 | * <!ViewGitFile(mlton,master,mlton/ssa/simplify-types.fun)> | |
21858 | ||
21859 | == Details and Notes == | |
21860 | ||
21861 | This pass must happen before polymorphic equality is implemented because | |
21862 | ||
21863 | * it will make polymorphic equality faster because some types are simpler | |
21864 | * it removes uses of polymorphic equality that must return true | |
21865 | ||
21866 | We must keep track of `Transparent` constructors whose argument type | |
21867 | uses `array` because of datatypes like the following: | |
21868 | [source,sml] | |
21869 | ---- | |
21870 | datatype t = T of t array | |
21871 | ---- | |
21872 | ||
21873 | Such a datatype has `Cardinality.Many`, but we cannot eliminate the | |
21874 | datatype and replace the lhs by the rhs, i.e. we must keep the | |
21875 | circularity around. | |
21876 | ||
21877 | Must do similar things for `vectors`. | |
21878 | ||
21879 | Also, to eliminate as many `Transparent` constructors as possible, for | |
21880 | something like the following, | |
21881 | [source,sml] | |
21882 | ---- | |
21883 | datatype t = T of u array | |
21884 | and u = U of t vector | |
21885 | ---- | |
21886 | we (arbitrarily) expand one of the datatypes first. The result will | |
21887 | be something like | |
21888 | [source,sml] | |
21889 | ---- | |
21890 | datatype u = U of u array array | |
21891 | ---- | |
21892 | where all uses of `t` are replaced by `u array`. | |
21893 | ||
21894 | <<< | |
21895 | ||
21896 | :mlton-guide-page: SML3d | |
21897 | [[SML3d]] | |
21898 | SML3d | |
21899 | ===== | |
21900 | ||
21901 | The http://sml3d.cs.uchicago.edu/[SML3d Project] is a collection of | |
21902 | libraries to support 3D graphics programming using Standard ML and the | |
21903 | http://www.opengl.org/[OpenGL] graphics API. It currently requires the | |
21904 | MLton implementation of SML and is supported on Linux, Mac OS X, and | |
21905 | Microsoft Windows. There is also support for | |
21906 | http://www.khronos.org/opencl/[OpenCL]. | |
21907 | ||
21908 | <<< | |
21909 | ||
21910 | :mlton-guide-page: SMLNET | |
21911 | [[SMLNET]] | |
21912 | SMLNET | |
21913 | ====== | |
21914 | ||
21915 | http://www.cl.cam.ac.uk/research/tsg/SMLNET[SML.NET] is a | |
21916 | <:StandardMLImplementations:Standard ML implementation> that | |
21917 | targets the .NET Common Language Runtime. | |
21918 | ||
21919 | SML.NET is based on the <:MLj:MLj> compiler. | |
21920 | ||
21921 | == Also see == | |
21922 | ||
21923 | * <!Cite(BentonEtAl04)> | |
21924 | ||
21925 | <<< | |
21926 | ||
21927 | :mlton-guide-page: SMLNJ | |
21928 | [[SMLNJ]] | |
21929 | SMLNJ | |
21930 | ===== | |
21931 | ||
21932 | http://www.smlnj.org/[SML/NJ] is a | |
21933 | <:StandardMLImplementations:Standard ML implementation>. It is a | |
21934 | native code compiler that runs on a variety of platforms and has a | |
21935 | number of libraries and tools. | |
21936 | ||
21937 | We maintain a list of SML/NJ's <:SMLNJDeviations:deviations> from | |
21938 | <:DefinitionOfStandardML:The Definition of Standard ML>. | |
21939 | ||
21940 | MLton has support for some features of SML/NJ in order to ease porting | |
21941 | between MLton and SML/NJ. | |
21942 | ||
21943 | * <:CompilationManager:> (CM) | |
21944 | * <:LineDirective:>s | |
21945 | * <:SMLofNJStructure:> | |
21946 | * <:UnsafeStructure:> | |
21947 | ||
21948 | <<< | |
21949 | ||
21950 | :mlton-guide-page: SMLNJDeviations | |
21951 | [[SMLNJDeviations]] | |
21952 | SMLNJDeviations | |
21953 | =============== | |
21954 | ||
21955 | Here are some deviations of <:SMLNJ:SML/NJ> from | |
21956 | <:DefinitionOfStandardML:The Definition of Standard ML (Revised)>. | |
21957 | Some of these are documented in the | |
21958 | http://www.smlnj.org/doc/Conversion/index.html[SML '97 Conversion Guide]. | |
21959 | Since MLton does not deviate from the Definition, you should look here | |
21960 | if you are having trouble porting a program from MLton to SML/NJ or | |
21961 | vice versa. If you discover other deviations of SML/NJ that aren't | |
21962 | listed here, please send mail to | |
21963 | mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`]. | |
21964 | ||
21965 | * SML/NJ allows spaces in long identifiers, as in `S . x`. Section | |
21966 | 2.5 of the Definition implies that `S . x` should be treated as three | |
21967 | separate lexical items. | |
21968 | ||
21969 | * SML/NJ allows `op` to appear in `val` specifications: | |
21970 | + | |
21971 | [source,sml] | |
21972 | ---- | |
21973 | signature FOO = sig | |
21974 | val op + : int * int -> int | |
21975 | end | |
21976 | ---- | |
21977 | + | |
21978 | The grammar on page 14 of the Definition does not allow it. Recent | |
21979 | versions of SML/NJ do give a warning. | |
21980 | ||
21981 | * SML/NJ rejects | |
21982 | + | |
21983 | [source,sml] | |
21984 | ---- | |
21985 | (op *) | |
21986 | ---- | |
21987 | + | |
21988 | as an unmatched close comment. | |
21989 | ||
21990 | * SML/NJ allows `=` to be rebound by the declaration: | |
21991 | + | |
21992 | [source,sml] | |
21993 | ---- | |
21994 | val op = = 13 | |
21995 | ---- | |
21996 | + | |
21997 | This is explicitly forbidden on page 5 of the Definition. Recent | |
21998 | versions of SML/NJ do give a warning. | |
21999 | ||
22000 | * SML/NJ allows rebinding `true`, `false`, `nil`, `::`, and `ref` by | |
22001 | the declarations: | |
22002 | + | |
22003 | [source,sml] | |
22004 | ---- | |
22005 | fun true () = () | |
22006 | fun false () = () | |
22007 | fun nil () = () | |
22008 | fun op :: () = () | |
22009 | fun ref () = () | |
22010 | ---- | |
22011 | + | |
22012 | This is explicitly forbidden on page 9 of the Definition. | |
22013 | ||
22014 | * SML/NJ extends the syntax of the language to allow vector | |
22015 | expressions and patterns like the following: | |
22016 | + | |
22017 | [source,sml] | |
22018 | ---- | |
22019 | val v = #[1,2,3] | |
22020 | val #[x,y,z] = v | |
22021 | ---- | |
22022 | + | |
22023 | MLton supports vector expressions and patterns with the <:SuccessorML#VectorExpsAndPats:`allowVectorExpsAndPats`> <:MLBasisAnnotations:ML Basis annotation>. | |
22024 | ||
22025 | * SML/NJ extends the syntax of the language to allow _or patterns_ | |
22026 | like the following: | |
22027 | + | |
22028 | [source,sml] | |
22029 | ---- | |
22030 | datatype foo = Foo of int | Bar of int | |
22031 | val (Foo x | Bar x) = Foo 13 | |
22032 | ---- | |
22033 | + | |
22034 | MLton supports or patterns with the <:SuccessorML#OrPats:`allowOrPats`> <:MLBasisAnnotations:ML Basis annotation>. | |
22035 | ||
22036 | * SML/NJ allows higher-order functors, that is, functors can be | |
22037 | components of structures and can be passed as functor arguments and | |
22038 | returned as functor results. As a consequence, SML/NJ allows | |
22039 | abbreviated functor definitions, as in the following: | |
22040 | + | |
22041 | [source,sml] | |
22042 | ---- | |
22043 | signature S = | |
22044 | sig | |
22045 | type t | |
22046 | val x: t | |
22047 | end | |
22048 | functor F (structure A: S): S = | |
22049 | struct | |
22050 | type t = A.t * A.t | |
22051 | val x = (A.x, A.x) | |
22052 | end | |
22053 | functor G = F | |
22054 | ---- | |
22055 | ||
22056 | * SML/NJ extends the syntax of the language to allow `functor` and | |
22057 | `signature` declarations to occur within the scope of `local` and | |
22058 | `structure` declarations. | |
22059 | ||
22060 | * SML/NJ allows duplicate type specifications in signatures when the | |
22061 | duplicates are introduced by `include`, as in the following: | |
22062 | + | |
22063 | [source,sml] | |
22064 | ---- | |
22065 | signature SIG1 = | |
22066 | sig | |
22067 | type t | |
22068 | type u | |
22069 | end | |
22070 | signature SIG2 = | |
22071 | sig | |
22072 | type t | |
22073 | type v | |
22074 | end | |
22075 | signature SIG = | |
22076 | sig | |
22077 | include SIG1 | |
22078 | include SIG2 | |
22079 | end | |
22080 | ---- | |
22081 | + | |
22082 | This is disallowed by rule 77 of the Definition. | |
22083 | ||
22084 | * SML/NJ allows sharing constraints between type abbreviations in | |
22085 | signatures, as in the following: | |
22086 | + | |
22087 | [source,sml] | |
22088 | ---- | |
22089 | signature SIG = | |
22090 | sig | |
22091 | type t = int * int | |
22092 | type u = int * int | |
22093 | sharing type t = u | |
22094 | end | |
22095 | ---- | |
22096 | + | |
22097 | These are disallowed by rule 78 of the Definition. Recent versions of | |
22098 | SML/NJ correctly disallow sharing constraints between type | |
22099 | abbreviations in signatures. | |
22100 | ||
22101 | * SML/NJ disallows multiple `where type` specifications of the same | |
22102 | type name, as in the following | |
22103 | + | |
22104 | [source,sml] | |
22105 | ---- | |
22106 | signature S = | |
22107 | sig | |
22108 | type t | |
22109 | type u = t | |
22110 | end | |
22111 | where type u = int | |
22112 | ---- | |
22113 | + | |
22114 | This is allowed by rule 64 of the Definition. | |
22115 | ||
22116 | * SML/NJ allows `and` in `sharing` specs in signatures, as in | |
22117 | + | |
22118 | [source,sml] | |
22119 | ---- | |
22120 | signature S = | |
22121 | sig | |
22122 | type t | |
22123 | type u | |
22124 | type v | |
22125 | sharing type t = u | |
22126 | and type u = v | |
22127 | end | |
22128 | ---- | |
22129 | ||
22130 | * SML/NJ does not expand the `withtype` derived form as described by | |
22131 | the Definition. According to page 55 of the Definition, the type | |
22132 | bindings of a `withtype` declaration are substituted simultaneously in | |
22133 | the connected datatype. Consider the following program. | |
22134 | + | |
22135 | [source,sml] | |
22136 | ---- | |
22137 | type u = real ; | |
22138 | datatype a = | |
22139 | A of t | |
22140 | | B of u | |
22141 | withtype u = int | |
22142 | and t = u | |
22143 | ---- | |
22144 | + | |
22145 | According to the Definition, it should be expanded to the following. | |
22146 | + | |
22147 | [source,sml] | |
22148 | ---- | |
22149 | type u = real ; | |
22150 | datatype a = | |
22151 | A of u | |
22152 | | B of int ; | |
22153 | type u = int | |
22154 | and t = u | |
22155 | ---- | |
22156 | + | |
22157 | However, SML/NJ expands `withtype` bindings sequentially, meaning that | |
22158 | earlier bindings are expanded within later ones. Hence, the above | |
22159 | program is expanded to the following. | |
22160 | + | |
22161 | [source,sml] | |
22162 | ---- | |
22163 | type u = real ; | |
22164 | datatype a = | |
22165 | A of int | |
22166 | | B of int ; | |
22167 | type u = int | |
22168 | type t = int | |
22169 | ---- | |
22170 | ||
22171 | * SML/NJ allows `withtype` specifications in signatures. | |
22172 | + | |
22173 | MLton supports `withtype` specifications in signatures with the <:SuccessorML#SigWithtype:`allowSigWithtype`> <:MLBasisAnnotations:ML Basis annotation>. | |
22174 | ||
22175 | * SML/NJ allows a `where` structure specification that is similar to a | |
22176 | `where type` specification. For example: | |
22177 | + | |
22178 | [source,sml] | |
22179 | ---- | |
22180 | structure S = struct type t = int end | |
22181 | signature SIG = | |
22182 | sig | |
22183 | structure T : sig type t end | |
22184 | end where T = S | |
22185 | ---- | |
22186 | + | |
22187 | This is equivalent to: | |
22188 | + | |
22189 | [source,sml] | |
22190 | ---- | |
22191 | structure S = struct type t = int end | |
22192 | signature SIG = | |
22193 | sig | |
22194 | structure T : sig type t end | |
22195 | end where type T.t = S.t | |
22196 | ---- | |
22197 | + | |
22198 | SML/NJ also allows a definitional structure specification that is | |
22199 | similar to a definitional type specification. For example: | |
22200 | + | |
22201 | [source,sml] | |
22202 | ---- | |
22203 | structure S = struct type t = int end | |
22204 | signature SIG = | |
22205 | sig | |
22206 | structure T : sig type t end = S | |
22207 | end | |
22208 | ---- | |
22209 | + | |
22210 | This is equivalent to the previous examples and to: | |
22211 | + | |
22212 | [source,sml] | |
22213 | ---- | |
22214 | structure S = struct type t = int end | |
22215 | signature SIG = | |
22216 | sig | |
22217 | structure T : sig type t end where type t = S.t | |
22218 | end | |
22219 | ---- | |
22220 | ||
22221 | * SML/NJ disallows binding non-datatypes with datatype replication. | |
22222 | For example, it rejects the following program that should be allowed | |
22223 | according to the Definition. | |
22224 | + | |
22225 | [source,sml] | |
22226 | ---- | |
22227 | type ('a, 'b) t = 'a * 'b | |
22228 | datatype u = datatype t | |
22229 | ---- | |
22230 | + | |
22231 | This idiom can be useful when one wants to rename a type without | |
22232 | rewriting all the type arguments. For example, the above would have | |
22233 | to be written in SML/NJ as follows. | |
22234 | + | |
22235 | [source,sml] | |
22236 | ---- | |
22237 | type ('a, 'b) t = 'a * 'b | |
22238 | type ('a, 'b) u = ('a, 'b) t | |
22239 | ---- | |
22240 | ||
22241 | * SML/NJ disallows sharing a structure with one of its substructures. | |
22242 | For example, SML/NJ disallows the following. | |
22243 | + | |
22244 | [source,sml] | |
22245 | ---- | |
22246 | signature SIG = | |
22247 | sig | |
22248 | structure S: | |
22249 | sig | |
22250 | type t | |
22251 | structure T: sig type t end | |
22252 | end | |
22253 | sharing S = S.T | |
22254 | end | |
22255 | ---- | |
22256 | + | |
22257 | This signature is allowed by the Definition. | |
22258 | ||
22259 | * SML/NJ disallows polymorphic generalization of refutable | |
22260 | patterns. For example, SML/NJ disallows the following. | |
22261 | + | |
22262 | [source,sml] | |
22263 | ---- | |
22264 | val [x] = [[]] | |
22265 | val _ = (1 :: x, "one" :: x) | |
22266 | ---- | |
22267 | + | |
22268 | Recent versions of SML/NJ correctly allow polymorphic generalization | |
22269 | of refutable patterns. | |
22270 | ||
22271 | * SML/NJ uses an overly restrictive context for type inference. For | |
22272 | example, SML/NJ rejects both of the following. | |
22273 | + | |
22274 | [source,sml] | |
22275 | ---- | |
22276 | structure S = | |
22277 | struct | |
22278 | val z = (fn x => x) [] | |
22279 | val y = z :: [true] :: nil | |
22280 | end | |
22281 | ---- | |
22282 | + | |
22283 | [source,sml] | |
22284 | ---- | |
22285 | structure S : sig val z : bool list end = | |
22286 | struct | |
22287 | val z = (fn x => x) [] | |
22288 | end | |
22289 | ---- | |
22290 | + | |
22291 | These structures are allowed by the Definition. | |
22292 | ||
22293 | == Deviations from the Basis Library Specification == | |
22294 | ||
22295 | Here are some deviations of SML/NJ from the <:BasisLibrary:Basis Library> | |
22296 | http://www.standardml.org/Basis[specification]. | |
22297 | ||
22298 | * SML/NJ exposes the equality of the `vector` type in structures such | |
22299 | as `Word8Vector` that abstractly match `MONO_VECTOR`, which says | |
22300 | `type vector`, not `eqtype vector`. So, for example, SML/NJ accepts | |
22301 | the following program: | |
22302 | + | |
22303 | [source,sml] | |
22304 | ---- | |
22305 | fun f (v: Word8Vector.vector) = v = v | |
22306 | ---- | |
22307 | ||
22308 | * SML/NJ exposes the equality property of the type `status` in | |
22309 | `OS.Process`. This means that programs which directly compare two | |
22310 | values of type `status` will work with SML/NJ but not MLton. | |
22311 | ||
22312 | * Under SML/NJ on Windows, `OS.Path.validVolume` incorrectly considers | |
22313 | absolute empty volumes to be valid. In other words, when the | |
22314 | expression | |
22315 | + | |
22316 | [source,sml] | |
22317 | ---- | |
22318 | OS.Path.validVolume { isAbs = true, vol = "" } | |
22319 | ---- | |
22320 | + | |
22321 | is evaluated by SML/NJ on Windows, the result is `true`. MLton, on | |
22322 | the other hand, correctly follows the Basis Library Specification, | |
22323 | which states that on Windows, `OS.Path.validVolume` should return | |
22324 | `false` whenever `isAbs = true` and `vol = ""`. | |
22325 | + | |
22326 | This incorrect behavior causes other `OS.Path` functions to behave | |
22327 | differently. For example, when the expression | |
22328 | + | |
22329 | [source,sml] | |
22330 | ---- | |
22331 | OS.Path.toString (OS.Path.fromString "\\usr\\local") | |
22332 | ---- | |
22333 | + | |
22334 | is evaluated by SML/NJ on Windows, the result is `"\\usr\\local"`, | |
22335 | whereas under MLton on Windows, evaluating this expression (correctly) | |
22336 | causes an `OS.Path.Path` exception to be raised. | |
22337 | ||
22338 | <<< | |
22339 | ||
22340 | :mlton-guide-page: SMLNJLibrary | |
22341 | [[SMLNJLibrary]] | |
22342 | SMLNJLibrary | |
22343 | ============ | |
22344 | ||
22345 | The http://www.smlnj.org/doc/smlnj-lib/index.html[SML/NJ Library] is a | |
22346 | collection of libraries that are distributed with SML/NJ. Due to | |
22347 | differences between SML/NJ and MLton, these libraries will not work | |
22348 | out-of-the box with MLton. | |
22349 | ||
22350 | As of 20180119, MLton includes a port of the SML/NJ Library | |
22351 | synchronized with SML/NJ version 110.82. | |
22352 | ||
22353 | == Usage == | |
22354 | ||
22355 | * You can import a sub-library of the SML/NJ Library into an MLB file with: | |
22356 | + | |
22357 | [options="header"] | |
22358 | |===== | |
22359 | |MLB file|Description | |
22360 | |`$(SML_LIB)/smlnj-lib/Util/smlnj-lib.mlb`|Various utility modules, included collections, simple formating, ... | |
22361 | |`$(SML_LIB)/smlnj-lib/Controls/controls-lib.mlb`|A library for managing control flags in an application. | |
22362 | |`$(SML_LIB)/smlnj-lib/HashCons/hash-cons-lib.mlb`|Support for implementing hash-consed data structures. | |
22363 | |`$(SML_LIB)/smlnj-lib/HTML/html-lib.mlb`|HTML 3.2 parsing and pretty-printing library. | |
22364 | |`$(SML_LIB)/smlnj-lib/HTML4/html4-lib.mlb`|HTML 4.01 parsing and pretty-printing library. | |
22365 | |`$(SML_LIB)/smlnj-lib/INet/inet-lib.mlb`|Networking utilities; supported on both Unix and Windows systems. | |
22366 | |`$(SML_LIB)/smlnj-lib/JSON/json-lib.mlb`|JavaScript Object Notation (JSON) reading and writing library. | |
22367 | |`$(SML_LIB)/smlnj-lib/PP/pp-lib.mlb`|Pretty-printing library. | |
22368 | |`$(SML_LIB)/smlnj-lib/Reactive/reactive-lib.mlb`|Reactive scripting library. | |
22369 | |`$(SML_LIB)/smlnj-lib/RegExp/regexp-lib.mlb`|Regular expression library. | |
22370 | |`$(SML_LIB)/smlnj-lib/SExp/sexp-lib.mlb`|S-expression library. | |
22371 | |`$(SML_LIB)/smlnj-lib/Unix/unix-lib.mlb`|Utilities for Unix-based operating systems. | |
22372 | |`$(SML_LIB)/smlnj-lib/XML/xml-lib.mlb`|XML library. | |
22373 | |===== | |
22374 | ||
22375 | * If you are porting a project from SML/NJ's <:CompilationManager:> to | |
22376 | MLton's <:MLBasis: ML Basis system> using `cm2mlb`, note that the | |
22377 | following maps are included by default: | |
22378 | + | |
22379 | ----- | |
22380 | # SMLNJ Library | |
22381 | $SMLNJ-LIB $(SML_LIB)/smlnj-lib | |
22382 | $smlnj-lib.cm $(SML_LIB)/smlnj-lib/Util | |
22383 | $controls-lib.cm $(SML_LIB)/smlnj-lib/Controls | |
22384 | $hash-cons-lib.cm $(SML_LIB)/smlnj-lib/HashCons | |
22385 | $html-lib.cm $(SML_LIB)/smlnj-lib/HTML | |
22386 | $html4-lib.cm $(SML_LIB)/smlnj-lib/HTML4 | |
22387 | $inet-lib.cm $(SML_LIB)/smlnj-lib/INet | |
22388 | $json-lib.cm $(SML_LIB)/smlnj-lib/JSON | |
22389 | $pp-lib.cm $(SML_LIB)/smlnj-lib/PP | |
22390 | $reactive-lib.cm $(SML_LIB)/smlnj-lib/Reactive | |
22391 | $regexp-lib.cm $(SML_LIB)/smlnj-lib/RegExp | |
22392 | $sexp-lib.cm $(SML_LIB)/smlnj-lib/SExp | |
22393 | $unix-lib.cm $(SML_LIB)/smlnj-lib/Unix | |
22394 | $xml-lib.cm $(SML_LIB)/smlnj-lib/XML | |
22395 | ---- | |
22396 | + | |
22397 | This will automatically convert a `$/smlnj-lib.cm` import in an input | |
22398 | `.cm` file into a `$(SML_LIB)/smlnj-lib/Util/smlnj-lib.mlb` import in | |
22399 | the output `.mlb` file. | |
22400 | ||
22401 | == Details == | |
22402 | ||
22403 | The following changes were made to the SML/NJ Library, in addition to | |
22404 | deriving the `.mlb` files from the `.cm` files: | |
22405 | ||
22406 | * `HTML4/pp-init.sml` (added): Implements `structure PrettyPrint` using the SML/NJ PP Library. This implementation is taken from the SML/NJ compiler source, since the SML/NJ HTML4 Library used the `structure PrettyPrint` provided by the SML/NJ compiler itself. | |
22407 | * `Util/base64.sml` (modified): Rewrote use of `Unsafe.CharVector.create` and `Unsafe.CharVector.update`; MLton assumes that vectors are immutable. | |
22408 | * `Util/engine.mlton.sml` (added, not exported): Implements `structure Engine`, providing time-limited, resumable computations using <:MLtonThread:>, <:MLtonSignal:>, and <:MLtonItimer:>. | |
22409 | * `Util/graph-scc-fn.sml` (modified): Rewrote use of `where` structure specification. | |
22410 | * `Util/redblack-map-fn.sml` (modified): Rewrote use of `where` structure specification. | |
22411 | * `Util/redblack-set-fn.sml` (modified): Rewrote use of `where` structure specification. | |
22412 | * `Util/time-limit.mlb` (added): Exports `structure TimeLimit`, which is _not_ exported by `smlnj-lib.mlb`. Since MLton is very conservative in the presence of threads and signals, program performance may be adversely affected by unnecessarily including `structure TimeLimit`. | |
22413 | * `Util/time-limit.mlton.sml` (added): Implements `structure TimeLimit` using `structure Engine`. The SML/NJ implementation of `structure TimeLimit` uses SML/NJ's first-class continuations, signals, and interval timer. | |
22414 | ||
22415 | == Patch == | |
22416 | ||
22417 | * <!ViewGitFile(mlton,master,lib/smlnj-lib/smlnj-lib.patch)> | |
22418 | ||
22419 | <<< | |
22420 | ||
22421 | :mlton-guide-page: SMLofNJStructure | |
22422 | [[SMLofNJStructure]] | |
22423 | SMLofNJStructure | |
22424 | ================ | |
22425 | ||
22426 | [source,sml] | |
22427 | ---- | |
22428 | signature SML_OF_NJ = | |
22429 | sig | |
22430 | structure Cont: | |
22431 | sig | |
22432 | type 'a cont | |
22433 | val callcc: ('a cont -> 'a) -> 'a | |
22434 | val isolate: ('a -> unit) -> 'a cont | |
22435 | val throw: 'a cont -> 'a -> 'b | |
22436 | end | |
22437 | structure SysInfo: | |
22438 | sig | |
22439 | exception UNKNOWN | |
22440 | datatype os_kind = BEOS | MACOS | OS2 | UNIX | WIN32 | |
22441 | ||
22442 | val getHostArch: unit -> string | |
22443 | val getOSKind: unit -> os_kind | |
22444 | val getOSName: unit -> string | |
22445 | end | |
22446 | ||
22447 | val exnHistory: exn -> string list | |
22448 | val exportFn: string * (string * string list -> OS.Process.status) -> unit | |
22449 | val exportML: string -> bool | |
22450 | val getAllArgs: unit -> string list | |
22451 | val getArgs: unit -> string list | |
22452 | val getCmdName: unit -> string | |
22453 | end | |
22454 | ---- | |
22455 | ||
22456 | `SMLofNJ` implements a subset of the structure of the same name | |
22457 | provided in <:SMLNJ:Standard ML of New Jersey>. It is included to | |
22458 | make it easier to port programs between the two systems. The | |
22459 | semantics of these functions may be different than in SML/NJ. | |
22460 | ||
22461 | * `structure Cont` | |
22462 | + | |
22463 | implements continuations. | |
22464 | ||
22465 | * `SysInfo.getHostArch ()` | |
22466 | + | |
22467 | returns the string for the architecture. | |
22468 | ||
22469 | * `SysInfo.getOSKind` | |
22470 | + | |
22471 | returns the OS kind. | |
22472 | ||
22473 | * `SysInfo.getOSName ()` | |
22474 | + | |
22475 | returns the string for the host. | |
22476 | ||
22477 | * `exnHistory` | |
22478 | + | |
22479 | the same as `MLton.Exn.history`. | |
22480 | ||
22481 | * `getCmdName ()` | |
22482 | + | |
22483 | the same as `CommandLine.name ()`. | |
22484 | ||
22485 | * `getArgs ()` | |
22486 | + | |
22487 | the same as `CommandLine.arguments ()`. | |
22488 | ||
22489 | * `getAllArgs ()` | |
22490 | + | |
22491 | the same as `getCmdName()::getArgs()`. | |
22492 | ||
22493 | * `exportFn f` | |
22494 | + | |
22495 | saves the state of the computation to a file that will apply `f` to | |
22496 | the command-line arguments upon restart. | |
22497 | ||
22498 | * `exportML f` | |
22499 | + | |
22500 | saves the state of the computation to file `f` and continue. Returns | |
22501 | `true` in the restarted computation and `false` in the continuing | |
22502 | computation. | |
22503 | ||
22504 | <<< | |
22505 | ||
22506 | :mlton-guide-page: SMLSharp | |
22507 | [[SMLSharp]] | |
22508 | SMLSharp | |
22509 | ======== | |
22510 | ||
22511 | http://www.pllab.riec.tohoku.ac.jp/smlsharp/[SML#] is an | |
22512 | <:StandardMLImplementations:implementation> of an extension of SML. | |
22513 | ||
22514 | It includes some | |
22515 | http://www.pllab.riec.tohoku.ac.jp/smlsharp/?Tools[generally useful SML tools] | |
22516 | including a pretty printer generator, a document generator, and a | |
22517 | regression testing framework, and | |
22518 | http://www.pllab.riec.tohoku.ac.jp/smlsharp/?Library%2FScripting[scripting library]. | |
22519 | ||
22520 | <<< | |
22521 | ||
22522 | :mlton-guide-page: Sources | |
22523 | [[Sources]] | |
22524 | Sources | |
22525 | ======= | |
22526 | ||
22527 | We maintain our sources with <:Git:>. You can | |
22528 | https://github.com/MLton/mlton/[view them on the web] or access | |
22529 | them with a git client. | |
22530 | ||
22531 | Anonymous read-only access is available via | |
22532 | ---------- | |
22533 | https://github.com/MLton/mlton.git | |
22534 | ---------- | |
22535 | or | |
22536 | ---------- | |
22537 | git://github.com/MLton/mlton.git | |
22538 | ---------- | |
22539 | ||
22540 | ||
22541 | == Commit email == | |
22542 | ||
22543 | All commits are sent to | |
22544 | mailto:MLton-commit@mlton.org[`MLton-commit@mlton.org`] | |
22545 | (https://lists.sourceforge.net/lists/listinfo/mlton-commit[subscribe], | |
22546 | https://sourceforge.net/mailarchive/forum.php?forum_name=mlton-commit[archive], | |
22547 | http://www.mlton.org/pipermail/mlton-commit/[archive]) which is a | |
22548 | read-only mailing list for commit emails. Discussion should go to | |
22549 | mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`]. | |
22550 | ||
22551 | ///// | |
22552 | If the first line of a commit log message begins with "++MAIL{nbsp} ++", | |
22553 | then the commit message will be sent with the subject as the rest of | |
22554 | that first line, and will also be sent to | |
22555 | mailto:MLton-devel@mlton.org[`MLton-devel@mlton.org`]. | |
22556 | ///// | |
22557 | ||
22558 | ||
22559 | == Changelog == | |
22560 | ||
22561 | See <!ViewGitFile(mlton,master,CHANGELOG.adoc)> for a list of | |
22562 | changes and bug fixes. | |
22563 | ||
22564 | ||
22565 | == Subversion == | |
22566 | ||
22567 | Prior to 20130308, we used <:Subversion:>. | |
22568 | ||
22569 | == CVS == | |
22570 | ||
22571 | Prior to 20050730, we used <:CVS:>. | |
22572 | ||
22573 | <<< | |
22574 | ||
22575 | :mlton-guide-page: SpaceSafety | |
22576 | [[SpaceSafety]] | |
22577 | SpaceSafety | |
22578 | =========== | |
22579 | ||
22580 | Informally, space safety is a property of a language implementation | |
22581 | that asymptotically bounds the space used by a running program. | |
22582 | ||
22583 | == Also see == | |
22584 | ||
22585 | * Chapter 12 of <!Cite(Appel92)> | |
22586 | * <!Cite(Clinger98)> | |
22587 | ||
22588 | <<< | |
22589 | ||
22590 | :mlton-guide-page: SSA | |
22591 | [[SSA]] | |
22592 | SSA | |
22593 | === | |
22594 | ||
22595 | <:SSA:> is an <:IntermediateLanguage:>, translated from <:SXML:> by | |
22596 | <:ClosureConvert:>, optimized by <:SSASimplify:>, and translated by | |
22597 | <:ToSSA2:> to <:SSA2:>. | |
22598 | ||
22599 | == Description == | |
22600 | ||
22601 | <:SSA:> is a <:FirstOrder:>, <:SimplyTyped:> <:IntermediateLanguage:>. | |
22602 | It is the main <:IntermediateLanguage:> used for optimizations. | |
22603 | ||
22604 | An <:SSA:> program consists of a collection of datatype declarations, | |
22605 | a sequence of global statements, and a collection of functions, along | |
22606 | with a distinguished "main" function. Each function consists of a | |
22607 | collection of basic blocks, where each basic block is a sequence of | |
22608 | statements ending with some control transfer. | |
22609 | ||
22610 | == Implementation == | |
22611 | ||
22612 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa.sig)> | |
22613 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa.fun)> | |
22614 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree.sig)> | |
22615 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree.fun)> | |
22616 | ||
22617 | == Type Checking == | |
22618 | ||
22619 | Type checking (<!ViewGitFile(mlton,master,mlton/ssa/type-check.sig)>, | |
22620 | <!ViewGitFile(mlton,master,mlton/ssa/type-check.fun)>) of a <:SSA:> program | |
22621 | verifies the following: | |
22622 | ||
22623 | * no duplicate definitions (tycons, cons, vars, labels, funcs) | |
22624 | * no out of scope references (tycons, cons, vars, labels, funcs) | |
22625 | * variable definitions dominate variable uses | |
22626 | * case transfers are exhaustive and irredundant | |
22627 | * `Enter`/`Leave` profile statements match | |
22628 | * "traditional" well-typedness | |
22629 | ||
22630 | == Details and Notes == | |
22631 | ||
22632 | SSA is an abbreviation for Static Single Assignment. | |
22633 | ||
22634 | For some initial design discussion, see the thread at: | |
22635 | ||
22636 | * http://mlton.org/pipermail/mlton/2001-August/019689.html | |
22637 | ||
22638 | For some retrospectives, see the threads at: | |
22639 | ||
22640 | * http://mlton.org/pipermail/mlton/2003-January/023054.html | |
22641 | * http://mlton.org/pipermail/mlton/2007-February/029597.html | |
22642 | ||
22643 | <<< | |
22644 | ||
22645 | :mlton-guide-page: SSA2 | |
22646 | [[SSA2]] | |
22647 | SSA2 | |
22648 | ==== | |
22649 | ||
22650 | <:SSA2:> is an <:IntermediateLanguage:>, translated from <:SSA:> by | |
22651 | <:ToSSA2:>, optimized by <:SSA2Simplify:>, and translated by | |
22652 | <:ToRSSA:> to <:RSSA:>. | |
22653 | ||
22654 | == Description == | |
22655 | ||
22656 | <:SSA2:> is a <:FirstOrder:>, <:SimplyTyped:> | |
22657 | <:IntermediateLanguage:>, a slight variant of the <:SSA:> | |
22658 | <:IntermediateLanguage:>, | |
22659 | ||
22660 | Like <:SSA:>, an <:SSA2:> program consists of a collection of datatype | |
22661 | declarations, a sequence of global statements, and a collection of | |
22662 | functions, along with a distinguished "main" function. Each function | |
22663 | consists of a collection of basic blocks, where each basic block is a | |
22664 | sequence of statements ending with some control transfer. | |
22665 | ||
22666 | Unlike <:SSA:>, <:SSA2:> includes mutable fields in objects and makes | |
22667 | the vector type constructor n-ary instead of unary. This allows | |
22668 | optimizations like <:RefFlatten:> and <:DeepFlatten:> to be expressed. | |
22669 | ||
22670 | == Implementation == | |
22671 | ||
22672 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa2.sig)> | |
22673 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa2.fun)> | |
22674 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree2.sig)> | |
22675 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa-tree2.fun)> | |
22676 | ||
22677 | == Type Checking == | |
22678 | ||
22679 | Type checking (<!ViewGitFile(mlton,master,mlton/ssa/type-check2.sig)>, | |
22680 | <!ViewGitFile(mlton,master,mlton/ssa/type-check2.fun)>) of a <:SSA2:> | |
22681 | program verifies the following: | |
22682 | ||
22683 | * no duplicate definitions (tycons, cons, vars, labels, funcs) | |
22684 | * no out of scope references (tycons, cons, vars, labels, funcs) | |
22685 | * variable definitions dominate variable uses | |
22686 | * case transfers are exhaustive and irredundant | |
22687 | * `Enter`/`Leave` profile statements match | |
22688 | * "traditional" well-typedness | |
22689 | ||
22690 | == Details and Notes == | |
22691 | ||
22692 | SSA is an abbreviation for Static Single Assignment. | |
22693 | ||
22694 | <<< | |
22695 | ||
22696 | :mlton-guide-page: SSA2Simplify | |
22697 | [[SSA2Simplify]] | |
22698 | SSA2Simplify | |
22699 | ============ | |
22700 | ||
22701 | The optimization passes for the <:SSA2:> <:IntermediateLanguage:> are | |
22702 | collected and controlled by the `Simplify2` functor | |
22703 | (<!ViewGitFile(mlton,master,mlton/ssa/simplify2.sig)>, | |
22704 | <!ViewGitFile(mlton,master,mlton/ssa/simplify2.fun)>). | |
22705 | ||
22706 | The following optimization passes are implemented: | |
22707 | ||
22708 | * <:DeepFlatten:> | |
22709 | * <:RefFlatten:> | |
22710 | * <:RemoveUnused:> | |
22711 | * <:Zone:> | |
22712 | ||
22713 | There are additional analysis and rewrite passes that augment many of the other optimization passes: | |
22714 | ||
22715 | * <:Restore:> | |
22716 | * <:Shrink:> | |
22717 | ||
22718 | The optimization passes can be controlled from the command-line by the options | |
22719 | ||
22720 | * `-diag-pass <pass>` -- keep diagnostic info for pass | |
22721 | * `-disable-pass <pass>` -- skip optimization pass (if normally performed) | |
22722 | * `-enable-pass <pass>` -- perform optimization pass (if normally skipped) | |
22723 | * `-keep-pass <pass>` -- keep the results of pass | |
22724 | * `-loop-passes <n>` -- loop optimization passes | |
22725 | * `-ssa2-passes <passes>` -- ssa optimization passes | |
22726 | ||
22727 | <<< | |
22728 | ||
22729 | :mlton-guide-page: SSASimplify | |
22730 | [[SSASimplify]] | |
22731 | SSASimplify | |
22732 | =========== | |
22733 | ||
22734 | The optimization passes for the <:SSA:> <:IntermediateLanguage:> are | |
22735 | collected and controlled by the `Simplify` functor | |
22736 | (<!ViewGitFile(mlton,master,mlton/ssa/simplify.sig)>, | |
22737 | <!ViewGitFile(mlton,master,mlton/ssa/simplify.fun)>). | |
22738 | ||
22739 | The following optimization passes are implemented: | |
22740 | ||
22741 | * <:CombineConversions:> | |
22742 | * <:CommonArg:> | |
22743 | * <:CommonBlock:> | |
22744 | * <:CommonSubexp:> | |
22745 | * <:ConstantPropagation:> | |
22746 | * <:Contify:> | |
22747 | * <:Flatten:> | |
22748 | * <:Inline:> | |
22749 | * <:IntroduceLoops:> | |
22750 | * <:KnownCase:> | |
22751 | * <:LocalFlatten:> | |
22752 | * <:LocalRef:> | |
22753 | * <:LoopInvariant:> | |
22754 | * <:LoopUnfoll:> | |
22755 | * <:LoopUnswitch:> | |
22756 | * <:Redundant:> | |
22757 | * <:RedundantTests:> | |
22758 | * <:RemoveUnused:> | |
22759 | * <:ShareZeroVec:> | |
22760 | * <:SimplifyTypes:> | |
22761 | * <:Useless:> | |
22762 | ||
22763 | The following implementation passes are implemented: | |
22764 | ||
22765 | * <:PolyEqual:> | |
22766 | * <:PolyHash:> | |
22767 | ||
22768 | There are additional analysis and rewrite passes that augment many of the other optimization passes: | |
22769 | ||
22770 | * <:Multi:> | |
22771 | * <:Restore:> | |
22772 | * <:Shrink:> | |
22773 | ||
22774 | The optimization passes can be controlled from the command-line by the options: | |
22775 | ||
22776 | * `-diag-pass <pass>` -- keep diagnostic info for pass | |
22777 | * `-disable-pass <pass>` -- skip optimization pass (if normally performed) | |
22778 | * `-enable-pass <pass>` -- perform optimization pass (if normally skipped) | |
22779 | * `-keep-pass <pass>` -- keep the results of pass | |
22780 | * `-loop-passes <n>` -- loop optimization passes | |
22781 | * `-ssa-passes <passes>` -- ssa optimization passes | |
22782 | ||
22783 | <<< | |
22784 | ||
22785 | :mlton-guide-page: Stabilizers | |
22786 | [[Stabilizers]] | |
22787 | Stabilizers | |
22788 | =========== | |
22789 | ||
22790 | == Installation == | |
22791 | ||
22792 | * Stabilizers currently require the MLton sources, this should be fixed by the next release | |
22793 | ||
22794 | == License == | |
22795 | ||
22796 | * Stabilizers are released under the MLton License | |
22797 | ||
22798 | == Instructions == | |
22799 | ||
22800 | * Download and build a source copy of MLton | |
22801 | * Extract the tar.gz file attached to this page | |
22802 | * Some examples are provided in the "examples/" sub directory, more examples will be added to this page in the following week | |
22803 | ||
22804 | == Bug reports / Suggestions == | |
22805 | ||
22806 | * Please send any errors you encounter to schatzp and lziarek at cs.purdue.edu | |
22807 | * We are looking to expand the usability of stabilizers | |
22808 | * Please send any suggestions and desired functionality to the above email addresses | |
22809 | ||
22810 | == Note == | |
22811 | ||
22812 | * This is an alpha release. We expect to have another release shortly with added functionality soon | |
22813 | * More documentation, such as signatures and descriptions of functionality, will be forthcoming | |
22814 | ||
22815 | ||
22816 | == Documentation == | |
22817 | ||
22818 | [source,sml] | |
22819 | ---- | |
22820 | signature STABLE = | |
22821 | sig | |
22822 | type checkpoint | |
22823 | ||
22824 | val stable: ('a -> 'b) -> ('a -> 'b) | |
22825 | val stabilize: unit -> 'a | |
22826 | ||
22827 | val stableCP: (('a -> 'b) * (unit -> unit)) -> | |
22828 | (('a -> 'b) * checkpoint) | |
22829 | val stabilizeCP: checkpoint -> unit | |
22830 | ||
22831 | val unmonitoredAssign: ('a ref * 'a) -> unit | |
22832 | val monitoredAssign: ('a ref * 'a) -> unit | |
22833 | end | |
22834 | ---- | |
22835 | ||
22836 | ||
22837 | `Stable` provides functions to manage stable sections. | |
22838 | ||
22839 | * `type checkpoint` | |
22840 | + | |
22841 | handle used to stabilize contexts other than the current one. | |
22842 | ||
22843 | * `stable f` | |
22844 | + | |
22845 | returns a function identical to `f` that will execute within a stable section. | |
22846 | ||
22847 | * `stabilize ()` | |
22848 | + | |
22849 | unrolls the effects made up to the current context to at least the | |
22850 | nearest enclosing _stable_ section. These effects may have propagated | |
22851 | to other threads, so all affected threads are returned to a globally | |
22852 | consistent previous state. The return is undefined because control | |
22853 | cannot resume after stabilize is called. | |
22854 | ||
22855 | * `stableCP (f, comp)` | |
22856 | + | |
22857 | returns a function `f'` and checkpoint tag `cp`. Function `f'` is | |
22858 | identical to `f` but when applied will execute within a stable | |
22859 | section. `comp` will be executed if `f'` is later stabilized. `cp` | |
22860 | is used by `stabilizeCP` to stabilize a given checkpoint. | |
22861 | ||
22862 | * `stabilizeCP cp` | |
22863 | + | |
22864 | same as stabilize except that the (possibly current) checkpoint to | |
22865 | stabilize is provided. | |
22866 | ||
22867 | * `unmonitoredAssign (r, v)` | |
22868 | + | |
22869 | standard assignment (`:=`). The version of CML distributed rebinds | |
22870 | `:=` to a monitored version so interesting effects can be recorded. | |
22871 | ||
22872 | * `monitoredAssign (r, v)` | |
22873 | + | |
22874 | the assignment operator that should be used in programs that use | |
22875 | stabilizers. `:=` is rebound to this by including CML. | |
22876 | ||
22877 | == Download == | |
22878 | ||
22879 | * <!Attachment(Stabilizers,stabilizers_alpha_2006-10-09.tar.gz)> | |
22880 | ||
22881 | == Also see == | |
22882 | ||
22883 | * <!Cite(ZiarekEtAl06)> | |
22884 | ||
22885 | <<< | |
22886 | ||
22887 | :mlton-guide-page: StandardML | |
22888 | [[StandardML]] | |
22889 | StandardML | |
22890 | ========== | |
22891 | ||
22892 | Standard ML (SML) is a programming language that combines excellent | |
22893 | support for rapid prototyping, modularity, and development of large | |
22894 | programs, with performance approaching that of C. | |
22895 | ||
22896 | == SML Resources == | |
22897 | ||
22898 | * <:StandardMLTutorials:Tutorials> | |
22899 | * <:StandardMLBooks:Books> | |
22900 | * <:StandardMLImplementations:Implementations> | |
22901 | // * http://google.com/coop/cse?cx=014714656471597805969%3Afzuz7eybmcy[SML web search] from Google Co-op | |
22902 | ||
22903 | == Aspects of SML == | |
22904 | ||
22905 | * <:DefineTypeBeforeUse:> | |
22906 | * <:EqualityType:> | |
22907 | * <:EqualityTypeVariable:> | |
22908 | * <:GenerativeDatatype:> | |
22909 | * <:GenerativeException:> | |
22910 | * <:Identifier:> | |
22911 | * <:OperatorPrecedence:> | |
22912 | * <:Overloading:> | |
22913 | * <:PolymorphicEquality:> | |
22914 | * <:TypeVariableScope:> | |
22915 | * <:ValueRestriction:> | |
22916 | ||
22917 | == Using SML == | |
22918 | ||
22919 | * <:Fixpoints:> | |
22920 | * <:ForLoops:> | |
22921 | * <:FunctionalRecordUpdate:> | |
22922 | * <:InfixingOperators:> | |
22923 | * <:Lazy:> | |
22924 | * <:ObjectOrientedProgramming:> | |
22925 | * <:OptionalArguments:> | |
22926 | * <:Printf:> | |
22927 | * <:PropertyList:> | |
22928 | * <:ReturnStatement:> | |
22929 | * <:Serialization:> | |
22930 | * <:StandardMLGotchas:> | |
22931 | * <:StyleGuide:> | |
22932 | * <:TipsForWritingConciseSML:> | |
22933 | * <:UniversalType:> | |
22934 | ||
22935 | == Programming in SML == | |
22936 | ||
22937 | * <:Emacs:> | |
22938 | * <:Enscript:> | |
22939 | * <:Pygments:> | |
22940 | ||
22941 | == Notes == | |
22942 | ||
22943 | * <:StandardMLHistory: History of SML> | |
22944 | * <:Regions:> | |
22945 | ||
22946 | == Related Languages == | |
22947 | ||
22948 | * <:Alice:> | |
22949 | * <:FSharp:F#> | |
22950 | * <:OCaml:> | |
22951 | ||
22952 | <<< | |
22953 | ||
22954 | :mlton-guide-page: StandardMLBooks | |
22955 | [[StandardMLBooks]] | |
22956 | StandardMLBooks | |
22957 | =============== | |
22958 | ||
22959 | == Introductory Books == | |
22960 | ||
22961 | * <!Cite(Ullman98, Elements of ML Programming)> | |
22962 | ||
22963 | * <!Cite(Paulson96, ML For the Working Programmer)> | |
22964 | ||
22965 | * <!Cite(HansenRichel99, Introduction to Programming using SML)> | |
22966 | ||
22967 | * <!Cite(FelleisenFreidman98, The Little MLer)> | |
22968 | ||
22969 | == Applications == | |
22970 | ||
22971 | * <!Cite(Shipman02, Unix System Programming with Standard ML)> | |
22972 | ||
22973 | == Reference Books == | |
22974 | ||
22975 | * <!Cite(GansnerReppy04, The Standard ML Basis Library)> | |
22976 | ||
22977 | * <:DefinitionOfStandardML:The Definition of Standard ML (Revised)> | |
22978 | ||
22979 | == Related Topics == | |
22980 | ||
22981 | * <!Cite(Reppy07, Concurrent Programming in ML)> | |
22982 | ||
22983 | * <!Cite(Okasaki99, Purely Functional Data Structures)> | |
22984 | ||
22985 | <<< | |
22986 | ||
22987 | :mlton-guide-page: StandardMLGotchas | |
22988 | [[StandardMLGotchas]] | |
22989 | StandardMLGotchas | |
22990 | ================= | |
22991 | ||
22992 | This page contains brief explanations of some recurring sources of | |
22993 | confusion and problems that SML newbies encounter. | |
22994 | ||
22995 | Many confusions about the syntax of SML seem to arise from the use of | |
22996 | an interactive REPL (Read-Eval Print Loop) while trying to learn the | |
22997 | basics of the language. While writing your first SML programs, you | |
22998 | should keep the source code of your programs in a form that is | |
22999 | accepted by an SML compiler as a whole. | |
23000 | ||
23001 | == The `and` keyword == | |
23002 | ||
23003 | It is a common mistake to misuse the `and` keyword or to not know how | |
23004 | to introduce mutually recursive definitions. The purpose of the `and` | |
23005 | keyword is to introduce mutually recursive definitions of functions | |
23006 | and datatypes. For example, | |
23007 | ||
23008 | [source,sml] | |
23009 | ---- | |
23010 | fun isEven 0w0 = true | |
23011 | | isEven 0w1 = false | |
23012 | | isEven n = isOdd (n-0w1) | |
23013 | and isOdd 0w0 = false | |
23014 | | isOdd 0w1 = true | |
23015 | | isOdd n = isEven (n-0w1) | |
23016 | ---- | |
23017 | ||
23018 | and | |
23019 | ||
23020 | [source,sml] | |
23021 | ---- | |
23022 | datatype decl = VAL of id * pat * expr | |
23023 | (* | ... *) | |
23024 | and expr = LET of decl * expr | |
23025 | (* | ... *) | |
23026 | ---- | |
23027 | ||
23028 | You can also use `and` as a shorthand in a couple of other places, but | |
23029 | it is not necessary. | |
23030 | ||
23031 | == Constructed patterns == | |
23032 | ||
23033 | It is a common mistake to forget to parenthesize constructed patterns | |
23034 | in `fun` bindings. Consider the following invalid definition: | |
23035 | ||
23036 | [source,sml] | |
23037 | ---- | |
23038 | fun length nil = 0 | |
23039 | | length h :: t = 1 + length t | |
23040 | ---- | |
23041 | ||
23042 | The pattern `h :: t` needs to be parenthesized: | |
23043 | ||
23044 | [source,sml] | |
23045 | ---- | |
23046 | fun length nil = 0 | |
23047 | | length (h :: t) = 1 + length t | |
23048 | ---- | |
23049 | ||
23050 | The parentheses are needed, because a `fun` definition may have | |
23051 | multiple consecutive constructed patterns through currying. | |
23052 | ||
23053 | The same applies to nonfix constructors. For example, the parentheses | |
23054 | in | |
23055 | ||
23056 | [source,sml] | |
23057 | ---- | |
23058 | fun valOf NONE = raise Option | |
23059 | | valOf (SOME x) = x | |
23060 | ---- | |
23061 | ||
23062 | are required. However, the outermost constructed pattern in a `fn` or | |
23063 | `case` expression need not be parenthesized, because in those cases | |
23064 | there is always just one constructed pattern. So, both | |
23065 | ||
23066 | [source,sml] | |
23067 | ---- | |
23068 | val valOf = fn NONE => raise Option | |
23069 | | SOME x => x | |
23070 | ---- | |
23071 | ||
23072 | and | |
23073 | ||
23074 | [source,sml] | |
23075 | ---- | |
23076 | fun valOf x = case x of | |
23077 | NONE => raise Option | |
23078 | | SOME x => x | |
23079 | ---- | |
23080 | ||
23081 | are fine. | |
23082 | ||
23083 | == Declarations and expressions == | |
23084 | ||
23085 | It is a common mistake to confuse expressions and declarations. | |
23086 | Normally an SML source file should only contain declarations. The | |
23087 | following are declarations: | |
23088 | ||
23089 | [source,sml] | |
23090 | ---- | |
23091 | datatype dt = ... | |
23092 | fun f ... = ... | |
23093 | functor Fn (...) = ... | |
23094 | infix ... | |
23095 | infixr ... | |
23096 | local ... in ... end | |
23097 | nonfix ... | |
23098 | open ... | |
23099 | signature SIG = ... | |
23100 | structure Struct = ... | |
23101 | type t = ... | |
23102 | val v = ... | |
23103 | ---- | |
23104 | ||
23105 | Note that | |
23106 | ||
23107 | [source,sml] | |
23108 | ---- | |
23109 | let ... in ... end | |
23110 | ---- | |
23111 | ||
23112 | isn't a declaration. | |
23113 | ||
23114 | To specify a side-effecting computation in a source file, you can write: | |
23115 | ||
23116 | [source,sml] | |
23117 | ---- | |
23118 | val () = ... | |
23119 | ---- | |
23120 | ||
23121 | ||
23122 | == Equality types == | |
23123 | ||
23124 | SML has a fairly intricate built-in notion of equality. See | |
23125 | <:EqualityType:> and <:EqualityTypeVariable:> for a thorough | |
23126 | discussion. | |
23127 | ||
23128 | ||
23129 | == Nested cases == | |
23130 | ||
23131 | It is a common mistake to write nested case expressions without the | |
23132 | necessary parentheses. See <:UnresolvedBugs:> for a discussion. | |
23133 | ||
23134 | ||
23135 | == (op *) == | |
23136 | ||
23137 | It used to be a common mistake to parenthesize `op *` as `(op *)`. | |
23138 | Before SML'97, `*)` was considered a comment terminator in SML and | |
23139 | caused a syntax error. At the time of writing, <:SMLNJ:SML/NJ> still | |
23140 | rejects the code. An extra space may be used for portability: | |
23141 | `(op * )`. However, parenthesizing `op` is redundant, even though it | |
23142 | is a widely used convention. | |
23143 | ||
23144 | ||
23145 | == Overloading == | |
23146 | ||
23147 | A number of standard operators (`+`, `-`, `~`, `*`, `<`, `>`, ...) and | |
23148 | numeric constants are overloaded for some of the numeric types (`int`, | |
23149 | `real`, `word`). It is a common surprise that definitions using | |
23150 | overloaded operators such as | |
23151 | ||
23152 | [source,sml] | |
23153 | ---- | |
23154 | fun min (x, y) = if y < x then y else x | |
23155 | ---- | |
23156 | ||
23157 | are not overloaded themselves. SML doesn't really support | |
23158 | (user-defined) overloading or other forms of ad hoc polymorphism. In | |
23159 | cases such as the above where the context doesn't resolve the | |
23160 | overloading, expressions using overloaded operators or constants get | |
23161 | assigned a default type. The above definition gets the type | |
23162 | ||
23163 | [source,sml] | |
23164 | ---- | |
23165 | val min : int * int -> int | |
23166 | ---- | |
23167 | ||
23168 | See <:Overloading:> and <:TypeIndexedValues:> for further discussion. | |
23169 | ||
23170 | ||
23171 | == Semicolons == | |
23172 | ||
23173 | It is a common mistake to use redundant semicolons in SML code. This | |
23174 | is probably caused by the fact that in an SML REPL, a semicolon (and | |
23175 | enter) is used to signal the REPL that it should evaluate the | |
23176 | preceding chunk of code as a unit. In SML source files, semicolons | |
23177 | are really needed in only two places. Namely, in expressions of the | |
23178 | form | |
23179 | ||
23180 | [source,sml] | |
23181 | ---- | |
23182 | (exp ; ... ; exp) | |
23183 | ---- | |
23184 | ||
23185 | and | |
23186 | ||
23187 | [source,sml] | |
23188 | ---- | |
23189 | let ... in exp ; ... ; exp end | |
23190 | ---- | |
23191 | ||
23192 | Note that semicolons act as expression (or declaration) separators | |
23193 | rather than as terminators. | |
23194 | ||
23195 | ||
23196 | == Stale bindings == | |
23197 | ||
23198 | {empty} | |
23199 | ||
23200 | ||
23201 | == Unresolved records == | |
23202 | ||
23203 | {empty} | |
23204 | ||
23205 | ||
23206 | == Value restriction == | |
23207 | ||
23208 | See <:ValueRestriction:>. | |
23209 | ||
23210 | ||
23211 | == Type Variable Scope == | |
23212 | ||
23213 | See <:TypeVariableScope:>. | |
23214 | ||
23215 | <<< | |
23216 | ||
23217 | :mlton-guide-page: StandardMLHistory | |
23218 | [[StandardMLHistory]] | |
23219 | StandardMLHistory | |
23220 | ================= | |
23221 | ||
23222 | <:StandardML:Standard ML> grew out of <:ML:> in the early 1980s. | |
23223 | ||
23224 | For an excellent overview of SML's history, see Appendix F of the | |
23225 | <:DefinitionOfStandardML:Definition>. | |
23226 | ||
23227 | For an overview if its history before 1982, see <!Cite(Milner82, How | |
23228 | ML Evolved)>. | |
23229 | ||
23230 | <<< | |
23231 | ||
23232 | :mlton-guide-page: StandardMLImplementations | |
23233 | [[StandardMLImplementations]] | |
23234 | StandardMLImplementations | |
23235 | ========================= | |
23236 | ||
23237 | There are a number of implementations of <:StandardML:Standard ML>, | |
23238 | from interpreters, to byte-code compilers, to incremental compilers, | |
23239 | to whole-program compilers. | |
23240 | ||
23241 | * <:Alice:Alice ML> | |
23242 | * <:HaMLet:HaMLet> | |
23243 | * <:MLKit:ML Kit> | |
23244 | * <:Home:MLton> | |
23245 | * <:MoscowML:Moscow ML> | |
23246 | * <:PolyML:Poly/ML> | |
23247 | * <:SMLSharp:SML#> | |
23248 | * <:SMLNJ:SML/NJ> | |
23249 | * <:SMLNET:SML.NET> | |
23250 | * <:TILT:TILT> | |
23251 | ||
23252 | == Not Actively Maintained == | |
23253 | ||
23254 | * http://www.dcs.ed.ac.uk/home/edml/[Edinburgh ML] | |
23255 | * <:MLj:MLj> | |
23256 | * MLWorks | |
23257 | * <:Poplog:> | |
23258 | * http://www.cs.cornell.edu/Info/People/jgm/til.tar.Z[TIL] | |
23259 | ||
23260 | <<< | |
23261 | ||
23262 | :mlton-guide-page: StandardMLPortability | |
23263 | [[StandardMLPortability]] | |
23264 | StandardMLPortability | |
23265 | ===================== | |
23266 | ||
23267 | Technically, SML'97 as defined in the | |
23268 | <:DefinitionOfStandardML:Definition> | |
23269 | requires only a minimal initial basis, which, while including the | |
23270 | types `int`, `real`, `char`, and `string`, need have | |
23271 | no operations on those base types. Hence, the only observable output | |
23272 | of an SML'97 program is termination or raising an exception. Most SML | |
23273 | compilers should agree there, to the degree each agrees with the | |
23274 | Definition. See <:UnresolvedBugs:> for MLton's very few corner cases. | |
23275 | ||
23276 | Realistically, a program needs to make use of the | |
23277 | <:BasisLibrary:Basis Library>. | |
23278 | Within the Basis Library, there are numerous places where the behavior | |
23279 | is implementation dependent. For a trivial example: | |
23280 | ||
23281 | [source,sml] | |
23282 | ---- | |
23283 | val _ = valOf (Int.maxInt) | |
23284 | ---- | |
23285 | ||
23286 | ||
23287 | may either raise the `Option` exception (if | |
23288 | `Int.maxInt == NONE`) or may terminate normally. The default | |
23289 | Int/Real/Word sizes are the biggest implementation dependent aspect; | |
23290 | so, one implementation may raise `Overflow` while another can | |
23291 | accommodate the result. Also, maximum array and vector lengths are | |
23292 | implementation dependent. Interfacing with the operating system is a | |
23293 | bit murky, and implementations surely differ in handling of errors | |
23294 | there. | |
23295 | ||
23296 | <<< | |
23297 | ||
23298 | :mlton-guide-page: StandardMLTutorials | |
23299 | [[StandardMLTutorials]] | |
23300 | StandardMLTutorials | |
23301 | =================== | |
23302 | ||
23303 | * http://www.dcs.napier.ac.uk/course-notes/sml/manual.html[A Gentle Introduction to ML]. | |
23304 | Andrew Cummings. | |
23305 | ||
23306 | * http://www.dcs.ed.ac.uk/home/stg/NOTES/[Programming in Standard ML '97: An Online Tutorial]. | |
23307 | Stephen Gilmore. | |
23308 | ||
23309 | * <!Cite(Harper11, Programming in Standard ML)>. | |
23310 | Robert Harper. | |
23311 | ||
23312 | * <!Cite(Tofte96, Essentials of Standard ML Modules)>. | |
23313 | Mads Tofte. | |
23314 | ||
23315 | * <!Cite(Tofte09, Tips for Computer Scientists on Standard ML (Revised))>. | |
23316 | Mads Tofte. | |
23317 | ||
23318 | <<< | |
23319 | ||
23320 | :mlton-guide-page: StaticSum | |
23321 | [[StaticSum]] | |
23322 | StaticSum | |
23323 | ========= | |
23324 | ||
23325 | While SML makes it impossible to write functions whose types would | |
23326 | depend on the values of their arguments, or so called dependently | |
23327 | typed functions, it is possible, and arguably commonplace, to write | |
23328 | functions whose types depend on the types of their arguments. Indeed, | |
23329 | the types of parametrically polymorphic functions like `map` and | |
23330 | `foldl` can be said to depend on the types of their arguments. What | |
23331 | is less commonplace, however, is to write functions whose behavior | |
23332 | would depend on the types of their arguments. Nevertheless, there are | |
23333 | several techniques for writing such functions. | |
23334 | <:TypeIndexedValues:Type-indexed values> and <:Fold:fold> are two such | |
23335 | techniques. This page presents another such technique dubbed static | |
23336 | sums. | |
23337 | ||
23338 | ||
23339 | == Ordinary Sums == | |
23340 | ||
23341 | Consider the sum type as defined below: | |
23342 | [source,sml] | |
23343 | ---- | |
23344 | structure Sum = struct | |
23345 | datatype ('a, 'b) t = INL of 'a | INR of 'b | |
23346 | end | |
23347 | ---- | |
23348 | ||
23349 | While a generic sum type such as defined above is very useful, it has | |
23350 | a number of limitations. As an example, we could write the function | |
23351 | `out` to extract the value from a sum as follows: | |
23352 | [source,sml] | |
23353 | ---- | |
23354 | fun out (s : ('a, 'a) Sum.t) : 'a = | |
23355 | case s | |
23356 | of Sum.INL a => a | |
23357 | | Sum.INR a => a | |
23358 | ---- | |
23359 | ||
23360 | As can be seen from the type of `out`, it is limited in the sense that | |
23361 | it requires both variants of the sum to have the same type. So, `out` | |
23362 | cannot be used to extract the value of a sum of two different types, | |
23363 | such as the type `(int, real) Sum.t`. As another example of a | |
23364 | limitation, consider the following attempt at a `succ` function: | |
23365 | [source,sml] | |
23366 | ---- | |
23367 | fun succ (s : (int, real) Sum.t) : ??? = | |
23368 | case s | |
23369 | of Sum.INL i => i + 1 | |
23370 | | Sum.INR r => Real.nextAfter (r, Real.posInf) | |
23371 | ---- | |
23372 | ||
23373 | The above definition of `succ` cannot be typed, because there is no | |
23374 | type for the codomain within SML. | |
23375 | ||
23376 | ||
23377 | == Static Sums == | |
23378 | ||
23379 | Interestingly, it is possible to define values `inL`, `inR`, and | |
23380 | `match` that satisfy the laws | |
23381 | ---- | |
23382 | match (inL x) (f, g) = f x | |
23383 | match (inR x) (f, g) = g x | |
23384 | ---- | |
23385 | and do not suffer from the same limitions. The definitions are | |
23386 | actually quite trivial: | |
23387 | [source,sml] | |
23388 | ---- | |
23389 | structure StaticSum = struct | |
23390 | fun inL x (f, _) = f x | |
23391 | fun inR x (_, g) = g x | |
23392 | fun match x = x | |
23393 | end | |
23394 | ---- | |
23395 | ||
23396 | Now, given the `succ` function defined as | |
23397 | [source,sml] | |
23398 | ---- | |
23399 | fun succ s = | |
23400 | StaticSum.match s | |
23401 | (fn i => i + 1, | |
23402 | fn r => Real.nextAfter (r, Real.posInf)) | |
23403 | ---- | |
23404 | we get | |
23405 | [source,sml] | |
23406 | ---- | |
23407 | succ (StaticSum.inL 1) = 2 | |
23408 | succ (StaticSum.inR Real.maxFinite) = Real.posInf | |
23409 | ---- | |
23410 | ||
23411 | To better understand how this works, consider the following signature | |
23412 | for static sums: | |
23413 | [source,sml] | |
23414 | ---- | |
23415 | structure StaticSum :> sig | |
23416 | type ('dL, 'cL, 'dR, 'cR, 'c) t | |
23417 | val inL : 'dL -> ('dL, 'cL, 'dR, 'cR, 'cL) t | |
23418 | val inR : 'dR -> ('dL, 'cL, 'dR, 'cR, 'cR) t | |
23419 | val match : ('dL, 'cL, 'dR, 'cR, 'c) t -> ('dL -> 'cL) * ('dR -> 'cR) -> 'c | |
23420 | end = struct | |
23421 | type ('dL, 'cL, 'dR, 'cR, 'c) t = ('dL -> 'cL) * ('dR -> 'cR) -> 'c | |
23422 | open StaticSum | |
23423 | end | |
23424 | ---- | |
23425 | ||
23426 | Above, `'d` stands for domain and `'c` for codomain. The key | |
23427 | difference between an ordinary sum type, like `(int, real) Sum.t`, and | |
23428 | a static sum type, like `(int, real, real, int, real) StaticSum.t`, is | |
23429 | that the ordinary sum type says nothing about the type of the result | |
23430 | of deconstructing a sum while the static sum type specifies the type. | |
23431 | ||
23432 | With the sealed static sum module, we get the type | |
23433 | [source,sml] | |
23434 | ---- | |
23435 | val succ : (int, int, real, real, 'a) StaticSum.t -> 'a | |
23436 | ---- | |
23437 | for the previously defined `succ` function. The type specifies that | |
23438 | `succ` maps a left `int` to an `int` and a right `real` to a `real`. | |
23439 | For example, the type of `StaticSum.inL 1` is | |
23440 | `(int, 'cL, 'dR, 'cR, 'cL) StaticSum.t`. Unifying this with the | |
23441 | argument type of `succ` gives the type `(int, int, real, real, int) | |
23442 | StaticSum.t -> int`. | |
23443 | ||
23444 | The `out` function is quite useful on its own. Here is how it can be | |
23445 | defined: | |
23446 | [source,sml] | |
23447 | ---- | |
23448 | structure StaticSum = struct | |
23449 | open StaticSum | |
23450 | val out : ('a, 'a, 'b, 'b, 'c) t -> 'c = | |
23451 | fn s => match s (fn x => x, fn x => x) | |
23452 | end | |
23453 | ---- | |
23454 | ||
23455 | Due to the value restriction, lack of first class polymorphism and | |
23456 | polymorphic recursion, the usefulness and convenience of static sums | |
23457 | is somewhat limited in SML. So, don't throw away the ordinary sum | |
23458 | type just yet. Static sums can nevertheless be quite useful. | |
23459 | ||
23460 | ||
23461 | === Example: Send and Receive with Argument Type Dependent Result Types === | |
23462 | ||
23463 | In some situations it would seem useful to define functions whose | |
23464 | result type would depend on some of the arguments. Traditionally such | |
23465 | functions have been thought to be impossible in SML and the solution | |
23466 | has been to define multiple functions. For example, the | |
23467 | http://www.standardml.org/Basis/socket.html[`Socket` structure] of the | |
23468 | Basis library defines 16 `send` and 16 `recv` functions. In contrast, | |
23469 | the Net structure | |
23470 | (<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/net.sig)>) of the | |
23471 | Basic library designed by Stephen Weeks defines only a single `send` | |
23472 | and a single `receive` and the result types of the functions depend on | |
23473 | their arguments. The implementation | |
23474 | (<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/net.sml)>) uses | |
23475 | static sums (with a slighly different signature: | |
23476 | <!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/static-sum.sig)>). | |
23477 | ||
23478 | ||
23479 | === Example: Picking Monad Results === | |
23480 | ||
23481 | Suppose that we need to write a parser that accepts a pair of integers | |
23482 | and returns their sum given a monadic parsing combinator library. A | |
23483 | part of the signature of such library could look like this | |
23484 | [source,sml] | |
23485 | ---- | |
23486 | signature PARSING = sig | |
23487 | include MONAD | |
23488 | val int : int t | |
23489 | val lparen : unit t | |
23490 | val rparen : unit t | |
23491 | val comma : unit t | |
23492 | (* ... *) | |
23493 | end | |
23494 | ---- | |
23495 | where the `MONAD` signature could be defined as | |
23496 | [source,sml] | |
23497 | ---- | |
23498 | signature MONAD = sig | |
23499 | type 'a t | |
23500 | val return : 'a -> 'a t | |
23501 | val >>= : 'a t * ('a -> 'b t) -> 'b t | |
23502 | end | |
23503 | infix >>= | |
23504 | ---- | |
23505 | ||
23506 | The straightforward, but tedious, way to write the desired parser is: | |
23507 | [source,sml] | |
23508 | ---- | |
23509 | val p = lparen >>= (fn _ => | |
23510 | int >>= (fn x => | |
23511 | comma >>= (fn _ => | |
23512 | int >>= (fn y => | |
23513 | rparen >>= (fn _ => | |
23514 | return (x + y)))))) | |
23515 | ---- | |
23516 | ||
23517 | In Haskell, the parser could be written using the `do` notation | |
23518 | considerably less verbosely as: | |
23519 | [source,haskell] | |
23520 | ---- | |
23521 | p = do { lparen ; x <- int ; comma ; y <- int ; rparen ; return $ x + y } | |
23522 | ---- | |
23523 | ||
23524 | SML doesn't provide a `do` notation, so we need another solution. | |
23525 | ||
23526 | Suppose we would have a "pick" notation for monads that would allows | |
23527 | us to write the parser as | |
23528 | [source,sml] | |
23529 | ---- | |
23530 | val p = `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (fn x & y => x + y) | |
23531 | ---- | |
23532 | using four auxiliary combinators: +`+, `\`, `^`, and `@`. | |
23533 | ||
23534 | Roughly speaking | |
23535 | ||
23536 | * +`p+ means that the result of `p` is dropped, | |
23537 | * `\p` means that the result of `p` is taken, | |
23538 | * `p ^ q` means that results of `p` and `q` are taken as a product, and | |
23539 | * `p @ a` means that the results of `p` are passed to the function `a` and that result is returned. | |
23540 | ||
23541 | The difficulty is in implementing the concatenation combinator `^`. | |
23542 | The type of the result of the concatenation depends on the types of | |
23543 | the arguments. | |
23544 | ||
23545 | Using static sums and the <:ProductType:product type>, the pick | |
23546 | notation for monads can be implemented as follows: | |
23547 | [source,sml] | |
23548 | ---- | |
23549 | functor MkMonadPick (include MONAD) = let | |
23550 | open StaticSum | |
23551 | in | |
23552 | struct | |
23553 | fun `a = inL (a >>= (fn _ => return ())) | |
23554 | val \ = inR | |
23555 | fun a @ f = out a >>= (return o f) | |
23556 | fun a ^ b = | |
23557 | (match b o match a) | |
23558 | (fn a => | |
23559 | (fn b => inL (a >>= (fn _ => b)), | |
23560 | fn b => inR (a >>= (fn _ => b))), | |
23561 | fn a => | |
23562 | (fn b => inR (a >>= (fn a => b >>= (fn _ => return a))), | |
23563 | fn b => inR (a >>= (fn a => b >>= (fn b => return (a & b)))))) | |
23564 | end | |
23565 | end | |
23566 | ---- | |
23567 | ||
23568 | The above implementation is inefficient, however. It uses many more | |
23569 | bind operations, `>>=`, than necessary. That can be solved with an | |
23570 | additional level of abstraction: | |
23571 | [source,sml] | |
23572 | ---- | |
23573 | functor MkMonadPick (include MONAD) = let | |
23574 | open StaticSum | |
23575 | in | |
23576 | struct | |
23577 | fun `a = inL (fn b => a >>= (fn _ => b ())) | |
23578 | fun \a = inR (fn b => a >>= b) | |
23579 | fun a @ f = out a (return o f) | |
23580 | fun a ^ b = | |
23581 | (match b o match a) | |
23582 | (fn a => (fn b => inL (fn c => a (fn () => b c)), | |
23583 | fn b => inR (fn c => a (fn () => b c))), | |
23584 | fn a => (fn b => inR (fn c => a (fn a => b (fn () => c a))), | |
23585 | fn b => inR (fn c => a (fn a => b (fn b => c (a & b)))))) | |
23586 | end | |
23587 | end | |
23588 | ---- | |
23589 | ||
23590 | After instantiating and opening either of the above monad pick | |
23591 | implementations, the previously given definition of `p` can be | |
23592 | compiled and results in a parser whose result is of type `int`. Here | |
23593 | is a functor to test the theory: | |
23594 | [source,sml] | |
23595 | ---- | |
23596 | functor Test (Arg : PARSING) = struct | |
23597 | local | |
23598 | structure Pick = MkMonadPick (Arg) | |
23599 | open Pick Arg | |
23600 | in | |
23601 | val p : int t = | |
23602 | `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (fn x & y => x + y) | |
23603 | end | |
23604 | end | |
23605 | ---- | |
23606 | ||
23607 | ||
23608 | == Also see == | |
23609 | ||
23610 | There are a number of related techniques. Here are some of them. | |
23611 | ||
23612 | * <:Fold:> | |
23613 | * <:TypeIndexedValues:> | |
23614 | ||
23615 | <<< | |
23616 | ||
23617 | :mlton-guide-page: StephenWeeks | |
23618 | [[StephenWeeks]] | |
23619 | StephenWeeks | |
23620 | ============ | |
23621 | ||
23622 | I live in the New York City area and work at http://janestcapital.com[Jane Street Capital]. | |
23623 | ||
23624 | My http://sweeks.com/[home page]. | |
23625 | ||
23626 | You can email me at sweeks@sweeks.com. | |
23627 | ||
23628 | <<< | |
23629 | ||
23630 | :mlton-guide-page: StyleGuide | |
23631 | [[StyleGuide]] | |
23632 | StyleGuide | |
23633 | ========== | |
23634 | ||
23635 | These conventions are chosen so that inertia is towards modularity, code reuse and finding bugs early, _not_ to save typing. | |
23636 | ||
23637 | * <:SyntacticConventions:> | |
23638 | ||
23639 | <<< | |
23640 | ||
23641 | :mlton-guide-page: Subversion | |
23642 | [[Subversion]] | |
23643 | Subversion | |
23644 | ========== | |
23645 | ||
23646 | http://subversion.apache.org/[Subversion] is a version control system. | |
23647 | The MLton project used Subversion to maintain its | |
23648 | <:Sources:source code>, but switched to <:Git:> on 20130308. | |
23649 | ||
23650 | Here are some online Subversion resources. | |
23651 | ||
23652 | * http://svnbook.red-bean.com[Version Control with Subversion] | |
23653 | ||
23654 | <<< | |
23655 | ||
23656 | :mlton-guide-page: SuccessorML | |
23657 | [[SuccessorML]] | |
23658 | SuccessorML | |
23659 | =========== | |
23660 | ||
23661 | The purpose of http://sml-family.org/successor-ml/[successor ML], or | |
23662 | sML for short, is to provide a vehicle for the continued evolution of | |
23663 | ML, using Standard ML as a starting point. The intention is for | |
23664 | successor ML to be a living, evolving dialect of ML that is responsive | |
23665 | to community needs and advances in language design, implementation, | |
23666 | and semantics. | |
23667 | ||
23668 | == SuccessorML Features in MLton == | |
23669 | ||
23670 | The following SuccessorML features have been implemented in MLton. | |
23671 | The features are disabled by default, and may be enabled utilizing the | |
23672 | feature's corresponding <:MLBasisAnnotations:ML Basis annotation> | |
23673 | which is listed directly after the feature name. In addition, the | |
23674 | +allowSuccessorML {false|true}+ annotation can be used to | |
23675 | simultaneously enable all of the features. | |
23676 | ||
23677 | * <!Anchor(DoDecls)> | |
23678 | `do` Declarations: +allowDoDecls {false|true}+ | |
23679 | + | |
23680 | Allow a +do _exp_+ declaration form, which evaluates _exp_ for its | |
23681 | side effects. The following example uses a `do` declaration: | |
23682 | + | |
23683 | [source,sml] | |
23684 | ---- | |
23685 | do print "Hello world.\n" | |
23686 | ---- | |
23687 | + | |
23688 | and is equivalent to: | |
23689 | + | |
23690 | [source,sml] | |
23691 | ---- | |
23692 | val () = print "Hello world.\n" | |
23693 | ---- | |
23694 | ||
23695 | * <!Anchor(ExtendedConsts)> | |
23696 | Extended Constants: +allowExtendedConsts {false|true}+ | |
23697 | + | |
23698 | -- | |
23699 | Allow or disallow all of the extended constants features. This is a | |
23700 | proxy for all of the following annotations. | |
23701 | ||
23702 | ** <!Anchor(ExtendedNumConsts)> | |
23703 | Extended Numeric Constants: +allowExtendedNumConsts {false|true}+ | |
23704 | + | |
23705 | Allow underscores as a separator in numeric constants and allow binary | |
23706 | integer and word constants. | |
23707 | + | |
23708 | Underscores in a numeric constant must occur between digits and | |
23709 | consecutive underscores are allowed. | |
23710 | + | |
23711 | Binary integer constants use the prefix +0b+ and binary word constants | |
23712 | use the prefix +0wb+. | |
23713 | + | |
23714 | The following example uses extended numeric constants (although it may | |
23715 | be incorrectly syntax highlighted): | |
23716 | + | |
23717 | [source,sml] | |
23718 | ---- | |
23719 | val pb = 0b10101 | |
23720 | val nb = ~0b10_10_10 | |
23721 | val wb = 0wb1010 | |
23722 | val i = 4__327__829 | |
23723 | val r = 6.022_140_9e23 | |
23724 | ---- | |
23725 | ||
23726 | ** <!Anchor(ExtendedTextConsts)> Extended Text Constants: +allowExtendedTextConsts {false|true}+ | |
23727 | + | |
23728 | Allow characters with integer codes ≥ 128 and ≤ 247 that | |
23729 | correspond to syntactically well-formed UTF-8 byte sequences in text | |
23730 | constants. | |
23731 | + | |
23732 | //// | |
23733 | and allow `\Uxxxxxxxx` numeric escapes in text constants. | |
23734 | //// | |
23735 | + | |
23736 | Any 1, 2, 3, or 4 byte sequence that can be properly decoded to a | |
23737 | binary number according to the UTF-8 encoding/decoding scheme is | |
23738 | allowed in a text constant (but invalid sequences are not explicitly | |
23739 | rejected) and denotes the corresponding sequence of characters with | |
23740 | integer codes ≥ 128 and ≤ 247. This feature enables "UTF-8 | |
23741 | convenience" (but not comprehensive Unicode support); in particular, | |
23742 | it allows one to copy text from a browser and paste it into a string | |
23743 | constant in an editor and, furthermore, if the string is printed to a | |
23744 | terminal, then will (typically) appear as the original text. The | |
23745 | following example uses UTF-8 byte sequences: | |
23746 | + | |
23747 | [source,sml] | |
23748 | ---- | |
23749 | val s1 : String.string = "\240\159\130\161" | |
23750 | val s2 : String.string = "🂡" | |
23751 | val _ = print ("s1 --> " ^ s1 ^ "\n") | |
23752 | val _ = print ("s2 --> " ^ s2 ^ "\n") | |
23753 | val _ = print ("String.size s1 --> " ^ Int.toString (String.size s1) ^ "\n") | |
23754 | val _ = print ("String.size s2 --> " ^ Int.toString (String.size s2) ^ "\n") | |
23755 | val _ = print ("s1 = s2 --> " ^ Bool.toString (s1 = s2) ^ "\n") | |
23756 | ---- | |
23757 | + | |
23758 | and, when compiled and executed, will display: | |
23759 | + | |
23760 | ---- | |
23761 | s1 --> 🂡 | |
23762 | s2 --> 🂡 | |
23763 | String.size s1 --> 4 | |
23764 | String.size s2 --> 4 | |
23765 | s1 = s2 --> true | |
23766 | ---- | |
23767 | + | |
23768 | Note that the `String.string` type corresponds to any sequence of | |
23769 | 8-bit values, including invalid UTF-8 sequences; hence the string | |
23770 | constant `"\192"` (a UTF-8 leading byte with no UTF-8 continuation | |
23771 | byte) is valid. Similarly, the `Char.char` type corresponds to a | |
23772 | single 8-bit value; hence the char constant `#"α"` is not valid, as | |
23773 | the text constant `"α"` denotes a sequence of two 8-bit values. | |
23774 | + | |
23775 | //// | |
23776 | A `\Uxxxxxxxx` numeric escape denotes a single character with the | |
23777 | hexadecimal integer code `xxxxxxxx`. Such numeric escapes are not | |
23778 | necessary for the `String.string` and `Char.char` types, since | |
23779 | characters in such text constants must have integer codes ≤ 255 and | |
23780 | the `\ddd` and `\uxxxx` numeric escapes suffice. However, the | |
23781 | `\Uxxxxxxxx` numeric escapes are useful for the `WideString.string` | |
23782 | and `WideChar.char` types, since characters in such text constants may | |
23783 | have integer codes ≤ 2^32^-1. The following uses a `\Uxxxxxxxx` | |
23784 | numeric escape (although it may be incorrectly syntax highlighted): | |
23785 | + | |
23786 | [source,sml] | |
23787 | ---- | |
23788 | val s1 : WideString.string = "\U0001F0A1" (* 'PLAYING CARD ACE OF SPADES' (U+1F0A1) *) | |
23789 | val _ = print ("WideString.size s1 --> " ^ Int.toString (WideString.size s1) ^ "\n") | |
23790 | ---- | |
23791 | + | |
23792 | and, when compiled and executed, will display: | |
23793 | + | |
23794 | ---- | |
23795 | WideString.size s1 --> 1 | |
23796 | ---- | |
23797 | + | |
23798 | Note that the `WideString.string` type corresponds to any sequence of | |
23799 | 32-bit values, including invalid Unicode code points; hence, the | |
23800 | string constants `"\U001F0000"` and `"\U40000000"` are valid (but the | |
23801 | corresponding integer codes are not valid Unicode code points). | |
23802 | Similarly, the `WideChar.char` type corresponds to a single 32-bit | |
23803 | value. | |
23804 | + | |
23805 | Finally, note that a UTF-8 byte sequence in a `WideString.string` or | |
23806 | `WideChar.char` text constant does not denote a single 32-bit value, | |
23807 | but rather a sequence of 32-bit values ≥ 128 and ≤ 247. The | |
23808 | following example uses both UTF-8 byte sequences and `\Uxxxxxxxx` | |
23809 | numeric escapes (although it may be incorrectly syntax highlighted): | |
23810 | + | |
23811 | [source,sml] | |
23812 | ---- | |
23813 | val s1 : WideString.string = "\U0001F0A1" (* 'PLAYING CARD ACE OF SPADES' (U+1F0A1) *) | |
23814 | val s2 : WideString.string = "🂡" | |
23815 | val s3 : WideString.string = "\U000000F0\U0000009F\U00000082\U000000A1" | |
23816 | val _ = print ("WideString.size s1 --> " ^ Int.toString (WideString.size s1) ^ "\n") | |
23817 | val _ = print ("WideString.size s2 --> " ^ Int.toString (WideString.size s2) ^ "\n") | |
23818 | val _ = print ("WideString.size s3 --> " ^ Int.toString (WideString.size s3) ^ "\n") | |
23819 | val _ = print ("s1 = s2 --> " ^ Bool.toString (s1 = s2) ^ "\n") | |
23820 | val _ = print ("s2 = s3 --> " ^ Bool.toString (s2 = s3) ^ "\n") | |
23821 | ---- | |
23822 | + | |
23823 | and, when compiled and executed, will display: | |
23824 | + | |
23825 | ---- | |
23826 | WideString.size s1 --> 1 | |
23827 | WideString.size s2 --> 4 | |
23828 | WideString.size s3 --> 4 | |
23829 | s1 = s2 --> false | |
23830 | s2 = s3 --> true | |
23831 | ---- | |
23832 | //// | |
23833 | -- | |
23834 | ||
23835 | * <!Anchor(LineComments)> | |
23836 | Line Comments: +allowLineComments {false|true}+ | |
23837 | + | |
23838 | Allow line comments beginning with the token ++(*)++. The following | |
23839 | example uses a line comment: | |
23840 | + | |
23841 | [source,sml] | |
23842 | ---- | |
23843 | (*) This is a line comment | |
23844 | ---- | |
23845 | + | |
23846 | Line comments properly nest within block comments. The following | |
23847 | example uses line comments nested within block comments: | |
23848 | + | |
23849 | [source,sml] | |
23850 | ---- | |
23851 | (* | |
23852 | val x = 4 (*) This is a line comment | |
23853 | *) | |
23854 | ||
23855 | (* | |
23856 | val y = 5 (*) This is a line comment *) | |
23857 | *) | |
23858 | ---- | |
23859 | ||
23860 | * <!Anchor(OptBar)> | |
23861 | Optional Pattern Bars: +allowOptBar {false|true}+ | |
23862 | + | |
23863 | Allow a bar to appear before the first match rule of a `case`, `fn`, | |
23864 | or `handle` expression, allow a bar to appear before the first | |
23865 | function-value binding of a `fun` declaration, and allow a bar to | |
23866 | appear before the first constructor binding or description of a | |
23867 | `datatype` declaration or specification. The following example uses | |
23868 | leading bars in a `datatype` declaration, a `fun` declaration, and a | |
23869 | `case` expression: | |
23870 | + | |
23871 | [source,sml] | |
23872 | ---- | |
23873 | datatype t = | |
23874 | | C | |
23875 | | B | |
23876 | | A | |
23877 | ||
23878 | fun | |
23879 | | f NONE = 0 | |
23880 | | f (SOME t) = | |
23881 | (case t of | |
23882 | | A => 1 | |
23883 | | B => 2 | |
23884 | | C => 3) | |
23885 | ---- | |
23886 | + | |
23887 | By eliminating the special case of the first element, this feature | |
23888 | allows for simpler refactoring (e.g., sorting the lines of the | |
23889 | `datatype` declaration's constructor bindings to put the constructors | |
23890 | in alphabetical order). | |
23891 | ||
23892 | * <!Anchor(OptSemicolon)> | |
23893 | Optional Semicolons: +allowOptSemicolon {false|true}+ | |
23894 | + | |
23895 | Allow a semicolon to appear after the last expression in a sequence or | |
23896 | `let`-body expression. The following example uses a trailing | |
23897 | semicolon in the body of a `let` expression: | |
23898 | + | |
23899 | [source,sml] | |
23900 | ---- | |
23901 | fun h z = | |
23902 | let | |
23903 | val x = 3 * z | |
23904 | in | |
23905 | f x ; | |
23906 | g x ; | |
23907 | end | |
23908 | ---- | |
23909 | + | |
23910 | By eliminating the special case of the last element, this feature | |
23911 | allows for simpler refactoring. | |
23912 | ||
23913 | * <!Anchor(OrPats)> | |
23914 | Disjunctive (Or) Patterns: +allowOrPats {false|true}+ | |
23915 | + | |
23916 | Allow disjunctive (a.k.a., "or") patterns of the form +_pat~1~_ | | |
23917 | _pat~2~_+, which matches a value that matches either +_pat~1~_+ or | |
23918 | +_pat~2~_+. Disjunctive patterns have lower precedence than `as` | |
23919 | patterns and constraint patterns, much as `orelse` expressions have | |
23920 | lower precedence than `andalso` expressions and constraint | |
23921 | expressions. Both sub-patterns of a disjunctive pattern must bind the | |
23922 | same variables with the same types. The following example uses | |
23923 | disjunctive patterns: | |
23924 | + | |
23925 | [source,sml] | |
23926 | ---- | |
23927 | datatype t = A of int | B of int | C of int | D of int * int | E of int * int | |
23928 | ||
23929 | fun f t = | |
23930 | case t of | |
23931 | A x | B x | C x => x + 1 | |
23932 | | D (x, _) | E (_, x) => x * 2 | |
23933 | ---- | |
23934 | ||
23935 | * <!Anchor(RecordPunExps)> | |
23936 | Record Punning Expressions: +allowRecordPunExps {false|true}+ | |
23937 | + | |
23938 | Allow record punning expressions, whereby an identifier +_vid_+ as an | |
23939 | expression row in a record expression denotes the expression row | |
23940 | +_vid_ = _vid_+ (i.e., treating a label as a variable). The following | |
23941 | example uses record punning expressions (and also record punning | |
23942 | patterns): | |
23943 | + | |
23944 | [source,sml] | |
23945 | ---- | |
23946 | fun incB r = | |
23947 | case r of {a, b, c} => {a, b = b + 1, c} | |
23948 | ---- | |
23949 | + | |
23950 | and is equivalent to: | |
23951 | + | |
23952 | [source,sml] | |
23953 | ---- | |
23954 | fun incB r = | |
23955 | case r of {a = a, b = b, c = c} => {a = a, b = b + 1, c = c} | |
23956 | ---- | |
23957 | ||
23958 | * <!Anchor(SigWithtype)> | |
23959 | `withtype` in Signatures: +allowSigWithtype {false|true}+ | |
23960 | + | |
23961 | Allow `withtype` to modify a `datatype` specification in a signature. | |
23962 | The following example uses `withtype` in a signature (and also | |
23963 | `withtype` in a declaration): | |
23964 | + | |
23965 | [source,sml] | |
23966 | ---- | |
23967 | signature STREAM = | |
23968 | sig | |
23969 | datatype 'a u = Nil | Cons of 'a * 'a t | |
23970 | withtype 'a t = unit -> 'a u | |
23971 | end | |
23972 | structure Stream : STREAM = | |
23973 | struct | |
23974 | datatype 'a u = Nil | Cons of 'a * 'a t | |
23975 | withtype 'a t = unit -> 'a u | |
23976 | end | |
23977 | ---- | |
23978 | + | |
23979 | and is equivalent to: | |
23980 | + | |
23981 | [source,sml] | |
23982 | ---- | |
23983 | signature STREAM = | |
23984 | sig | |
23985 | datatype 'a u = Nil | Cons of 'a * (unit -> 'a u) | |
23986 | type 'a t = unit -> 'a u | |
23987 | end | |
23988 | structure Stream : STREAM = | |
23989 | struct | |
23990 | datatype 'a u = Nil | Cons of 'a * (unit -> 'a u) | |
23991 | type 'a t = unit -> 'a u | |
23992 | end | |
23993 | ---- | |
23994 | ||
23995 | * <!Anchor(VectorExpsAndPats)> | |
23996 | Vector Expressions and Patterns: +allowVectorExpsAndPats {false|true}+ | |
23997 | + | |
23998 | -- | |
23999 | Allow or disallow vector expressions and vector patterns. This is a | |
24000 | proxy for all of the following annotations. | |
24001 | ||
24002 | ** <!Anchor(VectorExps)> | |
24003 | Vector Expressions: +allowVectorExps {false|true}+ | |
24004 | + | |
24005 | Allow vector expressions of the form +#[_exp~0~_, _exp~1~_, ..., _exp~n-1~_]+ (where _n ≥ 0_). The expression has type +_τ_ vector+ when each expression _exp~i~_ has type +_τ_+. | |
24006 | ||
24007 | ** <!Anchor(VectorPats)> | |
24008 | Vector Patterns: +allowVectorPats {false|true}+ | |
24009 | + | |
24010 | Allow vector patterns of the form +#[_pat~0~_, _pat~1~_, ..., _pat~n-1~_]+ (where _n ≥ 0_). The pattern matches values of type +_τ_ vector+ when each pattern _pat~i~_ matches values of type +_τ_+. | |
24011 | -- | |
24012 | ||
24013 | <<< | |
24014 | ||
24015 | :mlton-guide-page: SureshJagannathan | |
24016 | [[SureshJagannathan]] | |
24017 | SureshJagannathan | |
24018 | ================= | |
24019 | ||
24020 | I am an Associate Professor at the http://www.cs.purdue.edu/[Department of Computer Science] at Purdue University. | |
24021 | My research focus is in programming language design and implementation, concurrency, | |
24022 | and distributed systems. I am interested in various aspects of MLton, mostly related to (in no particular order): (1) control-flow analysis (2) representation | |
24023 | strategies (e.g., flattening), (3) IR formats, and (4) extensions for distributed programming. | |
24024 | ||
24025 | ||
24026 | Please see my http://www.cs.purdue.edu/homes/suresh/index.html[Home page] for more details. | |
24027 | ||
24028 | <<< | |
24029 | ||
24030 | :mlton-guide-page: Swerve | |
24031 | [[Swerve]] | |
24032 | Swerve | |
24033 | ====== | |
24034 | ||
24035 | http://ftp.sun.ac.za/ftp/mirrorsites/ocaml/Systems_programming/book/c3253.html[Swerve] | |
24036 | is an HTTP server written in SML, originally developed with SML/NJ. | |
24037 | <:RayRacine:> ported Swerve to MLton in January 2005. | |
24038 | ||
24039 | <!Attachment(Swerve,swerve.tar.bz2,Download)> the port. | |
24040 | ||
24041 | Excerpt from the included `README`: | |
24042 | ____ | |
24043 | Total testing of this port consisted of a successful compile, startup, | |
24044 | and serving one html page with one gif image. Given that the original | |
24045 | code was throughly designed and implemented in a thoughtful manner and | |
24046 | I expect it is quite usable modulo a few minor bugs introduced by my | |
24047 | porting effort. | |
24048 | ____ | |
24049 | ||
24050 | Swerve is described in <!Cite(Shipman02)>. | |
24051 | ||
24052 | <<< | |
24053 | ||
24054 | :mlton-guide-page: SXML | |
24055 | [[SXML]] | |
24056 | SXML | |
24057 | ==== | |
24058 | ||
24059 | <:SXML:> is an <:IntermediateLanguage:>, translated from <:XML:> by | |
24060 | <:Monomorphise:>, optimized by <:SXMLSimplify:>, and translated by | |
24061 | <:ClosureConvert:> to <:SSA:>. | |
24062 | ||
24063 | == Description == | |
24064 | ||
24065 | SXML is a simply-typed version of <:XML:>. | |
24066 | ||
24067 | == Implementation == | |
24068 | ||
24069 | * <!ViewGitFile(mlton,master,mlton/xml/sxml.sig)> | |
24070 | * <!ViewGitFile(mlton,master,mlton/xml/sxml.fun)> | |
24071 | * <!ViewGitFile(mlton,master,mlton/xml/sxml-tree.sig)> | |
24072 | ||
24073 | == Type Checking == | |
24074 | ||
24075 | <:SXML:> shares the type checker for <:XML:>. | |
24076 | ||
24077 | == Details and Notes == | |
24078 | ||
24079 | There are only two differences between <:XML:> and <:SXML:>. First, | |
24080 | <:SXML:> `val`, `fun`, and `datatype` declarations always have an | |
24081 | empty list of type variables. Second, <:SXML:> variable references | |
24082 | always have an empty list of type arguments. Constructors uses can | |
24083 | only have a nonempty list of type arguments if the constructor is a | |
24084 | primitive. | |
24085 | ||
24086 | Although we could rely on the type system to enforce these constraints | |
24087 | by parameterizing the <:XML:> signature, <:StephenWeeks:> did so in a | |
24088 | previous version of the compiler, and the software engineering gains | |
24089 | were not worth the effort. | |
24090 | ||
24091 | <<< | |
24092 | ||
24093 | :mlton-guide-page: SXMLShrink | |
24094 | [[SXMLShrink]] | |
24095 | SXMLShrink | |
24096 | ========== | |
24097 | ||
24098 | SXMLShrink is an optimization pass for the <:SXML:> | |
24099 | <:IntermediateLanguage:>, invoked from <:SXMLSimplify:>. | |
24100 | ||
24101 | == Description == | |
24102 | ||
24103 | This pass performs optimizations based on a reduction system. | |
24104 | ||
24105 | == Implementation == | |
24106 | ||
24107 | * <!ViewGitFile(mlton,master,mlton/xml/shrink.sig)> | |
24108 | * <!ViewGitFile(mlton,master,mlton/xml/shrink.fun)> | |
24109 | ||
24110 | == Details and Notes == | |
24111 | ||
24112 | <:SXML:> shares the <:XMLShrink:> simplifier. | |
24113 | ||
24114 | <<< | |
24115 | ||
24116 | :mlton-guide-page: SXMLSimplify | |
24117 | [[SXMLSimplify]] | |
24118 | SXMLSimplify | |
24119 | ============ | |
24120 | ||
24121 | The optimization passes for the <:SXML:> <:IntermediateLanguage:> are | |
24122 | collected and controlled by the `SxmlSimplify` functor | |
24123 | (<!ViewGitFile(mlton,master,mlton/xml/sxml-simplify.sig)>, | |
24124 | <!ViewGitFile(mlton,master,mlton/xml/sxml-simplify.fun)>). | |
24125 | ||
24126 | The following optimization passes are implemented: | |
24127 | ||
24128 | * <:Polyvariance:> | |
24129 | * <:SXMLShrink:> | |
24130 | ||
24131 | The following implementation passes are implemented: | |
24132 | ||
24133 | * <:ImplementExceptions:> | |
24134 | * <:ImplementSuffix:> | |
24135 | ||
24136 | The following optimization passes are not implemented, but might prove useful: | |
24137 | ||
24138 | * <:Uncurry:> | |
24139 | * <:LambdaLift:> | |
24140 | ||
24141 | The optimization passes can be controlled from the command-line by the options | |
24142 | ||
24143 | * `-diag-pass <pass>` -- keep diagnostic info for pass | |
24144 | * `-disable-pass <pass>` -- skip optimization pass (if normally performed) | |
24145 | * `-enable-pass <pass>` -- perform optimization pass (if normally skipped) | |
24146 | * `-keep-pass <pass>` -- keep the results of pass | |
24147 | * `-sxml-passes <passes>` -- sxml optimization passes | |
24148 | ||
24149 | <<< | |
24150 | ||
24151 | :mlton-guide-page: SyntacticConventions | |
24152 | [[SyntacticConventions]] | |
24153 | SyntacticConventions | |
24154 | ==================== | |
24155 | ||
24156 | Here are a number of syntactic conventions useful for programming in | |
24157 | SML. | |
24158 | ||
24159 | ||
24160 | == General == | |
24161 | ||
24162 | * A line of code never exceeds 80 columns. | |
24163 | ||
24164 | * Only split a syntactic entity across multiple lines if it doesn't fit on one line within 80 columns. | |
24165 | ||
24166 | * Use alphabetical order wherever possible. | |
24167 | ||
24168 | * Avoid redundant parentheses. | |
24169 | ||
24170 | * When using `:`, there is no space before the colon, and a single space after it. | |
24171 | ||
24172 | ||
24173 | == Identifiers == | |
24174 | ||
24175 | * Variables, record labels and type constructors begin with and use | |
24176 | small letters, using capital letters to separate words. | |
24177 | + | |
24178 | [source,sml] | |
24179 | ---- | |
24180 | cost | |
24181 | maxValue | |
24182 | ---- | |
24183 | ||
24184 | * Variables that represent collections of objects (lists, arrays, | |
24185 | vectors, ...) are often suffixed with an `s`. | |
24186 | + | |
24187 | [source,sml] | |
24188 | ---- | |
24189 | xs | |
24190 | employees | |
24191 | ---- | |
24192 | ||
24193 | * Constructors, structure identifiers, and functor identifiers begin | |
24194 | with a capital letter. | |
24195 | + | |
24196 | [source,sml] | |
24197 | ---- | |
24198 | Queue | |
24199 | LinkedList | |
24200 | ---- | |
24201 | ||
24202 | * Signature identifiers are in all capitals, using `_` to separate | |
24203 | words. | |
24204 | + | |
24205 | [source,sml] | |
24206 | ---- | |
24207 | LIST | |
24208 | BINARY_HEAP | |
24209 | ---- | |
24210 | ||
24211 | ||
24212 | == Types == | |
24213 | ||
24214 | * Alphabetize record labels. In a record type, there are spaces after | |
24215 | colons and commas, but not before colons or commas, or at the | |
24216 | delimiters `{` and `}`. | |
24217 | + | |
24218 | [source,sml] | |
24219 | ---- | |
24220 | {bar: int, foo: int} | |
24221 | ---- | |
24222 | ||
24223 | * Only split a record type across multiple lines if it doesn't fit on | |
24224 | one line. If a record type must be split over multiple lines, put one | |
24225 | field per line. | |
24226 | + | |
24227 | [source,sml] | |
24228 | ---- | |
24229 | {bar: int, | |
24230 | foo: real * real, | |
24231 | zoo: bool} | |
24232 | ---- | |
24233 | ||
24234 | ||
24235 | * In a tuple type, there are spaces before and after each `*`. | |
24236 | + | |
24237 | [source,sml] | |
24238 | ---- | |
24239 | int * bool * real | |
24240 | ---- | |
24241 | ||
24242 | * Only split a tuple type across multiple lines if it doesn't fit on | |
24243 | one line. In a tuple type split over multiple lines, there is one | |
24244 | type per line, and the `*`-s go at the beginning of the lines. | |
24245 | + | |
24246 | [source,sml] | |
24247 | ---- | |
24248 | int | |
24249 | * bool | |
24250 | * real | |
24251 | ---- | |
24252 | + | |
24253 | It may also be useful to parenthesize to make the grouping more | |
24254 | apparent. | |
24255 | + | |
24256 | [source,sml] | |
24257 | ---- | |
24258 | (int | |
24259 | * bool | |
24260 | * real) | |
24261 | ---- | |
24262 | ||
24263 | * In an arrow type split over multiple lines, put the arrow at the | |
24264 | beginning of its line. | |
24265 | + | |
24266 | [source,sml] | |
24267 | ---- | |
24268 | int * real | |
24269 | -> bool | |
24270 | ---- | |
24271 | + | |
24272 | It may also be useful to parenthesize to make the grouping more | |
24273 | apparent. | |
24274 | + | |
24275 | [source,sml] | |
24276 | ---- | |
24277 | (int * real | |
24278 | -> bool) | |
24279 | ---- | |
24280 | ||
24281 | * Avoid redundant parentheses. | |
24282 | ||
24283 | * Arrow types associate to the right, so write | |
24284 | + | |
24285 | [source,sml] | |
24286 | ---- | |
24287 | a -> b -> c | |
24288 | ---- | |
24289 | + | |
24290 | not | |
24291 | + | |
24292 | [source,sml] | |
24293 | ---- | |
24294 | a -> (b -> c) | |
24295 | ---- | |
24296 | ||
24297 | * Type constructor application associates to the left, so write | |
24298 | + | |
24299 | [source,sml] | |
24300 | ---- | |
24301 | int ref list | |
24302 | ---- | |
24303 | + | |
24304 | not | |
24305 | + | |
24306 | [source,sml] | |
24307 | ---- | |
24308 | (int ref) list | |
24309 | ---- | |
24310 | ||
24311 | * Type constructor application binds more tightly than a tuple type, | |
24312 | so write | |
24313 | + | |
24314 | [source,sml] | |
24315 | ---- | |
24316 | int list * bool list | |
24317 | ---- | |
24318 | + | |
24319 | not | |
24320 | + | |
24321 | [source,sml] | |
24322 | ---- | |
24323 | (int list) * (bool list) | |
24324 | ---- | |
24325 | ||
24326 | * Tuple types bind more tightly than arrow types, so write | |
24327 | + | |
24328 | [source,sml] | |
24329 | ---- | |
24330 | int * bool -> real | |
24331 | ---- | |
24332 | + | |
24333 | not | |
24334 | + | |
24335 | [source,sml] | |
24336 | ---- | |
24337 | (int * bool) -> real | |
24338 | ---- | |
24339 | ||
24340 | ||
24341 | == Core == | |
24342 | ||
24343 | * A core expression or declaration split over multiple lines does not | |
24344 | contain any blank lines. | |
24345 | ||
24346 | * A record field selector has no space between the `#` and the record | |
24347 | label. So, write | |
24348 | + | |
24349 | [source,sml] | |
24350 | ---- | |
24351 | #foo | |
24352 | ---- | |
24353 | + | |
24354 | not | |
24355 | + | |
24356 | [source,sml] | |
24357 | ---- | |
24358 | # foo | |
24359 | ---- | |
24360 | + | |
24361 | ||
24362 | * A tuple has a space after each comma, but not before, and not at the | |
24363 | delimiters `(` and `)`. | |
24364 | + | |
24365 | [source,sml] | |
24366 | ---- | |
24367 | (e1, e2, e3) | |
24368 | ---- | |
24369 | ||
24370 | * A tuple split over multiple lines has one element per line, and the | |
24371 | commas go at the end of the lines. | |
24372 | + | |
24373 | [source,sml] | |
24374 | ---- | |
24375 | (e1, | |
24376 | e2, | |
24377 | e3) | |
24378 | ---- | |
24379 | ||
24380 | * A list has a space after each comma, but not before, and not at the | |
24381 | delimiters `[` and `]`. | |
24382 | + | |
24383 | [source,sml] | |
24384 | ---- | |
24385 | [e1, e2, e3] | |
24386 | ---- | |
24387 | ||
24388 | * A list split over multiple lines has one element per line, and the | |
24389 | commas at the end of the lines. | |
24390 | + | |
24391 | [source,sml] | |
24392 | ---- | |
24393 | [e1, | |
24394 | e2, | |
24395 | e3] | |
24396 | ---- | |
24397 | ||
24398 | * A record has spaces before and after `=`, a space after each comma, | |
24399 | but not before, and not at the delimiters `{` and `}`. Field names | |
24400 | appear in alphabetical order. | |
24401 | + | |
24402 | [source,sml] | |
24403 | ---- | |
24404 | {bar = 13, foo = true} | |
24405 | ---- | |
24406 | ||
24407 | * A sequence expression has a space after each semicolon, but not before. | |
24408 | + | |
24409 | [source,sml] | |
24410 | ---- | |
24411 | (e1; e2; e3) | |
24412 | ---- | |
24413 | ||
24414 | * A sequence expression split over multiple lines has one expression | |
24415 | per line, and the semicolons at the beginning of lines. Lisp and | |
24416 | Scheme programmers may find this hard to read at first. | |
24417 | + | |
24418 | [source,sml] | |
24419 | ---- | |
24420 | (e1 | |
24421 | ; e2 | |
24422 | ; e3) | |
24423 | ---- | |
24424 | + | |
24425 | _Rationale_: this makes it easy to visually spot the beginning of each | |
24426 | expression, which becomes more valuable as the expressions themselves | |
24427 | are split across multiple lines. | |
24428 | ||
24429 | * An application expression has a space between the function and the | |
24430 | argument. There are no parens unless the argument is a tuple (in | |
24431 | which case the parens are really part of the tuple, not the | |
24432 | application). | |
24433 | + | |
24434 | [source,sml] | |
24435 | ---- | |
24436 | f a | |
24437 | f (a1, a2, a3) | |
24438 | ---- | |
24439 | ||
24440 | * Avoid redundant parentheses. Application associates to left, so | |
24441 | write | |
24442 | + | |
24443 | [source,sml] | |
24444 | ---- | |
24445 | f a1 a2 a3 | |
24446 | ---- | |
24447 | + | |
24448 | not | |
24449 | + | |
24450 | [source,sml] | |
24451 | ---- | |
24452 | ((f a1) a2) a3 | |
24453 | ---- | |
24454 | ||
24455 | * Infix operators have a space before and after the operator. | |
24456 | + | |
24457 | [source,sml] | |
24458 | ---- | |
24459 | x + y | |
24460 | x * y - z | |
24461 | ---- | |
24462 | ||
24463 | * Avoid redundant parentheses. Use <:OperatorPrecedence:>. So, write | |
24464 | + | |
24465 | [source,sml] | |
24466 | ---- | |
24467 | x + y * z | |
24468 | ---- | |
24469 | + | |
24470 | not | |
24471 | + | |
24472 | [source,sml] | |
24473 | ---- | |
24474 | x + (y * z) | |
24475 | ---- | |
24476 | ||
24477 | * An `andalso` expression split over multiple lines has the `andalso` | |
24478 | at the beginning of subsequent lines. | |
24479 | + | |
24480 | [source,sml] | |
24481 | ---- | |
24482 | e1 | |
24483 | andalso e2 | |
24484 | andalso e3 | |
24485 | ---- | |
24486 | ||
24487 | * A `case` expression is indented as follows | |
24488 | + | |
24489 | [source,sml] | |
24490 | ---- | |
24491 | case e1 of | |
24492 | p1 => e1 | |
24493 | | p2 => e2 | |
24494 | | p3 => e3 | |
24495 | ---- | |
24496 | ||
24497 | * A `datatype`'s constructors are alphabetized. | |
24498 | + | |
24499 | [source,sml] | |
24500 | ---- | |
24501 | datatype t = A | B | C | |
24502 | ---- | |
24503 | ||
24504 | * A `datatype` declaration has a space before and after each `|`. | |
24505 | + | |
24506 | [source,sml] | |
24507 | ---- | |
24508 | datatype t = A | B of int | C | |
24509 | ---- | |
24510 | ||
24511 | * A `datatype` split over multiple lines has one constructor per line, | |
24512 | with the `|` at the beginning of lines and the constructors beginning | |
24513 | 3 columns to the right of the `datatype`. | |
24514 | + | |
24515 | [source,sml] | |
24516 | ---- | |
24517 | datatype t = | |
24518 | A | |
24519 | | B | |
24520 | | C | |
24521 | ---- | |
24522 | ||
24523 | * A `fun` declaration may start its body on the subsequent line, | |
24524 | indented 3 spaces. | |
24525 | + | |
24526 | [source,sml] | |
24527 | ---- | |
24528 | fun f x y = | |
24529 | let | |
24530 | val z = x + y + z | |
24531 | in | |
24532 | z | |
24533 | end | |
24534 | ---- | |
24535 | ||
24536 | * An `if` expression is indented as follows. | |
24537 | + | |
24538 | [source,sml] | |
24539 | ---- | |
24540 | if e1 | |
24541 | then e2 | |
24542 | else e3 | |
24543 | ---- | |
24544 | ||
24545 | * A sequence of `if`-`then`-`else`-s is indented as follows. | |
24546 | + | |
24547 | [source,sml] | |
24548 | ---- | |
24549 | if e1 | |
24550 | then e2 | |
24551 | else if e3 | |
24552 | then e4 | |
24553 | else if e5 | |
24554 | then e6 | |
24555 | else e7 | |
24556 | ---- | |
24557 | ||
24558 | * A `let` expression has the `let`, `in`, and `end` on their own | |
24559 | lines, starting in the same column. Declarations and the body are | |
24560 | indented 3 spaces. | |
24561 | + | |
24562 | [source,sml] | |
24563 | ---- | |
24564 | let | |
24565 | val x = 13 | |
24566 | val y = 14 | |
24567 | in | |
24568 | x + y | |
24569 | end | |
24570 | ---- | |
24571 | ||
24572 | * A `local` declaration has the `local`, `in`, and `end` on their own | |
24573 | lines, starting in the same column. Declarations are indented 3 | |
24574 | spaces. | |
24575 | + | |
24576 | [source,sml] | |
24577 | ---- | |
24578 | local | |
24579 | val x = 13 | |
24580 | in | |
24581 | val y = x | |
24582 | end | |
24583 | ---- | |
24584 | ||
24585 | * An `orelse` expression split over multiple lines has the `orelse` at | |
24586 | the beginning of subsequent lines. | |
24587 | + | |
24588 | [source,sml] | |
24589 | ---- | |
24590 | e1 | |
24591 | orelse e2 | |
24592 | orelse e3 | |
24593 | ---- | |
24594 | ||
24595 | * A `val` declaration has a space before and after the `=`. | |
24596 | + | |
24597 | [source,sml] | |
24598 | ---- | |
24599 | val p = e | |
24600 | ---- | |
24601 | ||
24602 | * A `val` declaration can start the expression on the subsequent line, | |
24603 | indented 3 spaces. | |
24604 | + | |
24605 | [source,sml] | |
24606 | ---- | |
24607 | val p = | |
24608 | if e1 then e2 else e3 | |
24609 | ---- | |
24610 | ||
24611 | ||
24612 | == Signatures == | |
24613 | ||
24614 | * A `signature` declaration is indented as follows. | |
24615 | + | |
24616 | [source,sml] | |
24617 | ---- | |
24618 | signature FOO = | |
24619 | sig | |
24620 | val x: int | |
24621 | end | |
24622 | ---- | |
24623 | + | |
24624 | _Exception_: a signature declaration in a file to itself can omit the | |
24625 | indentation to save horizontal space. | |
24626 | + | |
24627 | [source,sml] | |
24628 | ---- | |
24629 | signature FOO = | |
24630 | sig | |
24631 | ||
24632 | val x: int | |
24633 | ||
24634 | end | |
24635 | ---- | |
24636 | + | |
24637 | In this case, there should be a blank line after the `sig` and before | |
24638 | the `end`. | |
24639 | ||
24640 | * A `val` specification has a space after the colon, but not before. | |
24641 | + | |
24642 | [source,sml] | |
24643 | ---- | |
24644 | val x: int | |
24645 | ---- | |
24646 | + | |
24647 | _Exception_: in the case of operators (like `+`), there is a space | |
24648 | before the colon to avoid lexing the colon as part of the operator. | |
24649 | + | |
24650 | [source,sml] | |
24651 | ---- | |
24652 | val + : t * t -> t | |
24653 | ---- | |
24654 | ||
24655 | * Alphabetize specifications in signatures. | |
24656 | + | |
24657 | [source,sml] | |
24658 | ---- | |
24659 | sig | |
24660 | val x: int | |
24661 | val y: bool | |
24662 | end | |
24663 | ---- | |
24664 | ||
24665 | ||
24666 | == Structures == | |
24667 | ||
24668 | * A `structure` declaration has a space on both sides of the `=`. | |
24669 | + | |
24670 | [source,sml] | |
24671 | ---- | |
24672 | structure Foo = Bar | |
24673 | ---- | |
24674 | ||
24675 | * A `structure` declaration split over multiple lines is indented as | |
24676 | follows. | |
24677 | + | |
24678 | [source,sml] | |
24679 | ---- | |
24680 | structure S = | |
24681 | struct | |
24682 | val x = 13 | |
24683 | end | |
24684 | ---- | |
24685 | + | |
24686 | _Exception_: a structure declaration in a file to itself can omit the | |
24687 | indentation to save horizontal space. | |
24688 | + | |
24689 | [source,sml] | |
24690 | ---- | |
24691 | structure S = | |
24692 | struct | |
24693 | ||
24694 | val x = 13 | |
24695 | ||
24696 | end | |
24697 | ---- | |
24698 | + | |
24699 | In this case, there should be a blank line after the `struct` and | |
24700 | before the `end`. | |
24701 | ||
24702 | * Declarations in a `struct` are separated by blank lines. | |
24703 | + | |
24704 | [source,sml] | |
24705 | ---- | |
24706 | struct | |
24707 | val x = | |
24708 | let | |
24709 | y = 13 | |
24710 | in | |
24711 | y + 1 | |
24712 | end | |
24713 | ||
24714 | val z = 14 | |
24715 | end | |
24716 | ---- | |
24717 | ||
24718 | ||
24719 | == Functors == | |
24720 | ||
24721 | * A `functor` declaration has spaces after each `:` (or `:>`) but not | |
24722 | before, and a space before and after the `=`. It is indented as | |
24723 | follows. | |
24724 | + | |
24725 | [source,sml] | |
24726 | ---- | |
24727 | functor Foo (S: FOO_ARG): FOO = | |
24728 | struct | |
24729 | val x = S.x | |
24730 | end | |
24731 | ---- | |
24732 | + | |
24733 | _Exception_: a functor declaration in a file to itself can omit the | |
24734 | indentation to save horizontal space. | |
24735 | + | |
24736 | [source,sml] | |
24737 | ---- | |
24738 | functor Foo (S: FOO_ARG): FOO = | |
24739 | struct | |
24740 | ||
24741 | val x = S.x | |
24742 | ||
24743 | end | |
24744 | ---- | |
24745 | + | |
24746 | In this case, there should be a blank line after the `struct` | |
24747 | and before the `end`. | |
24748 | ||
24749 | <<< | |
24750 | ||
24751 | :mlton-guide-page: Talk | |
24752 | [[Talk]] | |
24753 | Talk | |
24754 | ==== | |
24755 | ||
24756 | == The MLton Standard ML Compiler == | |
24757 | ||
24758 | *Henry Cejtin, Matthew Fluet, Suresh Jagannathan, Stephen Weeks* | |
24759 | ||
24760 | {nbsp} + | |
24761 | {nbsp} + | |
24762 | {nbsp} + | |
24763 | ||
24764 | ''' | |
24765 | ||
24766 | [cols="<,>"] | |
24767 | |==== | |
24768 | ||<:TalkStandardML: Next> | |
24769 | |==== | |
24770 | ||
24771 | <<< | |
24772 | ||
24773 | :mlton-guide-page: TalkDiveIn | |
24774 | [[TalkDiveIn]] | |
24775 | TalkDiveIn | |
24776 | ========== | |
24777 | ||
24778 | == Dive In == | |
24779 | ||
24780 | * to <:Development:> | |
24781 | * to <:Documentation:> | |
24782 | * to <:Download:> | |
24783 | ||
24784 | {nbsp} + | |
24785 | {nbsp} + | |
24786 | {nbsp} + | |
24787 | ||
24788 | ''' | |
24789 | ||
24790 | [cols="<,>"] | |
24791 | |==== | |
24792 | |<:TalkMLtonHistory: Prev>| | |
24793 | |==== | |
24794 | ||
24795 | <<< | |
24796 | ||
24797 | :mlton-guide-page: TalkFolkLore | |
24798 | [[TalkFolkLore]] | |
24799 | TalkFolkLore | |
24800 | ============ | |
24801 | ||
24802 | == Folk Lore == | |
24803 | ||
24804 | * Defunctorization and monomorphisation are feasible | |
24805 | * Global control-flow analysis is feasible | |
24806 | * Early closure conversion is feasible | |
24807 | ||
24808 | {nbsp} + | |
24809 | {nbsp} + | |
24810 | {nbsp} + | |
24811 | ||
24812 | ''' | |
24813 | ||
24814 | [cols="<,>"] | |
24815 | |==== | |
24816 | |<:TalkWholeProgram: Prev>|<:TalkMLtonFeatures: Next> | |
24817 | |==== | |
24818 | ||
24819 | <<< | |
24820 | ||
24821 | :mlton-guide-page: TalkFromSMLTo | |
24822 | [[TalkFromSMLTo]] | |
24823 | TalkFromSMLTo | |
24824 | ============= | |
24825 | ||
24826 | == From Standard ML to S-T F-O IL == | |
24827 | ||
24828 | * What issues arise when translating from Standard ML into an intermediate language? | |
24829 | ||
24830 | {nbsp} + | |
24831 | {nbsp} + | |
24832 | {nbsp} + | |
24833 | ||
24834 | ''' | |
24835 | ||
24836 | [cols="<,>"] | |
24837 | |==== | |
24838 | |<:TalkMLtonApproach: Prev>|<:TalkHowModules: Next> | |
24839 | |==== | |
24840 | ||
24841 | <<< | |
24842 | ||
24843 | :mlton-guide-page: TalkHowHigherOrder | |
24844 | [[TalkHowHigherOrder]] | |
24845 | TalkHowHigherOrder | |
24846 | ================== | |
24847 | ||
24848 | == Higher-order Functions == | |
24849 | ||
24850 | * How does one represent SML's higher-order functions? | |
24851 | * MLton's answer: defunctionalize | |
24852 | ||
24853 | {nbsp} + | |
24854 | {nbsp} + | |
24855 | ||
24856 | See <:ClosureConvert:>. | |
24857 | ||
24858 | {nbsp} + | |
24859 | {nbsp} + | |
24860 | {nbsp} + | |
24861 | ||
24862 | ''' | |
24863 | [cols="<,>"] | |
24864 | |==== | |
24865 | |<:TalkMLtonApproach: Prev>|<:TalkWholeProgram: Next> | |
24866 | |==== | |
24867 | ||
24868 | <<< | |
24869 | ||
24870 | :mlton-guide-page: TalkHowModules | |
24871 | [[TalkHowModules]] | |
24872 | TalkHowModules | |
24873 | ============== | |
24874 | ||
24875 | == Modules == | |
24876 | ||
24877 | * How does one represent SML's modules? | |
24878 | * MLton's answer: defunctorize | |
24879 | ||
24880 | {nbsp} + | |
24881 | {nbsp} + | |
24882 | ||
24883 | See <:Elaborate:>. | |
24884 | ||
24885 | {nbsp} + | |
24886 | {nbsp} + | |
24887 | {nbsp} + | |
24888 | ||
24889 | ''' | |
24890 | ||
24891 | [cols="<,>"] | |
24892 | |==== | |
24893 | |<:TalkFromSMLTo: Prev>|<:TalkHowPolymorphism: Next> | |
24894 | |==== | |
24895 | ||
24896 | <<< | |
24897 | ||
24898 | :mlton-guide-page: TalkHowPolymorphism | |
24899 | [[TalkHowPolymorphism]] | |
24900 | TalkHowPolymorphism | |
24901 | =================== | |
24902 | ||
24903 | == Polymorphism == | |
24904 | ||
24905 | * How does one represent SML's polymorphism? | |
24906 | * MLton's answer: monomorphise | |
24907 | ||
24908 | {nbsp} + | |
24909 | {nbsp} + | |
24910 | ||
24911 | See <:Monomorphise:>. | |
24912 | ||
24913 | {nbsp} + | |
24914 | {nbsp} + | |
24915 | {nbsp} + | |
24916 | ||
24917 | ''' | |
24918 | ||
24919 | [cols="<,>"] | |
24920 | |==== | |
24921 | |<:TalkHowModules: Prev>|<:TalkHowHigherOrder: Next> | |
24922 | |==== | |
24923 | ||
24924 | <<< | |
24925 | ||
24926 | :mlton-guide-page: TalkMLtonApproach | |
24927 | [[TalkMLtonApproach]] | |
24928 | TalkMLtonApproach | |
24929 | ================= | |
24930 | ||
24931 | == MLton's Approach == | |
24932 | ||
24933 | * whole-program optimization using a simply-typed, first-order intermediate language | |
24934 | * ensures programs are not penalized for exploiting abstraction and modularity | |
24935 | ||
24936 | {nbsp} + | |
24937 | {nbsp} + | |
24938 | {nbsp} + | |
24939 | ||
24940 | ''' | |
24941 | ||
24942 | [cols="<,>"] | |
24943 | |==== | |
24944 | |<:TalkStandardML: Prev>|<:TalkFromSMLTo: Next> | |
24945 | |==== | |
24946 | ||
24947 | <<< | |
24948 | ||
24949 | :mlton-guide-page: TalkMLtonFeatures | |
24950 | [[TalkMLtonFeatures]] | |
24951 | TalkMLtonFeatures | |
24952 | ================= | |
24953 | ||
24954 | == MLton Features == | |
24955 | ||
24956 | * Supports full Standard ML language and Basis Library | |
24957 | * Generates standalone executables | |
24958 | * Extensions | |
24959 | ** Foreign function interface (SML to C, C to SML) | |
24960 | ** ML Basis system for programming in the very large | |
24961 | ** Extension libraries | |
24962 | ||
24963 | {nbsp} + | |
24964 | {nbsp} + | |
24965 | ||
24966 | See <:Features:>. | |
24967 | ||
24968 | {nbsp} + | |
24969 | {nbsp} + | |
24970 | {nbsp} + | |
24971 | ||
24972 | ''' | |
24973 | ||
24974 | [cols="<,>"] | |
24975 | |==== | |
24976 | |<:TalkFolkLore: Prev>|<:TalkMLtonHistory: Next> | |
24977 | |==== | |
24978 | ||
24979 | <<< | |
24980 | ||
24981 | :mlton-guide-page: TalkMLtonHistory | |
24982 | [[TalkMLtonHistory]] | |
24983 | TalkMLtonHistory | |
24984 | ================ | |
24985 | ||
24986 | == MLton History == | |
24987 | ||
24988 | [cols="<25%,<75%"] | |
24989 | |==== | |
24990 | | April 1997 | Stephen Weeks wrote a defunctorizer for SML/NJ | |
24991 | | Aug. 1997 | Begin independent compiler (`smlc`) | |
24992 | | Oct. 1997 | Monomorphiser | |
24993 | | Nov. 1997 | Polyvariant higher-order control-flow analysis (10,000 lines) | |
24994 | | March 1999 | First release of MLton (48,006 lines) | |
24995 | | Jan. 2002 | MLton at 102,541 lines | |
24996 | | Jan. 2003 | MLton at 112,204 lines | |
24997 | | Jan. 2004 | MLton at 122,299 lines | |
24998 | | Nov. 2004 | MLton at 141,311 lines | |
24999 | |==== | |
25000 | ||
25001 | {nbsp} + | |
25002 | {nbsp} + | |
25003 | ||
25004 | See <:History:>. | |
25005 | ||
25006 | {nbsp} + | |
25007 | {nbsp} + | |
25008 | {nbsp} + | |
25009 | ||
25010 | ''' | |
25011 | ||
25012 | [cols="<,>"] | |
25013 | |==== | |
25014 | |<:TalkMLtonFeatures: Prev>|<:TalkDiveIn: Next> | |
25015 | |==== | |
25016 | ||
25017 | <<< | |
25018 | ||
25019 | :mlton-guide-page: TalkStandardML | |
25020 | [[TalkStandardML]] | |
25021 | TalkStandardML | |
25022 | ============== | |
25023 | ||
25024 | == Standard ML == | |
25025 | ||
25026 | * a high-level language makes | |
25027 | ** a programmer's life easier | |
25028 | ** a compiler writer's life harder | |
25029 | ||
25030 | * perceived overheads of features discourage their use | |
25031 | ** higher-order functions | |
25032 | ** polymorphic datatypes | |
25033 | ** separate modules | |
25034 | ||
25035 | {nbsp} + | |
25036 | {nbsp} + | |
25037 | ||
25038 | Also see <:StandardML:Standard ML>. | |
25039 | ||
25040 | {nbsp} + | |
25041 | {nbsp} + | |
25042 | {nbsp} + | |
25043 | ||
25044 | ''' | |
25045 | ||
25046 | [cols="<,>"] | |
25047 | |==== | |
25048 | |<:Talk: Prev>|<:TalkMLtonApproach: Next> | |
25049 | |==== | |
25050 | ||
25051 | <<< | |
25052 | ||
25053 | :mlton-guide-page: TalkTemplate | |
25054 | [[TalkTemplate]] | |
25055 | TalkTemplate | |
25056 | ============ | |
25057 | ||
25058 | == Title == | |
25059 | ||
25060 | * Bullet | |
25061 | * Bullet | |
25062 | ||
25063 | ||
25064 | {nbsp} + | |
25065 | {nbsp} + | |
25066 | {nbsp} + | |
25067 | ||
25068 | ''' | |
25069 | ||
25070 | [cols="<,>"] | |
25071 | |==== | |
25072 | |<:ZZZPrev: Prev>|<:ZZZNext: Next> | |
25073 | |==== | |
25074 | ||
25075 | <<< | |
25076 | ||
25077 | :mlton-guide-page: TalkWholeProgram | |
25078 | [[TalkWholeProgram]] | |
25079 | TalkWholeProgram | |
25080 | ================ | |
25081 | ||
25082 | == Whole Program Compiler == | |
25083 | ||
25084 | * Each of these techniques requires whole-program analysis | |
25085 | * But, additional benefits: | |
25086 | ** eliminate (some) variability in programming styles | |
25087 | ** specialize representations | |
25088 | ** simplifies and improves runtime system | |
25089 | ||
25090 | {nbsp} + | |
25091 | {nbsp} + | |
25092 | {nbsp} + | |
25093 | ||
25094 | ''' | |
25095 | ||
25096 | [cols="<,>"] | |
25097 | |==== | |
25098 | |<:TalkHowHigherOrder: Prev>|<:TalkFolkLore: Next> | |
25099 | |==== | |
25100 | ||
25101 | <<< | |
25102 | ||
25103 | :mlton-guide-page: TILT | |
25104 | [[TILT]] | |
25105 | TILT | |
25106 | ==== | |
25107 | ||
25108 | http://www.cs.cornell.edu/home/jgm/tilt.html[TILT] is a | |
25109 | <:StandardMLImplementations:Standard ML implementation>. | |
25110 | ||
25111 | <<< | |
25112 | ||
25113 | :mlton-guide-page: TipsForWritingConciseSML | |
25114 | [[TipsForWritingConciseSML]] | |
25115 | TipsForWritingConciseSML | |
25116 | ======================== | |
25117 | ||
25118 | SML is a rich enough language that there are often several ways to | |
25119 | express things. This page contains miscellaneous tips (ideas not | |
25120 | rules) for writing concise SML. The metric that we are interested in | |
25121 | here is the number of tokens or words (rather than the number of | |
25122 | lines, for example). | |
25123 | ||
25124 | == Datatypes in Signatures == | |
25125 | ||
25126 | A seemingly frequent source of repetition in SML is that of datatype | |
25127 | definitions in signatures and structures. Actually, it isn't | |
25128 | repetition at all. A datatype specification in a signature, such as, | |
25129 | ||
25130 | [source,sml] | |
25131 | ---- | |
25132 | signature EXP = sig | |
25133 | datatype exp = Fn of id * exp | App of exp * exp | Var of id | |
25134 | end | |
25135 | ---- | |
25136 | ||
25137 | is just a specification of a datatype that may be matched by multiple | |
25138 | (albeit identical) datatype declarations. For example, in | |
25139 | ||
25140 | [source,sml] | |
25141 | ---- | |
25142 | structure AnExp : EXP = struct | |
25143 | datatype exp = Fn of id * exp | App of exp * exp | Var of id | |
25144 | end | |
25145 | ||
25146 | structure AnotherExp : EXP = struct | |
25147 | datatype exp = Fn of id * exp | App of exp * exp | Var of id | |
25148 | end | |
25149 | ---- | |
25150 | ||
25151 | the types `AnExp.exp` and `AnotherExp.exp` are two distinct types. If | |
25152 | such <:GenerativeDatatype:generativity> isn't desired or needed, you | |
25153 | can avoid the repetition: | |
25154 | ||
25155 | [source,sml] | |
25156 | ---- | |
25157 | structure Exp = struct | |
25158 | datatype exp = Fn of id * exp | App of exp * exp | Var of id | |
25159 | end | |
25160 | ||
25161 | signature EXP = sig | |
25162 | datatype exp = datatype Exp.exp | |
25163 | end | |
25164 | ||
25165 | structure Exp : EXP = struct | |
25166 | open Exp | |
25167 | end | |
25168 | ---- | |
25169 | ||
25170 | Keep in mind that this isn't semantically equivalent to the original. | |
25171 | ||
25172 | ||
25173 | == Clausal Function Definitions == | |
25174 | ||
25175 | The syntax of clausal function definitions is rather repetitive. For | |
25176 | example, | |
25177 | ||
25178 | [source,sml] | |
25179 | ---- | |
25180 | fun isSome NONE = false | |
25181 | | isSome (SOME _) = true | |
25182 | ---- | |
25183 | ||
25184 | is more verbose than | |
25185 | ||
25186 | [source,sml] | |
25187 | ---- | |
25188 | val isSome = | |
25189 | fn NONE => false | |
25190 | | SOME _ => true | |
25191 | ---- | |
25192 | ||
25193 | For recursive functions the break-even point is one clause higher. For example, | |
25194 | ||
25195 | [source,sml] | |
25196 | ---- | |
25197 | fun fib 0 = 0 | |
25198 | | fib 1 = 1 | |
25199 | | fib n = fib (n-1) + fib (n-2) | |
25200 | ---- | |
25201 | ||
25202 | isn't less verbose than | |
25203 | ||
25204 | [source,sml] | |
25205 | ---- | |
25206 | val rec fib = | |
25207 | fn 0 => 0 | |
25208 | | 1 => 1 | |
25209 | | n => fib (n-1) + fib (n-2) | |
25210 | ---- | |
25211 | ||
25212 | It is quite often the case that a curried function primarily examines | |
25213 | just one of its arguments. Such functions can be written particularly | |
25214 | concisely by making the examined argument last. For example, instead | |
25215 | of | |
25216 | ||
25217 | [source,sml] | |
25218 | ---- | |
25219 | fun eval (Fn (v, b)) env => ... | |
25220 | | eval (App (f, a) env => ... | |
25221 | | eval (Var v) env => ... | |
25222 | ---- | |
25223 | ||
25224 | consider writing | |
25225 | ||
25226 | [source,sml] | |
25227 | ---- | |
25228 | fun eval env = | |
25229 | fn Fn (v, b) => ... | |
25230 | | App (f, a) => ... | |
25231 | | Var v => ... | |
25232 | ---- | |
25233 | ||
25234 | ||
25235 | == Parentheses == | |
25236 | ||
25237 | It is a good idea to avoid using lots of irritating superfluous | |
25238 | parentheses. An important rule to know is that prefix function | |
25239 | application in SML has higher precedence than any infix operator. For | |
25240 | example, the outer parentheses in | |
25241 | ||
25242 | [source,sml] | |
25243 | ---- | |
25244 | (square (5 + 1)) + (square (5 * 2)) | |
25245 | ---- | |
25246 | ||
25247 | are superfluous. | |
25248 | ||
25249 | People trained in other languages often use superfluous parentheses in | |
25250 | a number of places. In particular, the parentheses in the following | |
25251 | examples are practically always superfluous and are best avoided: | |
25252 | ||
25253 | [source,sml] | |
25254 | ---- | |
25255 | if (condition) then ... else ... | |
25256 | while (condition) do ... | |
25257 | ---- | |
25258 | ||
25259 | The same basically applies to case expressions: | |
25260 | ||
25261 | [source,sml] | |
25262 | ---- | |
25263 | case (expression) of ... | |
25264 | ---- | |
25265 | ||
25266 | It is not uncommon to match a tuple of two or more values: | |
25267 | ||
25268 | [source,sml] | |
25269 | ---- | |
25270 | case (a, b) of | |
25271 | (A1, B1) => ... | |
25272 | | (A2, B2) => ... | |
25273 | ---- | |
25274 | ||
25275 | Such case expressions can be written more concisely with an | |
25276 | <:ProductType:infix product constructor>: | |
25277 | ||
25278 | [source,sml] | |
25279 | ---- | |
25280 | case a & b of | |
25281 | A1 & B1 => ... | |
25282 | | A2 & B2 => ... | |
25283 | ---- | |
25284 | ||
25285 | ||
25286 | == Conditionals == | |
25287 | ||
25288 | Repeated sequences of conditionals such as | |
25289 | ||
25290 | [source,sml] | |
25291 | ---- | |
25292 | if x < y then ... | |
25293 | else if x = y then ... | |
25294 | else ... | |
25295 | ---- | |
25296 | ||
25297 | can often be written more concisely as case expressions such as | |
25298 | ||
25299 | [source,sml] | |
25300 | ---- | |
25301 | case Int.compare (x, y) of | |
25302 | LESS => ... | |
25303 | | EQUAL => ... | |
25304 | | GREATER => ... | |
25305 | ---- | |
25306 | ||
25307 | For a custom comparison, you would then define an appropriate datatype | |
25308 | and a reification function. An alternative to using datatypes is to | |
25309 | use dispatch functions | |
25310 | ||
25311 | [source,sml] | |
25312 | ---- | |
25313 | comparing (x, y) | |
25314 | {lt = fn () => ..., | |
25315 | eq = fn () => ..., | |
25316 | gt = fn () => ...} | |
25317 | ---- | |
25318 | ||
25319 | where | |
25320 | ||
25321 | [source,sml] | |
25322 | ---- | |
25323 | fun comparing (x, y) {lt, eq, gt} = | |
25324 | (case Int.compare (x, y) of | |
25325 | LESS => lt | |
25326 | | EQUAL => eq | |
25327 | | GREATER => gt) () | |
25328 | ---- | |
25329 | ||
25330 | An advantage is that no datatype definition is needed. A disadvantage | |
25331 | is that you can't combine multiple dispatch results easily. | |
25332 | ||
25333 | ||
25334 | == Command-Query Fusion == | |
25335 | ||
25336 | Many are familiar with the | |
25337 | http://en.wikipedia.org/wiki/Command-Query_Separation[Command-Query | |
25338 | Separation Principle]. Adhering to the principle, a signature for an | |
25339 | imperative stack might contain specifications | |
25340 | ||
25341 | [source,sml] | |
25342 | ---- | |
25343 | val isEmpty : 'a t -> bool | |
25344 | val pop : 'a t -> 'a | |
25345 | ---- | |
25346 | ||
25347 | and use of a stack would look like | |
25348 | ||
25349 | [source,sml] | |
25350 | ---- | |
25351 | if isEmpty stack | |
25352 | then ... pop stack ... | |
25353 | else ... | |
25354 | ---- | |
25355 | ||
25356 | or, when the element needs to be named, | |
25357 | ||
25358 | [source,sml] | |
25359 | ---- | |
25360 | if isEmpty stack | |
25361 | then let val elem = pop stack in ... end | |
25362 | else ... | |
25363 | ---- | |
25364 | ||
25365 | For efficiency, correctness, and conciseness, it is often better to | |
25366 | combine the query and command and return the result as an option: | |
25367 | ||
25368 | [source,sml] | |
25369 | ---- | |
25370 | val pop : 'a t -> 'a option | |
25371 | ---- | |
25372 | ||
25373 | A use of a stack would then look like this: | |
25374 | ||
25375 | [source,sml] | |
25376 | ---- | |
25377 | case pop stack of | |
25378 | NONE => ... | |
25379 | | SOME elem => ... | |
25380 | ---- | |
25381 | ||
25382 | <<< | |
25383 | ||
25384 | :mlton-guide-page: ToMachine | |
25385 | [[ToMachine]] | |
25386 | ToMachine | |
25387 | ========= | |
25388 | ||
25389 | <:ToMachine:> is a translation pass from the <:RSSA:> | |
25390 | <:IntermediateLanguage:> to the <:Machine:> <:IntermediateLanguage:>. | |
25391 | ||
25392 | == Description == | |
25393 | ||
25394 | This pass converts from a <:RSSA:> program into a <:Machine:> program. | |
25395 | ||
25396 | It uses <:AllocateRegisters:>, <:Chunkify:>, and <:ParallelMove:>. | |
25397 | ||
25398 | == Implementation == | |
25399 | ||
25400 | * <!ViewGitFile(mlton,master,mlton/backend/backend.sig)> | |
25401 | * <!ViewGitFile(mlton,master,mlton/backend/backend.fun)> | |
25402 | ||
25403 | == Details and Notes == | |
25404 | ||
25405 | Because the MLton runtime system is shared by all <:Codegen:codegens>, it is most | |
25406 | convenient to decide on stack layout _before_ any <:Codegen:codegen> takes over. | |
25407 | In particular, we compute all the stack frame info for each <:RSSA:> | |
25408 | function, including stack size, <:GarbageCollection:garbage collector> | |
25409 | masks for each frame, etc. To do so, the <:Machine:> | |
25410 | <:IntermediateLanguage:> imagines an abstract machine with an infinite | |
25411 | number of (pseudo-)registers of every size. A liveness analysis | |
25412 | determines, for each variable, whether or not it is live across a | |
25413 | point where the runtime system might take over (for example, any | |
25414 | garbage collection point) or a non-tail call to another <:RSSA:> | |
25415 | function. Those that are live go on the stack, while those that | |
25416 | aren't live go into psuedo-registers. From this information, we know | |
25417 | all we need to about each stack frame. On the downside, nothing | |
25418 | further on is allowed to change this stack info; it is set in stone. | |
25419 | ||
25420 | <<< | |
25421 | ||
25422 | :mlton-guide-page: TomMurphy | |
25423 | [[TomMurphy]] | |
25424 | TomMurphy | |
25425 | ========= | |
25426 | ||
25427 | Tom Murphy VII is a long time MLton user and occasional contributor. He works on programming languages for his PhD work at Carnegie Mellon in Pittsburgh, USA. <:AdamGoode:> lives on the same floor of Wean Hall. | |
25428 | ||
25429 | http://tom7.org[Home page] | |
25430 | ||
25431 | <<< | |
25432 | ||
25433 | :mlton-guide-page: ToRSSA | |
25434 | [[ToRSSA]] | |
25435 | ToRSSA | |
25436 | ====== | |
25437 | ||
25438 | <:ToRSSA:> is a translation pass from the <:SSA2:> | |
25439 | <:IntermediateLanguage:> to the <:RSSA:> <:IntermediateLanguage:>. | |
25440 | ||
25441 | == Description == | |
25442 | ||
25443 | This pass converts a <:SSA2:> program into a <:RSSA:> program. | |
25444 | ||
25445 | It uses <:PackedRepresentation:>. | |
25446 | ||
25447 | == Implementation == | |
25448 | ||
25449 | * <!ViewGitFile(mlton,master,mlton/backend/ssa-to-rssa.sig)> | |
25450 | * <!ViewGitFile(mlton,master,mlton/backend/ssa-to-rssa.fun)> | |
25451 | ||
25452 | == Details and Notes == | |
25453 | ||
25454 | {empty} | |
25455 | ||
25456 | <<< | |
25457 | ||
25458 | :mlton-guide-page: ToSSA2 | |
25459 | [[ToSSA2]] | |
25460 | ToSSA2 | |
25461 | ====== | |
25462 | ||
25463 | <:ToSSA2:> is a translation pass from the <:SSA:> | |
25464 | <:IntermediateLanguage:> to the <:SSA2:> <:IntermediateLanguage:>. | |
25465 | ||
25466 | == Description == | |
25467 | ||
25468 | This pass is a simple conversion from a <:SSA:> program into a | |
25469 | <:SSA2:> program. | |
25470 | ||
25471 | The only interesting portions of the translation are: | |
25472 | ||
25473 | * an <:SSA:> `ref` type becomes an object with a single mutable field | |
25474 | * `array`, `vector`, and `ref` are eliminated in favor of select and updates | |
25475 | * `Case` transfers separate discrimination and constructor argument selects | |
25476 | ||
25477 | == Implementation == | |
25478 | ||
25479 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa-to-ssa2.sig)> | |
25480 | * <!ViewGitFile(mlton,master,mlton/ssa/ssa-to-ssa2.fun)> | |
25481 | ||
25482 | == Details and Notes == | |
25483 | ||
25484 | {empty} | |
25485 | ||
25486 | <<< | |
25487 | ||
25488 | :mlton-guide-page: TypeChecking | |
25489 | [[TypeChecking]] | |
25490 | TypeChecking | |
25491 | ============ | |
25492 | ||
25493 | MLton's type checker follows the <:DefinitionOfStandardML:Definition> | |
25494 | closely, so you may find differences between MLton and other SML | |
25495 | compilers that do not follow the Definition so closely. In | |
25496 | particular, SML/NJ has many deviations from the Definition -- please | |
25497 | see <:SMLNJDeviations:> for those that we are aware of. | |
25498 | ||
25499 | In some respects MLton's type checker is more powerful than other SML | |
25500 | compilers, so there are programs that MLton accepts that are rejected | |
25501 | by some other SML compilers. These kinds of programs fall into a few | |
25502 | simple categories. | |
25503 | ||
25504 | * MLton resolves flexible record patterns using a larger context than | |
25505 | many other SML compilers. For example, MLton accepts the | |
25506 | following. | |
25507 | + | |
25508 | [source,sml] | |
25509 | ---- | |
25510 | fun f {x, ...} = x | |
25511 | val _ = f {x = 13, y = "foo"} | |
25512 | ---- | |
25513 | ||
25514 | * MLton uses as large a context as possible to resolve the type of | |
25515 | variables constrained by the value restriction to be monotypes. For | |
25516 | example, MLton accepts the following. | |
25517 | + | |
25518 | [source,sml] | |
25519 | ---- | |
25520 | structure S: | |
25521 | sig | |
25522 | val f: int -> int | |
25523 | end = | |
25524 | struct | |
25525 | val f = (fn x => x) (fn y => y) | |
25526 | end | |
25527 | ---- | |
25528 | ||
25529 | ||
25530 | == Type error messages == | |
25531 | ||
25532 | To aid in the understanding of type errors, MLton's type checker | |
25533 | displays type errors differently than other SML compilers. In | |
25534 | particular, when two types are different, it is important for the | |
25535 | programmer to easily understand why they are different. So, MLton | |
25536 | displays only the differences between two types that don't match, | |
25537 | using underscores for the parts that match. For example, if a | |
25538 | function expects `real * int` but gets `real * real`, the type error | |
25539 | message would look like | |
25540 | ||
25541 | ---- | |
25542 | expects: _ * [int] | |
25543 | but got: _ * [real] | |
25544 | ---- | |
25545 | ||
25546 | As another aid to spotting differences, MLton places brackets `[]` | |
25547 | around the parts of the types that don't match. A common situation is | |
25548 | when a function receives a different number of arguments than it | |
25549 | expects, in which case you might see an error like | |
25550 | ||
25551 | ---- | |
25552 | expects: [int * real] | |
25553 | but got: [int * real * string] | |
25554 | ---- | |
25555 | ||
25556 | The brackets make it easy to see that the problem is that the tuples | |
25557 | have different numbers of components -- not that the components don't | |
25558 | match. Contrast that with a case where a function receives the right | |
25559 | number of arguments, but in the wrong order, in which case you might | |
25560 | see an error like | |
25561 | ||
25562 | ---- | |
25563 | expects: [int] * [real] | |
25564 | but got: [real] * [int] | |
25565 | ---- | |
25566 | ||
25567 | Here the brackets make it easy to see that the components do not match. | |
25568 | ||
25569 | We appreciate feedback on any type error messages that you find | |
25570 | confusing, or suggestions you may have for improvements to error | |
25571 | messages. | |
25572 | ||
25573 | ||
25574 | == The shortest/most-recent rule for type names == | |
25575 | ||
25576 | In a type error message, MLton often has a number of choices in | |
25577 | deciding what name to use for a type. For example, in the following | |
25578 | type-incorrect program | |
25579 | ||
25580 | [source,sml] | |
25581 | ---- | |
25582 | type t = int | |
25583 | fun f (x: t) = x | |
25584 | val _ = f "foo" | |
25585 | ---- | |
25586 | ||
25587 | MLton reports the error message | |
25588 | ||
25589 | ---- | |
25590 | Error: z.sml 3.9-3.15. | |
25591 | Function applied to incorrect argument. | |
25592 | expects: [t] | |
25593 | but got: [string] | |
25594 | in: f "foo" | |
25595 | ---- | |
25596 | ||
25597 | MLton could have reported `expects: [int]` instead of `expects: [t]`. | |
25598 | However, MLton uses the shortest/most-recent rule in order to decide | |
25599 | what type name to display. This rule means that, at the point of the | |
25600 | error, MLton first looks for the shortest name for a type in terms of | |
25601 | number of structure identifiers (e.g. `foobar` is shorter than `A.t`). | |
25602 | Next, if there are multiple names of the same length, then MLton uses | |
25603 | the most recently defined name. It is this tiebreaker that causes | |
25604 | MLton to prefer `t` to `int` in the above example. | |
25605 | ||
25606 | In signature matching, most recently defined is not taken to include | |
25607 | all of the definitions introduced by the structure (since the matching | |
25608 | takes place outside the structure and before it is defined). For | |
25609 | example, in the following type-incorrect program | |
25610 | ||
25611 | [source,sml] | |
25612 | ---- | |
25613 | structure S: | |
25614 | sig | |
25615 | val x: int | |
25616 | end = | |
25617 | struct | |
25618 | type t = int | |
25619 | val x = "foo" | |
25620 | end | |
25621 | ---- | |
25622 | ||
25623 | MLton reports the error message | |
25624 | ||
25625 | ---- | |
25626 | Error: z.sml 2.4-4.6. | |
25627 | Variable in structure disagrees with signature (type): x. | |
25628 | structure: val x: [string] | |
25629 | defn at: z.sml 7.11-7.11 | |
25630 | signature: val x: [int] | |
25631 | spec at: z.sml 3.11-3.11 | |
25632 | ---- | |
25633 | ||
25634 | If there is a type that only exists inside the structure being | |
25635 | matched, then the prefix `_str.` is used. For example, in the | |
25636 | following type-incorrect program | |
25637 | ||
25638 | [source,sml] | |
25639 | ---- | |
25640 | structure S: | |
25641 | sig | |
25642 | val x: int | |
25643 | end = | |
25644 | struct | |
25645 | datatype t = T | |
25646 | val x = T | |
25647 | end | |
25648 | ---- | |
25649 | ||
25650 | MLton reports the error message | |
25651 | ||
25652 | ---- | |
25653 | Error: z.sml 2.4-4.6. | |
25654 | Variable in structure disagrees with signature (type): x. | |
25655 | structure: val x: [_str.t] | |
25656 | defn at: z.sml 7.11-7.11 | |
25657 | signature: val x: [int] | |
25658 | spec at: z.sml 3.11-3.11 | |
25659 | ---- | |
25660 | ||
25661 | in which the `[_str.t]` refers to the type defined in the structure. | |
25662 | ||
25663 | <<< | |
25664 | ||
25665 | :mlton-guide-page: TypeConstructor | |
25666 | [[TypeConstructor]] | |
25667 | TypeConstructor | |
25668 | =============== | |
25669 | ||
25670 | In <:StandardML:Standard ML>, a type constructor is a function from | |
25671 | types to types. Type constructors can be _nullary_, meaning that | |
25672 | they take no arguments, as in `char`, `int`, and `real`. | |
25673 | Type constructors can be _unary_, meaning that they take one | |
25674 | argument, as in `array`, `list`, and `vector`. A program | |
25675 | can define a new type constructor in two ways: a `type` definition | |
25676 | or a `datatype` declaration. User-defined type constructors can | |
25677 | can take any number of arguments. | |
25678 | ||
25679 | [source,sml] | |
25680 | ---- | |
25681 | datatype t = T of int * real (* 0 arguments *) | |
25682 | type 'a t = 'a * int (* 1 argument *) | |
25683 | datatype ('a, 'b) t = A | B of 'a * 'b (* 2 arguments *) | |
25684 | type ('a, 'b, 'c) t = 'a * ('b -> 'c) (* 3 arguments *) | |
25685 | ---- | |
25686 | ||
25687 | Here are the syntax rules for type constructor application. | |
25688 | ||
25689 | * Type constructor application is written in postfix. So, one writes | |
25690 | `int list`, not `list int`. | |
25691 | ||
25692 | * Unary type constructors drop the parens, so one writes | |
25693 | `int list`, not `(int) list`. | |
25694 | ||
25695 | * Nullary type constructors drop the argument entirely, so one writes | |
25696 | `int`, not `() int`. | |
25697 | ||
25698 | * N-ary type constructors use tuple notation; for example, | |
25699 | `(int, real) t`. | |
25700 | ||
25701 | * Type constructor application associates to the left. So, | |
25702 | `int ref list` is the same as `(int ref) list`. | |
25703 | ||
25704 | <<< | |
25705 | ||
25706 | :mlton-guide-page: TypeIndexedValues | |
25707 | [[TypeIndexedValues]] | |
25708 | TypeIndexedValues | |
25709 | ================= | |
25710 | ||
25711 | <:StandardML:Standard ML> does not support ad hoc polymorphism. This | |
25712 | presents a challenge to programmers. The problem is that at first | |
25713 | glance there seems to be no practical way to implement something like | |
25714 | a function for converting a value of any type to a string or a | |
25715 | function for computing a hash value for a value of any type. | |
25716 | Fortunately there are ways to implement type-indexed values in SML as | |
25717 | discussed in <!Cite(Yang98)>. Various articles such as | |
25718 | <!Cite(Danvy98)>, <!Cite(Ramsey11)>, <!Cite(Elsman04)>, | |
25719 | <!Cite(Kennedy04)>, and <!Cite(Benton05)> also contain examples of | |
25720 | type-indexed values. | |
25721 | ||
25722 | *NOTE:* The technique used in the following example uses an early (and | |
25723 | somewhat broken) variation of the basic technique used in an | |
25724 | experimental generic programming library (see | |
25725 | <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>) that can | |
25726 | be found from the MLton repository. The generic programming library | |
25727 | also includes a more advanced generic pretty printing function (see | |
25728 | <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/public/value/pretty.sig)>). | |
25729 | ||
25730 | == Example: Converting any SML value to (roughly) SML syntax == | |
25731 | ||
25732 | Consider the problem of converting any SML value to a textual | |
25733 | presentation that matches the syntax of SML as closely as possible. | |
25734 | One solution is a type-indexed function that maps a given type to a | |
25735 | function that maps any value (of the type) to its textual | |
25736 | presentation. A type-indexed function like this can be useful for a | |
25737 | variety of purposes. For example, one could use it to show debugging | |
25738 | information. We'll call this function "`show`". | |
25739 | ||
25740 | We'll do a fairly complete implementation of `show`. We do not | |
25741 | distinguish infix and nonfix constructors, but that is not an | |
25742 | intrinsic property of SML datatypes. We also don't reconstruct a type | |
25743 | name for the value, although it would be particularly useful for | |
25744 | functional values. To reconstruct type names, some changes would be | |
25745 | needed and the reader is encouraged to consider how to do that. A | |
25746 | more realistic implementation would use some pretty printing | |
25747 | combinators to compute a layout for the result. This should be a | |
25748 | relatively easy change (given a suitable pretty printing library). | |
25749 | Cyclic values (through references and arrays) do not have a standard | |
25750 | textual presentation and it is impossible to convert arbitrary | |
25751 | functional values (within SML) to a meaningful textual presentation. | |
25752 | Finally, it would also make sense to show sharing of references and | |
25753 | arrays. We'll leave these improvements to an actual library | |
25754 | implementation. | |
25755 | ||
25756 | The following code uses the <:Fixpoints:fixpoint framework> and other | |
25757 | utilities from an Extended Basis library (see | |
25758 | <!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>). | |
25759 | ||
25760 | === Signature === | |
25761 | ||
25762 | Let's consider the design of the `SHOW` signature: | |
25763 | [source,sml] | |
25764 | ---- | |
25765 | infixr --> | |
25766 | ||
25767 | signature SHOW = sig | |
25768 | type 'a t (* complete type-index *) | |
25769 | type 'a s (* incomplete sum *) | |
25770 | type ('a, 'k) p (* incomplete product *) | |
25771 | type u (* tuple or unlabelled product *) | |
25772 | type l (* record or labelled product *) | |
25773 | ||
25774 | val show : 'a t -> 'a -> string | |
25775 | ||
25776 | (* user-defined types *) | |
25777 | val inj : ('a -> 'b) -> 'b t -> 'a t | |
25778 | ||
25779 | (* tuples and records *) | |
25780 | val * : ('a, 'k) p * ('b, 'k) p -> (('a, 'b) product, 'k) p | |
25781 | ||
25782 | val U : 'a t -> ('a, u) p | |
25783 | val L : string -> 'a t -> ('a, l) p | |
25784 | ||
25785 | val tuple : ('a, u) p -> 'a t | |
25786 | val record : ('a, l) p -> 'a t | |
25787 | ||
25788 | (* datatypes *) | |
25789 | val + : 'a s * 'b s -> (('a, 'b) sum) s | |
25790 | ||
25791 | val C0 : string -> unit s | |
25792 | val C1 : string -> 'a t -> 'a s | |
25793 | ||
25794 | val data : 'a s -> 'a t | |
25795 | ||
25796 | val Y : 'a t Tie.t | |
25797 | ||
25798 | (* exceptions *) | |
25799 | val exn : exn t | |
25800 | val regExn : (exn -> ('a * 'a s) option) -> unit | |
25801 | ||
25802 | (* some built-in type constructors *) | |
25803 | val refc : 'a t -> 'a ref t | |
25804 | val array : 'a t -> 'a array t | |
25805 | val list : 'a t -> 'a list t | |
25806 | val vector : 'a t -> 'a vector t | |
25807 | val --> : 'a t * 'b t -> ('a -> 'b) t | |
25808 | ||
25809 | (* some built-in base types *) | |
25810 | val string : string t | |
25811 | val unit : unit t | |
25812 | val bool : bool t | |
25813 | val char : char t | |
25814 | val int : int t | |
25815 | val word : word t | |
25816 | val real : real t | |
25817 | end | |
25818 | ---- | |
25819 | ||
25820 | While some details are shaped by the specific requirements of `show`, | |
25821 | there are a number of (design) patterns that translate to other | |
25822 | type-indexed values. The former kind of details are mostly shaped by | |
25823 | the syntax of SML values that `show` is designed to produce. To this | |
25824 | end, abstract types and phantom types are used to distinguish | |
25825 | incomplete record, tuple, and datatype type-indices from each other | |
25826 | and from complete type-indices. Also, names of record labels and | |
25827 | datatype constructors need to be provided by the user. | |
25828 | ||
25829 | ==== Arbitrary user-defined datatypes ==== | |
25830 | ||
25831 | Perhaps the most important pattern is how the design supports | |
25832 | arbitrary user-defined datatypes. A number of combinators together | |
25833 | conspire to provide the functionality. First of all, to support new | |
25834 | user-defined types, a combinator taking a conversion function to a | |
25835 | previously supported type is provided: | |
25836 | [source,sml] | |
25837 | ---- | |
25838 | val inj : ('a -> 'b) -> 'b t -> 'a t | |
25839 | ---- | |
25840 | ||
25841 | An injection function is sufficient in this case, but in the general | |
25842 | case, an embedding with injection and projection functions may be | |
25843 | needed. | |
25844 | ||
25845 | To support products (tuples and records) a product combinator is | |
25846 | provided: | |
25847 | [source,sml] | |
25848 | ---- | |
25849 | val * : ('a, 'k) p * ('b, 'k) p -> (('a, 'b) product, 'k) p | |
25850 | ---- | |
25851 | The second (phantom) type variable `'k` is there to distinguish | |
25852 | between labelled and unlabelled products and the type `p` | |
25853 | distinguishes incomplete products from complete type-indices of type | |
25854 | `t`. Most type-indexed values do not need to make such distinctions. | |
25855 | ||
25856 | To support sums (datatypes) a sum combinator is provided: | |
25857 | [source,sml] | |
25858 | ---- | |
25859 | val + : 'a s * 'b s -> (('a, 'b) sum) s | |
25860 | ---- | |
25861 | Again, the purpose of the type `s` is to distinguish incomplete sums | |
25862 | from complete type-indices of type `t`, which usually isn't necessary. | |
25863 | ||
25864 | Finally, to support recursive datatypes, including sets of mutually | |
25865 | recursive datatypes, a <:Fixpoints:fixpoint tier> is provided: | |
25866 | [source,sml] | |
25867 | ---- | |
25868 | val Y : 'a t Tie.t | |
25869 | ---- | |
25870 | ||
25871 | Together these combinators (with the more domain specific combinators | |
25872 | `U`, `L`, `tuple`, `record`, `C0`, `C1`, and `data`) enable one to | |
25873 | encode a type-index for any user-defined datatype. | |
25874 | ||
25875 | ==== Exceptions ==== | |
25876 | ||
25877 | The `exn` type in SML is a <:UniversalType:universal type> into which | |
25878 | all types can be embedded. SML also allows a program to generate new | |
25879 | exception variants at run-time. Thus a mechanism is required to register | |
25880 | handlers for particular variants: | |
25881 | [source,sml] | |
25882 | ---- | |
25883 | val exn : exn t | |
25884 | val regExn : (exn -> ('a * 'a s) option) -> unit | |
25885 | ---- | |
25886 | ||
25887 | The universal `exn` type-index then makes use of the registered | |
25888 | handlers. The above particular form of handler, which converts an | |
25889 | exception value to a value of some type and a type-index for that type | |
25890 | (essentially an existential type) is designed to make it convenient to | |
25891 | write handlers. To write a handler, one can conveniently reuse | |
25892 | existing type-indices: | |
25893 | [source,sml] | |
25894 | ---- | |
25895 | exception Int of int | |
25896 | ||
25897 | local | |
25898 | open Show | |
25899 | in | |
25900 | val () = regExn (fn Int v => SOME (v, C1"Int" int) | |
25901 | | _ => NONE) | |
25902 | end | |
25903 | ---- | |
25904 | ||
25905 | Note that a single handler may actually handle an arbitrary number of | |
25906 | different exceptions. | |
25907 | ||
25908 | ==== Other types ==== | |
25909 | ||
25910 | Some built-in and standard types typically require special treatment | |
25911 | due to their special nature. The most important of these are arrays | |
25912 | and references, because cyclic data (ignoring closures) and observable | |
25913 | sharing can only be constructed through them. | |
25914 | ||
25915 | When arrow types are really supported, unlike in this case, they | |
25916 | usually need special treatment due to the contravariance of arguments. | |
25917 | ||
25918 | Lists and vectors require special treatment in the case of `show`, | |
25919 | because of their special syntax. This isn't usually the case. | |
25920 | ||
25921 | The set of base types to support also needs to be considered unless | |
25922 | one exports an interface for constructing type-indices for entirely | |
25923 | new base types. | |
25924 | ||
25925 | == Usage == | |
25926 | ||
25927 | Before going to the implementation, let's look at some examples. For | |
25928 | the following examples, we'll assume a structure binding | |
25929 | `Show :> SHOW`. If you want to try the examples immediately, just | |
25930 | skip forward to the implementation. | |
25931 | ||
25932 | To use `show`, one first needs a type-index, which is then given to | |
25933 | `show`. To show a list of integers, one would use the type-index | |
25934 | `list int`, which has the type `int list Show.t`: | |
25935 | [source,sml] | |
25936 | ---- | |
25937 | val "[3, 1, 4]" = | |
25938 | let open Show in show (list int) end | |
25939 | [3, 1, 4] | |
25940 | ---- | |
25941 | ||
25942 | Likewise, to show a list of lists of characters, one would use the | |
25943 | type-index `list (list char)`, which has the type `char list list | |
25944 | Show.t`: | |
25945 | [source,sml] | |
25946 | ---- | |
25947 | val "[[#\"a\", #\"b\", #\"c\"], []]" = | |
25948 | let open Show in show (list (list char)) end | |
25949 | [[#"a", #"b", #"c"], []] | |
25950 | ---- | |
25951 | ||
25952 | Handling standard types is not particularly interesting. It is more | |
25953 | interesting to see how user-defined types can be handled. Although | |
25954 | the `option` datatype is a standard type, it requires no special | |
25955 | support, so we can treat it as a user-defined type. Options can be | |
25956 | encoded easily using a sum: | |
25957 | [source,sml] | |
25958 | ---- | |
25959 | fun option t = let | |
25960 | open Show | |
25961 | in | |
25962 | inj (fn NONE => INL () | |
25963 | | SOME v => INR v) | |
25964 | (data (C0"NONE" + C1"SOME" t)) | |
25965 | end | |
25966 | ||
25967 | val "SOME 5" = | |
25968 | let open Show in show (option int) end | |
25969 | (SOME 5) | |
25970 | ---- | |
25971 | ||
25972 | Readers new to type-indexed values might want to type annotate each | |
25973 | subexpression of the above example as an exercise. (Use a compiler to | |
25974 | check your annotations.) | |
25975 | ||
25976 | Using a product, user specified records can be also be encoded easily: | |
25977 | [source,sml] | |
25978 | ---- | |
25979 | val abc = let | |
25980 | open Show | |
25981 | in | |
25982 | inj (fn {a, b, c} => a & b & c) | |
25983 | (record (L"a" (option int) * | |
25984 | L"b" real * | |
25985 | L"c" bool)) | |
25986 | end | |
25987 | ||
25988 | val "{a = SOME 1, b = 3.0, c = false}" = | |
25989 | let open Show in show abc end | |
25990 | {a = SOME 1, b = 3.0, c = false} | |
25991 | ---- | |
25992 | ||
25993 | As you can see, both of the above use `inj` to inject user-defined | |
25994 | types to the general purpose sum and product types. | |
25995 | ||
25996 | Of particular interest is whether recursive datatypes and cyclic data | |
25997 | can be handled. For example, how does one write a type-index for a | |
25998 | recursive datatype such as a cyclic graph? | |
25999 | [source,sml] | |
26000 | ---- | |
26001 | datatype 'a graph = VTX of 'a * 'a graph list ref | |
26002 | fun arcs (VTX (_, r)) = r | |
26003 | ---- | |
26004 | ||
26005 | Using the `Show` combinators, we could first write a new type-index | |
26006 | combinator for `graph`: | |
26007 | [source,sml] | |
26008 | ---- | |
26009 | fun graph a = let | |
26010 | open Tie Show | |
26011 | in | |
26012 | fix Y (fn graph_a => | |
26013 | inj (fn VTX (x, y) => x & y) | |
26014 | (data (C1"VTX" | |
26015 | (tuple (U a * | |
26016 | U (refc (list graph_a))))))) | |
26017 | end | |
26018 | ---- | |
26019 | ||
26020 | To show a graph with integer labels | |
26021 | [source,sml] | |
26022 | ---- | |
26023 | val a_graph = let | |
26024 | val a = VTX (1, ref []) | |
26025 | val b = VTX (2, ref []) | |
26026 | val c = VTX (3, ref []) | |
26027 | val d = VTX (4, ref []) | |
26028 | val e = VTX (5, ref []) | |
26029 | val f = VTX (6, ref []) | |
26030 | in | |
26031 | arcs a := [b, d] | |
26032 | ; arcs b := [c, e] | |
26033 | ; arcs c := [a, f] | |
26034 | ; arcs d := [f] | |
26035 | ; arcs e := [d] | |
26036 | ; arcs f := [e] | |
26037 | ; a | |
26038 | end | |
26039 | ---- | |
26040 | we could then simply write | |
26041 | [source,sml] | |
26042 | ---- | |
26043 | val "VTX (1, ref [VTX (2, ref [VTX (3, ref [VTX (1, %0), \ | |
26044 | \VTX (6, ref [VTX (5, ref [VTX (4, ref [VTX (6, %3)])])] as %3)]), \ | |
26045 | \VTX (5, ref [VTX (4, ref [VTX (6, ref [VTX (5, %2)])])] as %2)]), \ | |
26046 | \VTX (4, ref [VTX (6, ref [VTX (5, ref [VTX (4, %1)])])] as %1)] as %0)" = | |
26047 | let open Show in show (graph int) end | |
26048 | a_graph | |
26049 | ---- | |
26050 | ||
26051 | There is a subtle gotcha with cyclic data. Consider the following code: | |
26052 | [source,sml] | |
26053 | ---- | |
26054 | exception ExnArray of exn array | |
26055 | ||
26056 | val () = let | |
26057 | open Show | |
26058 | in | |
26059 | regExn (fn ExnArray a => | |
26060 | SOME (a, C1"ExnArray" (array exn)) | |
26061 | | _ => NONE) | |
26062 | end | |
26063 | ||
26064 | val a_cycle = let | |
26065 | val a = Array.fromList [Empty] | |
26066 | in | |
26067 | Array.update (a, 0, ExnArray a) ; a | |
26068 | end | |
26069 | ---- | |
26070 | ||
26071 | Although the above looks innocent enough, the evaluation of | |
26072 | [source,sml] | |
26073 | ---- | |
26074 | val "[|ExnArray %0|] as %0" = | |
26075 | let open Show in show (array exn) end | |
26076 | a_cycle | |
26077 | ---- | |
26078 | goes into an infinite loop. To avoid this problem, the type-index | |
26079 | `array exn` must be evaluated only once, as in the following: | |
26080 | [source,sml] | |
26081 | ---- | |
26082 | val array_exn = let open Show in array exn end | |
26083 | ||
26084 | exception ExnArray of exn array | |
26085 | ||
26086 | val () = let | |
26087 | open Show | |
26088 | in | |
26089 | regExn (fn ExnArray a => | |
26090 | SOME (a, C1"ExnArray" array_exn) | |
26091 | | _ => NONE) | |
26092 | end | |
26093 | ||
26094 | val a_cycle = let | |
26095 | val a = Array.fromList [Empty] | |
26096 | in | |
26097 | Array.update (a, 0, ExnArray a) ; a | |
26098 | end | |
26099 | ||
26100 | val "[|ExnArray %0|] as %0" = | |
26101 | let open Show in show array_exn end | |
26102 | a_cycle | |
26103 | ---- | |
26104 | ||
26105 | Cyclic data (excluding closures) in Standard ML can only be | |
26106 | constructed imperatively through arrays and references (combined with | |
26107 | exceptions or recursive datatypes). Before recursing to a reference | |
26108 | or an array, one needs to check whether that reference or array has | |
26109 | already been seen before. When `ref` or `array` is called with a | |
26110 | type-index, a new cyclicity checker is instantiated. | |
26111 | ||
26112 | == Implementation == | |
26113 | ||
26114 | [source,sml] | |
26115 | ---- | |
26116 | structure SmlSyntax = struct | |
26117 | local | |
26118 | structure CV = CharVector and C = Char | |
26119 | in | |
26120 | val isSym = Char.contains "!%&$#+-/:<=>?@\\~`^|*" | |
26121 | ||
26122 | fun isSymId s = 0 < size s andalso CV.all isSym s | |
26123 | ||
26124 | fun isAlphaNumId s = | |
26125 | 0 < size s | |
26126 | andalso C.isAlpha (CV.sub (s, 0)) | |
26127 | andalso CV.all (fn c => C.isAlphaNum c | |
26128 | orelse #"'" = c | |
26129 | orelse #"_" = c) s | |
26130 | ||
26131 | fun isNumLabel s = | |
26132 | 0 < size s | |
26133 | andalso #"0" <> CV.sub (s, 0) | |
26134 | andalso CV.all C.isDigit s | |
26135 | ||
26136 | fun isId s = isAlphaNumId s orelse isSymId s | |
26137 | ||
26138 | fun isLongId s = List.all isId (String.fields (#"." <\ op =) s) | |
26139 | ||
26140 | fun isLabel s = isId s orelse isNumLabel s | |
26141 | end | |
26142 | end | |
26143 | ||
26144 | structure Show :> SHOW = struct | |
26145 | datatype 'a t = IN of exn list * 'a -> bool * string | |
26146 | type 'a s = 'a t | |
26147 | type ('a, 'k) p = 'a t | |
26148 | type u = unit | |
26149 | type l = unit | |
26150 | ||
26151 | fun show (IN t) x = #2 (t ([], x)) | |
26152 | ||
26153 | (* user-defined types *) | |
26154 | fun inj inj (IN b) = IN (b o Pair.map (id, inj)) | |
26155 | ||
26156 | local | |
26157 | fun surround pre suf (_, s) = (false, concat [pre, s, suf]) | |
26158 | fun parenthesize x = if #1 x then surround "(" ")" x else x | |
26159 | fun construct tag = | |
26160 | (fn (_, s) => (true, concat [tag, " ", s])) o parenthesize | |
26161 | fun check p m s = if p s then () else raise Fail (m^s) | |
26162 | in | |
26163 | (* tuples and records *) | |
26164 | fun (IN l) * (IN r) = | |
26165 | IN (fn (rs, a & b) => | |
26166 | (false, concat [#2 (l (rs, a)), | |
26167 | ", ", | |
26168 | #2 (r (rs, b))])) | |
26169 | ||
26170 | val U = id | |
26171 | fun L l = (check SmlSyntax.isLabel "Invalid label: " l | |
26172 | ; fn IN t => IN (surround (l^" = ") "" o t)) | |
26173 | ||
26174 | fun tuple (IN t) = IN (surround "(" ")" o t) | |
26175 | fun record (IN t) = IN (surround "{" "}" o t) | |
26176 | ||
26177 | (* datatypes *) | |
26178 | fun (IN l) + (IN r) = IN (fn (rs, INL a) => l (rs, a) | |
26179 | | (rs, INR b) => r (rs, b)) | |
26180 | ||
26181 | fun C0 c = (check SmlSyntax.isId "Invalid constructor: " c | |
26182 | ; IN (const (false, c))) | |
26183 | fun C1 c (IN t) = (check SmlSyntax.isId "Invalid constructor: " c | |
26184 | ; IN (construct c o t)) | |
26185 | ||
26186 | val data = id | |
26187 | ||
26188 | fun Y ? = Tie.iso Tie.function (fn IN x => x, IN) ? | |
26189 | ||
26190 | (* exceptions *) | |
26191 | local | |
26192 | val handlers = ref ([] : (exn -> unit t option) list) | |
26193 | in | |
26194 | val exn = IN (fn (rs, e) => let | |
26195 | fun lp [] = | |
26196 | C0(concat ["<exn:", | |
26197 | General.exnName e, | |
26198 | ">"]) | |
26199 | | lp (f::fs) = | |
26200 | case f e | |
26201 | of NONE => lp fs | |
26202 | | SOME t => t | |
26203 | val IN f = lp (!handlers) | |
26204 | in | |
26205 | f (rs, ()) | |
26206 | end) | |
26207 | fun regExn f = | |
26208 | handlers := (Option.map | |
26209 | (fn (x, IN f) => | |
26210 | IN (fn (rs, ()) => | |
26211 | f (rs, x))) o f) | |
26212 | :: !handlers | |
26213 | end | |
26214 | ||
26215 | (* some built-in type constructors *) | |
26216 | local | |
26217 | fun cyclic (IN t) = let | |
26218 | exception E of ''a * bool ref | |
26219 | in | |
26220 | IN (fn (rs, v : ''a) => let | |
26221 | val idx = Int.toString o length | |
26222 | fun lp (E (v', c)::rs) = | |
26223 | if v' <> v then lp rs | |
26224 | else (c := false ; (false, "%"^idx rs)) | |
26225 | | lp (_::rs) = lp rs | |
26226 | | lp [] = let | |
26227 | val c = ref true | |
26228 | val r = t (E (v, c)::rs, v) | |
26229 | in | |
26230 | if !c then r | |
26231 | else surround "" (" as %"^idx rs) r | |
26232 | end | |
26233 | in | |
26234 | lp rs | |
26235 | end) | |
26236 | end | |
26237 | ||
26238 | fun aggregate pre suf toList (IN t) = | |
26239 | IN (surround pre suf o | |
26240 | (fn (rs, a) => | |
26241 | (false, | |
26242 | String.concatWith | |
26243 | ", " | |
26244 | (map (#2 o curry t rs) | |
26245 | (toList a))))) | |
26246 | in | |
26247 | fun refc ? = (cyclic o inj ! o C1"ref") ? | |
26248 | fun array ? = (cyclic o aggregate "[|" "|]" (Array.foldr op:: [])) ? | |
26249 | fun list ? = aggregate "[" "]" id ? | |
26250 | fun vector ? = aggregate "#[" "]" (Vector.foldr op:: []) ? | |
26251 | end | |
26252 | ||
26253 | fun (IN _) --> (IN _) = IN (const (false, "<fn>")) | |
26254 | ||
26255 | (* some built-in base types *) | |
26256 | local | |
26257 | fun mk toS = (fn x => (false, x)) o toS o (fn (_, x) => x) | |
26258 | in | |
26259 | val string = | |
26260 | IN (surround "\"" "\"" o mk (String.translate Char.toString)) | |
26261 | val unit = IN (mk (fn () => "()")) | |
26262 | val bool = IN (mk Bool.toString) | |
26263 | val char = IN (surround "#\"" "\"" o mk Char.toString) | |
26264 | val int = IN (mk Int.toString) | |
26265 | val word = IN (surround "0wx" "" o mk Word.toString) | |
26266 | val real = IN (mk Real.toString) | |
26267 | end | |
26268 | end | |
26269 | end | |
26270 | ||
26271 | (* Handlers for standard top-level exceptions *) | |
26272 | val () = let | |
26273 | open Show | |
26274 | fun E0 name = SOME ((), C0 name) | |
26275 | in | |
26276 | regExn (fn Bind => E0"Bind" | |
26277 | | Chr => E0"Chr" | |
26278 | | Div => E0"Div" | |
26279 | | Domain => E0"Domain" | |
26280 | | Empty => E0"Empty" | |
26281 | | Match => E0"Match" | |
26282 | | Option => E0"Option" | |
26283 | | Overflow => E0"Overflow" | |
26284 | | Size => E0"Size" | |
26285 | | Span => E0"Span" | |
26286 | | Subscript => E0"Subscript" | |
26287 | | _ => NONE) | |
26288 | ; regExn (fn Fail s => SOME (s, C1"Fail" string) | |
26289 | | _ => NONE) | |
26290 | end | |
26291 | ---- | |
26292 | ||
26293 | ||
26294 | == Also see == | |
26295 | ||
26296 | There are a number of related techniques. Here are some of them. | |
26297 | ||
26298 | * <:Fold:> | |
26299 | * <:StaticSum:> | |
26300 | ||
26301 | <<< | |
26302 | ||
26303 | :mlton-guide-page: TypeVariableScope | |
26304 | [[TypeVariableScope]] | |
26305 | TypeVariableScope | |
26306 | ================= | |
26307 | ||
26308 | In <:StandardML:Standard ML>, every type variable is _scoped_ (or | |
26309 | bound) at a particular point in the program. A type variable can be | |
26310 | either implicitly scoped or explicitly scoped. For example, `'a` is | |
26311 | implicitly scoped in | |
26312 | ||
26313 | [source,sml] | |
26314 | ---- | |
26315 | val id: 'a -> 'a = fn x => x | |
26316 | ---- | |
26317 | ||
26318 | and is implicitly scoped in | |
26319 | ||
26320 | [source,sml] | |
26321 | ---- | |
26322 | val id = fn x: 'a => x | |
26323 | ---- | |
26324 | ||
26325 | On the other hand, `'a` is explicitly scoped in | |
26326 | ||
26327 | [source,sml] | |
26328 | ---- | |
26329 | val 'a id: 'a -> 'a = fn x => x | |
26330 | ---- | |
26331 | ||
26332 | and is explicitly scoped in | |
26333 | ||
26334 | [source,sml] | |
26335 | ---- | |
26336 | val 'a id = fn x: 'a => x | |
26337 | ---- | |
26338 | ||
26339 | A type variable can be scoped at a `val` or `fun` declaration. An SML | |
26340 | type checker performs scope inference on each top-level declaration to | |
26341 | determine the scope of each implicitly scoped type variable. After | |
26342 | scope inference, every type variable is scoped at exactly one | |
26343 | enclosing `val` or `fun` declaration. Scope inference shows that the | |
26344 | first and second example above are equivalent to the third and fourth | |
26345 | example, respectively. | |
26346 | ||
26347 | Section 4.6 of the <:DefinitionOfStandardML:Definition> specifies | |
26348 | precisely the scope of an implicitly scoped type variable. A free | |
26349 | occurrence of a type variable `'a` in a declaration `d` is said to be | |
26350 | _unguarded_ in `d` if `'a` is not part of a smaller declaration. A | |
26351 | type variable `'a` is implicitly scoped at `d` if `'a` is unguarded in | |
26352 | `d` and `'a` does not occur unguarded in any declaration containing | |
26353 | `d`. | |
26354 | ||
26355 | ||
26356 | == Scope inference examples == | |
26357 | ||
26358 | * In this example, | |
26359 | + | |
26360 | [source,sml] | |
26361 | ---- | |
26362 | val id: 'a -> 'a = fn x => x | |
26363 | ---- | |
26364 | + | |
26365 | `'a` is unguarded in `val id` and does not occur unguarded in any | |
26366 | containing declaration. Hence, `'a` is scoped at `val id` and the | |
26367 | declaration is equivalent to the following. | |
26368 | + | |
26369 | [source,sml] | |
26370 | ---- | |
26371 | val 'a id: 'a -> 'a = fn x => x | |
26372 | ---- | |
26373 | ||
26374 | * In this example, | |
26375 | + | |
26376 | [source,sml] | |
26377 | ---- | |
26378 | val f = fn x => let exception E of 'a in E x end | |
26379 | ---- | |
26380 | + | |
26381 | `'a` is unguarded in `val f` and does not occur unguarded in any | |
26382 | containing declaration. Hence, `'a` is scoped at `val f` and the | |
26383 | declaration is equivalent to the following. | |
26384 | + | |
26385 | [source,sml] | |
26386 | ---- | |
26387 | val 'a f = fn x => let exception E of 'a in E x end | |
26388 | ---- | |
26389 | ||
26390 | * In this example (taken from the <:DefinitionOfStandardML:Definition>), | |
26391 | + | |
26392 | [source,sml] | |
26393 | ---- | |
26394 | val x: int -> int = let val id: 'a -> 'a = fn z => z in id id end | |
26395 | ---- | |
26396 | + | |
26397 | `'a` occurs unguarded in `val id`, but not in `val x`. Hence, `'a` is | |
26398 | implicitly scoped at `val id`, and the declaration is equivalent to | |
26399 | the following. | |
26400 | + | |
26401 | [source,sml] | |
26402 | ---- | |
26403 | val x: int -> int = let val 'a id: 'a -> 'a = fn z => z in id id end | |
26404 | ---- | |
26405 | ||
26406 | ||
26407 | * In this example, | |
26408 | + | |
26409 | [source,sml] | |
26410 | ---- | |
26411 | val f = (fn x: 'a => x) (fn y => y) | |
26412 | ---- | |
26413 | + | |
26414 | `'a` occurs unguarded in `val f` and does not occur unguarded in any | |
26415 | containing declaration. Hence, `'a` is implicitly scoped at `val f`, | |
26416 | and the declaration is equivalent to the following. | |
26417 | + | |
26418 | [source,sml] | |
26419 | ---- | |
26420 | val 'a f = (fn x: 'a => x) (fn y => y) | |
26421 | ---- | |
26422 | + | |
26423 | This does not type check due to the <:ValueRestriction:>. | |
26424 | ||
26425 | * In this example, | |
26426 | + | |
26427 | [source,sml] | |
26428 | ---- | |
26429 | fun f x = | |
26430 | let | |
26431 | fun g (y: 'a) = if true then x else y | |
26432 | in | |
26433 | g x | |
26434 | end | |
26435 | ---- | |
26436 | + | |
26437 | `'a` occurs unguarded in `fun g`, not in `fun f`. Hence, `'a` is | |
26438 | implicitly scoped at `fun g`, and the declaration is equivalent to | |
26439 | + | |
26440 | [source,sml] | |
26441 | ---- | |
26442 | fun f x = | |
26443 | let | |
26444 | fun 'a g (y: 'a) = if true then x else y | |
26445 | in | |
26446 | g x | |
26447 | end | |
26448 | ---- | |
26449 | + | |
26450 | This fails to type check because `x` and `y` must have the same type, | |
26451 | but the `x` occurs outside the scope of the type variable `'a`. MLton | |
26452 | reports the following error. | |
26453 | + | |
26454 | ---- | |
26455 | Error: z.sml 3.21-3.41. | |
26456 | Then and else branches disagree. | |
26457 | then: [???] | |
26458 | else: ['a] | |
26459 | in: if true then x else y | |
26460 | note: type would escape its scope: 'a | |
26461 | escape to: z.sml 1.1-6.5 | |
26462 | ---- | |
26463 | + | |
26464 | This problem could be fixed either by adding an explicit type | |
26465 | constraint, as in `fun f (x: 'a)`, or by explicitly scoping `'a`, as | |
26466 | in `fun 'a f x = ...`. | |
26467 | ||
26468 | ||
26469 | == Restrictions on type variable scope == | |
26470 | ||
26471 | It is not allowed to scope a type variable within a declaration in | |
26472 | which it is already in scope (see the last restriction listed on page | |
26473 | 9 of the <:DefinitionOfStandardML:Definition>). For example, the | |
26474 | following program is invalid. | |
26475 | ||
26476 | [source,sml] | |
26477 | ---- | |
26478 | fun 'a f (x: 'a) = | |
26479 | let | |
26480 | fun 'a g (y: 'a) = y | |
26481 | in | |
26482 | () | |
26483 | end | |
26484 | ---- | |
26485 | ||
26486 | MLton reports the following error. | |
26487 | ||
26488 | ---- | |
26489 | Error: z.sml 3.11-3.12. | |
26490 | Type variable scoped at an outer declaration: 'a. | |
26491 | scoped at: z.sml 1.1-6.6 | |
26492 | ---- | |
26493 | ||
26494 | This is an error even if the scoping is implicit. That is, the | |
26495 | following program is invalid as well. | |
26496 | ||
26497 | [source,sml] | |
26498 | ---- | |
26499 | fun f (x: 'a) = | |
26500 | let | |
26501 | fun 'a g (y: 'a) = y | |
26502 | in | |
26503 | () | |
26504 | end | |
26505 | ---- | |
26506 | ||
26507 | <<< | |
26508 | ||
26509 | :mlton-guide-page: Unicode | |
26510 | [[Unicode]] | |
26511 | Unicode | |
26512 | ======= | |
26513 | ||
26514 | == Support in The Definition of Standard ML == | |
26515 | ||
26516 | There is no real support for Unicode in the | |
26517 | <:DefinitionOfStandardML:Definition>; there are only a few throw-away | |
26518 | sentences along the lines of "the characters with numbers 0 to 127 | |
26519 | coincide with the ASCII character set." | |
26520 | ||
26521 | == Support in The Standard ML Basis Library == | |
26522 | ||
26523 | Neither is there real support for Unicode in the <:BasisLibrary:Basis | |
26524 | Library>. The general consensus (which includes the opinions of the | |
26525 | editors of the Basis Library) is that the `WideChar` and `WideString` | |
26526 | structures are insufficient for the purposes of Unicode. There is no | |
26527 | `LargeChar` structure, which in itself is a deficiency, since a | |
26528 | programmer can not program against the largest supported character | |
26529 | size. | |
26530 | ||
26531 | == Current Support in MLton == | |
26532 | ||
26533 | MLton, as a minor extension over the Definition, supports UTF-8 byte | |
26534 | sequences in text constants. This feature enables "UTF-8 convenience" | |
26535 | (but not comprehensive Unicode support); in particular, it allows one | |
26536 | to copy text from a browser and paste it into a string constant in an | |
26537 | editor and, furthermore, if the string is printed to a terminal, then | |
26538 | will (typically) appear as the original text. See the | |
26539 | <:SuccessorML#ExtendedTextConsts:extended text constants feature of | |
26540 | Successor ML> for more details. | |
26541 | ||
26542 | MLton, also as a minor extension over the Definition, supports | |
26543 | `\Uxxxxxxxx` numeric escapes in text constants and has preliminary | |
26544 | internal support for 16- and 32-bit characters and strings. | |
26545 | ||
26546 | MLton provides `WideChar` and `WideString` structures, corresponding | |
26547 | to 32-bit characters and strings, respectively. | |
26548 | ||
26549 | == Questions and Discussions == | |
26550 | ||
26551 | There are periodic flurries of questions and discussion about Unicode | |
26552 | in MLton/SML. In December 2004, there was a discussion that led to | |
26553 | some seemingly sound design decisions. The discussion started at: | |
26554 | ||
26555 | * http://www.mlton.org/pipermail/mlton/2004-December/026396.html | |
26556 | ||
26557 | There is a good summary of points at: | |
26558 | ||
26559 | * http://www.mlton.org/pipermail/mlton/2004-December/026440.html | |
26560 | ||
26561 | In November 2005, there was a followup discussion and the beginning of | |
26562 | some coding. | |
26563 | ||
26564 | * http://www.mlton.org/pipermail/mlton/2005-November/028300.html | |
26565 | ||
26566 | == Also see == | |
26567 | ||
26568 | The <:fxp:> XML parser has some support for dealing with Unicode | |
26569 | documents. | |
26570 | ||
26571 | <<< | |
26572 | ||
26573 | :mlton-guide-page: UniversalType | |
26574 | [[UniversalType]] | |
26575 | UniversalType | |
26576 | ============= | |
26577 | ||
26578 | A universal type is a type into which all other types can be embedded. | |
26579 | Here's a <:StandardML:Standard ML> signature for a universal type. | |
26580 | ||
26581 | [source,sml] | |
26582 | ---- | |
26583 | signature UNIVERSAL_TYPE = | |
26584 | sig | |
26585 | type t | |
26586 | ||
26587 | val embed: unit -> ('a -> t) * (t -> 'a option) | |
26588 | end | |
26589 | ---- | |
26590 | ||
26591 | The idea is that `type t` is the universal type and that each call to | |
26592 | `embed` returns a new pair of functions `(inject, project)`, where | |
26593 | `inject` embeds a value into the universal type and `project` extracts | |
26594 | the value from the universal type. A pair `(inject, project)` | |
26595 | returned by `embed` works together in that `project u` will return | |
26596 | `SOME v` if and only if `u` was created by `inject v`. If `u` was | |
26597 | created by a different function `inject'`, then `project` returns | |
26598 | `NONE`. | |
26599 | ||
26600 | Here's an example embedding integers and reals into a universal type. | |
26601 | ||
26602 | [source,sml] | |
26603 | ---- | |
26604 | functor Test (U: UNIVERSAL_TYPE): sig end = | |
26605 | struct | |
26606 | val (intIn: int -> U.t, intOut) = U.embed () | |
26607 | val r: U.t ref = ref (intIn 13) | |
26608 | val s1 = | |
26609 | case intOut (!r) of | |
26610 | NONE => "NONE" | |
26611 | | SOME i => Int.toString i | |
26612 | val (realIn: real -> U.t, realOut) = U.embed () | |
26613 | val () = r := realIn 13.0 | |
26614 | val s2 = | |
26615 | case intOut (!r) of | |
26616 | NONE => "NONE" | |
26617 | | SOME i => Int.toString i | |
26618 | val s3 = | |
26619 | case realOut (!r) of | |
26620 | NONE => "NONE" | |
26621 | | SOME x => Real.toString x | |
26622 | val () = print (concat [s1, " ", s2, " ", s3, "\n"]) | |
26623 | end | |
26624 | ---- | |
26625 | ||
26626 | Applying `Test` to an appropriate implementation will print | |
26627 | ||
26628 | ---- | |
26629 | 13 NONE 13.0 | |
26630 | ---- | |
26631 | ||
26632 | Note that two different calls to embed on the same type return | |
26633 | different embeddings. | |
26634 | ||
26635 | Standard ML does not have explicit support for universal types; | |
26636 | however, there are at least two ways to implement them. | |
26637 | ||
26638 | ||
26639 | == Implementation Using Exceptions == | |
26640 | ||
26641 | While the intended use of SML exceptions is for exception handling, an | |
26642 | accidental feature of their design is that the `exn` type is a | |
26643 | universal type. The implementation relies on being able to declare | |
26644 | exceptions locally to a function and on the fact that exceptions are | |
26645 | <:GenerativeException:generative>. | |
26646 | ||
26647 | [source,sml] | |
26648 | ---- | |
26649 | structure U:> UNIVERSAL_TYPE = | |
26650 | struct | |
26651 | type t = exn | |
26652 | ||
26653 | fun 'a embed () = | |
26654 | let | |
26655 | exception E of 'a | |
26656 | fun project (e: t): 'a option = | |
26657 | case e of | |
26658 | E a => SOME a | |
26659 | | _ => NONE | |
26660 | in | |
26661 | (E, project) | |
26662 | end | |
26663 | end | |
26664 | ---- | |
26665 | ||
26666 | ||
26667 | == Implementation Using Functions and References == | |
26668 | ||
26669 | [source,sml] | |
26670 | ---- | |
26671 | structure U:> UNIVERSAL_TYPE = | |
26672 | struct | |
26673 | datatype t = T of {clear: unit -> unit, | |
26674 | store: unit -> unit} | |
26675 | ||
26676 | fun 'a embed () = | |
26677 | let | |
26678 | val r: 'a option ref = ref NONE | |
26679 | fun inject (a: 'a): t = | |
26680 | T {clear = fn () => r := NONE, | |
26681 | store = fn () => r := SOME a} | |
26682 | fun project (T {clear, store}): 'a option = | |
26683 | let | |
26684 | val () = store () | |
26685 | val res = !r | |
26686 | val () = clear () | |
26687 | in | |
26688 | res | |
26689 | end | |
26690 | in | |
26691 | (inject, project) | |
26692 | end | |
26693 | end | |
26694 | ---- | |
26695 | ||
26696 | Note that due to the use of a shared ref cell, the above | |
26697 | implementation is not thread safe. | |
26698 | ||
26699 | One could try to simplify the above implementation by eliminating the | |
26700 | `clear` function, making `type t = unit -> unit`. | |
26701 | ||
26702 | [source,sml] | |
26703 | ---- | |
26704 | structure U:> UNIVERSAL_TYPE = | |
26705 | struct | |
26706 | type t = unit -> unit | |
26707 | ||
26708 | fun 'a embed () = | |
26709 | let | |
26710 | val r: 'a option ref = ref NONE | |
26711 | fun inject (a: 'a): t = fn () => r := SOME a | |
26712 | fun project (f: t): 'a option = (r := NONE; f (); !r) | |
26713 | in | |
26714 | (inject, project) | |
26715 | end | |
26716 | end | |
26717 | ---- | |
26718 | ||
26719 | While correct, this approach keeps the contents of the ref cell alive | |
26720 | longer than necessary, which could cause a space leak. The problem is | |
26721 | in `project`, where the call to `f` stores some value in some ref cell | |
26722 | `r'`. Perhaps `r'` is the same ref cell as `r`, but perhaps not. If | |
26723 | we do not clear `r'` before returning from `project`, then `r'` will | |
26724 | keep the value alive, even though it is useless. | |
26725 | ||
26726 | ||
26727 | == Also see == | |
26728 | ||
26729 | * <:PropertyList:>: Lisp-style property lists implemented with a universal type | |
26730 | ||
26731 | <<< | |
26732 | ||
26733 | :mlton-guide-page: UnresolvedBugs | |
26734 | [[UnresolvedBugs]] | |
26735 | UnresolvedBugs | |
26736 | ============== | |
26737 | ||
26738 | Here are the places where MLton deviates from | |
26739 | <:DefinitionOfStandardML:The Definition of Standard ML (Revised)> and | |
26740 | the <:BasisLibrary:Basis Library>. In general, MLton complies with | |
26741 | the <:DefinitionOfStandardML:Definition> quite closely, typically much | |
26742 | more closely than other SML compilers (see, e.g., our list of | |
26743 | <:SMLNJDeviations:SML/NJ's deviations>). In fact, the four deviations | |
26744 | listed here are the only known deviations, and we have no immediate | |
26745 | plans to fix them. If you find a deviation not listed here, please | |
26746 | report a <:Bug:>. | |
26747 | ||
26748 | We don't plan to fix these bugs because the first (parsing nested | |
26749 | cases) has historically never been accepted by any SML compiler, the | |
26750 | second clearly indicates a problem in the | |
26751 | <:DefinitionOfStandardML:Definition>, and the remaining are difficult | |
26752 | to resolve in the context of MLton's implementaton of Standard ML (and | |
26753 | unlikely to be problematic in practice). | |
26754 | ||
26755 | * MLton does not correctly parse case expressions nested within other | |
26756 | matches. For example, the following fails. | |
26757 | + | |
26758 | [source,sml] | |
26759 | ---- | |
26760 | fun f 0 y = | |
26761 | case x of | |
26762 | 1 => 2 | |
26763 | | _ => 3 | |
26764 | | f _ y = 4 | |
26765 | ---- | |
26766 | + | |
26767 | To do this in a program, simply parenthesize the case expression. | |
26768 | + | |
26769 | Allowing such expressions, although compliant with the Definition, | |
26770 | would be a mistake, since using parentheses is clearer and no SML | |
26771 | compiler has ever allowed them. Furthermore, implementing this would | |
26772 | require serious yacc grammar rewriting followed by postprocessing. | |
26773 | ||
26774 | * MLton does not raise the `Bind` exception at run time when | |
26775 | evaluating `val rec` (and `fun`) declarations that redefine | |
26776 | identifiers that previously had constructor status. (By default, | |
26777 | MLton does warn at compile time about `val rec` (and `fun`) | |
26778 | declarations that redefine identifiers that previously had | |
26779 | constructors status; see the `valrecConstr` <:MLBasisAnnotations:ML | |
26780 | Basis annotation>.) For example, the Definition requires the | |
26781 | following program to type check, but also (bizarelly) requires it to | |
26782 | raise the `Bind` exception | |
26783 | + | |
26784 | [source,sml] | |
26785 | ---- | |
26786 | val rec NONE = fn () => () | |
26787 | ---- | |
26788 | + | |
26789 | The Definition's behavior is obviously an error, a mismatch between | |
26790 | the static semantics (rule 26) and the dynamic semantics (rule 126). | |
26791 | Given the comments on rule 26 in the Definition, it seems clear that | |
26792 | the authors meant for `val rec` to allow an identifier's constructor | |
26793 | status to be overridden both statically and dynamically. Hence, MLton | |
26794 | and most SML compilers follow rule 26, but do not follow rule 126. | |
26795 | ||
26796 | * MLton does not hide the equality aspect of types declared in | |
26797 | `abstype` declarations. So, MLton accepts programs like the following, | |
26798 | while the Definition rejects them. | |
26799 | + | |
26800 | [source,sml] | |
26801 | ---- | |
26802 | abstype t = T with end | |
26803 | val _ = fn (t1, t2 : t) => t1 = t2 | |
26804 | ||
26805 | abstype t = T with val a = T end | |
26806 | val _ = a = a | |
26807 | ---- | |
26808 | + | |
26809 | One consequence of this choice is that MLton accepts the following | |
26810 | program, in accordance with the Definition. | |
26811 | + | |
26812 | [source,sml] | |
26813 | ---- | |
26814 | abstype t = T with val eq = op = end | |
26815 | val _ = fn (t1, t2 : t) => eq (t1, t2) | |
26816 | ---- | |
26817 | + | |
26818 | Other implementations will typically reject this program, because they | |
26819 | make an early choice for the type of `eq` to be `''a * ''a -> bool` | |
26820 | instead of `t * t -> bool`. The choice is understandable, since the | |
26821 | Definition accepts the following program. | |
26822 | + | |
26823 | [source,sml] | |
26824 | ---- | |
26825 | abstype t = T with val eq = op = end | |
26826 | val _ = eq (1, 2) | |
26827 | ---- | |
26828 | + | |
26829 | ||
26830 | * MLton (re-)type checks each functor definition at every | |
26831 | corresponding functor application (the compilation technique of | |
26832 | defunctorization). One consequence of this implementation is that | |
26833 | MLton accepts the following program, while the Definition rejects | |
26834 | it. | |
26835 | + | |
26836 | [source,sml] | |
26837 | ---- | |
26838 | functor F (X: sig type t end) = struct | |
26839 | val f = id id | |
26840 | end | |
26841 | structure A = F (struct type t = int end) | |
26842 | structure B = F (struct type t = bool end) | |
26843 | val _ = A.f 10 | |
26844 | val _ = B.f "dude" | |
26845 | ---- | |
26846 | + | |
26847 | On the other hand, other implementations will typically reject the | |
26848 | following program, while MLton and the Definition accept it. | |
26849 | + | |
26850 | [source,sml] | |
26851 | ---- | |
26852 | functor F (X: sig type t end) = struct | |
26853 | val f = id id | |
26854 | end | |
26855 | structure A = F (struct type t = int end) | |
26856 | structure B = F (struct type t = bool end) | |
26857 | val _ = A.f 10 | |
26858 | val _ = B.f false | |
26859 | ---- | |
26860 | + | |
26861 | See <!Cite(DreyerBlume07)> for more details. | |
26862 | ||
26863 | <<< | |
26864 | ||
26865 | :mlton-guide-page: UnsafeStructure | |
26866 | [[UnsafeStructure]] | |
26867 | UnsafeStructure | |
26868 | =============== | |
26869 | ||
26870 | This module is a subset of the `Unsafe` module provided by SML/NJ, | |
26871 | with a few extract operations for `PackWord` and `PackReal`. | |
26872 | ||
26873 | [source,sml] | |
26874 | ---- | |
26875 | signature UNSAFE_MONO_ARRAY = | |
26876 | sig | |
26877 | type array | |
26878 | type elem | |
26879 | ||
26880 | val create: int -> array | |
26881 | val sub: array * int -> elem | |
26882 | val update: array * int * elem -> unit | |
26883 | end | |
26884 | ||
26885 | signature UNSAFE_MONO_VECTOR = | |
26886 | sig | |
26887 | type elem | |
26888 | type vector | |
26889 | ||
26890 | val sub: vector * int -> elem | |
26891 | end | |
26892 | ||
26893 | signature UNSAFE = | |
26894 | sig | |
26895 | structure Array: | |
26896 | sig | |
26897 | val create: int * 'a -> 'a array | |
26898 | val sub: 'a array * int -> 'a | |
26899 | val update: 'a array * int * 'a -> unit | |
26900 | end | |
26901 | structure CharArray: UNSAFE_MONO_ARRAY | |
26902 | structure CharVector: UNSAFE_MONO_VECTOR | |
26903 | structure IntArray: UNSAFE_MONO_ARRAY | |
26904 | structure IntVector: UNSAFE_MONO_VECTOR | |
26905 | structure Int8Array: UNSAFE_MONO_ARRAY | |
26906 | structure Int8Vector: UNSAFE_MONO_VECTOR | |
26907 | structure Int16Array: UNSAFE_MONO_ARRAY | |
26908 | structure Int16Vector: UNSAFE_MONO_VECTOR | |
26909 | structure Int32Array: UNSAFE_MONO_ARRAY | |
26910 | structure Int32Vector: UNSAFE_MONO_VECTOR | |
26911 | structure Int64Array: UNSAFE_MONO_ARRAY | |
26912 | structure Int64Vector: UNSAFE_MONO_VECTOR | |
26913 | structure IntInfArray: UNSAFE_MONO_ARRAY | |
26914 | structure IntInfVector: UNSAFE_MONO_VECTOR | |
26915 | structure LargeIntArray: UNSAFE_MONO_ARRAY | |
26916 | structure LargeIntVector: UNSAFE_MONO_VECTOR | |
26917 | structure LargeRealArray: UNSAFE_MONO_ARRAY | |
26918 | structure LargeRealVector: UNSAFE_MONO_VECTOR | |
26919 | structure LargeWordArray: UNSAFE_MONO_ARRAY | |
26920 | structure LargeWordVector: UNSAFE_MONO_VECTOR | |
26921 | structure RealArray: UNSAFE_MONO_ARRAY | |
26922 | structure RealVector: UNSAFE_MONO_VECTOR | |
26923 | structure Real32Array: UNSAFE_MONO_ARRAY | |
26924 | structure Real32Vector: UNSAFE_MONO_VECTOR | |
26925 | structure Real64Array: UNSAFE_MONO_ARRAY | |
26926 | structure Vector: | |
26927 | sig | |
26928 | val sub: 'a vector * int -> 'a | |
26929 | end | |
26930 | structure Word8Array: UNSAFE_MONO_ARRAY | |
26931 | structure Word8Vector: UNSAFE_MONO_VECTOR | |
26932 | structure Word16Array: UNSAFE_MONO_ARRAY | |
26933 | structure Word16Vector: UNSAFE_MONO_VECTOR | |
26934 | structure Word32Array: UNSAFE_MONO_ARRAY | |
26935 | structure Word32Vector: UNSAFE_MONO_VECTOR | |
26936 | structure Word64Array: UNSAFE_MONO_ARRAY | |
26937 | structure Word64Vector: UNSAFE_MONO_VECTOR | |
26938 | ||
26939 | structure PackReal32Big : PACK_REAL | |
26940 | structure PackReal32Little : PACK_REAL | |
26941 | structure PackReal64Big : PACK_REAL | |
26942 | structure PackReal64Little : PACK_REAL | |
26943 | structure PackRealBig : PACK_REAL | |
26944 | structure PackRealLittle : PACK_REAL | |
26945 | structure PackWord16Big : PACK_WORD | |
26946 | structure PackWord16Little : PACK_WORD | |
26947 | structure PackWord32Big : PACK_WORD | |
26948 | structure PackWord32Little : PACK_WORD | |
26949 | structure PackWord64Big : PACK_WORD | |
26950 | structure PackWord64Little : PACK_WORD | |
26951 | end | |
26952 | ---- | |
26953 | ||
26954 | <<< | |
26955 | ||
26956 | :mlton-guide-page: Useless | |
26957 | [[Useless]] | |
26958 | Useless | |
26959 | ======= | |
26960 | ||
26961 | <:Useless:> is an optimization pass for the <:SSA:> | |
26962 | <:IntermediateLanguage:>, invoked from <:SSASimplify:>. | |
26963 | ||
26964 | == Description == | |
26965 | ||
26966 | This pass: | |
26967 | ||
26968 | * removes components of tuples that are constants (use unification) | |
26969 | * removes function arguments that are constants | |
26970 | * builds some kind of dependence graph where | |
26971 | ** a value of ground type is useful if it is an arg to a primitive | |
26972 | ** a tuple is useful if it contains a useful component | |
26973 | ** a constructor is useful if it contains a useful component or is used in a `Case` transfer | |
26974 | ||
26975 | If a useful tuple is coerced to another useful tuple, then all of | |
26976 | their components must agree (exactly). It is trivial to convert a | |
26977 | useful value to a useless one. | |
26978 | ||
26979 | == Implementation == | |
26980 | ||
26981 | * <!ViewGitFile(mlton,master,mlton/ssa/useless.fun)> | |
26982 | ||
26983 | == Details and Notes == | |
26984 | ||
26985 | It is also trivial to convert a useful tuple to one of its useful | |
26986 | components -- but this seems hard. | |
26987 | ||
26988 | Suppose that you have a `ref`/`array`/`vector` that is useful, but the | |
26989 | components aren't -- then the components are converted to type `unit`, | |
26990 | and any primitive args must be as well. | |
26991 | ||
26992 | Unify all handler arguments so that `raise`/`handle` has a consistent | |
26993 | calling convention. | |
26994 | ||
26995 | <<< | |
26996 | ||
26997 | :mlton-guide-page: Users | |
26998 | [[Users]] | |
26999 | Users | |
27000 | ===== | |
27001 | ||
27002 | Here is a list of companies, projects, and courses that use or have | |
27003 | used MLton. If you use MLton and are not here, please add your | |
27004 | project with a brief description and a link. Thanks. | |
27005 | ||
27006 | == Companies == | |
27007 | ||
27008 | * http://www.hardcoreprocessing.com/[Hardcore Processing] uses MLton as a http://www.hardcoreprocessing.com/Freeware/MLTonWin32.html[crosscompiler from Linux to Windows] for graphics and game software. | |
27009 | ** http://www.cex3d.net/[CEX3D Converter], a conversion program for 3D objects. | |
27010 | ** http://www.hardcoreprocessing.com/company/showreel/index.html[Interactive Showreel], which contains a crossplatform GUI-toolkit and a realtime renderer for a subset of RenderMan written in Standard ML. | |
27011 | ** various http://www.hardcoreprocessing.com/entertainment/index.html[games] | |
27012 | * http://www.mathworks.com/products/polyspace/[MathWorks/PolySpace Technologies] builds their product that detects runtime errors in embedded systems based on abstract interpretation. | |
27013 | // * http://www.sourcelight.com/[Sourcelight Technologies] uses MLton internally for prototyping and for processing databases as part of their system that makes personalized movie recommen | |
27014 | * http://www.reactive-systems.com/[Reactive Systems] uses MLton to build Reactis, a model-based testing and validation package used in the automotive and aerospace industries. | |
27015 | ||
27016 | == Projects == | |
27017 | ||
27018 | * http://www-ia.hiof.no/%7Erolando/adate_intro.html[ADATE], Automatic Design of Algorithms Through Evolution, a system for automatic programming i.e., inductive inference of algorithms. ADATE can automatically generate non-trivial and novel algorithms written in Standard ML. | |
27019 | * http://types.bu.edu/reports/Dim+Wes+Mul+Tur+Wel+Con:TIC-2000-LNCS.html[CIL], a compiler for SML based on intersection and union types. | |
27020 | * http://www.cs.cmu.edu/%7Econcert/[ConCert], a project investigating certified code for grid computing. | |
27021 | * http://hcoop.sourceforge.net/[Cooperative Internet hosting tools] | |
27022 | // * http://www.eecs.harvard.edu/%7Estein/[DesynchFS], a programming model and distributed file system for large clusters | |
27023 | * http://www.fantasy-coders.de/projects/gh/[Guugelhupf], a simple search engine. | |
27024 | * http://www.mpi-sws.org/%7Erossberg/hamlet/[HaMLet], a model implementation of Standard ML. | |
27025 | * http://code.google.com/p/kepler-code/[KeplerCode], independent verification of the computational aspects of proofs of the Kepler conjecture and the Dodecahedral conjecture. | |
27026 | * http://www.gilith.com/research/metis/[Metis], a first-order prover (used in the http://hol.sourceforge.net/[HOL4 theorem prover] and the http://isabelle.in.tum.de/[Isabelle theorem prover]). | |
27027 | * http://tom7misc.cvs.sourceforge.net/viewvc/tom7misc/net/mlftpd/[mlftpd], an ftp daemon written in SML. <:TomMurphy:> is also working on http://tom7misc.cvs.sourceforge.net/viewvc/tom7misc/net/[replacements for standard network services] in SML. He also uses MLton to build his entries (http://www.cs.cmu.edu/%7Etom7/icfp2001/[2001], http://www.cs.cmu.edu/%7Etom7/icfp2002/[2002], http://www.cs.cmu.edu/%7Etom7/icfp2004/[2004], http://www.cs.cmu.edu/%7Etom7/icfp2005/[2005]) in the annual ICFP programming contest. | |
27028 | * http://www.informatik.uni-freiburg.de/proglang/research/software/mlope/[MLOPE], an offline partial evaluator for Standard ML. | |
27029 | * http://www.ida.liu.se/%7Epelab/rml/[RML], a system for developing, compiling and debugging and teaching structural operational semantics (SOS) and natural semantics specifications. | |
27030 | * http://www.macs.hw.ac.uk/ultra/skalpel/index.html[Skalpel], a type-error slicer for SML | |
27031 | // * http://alleystoughton.us/smlnjtrans/[SMLNJtrans], a program for generating SML/NJ transcripts in LaTeX. | |
27032 | * http://www.cs.cmu.edu/%7Etom7/ssapre/[SSA PRE], an implementation of Partial Redundancy Elimination for MLton. | |
27033 | * <:Stabilizers:>, a modular checkpointing abstraction for concurrent functional programs. | |
27034 | * http://ttic.uchicago.edu/%7Epl/sa-sml/[Self-Adjusting SML], self-adjusting computation, a model of computing where programs can automatically adjust to changes to their data. | |
27035 | * http://faculty.ist.unomaha.edu/winter/ShiftLab/TL_web/TL_index.html[TL System], providing general-purpose support for rewrite-based transformation over elements belonging to a (user-defined) domain language. | |
27036 | * http://projects.laas.fr/tina/[Tina] (Time Petri net Analyzer) | |
27037 | * http://www.twelf.org/[Twelf] an implementation of the LF logical framework. | |
27038 | * http://www.cs.indiana.edu/%7Errnewton/wavescope/[WaveScript/WaveScript], a sensor network project; the WaveScript compiler can generate SML (MLton) code. | |
27039 | ||
27040 | == Courses == | |
27041 | ||
27042 | * http://www.eecs.harvard.edu/%7Enr/cs152/[Harvard CS-152], undergraduate programming languages. | |
27043 | * http://www.ia-stud.hiof.no/%7Erolando/PL/[Høgskolen i Østfold IAI30202], programming languages. | |
27044 | ||
27045 | <<< | |
27046 | ||
27047 | :mlton-guide-page: Utilities | |
27048 | [[Utilities]] | |
27049 | Utilities | |
27050 | ========= | |
27051 | ||
27052 | This page is a collection of basic utilities used in the examples on | |
27053 | various pages. See | |
27054 | ||
27055 | * <:InfixingOperators:>, and | |
27056 | * <:ProductType:> | |
27057 | ||
27058 | for longer discussions on some of these utilities. | |
27059 | ||
27060 | [source,sml] | |
27061 | ---- | |
27062 | (* Operator precedence table *) | |
27063 | infix 8 * / div mod (* +1 from Basis Library *) | |
27064 | infix 7 + - ^ (* +1 from Basis Library *) | |
27065 | infixr 6 :: @ (* +1 from Basis Library *) | |
27066 | infix 5 = <> > >= < <= (* +1 from Basis Library *) | |
27067 | infix 4 <\ \> | |
27068 | infixr 4 </ /> | |
27069 | infix 3 o | |
27070 | infix 2 >| | |
27071 | infixr 2 |< | |
27072 | infix 1 := (* -2 from Basis Library *) | |
27073 | infix 0 before & | |
27074 | ||
27075 | (* Some basic combinators *) | |
27076 | fun const x _ = x | |
27077 | fun cross (f, g) (x, y) = (f x, g y) | |
27078 | fun curry f x y = f (x, y) | |
27079 | fun fail e _ = raise e | |
27080 | fun id x = x | |
27081 | ||
27082 | (* Product type *) | |
27083 | datatype ('a, 'b) product = & of 'a * 'b | |
27084 | ||
27085 | (* Sum type *) | |
27086 | datatype ('a, 'b) sum = INL of 'a | INR of 'b | |
27087 | ||
27088 | (* Some type shorthands *) | |
27089 | type 'a uop = 'a -> 'a | |
27090 | type 'a fix = 'a uop -> 'a | |
27091 | type 'a thunk = unit -> 'a | |
27092 | type 'a effect = 'a -> unit | |
27093 | type ('a, 'b) emb = ('a -> 'b) * ('b -> 'a) | |
27094 | ||
27095 | (* Infixing, sectioning, and application operators *) | |
27096 | fun x <\ f = fn y => f (x, y) | |
27097 | fun f \> y = f y | |
27098 | fun f /> y = fn x => f (x, y) | |
27099 | fun x </ f = f x | |
27100 | ||
27101 | (* Piping operators *) | |
27102 | val op>| = op</ | |
27103 | val op|< = op\> | |
27104 | ---- | |
27105 | ||
27106 | <<< | |
27107 | ||
27108 | :mlton-guide-page: ValueRestriction | |
27109 | [[ValueRestriction]] | |
27110 | ValueRestriction | |
27111 | ================ | |
27112 | ||
27113 | The value restriction is a rule that governs when type inference is | |
27114 | allowed to polymorphically generalize a value declaration. In short, | |
27115 | the value restriction says that generalization can only occur if the | |
27116 | right-hand side of an expression is syntactically a value. For | |
27117 | example, in | |
27118 | ||
27119 | [source,sml] | |
27120 | ---- | |
27121 | val f = fn x => x | |
27122 | val _ = (f "foo"; f 13) | |
27123 | ---- | |
27124 | ||
27125 | the expression `fn x => x` is syntactically a value, so `f` has | |
27126 | polymorphic type `'a -> 'a` and both calls to `f` type check. On the | |
27127 | other hand, in | |
27128 | ||
27129 | [source,sml] | |
27130 | ---- | |
27131 | val f = let in fn x => x end | |
27132 | val _ = (f "foo"; f 13) | |
27133 | ---- | |
27134 | ||
27135 | the expression `let in fn x => end end` is not syntactically a value | |
27136 | and so `f` can either have type `int -> int` or `string -> string`, | |
27137 | but not `'a -> 'a`. Hence, the program does not type check. | |
27138 | ||
27139 | <:DefinitionOfStandardML:The Definition of Standard ML> spells out | |
27140 | precisely which expressions are syntactic values (it refers to such | |
27141 | expressions as _non-expansive_). An expression is a value if it is of | |
27142 | one of the following forms. | |
27143 | ||
27144 | * a constant (`13`, `"foo"`, `13.0`, ...) | |
27145 | * a variable (`x`, `y`, ...) | |
27146 | * a function (`fn x => e`) | |
27147 | * the application of a constructor other than `ref` to a value (`Foo v`) | |
27148 | * a type constrained value (`v: t`) | |
27149 | * a tuple in which each field is a value `(v1, v2, ...)` | |
27150 | * a record in which each field is a value `{l1 = v1, l2 = v2, ...}` | |
27151 | * a list in which each element is a value `[v1, v2, ...]` | |
27152 | ||
27153 | ||
27154 | == Why the value restriction exists == | |
27155 | ||
27156 | The value restriction prevents a ref cell (or an array) from holding | |
27157 | values of different types, which would allow a value of one type to be | |
27158 | cast to another and hence would break type safety. If the restriction | |
27159 | were not in place, the following program would type check. | |
27160 | ||
27161 | [source,sml] | |
27162 | ---- | |
27163 | val r: 'a option ref = ref NONE | |
27164 | val r1: string option ref = r | |
27165 | val r2: int option ref = r | |
27166 | val () = r1 := SOME "foo" | |
27167 | val v: int = valOf (!r2) | |
27168 | ---- | |
27169 | ||
27170 | The first line violates the value restriction because `ref NONE` is | |
27171 | not a value. All other lines are type correct. By its last line, the | |
27172 | program has cast the string `"foo"` to an integer. This breaks type | |
27173 | safety, because now we can add a string to an integer with an | |
27174 | expression like `v + 13`. We could even be more devious, by adding | |
27175 | the following two lines, which allow us to threat the string `"foo"` | |
27176 | as a function. | |
27177 | ||
27178 | [source,sml] | |
27179 | ---- | |
27180 | val r3: (int -> int) option ref = r | |
27181 | val v: int -> int = valOf (!r3) | |
27182 | ---- | |
27183 | ||
27184 | Eliminating the explicit `ref` does nothing to fix the problem. For | |
27185 | example, we could replace the declaration of `r` with the following. | |
27186 | ||
27187 | [source,sml] | |
27188 | ---- | |
27189 | val f: unit -> 'a option ref = fn () => ref NONE | |
27190 | val r: 'a option ref = f () | |
27191 | ---- | |
27192 | ||
27193 | The declaration of `f` is well typed, while the declaration of `r` | |
27194 | violates the value restriction because `f ()` is not a value. | |
27195 | ||
27196 | ||
27197 | == Unnecessarily rejected programs == | |
27198 | ||
27199 | Unfortunately, the value restriction rejects some programs that could | |
27200 | be accepted. | |
27201 | ||
27202 | [source,sml] | |
27203 | ---- | |
27204 | val id: 'a -> 'a = fn x => x | |
27205 | val f: 'a -> 'a = id id | |
27206 | ---- | |
27207 | ||
27208 | The type constraint on `f` requires `f` to be polymorphic, which is | |
27209 | disallowed because `id id` is not a value. MLton reports the | |
27210 | following type error. | |
27211 | ||
27212 | ---- | |
27213 | Error: z.sml 2.5-2.5. | |
27214 | Type of variable cannot be generalized in expansive declaration: f. | |
27215 | type: ['a] -> ['a] | |
27216 | in: val 'a f: ('a -> 'a) = id id | |
27217 | ---- | |
27218 | ||
27219 | MLton indicates the inability to make `f` polymorphic by saying that | |
27220 | the type of `f` cannot be generalized (made polymorphic) its | |
27221 | declaration is expansive (not a value). MLton doesn't explicitly | |
27222 | mention the value restriction, but that is the reason. If we leave | |
27223 | the type constraint off of `f` | |
27224 | ||
27225 | [source,sml] | |
27226 | ---- | |
27227 | val id: 'a -> 'a = fn x => x | |
27228 | val f = id id | |
27229 | ---- | |
27230 | ||
27231 | then the program succeeds; however, MLton gives us the following | |
27232 | warning. | |
27233 | ||
27234 | ---- | |
27235 | Warning: z.sml 2.5-2.5. | |
27236 | Type of variable was not inferred and could not be generalized: f. | |
27237 | type: ??? -> ??? | |
27238 | in: val f = id id | |
27239 | ---- | |
27240 | ||
27241 | This warning indicates that MLton couldn't polymorphically generalize | |
27242 | `f`, nor was there enough context using `f` to determine its type. | |
27243 | This in itself is not a type error, but it it is a hint that something | |
27244 | is wrong with our program. Using `f` provides enough context to | |
27245 | eliminate the warning. | |
27246 | ||
27247 | [source,sml] | |
27248 | ---- | |
27249 | val id: 'a -> 'a = fn x => x | |
27250 | val f = id id | |
27251 | val _ = f 13 | |
27252 | ---- | |
27253 | ||
27254 | But attempting to use `f` as a polymorphic function will fail. | |
27255 | ||
27256 | [source,sml] | |
27257 | ---- | |
27258 | val id: 'a -> 'a = fn x => x | |
27259 | val f = id id | |
27260 | val _ = f 13 | |
27261 | val _ = f "foo" | |
27262 | ---- | |
27263 | ||
27264 | ---- | |
27265 | Error: z.sml 4.9-4.15. | |
27266 | Function applied to incorrect argument. | |
27267 | expects: [int] | |
27268 | but got: [string] | |
27269 | in: f "foo" | |
27270 | ---- | |
27271 | ||
27272 | ||
27273 | == Alternatives to the value restriction == | |
27274 | ||
27275 | There would be nothing wrong with treating `f` as polymorphic in | |
27276 | ||
27277 | [source,sml] | |
27278 | ---- | |
27279 | val id: 'a -> 'a = fn x => x | |
27280 | val f = id id | |
27281 | ---- | |
27282 | ||
27283 | One might think that the value restriction could be relaxed, and that | |
27284 | only types involving `ref` should be disallowed. Unfortunately, the | |
27285 | following example shows that even the type `'a -> 'a` can cause | |
27286 | problems. If this program were allowed, then we could cast an integer | |
27287 | to a string (or any other type). | |
27288 | ||
27289 | [source,sml] | |
27290 | ---- | |
27291 | val f: 'a -> 'a = | |
27292 | let | |
27293 | val r: 'a option ref = ref NONE | |
27294 | in | |
27295 | fn x => | |
27296 | let | |
27297 | val y = !r | |
27298 | val () = r := SOME x | |
27299 | in | |
27300 | case y of | |
27301 | NONE => x | |
27302 | | SOME y => y | |
27303 | end | |
27304 | end | |
27305 | val _ = f 13 | |
27306 | val _ = f "foo" | |
27307 | ---- | |
27308 | ||
27309 | The previous version of Standard ML took a different approach | |
27310 | (<!Cite(MilnerEtAl90)>, <!Cite(Tofte90)>, <:ImperativeTypeVariable:>) | |
27311 | than the value restriction. It encoded information in the type system | |
27312 | about when ref cells would be created, and used this to prevent a ref | |
27313 | cell from holding multiple types. Although it allowed more programs | |
27314 | to be type checked, this approach had significant drawbacks. First, | |
27315 | it was significantly more complex, both for implementers and for | |
27316 | programmers. Second, it had an unfortunate interaction with the | |
27317 | modularity, because information about ref usage was exposed in module | |
27318 | signatures. This either prevented the use of references for | |
27319 | implementing a signature, or required information that one would like | |
27320 | to keep hidden to propagate across modules. | |
27321 | ||
27322 | In the early nineties, Andrew Wright studied about 250,000 lines of | |
27323 | existing SML code and discovered that it did not make significant use | |
27324 | of the extended typing ability, and proposed the value restriction as | |
27325 | a simpler alternative (<!Cite(Wright95)>). This was adopted in the | |
27326 | revised <:DefinitionOfStandardML:Definition>. | |
27327 | ||
27328 | ||
27329 | == Working with the value restriction == | |
27330 | ||
27331 | One technique that works with the value restriction is | |
27332 | <:EtaExpansion:>. We can use eta expansion to make our `id id` | |
27333 | example type check follows. | |
27334 | ||
27335 | [source,sml] | |
27336 | ---- | |
27337 | val id: 'a -> 'a = fn x => x | |
27338 | val f: 'a -> 'a = fn z => (id id) z | |
27339 | ---- | |
27340 | ||
27341 | This solution means that the computation (in this case `id id`) will | |
27342 | be performed each time `f` is applied, instead of just once when `f` | |
27343 | is declared. In this case, that is not a problem, but it could be if | |
27344 | the declaration of `f` performs substantial computation or creates a | |
27345 | shared data structure. | |
27346 | ||
27347 | Another technique that sometimes works is to move a monomorphic | |
27348 | computation prior to a (would-be) polymorphic declaration so that the | |
27349 | expression is a value. Consider the following program, which fails | |
27350 | due to the value restriction. | |
27351 | ||
27352 | [source,sml] | |
27353 | ---- | |
27354 | datatype 'a t = A of string | B of 'a | |
27355 | val x: 'a t = A (if true then "yes" else "no") | |
27356 | ---- | |
27357 | ||
27358 | It is easy to rewrite this program as | |
27359 | ||
27360 | [source,sml] | |
27361 | ---- | |
27362 | datatype 'a t = A of string | B of 'a | |
27363 | local | |
27364 | val s = if true then "yes" else "no" | |
27365 | in | |
27366 | val x: 'a t = A s | |
27367 | end | |
27368 | ---- | |
27369 | ||
27370 | The following example (taken from <!Cite(Wright95)>) creates a ref | |
27371 | cell to count the number of times a function is called. | |
27372 | ||
27373 | [source,sml] | |
27374 | ---- | |
27375 | val count: ('a -> 'a) -> ('a -> 'a) * (unit -> int) = | |
27376 | fn f => | |
27377 | let | |
27378 | val r = ref 0 | |
27379 | in | |
27380 | (fn x => (r := 1 + !r; f x), fn () => !r) | |
27381 | end | |
27382 | val id: 'a -> 'a = fn x => x | |
27383 | val (countId: 'a -> 'a, numCalls) = count id | |
27384 | ---- | |
27385 | ||
27386 | The example does not type check, due to the value restriction. | |
27387 | However, it is easy to rewrite the program, staging the ref cell | |
27388 | creation before the polymorphic code. | |
27389 | ||
27390 | [source,sml] | |
27391 | ---- | |
27392 | datatype t = T of int ref | |
27393 | val count1: unit -> t = fn () => T (ref 0) | |
27394 | val count2: t * ('a -> 'a) -> (unit -> int) * ('a -> 'a) = | |
27395 | fn (T r, f) => (fn () => !r, fn x => (r := 1 + !r; f x)) | |
27396 | val id: 'a -> 'a = fn x => x | |
27397 | val t = count1 () | |
27398 | val countId: 'a -> 'a = fn z => #2 (count2 (t, id)) z | |
27399 | val numCalls = #1 (count2 (t, id)) | |
27400 | ---- | |
27401 | ||
27402 | Of course, one can hide the constructor `T` inside a `local` or behind | |
27403 | a signature. | |
27404 | ||
27405 | ||
27406 | == Also see == | |
27407 | ||
27408 | * <:ImperativeTypeVariable:> | |
27409 | ||
27410 | <<< | |
27411 | ||
27412 | :mlton-guide-page: VariableArityPolymorphism | |
27413 | [[VariableArityPolymorphism]] | |
27414 | VariableArityPolymorphism | |
27415 | ========================= | |
27416 | ||
27417 | <:StandardML:Standard ML> programmers often face the problem of how to | |
27418 | provide a variable-arity polymorphic function. For example, suppose | |
27419 | one is defining a combinator library, e.g. for parsing or pickling. | |
27420 | The signature for such a library might look something like the | |
27421 | following. | |
27422 | ||
27423 | [source,sml] | |
27424 | ---- | |
27425 | signature COMBINATOR = | |
27426 | sig | |
27427 | type 'a t | |
27428 | ||
27429 | val int: int t | |
27430 | val real: real t | |
27431 | val string: string t | |
27432 | val unit: unit t | |
27433 | val tuple2: 'a1 t * 'a2 t -> ('a1 * 'a2) t | |
27434 | val tuple3: 'a1 t * 'a2 t * 'a3 t -> ('a1 * 'a2 * 'a3) t | |
27435 | val tuple4: 'a1 t * 'a2 t * 'a3 t * 'a4 t | |
27436 | -> ('a1 * 'a2 * 'a3 * 'a4) t | |
27437 | ... | |
27438 | end | |
27439 | ---- | |
27440 | ||
27441 | The question is how to define a variable-arity tuple combinator. | |
27442 | Traditionally, the only way to take a variable number of arguments in | |
27443 | SML is to put the arguments in a list (or vector) and pass that. So, | |
27444 | one might define a tuple combinator with the following signature. | |
27445 | [source,sml] | |
27446 | ---- | |
27447 | val tupleN: 'a list -> 'a list t | |
27448 | ---- | |
27449 | ||
27450 | The problem with this approach is that as soon as one places values in | |
27451 | a list, they must all have the same type. So, programmers often take | |
27452 | an alternative approach, and define a family of `tuple<N>` functions, | |
27453 | as we see in the `COMBINATOR` signature above. | |
27454 | ||
27455 | The family-of-functions approach is ugly for many reasons. First, it | |
27456 | clutters the signature with a number of functions when there should | |
27457 | really only be one. Second, it is _closed_, in that there are a fixed | |
27458 | number of tuple combinators in the interface, and should a client need | |
27459 | a combinator for a large tuple, he is out of luck. Third, this | |
27460 | approach often requires a lot of duplicate code in the implementation | |
27461 | of the combinators. | |
27462 | ||
27463 | Fortunately, using <:Fold01N:> and <:ProductType:products>, one can | |
27464 | provide an interface and implementation that solves all these | |
27465 | problems. Here is a simple pickling module that converts values to | |
27466 | strings. | |
27467 | [source,sml] | |
27468 | ---- | |
27469 | structure Pickler = | |
27470 | struct | |
27471 | type 'a t = 'a -> string | |
27472 | ||
27473 | val unit = fn () => "" | |
27474 | ||
27475 | val int = Int.toString | |
27476 | ||
27477 | val real = Real.toString | |
27478 | ||
27479 | val string = id | |
27480 | ||
27481 | type 'a accum = 'a * string list -> string list | |
27482 | ||
27483 | val tuple = | |
27484 | fn z => | |
27485 | Fold01N.fold | |
27486 | {finish = fn ps => fn x => concat (rev (ps (x, []))), | |
27487 | start = fn p => fn (x, l) => p x :: l, | |
27488 | zero = unit} | |
27489 | z | |
27490 | ||
27491 | val ` = | |
27492 | fn z => | |
27493 | Fold01N.step1 | |
27494 | {combine = (fn (p, p') => fn (x & x', l) => p' x' :: "," :: p (x, l))} | |
27495 | z | |
27496 | end | |
27497 | ---- | |
27498 | ||
27499 | If one has `n` picklers of types | |
27500 | [source,sml] | |
27501 | ---- | |
27502 | val p1: a1 Pickler.t | |
27503 | val p2: a2 Pickler.t | |
27504 | ... | |
27505 | val pn: an Pickler.t | |
27506 | ---- | |
27507 | then one can construct a pickler for n-ary products as follows. | |
27508 | [source,sml] | |
27509 | ---- | |
27510 | tuple `p1 `p2 ... `pn $ : (a1 & a2 & ... & an) Pickler.t | |
27511 | ---- | |
27512 | ||
27513 | For example, with `Pickler` in scope, one can prove the following | |
27514 | equations. | |
27515 | [source,sml] | |
27516 | ---- | |
27517 | "" = tuple $ () | |
27518 | "1" = tuple `int $ 1 | |
27519 | "1,2.0" = tuple `int `real $ (1 & 2.0) | |
27520 | "1,2.0,three" = tuple `int `real `string $ (1 & 2.0 & "three") | |
27521 | ---- | |
27522 | ||
27523 | Here is the signature for `Pickler`. It shows why the `accum` type is | |
27524 | useful. | |
27525 | [source,sml] | |
27526 | ---- | |
27527 | signature PICKLER = | |
27528 | sig | |
27529 | type 'a t | |
27530 | ||
27531 | val int: int t | |
27532 | val real: real t | |
27533 | val string: string t | |
27534 | val unit: unit t | |
27535 | ||
27536 | type 'a accum | |
27537 | val ` : ('a accum, 'b t, ('a, 'b) prod accum, | |
27538 | 'z1, 'z2, 'z3, 'z4, 'z5, 'z6, 'z7) Fold01N.step1 | |
27539 | val tuple: ('a t, 'a accum, 'b accum, 'b t, unit t, | |
27540 | 'z1, 'z2, 'z3, 'z4, 'z5) Fold01N.t | |
27541 | end | |
27542 | ||
27543 | structure Pickler: PICKLER = Pickler | |
27544 | ---- | |
27545 | ||
27546 | <<< | |
27547 | ||
27548 | :mlton-guide-page: Variant | |
27549 | [[Variant]] | |
27550 | Variant | |
27551 | ======= | |
27552 | ||
27553 | A _variant_ is an arm of a datatype declaration. For example, the | |
27554 | datatype | |
27555 | ||
27556 | [source,sml] | |
27557 | ---- | |
27558 | datatype t = A | B of int | C of real | |
27559 | ---- | |
27560 | ||
27561 | has three variants: `A`, `B`, and `C`. | |
27562 | ||
27563 | <<< | |
27564 | ||
27565 | :mlton-guide-page: VesaKarvonen | |
27566 | [[VesaKarvonen]] | |
27567 | VesaKarvonen | |
27568 | ============ | |
27569 | ||
27570 | Vesa Karvonen is a student at the http://www.cs.helsinki.fi/index.en.html[University of Helsinki]. | |
27571 | His interests lie in programming techniques that allow complex programs to be expressed | |
27572 | clearly and concisely and the design and implementation of programming languages. | |
27573 | ||
27574 | image::VesaKarvonen.attachments/vesa-in-mlton-t-shirt.jpg[align="center"] | |
27575 | ||
27576 | Things he'd like to see for SML and hopes to be able to contribute towards: | |
27577 | ||
27578 | * A practical tool for documenting libraries. Preferably one that is | |
27579 | based on extracting the documentation from source code comments. | |
27580 | ||
27581 | * A good IDE. Possibly an enhanced SML mode (`esml-mode`) for Emacs. | |
27582 | Google for http://www.google.com/search?&q=SLIME+video[SLIME video] to | |
27583 | get an idea of what he'd like to see. Some specific notes: | |
27584 | + | |
27585 | -- | |
27586 | * show type at point | |
27587 | * robust, consistent indentation | |
27588 | * show documentation | |
27589 | * jump to definition (see <:EmacsDefUseMode:>) | |
27590 | -- | |
27591 | + | |
27592 | <:EmacsBgBuildMode:> has also been written for working with MLton. | |
27593 | ||
27594 | * Documented and cataloged libraries. Perhaps something like | |
27595 | http://www.boost.org[Boost], but for SML libraries. Here is a partial | |
27596 | list of libraries, tools, and frameworks Vesa is or has been working | |
27597 | on: | |
27598 | + | |
27599 | -- | |
27600 | * Asynchronous Programming Library (<!ViewGitFile(mltonlib,master,com/ssh/async/unstable/README)>) | |
27601 | * Extended Basis Library (<!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>) | |
27602 | * Generic Programming Library (<!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>) | |
27603 | * Pretty Printing Library (<!ViewGitFile(mltonlib,master,com/ssh/prettier/unstable/README)>) | |
27604 | * Random Generator Library (<!ViewGitFile(mltonlib,master,com/ssh/random/unstable/README)>) | |
27605 | * RPC (Remote Procedure Call) Library (<!ViewGitFile(mltonlib,master,org/mlton/vesak/rpc-lib/unstable/README)>) | |
27606 | * http://www.libsdl.org/[SDL] Binding (<!ViewGitFile(mltonlib,master,org/mlton/vesak/sdl/unstable/README)>) | |
27607 | * Unit Testing Library (<!ViewGitFile(mltonlib,master,com/ssh/unit-test/unstable/README)>) | |
27608 | * Use Library (<!ViewGitFile(mltonlib,master,org/mlton/vesak/use-lib/unstable/README)>) | |
27609 | * Windows Library (<!ViewGitFile(mltonlib,master,com/ssh/windows/unstable/README)>) | |
27610 | -- | |
27611 | Note that most of these libraries have been ported to several <:StandardMLImplementations:SML implementations>. | |
27612 | ||
27613 | <<< | |
27614 | ||
27615 | :mlton-guide-page: WarnUnusedAnomalies | |
27616 | [[WarnUnusedAnomalies]] | |
27617 | WarnUnusedAnomalies | |
27618 | =================== | |
27619 | ||
27620 | The `warnUnused` <:MLBasisAnnotations:MLBasis annotation> can be used | |
27621 | to report unused identifiers. This can be useful for catching bugs | |
27622 | and for code maintenance (e.g., eliminating dead code). However, the | |
27623 | `warnUnused` annotation can sometimes behave in counter-intuitive | |
27624 | ways. This page gives some of the anomalies that have been reported. | |
27625 | ||
27626 | * Functions whose only uses are recursive uses within their bodies are | |
27627 | not warned as unused: | |
27628 | + | |
27629 | [source,sml] | |
27630 | ---- | |
27631 | local | |
27632 | fun foo () = foo () : unit | |
27633 | val bar = let fun baz () = baz () : unit in baz end | |
27634 | in | |
27635 | end | |
27636 | ---- | |
27637 | + | |
27638 | ---- | |
27639 | Warning: z.sml 3.5. | |
27640 | Unused variable: bar. | |
27641 | ---- | |
27642 | ||
27643 | * Components of actual functor argument that are necessary to match | |
27644 | the functor argument signature but are unused in the body of the | |
27645 | functor are warned as unused: | |
27646 | + | |
27647 | [source,sml] | |
27648 | ---- | |
27649 | functor Warning (type t val x : t) = struct | |
27650 | val y = x | |
27651 | end | |
27652 | structure X = Warning (type t = int val x = 1) | |
27653 | ---- | |
27654 | + | |
27655 | ---- | |
27656 | Warning: z.sml 4.29. | |
27657 | Unused type: t. | |
27658 | ---- | |
27659 | ||
27660 | ||
27661 | * No component of a functor result is warned as unused. In the | |
27662 | following, the only uses of `f2` are to match the functor argument | |
27663 | signatures of `functor G` and `functor H` and there are no uses of | |
27664 | `z`: | |
27665 | + | |
27666 | [source,sml] | |
27667 | ---- | |
27668 | functor F(structure X : sig type t end) = struct | |
27669 | type t = X.t | |
27670 | fun f1 (_ : X.t) = () | |
27671 | fun f2 (_ : X.t) = () | |
27672 | val z = () | |
27673 | end | |
27674 | functor G(structure Y : sig | |
27675 | type t | |
27676 | val f1 : t -> unit | |
27677 | val f2 : t -> unit | |
27678 | val z : unit | |
27679 | end) = struct | |
27680 | fun g (x : Y.t) = Y.f1 x | |
27681 | end | |
27682 | functor H(structure Y : sig | |
27683 | type t | |
27684 | val f1 : t -> unit | |
27685 | val f2 : t -> unit | |
27686 | val z : unit | |
27687 | end) = struct | |
27688 | fun h (x : Y.t) = Y.f1 x | |
27689 | end | |
27690 | functor Z() = struct | |
27691 | structure S = F(structure X = struct type t = unit end) | |
27692 | structure SG = G(structure Y = S) | |
27693 | structure SH = H(structure Y = S) | |
27694 | end | |
27695 | structure U = Z() | |
27696 | val _ = U.SG.g () | |
27697 | val _ = U.SH.h () | |
27698 | ---- | |
27699 | + | |
27700 | ---- | |
27701 | ---- | |
27702 | ||
27703 | <<< | |
27704 | ||
27705 | :mlton-guide-page: WesleyTerpstra | |
27706 | [[WesleyTerpstra]] | |
27707 | WesleyTerpstra | |
27708 | ============== | |
27709 | ||
27710 | Wesley W. Terpstra is a PhD student at the Technische Universitat Darmstadt (Germany). | |
27711 | ||
27712 | Research interests | |
27713 | ||
27714 | * Distributed systems (P2P) | |
27715 | * Number theory (Error-correcting codes) | |
27716 | ||
27717 | My interest in SML is centered on the fact the the language is able to directly express ideas from number theory which are important for my work. Modules and Functors seem to be a very natural basis for implementing many algebraic structures. MLton provides an ideal platform for actual implementation as it is fast and has unboxed words. | |
27718 | ||
27719 | Things I would like from MLton in the future: | |
27720 | ||
27721 | * Some better optimization of mathematical expressions | |
27722 | * IPv6 and multicast support | |
27723 | * A complete GUI toolkit like mGTK | |
27724 | * More supported platforms so that applications written under MLton have a wider audience | |
27725 | ||
27726 | <<< | |
27727 | ||
27728 | :mlton-guide-page: WholeProgramOptimization | |
27729 | [[WholeProgramOptimization]] | |
27730 | WholeProgramOptimization | |
27731 | ======================== | |
27732 | ||
27733 | Whole-program optimization is a compilation technique in which | |
27734 | optimizations operate over the entire program. This allows the | |
27735 | compiler many optimization opportunities that are not available when | |
27736 | analyzing modules separately (as with separate compilation). | |
27737 | ||
27738 | Most of MLton's optimizations are whole-program optimizations. | |
27739 | Because MLton compiles the whole program at once, it can perform | |
27740 | optimization across module boundaries. As a consequence, MLton often | |
27741 | reduces or eliminates the run-time penalty that arises with separate | |
27742 | compilation of SML features such as functors, modules, polymorphism, | |
27743 | and higher-order functions. MLton takes advantage of having the | |
27744 | entire program to perform transformations such as: defunctorization, | |
27745 | monomorphisation, higher-order control-flow analysis, inlining, | |
27746 | unboxing, argument flattening, redundant-argument removal, constant | |
27747 | folding, and representation selection. Whole-program compilation is | |
27748 | an integral part of the design of MLton and is not likely to change. | |
27749 | ||
27750 | <<< | |
27751 | ||
27752 | :mlton-guide-page: WishList | |
27753 | [[WishList]] | |
27754 | WishList | |
27755 | ======== | |
27756 | ||
27757 | This page is mainly for recording recurring feature requests. If you | |
27758 | have a new feature request, you probably want to query interest on one | |
27759 | of the <:Contact:mailing lists> first. | |
27760 | ||
27761 | Please be aware of MLton's policy on | |
27762 | <:LanguageChanges:language changes>. Nonetheless, we hope to provide | |
27763 | support for some of the "immediate" <:SuccessorML:> proposals in a | |
27764 | future release. | |
27765 | ||
27766 | ||
27767 | == Support for link options in ML Basis files == | |
27768 | ||
27769 | Introduce a mechanism to specify link options in <:MLBasis:ML Basis> | |
27770 | files. For example, generalizing a bit, a ML Basis declaration of the | |
27771 | form | |
27772 | ||
27773 | ---- | |
27774 | option "option" | |
27775 | ---- | |
27776 | ||
27777 | could be introduced whose semantics would be the same (as closely as | |
27778 | possible) as if the option string were specified on the compiler | |
27779 | command line. | |
27780 | ||
27781 | The main motivation for this is that a MLton library that would | |
27782 | introduce bindings (through <:ForeignFunctionInterface:FFI>) to an | |
27783 | external library could be packaged conveniently as a single MLB file. | |
27784 | For example, to link with library `foo` the MLB file would simply | |
27785 | contain: | |
27786 | ||
27787 | ---- | |
27788 | option "-link-opt -lfoo" | |
27789 | ---- | |
27790 | ||
27791 | Similar feature requests have been discussed previously on the mailing lists: | |
27792 | ||
27793 | * http://www.mlton.org/pipermail/mlton/2004-July/025553.html | |
27794 | * http://www.mlton.org/pipermail/mlton/2005-January/026648.html | |
27795 | ||
27796 | <<< | |
27797 | ||
27798 | :mlton-guide-page: XML | |
27799 | [[XML]] | |
27800 | XML | |
27801 | === | |
27802 | ||
27803 | <:XML:> is an <:IntermediateLanguage:>, translated from <:CoreML:> by | |
27804 | <:Defunctorize:>, optimized by <:XMLSimplify:>, and translated by | |
27805 | <:Monomorphise:> to <:SXML:>. | |
27806 | ||
27807 | == Description == | |
27808 | ||
27809 | <:XML:> is polymorphic, higher-order, with flat patterns. Every | |
27810 | <:XML:> expression is annotated with its type. Polymorphic | |
27811 | generalization is made explicit through type variables annotating | |
27812 | `val` and `fun` declarations. Polymorphic instantiation is made | |
27813 | explicit by specifying type arguments at variable references. <:XML:> | |
27814 | patterns can not be nested and can not contain wildcards, constraints, | |
27815 | flexible records, or layering. | |
27816 | ||
27817 | == Implementation == | |
27818 | ||
27819 | * <!ViewGitFile(mlton,master,mlton/xml/xml.sig)> | |
27820 | * <!ViewGitFile(mlton,master,mlton/xml/xml.fun)> | |
27821 | * <!ViewGitFile(mlton,master,mlton/xml/xml-tree.sig)> | |
27822 | * <!ViewGitFile(mlton,master,mlton/xml/xml-tree.fun)> | |
27823 | ||
27824 | == Type Checking == | |
27825 | ||
27826 | <:XML:> also has a type checker, used for debugging. At present, the | |
27827 | type checker is also the best specification of the type system of | |
27828 | <:XML:>. If you need more details, the type checker | |
27829 | (<!ViewGitFile(mlton,master,mlton/xml/type-check.sig)>, | |
27830 | <!ViewGitFile(mlton,master,mlton/xml/type-check.fun)>), is pretty short. | |
27831 | ||
27832 | Since the type checker does not affect the output of the compiler | |
27833 | (unless it reports an error), it can be turned off. The type checker | |
27834 | recursively descends the program, checking that the type annotating | |
27835 | each node is the same as the type synthesized from the types of the | |
27836 | expressions subnodes. | |
27837 | ||
27838 | == Details and Notes == | |
27839 | ||
27840 | <:XML:> uses the same atoms as <:CoreML:>, hence all identifiers | |
27841 | (constructors, variables, etc.) are unique and can have properties | |
27842 | attached to them. Finally, <:XML:> has a simplifier (<:XMLShrink:>), | |
27843 | which implements a reduction system. | |
27844 | ||
27845 | === Types === | |
27846 | ||
27847 | <:XML:> types are either type variables or applications of n-ary type | |
27848 | constructors. There are many utility functions for constructing and | |
27849 | destructing types involving built-in type constructors. | |
27850 | ||
27851 | A type scheme binds list of type variables in a type. The only | |
27852 | interesting operation on type schemes is the application of a type | |
27853 | scheme to a list of types, which performs a simultaneous substitution | |
27854 | of the type arguments for the bound type variables of the scheme. For | |
27855 | the purposes of type checking, it is necessary to know the type scheme | |
27856 | of variables, constructors, and primitives. This is done by | |
27857 | associating the scheme with the identifier using its property list. | |
27858 | This approach is used instead of the more traditional environment | |
27859 | approach for reasons of speed. | |
27860 | ||
27861 | === XmlTree === | |
27862 | ||
27863 | Before defining `XML`, the signature for language <:XML:>, we need to | |
27864 | define an auxiliary signature `XML_TREE`, that contains the datatype | |
27865 | declarations for the expression trees of <:XML:>. This is done solely | |
27866 | for the purpose of modularity -- it allows the simplifier and type | |
27867 | checker to be defined by separate functors (which take a structure | |
27868 | matching `XML_TREE`). Then, `Xml` is defined as the signature for a | |
27869 | module containing the expression trees, the simplifier, and the type | |
27870 | checker. | |
27871 | ||
27872 | Both constructors and variables can have type schemes, hence both | |
27873 | constructor and variable references specify the instance of the scheme | |
27874 | at the point of references. An instance is specified with a vector of | |
27875 | types, which corresponds to the type variables in the scheme. | |
27876 | ||
27877 | <:XML:> patterns are flat (i.e. not nested). A pattern is a | |
27878 | constructor with an optional argument variable. Patterns only occur | |
27879 | in `case` expressions. To evaluate a case expression, compare the | |
27880 | test value sequentially against each pattern. For the first pattern | |
27881 | that matches, destruct the value if necessary to bind the pattern | |
27882 | variables and evaluate the corresponding expression. If no pattern | |
27883 | matches, evaluate the default. All patterns of a case statement are | |
27884 | of the same variant of `Pat.t`, although this is not enforced by ML's | |
27885 | type system. The type checker, however, does enforce this. Because | |
27886 | tuple patterns are irrefutable, there will only ever be one tuple | |
27887 | pattern in a case expression and there will be no default. | |
27888 | ||
27889 | <:XML:> contains value, exception, and mutually recursive function | |
27890 | declarations. There are no free type variables in <:XML:>. All type | |
27891 | variables are explicitly bound at either a value or function | |
27892 | declaration. At some point in the future, exception declarations may | |
27893 | go away, and exceptions may be represented with a single datatype | |
27894 | containing a `unit ref` component to implement genericity. | |
27895 | ||
27896 | <:XML:> expressions are like those of <:CoreML:>, with the following | |
27897 | exceptions. There are no records expressions. After type inference, | |
27898 | all records (some of which may have originally been tuples in the | |
27899 | source) are converted to tuples, because once flexible record patterns | |
27900 | have been resolved, tuple labels are superfluous. Tuple components | |
27901 | are ordered based on the field ordering relation. <:XML:> eta expands | |
27902 | primitives and constructors so that there are always fully applied. | |
27903 | Hence, the only kind of value of arrow type is a lambda. This | |
27904 | property is useful for flow analysis and later in code generation. | |
27905 | ||
27906 | An <:XML:> program is a list of toplevel datatype declarations and a | |
27907 | body expression. Because datatype declarations are not generative, | |
27908 | the defunctorizer can safely move them to toplevel. | |
27909 | ||
27910 | <<< | |
27911 | ||
27912 | :mlton-guide-page: XMLShrink | |
27913 | [[XMLShrink]] | |
27914 | XMLShrink | |
27915 | ========= | |
27916 | ||
27917 | XMLShrink is an optimization pass for the <:XML:> | |
27918 | <:IntermediateLanguage:>, invoked from <:XMLSimplify:>. | |
27919 | ||
27920 | == Description == | |
27921 | ||
27922 | This pass performs optimizations based on a reduction system. | |
27923 | ||
27924 | == Implementation == | |
27925 | ||
27926 | * <!ViewGitFile(mlton,master,mlton/xml/shrink.sig)> | |
27927 | * <!ViewGitFile(mlton,master,mlton/xml/shrink.fun)> | |
27928 | ||
27929 | == Details and Notes == | |
27930 | ||
27931 | The simplifier is based on <!Cite(AppelJim97, Shrinking Lambda | |
27932 | Expressions in Linear Time)>. | |
27933 | ||
27934 | The source program may contain functions that are only called once, or | |
27935 | not even called at all. Match compilation introduces many such | |
27936 | functions. In order to reduce the program size, speed up later | |
27937 | phases, and improve the flow analysis, a source to source simplifier | |
27938 | is run on <:XML:> after type inference and match compilation. | |
27939 | ||
27940 | The simplifier implements the reductions shown below. The reductions | |
27941 | eliminate unnecessary declarations (see the side constraint in the | |
27942 | figure), applications where the function is immediate, and case | |
27943 | statements where the test is immediate. Declarations can be | |
27944 | eliminated only when the expression is nonexpansive (see Section 4.7 | |
27945 | of the <:DefinitionOfStandardML: Definition>), which is a syntactic | |
27946 | condition that ensures that the expression has no effects | |
27947 | (assignments, raises, or nontermination). The reductions on case | |
27948 | statements do not show the other irrelevant cases that may exist. The | |
27949 | reductions were chosen so that they were strongly normalizing and so | |
27950 | that they never increased tree size. | |
27951 | ||
27952 | * {empty} | |
27953 | + | |
27954 | -- | |
27955 | [source,sml] | |
27956 | ---- | |
27957 | let x = e1 in e2 | |
27958 | ---- | |
27959 | ||
27960 | reduces to | |
27961 | ||
27962 | [source,sml] | |
27963 | ---- | |
27964 | e2 [x -> e1] | |
27965 | ---- | |
27966 | ||
27967 | if `e1` is a constant or variable or if `e1` is nonexpansive and `x` occurs zero or one time in `e2` | |
27968 | -- | |
27969 | ||
27970 | * {empty} | |
27971 | + | |
27972 | -- | |
27973 | [source,sml] | |
27974 | ---- | |
27975 | (fn x => e1) e2 | |
27976 | ---- | |
27977 | ||
27978 | reduces to | |
27979 | ||
27980 | [source,sml] | |
27981 | ---- | |
27982 | let x = e2 in e1 | |
27983 | ---- | |
27984 | -- | |
27985 | ||
27986 | * {empty} | |
27987 | + | |
27988 | -- | |
27989 | [source,sml] | |
27990 | ---- | |
27991 | e1 handle e2 | |
27992 | ---- | |
27993 | ||
27994 | reduces to | |
27995 | ||
27996 | [source,sml] | |
27997 | ---- | |
27998 | e1 | |
27999 | ---- | |
28000 | ||
28001 | if `e1` is nonexpansive | |
28002 | -- | |
28003 | ||
28004 | * {empty} | |
28005 | + | |
28006 | -- | |
28007 | [source,sml] | |
28008 | ---- | |
28009 | case let d in e end of p1 => e1 ... | |
28010 | ---- | |
28011 | ||
28012 | reduces to | |
28013 | ||
28014 | [source,sml] | |
28015 | ---- | |
28016 | let d in case e of p1 => e1 ... end | |
28017 | ---- | |
28018 | -- | |
28019 | ||
28020 | * {empty} | |
28021 | + | |
28022 | -- | |
28023 | [source,sml] | |
28024 | ---- | |
28025 | case C e1 of C x => e2 | |
28026 | ---- | |
28027 | ||
28028 | reduces to | |
28029 | ||
28030 | [source,sml] | |
28031 | ---- | |
28032 | let x = e1 in e2 | |
28033 | ---- | |
28034 | -- | |
28035 | ||
28036 | <<< | |
28037 | ||
28038 | :mlton-guide-page: XMLSimplify | |
28039 | [[XMLSimplify]] | |
28040 | XMLSimplify | |
28041 | =========== | |
28042 | ||
28043 | The optimization passes for the <:XML:> <:IntermediateLanguage:> are | |
28044 | collected and controlled by the `XmlSimplify` functor | |
28045 | (<!ViewGitFile(mlton,master,mlton/xml/xml-simplify.sig)>, | |
28046 | <!ViewGitFile(mlton,master,mlton/xml/xml-simplify.fun)>). | |
28047 | ||
28048 | The following optimization passes are implemented: | |
28049 | ||
28050 | * <:XMLSimplifyTypes:> | |
28051 | * <:XMLShrink:> | |
28052 | ||
28053 | The optimization passes can be controlled from the command-line by the options | |
28054 | ||
28055 | * `-diag-pass <pass>` -- keep diagnostic info for pass | |
28056 | * `-disable-pass <pass>` -- skip optimization pass (if normally performed) | |
28057 | * `-enable-pass <pass>` -- perform optimization pass (if normally skipped) | |
28058 | * `-keep-pass <pass>` -- keep the results of pass | |
28059 | * `-xml-passes <passes>` -- xml optimization passes | |
28060 | ||
28061 | <<< | |
28062 | ||
28063 | :mlton-guide-page: XMLSimplifyTypes | |
28064 | [[XMLSimplifyTypes]] | |
28065 | XMLSimplifyTypes | |
28066 | ================ | |
28067 | ||
28068 | <:XMLSimplifyTypes:> is an optimization pass for the <:XML:> | |
28069 | <:IntermediateLanguage:>, invoked from <:XMLSimplify:>. | |
28070 | ||
28071 | == Description == | |
28072 | ||
28073 | This pass simplifies types in an <:XML:> program, eliminating all | |
28074 | unused type arguments. | |
28075 | ||
28076 | == Implementation == | |
28077 | ||
28078 | * <!ViewGitFile(mlton,master,mlton/xml/simplify-types.sig)> | |
28079 | * <!ViewGitFile(mlton,master,mlton/xml/simplify-types.fun)> | |
28080 | ||
28081 | == Details and Notes == | |
28082 | ||
28083 | It first computes a simple fixpoint on all the `datatype` declarations | |
28084 | to determine which `datatype` `tycon` args are actually used. Then it | |
28085 | does a single pass over the program to determine which polymorphic | |
28086 | declaration type variables are used, and rewrites types to eliminate | |
28087 | unused type arguments. | |
28088 | ||
28089 | This pass should eliminate any spurious duplication that the | |
28090 | <:Monomorphise:> pass might perform due to phantom types. | |
28091 | ||
28092 | <<< | |
28093 | ||
28094 | :mlton-guide-page: Zone | |
28095 | [[Zone]] | |
28096 | Zone | |
28097 | ==== | |
28098 | ||
28099 | <:Zone:> is an optimization pass for the <:SSA2:> | |
28100 | <:IntermediateLanguage:>, invoked from <:SSA2Simplify:>. | |
28101 | ||
28102 | == Description == | |
28103 | ||
28104 | This pass breaks large <:SSA2:> functions into zones, which are | |
28105 | connected subgraphs of the dominator tree. For each zone, at the node | |
28106 | that dominates the zone (the "zone root"), it places a tuple | |
28107 | collecting all of the live variables at that node. It replaces any | |
28108 | variables used in that zone with offsets from the tuple. The goal is | |
28109 | to decrease the liveness information in large <:SSA:> functions. | |
28110 | ||
28111 | == Implementation == | |
28112 | ||
28113 | * <!ViewGitFile(mlton,master,mlton/ssa/zone.fun)> | |
28114 | ||
28115 | == Details and Notes == | |
28116 | ||
28117 | Compute strongly-connected components to avoid put tuple constructions | |
28118 | in loops. | |
28119 | ||
28120 | There are two (expert) flags that govern the use of this pass | |
28121 | ||
28122 | * `-max-function-size <n>` | |
28123 | * `-zone-cut-depth <n>` | |
28124 | ||
28125 | Zone splitting only works when the number of basic blocks in a | |
28126 | function is greater than `n`. The `n` used to cut the dominator tree | |
28127 | is set by `-zone-cut-depth`. | |
28128 | ||
28129 | There is currently no attempt to be safe-for-space. That is, the | |
28130 | tuples are not restricted to containing only "small" values. | |
28131 | ||
28132 | In the `HOL` program, the particular problem is the main function, | |
28133 | which has 161,783 blocks and 257,519 variables -- the product of those | |
28134 | two numbers being about 41 billion. Now, we're not likely going to | |
28135 | need that much space since we use a sparse representation. But even | |
28136 | 1/100th would really hurt. And of course this rules out bit vectors. | |
28137 | ||
28138 | <<< | |
28139 | ||
28140 | :mlton-guide-page: ZZZOrphanedPages | |
28141 | [[ZZZOrphanedPages]] | |
28142 | ZZZOrphanedPages | |
28143 | ================ | |
28144 | ||
28145 | The contents of these pages have been moved to other pages. | |
28146 | ||
28147 | These templates are used by other pages. | |
28148 | ||
28149 | * <:CompilerPassTemplate:> | |
28150 | * <:TalkTemplate:> | |
28151 | ||
28152 | <<< |