4 <:StandardML:Standard ML> does not support ad hoc polymorphism. This
5 presents a challenge to programmers. The problem is that at first
6 glance there seems to be no practical way to implement something like
7 a function for converting a value of any type to a string or a
8 function for computing a hash value for a value of any type.
9 Fortunately there are ways to implement type-indexed values in SML as
10 discussed in <!Cite(Yang98)>. Various articles such as
11 <!Cite(Danvy98)>, <!Cite(Ramsey11)>, <!Cite(Elsman04)>,
12 <!Cite(Kennedy04)>, and <!Cite(Benton05)> also contain examples of
15 *NOTE:* The technique used in the following example uses an early (and
16 somewhat broken) variation of the basic technique used in an
17 experimental generic programming library (see
18 <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/README)>) that can
19 be found from the MLton repository. The generic programming library
20 also includes a more advanced generic pretty printing function (see
21 <!ViewGitFile(mltonlib,master,com/ssh/generic/unstable/public/value/pretty.sig)>).
23 == Example: Converting any SML value to (roughly) SML syntax ==
25 Consider the problem of converting any SML value to a textual
26 presentation that matches the syntax of SML as closely as possible.
27 One solution is a type-indexed function that maps a given type to a
28 function that maps any value (of the type) to its textual
29 presentation. A type-indexed function like this can be useful for a
30 variety of purposes. For example, one could use it to show debugging
31 information. We'll call this function "`show`".
33 We'll do a fairly complete implementation of `show`. We do not
34 distinguish infix and nonfix constructors, but that is not an
35 intrinsic property of SML datatypes. We also don't reconstruct a type
36 name for the value, although it would be particularly useful for
37 functional values. To reconstruct type names, some changes would be
38 needed and the reader is encouraged to consider how to do that. A
39 more realistic implementation would use some pretty printing
40 combinators to compute a layout for the result. This should be a
41 relatively easy change (given a suitable pretty printing library).
42 Cyclic values (through references and arrays) do not have a standard
43 textual presentation and it is impossible to convert arbitrary
44 functional values (within SML) to a meaningful textual presentation.
45 Finally, it would also make sense to show sharing of references and
46 arrays. We'll leave these improvements to an actual library
49 The following code uses the <:Fixpoints:fixpoint framework> and other
50 utilities from an Extended Basis library (see
51 <!ViewGitFile(mltonlib,master,com/ssh/extended-basis/unstable/README)>).
55 Let's consider the design of the `SHOW` signature:
61 type 'a t (* complete type-index *)
62 type 'a s (* incomplete sum *)
63 type ('a, 'k) p (* incomplete product *)
64 type u (* tuple or unlabelled product *)
65 type l (* record or labelled product *)
67 val show : 'a t -> 'a -> string
69 (* user-defined types *)
70 val inj : ('a -> 'b) -> 'b t -> 'a t
72 (* tuples and records *)
73 val * : ('a, 'k) p * ('b, 'k) p -> (('a, 'b) product, 'k) p
75 val U : 'a t -> ('a, u) p
76 val L : string -> 'a t -> ('a, l) p
78 val tuple : ('a, u) p -> 'a t
79 val record : ('a, l) p -> 'a t
82 val + : 'a s * 'b s -> (('a, 'b) sum) s
84 val C0 : string -> unit s
85 val C1 : string -> 'a t -> 'a s
87 val data : 'a s -> 'a t
93 val regExn : (exn -> ('a * 'a s) option) -> unit
95 (* some built-in type constructors *)
96 val refc : 'a t -> 'a ref t
97 val array : 'a t -> 'a array t
98 val list : 'a t -> 'a list t
99 val vector : 'a t -> 'a vector t
100 val --> : 'a t * 'b t -> ('a -> 'b) t
102 (* some built-in base types *)
103 val string : string t
113 While some details are shaped by the specific requirements of `show`,
114 there are a number of (design) patterns that translate to other
115 type-indexed values. The former kind of details are mostly shaped by
116 the syntax of SML values that `show` is designed to produce. To this
117 end, abstract types and phantom types are used to distinguish
118 incomplete record, tuple, and datatype type-indices from each other
119 and from complete type-indices. Also, names of record labels and
120 datatype constructors need to be provided by the user.
122 ==== Arbitrary user-defined datatypes ====
124 Perhaps the most important pattern is how the design supports
125 arbitrary user-defined datatypes. A number of combinators together
126 conspire to provide the functionality. First of all, to support new
127 user-defined types, a combinator taking a conversion function to a
128 previously supported type is provided:
131 val inj : ('a -> 'b) -> 'b t -> 'a t
134 An injection function is sufficient in this case, but in the general
135 case, an embedding with injection and projection functions may be
138 To support products (tuples and records) a product combinator is
142 val * : ('a, 'k) p * ('b, 'k) p -> (('a, 'b) product, 'k) p
144 The second (phantom) type variable `'k` is there to distinguish
145 between labelled and unlabelled products and the type `p`
146 distinguishes incomplete products from complete type-indices of type
147 `t`. Most type-indexed values do not need to make such distinctions.
149 To support sums (datatypes) a sum combinator is provided:
152 val + : 'a s * 'b s -> (('a, 'b) sum) s
154 Again, the purpose of the type `s` is to distinguish incomplete sums
155 from complete type-indices of type `t`, which usually isn't necessary.
157 Finally, to support recursive datatypes, including sets of mutually
158 recursive datatypes, a <:Fixpoints:fixpoint tier> is provided:
164 Together these combinators (with the more domain specific combinators
165 `U`, `L`, `tuple`, `record`, `C0`, `C1`, and `data`) enable one to
166 encode a type-index for any user-defined datatype.
170 The `exn` type in SML is a <:UniversalType:universal type> into which
171 all types can be embedded. SML also allows a program to generate new
172 exception variants at run-time. Thus a mechanism is required to register
173 handlers for particular variants:
177 val regExn : (exn -> ('a * 'a s) option) -> unit
180 The universal `exn` type-index then makes use of the registered
181 handlers. The above particular form of handler, which converts an
182 exception value to a value of some type and a type-index for that type
183 (essentially an existential type) is designed to make it convenient to
184 write handlers. To write a handler, one can conveniently reuse
185 existing type-indices:
193 val () = regExn (fn Int v => SOME (v, C1"Int" int)
198 Note that a single handler may actually handle an arbitrary number of
199 different exceptions.
201 ==== Other types ====
203 Some built-in and standard types typically require special treatment
204 due to their special nature. The most important of these are arrays
205 and references, because cyclic data (ignoring closures) and observable
206 sharing can only be constructed through them.
208 When arrow types are really supported, unlike in this case, they
209 usually need special treatment due to the contravariance of arguments.
211 Lists and vectors require special treatment in the case of `show`,
212 because of their special syntax. This isn't usually the case.
214 The set of base types to support also needs to be considered unless
215 one exports an interface for constructing type-indices for entirely
220 Before going to the implementation, let's look at some examples. For
221 the following examples, we'll assume a structure binding
222 `Show :> SHOW`. If you want to try the examples immediately, just
223 skip forward to the implementation.
225 To use `show`, one first needs a type-index, which is then given to
226 `show`. To show a list of integers, one would use the type-index
227 `list int`, which has the type `int list Show.t`:
231 let open Show in show (list int) end
235 Likewise, to show a list of lists of characters, one would use the
236 type-index `list (list char)`, which has the type `char list list
240 val "[[#\"a\", #\"b\", #\"c\"], []]" =
241 let open Show in show (list (list char)) end
242 [[#"a", #"b", #"c"], []]
245 Handling standard types is not particularly interesting. It is more
246 interesting to see how user-defined types can be handled. Although
247 the `option` datatype is a standard type, it requires no special
248 support, so we can treat it as a user-defined type. Options can be
249 encoded easily using a sum:
255 inj (fn NONE => INL ()
257 (data (C0"NONE" + C1"SOME" t))
261 let open Show in show (option int) end
265 Readers new to type-indexed values might want to type annotate each
266 subexpression of the above example as an exercise. (Use a compiler to
267 check your annotations.)
269 Using a product, user specified records can be also be encoded easily:
275 inj (fn {a, b, c} => a & b & c)
276 (record (L"a" (option int) *
281 val "{a = SOME 1, b = 3.0, c = false}" =
282 let open Show in show abc end
283 {a = SOME 1, b = 3.0, c = false}
286 As you can see, both of the above use `inj` to inject user-defined
287 types to the general purpose sum and product types.
289 Of particular interest is whether recursive datatypes and cyclic data
290 can be handled. For example, how does one write a type-index for a
291 recursive datatype such as a cyclic graph?
294 datatype 'a graph = VTX of 'a * 'a graph list ref
295 fun arcs (VTX (_, r)) = r
298 Using the `Show` combinators, we could first write a new type-index
299 combinator for `graph`:
306 inj (fn VTX (x, y) => x & y)
309 U (refc (list graph_a)))))))
313 To show a graph with integer labels
317 val a = VTX (1, ref [])
318 val b = VTX (2, ref [])
319 val c = VTX (3, ref [])
320 val d = VTX (4, ref [])
321 val e = VTX (5, ref [])
322 val f = VTX (6, ref [])
333 we could then simply write
336 val "VTX (1, ref [VTX (2, ref [VTX (3, ref [VTX (1, %0), \
337 \VTX (6, ref [VTX (5, ref [VTX (4, ref [VTX (6, %3)])])] as %3)]), \
338 \VTX (5, ref [VTX (4, ref [VTX (6, ref [VTX (5, %2)])])] as %2)]), \
339 \VTX (4, ref [VTX (6, ref [VTX (5, ref [VTX (4, %1)])])] as %1)] as %0)" =
340 let open Show in show (graph int) end
344 There is a subtle gotcha with cyclic data. Consider the following code:
347 exception ExnArray of exn array
352 regExn (fn ExnArray a =>
353 SOME (a, C1"ExnArray" (array exn))
358 val a = Array.fromList [Empty]
360 Array.update (a, 0, ExnArray a) ; a
364 Although the above looks innocent enough, the evaluation of
367 val "[|ExnArray %0|] as %0" =
368 let open Show in show (array exn) end
371 goes into an infinite loop. To avoid this problem, the type-index
372 `array exn` must be evaluated only once, as in the following:
375 val array_exn = let open Show in array exn end
377 exception ExnArray of exn array
382 regExn (fn ExnArray a =>
383 SOME (a, C1"ExnArray" array_exn)
388 val a = Array.fromList [Empty]
390 Array.update (a, 0, ExnArray a) ; a
393 val "[|ExnArray %0|] as %0" =
394 let open Show in show array_exn end
398 Cyclic data (excluding closures) in Standard ML can only be
399 constructed imperatively through arrays and references (combined with
400 exceptions or recursive datatypes). Before recursing to a reference
401 or an array, one needs to check whether that reference or array has
402 already been seen before. When `ref` or `array` is called with a
403 type-index, a new cyclicity checker is instantiated.
409 structure SmlSyntax = struct
411 structure CV = CharVector and C = Char
413 val isSym = Char.contains "!%&$#+-/:<=>?@\\~`^|*"
415 fun isSymId s = 0 < size s andalso CV.all isSym s
419 andalso C.isAlpha (CV.sub (s, 0))
420 andalso CV.all (fn c => C.isAlphaNum c
426 andalso #"0" <> CV.sub (s, 0)
427 andalso CV.all C.isDigit s
429 fun isId s = isAlphaNumId s orelse isSymId s
431 fun isLongId s = List.all isId (String.fields (#"." <\ op =) s)
433 fun isLabel s = isId s orelse isNumLabel s
437 structure Show :> SHOW = struct
438 datatype 'a t = IN of exn list * 'a -> bool * string
440 type ('a, 'k) p = 'a t
444 fun show (IN t) x = #2 (t ([], x))
446 (* user-defined types *)
447 fun inj inj (IN b) = IN (b o Pair.map (id, inj))
450 fun surround pre suf (_, s) = (false, concat [pre, s, suf])
451 fun parenthesize x = if #1 x then surround "(" ")" x else x
453 (fn (_, s) => (true, concat [tag, " ", s])) o parenthesize
454 fun check p m s = if p s then () else raise Fail (m^s)
456 (* tuples and records *)
457 fun (IN l) * (IN r) =
458 IN (fn (rs, a & b) =>
459 (false, concat [#2 (l (rs, a)),
464 fun L l = (check SmlSyntax.isLabel "Invalid label: " l
465 ; fn IN t => IN (surround (l^" = ") "" o t))
467 fun tuple (IN t) = IN (surround "(" ")" o t)
468 fun record (IN t) = IN (surround "{" "}" o t)
471 fun (IN l) + (IN r) = IN (fn (rs, INL a) => l (rs, a)
472 | (rs, INR b) => r (rs, b))
474 fun C0 c = (check SmlSyntax.isId "Invalid constructor: " c
475 ; IN (const (false, c)))
476 fun C1 c (IN t) = (check SmlSyntax.isId "Invalid constructor: " c
477 ; IN (construct c o t))
481 fun Y ? = Tie.iso Tie.function (fn IN x => x, IN) ?
485 val handlers = ref ([] : (exn -> unit t option) list)
487 val exn = IN (fn (rs, e) => let
496 val IN f = lp (!handlers)
501 handlers := (Option.map
508 (* some built-in type constructors *)
510 fun cyclic (IN t) = let
511 exception E of ''a * bool ref
513 IN (fn (rs, v : ''a) => let
514 val idx = Int.toString o length
515 fun lp (E (v', c)::rs) =
516 if v' <> v then lp rs
517 else (c := false ; (false, "%"^idx rs))
521 val r = t (E (v, c)::rs, v)
524 else surround "" (" as %"^idx rs) r
531 fun aggregate pre suf toList (IN t) =
532 IN (surround pre suf o
537 (map (#2 o curry t rs)
540 fun refc ? = (cyclic o inj ! o C1"ref") ?
541 fun array ? = (cyclic o aggregate "[|" "|]" (Array.foldr op:: [])) ?
542 fun list ? = aggregate "[" "]" id ?
543 fun vector ? = aggregate "#[" "]" (Vector.foldr op:: []) ?
546 fun (IN _) --> (IN _) = IN (const (false, "<fn>"))
548 (* some built-in base types *)
550 fun mk toS = (fn x => (false, x)) o toS o (fn (_, x) => x)
553 IN (surround "\"" "\"" o mk (String.translate Char.toString))
554 val unit = IN (mk (fn () => "()"))
555 val bool = IN (mk Bool.toString)
556 val char = IN (surround "#\"" "\"" o mk Char.toString)
557 val int = IN (mk Int.toString)
558 val word = IN (surround "0wx" "" o mk Word.toString)
559 val real = IN (mk Real.toString)
564 (* Handlers for standard top-level exceptions *)
567 fun E0 name = SOME ((), C0 name)
569 regExn (fn Bind => E0"Bind"
572 | Domain => E0"Domain"
575 | Option => E0"Option"
576 | Overflow => E0"Overflow"
579 | Subscript => E0"Subscript"
581 ; regExn (fn Fail s => SOME (s, C1"Fail" string)
589 There are a number of related techniques. Here are some of them.