Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | PrintfGentle |
2 | ============ | |
3 | :toc: | |
4 | ||
5 | This page provides a gentle introduction and derivation of <:Printf:>, | |
6 | with sections and arrangement more suitable to a talk. | |
7 | ||
8 | ||
9 | == Introduction == | |
10 | ||
11 | SML does not have `printf`. Could we define it ourselves? | |
12 | ||
13 | [source,sml] | |
14 | ---- | |
15 | val () = printf ("here's an int %d and a real %f.\n", 13, 17.0) | |
16 | val () = printf ("here's three values (%d, %f, %f).\n", 13, 17.0, 19.0) | |
17 | ---- | |
18 | ||
19 | What could the type of `printf` be? | |
20 | ||
21 | This obviously can't work, because SML functions take a fixed number | |
22 | of arguments. Actually they take one argument, but if that's a tuple, | |
23 | it can only have a fixed number of components. | |
24 | ||
25 | ||
26 | == From tupling to currying == | |
27 | ||
28 | What about currying to get around the typing problem? | |
29 | ||
30 | [source,sml] | |
31 | ---- | |
32 | val () = printf "here's an int %d and a real %f.\n" 13 17.0 | |
33 | val () = printf "here's three values (%d, %f, %f).\n" 13 17.0 19.0 | |
34 | ---- | |
35 | ||
36 | That fails for a similar reason. We need two types for `printf`. | |
37 | ||
38 | ---- | |
39 | val printf: string -> int -> real -> unit | |
40 | val printf: string -> int -> real -> real -> unit | |
41 | ---- | |
42 | ||
43 | This can't work, because `printf` can only have one type. SML doesn't | |
44 | support programmer-defined overloading. | |
45 | ||
46 | ||
47 | == Overloading and dependent types == | |
48 | ||
49 | Even without worrying about number of arguments, there is another | |
50 | problem. The type of `printf` depends on the format string. | |
51 | ||
52 | [source,sml] | |
53 | ---- | |
54 | val () = printf "here's an int %d and a real %f.\n" 13 17.0 | |
55 | val () = printf "here's a real %f and an int %d.\n" 17.0 13 | |
56 | ---- | |
57 | ||
58 | Now we need | |
59 | ||
60 | ---- | |
61 | val printf: string -> int -> real -> unit | |
62 | val printf: string -> real -> int -> unit | |
63 | ---- | |
64 | ||
65 | Again, this can't possibly working because SML doesn't have | |
66 | overloading, and types can't depend on values. | |
67 | ||
68 | ||
69 | == Idea: express type information in the format string == | |
70 | ||
71 | If we express type information in the format string, then different | |
72 | uses of `printf` can have different types. | |
73 | ||
74 | [source,sml] | |
75 | ---- | |
76 | type 'a t (* the type of format strings *) | |
77 | val printf: 'a t -> 'a | |
78 | infix D F | |
79 | val fs1: (int -> real -> unit) t = "here's an int "D" and a real "F".\n" | |
80 | val fs2: (int -> real -> real -> unit) t = | |
81 | "here's three values ("D", "F", "F").\n" | |
82 | val () = printf fs1 13 17.0 | |
83 | val () = printf fs2 13 17.0 19.0 | |
84 | ---- | |
85 | ||
86 | Now, our two calls to `printf` type check, because the format | |
87 | string specializes `printf` to the appropriate type. | |
88 | ||
89 | ||
90 | == The types of format characters == | |
91 | ||
92 | What should the type of format characters `D` and `F` be? Each format | |
93 | character requires an additional argument of the appropriate type to | |
94 | be supplied to `printf`. | |
95 | ||
96 | Idea: guess the final type that will be needed for `printf` the format | |
97 | string and verify it with each format character. | |
98 | ||
99 | [source,sml] | |
100 | ---- | |
101 | type ('a, 'b) t (* 'a = rest of type to verify, 'b = final type *) | |
102 | val ` : string -> ('a, 'a) t (* guess the type, which must be verified *) | |
103 | val D: (int -> 'a, 'b) t * string -> ('a, 'b) t (* consume an int *) | |
104 | val F: (real -> 'a, 'b) t * string -> ('a, 'b) t (* consume a real *) | |
105 | val printf: (unit, 'a) t -> 'a | |
106 | ---- | |
107 | ||
108 | Don't worry. In the end, type inference will guess and verify for us. | |
109 | ||
110 | ||
111 | == Understanding guess and verify == | |
112 | ||
113 | Now, let's build up a format string and a specialized `printf`. | |
114 | ||
115 | [source,sml] | |
116 | ---- | |
117 | infix D F | |
118 | val f0 = `"here's an int " | |
119 | val f1 = f0 D " and a real " | |
120 | val f2 = f1 F ".\n" | |
121 | val p = printf f2 | |
122 | ---- | |
123 | ||
124 | These definitions yield the following types. | |
125 | ||
126 | [source,sml] | |
127 | ---- | |
128 | val f0: (int -> real -> unit, int -> real -> unit) t | |
129 | val f1: (real -> unit, int -> real -> unit) t | |
130 | val f2: (unit, int -> real -> unit) t | |
131 | val p: int -> real -> unit | |
132 | ---- | |
133 | ||
134 | So, `p` is a specialized `printf` function. We could use it as | |
135 | follows | |
136 | ||
137 | [source,sml] | |
138 | ---- | |
139 | val () = p 13 17.0 | |
140 | val () = p 14 19.0 | |
141 | ---- | |
142 | ||
143 | ||
144 | == Type checking this using a functor == | |
145 | ||
146 | [source,sml] | |
147 | ---- | |
148 | signature PRINTF = | |
149 | sig | |
150 | type ('a, 'b) t | |
151 | val ` : string -> ('a, 'a) t | |
152 | val D: (int -> 'a, 'b) t * string -> ('a, 'b) t | |
153 | val F: (real -> 'a, 'b) t * string -> ('a, 'b) t | |
154 | val printf: (unit, 'a) t -> 'a | |
155 | end | |
156 | ||
157 | functor Test (P: PRINTF) = | |
158 | struct | |
159 | open P | |
160 | infix D F | |
161 | ||
162 | val () = printf (`"here's an int "D" and a real "F".\n") 13 17.0 | |
163 | val () = printf (`"here's three values ("D", "F ", "F").\n") 13 17.0 19.0 | |
164 | end | |
165 | ---- | |
166 | ||
167 | ||
168 | == Implementing `Printf` == | |
169 | ||
170 | Think of a format character as a formatter transformer. It takes the | |
171 | formatter for the part of the format string before it and transforms | |
172 | it into a new formatter that first does the left hand bit, then does | |
173 | its bit, then continues on with the rest of the format string. | |
174 | ||
175 | [source,sml] | |
176 | ---- | |
177 | structure Printf: PRINTF = | |
178 | struct | |
179 | datatype ('a, 'b) t = T of (unit -> 'a) -> 'b | |
180 | ||
181 | fun printf (T f) = f (fn () => ()) | |
182 | ||
183 | fun ` s = T (fn a => (print s; a ())) | |
184 | ||
185 | fun D (T f, s) = | |
186 | T (fn g => f (fn () => fn i => | |
187 | (print (Int.toString i); print s; g ()))) | |
188 | ||
189 | fun F (T f, s) = | |
190 | T (fn g => f (fn () => fn i => | |
191 | (print (Real.toString i); print s; g ()))) | |
192 | end | |
193 | ---- | |
194 | ||
195 | ||
196 | == Testing printf == | |
197 | ||
198 | [source,sml] | |
199 | ---- | |
200 | structure Z = Test (Printf) | |
201 | ---- | |
202 | ||
203 | ||
204 | == User-definable formats == | |
205 | ||
206 | The definition of the format characters is pretty much the same. | |
207 | Within the `Printf` structure we can define a format character | |
208 | generator. | |
209 | ||
210 | [source,sml] | |
211 | ---- | |
212 | val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t = | |
213 | fn toString => fn (T f, s) => | |
214 | T (fn th => f (fn () => fn a => (print (toString a); print s ; th ()))) | |
215 | val D = fn z => newFormat Int.toString z | |
216 | val F = fn z => newFormat Real.toString z | |
217 | ---- | |
218 | ||
219 | ||
220 | == A core `Printf` == | |
221 | ||
222 | We can now have a very small `PRINTF` signature, and define all | |
223 | the format strings externally to the core module. | |
224 | ||
225 | [source,sml] | |
226 | ---- | |
227 | signature PRINTF = | |
228 | sig | |
229 | type ('a, 'b) t | |
230 | val ` : string -> ('a, 'a) t | |
231 | val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t | |
232 | val printf: (unit, 'a) t -> 'a | |
233 | end | |
234 | ||
235 | structure Printf: PRINTF = | |
236 | struct | |
237 | datatype ('a, 'b) t = T of (unit -> 'a) -> 'b | |
238 | ||
239 | fun printf (T f) = f (fn () => ()) | |
240 | ||
241 | fun ` s = T (fn a => (print s; a ())) | |
242 | ||
243 | fun newFormat toString (T f, s) = | |
244 | T (fn th => | |
245 | f (fn () => fn a => | |
246 | (print (toString a) | |
247 | ; print s | |
248 | ; th ()))) | |
249 | end | |
250 | ---- | |
251 | ||
252 | ||
253 | == Extending to fprintf == | |
254 | ||
255 | One can implement fprintf by threading the outstream through all the | |
256 | transformers. | |
257 | ||
258 | [source,sml] | |
259 | ---- | |
260 | signature PRINTF = | |
261 | sig | |
262 | type ('a, 'b) t | |
263 | val ` : string -> ('a, 'a) t | |
264 | val fprintf: (unit, 'a) t * TextIO.outstream -> 'a | |
265 | val newFormat: ('a -> string) -> ('a -> 'b, 'c) t * string -> ('b, 'c) t | |
266 | val printf: (unit, 'a) t -> 'a | |
267 | end | |
268 | ||
269 | structure Printf: PRINTF = | |
270 | struct | |
271 | type out = TextIO.outstream | |
272 | val output = TextIO.output | |
273 | ||
274 | datatype ('a, 'b) t = T of (out -> 'a) -> out -> 'b | |
275 | ||
276 | fun fprintf (T f, out) = f (fn _ => ()) out | |
277 | ||
278 | fun printf t = fprintf (t, TextIO.stdOut) | |
279 | ||
280 | fun ` s = T (fn a => fn out => (output (out, s); a out)) | |
281 | ||
282 | fun newFormat toString (T f, s) = | |
283 | T (fn g => | |
284 | f (fn out => fn a => | |
285 | (output (out, toString a) | |
286 | ; output (out, s) | |
287 | ; g out))) | |
288 | end | |
289 | ---- | |
290 | ||
291 | ||
292 | == Notes == | |
293 | ||
294 | * Lesson: instead of using dependent types for a function, express the | |
295 | the dependency in the type of the argument. | |
296 | ||
297 | * If `printf` is partially applied, it will do the printing then and | |
298 | there. Perhaps this could be fixed with some kind of terminator. | |
299 | + | |
300 | A syntactic or argument terminator is not necessary. A formatter can | |
301 | either be eager (as above) or lazy (as below). A lazy formatter | |
302 | accumulates enough state to print the entire string. The simplest | |
303 | lazy formatter concatenates the strings as they become available: | |
304 | + | |
305 | [source,sml] | |
306 | ---- | |
307 | structure PrintfLazyConcat: PRINTF = | |
308 | struct | |
309 | datatype ('a, 'b) t = T of (string -> 'a) -> string -> 'b | |
310 | ||
311 | fun printf (T f) = f print "" | |
312 | ||
313 | fun ` s = T (fn th => fn s' => th (s' ^ s)) | |
314 | ||
315 | fun newFormat toString (T f, s) = | |
316 | T (fn th => | |
317 | f (fn s' => fn a => | |
318 | th (s' ^ toString a ^ s))) | |
319 | end | |
320 | ---- | |
321 | + | |
322 | It is somewhat more efficient to accumulate the strings as a list: | |
323 | + | |
324 | [source,sml] | |
325 | ---- | |
326 | structure PrintfLazyList: PRINTF = | |
327 | struct | |
328 | datatype ('a, 'b) t = T of (string list -> 'a) -> string list -> 'b | |
329 | ||
330 | fun printf (T f) = f (List.app print o List.rev) [] | |
331 | ||
332 | fun ` s = T (fn th => fn ss => th (s::ss)) | |
333 | ||
334 | fun newFormat toString (T f, s) = | |
335 | T (fn th => | |
336 | f (fn ss => fn a => | |
337 | th (s::toString a::ss))) | |
338 | end | |
339 | ---- | |
340 | ||
341 | ||
342 | == Also see == | |
343 | ||
344 | * <:Printf:> | |
345 | * <!Cite(Danvy98, Functional Unparsing)> |