Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | StaticSum |
2 | ========= | |
3 | ||
4 | While SML makes it impossible to write functions whose types would | |
5 | depend on the values of their arguments, or so called dependently | |
6 | typed functions, it is possible, and arguably commonplace, to write | |
7 | functions whose types depend on the types of their arguments. Indeed, | |
8 | the types of parametrically polymorphic functions like `map` and | |
9 | `foldl` can be said to depend on the types of their arguments. What | |
10 | is less commonplace, however, is to write functions whose behavior | |
11 | would depend on the types of their arguments. Nevertheless, there are | |
12 | several techniques for writing such functions. | |
13 | <:TypeIndexedValues:Type-indexed values> and <:Fold:fold> are two such | |
14 | techniques. This page presents another such technique dubbed static | |
15 | sums. | |
16 | ||
17 | ||
18 | == Ordinary Sums == | |
19 | ||
20 | Consider the sum type as defined below: | |
21 | [source,sml] | |
22 | ---- | |
23 | structure Sum = struct | |
24 | datatype ('a, 'b) t = INL of 'a | INR of 'b | |
25 | end | |
26 | ---- | |
27 | ||
28 | While a generic sum type such as defined above is very useful, it has | |
29 | a number of limitations. As an example, we could write the function | |
30 | `out` to extract the value from a sum as follows: | |
31 | [source,sml] | |
32 | ---- | |
33 | fun out (s : ('a, 'a) Sum.t) : 'a = | |
34 | case s | |
35 | of Sum.INL a => a | |
36 | | Sum.INR a => a | |
37 | ---- | |
38 | ||
39 | As can be seen from the type of `out`, it is limited in the sense that | |
40 | it requires both variants of the sum to have the same type. So, `out` | |
41 | cannot be used to extract the value of a sum of two different types, | |
42 | such as the type `(int, real) Sum.t`. As another example of a | |
43 | limitation, consider the following attempt at a `succ` function: | |
44 | [source,sml] | |
45 | ---- | |
46 | fun succ (s : (int, real) Sum.t) : ??? = | |
47 | case s | |
48 | of Sum.INL i => i + 1 | |
49 | | Sum.INR r => Real.nextAfter (r, Real.posInf) | |
50 | ---- | |
51 | ||
52 | The above definition of `succ` cannot be typed, because there is no | |
53 | type for the codomain within SML. | |
54 | ||
55 | ||
56 | == Static Sums == | |
57 | ||
58 | Interestingly, it is possible to define values `inL`, `inR`, and | |
59 | `match` that satisfy the laws | |
60 | ---- | |
61 | match (inL x) (f, g) = f x | |
62 | match (inR x) (f, g) = g x | |
63 | ---- | |
64 | and do not suffer from the same limitions. The definitions are | |
65 | actually quite trivial: | |
66 | [source,sml] | |
67 | ---- | |
68 | structure StaticSum = struct | |
69 | fun inL x (f, _) = f x | |
70 | fun inR x (_, g) = g x | |
71 | fun match x = x | |
72 | end | |
73 | ---- | |
74 | ||
75 | Now, given the `succ` function defined as | |
76 | [source,sml] | |
77 | ---- | |
78 | fun succ s = | |
79 | StaticSum.match s | |
80 | (fn i => i + 1, | |
81 | fn r => Real.nextAfter (r, Real.posInf)) | |
82 | ---- | |
83 | we get | |
84 | [source,sml] | |
85 | ---- | |
86 | succ (StaticSum.inL 1) = 2 | |
87 | succ (StaticSum.inR Real.maxFinite) = Real.posInf | |
88 | ---- | |
89 | ||
90 | To better understand how this works, consider the following signature | |
91 | for static sums: | |
92 | [source,sml] | |
93 | ---- | |
94 | structure StaticSum :> sig | |
95 | type ('dL, 'cL, 'dR, 'cR, 'c) t | |
96 | val inL : 'dL -> ('dL, 'cL, 'dR, 'cR, 'cL) t | |
97 | val inR : 'dR -> ('dL, 'cL, 'dR, 'cR, 'cR) t | |
98 | val match : ('dL, 'cL, 'dR, 'cR, 'c) t -> ('dL -> 'cL) * ('dR -> 'cR) -> 'c | |
99 | end = struct | |
100 | type ('dL, 'cL, 'dR, 'cR, 'c) t = ('dL -> 'cL) * ('dR -> 'cR) -> 'c | |
101 | open StaticSum | |
102 | end | |
103 | ---- | |
104 | ||
105 | Above, `'d` stands for domain and `'c` for codomain. The key | |
106 | difference between an ordinary sum type, like `(int, real) Sum.t`, and | |
107 | a static sum type, like `(int, real, real, int, real) StaticSum.t`, is | |
108 | that the ordinary sum type says nothing about the type of the result | |
109 | of deconstructing a sum while the static sum type specifies the type. | |
110 | ||
111 | With the sealed static sum module, we get the type | |
112 | [source,sml] | |
113 | ---- | |
114 | val succ : (int, int, real, real, 'a) StaticSum.t -> 'a | |
115 | ---- | |
116 | for the previously defined `succ` function. The type specifies that | |
117 | `succ` maps a left `int` to an `int` and a right `real` to a `real`. | |
118 | For example, the type of `StaticSum.inL 1` is | |
119 | `(int, 'cL, 'dR, 'cR, 'cL) StaticSum.t`. Unifying this with the | |
120 | argument type of `succ` gives the type `(int, int, real, real, int) | |
121 | StaticSum.t -> int`. | |
122 | ||
123 | The `out` function is quite useful on its own. Here is how it can be | |
124 | defined: | |
125 | [source,sml] | |
126 | ---- | |
127 | structure StaticSum = struct | |
128 | open StaticSum | |
129 | val out : ('a, 'a, 'b, 'b, 'c) t -> 'c = | |
130 | fn s => match s (fn x => x, fn x => x) | |
131 | end | |
132 | ---- | |
133 | ||
134 | Due to the value restriction, lack of first class polymorphism and | |
135 | polymorphic recursion, the usefulness and convenience of static sums | |
136 | is somewhat limited in SML. So, don't throw away the ordinary sum | |
137 | type just yet. Static sums can nevertheless be quite useful. | |
138 | ||
139 | ||
140 | === Example: Send and Receive with Argument Type Dependent Result Types === | |
141 | ||
142 | In some situations it would seem useful to define functions whose | |
143 | result type would depend on some of the arguments. Traditionally such | |
144 | functions have been thought to be impossible in SML and the solution | |
145 | has been to define multiple functions. For example, the | |
146 | http://www.standardml.org/Basis/socket.html[`Socket` structure] of the | |
147 | Basis library defines 16 `send` and 16 `recv` functions. In contrast, | |
148 | the Net structure | |
149 | (<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/net.sig)>) of the | |
150 | Basic library designed by Stephen Weeks defines only a single `send` | |
151 | and a single `receive` and the result types of the functions depend on | |
152 | their arguments. The implementation | |
153 | (<!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/net.sml)>) uses | |
154 | static sums (with a slighly different signature: | |
155 | <!ViewGitFile(mltonlib,master,com/sweeks/basic/unstable/static-sum.sig)>). | |
156 | ||
157 | ||
158 | === Example: Picking Monad Results === | |
159 | ||
160 | Suppose that we need to write a parser that accepts a pair of integers | |
161 | and returns their sum given a monadic parsing combinator library. A | |
162 | part of the signature of such library could look like this | |
163 | [source,sml] | |
164 | ---- | |
165 | signature PARSING = sig | |
166 | include MONAD | |
167 | val int : int t | |
168 | val lparen : unit t | |
169 | val rparen : unit t | |
170 | val comma : unit t | |
171 | (* ... *) | |
172 | end | |
173 | ---- | |
174 | where the `MONAD` signature could be defined as | |
175 | [source,sml] | |
176 | ---- | |
177 | signature MONAD = sig | |
178 | type 'a t | |
179 | val return : 'a -> 'a t | |
180 | val >>= : 'a t * ('a -> 'b t) -> 'b t | |
181 | end | |
182 | infix >>= | |
183 | ---- | |
184 | ||
185 | The straightforward, but tedious, way to write the desired parser is: | |
186 | [source,sml] | |
187 | ---- | |
188 | val p = lparen >>= (fn _ => | |
189 | int >>= (fn x => | |
190 | comma >>= (fn _ => | |
191 | int >>= (fn y => | |
192 | rparen >>= (fn _ => | |
193 | return (x + y)))))) | |
194 | ---- | |
195 | ||
196 | In Haskell, the parser could be written using the `do` notation | |
197 | considerably less verbosely as: | |
198 | [source,haskell] | |
199 | ---- | |
200 | p = do { lparen ; x <- int ; comma ; y <- int ; rparen ; return $ x + y } | |
201 | ---- | |
202 | ||
203 | SML doesn't provide a `do` notation, so we need another solution. | |
204 | ||
205 | Suppose we would have a "pick" notation for monads that would allows | |
206 | us to write the parser as | |
207 | [source,sml] | |
208 | ---- | |
209 | val p = `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (fn x & y => x + y) | |
210 | ---- | |
211 | using four auxiliary combinators: +`+, `\`, `^`, and `@`. | |
212 | ||
213 | Roughly speaking | |
214 | ||
215 | * +`p+ means that the result of `p` is dropped, | |
216 | * `\p` means that the result of `p` is taken, | |
217 | * `p ^ q` means that results of `p` and `q` are taken as a product, and | |
218 | * `p @ a` means that the results of `p` are passed to the function `a` and that result is returned. | |
219 | ||
220 | The difficulty is in implementing the concatenation combinator `^`. | |
221 | The type of the result of the concatenation depends on the types of | |
222 | the arguments. | |
223 | ||
224 | Using static sums and the <:ProductType:product type>, the pick | |
225 | notation for monads can be implemented as follows: | |
226 | [source,sml] | |
227 | ---- | |
228 | functor MkMonadPick (include MONAD) = let | |
229 | open StaticSum | |
230 | in | |
231 | struct | |
232 | fun `a = inL (a >>= (fn _ => return ())) | |
233 | val \ = inR | |
234 | fun a @ f = out a >>= (return o f) | |
235 | fun a ^ b = | |
236 | (match b o match a) | |
237 | (fn a => | |
238 | (fn b => inL (a >>= (fn _ => b)), | |
239 | fn b => inR (a >>= (fn _ => b))), | |
240 | fn a => | |
241 | (fn b => inR (a >>= (fn a => b >>= (fn _ => return a))), | |
242 | fn b => inR (a >>= (fn a => b >>= (fn b => return (a & b)))))) | |
243 | end | |
244 | end | |
245 | ---- | |
246 | ||
247 | The above implementation is inefficient, however. It uses many more | |
248 | bind operations, `>>=`, than necessary. That can be solved with an | |
249 | additional level of abstraction: | |
250 | [source,sml] | |
251 | ---- | |
252 | functor MkMonadPick (include MONAD) = let | |
253 | open StaticSum | |
254 | in | |
255 | struct | |
256 | fun `a = inL (fn b => a >>= (fn _ => b ())) | |
257 | fun \a = inR (fn b => a >>= b) | |
258 | fun a @ f = out a (return o f) | |
259 | fun a ^ b = | |
260 | (match b o match a) | |
261 | (fn a => (fn b => inL (fn c => a (fn () => b c)), | |
262 | fn b => inR (fn c => a (fn () => b c))), | |
263 | fn a => (fn b => inR (fn c => a (fn a => b (fn () => c a))), | |
264 | fn b => inR (fn c => a (fn a => b (fn b => c (a & b)))))) | |
265 | end | |
266 | end | |
267 | ---- | |
268 | ||
269 | After instantiating and opening either of the above monad pick | |
270 | implementations, the previously given definition of `p` can be | |
271 | compiled and results in a parser whose result is of type `int`. Here | |
272 | is a functor to test the theory: | |
273 | [source,sml] | |
274 | ---- | |
275 | functor Test (Arg : PARSING) = struct | |
276 | local | |
277 | structure Pick = MkMonadPick (Arg) | |
278 | open Pick Arg | |
279 | in | |
280 | val p : int t = | |
281 | `lparen ^ \int ^ `comma ^ \int ^ `rparen @ (fn x & y => x + y) | |
282 | end | |
283 | end | |
284 | ---- | |
285 | ||
286 | ||
287 | == Also see == | |
288 | ||
289 | There are a number of related techniques. Here are some of them. | |
290 | ||
291 | * <:Fold:> | |
292 | * <:TypeIndexedValues:> |