Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | ConcurrentMLImplementation |
2 | ========================== | |
3 | ||
4 | Here are some notes on MLton's implementation of <:ConcurrentML:>. | |
5 | ||
6 | Concurrent ML was originally implemented for SML/NJ. It was ported to | |
7 | MLton in the summer of 2004. The main difference between the | |
8 | implementations is that SML/NJ uses continuations to implement CML | |
9 | threads, while MLton uses its underlying <:MLtonThread:thread> | |
10 | package. Presently, MLton's threads are a little more heavyweight | |
11 | than SML/NJ's continuations, but it's pretty clear that there is some | |
12 | fat there that could be trimmed. | |
13 | ||
14 | The implementation of CML in SML/NJ is built upon the first-class | |
15 | continuations of the `SMLofNJ.Cont` module. | |
16 | [source,sml] | |
17 | ---- | |
18 | type 'a cont | |
19 | val callcc: ('a cont -> 'a) -> 'a | |
20 | val isolate: ('a -> unit) -> 'a cont | |
21 | val throw: 'a cont -> 'a -> 'b | |
22 | ---- | |
23 | ||
24 | The implementation of CML in MLton is built upon the first-class | |
25 | threads of the <:MLtonThread:> module. | |
26 | [source,sml] | |
27 | ---- | |
28 | type 'a t | |
29 | val new: ('a -> unit) -> 'a t | |
30 | val prepare: 'a t * 'a -> Runnable.t | |
31 | val switch: ('a t -> Runnable.t) -> 'a | |
32 | ---- | |
33 | ||
34 | The port is relatively straightforward, because CML always throws to a | |
35 | continuation at most once. Hence, an "abstract" implementation of | |
36 | CML could be built upon first-class one-shot continuations, which map | |
37 | equally well to SML/NJ's continuations and MLton's threads. | |
38 | ||
39 | The "essence" of the port is to transform: | |
40 | ---- | |
41 | callcc (fn k => ... throw k' v') | |
42 | ---- | |
43 | {empty}to | |
44 | ---- | |
45 | switch (fn t => ... prepare (t', v')) | |
46 | ---- | |
47 | which suffices for the vast majority of the CML implementation. | |
48 | ||
49 | There was only one complicated transformation: blocking multiple base | |
50 | events. In SML/NJ CML, the representation of base events is given by: | |
51 | [source,sml] | |
52 | ---- | |
53 | datatype 'a event_status | |
54 | = ENABLED of {prio: int, doFn: unit -> 'a} | |
55 | | BLOCKED of { | |
56 | transId: trans_id ref, | |
57 | cleanUp: unit -> unit, | |
58 | next: unit -> unit | |
59 | } -> 'a | |
60 | type 'a base_evt = unit -> 'a event_status | |
61 | ---- | |
62 | ||
63 | When synchronizing on a set of base events, which are all blocked, we | |
64 | must invoke each `BLOCKED` function with the same `transId` and | |
65 | `cleanUp` (the `transId` is (checked and) set to `CANCEL` by the | |
66 | `cleanUp` function, which is invoked by the first enabled event; this | |
67 | "fizzles" every other event in the synchronization group that later | |
68 | becomes enabled). However, each `BLOCKED` function is implemented by | |
69 | a callcc, so that when the event is enabled, it throws back to the | |
70 | point of synchronization. Hence, the next function (which doesn't | |
71 | return) is invoked by the `BLOCKED` function to escape the callcc and | |
72 | continue in the thread performing the synchronization. In SML/NJ this | |
73 | is implemented as follows: | |
74 | [source,sml] | |
75 | ---- | |
76 | fun ext ([], blockFns) = callcc (fn k => let | |
77 | val throw = throw k | |
78 | val (transId, setFlg) = mkFlg() | |
79 | fun log [] = S.atomicDispatch () | |
80 | | log (blockFn:: r) = | |
81 | throw (blockFn { | |
82 | transId = transId, | |
83 | cleanUp = setFlg, | |
84 | next = fn () => log r | |
85 | }) | |
86 | in | |
87 | log blockFns; error "[log]" | |
88 | end) | |
89 | ---- | |
90 | (Note that `S.atomicDispatch` invokes the continuation of the next | |
91 | continuation on the ready queue.) This doesn't map well to the MLton | |
92 | thread model. Although it follows the | |
93 | ---- | |
94 | callcc (fn k => ... throw k v) | |
95 | ---- | |
96 | model, the fact that `blockFn` will also attempt to do | |
97 | ---- | |
98 | callcc (fn k' => ... next ()) | |
99 | ---- | |
100 | means that the naive transformation will result in nested `switch`-es. | |
101 | ||
102 | We need to think a little more about what this code is trying to do. | |
103 | Essentially, each `blockFn` wants to capture this continuation, hold | |
104 | on to it until the event is enabled, and continue with next; when the | |
105 | event is enabled, before invoking the continuation and returning to | |
106 | the synchronization point, the `cleanUp` and other event specific | |
107 | operations are performed. | |
108 | ||
109 | To accomplish the same effect in the MLton thread implementation, we | |
110 | have the following: | |
111 | [source,sml] | |
112 | ---- | |
113 | datatype 'a status = | |
114 | ENABLED of {prio: int, doitFn: unit -> 'a} | |
115 | | BLOCKED of {transId: trans_id, | |
116 | cleanUp: unit -> unit, | |
117 | next: unit -> rdy_thread} -> 'a | |
118 | ||
119 | type 'a base = unit -> 'a status | |
120 | ||
121 | fun ext ([], blockFns): 'a = | |
122 | S.atomicSwitch | |
123 | (fn (t: 'a S.thread) => | |
124 | let | |
125 | val (transId, cleanUp) = TransID.mkFlg () | |
126 | fun log blockFns: S.rdy_thread = | |
127 | case blockFns of | |
128 | [] => S.next () | |
129 | | blockFn::blockFns => | |
130 | (S.prep o S.new) | |
131 | (fn _ => fn () => | |
132 | let | |
133 | val () = S.atomicBegin () | |
134 | val x = blockFn {transId = transId, | |
135 | cleanUp = cleanUp, | |
136 | next = fn () => log blockFns} | |
137 | in S.switch(fn _ => S.prepVal (t, x)) | |
138 | end) | |
139 | in | |
140 | log blockFns | |
141 | end) | |
142 | ---- | |
143 | ||
144 | To avoid the nested `switch`-es, I run the `blockFn` in it's own | |
145 | thread, whose only purpose is to return to the synchronization point. | |
146 | This corresponds to the `throw (blockFn {...})` in the SML/NJ | |
147 | implementation. I'm worried that this implementation might be a | |
148 | little expensive, starting a new thread for each blocked event (when | |
149 | there are only multiple blocked events in a synchronization group). | |
150 | But, I don't see another way of implementing this behavior in the | |
151 | MLton thread model. | |
152 | ||
153 | Note that another way of thinking about what is going on is to | |
154 | consider each `blockFn` as prepending a different set of actions to | |
155 | the thread `t`. It might be possible to give a | |
156 | `MLton.Thread.unsafePrepend`. | |
157 | [source,sml] | |
158 | ---- | |
159 | fun unsafePrepend (T r: 'a t, f: 'b -> 'a): 'b t = | |
160 | let | |
161 | val t = | |
162 | case !r of | |
163 | Dead => raise Fail "prepend to a Dead thread" | |
164 | | New g => New (g o f) | |
165 | | Paused (g, t) => Paused (fn h => g (f o h), t) | |
166 | in (* r := Dead; *) | |
167 | T (ref t) | |
168 | end | |
169 | ---- | |
170 | I have commented out the `r := Dead`, which would allow multiple | |
171 | prepends to the same thread (i.e., not destroying the original thread | |
172 | in the process). Of course, only one of the threads could be run: if | |
173 | the original thread were in the `Paused` state, then multiple threads | |
174 | would share the underlying runtime/primitive thread. Now, this | |
175 | matches the "one-shot" nature of CML continuations/threads, but I'm | |
176 | not comfortable with extending `MLton.Thread` with such an unsafe | |
177 | operation. | |
178 | ||
179 | Other than this complication with blocking multiple base events, the | |
180 | port was quite routine. (As a very pleasant surprise, the CML | |
181 | implementation in SML/NJ doesn't use any SML/NJ-isms.) There is a | |
182 | slight difference in the way in which critical sections are handled in | |
183 | SML/NJ and MLton; since `MLton.Thread.switch` _always_ leaves a | |
184 | critical section, it is sometimes necessary to add additional | |
185 | `atomicBegin`-s/`atomicEnd`-s to ensure that we remain in a critical | |
186 | section after a thread switch. | |
187 | ||
188 | While looking at virtually every file in the core CML implementation, | |
189 | I took the liberty of simplifying things where it seemed possible; in | |
190 | terms of style, the implementation is about half-way between Reppy's | |
191 | original and MLton's. | |
192 | ||
193 | Some changes of note: | |
194 | ||
195 | * `util/` contains all pertinent data-structures: (functional and | |
196 | imperative) queues, (functional) priority queues. Hence, it should be | |
197 | easier to switch in more efficient or real-time implementations. | |
198 | ||
199 | * `core-cml/scheduler.sml`: in both implementations, this is where | |
200 | most of the interesting action takes place. I've made the connection | |
201 | between `MLton.Thread.t`-s and `ThreadId.thread_id`-s more abstract | |
202 | than it is in the SML/NJ implementation, and encapsulated all of the | |
203 | `MLton.Thread` operations in this module. | |
204 | ||
205 | * eliminated all of the "by hand" inlining | |
206 | ||
207 | ||
208 | == Future Extensions == | |
209 | ||
210 | The CML documentation says the following: | |
211 | ____ | |
212 | ||
213 | ---- | |
214 | CML.joinEvt: thread_id -> unit event | |
215 | ---- | |
216 | ||
217 | * `joinEvt tid` | |
218 | + | |
219 | creates an event value for synchronizing on the termination of the | |
220 | thread with the ID tid. There are three ways that a thread may | |
221 | terminate: the function that was passed to spawn (or spawnc) may | |
222 | return; it may call the exit function, or it may have an uncaught | |
223 | exception. Note that `joinEvt` does not distinguish between these | |
224 | cases; it also does not become enabled if the named thread deadlocks | |
225 | (even if it is garbage collected). | |
226 | ____ | |
227 | ||
228 | I believe that the `MLton.Finalizable` might be able to relax that | |
229 | last restriction. Upon the creation of a `'a Scheduler.thread`, we | |
230 | could attach a finalizer to the underlying `'a MLton.Thread.t` that | |
231 | enables the `joinEvt` (in the associated `ThreadID.thread_id`) when | |
232 | the `'a MLton.Thread.t` becomes unreachable. | |
233 | ||
234 | I don't know why CML doesn't have | |
235 | ---- | |
236 | CML.kill: thread_id -> unit | |
237 | ---- | |
238 | which has a fairly simple implementation -- setting a kill flag in the | |
239 | `thread_id` and adjusting the scheduler to discard any killed threads | |
240 | that it takes off the ready queue. The fairness of the scheduler | |
241 | ensures that a killed thread will eventually be discarded. The | |
242 | semantics are little murky for blocked threads that are killed, | |
243 | though. For example, consider a thread blocked on `SyncVar.mTake mv` | |
244 | and a thread blocked on `SyncVar.mGet mv`. If the first thread is | |
245 | killed while blocked, and a third thread does `SyncVar.mPut (mv, x)`, | |
246 | then we might expect that we'll enable the second thread, and never | |
247 | the first. But, when only the ready queue is able to discard killed | |
248 | threads, then the `SyncVar.mPut` could enable the first thread | |
249 | (putting it on the ready queue, from which it will be discarded) and | |
250 | leave the second thread blocked. We could solve this by adjusting the | |
251 | `TransID.trans_id types` and the "cleaner" functions to look for both | |
252 | canceled transactions and transactions on killed threads. | |
253 | ||
254 | John Reppy says that <!Cite(MarlowEtAl01)> and <!Cite(FlattFindler04)> | |
255 | explain why `CML.kill` would be a bad idea. | |
256 | ||
257 | Between `CML.timeOutEvt` and `CML.kill`, one could give an efficient | |
258 | solution to the recent `comp.lang.ml` post about terminating a | |
259 | function that doesn't complete in a given time. | |
260 | [source,sml] | |
261 | ---- | |
262 | fun timeOut (f: unit -> 'a, t: Time.time): 'a option = | |
263 | let | |
264 | val iv = SyncVar.iVar () | |
265 | val tid = CML.spawn (fn () => SyncVar.iPut (iv, f ())) | |
266 | in | |
267 | CML.select | |
268 | [CML.wrap (CML.timeOutEvt t, fn () => (CML.kill tid; NONE)), | |
269 | CML.wrap (SyncVar.iGetEvt iv, fn x => SOME x)] | |
270 | end | |
271 | ---- | |
272 | ||
273 | ||
274 | == Space Safety == | |
275 | ||
276 | There are some CML related posts on the MLton mailing list: | |
277 | ||
278 | * http://www.mlton.org/pipermail/mlton/2004-May/ | |
279 | ||
280 | that discuss concerns that SML/NJ's implementation is not space | |
281 | efficient, because multi-shot continuations can be held indefinitely | |
282 | on event queues. MLton is better off because of the one-shot nature | |
283 | -- when an event enables a thread, all other copies of the thread | |
284 | waiting in other event queues get turned into dead threads (of zero | |
285 | size). |