Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / src / MLtonProcess.adoc
1 MLtonProcess
2 ============
3
4 [source,sml]
5 ----
6 signature MLTON_PROCESS =
7 sig
8 type pid
9
10 val spawn: {args: string list, path: string} -> pid
11 val spawne: {args: string list, env: string list, path: string} -> pid
12 val spawnp: {args: string list, file: string} -> pid
13
14 type ('stdin, 'stdout, 'stderr) t
15
16 type input
17 type output
18
19 type none
20 type chain
21 type any
22
23 exception MisuseOfForget
24 exception DoublyRedirected
25
26 structure Child:
27 sig
28 type ('use, 'dir) t
29
30 val binIn: (BinIO.instream, input) t -> BinIO.instream
31 val binOut: (BinIO.outstream, output) t -> BinIO.outstream
32 val fd: (Posix.FileSys.file_desc, 'dir) t -> Posix.FileSys.file_desc
33 val remember: (any, 'dir) t -> ('use, 'dir) t
34 val textIn: (TextIO.instream, input) t -> TextIO.instream
35 val textOut: (TextIO.outstream, output) t -> TextIO.outstream
36 end
37
38 structure Param:
39 sig
40 type ('use, 'dir) t
41
42 val child: (chain, 'dir) Child.t -> (none, 'dir) t
43 val fd: Posix.FileSys.file_desc -> (none, 'dir) t
44 val file: string -> (none, 'dir) t
45 val forget: ('use, 'dir) t -> (any, 'dir) t
46 val null: (none, 'dir) t
47 val pipe: ('use, 'dir) t
48 val self: (none, 'dir) t
49 end
50
51 val create:
52 {args: string list,
53 env: string list option,
54 path: string,
55 stderr: ('stderr, output) Param.t,
56 stdin: ('stdin, input) Param.t,
57 stdout: ('stdout, output) Param.t}
58 -> ('stdin, 'stdout, 'stderr) t
59 val getStderr: ('stdin, 'stdout, 'stderr) t -> ('stderr, input) Child.t
60 val getStdin: ('stdin, 'stdout, 'stderr) t -> ('stdin, output) Child.t
61 val getStdout: ('stdin, 'stdout, 'stderr) t -> ('stdout, input) Child.t
62 val kill: ('stdin, 'stdout, 'stderr) t * Posix.Signal.signal -> unit
63 val reap: ('stdin, 'stdout, 'stderr) t -> Posix.Process.exit_status
64 end
65 ----
66
67
68 == Spawn ==
69
70 The `spawn` functions provide an alternative to the
71 `fork`/`exec` idiom that is typically used to create a new
72 process. On most platforms, the `spawn` functions are simple
73 wrappers around `fork`/`exec`. However, under Windows, the
74 `spawn` functions are primitive. All `spawn` functions return
75 the process id of the spawned process. They differ in how the
76 executable is found and the environment that it uses.
77
78 * `spawn {args, path}`
79 +
80 starts a new process running the executable specified by `path`
81 with the arguments `args`. Like `Posix.Process.exec`.
82
83 * `spawne {args, env, path}`
84 +
85 starts a new process running the executable specified by `path` with
86 the arguments `args` and environment `env`. Like
87 `Posix.Process.exece`.
88
89 * `spawnp {args, file}`
90 +
91 search the `PATH` environment variable for an executable named `file`,
92 and start a new process running that executable with the arguments
93 `args`. Like `Posix.Process.execp`.
94
95
96 == Create ==
97
98 `MLton.Process.create` provides functionality similar to
99 `Unix.executeInEnv`, but provides more control control over the input,
100 output, and error streams. In addition, `create` works on all
101 platforms, including Cygwin and MinGW (Windows) where `Posix.fork` is
102 unavailable. For greatest portability programs should still use the
103 standard `Unix.execute`, `Unix.executeInEnv`, and `OS.Process.system`.
104
105 The following types and sub-structures are used by the `create`
106 function. They provide static type checking of correct stream usage.
107
108 === Child ===
109
110 * `('use, 'dir) Child.t`
111 +
112 This represents a handle to one of a child's standard streams. The
113 `'dir` is viewed with respect to the parent. Thus a `('a, input)
114 Child.t` handle means that the parent may input the output from the
115 child.
116
117 * `Child.{bin,text}{In,Out} h`
118 +
119 These functions take a handle and bind it to a stream of the named
120 type. The type system will detect attempts to reverse the direction
121 of a stream or to use the same stream in multiple, incompatible ways.
122
123 * `Child.fd h`
124 +
125 This function behaves like the other `Child.*` functions; it opens a
126 stream. However, it does not enforce that you read or write from the
127 handle. If you use the descriptor in an inappropriate direction, the
128 behavior is undefined. Furthermore, this function may potentially be
129 unavailable on future MLton host platforms.
130
131 * `Child.remember h`
132 +
133 This function takes a stream of use `any` and resets the use of the
134 stream so that the stream may be used by `Child.*`. An `any` stream
135 may have had use `none` or `'use` prior to calling `Param.forget`. If
136 the stream was `none` and is used, `MisuseOfForget` is raised.
137
138 === Param ===
139
140 * `('use, 'dir) Param.t`
141 +
142 This is a handle to an input/output source and will be passed to the
143 created child process. The `'dir` is relative to the child process.
144 Input means that the child process will read from this stream.
145
146 * `Param.child h`
147 +
148 Connect the stream of the new child process to the stream of a
149 previously created child process. A single child stream should be
150 connected to only one child process or else `DoublyRedirected` will be
151 raised.
152
153 * `Param.fd fd`
154 +
155 This creates a stream from the provided file descriptor which will be
156 closed when `create` is called. This function may not be available on
157 future MLton host platforms.
158
159 * `Param.forget h`
160 +
161 This hides the type of the actual parameter as `any`. This is useful
162 if you are implementing an application which conditionally attaches
163 the child process to files or pipes. However, you must ensure that
164 your use after `Child.remember` matches the original type.
165
166 * `Param.file s`
167 +
168 Open the given file and connect it to the child process. Note that the
169 file will be opened only when `create` is called. So any exceptions
170 will be raised there and not by this function. If used for `input`,
171 the file is opened read-only. If used for `output`, the file is opened
172 read-write.
173
174 * `Param.null`
175 +
176 In some situations, the child process should have its output
177 discarded. The `null` param when passed as `stdout` or `stderr` does
178 this. When used for `stdin`, the child process will either receive
179 `EOF` or a failure condition if it attempts to read from `stdin`.
180
181 * `Param.pipe`
182 +
183 This will connect the input/output of the child process to a pipe
184 which the parent process holds. This may later form the input to one
185 of the `Child.*` functions and/or the `Param.child` function.
186
187 * `Param.self`
188 +
189 This will connect the input/output of the child process to the
190 corresponding stream of the parent process.
191
192 === Process ===
193
194 * `type ('stdin, 'stdout, 'stderr) t`
195 +
196 represents a handle to a child process. The type arguments capture
197 how the named stream of the child process may be used.
198
199 * `type any`
200 +
201 bypasses the type system in situations where an application does not
202 want the it to enforce correct usage. See `Child.remember` and
203 `Param.forget`.
204
205 * `type chain`
206 +
207 means that the child process's stream was connected via a pipe to the
208 parent process. The parent process may pass this pipe in turn to
209 another child, thus chaining them together.
210
211 * `type input, output`
212 +
213 record the direction that a stream flows. They are used as a part of
214 `Param.t` and `Child.t` and is detailed there.
215
216 * `type none`
217 +
218 means that the child process's stream my not be used by the parent
219 process. This happens when the child process is connected directly to
220 some source.
221 +
222 The types `BinIO.instream`, `BinIO.outstream`, `TextIO.instream`,
223 `TextIO.outstream`, and `Posix.FileSys.file_desc` are also valid types
224 with which to instantiate child streams.
225
226 * `exception MisuseOfForget`
227 +
228 may be raised if `Child.remember` and `Param.forget` are used to
229 bypass the normal type checking. This exception will only be raised
230 in cases where the `forget` mechanism allows a misuse that would be
231 impossible with the type-safe versions.
232
233 * `exception DoublyRedirected`
234 +
235 raised if a stream connected to a child process is redirected to two
236 separate child processes. It is safe, though bad style, to use the a
237 `Child.t` with the same `Child.*` function repeatedly.
238
239 * `create {args, path, env, stderr, stdin, stdout}`
240 +
241 starts a child process with the given command-line `args` (excluding
242 the program name). `path` should be an absolute path to the executable
243 run in the new child process; relative paths work, but are less
244 robust. Optionally, the environment may be overridden with `env`
245 where each string element has the form `"key=value"`. The `std*`
246 options must be provided by the `Param.*` functions documented above.
247 +
248 Processes which are `create`-d must be either `reap`-ed or `kill`-ed.
249
250 * `getStd{in,out,err} proc`
251 +
252 gets a handle to the specified stream. These should be used by the
253 `Child.*` functions. Failure to use a stream connected via pipe to a
254 child process may result in runtime dead-lock and elicits a compiler
255 warning.
256
257 * `kill (proc, sig)`
258 +
259 terminates the child process immediately. The signal may or may not
260 mean anything depending on the host platform. A good value is
261 `Posix.Signal.term`.
262
263 * `reap proc`
264 +
265 waits for the child process to terminate and return its exit status.
266
267
268 == Important usage notes ==
269
270 When building an application with many pipes between child processes,
271 it is important to ensure that there are no cycles in the undirected
272 pipe graph. If this property is not maintained, deadlocks are a very
273 serious potential bug which may only appear under difficult to
274 reproduce conditions.
275
276 The danger lies in that most operating systems implement pipes with a
277 fixed buffer size. If process A has two output pipes which process B
278 reads, it can happen that process A blocks writing to pipe 2 because
279 it is full while process B blocks reading from pipe 1 because it is
280 empty. This same situation can happen with any undirected cycle formed
281 between processes (vertexes) and pipes (undirected edges) in the
282 graph.
283
284 It is possible to make this safe using low-level I/O primitives for
285 polling. However, these primitives are not very portable and
286 difficult to use properly. A far better approach is to make sure you
287 never create a cycle in the first place.
288
289 For these reasons, the `Unix.executeInEnv` is a very dangerous
290 function. Be careful when using it to ensure that the child process
291 only operates on either `stdin` or `stdout`, but not both.
292
293
294 == Example use of MLton.Process.create ==
295
296 The following example program launches the `ipconfig` utility, pipes
297 its output through `grep`, and then reads the result back into the
298 program.
299
300 [source,sml]
301 ----
302 open MLton.Process
303 val p =
304 create {args = [ "/all" ],
305 env = NONE,
306 path = "C:\\WINDOWS\\system32\\ipconfig.exe",
307 stderr = Param.self,
308 stdin = Param.null,
309 stdout = Param.pipe}
310 val q =
311 create {args = [ "IP-Ad" ],
312 env = NONE,
313 path = "C:\\msys\\bin\\grep.exe",
314 stderr = Param.self,
315 stdin = Param.child (getStdout p),
316 stdout = Param.pipe}
317 fun suck h =
318 case TextIO.inputLine h of
319 NONE => ()
320 | SOME s => (print ("'" ^ s ^ "'\n"); suck h)
321
322 val () = suck (Child.textIn (getStdout q))
323 ----