Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | |
2 | Date: Tue, 23 Jul 2002 11:49:57 -0400 (EDT) | |
3 | From: Matthew Fluet <fluet@CS.Cornell.EDU> | |
4 | ||
5 | ||
6 | John and SML implementers, | |
7 | ||
8 | Here are a loose collection of notes I've taken while starting to | |
9 | update the MLton implementation of the SML Basis Library to the latest | |
10 | version. They span quite a range: errata and typos, signature | |
11 | constraint concerns, and some design questions. Thus far, I've looked | |
12 | at the structures that had been grouped under the headings General, | |
13 | Text, Integer, Reals, Lists, and Arrays and Vectors (i.e., excluding | |
14 | IO, System, and Posix) in the "old" web specification. | |
15 | ||
16 | A few high level comments: | |
17 | ||
18 | * As an organizational principal, I liked the grouping of modules into | |
19 | larger collections used in the "old" web specification better than | |
20 | the long alphabetical list. | |
21 | * I'm quite happy to see opaque signature matches for most structures. | |
22 | In particular, I think it will help avoid porting problems between | |
23 | implementations that provide different INTEGER structures, especially | |
24 | when LargeInt = Int in one implementation and LargeInt = IntInf in | |
25 | another. | |
26 | ||
27 | Required and optional components, Top-level: | |
28 | ||
29 | * A number of structures have an opaque signature match in | |
30 | overview.html, but not in the corresponding structure specific page: | |
31 | General, Bool, Option, List, ListPair, IntInf, | |
32 | Array, ArraySlice, Vector, VectorSlice. | |
33 | * Word8Array2 is listed as required in overview.html, | |
34 | but its signature, MONO_ARRAY2, is not required. | |
35 | Furthermore, Word8Array2 is marked optional in mono-array2.html. | |
36 | I don't quite see a rationale for Word8Array2 being required. | |
37 | * With the addition of val ~ : word -> word to the WORD signature, | |
38 | presumably ~ should be overloaded at num, rather than at intreal. | |
39 | ||
40 | Reals: | |
41 | ||
42 | * In pack-float.html, the where type clauses are incorrect: | |
43 | structure PackRealBig :> PACK_REAL | |
44 | where type PackRealBig.real = Real.real | |
45 | should be | |
46 | structure PackRealBig :> PACK_REAL | |
47 | where type real = Real.real | |
48 | * Likewise, in most places, references to basic types are unqualifed, | |
49 | so perhaps the where clause should read | |
50 | where type real = real | |
51 | for the PackRealBig and PackRealLittle structures. | |
52 | ||
53 | Arrays and Vectors: | |
54 | ||
55 | * In vector-slice.html, the description of subslice references |arr| | |
56 | when it should reference |sl|. | |
57 | * In {[mono-]array[-slice],[mono-]vector[-slice]}.html, the | |
58 | description of findi references appi when it should reference findi. | |
59 | * In mono-array-slice.html, structure CharArraySlice has the clause | |
60 | where type array = CharVector.vector | |
61 | which should be | |
62 | where type array = CharArray.array. | |
63 | * In mono-{vector[-slice],array[-slice],array2}.html, there are | |
64 | Word<N> structures but no (default word) Word structures. | |
65 | * In mono-vector.html, structure CharVector has the clause | |
66 | where type elem = Char.char | |
67 | while the other monomorphic vectors of basic types reference | |
68 | the unqualified type; i.e. structure BoolVector has the clause | |
69 | where type elem = bool. | |
70 | * There are no "See also"'s into MONO_VECTOR_SLICE or MONO_ARRAY_SLICE | |
71 | from MONO_VECTOR or MONO_ARRAY. | |
72 | * A long discussion about types defined in | |
73 | [MONO_]{ARRAY,VECTOR}[_SLICE] signatures; deferred to a separate | |
74 | email. | |
75 | ||
76 | Really nit-picky: | |
77 | ||
78 | * Ordering of comparison functions (>, >=, etc.) and unary negation | |
79 | are different within INTEGER and WORD. | |
80 | * Ordering of functions in CHAR seems awkward. | |
81 | * Ordering of full, slice, subslice different in ARRAY_SLICE and | |
82 | VECTOR_SLICE. | |
83 | * Ordering of foldi/fold and modifi/modify different in ARRAY2 and | |
84 | MONO_ARRAY2. | |
85 | ||
86 | Top-level and opaque signatures: | |
87 | * I think it would be useful to see the entire top-level of required | |
88 | structures written out with their respective signature constraints. | |
89 | For example, in the description of the Math structure, the spec | |
90 | reads: "The top-level structure Math provides these functions for | |
91 | the default real type Real.real." Because the top-level Math | |
92 | structure has an opaque signature match (in overview.html), then the | |
93 | sentence above implies that there ought to be the constraint | |
94 | where type real = real (or Real.real). | |
95 | Granted, none of the other structures in overview.html have where | |
96 | clauses, and most type constraints are documented in the structure | |
97 | specific pages, but the constraint on the top-level Math.real | |
98 | slipped my mind when I first looked at it. | |
99 | ||
100 | -Matthew | |
101 | ||
102 | ****************************************************************************** | |
103 | ****************************************************************************** | |
104 | ||
105 | Date: Tue, 23 Jul 2002 11:54:09 -0400 (EDT) | |
106 | From: Matthew Fluet <fluet@CS.Cornell.EDU> | |
107 | ||
108 | ||
109 | As promised, here is a longish look at the types used in Arrays and | |
110 | Vectors. | |
111 | ||
112 | Array and Vector design: | |
113 | ||
114 | * The ARRAY signature includes type 'a vector. | |
115 | Presumably, type 'a Array.vector = type 'a Vector.vector, but no | |
116 | constraint makes this explicit. | |
117 | * MONO_ARRAY_SLICE includes type vector and type vector_slice, | |
118 | while the ARRAY_SLICE signature explicitly references | |
119 | 'a VectorSlice.slice and 'a Vector.vector. | |
120 | * VECTOR_SLICE doesn't include 'a vector, but has | |
121 | val mapi : (int * 'a -> 'b) -> 'a slice -> 'b vector | |
122 | val map : ('a -> 'b) -> 'a slice -> 'b vector; | |
123 | On the other hand, full, slice, base, vector, and concat | |
124 | reference 'a Vector.vector. | |
125 | ||
126 | For consistency, I'd prefer to see | |
127 | signature VECTOR = | |
128 | sig type 'a vector ... end | |
129 | signature VECTOR_SLICE = | |
130 | sig type 'a vector type 'a slice ... end | |
131 | signature ARRAY = | |
132 | sig type 'a vector type 'a array ... end | |
133 | signature ARRAY_SLICE = | |
134 | sig type 'a vector type 'a vector_slice | |
135 | tyep 'a array type 'a slice ... end | |
136 | signature MONO_VECTOR = | |
137 | sig type elem type vector ... end | |
138 | signature MONO_VECTOR_SLICE = | |
139 | sig type elem type vector type slice ... end | |
140 | signature MONO_ARRAY = | |
141 | sig type elem type vector type array ... end | |
142 | signature MONO_ARRAY_SLICE = | |
143 | sig type elem type vector type vector_slice | |
144 | type array type slice ... end | |
145 | ||
146 | structure Vector :> VECTOR | |
147 | structure VectorSlice :> VECTOR_SLICE | |
148 | where type 'a vector = 'a Vector.vector | |
149 | structure Array :> ARRAY | |
150 | where type 'a vector = 'a Vector.vector | |
151 | structure ArraySlice :> ARRAY_SLICE | |
152 | where type 'a vector = 'a Vector | |
153 | where type 'a vector_slice = 'a VectorSlice.slice | |
154 | where type 'a array = 'a Array.array | |
155 | structure BoolVector :> MONO_VECTOR | |
156 | where type elem = bool | |
157 | structure BoolVectorSlice :> MONO_VECTOR_SLICE | |
158 | where type elem = bool | |
159 | where type vector = BoolVector.vector | |
160 | structure BoolArray :> MONO_ARRAY | |
161 | where type elem = bool | |
162 | where type vector = BoolVector.vector | |
163 | structure BoolArraySlice :> MONO_ARRAY_SLICE | |
164 | where type elem = bool | |
165 | where type vector = BoolVector.vector | |
166 | where type vector_slice = BoolVectorSlice.slice | |
167 | where type array = BoolArray.array | |
168 | ||
169 | While semantically, this shouldn't be any different than the | |
170 | specification, it could effect type-error messages. For example, if I | |
171 | have the structure Foo: | |
172 | ||
173 | structure Foo = struct | |
174 | open BoolArraySlice | |
175 | ||
176 | val copyVec0 {src: vector_slice, | |
177 | dst: array} = copyVec {src = src, dst = dst, di = 0} | |
178 | end | |
179 | ||
180 | which I decide to generalize to polymorphic array slices, then just | |
181 | changing BoolArraySlice to ArraySlice will lead to different | |
182 | type-error messages: either "ubound type constructor: vector_slice" | |
183 | (under the specification) or "type constructor vector_slice given 0 | |
184 | arguments, wants 1" (under the signatures given above); and an arity | |
185 | error for array in either case. It's not much of an argument, but I | |
186 | need to replace vector_slice with 'a VectorSlice.slice under the | |
187 | specification, while I only need to add 'a under the sigs above. | |
188 | ||
189 | ||
190 | Array2: | |
191 | * Why not have an ARRAY2_REGION analagous to ARRAY_SLICE? | |
192 | Likewise, how about VECTOR2 and VECTOR2_REGION? | |
193 | I think the decision to separate Arrays and Vectors from | |
194 | their corresponding slices is a nice design choice, and I'd be in | |
195 | favor of extending it to multi-dimentional ones. | |
196 | * Should ARRAY2 have findi/find, exists, all? collate? | |
197 | ||
198 | ****************************************************************************** | |
199 | ****************************************************************************** | |
200 | ||
201 | Date: Thu, 25 Jul 2002 15:20:01 +0200 | |
202 | From: Andreas Rossberg <rossberg@ps.uni-sb.de> | |
203 | ||
204 | ||
205 | Like Matthew I started implementing the latest version of the Basis spec | |
206 | for Alice and Hamlet. I'm quite happy with most of the changes. It was a | |
207 | surprise to discover the presence of a Windows structure, though :-) | |
208 | ||
209 | Here is my list of comments, some of which may duplicate observations | |
210 | already made by Matthew. They primarily cover global issues and the | |
211 | required part of the library, though I haven't looked deeper into the IO | |
212 | and Posix parts yet. I also included some proposals for modest additions | |
213 | to the library, which I believe are useful and fit its spirit. | |
214 | ||
215 | ||
216 | Trivial bugs, typos, cosmetics | |
217 | ------------------------------ | |
218 | ||
219 | * Overview: | |
220 | - INT_INF appears in the list of required signatures. | |
221 | - WordArray2 appears under the list of required structures, | |
222 | instead of optional ones. | |
223 | ||
224 | * LIST_PAIR: | |
225 | - Typo in description of allEq: double "the". | |
226 | ||
227 | * SUBSTRING: | |
228 | - The scan example uses the deprecated "all" function. | |
229 | ||
230 | * VECTOR_SLICE: | |
231 | - Typo in synopsis of subslice: s/opt/sz/. | |
232 | - Typo in description of subslice: s/|arr|/|sl|/. | |
233 | - Typo in description of findi: s/appi/findi/. | |
234 | - Signature sometimes uses Vector.vector instead of plain vector. | |
235 | - The equation for mapi can be simplified to: | |
236 | Vector.fromList (foldri (fn (i,a,l) => f(i,a)::l) [] slice) | |
237 | ||
238 | * MONO_VECTOR_SLICE and ARRAY_SLICE and MONO_ARRAY_SLICE: | |
239 | - Typo in synopsis of subslice: s/opt/sz/. | |
240 | - Typo in description of findi: s/appi/findi/. | |
241 | ||
242 | * BYTE: | |
243 | - Accidental "val" keyword in synopsis of some functions. | |
244 | ||
245 | * TEXT_IO: | |
246 | - The "where" constraints contain erroneously qualified ids. | |
247 | - The specification of the TEXT_IO signature is not valid SML'97, | |
248 | since StreamIO is specified twice. You might want to add a | |
249 | comment regarding that. | |
250 | - The constraints for types vector and elem are redundant | |
251 | (in fact, invalid), because the signature TEXT_STREAM_IO | |
252 | already specifies the necessary equations. | |
253 | ||
254 | * The use of variable names is sometimes inconsistent: | |
255 | - Predicate arguments to higher-order functions are usually | |
256 | named "f" (eg. List.all), sometimes "p" (eg. String.tokens, | |
257 | StringCvt.splitl), and sometimes even "pred" (eg. ListPair.all). | |
258 | - Similarly, fold functions mostly use "init" to name initial | |
259 | accumulators, except in the List and ListPair modules. | |
260 | ||
261 | ||
262 | ||
263 | Ambiguities / Unclear Details | |
264 | ----------------------------- | |
265 | ||
266 | * Overview: | |
267 | - The subsection about dependencies among optional modules has | |
268 | disappeared. Does that mean that there aren't any anymore? | |
269 | (The nice subsection about design rules and conventions also | |
270 | has gone.) | |
271 | ||
272 | * The intended meaning of opaque signature constraints is not always | |
273 | clear to me. Sometimes the prose contains remarks about additional | |
274 | equalities that are not appearent from the signature constraints. | |
275 | For example, is or isn't | |
276 | - Text.Char.char = Char.char ? (and so on for the rest of Text) | |
277 | - LargeInt.int = IntN.int (for some structure IntN) ? | |
278 | (likewise LargeWord.word, LargeReal.real) | |
279 | - Char.string = String.string ? | |
280 | - Math.real = Real.real ? | |
281 | In particular, the spec sometimes speaks of "equal structures", | |
282 | which has no real technical meaning in SML'97. | |
283 | Note that from the opaque matching on the overview page one might | |
284 | even conclude that General.unit <> {} ! | |
285 | ||
286 | * The type specification of String.string and CharVector.vector | |
287 | is circular: | |
288 | structure String :> STRING | |
289 | where type string = CharVector.vector | |
290 | structure CharVector :> MONO_VECTOR | |
291 | where type vector = String.string | |
292 | Likewise for Substring.substring and CharVectorSlice.slice. | |
293 | A respective defining structure should be chosen. | |
294 | ||
295 | * STRING: | |
296 | - Function fromString has a special case that is not covered by | |
297 | implementing the function through straight-forward iterative | |
298 | application of the Char.scan function, namely a trailing gap | |
299 | escape (\f...f\) as in "foo\\ \\" or "foo\\ \\\000" (where \000 | |
300 | is an non-convertible character). Several implementations I | |
301 | tried get that detail wrong, so a corresponding note might be | |
302 | in order. Moreover, it is not completely obvious from the | |
303 | description what the result should be for strings that contain | |
304 | a gap escape as the only convertible sequence, e.g. "\\ \\" or | |
305 | "\\ \\\000" - it is supposed to be SOME "", I guess. | |
306 | ||
307 | * SUBSTRING: | |
308 | - Shouldn't span raise Span if i' < i? Otherwise, contrary | |
309 | to the prose, it in fact accepts arguments where ss' is | |
310 | left to ss, as long as they overlap (which is rather odd). | |
311 | - For the curried triml/trimr it is not clear whether an | |
312 | Subscript exception has to be raised already if k < 0 but no | |
313 | second argument is applied. | |
314 | ||
315 | ||
316 | ||
317 | Naming and structuring | |
318 | ---------------------- | |
319 | ||
320 | Its nicely chosen regular naming conventions and structure are two of | |
321 | the aspects I like most about the Standard Basis. The following list | |
322 | enumerates the few cases where I feel that the spec violates its own | |
323 | conventions. | |
324 | ||
325 | * WORD: | |
326 | - The fromLargeWord and toLargeWord functions should drop | |
327 | the "Word" suffix to be consistent with the corresponding | |
328 | functions in the REAL and INTEGER signatures. | |
329 | ||
330 | * CHAR: | |
331 | - The functions contains/notContains should be moved to the | |
332 | STRING signature, as they are similar to find/exist | |
333 | operations and thus functionality of the aggregate. The | |
334 | type string could then be removed from the signature. | |
335 | ||
336 | * ARRAY_SLICE and MONO_ARRAY_SLICE: | |
337 | - The function copyVec seems completely out of place: it does | |
338 | neither operate on array slices, nor on vectors. But honestly | |
339 | I have got no idea where else to put it :-( | |
340 | ||
341 | * STRING and SUBSTRING: | |
342 | - There is a certain asymmetry between slices and substrings | |
343 | which tends to confuse at least myself when hacking. For more | |
344 | consistency I propose: | |
345 | (1) changing the type of Substring.substring to | |
346 | string * int * int option -> substring | |
347 | (for consistency with VectorSlice.slice), | |
348 | (2) renaming Substring.slice to Substring.subsubstring, | |
349 | (for consistency with VectorSlice.subslice), | |
350 | (3) removing Substring.{app,foldl,foldr} (there are no similar | |
351 | functions in the STRING signature, and in both cases they | |
352 | are available through CharVector/CharVectorSlice), | |
353 | (4) removing String.extract and Substring.extract (the same | |
354 | functionality is available through CharVector[Slice]). | |
355 | - I believe the deprecated Substring.all can be removed for good. | |
356 | After all, there are more serious incompatible changes being | |
357 | made (e.g. array copying functions). | |
358 | ||
359 | * Vectors and arrays: | |
360 | - While the lib consistently uses the to/from convention for | |
361 | conversions on basic types, it sometimes uses adhoc conventions | |
362 | for aggregates. I propose renaming: | |
363 | (1) Array.vector to Array.toVector | |
364 | (2) VectorSlice.vector to VectorSlice.toVector, | |
365 | (3) ArraySlice.vector to ArraySlice.toVector, | |
366 | (4) Substring.string to Substring.toString, | |
367 | - Since the copy functions have only 3, mostly distinctly typed | |
368 | arguments now, there no longer seems to be a strong reason to | |
369 | require passing those by notationally heavy records. | |
370 | ||
371 | * INT_INF: | |
372 | - The presence of bit fiddling operators in that signature is | |
373 | something that feels exceptionally ad-hoc. Either they should | |
374 | be available for all integer types, or there should be a | |
375 | separate WORD_INF, with appropriate conversions, that makes | |
376 | these available. | |
377 | ||
378 | * Toplevel: | |
379 | - Now that there is Word.~ (which is good) it seems rather odd | |
380 | that the toplevel ~ is not overloaded for words, i.e. does not | |
381 | have type num-> num. | |
382 | ||
383 | * Net functionality: | |
384 | - I really like the idea of structuring the library namespace as | |
385 | it has been done with the OS and Posix structures. I would | |
386 | prefer to see something similar being done for the added | |
387 | network functionality. More precisely, I propose | |
388 | (1) moving the structures Socket, INetSock, GenericSock, and | |
389 | the three Net*DB structures into a new wrapper structure | |
390 | Net (renaming Net*DB to *DB), | |
391 | (2) defining a corresponding signature NET, | |
392 | (3) renaming the signatures SOCKET, GENERIC_SOCK and INET_SOCK | |
393 | to NET_SOCKET, NET_GENERIC_SOCK and NET_INET_SOCK, resp., | |
394 | (4) moving UnixSock to the Unix structure (renamed as Socket). | |
395 | ||
396 | ||
397 | ||
398 | Misc. proposals for additional functionality | |
399 | -------------------------------------------- | |
400 | ||
401 | Here is a small collection of miscellaneous simple functions which I | |
402 | believe the library is still lacking, either because they are commonly | |
403 | useful or because they would make the library more regular. | |
404 | ||
405 | * LIST and LIST_PAIR: | |
406 | - The IMHO single most convenient extension to the library would | |
407 | be indexed morphisms on lists, i.e. adding | |
408 | val appi : (int * 'a -> unit) -> 'a list -> unit | |
409 | val mapi : (int * 'a -> 'b) -> 'a list -> 'b list | |
410 | val foldli : (int * 'a * 'b -> 'b) -> 'b -> 'a list -> 'b | |
411 | val foldri : (int * 'a * 'b -> 'b) -> 'b -> 'a list -> 'b | |
412 | val findi : (int * 'a -> bool) -> 'a list -> (int * 'a) option | |
413 | - Likewise for LIST_PAIR. | |
414 | - LIST_PAIR does not support partial mapping: | |
415 | val mapPartial : ('a * 'b -> 'c option) -> | |
416 | 'a list * 'b list -> 'c list | |
417 | ||
418 | * LIST, VECTOR, ARRAY, etc.: | |
419 | - Another function on lists that would be very useful from my | |
420 | perspective is | |
421 | val appr : ('a -> unit) -> 'a list -> unit | |
422 | and its indexed sibling | |
423 | val appri : (int * 'a -> unit) -> 'a list -> unit | |
424 | which traverse the list from right to left. | |
425 | - Likewise for all aggregate types. | |
426 | - All aggregates come with a fromList function. I often feel the | |
427 | need to have inverse toList functions. Use of foldr is obfuscating. | |
428 | ||
429 | * OPTION: | |
430 | - Often using isSome is a bit clumsy. I thus propose adding the dual | |
431 | val isNone : 'a option -> bool | |
432 | ||
433 | * STRING and SUBSTRING: | |
434 | - For historical reasons we have {String,Substring}.size instead | |
435 | of *.length, which is inconsistent with all other aggregates and | |
436 | frequently lets me mix them up when I use them side by side. | |
437 | I propose adding aliases | |
438 | String.maxLen | |
439 | String.length | |
440 | Substring.length | |
441 | ||
442 | * WideChar and WideString: | |
443 | - There is no convenient way to convert between the standard and | |
444 | wide character set. Would it be reasonable to introduce LargeChar | |
445 | and LargeString structures (and so on) and have the CHAR and | |
446 | STRING signatures enriched by fromLarge/toLarge functions, as for | |
447 | numbers? That would also allow a program to select the widest | |
448 | character set available (which is currently impossible within the | |
449 | language). | |
450 | ||
451 | * String conversion: | |
452 | - I don't quite see the rationale for which signatures contain a | |
453 | scan function and which don't. I believe it makes sense to have | |
454 | scan in every signature that has fromString. | |
455 | - There should be a function | |
456 | val scanC : (Char.char, 'a) StringCvt.reader | |
457 | -> (char, 'a) StringCvt.reader | |
458 | to scan strings as C characters. This would make Char.fromCString | |
459 | and particularly String.fromCString more modular. | |
460 | - How about a dual writer abstraction as with | |
461 | type ('a,'b) writer = 'a * 'b -> 'b option | |
462 | and supporting fmt functions for basic types? Such a thing might | |
463 | be useful for writing to streams or buffers. | |
464 | ||
465 | * Vectors: | |
466 | For some time now I have been trying to use vectors more often | |
467 | instead of an often inappropriate list representation. This is | |
468 | sometimes made more difficult simply because the library support | |
469 | isn't as good as for lists. It improved in the updated version | |
470 | but still I miss: | |
471 | - Array.fromVector, | |
472 | - Vector.mapPartial, | |
473 | - Vector.rev, | |
474 | - Vector.append (though I guess concat is good enough), | |
475 | - most of all: a VectorPair structure. | |
476 | ||
477 | * Hash functions: | |
478 | - Giving every basic type a (default) hash function in addition to | |
479 | comparison would be quite useful in conjunction with container | |
480 | libraries. | |
481 | ||
482 | * There is no defining structure for references. I would like to see | |
483 | signature REF | |
484 | structure Ref : REF | |
485 | where REF contains: | |
486 | datatype ref = datatype ref | |
487 | val ! : 'a ref -> 'a | |
488 | val := : 'a ref * 'a -> unit | |
489 | val swap : 'a ref * 'a ref -> unit (* or :=: ? *) | |
490 | val map : ('a -> 'a) -> 'a ref -> 'a ref | |
491 | You might then consider removing ! and := from GENERAL. | |
492 | ||
493 | * Signature conventions: | |
494 | Some additional conventions would make use of Basis types as | |
495 | functor arguments more convenient: | |
496 | - Each signature defining an abstract type should make that | |
497 | type available under the alias "t" as well (this includes | |
498 | monomorphic types as well as polymorphic ones). | |
499 | - Every equality type should come with an explicit equality | |
500 | function | |
501 | val eq : t * t -> bool | |
502 | to move away from the reliance on eqtypes. | |
503 | - There should be a uniform name for canonical constructor | |
504 | functions, e.g. "new" (or at least an alias). | |
505 | ||
506 | -- | |
507 | Andreas Rossberg, rossberg@ps.uni-sb.de | |
508 | ||
509 | ****************************************************************************** | |
510 | ****************************************************************************** | |
511 | ||
512 | Date: Fri, 2 Aug 2002 14:04:16 +0100 | |
513 | From: David Matthews <David.Matthews@deanvillage.com> | |
514 | ||
515 | ||
516 | I've been having another look at the Basis library implementation in | |
517 | Poly/ML and in particular the I/O library. I'm still not sure I fully | |
518 | understand the implications of the Stream IO (functional IO) layer and | |
519 | in particular the way "canInput" works and interacts with "input". | |
520 | ||
521 | The definition says that canInput(f, n) returns SOME k "if a call to | |
522 | input would return immediately with at least k characters". | |
523 | Specifically it does not say "if a call to inputN(f, k) would return | |
524 | immediately". Secondly it says that it "should attempt to return as | |
525 | large a k as possible" and gives the example of a buffer containing 10 | |
526 | characters with the user calling canInput(f, 15). This suggests that a | |
527 | call to canInput could have the effect of committing the stream since a | |
528 | perfectly good implementation of "input" would be to return what was | |
529 | left of the buffer, i.e. 10 characters, and only read from the | |
530 | underlying stream on a subsequent call to "input". Yet after a call to | |
531 | canInput(f, 15) which returns SOME 15 the call to "input" is forced to | |
532 | return at least 15. In other words a call to canInput changes the | |
533 | behaviour of a subsequent call to "input". Generally, what is the | |
534 | behaviour of canInput with an argument larger than the buffer size? How | |
535 | far ahead is canInput expected to read? | |
536 | ||
537 | A few other notes of things I've discovered, some of which are trivial: | |
538 | ||
539 | The signature for TextIO.StreamIO contains duplicates of | |
540 | where type StreamIO.reader = TextPrimIO.reader | |
541 | where type StreamIO.writer = TextPrimIO.writer | |
542 | ||
543 | There are declared constants for platformWin32Windows2000 and | |
544 | platformWin32WindowsXP in the Windows structure. When I proposed the | |
545 | Windows.Config structure I didn't include constants for these versions | |
546 | of the OS because the underlying GetVersionEx function returns the same | |
547 | value, VER_PLATFORM_WIN32_NT in the dwPlatformId field for NT, Windows | |
548 | 2000 and XP It is possible to distinguish these but only using the | |
549 | major and minor version fields. Windows CE does give a different value | |
550 | for the platformID. I would say it is confusing to have these here | |
551 | because it implies that it's possible to discriminate on the basis of | |
552 | the platformID field. | |
553 | ||
554 | The example definition of input1 at the bottom of STREAM_IO returns a | |
555 | value of type elem option * instream when the signature says it should | |
556 | be (elem * instream) option. | |
557 | ||
558 | Description of "input" function in STREAM_IO signature. The word "ay" | |
559 | should be "may". | |
560 | ||
561 | -- | |
562 | David. | |
563 | ||
564 | ****************************************************************************** | |
565 | ****************************************************************************** | |
566 | ||
567 | Date: Fri, 11 Oct 2002 17:46:59 -0400 (EDT) | |
568 | From: Matthew Fluet <fluet@CS.Cornell.EDU> | |
569 | ||
570 | ||
571 | Following up my previous post, here is another loose collection of | |
572 | notes I've taken while updating the MLton implementation of the SML | |
573 | Basis Library. This includes the structures that had been grouped | |
574 | under the headings System, Posix, and IO in the "old" web | |
575 | specification. | |
576 | ||
577 | Required and optional components: | |
578 | * The optional functors PrimIO, StreamIO, and ImperativeIO are not | |
579 | listed among the optional components in overview.html. | |
580 | ||
581 | Lists: | |
582 | * The discussion for the ListPair structure says: | |
583 | "Note that a function requiring equal length arguments may determine | |
584 | this lazily, i.e. , it may act as though the lists have equal length | |
585 | and invoke the user-supplied function argument, but raise the | |
586 | exception when it arrives at the end of one list before the end of the | |
587 | other." | |
588 | Such an implementation choice seems to go against the spirit that | |
589 | programs run under conforming implementations of the Basis Library | |
590 | should behave the same. | |
591 | ||
592 | Posix: | |
593 | * In posix.html, last sentence in Discussion: "onsult" instead of | |
594 | "consult" | |
595 | PosixSignal: | |
596 | * In posix-signal.html, in Discussion: "The name of the coressponding | |
597 | ..." sentence is repeated. | |
598 | PosixError: | |
599 | * In the discussion of POSIX_ERROR: | |
600 | "The name of a corresponding POSIX error can be derived by | |
601 | capitalizing all letters and adding the character ``E'' as a | |
602 | prefix. For example, the POSIX error associated with nodev is | |
603 | ENODEV. The only exception to this rule is the error toobig, whose | |
604 | associated POSIX error is E2BIG." | |
605 | It isn't clear if this is the intended semantics for errorName and | |
606 | syserror. | |
607 | ||
608 | Time: | |
609 | * The type time now includes "negative values moving to the past." | |
610 | In the absence of negative values, the text for the the | |
611 | to{Seconds,Milliseconds,Microseconds} functions to drop fractions of | |
612 | the time unit was unambigous. With negative values, I would | |
613 | interpret this as rounding towards zero. Is this correct? Would it | |
614 | be clearer to describe the rounding as such? | |
615 | * The + and - functions are required to raise Overflow, although most | |
616 | other "result not representable as a time value" error raises Time. | |
617 | * The - function is written prefix instead of infix in the | |
618 | description. | |
619 | * The scan and fromString functions do not specify how to treat a | |
620 | value with greater precision than the internal representation; | |
621 | should it have rounding or truncation semantics? Also, the | |
622 | functions are required to raise Overflow for an unrepresentable | |
623 | time value. | |
624 | ||
625 | IO: | |
626 | * The nice introduction to IO that appears at | |
627 | http://cm.bell-labs.com/cm/cs/what/smlnj/doc/basis/pages/io-explain.html | |
628 | doesn't seem to be included with the new pages. | |
629 | * The functor arguments in PrimIO, StreamIO, and ImperativIO functors | |
630 | don't match; some use structure A: MONO_ARRAY and others use | |
631 | structure Array: MONO_ARRAY. | |
632 | ||
633 | PrimIO() and PRIM_IO | |
634 | * The PRIM_IO signature requires pos to be an eqtype, but the PrimIO | |
635 | functor argument only requires pos to be a type. | |
636 | * readArr[NB], write{Vec,Arr}[NB] take "slices" (records of type {buf: | |
637 | {vector,array}, i: int, sz: int option}) but no description of the | |
638 | appropriate action to take when the slices are invalid. Presumably, | |
639 | they should raise Subscript. | |
640 | * There are a number of "contradictory" statments: | |
641 | "Readers and writers should not, in general, raise the IO.Io | |
642 | exception. It is assumed that the higher levels will appropriately | |
643 | handle these exceptions." | |
644 | "A reader is required to raise IO.Io if any of its functions, except | |
645 | close or getPos, is invoked after a call to close. A writer is | |
646 | required to raise IO.Io if any of its functions, except close, is | |
647 | invoked after a call to close." | |
648 | "closes the reader and frees operating system resources. Further | |
649 | operations on the reader (besides close and getPos) raise | |
650 | IO.ClosedStream." | |
651 | "closes the writer and frees operating system resources. Further | |
652 | operations (other than close) raise IO.ClosedStream." | |
653 | * The augment_reader and augment_writer functions may introduce new | |
654 | functions. Should the synthesized operations handle IO.Io | |
655 | exceptions and change the function field? Maybe this falls under | |
656 | the "intentionally unspecified" clause. | |
657 | ||
658 | StreamIO() and STREAM_IO: | |
659 | * What is the difference between a terminated output stream and a | |
660 | closed output stream? Some operations say what to do when the | |
661 | stream is terminated or closed, but many are unspecified when the | |
662 | other condition holds. I resolved this by looking at the IO | |
663 | introduction mentioned above, where it discusses stream states. | |
664 | But, closeOut is still confusing: "flushes f's buffers, marks the | |
665 | stream closed, and closes the underlying writer. This operation has | |
666 | no effect if f is already closed. If f is terminated, it should | |
667 | close the underlying writer." Shouldn't closeOut always execute the | |
668 | underlying writer's close function? The only way to terminate an | |
669 | outstream is to getOutstream, but I would really expect | |
670 | TextIO.closeOut to "really" close the underlying | |
671 | file/outstream/writer. | |
672 | * The IO structure has dropped the TerminatedStream exception, but | |
673 | there seem to be sufficient cases when a stream should raise an | |
674 | exception when it is terminated. | |
675 | * The semantics of the vector returned by getReader are unclear. At | |
676 | the very least, the source code for SML/NJ and PolyML have very | |
677 | different interpretations, and I've chosen yet another. I think | |
678 | part of the problem is that the word "[un]consumed" only appears in | |
679 | the description of this function, so it's unclear what corresponds | |
680 | to consumed input. | |
681 | * I suspect the example under endOfStream is wrong: | |
682 | ||
683 | In these cases the StreamIO.instream will also have multiple EOF's; | |
684 | that is, it can be that | |
685 | ||
686 | val true = endOfStream(f) | |
687 | val ("",f') = input f | |
688 | val true = endOfStream(f') | |
689 | val ("xyz",f'') = input f | |
690 | ||
691 | The fact that input f can return two different values would seem to | |
692 | violate the principal argument for functional streams! Looking at | |
693 | the aforementioned IO introduction in the "old" pages, I see the | |
694 | more reasonable example: | |
695 | ||
696 | Consequently, the following is not guaranteed to be true: | |
697 | ||
698 | let val z = TextIO.StreamIO.endOfStream f | |
699 | val (a,f') = TextIO.StreamIO.input f | |
700 | val x = TextIO.StreamIO.endOfStream f' | |
701 | in x=z (* not necessarily true! *) | |
702 | end | |
703 | ||
704 | whereas the following is guaranteed to be true: | |
705 | ||
706 | let val z = TextIO.StreamIO.endOfStream f | |
707 | val (a,f') = TextIO.StreamIO.input f | |
708 | val x = TextIO.StreamIO.endOfStream f (* note, no prime! *) | |
709 | in x=z (* guaranteed true! *) | |
710 | end | |
711 | * David Matthews's post on Aug. 2 raised questions about canInput | |
712 | which are unresolved. | |
713 | ||
714 | General comments: | |
715 | * Various operations in IO take "slices", but aren't expressed in | |
716 | terms of {Vector,Array}Slice structures. One difficulty with this | |
717 | is that the slice types are not in scope within the IO signatures. | |
718 | ||
719 | I would really advocate making the VectorSlice structure a | |
720 | substructure of the Vector structure (and likewise for arrays). | |
721 | Even if this isn't done for the polymorphic vector/array structures, | |
722 | it would be extremely beneficial for the monomorphic structures, | |
723 | where in the {Prim,Stream,Imperative}IO functors, it is impossible | |
724 | to access the corresponding monomorphic vector/array slice | |
725 | structures. I found myself using Vector.tabulate when I really | |
726 | wanted ArraySlice.vector. | |
727 | ||
728 | The "old" MONO_ARRAY signature included structure Vector: | |
729 | MONO_VECTOR which gave access to the corresponding monomorphic | |
730 | vectors. | |
731 | ||
732 | -Matthew | |
733 | ||
734 | ****************************************************************************** | |
735 | ****************************************************************************** | |
736 | ||
737 | Date: Fri, 13 Dec 2002 15:57:55 +0100 | |
738 | From: Andreas Rossberg <rossberg@ps.uni-sb.de> | |
739 | ||
740 | ||
741 | Here is a collection of issues and comments we gathered when | |
742 | implementing the I/O stack from the Standard Basis (primitive, stream, | |
743 | imperative I/O) for Alice. While in general the specification seems to | |
744 | be pretty precise and complete, we sometimes found it hard to understand | |
745 | the semantic details of stream I/O, especially since many of them can | |
746 | only be derived indirectly from the examples in the discussion section | |
747 | and there appear to be some minor ambiguities and inconsistencies. Also, | |
748 | the PrimIO and StreamIO functors cannot always be implemented as | |
749 | suggested, because of their parametricity in types such as position and | |
750 | element. | |
751 | ||
752 | As a general note, the I/O interface does not seem to have been designed | |
753 | with concurrency in mind. In particular, augmenting readers and writers | |
754 | cannot be made thread-safe, AFAWCS. This is a bit of a problem for us, | |
755 | since Alice is relying on concurrency. However, that does not seem to be | |
756 | an issue easily solved. | |
757 | ||
758 | - Leif Kornstaedt, Andreas Rossberg | |
759 | ||
760 | ||
761 | The IO structure | |
762 | ---------------- | |
763 | ||
764 | * exception Io: | |
765 | ||
766 | - function field: (pedantic) The wording seems to imply that only | |
767 | functions from STREAM_IO raise the Io exception, but this is | |
768 | clearly not the case (consider TextIO.openIn to name just one). | |
769 | ||
770 | * datatype buffer_mode: | |
771 | ||
772 | - There is no specification of what precisely line buffering is | |
773 | supposed to mean, in particular for non-text streams. | |
774 | ||
775 | ||
776 | ||
777 | The PRIM_IO signature | |
778 | --------------------- | |
779 | ||
780 | * Synopsis: | |
781 | ||
782 | - (pedantic) It says that "higher level I/O facilities do not | |
783 | access the OS structure directly...". That's somewhat misleading | |
784 | since OS does not provide the same functionality anyway (if any, | |
785 | it was the Posix structure). | |
786 | ||
787 | * type reader: | |
788 | ||
789 | - Unlike for writers, it is not specified what the minimal set of | |
790 | operations is that a reader must support. | |
791 | ||
792 | - It is not specified whether multiple end-of-streams may occur. | |
793 | Since they are anticipated for StreamIO, one should expect them | |
794 | to be possible for underlying readers as well. However, this | |
795 | requires clarification of the semantics of several operations. | |
796 | ||
797 | - readArr, readArrNB: It is specified nowhere what the option for | |
798 | sz is supposed to mean, i.e. what the semantics of NONE is | |
799 | (presumably as for slices). | |
800 | ||
801 | - readVec, readVecNB: Unlike all other similar read and write | |
802 | functions, these two do not accept an option for the size | |
803 | argument. | |
804 | ||
805 | - avail: The description suggests that the function can be used as | |
806 | a hint by inputAll. However, this information is too inaccurate | |
807 | to be useful, since (apart from translation issues) the physical | |
808 | size of elements cannot be obtained (in particular in the | |
809 | StreamIO functor, which is parametric in the element type). In | |
810 | practice, endPos seems to be more useful for this purpose. So it | |
811 | is not clear what purpose avail could actually serve at all at | |
812 | the abstraction level provided by readers. | |
813 | ||
814 | - endPos: | |
815 | (1) May it block? For example, when reading from terminal or | |
816 | from another kind of stream, this can be naturally expected. | |
817 | ||
818 | (2) Which position is returned if there are multiple | |
819 | end-of-streams? | |
820 | ||
821 | - getPos, setPos, endPos, verifyPos: Description should start with | |
822 | "when present". | |
823 | ||
824 | - setPos, endPos: Should not raise an exception if unimplemented, | |
825 | but rather be NONE. Actually, the implementation notes on writers | |
826 | state that endPos *must* be implemented for readers. | |
827 | ||
828 | - Implementation note, item 6: Why is it likely that the client | |
829 | uses getPos frequently? And why should the reader count | |
830 | *untranslated* elements (and how would there be actual elements | |
831 | before translation)? | |
832 | (See also comments on STREAM_IO.filePosIn) | |
833 | ||
834 | * type writer: | |
835 | ||
836 | - writeVec, writeArr, writeVecNB, writeArrNB: | |
837 | (1) Again, it is not specified what the optional size means. | |
838 | ||
839 | (2) When may k < sz occur without having IO failure? If it is | |
840 | arbitrary, then there appears to be no correct way to write a | |
841 | sequence of elements, because it is neither possible to detect | |
842 | partial element writes (which are explained in the paragraph | |
843 | before the Implementation Notes), nor to complete such writes. | |
844 | This particularly implies that the StreamIO functor cannot | |
845 | implement flushing correctly (see below). | |
846 | ||
847 | - getPos, setPos, endPos, verifyPos: Description should start with | |
848 | "when present". | |
849 | ||
850 | - getPos, setPos: Should not raise an exception if unimplemented, | |
851 | but rather be NONE. | |
852 | ||
853 | - last paragraph before Implementation Note: Typo, double "plus". | |
854 | ||
855 | - first sentence in Implementation Note: (pedantic) Why is this | |
856 | put into the implementation notes when it actually seems to be a | |
857 | requirement of the specification? | |
858 | ||
859 | - last paragraph of Implementation Note: | |
860 | (1) States that readers must implement getPos, which seems to be | |
861 | contradicted by its optional type. | |
862 | ||
863 | (2) Typo, double "need". | |
864 | ||
865 | * openVector: | |
866 | ||
867 | - Is this supposed to support random access? Note that for types | |
868 | generated with the PrimIO functor it cannot (see below)! That | |
869 | seems to make this function rather useless. | |
870 | ||
871 | * augmentReader, augmentWriter: | |
872 | ||
873 | - It is not possible to synthesize operations in a way that is | |
874 | thread-safe in concurrent systems, hence it should be noted that | |
875 | augmenting is potentially dangerous. | |
876 | ||
877 | * There is no reference to the PrimIO functor. | |
878 | ||
879 | ||
880 | ||
881 | The PrimIO functor | |
882 | ------------------ | |
883 | ||
884 | * General problems: | |
885 | ||
886 | - Since the implementation is necessarily parametric in the pos | |
887 | type, openVector, nullRd, nullWr cannot create readers that | |
888 | allow random access, although one would expect that at least for | |
889 | openVector. | |
890 | ||
891 | * Functor argument: | |
892 | ||
893 | - Structure names A and V are inconsistent with the StreamIO and | |
894 | ImperativeIO functors. | |
895 | ||
896 | - Type pos has to be an eqtype to match the result signature. | |
897 | ||
898 | - Since the extract and copy functions have been removed/changed | |
899 | from ARRAY and VECTOR signatures, the PrimIO functor now | |
900 | naturally requires slice structures for efficient | |
901 | implementation. (Likewise the StreamIO functor) | |
902 | ||
903 | * Functor result: | |
904 | ||
905 | - Type sharing of the pos type is not specified, though essential | |
906 | for this functor being useful at all. | |
907 | ||
908 | ||
909 | ||
910 | ||
911 | The STREAM_IO signature | |
912 | ----------------------- | |
913 | ||
914 | * Synopsis: | |
915 | ||
916 | - An exception likely to be raised in by the underlying | |
917 | reader/writer is Size, which is not mentioned. OTOH, Fail can | |
918 | only occur in the rare case of user-supplied readers/writers, as | |
919 | the Basis itself is supposed to never raise it. | |
920 | ||
921 | * type out_pos: | |
922 | ||
923 | - A note on the meaning of this type would be desirable, since its | |
924 | canonical representation is (outstream * pos) rather than pos. | |
925 | (That also may have caused confusion in the discussion of | |
926 | imperative I/O, see below.) | |
927 | ||
928 | * input1: | |
929 | ||
930 | - The signature of this function is inconsistent with all other | |
931 | input functions. It should rather have type | |
932 | ||
933 | instream -> elem option * instream | |
934 | ||
935 | which in fact appears to be the type assumed in the discussion | |
936 | example relating input1 to inputN. | |
937 | ||
938 | * input: | |
939 | ||
940 | - Typo, s/ay/may/ | |
941 | ||
942 | * inputN: | |
943 | ||
944 | - This function is somewhat underspecified for n=0. In particular, | |
945 | may it block? Is it required to raise Io if the underlying | |
946 | reader is closed? | |
947 | ||
948 | * input, input1, inputN, inputAll: | |
949 | ||
950 | - (pedantic) Descriptions speak of "underlying system calls", | |
951 | although the reader may not actually depend on system calls. | |
952 | Preferably speak of "underlying reader" only. | |
953 | ||
954 | * closeIn: | |
955 | ||
956 | - Likewise, description speaks of "releasing system resources". | |
957 | This should be replaced by saying that it closes the underlying | |
958 | reader (which is not even specified as is). | |
959 | ||
960 | * closeOut: | |
961 | ||
962 | - Does the function attempt to close the stream even if flushing | |
963 | fails? | |
964 | ||
965 | - Why is it possible to close terminated streams? That seems to | |
966 | allow unfortunate interference with another stream that has been | |
967 | created from the extracted writer. | |
968 | ||
969 | * mkInstream, getReader: | |
970 | ||
971 | - The table seems to imply that mkInstream always augments its | |
972 | reader. This is inappropriate for concurrent environments (see | |
973 | above). | |
974 | ||
975 | - Should getReader return the original or the augmented reader? | |
976 | ||
977 | - The table still includes the removed getPosIn and setPosIn | |
978 | functions. | |
979 | ||
980 | * mkOutstream, getWriter: | |
981 | ||
982 | - Likewise. | |
983 | ||
984 | * filePosIn: | |
985 | ||
986 | - There seems to be no way to implement this function for buffered | |
987 | I/O, because the reader position that corresponds to a | |
988 | mid-block-element is not available and cannot be calculated in | |
989 | general. So how is this meant? | |
990 | ||
991 | - Typo, s/character/element/ | |
992 | ||
993 | * filePosOut: | |
994 | ||
995 | - Likewise. | |
996 | ||
997 | * getWriter: | |
998 | ||
999 | - It is non-obvious what the precise meaning of "terminating" a | |
1000 | stream is. If this is merely setting a status flag then a | |
1001 | corresponding note would be helpful. | |
1002 | ||
1003 | * getPosOut: | |
1004 | ||
1005 | - May this flush the stream (and hence raise Io exceptions)? | |
1006 | ||
1007 | * setPosOut: | |
1008 | ||
1009 | - This may raise an exception because the position has been | |
1010 | invalidated after obtaining it (e.g. by file truncation | |
1011 | performed by another process). | |
1012 | ||
1013 | - Typo, s/underlying device/underlying writer/ | |
1014 | ||
1015 | * setBufferMode, getBufferMode: | |
1016 | ||
1017 | - There is no specification of the semantics of line buffering, in | |
1018 | particular for non-text streams. | |
1019 | (See also comments on StreamIO functor) | |
1020 | ||
1021 | - It is not specified whether the stream may be flushed when set | |
1022 | to LINE_BUF mode (may cause Io exception). It seems unreasonable | |
1023 | to require it not to do so (assuming that line buffering is | |
1024 | intended to maintain the invariant that the buffer never | |
1025 | contains line breaks). | |
1026 | ||
1027 | - The synopsis of this function uses "ostr", while all others | |
1028 | use "f" for streams. | |
1029 | ||
1030 | * setPosOut, setBufferMode, getWriter: | |
1031 | ||
1032 | - Can raise an exception if flushing fails. | |
1033 | ||
1034 | * Discussion: | |
1035 | ||
1036 | - The statement that closing a stream just causes the | |
1037 | not-yet-determined part of the stream to be empty should | |
1038 | probably be generalised to explain what *truncating* a stream | |
1039 | means (getReader also truncates the stream). | |
1040 | ||
1041 | - Example of freshly opened stream: | |
1042 | s/mkInstream r/mkInstream(r, vector [])/ | |
1043 | s/size/length/ | |
1044 | ||
1045 | - nreads example: | |
1046 | s/mkInstream r/mkInstream(r, vector [])/ | |
1047 | s/size/length/ | |
1048 | ||
1049 | - input1/inputN relation example: | |
1050 | (1) Inconsistent with the actual typing of input1 (see above). | |
1051 | ||
1052 | (2) Typo, s/inputN f/inputN(f,1)/ | |
1053 | ||
1054 | - Unbuffered I/O, 1st example: | |
1055 | (1) Typos, | |
1056 | s/mkInstream(reader)/mkInstream(reader, vector [])/ | |
1057 | s/PrimIO.Rd{chunkSize,...}/(PrimIO.RD{chunksize,...}, v)/ | |
1058 | ||
1059 | (2) More importantly, the actual condition appears to be | |
1060 | incorrect. It should read: | |
1061 | (chunkSize > 1 orelse length v = 1) andalso endOfStream f' | |
1062 | ||
1063 | - Unbuffered I/O, 2nd example: | |
1064 | s/mkInstream(reader)/mkInstream(reader, vector [])/ | |
1065 | s/PrimIO.Rd{chunkSize,...}/(PrimIO.RD{chunksize,...}, v)/ | |
1066 | The condition must be corrected as above. | |
1067 | ||
1068 | * There is no reference to the StreamIO functor. | |
1069 | ||
1070 | ||
1071 | ||
1072 | The StreamIO functor | |
1073 | -------------------- | |
1074 | ||
1075 | * General problems: | |
1076 | ||
1077 | - It is impossible for this functor to support line buffering, | |
1078 | since it has no way of knowing which element consists a line | |
1079 | break. This could be solved by changing the someElem functor | |
1080 | argument to a breakElem argument. | |
1081 | ||
1082 | - It is also impossible to utilize reader's endPos for | |
1083 | pre-allocation, because the functor is parametric in the | |
1084 | position type. | |
1085 | ||
1086 | * Functor argument: | |
1087 | ||
1088 | - Since the extract and copy functions have been removed/changed | |
1089 | from ARRAY and VECTOR signatures, the StreamIO functor now | |
1090 | naturally requires slice structures for efficient | |
1091 | implementation. (Likewise the PrimIO functor) | |
1092 | ||
1093 | * Functor result: | |
1094 | ||
1095 | - Type sharing of the result types is not specified. | |
1096 | ||
1097 | * Discussion, paragraph on flushing: | |
1098 | ||
1099 | - Most of this discussion rather belongs to the description of | |
1100 | STREAM_IO. | |
1101 | ||
1102 | - Everything said here is not restricted to flushOut, but applies | |
1103 | to flushing in general. | |
1104 | ||
1105 | - Unfortunately, it is left unspecified where flushing may happen | |
1106 | and, consequently, where respective Io exceptions may occur. | |
1107 | ||
1108 | - Write retries as suggested here seem to be impossible to | |
1109 | implement correctly using the writer interface as specified (see | |
1110 | comments on PRIM_IO.writer). | |
1111 | ||
1112 | - According to the writer description, write operations may never | |
1113 | return an element count of 0, so the last sentence is | |
1114 | misleading. | |
1115 | ||
1116 | * Discussion, last paragraph: | |
1117 | ||
1118 | - Typo, missing ")" | |
1119 | ||
1120 | * Implementation note: | |
1121 | ||
1122 | - 3rd bullet: typo, s/PrimIO.augmentIn/PrimIO.augmentReader/ | |
1123 | ||
1124 | - 5th and 6th bullet: The endPos function cannot be utilized as | |
1125 | suggested, because the functor is necessarily parametric in the | |
1126 | position type. | |
1127 | ||
1128 | ||
1129 | ||
1130 | The IMPERATIVE_IO signature | |
1131 | --------------------------- | |
1132 | ||
1133 | * General comment: | |
1134 | ||
1135 | - It is unfortunate that imperative I/O is asymmetric with respect | |
1136 | to providing (limited) random access on input vs. output streams | |
1137 | - the former requires going down to the lower-level stream I/O. | |
1138 | That makes imperative I/O a somewhat incomplete abstraction | |
1139 | layer. | |
1140 | ||
1141 | - Likewise, it would be desirable if there were ways for | |
1142 | performing full-fledged random access without leaving the | |
1143 | imperative I/O abstraction layer, at least for streams were it | |
1144 | is suitable (e.g. BinIO). Despite the statement in the | |
1145 | discussion this is neither available for input nor for output | |
1146 | streams (see comments below). | |
1147 | ||
1148 | * closeIn: | |
1149 | ||
1150 | - Typo, s/S.closeIn/StreamIO.closeIn/ | |
1151 | ||
1152 | * flushOut: | |
1153 | ||
1154 | - Typo, s/S.flushOut/StreamIO.flushOut/ | |
1155 | ||
1156 | * closeOut: | |
1157 | ||
1158 | - Typo, s/S.closeOut/StreamIO.closeOut/ | |
1159 | ||
1160 | * Discussion: | |
1161 | ||
1162 | - Equivalences, last line: s/StreamIO.output/StreamIO.flushOut/ | |
1163 | ||
1164 | - Paragraph about random-access on output streams: It says that | |
1165 | BinIO.StreamIO.out_pos = Position.int. This is not true, we have | |
1166 | BinPrimIO.pos = Position.int, but that is a completely different | |
1167 | type. In fact, it is impossible to implement out_pos as | |
1168 | Position.int. | |
1169 | ||
1170 | * There is no reference to the ImperativeIO functor. | |
1171 | ||
1172 | ||
1173 | ||
1174 | The ImperativeIO functor | |
1175 | ------------------------ | |
1176 | ||
1177 | * Functor argument: | |
1178 | ||
1179 | - The Array argument is unnecessary. | |
1180 | ||
1181 | * Functor result: | |
1182 | ||
1183 | - Type sharing of the result types is not specified. | |
1184 | ||
1185 | ||
1186 | ||
1187 | The TEXT_STREAM_IO signature | |
1188 | ---------------------------- | |
1189 | ||
1190 | * General comment: | |
1191 | ||
1192 | - Why bother separating this signature from STREAM_IO? | |
1193 | => outputSubstr can easily be generalised to outputSlice | |
1194 | (for good), | |
1195 | => if line buffering is part of STREAM_IO, inputLine | |
1196 | might be as well. | |
1197 | ||
1198 | ||
1199 | ||
1200 | The TextIO structure | |
1201 | -------------------- | |
1202 | ||
1203 | * General comment: | |
1204 | ||
1205 | - Systems providing WideText should also provide a WideTextIO | |
1206 | structure (they have to provide WideTextPrimIO already, which | |
1207 | seems inconsistent). | |
1208 | ||
1209 | * Interface: | |
1210 | ||
1211 | - Duplicated type constraints for StreamIO.reader and | |
1212 | StreamIO.writer. | |
1213 | ||
1214 | ||
1215 | ||
1216 | The BinIO structure | |
1217 | -------------------- | |
1218 | ||
1219 | * Interface: | |
1220 | ||
1221 | - Type sharing with BinPrimIO is not specified (unlike for | |
1222 | TextIO), i.e. the following constraints are missing: | |
1223 | ||
1224 | where type StreamIO.reader = BinPrimIO.reader | |
1225 | where type StreamIO.writer = BinPrimIO.writer | |
1226 | where type StreamIO.pos = BinPrimIO.pos | |
1227 | ||
1228 | ****************************************************************************** | |
1229 | ****************************************************************************** | |
1230 | ****************************************************************************** | |
1231 | ****************************************************************************** | |
1232 | ||
1233 | Doing host/network byte order conversions on ML side. | |
1234 | ||
1235 | Socket.Ctl | |
1236 | * Semantics of setNBIO, getNREAD, getATMARK are unclear; | |
1237 | Don't seem to be accessible via {get,set}sockopt; | |
1238 | Instead, using ioctl. | |
1239 | ||
1240 | ****************************************************************************** | |
1241 | ****************************************************************************** | |
1242 | ||
1243 | Posix.FileSys: | |
1244 | * Within structure S, the type mode is constrained equal to flags, | |
1245 | but flags is an eqtype. | |
1246 | ||
1247 | STREAM_IO.pos | |
1248 | * "This is the type of positions in the underlying readers and | |
1249 | writers. In some instantiations of this signature (e.g., | |
1250 | TextIO.StreamIO), pos is abstract; in others (e.g., BinIO.StreamIO) | |
1251 | it is Position.int." But, the equality of BinIO.StreamIO.pos and | |
1252 | Position.int is never specified in any where constraint of BinIO. | |
1253 | * How can filePosIn be implemented with completely abstract pos? | |
1254 | ||
1255 | Not sent to list: | |
1256 | ||
1257 | * (In general, probably a good idea to look at the entire top-level | |
1258 | structure/signature matches and choose a consistent usage of base | |
1259 | types. For example, Int:>INTEGER would seem to hide the top-level | |
1260 | int; unless Int is opened afterwards. But, then what about all the | |
1261 | other structures that reference int? Is top-level int = Int.int or | |
1262 | is Int.int = top-level int.) | |
1263 | --> I think I'm biased from looking at the MLton implementation, | |
1264 | becuase I'm finding it hard to think about how to really express all | |
1265 | of the sharing constraints in a way that will be acceptable. This | |
1266 | might be the wrong way to look at things: the listing of structures | |
1267 | and signatures with clauses doesn't correspond to a build order, it | |
1268 | corresponds to the way the environment should look to the program. | |
1269 | ||
1270 | Sequences and Slices: | |
1271 | Why not existsi, alli? | |
1272 | ||
1273 | Vector: | |
1274 | Why no vector: int * 'a -> 'a vector? | |
1275 | ||
1276 | ||
1277 | Resolved: | |
1278 | ||
1279 | If one defines VECTOR_SLICE by including a type 'a vector and replace | |
1280 | 'a Vector.vector with the local 'a vector, but then binds | |
1281 | structure Vector: VECTOR | |
1282 | structure VectorSlice: VECTOR_SLICE where type 'a vector = 'a Vector.vector | |
1283 | at the top-level, does one violate the basis spec? | |
1284 | Rationale: it's easiset to implement Vector and VectorSlice | |
1285 | simultaneously, say with VectorSlice as a substructure of Vector (in | |
1286 | fact, with all of the Vector operations being dispatched to the | |
1287 | corresponding VectorSlice ops with full slices), so Vector isn't in | |
1288 | scope for the VECTOR_SLICE. | |
1289 | *** No, it's not o.k., because opening VectorSlice will introduce a binding | |
1290 | for 'a vector; but, if we're lucky, John will accept the proposal. | |
1291 | ||
1292 | IEEEReal: | |
1293 | toString prepends a #"~" even when the class is NAN? | |
1294 | *** I guess this is o.k.; there is an explicit sign field. | |
1295 | ||
1296 | PACK_WORD: | |
1297 | structure Pack<N>Big :> PACK_WORD (* OPTIONAL *) | |
1298 | structure Pack<N>Little :> PACK_WORD (* OPTIONAL *) | |
1299 | but PACK_WORD has | |
1300 | val subVec : Word8Vector.vector * int -> LargeWord.word | |
1301 | i.e., reference to LargeWord.word. | |
1302 | Should it be | |
1303 | PACK_WORD | |
1304 | type word | |
1305 | val subVec : Word8Vector.vector * int -> word | |
1306 | with | |
1307 | structure Pack<N>Big :> PACK_WORD with word = Word<N>.word (* OPTIONAL *) | |
1308 | Should there be PackBig and PackLittle with word = Word.word? | |
1309 | Should there be PackLargeBig with word = LargeWord.word? | |
1310 | There aren't many structures that refine on LargeXYZ; most refine on XYZ<N>. | |
1311 | *** O.k., we always unpack into a LargeWord, which we could then | |
1312 | Word<N>.fromLargeWord back to the size. I guess this is o.k.; It | |
1313 | lets an implementation give more Pack<N>Big structures than there | |
1314 | are Word<N> structures. | |
1315 | ||
1316 | MLton specific: | |
1317 | + why are Int32_gtu and Int32_geu primitive? | |
1318 | Why not just Word.fromInt and use Word comparisons? | |
1319 | + Real:>REAL doesn't match basis because it may peform | |
1320 | arithmetic at extended precision. Should this be mentioned | |
1321 | in the user guide? | |
1322 | + QUESTION: proc-env.sml | |
1323 | + QUESTION: char.sml | |
1324 | + check uses of {Vector,Array}Slice.slice for replacement by unsafeSlice. | |
1325 | ||
1326 | ||
1327 | ****************************************************************************** | |
1328 | ****************************************************************************** | |
1329 | ||
1330 | UNIX: | |
1331 | I'm not quite sure how the ('a, 'b) proc type is supposed to work in | |
1332 | practice; The old Unix structure just used them as | |
1333 | TextIO.{in,out}streams. My suspicion is that we're supposed to use | |
1334 | Posix.IO.mk{Bin,Text}{Reader,Writer} functions and then use the type | |
1335 | system to ensure that if we force a stream to be bin or text, then all | |
1336 | other uses have to be the same. I also suspect that we're only | |
1337 | supposed to lift the file_desc up to an instream/outstream once; i.e., | |
1338 | multiple textInstreamOf calls should continue to return the same | |
1339 | TextIO.instream. That would seem to suggest we need an 'a option ref | |
1340 | that can be banged at the first call to a streamOf function, and | |
1341 | subsequent calls just return the value there. | |
1342 | ||
1343 | textInstreamOf pr | |
1344 | binInstreamOf pr | |
1345 | return a text or binary instream connected to the standard output | |
1346 | stream of the process pr. Note the multiple calls to these | |
1347 | functions on the same proc will result in multiple streams that | |
1348 | all share the same underlying Unix stream. | |
1349 | ||
1350 | textOutstreamOf pr | |
1351 | binOutstreamOf pr | |
1352 | return a text or binary outstream connected to the standard input | |
1353 | stream of the process pr. Note the multiple calls to these | |
1354 | functions on the same proc will result in multiple streams that | |
1355 | all share the same underlying Unix stream. | |
1356 | ||
1357 | streamsOf pr | |
1358 | returns a pair of input and output text streams associated with | |
1359 | pr. This function is equivalent to (textInstream pr, textOutstream | |
1360 | pr) and is provided for backward compatibility. |