| 1 | |
| 2 | Date: Tue, 23 Jul 2002 11:49:57 -0400 (EDT) |
| 3 | From: Matthew Fluet <fluet@CS.Cornell.EDU> |
| 4 | |
| 5 | |
| 6 | John and SML implementers, |
| 7 | |
| 8 | Here are a loose collection of notes I've taken while starting to |
| 9 | update the MLton implementation of the SML Basis Library to the latest |
| 10 | version. They span quite a range: errata and typos, signature |
| 11 | constraint concerns, and some design questions. Thus far, I've looked |
| 12 | at the structures that had been grouped under the headings General, |
| 13 | Text, Integer, Reals, Lists, and Arrays and Vectors (i.e., excluding |
| 14 | IO, System, and Posix) in the "old" web specification. |
| 15 | |
| 16 | A few high level comments: |
| 17 | |
| 18 | * As an organizational principal, I liked the grouping of modules into |
| 19 | larger collections used in the "old" web specification better than |
| 20 | the long alphabetical list. |
| 21 | * I'm quite happy to see opaque signature matches for most structures. |
| 22 | In particular, I think it will help avoid porting problems between |
| 23 | implementations that provide different INTEGER structures, especially |
| 24 | when LargeInt = Int in one implementation and LargeInt = IntInf in |
| 25 | another. |
| 26 | |
| 27 | Required and optional components, Top-level: |
| 28 | |
| 29 | * A number of structures have an opaque signature match in |
| 30 | overview.html, but not in the corresponding structure specific page: |
| 31 | General, Bool, Option, List, ListPair, IntInf, |
| 32 | Array, ArraySlice, Vector, VectorSlice. |
| 33 | * Word8Array2 is listed as required in overview.html, |
| 34 | but its signature, MONO_ARRAY2, is not required. |
| 35 | Furthermore, Word8Array2 is marked optional in mono-array2.html. |
| 36 | I don't quite see a rationale for Word8Array2 being required. |
| 37 | * With the addition of val ~ : word -> word to the WORD signature, |
| 38 | presumably ~ should be overloaded at num, rather than at intreal. |
| 39 | |
| 40 | Reals: |
| 41 | |
| 42 | * In pack-float.html, the where type clauses are incorrect: |
| 43 | structure PackRealBig :> PACK_REAL |
| 44 | where type PackRealBig.real = Real.real |
| 45 | should be |
| 46 | structure PackRealBig :> PACK_REAL |
| 47 | where type real = Real.real |
| 48 | * Likewise, in most places, references to basic types are unqualifed, |
| 49 | so perhaps the where clause should read |
| 50 | where type real = real |
| 51 | for the PackRealBig and PackRealLittle structures. |
| 52 | |
| 53 | Arrays and Vectors: |
| 54 | |
| 55 | * In vector-slice.html, the description of subslice references |arr| |
| 56 | when it should reference |sl|. |
| 57 | * In {[mono-]array[-slice],[mono-]vector[-slice]}.html, the |
| 58 | description of findi references appi when it should reference findi. |
| 59 | * In mono-array-slice.html, structure CharArraySlice has the clause |
| 60 | where type array = CharVector.vector |
| 61 | which should be |
| 62 | where type array = CharArray.array. |
| 63 | * In mono-{vector[-slice],array[-slice],array2}.html, there are |
| 64 | Word<N> structures but no (default word) Word structures. |
| 65 | * In mono-vector.html, structure CharVector has the clause |
| 66 | where type elem = Char.char |
| 67 | while the other monomorphic vectors of basic types reference |
| 68 | the unqualified type; i.e. structure BoolVector has the clause |
| 69 | where type elem = bool. |
| 70 | * There are no "See also"'s into MONO_VECTOR_SLICE or MONO_ARRAY_SLICE |
| 71 | from MONO_VECTOR or MONO_ARRAY. |
| 72 | * A long discussion about types defined in |
| 73 | [MONO_]{ARRAY,VECTOR}[_SLICE] signatures; deferred to a separate |
| 74 | email. |
| 75 | |
| 76 | Really nit-picky: |
| 77 | |
| 78 | * Ordering of comparison functions (>, >=, etc.) and unary negation |
| 79 | are different within INTEGER and WORD. |
| 80 | * Ordering of functions in CHAR seems awkward. |
| 81 | * Ordering of full, slice, subslice different in ARRAY_SLICE and |
| 82 | VECTOR_SLICE. |
| 83 | * Ordering of foldi/fold and modifi/modify different in ARRAY2 and |
| 84 | MONO_ARRAY2. |
| 85 | |
| 86 | Top-level and opaque signatures: |
| 87 | * I think it would be useful to see the entire top-level of required |
| 88 | structures written out with their respective signature constraints. |
| 89 | For example, in the description of the Math structure, the spec |
| 90 | reads: "The top-level structure Math provides these functions for |
| 91 | the default real type Real.real." Because the top-level Math |
| 92 | structure has an opaque signature match (in overview.html), then the |
| 93 | sentence above implies that there ought to be the constraint |
| 94 | where type real = real (or Real.real). |
| 95 | Granted, none of the other structures in overview.html have where |
| 96 | clauses, and most type constraints are documented in the structure |
| 97 | specific pages, but the constraint on the top-level Math.real |
| 98 | slipped my mind when I first looked at it. |
| 99 | |
| 100 | -Matthew |
| 101 | |
| 102 | ****************************************************************************** |
| 103 | ****************************************************************************** |
| 104 | |
| 105 | Date: Tue, 23 Jul 2002 11:54:09 -0400 (EDT) |
| 106 | From: Matthew Fluet <fluet@CS.Cornell.EDU> |
| 107 | |
| 108 | |
| 109 | As promised, here is a longish look at the types used in Arrays and |
| 110 | Vectors. |
| 111 | |
| 112 | Array and Vector design: |
| 113 | |
| 114 | * The ARRAY signature includes type 'a vector. |
| 115 | Presumably, type 'a Array.vector = type 'a Vector.vector, but no |
| 116 | constraint makes this explicit. |
| 117 | * MONO_ARRAY_SLICE includes type vector and type vector_slice, |
| 118 | while the ARRAY_SLICE signature explicitly references |
| 119 | 'a VectorSlice.slice and 'a Vector.vector. |
| 120 | * VECTOR_SLICE doesn't include 'a vector, but has |
| 121 | val mapi : (int * 'a -> 'b) -> 'a slice -> 'b vector |
| 122 | val map : ('a -> 'b) -> 'a slice -> 'b vector; |
| 123 | On the other hand, full, slice, base, vector, and concat |
| 124 | reference 'a Vector.vector. |
| 125 | |
| 126 | For consistency, I'd prefer to see |
| 127 | signature VECTOR = |
| 128 | sig type 'a vector ... end |
| 129 | signature VECTOR_SLICE = |
| 130 | sig type 'a vector type 'a slice ... end |
| 131 | signature ARRAY = |
| 132 | sig type 'a vector type 'a array ... end |
| 133 | signature ARRAY_SLICE = |
| 134 | sig type 'a vector type 'a vector_slice |
| 135 | tyep 'a array type 'a slice ... end |
| 136 | signature MONO_VECTOR = |
| 137 | sig type elem type vector ... end |
| 138 | signature MONO_VECTOR_SLICE = |
| 139 | sig type elem type vector type slice ... end |
| 140 | signature MONO_ARRAY = |
| 141 | sig type elem type vector type array ... end |
| 142 | signature MONO_ARRAY_SLICE = |
| 143 | sig type elem type vector type vector_slice |
| 144 | type array type slice ... end |
| 145 | |
| 146 | structure Vector :> VECTOR |
| 147 | structure VectorSlice :> VECTOR_SLICE |
| 148 | where type 'a vector = 'a Vector.vector |
| 149 | structure Array :> ARRAY |
| 150 | where type 'a vector = 'a Vector.vector |
| 151 | structure ArraySlice :> ARRAY_SLICE |
| 152 | where type 'a vector = 'a Vector |
| 153 | where type 'a vector_slice = 'a VectorSlice.slice |
| 154 | where type 'a array = 'a Array.array |
| 155 | structure BoolVector :> MONO_VECTOR |
| 156 | where type elem = bool |
| 157 | structure BoolVectorSlice :> MONO_VECTOR_SLICE |
| 158 | where type elem = bool |
| 159 | where type vector = BoolVector.vector |
| 160 | structure BoolArray :> MONO_ARRAY |
| 161 | where type elem = bool |
| 162 | where type vector = BoolVector.vector |
| 163 | structure BoolArraySlice :> MONO_ARRAY_SLICE |
| 164 | where type elem = bool |
| 165 | where type vector = BoolVector.vector |
| 166 | where type vector_slice = BoolVectorSlice.slice |
| 167 | where type array = BoolArray.array |
| 168 | |
| 169 | While semantically, this shouldn't be any different than the |
| 170 | specification, it could effect type-error messages. For example, if I |
| 171 | have the structure Foo: |
| 172 | |
| 173 | structure Foo = struct |
| 174 | open BoolArraySlice |
| 175 | |
| 176 | val copyVec0 {src: vector_slice, |
| 177 | dst: array} = copyVec {src = src, dst = dst, di = 0} |
| 178 | end |
| 179 | |
| 180 | which I decide to generalize to polymorphic array slices, then just |
| 181 | changing BoolArraySlice to ArraySlice will lead to different |
| 182 | type-error messages: either "ubound type constructor: vector_slice" |
| 183 | (under the specification) or "type constructor vector_slice given 0 |
| 184 | arguments, wants 1" (under the signatures given above); and an arity |
| 185 | error for array in either case. It's not much of an argument, but I |
| 186 | need to replace vector_slice with 'a VectorSlice.slice under the |
| 187 | specification, while I only need to add 'a under the sigs above. |
| 188 | |
| 189 | |
| 190 | Array2: |
| 191 | * Why not have an ARRAY2_REGION analagous to ARRAY_SLICE? |
| 192 | Likewise, how about VECTOR2 and VECTOR2_REGION? |
| 193 | I think the decision to separate Arrays and Vectors from |
| 194 | their corresponding slices is a nice design choice, and I'd be in |
| 195 | favor of extending it to multi-dimentional ones. |
| 196 | * Should ARRAY2 have findi/find, exists, all? collate? |
| 197 | |
| 198 | ****************************************************************************** |
| 199 | ****************************************************************************** |
| 200 | |
| 201 | Date: Thu, 25 Jul 2002 15:20:01 +0200 |
| 202 | From: Andreas Rossberg <rossberg@ps.uni-sb.de> |
| 203 | |
| 204 | |
| 205 | Like Matthew I started implementing the latest version of the Basis spec |
| 206 | for Alice and Hamlet. I'm quite happy with most of the changes. It was a |
| 207 | surprise to discover the presence of a Windows structure, though :-) |
| 208 | |
| 209 | Here is my list of comments, some of which may duplicate observations |
| 210 | already made by Matthew. They primarily cover global issues and the |
| 211 | required part of the library, though I haven't looked deeper into the IO |
| 212 | and Posix parts yet. I also included some proposals for modest additions |
| 213 | to the library, which I believe are useful and fit its spirit. |
| 214 | |
| 215 | |
| 216 | Trivial bugs, typos, cosmetics |
| 217 | ------------------------------ |
| 218 | |
| 219 | * Overview: |
| 220 | - INT_INF appears in the list of required signatures. |
| 221 | - WordArray2 appears under the list of required structures, |
| 222 | instead of optional ones. |
| 223 | |
| 224 | * LIST_PAIR: |
| 225 | - Typo in description of allEq: double "the". |
| 226 | |
| 227 | * SUBSTRING: |
| 228 | - The scan example uses the deprecated "all" function. |
| 229 | |
| 230 | * VECTOR_SLICE: |
| 231 | - Typo in synopsis of subslice: s/opt/sz/. |
| 232 | - Typo in description of subslice: s/|arr|/|sl|/. |
| 233 | - Typo in description of findi: s/appi/findi/. |
| 234 | - Signature sometimes uses Vector.vector instead of plain vector. |
| 235 | - The equation for mapi can be simplified to: |
| 236 | Vector.fromList (foldri (fn (i,a,l) => f(i,a)::l) [] slice) |
| 237 | |
| 238 | * MONO_VECTOR_SLICE and ARRAY_SLICE and MONO_ARRAY_SLICE: |
| 239 | - Typo in synopsis of subslice: s/opt/sz/. |
| 240 | - Typo in description of findi: s/appi/findi/. |
| 241 | |
| 242 | * BYTE: |
| 243 | - Accidental "val" keyword in synopsis of some functions. |
| 244 | |
| 245 | * TEXT_IO: |
| 246 | - The "where" constraints contain erroneously qualified ids. |
| 247 | - The specification of the TEXT_IO signature is not valid SML'97, |
| 248 | since StreamIO is specified twice. You might want to add a |
| 249 | comment regarding that. |
| 250 | - The constraints for types vector and elem are redundant |
| 251 | (in fact, invalid), because the signature TEXT_STREAM_IO |
| 252 | already specifies the necessary equations. |
| 253 | |
| 254 | * The use of variable names is sometimes inconsistent: |
| 255 | - Predicate arguments to higher-order functions are usually |
| 256 | named "f" (eg. List.all), sometimes "p" (eg. String.tokens, |
| 257 | StringCvt.splitl), and sometimes even "pred" (eg. ListPair.all). |
| 258 | - Similarly, fold functions mostly use "init" to name initial |
| 259 | accumulators, except in the List and ListPair modules. |
| 260 | |
| 261 | |
| 262 | |
| 263 | Ambiguities / Unclear Details |
| 264 | ----------------------------- |
| 265 | |
| 266 | * Overview: |
| 267 | - The subsection about dependencies among optional modules has |
| 268 | disappeared. Does that mean that there aren't any anymore? |
| 269 | (The nice subsection about design rules and conventions also |
| 270 | has gone.) |
| 271 | |
| 272 | * The intended meaning of opaque signature constraints is not always |
| 273 | clear to me. Sometimes the prose contains remarks about additional |
| 274 | equalities that are not appearent from the signature constraints. |
| 275 | For example, is or isn't |
| 276 | - Text.Char.char = Char.char ? (and so on for the rest of Text) |
| 277 | - LargeInt.int = IntN.int (for some structure IntN) ? |
| 278 | (likewise LargeWord.word, LargeReal.real) |
| 279 | - Char.string = String.string ? |
| 280 | - Math.real = Real.real ? |
| 281 | In particular, the spec sometimes speaks of "equal structures", |
| 282 | which has no real technical meaning in SML'97. |
| 283 | Note that from the opaque matching on the overview page one might |
| 284 | even conclude that General.unit <> {} ! |
| 285 | |
| 286 | * The type specification of String.string and CharVector.vector |
| 287 | is circular: |
| 288 | structure String :> STRING |
| 289 | where type string = CharVector.vector |
| 290 | structure CharVector :> MONO_VECTOR |
| 291 | where type vector = String.string |
| 292 | Likewise for Substring.substring and CharVectorSlice.slice. |
| 293 | A respective defining structure should be chosen. |
| 294 | |
| 295 | * STRING: |
| 296 | - Function fromString has a special case that is not covered by |
| 297 | implementing the function through straight-forward iterative |
| 298 | application of the Char.scan function, namely a trailing gap |
| 299 | escape (\f...f\) as in "foo\\ \\" or "foo\\ \\\000" (where \000 |
| 300 | is an non-convertible character). Several implementations I |
| 301 | tried get that detail wrong, so a corresponding note might be |
| 302 | in order. Moreover, it is not completely obvious from the |
| 303 | description what the result should be for strings that contain |
| 304 | a gap escape as the only convertible sequence, e.g. "\\ \\" or |
| 305 | "\\ \\\000" - it is supposed to be SOME "", I guess. |
| 306 | |
| 307 | * SUBSTRING: |
| 308 | - Shouldn't span raise Span if i' < i? Otherwise, contrary |
| 309 | to the prose, it in fact accepts arguments where ss' is |
| 310 | left to ss, as long as they overlap (which is rather odd). |
| 311 | - For the curried triml/trimr it is not clear whether an |
| 312 | Subscript exception has to be raised already if k < 0 but no |
| 313 | second argument is applied. |
| 314 | |
| 315 | |
| 316 | |
| 317 | Naming and structuring |
| 318 | ---------------------- |
| 319 | |
| 320 | Its nicely chosen regular naming conventions and structure are two of |
| 321 | the aspects I like most about the Standard Basis. The following list |
| 322 | enumerates the few cases where I feel that the spec violates its own |
| 323 | conventions. |
| 324 | |
| 325 | * WORD: |
| 326 | - The fromLargeWord and toLargeWord functions should drop |
| 327 | the "Word" suffix to be consistent with the corresponding |
| 328 | functions in the REAL and INTEGER signatures. |
| 329 | |
| 330 | * CHAR: |
| 331 | - The functions contains/notContains should be moved to the |
| 332 | STRING signature, as they are similar to find/exist |
| 333 | operations and thus functionality of the aggregate. The |
| 334 | type string could then be removed from the signature. |
| 335 | |
| 336 | * ARRAY_SLICE and MONO_ARRAY_SLICE: |
| 337 | - The function copyVec seems completely out of place: it does |
| 338 | neither operate on array slices, nor on vectors. But honestly |
| 339 | I have got no idea where else to put it :-( |
| 340 | |
| 341 | * STRING and SUBSTRING: |
| 342 | - There is a certain asymmetry between slices and substrings |
| 343 | which tends to confuse at least myself when hacking. For more |
| 344 | consistency I propose: |
| 345 | (1) changing the type of Substring.substring to |
| 346 | string * int * int option -> substring |
| 347 | (for consistency with VectorSlice.slice), |
| 348 | (2) renaming Substring.slice to Substring.subsubstring, |
| 349 | (for consistency with VectorSlice.subslice), |
| 350 | (3) removing Substring.{app,foldl,foldr} (there are no similar |
| 351 | functions in the STRING signature, and in both cases they |
| 352 | are available through CharVector/CharVectorSlice), |
| 353 | (4) removing String.extract and Substring.extract (the same |
| 354 | functionality is available through CharVector[Slice]). |
| 355 | - I believe the deprecated Substring.all can be removed for good. |
| 356 | After all, there are more serious incompatible changes being |
| 357 | made (e.g. array copying functions). |
| 358 | |
| 359 | * Vectors and arrays: |
| 360 | - While the lib consistently uses the to/from convention for |
| 361 | conversions on basic types, it sometimes uses adhoc conventions |
| 362 | for aggregates. I propose renaming: |
| 363 | (1) Array.vector to Array.toVector |
| 364 | (2) VectorSlice.vector to VectorSlice.toVector, |
| 365 | (3) ArraySlice.vector to ArraySlice.toVector, |
| 366 | (4) Substring.string to Substring.toString, |
| 367 | - Since the copy functions have only 3, mostly distinctly typed |
| 368 | arguments now, there no longer seems to be a strong reason to |
| 369 | require passing those by notationally heavy records. |
| 370 | |
| 371 | * INT_INF: |
| 372 | - The presence of bit fiddling operators in that signature is |
| 373 | something that feels exceptionally ad-hoc. Either they should |
| 374 | be available for all integer types, or there should be a |
| 375 | separate WORD_INF, with appropriate conversions, that makes |
| 376 | these available. |
| 377 | |
| 378 | * Toplevel: |
| 379 | - Now that there is Word.~ (which is good) it seems rather odd |
| 380 | that the toplevel ~ is not overloaded for words, i.e. does not |
| 381 | have type num-> num. |
| 382 | |
| 383 | * Net functionality: |
| 384 | - I really like the idea of structuring the library namespace as |
| 385 | it has been done with the OS and Posix structures. I would |
| 386 | prefer to see something similar being done for the added |
| 387 | network functionality. More precisely, I propose |
| 388 | (1) moving the structures Socket, INetSock, GenericSock, and |
| 389 | the three Net*DB structures into a new wrapper structure |
| 390 | Net (renaming Net*DB to *DB), |
| 391 | (2) defining a corresponding signature NET, |
| 392 | (3) renaming the signatures SOCKET, GENERIC_SOCK and INET_SOCK |
| 393 | to NET_SOCKET, NET_GENERIC_SOCK and NET_INET_SOCK, resp., |
| 394 | (4) moving UnixSock to the Unix structure (renamed as Socket). |
| 395 | |
| 396 | |
| 397 | |
| 398 | Misc. proposals for additional functionality |
| 399 | -------------------------------------------- |
| 400 | |
| 401 | Here is a small collection of miscellaneous simple functions which I |
| 402 | believe the library is still lacking, either because they are commonly |
| 403 | useful or because they would make the library more regular. |
| 404 | |
| 405 | * LIST and LIST_PAIR: |
| 406 | - The IMHO single most convenient extension to the library would |
| 407 | be indexed morphisms on lists, i.e. adding |
| 408 | val appi : (int * 'a -> unit) -> 'a list -> unit |
| 409 | val mapi : (int * 'a -> 'b) -> 'a list -> 'b list |
| 410 | val foldli : (int * 'a * 'b -> 'b) -> 'b -> 'a list -> 'b |
| 411 | val foldri : (int * 'a * 'b -> 'b) -> 'b -> 'a list -> 'b |
| 412 | val findi : (int * 'a -> bool) -> 'a list -> (int * 'a) option |
| 413 | - Likewise for LIST_PAIR. |
| 414 | - LIST_PAIR does not support partial mapping: |
| 415 | val mapPartial : ('a * 'b -> 'c option) -> |
| 416 | 'a list * 'b list -> 'c list |
| 417 | |
| 418 | * LIST, VECTOR, ARRAY, etc.: |
| 419 | - Another function on lists that would be very useful from my |
| 420 | perspective is |
| 421 | val appr : ('a -> unit) -> 'a list -> unit |
| 422 | and its indexed sibling |
| 423 | val appri : (int * 'a -> unit) -> 'a list -> unit |
| 424 | which traverse the list from right to left. |
| 425 | - Likewise for all aggregate types. |
| 426 | - All aggregates come with a fromList function. I often feel the |
| 427 | need to have inverse toList functions. Use of foldr is obfuscating. |
| 428 | |
| 429 | * OPTION: |
| 430 | - Often using isSome is a bit clumsy. I thus propose adding the dual |
| 431 | val isNone : 'a option -> bool |
| 432 | |
| 433 | * STRING and SUBSTRING: |
| 434 | - For historical reasons we have {String,Substring}.size instead |
| 435 | of *.length, which is inconsistent with all other aggregates and |
| 436 | frequently lets me mix them up when I use them side by side. |
| 437 | I propose adding aliases |
| 438 | String.maxLen |
| 439 | String.length |
| 440 | Substring.length |
| 441 | |
| 442 | * WideChar and WideString: |
| 443 | - There is no convenient way to convert between the standard and |
| 444 | wide character set. Would it be reasonable to introduce LargeChar |
| 445 | and LargeString structures (and so on) and have the CHAR and |
| 446 | STRING signatures enriched by fromLarge/toLarge functions, as for |
| 447 | numbers? That would also allow a program to select the widest |
| 448 | character set available (which is currently impossible within the |
| 449 | language). |
| 450 | |
| 451 | * String conversion: |
| 452 | - I don't quite see the rationale for which signatures contain a |
| 453 | scan function and which don't. I believe it makes sense to have |
| 454 | scan in every signature that has fromString. |
| 455 | - There should be a function |
| 456 | val scanC : (Char.char, 'a) StringCvt.reader |
| 457 | -> (char, 'a) StringCvt.reader |
| 458 | to scan strings as C characters. This would make Char.fromCString |
| 459 | and particularly String.fromCString more modular. |
| 460 | - How about a dual writer abstraction as with |
| 461 | type ('a,'b) writer = 'a * 'b -> 'b option |
| 462 | and supporting fmt functions for basic types? Such a thing might |
| 463 | be useful for writing to streams or buffers. |
| 464 | |
| 465 | * Vectors: |
| 466 | For some time now I have been trying to use vectors more often |
| 467 | instead of an often inappropriate list representation. This is |
| 468 | sometimes made more difficult simply because the library support |
| 469 | isn't as good as for lists. It improved in the updated version |
| 470 | but still I miss: |
| 471 | - Array.fromVector, |
| 472 | - Vector.mapPartial, |
| 473 | - Vector.rev, |
| 474 | - Vector.append (though I guess concat is good enough), |
| 475 | - most of all: a VectorPair structure. |
| 476 | |
| 477 | * Hash functions: |
| 478 | - Giving every basic type a (default) hash function in addition to |
| 479 | comparison would be quite useful in conjunction with container |
| 480 | libraries. |
| 481 | |
| 482 | * There is no defining structure for references. I would like to see |
| 483 | signature REF |
| 484 | structure Ref : REF |
| 485 | where REF contains: |
| 486 | datatype ref = datatype ref |
| 487 | val ! : 'a ref -> 'a |
| 488 | val := : 'a ref * 'a -> unit |
| 489 | val swap : 'a ref * 'a ref -> unit (* or :=: ? *) |
| 490 | val map : ('a -> 'a) -> 'a ref -> 'a ref |
| 491 | You might then consider removing ! and := from GENERAL. |
| 492 | |
| 493 | * Signature conventions: |
| 494 | Some additional conventions would make use of Basis types as |
| 495 | functor arguments more convenient: |
| 496 | - Each signature defining an abstract type should make that |
| 497 | type available under the alias "t" as well (this includes |
| 498 | monomorphic types as well as polymorphic ones). |
| 499 | - Every equality type should come with an explicit equality |
| 500 | function |
| 501 | val eq : t * t -> bool |
| 502 | to move away from the reliance on eqtypes. |
| 503 | - There should be a uniform name for canonical constructor |
| 504 | functions, e.g. "new" (or at least an alias). |
| 505 | |
| 506 | -- |
| 507 | Andreas Rossberg, rossberg@ps.uni-sb.de |
| 508 | |
| 509 | ****************************************************************************** |
| 510 | ****************************************************************************** |
| 511 | |
| 512 | Date: Fri, 2 Aug 2002 14:04:16 +0100 |
| 513 | From: David Matthews <David.Matthews@deanvillage.com> |
| 514 | |
| 515 | |
| 516 | I've been having another look at the Basis library implementation in |
| 517 | Poly/ML and in particular the I/O library. I'm still not sure I fully |
| 518 | understand the implications of the Stream IO (functional IO) layer and |
| 519 | in particular the way "canInput" works and interacts with "input". |
| 520 | |
| 521 | The definition says that canInput(f, n) returns SOME k "if a call to |
| 522 | input would return immediately with at least k characters". |
| 523 | Specifically it does not say "if a call to inputN(f, k) would return |
| 524 | immediately". Secondly it says that it "should attempt to return as |
| 525 | large a k as possible" and gives the example of a buffer containing 10 |
| 526 | characters with the user calling canInput(f, 15). This suggests that a |
| 527 | call to canInput could have the effect of committing the stream since a |
| 528 | perfectly good implementation of "input" would be to return what was |
| 529 | left of the buffer, i.e. 10 characters, and only read from the |
| 530 | underlying stream on a subsequent call to "input". Yet after a call to |
| 531 | canInput(f, 15) which returns SOME 15 the call to "input" is forced to |
| 532 | return at least 15. In other words a call to canInput changes the |
| 533 | behaviour of a subsequent call to "input". Generally, what is the |
| 534 | behaviour of canInput with an argument larger than the buffer size? How |
| 535 | far ahead is canInput expected to read? |
| 536 | |
| 537 | A few other notes of things I've discovered, some of which are trivial: |
| 538 | |
| 539 | The signature for TextIO.StreamIO contains duplicates of |
| 540 | where type StreamIO.reader = TextPrimIO.reader |
| 541 | where type StreamIO.writer = TextPrimIO.writer |
| 542 | |
| 543 | There are declared constants for platformWin32Windows2000 and |
| 544 | platformWin32WindowsXP in the Windows structure. When I proposed the |
| 545 | Windows.Config structure I didn't include constants for these versions |
| 546 | of the OS because the underlying GetVersionEx function returns the same |
| 547 | value, VER_PLATFORM_WIN32_NT in the dwPlatformId field for NT, Windows |
| 548 | 2000 and XP It is possible to distinguish these but only using the |
| 549 | major and minor version fields. Windows CE does give a different value |
| 550 | for the platformID. I would say it is confusing to have these here |
| 551 | because it implies that it's possible to discriminate on the basis of |
| 552 | the platformID field. |
| 553 | |
| 554 | The example definition of input1 at the bottom of STREAM_IO returns a |
| 555 | value of type elem option * instream when the signature says it should |
| 556 | be (elem * instream) option. |
| 557 | |
| 558 | Description of "input" function in STREAM_IO signature. The word "ay" |
| 559 | should be "may". |
| 560 | |
| 561 | -- |
| 562 | David. |
| 563 | |
| 564 | ****************************************************************************** |
| 565 | ****************************************************************************** |
| 566 | |
| 567 | Date: Fri, 11 Oct 2002 17:46:59 -0400 (EDT) |
| 568 | From: Matthew Fluet <fluet@CS.Cornell.EDU> |
| 569 | |
| 570 | |
| 571 | Following up my previous post, here is another loose collection of |
| 572 | notes I've taken while updating the MLton implementation of the SML |
| 573 | Basis Library. This includes the structures that had been grouped |
| 574 | under the headings System, Posix, and IO in the "old" web |
| 575 | specification. |
| 576 | |
| 577 | Required and optional components: |
| 578 | * The optional functors PrimIO, StreamIO, and ImperativeIO are not |
| 579 | listed among the optional components in overview.html. |
| 580 | |
| 581 | Lists: |
| 582 | * The discussion for the ListPair structure says: |
| 583 | "Note that a function requiring equal length arguments may determine |
| 584 | this lazily, i.e. , it may act as though the lists have equal length |
| 585 | and invoke the user-supplied function argument, but raise the |
| 586 | exception when it arrives at the end of one list before the end of the |
| 587 | other." |
| 588 | Such an implementation choice seems to go against the spirit that |
| 589 | programs run under conforming implementations of the Basis Library |
| 590 | should behave the same. |
| 591 | |
| 592 | Posix: |
| 593 | * In posix.html, last sentence in Discussion: "onsult" instead of |
| 594 | "consult" |
| 595 | PosixSignal: |
| 596 | * In posix-signal.html, in Discussion: "The name of the coressponding |
| 597 | ..." sentence is repeated. |
| 598 | PosixError: |
| 599 | * In the discussion of POSIX_ERROR: |
| 600 | "The name of a corresponding POSIX error can be derived by |
| 601 | capitalizing all letters and adding the character ``E'' as a |
| 602 | prefix. For example, the POSIX error associated with nodev is |
| 603 | ENODEV. The only exception to this rule is the error toobig, whose |
| 604 | associated POSIX error is E2BIG." |
| 605 | It isn't clear if this is the intended semantics for errorName and |
| 606 | syserror. |
| 607 | |
| 608 | Time: |
| 609 | * The type time now includes "negative values moving to the past." |
| 610 | In the absence of negative values, the text for the the |
| 611 | to{Seconds,Milliseconds,Microseconds} functions to drop fractions of |
| 612 | the time unit was unambigous. With negative values, I would |
| 613 | interpret this as rounding towards zero. Is this correct? Would it |
| 614 | be clearer to describe the rounding as such? |
| 615 | * The + and - functions are required to raise Overflow, although most |
| 616 | other "result not representable as a time value" error raises Time. |
| 617 | * The - function is written prefix instead of infix in the |
| 618 | description. |
| 619 | * The scan and fromString functions do not specify how to treat a |
| 620 | value with greater precision than the internal representation; |
| 621 | should it have rounding or truncation semantics? Also, the |
| 622 | functions are required to raise Overflow for an unrepresentable |
| 623 | time value. |
| 624 | |
| 625 | IO: |
| 626 | * The nice introduction to IO that appears at |
| 627 | http://cm.bell-labs.com/cm/cs/what/smlnj/doc/basis/pages/io-explain.html |
| 628 | doesn't seem to be included with the new pages. |
| 629 | * The functor arguments in PrimIO, StreamIO, and ImperativIO functors |
| 630 | don't match; some use structure A: MONO_ARRAY and others use |
| 631 | structure Array: MONO_ARRAY. |
| 632 | |
| 633 | PrimIO() and PRIM_IO |
| 634 | * The PRIM_IO signature requires pos to be an eqtype, but the PrimIO |
| 635 | functor argument only requires pos to be a type. |
| 636 | * readArr[NB], write{Vec,Arr}[NB] take "slices" (records of type {buf: |
| 637 | {vector,array}, i: int, sz: int option}) but no description of the |
| 638 | appropriate action to take when the slices are invalid. Presumably, |
| 639 | they should raise Subscript. |
| 640 | * There are a number of "contradictory" statments: |
| 641 | "Readers and writers should not, in general, raise the IO.Io |
| 642 | exception. It is assumed that the higher levels will appropriately |
| 643 | handle these exceptions." |
| 644 | "A reader is required to raise IO.Io if any of its functions, except |
| 645 | close or getPos, is invoked after a call to close. A writer is |
| 646 | required to raise IO.Io if any of its functions, except close, is |
| 647 | invoked after a call to close." |
| 648 | "closes the reader and frees operating system resources. Further |
| 649 | operations on the reader (besides close and getPos) raise |
| 650 | IO.ClosedStream." |
| 651 | "closes the writer and frees operating system resources. Further |
| 652 | operations (other than close) raise IO.ClosedStream." |
| 653 | * The augment_reader and augment_writer functions may introduce new |
| 654 | functions. Should the synthesized operations handle IO.Io |
| 655 | exceptions and change the function field? Maybe this falls under |
| 656 | the "intentionally unspecified" clause. |
| 657 | |
| 658 | StreamIO() and STREAM_IO: |
| 659 | * What is the difference between a terminated output stream and a |
| 660 | closed output stream? Some operations say what to do when the |
| 661 | stream is terminated or closed, but many are unspecified when the |
| 662 | other condition holds. I resolved this by looking at the IO |
| 663 | introduction mentioned above, where it discusses stream states. |
| 664 | But, closeOut is still confusing: "flushes f's buffers, marks the |
| 665 | stream closed, and closes the underlying writer. This operation has |
| 666 | no effect if f is already closed. If f is terminated, it should |
| 667 | close the underlying writer." Shouldn't closeOut always execute the |
| 668 | underlying writer's close function? The only way to terminate an |
| 669 | outstream is to getOutstream, but I would really expect |
| 670 | TextIO.closeOut to "really" close the underlying |
| 671 | file/outstream/writer. |
| 672 | * The IO structure has dropped the TerminatedStream exception, but |
| 673 | there seem to be sufficient cases when a stream should raise an |
| 674 | exception when it is terminated. |
| 675 | * The semantics of the vector returned by getReader are unclear. At |
| 676 | the very least, the source code for SML/NJ and PolyML have very |
| 677 | different interpretations, and I've chosen yet another. I think |
| 678 | part of the problem is that the word "[un]consumed" only appears in |
| 679 | the description of this function, so it's unclear what corresponds |
| 680 | to consumed input. |
| 681 | * I suspect the example under endOfStream is wrong: |
| 682 | |
| 683 | In these cases the StreamIO.instream will also have multiple EOF's; |
| 684 | that is, it can be that |
| 685 | |
| 686 | val true = endOfStream(f) |
| 687 | val ("",f') = input f |
| 688 | val true = endOfStream(f') |
| 689 | val ("xyz",f'') = input f |
| 690 | |
| 691 | The fact that input f can return two different values would seem to |
| 692 | violate the principal argument for functional streams! Looking at |
| 693 | the aforementioned IO introduction in the "old" pages, I see the |
| 694 | more reasonable example: |
| 695 | |
| 696 | Consequently, the following is not guaranteed to be true: |
| 697 | |
| 698 | let val z = TextIO.StreamIO.endOfStream f |
| 699 | val (a,f') = TextIO.StreamIO.input f |
| 700 | val x = TextIO.StreamIO.endOfStream f' |
| 701 | in x=z (* not necessarily true! *) |
| 702 | end |
| 703 | |
| 704 | whereas the following is guaranteed to be true: |
| 705 | |
| 706 | let val z = TextIO.StreamIO.endOfStream f |
| 707 | val (a,f') = TextIO.StreamIO.input f |
| 708 | val x = TextIO.StreamIO.endOfStream f (* note, no prime! *) |
| 709 | in x=z (* guaranteed true! *) |
| 710 | end |
| 711 | * David Matthews's post on Aug. 2 raised questions about canInput |
| 712 | which are unresolved. |
| 713 | |
| 714 | General comments: |
| 715 | * Various operations in IO take "slices", but aren't expressed in |
| 716 | terms of {Vector,Array}Slice structures. One difficulty with this |
| 717 | is that the slice types are not in scope within the IO signatures. |
| 718 | |
| 719 | I would really advocate making the VectorSlice structure a |
| 720 | substructure of the Vector structure (and likewise for arrays). |
| 721 | Even if this isn't done for the polymorphic vector/array structures, |
| 722 | it would be extremely beneficial for the monomorphic structures, |
| 723 | where in the {Prim,Stream,Imperative}IO functors, it is impossible |
| 724 | to access the corresponding monomorphic vector/array slice |
| 725 | structures. I found myself using Vector.tabulate when I really |
| 726 | wanted ArraySlice.vector. |
| 727 | |
| 728 | The "old" MONO_ARRAY signature included structure Vector: |
| 729 | MONO_VECTOR which gave access to the corresponding monomorphic |
| 730 | vectors. |
| 731 | |
| 732 | -Matthew |
| 733 | |
| 734 | ****************************************************************************** |
| 735 | ****************************************************************************** |
| 736 | |
| 737 | Date: Fri, 13 Dec 2002 15:57:55 +0100 |
| 738 | From: Andreas Rossberg <rossberg@ps.uni-sb.de> |
| 739 | |
| 740 | |
| 741 | Here is a collection of issues and comments we gathered when |
| 742 | implementing the I/O stack from the Standard Basis (primitive, stream, |
| 743 | imperative I/O) for Alice. While in general the specification seems to |
| 744 | be pretty precise and complete, we sometimes found it hard to understand |
| 745 | the semantic details of stream I/O, especially since many of them can |
| 746 | only be derived indirectly from the examples in the discussion section |
| 747 | and there appear to be some minor ambiguities and inconsistencies. Also, |
| 748 | the PrimIO and StreamIO functors cannot always be implemented as |
| 749 | suggested, because of their parametricity in types such as position and |
| 750 | element. |
| 751 | |
| 752 | As a general note, the I/O interface does not seem to have been designed |
| 753 | with concurrency in mind. In particular, augmenting readers and writers |
| 754 | cannot be made thread-safe, AFAWCS. This is a bit of a problem for us, |
| 755 | since Alice is relying on concurrency. However, that does not seem to be |
| 756 | an issue easily solved. |
| 757 | |
| 758 | - Leif Kornstaedt, Andreas Rossberg |
| 759 | |
| 760 | |
| 761 | The IO structure |
| 762 | ---------------- |
| 763 | |
| 764 | * exception Io: |
| 765 | |
| 766 | - function field: (pedantic) The wording seems to imply that only |
| 767 | functions from STREAM_IO raise the Io exception, but this is |
| 768 | clearly not the case (consider TextIO.openIn to name just one). |
| 769 | |
| 770 | * datatype buffer_mode: |
| 771 | |
| 772 | - There is no specification of what precisely line buffering is |
| 773 | supposed to mean, in particular for non-text streams. |
| 774 | |
| 775 | |
| 776 | |
| 777 | The PRIM_IO signature |
| 778 | --------------------- |
| 779 | |
| 780 | * Synopsis: |
| 781 | |
| 782 | - (pedantic) It says that "higher level I/O facilities do not |
| 783 | access the OS structure directly...". That's somewhat misleading |
| 784 | since OS does not provide the same functionality anyway (if any, |
| 785 | it was the Posix structure). |
| 786 | |
| 787 | * type reader: |
| 788 | |
| 789 | - Unlike for writers, it is not specified what the minimal set of |
| 790 | operations is that a reader must support. |
| 791 | |
| 792 | - It is not specified whether multiple end-of-streams may occur. |
| 793 | Since they are anticipated for StreamIO, one should expect them |
| 794 | to be possible for underlying readers as well. However, this |
| 795 | requires clarification of the semantics of several operations. |
| 796 | |
| 797 | - readArr, readArrNB: It is specified nowhere what the option for |
| 798 | sz is supposed to mean, i.e. what the semantics of NONE is |
| 799 | (presumably as for slices). |
| 800 | |
| 801 | - readVec, readVecNB: Unlike all other similar read and write |
| 802 | functions, these two do not accept an option for the size |
| 803 | argument. |
| 804 | |
| 805 | - avail: The description suggests that the function can be used as |
| 806 | a hint by inputAll. However, this information is too inaccurate |
| 807 | to be useful, since (apart from translation issues) the physical |
| 808 | size of elements cannot be obtained (in particular in the |
| 809 | StreamIO functor, which is parametric in the element type). In |
| 810 | practice, endPos seems to be more useful for this purpose. So it |
| 811 | is not clear what purpose avail could actually serve at all at |
| 812 | the abstraction level provided by readers. |
| 813 | |
| 814 | - endPos: |
| 815 | (1) May it block? For example, when reading from terminal or |
| 816 | from another kind of stream, this can be naturally expected. |
| 817 | |
| 818 | (2) Which position is returned if there are multiple |
| 819 | end-of-streams? |
| 820 | |
| 821 | - getPos, setPos, endPos, verifyPos: Description should start with |
| 822 | "when present". |
| 823 | |
| 824 | - setPos, endPos: Should not raise an exception if unimplemented, |
| 825 | but rather be NONE. Actually, the implementation notes on writers |
| 826 | state that endPos *must* be implemented for readers. |
| 827 | |
| 828 | - Implementation note, item 6: Why is it likely that the client |
| 829 | uses getPos frequently? And why should the reader count |
| 830 | *untranslated* elements (and how would there be actual elements |
| 831 | before translation)? |
| 832 | (See also comments on STREAM_IO.filePosIn) |
| 833 | |
| 834 | * type writer: |
| 835 | |
| 836 | - writeVec, writeArr, writeVecNB, writeArrNB: |
| 837 | (1) Again, it is not specified what the optional size means. |
| 838 | |
| 839 | (2) When may k < sz occur without having IO failure? If it is |
| 840 | arbitrary, then there appears to be no correct way to write a |
| 841 | sequence of elements, because it is neither possible to detect |
| 842 | partial element writes (which are explained in the paragraph |
| 843 | before the Implementation Notes), nor to complete such writes. |
| 844 | This particularly implies that the StreamIO functor cannot |
| 845 | implement flushing correctly (see below). |
| 846 | |
| 847 | - getPos, setPos, endPos, verifyPos: Description should start with |
| 848 | "when present". |
| 849 | |
| 850 | - getPos, setPos: Should not raise an exception if unimplemented, |
| 851 | but rather be NONE. |
| 852 | |
| 853 | - last paragraph before Implementation Note: Typo, double "plus". |
| 854 | |
| 855 | - first sentence in Implementation Note: (pedantic) Why is this |
| 856 | put into the implementation notes when it actually seems to be a |
| 857 | requirement of the specification? |
| 858 | |
| 859 | - last paragraph of Implementation Note: |
| 860 | (1) States that readers must implement getPos, which seems to be |
| 861 | contradicted by its optional type. |
| 862 | |
| 863 | (2) Typo, double "need". |
| 864 | |
| 865 | * openVector: |
| 866 | |
| 867 | - Is this supposed to support random access? Note that for types |
| 868 | generated with the PrimIO functor it cannot (see below)! That |
| 869 | seems to make this function rather useless. |
| 870 | |
| 871 | * augmentReader, augmentWriter: |
| 872 | |
| 873 | - It is not possible to synthesize operations in a way that is |
| 874 | thread-safe in concurrent systems, hence it should be noted that |
| 875 | augmenting is potentially dangerous. |
| 876 | |
| 877 | * There is no reference to the PrimIO functor. |
| 878 | |
| 879 | |
| 880 | |
| 881 | The PrimIO functor |
| 882 | ------------------ |
| 883 | |
| 884 | * General problems: |
| 885 | |
| 886 | - Since the implementation is necessarily parametric in the pos |
| 887 | type, openVector, nullRd, nullWr cannot create readers that |
| 888 | allow random access, although one would expect that at least for |
| 889 | openVector. |
| 890 | |
| 891 | * Functor argument: |
| 892 | |
| 893 | - Structure names A and V are inconsistent with the StreamIO and |
| 894 | ImperativeIO functors. |
| 895 | |
| 896 | - Type pos has to be an eqtype to match the result signature. |
| 897 | |
| 898 | - Since the extract and copy functions have been removed/changed |
| 899 | from ARRAY and VECTOR signatures, the PrimIO functor now |
| 900 | naturally requires slice structures for efficient |
| 901 | implementation. (Likewise the StreamIO functor) |
| 902 | |
| 903 | * Functor result: |
| 904 | |
| 905 | - Type sharing of the pos type is not specified, though essential |
| 906 | for this functor being useful at all. |
| 907 | |
| 908 | |
| 909 | |
| 910 | |
| 911 | The STREAM_IO signature |
| 912 | ----------------------- |
| 913 | |
| 914 | * Synopsis: |
| 915 | |
| 916 | - An exception likely to be raised in by the underlying |
| 917 | reader/writer is Size, which is not mentioned. OTOH, Fail can |
| 918 | only occur in the rare case of user-supplied readers/writers, as |
| 919 | the Basis itself is supposed to never raise it. |
| 920 | |
| 921 | * type out_pos: |
| 922 | |
| 923 | - A note on the meaning of this type would be desirable, since its |
| 924 | canonical representation is (outstream * pos) rather than pos. |
| 925 | (That also may have caused confusion in the discussion of |
| 926 | imperative I/O, see below.) |
| 927 | |
| 928 | * input1: |
| 929 | |
| 930 | - The signature of this function is inconsistent with all other |
| 931 | input functions. It should rather have type |
| 932 | |
| 933 | instream -> elem option * instream |
| 934 | |
| 935 | which in fact appears to be the type assumed in the discussion |
| 936 | example relating input1 to inputN. |
| 937 | |
| 938 | * input: |
| 939 | |
| 940 | - Typo, s/ay/may/ |
| 941 | |
| 942 | * inputN: |
| 943 | |
| 944 | - This function is somewhat underspecified for n=0. In particular, |
| 945 | may it block? Is it required to raise Io if the underlying |
| 946 | reader is closed? |
| 947 | |
| 948 | * input, input1, inputN, inputAll: |
| 949 | |
| 950 | - (pedantic) Descriptions speak of "underlying system calls", |
| 951 | although the reader may not actually depend on system calls. |
| 952 | Preferably speak of "underlying reader" only. |
| 953 | |
| 954 | * closeIn: |
| 955 | |
| 956 | - Likewise, description speaks of "releasing system resources". |
| 957 | This should be replaced by saying that it closes the underlying |
| 958 | reader (which is not even specified as is). |
| 959 | |
| 960 | * closeOut: |
| 961 | |
| 962 | - Does the function attempt to close the stream even if flushing |
| 963 | fails? |
| 964 | |
| 965 | - Why is it possible to close terminated streams? That seems to |
| 966 | allow unfortunate interference with another stream that has been |
| 967 | created from the extracted writer. |
| 968 | |
| 969 | * mkInstream, getReader: |
| 970 | |
| 971 | - The table seems to imply that mkInstream always augments its |
| 972 | reader. This is inappropriate for concurrent environments (see |
| 973 | above). |
| 974 | |
| 975 | - Should getReader return the original or the augmented reader? |
| 976 | |
| 977 | - The table still includes the removed getPosIn and setPosIn |
| 978 | functions. |
| 979 | |
| 980 | * mkOutstream, getWriter: |
| 981 | |
| 982 | - Likewise. |
| 983 | |
| 984 | * filePosIn: |
| 985 | |
| 986 | - There seems to be no way to implement this function for buffered |
| 987 | I/O, because the reader position that corresponds to a |
| 988 | mid-block-element is not available and cannot be calculated in |
| 989 | general. So how is this meant? |
| 990 | |
| 991 | - Typo, s/character/element/ |
| 992 | |
| 993 | * filePosOut: |
| 994 | |
| 995 | - Likewise. |
| 996 | |
| 997 | * getWriter: |
| 998 | |
| 999 | - It is non-obvious what the precise meaning of "terminating" a |
| 1000 | stream is. If this is merely setting a status flag then a |
| 1001 | corresponding note would be helpful. |
| 1002 | |
| 1003 | * getPosOut: |
| 1004 | |
| 1005 | - May this flush the stream (and hence raise Io exceptions)? |
| 1006 | |
| 1007 | * setPosOut: |
| 1008 | |
| 1009 | - This may raise an exception because the position has been |
| 1010 | invalidated after obtaining it (e.g. by file truncation |
| 1011 | performed by another process). |
| 1012 | |
| 1013 | - Typo, s/underlying device/underlying writer/ |
| 1014 | |
| 1015 | * setBufferMode, getBufferMode: |
| 1016 | |
| 1017 | - There is no specification of the semantics of line buffering, in |
| 1018 | particular for non-text streams. |
| 1019 | (See also comments on StreamIO functor) |
| 1020 | |
| 1021 | - It is not specified whether the stream may be flushed when set |
| 1022 | to LINE_BUF mode (may cause Io exception). It seems unreasonable |
| 1023 | to require it not to do so (assuming that line buffering is |
| 1024 | intended to maintain the invariant that the buffer never |
| 1025 | contains line breaks). |
| 1026 | |
| 1027 | - The synopsis of this function uses "ostr", while all others |
| 1028 | use "f" for streams. |
| 1029 | |
| 1030 | * setPosOut, setBufferMode, getWriter: |
| 1031 | |
| 1032 | - Can raise an exception if flushing fails. |
| 1033 | |
| 1034 | * Discussion: |
| 1035 | |
| 1036 | - The statement that closing a stream just causes the |
| 1037 | not-yet-determined part of the stream to be empty should |
| 1038 | probably be generalised to explain what *truncating* a stream |
| 1039 | means (getReader also truncates the stream). |
| 1040 | |
| 1041 | - Example of freshly opened stream: |
| 1042 | s/mkInstream r/mkInstream(r, vector [])/ |
| 1043 | s/size/length/ |
| 1044 | |
| 1045 | - nreads example: |
| 1046 | s/mkInstream r/mkInstream(r, vector [])/ |
| 1047 | s/size/length/ |
| 1048 | |
| 1049 | - input1/inputN relation example: |
| 1050 | (1) Inconsistent with the actual typing of input1 (see above). |
| 1051 | |
| 1052 | (2) Typo, s/inputN f/inputN(f,1)/ |
| 1053 | |
| 1054 | - Unbuffered I/O, 1st example: |
| 1055 | (1) Typos, |
| 1056 | s/mkInstream(reader)/mkInstream(reader, vector [])/ |
| 1057 | s/PrimIO.Rd{chunkSize,...}/(PrimIO.RD{chunksize,...}, v)/ |
| 1058 | |
| 1059 | (2) More importantly, the actual condition appears to be |
| 1060 | incorrect. It should read: |
| 1061 | (chunkSize > 1 orelse length v = 1) andalso endOfStream f' |
| 1062 | |
| 1063 | - Unbuffered I/O, 2nd example: |
| 1064 | s/mkInstream(reader)/mkInstream(reader, vector [])/ |
| 1065 | s/PrimIO.Rd{chunkSize,...}/(PrimIO.RD{chunksize,...}, v)/ |
| 1066 | The condition must be corrected as above. |
| 1067 | |
| 1068 | * There is no reference to the StreamIO functor. |
| 1069 | |
| 1070 | |
| 1071 | |
| 1072 | The StreamIO functor |
| 1073 | -------------------- |
| 1074 | |
| 1075 | * General problems: |
| 1076 | |
| 1077 | - It is impossible for this functor to support line buffering, |
| 1078 | since it has no way of knowing which element consists a line |
| 1079 | break. This could be solved by changing the someElem functor |
| 1080 | argument to a breakElem argument. |
| 1081 | |
| 1082 | - It is also impossible to utilize reader's endPos for |
| 1083 | pre-allocation, because the functor is parametric in the |
| 1084 | position type. |
| 1085 | |
| 1086 | * Functor argument: |
| 1087 | |
| 1088 | - Since the extract and copy functions have been removed/changed |
| 1089 | from ARRAY and VECTOR signatures, the StreamIO functor now |
| 1090 | naturally requires slice structures for efficient |
| 1091 | implementation. (Likewise the PrimIO functor) |
| 1092 | |
| 1093 | * Functor result: |
| 1094 | |
| 1095 | - Type sharing of the result types is not specified. |
| 1096 | |
| 1097 | * Discussion, paragraph on flushing: |
| 1098 | |
| 1099 | - Most of this discussion rather belongs to the description of |
| 1100 | STREAM_IO. |
| 1101 | |
| 1102 | - Everything said here is not restricted to flushOut, but applies |
| 1103 | to flushing in general. |
| 1104 | |
| 1105 | - Unfortunately, it is left unspecified where flushing may happen |
| 1106 | and, consequently, where respective Io exceptions may occur. |
| 1107 | |
| 1108 | - Write retries as suggested here seem to be impossible to |
| 1109 | implement correctly using the writer interface as specified (see |
| 1110 | comments on PRIM_IO.writer). |
| 1111 | |
| 1112 | - According to the writer description, write operations may never |
| 1113 | return an element count of 0, so the last sentence is |
| 1114 | misleading. |
| 1115 | |
| 1116 | * Discussion, last paragraph: |
| 1117 | |
| 1118 | - Typo, missing ")" |
| 1119 | |
| 1120 | * Implementation note: |
| 1121 | |
| 1122 | - 3rd bullet: typo, s/PrimIO.augmentIn/PrimIO.augmentReader/ |
| 1123 | |
| 1124 | - 5th and 6th bullet: The endPos function cannot be utilized as |
| 1125 | suggested, because the functor is necessarily parametric in the |
| 1126 | position type. |
| 1127 | |
| 1128 | |
| 1129 | |
| 1130 | The IMPERATIVE_IO signature |
| 1131 | --------------------------- |
| 1132 | |
| 1133 | * General comment: |
| 1134 | |
| 1135 | - It is unfortunate that imperative I/O is asymmetric with respect |
| 1136 | to providing (limited) random access on input vs. output streams |
| 1137 | - the former requires going down to the lower-level stream I/O. |
| 1138 | That makes imperative I/O a somewhat incomplete abstraction |
| 1139 | layer. |
| 1140 | |
| 1141 | - Likewise, it would be desirable if there were ways for |
| 1142 | performing full-fledged random access without leaving the |
| 1143 | imperative I/O abstraction layer, at least for streams were it |
| 1144 | is suitable (e.g. BinIO). Despite the statement in the |
| 1145 | discussion this is neither available for input nor for output |
| 1146 | streams (see comments below). |
| 1147 | |
| 1148 | * closeIn: |
| 1149 | |
| 1150 | - Typo, s/S.closeIn/StreamIO.closeIn/ |
| 1151 | |
| 1152 | * flushOut: |
| 1153 | |
| 1154 | - Typo, s/S.flushOut/StreamIO.flushOut/ |
| 1155 | |
| 1156 | * closeOut: |
| 1157 | |
| 1158 | - Typo, s/S.closeOut/StreamIO.closeOut/ |
| 1159 | |
| 1160 | * Discussion: |
| 1161 | |
| 1162 | - Equivalences, last line: s/StreamIO.output/StreamIO.flushOut/ |
| 1163 | |
| 1164 | - Paragraph about random-access on output streams: It says that |
| 1165 | BinIO.StreamIO.out_pos = Position.int. This is not true, we have |
| 1166 | BinPrimIO.pos = Position.int, but that is a completely different |
| 1167 | type. In fact, it is impossible to implement out_pos as |
| 1168 | Position.int. |
| 1169 | |
| 1170 | * There is no reference to the ImperativeIO functor. |
| 1171 | |
| 1172 | |
| 1173 | |
| 1174 | The ImperativeIO functor |
| 1175 | ------------------------ |
| 1176 | |
| 1177 | * Functor argument: |
| 1178 | |
| 1179 | - The Array argument is unnecessary. |
| 1180 | |
| 1181 | * Functor result: |
| 1182 | |
| 1183 | - Type sharing of the result types is not specified. |
| 1184 | |
| 1185 | |
| 1186 | |
| 1187 | The TEXT_STREAM_IO signature |
| 1188 | ---------------------------- |
| 1189 | |
| 1190 | * General comment: |
| 1191 | |
| 1192 | - Why bother separating this signature from STREAM_IO? |
| 1193 | => outputSubstr can easily be generalised to outputSlice |
| 1194 | (for good), |
| 1195 | => if line buffering is part of STREAM_IO, inputLine |
| 1196 | might be as well. |
| 1197 | |
| 1198 | |
| 1199 | |
| 1200 | The TextIO structure |
| 1201 | -------------------- |
| 1202 | |
| 1203 | * General comment: |
| 1204 | |
| 1205 | - Systems providing WideText should also provide a WideTextIO |
| 1206 | structure (they have to provide WideTextPrimIO already, which |
| 1207 | seems inconsistent). |
| 1208 | |
| 1209 | * Interface: |
| 1210 | |
| 1211 | - Duplicated type constraints for StreamIO.reader and |
| 1212 | StreamIO.writer. |
| 1213 | |
| 1214 | |
| 1215 | |
| 1216 | The BinIO structure |
| 1217 | -------------------- |
| 1218 | |
| 1219 | * Interface: |
| 1220 | |
| 1221 | - Type sharing with BinPrimIO is not specified (unlike for |
| 1222 | TextIO), i.e. the following constraints are missing: |
| 1223 | |
| 1224 | where type StreamIO.reader = BinPrimIO.reader |
| 1225 | where type StreamIO.writer = BinPrimIO.writer |
| 1226 | where type StreamIO.pos = BinPrimIO.pos |
| 1227 | |
| 1228 | ****************************************************************************** |
| 1229 | ****************************************************************************** |
| 1230 | ****************************************************************************** |
| 1231 | ****************************************************************************** |
| 1232 | |
| 1233 | Doing host/network byte order conversions on ML side. |
| 1234 | |
| 1235 | Socket.Ctl |
| 1236 | * Semantics of setNBIO, getNREAD, getATMARK are unclear; |
| 1237 | Don't seem to be accessible via {get,set}sockopt; |
| 1238 | Instead, using ioctl. |
| 1239 | |
| 1240 | ****************************************************************************** |
| 1241 | ****************************************************************************** |
| 1242 | |
| 1243 | Posix.FileSys: |
| 1244 | * Within structure S, the type mode is constrained equal to flags, |
| 1245 | but flags is an eqtype. |
| 1246 | |
| 1247 | STREAM_IO.pos |
| 1248 | * "This is the type of positions in the underlying readers and |
| 1249 | writers. In some instantiations of this signature (e.g., |
| 1250 | TextIO.StreamIO), pos is abstract; in others (e.g., BinIO.StreamIO) |
| 1251 | it is Position.int." But, the equality of BinIO.StreamIO.pos and |
| 1252 | Position.int is never specified in any where constraint of BinIO. |
| 1253 | * How can filePosIn be implemented with completely abstract pos? |
| 1254 | |
| 1255 | Not sent to list: |
| 1256 | |
| 1257 | * (In general, probably a good idea to look at the entire top-level |
| 1258 | structure/signature matches and choose a consistent usage of base |
| 1259 | types. For example, Int:>INTEGER would seem to hide the top-level |
| 1260 | int; unless Int is opened afterwards. But, then what about all the |
| 1261 | other structures that reference int? Is top-level int = Int.int or |
| 1262 | is Int.int = top-level int.) |
| 1263 | --> I think I'm biased from looking at the MLton implementation, |
| 1264 | becuase I'm finding it hard to think about how to really express all |
| 1265 | of the sharing constraints in a way that will be acceptable. This |
| 1266 | might be the wrong way to look at things: the listing of structures |
| 1267 | and signatures with clauses doesn't correspond to a build order, it |
| 1268 | corresponds to the way the environment should look to the program. |
| 1269 | |
| 1270 | Sequences and Slices: |
| 1271 | Why not existsi, alli? |
| 1272 | |
| 1273 | Vector: |
| 1274 | Why no vector: int * 'a -> 'a vector? |
| 1275 | |
| 1276 | |
| 1277 | Resolved: |
| 1278 | |
| 1279 | If one defines VECTOR_SLICE by including a type 'a vector and replace |
| 1280 | 'a Vector.vector with the local 'a vector, but then binds |
| 1281 | structure Vector: VECTOR |
| 1282 | structure VectorSlice: VECTOR_SLICE where type 'a vector = 'a Vector.vector |
| 1283 | at the top-level, does one violate the basis spec? |
| 1284 | Rationale: it's easiset to implement Vector and VectorSlice |
| 1285 | simultaneously, say with VectorSlice as a substructure of Vector (in |
| 1286 | fact, with all of the Vector operations being dispatched to the |
| 1287 | corresponding VectorSlice ops with full slices), so Vector isn't in |
| 1288 | scope for the VECTOR_SLICE. |
| 1289 | *** No, it's not o.k., because opening VectorSlice will introduce a binding |
| 1290 | for 'a vector; but, if we're lucky, John will accept the proposal. |
| 1291 | |
| 1292 | IEEEReal: |
| 1293 | toString prepends a #"~" even when the class is NAN? |
| 1294 | *** I guess this is o.k.; there is an explicit sign field. |
| 1295 | |
| 1296 | PACK_WORD: |
| 1297 | structure Pack<N>Big :> PACK_WORD (* OPTIONAL *) |
| 1298 | structure Pack<N>Little :> PACK_WORD (* OPTIONAL *) |
| 1299 | but PACK_WORD has |
| 1300 | val subVec : Word8Vector.vector * int -> LargeWord.word |
| 1301 | i.e., reference to LargeWord.word. |
| 1302 | Should it be |
| 1303 | PACK_WORD |
| 1304 | type word |
| 1305 | val subVec : Word8Vector.vector * int -> word |
| 1306 | with |
| 1307 | structure Pack<N>Big :> PACK_WORD with word = Word<N>.word (* OPTIONAL *) |
| 1308 | Should there be PackBig and PackLittle with word = Word.word? |
| 1309 | Should there be PackLargeBig with word = LargeWord.word? |
| 1310 | There aren't many structures that refine on LargeXYZ; most refine on XYZ<N>. |
| 1311 | *** O.k., we always unpack into a LargeWord, which we could then |
| 1312 | Word<N>.fromLargeWord back to the size. I guess this is o.k.; It |
| 1313 | lets an implementation give more Pack<N>Big structures than there |
| 1314 | are Word<N> structures. |
| 1315 | |
| 1316 | MLton specific: |
| 1317 | + why are Int32_gtu and Int32_geu primitive? |
| 1318 | Why not just Word.fromInt and use Word comparisons? |
| 1319 | + Real:>REAL doesn't match basis because it may peform |
| 1320 | arithmetic at extended precision. Should this be mentioned |
| 1321 | in the user guide? |
| 1322 | + QUESTION: proc-env.sml |
| 1323 | + QUESTION: char.sml |
| 1324 | + check uses of {Vector,Array}Slice.slice for replacement by unsafeSlice. |
| 1325 | |
| 1326 | |
| 1327 | ****************************************************************************** |
| 1328 | ****************************************************************************** |
| 1329 | |
| 1330 | UNIX: |
| 1331 | I'm not quite sure how the ('a, 'b) proc type is supposed to work in |
| 1332 | practice; The old Unix structure just used them as |
| 1333 | TextIO.{in,out}streams. My suspicion is that we're supposed to use |
| 1334 | Posix.IO.mk{Bin,Text}{Reader,Writer} functions and then use the type |
| 1335 | system to ensure that if we force a stream to be bin or text, then all |
| 1336 | other uses have to be the same. I also suspect that we're only |
| 1337 | supposed to lift the file_desc up to an instream/outstream once; i.e., |
| 1338 | multiple textInstreamOf calls should continue to return the same |
| 1339 | TextIO.instream. That would seem to suggest we need an 'a option ref |
| 1340 | that can be banged at the first call to a streamOf function, and |
| 1341 | subsequent calls just return the value there. |
| 1342 | |
| 1343 | textInstreamOf pr |
| 1344 | binInstreamOf pr |
| 1345 | return a text or binary instream connected to the standard output |
| 1346 | stream of the process pr. Note the multiple calls to these |
| 1347 | functions on the same proc will result in multiple streams that |
| 1348 | all share the same underlying Unix stream. |
| 1349 | |
| 1350 | textOutstreamOf pr |
| 1351 | binOutstreamOf pr |
| 1352 | return a text or binary outstream connected to the standard input |
| 1353 | stream of the process pr. Note the multiple calls to these |
| 1354 | functions on the same proc will result in multiple streams that |
| 1355 | all share the same underlying Unix stream. |
| 1356 | |
| 1357 | streamsOf pr |
| 1358 | returns a pair of input and output text streams associated with |
| 1359 | pr. This function is equivalent to (textInstream pr, textOutstream |
| 1360 | pr) and is provided for backward compatibility. |