| 1 | @c -*-texinfo-*- |
| 2 | @c This is part of the GNU Guile Reference Manual. |
| 3 | @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009, |
| 4 | @c 2010, 2011, 2013 Free Software Foundation, Inc. |
| 5 | @c See the file guile.texi for copying conditions. |
| 6 | |
| 7 | @node Input and Output |
| 8 | @section Input and Output |
| 9 | |
| 10 | @menu |
| 11 | * Ports:: The idea of the port abstraction. |
| 12 | * Reading:: Procedures for reading from a port. |
| 13 | * Writing:: Procedures for writing to a port. |
| 14 | * Closing:: Procedures to close a port. |
| 15 | * Random Access:: Moving around a random access port. |
| 16 | * Line/Delimited:: Read and write lines or delimited text. |
| 17 | * Block Reading and Writing:: Reading and writing blocks of text. |
| 18 | * Default Ports:: Defaults for input, output and errors. |
| 19 | * Port Types:: Types of port and how to make them. |
| 20 | * R6RS I/O Ports:: The R6RS port API. |
| 21 | * I/O Extensions:: Using and extending ports in C. |
| 22 | * BOM Handling:: Handling of Unicode byte order marks. |
| 23 | @end menu |
| 24 | |
| 25 | |
| 26 | @node Ports |
| 27 | @subsection Ports |
| 28 | @cindex Port |
| 29 | |
| 30 | Sequential input/output in Scheme is represented by operations on a |
| 31 | @dfn{port}. This chapter explains the operations that Guile provides |
| 32 | for working with ports. |
| 33 | |
| 34 | Ports are created by opening, for instance @code{open-file} for a file |
| 35 | (@pxref{File Ports}). Characters can be read from an input port and |
| 36 | written to an output port, or both on an input/output port. A port |
| 37 | can be closed (@pxref{Closing}) when no longer required, after which |
| 38 | any attempt to read or write is an error. |
| 39 | |
| 40 | The formal definition of a port is very generic: an input port is |
| 41 | simply ``an object which can deliver characters on demand,'' and an |
| 42 | output port is ``an object which can accept characters.'' Because |
| 43 | this definition is so loose, it is easy to write functions that |
| 44 | simulate ports in software. @dfn{Soft ports} and @dfn{string ports} |
| 45 | are two interesting and powerful examples of this technique. |
| 46 | (@pxref{Soft Ports}, and @ref{String Ports}.) |
| 47 | |
| 48 | Ports are garbage collected in the usual way (@pxref{Memory |
| 49 | Management}), and will be closed at that time if not already closed. |
| 50 | In this case any errors occurring in the close will not be reported. |
| 51 | Usually a program will want to explicitly close so as to be sure all |
| 52 | its operations have been successful. Of course if a program has |
| 53 | abandoned something due to an error or other condition then closing |
| 54 | problems are probably not of interest. |
| 55 | |
| 56 | It is strongly recommended that file ports be closed explicitly when |
| 57 | no longer required. Most systems have limits on how many files can be |
| 58 | open, both on a per-process and a system-wide basis. A program that |
| 59 | uses many files should take care not to hit those limits. The same |
| 60 | applies to similar system resources such as pipes and sockets. |
| 61 | |
| 62 | Note that automatic garbage collection is triggered only by memory |
| 63 | consumption, not by file or other resource usage, so a program cannot |
| 64 | rely on that to keep it away from system limits. An explicit call to |
| 65 | @code{gc} can of course be relied on to pick up unreferenced ports. |
| 66 | If program flow makes it hard to be certain when to close then this |
| 67 | may be an acceptable way to control resource usage. |
| 68 | |
| 69 | All file access uses the ``LFS'' large file support functions when |
| 70 | available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be |
| 71 | read and written on a 32-bit system. |
| 72 | |
| 73 | Each port has an associated character encoding that controls how bytes |
| 74 | read from the port are converted to characters and string and controls |
| 75 | how characters and strings written to the port are converted to bytes. |
| 76 | When ports are created, they inherit their character encoding from the |
| 77 | current locale, but, that can be modified after the port is created. |
| 78 | |
| 79 | Currently, the ports only work with @emph{non-modal} encodings. Most |
| 80 | encodings are non-modal, meaning that the conversion of bytes to a |
| 81 | string doesn't depend on its context: the same byte sequence will always |
| 82 | return the same string. A couple of modal encodings are in common use, |
| 83 | like ISO-2022-JP and ISO-2022-KR, and they are not yet supported. |
| 84 | |
| 85 | Each port also has an associated conversion strategy: what to do when |
| 86 | a Guile character can't be converted to the port's encoded character |
| 87 | representation for output. There are three possible strategies: to |
| 88 | raise an error, to replace the character with a hex escape, or to |
| 89 | replace the character with a substitute character. |
| 90 | |
| 91 | @rnindex input-port? |
| 92 | @deffn {Scheme Procedure} input-port? x |
| 93 | @deffnx {C Function} scm_input_port_p (x) |
| 94 | Return @code{#t} if @var{x} is an input port, otherwise return |
| 95 | @code{#f}. Any object satisfying this predicate also satisfies |
| 96 | @code{port?}. |
| 97 | @end deffn |
| 98 | |
| 99 | @rnindex output-port? |
| 100 | @deffn {Scheme Procedure} output-port? x |
| 101 | @deffnx {C Function} scm_output_port_p (x) |
| 102 | Return @code{#t} if @var{x} is an output port, otherwise return |
| 103 | @code{#f}. Any object satisfying this predicate also satisfies |
| 104 | @code{port?}. |
| 105 | @end deffn |
| 106 | |
| 107 | @deffn {Scheme Procedure} port? x |
| 108 | @deffnx {C Function} scm_port_p (x) |
| 109 | Return a boolean indicating whether @var{x} is a port. |
| 110 | Equivalent to @code{(or (input-port? @var{x}) (output-port? |
| 111 | @var{x}))}. |
| 112 | @end deffn |
| 113 | |
| 114 | @deffn {Scheme Procedure} set-port-encoding! port enc |
| 115 | @deffnx {C Function} scm_set_port_encoding_x (port, enc) |
| 116 | Sets the character encoding that will be used to interpret all port I/O. |
| 117 | @var{enc} is a string containing the name of an encoding. Valid |
| 118 | encoding names are those |
| 119 | @url{http://www.iana.org/assignments/character-sets, defined by IANA}. |
| 120 | @end deffn |
| 121 | |
| 122 | @defvr {Scheme Variable} %default-port-encoding |
| 123 | A fluid containing @code{#f} or the name of the encoding to |
| 124 | be used by default for newly created ports (@pxref{Fluids and Dynamic |
| 125 | States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}. |
| 126 | |
| 127 | New ports are created with the encoding appropriate for the current |
| 128 | locale if @code{setlocale} has been called or the value specified by |
| 129 | this fluid otherwise. |
| 130 | @end defvr |
| 131 | |
| 132 | @deffn {Scheme Procedure} port-encoding port |
| 133 | @deffnx {C Function} scm_port_encoding (port) |
| 134 | Returns, as a string, the character encoding that @var{port} uses to interpret |
| 135 | its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}. |
| 136 | @end deffn |
| 137 | |
| 138 | @deffn {Scheme Procedure} set-port-conversion-strategy! port sym |
| 139 | @deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym) |
| 140 | Sets the behavior of the interpreter when outputting a character that |
| 141 | is not representable in the port's current encoding. @var{sym} can be |
| 142 | either @code{'error}, @code{'substitute}, or @code{'escape}. If it is |
| 143 | @code{'error}, an error will be thrown when an nonconvertible character |
| 144 | is encountered. If it is @code{'substitute}, then nonconvertible |
| 145 | characters will be replaced with approximate characters, or with |
| 146 | question marks if no approximately correct character is available. If |
| 147 | it is @code{'escape}, it will appear as a hex escape when output. |
| 148 | |
| 149 | If @var{port} is an open port, the conversion error behavior |
| 150 | is set for that port. If it is @code{#f}, it is set as the |
| 151 | default behavior for any future ports that get created in |
| 152 | this thread. |
| 153 | @end deffn |
| 154 | |
| 155 | @deffn {Scheme Procedure} port-conversion-strategy port |
| 156 | @deffnx {C Function} scm_port_conversion_strategy (port) |
| 157 | Returns the behavior of the port when outputting a character that is |
| 158 | not representable in the port's current encoding. It returns the |
| 159 | symbol @code{error} if unrepresentable characters should cause |
| 160 | exceptions, @code{substitute} if the port should try to replace |
| 161 | unrepresentable characters with question marks or approximate |
| 162 | characters, or @code{escape} if unrepresentable characters should be |
| 163 | converted to string escapes. |
| 164 | |
| 165 | If @var{port} is @code{#f}, then the current default behavior will be |
| 166 | returned. New ports will have this default behavior when they are |
| 167 | created. |
| 168 | @end deffn |
| 169 | |
| 170 | @deffn {Scheme Variable} %default-port-conversion-strategy |
| 171 | The fluid that defines the conversion strategy for newly created ports, |
| 172 | and for other conversion routines such as @code{scm_to_stringn}, |
| 173 | @code{scm_from_stringn}, @code{string->pointer}, and |
| 174 | @code{pointer->string}. |
| 175 | |
| 176 | Its value must be one of the symbols described above, with the same |
| 177 | semantics: @code{'error}, @code{'substitute}, or @code{'escape}. |
| 178 | |
| 179 | When Guile starts, its value is @code{'substitute}. |
| 180 | |
| 181 | Note that @code{(set-port-conversion-strategy! #f @var{sym})} is |
| 182 | equivalent to @code{(fluid-set! %default-port-conversion-strategy |
| 183 | @var{sym})}. |
| 184 | @end deffn |
| 185 | |
| 186 | |
| 187 | @node Reading |
| 188 | @subsection Reading |
| 189 | @cindex Reading |
| 190 | |
| 191 | [Generic procedures for reading from ports.] |
| 192 | |
| 193 | These procedures pertain to reading characters and strings from |
| 194 | ports. To read general S-expressions from ports, @xref{Scheme Read}. |
| 195 | |
| 196 | @rnindex eof-object? |
| 197 | @cindex End of file object |
| 198 | @deffn {Scheme Procedure} eof-object? x |
| 199 | @deffnx {C Function} scm_eof_object_p (x) |
| 200 | Return @code{#t} if @var{x} is an end-of-file object; otherwise |
| 201 | return @code{#f}. |
| 202 | @end deffn |
| 203 | |
| 204 | @rnindex char-ready? |
| 205 | @deffn {Scheme Procedure} char-ready? [port] |
| 206 | @deffnx {C Function} scm_char_ready_p (port) |
| 207 | Return @code{#t} if a character is ready on input @var{port} |
| 208 | and return @code{#f} otherwise. If @code{char-ready?} returns |
| 209 | @code{#t} then the next @code{read-char} operation on |
| 210 | @var{port} is guaranteed not to hang. If @var{port} is a file |
| 211 | port at end of file then @code{char-ready?} returns @code{#t}. |
| 212 | |
| 213 | @code{char-ready?} exists to make it possible for a |
| 214 | program to accept characters from interactive ports without |
| 215 | getting stuck waiting for input. Any input editors associated |
| 216 | with such ports must make sure that characters whose existence |
| 217 | has been asserted by @code{char-ready?} cannot be rubbed out. |
| 218 | If @code{char-ready?} were to return @code{#f} at end of file, |
| 219 | a port at end of file would be indistinguishable from an |
| 220 | interactive port that has no ready characters. |
| 221 | @end deffn |
| 222 | |
| 223 | @rnindex read-char |
| 224 | @deffn {Scheme Procedure} read-char [port] |
| 225 | @deffnx {C Function} scm_read_char (port) |
| 226 | Return the next character available from @var{port}, updating |
| 227 | @var{port} to point to the following character. If no more |
| 228 | characters are available, the end-of-file object is returned. |
| 229 | |
| 230 | When @var{port}'s data cannot be decoded according to its |
| 231 | character encoding, a @code{decoding-error} is raised and |
| 232 | @var{port} points past the erroneous byte sequence. |
| 233 | @end deffn |
| 234 | |
| 235 | @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size) |
| 236 | Read up to @var{size} bytes from @var{port} and store them in |
| 237 | @var{buffer}. The return value is the number of bytes actually read, |
| 238 | which can be less than @var{size} if end-of-file has been reached. |
| 239 | |
| 240 | Note that this function does not update @code{port-line} and |
| 241 | @code{port-column} below. |
| 242 | @end deftypefn |
| 243 | |
| 244 | @rnindex peek-char |
| 245 | @deffn {Scheme Procedure} peek-char [port] |
| 246 | @deffnx {C Function} scm_peek_char (port) |
| 247 | Return the next character available from @var{port}, |
| 248 | @emph{without} updating @var{port} to point to the following |
| 249 | character. If no more characters are available, the |
| 250 | end-of-file object is returned. |
| 251 | |
| 252 | The value returned by |
| 253 | a call to @code{peek-char} is the same as the value that would |
| 254 | have been returned by a call to @code{read-char} on the same |
| 255 | port. The only difference is that the very next call to |
| 256 | @code{read-char} or @code{peek-char} on that @var{port} will |
| 257 | return the value returned by the preceding call to |
| 258 | @code{peek-char}. In particular, a call to @code{peek-char} on |
| 259 | an interactive port will hang waiting for input whenever a call |
| 260 | to @code{read-char} would have hung. |
| 261 | |
| 262 | As for @code{read-char}, a @code{decoding-error} may be raised |
| 263 | if such a situation occurs. However, unlike with @code{read-char}, |
| 264 | @var{port} still points at the beginning of the erroneous byte |
| 265 | sequence when the error is raised. |
| 266 | @end deffn |
| 267 | |
| 268 | @deffn {Scheme Procedure} unread-char cobj [port] |
| 269 | @deffnx {C Function} scm_unread_char (cobj, port) |
| 270 | Place character @var{cobj} in @var{port} so that it will be read by the |
| 271 | next read operation. If called multiple times, the unread characters |
| 272 | will be read again in last-in first-out order. If @var{port} is |
| 273 | not supplied, the current input port is used. |
| 274 | @end deffn |
| 275 | |
| 276 | @deffn {Scheme Procedure} unread-string str port |
| 277 | @deffnx {C Function} scm_unread_string (str, port) |
| 278 | Place the string @var{str} in @var{port} so that its characters will |
| 279 | be read from left-to-right as the next characters from @var{port} |
| 280 | during subsequent read operations. If called multiple times, the |
| 281 | unread characters will be read again in last-in first-out order. If |
| 282 | @var{port} is not supplied, the @code{current-input-port} is used. |
| 283 | @end deffn |
| 284 | |
| 285 | @deffn {Scheme Procedure} drain-input port |
| 286 | @deffnx {C Function} scm_drain_input (port) |
| 287 | This procedure clears a port's input buffers, similar |
| 288 | to the way that force-output clears the output buffer. The |
| 289 | contents of the buffers are returned as a single string, e.g., |
| 290 | |
| 291 | @lisp |
| 292 | (define p (open-input-file ...)) |
| 293 | (drain-input p) => empty string, nothing buffered yet. |
| 294 | (unread-char (read-char p) p) |
| 295 | (drain-input p) => initial chars from p, up to the buffer size. |
| 296 | @end lisp |
| 297 | |
| 298 | Draining the buffers may be useful for cleanly finishing |
| 299 | buffered I/O so that the file descriptor can be used directly |
| 300 | for further input. |
| 301 | @end deffn |
| 302 | |
| 303 | @deffn {Scheme Procedure} port-column port |
| 304 | @deffnx {Scheme Procedure} port-line port |
| 305 | @deffnx {C Function} scm_port_column (port) |
| 306 | @deffnx {C Function} scm_port_line (port) |
| 307 | Return the current column number or line number of @var{port}. |
| 308 | If the number is |
| 309 | unknown, the result is #f. Otherwise, the result is a 0-origin integer |
| 310 | - i.e.@: the first character of the first line is line 0, column 0. |
| 311 | (However, when you display a file position, for example in an error |
| 312 | message, we recommend you add 1 to get 1-origin integers. This is |
| 313 | because lines and column numbers traditionally start with 1, and that is |
| 314 | what non-programmers will find most natural.) |
| 315 | @end deffn |
| 316 | |
| 317 | @deffn {Scheme Procedure} set-port-column! port column |
| 318 | @deffnx {Scheme Procedure} set-port-line! port line |
| 319 | @deffnx {C Function} scm_set_port_column_x (port, column) |
| 320 | @deffnx {C Function} scm_set_port_line_x (port, line) |
| 321 | Set the current column or line number of @var{port}. |
| 322 | @end deffn |
| 323 | |
| 324 | @node Writing |
| 325 | @subsection Writing |
| 326 | @cindex Writing |
| 327 | |
| 328 | [Generic procedures for writing to ports.] |
| 329 | |
| 330 | These procedures are for writing characters and strings to |
| 331 | ports. For more information on writing arbitrary Scheme objects to |
| 332 | ports, @xref{Scheme Write}. |
| 333 | |
| 334 | @deffn {Scheme Procedure} get-print-state port |
| 335 | @deffnx {C Function} scm_get_print_state (port) |
| 336 | Return the print state of the port @var{port}. If @var{port} |
| 337 | has no associated print state, @code{#f} is returned. |
| 338 | @end deffn |
| 339 | |
| 340 | @rnindex newline |
| 341 | @deffn {Scheme Procedure} newline [port] |
| 342 | @deffnx {C Function} scm_newline (port) |
| 343 | Send a newline to @var{port}. |
| 344 | If @var{port} is omitted, send to the current output port. |
| 345 | @end deffn |
| 346 | |
| 347 | @deffn {Scheme Procedure} port-with-print-state port [pstate] |
| 348 | @deffnx {C Function} scm_port_with_print_state (port, pstate) |
| 349 | Create a new port which behaves like @var{port}, but with an |
| 350 | included print state @var{pstate}. @var{pstate} is optional. |
| 351 | If @var{pstate} isn't supplied and @var{port} already has |
| 352 | a print state, the old print state is reused. |
| 353 | @end deffn |
| 354 | |
| 355 | @deffn {Scheme Procedure} simple-format destination message . args |
| 356 | @deffnx {C Function} scm_simple_format (destination, message, args) |
| 357 | Write @var{message} to @var{destination}, defaulting to |
| 358 | the current output port. |
| 359 | @var{message} can contain @code{~A} (was @code{%s}) and |
| 360 | @code{~S} (was @code{%S}) escapes. When printed, |
| 361 | the escapes are replaced with corresponding members of |
| 362 | @var{args}: |
| 363 | @code{~A} formats using @code{display} and @code{~S} formats |
| 364 | using @code{write}. |
| 365 | If @var{destination} is @code{#t}, then use the current output |
| 366 | port, if @var{destination} is @code{#f}, then return a string |
| 367 | containing the formatted text. Does not add a trailing newline. |
| 368 | @end deffn |
| 369 | |
| 370 | @rnindex write-char |
| 371 | @deffn {Scheme Procedure} write-char chr [port] |
| 372 | @deffnx {C Function} scm_write_char (chr, port) |
| 373 | Send character @var{chr} to @var{port}. |
| 374 | @end deffn |
| 375 | |
| 376 | @deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size) |
| 377 | Write @var{size} bytes at @var{buffer} to @var{port}. |
| 378 | |
| 379 | Note that this function does not update @code{port-line} and |
| 380 | @code{port-column} (@pxref{Reading}). |
| 381 | @end deftypefn |
| 382 | |
| 383 | @findex fflush |
| 384 | @deffn {Scheme Procedure} force-output [port] |
| 385 | @deffnx {C Function} scm_force_output (port) |
| 386 | Flush the specified output port, or the current output port if @var{port} |
| 387 | is omitted. The current output buffer contents are passed to the |
| 388 | underlying port implementation (e.g., in the case of fports, the |
| 389 | data will be written to the file and the output buffer will be cleared.) |
| 390 | It has no effect on an unbuffered port. |
| 391 | |
| 392 | The return value is unspecified. |
| 393 | @end deffn |
| 394 | |
| 395 | @deffn {Scheme Procedure} flush-all-ports |
| 396 | @deffnx {C Function} scm_flush_all_ports () |
| 397 | Equivalent to calling @code{force-output} on |
| 398 | all open output ports. The return value is unspecified. |
| 399 | @end deffn |
| 400 | |
| 401 | |
| 402 | @node Closing |
| 403 | @subsection Closing |
| 404 | @cindex Closing ports |
| 405 | @cindex Port, close |
| 406 | |
| 407 | @deffn {Scheme Procedure} close-port port |
| 408 | @deffnx {C Function} scm_close_port (port) |
| 409 | Close the specified port object. Return @code{#t} if it |
| 410 | successfully closes a port or @code{#f} if it was already |
| 411 | closed. An exception may be raised if an error occurs, for |
| 412 | example when flushing buffered output. See also @ref{Ports and |
| 413 | File Descriptors, close}, for a procedure which can close file |
| 414 | descriptors. |
| 415 | @end deffn |
| 416 | |
| 417 | @deffn {Scheme Procedure} close-input-port port |
| 418 | @deffnx {Scheme Procedure} close-output-port port |
| 419 | @deffnx {C Function} scm_close_input_port (port) |
| 420 | @deffnx {C Function} scm_close_output_port (port) |
| 421 | @rnindex close-input-port |
| 422 | @rnindex close-output-port |
| 423 | Close the specified input or output @var{port}. An exception may be |
| 424 | raised if an error occurs while closing. If @var{port} is already |
| 425 | closed, nothing is done. The return value is unspecified. |
| 426 | |
| 427 | See also @ref{Ports and File Descriptors, close}, for a procedure |
| 428 | which can close file descriptors. |
| 429 | @end deffn |
| 430 | |
| 431 | @deffn {Scheme Procedure} port-closed? port |
| 432 | @deffnx {C Function} scm_port_closed_p (port) |
| 433 | Return @code{#t} if @var{port} is closed or @code{#f} if it is |
| 434 | open. |
| 435 | @end deffn |
| 436 | |
| 437 | |
| 438 | @node Random Access |
| 439 | @subsection Random Access |
| 440 | @cindex Random access, ports |
| 441 | @cindex Port, random access |
| 442 | |
| 443 | @deffn {Scheme Procedure} seek fd_port offset whence |
| 444 | @deffnx {C Function} scm_seek (fd_port, offset, whence) |
| 445 | Sets the current position of @var{fd_port} to the integer |
| 446 | @var{offset}, which is interpreted according to the value of |
| 447 | @var{whence}. |
| 448 | |
| 449 | One of the following variables should be supplied for |
| 450 | @var{whence}: |
| 451 | @defvar SEEK_SET |
| 452 | Seek from the beginning of the file. |
| 453 | @end defvar |
| 454 | @defvar SEEK_CUR |
| 455 | Seek from the current position. |
| 456 | @end defvar |
| 457 | @defvar SEEK_END |
| 458 | Seek from the end of the file. |
| 459 | @end defvar |
| 460 | If @var{fd_port} is a file descriptor, the underlying system |
| 461 | call is @code{lseek}. @var{port} may be a string port. |
| 462 | |
| 463 | The value returned is the new position in the file. This means |
| 464 | that the current position of a port can be obtained using: |
| 465 | @lisp |
| 466 | (seek port 0 SEEK_CUR) |
| 467 | @end lisp |
| 468 | @end deffn |
| 469 | |
| 470 | @deffn {Scheme Procedure} ftell fd_port |
| 471 | @deffnx {C Function} scm_ftell (fd_port) |
| 472 | Return an integer representing the current position of |
| 473 | @var{fd_port}, measured from the beginning. Equivalent to: |
| 474 | |
| 475 | @lisp |
| 476 | (seek port 0 SEEK_CUR) |
| 477 | @end lisp |
| 478 | @end deffn |
| 479 | |
| 480 | @findex truncate |
| 481 | @findex ftruncate |
| 482 | @deffn {Scheme Procedure} truncate-file file [length] |
| 483 | @deffnx {C Function} scm_truncate_file (file, length) |
| 484 | Truncate @var{file} to @var{length} bytes. @var{file} can be a |
| 485 | filename string, a port object, or an integer file descriptor. The |
| 486 | return value is unspecified. |
| 487 | |
| 488 | For a port or file descriptor @var{length} can be omitted, in which |
| 489 | case the file is truncated at the current position (per @code{ftell} |
| 490 | above). |
| 491 | |
| 492 | On most systems a file can be extended by giving a length greater than |
| 493 | the current size, but this is not mandatory in the POSIX standard. |
| 494 | @end deffn |
| 495 | |
| 496 | @node Line/Delimited |
| 497 | @subsection Line Oriented and Delimited Text |
| 498 | @cindex Line input/output |
| 499 | @cindex Port, line input/output |
| 500 | |
| 501 | The delimited-I/O module can be accessed with: |
| 502 | |
| 503 | @lisp |
| 504 | (use-modules (ice-9 rdelim)) |
| 505 | @end lisp |
| 506 | |
| 507 | It can be used to read or write lines of text, or read text delimited by |
| 508 | a specified set of characters. It's similar to the @code{(scsh rdelim)} |
| 509 | module from guile-scsh, but does not use multiple values or character |
| 510 | sets and has an extra procedure @code{write-line}. |
| 511 | |
| 512 | @c begin (scm-doc-string "rdelim.scm" "read-line") |
| 513 | @deffn {Scheme Procedure} read-line [port] [handle-delim] |
| 514 | Return a line of text from @var{port} if specified, otherwise from the |
| 515 | value returned by @code{(current-input-port)}. Under Unix, a line of text |
| 516 | is terminated by the first end-of-line character or by end-of-file. |
| 517 | |
| 518 | If @var{handle-delim} is specified, it should be one of the following |
| 519 | symbols: |
| 520 | @table @code |
| 521 | @item trim |
| 522 | Discard the terminating delimiter. This is the default, but it will |
| 523 | be impossible to tell whether the read terminated with a delimiter or |
| 524 | end-of-file. |
| 525 | @item concat |
| 526 | Append the terminating delimiter (if any) to the returned string. |
| 527 | @item peek |
| 528 | Push the terminating delimiter (if any) back on to the port. |
| 529 | @item split |
| 530 | Return a pair containing the string read from the port and the |
| 531 | terminating delimiter or end-of-file object. |
| 532 | @end table |
| 533 | |
| 534 | Like @code{read-char}, this procedure can throw to @code{decoding-error} |
| 535 | (@pxref{Reading, @code{read-char}}). |
| 536 | @end deffn |
| 537 | |
| 538 | @c begin (scm-doc-string "rdelim.scm" "read-line!") |
| 539 | @deffn {Scheme Procedure} read-line! buf [port] |
| 540 | Read a line of text into the supplied string @var{buf} and return the |
| 541 | number of characters added to @var{buf}. If @var{buf} is filled, then |
| 542 | @code{#f} is returned. |
| 543 | Read from @var{port} if |
| 544 | specified, otherwise from the value returned by @code{(current-input-port)}. |
| 545 | @end deffn |
| 546 | |
| 547 | @c begin (scm-doc-string "rdelim.scm" "read-delimited") |
| 548 | @deffn {Scheme Procedure} read-delimited delims [port] [handle-delim] |
| 549 | Read text until one of the characters in the string @var{delims} is found |
| 550 | or end-of-file is reached. Read from @var{port} if supplied, otherwise |
| 551 | from the value returned by @code{(current-input-port)}. |
| 552 | @var{handle-delim} takes the same values as described for @code{read-line}. |
| 553 | @end deffn |
| 554 | |
| 555 | @c begin (scm-doc-string "rdelim.scm" "read-delimited!") |
| 556 | @deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end] |
| 557 | Read text into the supplied string @var{buf}. |
| 558 | |
| 559 | If a delimiter was found, return the number of characters written, |
| 560 | except if @var{handle-delim} is @code{split}, in which case the return |
| 561 | value is a pair, as noted above. |
| 562 | |
| 563 | As a special case, if @var{port} was already at end-of-stream, the EOF |
| 564 | object is returned. Also, if no characters were written because the |
| 565 | buffer was full, @code{#f} is returned. |
| 566 | |
| 567 | It's something of a wacky interface, to be honest. |
| 568 | @end deffn |
| 569 | |
| 570 | @deffn {Scheme Procedure} write-line obj [port] |
| 571 | @deffnx {C Function} scm_write_line (obj, port) |
| 572 | Display @var{obj} and a newline character to @var{port}. If |
| 573 | @var{port} is not specified, @code{(current-output-port)} is |
| 574 | used. This function is equivalent to: |
| 575 | @lisp |
| 576 | (display obj [port]) |
| 577 | (newline [port]) |
| 578 | @end lisp |
| 579 | @end deffn |
| 580 | |
| 581 | In the past, Guile did not have a procedure that would just read out all |
| 582 | of the characters from a port. As a workaround, many people just called |
| 583 | @code{read-delimited} with no delimiters, knowing that would produce the |
| 584 | behavior they wanted. This prompted Guile developers to add some |
| 585 | routines that would read all characters from a port. So it is that |
| 586 | @code{(ice-9 rdelim)} is also the home for procedures that can reading |
| 587 | undelimited text: |
| 588 | |
| 589 | @deffn {Scheme Procedure} read-string [port] [count] |
| 590 | Read all of the characters out of @var{port} and return them as a |
| 591 | string. If the @var{count} is present, treat it as a limit to the |
| 592 | number of characters to read. |
| 593 | |
| 594 | By default, read from the current input port, with no size limit on the |
| 595 | result. This procedure always returns a string, even if no characters |
| 596 | were read. |
| 597 | @end deffn |
| 598 | |
| 599 | @deffn {Scheme Procedure} read-string! buf [port] [start] [end] |
| 600 | Fill @var{buf} with characters read from @var{port}, defaulting to the |
| 601 | current input port. Return the number of characters read. |
| 602 | |
| 603 | If @var{start} or @var{end} are specified, store data only into the |
| 604 | substring of @var{str} bounded by @var{start} and @var{end} (which |
| 605 | default to the beginning and end of the string, respectively). |
| 606 | @end deffn |
| 607 | |
| 608 | Some of the aforementioned I/O functions rely on the following C |
| 609 | primitives. These will mainly be of interest to people hacking Guile |
| 610 | internals. |
| 611 | |
| 612 | @deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]] |
| 613 | @deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end) |
| 614 | Read characters from @var{port} into @var{str} until one of the |
| 615 | characters in the @var{delims} string is encountered. If |
| 616 | @var{gobble} is true, discard the delimiter character; |
| 617 | otherwise, leave it in the input stream for the next read. If |
| 618 | @var{port} is not specified, use the value of |
| 619 | @code{(current-input-port)}. If @var{start} or @var{end} are |
| 620 | specified, store data only into the substring of @var{str} |
| 621 | bounded by @var{start} and @var{end} (which default to the |
| 622 | beginning and end of the string, respectively). |
| 623 | |
| 624 | Return a pair consisting of the delimiter that terminated the |
| 625 | string and the number of characters read. If reading stopped |
| 626 | at the end of file, the delimiter returned is the |
| 627 | @var{eof-object}; if the string was filled without encountering |
| 628 | a delimiter, this value is @code{#f}. |
| 629 | @end deffn |
| 630 | |
| 631 | @deffn {Scheme Procedure} %read-line [port] |
| 632 | @deffnx {C Function} scm_read_line (port) |
| 633 | Read a newline-terminated line from @var{port}, allocating storage as |
| 634 | necessary. The newline terminator (if any) is removed from the string, |
| 635 | and a pair consisting of the line and its delimiter is returned. The |
| 636 | delimiter may be either a newline or the @var{eof-object}; if |
| 637 | @code{%read-line} is called at the end of file, it returns the pair |
| 638 | @code{(#<eof> . #<eof>)}. |
| 639 | @end deffn |
| 640 | |
| 641 | @node Block Reading and Writing |
| 642 | @subsection Block reading and writing |
| 643 | @cindex Block read/write |
| 644 | @cindex Port, block read/write |
| 645 | |
| 646 | The Block-string-I/O module can be accessed with: |
| 647 | |
| 648 | @lisp |
| 649 | (use-modules (ice-9 rw)) |
| 650 | @end lisp |
| 651 | |
| 652 | It currently contains procedures that help to implement the |
| 653 | @code{(scsh rw)} module in guile-scsh. |
| 654 | |
| 655 | @deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]] |
| 656 | @deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end) |
| 657 | Read characters from a port or file descriptor into a |
| 658 | string @var{str}. A port must have an underlying file |
| 659 | descriptor --- a so-called fport. This procedure is |
| 660 | scsh-compatible and can efficiently read large strings. |
| 661 | It will: |
| 662 | |
| 663 | @itemize |
| 664 | @item |
| 665 | attempt to fill the entire string, unless the @var{start} |
| 666 | and/or @var{end} arguments are supplied. i.e., @var{start} |
| 667 | defaults to 0 and @var{end} defaults to |
| 668 | @code{(string-length str)} |
| 669 | @item |
| 670 | use the current input port if @var{port_or_fdes} is not |
| 671 | supplied. |
| 672 | @item |
| 673 | return fewer than the requested number of characters in some |
| 674 | cases, e.g., on end of file, if interrupted by a signal, or if |
| 675 | not all the characters are immediately available. |
| 676 | @item |
| 677 | wait indefinitely for some input if no characters are |
| 678 | currently available, |
| 679 | unless the port is in non-blocking mode. |
| 680 | @item |
| 681 | read characters from the port's input buffers if available, |
| 682 | instead from the underlying file descriptor. |
| 683 | @item |
| 684 | return @code{#f} if end-of-file is encountered before reading |
| 685 | any characters, otherwise return the number of characters |
| 686 | read. |
| 687 | @item |
| 688 | return 0 if the port is in non-blocking mode and no characters |
| 689 | are immediately available. |
| 690 | @item |
| 691 | return 0 if the request is for 0 bytes, with no |
| 692 | end-of-file check. |
| 693 | @end itemize |
| 694 | @end deffn |
| 695 | |
| 696 | @deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]] |
| 697 | @deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end) |
| 698 | Write characters from a string @var{str} to a port or file |
| 699 | descriptor. A port must have an underlying file descriptor |
| 700 | --- a so-called fport. This procedure is |
| 701 | scsh-compatible and can efficiently write large strings. |
| 702 | It will: |
| 703 | |
| 704 | @itemize |
| 705 | @item |
| 706 | attempt to write the entire string, unless the @var{start} |
| 707 | and/or @var{end} arguments are supplied. i.e., @var{start} |
| 708 | defaults to 0 and @var{end} defaults to |
| 709 | @code{(string-length str)} |
| 710 | @item |
| 711 | use the current output port if @var{port_of_fdes} is not |
| 712 | supplied. |
| 713 | @item |
| 714 | in the case of a buffered port, store the characters in the |
| 715 | port's output buffer, if all will fit. If they will not fit |
| 716 | then any existing buffered characters will be flushed |
| 717 | before attempting |
| 718 | to write the new characters directly to the underlying file |
| 719 | descriptor. If the port is in non-blocking mode and |
| 720 | buffered characters can not be flushed immediately, then an |
| 721 | @code{EAGAIN} system-error exception will be raised (Note: |
| 722 | scsh does not support the use of non-blocking buffered ports.) |
| 723 | @item |
| 724 | write fewer than the requested number of |
| 725 | characters in some cases, e.g., if interrupted by a signal or |
| 726 | if not all of the output can be accepted immediately. |
| 727 | @item |
| 728 | wait indefinitely for at least one character |
| 729 | from @var{str} to be accepted by the port, unless the port is |
| 730 | in non-blocking mode. |
| 731 | @item |
| 732 | return the number of characters accepted by the port. |
| 733 | @item |
| 734 | return 0 if the port is in non-blocking mode and can not accept |
| 735 | at least one character from @var{str} immediately |
| 736 | @item |
| 737 | return 0 immediately if the request size is 0 bytes. |
| 738 | @end itemize |
| 739 | @end deffn |
| 740 | |
| 741 | @node Default Ports |
| 742 | @subsection Default Ports for Input, Output and Errors |
| 743 | @cindex Default ports |
| 744 | @cindex Port, default |
| 745 | |
| 746 | @rnindex current-input-port |
| 747 | @deffn {Scheme Procedure} current-input-port |
| 748 | @deffnx {C Function} scm_current_input_port () |
| 749 | @cindex standard input |
| 750 | Return the current input port. This is the default port used |
| 751 | by many input procedures. |
| 752 | |
| 753 | Initially this is the @dfn{standard input} in Unix and C terminology. |
| 754 | When the standard input is a tty the port is unbuffered, otherwise |
| 755 | it's fully buffered. |
| 756 | |
| 757 | Unbuffered input is good if an application runs an interactive |
| 758 | subprocess, since any type-ahead input won't go into Guile's buffer |
| 759 | and be unavailable to the subprocess. |
| 760 | |
| 761 | Note that Guile buffering is completely separate from the tty ``line |
| 762 | discipline''. In the usual cooked mode on a tty Guile only sees a |
| 763 | line of input once the user presses @key{Return}. |
| 764 | @end deffn |
| 765 | |
| 766 | @rnindex current-output-port |
| 767 | @deffn {Scheme Procedure} current-output-port |
| 768 | @deffnx {C Function} scm_current_output_port () |
| 769 | @cindex standard output |
| 770 | Return the current output port. This is the default port used |
| 771 | by many output procedures. |
| 772 | |
| 773 | Initially this is the @dfn{standard output} in Unix and C terminology. |
| 774 | When the standard output is a tty this port is unbuffered, otherwise |
| 775 | it's fully buffered. |
| 776 | |
| 777 | Unbuffered output to a tty is good for ensuring progress output or a |
| 778 | prompt is seen. But an application which always prints whole lines |
| 779 | could change to line buffered, or an application with a lot of output |
| 780 | could go fully buffered and perhaps make explicit @code{force-output} |
| 781 | calls (@pxref{Writing}) at selected points. |
| 782 | @end deffn |
| 783 | |
| 784 | @deffn {Scheme Procedure} current-error-port |
| 785 | @deffnx {C Function} scm_current_error_port () |
| 786 | @cindex standard error output |
| 787 | Return the port to which errors and warnings should be sent. |
| 788 | |
| 789 | Initially this is the @dfn{standard error} in Unix and C terminology. |
| 790 | When the standard error is a tty this port is unbuffered, otherwise |
| 791 | it's fully buffered. |
| 792 | @end deffn |
| 793 | |
| 794 | @deffn {Scheme Procedure} set-current-input-port port |
| 795 | @deffnx {Scheme Procedure} set-current-output-port port |
| 796 | @deffnx {Scheme Procedure} set-current-error-port port |
| 797 | @deffnx {C Function} scm_set_current_input_port (port) |
| 798 | @deffnx {C Function} scm_set_current_output_port (port) |
| 799 | @deffnx {C Function} scm_set_current_error_port (port) |
| 800 | Change the ports returned by @code{current-input-port}, |
| 801 | @code{current-output-port} and @code{current-error-port}, respectively, |
| 802 | so that they use the supplied @var{port} for input or output. |
| 803 | @end deffn |
| 804 | |
| 805 | @deftypefn {C Function} void scm_dynwind_current_input_port (SCM port) |
| 806 | @deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port) |
| 807 | @deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port) |
| 808 | These functions must be used inside a pair of calls to |
| 809 | @code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic |
| 810 | Wind}). During the dynwind context, the indicated port is set to |
| 811 | @var{port}. |
| 812 | |
| 813 | More precisely, the current port is swapped with a `backup' value |
| 814 | whenever the dynwind context is entered or left. The backup value is |
| 815 | initialized with the @var{port} argument. |
| 816 | @end deftypefn |
| 817 | |
| 818 | @node Port Types |
| 819 | @subsection Types of Port |
| 820 | @cindex Types of ports |
| 821 | @cindex Port, types |
| 822 | |
| 823 | [Types of port; how to make them.] |
| 824 | |
| 825 | @menu |
| 826 | * File Ports:: Ports on an operating system file. |
| 827 | * String Ports:: Ports on a Scheme string. |
| 828 | * Soft Ports:: Ports on arbitrary Scheme procedures. |
| 829 | * Void Ports:: Ports on nothing at all. |
| 830 | @end menu |
| 831 | |
| 832 | |
| 833 | @node File Ports |
| 834 | @subsubsection File Ports |
| 835 | @cindex File port |
| 836 | @cindex Port, file |
| 837 | |
| 838 | The following procedures are used to open file ports. |
| 839 | See also @ref{Ports and File Descriptors, open}, for an interface |
| 840 | to the Unix @code{open} system call. |
| 841 | |
| 842 | Most systems have limits on how many files can be open, so it's |
| 843 | strongly recommended that file ports be closed explicitly when no |
| 844 | longer required (@pxref{Ports}). |
| 845 | |
| 846 | @deffn {Scheme Procedure} open-file filename mode @ |
| 847 | [#:guess-encoding=#f] [#:encoding=#f] |
| 848 | @deffnx {C Function} scm_open_file_with_encoding @ |
| 849 | (filename, mode, guess_encoding, encoding) |
| 850 | @deffnx {C Function} scm_open_file (filename, mode) |
| 851 | Open the file whose name is @var{filename}, and return a port |
| 852 | representing that file. The attributes of the port are |
| 853 | determined by the @var{mode} string. The way in which this is |
| 854 | interpreted is similar to C stdio. The first character must be |
| 855 | one of the following: |
| 856 | |
| 857 | @table @samp |
| 858 | @item r |
| 859 | Open an existing file for input. |
| 860 | @item w |
| 861 | Open a file for output, creating it if it doesn't already exist |
| 862 | or removing its contents if it does. |
| 863 | @item a |
| 864 | Open a file for output, creating it if it doesn't already |
| 865 | exist. All writes to the port will go to the end of the file. |
| 866 | The "append mode" can be turned off while the port is in use |
| 867 | @pxref{Ports and File Descriptors, fcntl} |
| 868 | @end table |
| 869 | |
| 870 | The following additional characters can be appended: |
| 871 | |
| 872 | @table @samp |
| 873 | @item + |
| 874 | Open the port for both input and output. E.g., @code{r+}: open |
| 875 | an existing file for both input and output. |
| 876 | @item 0 |
| 877 | Create an "unbuffered" port. In this case input and output |
| 878 | operations are passed directly to the underlying port |
| 879 | implementation without additional buffering. This is likely to |
| 880 | slow down I/O operations. The buffering mode can be changed |
| 881 | while a port is in use @pxref{Ports and File Descriptors, |
| 882 | setvbuf} |
| 883 | @item l |
| 884 | Add line-buffering to the port. The port output buffer will be |
| 885 | automatically flushed whenever a newline character is written. |
| 886 | @item b |
| 887 | Use binary mode, ensuring that each byte in the file will be read as one |
| 888 | Scheme character. |
| 889 | |
| 890 | To provide this property, the file will be opened with the 8-bit |
| 891 | character encoding "ISO-8859-1", ignoring the default port encoding. |
| 892 | @xref{Ports}, for more information on port encodings. |
| 893 | |
| 894 | Note that while it is possible to read and write binary data as |
| 895 | characters or strings, it is usually better to treat bytes as octets, |
| 896 | and byte sequences as bytevectors. @xref{R6RS Binary Input}, and |
| 897 | @ref{R6RS Binary Output}, for more. |
| 898 | |
| 899 | This option had another historical meaning, for DOS compatibility: in |
| 900 | the default (textual) mode, DOS reads a CR-LF sequence as one LF byte. |
| 901 | The @code{b} flag prevents this from happening, adding @code{O_BINARY} |
| 902 | to the underlying @code{open} call. Still, the flag is generally useful |
| 903 | because of its port encoding ramifications. |
| 904 | @end table |
| 905 | |
| 906 | Unless binary mode is requested, the character encoding of the new port |
| 907 | is determined as follows: First, if @var{guess-encoding} is true, the |
| 908 | @code{file-encoding} procedure is used to guess the encoding of the file |
| 909 | (@pxref{Character Encoding of Source Files}). If @var{guess-encoding} |
| 910 | is false or if @code{file-encoding} fails, @var{encoding} is used unless |
| 911 | it is also false. As a last resort, the default port encoding is used. |
| 912 | @xref{Ports}, for more information on port encodings. It is an error to |
| 913 | pass a non-false @var{guess-encoding} or @var{encoding} if binary mode |
| 914 | is requested. |
| 915 | |
| 916 | If a file cannot be opened with the access requested, @code{open-file} |
| 917 | throws an exception. |
| 918 | |
| 919 | When the file is opened, its encoding is set to the current |
| 920 | @code{%default-port-encoding}, unless the @code{b} flag was supplied. |
| 921 | Sometimes it is desirable to honor Emacs-style coding declarations in |
| 922 | files@footnote{Guile 2.0.0 to 2.0.7 would do this by default. This |
| 923 | behavior was deemed inappropriate and disabled starting from Guile |
| 924 | 2.0.8.}. When that is the case, the @code{file-encoding} procedure can |
| 925 | be used as follows (@pxref{Character Encoding of Source Files, |
| 926 | @code{file-encoding}}): |
| 927 | |
| 928 | @example |
| 929 | (let* ((port (open-input-file file)) |
| 930 | (encoding (file-encoding port))) |
| 931 | (set-port-encoding! port (or encoding (port-encoding port)))) |
| 932 | @end example |
| 933 | |
| 934 | In theory we could create read/write ports which were buffered |
| 935 | in one direction only. However this isn't included in the |
| 936 | current interfaces. |
| 937 | @end deffn |
| 938 | |
| 939 | @rnindex open-input-file |
| 940 | @deffn {Scheme Procedure} open-input-file filename @ |
| 941 | [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] |
| 942 | |
| 943 | Open @var{filename} for input. If @var{binary} is true, open the port |
| 944 | in binary mode, otherwise use text mode. @var{encoding} and |
| 945 | @var{guess-encoding} determine the character encoding as described above |
| 946 | for @code{open-file}. Equivalent to |
| 947 | @lisp |
| 948 | (open-file @var{filename} |
| 949 | (if @var{binary} "rb" "r") |
| 950 | #:guess-encoding @var{guess-encoding} |
| 951 | #:encoding @var{encoding}) |
| 952 | @end lisp |
| 953 | @end deffn |
| 954 | |
| 955 | @rnindex open-output-file |
| 956 | @deffn {Scheme Procedure} open-output-file filename @ |
| 957 | [#:encoding=#f] [#:binary=#f] |
| 958 | |
| 959 | Open @var{filename} for output. If @var{binary} is true, open the port |
| 960 | in binary mode, otherwise use text mode. @var{encoding} specifies the |
| 961 | character encoding as described above for @code{open-file}. Equivalent |
| 962 | to |
| 963 | @lisp |
| 964 | (open-file @var{filename} |
| 965 | (if @var{binary} "wb" "w") |
| 966 | #:encoding @var{encoding}) |
| 967 | @end lisp |
| 968 | @end deffn |
| 969 | |
| 970 | @deffn {Scheme Procedure} call-with-input-file filename proc @ |
| 971 | [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] |
| 972 | @deffnx {Scheme Procedure} call-with-output-file filename proc @ |
| 973 | [#:encoding=#f] [#:binary=#f] |
| 974 | @rnindex call-with-input-file |
| 975 | @rnindex call-with-output-file |
| 976 | Open @var{filename} for input or output, and call @code{(@var{proc} |
| 977 | port)} with the resulting port. Return the value returned by |
| 978 | @var{proc}. @var{filename} is opened as per @code{open-input-file} or |
| 979 | @code{open-output-file} respectively, and an error is signaled if it |
| 980 | cannot be opened. |
| 981 | |
| 982 | When @var{proc} returns, the port is closed. If @var{proc} does not |
| 983 | return (e.g.@: if it throws an error), then the port might not be |
| 984 | closed automatically, though it will be garbage collected in the usual |
| 985 | way if not otherwise referenced. |
| 986 | @end deffn |
| 987 | |
| 988 | @deffn {Scheme Procedure} with-input-from-file filename thunk @ |
| 989 | [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] |
| 990 | @deffnx {Scheme Procedure} with-output-to-file filename thunk @ |
| 991 | [#:encoding=#f] [#:binary=#f] |
| 992 | @deffnx {Scheme Procedure} with-error-to-file filename thunk @ |
| 993 | [#:encoding=#f] [#:binary=#f] |
| 994 | @rnindex with-input-from-file |
| 995 | @rnindex with-output-to-file |
| 996 | Open @var{filename} and call @code{(@var{thunk})} with the new port |
| 997 | setup as respectively the @code{current-input-port}, |
| 998 | @code{current-output-port}, or @code{current-error-port}. Return the |
| 999 | value returned by @var{thunk}. @var{filename} is opened as per |
| 1000 | @code{open-input-file} or @code{open-output-file} respectively, and an |
| 1001 | error is signaled if it cannot be opened. |
| 1002 | |
| 1003 | When @var{thunk} returns, the port is closed and the previous setting |
| 1004 | of the respective current port is restored. |
| 1005 | |
| 1006 | The current port setting is managed with @code{dynamic-wind}, so the |
| 1007 | previous value is restored no matter how @var{thunk} exits (eg.@: an |
| 1008 | exception), and if @var{thunk} is re-entered (via a captured |
| 1009 | continuation) then it's set again to the @var{filename} port. |
| 1010 | |
| 1011 | The port is closed when @var{thunk} returns normally, but not when |
| 1012 | exited via an exception or new continuation. This ensures it's still |
| 1013 | ready for use if @var{thunk} is re-entered by a captured continuation. |
| 1014 | Of course the port is always garbage collected and closed in the usual |
| 1015 | way when no longer referenced anywhere. |
| 1016 | @end deffn |
| 1017 | |
| 1018 | @deffn {Scheme Procedure} port-mode port |
| 1019 | @deffnx {C Function} scm_port_mode (port) |
| 1020 | Return the port modes associated with the open port @var{port}. |
| 1021 | These will not necessarily be identical to the modes used when |
| 1022 | the port was opened, since modes such as "append" which are |
| 1023 | used only during port creation are not retained. |
| 1024 | @end deffn |
| 1025 | |
| 1026 | @deffn {Scheme Procedure} port-filename port |
| 1027 | @deffnx {C Function} scm_port_filename (port) |
| 1028 | Return the filename associated with @var{port}, or @code{#f} if no |
| 1029 | filename is associated with the port. |
| 1030 | |
| 1031 | @var{port} must be open, @code{port-filename} cannot be used once the |
| 1032 | port is closed. |
| 1033 | @end deffn |
| 1034 | |
| 1035 | @deffn {Scheme Procedure} set-port-filename! port filename |
| 1036 | @deffnx {C Function} scm_set_port_filename_x (port, filename) |
| 1037 | Change the filename associated with @var{port}, using the current input |
| 1038 | port if none is specified. Note that this does not change the port's |
| 1039 | source of data, but only the value that is returned by |
| 1040 | @code{port-filename} and reported in diagnostic output. |
| 1041 | @end deffn |
| 1042 | |
| 1043 | @deffn {Scheme Procedure} file-port? obj |
| 1044 | @deffnx {C Function} scm_file_port_p (obj) |
| 1045 | Determine whether @var{obj} is a port that is related to a file. |
| 1046 | @end deffn |
| 1047 | |
| 1048 | |
| 1049 | @node String Ports |
| 1050 | @subsubsection String Ports |
| 1051 | @cindex String port |
| 1052 | @cindex Port, string |
| 1053 | |
| 1054 | The following allow string ports to be opened by analogy to R4RS |
| 1055 | file port facilities: |
| 1056 | |
| 1057 | With string ports, the port-encoding is treated differently than other |
| 1058 | types of ports. When string ports are created, they do not inherit a |
| 1059 | character encoding from the current locale. They are given a |
| 1060 | default locale that allows them to handle all valid string characters. |
| 1061 | Typically one should not modify a string port's character encoding |
| 1062 | away from its default. |
| 1063 | |
| 1064 | @deffn {Scheme Procedure} call-with-output-string proc |
| 1065 | @deffnx {C Function} scm_call_with_output_string (proc) |
| 1066 | Calls the one-argument procedure @var{proc} with a newly created output |
| 1067 | port. When the function returns, the string composed of the characters |
| 1068 | written into the port is returned. @var{proc} should not close the port. |
| 1069 | @end deffn |
| 1070 | |
| 1071 | @deffn {Scheme Procedure} call-with-input-string string proc |
| 1072 | @deffnx {C Function} scm_call_with_input_string (string, proc) |
| 1073 | Calls the one-argument procedure @var{proc} with a newly |
| 1074 | created input port from which @var{string}'s contents may be |
| 1075 | read. The value yielded by the @var{proc} is returned. |
| 1076 | @end deffn |
| 1077 | |
| 1078 | @deffn {Scheme Procedure} with-output-to-string thunk |
| 1079 | Calls the zero-argument procedure @var{thunk} with the current output |
| 1080 | port set temporarily to a new string port. It returns a string |
| 1081 | composed of the characters written to the current output. |
| 1082 | @end deffn |
| 1083 | |
| 1084 | @deffn {Scheme Procedure} with-input-from-string string thunk |
| 1085 | Calls the zero-argument procedure @var{thunk} with the current input |
| 1086 | port set temporarily to a string port opened on the specified |
| 1087 | @var{string}. The value yielded by @var{thunk} is returned. |
| 1088 | @end deffn |
| 1089 | |
| 1090 | @deffn {Scheme Procedure} open-input-string str |
| 1091 | @deffnx {C Function} scm_open_input_string (str) |
| 1092 | Take a string and return an input port that delivers characters |
| 1093 | from the string. The port can be closed by |
| 1094 | @code{close-input-port}, though its storage will be reclaimed |
| 1095 | by the garbage collector if it becomes inaccessible. |
| 1096 | @end deffn |
| 1097 | |
| 1098 | @deffn {Scheme Procedure} open-output-string |
| 1099 | @deffnx {C Function} scm_open_output_string () |
| 1100 | Return an output port that will accumulate characters for |
| 1101 | retrieval by @code{get-output-string}. The port can be closed |
| 1102 | by the procedure @code{close-output-port}, though its storage |
| 1103 | will be reclaimed by the garbage collector if it becomes |
| 1104 | inaccessible. |
| 1105 | @end deffn |
| 1106 | |
| 1107 | @deffn {Scheme Procedure} get-output-string port |
| 1108 | @deffnx {C Function} scm_get_output_string (port) |
| 1109 | Given an output port created by @code{open-output-string}, |
| 1110 | return a string consisting of the characters that have been |
| 1111 | output to the port so far. |
| 1112 | |
| 1113 | @code{get-output-string} must be used before closing @var{port}, once |
| 1114 | closed the string cannot be obtained. |
| 1115 | @end deffn |
| 1116 | |
| 1117 | A string port can be used in many procedures which accept a port |
| 1118 | but which are not dependent on implementation details of fports. |
| 1119 | E.g., seeking and truncating will work on a string port, |
| 1120 | but trying to extract the file descriptor number will fail. |
| 1121 | |
| 1122 | |
| 1123 | @node Soft Ports |
| 1124 | @subsubsection Soft Ports |
| 1125 | @cindex Soft port |
| 1126 | @cindex Port, soft |
| 1127 | |
| 1128 | A @dfn{soft-port} is a port based on a vector of procedures capable of |
| 1129 | accepting or delivering characters. It allows emulation of I/O ports. |
| 1130 | |
| 1131 | @deffn {Scheme Procedure} make-soft-port pv modes |
| 1132 | @deffnx {C Function} scm_make_soft_port (pv, modes) |
| 1133 | Return a port capable of receiving or delivering characters as |
| 1134 | specified by the @var{modes} string (@pxref{File Ports, |
| 1135 | open-file}). @var{pv} must be a vector of length 5 or 6. Its |
| 1136 | components are as follows: |
| 1137 | |
| 1138 | @enumerate 0 |
| 1139 | @item |
| 1140 | procedure accepting one character for output |
| 1141 | @item |
| 1142 | procedure accepting a string for output |
| 1143 | @item |
| 1144 | thunk for flushing output |
| 1145 | @item |
| 1146 | thunk for getting one character |
| 1147 | @item |
| 1148 | thunk for closing port (not by garbage collection) |
| 1149 | @item |
| 1150 | (if present and not @code{#f}) thunk for computing the number of |
| 1151 | characters that can be read from the port without blocking. |
| 1152 | @end enumerate |
| 1153 | |
| 1154 | For an output-only port only elements 0, 1, 2, and 4 need be |
| 1155 | procedures. For an input-only port only elements 3 and 4 need |
| 1156 | be procedures. Thunks 2 and 4 can instead be @code{#f} if |
| 1157 | there is no useful operation for them to perform. |
| 1158 | |
| 1159 | If thunk 3 returns @code{#f} or an @code{eof-object} |
| 1160 | (@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on |
| 1161 | Scheme}) it indicates that the port has reached end-of-file. |
| 1162 | For example: |
| 1163 | |
| 1164 | @lisp |
| 1165 | (define stdout (current-output-port)) |
| 1166 | (define p (make-soft-port |
| 1167 | (vector |
| 1168 | (lambda (c) (write c stdout)) |
| 1169 | (lambda (s) (display s stdout)) |
| 1170 | (lambda () (display "." stdout)) |
| 1171 | (lambda () (char-upcase (read-char))) |
| 1172 | (lambda () (display "@@" stdout))) |
| 1173 | "rw")) |
| 1174 | |
| 1175 | (write p p) @result{} #<input-output: soft 8081e20> |
| 1176 | @end lisp |
| 1177 | @end deffn |
| 1178 | |
| 1179 | |
| 1180 | @node Void Ports |
| 1181 | @subsubsection Void Ports |
| 1182 | @cindex Void port |
| 1183 | @cindex Port, void |
| 1184 | |
| 1185 | This kind of port causes any data to be discarded when written to, and |
| 1186 | always returns the end-of-file object when read from. |
| 1187 | |
| 1188 | @deffn {Scheme Procedure} %make-void-port mode |
| 1189 | @deffnx {C Function} scm_sys_make_void_port (mode) |
| 1190 | Create and return a new void port. A void port acts like |
| 1191 | @file{/dev/null}. The @var{mode} argument |
| 1192 | specifies the input/output modes for this port: see the |
| 1193 | documentation for @code{open-file} in @ref{File Ports}. |
| 1194 | @end deffn |
| 1195 | |
| 1196 | |
| 1197 | @node R6RS I/O Ports |
| 1198 | @subsection R6RS I/O Ports |
| 1199 | |
| 1200 | @cindex R6RS |
| 1201 | @cindex R6RS ports |
| 1202 | |
| 1203 | The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on |
| 1204 | the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs |
| 1205 | io ports)} module. It provides features, such as binary I/O and Unicode |
| 1206 | string I/O, that complement or refine Guile's historical port API |
| 1207 | presented above (@pxref{Input and Output}). Note that R6RS ports are not |
| 1208 | disjoint from Guile's native ports, so Guile-specific procedures will |
| 1209 | work on ports created using the R6RS API, and vice versa. |
| 1210 | |
| 1211 | The text in this section is taken from the R6RS standard libraries |
| 1212 | document, with only minor adaptions for inclusion in this manual. The |
| 1213 | Guile developers offer their thanks to the R6RS editors for having |
| 1214 | provided the report's text under permissive conditions making this |
| 1215 | possible. |
| 1216 | |
| 1217 | @c FIXME: Update description when implemented. |
| 1218 | @emph{Note}: The implementation of this R6RS API is not complete yet. |
| 1219 | |
| 1220 | @menu |
| 1221 | * R6RS File Names:: File names. |
| 1222 | * R6RS File Options:: Options for opening files. |
| 1223 | * R6RS Buffer Modes:: Influencing buffering behavior. |
| 1224 | * R6RS Transcoders:: Influencing port encoding. |
| 1225 | * R6RS End-of-File:: The end-of-file object. |
| 1226 | * R6RS Port Manipulation:: Manipulating R6RS ports. |
| 1227 | * R6RS Input Ports:: Input Ports. |
| 1228 | * R6RS Binary Input:: Binary input. |
| 1229 | * R6RS Textual Input:: Textual input. |
| 1230 | * R6RS Output Ports:: Output Ports. |
| 1231 | * R6RS Binary Output:: Binary output. |
| 1232 | * R6RS Textual Output:: Textual output. |
| 1233 | @end menu |
| 1234 | |
| 1235 | A subset of the @code{(rnrs io ports)} module, plus one non-standard |
| 1236 | procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is |
| 1237 | provided by the @code{(ice-9 binary-ports)} module. It contains binary |
| 1238 | input/output procedures and does not rely on R6RS support. |
| 1239 | |
| 1240 | @node R6RS File Names |
| 1241 | @subsubsection File Names |
| 1242 | |
| 1243 | Some of the procedures described in this chapter accept a file name as an |
| 1244 | argument. Valid values for such a file name include strings that name a file |
| 1245 | using the native notation of file system paths on an implementation's |
| 1246 | underlying operating system, and may include implementation-dependent |
| 1247 | values as well. |
| 1248 | |
| 1249 | A @var{filename} parameter name means that the |
| 1250 | corresponding argument must be a file name. |
| 1251 | |
| 1252 | @node R6RS File Options |
| 1253 | @subsubsection File Options |
| 1254 | @cindex file options |
| 1255 | |
| 1256 | When opening a file, the various procedures in this library accept a |
| 1257 | @code{file-options} object that encapsulates flags to specify how the |
| 1258 | file is to be opened. A @code{file-options} object is an enum-set |
| 1259 | (@pxref{rnrs enums}) over the symbols constituting valid file options. |
| 1260 | |
| 1261 | A @var{file-options} parameter name means that the corresponding |
| 1262 | argument must be a file-options object. |
| 1263 | |
| 1264 | @deffn {Scheme Syntax} file-options @var{file-options-symbol} ... |
| 1265 | |
| 1266 | Each @var{file-options-symbol} must be a symbol. |
| 1267 | |
| 1268 | The @code{file-options} syntax returns a file-options object that |
| 1269 | encapsulates the specified options. |
| 1270 | |
| 1271 | When supplied to an operation that opens a file for output, the |
| 1272 | file-options object returned by @code{(file-options)} specifies that the |
| 1273 | file is created if it does not exist and an exception with condition |
| 1274 | type @code{&i/o-file-already-exists} is raised if it does exist. The |
| 1275 | following standard options can be included to modify the default |
| 1276 | behavior. |
| 1277 | |
| 1278 | @table @code |
| 1279 | @item no-create |
| 1280 | If the file does not already exist, it is not created; |
| 1281 | instead, an exception with condition type @code{&i/o-file-does-not-exist} |
| 1282 | is raised. |
| 1283 | If the file already exists, the exception with condition type |
| 1284 | @code{&i/o-file-already-exists} is not raised |
| 1285 | and the file is truncated to zero length. |
| 1286 | @item no-fail |
| 1287 | If the file already exists, the exception with condition type |
| 1288 | @code{&i/o-file-already-exists} is not raised, |
| 1289 | even if @code{no-create} is not included, |
| 1290 | and the file is truncated to zero length. |
| 1291 | @item no-truncate |
| 1292 | If the file already exists and the exception with condition type |
| 1293 | @code{&i/o-file-already-exists} has been inhibited by inclusion of |
| 1294 | @code{no-create} or @code{no-fail}, the file is not truncated, but |
| 1295 | the port's current position is still set to the beginning of the |
| 1296 | file. |
| 1297 | @end table |
| 1298 | |
| 1299 | These options have no effect when a file is opened only for input. |
| 1300 | Symbols other than those listed above may be used as |
| 1301 | @var{file-options-symbol}s; they have implementation-specific meaning, |
| 1302 | if any. |
| 1303 | |
| 1304 | @quotation Note |
| 1305 | Only the name of @var{file-options-symbol} is significant. |
| 1306 | @end quotation |
| 1307 | @end deffn |
| 1308 | |
| 1309 | @node R6RS Buffer Modes |
| 1310 | @subsubsection Buffer Modes |
| 1311 | |
| 1312 | Each port has an associated buffer mode. For an output port, the |
| 1313 | buffer mode defines when an output operation flushes the buffer |
| 1314 | associated with the output port. For an input port, the buffer mode |
| 1315 | defines how much data will be read to satisfy read operations. The |
| 1316 | possible buffer modes are the symbols @code{none} for no buffering, |
| 1317 | @code{line} for flushing upon line endings and reading up to line |
| 1318 | endings, or other implementation-dependent behavior, |
| 1319 | and @code{block} for arbitrary buffering. This section uses |
| 1320 | the parameter name @var{buffer-mode} for arguments that must be |
| 1321 | buffer-mode symbols. |
| 1322 | |
| 1323 | If two ports are connected to the same mutable source, both ports |
| 1324 | are unbuffered, and reading a byte or character from that shared |
| 1325 | source via one of the two ports would change the bytes or characters |
| 1326 | seen via the other port, a lookahead operation on one port will |
| 1327 | render the peeked byte or character inaccessible via the other port, |
| 1328 | while a subsequent read operation on the peeked port will see the |
| 1329 | peeked byte or character even though the port is otherwise unbuffered. |
| 1330 | |
| 1331 | In other words, the semantics of buffering is defined in terms of side |
| 1332 | effects on shared mutable sources, and a lookahead operation has the |
| 1333 | same side effect on the shared source as a read operation. |
| 1334 | |
| 1335 | @deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol} |
| 1336 | |
| 1337 | @var{buffer-mode-symbol} must be a symbol whose name is one of |
| 1338 | @code{none}, @code{line}, and @code{block}. The result is the |
| 1339 | corresponding symbol, and specifies the associated buffer mode. |
| 1340 | |
| 1341 | @quotation Note |
| 1342 | Only the name of @var{buffer-mode-symbol} is significant. |
| 1343 | @end quotation |
| 1344 | @end deffn |
| 1345 | |
| 1346 | @deffn {Scheme Procedure} buffer-mode? obj |
| 1347 | Returns @code{#t} if the argument is a valid buffer-mode symbol, and |
| 1348 | returns @code{#f} otherwise. |
| 1349 | @end deffn |
| 1350 | |
| 1351 | @node R6RS Transcoders |
| 1352 | @subsubsection Transcoders |
| 1353 | @cindex codec |
| 1354 | @cindex end-of-line style |
| 1355 | @cindex transcoder |
| 1356 | @cindex binary port |
| 1357 | @cindex textual port |
| 1358 | |
| 1359 | Several different Unicode encoding schemes describe standard ways to |
| 1360 | encode characters and strings as byte sequences and to decode those |
| 1361 | sequences. Within this document, a @dfn{codec} is an immutable Scheme |
| 1362 | object that represents a Unicode or similar encoding scheme. |
| 1363 | |
| 1364 | An @dfn{end-of-line style} is a symbol that, if it is not @code{none}, |
| 1365 | describes how a textual port transcodes representations of line endings. |
| 1366 | |
| 1367 | A @dfn{transcoder} is an immutable Scheme object that combines a codec |
| 1368 | with an end-of-line style and a method for handling decoding errors. |
| 1369 | Each transcoder represents some specific bidirectional (but not |
| 1370 | necessarily lossless), possibly stateful translation between byte |
| 1371 | sequences and Unicode characters and strings. Every transcoder can |
| 1372 | operate in the input direction (bytes to characters) or in the output |
| 1373 | direction (characters to bytes). A @var{transcoder} parameter name |
| 1374 | means that the corresponding argument must be a transcoder. |
| 1375 | |
| 1376 | A @dfn{binary port} is a port that supports binary I/O, does not have an |
| 1377 | associated transcoder and does not support textual I/O. A @dfn{textual |
| 1378 | port} is a port that supports textual I/O, and does not support binary |
| 1379 | I/O. A textual port may or may not have an associated transcoder. |
| 1380 | |
| 1381 | @deffn {Scheme Procedure} latin-1-codec |
| 1382 | @deffnx {Scheme Procedure} utf-8-codec |
| 1383 | @deffnx {Scheme Procedure} utf-16-codec |
| 1384 | |
| 1385 | These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16 |
| 1386 | encoding schemes. |
| 1387 | |
| 1388 | A call to any of these procedures returns a value that is equal in the |
| 1389 | sense of @code{eqv?} to the result of any other call to the same |
| 1390 | procedure. |
| 1391 | @end deffn |
| 1392 | |
| 1393 | @deffn {Scheme Syntax} eol-style @var{eol-style-symbol} |
| 1394 | |
| 1395 | @var{eol-style-symbol} should be a symbol whose name is one of |
| 1396 | @code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls}, |
| 1397 | and @code{none}. |
| 1398 | |
| 1399 | The form evaluates to the corresponding symbol. If the name of |
| 1400 | @var{eol-style-symbol} is not one of these symbols, the effect and |
| 1401 | result are implementation-dependent; in particular, the result may be an |
| 1402 | eol-style symbol acceptable as an @var{eol-style} argument to |
| 1403 | @code{make-transcoder}. Otherwise, an exception is raised. |
| 1404 | |
| 1405 | All eol-style symbols except @code{none} describe a specific |
| 1406 | line-ending encoding: |
| 1407 | |
| 1408 | @table @code |
| 1409 | @item lf |
| 1410 | linefeed |
| 1411 | @item cr |
| 1412 | carriage return |
| 1413 | @item crlf |
| 1414 | carriage return, linefeed |
| 1415 | @item nel |
| 1416 | next line |
| 1417 | @item crnel |
| 1418 | carriage return, next line |
| 1419 | @item ls |
| 1420 | line separator |
| 1421 | @end table |
| 1422 | |
| 1423 | For a textual port with a transcoder, and whose transcoder has an |
| 1424 | eol-style symbol @code{none}, no conversion occurs. For a textual input |
| 1425 | port, any eol-style symbol other than @code{none} means that all of the |
| 1426 | above line-ending encodings are recognized and are translated into a |
| 1427 | single linefeed. For a textual output port, @code{none} and @code{lf} |
| 1428 | are equivalent. Linefeed characters are encoded according to the |
| 1429 | specified eol-style symbol, and all other characters that participate in |
| 1430 | possible line endings are encoded as is. |
| 1431 | |
| 1432 | @quotation Note |
| 1433 | Only the name of @var{eol-style-symbol} is significant. |
| 1434 | @end quotation |
| 1435 | @end deffn |
| 1436 | |
| 1437 | @deffn {Scheme Procedure} native-eol-style |
| 1438 | Returns the default end-of-line style of the underlying platform, e.g., |
| 1439 | @code{lf} on Unix and @code{crlf} on Windows. |
| 1440 | @end deffn |
| 1441 | |
| 1442 | @deffn {Condition Type} &i/o-decoding |
| 1443 | @deffnx {Scheme Procedure} make-i/o-decoding-error port |
| 1444 | @deffnx {Scheme Procedure} i/o-decoding-error? obj |
| 1445 | |
| 1446 | This condition type could be defined by |
| 1447 | |
| 1448 | @lisp |
| 1449 | (define-condition-type &i/o-decoding &i/o-port |
| 1450 | make-i/o-decoding-error i/o-decoding-error?) |
| 1451 | @end lisp |
| 1452 | |
| 1453 | An exception with this type is raised when one of the operations for |
| 1454 | textual input from a port encounters a sequence of bytes that cannot be |
| 1455 | translated into a character or string by the input direction of the |
| 1456 | port's transcoder. |
| 1457 | |
| 1458 | When such an exception is raised, the port's position is past the |
| 1459 | invalid encoding. |
| 1460 | @end deffn |
| 1461 | |
| 1462 | @deffn {Condition Type} &i/o-encoding |
| 1463 | @deffnx {Scheme Procedure} make-i/o-encoding-error port char |
| 1464 | @deffnx {Scheme Procedure} i/o-encoding-error? obj |
| 1465 | @deffnx {Scheme Procedure} i/o-encoding-error-char condition |
| 1466 | |
| 1467 | This condition type could be defined by |
| 1468 | |
| 1469 | @lisp |
| 1470 | (define-condition-type &i/o-encoding &i/o-port |
| 1471 | make-i/o-encoding-error i/o-encoding-error? |
| 1472 | (char i/o-encoding-error-char)) |
| 1473 | @end lisp |
| 1474 | |
| 1475 | An exception with this type is raised when one of the operations for |
| 1476 | textual output to a port encounters a character that cannot be |
| 1477 | translated into bytes by the output direction of the port's transcoder. |
| 1478 | @var{char} is the character that could not be encoded. |
| 1479 | @end deffn |
| 1480 | |
| 1481 | @deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol} |
| 1482 | |
| 1483 | @var{error-handling-mode-symbol} should be a symbol whose name is one of |
| 1484 | @code{ignore}, @code{raise}, and @code{replace}. The form evaluates to |
| 1485 | the corresponding symbol. If @var{error-handling-mode-symbol} is not |
| 1486 | one of these identifiers, effect and result are |
| 1487 | implementation-dependent: The result may be an error-handling-mode |
| 1488 | symbol acceptable as a @var{handling-mode} argument to |
| 1489 | @code{make-transcoder}. If it is not acceptable as a |
| 1490 | @var{handling-mode} argument to @code{make-transcoder}, an exception is |
| 1491 | raised. |
| 1492 | |
| 1493 | @quotation Note |
| 1494 | Only the name of @var{error-handling-mode-symbol} is significant. |
| 1495 | @end quotation |
| 1496 | |
| 1497 | The error-handling mode of a transcoder specifies the behavior |
| 1498 | of textual I/O operations in the presence of encoding or decoding |
| 1499 | errors. |
| 1500 | |
| 1501 | If a textual input operation encounters an invalid or incomplete |
| 1502 | character encoding, and the error-handling mode is @code{ignore}, an |
| 1503 | appropriate number of bytes of the invalid encoding are ignored and |
| 1504 | decoding continues with the following bytes. |
| 1505 | |
| 1506 | If the error-handling mode is @code{replace}, the replacement |
| 1507 | character U+FFFD is injected into the data stream, an appropriate |
| 1508 | number of bytes are ignored, and decoding |
| 1509 | continues with the following bytes. |
| 1510 | |
| 1511 | If the error-handling mode is @code{raise}, an exception with condition |
| 1512 | type @code{&i/o-decoding} is raised. |
| 1513 | |
| 1514 | If a textual output operation encounters a character it cannot encode, |
| 1515 | and the error-handling mode is @code{ignore}, the character is ignored |
| 1516 | and encoding continues with the next character. If the error-handling |
| 1517 | mode is @code{replace}, a codec-specific replacement character is |
| 1518 | emitted by the transcoder, and encoding continues with the next |
| 1519 | character. The replacement character is U+FFFD for transcoders whose |
| 1520 | codec is one of the Unicode encodings, but is the @code{?} character |
| 1521 | for the Latin-1 encoding. If the error-handling mode is @code{raise}, |
| 1522 | an exception with condition type @code{&i/o-encoding} is raised. |
| 1523 | @end deffn |
| 1524 | |
| 1525 | @deffn {Scheme Procedure} make-transcoder codec |
| 1526 | @deffnx {Scheme Procedure} make-transcoder codec eol-style |
| 1527 | @deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode |
| 1528 | |
| 1529 | @var{codec} must be a codec; @var{eol-style}, if present, an eol-style |
| 1530 | symbol; and @var{handling-mode}, if present, an error-handling-mode |
| 1531 | symbol. |
| 1532 | |
| 1533 | @var{eol-style} may be omitted, in which case it defaults to the native |
| 1534 | end-of-line style of the underlying platform. @var{handling-mode} may |
| 1535 | be omitted, in which case it defaults to @code{replace}. The result is |
| 1536 | a transcoder with the behavior specified by its arguments. |
| 1537 | @end deffn |
| 1538 | |
| 1539 | @deffn {Scheme procedure} native-transcoder |
| 1540 | Returns an implementation-dependent transcoder that represents a |
| 1541 | possibly locale-dependent ``native'' transcoding. |
| 1542 | @end deffn |
| 1543 | |
| 1544 | @deffn {Scheme Procedure} transcoder-codec transcoder |
| 1545 | @deffnx {Scheme Procedure} transcoder-eol-style transcoder |
| 1546 | @deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder |
| 1547 | |
| 1548 | These are accessors for transcoder objects; when applied to a |
| 1549 | transcoder returned by @code{make-transcoder}, they return the |
| 1550 | @var{codec}, @var{eol-style}, and @var{handling-mode} arguments, |
| 1551 | respectively. |
| 1552 | @end deffn |
| 1553 | |
| 1554 | @deffn {Scheme Procedure} bytevector->string bytevector transcoder |
| 1555 | |
| 1556 | Returns the string that results from transcoding the |
| 1557 | @var{bytevector} according to the input direction of the transcoder. |
| 1558 | @end deffn |
| 1559 | |
| 1560 | @deffn {Scheme Procedure} string->bytevector string transcoder |
| 1561 | |
| 1562 | Returns the bytevector that results from transcoding the |
| 1563 | @var{string} according to the output direction of the transcoder. |
| 1564 | @end deffn |
| 1565 | |
| 1566 | @node R6RS End-of-File |
| 1567 | @subsubsection The End-of-File Object |
| 1568 | |
| 1569 | @cindex EOF |
| 1570 | @cindex end-of-file |
| 1571 | |
| 1572 | R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io |
| 1573 | ports)} module: |
| 1574 | |
| 1575 | @deffn {Scheme Procedure} eof-object? obj |
| 1576 | @deffnx {C Function} scm_eof_object_p (obj) |
| 1577 | Return true if @var{obj} is the end-of-file (EOF) object. |
| 1578 | @end deffn |
| 1579 | |
| 1580 | In addition, the following procedure is provided: |
| 1581 | |
| 1582 | @deffn {Scheme Procedure} eof-object |
| 1583 | @deffnx {C Function} scm_eof_object () |
| 1584 | Return the end-of-file (EOF) object. |
| 1585 | |
| 1586 | @lisp |
| 1587 | (eof-object? (eof-object)) |
| 1588 | @result{} #t |
| 1589 | @end lisp |
| 1590 | @end deffn |
| 1591 | |
| 1592 | |
| 1593 | @node R6RS Port Manipulation |
| 1594 | @subsubsection Port Manipulation |
| 1595 | |
| 1596 | The procedures listed below operate on any kind of R6RS I/O port. |
| 1597 | |
| 1598 | @deffn {Scheme Procedure} port? obj |
| 1599 | Returns @code{#t} if the argument is a port, and returns @code{#f} |
| 1600 | otherwise. |
| 1601 | @end deffn |
| 1602 | |
| 1603 | @deffn {Scheme Procedure} port-transcoder port |
| 1604 | Returns the transcoder associated with @var{port} if @var{port} is |
| 1605 | textual and has an associated transcoder, and returns @code{#f} if |
| 1606 | @var{port} is binary or does not have an associated transcoder. |
| 1607 | @end deffn |
| 1608 | |
| 1609 | @deffn {Scheme Procedure} binary-port? port |
| 1610 | Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for |
| 1611 | binary data input/output. |
| 1612 | |
| 1613 | Note that internally Guile does not differentiate between binary and |
| 1614 | textual ports, unlike the R6RS. Thus, this procedure returns true when |
| 1615 | @var{port} does not have an associated encoding---i.e., when |
| 1616 | @code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports, |
| 1617 | port-encoding}). This is the case for ports returned by R6RS procedures |
| 1618 | such as @code{open-bytevector-input-port} and |
| 1619 | @code{make-custom-binary-output-port}. |
| 1620 | |
| 1621 | However, Guile currently does not prevent use of textual I/O procedures |
| 1622 | such as @code{display} or @code{read-char} with binary ports. Doing so |
| 1623 | ``upgrades'' the port from binary to textual, under the ISO-8859-1 |
| 1624 | encoding. Likewise, Guile does not prevent use of |
| 1625 | @code{set-port-encoding!} on a binary port, which also turns it into a |
| 1626 | ``textual'' port. |
| 1627 | @end deffn |
| 1628 | |
| 1629 | @deffn {Scheme Procedure} textual-port? port |
| 1630 | Always return @code{#t}, as all ports can be used for textual I/O in |
| 1631 | Guile. |
| 1632 | @end deffn |
| 1633 | |
| 1634 | @deffn {Scheme Procedure} transcoded-port binary-port transcoder |
| 1635 | The @code{transcoded-port} procedure |
| 1636 | returns a new textual port with the specified @var{transcoder}. |
| 1637 | Otherwise the new textual port's state is largely the same as |
| 1638 | that of @var{binary-port}. |
| 1639 | If @var{binary-port} is an input port, the new textual |
| 1640 | port will be an input port and |
| 1641 | will transcode the bytes that have not yet been read from |
| 1642 | @var{binary-port}. |
| 1643 | If @var{binary-port} is an output port, the new textual |
| 1644 | port will be an output port and |
| 1645 | will transcode output characters into bytes that are |
| 1646 | written to the byte sink represented by @var{binary-port}. |
| 1647 | |
| 1648 | As a side effect, however, @code{transcoded-port} |
| 1649 | closes @var{binary-port} in |
| 1650 | a special way that allows the new textual port to continue to |
| 1651 | use the byte source or sink represented by @var{binary-port}, |
| 1652 | even though @var{binary-port} itself is closed and cannot |
| 1653 | be used by the input and output operations described in this |
| 1654 | chapter. |
| 1655 | @end deffn |
| 1656 | |
| 1657 | @deffn {Scheme Procedure} port-position port |
| 1658 | If @var{port} supports it (see below), return the offset (an integer) |
| 1659 | indicating where the next octet will be read from/written to in |
| 1660 | @var{port}. If @var{port} does not support this operation, an error |
| 1661 | condition is raised. |
| 1662 | |
| 1663 | This is similar to Guile's @code{seek} procedure with the |
| 1664 | @code{SEEK_CUR} argument (@pxref{Random Access}). |
| 1665 | @end deffn |
| 1666 | |
| 1667 | @deffn {Scheme Procedure} port-has-port-position? port |
| 1668 | Return @code{#t} is @var{port} supports @code{port-position}. |
| 1669 | @end deffn |
| 1670 | |
| 1671 | @deffn {Scheme Procedure} set-port-position! port offset |
| 1672 | If @var{port} supports it (see below), set the position where the next |
| 1673 | octet will be read from/written to @var{port} to @var{offset} (an |
| 1674 | integer). If @var{port} does not support this operation, an error |
| 1675 | condition is raised. |
| 1676 | |
| 1677 | This is similar to Guile's @code{seek} procedure with the |
| 1678 | @code{SEEK_SET} argument (@pxref{Random Access}). |
| 1679 | @end deffn |
| 1680 | |
| 1681 | @deffn {Scheme Procedure} port-has-set-port-position!? port |
| 1682 | Return @code{#t} is @var{port} supports @code{set-port-position!}. |
| 1683 | @end deffn |
| 1684 | |
| 1685 | @deffn {Scheme Procedure} call-with-port port proc |
| 1686 | Call @var{proc}, passing it @var{port} and closing @var{port} upon exit |
| 1687 | of @var{proc}. Return the return values of @var{proc}. |
| 1688 | @end deffn |
| 1689 | |
| 1690 | @node R6RS Input Ports |
| 1691 | @subsubsection Input Ports |
| 1692 | |
| 1693 | @deffn {Scheme Procedure} input-port? obj |
| 1694 | Returns @code{#t} if the argument is an input port (or a combined input |
| 1695 | and output port), and returns @code{#f} otherwise. |
| 1696 | @end deffn |
| 1697 | |
| 1698 | @deffn {Scheme Procedure} port-eof? input-port |
| 1699 | Returns @code{#t} |
| 1700 | if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port) |
| 1701 | or the @code{lookahead-char} procedure (if @var{input-port} is a textual port) |
| 1702 | would return |
| 1703 | the end-of-file object, and @code{#f} otherwise. |
| 1704 | The operation may block indefinitely if no data is available |
| 1705 | but the port cannot be determined to be at end of file. |
| 1706 | @end deffn |
| 1707 | |
| 1708 | @deffn {Scheme Procedure} open-file-input-port filename |
| 1709 | @deffnx {Scheme Procedure} open-file-input-port filename file-options |
| 1710 | @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode |
| 1711 | @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder |
| 1712 | @var{maybe-transcoder} must be either a transcoder or @code{#f}. |
| 1713 | |
| 1714 | The @code{open-file-input-port} procedure returns an |
| 1715 | input port for the named file. The @var{file-options} and |
| 1716 | @var{maybe-transcoder} arguments are optional. |
| 1717 | |
| 1718 | The @var{file-options} argument, which may determine |
| 1719 | various aspects of the returned port (@pxref{R6RS File Options}), |
| 1720 | defaults to the value of @code{(file-options)}. |
| 1721 | |
| 1722 | The @var{buffer-mode} argument, if supplied, |
| 1723 | must be one of the symbols that name a buffer mode. |
| 1724 | The @var{buffer-mode} argument defaults to @code{block}. |
| 1725 | |
| 1726 | If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated |
| 1727 | with the returned port. |
| 1728 | |
| 1729 | If @var{maybe-transcoder} is @code{#f} or absent, |
| 1730 | the port will be a binary port and will support the |
| 1731 | @code{port-position} and @code{set-port-position!} operations. |
| 1732 | Otherwise the port will be a textual port, and whether it supports |
| 1733 | the @code{port-position} and @code{set-port-position!} operations |
| 1734 | is implementation-dependent (and possibly transcoder-dependent). |
| 1735 | @end deffn |
| 1736 | |
| 1737 | @deffn {Scheme Procedure} standard-input-port |
| 1738 | Returns a fresh binary input port connected to standard input. Whether |
| 1739 | the port supports the @code{port-position} and @code{set-port-position!} |
| 1740 | operations is implementation-dependent. |
| 1741 | @end deffn |
| 1742 | |
| 1743 | @deffn {Scheme Procedure} current-input-port |
| 1744 | This returns a default textual port for input. Normally, this default |
| 1745 | port is associated with standard input, but can be dynamically |
| 1746 | re-assigned using the @code{with-input-from-file} procedure from the |
| 1747 | @code{io simple (6)} library (@pxref{rnrs io simple}). The port may or |
| 1748 | may not have an associated transcoder; if it does, the transcoder is |
| 1749 | implementation-dependent. |
| 1750 | @end deffn |
| 1751 | |
| 1752 | @node R6RS Binary Input |
| 1753 | @subsubsection Binary Input |
| 1754 | |
| 1755 | @cindex binary input |
| 1756 | |
| 1757 | R6RS binary input ports can be created with the procedures described |
| 1758 | below. |
| 1759 | |
| 1760 | @deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder] |
| 1761 | @deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder) |
| 1762 | Return an input port whose contents are drawn from bytevector @var{bv} |
| 1763 | (@pxref{Bytevectors}). |
| 1764 | |
| 1765 | @c FIXME: Update description when implemented. |
| 1766 | The @var{transcoder} argument is currently not supported. |
| 1767 | @end deffn |
| 1768 | |
| 1769 | @cindex custom binary input ports |
| 1770 | |
| 1771 | @deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close |
| 1772 | @deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close) |
| 1773 | Return a new custom binary input port@footnote{This is similar in spirit |
| 1774 | to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a |
| 1775 | string) whose input is drained by invoking @var{read!} and passing it a |
| 1776 | bytevector, an index where bytes should be written, and the number of |
| 1777 | bytes to read. The @code{read!} procedure must return an integer |
| 1778 | indicating the number of bytes read, or @code{0} to indicate the |
| 1779 | end-of-file. |
| 1780 | |
| 1781 | Optionally, if @var{get-position} is not @code{#f}, it must be a thunk |
| 1782 | that will be called when @code{port-position} is invoked on the custom |
| 1783 | binary port and should return an integer indicating the position within |
| 1784 | the underlying data stream; if @var{get-position} was not supplied, the |
| 1785 | returned port does not support @code{port-position}. |
| 1786 | |
| 1787 | Likewise, if @var{set-position!} is not @code{#f}, it should be a |
| 1788 | one-argument procedure. When @code{set-port-position!} is invoked on the |
| 1789 | custom binary input port, @var{set-position!} is passed an integer |
| 1790 | indicating the position of the next byte is to read. |
| 1791 | |
| 1792 | Finally, if @var{close} is not @code{#f}, it must be a thunk. It is |
| 1793 | invoked when the custom binary input port is closed. |
| 1794 | |
| 1795 | Using a custom binary input port, the @code{open-bytevector-input-port} |
| 1796 | procedure could be implemented as follows: |
| 1797 | |
| 1798 | @lisp |
| 1799 | (define (open-bytevector-input-port source) |
| 1800 | (define position 0) |
| 1801 | (define length (bytevector-length source)) |
| 1802 | |
| 1803 | (define (read! bv start count) |
| 1804 | (let ((count (min count (- length position)))) |
| 1805 | (bytevector-copy! source position |
| 1806 | bv start count) |
| 1807 | (set! position (+ position count)) |
| 1808 | count)) |
| 1809 | |
| 1810 | (define (get-position) position) |
| 1811 | |
| 1812 | (define (set-position! new-position) |
| 1813 | (set! position new-position)) |
| 1814 | |
| 1815 | (make-custom-binary-input-port "the port" read! |
| 1816 | get-position |
| 1817 | set-position!)) |
| 1818 | |
| 1819 | (read (open-bytevector-input-port (string->utf8 "hello"))) |
| 1820 | @result{} hello |
| 1821 | @end lisp |
| 1822 | @end deffn |
| 1823 | |
| 1824 | @cindex binary input |
| 1825 | Binary input is achieved using the procedures below: |
| 1826 | |
| 1827 | @deffn {Scheme Procedure} get-u8 port |
| 1828 | @deffnx {C Function} scm_get_u8 (port) |
| 1829 | Return an octet read from @var{port}, a binary input port, blocking as |
| 1830 | necessary, or the end-of-file object. |
| 1831 | @end deffn |
| 1832 | |
| 1833 | @deffn {Scheme Procedure} lookahead-u8 port |
| 1834 | @deffnx {C Function} scm_lookahead_u8 (port) |
| 1835 | Like @code{get-u8} but does not update @var{port}'s position to point |
| 1836 | past the octet. |
| 1837 | @end deffn |
| 1838 | |
| 1839 | @deffn {Scheme Procedure} get-bytevector-n port count |
| 1840 | @deffnx {C Function} scm_get_bytevector_n (port, count) |
| 1841 | Read @var{count} octets from @var{port}, blocking as necessary and |
| 1842 | return a bytevector containing the octets read. If fewer bytes are |
| 1843 | available, a bytevector smaller than @var{count} is returned. |
| 1844 | @end deffn |
| 1845 | |
| 1846 | @deffn {Scheme Procedure} get-bytevector-n! port bv start count |
| 1847 | @deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count) |
| 1848 | Read @var{count} bytes from @var{port} and store them in @var{bv} |
| 1849 | starting at index @var{start}. Return either the number of bytes |
| 1850 | actually read or the end-of-file object. |
| 1851 | @end deffn |
| 1852 | |
| 1853 | @deffn {Scheme Procedure} get-bytevector-some port |
| 1854 | @deffnx {C Function} scm_get_bytevector_some (port) |
| 1855 | Read from @var{port}, blocking as necessary, until bytes are available |
| 1856 | or an end-of-file is reached. Return either the end-of-file object or a |
| 1857 | new bytevector containing some of the available bytes (at least one), |
| 1858 | and update the port position to point just past these bytes. |
| 1859 | @end deffn |
| 1860 | |
| 1861 | @deffn {Scheme Procedure} get-bytevector-all port |
| 1862 | @deffnx {C Function} scm_get_bytevector_all (port) |
| 1863 | Read from @var{port}, blocking as necessary, until the end-of-file is |
| 1864 | reached. Return either a new bytevector containing the data read or the |
| 1865 | end-of-file object (if no data were available). |
| 1866 | @end deffn |
| 1867 | |
| 1868 | The @code{(ice-9 binary-ports)} module provides the following procedure |
| 1869 | as an extension to @code{(rnrs io ports)}: |
| 1870 | |
| 1871 | @deffn {Scheme Procedure} unget-bytevector port bv [start [count]] |
| 1872 | @deffnx {C Function} scm_unget_bytevector (port, bv, start, count) |
| 1873 | Place the contents of @var{bv} in @var{port}, optionally starting at |
| 1874 | index @var{start} and limiting to @var{count} octets, so that its bytes |
| 1875 | will be read from left-to-right as the next bytes from @var{port} during |
| 1876 | subsequent read operations. If called multiple times, the unread bytes |
| 1877 | will be read again in last-in first-out order. |
| 1878 | @end deffn |
| 1879 | |
| 1880 | @node R6RS Textual Input |
| 1881 | @subsubsection Textual Input |
| 1882 | |
| 1883 | @deffn {Scheme Procedure} get-char textual-input-port |
| 1884 | Reads from @var{textual-input-port}, blocking as necessary, until a |
| 1885 | complete character is available from @var{textual-input-port}, |
| 1886 | or until an end of file is reached. |
| 1887 | |
| 1888 | If a complete character is available before the next end of file, |
| 1889 | @code{get-char} returns that character and updates the input port to |
| 1890 | point past the character. If an end of file is reached before any |
| 1891 | character is read, @code{get-char} returns the end-of-file object. |
| 1892 | @end deffn |
| 1893 | |
| 1894 | @deffn {Scheme Procedure} lookahead-char textual-input-port |
| 1895 | The @code{lookahead-char} procedure is like @code{get-char}, but it does |
| 1896 | not update @var{textual-input-port} to point past the character. |
| 1897 | @end deffn |
| 1898 | |
| 1899 | @deffn {Scheme Procedure} get-string-n textual-input-port count |
| 1900 | |
| 1901 | @var{count} must be an exact, non-negative integer object, representing |
| 1902 | the number of characters to be read. |
| 1903 | |
| 1904 | The @code{get-string-n} procedure reads from @var{textual-input-port}, |
| 1905 | blocking as necessary, until @var{count} characters are available, or |
| 1906 | until an end of file is reached. |
| 1907 | |
| 1908 | If @var{count} characters are available before end of file, |
| 1909 | @code{get-string-n} returns a string consisting of those @var{count} |
| 1910 | characters. If fewer characters are available before an end of file, but |
| 1911 | one or more characters can be read, @code{get-string-n} returns a string |
| 1912 | containing those characters. In either case, the input port is updated |
| 1913 | to point just past the characters read. If no characters can be read |
| 1914 | before an end of file, the end-of-file object is returned. |
| 1915 | @end deffn |
| 1916 | |
| 1917 | @deffn {Scheme Procedure} get-string-n! textual-input-port string start count |
| 1918 | |
| 1919 | @var{start} and @var{count} must be exact, non-negative integer objects, |
| 1920 | with @var{count} representing the number of characters to be read. |
| 1921 | @var{string} must be a string with at least $@var{start} + @var{count}$ |
| 1922 | characters. |
| 1923 | |
| 1924 | The @code{get-string-n!} procedure reads from @var{textual-input-port} |
| 1925 | in the same manner as @code{get-string-n}. If @var{count} characters |
| 1926 | are available before an end of file, they are written into @var{string} |
| 1927 | starting at index @var{start}, and @var{count} is returned. If fewer |
| 1928 | characters are available before an end of file, but one or more can be |
| 1929 | read, those characters are written into @var{string} starting at index |
| 1930 | @var{start} and the number of characters actually read is returned as an |
| 1931 | exact integer object. If no characters can be read before an end of |
| 1932 | file, the end-of-file object is returned. |
| 1933 | @end deffn |
| 1934 | |
| 1935 | @deffn {Scheme Procedure} get-string-all textual-input-port |
| 1936 | Reads from @var{textual-input-port} until an end of file, decoding |
| 1937 | characters in the same manner as @code{get-string-n} and |
| 1938 | @code{get-string-n!}. |
| 1939 | |
| 1940 | If characters are available before the end of file, a string containing |
| 1941 | all the characters decoded from that data are returned. If no character |
| 1942 | precedes the end of file, the end-of-file object is returned. |
| 1943 | @end deffn |
| 1944 | |
| 1945 | @deffn {Scheme Procedure} get-line textual-input-port |
| 1946 | Reads from @var{textual-input-port} up to and including the linefeed |
| 1947 | character or end of file, decoding characters in the same manner as |
| 1948 | @code{get-string-n} and @code{get-string-n!}. |
| 1949 | |
| 1950 | If a linefeed character is read, a string containing all of the text up |
| 1951 | to (but not including) the linefeed character is returned, and the port |
| 1952 | is updated to point just past the linefeed character. If an end of file |
| 1953 | is encountered before any linefeed character is read, but some |
| 1954 | characters have been read and decoded as characters, a string containing |
| 1955 | those characters is returned. If an end of file is encountered before |
| 1956 | any characters are read, the end-of-file object is returned. |
| 1957 | |
| 1958 | @quotation Note |
| 1959 | The end-of-line style, if not @code{none}, will cause all line endings |
| 1960 | to be read as linefeed characters. @xref{R6RS Transcoders}. |
| 1961 | @end quotation |
| 1962 | @end deffn |
| 1963 | |
| 1964 | @deffn {Scheme Procedure} get-datum textual-input-port count |
| 1965 | Reads an external representation from @var{textual-input-port} and returns the |
| 1966 | datum it represents. The @code{get-datum} procedure returns the next |
| 1967 | datum that can be parsed from the given @var{textual-input-port}, updating |
| 1968 | @var{textual-input-port} to point exactly past the end of the external |
| 1969 | representation of the object. |
| 1970 | |
| 1971 | Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme |
| 1972 | Syntax}) in the input is first skipped. If an end of file occurs after |
| 1973 | the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File}) |
| 1974 | is returned. |
| 1975 | |
| 1976 | If a character inconsistent with an external representation is |
| 1977 | encountered in the input, an exception with condition types |
| 1978 | @code{&lexical} and @code{&i/o-read} is raised. Also, if the end of |
| 1979 | file is encountered after the beginning of an external representation, |
| 1980 | but the external representation is incomplete and therefore cannot be |
| 1981 | parsed, an exception with condition types @code{&lexical} and |
| 1982 | @code{&i/o-read} is raised. |
| 1983 | @end deffn |
| 1984 | |
| 1985 | @node R6RS Output Ports |
| 1986 | @subsubsection Output Ports |
| 1987 | |
| 1988 | @deffn {Scheme Procedure} output-port? obj |
| 1989 | Returns @code{#t} if the argument is an output port (or a |
| 1990 | combined input and output port), @code{#f} otherwise. |
| 1991 | @end deffn |
| 1992 | |
| 1993 | @deffn {Scheme Procedure} flush-output-port port |
| 1994 | Flushes any buffered output from the buffer of @var{output-port} to the |
| 1995 | underlying file, device, or object. The @code{flush-output-port} |
| 1996 | procedure returns an unspecified values. |
| 1997 | @end deffn |
| 1998 | |
| 1999 | @deffn {Scheme Procedure} open-file-output-port filename |
| 2000 | @deffnx {Scheme Procedure} open-file-output-port filename file-options |
| 2001 | @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode |
| 2002 | @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder |
| 2003 | |
| 2004 | @var{maybe-transcoder} must be either a transcoder or @code{#f}. |
| 2005 | |
| 2006 | The @code{open-file-output-port} procedure returns an output port for the named file. |
| 2007 | |
| 2008 | The @var{file-options} argument, which may determine various aspects of |
| 2009 | the returned port (@pxref{R6RS File Options}), defaults to the value of |
| 2010 | @code{(file-options)}. |
| 2011 | |
| 2012 | The @var{buffer-mode} argument, if supplied, |
| 2013 | must be one of the symbols that name a buffer mode. |
| 2014 | The @var{buffer-mode} argument defaults to @code{block}. |
| 2015 | |
| 2016 | If @var{maybe-transcoder} is a transcoder, it becomes the transcoder |
| 2017 | associated with the port. |
| 2018 | |
| 2019 | If @var{maybe-transcoder} is @code{#f} or absent, |
| 2020 | the port will be a binary port and will support the |
| 2021 | @code{port-position} and @code{set-port-position!} operations. |
| 2022 | Otherwise the port will be a textual port, and whether it supports |
| 2023 | the @code{port-position} and @code{set-port-position!} operations |
| 2024 | is implementation-dependent (and possibly transcoder-dependent). |
| 2025 | @end deffn |
| 2026 | |
| 2027 | @deffn {Scheme Procedure} standard-output-port |
| 2028 | @deffnx {Scheme Procedure} standard-error-port |
| 2029 | Returns a fresh binary output port connected to the standard output or |
| 2030 | standard error respectively. Whether the port supports the |
| 2031 | @code{port-position} and @code{set-port-position!} operations is |
| 2032 | implementation-dependent. |
| 2033 | @end deffn |
| 2034 | |
| 2035 | @deffn {Scheme Procedure} current-output-port |
| 2036 | @deffnx {Scheme Procedure} current-error-port |
| 2037 | These return default textual ports for regular output and error output. |
| 2038 | Normally, these default ports are associated with standard output, and |
| 2039 | standard error, respectively. The return value of |
| 2040 | @code{current-output-port} can be dynamically re-assigned using the |
| 2041 | @code{with-output-to-file} procedure from the @code{io simple (6)} |
| 2042 | library (@pxref{rnrs io simple}). A port returned by one of these |
| 2043 | procedures may or may not have an associated transcoder; if it does, the |
| 2044 | transcoder is implementation-dependent. |
| 2045 | @end deffn |
| 2046 | |
| 2047 | @node R6RS Binary Output |
| 2048 | @subsubsection Binary Output |
| 2049 | |
| 2050 | Binary output ports can be created with the procedures below. |
| 2051 | |
| 2052 | @deffn {Scheme Procedure} open-bytevector-output-port [transcoder] |
| 2053 | @deffnx {C Function} scm_open_bytevector_output_port (transcoder) |
| 2054 | Return two values: a binary output port and a procedure. The latter |
| 2055 | should be called with zero arguments to obtain a bytevector containing |
| 2056 | the data accumulated by the port, as illustrated below. |
| 2057 | |
| 2058 | @lisp |
| 2059 | (call-with-values |
| 2060 | (lambda () |
| 2061 | (open-bytevector-output-port)) |
| 2062 | (lambda (port get-bytevector) |
| 2063 | (display "hello" port) |
| 2064 | (get-bytevector))) |
| 2065 | |
| 2066 | @result{} #vu8(104 101 108 108 111) |
| 2067 | @end lisp |
| 2068 | |
| 2069 | @c FIXME: Update description when implemented. |
| 2070 | The @var{transcoder} argument is currently not supported. |
| 2071 | @end deffn |
| 2072 | |
| 2073 | @cindex custom binary output ports |
| 2074 | |
| 2075 | @deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close |
| 2076 | @deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close) |
| 2077 | Return a new custom binary output port named @var{id} (a string) whose |
| 2078 | output is sunk by invoking @var{write!} and passing it a bytevector, an |
| 2079 | index where bytes should be read from this bytevector, and the number of |
| 2080 | bytes to be ``written''. The @code{write!} procedure must return an |
| 2081 | integer indicating the number of bytes actually written; when it is |
| 2082 | passed @code{0} as the number of bytes to write, it should behave as |
| 2083 | though an end-of-file was sent to the byte sink. |
| 2084 | |
| 2085 | The other arguments are as for @code{make-custom-binary-input-port} |
| 2086 | (@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}). |
| 2087 | @end deffn |
| 2088 | |
| 2089 | @cindex binary output |
| 2090 | Writing to a binary output port can be done using the following |
| 2091 | procedures: |
| 2092 | |
| 2093 | @deffn {Scheme Procedure} put-u8 port octet |
| 2094 | @deffnx {C Function} scm_put_u8 (port, octet) |
| 2095 | Write @var{octet}, an integer in the 0--255 range, to @var{port}, a |
| 2096 | binary output port. |
| 2097 | @end deffn |
| 2098 | |
| 2099 | @deffn {Scheme Procedure} put-bytevector port bv [start [count]] |
| 2100 | @deffnx {C Function} scm_put_bytevector (port, bv, start, count) |
| 2101 | Write the contents of @var{bv} to @var{port}, optionally starting at |
| 2102 | index @var{start} and limiting to @var{count} octets. |
| 2103 | @end deffn |
| 2104 | |
| 2105 | @node R6RS Textual Output |
| 2106 | @subsubsection Textual Output |
| 2107 | |
| 2108 | @deffn {Scheme Procedure} put-char port char |
| 2109 | Writes @var{char} to the port. The @code{put-char} procedure returns |
| 2110 | an unspecified value. |
| 2111 | @end deffn |
| 2112 | |
| 2113 | @deffn {Scheme Procedure} put-string port string |
| 2114 | @deffnx {Scheme Procedure} put-string port string start |
| 2115 | @deffnx {Scheme Procedure} put-string port string start count |
| 2116 | |
| 2117 | @var{start} and @var{count} must be non-negative exact integer objects. |
| 2118 | @var{string} must have a length of at least @math{@var{start} + |
| 2119 | @var{count}}. @var{start} defaults to 0. @var{count} defaults to |
| 2120 | @math{@code{(string-length @var{string})} - @var{start}}$. The |
| 2121 | @code{put-string} procedure writes the @var{count} characters of |
| 2122 | @var{string} starting at index @var{start} to the port. The |
| 2123 | @code{put-string} procedure returns an unspecified value. |
| 2124 | @end deffn |
| 2125 | |
| 2126 | @deffn {Scheme Procedure} put-datum textual-output-port datum |
| 2127 | @var{datum} should be a datum value. The @code{put-datum} procedure |
| 2128 | writes an external representation of @var{datum} to |
| 2129 | @var{textual-output-port}. The specific external representation is |
| 2130 | implementation-dependent. However, whenever possible, an implementation |
| 2131 | should produce a representation for which @code{get-datum}, when reading |
| 2132 | the representation, will return an object equal (in the sense of |
| 2133 | @code{equal?}) to @var{datum}. |
| 2134 | |
| 2135 | @quotation Note |
| 2136 | Not all datums may allow producing an external representation for which |
| 2137 | @code{get-datum} will produce an object that is equal to the |
| 2138 | original. Specifically, NaNs contained in @var{datum} may make |
| 2139 | this impossible. |
| 2140 | @end quotation |
| 2141 | |
| 2142 | @quotation Note |
| 2143 | The @code{put-datum} procedure merely writes the external |
| 2144 | representation, but no trailing delimiter. If @code{put-datum} is |
| 2145 | used to write several subsequent external representations to an |
| 2146 | output port, care should be taken to delimit them properly so they can |
| 2147 | be read back in by subsequent calls to @code{get-datum}. |
| 2148 | @end quotation |
| 2149 | @end deffn |
| 2150 | |
| 2151 | @node I/O Extensions |
| 2152 | @subsection Using and Extending Ports in C |
| 2153 | |
| 2154 | @menu |
| 2155 | * C Port Interface:: Using ports from C. |
| 2156 | * Port Implementation:: How to implement a new port type in C. |
| 2157 | @end menu |
| 2158 | |
| 2159 | |
| 2160 | @node C Port Interface |
| 2161 | @subsubsection C Port Interface |
| 2162 | @cindex C port interface |
| 2163 | @cindex Port, C interface |
| 2164 | |
| 2165 | This section describes how to use Scheme ports from C. |
| 2166 | |
| 2167 | @subsubheading Port basics |
| 2168 | |
| 2169 | @cindex ptob |
| 2170 | @tindex scm_ptob_descriptor |
| 2171 | @tindex scm_port |
| 2172 | @findex SCM_PTAB_ENTRY |
| 2173 | @findex SCM_PTOBNUM |
| 2174 | @vindex scm_ptobs |
| 2175 | There are two main data structures. A port type object (ptob) is of |
| 2176 | type @code{scm_ptob_descriptor}. A port instance is of type |
| 2177 | @code{scm_port}. Given an @code{SCM} variable which points to a port, |
| 2178 | the corresponding C port object can be obtained using the |
| 2179 | @code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using |
| 2180 | @code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs} |
| 2181 | global array. |
| 2182 | |
| 2183 | @subsubheading Port buffers |
| 2184 | |
| 2185 | An input port always has a read buffer and an output port always has a |
| 2186 | write buffer. However the size of these buffers is not guaranteed to be |
| 2187 | more than one byte (e.g., the @code{shortbuf} field in @code{scm_port} |
| 2188 | which is used when no other buffer is allocated). The way in which the |
| 2189 | buffers are allocated depends on the implementation of the ptob. For |
| 2190 | example in the case of an fport, buffers may be allocated with malloc |
| 2191 | when the port is created, but in the case of an strport the underlying |
| 2192 | string is used as the buffer. |
| 2193 | |
| 2194 | @subsubheading The @code{rw_random} flag |
| 2195 | |
| 2196 | Special treatment is required for ports which can be seeked at random. |
| 2197 | Before various operations, such as seeking the port or changing from |
| 2198 | input to output on a bidirectional port or vice versa, the port |
| 2199 | implementation must be given a chance to update its state. The write |
| 2200 | buffer is updated by calling the @code{flush} ptob procedure and the |
| 2201 | input buffer is updated by calling the @code{end_input} ptob procedure. |
| 2202 | In the case of an fport, @code{flush} causes buffered output to be |
| 2203 | written to the file descriptor, while @code{end_input} causes the |
| 2204 | descriptor position to be adjusted to account for buffered input which |
| 2205 | was never read. |
| 2206 | |
| 2207 | The special treatment must be performed if the @code{rw_random} flag in |
| 2208 | the port is non-zero. |
| 2209 | |
| 2210 | @subsubheading The @code{rw_active} variable |
| 2211 | |
| 2212 | The @code{rw_active} variable in the port is only used if |
| 2213 | @code{rw_random} is set. It's defined as an enum with the following |
| 2214 | values: |
| 2215 | |
| 2216 | @table @code |
| 2217 | @item SCM_PORT_READ |
| 2218 | the read buffer may have unread data. |
| 2219 | |
| 2220 | @item SCM_PORT_WRITE |
| 2221 | the write buffer may have unwritten data. |
| 2222 | |
| 2223 | @item SCM_PORT_NEITHER |
| 2224 | neither the write nor the read buffer has data. |
| 2225 | @end table |
| 2226 | |
| 2227 | @subsubheading Reading from a port. |
| 2228 | |
| 2229 | To read from a port, it's possible to either call existing libguile |
| 2230 | procedures such as @code{scm_getc} and @code{scm_read_line} or to read |
| 2231 | data from the read buffer directly. Reading from the buffer involves |
| 2232 | the following steps: |
| 2233 | |
| 2234 | @enumerate |
| 2235 | @item |
| 2236 | Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}. |
| 2237 | |
| 2238 | @item |
| 2239 | Fill the read buffer, if it's empty, using @code{scm_fill_input}. |
| 2240 | |
| 2241 | @item Read the data from the buffer and update the read position in |
| 2242 | the buffer. Steps 2) and 3) may be repeated as many times as required. |
| 2243 | |
| 2244 | @item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set. |
| 2245 | |
| 2246 | @item update the port's line and column counts. |
| 2247 | @end enumerate |
| 2248 | |
| 2249 | @subsubheading Writing to a port. |
| 2250 | |
| 2251 | To write data to a port, calling @code{scm_lfwrite} should be sufficient for |
| 2252 | most purposes. This takes care of the following steps: |
| 2253 | |
| 2254 | @enumerate |
| 2255 | @item |
| 2256 | End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}. |
| 2257 | |
| 2258 | @item |
| 2259 | Pass the data to the ptob implementation using the @code{write} ptob |
| 2260 | procedure. The advantage of using the ptob @code{write} instead of |
| 2261 | manipulating the write buffer directly is that it allows the data to be |
| 2262 | written in one operation even if the port is using the single-byte |
| 2263 | @code{shortbuf}. |
| 2264 | |
| 2265 | @item |
| 2266 | Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random} |
| 2267 | is set. |
| 2268 | @end enumerate |
| 2269 | |
| 2270 | |
| 2271 | @node Port Implementation |
| 2272 | @subsubsection Port Implementation |
| 2273 | @cindex Port implementation |
| 2274 | |
| 2275 | This section describes how to implement a new port type in C. |
| 2276 | |
| 2277 | As described in the previous section, a port type object (ptob) is |
| 2278 | a structure of type @code{scm_ptob_descriptor}. A ptob is created by |
| 2279 | calling @code{scm_make_port_type}. |
| 2280 | |
| 2281 | @deftypefun scm_t_bits scm_make_port_type (char *name, int (*fill_input) (SCM port), void (*write) (SCM port, const void *data, size_t size)) |
| 2282 | Return a new port type object. The @var{name}, @var{fill_input} and |
| 2283 | @var{write} parameters are initial values for those port type fields, |
| 2284 | as described below. The other fields are initialized with default |
| 2285 | values and can be changed later. |
| 2286 | @end deftypefun |
| 2287 | |
| 2288 | All of the elements of the ptob, apart from @code{name}, are procedures |
| 2289 | which collectively implement the port behaviour. Creating a new port |
| 2290 | type mostly involves writing these procedures. |
| 2291 | |
| 2292 | @table @code |
| 2293 | @item name |
| 2294 | A pointer to a NUL terminated string: the name of the port type. This |
| 2295 | is the only element of @code{scm_ptob_descriptor} which is not |
| 2296 | a procedure. Set via the first argument to @code{scm_make_port_type}. |
| 2297 | |
| 2298 | @item mark |
| 2299 | Called during garbage collection to mark any SCM objects that a port |
| 2300 | object may contain. It doesn't need to be set unless the port has |
| 2301 | @code{SCM} components. Set using |
| 2302 | |
| 2303 | @deftypefun void scm_set_port_mark (scm_t_bits tc, SCM (*mark) (SCM port)) |
| 2304 | @end deftypefun |
| 2305 | |
| 2306 | @item free |
| 2307 | Called when the port is collected during gc. It |
| 2308 | should free any resources used by the port. |
| 2309 | Set using |
| 2310 | |
| 2311 | @deftypefun void scm_set_port_free (scm_t_bits tc, size_t (*free) (SCM port)) |
| 2312 | @end deftypefun |
| 2313 | |
| 2314 | @item print |
| 2315 | Called when @code{write} is called on the port object, to print a |
| 2316 | port description. E.g., for an fport it may produce something like: |
| 2317 | @code{#<input: /etc/passwd 3>}. Set using |
| 2318 | |
| 2319 | @deftypefun void scm_set_port_print (scm_t_bits tc, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate)) |
| 2320 | The first argument @var{port} is the object being printed, the second |
| 2321 | argument @var{dest_port} is where its description should go. |
| 2322 | @end deftypefun |
| 2323 | |
| 2324 | @item equalp |
| 2325 | Not used at present. Set using |
| 2326 | |
| 2327 | @deftypefun void scm_set_port_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM)) |
| 2328 | @end deftypefun |
| 2329 | |
| 2330 | @item close |
| 2331 | Called when the port is closed, unless it was collected during gc. It |
| 2332 | should free any resources used by the port. |
| 2333 | Set using |
| 2334 | |
| 2335 | @deftypefun void scm_set_port_close (scm_t_bits tc, int (*close) (SCM port)) |
| 2336 | @end deftypefun |
| 2337 | |
| 2338 | @item write |
| 2339 | Accept data which is to be written using the port. The port implementation |
| 2340 | may choose to buffer the data instead of processing it directly. |
| 2341 | Set via the third argument to @code{scm_make_port_type}. |
| 2342 | |
| 2343 | @item flush |
| 2344 | Complete the processing of buffered output data. Reset the value of |
| 2345 | @code{rw_active} to @code{SCM_PORT_NEITHER}. |
| 2346 | Set using |
| 2347 | |
| 2348 | @deftypefun void scm_set_port_flush (scm_t_bits tc, void (*flush) (SCM port)) |
| 2349 | @end deftypefun |
| 2350 | |
| 2351 | @item end_input |
| 2352 | Perform any synchronization required when switching from input to output |
| 2353 | on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}. |
| 2354 | Set using |
| 2355 | |
| 2356 | @deftypefun void scm_set_port_end_input (scm_t_bits tc, void (*end_input) (SCM port, int offset)) |
| 2357 | @end deftypefun |
| 2358 | |
| 2359 | @item fill_input |
| 2360 | Read new data into the read buffer and return the first character. It |
| 2361 | can be assumed that the read buffer is empty when this procedure is called. |
| 2362 | Set via the second argument to @code{scm_make_port_type}. |
| 2363 | |
| 2364 | @item input_waiting |
| 2365 | Return a lower bound on the number of bytes that could be read from the |
| 2366 | port without blocking. It can be assumed that the current state of |
| 2367 | @code{rw_active} is @code{SCM_PORT_NEITHER}. |
| 2368 | Set using |
| 2369 | |
| 2370 | @deftypefun void scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM port)) |
| 2371 | @end deftypefun |
| 2372 | |
| 2373 | @item seek |
| 2374 | Set the current position of the port. The procedure can not make |
| 2375 | any assumptions about the value of @code{rw_active} when it's |
| 2376 | called. It can reset the buffers first if desired by using something |
| 2377 | like: |
| 2378 | |
| 2379 | @example |
| 2380 | if (pt->rw_active == SCM_PORT_READ) |
| 2381 | scm_end_input (port); |
| 2382 | else if (pt->rw_active == SCM_PORT_WRITE) |
| 2383 | ptob->flush (port); |
| 2384 | @end example |
| 2385 | |
| 2386 | However note that this will have the side effect of discarding any data |
| 2387 | in the unread-char buffer, in addition to any side effects from the |
| 2388 | @code{end_input} and @code{flush} ptob procedures. This is undesirable |
| 2389 | when seek is called to measure the current position of the port, i.e., |
| 2390 | @code{(seek p 0 SEEK_CUR)}. The libguile fport and string port |
| 2391 | implementations take care to avoid this problem. |
| 2392 | |
| 2393 | The procedure is set using |
| 2394 | |
| 2395 | @deftypefun void scm_set_port_seek (scm_t_bits tc, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence)) |
| 2396 | @end deftypefun |
| 2397 | |
| 2398 | @item truncate |
| 2399 | Truncate the port data to be specified length. It can be assumed that the |
| 2400 | current state of @code{rw_active} is @code{SCM_PORT_NEITHER}. |
| 2401 | Set using |
| 2402 | |
| 2403 | @deftypefun void scm_set_port_truncate (scm_t_bits tc, void (*truncate) (SCM port, scm_t_off length)) |
| 2404 | @end deftypefun |
| 2405 | |
| 2406 | @end table |
| 2407 | |
| 2408 | @node BOM Handling |
| 2409 | @subsection Handling of Unicode byte order marks. |
| 2410 | @cindex BOM |
| 2411 | @cindex byte order mark |
| 2412 | |
| 2413 | This section documents the finer points of Guile's handling of Unicode |
| 2414 | byte order marks (BOMs). A byte order mark (U+FEFF) is typically found |
| 2415 | at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably |
| 2416 | determine the byte order. Occasionally, a BOM is found at the start of |
| 2417 | a UTF-8 stream, but this is much less common and not generally |
| 2418 | recommended. |
| 2419 | |
| 2420 | Guile attempts to handle BOMs automatically, and in accordance with the |
| 2421 | recommendations of the Unicode Standard, when the port encoding is set |
| 2422 | to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile |
| 2423 | automatically writes a BOM at the start of a UTF-16 or UTF-32 stream, |
| 2424 | and automatically consumes one from the start of a UTF-8, UTF-16, or |
| 2425 | UTF-32 stream. |
| 2426 | |
| 2427 | As specified in the Unicode Standard, a BOM is only handled specially at |
| 2428 | the start of a stream, and only if the port encoding is set to |
| 2429 | @code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is |
| 2430 | set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or |
| 2431 | @code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of |
| 2432 | the special handling described in this section applies. |
| 2433 | |
| 2434 | @itemize @bullet |
| 2435 | @item |
| 2436 | To ensure that Guile will properly detect the byte order of a UTF-16 or |
| 2437 | UTF-32 stream, you must perform a textual read before any writes, seeks, |
| 2438 | or binary I/O. Guile will not attempt to read a BOM unless a read is |
| 2439 | explicitly requested at the start of the stream. |
| 2440 | |
| 2441 | @item |
| 2442 | If a textual write is performed before the first read, then an arbitrary |
| 2443 | byte order will be chosen. Currently, big endian is the default on all |
| 2444 | platforms, but that may change in the future. If you wish to explicitly |
| 2445 | control the byte order of an output stream, set the port encoding to |
| 2446 | @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE}, |
| 2447 | and explicitly write a BOM (@code{#\xFEFF}) if desired. |
| 2448 | |
| 2449 | @item |
| 2450 | If @code{set-port-encoding!} is called in the middle of a stream, Guile |
| 2451 | treats this as a new logical ``start of stream'' for purposes of BOM |
| 2452 | handling, and will forget about any BOMs that had previously been seen. |
| 2453 | Therefore, it may choose a different byte order than had been used |
| 2454 | previously. This is intended to support multiple logical text streams |
| 2455 | embedded within a larger binary stream. |
| 2456 | |
| 2457 | @item |
| 2458 | Binary I/O operations are not guaranteed to update Guile's notion of |
| 2459 | whether the port is at the ``start of the stream'', nor are they |
| 2460 | guaranteed to produce or consume BOMs. |
| 2461 | |
| 2462 | @item |
| 2463 | For ports that support seeking (e.g. normal files), the input and output |
| 2464 | streams are considered linked: if the user reads first, then a BOM will |
| 2465 | be consumed (if appropriate), but later writes will @emph{not} produce a |
| 2466 | BOM. Similarly, if the user writes first, then later reads will |
| 2467 | @emph{not} consume a BOM. |
| 2468 | |
| 2469 | @item |
| 2470 | For ports that do not support seeking (e.g. pipes, sockets, and |
| 2471 | terminals), the input and output streams are considered |
| 2472 | @emph{independent} for purposes of BOM handling: the first read will |
| 2473 | consume a BOM (if appropriate), and the first write will @emph{also} |
| 2474 | produce a BOM (if appropriate). However, the input and output streams |
| 2475 | will always use the same byte order. |
| 2476 | |
| 2477 | @item |
| 2478 | Seeks to the beginning of a file will set the ``start of stream'' flags. |
| 2479 | Therefore, a subsequent textual read or write will consume or produce a |
| 2480 | BOM. However, unlike @code{set-port-encoding!}, if a byte order had |
| 2481 | already been chosen for the port, it will remain in effect after a seek, |
| 2482 | and cannot be changed by the presence of a BOM. Seeks anywhere other |
| 2483 | than the beginning of a file clear the ``start of stream'' flags. |
| 2484 | @end itemize |
| 2485 | |
| 2486 | @c Local Variables: |
| 2487 | @c TeX-master: "guile.texi" |
| 2488 | @c End: |