Add 'positive?' and 'negative?' as primitives.
[bpt/guile.git] / doc / ref / api-io.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
4 @c 2010, 2011, 2013 Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
6
7 @node Input and Output
8 @section Input and Output
9
10 @menu
11 * Ports:: The idea of the port abstraction.
12 * Reading:: Procedures for reading from a port.
13 * Writing:: Procedures for writing to a port.
14 * Closing:: Procedures to close a port.
15 * Random Access:: Moving around a random access port.
16 * Line/Delimited:: Read and write lines or delimited text.
17 * Block Reading and Writing:: Reading and writing blocks of text.
18 * Default Ports:: Defaults for input, output and errors.
19 * Port Types:: Types of port and how to make them.
20 * R6RS I/O Ports:: The R6RS port API.
21 * I/O Extensions:: Using and extending ports in C.
22 * BOM Handling:: Handling of Unicode byte order marks.
23 @end menu
24
25
26 @node Ports
27 @subsection Ports
28 @cindex Port
29
30 Sequential input/output in Scheme is represented by operations on a
31 @dfn{port}. This chapter explains the operations that Guile provides
32 for working with ports.
33
34 Ports are created by opening, for instance @code{open-file} for a file
35 (@pxref{File Ports}). Characters can be read from an input port and
36 written to an output port, or both on an input/output port. A port
37 can be closed (@pxref{Closing}) when no longer required, after which
38 any attempt to read or write is an error.
39
40 The formal definition of a port is very generic: an input port is
41 simply ``an object which can deliver characters on demand,'' and an
42 output port is ``an object which can accept characters.'' Because
43 this definition is so loose, it is easy to write functions that
44 simulate ports in software. @dfn{Soft ports} and @dfn{string ports}
45 are two interesting and powerful examples of this technique.
46 (@pxref{Soft Ports}, and @ref{String Ports}.)
47
48 Ports are garbage collected in the usual way (@pxref{Memory
49 Management}), and will be closed at that time if not already closed.
50 In this case any errors occurring in the close will not be reported.
51 Usually a program will want to explicitly close so as to be sure all
52 its operations have been successful. Of course if a program has
53 abandoned something due to an error or other condition then closing
54 problems are probably not of interest.
55
56 It is strongly recommended that file ports be closed explicitly when
57 no longer required. Most systems have limits on how many files can be
58 open, both on a per-process and a system-wide basis. A program that
59 uses many files should take care not to hit those limits. The same
60 applies to similar system resources such as pipes and sockets.
61
62 Note that automatic garbage collection is triggered only by memory
63 consumption, not by file or other resource usage, so a program cannot
64 rely on that to keep it away from system limits. An explicit call to
65 @code{gc} can of course be relied on to pick up unreferenced ports.
66 If program flow makes it hard to be certain when to close then this
67 may be an acceptable way to control resource usage.
68
69 All file access uses the ``LFS'' large file support functions when
70 available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be
71 read and written on a 32-bit system.
72
73 Each port has an associated character encoding that controls how bytes
74 read from the port are converted to characters and string and controls
75 how characters and strings written to the port are converted to bytes.
76 When ports are created, they inherit their character encoding from the
77 current locale, but, that can be modified after the port is created.
78
79 Currently, the ports only work with @emph{non-modal} encodings. Most
80 encodings are non-modal, meaning that the conversion of bytes to a
81 string doesn't depend on its context: the same byte sequence will always
82 return the same string. A couple of modal encodings are in common use,
83 like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
84
85 Each port also has an associated conversion strategy: what to do when
86 a Guile character can't be converted to the port's encoded character
87 representation for output. There are three possible strategies: to
88 raise an error, to replace the character with a hex escape, or to
89 replace the character with a substitute character.
90
91 @rnindex input-port?
92 @deffn {Scheme Procedure} input-port? x
93 @deffnx {C Function} scm_input_port_p (x)
94 Return @code{#t} if @var{x} is an input port, otherwise return
95 @code{#f}. Any object satisfying this predicate also satisfies
96 @code{port?}.
97 @end deffn
98
99 @rnindex output-port?
100 @deffn {Scheme Procedure} output-port? x
101 @deffnx {C Function} scm_output_port_p (x)
102 Return @code{#t} if @var{x} is an output port, otherwise return
103 @code{#f}. Any object satisfying this predicate also satisfies
104 @code{port?}.
105 @end deffn
106
107 @deffn {Scheme Procedure} port? x
108 @deffnx {C Function} scm_port_p (x)
109 Return a boolean indicating whether @var{x} is a port.
110 Equivalent to @code{(or (input-port? @var{x}) (output-port?
111 @var{x}))}.
112 @end deffn
113
114 @deffn {Scheme Procedure} set-port-encoding! port enc
115 @deffnx {C Function} scm_set_port_encoding_x (port, enc)
116 Sets the character encoding that will be used to interpret all port I/O.
117 @var{enc} is a string containing the name of an encoding. Valid
118 encoding names are those
119 @url{http://www.iana.org/assignments/character-sets, defined by IANA}.
120 @end deffn
121
122 @defvr {Scheme Variable} %default-port-encoding
123 A fluid containing @code{#f} or the name of the encoding to
124 be used by default for newly created ports (@pxref{Fluids and Dynamic
125 States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
126
127 New ports are created with the encoding appropriate for the current
128 locale if @code{setlocale} has been called or the value specified by
129 this fluid otherwise.
130 @end defvr
131
132 @deffn {Scheme Procedure} port-encoding port
133 @deffnx {C Function} scm_port_encoding (port)
134 Returns, as a string, the character encoding that @var{port} uses to interpret
135 its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
136 @end deffn
137
138 @deffn {Scheme Procedure} set-port-conversion-strategy! port sym
139 @deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
140 Sets the behavior of the interpreter when outputting a character that
141 is not representable in the port's current encoding. @var{sym} can be
142 either @code{'error}, @code{'substitute}, or @code{'escape}. If it is
143 @code{'error}, an error will be thrown when an nonconvertible character
144 is encountered. If it is @code{'substitute}, then nonconvertible
145 characters will be replaced with approximate characters, or with
146 question marks if no approximately correct character is available. If
147 it is @code{'escape}, it will appear as a hex escape when output.
148
149 If @var{port} is an open port, the conversion error behavior
150 is set for that port. If it is @code{#f}, it is set as the
151 default behavior for any future ports that get created in
152 this thread.
153 @end deffn
154
155 @deffn {Scheme Procedure} port-conversion-strategy port
156 @deffnx {C Function} scm_port_conversion_strategy (port)
157 Returns the behavior of the port when outputting a character that is
158 not representable in the port's current encoding. It returns the
159 symbol @code{error} if unrepresentable characters should cause
160 exceptions, @code{substitute} if the port should try to replace
161 unrepresentable characters with question marks or approximate
162 characters, or @code{escape} if unrepresentable characters should be
163 converted to string escapes.
164
165 If @var{port} is @code{#f}, then the current default behavior will be
166 returned. New ports will have this default behavior when they are
167 created.
168 @end deffn
169
170 @deffn {Scheme Variable} %default-port-conversion-strategy
171 The fluid that defines the conversion strategy for newly created ports,
172 and for other conversion routines such as @code{scm_to_stringn},
173 @code{scm_from_stringn}, @code{string->pointer}, and
174 @code{pointer->string}.
175
176 Its value must be one of the symbols described above, with the same
177 semantics: @code{'error}, @code{'substitute}, or @code{'escape}.
178
179 When Guile starts, its value is @code{'substitute}.
180
181 Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
182 equivalent to @code{(fluid-set! %default-port-conversion-strategy
183 @var{sym})}.
184 @end deffn
185
186
187 @node Reading
188 @subsection Reading
189 @cindex Reading
190
191 [Generic procedures for reading from ports.]
192
193 These procedures pertain to reading characters and strings from
194 ports. To read general S-expressions from ports, @xref{Scheme Read}.
195
196 @rnindex eof-object?
197 @cindex End of file object
198 @deffn {Scheme Procedure} eof-object? x
199 @deffnx {C Function} scm_eof_object_p (x)
200 Return @code{#t} if @var{x} is an end-of-file object; otherwise
201 return @code{#f}.
202 @end deffn
203
204 @rnindex char-ready?
205 @deffn {Scheme Procedure} char-ready? [port]
206 @deffnx {C Function} scm_char_ready_p (port)
207 Return @code{#t} if a character is ready on input @var{port}
208 and return @code{#f} otherwise. If @code{char-ready?} returns
209 @code{#t} then the next @code{read-char} operation on
210 @var{port} is guaranteed not to hang. If @var{port} is a file
211 port at end of file then @code{char-ready?} returns @code{#t}.
212
213 @code{char-ready?} exists to make it possible for a
214 program to accept characters from interactive ports without
215 getting stuck waiting for input. Any input editors associated
216 with such ports must make sure that characters whose existence
217 has been asserted by @code{char-ready?} cannot be rubbed out.
218 If @code{char-ready?} were to return @code{#f} at end of file,
219 a port at end of file would be indistinguishable from an
220 interactive port that has no ready characters.
221 @end deffn
222
223 @rnindex read-char
224 @deffn {Scheme Procedure} read-char [port]
225 @deffnx {C Function} scm_read_char (port)
226 Return the next character available from @var{port}, updating
227 @var{port} to point to the following character. If no more
228 characters are available, the end-of-file object is returned.
229
230 When @var{port}'s data cannot be decoded according to its
231 character encoding, a @code{decoding-error} is raised and
232 @var{port} points past the erroneous byte sequence.
233 @end deffn
234
235 @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
236 Read up to @var{size} bytes from @var{port} and store them in
237 @var{buffer}. The return value is the number of bytes actually read,
238 which can be less than @var{size} if end-of-file has been reached.
239
240 Note that this function does not update @code{port-line} and
241 @code{port-column} below.
242 @end deftypefn
243
244 @rnindex peek-char
245 @deffn {Scheme Procedure} peek-char [port]
246 @deffnx {C Function} scm_peek_char (port)
247 Return the next character available from @var{port},
248 @emph{without} updating @var{port} to point to the following
249 character. If no more characters are available, the
250 end-of-file object is returned.
251
252 The value returned by
253 a call to @code{peek-char} is the same as the value that would
254 have been returned by a call to @code{read-char} on the same
255 port. The only difference is that the very next call to
256 @code{read-char} or @code{peek-char} on that @var{port} will
257 return the value returned by the preceding call to
258 @code{peek-char}. In particular, a call to @code{peek-char} on
259 an interactive port will hang waiting for input whenever a call
260 to @code{read-char} would have hung.
261
262 As for @code{read-char}, a @code{decoding-error} may be raised
263 if such a situation occurs. However, unlike with @code{read-char},
264 @var{port} still points at the beginning of the erroneous byte
265 sequence when the error is raised.
266 @end deffn
267
268 @deffn {Scheme Procedure} unread-char cobj [port]
269 @deffnx {C Function} scm_unread_char (cobj, port)
270 Place character @var{cobj} in @var{port} so that it will be read by the
271 next read operation. If called multiple times, the unread characters
272 will be read again in last-in first-out order. If @var{port} is
273 not supplied, the current input port is used.
274 @end deffn
275
276 @deffn {Scheme Procedure} unread-string str port
277 @deffnx {C Function} scm_unread_string (str, port)
278 Place the string @var{str} in @var{port} so that its characters will
279 be read from left-to-right as the next characters from @var{port}
280 during subsequent read operations. If called multiple times, the
281 unread characters will be read again in last-in first-out order. If
282 @var{port} is not supplied, the @code{current-input-port} is used.
283 @end deffn
284
285 @deffn {Scheme Procedure} drain-input port
286 @deffnx {C Function} scm_drain_input (port)
287 This procedure clears a port's input buffers, similar
288 to the way that force-output clears the output buffer. The
289 contents of the buffers are returned as a single string, e.g.,
290
291 @lisp
292 (define p (open-input-file ...))
293 (drain-input p) => empty string, nothing buffered yet.
294 (unread-char (read-char p) p)
295 (drain-input p) => initial chars from p, up to the buffer size.
296 @end lisp
297
298 Draining the buffers may be useful for cleanly finishing
299 buffered I/O so that the file descriptor can be used directly
300 for further input.
301 @end deffn
302
303 @deffn {Scheme Procedure} port-column port
304 @deffnx {Scheme Procedure} port-line port
305 @deffnx {C Function} scm_port_column (port)
306 @deffnx {C Function} scm_port_line (port)
307 Return the current column number or line number of @var{port}.
308 If the number is
309 unknown, the result is #f. Otherwise, the result is a 0-origin integer
310 - i.e.@: the first character of the first line is line 0, column 0.
311 (However, when you display a file position, for example in an error
312 message, we recommend you add 1 to get 1-origin integers. This is
313 because lines and column numbers traditionally start with 1, and that is
314 what non-programmers will find most natural.)
315 @end deffn
316
317 @deffn {Scheme Procedure} set-port-column! port column
318 @deffnx {Scheme Procedure} set-port-line! port line
319 @deffnx {C Function} scm_set_port_column_x (port, column)
320 @deffnx {C Function} scm_set_port_line_x (port, line)
321 Set the current column or line number of @var{port}.
322 @end deffn
323
324 @node Writing
325 @subsection Writing
326 @cindex Writing
327
328 [Generic procedures for writing to ports.]
329
330 These procedures are for writing characters and strings to
331 ports. For more information on writing arbitrary Scheme objects to
332 ports, @xref{Scheme Write}.
333
334 @deffn {Scheme Procedure} get-print-state port
335 @deffnx {C Function} scm_get_print_state (port)
336 Return the print state of the port @var{port}. If @var{port}
337 has no associated print state, @code{#f} is returned.
338 @end deffn
339
340 @rnindex newline
341 @deffn {Scheme Procedure} newline [port]
342 @deffnx {C Function} scm_newline (port)
343 Send a newline to @var{port}.
344 If @var{port} is omitted, send to the current output port.
345 @end deffn
346
347 @deffn {Scheme Procedure} port-with-print-state port [pstate]
348 @deffnx {C Function} scm_port_with_print_state (port, pstate)
349 Create a new port which behaves like @var{port}, but with an
350 included print state @var{pstate}. @var{pstate} is optional.
351 If @var{pstate} isn't supplied and @var{port} already has
352 a print state, the old print state is reused.
353 @end deffn
354
355 @deffn {Scheme Procedure} simple-format destination message . args
356 @deffnx {C Function} scm_simple_format (destination, message, args)
357 Write @var{message} to @var{destination}, defaulting to
358 the current output port.
359 @var{message} can contain @code{~A} (was @code{%s}) and
360 @code{~S} (was @code{%S}) escapes. When printed,
361 the escapes are replaced with corresponding members of
362 @var{args}:
363 @code{~A} formats using @code{display} and @code{~S} formats
364 using @code{write}.
365 If @var{destination} is @code{#t}, then use the current output
366 port, if @var{destination} is @code{#f}, then return a string
367 containing the formatted text. Does not add a trailing newline.
368 @end deffn
369
370 @rnindex write-char
371 @deffn {Scheme Procedure} write-char chr [port]
372 @deffnx {C Function} scm_write_char (chr, port)
373 Send character @var{chr} to @var{port}.
374 @end deffn
375
376 @deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size)
377 Write @var{size} bytes at @var{buffer} to @var{port}.
378
379 Note that this function does not update @code{port-line} and
380 @code{port-column} (@pxref{Reading}).
381 @end deftypefn
382
383 @findex fflush
384 @deffn {Scheme Procedure} force-output [port]
385 @deffnx {C Function} scm_force_output (port)
386 Flush the specified output port, or the current output port if @var{port}
387 is omitted. The current output buffer contents are passed to the
388 underlying port implementation (e.g., in the case of fports, the
389 data will be written to the file and the output buffer will be cleared.)
390 It has no effect on an unbuffered port.
391
392 The return value is unspecified.
393 @end deffn
394
395 @deffn {Scheme Procedure} flush-all-ports
396 @deffnx {C Function} scm_flush_all_ports ()
397 Equivalent to calling @code{force-output} on
398 all open output ports. The return value is unspecified.
399 @end deffn
400
401
402 @node Closing
403 @subsection Closing
404 @cindex Closing ports
405 @cindex Port, close
406
407 @deffn {Scheme Procedure} close-port port
408 @deffnx {C Function} scm_close_port (port)
409 Close the specified port object. Return @code{#t} if it
410 successfully closes a port or @code{#f} if it was already
411 closed. An exception may be raised if an error occurs, for
412 example when flushing buffered output. See also @ref{Ports and
413 File Descriptors, close}, for a procedure which can close file
414 descriptors.
415 @end deffn
416
417 @deffn {Scheme Procedure} close-input-port port
418 @deffnx {Scheme Procedure} close-output-port port
419 @deffnx {C Function} scm_close_input_port (port)
420 @deffnx {C Function} scm_close_output_port (port)
421 @rnindex close-input-port
422 @rnindex close-output-port
423 Close the specified input or output @var{port}. An exception may be
424 raised if an error occurs while closing. If @var{port} is already
425 closed, nothing is done. The return value is unspecified.
426
427 See also @ref{Ports and File Descriptors, close}, for a procedure
428 which can close file descriptors.
429 @end deffn
430
431 @deffn {Scheme Procedure} port-closed? port
432 @deffnx {C Function} scm_port_closed_p (port)
433 Return @code{#t} if @var{port} is closed or @code{#f} if it is
434 open.
435 @end deffn
436
437
438 @node Random Access
439 @subsection Random Access
440 @cindex Random access, ports
441 @cindex Port, random access
442
443 @deffn {Scheme Procedure} seek fd_port offset whence
444 @deffnx {C Function} scm_seek (fd_port, offset, whence)
445 Sets the current position of @var{fd_port} to the integer
446 @var{offset}, which is interpreted according to the value of
447 @var{whence}.
448
449 One of the following variables should be supplied for
450 @var{whence}:
451 @defvar SEEK_SET
452 Seek from the beginning of the file.
453 @end defvar
454 @defvar SEEK_CUR
455 Seek from the current position.
456 @end defvar
457 @defvar SEEK_END
458 Seek from the end of the file.
459 @end defvar
460 If @var{fd_port} is a file descriptor, the underlying system
461 call is @code{lseek}. @var{port} may be a string port.
462
463 The value returned is the new position in the file. This means
464 that the current position of a port can be obtained using:
465 @lisp
466 (seek port 0 SEEK_CUR)
467 @end lisp
468 @end deffn
469
470 @deffn {Scheme Procedure} ftell fd_port
471 @deffnx {C Function} scm_ftell (fd_port)
472 Return an integer representing the current position of
473 @var{fd_port}, measured from the beginning. Equivalent to:
474
475 @lisp
476 (seek port 0 SEEK_CUR)
477 @end lisp
478 @end deffn
479
480 @findex truncate
481 @findex ftruncate
482 @deffn {Scheme Procedure} truncate-file file [length]
483 @deffnx {C Function} scm_truncate_file (file, length)
484 Truncate @var{file} to @var{length} bytes. @var{file} can be a
485 filename string, a port object, or an integer file descriptor. The
486 return value is unspecified.
487
488 For a port or file descriptor @var{length} can be omitted, in which
489 case the file is truncated at the current position (per @code{ftell}
490 above).
491
492 On most systems a file can be extended by giving a length greater than
493 the current size, but this is not mandatory in the POSIX standard.
494 @end deffn
495
496 @node Line/Delimited
497 @subsection Line Oriented and Delimited Text
498 @cindex Line input/output
499 @cindex Port, line input/output
500
501 The delimited-I/O module can be accessed with:
502
503 @lisp
504 (use-modules (ice-9 rdelim))
505 @end lisp
506
507 It can be used to read or write lines of text, or read text delimited by
508 a specified set of characters. It's similar to the @code{(scsh rdelim)}
509 module from guile-scsh, but does not use multiple values or character
510 sets and has an extra procedure @code{write-line}.
511
512 @c begin (scm-doc-string "rdelim.scm" "read-line")
513 @deffn {Scheme Procedure} read-line [port] [handle-delim]
514 Return a line of text from @var{port} if specified, otherwise from the
515 value returned by @code{(current-input-port)}. Under Unix, a line of text
516 is terminated by the first end-of-line character or by end-of-file.
517
518 If @var{handle-delim} is specified, it should be one of the following
519 symbols:
520 @table @code
521 @item trim
522 Discard the terminating delimiter. This is the default, but it will
523 be impossible to tell whether the read terminated with a delimiter or
524 end-of-file.
525 @item concat
526 Append the terminating delimiter (if any) to the returned string.
527 @item peek
528 Push the terminating delimiter (if any) back on to the port.
529 @item split
530 Return a pair containing the string read from the port and the
531 terminating delimiter or end-of-file object.
532 @end table
533
534 Like @code{read-char}, this procedure can throw to @code{decoding-error}
535 (@pxref{Reading, @code{read-char}}).
536 @end deffn
537
538 @c begin (scm-doc-string "rdelim.scm" "read-line!")
539 @deffn {Scheme Procedure} read-line! buf [port]
540 Read a line of text into the supplied string @var{buf} and return the
541 number of characters added to @var{buf}. If @var{buf} is filled, then
542 @code{#f} is returned.
543 Read from @var{port} if
544 specified, otherwise from the value returned by @code{(current-input-port)}.
545 @end deffn
546
547 @c begin (scm-doc-string "rdelim.scm" "read-delimited")
548 @deffn {Scheme Procedure} read-delimited delims [port] [handle-delim]
549 Read text until one of the characters in the string @var{delims} is found
550 or end-of-file is reached. Read from @var{port} if supplied, otherwise
551 from the value returned by @code{(current-input-port)}.
552 @var{handle-delim} takes the same values as described for @code{read-line}.
553 @end deffn
554
555 @c begin (scm-doc-string "rdelim.scm" "read-delimited!")
556 @deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
557 Read text into the supplied string @var{buf}.
558
559 If a delimiter was found, return the number of characters written,
560 except if @var{handle-delim} is @code{split}, in which case the return
561 value is a pair, as noted above.
562
563 As a special case, if @var{port} was already at end-of-stream, the EOF
564 object is returned. Also, if no characters were written because the
565 buffer was full, @code{#f} is returned.
566
567 It's something of a wacky interface, to be honest.
568 @end deffn
569
570 @deffn {Scheme Procedure} write-line obj [port]
571 @deffnx {C Function} scm_write_line (obj, port)
572 Display @var{obj} and a newline character to @var{port}. If
573 @var{port} is not specified, @code{(current-output-port)} is
574 used. This function is equivalent to:
575 @lisp
576 (display obj [port])
577 (newline [port])
578 @end lisp
579 @end deffn
580
581 In the past, Guile did not have a procedure that would just read out all
582 of the characters from a port. As a workaround, many people just called
583 @code{read-delimited} with no delimiters, knowing that would produce the
584 behavior they wanted. This prompted Guile developers to add some
585 routines that would read all characters from a port. So it is that
586 @code{(ice-9 rdelim)} is also the home for procedures that can reading
587 undelimited text:
588
589 @deffn {Scheme Procedure} read-string [port] [count]
590 Read all of the characters out of @var{port} and return them as a
591 string. If the @var{count} is present, treat it as a limit to the
592 number of characters to read.
593
594 By default, read from the current input port, with no size limit on the
595 result. This procedure always returns a string, even if no characters
596 were read.
597 @end deffn
598
599 @deffn {Scheme Procedure} read-string! buf [port] [start] [end]
600 Fill @var{buf} with characters read from @var{port}, defaulting to the
601 current input port. Return the number of characters read.
602
603 If @var{start} or @var{end} are specified, store data only into the
604 substring of @var{str} bounded by @var{start} and @var{end} (which
605 default to the beginning and end of the string, respectively).
606 @end deffn
607
608 Some of the aforementioned I/O functions rely on the following C
609 primitives. These will mainly be of interest to people hacking Guile
610 internals.
611
612 @deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]]
613 @deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end)
614 Read characters from @var{port} into @var{str} until one of the
615 characters in the @var{delims} string is encountered. If
616 @var{gobble} is true, discard the delimiter character;
617 otherwise, leave it in the input stream for the next read. If
618 @var{port} is not specified, use the value of
619 @code{(current-input-port)}. If @var{start} or @var{end} are
620 specified, store data only into the substring of @var{str}
621 bounded by @var{start} and @var{end} (which default to the
622 beginning and end of the string, respectively).
623
624 Return a pair consisting of the delimiter that terminated the
625 string and the number of characters read. If reading stopped
626 at the end of file, the delimiter returned is the
627 @var{eof-object}; if the string was filled without encountering
628 a delimiter, this value is @code{#f}.
629 @end deffn
630
631 @deffn {Scheme Procedure} %read-line [port]
632 @deffnx {C Function} scm_read_line (port)
633 Read a newline-terminated line from @var{port}, allocating storage as
634 necessary. The newline terminator (if any) is removed from the string,
635 and a pair consisting of the line and its delimiter is returned. The
636 delimiter may be either a newline or the @var{eof-object}; if
637 @code{%read-line} is called at the end of file, it returns the pair
638 @code{(#<eof> . #<eof>)}.
639 @end deffn
640
641 @node Block Reading and Writing
642 @subsection Block reading and writing
643 @cindex Block read/write
644 @cindex Port, block read/write
645
646 The Block-string-I/O module can be accessed with:
647
648 @lisp
649 (use-modules (ice-9 rw))
650 @end lisp
651
652 It currently contains procedures that help to implement the
653 @code{(scsh rw)} module in guile-scsh.
654
655 @deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]]
656 @deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end)
657 Read characters from a port or file descriptor into a
658 string @var{str}. A port must have an underlying file
659 descriptor --- a so-called fport. This procedure is
660 scsh-compatible and can efficiently read large strings.
661 It will:
662
663 @itemize
664 @item
665 attempt to fill the entire string, unless the @var{start}
666 and/or @var{end} arguments are supplied. i.e., @var{start}
667 defaults to 0 and @var{end} defaults to
668 @code{(string-length str)}
669 @item
670 use the current input port if @var{port_or_fdes} is not
671 supplied.
672 @item
673 return fewer than the requested number of characters in some
674 cases, e.g., on end of file, if interrupted by a signal, or if
675 not all the characters are immediately available.
676 @item
677 wait indefinitely for some input if no characters are
678 currently available,
679 unless the port is in non-blocking mode.
680 @item
681 read characters from the port's input buffers if available,
682 instead from the underlying file descriptor.
683 @item
684 return @code{#f} if end-of-file is encountered before reading
685 any characters, otherwise return the number of characters
686 read.
687 @item
688 return 0 if the port is in non-blocking mode and no characters
689 are immediately available.
690 @item
691 return 0 if the request is for 0 bytes, with no
692 end-of-file check.
693 @end itemize
694 @end deffn
695
696 @deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]]
697 @deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end)
698 Write characters from a string @var{str} to a port or file
699 descriptor. A port must have an underlying file descriptor
700 --- a so-called fport. This procedure is
701 scsh-compatible and can efficiently write large strings.
702 It will:
703
704 @itemize
705 @item
706 attempt to write the entire string, unless the @var{start}
707 and/or @var{end} arguments are supplied. i.e., @var{start}
708 defaults to 0 and @var{end} defaults to
709 @code{(string-length str)}
710 @item
711 use the current output port if @var{port_of_fdes} is not
712 supplied.
713 @item
714 in the case of a buffered port, store the characters in the
715 port's output buffer, if all will fit. If they will not fit
716 then any existing buffered characters will be flushed
717 before attempting
718 to write the new characters directly to the underlying file
719 descriptor. If the port is in non-blocking mode and
720 buffered characters can not be flushed immediately, then an
721 @code{EAGAIN} system-error exception will be raised (Note:
722 scsh does not support the use of non-blocking buffered ports.)
723 @item
724 write fewer than the requested number of
725 characters in some cases, e.g., if interrupted by a signal or
726 if not all of the output can be accepted immediately.
727 @item
728 wait indefinitely for at least one character
729 from @var{str} to be accepted by the port, unless the port is
730 in non-blocking mode.
731 @item
732 return the number of characters accepted by the port.
733 @item
734 return 0 if the port is in non-blocking mode and can not accept
735 at least one character from @var{str} immediately
736 @item
737 return 0 immediately if the request size is 0 bytes.
738 @end itemize
739 @end deffn
740
741 @node Default Ports
742 @subsection Default Ports for Input, Output and Errors
743 @cindex Default ports
744 @cindex Port, default
745
746 @rnindex current-input-port
747 @deffn {Scheme Procedure} current-input-port
748 @deffnx {C Function} scm_current_input_port ()
749 @cindex standard input
750 Return the current input port. This is the default port used
751 by many input procedures.
752
753 Initially this is the @dfn{standard input} in Unix and C terminology.
754 When the standard input is a tty the port is unbuffered, otherwise
755 it's fully buffered.
756
757 Unbuffered input is good if an application runs an interactive
758 subprocess, since any type-ahead input won't go into Guile's buffer
759 and be unavailable to the subprocess.
760
761 Note that Guile buffering is completely separate from the tty ``line
762 discipline''. In the usual cooked mode on a tty Guile only sees a
763 line of input once the user presses @key{Return}.
764 @end deffn
765
766 @rnindex current-output-port
767 @deffn {Scheme Procedure} current-output-port
768 @deffnx {C Function} scm_current_output_port ()
769 @cindex standard output
770 Return the current output port. This is the default port used
771 by many output procedures.
772
773 Initially this is the @dfn{standard output} in Unix and C terminology.
774 When the standard output is a tty this port is unbuffered, otherwise
775 it's fully buffered.
776
777 Unbuffered output to a tty is good for ensuring progress output or a
778 prompt is seen. But an application which always prints whole lines
779 could change to line buffered, or an application with a lot of output
780 could go fully buffered and perhaps make explicit @code{force-output}
781 calls (@pxref{Writing}) at selected points.
782 @end deffn
783
784 @deffn {Scheme Procedure} current-error-port
785 @deffnx {C Function} scm_current_error_port ()
786 @cindex standard error output
787 Return the port to which errors and warnings should be sent.
788
789 Initially this is the @dfn{standard error} in Unix and C terminology.
790 When the standard error is a tty this port is unbuffered, otherwise
791 it's fully buffered.
792 @end deffn
793
794 @deffn {Scheme Procedure} set-current-input-port port
795 @deffnx {Scheme Procedure} set-current-output-port port
796 @deffnx {Scheme Procedure} set-current-error-port port
797 @deffnx {C Function} scm_set_current_input_port (port)
798 @deffnx {C Function} scm_set_current_output_port (port)
799 @deffnx {C Function} scm_set_current_error_port (port)
800 Change the ports returned by @code{current-input-port},
801 @code{current-output-port} and @code{current-error-port}, respectively,
802 so that they use the supplied @var{port} for input or output.
803 @end deffn
804
805 @deftypefn {C Function} void scm_dynwind_current_input_port (SCM port)
806 @deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port)
807 @deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port)
808 These functions must be used inside a pair of calls to
809 @code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic
810 Wind}). During the dynwind context, the indicated port is set to
811 @var{port}.
812
813 More precisely, the current port is swapped with a `backup' value
814 whenever the dynwind context is entered or left. The backup value is
815 initialized with the @var{port} argument.
816 @end deftypefn
817
818 @node Port Types
819 @subsection Types of Port
820 @cindex Types of ports
821 @cindex Port, types
822
823 [Types of port; how to make them.]
824
825 @menu
826 * File Ports:: Ports on an operating system file.
827 * String Ports:: Ports on a Scheme string.
828 * Soft Ports:: Ports on arbitrary Scheme procedures.
829 * Void Ports:: Ports on nothing at all.
830 @end menu
831
832
833 @node File Ports
834 @subsubsection File Ports
835 @cindex File port
836 @cindex Port, file
837
838 The following procedures are used to open file ports.
839 See also @ref{Ports and File Descriptors, open}, for an interface
840 to the Unix @code{open} system call.
841
842 Most systems have limits on how many files can be open, so it's
843 strongly recommended that file ports be closed explicitly when no
844 longer required (@pxref{Ports}).
845
846 @deffn {Scheme Procedure} open-file filename mode @
847 [#:guess-encoding=#f] [#:encoding=#f]
848 @deffnx {C Function} scm_open_file_with_encoding @
849 (filename, mode, guess_encoding, encoding)
850 @deffnx {C Function} scm_open_file (filename, mode)
851 Open the file whose name is @var{filename}, and return a port
852 representing that file. The attributes of the port are
853 determined by the @var{mode} string. The way in which this is
854 interpreted is similar to C stdio. The first character must be
855 one of the following:
856
857 @table @samp
858 @item r
859 Open an existing file for input.
860 @item w
861 Open a file for output, creating it if it doesn't already exist
862 or removing its contents if it does.
863 @item a
864 Open a file for output, creating it if it doesn't already
865 exist. All writes to the port will go to the end of the file.
866 The "append mode" can be turned off while the port is in use
867 @pxref{Ports and File Descriptors, fcntl}
868 @end table
869
870 The following additional characters can be appended:
871
872 @table @samp
873 @item +
874 Open the port for both input and output. E.g., @code{r+}: open
875 an existing file for both input and output.
876 @item 0
877 Create an "unbuffered" port. In this case input and output
878 operations are passed directly to the underlying port
879 implementation without additional buffering. This is likely to
880 slow down I/O operations. The buffering mode can be changed
881 while a port is in use @pxref{Ports and File Descriptors,
882 setvbuf}
883 @item l
884 Add line-buffering to the port. The port output buffer will be
885 automatically flushed whenever a newline character is written.
886 @item b
887 Use binary mode, ensuring that each byte in the file will be read as one
888 Scheme character.
889
890 To provide this property, the file will be opened with the 8-bit
891 character encoding "ISO-8859-1", ignoring the default port encoding.
892 @xref{Ports}, for more information on port encodings.
893
894 Note that while it is possible to read and write binary data as
895 characters or strings, it is usually better to treat bytes as octets,
896 and byte sequences as bytevectors. @xref{R6RS Binary Input}, and
897 @ref{R6RS Binary Output}, for more.
898
899 This option had another historical meaning, for DOS compatibility: in
900 the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
901 The @code{b} flag prevents this from happening, adding @code{O_BINARY}
902 to the underlying @code{open} call. Still, the flag is generally useful
903 because of its port encoding ramifications.
904 @end table
905
906 Unless binary mode is requested, the character encoding of the new port
907 is determined as follows: First, if @var{guess-encoding} is true, the
908 @code{file-encoding} procedure is used to guess the encoding of the file
909 (@pxref{Character Encoding of Source Files}). If @var{guess-encoding}
910 is false or if @code{file-encoding} fails, @var{encoding} is used unless
911 it is also false. As a last resort, the default port encoding is used.
912 @xref{Ports}, for more information on port encodings. It is an error to
913 pass a non-false @var{guess-encoding} or @var{encoding} if binary mode
914 is requested.
915
916 If a file cannot be opened with the access requested, @code{open-file}
917 throws an exception.
918
919 When the file is opened, its encoding is set to the current
920 @code{%default-port-encoding}, unless the @code{b} flag was supplied.
921 Sometimes it is desirable to honor Emacs-style coding declarations in
922 files@footnote{Guile 2.0.0 to 2.0.7 would do this by default. This
923 behavior was deemed inappropriate and disabled starting from Guile
924 2.0.8.}. When that is the case, the @code{file-encoding} procedure can
925 be used as follows (@pxref{Character Encoding of Source Files,
926 @code{file-encoding}}):
927
928 @example
929 (let* ((port (open-input-file file))
930 (encoding (file-encoding port)))
931 (set-port-encoding! port (or encoding (port-encoding port))))
932 @end example
933
934 In theory we could create read/write ports which were buffered
935 in one direction only. However this isn't included in the
936 current interfaces.
937 @end deffn
938
939 @rnindex open-input-file
940 @deffn {Scheme Procedure} open-input-file filename @
941 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
942
943 Open @var{filename} for input. If @var{binary} is true, open the port
944 in binary mode, otherwise use text mode. @var{encoding} and
945 @var{guess-encoding} determine the character encoding as described above
946 for @code{open-file}. Equivalent to
947 @lisp
948 (open-file @var{filename}
949 (if @var{binary} "rb" "r")
950 #:guess-encoding @var{guess-encoding}
951 #:encoding @var{encoding})
952 @end lisp
953 @end deffn
954
955 @rnindex open-output-file
956 @deffn {Scheme Procedure} open-output-file filename @
957 [#:encoding=#f] [#:binary=#f]
958
959 Open @var{filename} for output. If @var{binary} is true, open the port
960 in binary mode, otherwise use text mode. @var{encoding} specifies the
961 character encoding as described above for @code{open-file}. Equivalent
962 to
963 @lisp
964 (open-file @var{filename}
965 (if @var{binary} "wb" "w")
966 #:encoding @var{encoding})
967 @end lisp
968 @end deffn
969
970 @deffn {Scheme Procedure} call-with-input-file filename proc @
971 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
972 @deffnx {Scheme Procedure} call-with-output-file filename proc @
973 [#:encoding=#f] [#:binary=#f]
974 @rnindex call-with-input-file
975 @rnindex call-with-output-file
976 Open @var{filename} for input or output, and call @code{(@var{proc}
977 port)} with the resulting port. Return the value returned by
978 @var{proc}. @var{filename} is opened as per @code{open-input-file} or
979 @code{open-output-file} respectively, and an error is signaled if it
980 cannot be opened.
981
982 When @var{proc} returns, the port is closed. If @var{proc} does not
983 return (e.g.@: if it throws an error), then the port might not be
984 closed automatically, though it will be garbage collected in the usual
985 way if not otherwise referenced.
986 @end deffn
987
988 @deffn {Scheme Procedure} with-input-from-file filename thunk @
989 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
990 @deffnx {Scheme Procedure} with-output-to-file filename thunk @
991 [#:encoding=#f] [#:binary=#f]
992 @deffnx {Scheme Procedure} with-error-to-file filename thunk @
993 [#:encoding=#f] [#:binary=#f]
994 @rnindex with-input-from-file
995 @rnindex with-output-to-file
996 Open @var{filename} and call @code{(@var{thunk})} with the new port
997 setup as respectively the @code{current-input-port},
998 @code{current-output-port}, or @code{current-error-port}. Return the
999 value returned by @var{thunk}. @var{filename} is opened as per
1000 @code{open-input-file} or @code{open-output-file} respectively, and an
1001 error is signaled if it cannot be opened.
1002
1003 When @var{thunk} returns, the port is closed and the previous setting
1004 of the respective current port is restored.
1005
1006 The current port setting is managed with @code{dynamic-wind}, so the
1007 previous value is restored no matter how @var{thunk} exits (eg.@: an
1008 exception), and if @var{thunk} is re-entered (via a captured
1009 continuation) then it's set again to the @var{filename} port.
1010
1011 The port is closed when @var{thunk} returns normally, but not when
1012 exited via an exception or new continuation. This ensures it's still
1013 ready for use if @var{thunk} is re-entered by a captured continuation.
1014 Of course the port is always garbage collected and closed in the usual
1015 way when no longer referenced anywhere.
1016 @end deffn
1017
1018 @deffn {Scheme Procedure} port-mode port
1019 @deffnx {C Function} scm_port_mode (port)
1020 Return the port modes associated with the open port @var{port}.
1021 These will not necessarily be identical to the modes used when
1022 the port was opened, since modes such as "append" which are
1023 used only during port creation are not retained.
1024 @end deffn
1025
1026 @deffn {Scheme Procedure} port-filename port
1027 @deffnx {C Function} scm_port_filename (port)
1028 Return the filename associated with @var{port}, or @code{#f} if no
1029 filename is associated with the port.
1030
1031 @var{port} must be open, @code{port-filename} cannot be used once the
1032 port is closed.
1033 @end deffn
1034
1035 @deffn {Scheme Procedure} set-port-filename! port filename
1036 @deffnx {C Function} scm_set_port_filename_x (port, filename)
1037 Change the filename associated with @var{port}, using the current input
1038 port if none is specified. Note that this does not change the port's
1039 source of data, but only the value that is returned by
1040 @code{port-filename} and reported in diagnostic output.
1041 @end deffn
1042
1043 @deffn {Scheme Procedure} file-port? obj
1044 @deffnx {C Function} scm_file_port_p (obj)
1045 Determine whether @var{obj} is a port that is related to a file.
1046 @end deffn
1047
1048
1049 @node String Ports
1050 @subsubsection String Ports
1051 @cindex String port
1052 @cindex Port, string
1053
1054 The following allow string ports to be opened by analogy to R4RS
1055 file port facilities:
1056
1057 With string ports, the port-encoding is treated differently than other
1058 types of ports. When string ports are created, they do not inherit a
1059 character encoding from the current locale. They are given a
1060 default locale that allows them to handle all valid string characters.
1061 Typically one should not modify a string port's character encoding
1062 away from its default.
1063
1064 @deffn {Scheme Procedure} call-with-output-string proc
1065 @deffnx {C Function} scm_call_with_output_string (proc)
1066 Calls the one-argument procedure @var{proc} with a newly created output
1067 port. When the function returns, the string composed of the characters
1068 written into the port is returned. @var{proc} should not close the port.
1069
1070 Note that which characters can be written to a string port depend on the port's
1071 encoding. The default encoding of string ports is specified by the
1072 @code{%default-port-encoding} fluid (@pxref{Ports,
1073 @code{%default-port-encoding}}). For instance, it is an error to write Greek
1074 letter alpha to an ISO-8859-1-encoded string port since this character cannot be
1075 represented with ISO-8859-1:
1076
1077 @example
1078 (define alpha (integer->char #x03b1)) ; GREEK SMALL LETTER ALPHA
1079
1080 (with-fluids ((%default-port-encoding "ISO-8859-1"))
1081 (call-with-output-string
1082 (lambda (p)
1083 (display alpha p))))
1084
1085 @result{}
1086 Throw to key `encoding-error'
1087 @end example
1088
1089 Changing the string port's encoding to a Unicode-capable encoding such as UTF-8
1090 solves the problem.
1091 @end deffn
1092
1093 @deffn {Scheme Procedure} call-with-input-string string proc
1094 @deffnx {C Function} scm_call_with_input_string (string, proc)
1095 Calls the one-argument procedure @var{proc} with a newly
1096 created input port from which @var{string}'s contents may be
1097 read. The value yielded by the @var{proc} is returned.
1098 @end deffn
1099
1100 @deffn {Scheme Procedure} with-output-to-string thunk
1101 Calls the zero-argument procedure @var{thunk} with the current output
1102 port set temporarily to a new string port. It returns a string
1103 composed of the characters written to the current output.
1104
1105 See @code{call-with-output-string} above for character encoding considerations.
1106 @end deffn
1107
1108 @deffn {Scheme Procedure} with-input-from-string string thunk
1109 Calls the zero-argument procedure @var{thunk} with the current input
1110 port set temporarily to a string port opened on the specified
1111 @var{string}. The value yielded by @var{thunk} is returned.
1112 @end deffn
1113
1114 @deffn {Scheme Procedure} open-input-string str
1115 @deffnx {C Function} scm_open_input_string (str)
1116 Take a string and return an input port that delivers characters
1117 from the string. The port can be closed by
1118 @code{close-input-port}, though its storage will be reclaimed
1119 by the garbage collector if it becomes inaccessible.
1120 @end deffn
1121
1122 @deffn {Scheme Procedure} open-output-string
1123 @deffnx {C Function} scm_open_output_string ()
1124 Return an output port that will accumulate characters for
1125 retrieval by @code{get-output-string}. The port can be closed
1126 by the procedure @code{close-output-port}, though its storage
1127 will be reclaimed by the garbage collector if it becomes
1128 inaccessible.
1129 @end deffn
1130
1131 @deffn {Scheme Procedure} get-output-string port
1132 @deffnx {C Function} scm_get_output_string (port)
1133 Given an output port created by @code{open-output-string},
1134 return a string consisting of the characters that have been
1135 output to the port so far.
1136
1137 @code{get-output-string} must be used before closing @var{port}, once
1138 closed the string cannot be obtained.
1139 @end deffn
1140
1141 A string port can be used in many procedures which accept a port
1142 but which are not dependent on implementation details of fports.
1143 E.g., seeking and truncating will work on a string port,
1144 but trying to extract the file descriptor number will fail.
1145
1146
1147 @node Soft Ports
1148 @subsubsection Soft Ports
1149 @cindex Soft port
1150 @cindex Port, soft
1151
1152 A @dfn{soft-port} is a port based on a vector of procedures capable of
1153 accepting or delivering characters. It allows emulation of I/O ports.
1154
1155 @deffn {Scheme Procedure} make-soft-port pv modes
1156 @deffnx {C Function} scm_make_soft_port (pv, modes)
1157 Return a port capable of receiving or delivering characters as
1158 specified by the @var{modes} string (@pxref{File Ports,
1159 open-file}). @var{pv} must be a vector of length 5 or 6. Its
1160 components are as follows:
1161
1162 @enumerate 0
1163 @item
1164 procedure accepting one character for output
1165 @item
1166 procedure accepting a string for output
1167 @item
1168 thunk for flushing output
1169 @item
1170 thunk for getting one character
1171 @item
1172 thunk for closing port (not by garbage collection)
1173 @item
1174 (if present and not @code{#f}) thunk for computing the number of
1175 characters that can be read from the port without blocking.
1176 @end enumerate
1177
1178 For an output-only port only elements 0, 1, 2, and 4 need be
1179 procedures. For an input-only port only elements 3 and 4 need
1180 be procedures. Thunks 2 and 4 can instead be @code{#f} if
1181 there is no useful operation for them to perform.
1182
1183 If thunk 3 returns @code{#f} or an @code{eof-object}
1184 (@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on
1185 Scheme}) it indicates that the port has reached end-of-file.
1186 For example:
1187
1188 @lisp
1189 (define stdout (current-output-port))
1190 (define p (make-soft-port
1191 (vector
1192 (lambda (c) (write c stdout))
1193 (lambda (s) (display s stdout))
1194 (lambda () (display "." stdout))
1195 (lambda () (char-upcase (read-char)))
1196 (lambda () (display "@@" stdout)))
1197 "rw"))
1198
1199 (write p p) @result{} #<input-output: soft 8081e20>
1200 @end lisp
1201 @end deffn
1202
1203
1204 @node Void Ports
1205 @subsubsection Void Ports
1206 @cindex Void port
1207 @cindex Port, void
1208
1209 This kind of port causes any data to be discarded when written to, and
1210 always returns the end-of-file object when read from.
1211
1212 @deffn {Scheme Procedure} %make-void-port mode
1213 @deffnx {C Function} scm_sys_make_void_port (mode)
1214 Create and return a new void port. A void port acts like
1215 @file{/dev/null}. The @var{mode} argument
1216 specifies the input/output modes for this port: see the
1217 documentation for @code{open-file} in @ref{File Ports}.
1218 @end deffn
1219
1220
1221 @node R6RS I/O Ports
1222 @subsection R6RS I/O Ports
1223
1224 @cindex R6RS
1225 @cindex R6RS ports
1226
1227 The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
1228 the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
1229 io ports)} module. It provides features, such as binary I/O and Unicode
1230 string I/O, that complement or refine Guile's historical port API
1231 presented above (@pxref{Input and Output}). Note that R6RS ports are not
1232 disjoint from Guile's native ports, so Guile-specific procedures will
1233 work on ports created using the R6RS API, and vice versa.
1234
1235 The text in this section is taken from the R6RS standard libraries
1236 document, with only minor adaptions for inclusion in this manual. The
1237 Guile developers offer their thanks to the R6RS editors for having
1238 provided the report's text under permissive conditions making this
1239 possible.
1240
1241 @c FIXME: Update description when implemented.
1242 @emph{Note}: The implementation of this R6RS API is not complete yet.
1243
1244 @menu
1245 * R6RS File Names:: File names.
1246 * R6RS File Options:: Options for opening files.
1247 * R6RS Buffer Modes:: Influencing buffering behavior.
1248 * R6RS Transcoders:: Influencing port encoding.
1249 * R6RS End-of-File:: The end-of-file object.
1250 * R6RS Port Manipulation:: Manipulating R6RS ports.
1251 * R6RS Input Ports:: Input Ports.
1252 * R6RS Binary Input:: Binary input.
1253 * R6RS Textual Input:: Textual input.
1254 * R6RS Output Ports:: Output Ports.
1255 * R6RS Binary Output:: Binary output.
1256 * R6RS Textual Output:: Textual output.
1257 @end menu
1258
1259 A subset of the @code{(rnrs io ports)} module, plus one non-standard
1260 procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is
1261 provided by the @code{(ice-9 binary-ports)} module. It contains binary
1262 input/output procedures and does not rely on R6RS support.
1263
1264 @node R6RS File Names
1265 @subsubsection File Names
1266
1267 Some of the procedures described in this chapter accept a file name as an
1268 argument. Valid values for such a file name include strings that name a file
1269 using the native notation of file system paths on an implementation's
1270 underlying operating system, and may include implementation-dependent
1271 values as well.
1272
1273 A @var{filename} parameter name means that the
1274 corresponding argument must be a file name.
1275
1276 @node R6RS File Options
1277 @subsubsection File Options
1278 @cindex file options
1279
1280 When opening a file, the various procedures in this library accept a
1281 @code{file-options} object that encapsulates flags to specify how the
1282 file is to be opened. A @code{file-options} object is an enum-set
1283 (@pxref{rnrs enums}) over the symbols constituting valid file options.
1284
1285 A @var{file-options} parameter name means that the corresponding
1286 argument must be a file-options object.
1287
1288 @deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
1289
1290 Each @var{file-options-symbol} must be a symbol.
1291
1292 The @code{file-options} syntax returns a file-options object that
1293 encapsulates the specified options.
1294
1295 When supplied to an operation that opens a file for output, the
1296 file-options object returned by @code{(file-options)} specifies that the
1297 file is created if it does not exist and an exception with condition
1298 type @code{&i/o-file-already-exists} is raised if it does exist. The
1299 following standard options can be included to modify the default
1300 behavior.
1301
1302 @table @code
1303 @item no-create
1304 If the file does not already exist, it is not created;
1305 instead, an exception with condition type @code{&i/o-file-does-not-exist}
1306 is raised.
1307 If the file already exists, the exception with condition type
1308 @code{&i/o-file-already-exists} is not raised
1309 and the file is truncated to zero length.
1310 @item no-fail
1311 If the file already exists, the exception with condition type
1312 @code{&i/o-file-already-exists} is not raised,
1313 even if @code{no-create} is not included,
1314 and the file is truncated to zero length.
1315 @item no-truncate
1316 If the file already exists and the exception with condition type
1317 @code{&i/o-file-already-exists} has been inhibited by inclusion of
1318 @code{no-create} or @code{no-fail}, the file is not truncated, but
1319 the port's current position is still set to the beginning of the
1320 file.
1321 @end table
1322
1323 These options have no effect when a file is opened only for input.
1324 Symbols other than those listed above may be used as
1325 @var{file-options-symbol}s; they have implementation-specific meaning,
1326 if any.
1327
1328 @quotation Note
1329 Only the name of @var{file-options-symbol} is significant.
1330 @end quotation
1331 @end deffn
1332
1333 @node R6RS Buffer Modes
1334 @subsubsection Buffer Modes
1335
1336 Each port has an associated buffer mode. For an output port, the
1337 buffer mode defines when an output operation flushes the buffer
1338 associated with the output port. For an input port, the buffer mode
1339 defines how much data will be read to satisfy read operations. The
1340 possible buffer modes are the symbols @code{none} for no buffering,
1341 @code{line} for flushing upon line endings and reading up to line
1342 endings, or other implementation-dependent behavior,
1343 and @code{block} for arbitrary buffering. This section uses
1344 the parameter name @var{buffer-mode} for arguments that must be
1345 buffer-mode symbols.
1346
1347 If two ports are connected to the same mutable source, both ports
1348 are unbuffered, and reading a byte or character from that shared
1349 source via one of the two ports would change the bytes or characters
1350 seen via the other port, a lookahead operation on one port will
1351 render the peeked byte or character inaccessible via the other port,
1352 while a subsequent read operation on the peeked port will see the
1353 peeked byte or character even though the port is otherwise unbuffered.
1354
1355 In other words, the semantics of buffering is defined in terms of side
1356 effects on shared mutable sources, and a lookahead operation has the
1357 same side effect on the shared source as a read operation.
1358
1359 @deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
1360
1361 @var{buffer-mode-symbol} must be a symbol whose name is one of
1362 @code{none}, @code{line}, and @code{block}. The result is the
1363 corresponding symbol, and specifies the associated buffer mode.
1364
1365 @quotation Note
1366 Only the name of @var{buffer-mode-symbol} is significant.
1367 @end quotation
1368 @end deffn
1369
1370 @deffn {Scheme Procedure} buffer-mode? obj
1371 Returns @code{#t} if the argument is a valid buffer-mode symbol, and
1372 returns @code{#f} otherwise.
1373 @end deffn
1374
1375 @node R6RS Transcoders
1376 @subsubsection Transcoders
1377 @cindex codec
1378 @cindex end-of-line style
1379 @cindex transcoder
1380 @cindex binary port
1381 @cindex textual port
1382
1383 Several different Unicode encoding schemes describe standard ways to
1384 encode characters and strings as byte sequences and to decode those
1385 sequences. Within this document, a @dfn{codec} is an immutable Scheme
1386 object that represents a Unicode or similar encoding scheme.
1387
1388 An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
1389 describes how a textual port transcodes representations of line endings.
1390
1391 A @dfn{transcoder} is an immutable Scheme object that combines a codec
1392 with an end-of-line style and a method for handling decoding errors.
1393 Each transcoder represents some specific bidirectional (but not
1394 necessarily lossless), possibly stateful translation between byte
1395 sequences and Unicode characters and strings. Every transcoder can
1396 operate in the input direction (bytes to characters) or in the output
1397 direction (characters to bytes). A @var{transcoder} parameter name
1398 means that the corresponding argument must be a transcoder.
1399
1400 A @dfn{binary port} is a port that supports binary I/O, does not have an
1401 associated transcoder and does not support textual I/O. A @dfn{textual
1402 port} is a port that supports textual I/O, and does not support binary
1403 I/O. A textual port may or may not have an associated transcoder.
1404
1405 @deffn {Scheme Procedure} latin-1-codec
1406 @deffnx {Scheme Procedure} utf-8-codec
1407 @deffnx {Scheme Procedure} utf-16-codec
1408
1409 These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
1410 encoding schemes.
1411
1412 A call to any of these procedures returns a value that is equal in the
1413 sense of @code{eqv?} to the result of any other call to the same
1414 procedure.
1415 @end deffn
1416
1417 @deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
1418
1419 @var{eol-style-symbol} should be a symbol whose name is one of
1420 @code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
1421 and @code{none}.
1422
1423 The form evaluates to the corresponding symbol. If the name of
1424 @var{eol-style-symbol} is not one of these symbols, the effect and
1425 result are implementation-dependent; in particular, the result may be an
1426 eol-style symbol acceptable as an @var{eol-style} argument to
1427 @code{make-transcoder}. Otherwise, an exception is raised.
1428
1429 All eol-style symbols except @code{none} describe a specific
1430 line-ending encoding:
1431
1432 @table @code
1433 @item lf
1434 linefeed
1435 @item cr
1436 carriage return
1437 @item crlf
1438 carriage return, linefeed
1439 @item nel
1440 next line
1441 @item crnel
1442 carriage return, next line
1443 @item ls
1444 line separator
1445 @end table
1446
1447 For a textual port with a transcoder, and whose transcoder has an
1448 eol-style symbol @code{none}, no conversion occurs. For a textual input
1449 port, any eol-style symbol other than @code{none} means that all of the
1450 above line-ending encodings are recognized and are translated into a
1451 single linefeed. For a textual output port, @code{none} and @code{lf}
1452 are equivalent. Linefeed characters are encoded according to the
1453 specified eol-style symbol, and all other characters that participate in
1454 possible line endings are encoded as is.
1455
1456 @quotation Note
1457 Only the name of @var{eol-style-symbol} is significant.
1458 @end quotation
1459 @end deffn
1460
1461 @deffn {Scheme Procedure} native-eol-style
1462 Returns the default end-of-line style of the underlying platform, e.g.,
1463 @code{lf} on Unix and @code{crlf} on Windows.
1464 @end deffn
1465
1466 @deffn {Condition Type} &i/o-decoding
1467 @deffnx {Scheme Procedure} make-i/o-decoding-error port
1468 @deffnx {Scheme Procedure} i/o-decoding-error? obj
1469
1470 This condition type could be defined by
1471
1472 @lisp
1473 (define-condition-type &i/o-decoding &i/o-port
1474 make-i/o-decoding-error i/o-decoding-error?)
1475 @end lisp
1476
1477 An exception with this type is raised when one of the operations for
1478 textual input from a port encounters a sequence of bytes that cannot be
1479 translated into a character or string by the input direction of the
1480 port's transcoder.
1481
1482 When such an exception is raised, the port's position is past the
1483 invalid encoding.
1484 @end deffn
1485
1486 @deffn {Condition Type} &i/o-encoding
1487 @deffnx {Scheme Procedure} make-i/o-encoding-error port char
1488 @deffnx {Scheme Procedure} i/o-encoding-error? obj
1489 @deffnx {Scheme Procedure} i/o-encoding-error-char condition
1490
1491 This condition type could be defined by
1492
1493 @lisp
1494 (define-condition-type &i/o-encoding &i/o-port
1495 make-i/o-encoding-error i/o-encoding-error?
1496 (char i/o-encoding-error-char))
1497 @end lisp
1498
1499 An exception with this type is raised when one of the operations for
1500 textual output to a port encounters a character that cannot be
1501 translated into bytes by the output direction of the port's transcoder.
1502 @var{char} is the character that could not be encoded.
1503 @end deffn
1504
1505 @deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
1506
1507 @var{error-handling-mode-symbol} should be a symbol whose name is one of
1508 @code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
1509 the corresponding symbol. If @var{error-handling-mode-symbol} is not
1510 one of these identifiers, effect and result are
1511 implementation-dependent: The result may be an error-handling-mode
1512 symbol acceptable as a @var{handling-mode} argument to
1513 @code{make-transcoder}. If it is not acceptable as a
1514 @var{handling-mode} argument to @code{make-transcoder}, an exception is
1515 raised.
1516
1517 @quotation Note
1518 Only the name of @var{error-handling-mode-symbol} is significant.
1519 @end quotation
1520
1521 The error-handling mode of a transcoder specifies the behavior
1522 of textual I/O operations in the presence of encoding or decoding
1523 errors.
1524
1525 If a textual input operation encounters an invalid or incomplete
1526 character encoding, and the error-handling mode is @code{ignore}, an
1527 appropriate number of bytes of the invalid encoding are ignored and
1528 decoding continues with the following bytes.
1529
1530 If the error-handling mode is @code{replace}, the replacement
1531 character U+FFFD is injected into the data stream, an appropriate
1532 number of bytes are ignored, and decoding
1533 continues with the following bytes.
1534
1535 If the error-handling mode is @code{raise}, an exception with condition
1536 type @code{&i/o-decoding} is raised.
1537
1538 If a textual output operation encounters a character it cannot encode,
1539 and the error-handling mode is @code{ignore}, the character is ignored
1540 and encoding continues with the next character. If the error-handling
1541 mode is @code{replace}, a codec-specific replacement character is
1542 emitted by the transcoder, and encoding continues with the next
1543 character. The replacement character is U+FFFD for transcoders whose
1544 codec is one of the Unicode encodings, but is the @code{?} character
1545 for the Latin-1 encoding. If the error-handling mode is @code{raise},
1546 an exception with condition type @code{&i/o-encoding} is raised.
1547 @end deffn
1548
1549 @deffn {Scheme Procedure} make-transcoder codec
1550 @deffnx {Scheme Procedure} make-transcoder codec eol-style
1551 @deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
1552
1553 @var{codec} must be a codec; @var{eol-style}, if present, an eol-style
1554 symbol; and @var{handling-mode}, if present, an error-handling-mode
1555 symbol.
1556
1557 @var{eol-style} may be omitted, in which case it defaults to the native
1558 end-of-line style of the underlying platform. @var{handling-mode} may
1559 be omitted, in which case it defaults to @code{replace}. The result is
1560 a transcoder with the behavior specified by its arguments.
1561 @end deffn
1562
1563 @deffn {Scheme procedure} native-transcoder
1564 Returns an implementation-dependent transcoder that represents a
1565 possibly locale-dependent ``native'' transcoding.
1566 @end deffn
1567
1568 @deffn {Scheme Procedure} transcoder-codec transcoder
1569 @deffnx {Scheme Procedure} transcoder-eol-style transcoder
1570 @deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
1571
1572 These are accessors for transcoder objects; when applied to a
1573 transcoder returned by @code{make-transcoder}, they return the
1574 @var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
1575 respectively.
1576 @end deffn
1577
1578 @deffn {Scheme Procedure} bytevector->string bytevector transcoder
1579
1580 Returns the string that results from transcoding the
1581 @var{bytevector} according to the input direction of the transcoder.
1582 @end deffn
1583
1584 @deffn {Scheme Procedure} string->bytevector string transcoder
1585
1586 Returns the bytevector that results from transcoding the
1587 @var{string} according to the output direction of the transcoder.
1588 @end deffn
1589
1590 @node R6RS End-of-File
1591 @subsubsection The End-of-File Object
1592
1593 @cindex EOF
1594 @cindex end-of-file
1595
1596 R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io
1597 ports)} module:
1598
1599 @deffn {Scheme Procedure} eof-object? obj
1600 @deffnx {C Function} scm_eof_object_p (obj)
1601 Return true if @var{obj} is the end-of-file (EOF) object.
1602 @end deffn
1603
1604 In addition, the following procedure is provided:
1605
1606 @deffn {Scheme Procedure} eof-object
1607 @deffnx {C Function} scm_eof_object ()
1608 Return the end-of-file (EOF) object.
1609
1610 @lisp
1611 (eof-object? (eof-object))
1612 @result{} #t
1613 @end lisp
1614 @end deffn
1615
1616
1617 @node R6RS Port Manipulation
1618 @subsubsection Port Manipulation
1619
1620 The procedures listed below operate on any kind of R6RS I/O port.
1621
1622 @deffn {Scheme Procedure} port? obj
1623 Returns @code{#t} if the argument is a port, and returns @code{#f}
1624 otherwise.
1625 @end deffn
1626
1627 @deffn {Scheme Procedure} port-transcoder port
1628 Returns the transcoder associated with @var{port} if @var{port} is
1629 textual and has an associated transcoder, and returns @code{#f} if
1630 @var{port} is binary or does not have an associated transcoder.
1631 @end deffn
1632
1633 @deffn {Scheme Procedure} binary-port? port
1634 Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
1635 binary data input/output.
1636
1637 Note that internally Guile does not differentiate between binary and
1638 textual ports, unlike the R6RS. Thus, this procedure returns true when
1639 @var{port} does not have an associated encoding---i.e., when
1640 @code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
1641 port-encoding}). This is the case for ports returned by R6RS procedures
1642 such as @code{open-bytevector-input-port} and
1643 @code{make-custom-binary-output-port}.
1644
1645 However, Guile currently does not prevent use of textual I/O procedures
1646 such as @code{display} or @code{read-char} with binary ports. Doing so
1647 ``upgrades'' the port from binary to textual, under the ISO-8859-1
1648 encoding. Likewise, Guile does not prevent use of
1649 @code{set-port-encoding!} on a binary port, which also turns it into a
1650 ``textual'' port.
1651 @end deffn
1652
1653 @deffn {Scheme Procedure} textual-port? port
1654 Always return @code{#t}, as all ports can be used for textual I/O in
1655 Guile.
1656 @end deffn
1657
1658 @deffn {Scheme Procedure} transcoded-port binary-port transcoder
1659 The @code{transcoded-port} procedure
1660 returns a new textual port with the specified @var{transcoder}.
1661 Otherwise the new textual port's state is largely the same as
1662 that of @var{binary-port}.
1663 If @var{binary-port} is an input port, the new textual
1664 port will be an input port and
1665 will transcode the bytes that have not yet been read from
1666 @var{binary-port}.
1667 If @var{binary-port} is an output port, the new textual
1668 port will be an output port and
1669 will transcode output characters into bytes that are
1670 written to the byte sink represented by @var{binary-port}.
1671
1672 As a side effect, however, @code{transcoded-port}
1673 closes @var{binary-port} in
1674 a special way that allows the new textual port to continue to
1675 use the byte source or sink represented by @var{binary-port},
1676 even though @var{binary-port} itself is closed and cannot
1677 be used by the input and output operations described in this
1678 chapter.
1679 @end deffn
1680
1681 @deffn {Scheme Procedure} port-position port
1682 If @var{port} supports it (see below), return the offset (an integer)
1683 indicating where the next octet will be read from/written to in
1684 @var{port}. If @var{port} does not support this operation, an error
1685 condition is raised.
1686
1687 This is similar to Guile's @code{seek} procedure with the
1688 @code{SEEK_CUR} argument (@pxref{Random Access}).
1689 @end deffn
1690
1691 @deffn {Scheme Procedure} port-has-port-position? port
1692 Return @code{#t} is @var{port} supports @code{port-position}.
1693 @end deffn
1694
1695 @deffn {Scheme Procedure} set-port-position! port offset
1696 If @var{port} supports it (see below), set the position where the next
1697 octet will be read from/written to @var{port} to @var{offset} (an
1698 integer). If @var{port} does not support this operation, an error
1699 condition is raised.
1700
1701 This is similar to Guile's @code{seek} procedure with the
1702 @code{SEEK_SET} argument (@pxref{Random Access}).
1703 @end deffn
1704
1705 @deffn {Scheme Procedure} port-has-set-port-position!? port
1706 Return @code{#t} is @var{port} supports @code{set-port-position!}.
1707 @end deffn
1708
1709 @deffn {Scheme Procedure} call-with-port port proc
1710 Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
1711 of @var{proc}. Return the return values of @var{proc}.
1712 @end deffn
1713
1714 @node R6RS Input Ports
1715 @subsubsection Input Ports
1716
1717 @deffn {Scheme Procedure} input-port? obj
1718 Returns @code{#t} if the argument is an input port (or a combined input
1719 and output port), and returns @code{#f} otherwise.
1720 @end deffn
1721
1722 @deffn {Scheme Procedure} port-eof? input-port
1723 Returns @code{#t}
1724 if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
1725 or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
1726 would return
1727 the end-of-file object, and @code{#f} otherwise.
1728 The operation may block indefinitely if no data is available
1729 but the port cannot be determined to be at end of file.
1730 @end deffn
1731
1732 @deffn {Scheme Procedure} open-file-input-port filename
1733 @deffnx {Scheme Procedure} open-file-input-port filename file-options
1734 @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
1735 @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
1736 @var{maybe-transcoder} must be either a transcoder or @code{#f}.
1737
1738 The @code{open-file-input-port} procedure returns an
1739 input port for the named file. The @var{file-options} and
1740 @var{maybe-transcoder} arguments are optional.
1741
1742 The @var{file-options} argument, which may determine
1743 various aspects of the returned port (@pxref{R6RS File Options}),
1744 defaults to the value of @code{(file-options)}.
1745
1746 The @var{buffer-mode} argument, if supplied,
1747 must be one of the symbols that name a buffer mode.
1748 The @var{buffer-mode} argument defaults to @code{block}.
1749
1750 If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
1751 with the returned port.
1752
1753 If @var{maybe-transcoder} is @code{#f} or absent,
1754 the port will be a binary port and will support the
1755 @code{port-position} and @code{set-port-position!} operations.
1756 Otherwise the port will be a textual port, and whether it supports
1757 the @code{port-position} and @code{set-port-position!} operations
1758 is implementation-dependent (and possibly transcoder-dependent).
1759 @end deffn
1760
1761 @deffn {Scheme Procedure} standard-input-port
1762 Returns a fresh binary input port connected to standard input. Whether
1763 the port supports the @code{port-position} and @code{set-port-position!}
1764 operations is implementation-dependent.
1765 @end deffn
1766
1767 @deffn {Scheme Procedure} current-input-port
1768 This returns a default textual port for input. Normally, this default
1769 port is associated with standard input, but can be dynamically
1770 re-assigned using the @code{with-input-from-file} procedure from the
1771 @code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
1772 may not have an associated transcoder; if it does, the transcoder is
1773 implementation-dependent.
1774 @end deffn
1775
1776 @node R6RS Binary Input
1777 @subsubsection Binary Input
1778
1779 @cindex binary input
1780
1781 R6RS binary input ports can be created with the procedures described
1782 below.
1783
1784 @deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder]
1785 @deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder)
1786 Return an input port whose contents are drawn from bytevector @var{bv}
1787 (@pxref{Bytevectors}).
1788
1789 @c FIXME: Update description when implemented.
1790 The @var{transcoder} argument is currently not supported.
1791 @end deffn
1792
1793 @cindex custom binary input ports
1794
1795 @deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close
1796 @deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close)
1797 Return a new custom binary input port@footnote{This is similar in spirit
1798 to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a
1799 string) whose input is drained by invoking @var{read!} and passing it a
1800 bytevector, an index where bytes should be written, and the number of
1801 bytes to read. The @code{read!} procedure must return an integer
1802 indicating the number of bytes read, or @code{0} to indicate the
1803 end-of-file.
1804
1805 Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
1806 that will be called when @code{port-position} is invoked on the custom
1807 binary port and should return an integer indicating the position within
1808 the underlying data stream; if @var{get-position} was not supplied, the
1809 returned port does not support @code{port-position}.
1810
1811 Likewise, if @var{set-position!} is not @code{#f}, it should be a
1812 one-argument procedure. When @code{set-port-position!} is invoked on the
1813 custom binary input port, @var{set-position!} is passed an integer
1814 indicating the position of the next byte is to read.
1815
1816 Finally, if @var{close} is not @code{#f}, it must be a thunk. It is
1817 invoked when the custom binary input port is closed.
1818
1819 The returned port is fully buffered by default, but its buffering mode
1820 can be changed using @code{setvbuf} (@pxref{Ports and File Descriptors,
1821 @code{setvbuf}}).
1822
1823 Using a custom binary input port, the @code{open-bytevector-input-port}
1824 procedure could be implemented as follows:
1825
1826 @lisp
1827 (define (open-bytevector-input-port source)
1828 (define position 0)
1829 (define length (bytevector-length source))
1830
1831 (define (read! bv start count)
1832 (let ((count (min count (- length position))))
1833 (bytevector-copy! source position
1834 bv start count)
1835 (set! position (+ position count))
1836 count))
1837
1838 (define (get-position) position)
1839
1840 (define (set-position! new-position)
1841 (set! position new-position))
1842
1843 (make-custom-binary-input-port "the port" read!
1844 get-position
1845 set-position!))
1846
1847 (read (open-bytevector-input-port (string->utf8 "hello")))
1848 @result{} hello
1849 @end lisp
1850 @end deffn
1851
1852 @cindex binary input
1853 Binary input is achieved using the procedures below:
1854
1855 @deffn {Scheme Procedure} get-u8 port
1856 @deffnx {C Function} scm_get_u8 (port)
1857 Return an octet read from @var{port}, a binary input port, blocking as
1858 necessary, or the end-of-file object.
1859 @end deffn
1860
1861 @deffn {Scheme Procedure} lookahead-u8 port
1862 @deffnx {C Function} scm_lookahead_u8 (port)
1863 Like @code{get-u8} but does not update @var{port}'s position to point
1864 past the octet.
1865 @end deffn
1866
1867 @deffn {Scheme Procedure} get-bytevector-n port count
1868 @deffnx {C Function} scm_get_bytevector_n (port, count)
1869 Read @var{count} octets from @var{port}, blocking as necessary and
1870 return a bytevector containing the octets read. If fewer bytes are
1871 available, a bytevector smaller than @var{count} is returned.
1872 @end deffn
1873
1874 @deffn {Scheme Procedure} get-bytevector-n! port bv start count
1875 @deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count)
1876 Read @var{count} bytes from @var{port} and store them in @var{bv}
1877 starting at index @var{start}. Return either the number of bytes
1878 actually read or the end-of-file object.
1879 @end deffn
1880
1881 @deffn {Scheme Procedure} get-bytevector-some port
1882 @deffnx {C Function} scm_get_bytevector_some (port)
1883 Read from @var{port}, blocking as necessary, until bytes are available
1884 or an end-of-file is reached. Return either the end-of-file object or a
1885 new bytevector containing some of the available bytes (at least one),
1886 and update the port position to point just past these bytes.
1887 @end deffn
1888
1889 @deffn {Scheme Procedure} get-bytevector-all port
1890 @deffnx {C Function} scm_get_bytevector_all (port)
1891 Read from @var{port}, blocking as necessary, until the end-of-file is
1892 reached. Return either a new bytevector containing the data read or the
1893 end-of-file object (if no data were available).
1894 @end deffn
1895
1896 The @code{(ice-9 binary-ports)} module provides the following procedure
1897 as an extension to @code{(rnrs io ports)}:
1898
1899 @deffn {Scheme Procedure} unget-bytevector port bv [start [count]]
1900 @deffnx {C Function} scm_unget_bytevector (port, bv, start, count)
1901 Place the contents of @var{bv} in @var{port}, optionally starting at
1902 index @var{start} and limiting to @var{count} octets, so that its bytes
1903 will be read from left-to-right as the next bytes from @var{port} during
1904 subsequent read operations. If called multiple times, the unread bytes
1905 will be read again in last-in first-out order.
1906 @end deffn
1907
1908 @node R6RS Textual Input
1909 @subsubsection Textual Input
1910
1911 @deffn {Scheme Procedure} get-char textual-input-port
1912 Reads from @var{textual-input-port}, blocking as necessary, until a
1913 complete character is available from @var{textual-input-port},
1914 or until an end of file is reached.
1915
1916 If a complete character is available before the next end of file,
1917 @code{get-char} returns that character and updates the input port to
1918 point past the character. If an end of file is reached before any
1919 character is read, @code{get-char} returns the end-of-file object.
1920 @end deffn
1921
1922 @deffn {Scheme Procedure} lookahead-char textual-input-port
1923 The @code{lookahead-char} procedure is like @code{get-char}, but it does
1924 not update @var{textual-input-port} to point past the character.
1925 @end deffn
1926
1927 @deffn {Scheme Procedure} get-string-n textual-input-port count
1928
1929 @var{count} must be an exact, non-negative integer object, representing
1930 the number of characters to be read.
1931
1932 The @code{get-string-n} procedure reads from @var{textual-input-port},
1933 blocking as necessary, until @var{count} characters are available, or
1934 until an end of file is reached.
1935
1936 If @var{count} characters are available before end of file,
1937 @code{get-string-n} returns a string consisting of those @var{count}
1938 characters. If fewer characters are available before an end of file, but
1939 one or more characters can be read, @code{get-string-n} returns a string
1940 containing those characters. In either case, the input port is updated
1941 to point just past the characters read. If no characters can be read
1942 before an end of file, the end-of-file object is returned.
1943 @end deffn
1944
1945 @deffn {Scheme Procedure} get-string-n! textual-input-port string start count
1946
1947 @var{start} and @var{count} must be exact, non-negative integer objects,
1948 with @var{count} representing the number of characters to be read.
1949 @var{string} must be a string with at least $@var{start} + @var{count}$
1950 characters.
1951
1952 The @code{get-string-n!} procedure reads from @var{textual-input-port}
1953 in the same manner as @code{get-string-n}. If @var{count} characters
1954 are available before an end of file, they are written into @var{string}
1955 starting at index @var{start}, and @var{count} is returned. If fewer
1956 characters are available before an end of file, but one or more can be
1957 read, those characters are written into @var{string} starting at index
1958 @var{start} and the number of characters actually read is returned as an
1959 exact integer object. If no characters can be read before an end of
1960 file, the end-of-file object is returned.
1961 @end deffn
1962
1963 @deffn {Scheme Procedure} get-string-all textual-input-port
1964 Reads from @var{textual-input-port} until an end of file, decoding
1965 characters in the same manner as @code{get-string-n} and
1966 @code{get-string-n!}.
1967
1968 If characters are available before the end of file, a string containing
1969 all the characters decoded from that data are returned. If no character
1970 precedes the end of file, the end-of-file object is returned.
1971 @end deffn
1972
1973 @deffn {Scheme Procedure} get-line textual-input-port
1974 Reads from @var{textual-input-port} up to and including the linefeed
1975 character or end of file, decoding characters in the same manner as
1976 @code{get-string-n} and @code{get-string-n!}.
1977
1978 If a linefeed character is read, a string containing all of the text up
1979 to (but not including) the linefeed character is returned, and the port
1980 is updated to point just past the linefeed character. If an end of file
1981 is encountered before any linefeed character is read, but some
1982 characters have been read and decoded as characters, a string containing
1983 those characters is returned. If an end of file is encountered before
1984 any characters are read, the end-of-file object is returned.
1985
1986 @quotation Note
1987 The end-of-line style, if not @code{none}, will cause all line endings
1988 to be read as linefeed characters. @xref{R6RS Transcoders}.
1989 @end quotation
1990 @end deffn
1991
1992 @deffn {Scheme Procedure} get-datum textual-input-port count
1993 Reads an external representation from @var{textual-input-port} and returns the
1994 datum it represents. The @code{get-datum} procedure returns the next
1995 datum that can be parsed from the given @var{textual-input-port}, updating
1996 @var{textual-input-port} to point exactly past the end of the external
1997 representation of the object.
1998
1999 Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
2000 Syntax}) in the input is first skipped. If an end of file occurs after
2001 the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
2002 is returned.
2003
2004 If a character inconsistent with an external representation is
2005 encountered in the input, an exception with condition types
2006 @code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
2007 file is encountered after the beginning of an external representation,
2008 but the external representation is incomplete and therefore cannot be
2009 parsed, an exception with condition types @code{&lexical} and
2010 @code{&i/o-read} is raised.
2011 @end deffn
2012
2013 @node R6RS Output Ports
2014 @subsubsection Output Ports
2015
2016 @deffn {Scheme Procedure} output-port? obj
2017 Returns @code{#t} if the argument is an output port (or a
2018 combined input and output port), @code{#f} otherwise.
2019 @end deffn
2020
2021 @deffn {Scheme Procedure} flush-output-port port
2022 Flushes any buffered output from the buffer of @var{output-port} to the
2023 underlying file, device, or object. The @code{flush-output-port}
2024 procedure returns an unspecified values.
2025 @end deffn
2026
2027 @deffn {Scheme Procedure} open-file-output-port filename
2028 @deffnx {Scheme Procedure} open-file-output-port filename file-options
2029 @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
2030 @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
2031
2032 @var{maybe-transcoder} must be either a transcoder or @code{#f}.
2033
2034 The @code{open-file-output-port} procedure returns an output port for the named file.
2035
2036 The @var{file-options} argument, which may determine various aspects of
2037 the returned port (@pxref{R6RS File Options}), defaults to the value of
2038 @code{(file-options)}.
2039
2040 The @var{buffer-mode} argument, if supplied,
2041 must be one of the symbols that name a buffer mode.
2042 The @var{buffer-mode} argument defaults to @code{block}.
2043
2044 If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
2045 associated with the port.
2046
2047 If @var{maybe-transcoder} is @code{#f} or absent,
2048 the port will be a binary port and will support the
2049 @code{port-position} and @code{set-port-position!} operations.
2050 Otherwise the port will be a textual port, and whether it supports
2051 the @code{port-position} and @code{set-port-position!} operations
2052 is implementation-dependent (and possibly transcoder-dependent).
2053 @end deffn
2054
2055 @deffn {Scheme Procedure} standard-output-port
2056 @deffnx {Scheme Procedure} standard-error-port
2057 Returns a fresh binary output port connected to the standard output or
2058 standard error respectively. Whether the port supports the
2059 @code{port-position} and @code{set-port-position!} operations is
2060 implementation-dependent.
2061 @end deffn
2062
2063 @deffn {Scheme Procedure} current-output-port
2064 @deffnx {Scheme Procedure} current-error-port
2065 These return default textual ports for regular output and error output.
2066 Normally, these default ports are associated with standard output, and
2067 standard error, respectively. The return value of
2068 @code{current-output-port} can be dynamically re-assigned using the
2069 @code{with-output-to-file} procedure from the @code{io simple (6)}
2070 library (@pxref{rnrs io simple}). A port returned by one of these
2071 procedures may or may not have an associated transcoder; if it does, the
2072 transcoder is implementation-dependent.
2073 @end deffn
2074
2075 @node R6RS Binary Output
2076 @subsubsection Binary Output
2077
2078 Binary output ports can be created with the procedures below.
2079
2080 @deffn {Scheme Procedure} open-bytevector-output-port [transcoder]
2081 @deffnx {C Function} scm_open_bytevector_output_port (transcoder)
2082 Return two values: a binary output port and a procedure. The latter
2083 should be called with zero arguments to obtain a bytevector containing
2084 the data accumulated by the port, as illustrated below.
2085
2086 @lisp
2087 (call-with-values
2088 (lambda ()
2089 (open-bytevector-output-port))
2090 (lambda (port get-bytevector)
2091 (display "hello" port)
2092 (get-bytevector)))
2093
2094 @result{} #vu8(104 101 108 108 111)
2095 @end lisp
2096
2097 @c FIXME: Update description when implemented.
2098 The @var{transcoder} argument is currently not supported.
2099 @end deffn
2100
2101 @cindex custom binary output ports
2102
2103 @deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close
2104 @deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close)
2105 Return a new custom binary output port named @var{id} (a string) whose
2106 output is sunk by invoking @var{write!} and passing it a bytevector, an
2107 index where bytes should be read from this bytevector, and the number of
2108 bytes to be ``written''. The @code{write!} procedure must return an
2109 integer indicating the number of bytes actually written; when it is
2110 passed @code{0} as the number of bytes to write, it should behave as
2111 though an end-of-file was sent to the byte sink.
2112
2113 The other arguments are as for @code{make-custom-binary-input-port}
2114 (@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}).
2115 @end deffn
2116
2117 @cindex binary output
2118 Writing to a binary output port can be done using the following
2119 procedures:
2120
2121 @deffn {Scheme Procedure} put-u8 port octet
2122 @deffnx {C Function} scm_put_u8 (port, octet)
2123 Write @var{octet}, an integer in the 0--255 range, to @var{port}, a
2124 binary output port.
2125 @end deffn
2126
2127 @deffn {Scheme Procedure} put-bytevector port bv [start [count]]
2128 @deffnx {C Function} scm_put_bytevector (port, bv, start, count)
2129 Write the contents of @var{bv} to @var{port}, optionally starting at
2130 index @var{start} and limiting to @var{count} octets.
2131 @end deffn
2132
2133 @node R6RS Textual Output
2134 @subsubsection Textual Output
2135
2136 @deffn {Scheme Procedure} put-char port char
2137 Writes @var{char} to the port. The @code{put-char} procedure returns
2138 an unspecified value.
2139 @end deffn
2140
2141 @deffn {Scheme Procedure} put-string port string
2142 @deffnx {Scheme Procedure} put-string port string start
2143 @deffnx {Scheme Procedure} put-string port string start count
2144
2145 @var{start} and @var{count} must be non-negative exact integer objects.
2146 @var{string} must have a length of at least @math{@var{start} +
2147 @var{count}}. @var{start} defaults to 0. @var{count} defaults to
2148 @math{@code{(string-length @var{string})} - @var{start}}$. The
2149 @code{put-string} procedure writes the @var{count} characters of
2150 @var{string} starting at index @var{start} to the port. The
2151 @code{put-string} procedure returns an unspecified value.
2152 @end deffn
2153
2154 @deffn {Scheme Procedure} put-datum textual-output-port datum
2155 @var{datum} should be a datum value. The @code{put-datum} procedure
2156 writes an external representation of @var{datum} to
2157 @var{textual-output-port}. The specific external representation is
2158 implementation-dependent. However, whenever possible, an implementation
2159 should produce a representation for which @code{get-datum}, when reading
2160 the representation, will return an object equal (in the sense of
2161 @code{equal?}) to @var{datum}.
2162
2163 @quotation Note
2164 Not all datums may allow producing an external representation for which
2165 @code{get-datum} will produce an object that is equal to the
2166 original. Specifically, NaNs contained in @var{datum} may make
2167 this impossible.
2168 @end quotation
2169
2170 @quotation Note
2171 The @code{put-datum} procedure merely writes the external
2172 representation, but no trailing delimiter. If @code{put-datum} is
2173 used to write several subsequent external representations to an
2174 output port, care should be taken to delimit them properly so they can
2175 be read back in by subsequent calls to @code{get-datum}.
2176 @end quotation
2177 @end deffn
2178
2179 @node I/O Extensions
2180 @subsection Using and Extending Ports in C
2181
2182 @menu
2183 * C Port Interface:: Using ports from C.
2184 * Port Implementation:: How to implement a new port type in C.
2185 @end menu
2186
2187
2188 @node C Port Interface
2189 @subsubsection C Port Interface
2190 @cindex C port interface
2191 @cindex Port, C interface
2192
2193 This section describes how to use Scheme ports from C.
2194
2195 @subsubheading Port basics
2196
2197 @cindex ptob
2198 @tindex scm_ptob_descriptor
2199 @tindex scm_port
2200 @findex SCM_PTAB_ENTRY
2201 @findex SCM_PTOBNUM
2202 @vindex scm_ptobs
2203 There are two main data structures. A port type object (ptob) is of
2204 type @code{scm_ptob_descriptor}. A port instance is of type
2205 @code{scm_port}. Given an @code{SCM} variable which points to a port,
2206 the corresponding C port object can be obtained using the
2207 @code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using
2208 @code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs}
2209 global array.
2210
2211 @subsubheading Port buffers
2212
2213 An input port always has a read buffer and an output port always has a
2214 write buffer. However the size of these buffers is not guaranteed to be
2215 more than one byte (e.g., the @code{shortbuf} field in @code{scm_port}
2216 which is used when no other buffer is allocated). The way in which the
2217 buffers are allocated depends on the implementation of the ptob. For
2218 example in the case of an fport, buffers may be allocated with malloc
2219 when the port is created, but in the case of an strport the underlying
2220 string is used as the buffer.
2221
2222 @subsubheading The @code{rw_random} flag
2223
2224 Special treatment is required for ports which can be seeked at random.
2225 Before various operations, such as seeking the port or changing from
2226 input to output on a bidirectional port or vice versa, the port
2227 implementation must be given a chance to update its state. The write
2228 buffer is updated by calling the @code{flush} ptob procedure and the
2229 input buffer is updated by calling the @code{end_input} ptob procedure.
2230 In the case of an fport, @code{flush} causes buffered output to be
2231 written to the file descriptor, while @code{end_input} causes the
2232 descriptor position to be adjusted to account for buffered input which
2233 was never read.
2234
2235 The special treatment must be performed if the @code{rw_random} flag in
2236 the port is non-zero.
2237
2238 @subsubheading The @code{rw_active} variable
2239
2240 The @code{rw_active} variable in the port is only used if
2241 @code{rw_random} is set. It's defined as an enum with the following
2242 values:
2243
2244 @table @code
2245 @item SCM_PORT_READ
2246 the read buffer may have unread data.
2247
2248 @item SCM_PORT_WRITE
2249 the write buffer may have unwritten data.
2250
2251 @item SCM_PORT_NEITHER
2252 neither the write nor the read buffer has data.
2253 @end table
2254
2255 @subsubheading Reading from a port.
2256
2257 To read from a port, it's possible to either call existing libguile
2258 procedures such as @code{scm_getc} and @code{scm_read_line} or to read
2259 data from the read buffer directly. Reading from the buffer involves
2260 the following steps:
2261
2262 @enumerate
2263 @item
2264 Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}.
2265
2266 @item
2267 Fill the read buffer, if it's empty, using @code{scm_fill_input}.
2268
2269 @item Read the data from the buffer and update the read position in
2270 the buffer. Steps 2) and 3) may be repeated as many times as required.
2271
2272 @item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set.
2273
2274 @item update the port's line and column counts.
2275 @end enumerate
2276
2277 @subsubheading Writing to a port.
2278
2279 To write data to a port, calling @code{scm_lfwrite} should be sufficient for
2280 most purposes. This takes care of the following steps:
2281
2282 @enumerate
2283 @item
2284 End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}.
2285
2286 @item
2287 Pass the data to the ptob implementation using the @code{write} ptob
2288 procedure. The advantage of using the ptob @code{write} instead of
2289 manipulating the write buffer directly is that it allows the data to be
2290 written in one operation even if the port is using the single-byte
2291 @code{shortbuf}.
2292
2293 @item
2294 Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random}
2295 is set.
2296 @end enumerate
2297
2298
2299 @node Port Implementation
2300 @subsubsection Port Implementation
2301 @cindex Port implementation
2302
2303 This section describes how to implement a new port type in C.
2304
2305 As described in the previous section, a port type object (ptob) is
2306 a structure of type @code{scm_ptob_descriptor}. A ptob is created by
2307 calling @code{scm_make_port_type}.
2308
2309 @deftypefun scm_t_bits scm_make_port_type (char *name, int (*fill_input) (SCM port), void (*write) (SCM port, const void *data, size_t size))
2310 Return a new port type object. The @var{name}, @var{fill_input} and
2311 @var{write} parameters are initial values for those port type fields,
2312 as described below. The other fields are initialized with default
2313 values and can be changed later.
2314 @end deftypefun
2315
2316 All of the elements of the ptob, apart from @code{name}, are procedures
2317 which collectively implement the port behaviour. Creating a new port
2318 type mostly involves writing these procedures.
2319
2320 @table @code
2321 @item name
2322 A pointer to a NUL terminated string: the name of the port type. This
2323 is the only element of @code{scm_ptob_descriptor} which is not
2324 a procedure. Set via the first argument to @code{scm_make_port_type}.
2325
2326 @item mark
2327 Called during garbage collection to mark any SCM objects that a port
2328 object may contain. It doesn't need to be set unless the port has
2329 @code{SCM} components. Set using
2330
2331 @deftypefun void scm_set_port_mark (scm_t_bits tc, SCM (*mark) (SCM port))
2332 @end deftypefun
2333
2334 @item free
2335 Called when the port is collected during gc. It
2336 should free any resources used by the port.
2337 Set using
2338
2339 @deftypefun void scm_set_port_free (scm_t_bits tc, size_t (*free) (SCM port))
2340 @end deftypefun
2341
2342 @item print
2343 Called when @code{write} is called on the port object, to print a
2344 port description. E.g., for an fport it may produce something like:
2345 @code{#<input: /etc/passwd 3>}. Set using
2346
2347 @deftypefun void scm_set_port_print (scm_t_bits tc, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate))
2348 The first argument @var{port} is the object being printed, the second
2349 argument @var{dest_port} is where its description should go.
2350 @end deftypefun
2351
2352 @item equalp
2353 Not used at present. Set using
2354
2355 @deftypefun void scm_set_port_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM))
2356 @end deftypefun
2357
2358 @item close
2359 Called when the port is closed, unless it was collected during gc. It
2360 should free any resources used by the port.
2361 Set using
2362
2363 @deftypefun void scm_set_port_close (scm_t_bits tc, int (*close) (SCM port))
2364 @end deftypefun
2365
2366 @item write
2367 Accept data which is to be written using the port. The port implementation
2368 may choose to buffer the data instead of processing it directly.
2369 Set via the third argument to @code{scm_make_port_type}.
2370
2371 @item flush
2372 Complete the processing of buffered output data. Reset the value of
2373 @code{rw_active} to @code{SCM_PORT_NEITHER}.
2374 Set using
2375
2376 @deftypefun void scm_set_port_flush (scm_t_bits tc, void (*flush) (SCM port))
2377 @end deftypefun
2378
2379 @item end_input
2380 Perform any synchronization required when switching from input to output
2381 on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}.
2382 Set using
2383
2384 @deftypefun void scm_set_port_end_input (scm_t_bits tc, void (*end_input) (SCM port, int offset))
2385 @end deftypefun
2386
2387 @item fill_input
2388 Read new data into the read buffer and return the first character. It
2389 can be assumed that the read buffer is empty when this procedure is called.
2390 Set via the second argument to @code{scm_make_port_type}.
2391
2392 @item input_waiting
2393 Return a lower bound on the number of bytes that could be read from the
2394 port without blocking. It can be assumed that the current state of
2395 @code{rw_active} is @code{SCM_PORT_NEITHER}.
2396 Set using
2397
2398 @deftypefun void scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM port))
2399 @end deftypefun
2400
2401 @item seek
2402 Set the current position of the port. The procedure can not make
2403 any assumptions about the value of @code{rw_active} when it's
2404 called. It can reset the buffers first if desired by using something
2405 like:
2406
2407 @example
2408 if (pt->rw_active == SCM_PORT_READ)
2409 scm_end_input (port);
2410 else if (pt->rw_active == SCM_PORT_WRITE)
2411 ptob->flush (port);
2412 @end example
2413
2414 However note that this will have the side effect of discarding any data
2415 in the unread-char buffer, in addition to any side effects from the
2416 @code{end_input} and @code{flush} ptob procedures. This is undesirable
2417 when seek is called to measure the current position of the port, i.e.,
2418 @code{(seek p 0 SEEK_CUR)}. The libguile fport and string port
2419 implementations take care to avoid this problem.
2420
2421 The procedure is set using
2422
2423 @deftypefun void scm_set_port_seek (scm_t_bits tc, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence))
2424 @end deftypefun
2425
2426 @item truncate
2427 Truncate the port data to be specified length. It can be assumed that the
2428 current state of @code{rw_active} is @code{SCM_PORT_NEITHER}.
2429 Set using
2430
2431 @deftypefun void scm_set_port_truncate (scm_t_bits tc, void (*truncate) (SCM port, scm_t_off length))
2432 @end deftypefun
2433
2434 @end table
2435
2436 @node BOM Handling
2437 @subsection Handling of Unicode byte order marks.
2438 @cindex BOM
2439 @cindex byte order mark
2440
2441 This section documents the finer points of Guile's handling of Unicode
2442 byte order marks (BOMs). A byte order mark (U+FEFF) is typically found
2443 at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
2444 determine the byte order. Occasionally, a BOM is found at the start of
2445 a UTF-8 stream, but this is much less common and not generally
2446 recommended.
2447
2448 Guile attempts to handle BOMs automatically, and in accordance with the
2449 recommendations of the Unicode Standard, when the port encoding is set
2450 to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile
2451 automatically writes a BOM at the start of a UTF-16 or UTF-32 stream,
2452 and automatically consumes one from the start of a UTF-8, UTF-16, or
2453 UTF-32 stream.
2454
2455 As specified in the Unicode Standard, a BOM is only handled specially at
2456 the start of a stream, and only if the port encoding is set to
2457 @code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is
2458 set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or
2459 @code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of
2460 the special handling described in this section applies.
2461
2462 @itemize @bullet
2463 @item
2464 To ensure that Guile will properly detect the byte order of a UTF-16 or
2465 UTF-32 stream, you must perform a textual read before any writes, seeks,
2466 or binary I/O. Guile will not attempt to read a BOM unless a read is
2467 explicitly requested at the start of the stream.
2468
2469 @item
2470 If a textual write is performed before the first read, then an arbitrary
2471 byte order will be chosen. Currently, big endian is the default on all
2472 platforms, but that may change in the future. If you wish to explicitly
2473 control the byte order of an output stream, set the port encoding to
2474 @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE},
2475 and explicitly write a BOM (@code{#\xFEFF}) if desired.
2476
2477 @item
2478 If @code{set-port-encoding!} is called in the middle of a stream, Guile
2479 treats this as a new logical ``start of stream'' for purposes of BOM
2480 handling, and will forget about any BOMs that had previously been seen.
2481 Therefore, it may choose a different byte order than had been used
2482 previously. This is intended to support multiple logical text streams
2483 embedded within a larger binary stream.
2484
2485 @item
2486 Binary I/O operations are not guaranteed to update Guile's notion of
2487 whether the port is at the ``start of the stream'', nor are they
2488 guaranteed to produce or consume BOMs.
2489
2490 @item
2491 For ports that support seeking (e.g. normal files), the input and output
2492 streams are considered linked: if the user reads first, then a BOM will
2493 be consumed (if appropriate), but later writes will @emph{not} produce a
2494 BOM. Similarly, if the user writes first, then later reads will
2495 @emph{not} consume a BOM.
2496
2497 @item
2498 For ports that do not support seeking (e.g. pipes, sockets, and
2499 terminals), the input and output streams are considered
2500 @emph{independent} for purposes of BOM handling: the first read will
2501 consume a BOM (if appropriate), and the first write will @emph{also}
2502 produce a BOM (if appropriate). However, the input and output streams
2503 will always use the same byte order.
2504
2505 @item
2506 Seeks to the beginning of a file will set the ``start of stream'' flags.
2507 Therefore, a subsequent textual read or write will consume or produce a
2508 BOM. However, unlike @code{set-port-encoding!}, if a byte order had
2509 already been chosen for the port, it will remain in effect after a seek,
2510 and cannot be changed by the presence of a BOM. Seeks anywhere other
2511 than the beginning of a file clear the ``start of stream'' flags.
2512 @end itemize
2513
2514 @c Local Variables:
2515 @c TeX-master: "guile.texi"
2516 @c End: