Add the `%default-port-conversion-strategy' fluid.
[bpt/guile.git] / doc / ref / api-io.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
4 @c 2010, 2011 Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
6
7 @node Input and Output
8 @section Input and Output
9
10 @menu
11 * Ports:: The idea of the port abstraction.
12 * Reading:: Procedures for reading from a port.
13 * Writing:: Procedures for writing to a port.
14 * Closing:: Procedures to close a port.
15 * Random Access:: Moving around a random access port.
16 * Line/Delimited:: Read and write lines or delimited text.
17 * Block Reading and Writing:: Reading and writing blocks of text.
18 * Default Ports:: Defaults for input, output and errors.
19 * Port Types:: Types of port and how to make them.
20 * R6RS I/O Ports:: The R6RS port API.
21 * I/O Extensions:: Using and extending ports in C.
22 @end menu
23
24
25 @node Ports
26 @subsection Ports
27 @cindex Port
28
29 Sequential input/output in Scheme is represented by operations on a
30 @dfn{port}. This chapter explains the operations that Guile provides
31 for working with ports.
32
33 Ports are created by opening, for instance @code{open-file} for a file
34 (@pxref{File Ports}). Characters can be read from an input port and
35 written to an output port, or both on an input/output port. A port
36 can be closed (@pxref{Closing}) when no longer required, after which
37 any attempt to read or write is an error.
38
39 The formal definition of a port is very generic: an input port is
40 simply ``an object which can deliver characters on demand,'' and an
41 output port is ``an object which can accept characters.'' Because
42 this definition is so loose, it is easy to write functions that
43 simulate ports in software. @dfn{Soft ports} and @dfn{string ports}
44 are two interesting and powerful examples of this technique.
45 (@pxref{Soft Ports}, and @ref{String Ports}.)
46
47 Ports are garbage collected in the usual way (@pxref{Memory
48 Management}), and will be closed at that time if not already closed.
49 In this case any errors occurring in the close will not be reported.
50 Usually a program will want to explicitly close so as to be sure all
51 its operations have been successful. Of course if a program has
52 abandoned something due to an error or other condition then closing
53 problems are probably not of interest.
54
55 It is strongly recommended that file ports be closed explicitly when
56 no longer required. Most systems have limits on how many files can be
57 open, both on a per-process and a system-wide basis. A program that
58 uses many files should take care not to hit those limits. The same
59 applies to similar system resources such as pipes and sockets.
60
61 Note that automatic garbage collection is triggered only by memory
62 consumption, not by file or other resource usage, so a program cannot
63 rely on that to keep it away from system limits. An explicit call to
64 @code{gc} can of course be relied on to pick up unreferenced ports.
65 If program flow makes it hard to be certain when to close then this
66 may be an acceptable way to control resource usage.
67
68 All file access uses the ``LFS'' large file support functions when
69 available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be
70 read and written on a 32-bit system.
71
72 Each port has an associated character encoding that controls how bytes
73 read from the port are converted to characters and string and controls
74 how characters and strings written to the port are converted to bytes.
75 When ports are created, they inherit their character encoding from the
76 current locale, but, that can be modified after the port is created.
77
78 Currently, the ports only work with @emph{non-modal} encodings. Most
79 encodings are non-modal, meaning that the conversion of bytes to a
80 string doesn't depend on its context: the same byte sequence will always
81 return the same string. A couple of modal encodings are in common use,
82 like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
83
84 Each port also has an associated conversion strategy: what to do when
85 a Guile character can't be converted to the port's encoded character
86 representation for output. There are three possible strategies: to
87 raise an error, to replace the character with a hex escape, or to
88 replace the character with a substitute character.
89
90 @rnindex input-port?
91 @deffn {Scheme Procedure} input-port? x
92 @deffnx {C Function} scm_input_port_p (x)
93 Return @code{#t} if @var{x} is an input port, otherwise return
94 @code{#f}. Any object satisfying this predicate also satisfies
95 @code{port?}.
96 @end deffn
97
98 @rnindex output-port?
99 @deffn {Scheme Procedure} output-port? x
100 @deffnx {C Function} scm_output_port_p (x)
101 Return @code{#t} if @var{x} is an output port, otherwise return
102 @code{#f}. Any object satisfying this predicate also satisfies
103 @code{port?}.
104 @end deffn
105
106 @deffn {Scheme Procedure} port? x
107 @deffnx {C Function} scm_port_p (x)
108 Return a boolean indicating whether @var{x} is a port.
109 Equivalent to @code{(or (input-port? @var{x}) (output-port?
110 @var{x}))}.
111 @end deffn
112
113 @deffn {Scheme Procedure} set-port-encoding! port enc
114 @deffnx {C Function} scm_set_port_encoding_x (port, enc)
115 Sets the character encoding that will be used to interpret all port I/O.
116 @var{enc} is a string containing the name of an encoding. Valid
117 encoding names are those
118 @url{http://www.iana.org/assignments/character-sets, defined by IANA}.
119 @end deffn
120
121 @defvr {Scheme Variable} %default-port-encoding
122 A fluid containing @code{#f} or the name of the encoding to
123 be used by default for newly created ports (@pxref{Fluids and Dynamic
124 States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
125
126 New ports are created with the encoding appropriate for the current
127 locale if @code{setlocale} has been called or the value specified by
128 this fluid otherwise.
129 @end defvr
130
131 @deffn {Scheme Procedure} port-encoding port
132 @deffnx {C Function} scm_port_encoding (port)
133 Returns, as a string, the character encoding that @var{port} uses to interpret
134 its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
135 @end deffn
136
137 @deffn {Scheme Procedure} set-port-conversion-strategy! port sym
138 @deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
139 Sets the behavior of the interpreter when outputting a character that
140 is not representable in the port's current encoding. @var{sym} can be
141 either @code{'error}, @code{'substitute}, or @code{'escape}. If it is
142 @code{'error}, an error will be thrown when an nonconvertible character
143 is encountered. If it is @code{'substitute}, then nonconvertible
144 characters will be replaced with approximate characters, or with
145 question marks if no approximately correct character is available. If
146 it is @code{'escape}, it will appear as a hex escape when output.
147
148 If @var{port} is an open port, the conversion error behavior
149 is set for that port. If it is @code{#f}, it is set as the
150 default behavior for any future ports that get created in
151 this thread.
152 @end deffn
153
154 @deffn {Scheme Procedure} port-conversion-strategy port
155 @deffnx {C Function} scm_port_conversion_strategy (port)
156 Returns the behavior of the port when outputting a character that is
157 not representable in the port's current encoding. It returns the
158 symbol @code{error} if unrepresentable characters should cause
159 exceptions, @code{substitute} if the port should try to replace
160 unrepresentable characters with question marks or approximate
161 characters, or @code{escape} if unrepresentable characters should be
162 converted to string escapes.
163
164 If @var{port} is @code{#f}, then the current default behavior will be
165 returned. New ports will have this default behavior when they are
166 created.
167 @end deffn
168
169 @deffn {Scheme Variable} %default-port-conversion-strategy
170 The fluid that defines the conversion strategy for newly created ports,
171 and for other conversion routines such as @code{scm_to_stringn},
172 @code{scm_from_stringn}, @code{string->pointer}, and
173 @code{pointer->string}.
174
175 Its value must be one of the symbols described above, with the same
176 semantics: @code{'error}, @code{'substitute}, or @code{'escape}.
177
178 When Guile starts, its value is @code{'substitute}.
179
180 Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
181 equivalent to @code{(fluid-set! %default-port-conversion-strategy
182 @var{sym})}.
183 @end deffn
184
185
186 @node Reading
187 @subsection Reading
188 @cindex Reading
189
190 [Generic procedures for reading from ports.]
191
192 These procedures pertain to reading characters and strings from
193 ports. To read general S-expressions from ports, @xref{Scheme Read}.
194
195 @rnindex eof-object?
196 @cindex End of file object
197 @deffn {Scheme Procedure} eof-object? x
198 @deffnx {C Function} scm_eof_object_p (x)
199 Return @code{#t} if @var{x} is an end-of-file object; otherwise
200 return @code{#f}.
201 @end deffn
202
203 @rnindex char-ready?
204 @deffn {Scheme Procedure} char-ready? [port]
205 @deffnx {C Function} scm_char_ready_p (port)
206 Return @code{#t} if a character is ready on input @var{port}
207 and return @code{#f} otherwise. If @code{char-ready?} returns
208 @code{#t} then the next @code{read-char} operation on
209 @var{port} is guaranteed not to hang. If @var{port} is a file
210 port at end of file then @code{char-ready?} returns @code{#t}.
211
212 @code{char-ready?} exists to make it possible for a
213 program to accept characters from interactive ports without
214 getting stuck waiting for input. Any input editors associated
215 with such ports must make sure that characters whose existence
216 has been asserted by @code{char-ready?} cannot be rubbed out.
217 If @code{char-ready?} were to return @code{#f} at end of file,
218 a port at end of file would be indistinguishable from an
219 interactive port that has no ready characters.
220 @end deffn
221
222 @rnindex read-char
223 @deffn {Scheme Procedure} read-char [port]
224 @deffnx {C Function} scm_read_char (port)
225 Return the next character available from @var{port}, updating
226 @var{port} to point to the following character. If no more
227 characters are available, the end-of-file object is returned.
228
229 When @var{port}'s data cannot be decoded according to its
230 character encoding, a @code{decoding-error} is raised and
231 @var{port} points past the erroneous byte sequence.
232 @end deffn
233
234 @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
235 Read up to @var{size} bytes from @var{port} and store them in
236 @var{buffer}. The return value is the number of bytes actually read,
237 which can be less than @var{size} if end-of-file has been reached.
238
239 Note that this function does not update @code{port-line} and
240 @code{port-column} below.
241 @end deftypefn
242
243 @rnindex peek-char
244 @deffn {Scheme Procedure} peek-char [port]
245 @deffnx {C Function} scm_peek_char (port)
246 Return the next character available from @var{port},
247 @emph{without} updating @var{port} to point to the following
248 character. If no more characters are available, the
249 end-of-file object is returned.
250
251 The value returned by
252 a call to @code{peek-char} is the same as the value that would
253 have been returned by a call to @code{read-char} on the same
254 port. The only difference is that the very next call to
255 @code{read-char} or @code{peek-char} on that @var{port} will
256 return the value returned by the preceding call to
257 @code{peek-char}. In particular, a call to @code{peek-char} on
258 an interactive port will hang waiting for input whenever a call
259 to @code{read-char} would have hung.
260
261 As for @code{read-char}, a @code{decoding-error} may be raised
262 if such a situation occurs. However, unlike with @code{read-char},
263 @var{port} still points at the beginning of the erroneous byte
264 sequence when the error is raised.
265 @end deffn
266
267 @deffn {Scheme Procedure} unread-char cobj [port]
268 @deffnx {C Function} scm_unread_char (cobj, port)
269 Place character @var{cobj} in @var{port} so that it will be read by the
270 next read operation. If called multiple times, the unread characters
271 will be read again in last-in first-out order. If @var{port} is
272 not supplied, the current input port is used.
273 @end deffn
274
275 @deffn {Scheme Procedure} unread-string str port
276 @deffnx {C Function} scm_unread_string (str, port)
277 Place the string @var{str} in @var{port} so that its characters will
278 be read from left-to-right as the next characters from @var{port}
279 during subsequent read operations. If called multiple times, the
280 unread characters will be read again in last-in first-out order. If
281 @var{port} is not supplied, the @code{current-input-port} is used.
282 @end deffn
283
284 @deffn {Scheme Procedure} drain-input port
285 @deffnx {C Function} scm_drain_input (port)
286 This procedure clears a port's input buffers, similar
287 to the way that force-output clears the output buffer. The
288 contents of the buffers are returned as a single string, e.g.,
289
290 @lisp
291 (define p (open-input-file ...))
292 (drain-input p) => empty string, nothing buffered yet.
293 (unread-char (read-char p) p)
294 (drain-input p) => initial chars from p, up to the buffer size.
295 @end lisp
296
297 Draining the buffers may be useful for cleanly finishing
298 buffered I/O so that the file descriptor can be used directly
299 for further input.
300 @end deffn
301
302 @deffn {Scheme Procedure} port-column port
303 @deffnx {Scheme Procedure} port-line port
304 @deffnx {C Function} scm_port_column (port)
305 @deffnx {C Function} scm_port_line (port)
306 Return the current column number or line number of @var{port}.
307 If the number is
308 unknown, the result is #f. Otherwise, the result is a 0-origin integer
309 - i.e.@: the first character of the first line is line 0, column 0.
310 (However, when you display a file position, for example in an error
311 message, we recommend you add 1 to get 1-origin integers. This is
312 because lines and column numbers traditionally start with 1, and that is
313 what non-programmers will find most natural.)
314 @end deffn
315
316 @deffn {Scheme Procedure} set-port-column! port column
317 @deffnx {Scheme Procedure} set-port-line! port line
318 @deffnx {C Function} scm_set_port_column_x (port, column)
319 @deffnx {C Function} scm_set_port_line_x (port, line)
320 Set the current column or line number of @var{port}.
321 @end deffn
322
323 @node Writing
324 @subsection Writing
325 @cindex Writing
326
327 [Generic procedures for writing to ports.]
328
329 These procedures are for writing characters and strings to
330 ports. For more information on writing arbitrary Scheme objects to
331 ports, @xref{Scheme Write}.
332
333 @deffn {Scheme Procedure} get-print-state port
334 @deffnx {C Function} scm_get_print_state (port)
335 Return the print state of the port @var{port}. If @var{port}
336 has no associated print state, @code{#f} is returned.
337 @end deffn
338
339 @rnindex newline
340 @deffn {Scheme Procedure} newline [port]
341 @deffnx {C Function} scm_newline (port)
342 Send a newline to @var{port}.
343 If @var{port} is omitted, send to the current output port.
344 @end deffn
345
346 @deffn {Scheme Procedure} port-with-print-state port [pstate]
347 @deffnx {C Function} scm_port_with_print_state (port, pstate)
348 Create a new port which behaves like @var{port}, but with an
349 included print state @var{pstate}. @var{pstate} is optional.
350 If @var{pstate} isn't supplied and @var{port} already has
351 a print state, the old print state is reused.
352 @end deffn
353
354 @deffn {Scheme Procedure} simple-format destination message . args
355 @deffnx {C Function} scm_simple_format (destination, message, args)
356 Write @var{message} to @var{destination}, defaulting to
357 the current output port.
358 @var{message} can contain @code{~A} (was @code{%s}) and
359 @code{~S} (was @code{%S}) escapes. When printed,
360 the escapes are replaced with corresponding members of
361 @var{args}:
362 @code{~A} formats using @code{display} and @code{~S} formats
363 using @code{write}.
364 If @var{destination} is @code{#t}, then use the current output
365 port, if @var{destination} is @code{#f}, then return a string
366 containing the formatted text. Does not add a trailing newline.
367 @end deffn
368
369 @rnindex write-char
370 @deffn {Scheme Procedure} write-char chr [port]
371 @deffnx {C Function} scm_write_char (chr, port)
372 Send character @var{chr} to @var{port}.
373 @end deffn
374
375 @deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size)
376 Write @var{size} bytes at @var{buffer} to @var{port}.
377
378 Note that this function does not update @code{port-line} and
379 @code{port-column} (@pxref{Reading}).
380 @end deftypefn
381
382 @findex fflush
383 @deffn {Scheme Procedure} force-output [port]
384 @deffnx {C Function} scm_force_output (port)
385 Flush the specified output port, or the current output port if @var{port}
386 is omitted. The current output buffer contents are passed to the
387 underlying port implementation (e.g., in the case of fports, the
388 data will be written to the file and the output buffer will be cleared.)
389 It has no effect on an unbuffered port.
390
391 The return value is unspecified.
392 @end deffn
393
394 @deffn {Scheme Procedure} flush-all-ports
395 @deffnx {C Function} scm_flush_all_ports ()
396 Equivalent to calling @code{force-output} on
397 all open output ports. The return value is unspecified.
398 @end deffn
399
400
401 @node Closing
402 @subsection Closing
403 @cindex Closing ports
404 @cindex Port, close
405
406 @deffn {Scheme Procedure} close-port port
407 @deffnx {C Function} scm_close_port (port)
408 Close the specified port object. Return @code{#t} if it
409 successfully closes a port or @code{#f} if it was already
410 closed. An exception may be raised if an error occurs, for
411 example when flushing buffered output. See also @ref{Ports and
412 File Descriptors, close}, for a procedure which can close file
413 descriptors.
414 @end deffn
415
416 @deffn {Scheme Procedure} close-input-port port
417 @deffnx {Scheme Procedure} close-output-port port
418 @deffnx {C Function} scm_close_input_port (port)
419 @deffnx {C Function} scm_close_output_port (port)
420 @rnindex close-input-port
421 @rnindex close-output-port
422 Close the specified input or output @var{port}. An exception may be
423 raised if an error occurs while closing. If @var{port} is already
424 closed, nothing is done. The return value is unspecified.
425
426 See also @ref{Ports and File Descriptors, close}, for a procedure
427 which can close file descriptors.
428 @end deffn
429
430 @deffn {Scheme Procedure} port-closed? port
431 @deffnx {C Function} scm_port_closed_p (port)
432 Return @code{#t} if @var{port} is closed or @code{#f} if it is
433 open.
434 @end deffn
435
436
437 @node Random Access
438 @subsection Random Access
439 @cindex Random access, ports
440 @cindex Port, random access
441
442 @deffn {Scheme Procedure} seek fd_port offset whence
443 @deffnx {C Function} scm_seek (fd_port, offset, whence)
444 Sets the current position of @var{fd_port} to the integer
445 @var{offset}, which is interpreted according to the value of
446 @var{whence}.
447
448 One of the following variables should be supplied for
449 @var{whence}:
450 @defvar SEEK_SET
451 Seek from the beginning of the file.
452 @end defvar
453 @defvar SEEK_CUR
454 Seek from the current position.
455 @end defvar
456 @defvar SEEK_END
457 Seek from the end of the file.
458 @end defvar
459 If @var{fd_port} is a file descriptor, the underlying system
460 call is @code{lseek}. @var{port} may be a string port.
461
462 The value returned is the new position in the file. This means
463 that the current position of a port can be obtained using:
464 @lisp
465 (seek port 0 SEEK_CUR)
466 @end lisp
467 @end deffn
468
469 @deffn {Scheme Procedure} ftell fd_port
470 @deffnx {C Function} scm_ftell (fd_port)
471 Return an integer representing the current position of
472 @var{fd_port}, measured from the beginning. Equivalent to:
473
474 @lisp
475 (seek port 0 SEEK_CUR)
476 @end lisp
477 @end deffn
478
479 @findex truncate
480 @findex ftruncate
481 @deffn {Scheme Procedure} truncate-file file [length]
482 @deffnx {C Function} scm_truncate_file (file, length)
483 Truncate @var{file} to @var{length} bytes. @var{file} can be a
484 filename string, a port object, or an integer file descriptor. The
485 return value is unspecified.
486
487 For a port or file descriptor @var{length} can be omitted, in which
488 case the file is truncated at the current position (per @code{ftell}
489 above).
490
491 On most systems a file can be extended by giving a length greater than
492 the current size, but this is not mandatory in the POSIX standard.
493 @end deffn
494
495 @node Line/Delimited
496 @subsection Line Oriented and Delimited Text
497 @cindex Line input/output
498 @cindex Port, line input/output
499
500 The delimited-I/O module can be accessed with:
501
502 @lisp
503 (use-modules (ice-9 rdelim))
504 @end lisp
505
506 It can be used to read or write lines of text, or read text delimited by
507 a specified set of characters. It's similar to the @code{(scsh rdelim)}
508 module from guile-scsh, but does not use multiple values or character
509 sets and has an extra procedure @code{write-line}.
510
511 @c begin (scm-doc-string "rdelim.scm" "read-line")
512 @deffn {Scheme Procedure} read-line [port] [handle-delim]
513 Return a line of text from @var{port} if specified, otherwise from the
514 value returned by @code{(current-input-port)}. Under Unix, a line of text
515 is terminated by the first end-of-line character or by end-of-file.
516
517 If @var{handle-delim} is specified, it should be one of the following
518 symbols:
519 @table @code
520 @item trim
521 Discard the terminating delimiter. This is the default, but it will
522 be impossible to tell whether the read terminated with a delimiter or
523 end-of-file.
524 @item concat
525 Append the terminating delimiter (if any) to the returned string.
526 @item peek
527 Push the terminating delimiter (if any) back on to the port.
528 @item split
529 Return a pair containing the string read from the port and the
530 terminating delimiter or end-of-file object.
531 @end table
532
533 Like @code{read-char}, this procedure can throw to @code{decoding-error}
534 (@pxref{Reading, @code{read-char}}).
535 @end deffn
536
537 @c begin (scm-doc-string "rdelim.scm" "read-line!")
538 @deffn {Scheme Procedure} read-line! buf [port]
539 Read a line of text into the supplied string @var{buf} and return the
540 number of characters added to @var{buf}. If @var{buf} is filled, then
541 @code{#f} is returned.
542 Read from @var{port} if
543 specified, otherwise from the value returned by @code{(current-input-port)}.
544 @end deffn
545
546 @c begin (scm-doc-string "rdelim.scm" "read-delimited")
547 @deffn {Scheme Procedure} read-delimited delims [port] [handle-delim]
548 Read text until one of the characters in the string @var{delims} is found
549 or end-of-file is reached. Read from @var{port} if supplied, otherwise
550 from the value returned by @code{(current-input-port)}.
551 @var{handle-delim} takes the same values as described for @code{read-line}.
552 @end deffn
553
554 @c begin (scm-doc-string "rdelim.scm" "read-delimited!")
555 @deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
556 Read text into the supplied string @var{buf}.
557
558 If a delimiter was found, return the number of characters written,
559 except if @var{handle-delim} is @code{split}, in which case the return
560 value is a pair, as noted above.
561
562 As a special case, if @var{port} was already at end-of-stream, the EOF
563 object is returned. Also, if no characters were written because the
564 buffer was full, @code{#f} is returned.
565
566 It's something of a wacky interface, to be honest.
567 @end deffn
568
569 @deffn {Scheme Procedure} write-line obj [port]
570 @deffnx {C Function} scm_write_line (obj, port)
571 Display @var{obj} and a newline character to @var{port}. If
572 @var{port} is not specified, @code{(current-output-port)} is
573 used. This function is equivalent to:
574 @lisp
575 (display obj [port])
576 (newline [port])
577 @end lisp
578 @end deffn
579
580 Some of the aforementioned I/O functions rely on the following C
581 primitives. These will mainly be of interest to people hacking Guile
582 internals.
583
584 @deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]]
585 @deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end)
586 Read characters from @var{port} into @var{str} until one of the
587 characters in the @var{delims} string is encountered. If
588 @var{gobble} is true, discard the delimiter character;
589 otherwise, leave it in the input stream for the next read. If
590 @var{port} is not specified, use the value of
591 @code{(current-input-port)}. If @var{start} or @var{end} are
592 specified, store data only into the substring of @var{str}
593 bounded by @var{start} and @var{end} (which default to the
594 beginning and end of the string, respectively).
595
596 Return a pair consisting of the delimiter that terminated the
597 string and the number of characters read. If reading stopped
598 at the end of file, the delimiter returned is the
599 @var{eof-object}; if the string was filled without encountering
600 a delimiter, this value is @code{#f}.
601 @end deffn
602
603 @deffn {Scheme Procedure} %read-line [port]
604 @deffnx {C Function} scm_read_line (port)
605 Read a newline-terminated line from @var{port}, allocating storage as
606 necessary. The newline terminator (if any) is removed from the string,
607 and a pair consisting of the line and its delimiter is returned. The
608 delimiter may be either a newline or the @var{eof-object}; if
609 @code{%read-line} is called at the end of file, it returns the pair
610 @code{(#<eof> . #<eof>)}.
611 @end deffn
612
613 @node Block Reading and Writing
614 @subsection Block reading and writing
615 @cindex Block read/write
616 @cindex Port, block read/write
617
618 The Block-string-I/O module can be accessed with:
619
620 @lisp
621 (use-modules (ice-9 rw))
622 @end lisp
623
624 It currently contains procedures that help to implement the
625 @code{(scsh rw)} module in guile-scsh.
626
627 @deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]]
628 @deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end)
629 Read characters from a port or file descriptor into a
630 string @var{str}. A port must have an underlying file
631 descriptor --- a so-called fport. This procedure is
632 scsh-compatible and can efficiently read large strings.
633 It will:
634
635 @itemize
636 @item
637 attempt to fill the entire string, unless the @var{start}
638 and/or @var{end} arguments are supplied. i.e., @var{start}
639 defaults to 0 and @var{end} defaults to
640 @code{(string-length str)}
641 @item
642 use the current input port if @var{port_or_fdes} is not
643 supplied.
644 @item
645 return fewer than the requested number of characters in some
646 cases, e.g., on end of file, if interrupted by a signal, or if
647 not all the characters are immediately available.
648 @item
649 wait indefinitely for some input if no characters are
650 currently available,
651 unless the port is in non-blocking mode.
652 @item
653 read characters from the port's input buffers if available,
654 instead from the underlying file descriptor.
655 @item
656 return @code{#f} if end-of-file is encountered before reading
657 any characters, otherwise return the number of characters
658 read.
659 @item
660 return 0 if the port is in non-blocking mode and no characters
661 are immediately available.
662 @item
663 return 0 if the request is for 0 bytes, with no
664 end-of-file check.
665 @end itemize
666 @end deffn
667
668 @deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]]
669 @deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end)
670 Write characters from a string @var{str} to a port or file
671 descriptor. A port must have an underlying file descriptor
672 --- a so-called fport. This procedure is
673 scsh-compatible and can efficiently write large strings.
674 It will:
675
676 @itemize
677 @item
678 attempt to write the entire string, unless the @var{start}
679 and/or @var{end} arguments are supplied. i.e., @var{start}
680 defaults to 0 and @var{end} defaults to
681 @code{(string-length str)}
682 @item
683 use the current output port if @var{port_of_fdes} is not
684 supplied.
685 @item
686 in the case of a buffered port, store the characters in the
687 port's output buffer, if all will fit. If they will not fit
688 then any existing buffered characters will be flushed
689 before attempting
690 to write the new characters directly to the underlying file
691 descriptor. If the port is in non-blocking mode and
692 buffered characters can not be flushed immediately, then an
693 @code{EAGAIN} system-error exception will be raised (Note:
694 scsh does not support the use of non-blocking buffered ports.)
695 @item
696 write fewer than the requested number of
697 characters in some cases, e.g., if interrupted by a signal or
698 if not all of the output can be accepted immediately.
699 @item
700 wait indefinitely for at least one character
701 from @var{str} to be accepted by the port, unless the port is
702 in non-blocking mode.
703 @item
704 return the number of characters accepted by the port.
705 @item
706 return 0 if the port is in non-blocking mode and can not accept
707 at least one character from @var{str} immediately
708 @item
709 return 0 immediately if the request size is 0 bytes.
710 @end itemize
711 @end deffn
712
713 @node Default Ports
714 @subsection Default Ports for Input, Output and Errors
715 @cindex Default ports
716 @cindex Port, default
717
718 @rnindex current-input-port
719 @deffn {Scheme Procedure} current-input-port
720 @deffnx {C Function} scm_current_input_port ()
721 @cindex standard input
722 Return the current input port. This is the default port used
723 by many input procedures.
724
725 Initially this is the @dfn{standard input} in Unix and C terminology.
726 When the standard input is a tty the port is unbuffered, otherwise
727 it's fully buffered.
728
729 Unbuffered input is good if an application runs an interactive
730 subprocess, since any type-ahead input won't go into Guile's buffer
731 and be unavailable to the subprocess.
732
733 Note that Guile buffering is completely separate from the tty ``line
734 discipline''. In the usual cooked mode on a tty Guile only sees a
735 line of input once the user presses @key{Return}.
736 @end deffn
737
738 @rnindex current-output-port
739 @deffn {Scheme Procedure} current-output-port
740 @deffnx {C Function} scm_current_output_port ()
741 @cindex standard output
742 Return the current output port. This is the default port used
743 by many output procedures.
744
745 Initially this is the @dfn{standard output} in Unix and C terminology.
746 When the standard output is a tty this port is unbuffered, otherwise
747 it's fully buffered.
748
749 Unbuffered output to a tty is good for ensuring progress output or a
750 prompt is seen. But an application which always prints whole lines
751 could change to line buffered, or an application with a lot of output
752 could go fully buffered and perhaps make explicit @code{force-output}
753 calls (@pxref{Writing}) at selected points.
754 @end deffn
755
756 @deffn {Scheme Procedure} current-error-port
757 @deffnx {C Function} scm_current_error_port ()
758 @cindex standard error output
759 Return the port to which errors and warnings should be sent.
760
761 Initially this is the @dfn{standard error} in Unix and C terminology.
762 When the standard error is a tty this port is unbuffered, otherwise
763 it's fully buffered.
764 @end deffn
765
766 @deffn {Scheme Procedure} set-current-input-port port
767 @deffnx {Scheme Procedure} set-current-output-port port
768 @deffnx {Scheme Procedure} set-current-error-port port
769 @deffnx {C Function} scm_set_current_input_port (port)
770 @deffnx {C Function} scm_set_current_output_port (port)
771 @deffnx {C Function} scm_set_current_error_port (port)
772 Change the ports returned by @code{current-input-port},
773 @code{current-output-port} and @code{current-error-port}, respectively,
774 so that they use the supplied @var{port} for input or output.
775 @end deffn
776
777 @deftypefn {C Function} void scm_dynwind_current_input_port (SCM port)
778 @deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port)
779 @deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port)
780 These functions must be used inside a pair of calls to
781 @code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic
782 Wind}). During the dynwind context, the indicated port is set to
783 @var{port}.
784
785 More precisely, the current port is swapped with a `backup' value
786 whenever the dynwind context is entered or left. The backup value is
787 initialized with the @var{port} argument.
788 @end deftypefn
789
790 @node Port Types
791 @subsection Types of Port
792 @cindex Types of ports
793 @cindex Port, types
794
795 [Types of port; how to make them.]
796
797 @menu
798 * File Ports:: Ports on an operating system file.
799 * String Ports:: Ports on a Scheme string.
800 * Soft Ports:: Ports on arbitrary Scheme procedures.
801 * Void Ports:: Ports on nothing at all.
802 @end menu
803
804
805 @node File Ports
806 @subsubsection File Ports
807 @cindex File port
808 @cindex Port, file
809
810 The following procedures are used to open file ports.
811 See also @ref{Ports and File Descriptors, open}, for an interface
812 to the Unix @code{open} system call.
813
814 Most systems have limits on how many files can be open, so it's
815 strongly recommended that file ports be closed explicitly when no
816 longer required (@pxref{Ports}).
817
818 @deffn {Scheme Procedure} open-file filename mode
819 @deffnx {C Function} scm_open_file (filename, mode)
820 Open the file whose name is @var{filename}, and return a port
821 representing that file. The attributes of the port are
822 determined by the @var{mode} string. The way in which this is
823 interpreted is similar to C stdio. The first character must be
824 one of the following:
825
826 @table @samp
827 @item r
828 Open an existing file for input.
829 @item w
830 Open a file for output, creating it if it doesn't already exist
831 or removing its contents if it does.
832 @item a
833 Open a file for output, creating it if it doesn't already
834 exist. All writes to the port will go to the end of the file.
835 The "append mode" can be turned off while the port is in use
836 @pxref{Ports and File Descriptors, fcntl}
837 @end table
838
839 The following additional characters can be appended:
840
841 @table @samp
842 @item +
843 Open the port for both input and output. E.g., @code{r+}: open
844 an existing file for both input and output.
845 @item 0
846 Create an "unbuffered" port. In this case input and output
847 operations are passed directly to the underlying port
848 implementation without additional buffering. This is likely to
849 slow down I/O operations. The buffering mode can be changed
850 while a port is in use @pxref{Ports and File Descriptors,
851 setvbuf}
852 @item l
853 Add line-buffering to the port. The port output buffer will be
854 automatically flushed whenever a newline character is written.
855 @item b
856 Use binary mode, ensuring that each byte in the file will be read as one
857 Scheme character.
858
859 To provide this property, the file will be opened with the 8-bit
860 character encoding "ISO-8859-1", ignoring any coding declaration or port
861 encoding. @xref{Ports}, for more information on port encodings.
862
863 Note that while it is possible to read and write binary data as
864 characters or strings, it is usually better to treat bytes as octets,
865 and byte sequences as bytevectors. @xref{R6RS Binary Input}, and
866 @ref{R6RS Binary Output}, for more.
867
868 This option had another historical meaning, for DOS compatibility: in
869 the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
870 The @code{b} flag prevents this from happening, adding @code{O_BINARY}
871 to the underlying @code{open} call. Still, the flag is generally useful
872 because of its port encoding ramifications.
873 @end table
874
875 If a file cannot be opened with the access
876 requested, @code{open-file} throws an exception.
877
878 When the file is opened, this procedure will scan for a coding
879 declaration (@pxref{Character Encoding of Source Files}). If a coding
880 declaration is found, it will be used to interpret the file. Otherwise,
881 the port's encoding will be used. To suppress this behavior, open the
882 file in binary mode and then set the port encoding explicitly using
883 @code{set-port-encoding!}.
884
885 In theory we could create read/write ports which were buffered
886 in one direction only. However this isn't included in the
887 current interfaces.
888 @end deffn
889
890 @rnindex open-input-file
891 @deffn {Scheme Procedure} open-input-file filename
892 Open @var{filename} for input. Equivalent to
893 @lisp
894 (open-file @var{filename} "r")
895 @end lisp
896 @end deffn
897
898 @rnindex open-output-file
899 @deffn {Scheme Procedure} open-output-file filename
900 Open @var{filename} for output. Equivalent to
901 @lisp
902 (open-file @var{filename} "w")
903 @end lisp
904 @end deffn
905
906 @deffn {Scheme Procedure} call-with-input-file filename proc
907 @deffnx {Scheme Procedure} call-with-output-file filename proc
908 @rnindex call-with-input-file
909 @rnindex call-with-output-file
910 Open @var{filename} for input or output, and call @code{(@var{proc}
911 port)} with the resulting port. Return the value returned by
912 @var{proc}. @var{filename} is opened as per @code{open-input-file} or
913 @code{open-output-file} respectively, and an error is signaled if it
914 cannot be opened.
915
916 When @var{proc} returns, the port is closed. If @var{proc} does not
917 return (e.g.@: if it throws an error), then the port might not be
918 closed automatically, though it will be garbage collected in the usual
919 way if not otherwise referenced.
920 @end deffn
921
922 @deffn {Scheme Procedure} with-input-from-file filename thunk
923 @deffnx {Scheme Procedure} with-output-to-file filename thunk
924 @deffnx {Scheme Procedure} with-error-to-file filename thunk
925 @rnindex with-input-from-file
926 @rnindex with-output-to-file
927 Open @var{filename} and call @code{(@var{thunk})} with the new port
928 setup as respectively the @code{current-input-port},
929 @code{current-output-port}, or @code{current-error-port}. Return the
930 value returned by @var{thunk}. @var{filename} is opened as per
931 @code{open-input-file} or @code{open-output-file} respectively, and an
932 error is signaled if it cannot be opened.
933
934 When @var{thunk} returns, the port is closed and the previous setting
935 of the respective current port is restored.
936
937 The current port setting is managed with @code{dynamic-wind}, so the
938 previous value is restored no matter how @var{thunk} exits (eg.@: an
939 exception), and if @var{thunk} is re-entered (via a captured
940 continuation) then it's set again to the @var{filename} port.
941
942 The port is closed when @var{thunk} returns normally, but not when
943 exited via an exception or new continuation. This ensures it's still
944 ready for use if @var{thunk} is re-entered by a captured continuation.
945 Of course the port is always garbage collected and closed in the usual
946 way when no longer referenced anywhere.
947 @end deffn
948
949 @deffn {Scheme Procedure} port-mode port
950 @deffnx {C Function} scm_port_mode (port)
951 Return the port modes associated with the open port @var{port}.
952 These will not necessarily be identical to the modes used when
953 the port was opened, since modes such as "append" which are
954 used only during port creation are not retained.
955 @end deffn
956
957 @deffn {Scheme Procedure} port-filename port
958 @deffnx {C Function} scm_port_filename (port)
959 Return the filename associated with @var{port}, or @code{#f} if no
960 filename is associated with the port.
961
962 @var{port} must be open, @code{port-filename} cannot be used once the
963 port is closed.
964 @end deffn
965
966 @deffn {Scheme Procedure} set-port-filename! port filename
967 @deffnx {C Function} scm_set_port_filename_x (port, filename)
968 Change the filename associated with @var{port}, using the current input
969 port if none is specified. Note that this does not change the port's
970 source of data, but only the value that is returned by
971 @code{port-filename} and reported in diagnostic output.
972 @end deffn
973
974 @deffn {Scheme Procedure} file-port? obj
975 @deffnx {C Function} scm_file_port_p (obj)
976 Determine whether @var{obj} is a port that is related to a file.
977 @end deffn
978
979
980 @node String Ports
981 @subsubsection String Ports
982 @cindex String port
983 @cindex Port, string
984
985 The following allow string ports to be opened by analogy to R4RS
986 file port facilities:
987
988 With string ports, the port-encoding is treated differently than other
989 types of ports. When string ports are created, they do not inherit a
990 character encoding from the current locale. They are given a
991 default locale that allows them to handle all valid string characters.
992 Typically one should not modify a string port's character encoding
993 away from its default.
994
995 @deffn {Scheme Procedure} call-with-output-string proc
996 @deffnx {C Function} scm_call_with_output_string (proc)
997 Calls the one-argument procedure @var{proc} with a newly created output
998 port. When the function returns, the string composed of the characters
999 written into the port is returned. @var{proc} should not close the port.
1000
1001 Note that which characters can be written to a string port depend on the port's
1002 encoding. The default encoding of string ports is specified by the
1003 @code{%default-port-encoding} fluid (@pxref{Ports,
1004 @code{%default-port-encoding}}). For instance, it is an error to write Greek
1005 letter alpha to an ISO-8859-1-encoded string port since this character cannot be
1006 represented with ISO-8859-1:
1007
1008 @example
1009 (define alpha (integer->char #x03b1)) ; GREEK SMALL LETTER ALPHA
1010
1011 (with-fluids ((%default-port-encoding "ISO-8859-1"))
1012 (call-with-output-string
1013 (lambda (p)
1014 (display alpha p))))
1015
1016 @result{}
1017 Throw to key `encoding-error'
1018 @end example
1019
1020 Changing the string port's encoding to a Unicode-capable encoding such as UTF-8
1021 solves the problem.
1022 @end deffn
1023
1024 @deffn {Scheme Procedure} call-with-input-string string proc
1025 @deffnx {C Function} scm_call_with_input_string (string, proc)
1026 Calls the one-argument procedure @var{proc} with a newly
1027 created input port from which @var{string}'s contents may be
1028 read. The value yielded by the @var{proc} is returned.
1029 @end deffn
1030
1031 @deffn {Scheme Procedure} with-output-to-string thunk
1032 Calls the zero-argument procedure @var{thunk} with the current output
1033 port set temporarily to a new string port. It returns a string
1034 composed of the characters written to the current output.
1035
1036 See @code{call-with-output-string} above for character encoding considerations.
1037 @end deffn
1038
1039 @deffn {Scheme Procedure} with-input-from-string string thunk
1040 Calls the zero-argument procedure @var{thunk} with the current input
1041 port set temporarily to a string port opened on the specified
1042 @var{string}. The value yielded by @var{thunk} is returned.
1043 @end deffn
1044
1045 @deffn {Scheme Procedure} open-input-string str
1046 @deffnx {C Function} scm_open_input_string (str)
1047 Take a string and return an input port that delivers characters
1048 from the string. The port can be closed by
1049 @code{close-input-port}, though its storage will be reclaimed
1050 by the garbage collector if it becomes inaccessible.
1051 @end deffn
1052
1053 @deffn {Scheme Procedure} open-output-string
1054 @deffnx {C Function} scm_open_output_string ()
1055 Return an output port that will accumulate characters for
1056 retrieval by @code{get-output-string}. The port can be closed
1057 by the procedure @code{close-output-port}, though its storage
1058 will be reclaimed by the garbage collector if it becomes
1059 inaccessible.
1060 @end deffn
1061
1062 @deffn {Scheme Procedure} get-output-string port
1063 @deffnx {C Function} scm_get_output_string (port)
1064 Given an output port created by @code{open-output-string},
1065 return a string consisting of the characters that have been
1066 output to the port so far.
1067
1068 @code{get-output-string} must be used before closing @var{port}, once
1069 closed the string cannot be obtained.
1070 @end deffn
1071
1072 A string port can be used in many procedures which accept a port
1073 but which are not dependent on implementation details of fports.
1074 E.g., seeking and truncating will work on a string port,
1075 but trying to extract the file descriptor number will fail.
1076
1077
1078 @node Soft Ports
1079 @subsubsection Soft Ports
1080 @cindex Soft port
1081 @cindex Port, soft
1082
1083 A @dfn{soft-port} is a port based on a vector of procedures capable of
1084 accepting or delivering characters. It allows emulation of I/O ports.
1085
1086 @deffn {Scheme Procedure} make-soft-port pv modes
1087 @deffnx {C Function} scm_make_soft_port (pv, modes)
1088 Return a port capable of receiving or delivering characters as
1089 specified by the @var{modes} string (@pxref{File Ports,
1090 open-file}). @var{pv} must be a vector of length 5 or 6. Its
1091 components are as follows:
1092
1093 @enumerate 0
1094 @item
1095 procedure accepting one character for output
1096 @item
1097 procedure accepting a string for output
1098 @item
1099 thunk for flushing output
1100 @item
1101 thunk for getting one character
1102 @item
1103 thunk for closing port (not by garbage collection)
1104 @item
1105 (if present and not @code{#f}) thunk for computing the number of
1106 characters that can be read from the port without blocking.
1107 @end enumerate
1108
1109 For an output-only port only elements 0, 1, 2, and 4 need be
1110 procedures. For an input-only port only elements 3 and 4 need
1111 be procedures. Thunks 2 and 4 can instead be @code{#f} if
1112 there is no useful operation for them to perform.
1113
1114 If thunk 3 returns @code{#f} or an @code{eof-object}
1115 (@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on
1116 Scheme}) it indicates that the port has reached end-of-file.
1117 For example:
1118
1119 @lisp
1120 (define stdout (current-output-port))
1121 (define p (make-soft-port
1122 (vector
1123 (lambda (c) (write c stdout))
1124 (lambda (s) (display s stdout))
1125 (lambda () (display "." stdout))
1126 (lambda () (char-upcase (read-char)))
1127 (lambda () (display "@@" stdout)))
1128 "rw"))
1129
1130 (write p p) @result{} #<input-output: soft 8081e20>
1131 @end lisp
1132 @end deffn
1133
1134
1135 @node Void Ports
1136 @subsubsection Void Ports
1137 @cindex Void port
1138 @cindex Port, void
1139
1140 This kind of port causes any data to be discarded when written to, and
1141 always returns the end-of-file object when read from.
1142
1143 @deffn {Scheme Procedure} %make-void-port mode
1144 @deffnx {C Function} scm_sys_make_void_port (mode)
1145 Create and return a new void port. A void port acts like
1146 @file{/dev/null}. The @var{mode} argument
1147 specifies the input/output modes for this port: see the
1148 documentation for @code{open-file} in @ref{File Ports}.
1149 @end deffn
1150
1151
1152 @node R6RS I/O Ports
1153 @subsection R6RS I/O Ports
1154
1155 @cindex R6RS
1156 @cindex R6RS ports
1157
1158 The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
1159 the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
1160 io ports)} module. It provides features, such as binary I/O and Unicode
1161 string I/O, that complement or refine Guile's historical port API
1162 presented above (@pxref{Input and Output}). Note that R6RS ports are not
1163 disjoint from Guile's native ports, so Guile-specific procedures will
1164 work on ports created using the R6RS API, and vice versa.
1165
1166 The text in this section is taken from the R6RS standard libraries
1167 document, with only minor adaptions for inclusion in this manual. The
1168 Guile developers offer their thanks to the R6RS editors for having
1169 provided the report's text under permissive conditions making this
1170 possible.
1171
1172 @c FIXME: Update description when implemented.
1173 @emph{Note}: The implementation of this R6RS API is not complete yet.
1174
1175 @menu
1176 * R6RS File Names:: File names.
1177 * R6RS File Options:: Options for opening files.
1178 * R6RS Buffer Modes:: Influencing buffering behavior.
1179 * R6RS Transcoders:: Influencing port encoding.
1180 * R6RS End-of-File:: The end-of-file object.
1181 * R6RS Port Manipulation:: Manipulating R6RS ports.
1182 * R6RS Input Ports:: Input Ports.
1183 * R6RS Binary Input:: Binary input.
1184 * R6RS Textual Input:: Textual input.
1185 * R6RS Output Ports:: Output Ports.
1186 * R6RS Binary Output:: Binary output.
1187 * R6RS Textual Output:: Textual output.
1188 @end menu
1189
1190 A subset of the @code{(rnrs io ports)} module is provided by the
1191 @code{(ice-9 binary-ports)} module. It contains binary input/output
1192 procedures and does not rely on R6RS support.
1193
1194 @node R6RS File Names
1195 @subsubsection File Names
1196
1197 Some of the procedures described in this chapter accept a file name as an
1198 argument. Valid values for such a file name include strings that name a file
1199 using the native notation of file system paths on an implementation's
1200 underlying operating system, and may include implementation-dependent
1201 values as well.
1202
1203 A @var{filename} parameter name means that the
1204 corresponding argument must be a file name.
1205
1206 @node R6RS File Options
1207 @subsubsection File Options
1208 @cindex file options
1209
1210 When opening a file, the various procedures in this library accept a
1211 @code{file-options} object that encapsulates flags to specify how the
1212 file is to be opened. A @code{file-options} object is an enum-set
1213 (@pxref{rnrs enums}) over the symbols constituting valid file options.
1214
1215 A @var{file-options} parameter name means that the corresponding
1216 argument must be a file-options object.
1217
1218 @deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
1219
1220 Each @var{file-options-symbol} must be a symbol.
1221
1222 The @code{file-options} syntax returns a file-options object that
1223 encapsulates the specified options.
1224
1225 When supplied to an operation that opens a file for output, the
1226 file-options object returned by @code{(file-options)} specifies that the
1227 file is created if it does not exist and an exception with condition
1228 type @code{&i/o-file-already-exists} is raised if it does exist. The
1229 following standard options can be included to modify the default
1230 behavior.
1231
1232 @table @code
1233 @item no-create
1234 If the file does not already exist, it is not created;
1235 instead, an exception with condition type @code{&i/o-file-does-not-exist}
1236 is raised.
1237 If the file already exists, the exception with condition type
1238 @code{&i/o-file-already-exists} is not raised
1239 and the file is truncated to zero length.
1240 @item no-fail
1241 If the file already exists, the exception with condition type
1242 @code{&i/o-file-already-exists} is not raised,
1243 even if @code{no-create} is not included,
1244 and the file is truncated to zero length.
1245 @item no-truncate
1246 If the file already exists and the exception with condition type
1247 @code{&i/o-file-already-exists} has been inhibited by inclusion of
1248 @code{no-create} or @code{no-fail}, the file is not truncated, but
1249 the port's current position is still set to the beginning of the
1250 file.
1251 @end table
1252
1253 These options have no effect when a file is opened only for input.
1254 Symbols other than those listed above may be used as
1255 @var{file-options-symbol}s; they have implementation-specific meaning,
1256 if any.
1257
1258 @quotation Note
1259 Only the name of @var{file-options-symbol} is significant.
1260 @end quotation
1261 @end deffn
1262
1263 @node R6RS Buffer Modes
1264 @subsubsection Buffer Modes
1265
1266 Each port has an associated buffer mode. For an output port, the
1267 buffer mode defines when an output operation flushes the buffer
1268 associated with the output port. For an input port, the buffer mode
1269 defines how much data will be read to satisfy read operations. The
1270 possible buffer modes are the symbols @code{none} for no buffering,
1271 @code{line} for flushing upon line endings and reading up to line
1272 endings, or other implementation-dependent behavior,
1273 and @code{block} for arbitrary buffering. This section uses
1274 the parameter name @var{buffer-mode} for arguments that must be
1275 buffer-mode symbols.
1276
1277 If two ports are connected to the same mutable source, both ports
1278 are unbuffered, and reading a byte or character from that shared
1279 source via one of the two ports would change the bytes or characters
1280 seen via the other port, a lookahead operation on one port will
1281 render the peeked byte or character inaccessible via the other port,
1282 while a subsequent read operation on the peeked port will see the
1283 peeked byte or character even though the port is otherwise unbuffered.
1284
1285 In other words, the semantics of buffering is defined in terms of side
1286 effects on shared mutable sources, and a lookahead operation has the
1287 same side effect on the shared source as a read operation.
1288
1289 @deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
1290
1291 @var{buffer-mode-symbol} must be a symbol whose name is one of
1292 @code{none}, @code{line}, and @code{block}. The result is the
1293 corresponding symbol, and specifies the associated buffer mode.
1294
1295 @quotation Note
1296 Only the name of @var{buffer-mode-symbol} is significant.
1297 @end quotation
1298 @end deffn
1299
1300 @deffn {Scheme Procedure} buffer-mode? obj
1301 Returns @code{#t} if the argument is a valid buffer-mode symbol, and
1302 returns @code{#f} otherwise.
1303 @end deffn
1304
1305 @node R6RS Transcoders
1306 @subsubsection Transcoders
1307 @cindex codec
1308 @cindex end-of-line style
1309 @cindex transcoder
1310 @cindex binary port
1311 @cindex textual port
1312
1313 Several different Unicode encoding schemes describe standard ways to
1314 encode characters and strings as byte sequences and to decode those
1315 sequences. Within this document, a @dfn{codec} is an immutable Scheme
1316 object that represents a Unicode or similar encoding scheme.
1317
1318 An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
1319 describes how a textual port transcodes representations of line endings.
1320
1321 A @dfn{transcoder} is an immutable Scheme object that combines a codec
1322 with an end-of-line style and a method for handling decoding errors.
1323 Each transcoder represents some specific bidirectional (but not
1324 necessarily lossless), possibly stateful translation between byte
1325 sequences and Unicode characters and strings. Every transcoder can
1326 operate in the input direction (bytes to characters) or in the output
1327 direction (characters to bytes). A @var{transcoder} parameter name
1328 means that the corresponding argument must be a transcoder.
1329
1330 A @dfn{binary port} is a port that supports binary I/O, does not have an
1331 associated transcoder and does not support textual I/O. A @dfn{textual
1332 port} is a port that supports textual I/O, and does not support binary
1333 I/O. A textual port may or may not have an associated transcoder.
1334
1335 @deffn {Scheme Procedure} latin-1-codec
1336 @deffnx {Scheme Procedure} utf-8-codec
1337 @deffnx {Scheme Procedure} utf-16-codec
1338
1339 These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
1340 encoding schemes.
1341
1342 A call to any of these procedures returns a value that is equal in the
1343 sense of @code{eqv?} to the result of any other call to the same
1344 procedure.
1345 @end deffn
1346
1347 @deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
1348
1349 @var{eol-style-symbol} should be a symbol whose name is one of
1350 @code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
1351 and @code{none}.
1352
1353 The form evaluates to the corresponding symbol. If the name of
1354 @var{eol-style-symbol} is not one of these symbols, the effect and
1355 result are implementation-dependent; in particular, the result may be an
1356 eol-style symbol acceptable as an @var{eol-style} argument to
1357 @code{make-transcoder}. Otherwise, an exception is raised.
1358
1359 All eol-style symbols except @code{none} describe a specific
1360 line-ending encoding:
1361
1362 @table @code
1363 @item lf
1364 linefeed
1365 @item cr
1366 carriage return
1367 @item crlf
1368 carriage return, linefeed
1369 @item nel
1370 next line
1371 @item crnel
1372 carriage return, next line
1373 @item ls
1374 line separator
1375 @end table
1376
1377 For a textual port with a transcoder, and whose transcoder has an
1378 eol-style symbol @code{none}, no conversion occurs. For a textual input
1379 port, any eol-style symbol other than @code{none} means that all of the
1380 above line-ending encodings are recognized and are translated into a
1381 single linefeed. For a textual output port, @code{none} and @code{lf}
1382 are equivalent. Linefeed characters are encoded according to the
1383 specified eol-style symbol, and all other characters that participate in
1384 possible line endings are encoded as is.
1385
1386 @quotation Note
1387 Only the name of @var{eol-style-symbol} is significant.
1388 @end quotation
1389 @end deffn
1390
1391 @deffn {Scheme Procedure} native-eol-style
1392 Returns the default end-of-line style of the underlying platform, e.g.,
1393 @code{lf} on Unix and @code{crlf} on Windows.
1394 @end deffn
1395
1396 @deffn {Condition Type} &i/o-decoding
1397 @deffnx {Scheme Procedure} make-i/o-decoding-error port
1398 @deffnx {Scheme Procedure} i/o-decoding-error? obj
1399
1400 This condition type could be defined by
1401
1402 @lisp
1403 (define-condition-type &i/o-decoding &i/o-port
1404 make-i/o-decoding-error i/o-decoding-error?)
1405 @end lisp
1406
1407 An exception with this type is raised when one of the operations for
1408 textual input from a port encounters a sequence of bytes that cannot be
1409 translated into a character or string by the input direction of the
1410 port's transcoder.
1411
1412 When such an exception is raised, the port's position is past the
1413 invalid encoding.
1414 @end deffn
1415
1416 @deffn {Condition Type} &i/o-encoding
1417 @deffnx {Scheme Procedure} make-i/o-encoding-error port char
1418 @deffnx {Scheme Procedure} i/o-encoding-error? obj
1419 @deffnx {Scheme Procedure} i/o-encoding-error-char condition
1420
1421 This condition type could be defined by
1422
1423 @lisp
1424 (define-condition-type &i/o-encoding &i/o-port
1425 make-i/o-encoding-error i/o-encoding-error?
1426 (char i/o-encoding-error-char))
1427 @end lisp
1428
1429 An exception with this type is raised when one of the operations for
1430 textual output to a port encounters a character that cannot be
1431 translated into bytes by the output direction of the port's transcoder.
1432 @var{char} is the character that could not be encoded.
1433 @end deffn
1434
1435 @deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
1436
1437 @var{error-handling-mode-symbol} should be a symbol whose name is one of
1438 @code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
1439 the corresponding symbol. If @var{error-handling-mode-symbol} is not
1440 one of these identifiers, effect and result are
1441 implementation-dependent: The result may be an error-handling-mode
1442 symbol acceptable as a @var{handling-mode} argument to
1443 @code{make-transcoder}. If it is not acceptable as a
1444 @var{handling-mode} argument to @code{make-transcoder}, an exception is
1445 raised.
1446
1447 @quotation Note
1448 Only the name of @var{error-handling-mode-symbol} is significant.
1449 @end quotation
1450
1451 The error-handling mode of a transcoder specifies the behavior
1452 of textual I/O operations in the presence of encoding or decoding
1453 errors.
1454
1455 If a textual input operation encounters an invalid or incomplete
1456 character encoding, and the error-handling mode is @code{ignore}, an
1457 appropriate number of bytes of the invalid encoding are ignored and
1458 decoding continues with the following bytes.
1459
1460 If the error-handling mode is @code{replace}, the replacement
1461 character U+FFFD is injected into the data stream, an appropriate
1462 number of bytes are ignored, and decoding
1463 continues with the following bytes.
1464
1465 If the error-handling mode is @code{raise}, an exception with condition
1466 type @code{&i/o-decoding} is raised.
1467
1468 If a textual output operation encounters a character it cannot encode,
1469 and the error-handling mode is @code{ignore}, the character is ignored
1470 and encoding continues with the next character. If the error-handling
1471 mode is @code{replace}, a codec-specific replacement character is
1472 emitted by the transcoder, and encoding continues with the next
1473 character. The replacement character is U+FFFD for transcoders whose
1474 codec is one of the Unicode encodings, but is the @code{?} character
1475 for the Latin-1 encoding. If the error-handling mode is @code{raise},
1476 an exception with condition type @code{&i/o-encoding} is raised.
1477 @end deffn
1478
1479 @deffn {Scheme Procedure} make-transcoder codec
1480 @deffnx {Scheme Procedure} make-transcoder codec eol-style
1481 @deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
1482
1483 @var{codec} must be a codec; @var{eol-style}, if present, an eol-style
1484 symbol; and @var{handling-mode}, if present, an error-handling-mode
1485 symbol.
1486
1487 @var{eol-style} may be omitted, in which case it defaults to the native
1488 end-of-line style of the underlying platform. @var{handling-mode} may
1489 be omitted, in which case it defaults to @code{replace}. The result is
1490 a transcoder with the behavior specified by its arguments.
1491 @end deffn
1492
1493 @deffn {Scheme procedure} native-transcoder
1494 Returns an implementation-dependent transcoder that represents a
1495 possibly locale-dependent ``native'' transcoding.
1496 @end deffn
1497
1498 @deffn {Scheme Procedure} transcoder-codec transcoder
1499 @deffnx {Scheme Procedure} transcoder-eol-style transcoder
1500 @deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
1501
1502 These are accessors for transcoder objects; when applied to a
1503 transcoder returned by @code{make-transcoder}, they return the
1504 @var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
1505 respectively.
1506 @end deffn
1507
1508 @deffn {Scheme Procedure} bytevector->string bytevector transcoder
1509
1510 Returns the string that results from transcoding the
1511 @var{bytevector} according to the input direction of the transcoder.
1512 @end deffn
1513
1514 @deffn {Scheme Procedure} string->bytevector string transcoder
1515
1516 Returns the bytevector that results from transcoding the
1517 @var{string} according to the output direction of the transcoder.
1518 @end deffn
1519
1520 @node R6RS End-of-File
1521 @subsubsection The End-of-File Object
1522
1523 @cindex EOF
1524 @cindex end-of-file
1525
1526 R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io
1527 ports)} module:
1528
1529 @deffn {Scheme Procedure} eof-object? obj
1530 @deffnx {C Function} scm_eof_object_p (obj)
1531 Return true if @var{obj} is the end-of-file (EOF) object.
1532 @end deffn
1533
1534 In addition, the following procedure is provided:
1535
1536 @deffn {Scheme Procedure} eof-object
1537 @deffnx {C Function} scm_eof_object ()
1538 Return the end-of-file (EOF) object.
1539
1540 @lisp
1541 (eof-object? (eof-object))
1542 @result{} #t
1543 @end lisp
1544 @end deffn
1545
1546
1547 @node R6RS Port Manipulation
1548 @subsubsection Port Manipulation
1549
1550 The procedures listed below operate on any kind of R6RS I/O port.
1551
1552 @deffn {Scheme Procedure} port? obj
1553 Returns @code{#t} if the argument is a port, and returns @code{#f}
1554 otherwise.
1555 @end deffn
1556
1557 @deffn {Scheme Procedure} port-transcoder port
1558 Returns the transcoder associated with @var{port} if @var{port} is
1559 textual and has an associated transcoder, and returns @code{#f} if
1560 @var{port} is binary or does not have an associated transcoder.
1561 @end deffn
1562
1563 @deffn {Scheme Procedure} binary-port? port
1564 Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
1565 binary data input/output.
1566
1567 Note that internally Guile does not differentiate between binary and
1568 textual ports, unlike the R6RS. Thus, this procedure returns true when
1569 @var{port} does not have an associated encoding---i.e., when
1570 @code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
1571 port-encoding}). This is the case for ports returned by R6RS procedures
1572 such as @code{open-bytevector-input-port} and
1573 @code{make-custom-binary-output-port}.
1574
1575 However, Guile currently does not prevent use of textual I/O procedures
1576 such as @code{display} or @code{read-char} with binary ports. Doing so
1577 ``upgrades'' the port from binary to textual, under the ISO-8859-1
1578 encoding. Likewise, Guile does not prevent use of
1579 @code{set-port-encoding!} on a binary port, which also turns it into a
1580 ``textual'' port.
1581 @end deffn
1582
1583 @deffn {Scheme Procedure} textual-port? port
1584 Always return @code{#t}, as all ports can be used for textual I/O in
1585 Guile.
1586 @end deffn
1587
1588 @deffn {Scheme Procedure} transcoded-port binary-port transcoder
1589 The @code{transcoded-port} procedure
1590 returns a new textual port with the specified @var{transcoder}.
1591 Otherwise the new textual port's state is largely the same as
1592 that of @var{binary-port}.
1593 If @var{binary-port} is an input port, the new textual
1594 port will be an input port and
1595 will transcode the bytes that have not yet been read from
1596 @var{binary-port}.
1597 If @var{binary-port} is an output port, the new textual
1598 port will be an output port and
1599 will transcode output characters into bytes that are
1600 written to the byte sink represented by @var{binary-port}.
1601
1602 As a side effect, however, @code{transcoded-port}
1603 closes @var{binary-port} in
1604 a special way that allows the new textual port to continue to
1605 use the byte source or sink represented by @var{binary-port},
1606 even though @var{binary-port} itself is closed and cannot
1607 be used by the input and output operations described in this
1608 chapter.
1609 @end deffn
1610
1611 @deffn {Scheme Procedure} port-position port
1612 If @var{port} supports it (see below), return the offset (an integer)
1613 indicating where the next octet will be read from/written to in
1614 @var{port}. If @var{port} does not support this operation, an error
1615 condition is raised.
1616
1617 This is similar to Guile's @code{seek} procedure with the
1618 @code{SEEK_CUR} argument (@pxref{Random Access}).
1619 @end deffn
1620
1621 @deffn {Scheme Procedure} port-has-port-position? port
1622 Return @code{#t} is @var{port} supports @code{port-position}.
1623 @end deffn
1624
1625 @deffn {Scheme Procedure} set-port-position! port offset
1626 If @var{port} supports it (see below), set the position where the next
1627 octet will be read from/written to @var{port} to @var{offset} (an
1628 integer). If @var{port} does not support this operation, an error
1629 condition is raised.
1630
1631 This is similar to Guile's @code{seek} procedure with the
1632 @code{SEEK_SET} argument (@pxref{Random Access}).
1633 @end deffn
1634
1635 @deffn {Scheme Procedure} port-has-set-port-position!? port
1636 Return @code{#t} is @var{port} supports @code{set-port-position!}.
1637 @end deffn
1638
1639 @deffn {Scheme Procedure} call-with-port port proc
1640 Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
1641 of @var{proc}. Return the return values of @var{proc}.
1642 @end deffn
1643
1644 @node R6RS Input Ports
1645 @subsubsection Input Ports
1646
1647 @deffn {Scheme Procedure} input-port? obj
1648 Returns @code{#t} if the argument is an input port (or a combined input
1649 and output port), and returns @code{#f} otherwise.
1650 @end deffn
1651
1652 @deffn {Scheme Procedure} port-eof? input-port
1653 Returns @code{#t}
1654 if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
1655 or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
1656 would return
1657 the end-of-file object, and @code{#f} otherwise.
1658 The operation may block indefinitely if no data is available
1659 but the port cannot be determined to be at end of file.
1660 @end deffn
1661
1662 @deffn {Scheme Procedure} open-file-input-port filename
1663 @deffnx {Scheme Procedure} open-file-input-port filename file-options
1664 @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
1665 @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
1666 @var{maybe-transcoder} must be either a transcoder or @code{#f}.
1667
1668 The @code{open-file-input-port} procedure returns an
1669 input port for the named file. The @var{file-options} and
1670 @var{maybe-transcoder} arguments are optional.
1671
1672 The @var{file-options} argument, which may determine
1673 various aspects of the returned port (@pxref{R6RS File Options}),
1674 defaults to the value of @code{(file-options)}.
1675
1676 The @var{buffer-mode} argument, if supplied,
1677 must be one of the symbols that name a buffer mode.
1678 The @var{buffer-mode} argument defaults to @code{block}.
1679
1680 If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
1681 with the returned port.
1682
1683 If @var{maybe-transcoder} is @code{#f} or absent,
1684 the port will be a binary port and will support the
1685 @code{port-position} and @code{set-port-position!} operations.
1686 Otherwise the port will be a textual port, and whether it supports
1687 the @code{port-position} and @code{set-port-position!} operations
1688 is implementation-dependent (and possibly transcoder-dependent).
1689 @end deffn
1690
1691 @deffn {Scheme Procedure} standard-input-port
1692 Returns a fresh binary input port connected to standard input. Whether
1693 the port supports the @code{port-position} and @code{set-port-position!}
1694 operations is implementation-dependent.
1695 @end deffn
1696
1697 @deffn {Scheme Procedure} current-input-port
1698 This returns a default textual port for input. Normally, this default
1699 port is associated with standard input, but can be dynamically
1700 re-assigned using the @code{with-input-from-file} procedure from the
1701 @code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
1702 may not have an associated transcoder; if it does, the transcoder is
1703 implementation-dependent.
1704 @end deffn
1705
1706 @node R6RS Binary Input
1707 @subsubsection Binary Input
1708
1709 @cindex binary input
1710
1711 R6RS binary input ports can be created with the procedures described
1712 below.
1713
1714 @deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder]
1715 @deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder)
1716 Return an input port whose contents are drawn from bytevector @var{bv}
1717 (@pxref{Bytevectors}).
1718
1719 @c FIXME: Update description when implemented.
1720 The @var{transcoder} argument is currently not supported.
1721 @end deffn
1722
1723 @cindex custom binary input ports
1724
1725 @deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close
1726 @deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close)
1727 Return a new custom binary input port@footnote{This is similar in spirit
1728 to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a
1729 string) whose input is drained by invoking @var{read!} and passing it a
1730 bytevector, an index where bytes should be written, and the number of
1731 bytes to read. The @code{read!} procedure must return an integer
1732 indicating the number of bytes read, or @code{0} to indicate the
1733 end-of-file.
1734
1735 Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
1736 that will be called when @code{port-position} is invoked on the custom
1737 binary port and should return an integer indicating the position within
1738 the underlying data stream; if @var{get-position} was not supplied, the
1739 returned port does not support @code{port-position}.
1740
1741 Likewise, if @var{set-position!} is not @code{#f}, it should be a
1742 one-argument procedure. When @code{set-port-position!} is invoked on the
1743 custom binary input port, @var{set-position!} is passed an integer
1744 indicating the position of the next byte is to read.
1745
1746 Finally, if @var{close} is not @code{#f}, it must be a thunk. It is
1747 invoked when the custom binary input port is closed.
1748
1749 Using a custom binary input port, the @code{open-bytevector-input-port}
1750 procedure could be implemented as follows:
1751
1752 @lisp
1753 (define (open-bytevector-input-port source)
1754 (define position 0)
1755 (define length (bytevector-length source))
1756
1757 (define (read! bv start count)
1758 (let ((count (min count (- length position))))
1759 (bytevector-copy! source position
1760 bv start count)
1761 (set! position (+ position count))
1762 count))
1763
1764 (define (get-position) position)
1765
1766 (define (set-position! new-position)
1767 (set! position new-position))
1768
1769 (make-custom-binary-input-port "the port" read!
1770 get-position
1771 set-position!))
1772
1773 (read (open-bytevector-input-port (string->utf8 "hello")))
1774 @result{} hello
1775 @end lisp
1776 @end deffn
1777
1778 @cindex binary input
1779 Binary input is achieved using the procedures below:
1780
1781 @deffn {Scheme Procedure} get-u8 port
1782 @deffnx {C Function} scm_get_u8 (port)
1783 Return an octet read from @var{port}, a binary input port, blocking as
1784 necessary, or the end-of-file object.
1785 @end deffn
1786
1787 @deffn {Scheme Procedure} lookahead-u8 port
1788 @deffnx {C Function} scm_lookahead_u8 (port)
1789 Like @code{get-u8} but does not update @var{port}'s position to point
1790 past the octet.
1791 @end deffn
1792
1793 @deffn {Scheme Procedure} get-bytevector-n port count
1794 @deffnx {C Function} scm_get_bytevector_n (port, count)
1795 Read @var{count} octets from @var{port}, blocking as necessary and
1796 return a bytevector containing the octets read. If fewer bytes are
1797 available, a bytevector smaller than @var{count} is returned.
1798 @end deffn
1799
1800 @deffn {Scheme Procedure} get-bytevector-n! port bv start count
1801 @deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count)
1802 Read @var{count} bytes from @var{port} and store them in @var{bv}
1803 starting at index @var{start}. Return either the number of bytes
1804 actually read or the end-of-file object.
1805 @end deffn
1806
1807 @deffn {Scheme Procedure} get-bytevector-some port
1808 @deffnx {C Function} scm_get_bytevector_some (port)
1809 Read from @var{port}, blocking as necessary, until data are available or
1810 and end-of-file is reached. Return either a new bytevector containing
1811 the data read or the end-of-file object.
1812 @end deffn
1813
1814 @deffn {Scheme Procedure} get-bytevector-all port
1815 @deffnx {C Function} scm_get_bytevector_all (port)
1816 Read from @var{port}, blocking as necessary, until the end-of-file is
1817 reached. Return either a new bytevector containing the data read or the
1818 end-of-file object (if no data were available).
1819 @end deffn
1820
1821 @node R6RS Textual Input
1822 @subsubsection Textual Input
1823
1824 @deffn {Scheme Procedure} get-char textual-input-port
1825 Reads from @var{textual-input-port}, blocking as necessary, until a
1826 complete character is available from @var{textual-input-port},
1827 or until an end of file is reached.
1828
1829 If a complete character is available before the next end of file,
1830 @code{get-char} returns that character and updates the input port to
1831 point past the character. If an end of file is reached before any
1832 character is read, @code{get-char} returns the end-of-file object.
1833 @end deffn
1834
1835 @deffn {Scheme Procedure} lookahead-char textual-input-port
1836 The @code{lookahead-char} procedure is like @code{get-char}, but it does
1837 not update @var{textual-input-port} to point past the character.
1838 @end deffn
1839
1840 @deffn {Scheme Procedure} get-string-n textual-input-port count
1841
1842 @var{count} must be an exact, non-negative integer object, representing
1843 the number of characters to be read.
1844
1845 The @code{get-string-n} procedure reads from @var{textual-input-port},
1846 blocking as necessary, until @var{count} characters are available, or
1847 until an end of file is reached.
1848
1849 If @var{count} characters are available before end of file,
1850 @code{get-string-n} returns a string consisting of those @var{count}
1851 characters. If fewer characters are available before an end of file, but
1852 one or more characters can be read, @code{get-string-n} returns a string
1853 containing those characters. In either case, the input port is updated
1854 to point just past the characters read. If no characters can be read
1855 before an end of file, the end-of-file object is returned.
1856 @end deffn
1857
1858 @deffn {Scheme Procedure} get-string-n! textual-input-port string start count
1859
1860 @var{start} and @var{count} must be exact, non-negative integer objects,
1861 with @var{count} representing the number of characters to be read.
1862 @var{string} must be a string with at least $@var{start} + @var{count}$
1863 characters.
1864
1865 The @code{get-string-n!} procedure reads from @var{textual-input-port}
1866 in the same manner as @code{get-string-n}. If @var{count} characters
1867 are available before an end of file, they are written into @var{string}
1868 starting at index @var{start}, and @var{count} is returned. If fewer
1869 characters are available before an end of file, but one or more can be
1870 read, those characters are written into @var{string} starting at index
1871 @var{start} and the number of characters actually read is returned as an
1872 exact integer object. If no characters can be read before an end of
1873 file, the end-of-file object is returned.
1874 @end deffn
1875
1876 @deffn {Scheme Procedure} get-string-all textual-input-port count
1877 Reads from @var{textual-input-port} until an end of file, decoding
1878 characters in the same manner as @code{get-string-n} and
1879 @code{get-string-n!}.
1880
1881 If characters are available before the end of file, a string containing
1882 all the characters decoded from that data are returned. If no character
1883 precedes the end of file, the end-of-file object is returned.
1884 @end deffn
1885
1886 @deffn {Scheme Procedure} get-line textual-input-port
1887 Reads from @var{textual-input-port} up to and including the linefeed
1888 character or end of file, decoding characters in the same manner as
1889 @code{get-string-n} and @code{get-string-n!}.
1890
1891 If a linefeed character is read, a string containing all of the text up
1892 to (but not including) the linefeed character is returned, and the port
1893 is updated to point just past the linefeed character. If an end of file
1894 is encountered before any linefeed character is read, but some
1895 characters have been read and decoded as characters, a string containing
1896 those characters is returned. If an end of file is encountered before
1897 any characters are read, the end-of-file object is returned.
1898
1899 @quotation Note
1900 The end-of-line style, if not @code{none}, will cause all line endings
1901 to be read as linefeed characters. @xref{R6RS Transcoders}.
1902 @end quotation
1903 @end deffn
1904
1905 @deffn {Scheme Procedure} get-datum textual-input-port count
1906 Reads an external representation from @var{textual-input-port} and returns the
1907 datum it represents. The @code{get-datum} procedure returns the next
1908 datum that can be parsed from the given @var{textual-input-port}, updating
1909 @var{textual-input-port} to point exactly past the end of the external
1910 representation of the object.
1911
1912 Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
1913 Syntax}) in the input is first skipped. If an end of file occurs after
1914 the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
1915 is returned.
1916
1917 If a character inconsistent with an external representation is
1918 encountered in the input, an exception with condition types
1919 @code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
1920 file is encountered after the beginning of an external representation,
1921 but the external representation is incomplete and therefore cannot be
1922 parsed, an exception with condition types @code{&lexical} and
1923 @code{&i/o-read} is raised.
1924 @end deffn
1925
1926 @node R6RS Output Ports
1927 @subsubsection Output Ports
1928
1929 @deffn {Scheme Procedure} output-port? obj
1930 Returns @code{#t} if the argument is an output port (or a
1931 combined input and output port), @code{#f} otherwise.
1932 @end deffn
1933
1934 @deffn {Scheme Procedure} flush-output-port port
1935 Flushes any buffered output from the buffer of @var{output-port} to the
1936 underlying file, device, or object. The @code{flush-output-port}
1937 procedure returns an unspecified values.
1938 @end deffn
1939
1940 @deffn {Scheme Procedure} open-file-output-port filename
1941 @deffnx {Scheme Procedure} open-file-output-port filename file-options
1942 @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
1943 @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
1944
1945 @var{maybe-transcoder} must be either a transcoder or @code{#f}.
1946
1947 The @code{open-file-output-port} procedure returns an output port for the named file.
1948
1949 The @var{file-options} argument, which may determine various aspects of
1950 the returned port (@pxref{R6RS File Options}), defaults to the value of
1951 @code{(file-options)}.
1952
1953 The @var{buffer-mode} argument, if supplied,
1954 must be one of the symbols that name a buffer mode.
1955 The @var{buffer-mode} argument defaults to @code{block}.
1956
1957 If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
1958 associated with the port.
1959
1960 If @var{maybe-transcoder} is @code{#f} or absent,
1961 the port will be a binary port and will support the
1962 @code{port-position} and @code{set-port-position!} operations.
1963 Otherwise the port will be a textual port, and whether it supports
1964 the @code{port-position} and @code{set-port-position!} operations
1965 is implementation-dependent (and possibly transcoder-dependent).
1966 @end deffn
1967
1968 @deffn {Scheme Procedure} standard-output-port
1969 @deffnx {Scheme Procedure} standard-error-port
1970 Returns a fresh binary output port connected to the standard output or
1971 standard error respectively. Whether the port supports the
1972 @code{port-position} and @code{set-port-position!} operations is
1973 implementation-dependent.
1974 @end deffn
1975
1976 @deffn {Scheme Procedure} current-output-port
1977 @deffnx {Scheme Procedure} current-error-port
1978 These return default textual ports for regular output and error output.
1979 Normally, these default ports are associated with standard output, and
1980 standard error, respectively. The return value of
1981 @code{current-output-port} can be dynamically re-assigned using the
1982 @code{with-output-to-file} procedure from the @code{io simple (6)}
1983 library (@pxref{rnrs io simple}). A port returned by one of these
1984 procedures may or may not have an associated transcoder; if it does, the
1985 transcoder is implementation-dependent.
1986 @end deffn
1987
1988 @node R6RS Binary Output
1989 @subsubsection Binary Output
1990
1991 Binary output ports can be created with the procedures below.
1992
1993 @deffn {Scheme Procedure} open-bytevector-output-port [transcoder]
1994 @deffnx {C Function} scm_open_bytevector_output_port (transcoder)
1995 Return two values: a binary output port and a procedure. The latter
1996 should be called with zero arguments to obtain a bytevector containing
1997 the data accumulated by the port, as illustrated below.
1998
1999 @lisp
2000 (call-with-values
2001 (lambda ()
2002 (open-bytevector-output-port))
2003 (lambda (port get-bytevector)
2004 (display "hello" port)
2005 (get-bytevector)))
2006
2007 @result{} #vu8(104 101 108 108 111)
2008 @end lisp
2009
2010 @c FIXME: Update description when implemented.
2011 The @var{transcoder} argument is currently not supported.
2012 @end deffn
2013
2014 @cindex custom binary output ports
2015
2016 @deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close
2017 @deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close)
2018 Return a new custom binary output port named @var{id} (a string) whose
2019 output is sunk by invoking @var{write!} and passing it a bytevector, an
2020 index where bytes should be read from this bytevector, and the number of
2021 bytes to be ``written''. The @code{write!} procedure must return an
2022 integer indicating the number of bytes actually written; when it is
2023 passed @code{0} as the number of bytes to write, it should behave as
2024 though an end-of-file was sent to the byte sink.
2025
2026 The other arguments are as for @code{make-custom-binary-input-port}
2027 (@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}).
2028 @end deffn
2029
2030 @cindex binary output
2031 Writing to a binary output port can be done using the following
2032 procedures:
2033
2034 @deffn {Scheme Procedure} put-u8 port octet
2035 @deffnx {C Function} scm_put_u8 (port, octet)
2036 Write @var{octet}, an integer in the 0--255 range, to @var{port}, a
2037 binary output port.
2038 @end deffn
2039
2040 @deffn {Scheme Procedure} put-bytevector port bv [start [count]]
2041 @deffnx {C Function} scm_put_bytevector (port, bv, start, count)
2042 Write the contents of @var{bv} to @var{port}, optionally starting at
2043 index @var{start} and limiting to @var{count} octets.
2044 @end deffn
2045
2046 @node R6RS Textual Output
2047 @subsubsection Textual Output
2048
2049 @deffn {Scheme Procedure} put-char port char
2050 Writes @var{char} to the port. The @code{put-char} procedure returns
2051 @end deffn
2052
2053 @deffn {Scheme Procedure} put-string port string
2054 @deffnx {Scheme Procedure} put-string port string start
2055 @deffnx {Scheme Procedure} put-string port string start count
2056
2057 @var{start} and @var{count} must be non-negative exact integer objects.
2058 @var{string} must have a length of at least @math{@var{start} +
2059 @var{count}}. @var{start} defaults to 0. @var{count} defaults to
2060 @math{@code{(string-length @var{string})} - @var{start}}$. The
2061 @code{put-string} procedure writes the @var{count} characters of
2062 @var{string} starting at index @var{start} to the port. The
2063 @code{put-string} procedure returns an unspecified value.
2064 @end deffn
2065
2066 @deffn {Scheme Procedure} put-datum textual-output-port datum
2067 @var{datum} should be a datum value. The @code{put-datum} procedure
2068 writes an external representation of @var{datum} to
2069 @var{textual-output-port}. The specific external representation is
2070 implementation-dependent. However, whenever possible, an implementation
2071 should produce a representation for which @code{get-datum}, when reading
2072 the representation, will return an object equal (in the sense of
2073 @code{equal?}) to @var{datum}.
2074
2075 @quotation Note
2076 Not all datums may allow producing an external representation for which
2077 @code{get-datum} will produce an object that is equal to the
2078 original. Specifically, NaNs contained in @var{datum} may make
2079 this impossible.
2080 @end quotation
2081
2082 @quotation Note
2083 The @code{put-datum} procedure merely writes the external
2084 representation, but no trailing delimiter. If @code{put-datum} is
2085 used to write several subsequent external representations to an
2086 output port, care should be taken to delimit them properly so they can
2087 be read back in by subsequent calls to @code{get-datum}.
2088 @end quotation
2089 @end deffn
2090
2091 @node I/O Extensions
2092 @subsection Using and Extending Ports in C
2093
2094 @menu
2095 * C Port Interface:: Using ports from C.
2096 * Port Implementation:: How to implement a new port type in C.
2097 @end menu
2098
2099
2100 @node C Port Interface
2101 @subsubsection C Port Interface
2102 @cindex C port interface
2103 @cindex Port, C interface
2104
2105 This section describes how to use Scheme ports from C.
2106
2107 @subsubheading Port basics
2108
2109 @cindex ptob
2110 @tindex scm_ptob_descriptor
2111 @tindex scm_port
2112 @findex SCM_PTAB_ENTRY
2113 @findex SCM_PTOBNUM
2114 @vindex scm_ptobs
2115 There are two main data structures. A port type object (ptob) is of
2116 type @code{scm_ptob_descriptor}. A port instance is of type
2117 @code{scm_port}. Given an @code{SCM} variable which points to a port,
2118 the corresponding C port object can be obtained using the
2119 @code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using
2120 @code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs}
2121 global array.
2122
2123 @subsubheading Port buffers
2124
2125 An input port always has a read buffer and an output port always has a
2126 write buffer. However the size of these buffers is not guaranteed to be
2127 more than one byte (e.g., the @code{shortbuf} field in @code{scm_port}
2128 which is used when no other buffer is allocated). The way in which the
2129 buffers are allocated depends on the implementation of the ptob. For
2130 example in the case of an fport, buffers may be allocated with malloc
2131 when the port is created, but in the case of an strport the underlying
2132 string is used as the buffer.
2133
2134 @subsubheading The @code{rw_random} flag
2135
2136 Special treatment is required for ports which can be seeked at random.
2137 Before various operations, such as seeking the port or changing from
2138 input to output on a bidirectional port or vice versa, the port
2139 implementation must be given a chance to update its state. The write
2140 buffer is updated by calling the @code{flush} ptob procedure and the
2141 input buffer is updated by calling the @code{end_input} ptob procedure.
2142 In the case of an fport, @code{flush} causes buffered output to be
2143 written to the file descriptor, while @code{end_input} causes the
2144 descriptor position to be adjusted to account for buffered input which
2145 was never read.
2146
2147 The special treatment must be performed if the @code{rw_random} flag in
2148 the port is non-zero.
2149
2150 @subsubheading The @code{rw_active} variable
2151
2152 The @code{rw_active} variable in the port is only used if
2153 @code{rw_random} is set. It's defined as an enum with the following
2154 values:
2155
2156 @table @code
2157 @item SCM_PORT_READ
2158 the read buffer may have unread data.
2159
2160 @item SCM_PORT_WRITE
2161 the write buffer may have unwritten data.
2162
2163 @item SCM_PORT_NEITHER
2164 neither the write nor the read buffer has data.
2165 @end table
2166
2167 @subsubheading Reading from a port.
2168
2169 To read from a port, it's possible to either call existing libguile
2170 procedures such as @code{scm_getc} and @code{scm_read_line} or to read
2171 data from the read buffer directly. Reading from the buffer involves
2172 the following steps:
2173
2174 @enumerate
2175 @item
2176 Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}.
2177
2178 @item
2179 Fill the read buffer, if it's empty, using @code{scm_fill_input}.
2180
2181 @item Read the data from the buffer and update the read position in
2182 the buffer. Steps 2) and 3) may be repeated as many times as required.
2183
2184 @item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set.
2185
2186 @item update the port's line and column counts.
2187 @end enumerate
2188
2189 @subsubheading Writing to a port.
2190
2191 To write data to a port, calling @code{scm_lfwrite} should be sufficient for
2192 most purposes. This takes care of the following steps:
2193
2194 @enumerate
2195 @item
2196 End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}.
2197
2198 @item
2199 Pass the data to the ptob implementation using the @code{write} ptob
2200 procedure. The advantage of using the ptob @code{write} instead of
2201 manipulating the write buffer directly is that it allows the data to be
2202 written in one operation even if the port is using the single-byte
2203 @code{shortbuf}.
2204
2205 @item
2206 Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random}
2207 is set.
2208 @end enumerate
2209
2210
2211 @node Port Implementation
2212 @subsubsection Port Implementation
2213 @cindex Port implementation
2214
2215 This section describes how to implement a new port type in C.
2216
2217 As described in the previous section, a port type object (ptob) is
2218 a structure of type @code{scm_ptob_descriptor}. A ptob is created by
2219 calling @code{scm_make_port_type}.
2220
2221 @deftypefun scm_t_bits scm_make_port_type (char *name, int (*fill_input) (SCM port), void (*write) (SCM port, const void *data, size_t size))
2222 Return a new port type object. The @var{name}, @var{fill_input} and
2223 @var{write} parameters are initial values for those port type fields,
2224 as described below. The other fields are initialized with default
2225 values and can be changed later.
2226 @end deftypefun
2227
2228 All of the elements of the ptob, apart from @code{name}, are procedures
2229 which collectively implement the port behaviour. Creating a new port
2230 type mostly involves writing these procedures.
2231
2232 @table @code
2233 @item name
2234 A pointer to a NUL terminated string: the name of the port type. This
2235 is the only element of @code{scm_ptob_descriptor} which is not
2236 a procedure. Set via the first argument to @code{scm_make_port_type}.
2237
2238 @item mark
2239 Called during garbage collection to mark any SCM objects that a port
2240 object may contain. It doesn't need to be set unless the port has
2241 @code{SCM} components. Set using
2242
2243 @deftypefun void scm_set_port_mark (scm_t_bits tc, SCM (*mark) (SCM port))
2244 @end deftypefun
2245
2246 @item free
2247 Called when the port is collected during gc. It
2248 should free any resources used by the port.
2249 Set using
2250
2251 @deftypefun void scm_set_port_free (scm_t_bits tc, size_t (*free) (SCM port))
2252 @end deftypefun
2253
2254 @item print
2255 Called when @code{write} is called on the port object, to print a
2256 port description. E.g., for an fport it may produce something like:
2257 @code{#<input: /etc/passwd 3>}. Set using
2258
2259 @deftypefun void scm_set_port_print (scm_t_bits tc, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate))
2260 The first argument @var{port} is the object being printed, the second
2261 argument @var{dest_port} is where its description should go.
2262 @end deftypefun
2263
2264 @item equalp
2265 Not used at present. Set using
2266
2267 @deftypefun void scm_set_port_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM))
2268 @end deftypefun
2269
2270 @item close
2271 Called when the port is closed, unless it was collected during gc. It
2272 should free any resources used by the port.
2273 Set using
2274
2275 @deftypefun void scm_set_port_close (scm_t_bits tc, int (*close) (SCM port))
2276 @end deftypefun
2277
2278 @item write
2279 Accept data which is to be written using the port. The port implementation
2280 may choose to buffer the data instead of processing it directly.
2281 Set via the third argument to @code{scm_make_port_type}.
2282
2283 @item flush
2284 Complete the processing of buffered output data. Reset the value of
2285 @code{rw_active} to @code{SCM_PORT_NEITHER}.
2286 Set using
2287
2288 @deftypefun void scm_set_port_flush (scm_t_bits tc, void (*flush) (SCM port))
2289 @end deftypefun
2290
2291 @item end_input
2292 Perform any synchronization required when switching from input to output
2293 on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}.
2294 Set using
2295
2296 @deftypefun void scm_set_port_end_input (scm_t_bits tc, void (*end_input) (SCM port, int offset))
2297 @end deftypefun
2298
2299 @item fill_input
2300 Read new data into the read buffer and return the first character. It
2301 can be assumed that the read buffer is empty when this procedure is called.
2302 Set via the second argument to @code{scm_make_port_type}.
2303
2304 @item input_waiting
2305 Return a lower bound on the number of bytes that could be read from the
2306 port without blocking. It can be assumed that the current state of
2307 @code{rw_active} is @code{SCM_PORT_NEITHER}.
2308 Set using
2309
2310 @deftypefun void scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM port))
2311 @end deftypefun
2312
2313 @item seek
2314 Set the current position of the port. The procedure can not make
2315 any assumptions about the value of @code{rw_active} when it's
2316 called. It can reset the buffers first if desired by using something
2317 like:
2318
2319 @example
2320 if (pt->rw_active == SCM_PORT_READ)
2321 scm_end_input (port);
2322 else if (pt->rw_active == SCM_PORT_WRITE)
2323 ptob->flush (port);
2324 @end example
2325
2326 However note that this will have the side effect of discarding any data
2327 in the unread-char buffer, in addition to any side effects from the
2328 @code{end_input} and @code{flush} ptob procedures. This is undesirable
2329 when seek is called to measure the current position of the port, i.e.,
2330 @code{(seek p 0 SEEK_CUR)}. The libguile fport and string port
2331 implementations take care to avoid this problem.
2332
2333 The procedure is set using
2334
2335 @deftypefun void scm_set_port_seek (scm_t_bits tc, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence))
2336 @end deftypefun
2337
2338 @item truncate
2339 Truncate the port data to be specified length. It can be assumed that the
2340 current state of @code{rw_active} is @code{SCM_PORT_NEITHER}.
2341 Set using
2342
2343 @deftypefun void scm_set_port_truncate (scm_t_bits tc, void (*truncate) (SCM port, scm_t_off length))
2344 @end deftypefun
2345
2346 @end table
2347
2348 @c Local Variables:
2349 @c TeX-master: "guile.texi"
2350 @c End: