elisp @@ macro
[bpt/guile.git] / doc / ref / api-io.texi
CommitLineData
07d83abe
MV
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
c62da8f8 3@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
cdd3d6c9 4@c 2010, 2011, 2013 Free Software Foundation, Inc.
07d83abe
MV
5@c See the file guile.texi for copying conditions.
6
07d83abe
MV
7@node Input and Output
8@section Input and Output
9
10@menu
11* Ports:: The idea of the port abstraction.
12* Reading:: Procedures for reading from a port.
13* Writing:: Procedures for writing to a port.
14* Closing:: Procedures to close a port.
15* Random Access:: Moving around a random access port.
16* Line/Delimited:: Read and write lines or delimited text.
17* Block Reading and Writing:: Reading and writing blocks of text.
18* Default Ports:: Defaults for input, output and errors.
19* Port Types:: Types of port and how to make them.
b242715b 20* R6RS I/O Ports:: The R6RS port API.
07d83abe 21* I/O Extensions:: Using and extending ports in C.
cdd3d6c9 22* BOM Handling:: Handling of Unicode byte order marks.
07d83abe
MV
23@end menu
24
25
26@node Ports
27@subsection Ports
bf5df489 28@cindex Port
07d83abe
MV
29
30Sequential input/output in Scheme is represented by operations on a
31@dfn{port}. This chapter explains the operations that Guile provides
32for working with ports.
33
34Ports are created by opening, for instance @code{open-file} for a file
35(@pxref{File Ports}). Characters can be read from an input port and
36written to an output port, or both on an input/output port. A port
37can be closed (@pxref{Closing}) when no longer required, after which
38any attempt to read or write is an error.
39
40The formal definition of a port is very generic: an input port is
41simply ``an object which can deliver characters on demand,'' and an
42output port is ``an object which can accept characters.'' Because
43this definition is so loose, it is easy to write functions that
44simulate ports in software. @dfn{Soft ports} and @dfn{string ports}
45are two interesting and powerful examples of this technique.
46(@pxref{Soft Ports}, and @ref{String Ports}.)
47
48Ports are garbage collected in the usual way (@pxref{Memory
49Management}), and will be closed at that time if not already closed.
28cc8dac 50In this case any errors occurring in the close will not be reported.
07d83abe
MV
51Usually a program will want to explicitly close so as to be sure all
52its operations have been successful. Of course if a program has
53abandoned something due to an error or other condition then closing
54problems are probably not of interest.
55
56It is strongly recommended that file ports be closed explicitly when
57no longer required. Most systems have limits on how many files can be
58open, both on a per-process and a system-wide basis. A program that
59uses many files should take care not to hit those limits. The same
60applies to similar system resources such as pipes and sockets.
61
62Note that automatic garbage collection is triggered only by memory
63consumption, not by file or other resource usage, so a program cannot
64rely on that to keep it away from system limits. An explicit call to
65@code{gc} can of course be relied on to pick up unreferenced ports.
66If program flow makes it hard to be certain when to close then this
67may be an acceptable way to control resource usage.
68
40296bab
KR
69All file access uses the ``LFS'' large file support functions when
70available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be
71read and written on a 32-bit system.
72
28cc8dac
MG
73Each port has an associated character encoding that controls how bytes
74read from the port are converted to characters and string and controls
75how characters and strings written to the port are converted to bytes.
76When ports are created, they inherit their character encoding from the
77current locale, but, that can be modified after the port is created.
78
912a8702
MG
79Currently, the ports only work with @emph{non-modal} encodings. Most
80encodings are non-modal, meaning that the conversion of bytes to a
81string doesn't depend on its context: the same byte sequence will always
82return the same string. A couple of modal encodings are in common use,
83like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
84
28cc8dac
MG
85Each port also has an associated conversion strategy: what to do when
86a Guile character can't be converted to the port's encoded character
87representation for output. There are three possible strategies: to
88raise an error, to replace the character with a hex escape, or to
89replace the character with a substitute character.
90
07d83abe
MV
91@rnindex input-port?
92@deffn {Scheme Procedure} input-port? x
93@deffnx {C Function} scm_input_port_p (x)
94Return @code{#t} if @var{x} is an input port, otherwise return
95@code{#f}. Any object satisfying this predicate also satisfies
96@code{port?}.
97@end deffn
98
99@rnindex output-port?
100@deffn {Scheme Procedure} output-port? x
101@deffnx {C Function} scm_output_port_p (x)
102Return @code{#t} if @var{x} is an output port, otherwise return
103@code{#f}. Any object satisfying this predicate also satisfies
104@code{port?}.
105@end deffn
106
107@deffn {Scheme Procedure} port? x
108@deffnx {C Function} scm_port_p (x)
109Return a boolean indicating whether @var{x} is a port.
110Equivalent to @code{(or (input-port? @var{x}) (output-port?
111@var{x}))}.
112@end deffn
113
28cc8dac
MG
114@deffn {Scheme Procedure} set-port-encoding! port enc
115@deffnx {C Function} scm_set_port_encoding_x (port, enc)
4c7b9975
LC
116Sets the character encoding that will be used to interpret all port I/O.
117@var{enc} is a string containing the name of an encoding. Valid
118encoding names are those
119@url{http://www.iana.org/assignments/character-sets, defined by IANA}.
28cc8dac 120@end deffn
d6a6989e
LC
121
122@defvr {Scheme Variable} %default-port-encoding
72b3aa56 123A fluid containing @code{#f} or the name of the encoding to
d6a6989e
LC
124be used by default for newly created ports (@pxref{Fluids and Dynamic
125States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
28cc8dac
MG
126
127New ports are created with the encoding appropriate for the current
4c7b9975
LC
128locale if @code{setlocale} has been called or the value specified by
129this fluid otherwise.
130@end defvr
28cc8dac
MG
131
132@deffn {Scheme Procedure} port-encoding port
5f6ffd66 133@deffnx {C Function} scm_port_encoding (port)
211683cc
MG
134Returns, as a string, the character encoding that @var{port} uses to interpret
135its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
28cc8dac
MG
136@end deffn
137
138@deffn {Scheme Procedure} set-port-conversion-strategy! port sym
139@deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
140Sets the behavior of the interpreter when outputting a character that
141is not representable in the port's current encoding. @var{sym} can be
142either @code{'error}, @code{'substitute}, or @code{'escape}. If it is
143@code{'error}, an error will be thrown when an nonconvertible character
144is encountered. If it is @code{'substitute}, then nonconvertible
145characters will be replaced with approximate characters, or with
146question marks if no approximately correct character is available. If
147it is @code{'escape}, it will appear as a hex escape when output.
148
149If @var{port} is an open port, the conversion error behavior
150is set for that port. If it is @code{#f}, it is set as the
151default behavior for any future ports that get created in
152this thread.
153@end deffn
154
155@deffn {Scheme Procedure} port-conversion-strategy port
156@deffnx {C Function} scm_port_conversion_strategy (port)
157Returns the behavior of the port when outputting a character that is
158not representable in the port's current encoding. It returns the
159symbol @code{error} if unrepresentable characters should cause
160exceptions, @code{substitute} if the port should try to replace
161unrepresentable characters with question marks or approximate
162characters, or @code{escape} if unrepresentable characters should be
163converted to string escapes.
164
165If @var{port} is @code{#f}, then the current default behavior will be
166returned. New ports will have this default behavior when they are
167created.
168@end deffn
169
b22e94db
LC
170@deffn {Scheme Variable} %default-port-conversion-strategy
171The fluid that defines the conversion strategy for newly created ports,
172and for other conversion routines such as @code{scm_to_stringn},
173@code{scm_from_stringn}, @code{string->pointer}, and
174@code{pointer->string}.
175
176Its value must be one of the symbols described above, with the same
177semantics: @code{'error}, @code{'substitute}, or @code{'escape}.
178
179When Guile starts, its value is @code{'substitute}.
180
181Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
182equivalent to @code{(fluid-set! %default-port-conversion-strategy
183@var{sym})}.
184@end deffn
28cc8dac 185
07d83abe
MV
186
187@node Reading
188@subsection Reading
bf5df489 189@cindex Reading
07d83abe
MV
190
191[Generic procedures for reading from ports.]
192
1518f649
AW
193These procedures pertain to reading characters and strings from
194ports. To read general S-expressions from ports, @xref{Scheme Read}.
195
07d83abe 196@rnindex eof-object?
bf5df489 197@cindex End of file object
07d83abe
MV
198@deffn {Scheme Procedure} eof-object? x
199@deffnx {C Function} scm_eof_object_p (x)
200Return @code{#t} if @var{x} is an end-of-file object; otherwise
201return @code{#f}.
202@end deffn
203
204@rnindex char-ready?
205@deffn {Scheme Procedure} char-ready? [port]
206@deffnx {C Function} scm_char_ready_p (port)
207Return @code{#t} if a character is ready on input @var{port}
208and return @code{#f} otherwise. If @code{char-ready?} returns
209@code{#t} then the next @code{read-char} operation on
210@var{port} is guaranteed not to hang. If @var{port} is a file
211port at end of file then @code{char-ready?} returns @code{#t}.
cdf1ad3b
MV
212
213@code{char-ready?} exists to make it possible for a
07d83abe
MV
214program to accept characters from interactive ports without
215getting stuck waiting for input. Any input editors associated
216with such ports must make sure that characters whose existence
217has been asserted by @code{char-ready?} cannot be rubbed out.
218If @code{char-ready?} were to return @code{#f} at end of file,
219a port at end of file would be indistinguishable from an
cdf1ad3b 220interactive port that has no ready characters.
07d83abe
MV
221@end deffn
222
223@rnindex read-char
224@deffn {Scheme Procedure} read-char [port]
225@deffnx {C Function} scm_read_char (port)
226Return the next character available from @var{port}, updating
227@var{port} to point to the following character. If no more
228characters are available, the end-of-file object is returned.
c62da8f8
LC
229
230When @var{port}'s data cannot be decoded according to its
231character encoding, a @code{decoding-error} is raised and
232@var{port} points past the erroneous byte sequence.
07d83abe
MV
233@end deffn
234
235@deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
236Read up to @var{size} bytes from @var{port} and store them in
237@var{buffer}. The return value is the number of bytes actually read,
238which can be less than @var{size} if end-of-file has been reached.
239
240Note that this function does not update @code{port-line} and
241@code{port-column} below.
242@end deftypefn
243
244@rnindex peek-char
245@deffn {Scheme Procedure} peek-char [port]
246@deffnx {C Function} scm_peek_char (port)
247Return the next character available from @var{port},
248@emph{without} updating @var{port} to point to the following
249character. If no more characters are available, the
cdf1ad3b
MV
250end-of-file object is returned.
251
252The value returned by
07d83abe
MV
253a call to @code{peek-char} is the same as the value that would
254have been returned by a call to @code{read-char} on the same
255port. The only difference is that the very next call to
256@code{read-char} or @code{peek-char} on that @var{port} will
257return the value returned by the preceding call to
258@code{peek-char}. In particular, a call to @code{peek-char} on
259an interactive port will hang waiting for input whenever a call
cdf1ad3b 260to @code{read-char} would have hung.
c62da8f8
LC
261
262As for @code{read-char}, a @code{decoding-error} may be raised
263if such a situation occurs. However, unlike with @code{read-char},
264@var{port} still points at the beginning of the erroneous byte
265sequence when the error is raised.
07d83abe
MV
266@end deffn
267
268@deffn {Scheme Procedure} unread-char cobj [port]
269@deffnx {C Function} scm_unread_char (cobj, port)
64de6db5 270Place character @var{cobj} in @var{port} so that it will be read by the
07d83abe
MV
271next read operation. If called multiple times, the unread characters
272will be read again in last-in first-out order. If @var{port} is
273not supplied, the current input port is used.
274@end deffn
275
276@deffn {Scheme Procedure} unread-string str port
277@deffnx {C Function} scm_unread_string (str, port)
278Place the string @var{str} in @var{port} so that its characters will
279be read from left-to-right as the next characters from @var{port}
280during subsequent read operations. If called multiple times, the
281unread characters will be read again in last-in first-out order. If
9782da8a 282@var{port} is not supplied, the @code{current-input-port} is used.
07d83abe
MV
283@end deffn
284
285@deffn {Scheme Procedure} drain-input port
286@deffnx {C Function} scm_drain_input (port)
287This procedure clears a port's input buffers, similar
288to the way that force-output clears the output buffer. The
289contents of the buffers are returned as a single string, e.g.,
290
291@lisp
292(define p (open-input-file ...))
293(drain-input p) => empty string, nothing buffered yet.
294(unread-char (read-char p) p)
295(drain-input p) => initial chars from p, up to the buffer size.
296@end lisp
297
298Draining the buffers may be useful for cleanly finishing
299buffered I/O so that the file descriptor can be used directly
300for further input.
301@end deffn
302
303@deffn {Scheme Procedure} port-column port
304@deffnx {Scheme Procedure} port-line port
305@deffnx {C Function} scm_port_column (port)
306@deffnx {C Function} scm_port_line (port)
307Return the current column number or line number of @var{port}.
308If the number is
309unknown, the result is #f. Otherwise, the result is a 0-origin integer
310- i.e.@: the first character of the first line is line 0, column 0.
311(However, when you display a file position, for example in an error
312message, we recommend you add 1 to get 1-origin integers. This is
313because lines and column numbers traditionally start with 1, and that is
314what non-programmers will find most natural.)
315@end deffn
316
317@deffn {Scheme Procedure} set-port-column! port column
318@deffnx {Scheme Procedure} set-port-line! port line
319@deffnx {C Function} scm_set_port_column_x (port, column)
320@deffnx {C Function} scm_set_port_line_x (port, line)
321Set the current column or line number of @var{port}.
322@end deffn
323
324@node Writing
325@subsection Writing
bf5df489 326@cindex Writing
07d83abe
MV
327
328[Generic procedures for writing to ports.]
329
1518f649
AW
330These procedures are for writing characters and strings to
331ports. For more information on writing arbitrary Scheme objects to
332ports, @xref{Scheme Write}.
333
07d83abe
MV
334@deffn {Scheme Procedure} get-print-state port
335@deffnx {C Function} scm_get_print_state (port)
336Return the print state of the port @var{port}. If @var{port}
337has no associated print state, @code{#f} is returned.
338@end deffn
339
07d83abe
MV
340@rnindex newline
341@deffn {Scheme Procedure} newline [port]
342@deffnx {C Function} scm_newline (port)
343Send a newline to @var{port}.
344If @var{port} is omitted, send to the current output port.
345@end deffn
346
cdf1ad3b 347@deffn {Scheme Procedure} port-with-print-state port [pstate]
07d83abe
MV
348@deffnx {C Function} scm_port_with_print_state (port, pstate)
349Create a new port which behaves like @var{port}, but with an
cdf1ad3b
MV
350included print state @var{pstate}. @var{pstate} is optional.
351If @var{pstate} isn't supplied and @var{port} already has
352a print state, the old print state is reused.
07d83abe
MV
353@end deffn
354
07d83abe
MV
355@deffn {Scheme Procedure} simple-format destination message . args
356@deffnx {C Function} scm_simple_format (destination, message, args)
357Write @var{message} to @var{destination}, defaulting to
358the current output port.
359@var{message} can contain @code{~A} (was @code{%s}) and
360@code{~S} (was @code{%S}) escapes. When printed,
361the escapes are replaced with corresponding members of
64de6db5 362@var{args}:
07d83abe
MV
363@code{~A} formats using @code{display} and @code{~S} formats
364using @code{write}.
365If @var{destination} is @code{#t}, then use the current output
366port, if @var{destination} is @code{#f}, then return a string
367containing the formatted text. Does not add a trailing newline.
368@end deffn
369
370@rnindex write-char
371@deffn {Scheme Procedure} write-char chr [port]
372@deffnx {C Function} scm_write_char (chr, port)
373Send character @var{chr} to @var{port}.
374@end deffn
375
376@deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size)
377Write @var{size} bytes at @var{buffer} to @var{port}.
378
379Note that this function does not update @code{port-line} and
380@code{port-column} (@pxref{Reading}).
381@end deftypefn
382
383@findex fflush
384@deffn {Scheme Procedure} force-output [port]
385@deffnx {C Function} scm_force_output (port)
386Flush the specified output port, or the current output port if @var{port}
387is omitted. The current output buffer contents are passed to the
388underlying port implementation (e.g., in the case of fports, the
389data will be written to the file and the output buffer will be cleared.)
390It has no effect on an unbuffered port.
391
392The return value is unspecified.
393@end deffn
394
395@deffn {Scheme Procedure} flush-all-ports
396@deffnx {C Function} scm_flush_all_ports ()
397Equivalent to calling @code{force-output} on
398all open output ports. The return value is unspecified.
399@end deffn
400
401
402@node Closing
403@subsection Closing
bf5df489
KR
404@cindex Closing ports
405@cindex Port, close
07d83abe
MV
406
407@deffn {Scheme Procedure} close-port port
408@deffnx {C Function} scm_close_port (port)
409Close the specified port object. Return @code{#t} if it
410successfully closes a port or @code{#f} if it was already
411closed. An exception may be raised if an error occurs, for
412example when flushing buffered output. See also @ref{Ports and
413File Descriptors, close}, for a procedure which can close file
414descriptors.
415@end deffn
416
417@deffn {Scheme Procedure} close-input-port port
418@deffnx {Scheme Procedure} close-output-port port
419@deffnx {C Function} scm_close_input_port (port)
420@deffnx {C Function} scm_close_output_port (port)
421@rnindex close-input-port
422@rnindex close-output-port
423Close the specified input or output @var{port}. An exception may be
424raised if an error occurs while closing. If @var{port} is already
425closed, nothing is done. The return value is unspecified.
426
427See also @ref{Ports and File Descriptors, close}, for a procedure
428which can close file descriptors.
429@end deffn
430
431@deffn {Scheme Procedure} port-closed? port
432@deffnx {C Function} scm_port_closed_p (port)
433Return @code{#t} if @var{port} is closed or @code{#f} if it is
434open.
435@end deffn
436
437
438@node Random Access
439@subsection Random Access
bf5df489
KR
440@cindex Random access, ports
441@cindex Port, random access
07d83abe
MV
442
443@deffn {Scheme Procedure} seek fd_port offset whence
444@deffnx {C Function} scm_seek (fd_port, offset, whence)
64de6db5 445Sets the current position of @var{fd_port} to the integer
680135b6
LC
446@var{offset}. For a file port, @var{offset} is expressed
447as a number of bytes; for other types of ports, such as string
448ports, @var{offset} is an abstract representation of the
449position within the port's data, not necessarily expressed
450as a number of bytes. @var{offset} is interpreted according to
451the value of @var{whence}.
07d83abe
MV
452
453One of the following variables should be supplied for
454@var{whence}:
455@defvar SEEK_SET
456Seek from the beginning of the file.
457@end defvar
458@defvar SEEK_CUR
459Seek from the current position.
460@end defvar
461@defvar SEEK_END
462Seek from the end of the file.
463@end defvar
64de6db5 464If @var{fd_port} is a file descriptor, the underlying system
07d83abe
MV
465call is @code{lseek}. @var{port} may be a string port.
466
680135b6 467The value returned is the new position in @var{fd_port}. This means
07d83abe
MV
468that the current position of a port can be obtained using:
469@lisp
470(seek port 0 SEEK_CUR)
471@end lisp
472@end deffn
473
474@deffn {Scheme Procedure} ftell fd_port
475@deffnx {C Function} scm_ftell (fd_port)
476Return an integer representing the current position of
64de6db5 477@var{fd_port}, measured from the beginning. Equivalent to:
07d83abe
MV
478
479@lisp
480(seek port 0 SEEK_CUR)
481@end lisp
482@end deffn
483
484@findex truncate
485@findex ftruncate
40296bab
KR
486@deffn {Scheme Procedure} truncate-file file [length]
487@deffnx {C Function} scm_truncate_file (file, length)
488Truncate @var{file} to @var{length} bytes. @var{file} can be a
489filename string, a port object, or an integer file descriptor. The
490return value is unspecified.
491
492For a port or file descriptor @var{length} can be omitted, in which
493case the file is truncated at the current position (per @code{ftell}
494above).
495
496On most systems a file can be extended by giving a length greater than
497the current size, but this is not mandatory in the POSIX standard.
07d83abe
MV
498@end deffn
499
500@node Line/Delimited
501@subsection Line Oriented and Delimited Text
bf5df489
KR
502@cindex Line input/output
503@cindex Port, line input/output
07d83abe
MV
504
505The delimited-I/O module can be accessed with:
506
aba0dff5 507@lisp
07d83abe 508(use-modules (ice-9 rdelim))
aba0dff5 509@end lisp
07d83abe
MV
510
511It can be used to read or write lines of text, or read text delimited by
512a specified set of characters. It's similar to the @code{(scsh rdelim)}
513module from guile-scsh, but does not use multiple values or character
514sets and has an extra procedure @code{write-line}.
515
516@c begin (scm-doc-string "rdelim.scm" "read-line")
517@deffn {Scheme Procedure} read-line [port] [handle-delim]
518Return a line of text from @var{port} if specified, otherwise from the
519value returned by @code{(current-input-port)}. Under Unix, a line of text
520is terminated by the first end-of-line character or by end-of-file.
521
522If @var{handle-delim} is specified, it should be one of the following
523symbols:
524@table @code
525@item trim
526Discard the terminating delimiter. This is the default, but it will
527be impossible to tell whether the read terminated with a delimiter or
528end-of-file.
529@item concat
530Append the terminating delimiter (if any) to the returned string.
531@item peek
532Push the terminating delimiter (if any) back on to the port.
533@item split
534Return a pair containing the string read from the port and the
535terminating delimiter or end-of-file object.
536@end table
c62da8f8
LC
537
538Like @code{read-char}, this procedure can throw to @code{decoding-error}
539(@pxref{Reading, @code{read-char}}).
07d83abe
MV
540@end deffn
541
542@c begin (scm-doc-string "rdelim.scm" "read-line!")
543@deffn {Scheme Procedure} read-line! buf [port]
544Read a line of text into the supplied string @var{buf} and return the
545number of characters added to @var{buf}. If @var{buf} is filled, then
546@code{#f} is returned.
547Read from @var{port} if
548specified, otherwise from the value returned by @code{(current-input-port)}.
549@end deffn
550
551@c begin (scm-doc-string "rdelim.scm" "read-delimited")
552@deffn {Scheme Procedure} read-delimited delims [port] [handle-delim]
553Read text until one of the characters in the string @var{delims} is found
554or end-of-file is reached. Read from @var{port} if supplied, otherwise
555from the value returned by @code{(current-input-port)}.
556@var{handle-delim} takes the same values as described for @code{read-line}.
557@end deffn
558
559@c begin (scm-doc-string "rdelim.scm" "read-delimited!")
560@deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
e7fb779f
AW
561Read text into the supplied string @var{buf}.
562
563If a delimiter was found, return the number of characters written,
564except if @var{handle-delim} is @code{split}, in which case the return
565value is a pair, as noted above.
566
567As a special case, if @var{port} was already at end-of-stream, the EOF
568object is returned. Also, if no characters were written because the
569buffer was full, @code{#f} is returned.
570
571It's something of a wacky interface, to be honest.
07d83abe
MV
572@end deffn
573
574@deffn {Scheme Procedure} write-line obj [port]
575@deffnx {C Function} scm_write_line (obj, port)
576Display @var{obj} and a newline character to @var{port}. If
577@var{port} is not specified, @code{(current-output-port)} is
578used. This function is equivalent to:
579@lisp
580(display obj [port])
581(newline [port])
582@end lisp
583@end deffn
584
5a35d42a
AW
585In the past, Guile did not have a procedure that would just read out all
586of the characters from a port. As a workaround, many people just called
587@code{read-delimited} with no delimiters, knowing that would produce the
588behavior they wanted. This prompted Guile developers to add some
589routines that would read all characters from a port. So it is that
590@code{(ice-9 rdelim)} is also the home for procedures that can reading
591undelimited text:
592
593@deffn {Scheme Procedure} read-string [port] [count]
594Read all of the characters out of @var{port} and return them as a
595string. If the @var{count} is present, treat it as a limit to the
596number of characters to read.
597
598By default, read from the current input port, with no size limit on the
599result. This procedure always returns a string, even if no characters
600were read.
601@end deffn
602
603@deffn {Scheme Procedure} read-string! buf [port] [start] [end]
604Fill @var{buf} with characters read from @var{port}, defaulting to the
605current input port. Return the number of characters read.
606
607If @var{start} or @var{end} are specified, store data only into the
608substring of @var{str} bounded by @var{start} and @var{end} (which
609default to the beginning and end of the string, respectively).
610@end deffn
611
28cc8dac 612Some of the aforementioned I/O functions rely on the following C
07d83abe
MV
613primitives. These will mainly be of interest to people hacking Guile
614internals.
615
616@deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]]
617@deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end)
618Read characters from @var{port} into @var{str} until one of the
619characters in the @var{delims} string is encountered. If
620@var{gobble} is true, discard the delimiter character;
621otherwise, leave it in the input stream for the next read. If
622@var{port} is not specified, use the value of
623@code{(current-input-port)}. If @var{start} or @var{end} are
624specified, store data only into the substring of @var{str}
625bounded by @var{start} and @var{end} (which default to the
626beginning and end of the string, respectively).
627
628 Return a pair consisting of the delimiter that terminated the
629string and the number of characters read. If reading stopped
630at the end of file, the delimiter returned is the
631@var{eof-object}; if the string was filled without encountering
632a delimiter, this value is @code{#f}.
633@end deffn
634
635@deffn {Scheme Procedure} %read-line [port]
636@deffnx {C Function} scm_read_line (port)
637Read a newline-terminated line from @var{port}, allocating storage as
638necessary. The newline terminator (if any) is removed from the string,
639and a pair consisting of the line and its delimiter is returned. The
640delimiter may be either a newline or the @var{eof-object}; if
641@code{%read-line} is called at the end of file, it returns the pair
642@code{(#<eof> . #<eof>)}.
643@end deffn
644
645@node Block Reading and Writing
646@subsection Block reading and writing
bf5df489
KR
647@cindex Block read/write
648@cindex Port, block read/write
07d83abe
MV
649
650The Block-string-I/O module can be accessed with:
651
aba0dff5 652@lisp
07d83abe 653(use-modules (ice-9 rw))
aba0dff5 654@end lisp
07d83abe
MV
655
656It currently contains procedures that help to implement the
657@code{(scsh rw)} module in guile-scsh.
658
659@deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]]
660@deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end)
661Read characters from a port or file descriptor into a
662string @var{str}. A port must have an underlying file
663descriptor --- a so-called fport. This procedure is
664scsh-compatible and can efficiently read large strings.
665It will:
666
667@itemize
668@item
669attempt to fill the entire string, unless the @var{start}
670and/or @var{end} arguments are supplied. i.e., @var{start}
671defaults to 0 and @var{end} defaults to
672@code{(string-length str)}
673@item
674use the current input port if @var{port_or_fdes} is not
675supplied.
676@item
677return fewer than the requested number of characters in some
678cases, e.g., on end of file, if interrupted by a signal, or if
679not all the characters are immediately available.
680@item
681wait indefinitely for some input if no characters are
682currently available,
683unless the port is in non-blocking mode.
684@item
685read characters from the port's input buffers if available,
686instead from the underlying file descriptor.
687@item
688return @code{#f} if end-of-file is encountered before reading
689any characters, otherwise return the number of characters
690read.
691@item
692return 0 if the port is in non-blocking mode and no characters
693are immediately available.
694@item
695return 0 if the request is for 0 bytes, with no
696end-of-file check.
697@end itemize
698@end deffn
699
700@deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]]
701@deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end)
702Write characters from a string @var{str} to a port or file
703descriptor. A port must have an underlying file descriptor
704--- a so-called fport. This procedure is
705scsh-compatible and can efficiently write large strings.
706It will:
707
708@itemize
709@item
710attempt to write the entire string, unless the @var{start}
711and/or @var{end} arguments are supplied. i.e., @var{start}
712defaults to 0 and @var{end} defaults to
713@code{(string-length str)}
714@item
715use the current output port if @var{port_of_fdes} is not
716supplied.
717@item
718in the case of a buffered port, store the characters in the
719port's output buffer, if all will fit. If they will not fit
720then any existing buffered characters will be flushed
721before attempting
722to write the new characters directly to the underlying file
723descriptor. If the port is in non-blocking mode and
724buffered characters can not be flushed immediately, then an
725@code{EAGAIN} system-error exception will be raised (Note:
726scsh does not support the use of non-blocking buffered ports.)
727@item
728write fewer than the requested number of
729characters in some cases, e.g., if interrupted by a signal or
730if not all of the output can be accepted immediately.
731@item
732wait indefinitely for at least one character
733from @var{str} to be accepted by the port, unless the port is
734in non-blocking mode.
735@item
736return the number of characters accepted by the port.
737@item
738return 0 if the port is in non-blocking mode and can not accept
739at least one character from @var{str} immediately
740@item
741return 0 immediately if the request size is 0 bytes.
742@end itemize
743@end deffn
744
745@node Default Ports
746@subsection Default Ports for Input, Output and Errors
bf5df489
KR
747@cindex Default ports
748@cindex Port, default
07d83abe
MV
749
750@rnindex current-input-port
751@deffn {Scheme Procedure} current-input-port
752@deffnx {C Function} scm_current_input_port ()
34846414 753@cindex standard input
07d83abe 754Return the current input port. This is the default port used
3fa0a042
KR
755by many input procedures.
756
757Initially this is the @dfn{standard input} in Unix and C terminology.
758When the standard input is a tty the port is unbuffered, otherwise
759it's fully buffered.
760
761Unbuffered input is good if an application runs an interactive
762subprocess, since any type-ahead input won't go into Guile's buffer
9782da8a 763and be unavailable to the subprocess.
3fa0a042
KR
764
765Note that Guile buffering is completely separate from the tty ``line
9782da8a
KR
766discipline''. In the usual cooked mode on a tty Guile only sees a
767line of input once the user presses @key{Return}.
07d83abe
MV
768@end deffn
769
770@rnindex current-output-port
771@deffn {Scheme Procedure} current-output-port
772@deffnx {C Function} scm_current_output_port ()
34846414 773@cindex standard output
07d83abe 774Return the current output port. This is the default port used
3fa0a042
KR
775by many output procedures.
776
777Initially this is the @dfn{standard output} in Unix and C terminology.
778When the standard output is a tty this port is unbuffered, otherwise
779it's fully buffered.
780
781Unbuffered output to a tty is good for ensuring progress output or a
782prompt is seen. But an application which always prints whole lines
783could change to line buffered, or an application with a lot of output
784could go fully buffered and perhaps make explicit @code{force-output}
785calls (@pxref{Writing}) at selected points.
07d83abe
MV
786@end deffn
787
788@deffn {Scheme Procedure} current-error-port
789@deffnx {C Function} scm_current_error_port ()
34846414 790@cindex standard error output
3fa0a042
KR
791Return the port to which errors and warnings should be sent.
792
793Initially this is the @dfn{standard error} in Unix and C terminology.
794When the standard error is a tty this port is unbuffered, otherwise
795it's fully buffered.
07d83abe
MV
796@end deffn
797
798@deffn {Scheme Procedure} set-current-input-port port
799@deffnx {Scheme Procedure} set-current-output-port port
800@deffnx {Scheme Procedure} set-current-error-port port
801@deffnx {C Function} scm_set_current_input_port (port)
802@deffnx {C Function} scm_set_current_output_port (port)
803@deffnx {C Function} scm_set_current_error_port (port)
804Change the ports returned by @code{current-input-port},
805@code{current-output-port} and @code{current-error-port}, respectively,
806so that they use the supplied @var{port} for input or output.
807@end deffn
808
661ae7ab
MV
809@deftypefn {C Function} void scm_dynwind_current_input_port (SCM port)
810@deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port)
811@deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port)
07d83abe 812These functions must be used inside a pair of calls to
661ae7ab
MV
813@code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic
814Wind}). During the dynwind context, the indicated port is set to
07d83abe
MV
815@var{port}.
816
817More precisely, the current port is swapped with a `backup' value
661ae7ab 818whenever the dynwind context is entered or left. The backup value is
07d83abe
MV
819initialized with the @var{port} argument.
820@end deftypefn
821
822@node Port Types
823@subsection Types of Port
bf5df489
KR
824@cindex Types of ports
825@cindex Port, types
07d83abe
MV
826
827[Types of port; how to make them.]
828
829@menu
830* File Ports:: Ports on an operating system file.
831* String Ports:: Ports on a Scheme string.
832* Soft Ports:: Ports on arbitrary Scheme procedures.
833* Void Ports:: Ports on nothing at all.
834@end menu
835
836
837@node File Ports
838@subsubsection File Ports
bf5df489
KR
839@cindex File port
840@cindex Port, file
07d83abe
MV
841
842The following procedures are used to open file ports.
843See also @ref{Ports and File Descriptors, open}, for an interface
844to the Unix @code{open} system call.
845
846Most systems have limits on how many files can be open, so it's
847strongly recommended that file ports be closed explicitly when no
848longer required (@pxref{Ports}).
849
3ace9a8e
MW
850@deffn {Scheme Procedure} open-file filename mode @
851 [#:guess-encoding=#f] [#:encoding=#f]
852@deffnx {C Function} scm_open_file_with_encoding @
853 (filename, mode, guess_encoding, encoding)
07d83abe
MV
854@deffnx {C Function} scm_open_file (filename, mode)
855Open the file whose name is @var{filename}, and return a port
856representing that file. The attributes of the port are
857determined by the @var{mode} string. The way in which this is
858interpreted is similar to C stdio. The first character must be
859one of the following:
c755b861 860
07d83abe
MV
861@table @samp
862@item r
863Open an existing file for input.
864@item w
865Open a file for output, creating it if it doesn't already exist
866or removing its contents if it does.
867@item a
868Open a file for output, creating it if it doesn't already
869exist. All writes to the port will go to the end of the file.
870The "append mode" can be turned off while the port is in use
871@pxref{Ports and File Descriptors, fcntl}
872@end table
c755b861 873
07d83abe 874The following additional characters can be appended:
c755b861 875
07d83abe
MV
876@table @samp
877@item +
878Open the port for both input and output. E.g., @code{r+}: open
879an existing file for both input and output.
880@item 0
881Create an "unbuffered" port. In this case input and output
882operations are passed directly to the underlying port
883implementation without additional buffering. This is likely to
884slow down I/O operations. The buffering mode can be changed
885while a port is in use @pxref{Ports and File Descriptors,
886setvbuf}
887@item l
888Add line-buffering to the port. The port output buffer will be
889automatically flushed whenever a newline character is written.
c755b861 890@item b
5261e742
AW
891Use binary mode, ensuring that each byte in the file will be read as one
892Scheme character.
893
894To provide this property, the file will be opened with the 8-bit
9a334eb3
MW
895character encoding "ISO-8859-1", ignoring the default port encoding.
896@xref{Ports}, for more information on port encodings.
5261e742
AW
897
898Note that while it is possible to read and write binary data as
899characters or strings, it is usually better to treat bytes as octets,
900and byte sequences as bytevectors. @xref{R6RS Binary Input}, and
901@ref{R6RS Binary Output}, for more.
902
903This option had another historical meaning, for DOS compatibility: in
904the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
905The @code{b} flag prevents this from happening, adding @code{O_BINARY}
906to the underlying @code{open} call. Still, the flag is generally useful
907because of its port encoding ramifications.
07d83abe 908@end table
c755b861 909
3ace9a8e
MW
910Unless binary mode is requested, the character encoding of the new port
911is determined as follows: First, if @var{guess-encoding} is true, the
912@code{file-encoding} procedure is used to guess the encoding of the file
913(@pxref{Character Encoding of Source Files}). If @var{guess-encoding}
914is false or if @code{file-encoding} fails, @var{encoding} is used unless
915it is also false. As a last resort, the default port encoding is used.
916@xref{Ports}, for more information on port encodings. It is an error to
917pass a non-false @var{guess-encoding} or @var{encoding} if binary mode
918is requested.
919
920If a file cannot be opened with the access requested, @code{open-file}
921throws an exception.
092bdcc4 922
9a334eb3
MW
923When the file is opened, its encoding is set to the current
924@code{%default-port-encoding}, unless the @code{b} flag was supplied.
925Sometimes it is desirable to honor Emacs-style coding declarations in
926files@footnote{Guile 2.0.0 to 2.0.7 would do this by default. This
927behavior was deemed inappropriate and disabled starting from Guile
9282.0.8.}. When that is the case, the @code{file-encoding} procedure can
929be used as follows (@pxref{Character Encoding of Source Files,
930@code{file-encoding}}):
931
932@example
933(let* ((port (open-input-file file))
934 (encoding (file-encoding port)))
935 (set-port-encoding! port (or encoding (port-encoding port))))
936@end example
211683cc 937
07d83abe
MV
938In theory we could create read/write ports which were buffered
939in one direction only. However this isn't included in the
092bdcc4 940current interfaces.
07d83abe
MV
941@end deffn
942
943@rnindex open-input-file
3ace9a8e
MW
944@deffn {Scheme Procedure} open-input-file filename @
945 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
946
947Open @var{filename} for input. If @var{binary} is true, open the port
948in binary mode, otherwise use text mode. @var{encoding} and
949@var{guess-encoding} determine the character encoding as described above
950for @code{open-file}. Equivalent to
aba0dff5 951@lisp
3ace9a8e
MW
952(open-file @var{filename}
953 (if @var{binary} "rb" "r")
954 #:guess-encoding @var{guess-encoding}
955 #:encoding @var{encoding})
aba0dff5 956@end lisp
07d83abe
MV
957@end deffn
958
959@rnindex open-output-file
3ace9a8e
MW
960@deffn {Scheme Procedure} open-output-file filename @
961 [#:encoding=#f] [#:binary=#f]
962
963Open @var{filename} for output. If @var{binary} is true, open the port
964in binary mode, otherwise use text mode. @var{encoding} specifies the
965character encoding as described above for @code{open-file}. Equivalent
966to
aba0dff5 967@lisp
3ace9a8e
MW
968(open-file @var{filename}
969 (if @var{binary} "wb" "w")
970 #:encoding @var{encoding})
aba0dff5 971@end lisp
07d83abe
MV
972@end deffn
973
3ace9a8e
MW
974@deffn {Scheme Procedure} call-with-input-file filename proc @
975 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
976@deffnx {Scheme Procedure} call-with-output-file filename proc @
977 [#:encoding=#f] [#:binary=#f]
07d83abe
MV
978@rnindex call-with-input-file
979@rnindex call-with-output-file
980Open @var{filename} for input or output, and call @code{(@var{proc}
981port)} with the resulting port. Return the value returned by
982@var{proc}. @var{filename} is opened as per @code{open-input-file} or
28cc8dac 983@code{open-output-file} respectively, and an error is signaled if it
07d83abe
MV
984cannot be opened.
985
986When @var{proc} returns, the port is closed. If @var{proc} does not
28cc8dac 987return (e.g.@: if it throws an error), then the port might not be
07d83abe
MV
988closed automatically, though it will be garbage collected in the usual
989way if not otherwise referenced.
990@end deffn
991
3ace9a8e
MW
992@deffn {Scheme Procedure} with-input-from-file filename thunk @
993 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
994@deffnx {Scheme Procedure} with-output-to-file filename thunk @
995 [#:encoding=#f] [#:binary=#f]
996@deffnx {Scheme Procedure} with-error-to-file filename thunk @
997 [#:encoding=#f] [#:binary=#f]
07d83abe
MV
998@rnindex with-input-from-file
999@rnindex with-output-to-file
1000Open @var{filename} and call @code{(@var{thunk})} with the new port
1001setup as respectively the @code{current-input-port},
1002@code{current-output-port}, or @code{current-error-port}. Return the
1003value returned by @var{thunk}. @var{filename} is opened as per
1004@code{open-input-file} or @code{open-output-file} respectively, and an
28cc8dac 1005error is signaled if it cannot be opened.
07d83abe
MV
1006
1007When @var{thunk} returns, the port is closed and the previous setting
1008of the respective current port is restored.
1009
1010The current port setting is managed with @code{dynamic-wind}, so the
1011previous value is restored no matter how @var{thunk} exits (eg.@: an
1012exception), and if @var{thunk} is re-entered (via a captured
64de6db5 1013continuation) then it's set again to the @var{filename} port.
07d83abe
MV
1014
1015The port is closed when @var{thunk} returns normally, but not when
1016exited via an exception or new continuation. This ensures it's still
1017ready for use if @var{thunk} is re-entered by a captured continuation.
1018Of course the port is always garbage collected and closed in the usual
1019way when no longer referenced anywhere.
1020@end deffn
1021
1022@deffn {Scheme Procedure} port-mode port
1023@deffnx {C Function} scm_port_mode (port)
1024Return the port modes associated with the open port @var{port}.
1025These will not necessarily be identical to the modes used when
1026the port was opened, since modes such as "append" which are
1027used only during port creation are not retained.
1028@end deffn
1029
1030@deffn {Scheme Procedure} port-filename port
1031@deffnx {C Function} scm_port_filename (port)
ac012a27
AW
1032Return the filename associated with @var{port}, or @code{#f} if no
1033filename is associated with the port.
e55abf41
KR
1034
1035@var{port} must be open, @code{port-filename} cannot be used once the
1036port is closed.
07d83abe
MV
1037@end deffn
1038
1039@deffn {Scheme Procedure} set-port-filename! port filename
1040@deffnx {C Function} scm_set_port_filename_x (port, filename)
1041Change the filename associated with @var{port}, using the current input
1042port if none is specified. Note that this does not change the port's
1043source of data, but only the value that is returned by
1044@code{port-filename} and reported in diagnostic output.
1045@end deffn
1046
1047@deffn {Scheme Procedure} file-port? obj
1048@deffnx {C Function} scm_file_port_p (obj)
1049Determine whether @var{obj} is a port that is related to a file.
1050@end deffn
1051
1052
1053@node String Ports
1054@subsubsection String Ports
bf5df489
KR
1055@cindex String port
1056@cindex Port, string
07d83abe 1057
ecb87335 1058The following allow string ports to be opened by analogy to R4RS
07d83abe
MV
1059file port facilities:
1060
28cc8dac
MG
1061With string ports, the port-encoding is treated differently than other
1062types of ports. When string ports are created, they do not inherit a
1063character encoding from the current locale. They are given a
1064default locale that allows them to handle all valid string characters.
1065Typically one should not modify a string port's character encoding
1066away from its default.
1067
07d83abe
MV
1068@deffn {Scheme Procedure} call-with-output-string proc
1069@deffnx {C Function} scm_call_with_output_string (proc)
1070Calls the one-argument procedure @var{proc} with a newly created output
1071port. When the function returns, the string composed of the characters
1072written into the port is returned. @var{proc} should not close the port.
1073@end deffn
1074
1075@deffn {Scheme Procedure} call-with-input-string string proc
1076@deffnx {C Function} scm_call_with_input_string (string, proc)
1077Calls the one-argument procedure @var{proc} with a newly
1078created input port from which @var{string}'s contents may be
1079read. The value yielded by the @var{proc} is returned.
1080@end deffn
1081
1082@deffn {Scheme Procedure} with-output-to-string thunk
1083Calls the zero-argument procedure @var{thunk} with the current output
1084port set temporarily to a new string port. It returns a string
1085composed of the characters written to the current output.
1086@end deffn
1087
1088@deffn {Scheme Procedure} with-input-from-string string thunk
1089Calls the zero-argument procedure @var{thunk} with the current input
1090port set temporarily to a string port opened on the specified
1091@var{string}. The value yielded by @var{thunk} is returned.
1092@end deffn
1093
1094@deffn {Scheme Procedure} open-input-string str
1095@deffnx {C Function} scm_open_input_string (str)
1096Take a string and return an input port that delivers characters
1097from the string. The port can be closed by
1098@code{close-input-port}, though its storage will be reclaimed
1099by the garbage collector if it becomes inaccessible.
1100@end deffn
1101
1102@deffn {Scheme Procedure} open-output-string
1103@deffnx {C Function} scm_open_output_string ()
1104Return an output port that will accumulate characters for
1105retrieval by @code{get-output-string}. The port can be closed
1106by the procedure @code{close-output-port}, though its storage
1107will be reclaimed by the garbage collector if it becomes
1108inaccessible.
1109@end deffn
1110
1111@deffn {Scheme Procedure} get-output-string port
1112@deffnx {C Function} scm_get_output_string (port)
1113Given an output port created by @code{open-output-string},
1114return a string consisting of the characters that have been
1115output to the port so far.
1116
1117@code{get-output-string} must be used before closing @var{port}, once
1118closed the string cannot be obtained.
1119@end deffn
1120
1121A string port can be used in many procedures which accept a port
1122but which are not dependent on implementation details of fports.
1123E.g., seeking and truncating will work on a string port,
1124but trying to extract the file descriptor number will fail.
1125
1126
1127@node Soft Ports
1128@subsubsection Soft Ports
bf5df489
KR
1129@cindex Soft port
1130@cindex Port, soft
07d83abe
MV
1131
1132A @dfn{soft-port} is a port based on a vector of procedures capable of
1133accepting or delivering characters. It allows emulation of I/O ports.
1134
1135@deffn {Scheme Procedure} make-soft-port pv modes
1136@deffnx {C Function} scm_make_soft_port (pv, modes)
1137Return a port capable of receiving or delivering characters as
1138specified by the @var{modes} string (@pxref{File Ports,
1139open-file}). @var{pv} must be a vector of length 5 or 6. Its
1140components are as follows:
1141
1142@enumerate 0
1143@item
1144procedure accepting one character for output
1145@item
1146procedure accepting a string for output
1147@item
1148thunk for flushing output
1149@item
1150thunk for getting one character
1151@item
1152thunk for closing port (not by garbage collection)
1153@item
1154(if present and not @code{#f}) thunk for computing the number of
1155characters that can be read from the port without blocking.
1156@end enumerate
1157
1158For an output-only port only elements 0, 1, 2, and 4 need be
1159procedures. For an input-only port only elements 3 and 4 need
1160be procedures. Thunks 2 and 4 can instead be @code{#f} if
1161there is no useful operation for them to perform.
1162
1163If thunk 3 returns @code{#f} or an @code{eof-object}
1164(@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on
1165Scheme}) it indicates that the port has reached end-of-file.
1166For example:
1167
1168@lisp
1169(define stdout (current-output-port))
1170(define p (make-soft-port
1171 (vector
1172 (lambda (c) (write c stdout))
1173 (lambda (s) (display s stdout))
1174 (lambda () (display "." stdout))
1175 (lambda () (char-upcase (read-char)))
1176 (lambda () (display "@@" stdout)))
1177 "rw"))
1178
1179(write p p) @result{} #<input-output: soft 8081e20>
1180@end lisp
1181@end deffn
1182
1183
1184@node Void Ports
1185@subsubsection Void Ports
bf5df489
KR
1186@cindex Void port
1187@cindex Port, void
07d83abe
MV
1188
1189This kind of port causes any data to be discarded when written to, and
1190always returns the end-of-file object when read from.
1191
1192@deffn {Scheme Procedure} %make-void-port mode
1193@deffnx {C Function} scm_sys_make_void_port (mode)
1194Create and return a new void port. A void port acts like
1195@file{/dev/null}. The @var{mode} argument
1196specifies the input/output modes for this port: see the
1197documentation for @code{open-file} in @ref{File Ports}.
1198@end deffn
1199
1200
b242715b
LC
1201@node R6RS I/O Ports
1202@subsection R6RS I/O Ports
1203
1204@cindex R6RS
1205@cindex R6RS ports
1206
1207The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
1208the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
1209io ports)} module. It provides features, such as binary I/O and Unicode
1210string I/O, that complement or refine Guile's historical port API
040dfa6f
AR
1211presented above (@pxref{Input and Output}). Note that R6RS ports are not
1212disjoint from Guile's native ports, so Guile-specific procedures will
1213work on ports created using the R6RS API, and vice versa.
1214
1215The text in this section is taken from the R6RS standard libraries
1216document, with only minor adaptions for inclusion in this manual. The
1217Guile developers offer their thanks to the R6RS editors for having
1218provided the report's text under permissive conditions making this
1219possible.
b242715b
LC
1220
1221@c FIXME: Update description when implemented.
958173e4 1222@emph{Note}: The implementation of this R6RS API is not complete yet.
b242715b
LC
1223
1224@menu
040dfa6f
AR
1225* R6RS File Names:: File names.
1226* R6RS File Options:: Options for opening files.
1227* R6RS Buffer Modes:: Influencing buffering behavior.
1228* R6RS Transcoders:: Influencing port encoding.
b242715b
LC
1229* R6RS End-of-File:: The end-of-file object.
1230* R6RS Port Manipulation:: Manipulating R6RS ports.
040dfa6f 1231* R6RS Input Ports:: Input Ports.
b242715b 1232* R6RS Binary Input:: Binary input.
040dfa6f
AR
1233* R6RS Textual Input:: Textual input.
1234* R6RS Output Ports:: Output Ports.
b242715b 1235* R6RS Binary Output:: Binary output.
040dfa6f 1236* R6RS Textual Output:: Textual output.
b242715b
LC
1237@end menu
1238
7f6c3f8f
MW
1239A subset of the @code{(rnrs io ports)} module, plus one non-standard
1240procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is
1241provided by the @code{(ice-9 binary-ports)} module. It contains binary
1242input/output procedures and does not rely on R6RS support.
de424d95 1243
040dfa6f
AR
1244@node R6RS File Names
1245@subsubsection File Names
1246
1247Some of the procedures described in this chapter accept a file name as an
1248argument. Valid values for such a file name include strings that name a file
b3da54d1 1249using the native notation of file system paths on an implementation's
040dfa6f
AR
1250underlying operating system, and may include implementation-dependent
1251values as well.
1252
1253A @var{filename} parameter name means that the
1254corresponding argument must be a file name.
1255
1256@node R6RS File Options
1257@subsubsection File Options
1258@cindex file options
1259
1260When opening a file, the various procedures in this library accept a
1261@code{file-options} object that encapsulates flags to specify how the
1262file is to be opened. A @code{file-options} object is an enum-set
1263(@pxref{rnrs enums}) over the symbols constituting valid file options.
1264
1265A @var{file-options} parameter name means that the corresponding
1266argument must be a file-options object.
1267
1268@deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
1269
1270Each @var{file-options-symbol} must be a symbol.
1271
1272The @code{file-options} syntax returns a file-options object that
1273encapsulates the specified options.
1274
1275When supplied to an operation that opens a file for output, the
1276file-options object returned by @code{(file-options)} specifies that the
1277file is created if it does not exist and an exception with condition
1278type @code{&i/o-file-already-exists} is raised if it does exist. The
1279following standard options can be included to modify the default
1280behavior.
1281
1282@table @code
1283@item no-create
1284 If the file does not already exist, it is not created;
1285 instead, an exception with condition type @code{&i/o-file-does-not-exist}
1286 is raised.
1287 If the file already exists, the exception with condition type
1288 @code{&i/o-file-already-exists} is not raised
1289 and the file is truncated to zero length.
1290@item no-fail
1291 If the file already exists, the exception with condition type
1292 @code{&i/o-file-already-exists} is not raised,
1293 even if @code{no-create} is not included,
1294 and the file is truncated to zero length.
1295@item no-truncate
1296 If the file already exists and the exception with condition type
1297 @code{&i/o-file-already-exists} has been inhibited by inclusion of
1298 @code{no-create} or @code{no-fail}, the file is not truncated, but
1299 the port's current position is still set to the beginning of the
1300 file.
1301@end table
1302
1303These options have no effect when a file is opened only for input.
1304Symbols other than those listed above may be used as
1305@var{file-options-symbol}s; they have implementation-specific meaning,
1306if any.
1307
1308@quotation Note
1309 Only the name of @var{file-options-symbol} is significant.
1310@end quotation
1311@end deffn
1312
1313@node R6RS Buffer Modes
1314@subsubsection Buffer Modes
1315
1316Each port has an associated buffer mode. For an output port, the
1317buffer mode defines when an output operation flushes the buffer
1318associated with the output port. For an input port, the buffer mode
1319defines how much data will be read to satisfy read operations. The
1320possible buffer modes are the symbols @code{none} for no buffering,
1321@code{line} for flushing upon line endings and reading up to line
1322endings, or other implementation-dependent behavior,
1323and @code{block} for arbitrary buffering. This section uses
1324the parameter name @var{buffer-mode} for arguments that must be
1325buffer-mode symbols.
1326
1327If two ports are connected to the same mutable source, both ports
1328are unbuffered, and reading a byte or character from that shared
1329source via one of the two ports would change the bytes or characters
1330seen via the other port, a lookahead operation on one port will
1331render the peeked byte or character inaccessible via the other port,
1332while a subsequent read operation on the peeked port will see the
1333peeked byte or character even though the port is otherwise unbuffered.
1334
1335In other words, the semantics of buffering is defined in terms of side
1336effects on shared mutable sources, and a lookahead operation has the
1337same side effect on the shared source as a read operation.
1338
1339@deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
1340
1341@var{buffer-mode-symbol} must be a symbol whose name is one of
1342@code{none}, @code{line}, and @code{block}. The result is the
1343corresponding symbol, and specifies the associated buffer mode.
1344
1345@quotation Note
1346 Only the name of @var{buffer-mode-symbol} is significant.
1347@end quotation
1348@end deffn
1349
1350@deffn {Scheme Procedure} buffer-mode? obj
1351Returns @code{#t} if the argument is a valid buffer-mode symbol, and
1352returns @code{#f} otherwise.
1353@end deffn
1354
1355@node R6RS Transcoders
1356@subsubsection Transcoders
1357@cindex codec
1358@cindex end-of-line style
1359@cindex transcoder
1360@cindex binary port
1361@cindex textual port
1362
1363Several different Unicode encoding schemes describe standard ways to
1364encode characters and strings as byte sequences and to decode those
1365sequences. Within this document, a @dfn{codec} is an immutable Scheme
1366object that represents a Unicode or similar encoding scheme.
1367
1368An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
1369describes how a textual port transcodes representations of line endings.
1370
1371A @dfn{transcoder} is an immutable Scheme object that combines a codec
1372with an end-of-line style and a method for handling decoding errors.
1373Each transcoder represents some specific bidirectional (but not
1374necessarily lossless), possibly stateful translation between byte
1375sequences and Unicode characters and strings. Every transcoder can
1376operate in the input direction (bytes to characters) or in the output
1377direction (characters to bytes). A @var{transcoder} parameter name
1378means that the corresponding argument must be a transcoder.
1379
1380A @dfn{binary port} is a port that supports binary I/O, does not have an
1381associated transcoder and does not support textual I/O. A @dfn{textual
1382port} is a port that supports textual I/O, and does not support binary
1383I/O. A textual port may or may not have an associated transcoder.
1384
1385@deffn {Scheme Procedure} latin-1-codec
1386@deffnx {Scheme Procedure} utf-8-codec
1387@deffnx {Scheme Procedure} utf-16-codec
1388
1389These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
1390encoding schemes.
1391
1392A call to any of these procedures returns a value that is equal in the
1393sense of @code{eqv?} to the result of any other call to the same
1394procedure.
1395@end deffn
1396
1397@deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
1398
1399@var{eol-style-symbol} should be a symbol whose name is one of
1400@code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
1401and @code{none}.
1402
1403The form evaluates to the corresponding symbol. If the name of
1404@var{eol-style-symbol} is not one of these symbols, the effect and
1405result are implementation-dependent; in particular, the result may be an
1406eol-style symbol acceptable as an @var{eol-style} argument to
1407@code{make-transcoder}. Otherwise, an exception is raised.
1408
1409All eol-style symbols except @code{none} describe a specific
1410line-ending encoding:
1411
1412@table @code
1413@item lf
1414linefeed
1415@item cr
1416carriage return
1417@item crlf
1418carriage return, linefeed
1419@item nel
1420next line
1421@item crnel
1422carriage return, next line
1423@item ls
1424line separator
1425@end table
1426
1427For a textual port with a transcoder, and whose transcoder has an
1428eol-style symbol @code{none}, no conversion occurs. For a textual input
1429port, any eol-style symbol other than @code{none} means that all of the
1430above line-ending encodings are recognized and are translated into a
1431single linefeed. For a textual output port, @code{none} and @code{lf}
1432are equivalent. Linefeed characters are encoded according to the
1433specified eol-style symbol, and all other characters that participate in
1434possible line endings are encoded as is.
1435
1436@quotation Note
1437 Only the name of @var{eol-style-symbol} is significant.
1438@end quotation
1439@end deffn
1440
1441@deffn {Scheme Procedure} native-eol-style
1442Returns the default end-of-line style of the underlying platform, e.g.,
1443@code{lf} on Unix and @code{crlf} on Windows.
1444@end deffn
1445
1446@deffn {Condition Type} &i/o-decoding
1447@deffnx {Scheme Procedure} make-i/o-decoding-error port
1448@deffnx {Scheme Procedure} i/o-decoding-error? obj
1449
1450This condition type could be defined by
1451
1452@lisp
1453(define-condition-type &i/o-decoding &i/o-port
1454 make-i/o-decoding-error i/o-decoding-error?)
1455@end lisp
1456
1457An exception with this type is raised when one of the operations for
1458textual input from a port encounters a sequence of bytes that cannot be
1459translated into a character or string by the input direction of the
1460port's transcoder.
1461
1462When such an exception is raised, the port's position is past the
1463invalid encoding.
1464@end deffn
1465
1466@deffn {Condition Type} &i/o-encoding
1467@deffnx {Scheme Procedure} make-i/o-encoding-error port char
1468@deffnx {Scheme Procedure} i/o-encoding-error? obj
1469@deffnx {Scheme Procedure} i/o-encoding-error-char condition
1470
1471This condition type could be defined by
1472
1473@lisp
1474(define-condition-type &i/o-encoding &i/o-port
1475 make-i/o-encoding-error i/o-encoding-error?
1476 (char i/o-encoding-error-char))
1477@end lisp
1478
1479An exception with this type is raised when one of the operations for
1480textual output to a port encounters a character that cannot be
1481translated into bytes by the output direction of the port's transcoder.
64de6db5 1482@var{char} is the character that could not be encoded.
040dfa6f
AR
1483@end deffn
1484
1485@deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
1486
1487@var{error-handling-mode-symbol} should be a symbol whose name is one of
1488@code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
1489the corresponding symbol. If @var{error-handling-mode-symbol} is not
1490one of these identifiers, effect and result are
1491implementation-dependent: The result may be an error-handling-mode
1492symbol acceptable as a @var{handling-mode} argument to
1493@code{make-transcoder}. If it is not acceptable as a
1494@var{handling-mode} argument to @code{make-transcoder}, an exception is
1495raised.
1496
1497@quotation Note
64de6db5 1498 Only the name of @var{error-handling-mode-symbol} is significant.
040dfa6f
AR
1499@end quotation
1500
1501The error-handling mode of a transcoder specifies the behavior
1502of textual I/O operations in the presence of encoding or decoding
1503errors.
1504
1505If a textual input operation encounters an invalid or incomplete
1506character encoding, and the error-handling mode is @code{ignore}, an
1507appropriate number of bytes of the invalid encoding are ignored and
1508decoding continues with the following bytes.
1509
1510If the error-handling mode is @code{replace}, the replacement
1511character U+FFFD is injected into the data stream, an appropriate
1512number of bytes are ignored, and decoding
1513continues with the following bytes.
1514
1515If the error-handling mode is @code{raise}, an exception with condition
1516type @code{&i/o-decoding} is raised.
1517
1518If a textual output operation encounters a character it cannot encode,
1519and the error-handling mode is @code{ignore}, the character is ignored
1520and encoding continues with the next character. If the error-handling
1521mode is @code{replace}, a codec-specific replacement character is
1522emitted by the transcoder, and encoding continues with the next
1523character. The replacement character is U+FFFD for transcoders whose
1524codec is one of the Unicode encodings, but is the @code{?} character
1525for the Latin-1 encoding. If the error-handling mode is @code{raise},
1526an exception with condition type @code{&i/o-encoding} is raised.
1527@end deffn
1528
1529@deffn {Scheme Procedure} make-transcoder codec
1530@deffnx {Scheme Procedure} make-transcoder codec eol-style
1531@deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
1532
1533@var{codec} must be a codec; @var{eol-style}, if present, an eol-style
1534symbol; and @var{handling-mode}, if present, an error-handling-mode
1535symbol.
1536
1537@var{eol-style} may be omitted, in which case it defaults to the native
64de6db5 1538end-of-line style of the underlying platform. @var{handling-mode} may
040dfa6f
AR
1539be omitted, in which case it defaults to @code{replace}. The result is
1540a transcoder with the behavior specified by its arguments.
1541@end deffn
1542
1543@deffn {Scheme procedure} native-transcoder
1544Returns an implementation-dependent transcoder that represents a
1545possibly locale-dependent ``native'' transcoding.
1546@end deffn
1547
1548@deffn {Scheme Procedure} transcoder-codec transcoder
1549@deffnx {Scheme Procedure} transcoder-eol-style transcoder
1550@deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
1551
1552These are accessors for transcoder objects; when applied to a
1553transcoder returned by @code{make-transcoder}, they return the
1554@var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
1555respectively.
1556@end deffn
1557
1558@deffn {Scheme Procedure} bytevector->string bytevector transcoder
1559
1560Returns the string that results from transcoding the
1561@var{bytevector} according to the input direction of the transcoder.
1562@end deffn
1563
1564@deffn {Scheme Procedure} string->bytevector string transcoder
1565
1566Returns the bytevector that results from transcoding the
1567@var{string} according to the output direction of the transcoder.
1568@end deffn
1569
b242715b
LC
1570@node R6RS End-of-File
1571@subsubsection The End-of-File Object
1572
1573@cindex EOF
1574@cindex end-of-file
1575
1576R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io
1577ports)} module:
1578
1579@deffn {Scheme Procedure} eof-object? obj
1580@deffnx {C Function} scm_eof_object_p (obj)
1581Return true if @var{obj} is the end-of-file (EOF) object.
1582@end deffn
1583
1584In addition, the following procedure is provided:
1585
1586@deffn {Scheme Procedure} eof-object
1587@deffnx {C Function} scm_eof_object ()
1588Return the end-of-file (EOF) object.
1589
1590@lisp
1591(eof-object? (eof-object))
1592@result{} #t
1593@end lisp
1594@end deffn
1595
1596
1597@node R6RS Port Manipulation
1598@subsubsection Port Manipulation
1599
1600The procedures listed below operate on any kind of R6RS I/O port.
1601
040dfa6f
AR
1602@deffn {Scheme Procedure} port? obj
1603Returns @code{#t} if the argument is a port, and returns @code{#f}
1604otherwise.
1605@end deffn
1606
1607@deffn {Scheme Procedure} port-transcoder port
1608Returns the transcoder associated with @var{port} if @var{port} is
1609textual and has an associated transcoder, and returns @code{#f} if
1610@var{port} is binary or does not have an associated transcoder.
1611@end deffn
1612
1613@deffn {Scheme Procedure} binary-port? port
1614Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
1615binary data input/output.
1616
1617Note that internally Guile does not differentiate between binary and
1618textual ports, unlike the R6RS. Thus, this procedure returns true when
1619@var{port} does not have an associated encoding---i.e., when
1620@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
1621port-encoding}). This is the case for ports returned by R6RS procedures
1622such as @code{open-bytevector-input-port} and
1623@code{make-custom-binary-output-port}.
1624
1625However, Guile currently does not prevent use of textual I/O procedures
1626such as @code{display} or @code{read-char} with binary ports. Doing so
1627``upgrades'' the port from binary to textual, under the ISO-8859-1
1628encoding. Likewise, Guile does not prevent use of
1629@code{set-port-encoding!} on a binary port, which also turns it into a
1630``textual'' port.
1631@end deffn
1632
1633@deffn {Scheme Procedure} textual-port? port
64de6db5 1634Always return @code{#t}, as all ports can be used for textual I/O in
040dfa6f
AR
1635Guile.
1636@end deffn
1637
64de6db5 1638@deffn {Scheme Procedure} transcoded-port binary-port transcoder
040dfa6f
AR
1639The @code{transcoded-port} procedure
1640returns a new textual port with the specified @var{transcoder}.
1641Otherwise the new textual port's state is largely the same as
1642that of @var{binary-port}.
1643If @var{binary-port} is an input port, the new textual
1644port will be an input port and
1645will transcode the bytes that have not yet been read from
1646@var{binary-port}.
1647If @var{binary-port} is an output port, the new textual
1648port will be an output port and
1649will transcode output characters into bytes that are
1650written to the byte sink represented by @var{binary-port}.
1651
1652As a side effect, however, @code{transcoded-port}
1653closes @var{binary-port} in
1654a special way that allows the new textual port to continue to
1655use the byte source or sink represented by @var{binary-port},
1656even though @var{binary-port} itself is closed and cannot
1657be used by the input and output operations described in this
1658chapter.
1659@end deffn
1660
b242715b
LC
1661@deffn {Scheme Procedure} port-position port
1662If @var{port} supports it (see below), return the offset (an integer)
1663indicating where the next octet will be read from/written to in
1664@var{port}. If @var{port} does not support this operation, an error
1665condition is raised.
1666
1667This is similar to Guile's @code{seek} procedure with the
1668@code{SEEK_CUR} argument (@pxref{Random Access}).
1669@end deffn
1670
1671@deffn {Scheme Procedure} port-has-port-position? port
1672Return @code{#t} is @var{port} supports @code{port-position}.
1673@end deffn
1674
1675@deffn {Scheme Procedure} set-port-position! port offset
1676If @var{port} supports it (see below), set the position where the next
1677octet will be read from/written to @var{port} to @var{offset} (an
1678integer). If @var{port} does not support this operation, an error
1679condition is raised.
1680
1681This is similar to Guile's @code{seek} procedure with the
1682@code{SEEK_SET} argument (@pxref{Random Access}).
1683@end deffn
1684
1685@deffn {Scheme Procedure} port-has-set-port-position!? port
1686Return @code{#t} is @var{port} supports @code{set-port-position!}.
1687@end deffn
1688
1689@deffn {Scheme Procedure} call-with-port port proc
1690Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
1691of @var{proc}. Return the return values of @var{proc}.
1692@end deffn
1693
040dfa6f
AR
1694@node R6RS Input Ports
1695@subsubsection Input Ports
96128014 1696
64de6db5 1697@deffn {Scheme Procedure} input-port? obj
040dfa6f
AR
1698Returns @code{#t} if the argument is an input port (or a combined input
1699and output port), and returns @code{#f} otherwise.
1700@end deffn
96128014 1701
64de6db5 1702@deffn {Scheme Procedure} port-eof? input-port
040dfa6f
AR
1703Returns @code{#t}
1704if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
1705or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
1706would return
1707the end-of-file object, and @code{#f} otherwise.
1708The operation may block indefinitely if no data is available
1709but the port cannot be determined to be at end of file.
96128014
LC
1710@end deffn
1711
040dfa6f
AR
1712@deffn {Scheme Procedure} open-file-input-port filename
1713@deffnx {Scheme Procedure} open-file-input-port filename file-options
1714@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
1715@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
64de6db5 1716@var{maybe-transcoder} must be either a transcoder or @code{#f}.
040dfa6f
AR
1717
1718The @code{open-file-input-port} procedure returns an
1719input port for the named file. The @var{file-options} and
1720@var{maybe-transcoder} arguments are optional.
1721
1722The @var{file-options} argument, which may determine
1723various aspects of the returned port (@pxref{R6RS File Options}),
1724defaults to the value of @code{(file-options)}.
1725
1726The @var{buffer-mode} argument, if supplied,
1727must be one of the symbols that name a buffer mode.
1728The @var{buffer-mode} argument defaults to @code{block}.
1729
1730If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
1731with the returned port.
1732
1733If @var{maybe-transcoder} is @code{#f} or absent,
1734the port will be a binary port and will support the
1735@code{port-position} and @code{set-port-position!} operations.
1736Otherwise the port will be a textual port, and whether it supports
1737the @code{port-position} and @code{set-port-position!} operations
1738is implementation-dependent (and possibly transcoder-dependent).
96128014
LC
1739@end deffn
1740
040dfa6f
AR
1741@deffn {Scheme Procedure} standard-input-port
1742Returns a fresh binary input port connected to standard input. Whether
1743the port supports the @code{port-position} and @code{set-port-position!}
1744operations is implementation-dependent.
1745@end deffn
1746
1747@deffn {Scheme Procedure} current-input-port
1748This returns a default textual port for input. Normally, this default
1749port is associated with standard input, but can be dynamically
1750re-assigned using the @code{with-input-from-file} procedure from the
1751@code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
1752may not have an associated transcoder; if it does, the transcoder is
1753implementation-dependent.
1754@end deffn
b242715b
LC
1755
1756@node R6RS Binary Input
1757@subsubsection Binary Input
1758
1759@cindex binary input
1760
1761R6RS binary input ports can be created with the procedures described
1762below.
1763
1764@deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder]
1765@deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder)
1766Return an input port whose contents are drawn from bytevector @var{bv}
1767(@pxref{Bytevectors}).
1768
1769@c FIXME: Update description when implemented.
1770The @var{transcoder} argument is currently not supported.
1771@end deffn
1772
1773@cindex custom binary input ports
1774
1775@deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close
1776@deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close)
1777Return a new custom binary input port@footnote{This is similar in spirit
1778to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a
1779string) whose input is drained by invoking @var{read!} and passing it a
1780bytevector, an index where bytes should be written, and the number of
1781bytes to read. The @code{read!} procedure must return an integer
1782indicating the number of bytes read, or @code{0} to indicate the
1783end-of-file.
1784
1785Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
64de6db5 1786that will be called when @code{port-position} is invoked on the custom
b242715b
LC
1787binary port and should return an integer indicating the position within
1788the underlying data stream; if @var{get-position} was not supplied, the
64de6db5 1789returned port does not support @code{port-position}.
b242715b
LC
1790
1791Likewise, if @var{set-position!} is not @code{#f}, it should be a
64de6db5 1792one-argument procedure. When @code{set-port-position!} is invoked on the
b242715b
LC
1793custom binary input port, @var{set-position!} is passed an integer
1794indicating the position of the next byte is to read.
1795
1796Finally, if @var{close} is not @code{#f}, it must be a thunk. It is
1797invoked when the custom binary input port is closed.
1798
8ca97482
LC
1799The returned port is fully buffered by default, but its buffering mode
1800can be changed using @code{setvbuf} (@pxref{Ports and File Descriptors,
1801@code{setvbuf}}).
1802
b242715b
LC
1803Using a custom binary input port, the @code{open-bytevector-input-port}
1804procedure could be implemented as follows:
1805
1806@lisp
1807(define (open-bytevector-input-port source)
1808 (define position 0)
1809 (define length (bytevector-length source))
1810
1811 (define (read! bv start count)
1812 (let ((count (min count (- length position))))
1813 (bytevector-copy! source position
1814 bv start count)
1815 (set! position (+ position count))
1816 count))
1817
1818 (define (get-position) position)
1819
1820 (define (set-position! new-position)
1821 (set! position new-position))
1822
1823 (make-custom-binary-input-port "the port" read!
1824 get-position
1825 set-position!))
1826
1827(read (open-bytevector-input-port (string->utf8 "hello")))
1828@result{} hello
1829@end lisp
1830@end deffn
1831
1832@cindex binary input
1833Binary input is achieved using the procedures below:
1834
1835@deffn {Scheme Procedure} get-u8 port
1836@deffnx {C Function} scm_get_u8 (port)
1837Return an octet read from @var{port}, a binary input port, blocking as
1838necessary, or the end-of-file object.
1839@end deffn
1840
1841@deffn {Scheme Procedure} lookahead-u8 port
1842@deffnx {C Function} scm_lookahead_u8 (port)
1843Like @code{get-u8} but does not update @var{port}'s position to point
1844past the octet.
1845@end deffn
1846
1847@deffn {Scheme Procedure} get-bytevector-n port count
1848@deffnx {C Function} scm_get_bytevector_n (port, count)
1849Read @var{count} octets from @var{port}, blocking as necessary and
1850return a bytevector containing the octets read. If fewer bytes are
1851available, a bytevector smaller than @var{count} is returned.
1852@end deffn
1853
1854@deffn {Scheme Procedure} get-bytevector-n! port bv start count
1855@deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count)
1856Read @var{count} bytes from @var{port} and store them in @var{bv}
1857starting at index @var{start}. Return either the number of bytes
1858actually read or the end-of-file object.
1859@end deffn
1860
1861@deffn {Scheme Procedure} get-bytevector-some port
1862@deffnx {C Function} scm_get_bytevector_some (port)
21bbe22a
MW
1863Read from @var{port}, blocking as necessary, until bytes are available
1864or an end-of-file is reached. Return either the end-of-file object or a
1865new bytevector containing some of the available bytes (at least one),
1866and update the port position to point just past these bytes.
b242715b
LC
1867@end deffn
1868
1869@deffn {Scheme Procedure} get-bytevector-all port
1870@deffnx {C Function} scm_get_bytevector_all (port)
1871Read from @var{port}, blocking as necessary, until the end-of-file is
1872reached. Return either a new bytevector containing the data read or the
1873end-of-file object (if no data were available).
1874@end deffn
1875
7f6c3f8f
MW
1876The @code{(ice-9 binary-ports)} module provides the following procedure
1877as an extension to @code{(rnrs io ports)}:
1878
1879@deffn {Scheme Procedure} unget-bytevector port bv [start [count]]
1880@deffnx {C Function} scm_unget_bytevector (port, bv, start, count)
1881Place the contents of @var{bv} in @var{port}, optionally starting at
1882index @var{start} and limiting to @var{count} octets, so that its bytes
1883will be read from left-to-right as the next bytes from @var{port} during
1884subsequent read operations. If called multiple times, the unread bytes
1885will be read again in last-in first-out order.
1886@end deffn
1887
040dfa6f
AR
1888@node R6RS Textual Input
1889@subsubsection Textual Input
1890
64de6db5 1891@deffn {Scheme Procedure} get-char textual-input-port
040dfa6f
AR
1892Reads from @var{textual-input-port}, blocking as necessary, until a
1893complete character is available from @var{textual-input-port},
1894or until an end of file is reached.
1895
1896If a complete character is available before the next end of file,
1897@code{get-char} returns that character and updates the input port to
1898point past the character. If an end of file is reached before any
1899character is read, @code{get-char} returns the end-of-file object.
1900@end deffn
1901
64de6db5 1902@deffn {Scheme Procedure} lookahead-char textual-input-port
040dfa6f
AR
1903The @code{lookahead-char} procedure is like @code{get-char}, but it does
1904not update @var{textual-input-port} to point past the character.
1905@end deffn
1906
64de6db5 1907@deffn {Scheme Procedure} get-string-n textual-input-port count
040dfa6f 1908
64de6db5 1909@var{count} must be an exact, non-negative integer object, representing
040dfa6f
AR
1910the number of characters to be read.
1911
1912The @code{get-string-n} procedure reads from @var{textual-input-port},
1913blocking as necessary, until @var{count} characters are available, or
1914until an end of file is reached.
1915
1916If @var{count} characters are available before end of file,
1917@code{get-string-n} returns a string consisting of those @var{count}
1918characters. If fewer characters are available before an end of file, but
1919one or more characters can be read, @code{get-string-n} returns a string
1920containing those characters. In either case, the input port is updated
1921to point just past the characters read. If no characters can be read
1922before an end of file, the end-of-file object is returned.
1923@end deffn
1924
64de6db5 1925@deffn {Scheme Procedure} get-string-n! textual-input-port string start count
040dfa6f 1926
64de6db5 1927@var{start} and @var{count} must be exact, non-negative integer objects,
040dfa6f 1928with @var{count} representing the number of characters to be read.
64de6db5 1929@var{string} must be a string with at least $@var{start} + @var{count}$
040dfa6f
AR
1930characters.
1931
1932The @code{get-string-n!} procedure reads from @var{textual-input-port}
1933in the same manner as @code{get-string-n}. If @var{count} characters
1934are available before an end of file, they are written into @var{string}
1935starting at index @var{start}, and @var{count} is returned. If fewer
1936characters are available before an end of file, but one or more can be
1937read, those characters are written into @var{string} starting at index
1938@var{start} and the number of characters actually read is returned as an
1939exact integer object. If no characters can be read before an end of
1940file, the end-of-file object is returned.
1941@end deffn
1942
1fcf6909 1943@deffn {Scheme Procedure} get-string-all textual-input-port
040dfa6f
AR
1944Reads from @var{textual-input-port} until an end of file, decoding
1945characters in the same manner as @code{get-string-n} and
1946@code{get-string-n!}.
1947
1948If characters are available before the end of file, a string containing
1949all the characters decoded from that data are returned. If no character
1950precedes the end of file, the end-of-file object is returned.
1951@end deffn
1952
64de6db5 1953@deffn {Scheme Procedure} get-line textual-input-port
040dfa6f
AR
1954Reads from @var{textual-input-port} up to and including the linefeed
1955character or end of file, decoding characters in the same manner as
1956@code{get-string-n} and @code{get-string-n!}.
1957
1958If a linefeed character is read, a string containing all of the text up
1959to (but not including) the linefeed character is returned, and the port
1960is updated to point just past the linefeed character. If an end of file
1961is encountered before any linefeed character is read, but some
1962characters have been read and decoded as characters, a string containing
1963those characters is returned. If an end of file is encountered before
1964any characters are read, the end-of-file object is returned.
1965
1966@quotation Note
1967 The end-of-line style, if not @code{none}, will cause all line endings
1968 to be read as linefeed characters. @xref{R6RS Transcoders}.
1969@end quotation
1970@end deffn
1971
64de6db5 1972@deffn {Scheme Procedure} get-datum textual-input-port count
040dfa6f
AR
1973Reads an external representation from @var{textual-input-port} and returns the
1974datum it represents. The @code{get-datum} procedure returns the next
1975datum that can be parsed from the given @var{textual-input-port}, updating
1976@var{textual-input-port} to point exactly past the end of the external
1977representation of the object.
1978
1979Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
1980Syntax}) in the input is first skipped. If an end of file occurs after
1981the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
1982is returned.
1983
1984If a character inconsistent with an external representation is
1985encountered in the input, an exception with condition types
1986@code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
1987file is encountered after the beginning of an external representation,
1988but the external representation is incomplete and therefore cannot be
1989parsed, an exception with condition types @code{&lexical} and
1990@code{&i/o-read} is raised.
1991@end deffn
1992
1993@node R6RS Output Ports
1994@subsubsection Output Ports
1995
1996@deffn {Scheme Procedure} output-port? obj
1997Returns @code{#t} if the argument is an output port (or a
1998combined input and output port), @code{#f} otherwise.
1999@end deffn
2000
2001@deffn {Scheme Procedure} flush-output-port port
2002Flushes any buffered output from the buffer of @var{output-port} to the
2003underlying file, device, or object. The @code{flush-output-port}
2004procedure returns an unspecified values.
2005@end deffn
2006
2007@deffn {Scheme Procedure} open-file-output-port filename
2008@deffnx {Scheme Procedure} open-file-output-port filename file-options
2009@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
2010@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
2011
2012@var{maybe-transcoder} must be either a transcoder or @code{#f}.
2013
2014The @code{open-file-output-port} procedure returns an output port for the named file.
2015
2016The @var{file-options} argument, which may determine various aspects of
2017the returned port (@pxref{R6RS File Options}), defaults to the value of
2018@code{(file-options)}.
2019
2020The @var{buffer-mode} argument, if supplied,
2021must be one of the symbols that name a buffer mode.
2022The @var{buffer-mode} argument defaults to @code{block}.
2023
2024If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
2025associated with the port.
2026
2027If @var{maybe-transcoder} is @code{#f} or absent,
2028the port will be a binary port and will support the
2029@code{port-position} and @code{set-port-position!} operations.
2030Otherwise the port will be a textual port, and whether it supports
2031the @code{port-position} and @code{set-port-position!} operations
2032is implementation-dependent (and possibly transcoder-dependent).
2033@end deffn
2034
2035@deffn {Scheme Procedure} standard-output-port
2036@deffnx {Scheme Procedure} standard-error-port
2037Returns a fresh binary output port connected to the standard output or
2038standard error respectively. Whether the port supports the
2039@code{port-position} and @code{set-port-position!} operations is
2040implementation-dependent.
2041@end deffn
2042
2043@deffn {Scheme Procedure} current-output-port
2044@deffnx {Scheme Procedure} current-error-port
2045These return default textual ports for regular output and error output.
2046Normally, these default ports are associated with standard output, and
2047standard error, respectively. The return value of
2048@code{current-output-port} can be dynamically re-assigned using the
2049@code{with-output-to-file} procedure from the @code{io simple (6)}
2050library (@pxref{rnrs io simple}). A port returned by one of these
2051procedures may or may not have an associated transcoder; if it does, the
2052transcoder is implementation-dependent.
2053@end deffn
2054
b242715b
LC
2055@node R6RS Binary Output
2056@subsubsection Binary Output
2057
2058Binary output ports can be created with the procedures below.
2059
2060@deffn {Scheme Procedure} open-bytevector-output-port [transcoder]
2061@deffnx {C Function} scm_open_bytevector_output_port (transcoder)
2062Return two values: a binary output port and a procedure. The latter
2063should be called with zero arguments to obtain a bytevector containing
2064the data accumulated by the port, as illustrated below.
2065
2066@lisp
2067(call-with-values
2068 (lambda ()
2069 (open-bytevector-output-port))
2070 (lambda (port get-bytevector)
2071 (display "hello" port)
2072 (get-bytevector)))
2073
2074@result{} #vu8(104 101 108 108 111)
2075@end lisp
2076
2077@c FIXME: Update description when implemented.
2078The @var{transcoder} argument is currently not supported.
2079@end deffn
2080
2081@cindex custom binary output ports
2082
2083@deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close
2084@deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close)
2085Return a new custom binary output port named @var{id} (a string) whose
2086output is sunk by invoking @var{write!} and passing it a bytevector, an
2087index where bytes should be read from this bytevector, and the number of
2088bytes to be ``written''. The @code{write!} procedure must return an
2089integer indicating the number of bytes actually written; when it is
2090passed @code{0} as the number of bytes to write, it should behave as
2091though an end-of-file was sent to the byte sink.
2092
2093The other arguments are as for @code{make-custom-binary-input-port}
2094(@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}).
2095@end deffn
2096
2097@cindex binary output
2098Writing to a binary output port can be done using the following
2099procedures:
2100
2101@deffn {Scheme Procedure} put-u8 port octet
2102@deffnx {C Function} scm_put_u8 (port, octet)
2103Write @var{octet}, an integer in the 0--255 range, to @var{port}, a
2104binary output port.
2105@end deffn
2106
2107@deffn {Scheme Procedure} put-bytevector port bv [start [count]]
2108@deffnx {C Function} scm_put_bytevector (port, bv, start, count)
2109Write the contents of @var{bv} to @var{port}, optionally starting at
2110index @var{start} and limiting to @var{count} octets.
2111@end deffn
2112
040dfa6f
AR
2113@node R6RS Textual Output
2114@subsubsection Textual Output
2115
2116@deffn {Scheme Procedure} put-char port char
2117Writes @var{char} to the port. The @code{put-char} procedure returns
803c087e 2118an unspecified value.
040dfa6f
AR
2119@end deffn
2120
2121@deffn {Scheme Procedure} put-string port string
2122@deffnx {Scheme Procedure} put-string port string start
2123@deffnx {Scheme Procedure} put-string port string start count
2124
2125@var{start} and @var{count} must be non-negative exact integer objects.
2126@var{string} must have a length of at least @math{@var{start} +
2127@var{count}}. @var{start} defaults to 0. @var{count} defaults to
2128@math{@code{(string-length @var{string})} - @var{start}}$. The
2129@code{put-string} procedure writes the @var{count} characters of
2130@var{string} starting at index @var{start} to the port. The
2131@code{put-string} procedure returns an unspecified value.
2132@end deffn
2133
64de6db5 2134@deffn {Scheme Procedure} put-datum textual-output-port datum
040dfa6f
AR
2135@var{datum} should be a datum value. The @code{put-datum} procedure
2136writes an external representation of @var{datum} to
2137@var{textual-output-port}. The specific external representation is
2138implementation-dependent. However, whenever possible, an implementation
2139should produce a representation for which @code{get-datum}, when reading
2140the representation, will return an object equal (in the sense of
2141@code{equal?}) to @var{datum}.
2142
2143@quotation Note
2144 Not all datums may allow producing an external representation for which
2145 @code{get-datum} will produce an object that is equal to the
2146 original. Specifically, NaNs contained in @var{datum} may make
2147 this impossible.
2148@end quotation
2149
2150@quotation Note
2151 The @code{put-datum} procedure merely writes the external
2152 representation, but no trailing delimiter. If @code{put-datum} is
2153 used to write several subsequent external representations to an
2154 output port, care should be taken to delimit them properly so they can
2155 be read back in by subsequent calls to @code{get-datum}.
2156@end quotation
2157@end deffn
b242715b 2158
07d83abe
MV
2159@node I/O Extensions
2160@subsection Using and Extending Ports in C
2161
2162@menu
2163* C Port Interface:: Using ports from C.
2164* Port Implementation:: How to implement a new port type in C.
2165@end menu
2166
2167
2168@node C Port Interface
2169@subsubsection C Port Interface
bf5df489
KR
2170@cindex C port interface
2171@cindex Port, C interface
07d83abe
MV
2172
2173This section describes how to use Scheme ports from C.
2174
2175@subsubheading Port basics
2176
3081aee1
KR
2177@cindex ptob
2178@tindex scm_ptob_descriptor
2179@tindex scm_port
2180@findex SCM_PTAB_ENTRY
2181@findex SCM_PTOBNUM
2182@vindex scm_ptobs
07d83abe
MV
2183There are two main data structures. A port type object (ptob) is of
2184type @code{scm_ptob_descriptor}. A port instance is of type
2185@code{scm_port}. Given an @code{SCM} variable which points to a port,
2186the corresponding C port object can be obtained using the
2187@code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using
2188@code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs}
2189global array.
2190
2191@subsubheading Port buffers
2192
2193An input port always has a read buffer and an output port always has a
2194write buffer. However the size of these buffers is not guaranteed to be
2195more than one byte (e.g., the @code{shortbuf} field in @code{scm_port}
2196which is used when no other buffer is allocated). The way in which the
2197buffers are allocated depends on the implementation of the ptob. For
2198example in the case of an fport, buffers may be allocated with malloc
2199when the port is created, but in the case of an strport the underlying
2200string is used as the buffer.
2201
2202@subsubheading The @code{rw_random} flag
2203
2204Special treatment is required for ports which can be seeked at random.
2205Before various operations, such as seeking the port or changing from
2206input to output on a bidirectional port or vice versa, the port
2207implementation must be given a chance to update its state. The write
2208buffer is updated by calling the @code{flush} ptob procedure and the
2209input buffer is updated by calling the @code{end_input} ptob procedure.
2210In the case of an fport, @code{flush} causes buffered output to be
2211written to the file descriptor, while @code{end_input} causes the
2212descriptor position to be adjusted to account for buffered input which
2213was never read.
2214
2215The special treatment must be performed if the @code{rw_random} flag in
2216the port is non-zero.
2217
2218@subsubheading The @code{rw_active} variable
2219
2220The @code{rw_active} variable in the port is only used if
2221@code{rw_random} is set. It's defined as an enum with the following
2222values:
2223
2224@table @code
2225@item SCM_PORT_READ
2226the read buffer may have unread data.
2227
2228@item SCM_PORT_WRITE
2229the write buffer may have unwritten data.
2230
2231@item SCM_PORT_NEITHER
2232neither the write nor the read buffer has data.
2233@end table
2234
2235@subsubheading Reading from a port.
2236
2237To read from a port, it's possible to either call existing libguile
2238procedures such as @code{scm_getc} and @code{scm_read_line} or to read
2239data from the read buffer directly. Reading from the buffer involves
2240the following steps:
2241
2242@enumerate
2243@item
2244Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}.
2245
2246@item
2247Fill the read buffer, if it's empty, using @code{scm_fill_input}.
2248
2249@item Read the data from the buffer and update the read position in
2250the buffer. Steps 2) and 3) may be repeated as many times as required.
2251
2252@item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set.
2253
2254@item update the port's line and column counts.
2255@end enumerate
2256
2257@subsubheading Writing to a port.
2258
2259To write data to a port, calling @code{scm_lfwrite} should be sufficient for
2260most purposes. This takes care of the following steps:
2261
2262@enumerate
2263@item
2264End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}.
2265
2266@item
2267Pass the data to the ptob implementation using the @code{write} ptob
2268procedure. The advantage of using the ptob @code{write} instead of
2269manipulating the write buffer directly is that it allows the data to be
2270written in one operation even if the port is using the single-byte
2271@code{shortbuf}.
2272
2273@item
2274Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random}
2275is set.
2276@end enumerate
2277
2278
2279@node Port Implementation
2280@subsubsection Port Implementation
28cc8dac 2281@cindex Port implementation
07d83abe
MV
2282
2283This section describes how to implement a new port type in C.
2284
2285As described in the previous section, a port type object (ptob) is
2286a structure of type @code{scm_ptob_descriptor}. A ptob is created by
2287calling @code{scm_make_port_type}.
2288
23f2b9a3
KR
2289@deftypefun scm_t_bits scm_make_port_type (char *name, int (*fill_input) (SCM port), void (*write) (SCM port, const void *data, size_t size))
2290Return a new port type object. The @var{name}, @var{fill_input} and
2291@var{write} parameters are initial values for those port type fields,
2292as described below. The other fields are initialized with default
2293values and can be changed later.
2294@end deftypefun
2295
07d83abe
MV
2296All of the elements of the ptob, apart from @code{name}, are procedures
2297which collectively implement the port behaviour. Creating a new port
2298type mostly involves writing these procedures.
2299
07d83abe
MV
2300@table @code
2301@item name
2302A pointer to a NUL terminated string: the name of the port type. This
2303is the only element of @code{scm_ptob_descriptor} which is not
2304a procedure. Set via the first argument to @code{scm_make_port_type}.
2305
2306@item mark
2307Called during garbage collection to mark any SCM objects that a port
2308object may contain. It doesn't need to be set unless the port has
23f2b9a3
KR
2309@code{SCM} components. Set using
2310
2311@deftypefun void scm_set_port_mark (scm_t_bits tc, SCM (*mark) (SCM port))
2312@end deftypefun
07d83abe
MV
2313
2314@item free
2315Called when the port is collected during gc. It
2316should free any resources used by the port.
23f2b9a3
KR
2317Set using
2318
2319@deftypefun void scm_set_port_free (scm_t_bits tc, size_t (*free) (SCM port))
2320@end deftypefun
07d83abe
MV
2321
2322@item print
2323Called when @code{write} is called on the port object, to print a
23f2b9a3
KR
2324port description. E.g., for an fport it may produce something like:
2325@code{#<input: /etc/passwd 3>}. Set using
2326
2327@deftypefun void scm_set_port_print (scm_t_bits tc, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate))
2328The first argument @var{port} is the object being printed, the second
2329argument @var{dest_port} is where its description should go.
2330@end deftypefun
07d83abe
MV
2331
2332@item equalp
23f2b9a3
KR
2333Not used at present. Set using
2334
2335@deftypefun void scm_set_port_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM))
2336@end deftypefun
07d83abe
MV
2337
2338@item close
2339Called when the port is closed, unless it was collected during gc. It
2340should free any resources used by the port.
23f2b9a3
KR
2341Set using
2342
2343@deftypefun void scm_set_port_close (scm_t_bits tc, int (*close) (SCM port))
2344@end deftypefun
07d83abe
MV
2345
2346@item write
2347Accept data which is to be written using the port. The port implementation
2348may choose to buffer the data instead of processing it directly.
2349Set via the third argument to @code{scm_make_port_type}.
2350
2351@item flush
2352Complete the processing of buffered output data. Reset the value of
2353@code{rw_active} to @code{SCM_PORT_NEITHER}.
23f2b9a3
KR
2354Set using
2355
2356@deftypefun void scm_set_port_flush (scm_t_bits tc, void (*flush) (SCM port))
2357@end deftypefun
07d83abe
MV
2358
2359@item end_input
2360Perform any synchronization required when switching from input to output
2361on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}.
23f2b9a3
KR
2362Set using
2363
2364@deftypefun void scm_set_port_end_input (scm_t_bits tc, void (*end_input) (SCM port, int offset))
2365@end deftypefun
07d83abe
MV
2366
2367@item fill_input
2368Read new data into the read buffer and return the first character. It
2369can be assumed that the read buffer is empty when this procedure is called.
2370Set via the second argument to @code{scm_make_port_type}.
2371
2372@item input_waiting
2373Return a lower bound on the number of bytes that could be read from the
2374port without blocking. It can be assumed that the current state of
2375@code{rw_active} is @code{SCM_PORT_NEITHER}.
23f2b9a3
KR
2376Set using
2377
2378@deftypefun void scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM port))
2379@end deftypefun
07d83abe
MV
2380
2381@item seek
2382Set the current position of the port. The procedure can not make
2383any assumptions about the value of @code{rw_active} when it's
2384called. It can reset the buffers first if desired by using something
2385like:
2386
2387@example
23f2b9a3
KR
2388if (pt->rw_active == SCM_PORT_READ)
2389 scm_end_input (port);
2390else if (pt->rw_active == SCM_PORT_WRITE)
2391 ptob->flush (port);
07d83abe
MV
2392@end example
2393
2394However note that this will have the side effect of discarding any data
2395in the unread-char buffer, in addition to any side effects from the
2396@code{end_input} and @code{flush} ptob procedures. This is undesirable
2397when seek is called to measure the current position of the port, i.e.,
2398@code{(seek p 0 SEEK_CUR)}. The libguile fport and string port
2399implementations take care to avoid this problem.
2400
23f2b9a3
KR
2401The procedure is set using
2402
f1ce9199 2403@deftypefun void scm_set_port_seek (scm_t_bits tc, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence))
23f2b9a3 2404@end deftypefun
07d83abe
MV
2405
2406@item truncate
2407Truncate the port data to be specified length. It can be assumed that the
2408current state of @code{rw_active} is @code{SCM_PORT_NEITHER}.
23f2b9a3
KR
2409Set using
2410
f1ce9199 2411@deftypefun void scm_set_port_truncate (scm_t_bits tc, void (*truncate) (SCM port, scm_t_off length))
23f2b9a3 2412@end deftypefun
07d83abe
MV
2413
2414@end table
2415
cdd3d6c9
MW
2416@node BOM Handling
2417@subsection Handling of Unicode byte order marks.
2418@cindex BOM
2419@cindex byte order mark
2420
2421This section documents the finer points of Guile's handling of Unicode
2422byte order marks (BOMs). A byte order mark (U+FEFF) is typically found
2423at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
2424determine the byte order. Occasionally, a BOM is found at the start of
2425a UTF-8 stream, but this is much less common and not generally
2426recommended.
2427
2428Guile attempts to handle BOMs automatically, and in accordance with the
2429recommendations of the Unicode Standard, when the port encoding is set
2430to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile
2431automatically writes a BOM at the start of a UTF-16 or UTF-32 stream,
2432and automatically consumes one from the start of a UTF-8, UTF-16, or
2433UTF-32 stream.
2434
2435As specified in the Unicode Standard, a BOM is only handled specially at
2436the start of a stream, and only if the port encoding is set to
2437@code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is
2438set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or
2439@code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of
2440the special handling described in this section applies.
2441
2442@itemize @bullet
2443@item
2444To ensure that Guile will properly detect the byte order of a UTF-16 or
2445UTF-32 stream, you must perform a textual read before any writes, seeks,
2446or binary I/O. Guile will not attempt to read a BOM unless a read is
2447explicitly requested at the start of the stream.
2448
2449@item
2450If a textual write is performed before the first read, then an arbitrary
2451byte order will be chosen. Currently, big endian is the default on all
2452platforms, but that may change in the future. If you wish to explicitly
2453control the byte order of an output stream, set the port encoding to
2454@code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE},
2455and explicitly write a BOM (@code{#\xFEFF}) if desired.
2456
2457@item
2458If @code{set-port-encoding!} is called in the middle of a stream, Guile
2459treats this as a new logical ``start of stream'' for purposes of BOM
2460handling, and will forget about any BOMs that had previously been seen.
2461Therefore, it may choose a different byte order than had been used
2462previously. This is intended to support multiple logical text streams
2463embedded within a larger binary stream.
2464
2465@item
2466Binary I/O operations are not guaranteed to update Guile's notion of
2467whether the port is at the ``start of the stream'', nor are they
2468guaranteed to produce or consume BOMs.
2469
2470@item
2471For ports that support seeking (e.g. normal files), the input and output
2472streams are considered linked: if the user reads first, then a BOM will
2473be consumed (if appropriate), but later writes will @emph{not} produce a
2474BOM. Similarly, if the user writes first, then later reads will
2475@emph{not} consume a BOM.
2476
2477@item
2478For ports that do not support seeking (e.g. pipes, sockets, and
2479terminals), the input and output streams are considered
2480@emph{independent} for purposes of BOM handling: the first read will
2481consume a BOM (if appropriate), and the first write will @emph{also}
2482produce a BOM (if appropriate). However, the input and output streams
2483will always use the same byte order.
2484
2485@item
2486Seeks to the beginning of a file will set the ``start of stream'' flags.
2487Therefore, a subsequent textual read or write will consume or produce a
2488BOM. However, unlike @code{set-port-encoding!}, if a byte order had
2489already been chosen for the port, it will remain in effect after a seek,
2490and cannot be changed by the presence of a BOM. Seeks anywhere other
2491than the beginning of a file clear the ``start of stream'' flags.
2492@end itemize
2493
07d83abe
MV
2494@c Local Variables:
2495@c TeX-master: "guile.texi"
2496@c End: