Merge remote-tracking branch 'origin/stable-2.0'
[bpt/guile.git] / doc / ref / api-io.texi
CommitLineData
07d83abe
MV
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
c62da8f8 3@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
cdd3d6c9 4@c 2010, 2011, 2013 Free Software Foundation, Inc.
07d83abe
MV
5@c See the file guile.texi for copying conditions.
6
07d83abe
MV
7@node Input and Output
8@section Input and Output
9
10@menu
11* Ports:: The idea of the port abstraction.
12* Reading:: Procedures for reading from a port.
13* Writing:: Procedures for writing to a port.
14* Closing:: Procedures to close a port.
15* Random Access:: Moving around a random access port.
16* Line/Delimited:: Read and write lines or delimited text.
17* Block Reading and Writing:: Reading and writing blocks of text.
18* Default Ports:: Defaults for input, output and errors.
19* Port Types:: Types of port and how to make them.
b242715b 20* R6RS I/O Ports:: The R6RS port API.
07d83abe 21* I/O Extensions:: Using and extending ports in C.
cdd3d6c9 22* BOM Handling:: Handling of Unicode byte order marks.
07d83abe
MV
23@end menu
24
25
26@node Ports
27@subsection Ports
bf5df489 28@cindex Port
07d83abe
MV
29
30Sequential input/output in Scheme is represented by operations on a
31@dfn{port}. This chapter explains the operations that Guile provides
32for working with ports.
33
34Ports are created by opening, for instance @code{open-file} for a file
35(@pxref{File Ports}). Characters can be read from an input port and
36written to an output port, or both on an input/output port. A port
37can be closed (@pxref{Closing}) when no longer required, after which
38any attempt to read or write is an error.
39
40The formal definition of a port is very generic: an input port is
41simply ``an object which can deliver characters on demand,'' and an
42output port is ``an object which can accept characters.'' Because
43this definition is so loose, it is easy to write functions that
44simulate ports in software. @dfn{Soft ports} and @dfn{string ports}
45are two interesting and powerful examples of this technique.
46(@pxref{Soft Ports}, and @ref{String Ports}.)
47
48Ports are garbage collected in the usual way (@pxref{Memory
49Management}), and will be closed at that time if not already closed.
28cc8dac 50In this case any errors occurring in the close will not be reported.
07d83abe
MV
51Usually a program will want to explicitly close so as to be sure all
52its operations have been successful. Of course if a program has
53abandoned something due to an error or other condition then closing
54problems are probably not of interest.
55
56It is strongly recommended that file ports be closed explicitly when
57no longer required. Most systems have limits on how many files can be
58open, both on a per-process and a system-wide basis. A program that
59uses many files should take care not to hit those limits. The same
60applies to similar system resources such as pipes and sockets.
61
62Note that automatic garbage collection is triggered only by memory
63consumption, not by file or other resource usage, so a program cannot
64rely on that to keep it away from system limits. An explicit call to
65@code{gc} can of course be relied on to pick up unreferenced ports.
66If program flow makes it hard to be certain when to close then this
67may be an acceptable way to control resource usage.
68
40296bab
KR
69All file access uses the ``LFS'' large file support functions when
70available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be
71read and written on a 32-bit system.
72
28cc8dac
MG
73Each port has an associated character encoding that controls how bytes
74read from the port are converted to characters and string and controls
75how characters and strings written to the port are converted to bytes.
76When ports are created, they inherit their character encoding from the
77current locale, but, that can be modified after the port is created.
78
912a8702
MG
79Currently, the ports only work with @emph{non-modal} encodings. Most
80encodings are non-modal, meaning that the conversion of bytes to a
81string doesn't depend on its context: the same byte sequence will always
82return the same string. A couple of modal encodings are in common use,
83like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
84
28cc8dac
MG
85Each port also has an associated conversion strategy: what to do when
86a Guile character can't be converted to the port's encoded character
87representation for output. There are three possible strategies: to
88raise an error, to replace the character with a hex escape, or to
89replace the character with a substitute character.
90
07d83abe
MV
91@rnindex input-port?
92@deffn {Scheme Procedure} input-port? x
93@deffnx {C Function} scm_input_port_p (x)
94Return @code{#t} if @var{x} is an input port, otherwise return
95@code{#f}. Any object satisfying this predicate also satisfies
96@code{port?}.
97@end deffn
98
99@rnindex output-port?
100@deffn {Scheme Procedure} output-port? x
101@deffnx {C Function} scm_output_port_p (x)
102Return @code{#t} if @var{x} is an output port, otherwise return
103@code{#f}. Any object satisfying this predicate also satisfies
104@code{port?}.
105@end deffn
106
107@deffn {Scheme Procedure} port? x
108@deffnx {C Function} scm_port_p (x)
109Return a boolean indicating whether @var{x} is a port.
110Equivalent to @code{(or (input-port? @var{x}) (output-port?
111@var{x}))}.
112@end deffn
113
28cc8dac
MG
114@deffn {Scheme Procedure} set-port-encoding! port enc
115@deffnx {C Function} scm_set_port_encoding_x (port, enc)
4c7b9975
LC
116Sets the character encoding that will be used to interpret all port I/O.
117@var{enc} is a string containing the name of an encoding. Valid
118encoding names are those
119@url{http://www.iana.org/assignments/character-sets, defined by IANA}.
28cc8dac 120@end deffn
d6a6989e
LC
121
122@defvr {Scheme Variable} %default-port-encoding
72b3aa56 123A fluid containing @code{#f} or the name of the encoding to
d6a6989e
LC
124be used by default for newly created ports (@pxref{Fluids and Dynamic
125States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
28cc8dac
MG
126
127New ports are created with the encoding appropriate for the current
4c7b9975
LC
128locale if @code{setlocale} has been called or the value specified by
129this fluid otherwise.
130@end defvr
28cc8dac
MG
131
132@deffn {Scheme Procedure} port-encoding port
5f6ffd66 133@deffnx {C Function} scm_port_encoding (port)
211683cc
MG
134Returns, as a string, the character encoding that @var{port} uses to interpret
135its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
28cc8dac
MG
136@end deffn
137
138@deffn {Scheme Procedure} set-port-conversion-strategy! port sym
139@deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
140Sets the behavior of the interpreter when outputting a character that
141is not representable in the port's current encoding. @var{sym} can be
142either @code{'error}, @code{'substitute}, or @code{'escape}. If it is
143@code{'error}, an error will be thrown when an nonconvertible character
144is encountered. If it is @code{'substitute}, then nonconvertible
145characters will be replaced with approximate characters, or with
146question marks if no approximately correct character is available. If
147it is @code{'escape}, it will appear as a hex escape when output.
148
149If @var{port} is an open port, the conversion error behavior
150is set for that port. If it is @code{#f}, it is set as the
151default behavior for any future ports that get created in
152this thread.
153@end deffn
154
155@deffn {Scheme Procedure} port-conversion-strategy port
156@deffnx {C Function} scm_port_conversion_strategy (port)
157Returns the behavior of the port when outputting a character that is
158not representable in the port's current encoding. It returns the
159symbol @code{error} if unrepresentable characters should cause
160exceptions, @code{substitute} if the port should try to replace
161unrepresentable characters with question marks or approximate
162characters, or @code{escape} if unrepresentable characters should be
163converted to string escapes.
164
165If @var{port} is @code{#f}, then the current default behavior will be
166returned. New ports will have this default behavior when they are
167created.
168@end deffn
169
b22e94db
LC
170@deffn {Scheme Variable} %default-port-conversion-strategy
171The fluid that defines the conversion strategy for newly created ports,
172and for other conversion routines such as @code{scm_to_stringn},
173@code{scm_from_stringn}, @code{string->pointer}, and
174@code{pointer->string}.
175
176Its value must be one of the symbols described above, with the same
177semantics: @code{'error}, @code{'substitute}, or @code{'escape}.
178
179When Guile starts, its value is @code{'substitute}.
180
181Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
182equivalent to @code{(fluid-set! %default-port-conversion-strategy
183@var{sym})}.
184@end deffn
28cc8dac 185
07d83abe
MV
186
187@node Reading
188@subsection Reading
bf5df489 189@cindex Reading
07d83abe
MV
190
191[Generic procedures for reading from ports.]
192
1518f649
AW
193These procedures pertain to reading characters and strings from
194ports. To read general S-expressions from ports, @xref{Scheme Read}.
195
07d83abe 196@rnindex eof-object?
bf5df489 197@cindex End of file object
07d83abe
MV
198@deffn {Scheme Procedure} eof-object? x
199@deffnx {C Function} scm_eof_object_p (x)
200Return @code{#t} if @var{x} is an end-of-file object; otherwise
201return @code{#f}.
202@end deffn
203
204@rnindex char-ready?
205@deffn {Scheme Procedure} char-ready? [port]
206@deffnx {C Function} scm_char_ready_p (port)
207Return @code{#t} if a character is ready on input @var{port}
208and return @code{#f} otherwise. If @code{char-ready?} returns
209@code{#t} then the next @code{read-char} operation on
210@var{port} is guaranteed not to hang. If @var{port} is a file
211port at end of file then @code{char-ready?} returns @code{#t}.
cdf1ad3b
MV
212
213@code{char-ready?} exists to make it possible for a
07d83abe
MV
214program to accept characters from interactive ports without
215getting stuck waiting for input. Any input editors associated
216with such ports must make sure that characters whose existence
217has been asserted by @code{char-ready?} cannot be rubbed out.
218If @code{char-ready?} were to return @code{#f} at end of file,
219a port at end of file would be indistinguishable from an
cdf1ad3b 220interactive port that has no ready characters.
07d83abe
MV
221@end deffn
222
223@rnindex read-char
224@deffn {Scheme Procedure} read-char [port]
225@deffnx {C Function} scm_read_char (port)
226Return the next character available from @var{port}, updating
227@var{port} to point to the following character. If no more
228characters are available, the end-of-file object is returned.
c62da8f8
LC
229
230When @var{port}'s data cannot be decoded according to its
231character encoding, a @code{decoding-error} is raised and
232@var{port} points past the erroneous byte sequence.
07d83abe
MV
233@end deffn
234
235@deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
236Read up to @var{size} bytes from @var{port} and store them in
237@var{buffer}. The return value is the number of bytes actually read,
238which can be less than @var{size} if end-of-file has been reached.
239
240Note that this function does not update @code{port-line} and
241@code{port-column} below.
242@end deftypefn
243
244@rnindex peek-char
245@deffn {Scheme Procedure} peek-char [port]
246@deffnx {C Function} scm_peek_char (port)
247Return the next character available from @var{port},
248@emph{without} updating @var{port} to point to the following
249character. If no more characters are available, the
cdf1ad3b
MV
250end-of-file object is returned.
251
252The value returned by
07d83abe
MV
253a call to @code{peek-char} is the same as the value that would
254have been returned by a call to @code{read-char} on the same
255port. The only difference is that the very next call to
256@code{read-char} or @code{peek-char} on that @var{port} will
257return the value returned by the preceding call to
258@code{peek-char}. In particular, a call to @code{peek-char} on
259an interactive port will hang waiting for input whenever a call
cdf1ad3b 260to @code{read-char} would have hung.
c62da8f8
LC
261
262As for @code{read-char}, a @code{decoding-error} may be raised
263if such a situation occurs. However, unlike with @code{read-char},
264@var{port} still points at the beginning of the erroneous byte
265sequence when the error is raised.
07d83abe
MV
266@end deffn
267
268@deffn {Scheme Procedure} unread-char cobj [port]
269@deffnx {C Function} scm_unread_char (cobj, port)
64de6db5 270Place character @var{cobj} in @var{port} so that it will be read by the
07d83abe
MV
271next read operation. If called multiple times, the unread characters
272will be read again in last-in first-out order. If @var{port} is
273not supplied, the current input port is used.
274@end deffn
275
276@deffn {Scheme Procedure} unread-string str port
277@deffnx {C Function} scm_unread_string (str, port)
278Place the string @var{str} in @var{port} so that its characters will
279be read from left-to-right as the next characters from @var{port}
280during subsequent read operations. If called multiple times, the
281unread characters will be read again in last-in first-out order. If
9782da8a 282@var{port} is not supplied, the @code{current-input-port} is used.
07d83abe
MV
283@end deffn
284
285@deffn {Scheme Procedure} drain-input port
286@deffnx {C Function} scm_drain_input (port)
287This procedure clears a port's input buffers, similar
288to the way that force-output clears the output buffer. The
289contents of the buffers are returned as a single string, e.g.,
290
291@lisp
292(define p (open-input-file ...))
293(drain-input p) => empty string, nothing buffered yet.
294(unread-char (read-char p) p)
295(drain-input p) => initial chars from p, up to the buffer size.
296@end lisp
297
298Draining the buffers may be useful for cleanly finishing
299buffered I/O so that the file descriptor can be used directly
300for further input.
301@end deffn
302
303@deffn {Scheme Procedure} port-column port
304@deffnx {Scheme Procedure} port-line port
305@deffnx {C Function} scm_port_column (port)
306@deffnx {C Function} scm_port_line (port)
307Return the current column number or line number of @var{port}.
308If the number is
309unknown, the result is #f. Otherwise, the result is a 0-origin integer
310- i.e.@: the first character of the first line is line 0, column 0.
311(However, when you display a file position, for example in an error
312message, we recommend you add 1 to get 1-origin integers. This is
313because lines and column numbers traditionally start with 1, and that is
314what non-programmers will find most natural.)
315@end deffn
316
317@deffn {Scheme Procedure} set-port-column! port column
318@deffnx {Scheme Procedure} set-port-line! port line
319@deffnx {C Function} scm_set_port_column_x (port, column)
320@deffnx {C Function} scm_set_port_line_x (port, line)
321Set the current column or line number of @var{port}.
322@end deffn
323
324@node Writing
325@subsection Writing
bf5df489 326@cindex Writing
07d83abe
MV
327
328[Generic procedures for writing to ports.]
329
1518f649
AW
330These procedures are for writing characters and strings to
331ports. For more information on writing arbitrary Scheme objects to
332ports, @xref{Scheme Write}.
333
07d83abe
MV
334@deffn {Scheme Procedure} get-print-state port
335@deffnx {C Function} scm_get_print_state (port)
336Return the print state of the port @var{port}. If @var{port}
337has no associated print state, @code{#f} is returned.
338@end deffn
339
07d83abe
MV
340@rnindex newline
341@deffn {Scheme Procedure} newline [port]
342@deffnx {C Function} scm_newline (port)
343Send a newline to @var{port}.
344If @var{port} is omitted, send to the current output port.
345@end deffn
346
cdf1ad3b 347@deffn {Scheme Procedure} port-with-print-state port [pstate]
07d83abe
MV
348@deffnx {C Function} scm_port_with_print_state (port, pstate)
349Create a new port which behaves like @var{port}, but with an
cdf1ad3b
MV
350included print state @var{pstate}. @var{pstate} is optional.
351If @var{pstate} isn't supplied and @var{port} already has
352a print state, the old print state is reused.
07d83abe
MV
353@end deffn
354
07d83abe
MV
355@deffn {Scheme Procedure} simple-format destination message . args
356@deffnx {C Function} scm_simple_format (destination, message, args)
357Write @var{message} to @var{destination}, defaulting to
358the current output port.
359@var{message} can contain @code{~A} (was @code{%s}) and
360@code{~S} (was @code{%S}) escapes. When printed,
361the escapes are replaced with corresponding members of
64de6db5 362@var{args}:
07d83abe
MV
363@code{~A} formats using @code{display} and @code{~S} formats
364using @code{write}.
365If @var{destination} is @code{#t}, then use the current output
366port, if @var{destination} is @code{#f}, then return a string
367containing the formatted text. Does not add a trailing newline.
368@end deffn
369
370@rnindex write-char
371@deffn {Scheme Procedure} write-char chr [port]
372@deffnx {C Function} scm_write_char (chr, port)
373Send character @var{chr} to @var{port}.
374@end deffn
375
376@deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size)
377Write @var{size} bytes at @var{buffer} to @var{port}.
378
379Note that this function does not update @code{port-line} and
380@code{port-column} (@pxref{Reading}).
381@end deftypefn
382
383@findex fflush
384@deffn {Scheme Procedure} force-output [port]
385@deffnx {C Function} scm_force_output (port)
386Flush the specified output port, or the current output port if @var{port}
387is omitted. The current output buffer contents are passed to the
388underlying port implementation (e.g., in the case of fports, the
389data will be written to the file and the output buffer will be cleared.)
390It has no effect on an unbuffered port.
391
392The return value is unspecified.
393@end deffn
394
395@deffn {Scheme Procedure} flush-all-ports
396@deffnx {C Function} scm_flush_all_ports ()
397Equivalent to calling @code{force-output} on
398all open output ports. The return value is unspecified.
399@end deffn
400
401
402@node Closing
403@subsection Closing
bf5df489
KR
404@cindex Closing ports
405@cindex Port, close
07d83abe
MV
406
407@deffn {Scheme Procedure} close-port port
408@deffnx {C Function} scm_close_port (port)
409Close the specified port object. Return @code{#t} if it
410successfully closes a port or @code{#f} if it was already
411closed. An exception may be raised if an error occurs, for
412example when flushing buffered output. See also @ref{Ports and
413File Descriptors, close}, for a procedure which can close file
414descriptors.
415@end deffn
416
417@deffn {Scheme Procedure} close-input-port port
418@deffnx {Scheme Procedure} close-output-port port
419@deffnx {C Function} scm_close_input_port (port)
420@deffnx {C Function} scm_close_output_port (port)
421@rnindex close-input-port
422@rnindex close-output-port
423Close the specified input or output @var{port}. An exception may be
424raised if an error occurs while closing. If @var{port} is already
425closed, nothing is done. The return value is unspecified.
426
427See also @ref{Ports and File Descriptors, close}, for a procedure
428which can close file descriptors.
429@end deffn
430
431@deffn {Scheme Procedure} port-closed? port
432@deffnx {C Function} scm_port_closed_p (port)
433Return @code{#t} if @var{port} is closed or @code{#f} if it is
434open.
435@end deffn
436
437
438@node Random Access
439@subsection Random Access
bf5df489
KR
440@cindex Random access, ports
441@cindex Port, random access
07d83abe
MV
442
443@deffn {Scheme Procedure} seek fd_port offset whence
444@deffnx {C Function} scm_seek (fd_port, offset, whence)
64de6db5 445Sets the current position of @var{fd_port} to the integer
07d83abe
MV
446@var{offset}, which is interpreted according to the value of
447@var{whence}.
448
449One of the following variables should be supplied for
450@var{whence}:
451@defvar SEEK_SET
452Seek from the beginning of the file.
453@end defvar
454@defvar SEEK_CUR
455Seek from the current position.
456@end defvar
457@defvar SEEK_END
458Seek from the end of the file.
459@end defvar
64de6db5 460If @var{fd_port} is a file descriptor, the underlying system
07d83abe
MV
461call is @code{lseek}. @var{port} may be a string port.
462
463The value returned is the new position in the file. This means
464that the current position of a port can be obtained using:
465@lisp
466(seek port 0 SEEK_CUR)
467@end lisp
468@end deffn
469
470@deffn {Scheme Procedure} ftell fd_port
471@deffnx {C Function} scm_ftell (fd_port)
472Return an integer representing the current position of
64de6db5 473@var{fd_port}, measured from the beginning. Equivalent to:
07d83abe
MV
474
475@lisp
476(seek port 0 SEEK_CUR)
477@end lisp
478@end deffn
479
480@findex truncate
481@findex ftruncate
40296bab
KR
482@deffn {Scheme Procedure} truncate-file file [length]
483@deffnx {C Function} scm_truncate_file (file, length)
484Truncate @var{file} to @var{length} bytes. @var{file} can be a
485filename string, a port object, or an integer file descriptor. The
486return value is unspecified.
487
488For a port or file descriptor @var{length} can be omitted, in which
489case the file is truncated at the current position (per @code{ftell}
490above).
491
492On most systems a file can be extended by giving a length greater than
493the current size, but this is not mandatory in the POSIX standard.
07d83abe
MV
494@end deffn
495
496@node Line/Delimited
497@subsection Line Oriented and Delimited Text
bf5df489
KR
498@cindex Line input/output
499@cindex Port, line input/output
07d83abe
MV
500
501The delimited-I/O module can be accessed with:
502
aba0dff5 503@lisp
07d83abe 504(use-modules (ice-9 rdelim))
aba0dff5 505@end lisp
07d83abe
MV
506
507It can be used to read or write lines of text, or read text delimited by
508a specified set of characters. It's similar to the @code{(scsh rdelim)}
509module from guile-scsh, but does not use multiple values or character
510sets and has an extra procedure @code{write-line}.
511
512@c begin (scm-doc-string "rdelim.scm" "read-line")
513@deffn {Scheme Procedure} read-line [port] [handle-delim]
514Return a line of text from @var{port} if specified, otherwise from the
515value returned by @code{(current-input-port)}. Under Unix, a line of text
516is terminated by the first end-of-line character or by end-of-file.
517
518If @var{handle-delim} is specified, it should be one of the following
519symbols:
520@table @code
521@item trim
522Discard the terminating delimiter. This is the default, but it will
523be impossible to tell whether the read terminated with a delimiter or
524end-of-file.
525@item concat
526Append the terminating delimiter (if any) to the returned string.
527@item peek
528Push the terminating delimiter (if any) back on to the port.
529@item split
530Return a pair containing the string read from the port and the
531terminating delimiter or end-of-file object.
532@end table
c62da8f8
LC
533
534Like @code{read-char}, this procedure can throw to @code{decoding-error}
535(@pxref{Reading, @code{read-char}}).
07d83abe
MV
536@end deffn
537
538@c begin (scm-doc-string "rdelim.scm" "read-line!")
539@deffn {Scheme Procedure} read-line! buf [port]
540Read a line of text into the supplied string @var{buf} and return the
541number of characters added to @var{buf}. If @var{buf} is filled, then
542@code{#f} is returned.
543Read from @var{port} if
544specified, otherwise from the value returned by @code{(current-input-port)}.
545@end deffn
546
547@c begin (scm-doc-string "rdelim.scm" "read-delimited")
548@deffn {Scheme Procedure} read-delimited delims [port] [handle-delim]
549Read text until one of the characters in the string @var{delims} is found
550or end-of-file is reached. Read from @var{port} if supplied, otherwise
551from the value returned by @code{(current-input-port)}.
552@var{handle-delim} takes the same values as described for @code{read-line}.
553@end deffn
554
555@c begin (scm-doc-string "rdelim.scm" "read-delimited!")
556@deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
e7fb779f
AW
557Read text into the supplied string @var{buf}.
558
559If a delimiter was found, return the number of characters written,
560except if @var{handle-delim} is @code{split}, in which case the return
561value is a pair, as noted above.
562
563As a special case, if @var{port} was already at end-of-stream, the EOF
564object is returned. Also, if no characters were written because the
565buffer was full, @code{#f} is returned.
566
567It's something of a wacky interface, to be honest.
07d83abe
MV
568@end deffn
569
570@deffn {Scheme Procedure} write-line obj [port]
571@deffnx {C Function} scm_write_line (obj, port)
572Display @var{obj} and a newline character to @var{port}. If
573@var{port} is not specified, @code{(current-output-port)} is
574used. This function is equivalent to:
575@lisp
576(display obj [port])
577(newline [port])
578@end lisp
579@end deffn
580
5a35d42a
AW
581In the past, Guile did not have a procedure that would just read out all
582of the characters from a port. As a workaround, many people just called
583@code{read-delimited} with no delimiters, knowing that would produce the
584behavior they wanted. This prompted Guile developers to add some
585routines that would read all characters from a port. So it is that
586@code{(ice-9 rdelim)} is also the home for procedures that can reading
587undelimited text:
588
589@deffn {Scheme Procedure} read-string [port] [count]
590Read all of the characters out of @var{port} and return them as a
591string. If the @var{count} is present, treat it as a limit to the
592number of characters to read.
593
594By default, read from the current input port, with no size limit on the
595result. This procedure always returns a string, even if no characters
596were read.
597@end deffn
598
599@deffn {Scheme Procedure} read-string! buf [port] [start] [end]
600Fill @var{buf} with characters read from @var{port}, defaulting to the
601current input port. Return the number of characters read.
602
603If @var{start} or @var{end} are specified, store data only into the
604substring of @var{str} bounded by @var{start} and @var{end} (which
605default to the beginning and end of the string, respectively).
606@end deffn
607
28cc8dac 608Some of the aforementioned I/O functions rely on the following C
07d83abe
MV
609primitives. These will mainly be of interest to people hacking Guile
610internals.
611
612@deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]]
613@deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end)
614Read characters from @var{port} into @var{str} until one of the
615characters in the @var{delims} string is encountered. If
616@var{gobble} is true, discard the delimiter character;
617otherwise, leave it in the input stream for the next read. If
618@var{port} is not specified, use the value of
619@code{(current-input-port)}. If @var{start} or @var{end} are
620specified, store data only into the substring of @var{str}
621bounded by @var{start} and @var{end} (which default to the
622beginning and end of the string, respectively).
623
624 Return a pair consisting of the delimiter that terminated the
625string and the number of characters read. If reading stopped
626at the end of file, the delimiter returned is the
627@var{eof-object}; if the string was filled without encountering
628a delimiter, this value is @code{#f}.
629@end deffn
630
631@deffn {Scheme Procedure} %read-line [port]
632@deffnx {C Function} scm_read_line (port)
633Read a newline-terminated line from @var{port}, allocating storage as
634necessary. The newline terminator (if any) is removed from the string,
635and a pair consisting of the line and its delimiter is returned. The
636delimiter may be either a newline or the @var{eof-object}; if
637@code{%read-line} is called at the end of file, it returns the pair
638@code{(#<eof> . #<eof>)}.
639@end deffn
640
641@node Block Reading and Writing
642@subsection Block reading and writing
bf5df489
KR
643@cindex Block read/write
644@cindex Port, block read/write
07d83abe
MV
645
646The Block-string-I/O module can be accessed with:
647
aba0dff5 648@lisp
07d83abe 649(use-modules (ice-9 rw))
aba0dff5 650@end lisp
07d83abe
MV
651
652It currently contains procedures that help to implement the
653@code{(scsh rw)} module in guile-scsh.
654
655@deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]]
656@deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end)
657Read characters from a port or file descriptor into a
658string @var{str}. A port must have an underlying file
659descriptor --- a so-called fport. This procedure is
660scsh-compatible and can efficiently read large strings.
661It will:
662
663@itemize
664@item
665attempt to fill the entire string, unless the @var{start}
666and/or @var{end} arguments are supplied. i.e., @var{start}
667defaults to 0 and @var{end} defaults to
668@code{(string-length str)}
669@item
670use the current input port if @var{port_or_fdes} is not
671supplied.
672@item
673return fewer than the requested number of characters in some
674cases, e.g., on end of file, if interrupted by a signal, or if
675not all the characters are immediately available.
676@item
677wait indefinitely for some input if no characters are
678currently available,
679unless the port is in non-blocking mode.
680@item
681read characters from the port's input buffers if available,
682instead from the underlying file descriptor.
683@item
684return @code{#f} if end-of-file is encountered before reading
685any characters, otherwise return the number of characters
686read.
687@item
688return 0 if the port is in non-blocking mode and no characters
689are immediately available.
690@item
691return 0 if the request is for 0 bytes, with no
692end-of-file check.
693@end itemize
694@end deffn
695
696@deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]]
697@deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end)
698Write characters from a string @var{str} to a port or file
699descriptor. A port must have an underlying file descriptor
700--- a so-called fport. This procedure is
701scsh-compatible and can efficiently write large strings.
702It will:
703
704@itemize
705@item
706attempt to write the entire string, unless the @var{start}
707and/or @var{end} arguments are supplied. i.e., @var{start}
708defaults to 0 and @var{end} defaults to
709@code{(string-length str)}
710@item
711use the current output port if @var{port_of_fdes} is not
712supplied.
713@item
714in the case of a buffered port, store the characters in the
715port's output buffer, if all will fit. If they will not fit
716then any existing buffered characters will be flushed
717before attempting
718to write the new characters directly to the underlying file
719descriptor. If the port is in non-blocking mode and
720buffered characters can not be flushed immediately, then an
721@code{EAGAIN} system-error exception will be raised (Note:
722scsh does not support the use of non-blocking buffered ports.)
723@item
724write fewer than the requested number of
725characters in some cases, e.g., if interrupted by a signal or
726if not all of the output can be accepted immediately.
727@item
728wait indefinitely for at least one character
729from @var{str} to be accepted by the port, unless the port is
730in non-blocking mode.
731@item
732return the number of characters accepted by the port.
733@item
734return 0 if the port is in non-blocking mode and can not accept
735at least one character from @var{str} immediately
736@item
737return 0 immediately if the request size is 0 bytes.
738@end itemize
739@end deffn
740
741@node Default Ports
742@subsection Default Ports for Input, Output and Errors
bf5df489
KR
743@cindex Default ports
744@cindex Port, default
07d83abe
MV
745
746@rnindex current-input-port
747@deffn {Scheme Procedure} current-input-port
748@deffnx {C Function} scm_current_input_port ()
34846414 749@cindex standard input
07d83abe 750Return the current input port. This is the default port used
3fa0a042
KR
751by many input procedures.
752
753Initially this is the @dfn{standard input} in Unix and C terminology.
754When the standard input is a tty the port is unbuffered, otherwise
755it's fully buffered.
756
757Unbuffered input is good if an application runs an interactive
758subprocess, since any type-ahead input won't go into Guile's buffer
9782da8a 759and be unavailable to the subprocess.
3fa0a042
KR
760
761Note that Guile buffering is completely separate from the tty ``line
9782da8a
KR
762discipline''. In the usual cooked mode on a tty Guile only sees a
763line of input once the user presses @key{Return}.
07d83abe
MV
764@end deffn
765
766@rnindex current-output-port
767@deffn {Scheme Procedure} current-output-port
768@deffnx {C Function} scm_current_output_port ()
34846414 769@cindex standard output
07d83abe 770Return the current output port. This is the default port used
3fa0a042
KR
771by many output procedures.
772
773Initially this is the @dfn{standard output} in Unix and C terminology.
774When the standard output is a tty this port is unbuffered, otherwise
775it's fully buffered.
776
777Unbuffered output to a tty is good for ensuring progress output or a
778prompt is seen. But an application which always prints whole lines
779could change to line buffered, or an application with a lot of output
780could go fully buffered and perhaps make explicit @code{force-output}
781calls (@pxref{Writing}) at selected points.
07d83abe
MV
782@end deffn
783
784@deffn {Scheme Procedure} current-error-port
785@deffnx {C Function} scm_current_error_port ()
34846414 786@cindex standard error output
3fa0a042
KR
787Return the port to which errors and warnings should be sent.
788
789Initially this is the @dfn{standard error} in Unix and C terminology.
790When the standard error is a tty this port is unbuffered, otherwise
791it's fully buffered.
07d83abe
MV
792@end deffn
793
794@deffn {Scheme Procedure} set-current-input-port port
795@deffnx {Scheme Procedure} set-current-output-port port
796@deffnx {Scheme Procedure} set-current-error-port port
797@deffnx {C Function} scm_set_current_input_port (port)
798@deffnx {C Function} scm_set_current_output_port (port)
799@deffnx {C Function} scm_set_current_error_port (port)
800Change the ports returned by @code{current-input-port},
801@code{current-output-port} and @code{current-error-port}, respectively,
802so that they use the supplied @var{port} for input or output.
803@end deffn
804
661ae7ab
MV
805@deftypefn {C Function} void scm_dynwind_current_input_port (SCM port)
806@deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port)
807@deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port)
07d83abe 808These functions must be used inside a pair of calls to
661ae7ab
MV
809@code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic
810Wind}). During the dynwind context, the indicated port is set to
07d83abe
MV
811@var{port}.
812
813More precisely, the current port is swapped with a `backup' value
661ae7ab 814whenever the dynwind context is entered or left. The backup value is
07d83abe
MV
815initialized with the @var{port} argument.
816@end deftypefn
817
818@node Port Types
819@subsection Types of Port
bf5df489
KR
820@cindex Types of ports
821@cindex Port, types
07d83abe
MV
822
823[Types of port; how to make them.]
824
825@menu
826* File Ports:: Ports on an operating system file.
827* String Ports:: Ports on a Scheme string.
828* Soft Ports:: Ports on arbitrary Scheme procedures.
829* Void Ports:: Ports on nothing at all.
830@end menu
831
832
833@node File Ports
834@subsubsection File Ports
bf5df489
KR
835@cindex File port
836@cindex Port, file
07d83abe
MV
837
838The following procedures are used to open file ports.
839See also @ref{Ports and File Descriptors, open}, for an interface
840to the Unix @code{open} system call.
841
842Most systems have limits on how many files can be open, so it's
843strongly recommended that file ports be closed explicitly when no
844longer required (@pxref{Ports}).
845
3ace9a8e
MW
846@deffn {Scheme Procedure} open-file filename mode @
847 [#:guess-encoding=#f] [#:encoding=#f]
848@deffnx {C Function} scm_open_file_with_encoding @
849 (filename, mode, guess_encoding, encoding)
07d83abe
MV
850@deffnx {C Function} scm_open_file (filename, mode)
851Open the file whose name is @var{filename}, and return a port
852representing that file. The attributes of the port are
853determined by the @var{mode} string. The way in which this is
854interpreted is similar to C stdio. The first character must be
855one of the following:
c755b861 856
07d83abe
MV
857@table @samp
858@item r
859Open an existing file for input.
860@item w
861Open a file for output, creating it if it doesn't already exist
862or removing its contents if it does.
863@item a
864Open a file for output, creating it if it doesn't already
865exist. All writes to the port will go to the end of the file.
866The "append mode" can be turned off while the port is in use
867@pxref{Ports and File Descriptors, fcntl}
868@end table
c755b861 869
07d83abe 870The following additional characters can be appended:
c755b861 871
07d83abe
MV
872@table @samp
873@item +
874Open the port for both input and output. E.g., @code{r+}: open
875an existing file for both input and output.
876@item 0
877Create an "unbuffered" port. In this case input and output
878operations are passed directly to the underlying port
879implementation without additional buffering. This is likely to
880slow down I/O operations. The buffering mode can be changed
881while a port is in use @pxref{Ports and File Descriptors,
882setvbuf}
883@item l
884Add line-buffering to the port. The port output buffer will be
885automatically flushed whenever a newline character is written.
c755b861 886@item b
5261e742
AW
887Use binary mode, ensuring that each byte in the file will be read as one
888Scheme character.
889
890To provide this property, the file will be opened with the 8-bit
9a334eb3
MW
891character encoding "ISO-8859-1", ignoring the default port encoding.
892@xref{Ports}, for more information on port encodings.
5261e742
AW
893
894Note that while it is possible to read and write binary data as
895characters or strings, it is usually better to treat bytes as octets,
896and byte sequences as bytevectors. @xref{R6RS Binary Input}, and
897@ref{R6RS Binary Output}, for more.
898
899This option had another historical meaning, for DOS compatibility: in
900the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
901The @code{b} flag prevents this from happening, adding @code{O_BINARY}
902to the underlying @code{open} call. Still, the flag is generally useful
903because of its port encoding ramifications.
07d83abe 904@end table
c755b861 905
3ace9a8e
MW
906Unless binary mode is requested, the character encoding of the new port
907is determined as follows: First, if @var{guess-encoding} is true, the
908@code{file-encoding} procedure is used to guess the encoding of the file
909(@pxref{Character Encoding of Source Files}). If @var{guess-encoding}
910is false or if @code{file-encoding} fails, @var{encoding} is used unless
911it is also false. As a last resort, the default port encoding is used.
912@xref{Ports}, for more information on port encodings. It is an error to
913pass a non-false @var{guess-encoding} or @var{encoding} if binary mode
914is requested.
915
916If a file cannot be opened with the access requested, @code{open-file}
917throws an exception.
092bdcc4 918
9a334eb3
MW
919When the file is opened, its encoding is set to the current
920@code{%default-port-encoding}, unless the @code{b} flag was supplied.
921Sometimes it is desirable to honor Emacs-style coding declarations in
922files@footnote{Guile 2.0.0 to 2.0.7 would do this by default. This
923behavior was deemed inappropriate and disabled starting from Guile
9242.0.8.}. When that is the case, the @code{file-encoding} procedure can
925be used as follows (@pxref{Character Encoding of Source Files,
926@code{file-encoding}}):
927
928@example
929(let* ((port (open-input-file file))
930 (encoding (file-encoding port)))
931 (set-port-encoding! port (or encoding (port-encoding port))))
932@end example
211683cc 933
07d83abe
MV
934In theory we could create read/write ports which were buffered
935in one direction only. However this isn't included in the
092bdcc4 936current interfaces.
07d83abe
MV
937@end deffn
938
939@rnindex open-input-file
3ace9a8e
MW
940@deffn {Scheme Procedure} open-input-file filename @
941 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
942
943Open @var{filename} for input. If @var{binary} is true, open the port
944in binary mode, otherwise use text mode. @var{encoding} and
945@var{guess-encoding} determine the character encoding as described above
946for @code{open-file}. Equivalent to
aba0dff5 947@lisp
3ace9a8e
MW
948(open-file @var{filename}
949 (if @var{binary} "rb" "r")
950 #:guess-encoding @var{guess-encoding}
951 #:encoding @var{encoding})
aba0dff5 952@end lisp
07d83abe
MV
953@end deffn
954
955@rnindex open-output-file
3ace9a8e
MW
956@deffn {Scheme Procedure} open-output-file filename @
957 [#:encoding=#f] [#:binary=#f]
958
959Open @var{filename} for output. If @var{binary} is true, open the port
960in binary mode, otherwise use text mode. @var{encoding} specifies the
961character encoding as described above for @code{open-file}. Equivalent
962to
aba0dff5 963@lisp
3ace9a8e
MW
964(open-file @var{filename}
965 (if @var{binary} "wb" "w")
966 #:encoding @var{encoding})
aba0dff5 967@end lisp
07d83abe
MV
968@end deffn
969
3ace9a8e
MW
970@deffn {Scheme Procedure} call-with-input-file filename proc @
971 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
972@deffnx {Scheme Procedure} call-with-output-file filename proc @
973 [#:encoding=#f] [#:binary=#f]
07d83abe
MV
974@rnindex call-with-input-file
975@rnindex call-with-output-file
976Open @var{filename} for input or output, and call @code{(@var{proc}
977port)} with the resulting port. Return the value returned by
978@var{proc}. @var{filename} is opened as per @code{open-input-file} or
28cc8dac 979@code{open-output-file} respectively, and an error is signaled if it
07d83abe
MV
980cannot be opened.
981
982When @var{proc} returns, the port is closed. If @var{proc} does not
28cc8dac 983return (e.g.@: if it throws an error), then the port might not be
07d83abe
MV
984closed automatically, though it will be garbage collected in the usual
985way if not otherwise referenced.
986@end deffn
987
3ace9a8e
MW
988@deffn {Scheme Procedure} with-input-from-file filename thunk @
989 [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
990@deffnx {Scheme Procedure} with-output-to-file filename thunk @
991 [#:encoding=#f] [#:binary=#f]
992@deffnx {Scheme Procedure} with-error-to-file filename thunk @
993 [#:encoding=#f] [#:binary=#f]
07d83abe
MV
994@rnindex with-input-from-file
995@rnindex with-output-to-file
996Open @var{filename} and call @code{(@var{thunk})} with the new port
997setup as respectively the @code{current-input-port},
998@code{current-output-port}, or @code{current-error-port}. Return the
999value returned by @var{thunk}. @var{filename} is opened as per
1000@code{open-input-file} or @code{open-output-file} respectively, and an
28cc8dac 1001error is signaled if it cannot be opened.
07d83abe
MV
1002
1003When @var{thunk} returns, the port is closed and the previous setting
1004of the respective current port is restored.
1005
1006The current port setting is managed with @code{dynamic-wind}, so the
1007previous value is restored no matter how @var{thunk} exits (eg.@: an
1008exception), and if @var{thunk} is re-entered (via a captured
64de6db5 1009continuation) then it's set again to the @var{filename} port.
07d83abe
MV
1010
1011The port is closed when @var{thunk} returns normally, but not when
1012exited via an exception or new continuation. This ensures it's still
1013ready for use if @var{thunk} is re-entered by a captured continuation.
1014Of course the port is always garbage collected and closed in the usual
1015way when no longer referenced anywhere.
1016@end deffn
1017
1018@deffn {Scheme Procedure} port-mode port
1019@deffnx {C Function} scm_port_mode (port)
1020Return the port modes associated with the open port @var{port}.
1021These will not necessarily be identical to the modes used when
1022the port was opened, since modes such as "append" which are
1023used only during port creation are not retained.
1024@end deffn
1025
1026@deffn {Scheme Procedure} port-filename port
1027@deffnx {C Function} scm_port_filename (port)
ac012a27
AW
1028Return the filename associated with @var{port}, or @code{#f} if no
1029filename is associated with the port.
e55abf41
KR
1030
1031@var{port} must be open, @code{port-filename} cannot be used once the
1032port is closed.
07d83abe
MV
1033@end deffn
1034
1035@deffn {Scheme Procedure} set-port-filename! port filename
1036@deffnx {C Function} scm_set_port_filename_x (port, filename)
1037Change the filename associated with @var{port}, using the current input
1038port if none is specified. Note that this does not change the port's
1039source of data, but only the value that is returned by
1040@code{port-filename} and reported in diagnostic output.
1041@end deffn
1042
1043@deffn {Scheme Procedure} file-port? obj
1044@deffnx {C Function} scm_file_port_p (obj)
1045Determine whether @var{obj} is a port that is related to a file.
1046@end deffn
1047
1048
1049@node String Ports
1050@subsubsection String Ports
bf5df489
KR
1051@cindex String port
1052@cindex Port, string
07d83abe 1053
ecb87335 1054The following allow string ports to be opened by analogy to R4RS
07d83abe
MV
1055file port facilities:
1056
28cc8dac
MG
1057With string ports, the port-encoding is treated differently than other
1058types of ports. When string ports are created, they do not inherit a
1059character encoding from the current locale. They are given a
1060default locale that allows them to handle all valid string characters.
1061Typically one should not modify a string port's character encoding
1062away from its default.
1063
07d83abe
MV
1064@deffn {Scheme Procedure} call-with-output-string proc
1065@deffnx {C Function} scm_call_with_output_string (proc)
1066Calls the one-argument procedure @var{proc} with a newly created output
1067port. When the function returns, the string composed of the characters
1068written into the port is returned. @var{proc} should not close the port.
1069@end deffn
1070
1071@deffn {Scheme Procedure} call-with-input-string string proc
1072@deffnx {C Function} scm_call_with_input_string (string, proc)
1073Calls the one-argument procedure @var{proc} with a newly
1074created input port from which @var{string}'s contents may be
1075read. The value yielded by the @var{proc} is returned.
1076@end deffn
1077
1078@deffn {Scheme Procedure} with-output-to-string thunk
1079Calls the zero-argument procedure @var{thunk} with the current output
1080port set temporarily to a new string port. It returns a string
1081composed of the characters written to the current output.
1082@end deffn
1083
1084@deffn {Scheme Procedure} with-input-from-string string thunk
1085Calls the zero-argument procedure @var{thunk} with the current input
1086port set temporarily to a string port opened on the specified
1087@var{string}. The value yielded by @var{thunk} is returned.
1088@end deffn
1089
1090@deffn {Scheme Procedure} open-input-string str
1091@deffnx {C Function} scm_open_input_string (str)
1092Take a string and return an input port that delivers characters
1093from the string. The port can be closed by
1094@code{close-input-port}, though its storage will be reclaimed
1095by the garbage collector if it becomes inaccessible.
1096@end deffn
1097
1098@deffn {Scheme Procedure} open-output-string
1099@deffnx {C Function} scm_open_output_string ()
1100Return an output port that will accumulate characters for
1101retrieval by @code{get-output-string}. The port can be closed
1102by the procedure @code{close-output-port}, though its storage
1103will be reclaimed by the garbage collector if it becomes
1104inaccessible.
1105@end deffn
1106
1107@deffn {Scheme Procedure} get-output-string port
1108@deffnx {C Function} scm_get_output_string (port)
1109Given an output port created by @code{open-output-string},
1110return a string consisting of the characters that have been
1111output to the port so far.
1112
1113@code{get-output-string} must be used before closing @var{port}, once
1114closed the string cannot be obtained.
1115@end deffn
1116
1117A string port can be used in many procedures which accept a port
1118but which are not dependent on implementation details of fports.
1119E.g., seeking and truncating will work on a string port,
1120but trying to extract the file descriptor number will fail.
1121
1122
1123@node Soft Ports
1124@subsubsection Soft Ports
bf5df489
KR
1125@cindex Soft port
1126@cindex Port, soft
07d83abe
MV
1127
1128A @dfn{soft-port} is a port based on a vector of procedures capable of
1129accepting or delivering characters. It allows emulation of I/O ports.
1130
1131@deffn {Scheme Procedure} make-soft-port pv modes
1132@deffnx {C Function} scm_make_soft_port (pv, modes)
1133Return a port capable of receiving or delivering characters as
1134specified by the @var{modes} string (@pxref{File Ports,
1135open-file}). @var{pv} must be a vector of length 5 or 6. Its
1136components are as follows:
1137
1138@enumerate 0
1139@item
1140procedure accepting one character for output
1141@item
1142procedure accepting a string for output
1143@item
1144thunk for flushing output
1145@item
1146thunk for getting one character
1147@item
1148thunk for closing port (not by garbage collection)
1149@item
1150(if present and not @code{#f}) thunk for computing the number of
1151characters that can be read from the port without blocking.
1152@end enumerate
1153
1154For an output-only port only elements 0, 1, 2, and 4 need be
1155procedures. For an input-only port only elements 3 and 4 need
1156be procedures. Thunks 2 and 4 can instead be @code{#f} if
1157there is no useful operation for them to perform.
1158
1159If thunk 3 returns @code{#f} or an @code{eof-object}
1160(@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on
1161Scheme}) it indicates that the port has reached end-of-file.
1162For example:
1163
1164@lisp
1165(define stdout (current-output-port))
1166(define p (make-soft-port
1167 (vector
1168 (lambda (c) (write c stdout))
1169 (lambda (s) (display s stdout))
1170 (lambda () (display "." stdout))
1171 (lambda () (char-upcase (read-char)))
1172 (lambda () (display "@@" stdout)))
1173 "rw"))
1174
1175(write p p) @result{} #<input-output: soft 8081e20>
1176@end lisp
1177@end deffn
1178
1179
1180@node Void Ports
1181@subsubsection Void Ports
bf5df489
KR
1182@cindex Void port
1183@cindex Port, void
07d83abe
MV
1184
1185This kind of port causes any data to be discarded when written to, and
1186always returns the end-of-file object when read from.
1187
1188@deffn {Scheme Procedure} %make-void-port mode
1189@deffnx {C Function} scm_sys_make_void_port (mode)
1190Create and return a new void port. A void port acts like
1191@file{/dev/null}. The @var{mode} argument
1192specifies the input/output modes for this port: see the
1193documentation for @code{open-file} in @ref{File Ports}.
1194@end deffn
1195
1196
b242715b
LC
1197@node R6RS I/O Ports
1198@subsection R6RS I/O Ports
1199
1200@cindex R6RS
1201@cindex R6RS ports
1202
1203The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
1204the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
1205io ports)} module. It provides features, such as binary I/O and Unicode
1206string I/O, that complement or refine Guile's historical port API
040dfa6f
AR
1207presented above (@pxref{Input and Output}). Note that R6RS ports are not
1208disjoint from Guile's native ports, so Guile-specific procedures will
1209work on ports created using the R6RS API, and vice versa.
1210
1211The text in this section is taken from the R6RS standard libraries
1212document, with only minor adaptions for inclusion in this manual. The
1213Guile developers offer their thanks to the R6RS editors for having
1214provided the report's text under permissive conditions making this
1215possible.
b242715b
LC
1216
1217@c FIXME: Update description when implemented.
958173e4 1218@emph{Note}: The implementation of this R6RS API is not complete yet.
b242715b
LC
1219
1220@menu
040dfa6f
AR
1221* R6RS File Names:: File names.
1222* R6RS File Options:: Options for opening files.
1223* R6RS Buffer Modes:: Influencing buffering behavior.
1224* R6RS Transcoders:: Influencing port encoding.
b242715b
LC
1225* R6RS End-of-File:: The end-of-file object.
1226* R6RS Port Manipulation:: Manipulating R6RS ports.
040dfa6f 1227* R6RS Input Ports:: Input Ports.
b242715b 1228* R6RS Binary Input:: Binary input.
040dfa6f
AR
1229* R6RS Textual Input:: Textual input.
1230* R6RS Output Ports:: Output Ports.
b242715b 1231* R6RS Binary Output:: Binary output.
040dfa6f 1232* R6RS Textual Output:: Textual output.
b242715b
LC
1233@end menu
1234
7f6c3f8f
MW
1235A subset of the @code{(rnrs io ports)} module, plus one non-standard
1236procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is
1237provided by the @code{(ice-9 binary-ports)} module. It contains binary
1238input/output procedures and does not rely on R6RS support.
de424d95 1239
040dfa6f
AR
1240@node R6RS File Names
1241@subsubsection File Names
1242
1243Some of the procedures described in this chapter accept a file name as an
1244argument. Valid values for such a file name include strings that name a file
b3da54d1 1245using the native notation of file system paths on an implementation's
040dfa6f
AR
1246underlying operating system, and may include implementation-dependent
1247values as well.
1248
1249A @var{filename} parameter name means that the
1250corresponding argument must be a file name.
1251
1252@node R6RS File Options
1253@subsubsection File Options
1254@cindex file options
1255
1256When opening a file, the various procedures in this library accept a
1257@code{file-options} object that encapsulates flags to specify how the
1258file is to be opened. A @code{file-options} object is an enum-set
1259(@pxref{rnrs enums}) over the symbols constituting valid file options.
1260
1261A @var{file-options} parameter name means that the corresponding
1262argument must be a file-options object.
1263
1264@deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
1265
1266Each @var{file-options-symbol} must be a symbol.
1267
1268The @code{file-options} syntax returns a file-options object that
1269encapsulates the specified options.
1270
1271When supplied to an operation that opens a file for output, the
1272file-options object returned by @code{(file-options)} specifies that the
1273file is created if it does not exist and an exception with condition
1274type @code{&i/o-file-already-exists} is raised if it does exist. The
1275following standard options can be included to modify the default
1276behavior.
1277
1278@table @code
1279@item no-create
1280 If the file does not already exist, it is not created;
1281 instead, an exception with condition type @code{&i/o-file-does-not-exist}
1282 is raised.
1283 If the file already exists, the exception with condition type
1284 @code{&i/o-file-already-exists} is not raised
1285 and the file is truncated to zero length.
1286@item no-fail
1287 If the file already exists, the exception with condition type
1288 @code{&i/o-file-already-exists} is not raised,
1289 even if @code{no-create} is not included,
1290 and the file is truncated to zero length.
1291@item no-truncate
1292 If the file already exists and the exception with condition type
1293 @code{&i/o-file-already-exists} has been inhibited by inclusion of
1294 @code{no-create} or @code{no-fail}, the file is not truncated, but
1295 the port's current position is still set to the beginning of the
1296 file.
1297@end table
1298
1299These options have no effect when a file is opened only for input.
1300Symbols other than those listed above may be used as
1301@var{file-options-symbol}s; they have implementation-specific meaning,
1302if any.
1303
1304@quotation Note
1305 Only the name of @var{file-options-symbol} is significant.
1306@end quotation
1307@end deffn
1308
1309@node R6RS Buffer Modes
1310@subsubsection Buffer Modes
1311
1312Each port has an associated buffer mode. For an output port, the
1313buffer mode defines when an output operation flushes the buffer
1314associated with the output port. For an input port, the buffer mode
1315defines how much data will be read to satisfy read operations. The
1316possible buffer modes are the symbols @code{none} for no buffering,
1317@code{line} for flushing upon line endings and reading up to line
1318endings, or other implementation-dependent behavior,
1319and @code{block} for arbitrary buffering. This section uses
1320the parameter name @var{buffer-mode} for arguments that must be
1321buffer-mode symbols.
1322
1323If two ports are connected to the same mutable source, both ports
1324are unbuffered, and reading a byte or character from that shared
1325source via one of the two ports would change the bytes or characters
1326seen via the other port, a lookahead operation on one port will
1327render the peeked byte or character inaccessible via the other port,
1328while a subsequent read operation on the peeked port will see the
1329peeked byte or character even though the port is otherwise unbuffered.
1330
1331In other words, the semantics of buffering is defined in terms of side
1332effects on shared mutable sources, and a lookahead operation has the
1333same side effect on the shared source as a read operation.
1334
1335@deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
1336
1337@var{buffer-mode-symbol} must be a symbol whose name is one of
1338@code{none}, @code{line}, and @code{block}. The result is the
1339corresponding symbol, and specifies the associated buffer mode.
1340
1341@quotation Note
1342 Only the name of @var{buffer-mode-symbol} is significant.
1343@end quotation
1344@end deffn
1345
1346@deffn {Scheme Procedure} buffer-mode? obj
1347Returns @code{#t} if the argument is a valid buffer-mode symbol, and
1348returns @code{#f} otherwise.
1349@end deffn
1350
1351@node R6RS Transcoders
1352@subsubsection Transcoders
1353@cindex codec
1354@cindex end-of-line style
1355@cindex transcoder
1356@cindex binary port
1357@cindex textual port
1358
1359Several different Unicode encoding schemes describe standard ways to
1360encode characters and strings as byte sequences and to decode those
1361sequences. Within this document, a @dfn{codec} is an immutable Scheme
1362object that represents a Unicode or similar encoding scheme.
1363
1364An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
1365describes how a textual port transcodes representations of line endings.
1366
1367A @dfn{transcoder} is an immutable Scheme object that combines a codec
1368with an end-of-line style and a method for handling decoding errors.
1369Each transcoder represents some specific bidirectional (but not
1370necessarily lossless), possibly stateful translation between byte
1371sequences and Unicode characters and strings. Every transcoder can
1372operate in the input direction (bytes to characters) or in the output
1373direction (characters to bytes). A @var{transcoder} parameter name
1374means that the corresponding argument must be a transcoder.
1375
1376A @dfn{binary port} is a port that supports binary I/O, does not have an
1377associated transcoder and does not support textual I/O. A @dfn{textual
1378port} is a port that supports textual I/O, and does not support binary
1379I/O. A textual port may or may not have an associated transcoder.
1380
1381@deffn {Scheme Procedure} latin-1-codec
1382@deffnx {Scheme Procedure} utf-8-codec
1383@deffnx {Scheme Procedure} utf-16-codec
1384
1385These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
1386encoding schemes.
1387
1388A call to any of these procedures returns a value that is equal in the
1389sense of @code{eqv?} to the result of any other call to the same
1390procedure.
1391@end deffn
1392
1393@deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
1394
1395@var{eol-style-symbol} should be a symbol whose name is one of
1396@code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
1397and @code{none}.
1398
1399The form evaluates to the corresponding symbol. If the name of
1400@var{eol-style-symbol} is not one of these symbols, the effect and
1401result are implementation-dependent; in particular, the result may be an
1402eol-style symbol acceptable as an @var{eol-style} argument to
1403@code{make-transcoder}. Otherwise, an exception is raised.
1404
1405All eol-style symbols except @code{none} describe a specific
1406line-ending encoding:
1407
1408@table @code
1409@item lf
1410linefeed
1411@item cr
1412carriage return
1413@item crlf
1414carriage return, linefeed
1415@item nel
1416next line
1417@item crnel
1418carriage return, next line
1419@item ls
1420line separator
1421@end table
1422
1423For a textual port with a transcoder, and whose transcoder has an
1424eol-style symbol @code{none}, no conversion occurs. For a textual input
1425port, any eol-style symbol other than @code{none} means that all of the
1426above line-ending encodings are recognized and are translated into a
1427single linefeed. For a textual output port, @code{none} and @code{lf}
1428are equivalent. Linefeed characters are encoded according to the
1429specified eol-style symbol, and all other characters that participate in
1430possible line endings are encoded as is.
1431
1432@quotation Note
1433 Only the name of @var{eol-style-symbol} is significant.
1434@end quotation
1435@end deffn
1436
1437@deffn {Scheme Procedure} native-eol-style
1438Returns the default end-of-line style of the underlying platform, e.g.,
1439@code{lf} on Unix and @code{crlf} on Windows.
1440@end deffn
1441
1442@deffn {Condition Type} &i/o-decoding
1443@deffnx {Scheme Procedure} make-i/o-decoding-error port
1444@deffnx {Scheme Procedure} i/o-decoding-error? obj
1445
1446This condition type could be defined by
1447
1448@lisp
1449(define-condition-type &i/o-decoding &i/o-port
1450 make-i/o-decoding-error i/o-decoding-error?)
1451@end lisp
1452
1453An exception with this type is raised when one of the operations for
1454textual input from a port encounters a sequence of bytes that cannot be
1455translated into a character or string by the input direction of the
1456port's transcoder.
1457
1458When such an exception is raised, the port's position is past the
1459invalid encoding.
1460@end deffn
1461
1462@deffn {Condition Type} &i/o-encoding
1463@deffnx {Scheme Procedure} make-i/o-encoding-error port char
1464@deffnx {Scheme Procedure} i/o-encoding-error? obj
1465@deffnx {Scheme Procedure} i/o-encoding-error-char condition
1466
1467This condition type could be defined by
1468
1469@lisp
1470(define-condition-type &i/o-encoding &i/o-port
1471 make-i/o-encoding-error i/o-encoding-error?
1472 (char i/o-encoding-error-char))
1473@end lisp
1474
1475An exception with this type is raised when one of the operations for
1476textual output to a port encounters a character that cannot be
1477translated into bytes by the output direction of the port's transcoder.
64de6db5 1478@var{char} is the character that could not be encoded.
040dfa6f
AR
1479@end deffn
1480
1481@deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
1482
1483@var{error-handling-mode-symbol} should be a symbol whose name is one of
1484@code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
1485the corresponding symbol. If @var{error-handling-mode-symbol} is not
1486one of these identifiers, effect and result are
1487implementation-dependent: The result may be an error-handling-mode
1488symbol acceptable as a @var{handling-mode} argument to
1489@code{make-transcoder}. If it is not acceptable as a
1490@var{handling-mode} argument to @code{make-transcoder}, an exception is
1491raised.
1492
1493@quotation Note
64de6db5 1494 Only the name of @var{error-handling-mode-symbol} is significant.
040dfa6f
AR
1495@end quotation
1496
1497The error-handling mode of a transcoder specifies the behavior
1498of textual I/O operations in the presence of encoding or decoding
1499errors.
1500
1501If a textual input operation encounters an invalid or incomplete
1502character encoding, and the error-handling mode is @code{ignore}, an
1503appropriate number of bytes of the invalid encoding are ignored and
1504decoding continues with the following bytes.
1505
1506If the error-handling mode is @code{replace}, the replacement
1507character U+FFFD is injected into the data stream, an appropriate
1508number of bytes are ignored, and decoding
1509continues with the following bytes.
1510
1511If the error-handling mode is @code{raise}, an exception with condition
1512type @code{&i/o-decoding} is raised.
1513
1514If a textual output operation encounters a character it cannot encode,
1515and the error-handling mode is @code{ignore}, the character is ignored
1516and encoding continues with the next character. If the error-handling
1517mode is @code{replace}, a codec-specific replacement character is
1518emitted by the transcoder, and encoding continues with the next
1519character. The replacement character is U+FFFD for transcoders whose
1520codec is one of the Unicode encodings, but is the @code{?} character
1521for the Latin-1 encoding. If the error-handling mode is @code{raise},
1522an exception with condition type @code{&i/o-encoding} is raised.
1523@end deffn
1524
1525@deffn {Scheme Procedure} make-transcoder codec
1526@deffnx {Scheme Procedure} make-transcoder codec eol-style
1527@deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
1528
1529@var{codec} must be a codec; @var{eol-style}, if present, an eol-style
1530symbol; and @var{handling-mode}, if present, an error-handling-mode
1531symbol.
1532
1533@var{eol-style} may be omitted, in which case it defaults to the native
64de6db5 1534end-of-line style of the underlying platform. @var{handling-mode} may
040dfa6f
AR
1535be omitted, in which case it defaults to @code{replace}. The result is
1536a transcoder with the behavior specified by its arguments.
1537@end deffn
1538
1539@deffn {Scheme procedure} native-transcoder
1540Returns an implementation-dependent transcoder that represents a
1541possibly locale-dependent ``native'' transcoding.
1542@end deffn
1543
1544@deffn {Scheme Procedure} transcoder-codec transcoder
1545@deffnx {Scheme Procedure} transcoder-eol-style transcoder
1546@deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
1547
1548These are accessors for transcoder objects; when applied to a
1549transcoder returned by @code{make-transcoder}, they return the
1550@var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
1551respectively.
1552@end deffn
1553
1554@deffn {Scheme Procedure} bytevector->string bytevector transcoder
1555
1556Returns the string that results from transcoding the
1557@var{bytevector} according to the input direction of the transcoder.
1558@end deffn
1559
1560@deffn {Scheme Procedure} string->bytevector string transcoder
1561
1562Returns the bytevector that results from transcoding the
1563@var{string} according to the output direction of the transcoder.
1564@end deffn
1565
b242715b
LC
1566@node R6RS End-of-File
1567@subsubsection The End-of-File Object
1568
1569@cindex EOF
1570@cindex end-of-file
1571
1572R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io
1573ports)} module:
1574
1575@deffn {Scheme Procedure} eof-object? obj
1576@deffnx {C Function} scm_eof_object_p (obj)
1577Return true if @var{obj} is the end-of-file (EOF) object.
1578@end deffn
1579
1580In addition, the following procedure is provided:
1581
1582@deffn {Scheme Procedure} eof-object
1583@deffnx {C Function} scm_eof_object ()
1584Return the end-of-file (EOF) object.
1585
1586@lisp
1587(eof-object? (eof-object))
1588@result{} #t
1589@end lisp
1590@end deffn
1591
1592
1593@node R6RS Port Manipulation
1594@subsubsection Port Manipulation
1595
1596The procedures listed below operate on any kind of R6RS I/O port.
1597
040dfa6f
AR
1598@deffn {Scheme Procedure} port? obj
1599Returns @code{#t} if the argument is a port, and returns @code{#f}
1600otherwise.
1601@end deffn
1602
1603@deffn {Scheme Procedure} port-transcoder port
1604Returns the transcoder associated with @var{port} if @var{port} is
1605textual and has an associated transcoder, and returns @code{#f} if
1606@var{port} is binary or does not have an associated transcoder.
1607@end deffn
1608
1609@deffn {Scheme Procedure} binary-port? port
1610Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
1611binary data input/output.
1612
1613Note that internally Guile does not differentiate between binary and
1614textual ports, unlike the R6RS. Thus, this procedure returns true when
1615@var{port} does not have an associated encoding---i.e., when
1616@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
1617port-encoding}). This is the case for ports returned by R6RS procedures
1618such as @code{open-bytevector-input-port} and
1619@code{make-custom-binary-output-port}.
1620
1621However, Guile currently does not prevent use of textual I/O procedures
1622such as @code{display} or @code{read-char} with binary ports. Doing so
1623``upgrades'' the port from binary to textual, under the ISO-8859-1
1624encoding. Likewise, Guile does not prevent use of
1625@code{set-port-encoding!} on a binary port, which also turns it into a
1626``textual'' port.
1627@end deffn
1628
1629@deffn {Scheme Procedure} textual-port? port
64de6db5 1630Always return @code{#t}, as all ports can be used for textual I/O in
040dfa6f
AR
1631Guile.
1632@end deffn
1633
64de6db5 1634@deffn {Scheme Procedure} transcoded-port binary-port transcoder
040dfa6f
AR
1635The @code{transcoded-port} procedure
1636returns a new textual port with the specified @var{transcoder}.
1637Otherwise the new textual port's state is largely the same as
1638that of @var{binary-port}.
1639If @var{binary-port} is an input port, the new textual
1640port will be an input port and
1641will transcode the bytes that have not yet been read from
1642@var{binary-port}.
1643If @var{binary-port} is an output port, the new textual
1644port will be an output port and
1645will transcode output characters into bytes that are
1646written to the byte sink represented by @var{binary-port}.
1647
1648As a side effect, however, @code{transcoded-port}
1649closes @var{binary-port} in
1650a special way that allows the new textual port to continue to
1651use the byte source or sink represented by @var{binary-port},
1652even though @var{binary-port} itself is closed and cannot
1653be used by the input and output operations described in this
1654chapter.
1655@end deffn
1656
b242715b
LC
1657@deffn {Scheme Procedure} port-position port
1658If @var{port} supports it (see below), return the offset (an integer)
1659indicating where the next octet will be read from/written to in
1660@var{port}. If @var{port} does not support this operation, an error
1661condition is raised.
1662
1663This is similar to Guile's @code{seek} procedure with the
1664@code{SEEK_CUR} argument (@pxref{Random Access}).
1665@end deffn
1666
1667@deffn {Scheme Procedure} port-has-port-position? port
1668Return @code{#t} is @var{port} supports @code{port-position}.
1669@end deffn
1670
1671@deffn {Scheme Procedure} set-port-position! port offset
1672If @var{port} supports it (see below), set the position where the next
1673octet will be read from/written to @var{port} to @var{offset} (an
1674integer). If @var{port} does not support this operation, an error
1675condition is raised.
1676
1677This is similar to Guile's @code{seek} procedure with the
1678@code{SEEK_SET} argument (@pxref{Random Access}).
1679@end deffn
1680
1681@deffn {Scheme Procedure} port-has-set-port-position!? port
1682Return @code{#t} is @var{port} supports @code{set-port-position!}.
1683@end deffn
1684
1685@deffn {Scheme Procedure} call-with-port port proc
1686Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
1687of @var{proc}. Return the return values of @var{proc}.
1688@end deffn
1689
040dfa6f
AR
1690@node R6RS Input Ports
1691@subsubsection Input Ports
96128014 1692
64de6db5 1693@deffn {Scheme Procedure} input-port? obj
040dfa6f
AR
1694Returns @code{#t} if the argument is an input port (or a combined input
1695and output port), and returns @code{#f} otherwise.
1696@end deffn
96128014 1697
64de6db5 1698@deffn {Scheme Procedure} port-eof? input-port
040dfa6f
AR
1699Returns @code{#t}
1700if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
1701or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
1702would return
1703the end-of-file object, and @code{#f} otherwise.
1704The operation may block indefinitely if no data is available
1705but the port cannot be determined to be at end of file.
96128014
LC
1706@end deffn
1707
040dfa6f
AR
1708@deffn {Scheme Procedure} open-file-input-port filename
1709@deffnx {Scheme Procedure} open-file-input-port filename file-options
1710@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
1711@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
64de6db5 1712@var{maybe-transcoder} must be either a transcoder or @code{#f}.
040dfa6f
AR
1713
1714The @code{open-file-input-port} procedure returns an
1715input port for the named file. The @var{file-options} and
1716@var{maybe-transcoder} arguments are optional.
1717
1718The @var{file-options} argument, which may determine
1719various aspects of the returned port (@pxref{R6RS File Options}),
1720defaults to the value of @code{(file-options)}.
1721
1722The @var{buffer-mode} argument, if supplied,
1723must be one of the symbols that name a buffer mode.
1724The @var{buffer-mode} argument defaults to @code{block}.
1725
1726If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
1727with the returned port.
1728
1729If @var{maybe-transcoder} is @code{#f} or absent,
1730the port will be a binary port and will support the
1731@code{port-position} and @code{set-port-position!} operations.
1732Otherwise the port will be a textual port, and whether it supports
1733the @code{port-position} and @code{set-port-position!} operations
1734is implementation-dependent (and possibly transcoder-dependent).
96128014
LC
1735@end deffn
1736
040dfa6f
AR
1737@deffn {Scheme Procedure} standard-input-port
1738Returns a fresh binary input port connected to standard input. Whether
1739the port supports the @code{port-position} and @code{set-port-position!}
1740operations is implementation-dependent.
1741@end deffn
1742
1743@deffn {Scheme Procedure} current-input-port
1744This returns a default textual port for input. Normally, this default
1745port is associated with standard input, but can be dynamically
1746re-assigned using the @code{with-input-from-file} procedure from the
1747@code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
1748may not have an associated transcoder; if it does, the transcoder is
1749implementation-dependent.
1750@end deffn
b242715b
LC
1751
1752@node R6RS Binary Input
1753@subsubsection Binary Input
1754
1755@cindex binary input
1756
1757R6RS binary input ports can be created with the procedures described
1758below.
1759
1760@deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder]
1761@deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder)
1762Return an input port whose contents are drawn from bytevector @var{bv}
1763(@pxref{Bytevectors}).
1764
1765@c FIXME: Update description when implemented.
1766The @var{transcoder} argument is currently not supported.
1767@end deffn
1768
1769@cindex custom binary input ports
1770
1771@deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close
1772@deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close)
1773Return a new custom binary input port@footnote{This is similar in spirit
1774to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a
1775string) whose input is drained by invoking @var{read!} and passing it a
1776bytevector, an index where bytes should be written, and the number of
1777bytes to read. The @code{read!} procedure must return an integer
1778indicating the number of bytes read, or @code{0} to indicate the
1779end-of-file.
1780
1781Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
64de6db5 1782that will be called when @code{port-position} is invoked on the custom
b242715b
LC
1783binary port and should return an integer indicating the position within
1784the underlying data stream; if @var{get-position} was not supplied, the
64de6db5 1785returned port does not support @code{port-position}.
b242715b
LC
1786
1787Likewise, if @var{set-position!} is not @code{#f}, it should be a
64de6db5 1788one-argument procedure. When @code{set-port-position!} is invoked on the
b242715b
LC
1789custom binary input port, @var{set-position!} is passed an integer
1790indicating the position of the next byte is to read.
1791
1792Finally, if @var{close} is not @code{#f}, it must be a thunk. It is
1793invoked when the custom binary input port is closed.
1794
1795Using a custom binary input port, the @code{open-bytevector-input-port}
1796procedure could be implemented as follows:
1797
1798@lisp
1799(define (open-bytevector-input-port source)
1800 (define position 0)
1801 (define length (bytevector-length source))
1802
1803 (define (read! bv start count)
1804 (let ((count (min count (- length position))))
1805 (bytevector-copy! source position
1806 bv start count)
1807 (set! position (+ position count))
1808 count))
1809
1810 (define (get-position) position)
1811
1812 (define (set-position! new-position)
1813 (set! position new-position))
1814
1815 (make-custom-binary-input-port "the port" read!
1816 get-position
1817 set-position!))
1818
1819(read (open-bytevector-input-port (string->utf8 "hello")))
1820@result{} hello
1821@end lisp
1822@end deffn
1823
1824@cindex binary input
1825Binary input is achieved using the procedures below:
1826
1827@deffn {Scheme Procedure} get-u8 port
1828@deffnx {C Function} scm_get_u8 (port)
1829Return an octet read from @var{port}, a binary input port, blocking as
1830necessary, or the end-of-file object.
1831@end deffn
1832
1833@deffn {Scheme Procedure} lookahead-u8 port
1834@deffnx {C Function} scm_lookahead_u8 (port)
1835Like @code{get-u8} but does not update @var{port}'s position to point
1836past the octet.
1837@end deffn
1838
1839@deffn {Scheme Procedure} get-bytevector-n port count
1840@deffnx {C Function} scm_get_bytevector_n (port, count)
1841Read @var{count} octets from @var{port}, blocking as necessary and
1842return a bytevector containing the octets read. If fewer bytes are
1843available, a bytevector smaller than @var{count} is returned.
1844@end deffn
1845
1846@deffn {Scheme Procedure} get-bytevector-n! port bv start count
1847@deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count)
1848Read @var{count} bytes from @var{port} and store them in @var{bv}
1849starting at index @var{start}. Return either the number of bytes
1850actually read or the end-of-file object.
1851@end deffn
1852
1853@deffn {Scheme Procedure} get-bytevector-some port
1854@deffnx {C Function} scm_get_bytevector_some (port)
21bbe22a
MW
1855Read from @var{port}, blocking as necessary, until bytes are available
1856or an end-of-file is reached. Return either the end-of-file object or a
1857new bytevector containing some of the available bytes (at least one),
1858and update the port position to point just past these bytes.
b242715b
LC
1859@end deffn
1860
1861@deffn {Scheme Procedure} get-bytevector-all port
1862@deffnx {C Function} scm_get_bytevector_all (port)
1863Read from @var{port}, blocking as necessary, until the end-of-file is
1864reached. Return either a new bytevector containing the data read or the
1865end-of-file object (if no data were available).
1866@end deffn
1867
7f6c3f8f
MW
1868The @code{(ice-9 binary-ports)} module provides the following procedure
1869as an extension to @code{(rnrs io ports)}:
1870
1871@deffn {Scheme Procedure} unget-bytevector port bv [start [count]]
1872@deffnx {C Function} scm_unget_bytevector (port, bv, start, count)
1873Place the contents of @var{bv} in @var{port}, optionally starting at
1874index @var{start} and limiting to @var{count} octets, so that its bytes
1875will be read from left-to-right as the next bytes from @var{port} during
1876subsequent read operations. If called multiple times, the unread bytes
1877will be read again in last-in first-out order.
1878@end deffn
1879
040dfa6f
AR
1880@node R6RS Textual Input
1881@subsubsection Textual Input
1882
64de6db5 1883@deffn {Scheme Procedure} get-char textual-input-port
040dfa6f
AR
1884Reads from @var{textual-input-port}, blocking as necessary, until a
1885complete character is available from @var{textual-input-port},
1886or until an end of file is reached.
1887
1888If a complete character is available before the next end of file,
1889@code{get-char} returns that character and updates the input port to
1890point past the character. If an end of file is reached before any
1891character is read, @code{get-char} returns the end-of-file object.
1892@end deffn
1893
64de6db5 1894@deffn {Scheme Procedure} lookahead-char textual-input-port
040dfa6f
AR
1895The @code{lookahead-char} procedure is like @code{get-char}, but it does
1896not update @var{textual-input-port} to point past the character.
1897@end deffn
1898
64de6db5 1899@deffn {Scheme Procedure} get-string-n textual-input-port count
040dfa6f 1900
64de6db5 1901@var{count} must be an exact, non-negative integer object, representing
040dfa6f
AR
1902the number of characters to be read.
1903
1904The @code{get-string-n} procedure reads from @var{textual-input-port},
1905blocking as necessary, until @var{count} characters are available, or
1906until an end of file is reached.
1907
1908If @var{count} characters are available before end of file,
1909@code{get-string-n} returns a string consisting of those @var{count}
1910characters. If fewer characters are available before an end of file, but
1911one or more characters can be read, @code{get-string-n} returns a string
1912containing those characters. In either case, the input port is updated
1913to point just past the characters read. If no characters can be read
1914before an end of file, the end-of-file object is returned.
1915@end deffn
1916
64de6db5 1917@deffn {Scheme Procedure} get-string-n! textual-input-port string start count
040dfa6f 1918
64de6db5 1919@var{start} and @var{count} must be exact, non-negative integer objects,
040dfa6f 1920with @var{count} representing the number of characters to be read.
64de6db5 1921@var{string} must be a string with at least $@var{start} + @var{count}$
040dfa6f
AR
1922characters.
1923
1924The @code{get-string-n!} procedure reads from @var{textual-input-port}
1925in the same manner as @code{get-string-n}. If @var{count} characters
1926are available before an end of file, they are written into @var{string}
1927starting at index @var{start}, and @var{count} is returned. If fewer
1928characters are available before an end of file, but one or more can be
1929read, those characters are written into @var{string} starting at index
1930@var{start} and the number of characters actually read is returned as an
1931exact integer object. If no characters can be read before an end of
1932file, the end-of-file object is returned.
1933@end deffn
1934
1fcf6909 1935@deffn {Scheme Procedure} get-string-all textual-input-port
040dfa6f
AR
1936Reads from @var{textual-input-port} until an end of file, decoding
1937characters in the same manner as @code{get-string-n} and
1938@code{get-string-n!}.
1939
1940If characters are available before the end of file, a string containing
1941all the characters decoded from that data are returned. If no character
1942precedes the end of file, the end-of-file object is returned.
1943@end deffn
1944
64de6db5 1945@deffn {Scheme Procedure} get-line textual-input-port
040dfa6f
AR
1946Reads from @var{textual-input-port} up to and including the linefeed
1947character or end of file, decoding characters in the same manner as
1948@code{get-string-n} and @code{get-string-n!}.
1949
1950If a linefeed character is read, a string containing all of the text up
1951to (but not including) the linefeed character is returned, and the port
1952is updated to point just past the linefeed character. If an end of file
1953is encountered before any linefeed character is read, but some
1954characters have been read and decoded as characters, a string containing
1955those characters is returned. If an end of file is encountered before
1956any characters are read, the end-of-file object is returned.
1957
1958@quotation Note
1959 The end-of-line style, if not @code{none}, will cause all line endings
1960 to be read as linefeed characters. @xref{R6RS Transcoders}.
1961@end quotation
1962@end deffn
1963
64de6db5 1964@deffn {Scheme Procedure} get-datum textual-input-port count
040dfa6f
AR
1965Reads an external representation from @var{textual-input-port} and returns the
1966datum it represents. The @code{get-datum} procedure returns the next
1967datum that can be parsed from the given @var{textual-input-port}, updating
1968@var{textual-input-port} to point exactly past the end of the external
1969representation of the object.
1970
1971Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
1972Syntax}) in the input is first skipped. If an end of file occurs after
1973the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
1974is returned.
1975
1976If a character inconsistent with an external representation is
1977encountered in the input, an exception with condition types
1978@code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
1979file is encountered after the beginning of an external representation,
1980but the external representation is incomplete and therefore cannot be
1981parsed, an exception with condition types @code{&lexical} and
1982@code{&i/o-read} is raised.
1983@end deffn
1984
1985@node R6RS Output Ports
1986@subsubsection Output Ports
1987
1988@deffn {Scheme Procedure} output-port? obj
1989Returns @code{#t} if the argument is an output port (or a
1990combined input and output port), @code{#f} otherwise.
1991@end deffn
1992
1993@deffn {Scheme Procedure} flush-output-port port
1994Flushes any buffered output from the buffer of @var{output-port} to the
1995underlying file, device, or object. The @code{flush-output-port}
1996procedure returns an unspecified values.
1997@end deffn
1998
1999@deffn {Scheme Procedure} open-file-output-port filename
2000@deffnx {Scheme Procedure} open-file-output-port filename file-options
2001@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
2002@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
2003
2004@var{maybe-transcoder} must be either a transcoder or @code{#f}.
2005
2006The @code{open-file-output-port} procedure returns an output port for the named file.
2007
2008The @var{file-options} argument, which may determine various aspects of
2009the returned port (@pxref{R6RS File Options}), defaults to the value of
2010@code{(file-options)}.
2011
2012The @var{buffer-mode} argument, if supplied,
2013must be one of the symbols that name a buffer mode.
2014The @var{buffer-mode} argument defaults to @code{block}.
2015
2016If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
2017associated with the port.
2018
2019If @var{maybe-transcoder} is @code{#f} or absent,
2020the port will be a binary port and will support the
2021@code{port-position} and @code{set-port-position!} operations.
2022Otherwise the port will be a textual port, and whether it supports
2023the @code{port-position} and @code{set-port-position!} operations
2024is implementation-dependent (and possibly transcoder-dependent).
2025@end deffn
2026
2027@deffn {Scheme Procedure} standard-output-port
2028@deffnx {Scheme Procedure} standard-error-port
2029Returns a fresh binary output port connected to the standard output or
2030standard error respectively. Whether the port supports the
2031@code{port-position} and @code{set-port-position!} operations is
2032implementation-dependent.
2033@end deffn
2034
2035@deffn {Scheme Procedure} current-output-port
2036@deffnx {Scheme Procedure} current-error-port
2037These return default textual ports for regular output and error output.
2038Normally, these default ports are associated with standard output, and
2039standard error, respectively. The return value of
2040@code{current-output-port} can be dynamically re-assigned using the
2041@code{with-output-to-file} procedure from the @code{io simple (6)}
2042library (@pxref{rnrs io simple}). A port returned by one of these
2043procedures may or may not have an associated transcoder; if it does, the
2044transcoder is implementation-dependent.
2045@end deffn
2046
b242715b
LC
2047@node R6RS Binary Output
2048@subsubsection Binary Output
2049
2050Binary output ports can be created with the procedures below.
2051
2052@deffn {Scheme Procedure} open-bytevector-output-port [transcoder]
2053@deffnx {C Function} scm_open_bytevector_output_port (transcoder)
2054Return two values: a binary output port and a procedure. The latter
2055should be called with zero arguments to obtain a bytevector containing
2056the data accumulated by the port, as illustrated below.
2057
2058@lisp
2059(call-with-values
2060 (lambda ()
2061 (open-bytevector-output-port))
2062 (lambda (port get-bytevector)
2063 (display "hello" port)
2064 (get-bytevector)))
2065
2066@result{} #vu8(104 101 108 108 111)
2067@end lisp
2068
2069@c FIXME: Update description when implemented.
2070The @var{transcoder} argument is currently not supported.
2071@end deffn
2072
2073@cindex custom binary output ports
2074
2075@deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close
2076@deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close)
2077Return a new custom binary output port named @var{id} (a string) whose
2078output is sunk by invoking @var{write!} and passing it a bytevector, an
2079index where bytes should be read from this bytevector, and the number of
2080bytes to be ``written''. The @code{write!} procedure must return an
2081integer indicating the number of bytes actually written; when it is
2082passed @code{0} as the number of bytes to write, it should behave as
2083though an end-of-file was sent to the byte sink.
2084
2085The other arguments are as for @code{make-custom-binary-input-port}
2086(@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}).
2087@end deffn
2088
2089@cindex binary output
2090Writing to a binary output port can be done using the following
2091procedures:
2092
2093@deffn {Scheme Procedure} put-u8 port octet
2094@deffnx {C Function} scm_put_u8 (port, octet)
2095Write @var{octet}, an integer in the 0--255 range, to @var{port}, a
2096binary output port.
2097@end deffn
2098
2099@deffn {Scheme Procedure} put-bytevector port bv [start [count]]
2100@deffnx {C Function} scm_put_bytevector (port, bv, start, count)
2101Write the contents of @var{bv} to @var{port}, optionally starting at
2102index @var{start} and limiting to @var{count} octets.
2103@end deffn
2104
040dfa6f
AR
2105@node R6RS Textual Output
2106@subsubsection Textual Output
2107
2108@deffn {Scheme Procedure} put-char port char
2109Writes @var{char} to the port. The @code{put-char} procedure returns
803c087e 2110an unspecified value.
040dfa6f
AR
2111@end deffn
2112
2113@deffn {Scheme Procedure} put-string port string
2114@deffnx {Scheme Procedure} put-string port string start
2115@deffnx {Scheme Procedure} put-string port string start count
2116
2117@var{start} and @var{count} must be non-negative exact integer objects.
2118@var{string} must have a length of at least @math{@var{start} +
2119@var{count}}. @var{start} defaults to 0. @var{count} defaults to
2120@math{@code{(string-length @var{string})} - @var{start}}$. The
2121@code{put-string} procedure writes the @var{count} characters of
2122@var{string} starting at index @var{start} to the port. The
2123@code{put-string} procedure returns an unspecified value.
2124@end deffn
2125
64de6db5 2126@deffn {Scheme Procedure} put-datum textual-output-port datum
040dfa6f
AR
2127@var{datum} should be a datum value. The @code{put-datum} procedure
2128writes an external representation of @var{datum} to
2129@var{textual-output-port}. The specific external representation is
2130implementation-dependent. However, whenever possible, an implementation
2131should produce a representation for which @code{get-datum}, when reading
2132the representation, will return an object equal (in the sense of
2133@code{equal?}) to @var{datum}.
2134
2135@quotation Note
2136 Not all datums may allow producing an external representation for which
2137 @code{get-datum} will produce an object that is equal to the
2138 original. Specifically, NaNs contained in @var{datum} may make
2139 this impossible.
2140@end quotation
2141
2142@quotation Note
2143 The @code{put-datum} procedure merely writes the external
2144 representation, but no trailing delimiter. If @code{put-datum} is
2145 used to write several subsequent external representations to an
2146 output port, care should be taken to delimit them properly so they can
2147 be read back in by subsequent calls to @code{get-datum}.
2148@end quotation
2149@end deffn
b242715b 2150
07d83abe
MV
2151@node I/O Extensions
2152@subsection Using and Extending Ports in C
2153
2154@menu
2155* C Port Interface:: Using ports from C.
2156* Port Implementation:: How to implement a new port type in C.
2157@end menu
2158
2159
2160@node C Port Interface
2161@subsubsection C Port Interface
bf5df489
KR
2162@cindex C port interface
2163@cindex Port, C interface
07d83abe
MV
2164
2165This section describes how to use Scheme ports from C.
2166
2167@subsubheading Port basics
2168
3081aee1
KR
2169@cindex ptob
2170@tindex scm_ptob_descriptor
2171@tindex scm_port
2172@findex SCM_PTAB_ENTRY
2173@findex SCM_PTOBNUM
2174@vindex scm_ptobs
07d83abe
MV
2175There are two main data structures. A port type object (ptob) is of
2176type @code{scm_ptob_descriptor}. A port instance is of type
2177@code{scm_port}. Given an @code{SCM} variable which points to a port,
2178the corresponding C port object can be obtained using the
2179@code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using
2180@code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs}
2181global array.
2182
2183@subsubheading Port buffers
2184
2185An input port always has a read buffer and an output port always has a
2186write buffer. However the size of these buffers is not guaranteed to be
2187more than one byte (e.g., the @code{shortbuf} field in @code{scm_port}
2188which is used when no other buffer is allocated). The way in which the
2189buffers are allocated depends on the implementation of the ptob. For
2190example in the case of an fport, buffers may be allocated with malloc
2191when the port is created, but in the case of an strport the underlying
2192string is used as the buffer.
2193
2194@subsubheading The @code{rw_random} flag
2195
2196Special treatment is required for ports which can be seeked at random.
2197Before various operations, such as seeking the port or changing from
2198input to output on a bidirectional port or vice versa, the port
2199implementation must be given a chance to update its state. The write
2200buffer is updated by calling the @code{flush} ptob procedure and the
2201input buffer is updated by calling the @code{end_input} ptob procedure.
2202In the case of an fport, @code{flush} causes buffered output to be
2203written to the file descriptor, while @code{end_input} causes the
2204descriptor position to be adjusted to account for buffered input which
2205was never read.
2206
2207The special treatment must be performed if the @code{rw_random} flag in
2208the port is non-zero.
2209
2210@subsubheading The @code{rw_active} variable
2211
2212The @code{rw_active} variable in the port is only used if
2213@code{rw_random} is set. It's defined as an enum with the following
2214values:
2215
2216@table @code
2217@item SCM_PORT_READ
2218the read buffer may have unread data.
2219
2220@item SCM_PORT_WRITE
2221the write buffer may have unwritten data.
2222
2223@item SCM_PORT_NEITHER
2224neither the write nor the read buffer has data.
2225@end table
2226
2227@subsubheading Reading from a port.
2228
2229To read from a port, it's possible to either call existing libguile
2230procedures such as @code{scm_getc} and @code{scm_read_line} or to read
2231data from the read buffer directly. Reading from the buffer involves
2232the following steps:
2233
2234@enumerate
2235@item
2236Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}.
2237
2238@item
2239Fill the read buffer, if it's empty, using @code{scm_fill_input}.
2240
2241@item Read the data from the buffer and update the read position in
2242the buffer. Steps 2) and 3) may be repeated as many times as required.
2243
2244@item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set.
2245
2246@item update the port's line and column counts.
2247@end enumerate
2248
2249@subsubheading Writing to a port.
2250
2251To write data to a port, calling @code{scm_lfwrite} should be sufficient for
2252most purposes. This takes care of the following steps:
2253
2254@enumerate
2255@item
2256End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}.
2257
2258@item
2259Pass the data to the ptob implementation using the @code{write} ptob
2260procedure. The advantage of using the ptob @code{write} instead of
2261manipulating the write buffer directly is that it allows the data to be
2262written in one operation even if the port is using the single-byte
2263@code{shortbuf}.
2264
2265@item
2266Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random}
2267is set.
2268@end enumerate
2269
2270
2271@node Port Implementation
2272@subsubsection Port Implementation
28cc8dac 2273@cindex Port implementation
07d83abe
MV
2274
2275This section describes how to implement a new port type in C.
2276
2277As described in the previous section, a port type object (ptob) is
2278a structure of type @code{scm_ptob_descriptor}. A ptob is created by
2279calling @code{scm_make_port_type}.
2280
23f2b9a3
KR
2281@deftypefun scm_t_bits scm_make_port_type (char *name, int (*fill_input) (SCM port), void (*write) (SCM port, const void *data, size_t size))
2282Return a new port type object. The @var{name}, @var{fill_input} and
2283@var{write} parameters are initial values for those port type fields,
2284as described below. The other fields are initialized with default
2285values and can be changed later.
2286@end deftypefun
2287
07d83abe
MV
2288All of the elements of the ptob, apart from @code{name}, are procedures
2289which collectively implement the port behaviour. Creating a new port
2290type mostly involves writing these procedures.
2291
07d83abe
MV
2292@table @code
2293@item name
2294A pointer to a NUL terminated string: the name of the port type. This
2295is the only element of @code{scm_ptob_descriptor} which is not
2296a procedure. Set via the first argument to @code{scm_make_port_type}.
2297
2298@item mark
2299Called during garbage collection to mark any SCM objects that a port
2300object may contain. It doesn't need to be set unless the port has
23f2b9a3
KR
2301@code{SCM} components. Set using
2302
2303@deftypefun void scm_set_port_mark (scm_t_bits tc, SCM (*mark) (SCM port))
2304@end deftypefun
07d83abe
MV
2305
2306@item free
2307Called when the port is collected during gc. It
2308should free any resources used by the port.
23f2b9a3
KR
2309Set using
2310
2311@deftypefun void scm_set_port_free (scm_t_bits tc, size_t (*free) (SCM port))
2312@end deftypefun
07d83abe
MV
2313
2314@item print
2315Called when @code{write} is called on the port object, to print a
23f2b9a3
KR
2316port description. E.g., for an fport it may produce something like:
2317@code{#<input: /etc/passwd 3>}. Set using
2318
2319@deftypefun void scm_set_port_print (scm_t_bits tc, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate))
2320The first argument @var{port} is the object being printed, the second
2321argument @var{dest_port} is where its description should go.
2322@end deftypefun
07d83abe
MV
2323
2324@item equalp
23f2b9a3
KR
2325Not used at present. Set using
2326
2327@deftypefun void scm_set_port_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM))
2328@end deftypefun
07d83abe
MV
2329
2330@item close
2331Called when the port is closed, unless it was collected during gc. It
2332should free any resources used by the port.
23f2b9a3
KR
2333Set using
2334
2335@deftypefun void scm_set_port_close (scm_t_bits tc, int (*close) (SCM port))
2336@end deftypefun
07d83abe
MV
2337
2338@item write
2339Accept data which is to be written using the port. The port implementation
2340may choose to buffer the data instead of processing it directly.
2341Set via the third argument to @code{scm_make_port_type}.
2342
2343@item flush
2344Complete the processing of buffered output data. Reset the value of
2345@code{rw_active} to @code{SCM_PORT_NEITHER}.
23f2b9a3
KR
2346Set using
2347
2348@deftypefun void scm_set_port_flush (scm_t_bits tc, void (*flush) (SCM port))
2349@end deftypefun
07d83abe
MV
2350
2351@item end_input
2352Perform any synchronization required when switching from input to output
2353on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}.
23f2b9a3
KR
2354Set using
2355
2356@deftypefun void scm_set_port_end_input (scm_t_bits tc, void (*end_input) (SCM port, int offset))
2357@end deftypefun
07d83abe
MV
2358
2359@item fill_input
2360Read new data into the read buffer and return the first character. It
2361can be assumed that the read buffer is empty when this procedure is called.
2362Set via the second argument to @code{scm_make_port_type}.
2363
2364@item input_waiting
2365Return a lower bound on the number of bytes that could be read from the
2366port without blocking. It can be assumed that the current state of
2367@code{rw_active} is @code{SCM_PORT_NEITHER}.
23f2b9a3
KR
2368Set using
2369
2370@deftypefun void scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM port))
2371@end deftypefun
07d83abe
MV
2372
2373@item seek
2374Set the current position of the port. The procedure can not make
2375any assumptions about the value of @code{rw_active} when it's
2376called. It can reset the buffers first if desired by using something
2377like:
2378
2379@example
23f2b9a3
KR
2380if (pt->rw_active == SCM_PORT_READ)
2381 scm_end_input (port);
2382else if (pt->rw_active == SCM_PORT_WRITE)
2383 ptob->flush (port);
07d83abe
MV
2384@end example
2385
2386However note that this will have the side effect of discarding any data
2387in the unread-char buffer, in addition to any side effects from the
2388@code{end_input} and @code{flush} ptob procedures. This is undesirable
2389when seek is called to measure the current position of the port, i.e.,
2390@code{(seek p 0 SEEK_CUR)}. The libguile fport and string port
2391implementations take care to avoid this problem.
2392
23f2b9a3
KR
2393The procedure is set using
2394
f1ce9199 2395@deftypefun void scm_set_port_seek (scm_t_bits tc, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence))
23f2b9a3 2396@end deftypefun
07d83abe
MV
2397
2398@item truncate
2399Truncate the port data to be specified length. It can be assumed that the
2400current state of @code{rw_active} is @code{SCM_PORT_NEITHER}.
23f2b9a3
KR
2401Set using
2402
f1ce9199 2403@deftypefun void scm_set_port_truncate (scm_t_bits tc, void (*truncate) (SCM port, scm_t_off length))
23f2b9a3 2404@end deftypefun
07d83abe
MV
2405
2406@end table
2407
cdd3d6c9
MW
2408@node BOM Handling
2409@subsection Handling of Unicode byte order marks.
2410@cindex BOM
2411@cindex byte order mark
2412
2413This section documents the finer points of Guile's handling of Unicode
2414byte order marks (BOMs). A byte order mark (U+FEFF) is typically found
2415at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
2416determine the byte order. Occasionally, a BOM is found at the start of
2417a UTF-8 stream, but this is much less common and not generally
2418recommended.
2419
2420Guile attempts to handle BOMs automatically, and in accordance with the
2421recommendations of the Unicode Standard, when the port encoding is set
2422to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile
2423automatically writes a BOM at the start of a UTF-16 or UTF-32 stream,
2424and automatically consumes one from the start of a UTF-8, UTF-16, or
2425UTF-32 stream.
2426
2427As specified in the Unicode Standard, a BOM is only handled specially at
2428the start of a stream, and only if the port encoding is set to
2429@code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is
2430set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or
2431@code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of
2432the special handling described in this section applies.
2433
2434@itemize @bullet
2435@item
2436To ensure that Guile will properly detect the byte order of a UTF-16 or
2437UTF-32 stream, you must perform a textual read before any writes, seeks,
2438or binary I/O. Guile will not attempt to read a BOM unless a read is
2439explicitly requested at the start of the stream.
2440
2441@item
2442If a textual write is performed before the first read, then an arbitrary
2443byte order will be chosen. Currently, big endian is the default on all
2444platforms, but that may change in the future. If you wish to explicitly
2445control the byte order of an output stream, set the port encoding to
2446@code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE},
2447and explicitly write a BOM (@code{#\xFEFF}) if desired.
2448
2449@item
2450If @code{set-port-encoding!} is called in the middle of a stream, Guile
2451treats this as a new logical ``start of stream'' for purposes of BOM
2452handling, and will forget about any BOMs that had previously been seen.
2453Therefore, it may choose a different byte order than had been used
2454previously. This is intended to support multiple logical text streams
2455embedded within a larger binary stream.
2456
2457@item
2458Binary I/O operations are not guaranteed to update Guile's notion of
2459whether the port is at the ``start of the stream'', nor are they
2460guaranteed to produce or consume BOMs.
2461
2462@item
2463For ports that support seeking (e.g. normal files), the input and output
2464streams are considered linked: if the user reads first, then a BOM will
2465be consumed (if appropriate), but later writes will @emph{not} produce a
2466BOM. Similarly, if the user writes first, then later reads will
2467@emph{not} consume a BOM.
2468
2469@item
2470For ports that do not support seeking (e.g. pipes, sockets, and
2471terminals), the input and output streams are considered
2472@emph{independent} for purposes of BOM handling: the first read will
2473consume a BOM (if appropriate), and the first write will @emph{also}
2474produce a BOM (if appropriate). However, the input and output streams
2475will always use the same byte order.
2476
2477@item
2478Seeks to the beginning of a file will set the ``start of stream'' flags.
2479Therefore, a subsequent textual read or write will consume or produce a
2480BOM. However, unlike @code{set-port-encoding!}, if a byte order had
2481already been chosen for the port, it will remain in effect after a seek,
2482and cannot be changed by the presence of a BOM. Seeks anywhere other
2483than the beginning of a file clear the ``start of stream'' flags.
2484@end itemize
2485
07d83abe
MV
2486@c Local Variables:
2487@c TeX-master: "guile.texi"
2488@c End: