Commit | Line | Data |
---|---|---|
07d83abe MV |
1 | @c -*-texinfo-*- |
2 | @c This is part of the GNU Guile Reference Manual. | |
c62da8f8 | 3 | @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009, |
cdd3d6c9 | 4 | @c 2010, 2011, 2013 Free Software Foundation, Inc. |
07d83abe MV |
5 | @c See the file guile.texi for copying conditions. |
6 | ||
07d83abe MV |
7 | @node Input and Output |
8 | @section Input and Output | |
9 | ||
10 | @menu | |
11 | * Ports:: The idea of the port abstraction. | |
12 | * Reading:: Procedures for reading from a port. | |
13 | * Writing:: Procedures for writing to a port. | |
14 | * Closing:: Procedures to close a port. | |
15 | * Random Access:: Moving around a random access port. | |
16 | * Line/Delimited:: Read and write lines or delimited text. | |
17 | * Block Reading and Writing:: Reading and writing blocks of text. | |
18 | * Default Ports:: Defaults for input, output and errors. | |
19 | * Port Types:: Types of port and how to make them. | |
b242715b | 20 | * R6RS I/O Ports:: The R6RS port API. |
07d83abe | 21 | * I/O Extensions:: Using and extending ports in C. |
cdd3d6c9 | 22 | * BOM Handling:: Handling of Unicode byte order marks. |
07d83abe MV |
23 | @end menu |
24 | ||
25 | ||
26 | @node Ports | |
27 | @subsection Ports | |
bf5df489 | 28 | @cindex Port |
07d83abe MV |
29 | |
30 | Sequential input/output in Scheme is represented by operations on a | |
31 | @dfn{port}. This chapter explains the operations that Guile provides | |
32 | for working with ports. | |
33 | ||
34 | Ports are created by opening, for instance @code{open-file} for a file | |
35 | (@pxref{File Ports}). Characters can be read from an input port and | |
36 | written to an output port, or both on an input/output port. A port | |
37 | can be closed (@pxref{Closing}) when no longer required, after which | |
38 | any attempt to read or write is an error. | |
39 | ||
40 | The formal definition of a port is very generic: an input port is | |
41 | simply ``an object which can deliver characters on demand,'' and an | |
42 | output port is ``an object which can accept characters.'' Because | |
43 | this definition is so loose, it is easy to write functions that | |
44 | simulate ports in software. @dfn{Soft ports} and @dfn{string ports} | |
45 | are two interesting and powerful examples of this technique. | |
46 | (@pxref{Soft Ports}, and @ref{String Ports}.) | |
47 | ||
48 | Ports are garbage collected in the usual way (@pxref{Memory | |
49 | Management}), and will be closed at that time if not already closed. | |
28cc8dac | 50 | In this case any errors occurring in the close will not be reported. |
07d83abe MV |
51 | Usually a program will want to explicitly close so as to be sure all |
52 | its operations have been successful. Of course if a program has | |
53 | abandoned something due to an error or other condition then closing | |
54 | problems are probably not of interest. | |
55 | ||
56 | It is strongly recommended that file ports be closed explicitly when | |
57 | no longer required. Most systems have limits on how many files can be | |
58 | open, both on a per-process and a system-wide basis. A program that | |
59 | uses many files should take care not to hit those limits. The same | |
60 | applies to similar system resources such as pipes and sockets. | |
61 | ||
62 | Note that automatic garbage collection is triggered only by memory | |
63 | consumption, not by file or other resource usage, so a program cannot | |
64 | rely on that to keep it away from system limits. An explicit call to | |
65 | @code{gc} can of course be relied on to pick up unreferenced ports. | |
66 | If program flow makes it hard to be certain when to close then this | |
67 | may be an acceptable way to control resource usage. | |
68 | ||
40296bab KR |
69 | All file access uses the ``LFS'' large file support functions when |
70 | available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be | |
71 | read and written on a 32-bit system. | |
72 | ||
28cc8dac MG |
73 | Each port has an associated character encoding that controls how bytes |
74 | read from the port are converted to characters and string and controls | |
75 | how characters and strings written to the port are converted to bytes. | |
76 | When ports are created, they inherit their character encoding from the | |
77 | current locale, but, that can be modified after the port is created. | |
78 | ||
912a8702 MG |
79 | Currently, the ports only work with @emph{non-modal} encodings. Most |
80 | encodings are non-modal, meaning that the conversion of bytes to a | |
81 | string doesn't depend on its context: the same byte sequence will always | |
82 | return the same string. A couple of modal encodings are in common use, | |
83 | like ISO-2022-JP and ISO-2022-KR, and they are not yet supported. | |
84 | ||
28cc8dac MG |
85 | Each port also has an associated conversion strategy: what to do when |
86 | a Guile character can't be converted to the port's encoded character | |
87 | representation for output. There are three possible strategies: to | |
88 | raise an error, to replace the character with a hex escape, or to | |
89 | replace the character with a substitute character. | |
90 | ||
07d83abe MV |
91 | @rnindex input-port? |
92 | @deffn {Scheme Procedure} input-port? x | |
93 | @deffnx {C Function} scm_input_port_p (x) | |
94 | Return @code{#t} if @var{x} is an input port, otherwise return | |
95 | @code{#f}. Any object satisfying this predicate also satisfies | |
96 | @code{port?}. | |
97 | @end deffn | |
98 | ||
99 | @rnindex output-port? | |
100 | @deffn {Scheme Procedure} output-port? x | |
101 | @deffnx {C Function} scm_output_port_p (x) | |
102 | Return @code{#t} if @var{x} is an output port, otherwise return | |
103 | @code{#f}. Any object satisfying this predicate also satisfies | |
104 | @code{port?}. | |
105 | @end deffn | |
106 | ||
107 | @deffn {Scheme Procedure} port? x | |
108 | @deffnx {C Function} scm_port_p (x) | |
109 | Return a boolean indicating whether @var{x} is a port. | |
110 | Equivalent to @code{(or (input-port? @var{x}) (output-port? | |
111 | @var{x}))}. | |
112 | @end deffn | |
113 | ||
28cc8dac MG |
114 | @deffn {Scheme Procedure} set-port-encoding! port enc |
115 | @deffnx {C Function} scm_set_port_encoding_x (port, enc) | |
4c7b9975 LC |
116 | Sets the character encoding that will be used to interpret all port I/O. |
117 | @var{enc} is a string containing the name of an encoding. Valid | |
118 | encoding names are those | |
119 | @url{http://www.iana.org/assignments/character-sets, defined by IANA}. | |
28cc8dac | 120 | @end deffn |
d6a6989e LC |
121 | |
122 | @defvr {Scheme Variable} %default-port-encoding | |
72b3aa56 | 123 | A fluid containing @code{#f} or the name of the encoding to |
d6a6989e LC |
124 | be used by default for newly created ports (@pxref{Fluids and Dynamic |
125 | States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}. | |
28cc8dac MG |
126 | |
127 | New ports are created with the encoding appropriate for the current | |
4c7b9975 LC |
128 | locale if @code{setlocale} has been called or the value specified by |
129 | this fluid otherwise. | |
130 | @end defvr | |
28cc8dac MG |
131 | |
132 | @deffn {Scheme Procedure} port-encoding port | |
5f6ffd66 | 133 | @deffnx {C Function} scm_port_encoding (port) |
211683cc MG |
134 | Returns, as a string, the character encoding that @var{port} uses to interpret |
135 | its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}. | |
28cc8dac MG |
136 | @end deffn |
137 | ||
138 | @deffn {Scheme Procedure} set-port-conversion-strategy! port sym | |
139 | @deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym) | |
140 | Sets the behavior of the interpreter when outputting a character that | |
141 | is not representable in the port's current encoding. @var{sym} can be | |
142 | either @code{'error}, @code{'substitute}, or @code{'escape}. If it is | |
143 | @code{'error}, an error will be thrown when an nonconvertible character | |
144 | is encountered. If it is @code{'substitute}, then nonconvertible | |
145 | characters will be replaced with approximate characters, or with | |
146 | question marks if no approximately correct character is available. If | |
147 | it is @code{'escape}, it will appear as a hex escape when output. | |
148 | ||
149 | If @var{port} is an open port, the conversion error behavior | |
150 | is set for that port. If it is @code{#f}, it is set as the | |
151 | default behavior for any future ports that get created in | |
152 | this thread. | |
153 | @end deffn | |
154 | ||
155 | @deffn {Scheme Procedure} port-conversion-strategy port | |
156 | @deffnx {C Function} scm_port_conversion_strategy (port) | |
157 | Returns the behavior of the port when outputting a character that is | |
158 | not representable in the port's current encoding. It returns the | |
159 | symbol @code{error} if unrepresentable characters should cause | |
160 | exceptions, @code{substitute} if the port should try to replace | |
161 | unrepresentable characters with question marks or approximate | |
162 | characters, or @code{escape} if unrepresentable characters should be | |
163 | converted to string escapes. | |
164 | ||
165 | If @var{port} is @code{#f}, then the current default behavior will be | |
166 | returned. New ports will have this default behavior when they are | |
167 | created. | |
168 | @end deffn | |
169 | ||
b22e94db LC |
170 | @deffn {Scheme Variable} %default-port-conversion-strategy |
171 | The fluid that defines the conversion strategy for newly created ports, | |
172 | and for other conversion routines such as @code{scm_to_stringn}, | |
173 | @code{scm_from_stringn}, @code{string->pointer}, and | |
174 | @code{pointer->string}. | |
175 | ||
176 | Its value must be one of the symbols described above, with the same | |
177 | semantics: @code{'error}, @code{'substitute}, or @code{'escape}. | |
178 | ||
179 | When Guile starts, its value is @code{'substitute}. | |
180 | ||
181 | Note that @code{(set-port-conversion-strategy! #f @var{sym})} is | |
182 | equivalent to @code{(fluid-set! %default-port-conversion-strategy | |
183 | @var{sym})}. | |
184 | @end deffn | |
28cc8dac | 185 | |
07d83abe MV |
186 | |
187 | @node Reading | |
188 | @subsection Reading | |
bf5df489 | 189 | @cindex Reading |
07d83abe MV |
190 | |
191 | [Generic procedures for reading from ports.] | |
192 | ||
1518f649 AW |
193 | These procedures pertain to reading characters and strings from |
194 | ports. To read general S-expressions from ports, @xref{Scheme Read}. | |
195 | ||
07d83abe | 196 | @rnindex eof-object? |
bf5df489 | 197 | @cindex End of file object |
07d83abe MV |
198 | @deffn {Scheme Procedure} eof-object? x |
199 | @deffnx {C Function} scm_eof_object_p (x) | |
200 | Return @code{#t} if @var{x} is an end-of-file object; otherwise | |
201 | return @code{#f}. | |
202 | @end deffn | |
203 | ||
204 | @rnindex char-ready? | |
205 | @deffn {Scheme Procedure} char-ready? [port] | |
206 | @deffnx {C Function} scm_char_ready_p (port) | |
207 | Return @code{#t} if a character is ready on input @var{port} | |
208 | and return @code{#f} otherwise. If @code{char-ready?} returns | |
209 | @code{#t} then the next @code{read-char} operation on | |
210 | @var{port} is guaranteed not to hang. If @var{port} is a file | |
211 | port at end of file then @code{char-ready?} returns @code{#t}. | |
cdf1ad3b MV |
212 | |
213 | @code{char-ready?} exists to make it possible for a | |
07d83abe MV |
214 | program to accept characters from interactive ports without |
215 | getting stuck waiting for input. Any input editors associated | |
216 | with such ports must make sure that characters whose existence | |
217 | has been asserted by @code{char-ready?} cannot be rubbed out. | |
218 | If @code{char-ready?} were to return @code{#f} at end of file, | |
219 | a port at end of file would be indistinguishable from an | |
cdf1ad3b | 220 | interactive port that has no ready characters. |
07d83abe MV |
221 | @end deffn |
222 | ||
223 | @rnindex read-char | |
224 | @deffn {Scheme Procedure} read-char [port] | |
225 | @deffnx {C Function} scm_read_char (port) | |
226 | Return the next character available from @var{port}, updating | |
227 | @var{port} to point to the following character. If no more | |
228 | characters are available, the end-of-file object is returned. | |
c62da8f8 LC |
229 | |
230 | When @var{port}'s data cannot be decoded according to its | |
231 | character encoding, a @code{decoding-error} is raised and | |
232 | @var{port} points past the erroneous byte sequence. | |
07d83abe MV |
233 | @end deffn |
234 | ||
235 | @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size) | |
236 | Read up to @var{size} bytes from @var{port} and store them in | |
237 | @var{buffer}. The return value is the number of bytes actually read, | |
238 | which can be less than @var{size} if end-of-file has been reached. | |
239 | ||
240 | Note that this function does not update @code{port-line} and | |
241 | @code{port-column} below. | |
242 | @end deftypefn | |
243 | ||
244 | @rnindex peek-char | |
245 | @deffn {Scheme Procedure} peek-char [port] | |
246 | @deffnx {C Function} scm_peek_char (port) | |
247 | Return the next character available from @var{port}, | |
248 | @emph{without} updating @var{port} to point to the following | |
249 | character. If no more characters are available, the | |
cdf1ad3b MV |
250 | end-of-file object is returned. |
251 | ||
252 | The value returned by | |
07d83abe MV |
253 | a call to @code{peek-char} is the same as the value that would |
254 | have been returned by a call to @code{read-char} on the same | |
255 | port. The only difference is that the very next call to | |
256 | @code{read-char} or @code{peek-char} on that @var{port} will | |
257 | return the value returned by the preceding call to | |
258 | @code{peek-char}. In particular, a call to @code{peek-char} on | |
259 | an interactive port will hang waiting for input whenever a call | |
cdf1ad3b | 260 | to @code{read-char} would have hung. |
c62da8f8 LC |
261 | |
262 | As for @code{read-char}, a @code{decoding-error} may be raised | |
263 | if such a situation occurs. However, unlike with @code{read-char}, | |
264 | @var{port} still points at the beginning of the erroneous byte | |
265 | sequence when the error is raised. | |
07d83abe MV |
266 | @end deffn |
267 | ||
268 | @deffn {Scheme Procedure} unread-char cobj [port] | |
269 | @deffnx {C Function} scm_unread_char (cobj, port) | |
64de6db5 | 270 | Place character @var{cobj} in @var{port} so that it will be read by the |
07d83abe MV |
271 | next read operation. If called multiple times, the unread characters |
272 | will be read again in last-in first-out order. If @var{port} is | |
273 | not supplied, the current input port is used. | |
274 | @end deffn | |
275 | ||
276 | @deffn {Scheme Procedure} unread-string str port | |
277 | @deffnx {C Function} scm_unread_string (str, port) | |
278 | Place the string @var{str} in @var{port} so that its characters will | |
279 | be read from left-to-right as the next characters from @var{port} | |
280 | during subsequent read operations. If called multiple times, the | |
281 | unread characters will be read again in last-in first-out order. If | |
9782da8a | 282 | @var{port} is not supplied, the @code{current-input-port} is used. |
07d83abe MV |
283 | @end deffn |
284 | ||
285 | @deffn {Scheme Procedure} drain-input port | |
286 | @deffnx {C Function} scm_drain_input (port) | |
287 | This procedure clears a port's input buffers, similar | |
288 | to the way that force-output clears the output buffer. The | |
289 | contents of the buffers are returned as a single string, e.g., | |
290 | ||
291 | @lisp | |
292 | (define p (open-input-file ...)) | |
293 | (drain-input p) => empty string, nothing buffered yet. | |
294 | (unread-char (read-char p) p) | |
295 | (drain-input p) => initial chars from p, up to the buffer size. | |
296 | @end lisp | |
297 | ||
298 | Draining the buffers may be useful for cleanly finishing | |
299 | buffered I/O so that the file descriptor can be used directly | |
300 | for further input. | |
301 | @end deffn | |
302 | ||
303 | @deffn {Scheme Procedure} port-column port | |
304 | @deffnx {Scheme Procedure} port-line port | |
305 | @deffnx {C Function} scm_port_column (port) | |
306 | @deffnx {C Function} scm_port_line (port) | |
307 | Return the current column number or line number of @var{port}. | |
308 | If the number is | |
309 | unknown, the result is #f. Otherwise, the result is a 0-origin integer | |
310 | - i.e.@: the first character of the first line is line 0, column 0. | |
311 | (However, when you display a file position, for example in an error | |
312 | message, we recommend you add 1 to get 1-origin integers. This is | |
313 | because lines and column numbers traditionally start with 1, and that is | |
314 | what non-programmers will find most natural.) | |
315 | @end deffn | |
316 | ||
317 | @deffn {Scheme Procedure} set-port-column! port column | |
318 | @deffnx {Scheme Procedure} set-port-line! port line | |
319 | @deffnx {C Function} scm_set_port_column_x (port, column) | |
320 | @deffnx {C Function} scm_set_port_line_x (port, line) | |
321 | Set the current column or line number of @var{port}. | |
322 | @end deffn | |
323 | ||
324 | @node Writing | |
325 | @subsection Writing | |
bf5df489 | 326 | @cindex Writing |
07d83abe MV |
327 | |
328 | [Generic procedures for writing to ports.] | |
329 | ||
1518f649 AW |
330 | These procedures are for writing characters and strings to |
331 | ports. For more information on writing arbitrary Scheme objects to | |
332 | ports, @xref{Scheme Write}. | |
333 | ||
07d83abe MV |
334 | @deffn {Scheme Procedure} get-print-state port |
335 | @deffnx {C Function} scm_get_print_state (port) | |
336 | Return the print state of the port @var{port}. If @var{port} | |
337 | has no associated print state, @code{#f} is returned. | |
338 | @end deffn | |
339 | ||
07d83abe MV |
340 | @rnindex newline |
341 | @deffn {Scheme Procedure} newline [port] | |
342 | @deffnx {C Function} scm_newline (port) | |
343 | Send a newline to @var{port}. | |
344 | If @var{port} is omitted, send to the current output port. | |
345 | @end deffn | |
346 | ||
cdf1ad3b | 347 | @deffn {Scheme Procedure} port-with-print-state port [pstate] |
07d83abe MV |
348 | @deffnx {C Function} scm_port_with_print_state (port, pstate) |
349 | Create a new port which behaves like @var{port}, but with an | |
cdf1ad3b MV |
350 | included print state @var{pstate}. @var{pstate} is optional. |
351 | If @var{pstate} isn't supplied and @var{port} already has | |
352 | a print state, the old print state is reused. | |
07d83abe MV |
353 | @end deffn |
354 | ||
07d83abe MV |
355 | @deffn {Scheme Procedure} simple-format destination message . args |
356 | @deffnx {C Function} scm_simple_format (destination, message, args) | |
357 | Write @var{message} to @var{destination}, defaulting to | |
358 | the current output port. | |
359 | @var{message} can contain @code{~A} (was @code{%s}) and | |
360 | @code{~S} (was @code{%S}) escapes. When printed, | |
361 | the escapes are replaced with corresponding members of | |
64de6db5 | 362 | @var{args}: |
07d83abe MV |
363 | @code{~A} formats using @code{display} and @code{~S} formats |
364 | using @code{write}. | |
365 | If @var{destination} is @code{#t}, then use the current output | |
366 | port, if @var{destination} is @code{#f}, then return a string | |
367 | containing the formatted text. Does not add a trailing newline. | |
368 | @end deffn | |
369 | ||
370 | @rnindex write-char | |
371 | @deffn {Scheme Procedure} write-char chr [port] | |
372 | @deffnx {C Function} scm_write_char (chr, port) | |
373 | Send character @var{chr} to @var{port}. | |
374 | @end deffn | |
375 | ||
376 | @deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size) | |
377 | Write @var{size} bytes at @var{buffer} to @var{port}. | |
378 | ||
379 | Note that this function does not update @code{port-line} and | |
380 | @code{port-column} (@pxref{Reading}). | |
381 | @end deftypefn | |
382 | ||
383 | @findex fflush | |
384 | @deffn {Scheme Procedure} force-output [port] | |
385 | @deffnx {C Function} scm_force_output (port) | |
386 | Flush the specified output port, or the current output port if @var{port} | |
387 | is omitted. The current output buffer contents are passed to the | |
388 | underlying port implementation (e.g., in the case of fports, the | |
389 | data will be written to the file and the output buffer will be cleared.) | |
390 | It has no effect on an unbuffered port. | |
391 | ||
392 | The return value is unspecified. | |
393 | @end deffn | |
394 | ||
395 | @deffn {Scheme Procedure} flush-all-ports | |
396 | @deffnx {C Function} scm_flush_all_ports () | |
397 | Equivalent to calling @code{force-output} on | |
398 | all open output ports. The return value is unspecified. | |
399 | @end deffn | |
400 | ||
401 | ||
402 | @node Closing | |
403 | @subsection Closing | |
bf5df489 KR |
404 | @cindex Closing ports |
405 | @cindex Port, close | |
07d83abe MV |
406 | |
407 | @deffn {Scheme Procedure} close-port port | |
408 | @deffnx {C Function} scm_close_port (port) | |
409 | Close the specified port object. Return @code{#t} if it | |
410 | successfully closes a port or @code{#f} if it was already | |
411 | closed. An exception may be raised if an error occurs, for | |
412 | example when flushing buffered output. See also @ref{Ports and | |
413 | File Descriptors, close}, for a procedure which can close file | |
414 | descriptors. | |
415 | @end deffn | |
416 | ||
417 | @deffn {Scheme Procedure} close-input-port port | |
418 | @deffnx {Scheme Procedure} close-output-port port | |
419 | @deffnx {C Function} scm_close_input_port (port) | |
420 | @deffnx {C Function} scm_close_output_port (port) | |
421 | @rnindex close-input-port | |
422 | @rnindex close-output-port | |
423 | Close the specified input or output @var{port}. An exception may be | |
424 | raised if an error occurs while closing. If @var{port} is already | |
425 | closed, nothing is done. The return value is unspecified. | |
426 | ||
427 | See also @ref{Ports and File Descriptors, close}, for a procedure | |
428 | which can close file descriptors. | |
429 | @end deffn | |
430 | ||
431 | @deffn {Scheme Procedure} port-closed? port | |
432 | @deffnx {C Function} scm_port_closed_p (port) | |
433 | Return @code{#t} if @var{port} is closed or @code{#f} if it is | |
434 | open. | |
435 | @end deffn | |
436 | ||
437 | ||
438 | @node Random Access | |
439 | @subsection Random Access | |
bf5df489 KR |
440 | @cindex Random access, ports |
441 | @cindex Port, random access | |
07d83abe MV |
442 | |
443 | @deffn {Scheme Procedure} seek fd_port offset whence | |
444 | @deffnx {C Function} scm_seek (fd_port, offset, whence) | |
64de6db5 | 445 | Sets the current position of @var{fd_port} to the integer |
07d83abe MV |
446 | @var{offset}, which is interpreted according to the value of |
447 | @var{whence}. | |
448 | ||
449 | One of the following variables should be supplied for | |
450 | @var{whence}: | |
451 | @defvar SEEK_SET | |
452 | Seek from the beginning of the file. | |
453 | @end defvar | |
454 | @defvar SEEK_CUR | |
455 | Seek from the current position. | |
456 | @end defvar | |
457 | @defvar SEEK_END | |
458 | Seek from the end of the file. | |
459 | @end defvar | |
64de6db5 | 460 | If @var{fd_port} is a file descriptor, the underlying system |
07d83abe MV |
461 | call is @code{lseek}. @var{port} may be a string port. |
462 | ||
463 | The value returned is the new position in the file. This means | |
464 | that the current position of a port can be obtained using: | |
465 | @lisp | |
466 | (seek port 0 SEEK_CUR) | |
467 | @end lisp | |
468 | @end deffn | |
469 | ||
470 | @deffn {Scheme Procedure} ftell fd_port | |
471 | @deffnx {C Function} scm_ftell (fd_port) | |
472 | Return an integer representing the current position of | |
64de6db5 | 473 | @var{fd_port}, measured from the beginning. Equivalent to: |
07d83abe MV |
474 | |
475 | @lisp | |
476 | (seek port 0 SEEK_CUR) | |
477 | @end lisp | |
478 | @end deffn | |
479 | ||
480 | @findex truncate | |
481 | @findex ftruncate | |
40296bab KR |
482 | @deffn {Scheme Procedure} truncate-file file [length] |
483 | @deffnx {C Function} scm_truncate_file (file, length) | |
484 | Truncate @var{file} to @var{length} bytes. @var{file} can be a | |
485 | filename string, a port object, or an integer file descriptor. The | |
486 | return value is unspecified. | |
487 | ||
488 | For a port or file descriptor @var{length} can be omitted, in which | |
489 | case the file is truncated at the current position (per @code{ftell} | |
490 | above). | |
491 | ||
492 | On most systems a file can be extended by giving a length greater than | |
493 | the current size, but this is not mandatory in the POSIX standard. | |
07d83abe MV |
494 | @end deffn |
495 | ||
496 | @node Line/Delimited | |
497 | @subsection Line Oriented and Delimited Text | |
bf5df489 KR |
498 | @cindex Line input/output |
499 | @cindex Port, line input/output | |
07d83abe MV |
500 | |
501 | The delimited-I/O module can be accessed with: | |
502 | ||
aba0dff5 | 503 | @lisp |
07d83abe | 504 | (use-modules (ice-9 rdelim)) |
aba0dff5 | 505 | @end lisp |
07d83abe MV |
506 | |
507 | It can be used to read or write lines of text, or read text delimited by | |
508 | a specified set of characters. It's similar to the @code{(scsh rdelim)} | |
509 | module from guile-scsh, but does not use multiple values or character | |
510 | sets and has an extra procedure @code{write-line}. | |
511 | ||
512 | @c begin (scm-doc-string "rdelim.scm" "read-line") | |
513 | @deffn {Scheme Procedure} read-line [port] [handle-delim] | |
514 | Return a line of text from @var{port} if specified, otherwise from the | |
515 | value returned by @code{(current-input-port)}. Under Unix, a line of text | |
516 | is terminated by the first end-of-line character or by end-of-file. | |
517 | ||
518 | If @var{handle-delim} is specified, it should be one of the following | |
519 | symbols: | |
520 | @table @code | |
521 | @item trim | |
522 | Discard the terminating delimiter. This is the default, but it will | |
523 | be impossible to tell whether the read terminated with a delimiter or | |
524 | end-of-file. | |
525 | @item concat | |
526 | Append the terminating delimiter (if any) to the returned string. | |
527 | @item peek | |
528 | Push the terminating delimiter (if any) back on to the port. | |
529 | @item split | |
530 | Return a pair containing the string read from the port and the | |
531 | terminating delimiter or end-of-file object. | |
532 | @end table | |
c62da8f8 LC |
533 | |
534 | Like @code{read-char}, this procedure can throw to @code{decoding-error} | |
535 | (@pxref{Reading, @code{read-char}}). | |
07d83abe MV |
536 | @end deffn |
537 | ||
538 | @c begin (scm-doc-string "rdelim.scm" "read-line!") | |
539 | @deffn {Scheme Procedure} read-line! buf [port] | |
540 | Read a line of text into the supplied string @var{buf} and return the | |
541 | number of characters added to @var{buf}. If @var{buf} is filled, then | |
542 | @code{#f} is returned. | |
543 | Read from @var{port} if | |
544 | specified, otherwise from the value returned by @code{(current-input-port)}. | |
545 | @end deffn | |
546 | ||
547 | @c begin (scm-doc-string "rdelim.scm" "read-delimited") | |
548 | @deffn {Scheme Procedure} read-delimited delims [port] [handle-delim] | |
549 | Read text until one of the characters in the string @var{delims} is found | |
550 | or end-of-file is reached. Read from @var{port} if supplied, otherwise | |
551 | from the value returned by @code{(current-input-port)}. | |
552 | @var{handle-delim} takes the same values as described for @code{read-line}. | |
553 | @end deffn | |
554 | ||
555 | @c begin (scm-doc-string "rdelim.scm" "read-delimited!") | |
556 | @deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end] | |
e7fb779f AW |
557 | Read text into the supplied string @var{buf}. |
558 | ||
559 | If a delimiter was found, return the number of characters written, | |
560 | except if @var{handle-delim} is @code{split}, in which case the return | |
561 | value is a pair, as noted above. | |
562 | ||
563 | As a special case, if @var{port} was already at end-of-stream, the EOF | |
564 | object is returned. Also, if no characters were written because the | |
565 | buffer was full, @code{#f} is returned. | |
566 | ||
567 | It's something of a wacky interface, to be honest. | |
07d83abe MV |
568 | @end deffn |
569 | ||
570 | @deffn {Scheme Procedure} write-line obj [port] | |
571 | @deffnx {C Function} scm_write_line (obj, port) | |
572 | Display @var{obj} and a newline character to @var{port}. If | |
573 | @var{port} is not specified, @code{(current-output-port)} is | |
574 | used. This function is equivalent to: | |
575 | @lisp | |
576 | (display obj [port]) | |
577 | (newline [port]) | |
578 | @end lisp | |
579 | @end deffn | |
580 | ||
5a35d42a AW |
581 | In the past, Guile did not have a procedure that would just read out all |
582 | of the characters from a port. As a workaround, many people just called | |
583 | @code{read-delimited} with no delimiters, knowing that would produce the | |
584 | behavior they wanted. This prompted Guile developers to add some | |
585 | routines that would read all characters from a port. So it is that | |
586 | @code{(ice-9 rdelim)} is also the home for procedures that can reading | |
587 | undelimited text: | |
588 | ||
589 | @deffn {Scheme Procedure} read-string [port] [count] | |
590 | Read all of the characters out of @var{port} and return them as a | |
591 | string. If the @var{count} is present, treat it as a limit to the | |
592 | number of characters to read. | |
593 | ||
594 | By default, read from the current input port, with no size limit on the | |
595 | result. This procedure always returns a string, even if no characters | |
596 | were read. | |
597 | @end deffn | |
598 | ||
599 | @deffn {Scheme Procedure} read-string! buf [port] [start] [end] | |
600 | Fill @var{buf} with characters read from @var{port}, defaulting to the | |
601 | current input port. Return the number of characters read. | |
602 | ||
603 | If @var{start} or @var{end} are specified, store data only into the | |
604 | substring of @var{str} bounded by @var{start} and @var{end} (which | |
605 | default to the beginning and end of the string, respectively). | |
606 | @end deffn | |
607 | ||
28cc8dac | 608 | Some of the aforementioned I/O functions rely on the following C |
07d83abe MV |
609 | primitives. These will mainly be of interest to people hacking Guile |
610 | internals. | |
611 | ||
612 | @deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]] | |
613 | @deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end) | |
614 | Read characters from @var{port} into @var{str} until one of the | |
615 | characters in the @var{delims} string is encountered. If | |
616 | @var{gobble} is true, discard the delimiter character; | |
617 | otherwise, leave it in the input stream for the next read. If | |
618 | @var{port} is not specified, use the value of | |
619 | @code{(current-input-port)}. If @var{start} or @var{end} are | |
620 | specified, store data only into the substring of @var{str} | |
621 | bounded by @var{start} and @var{end} (which default to the | |
622 | beginning and end of the string, respectively). | |
623 | ||
624 | Return a pair consisting of the delimiter that terminated the | |
625 | string and the number of characters read. If reading stopped | |
626 | at the end of file, the delimiter returned is the | |
627 | @var{eof-object}; if the string was filled without encountering | |
628 | a delimiter, this value is @code{#f}. | |
629 | @end deffn | |
630 | ||
631 | @deffn {Scheme Procedure} %read-line [port] | |
632 | @deffnx {C Function} scm_read_line (port) | |
633 | Read a newline-terminated line from @var{port}, allocating storage as | |
634 | necessary. The newline terminator (if any) is removed from the string, | |
635 | and a pair consisting of the line and its delimiter is returned. The | |
636 | delimiter may be either a newline or the @var{eof-object}; if | |
637 | @code{%read-line} is called at the end of file, it returns the pair | |
638 | @code{(#<eof> . #<eof>)}. | |
639 | @end deffn | |
640 | ||
641 | @node Block Reading and Writing | |
642 | @subsection Block reading and writing | |
bf5df489 KR |
643 | @cindex Block read/write |
644 | @cindex Port, block read/write | |
07d83abe MV |
645 | |
646 | The Block-string-I/O module can be accessed with: | |
647 | ||
aba0dff5 | 648 | @lisp |
07d83abe | 649 | (use-modules (ice-9 rw)) |
aba0dff5 | 650 | @end lisp |
07d83abe MV |
651 | |
652 | It currently contains procedures that help to implement the | |
653 | @code{(scsh rw)} module in guile-scsh. | |
654 | ||
655 | @deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]] | |
656 | @deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end) | |
657 | Read characters from a port or file descriptor into a | |
658 | string @var{str}. A port must have an underlying file | |
659 | descriptor --- a so-called fport. This procedure is | |
660 | scsh-compatible and can efficiently read large strings. | |
661 | It will: | |
662 | ||
663 | @itemize | |
664 | @item | |
665 | attempt to fill the entire string, unless the @var{start} | |
666 | and/or @var{end} arguments are supplied. i.e., @var{start} | |
667 | defaults to 0 and @var{end} defaults to | |
668 | @code{(string-length str)} | |
669 | @item | |
670 | use the current input port if @var{port_or_fdes} is not | |
671 | supplied. | |
672 | @item | |
673 | return fewer than the requested number of characters in some | |
674 | cases, e.g., on end of file, if interrupted by a signal, or if | |
675 | not all the characters are immediately available. | |
676 | @item | |
677 | wait indefinitely for some input if no characters are | |
678 | currently available, | |
679 | unless the port is in non-blocking mode. | |
680 | @item | |
681 | read characters from the port's input buffers if available, | |
682 | instead from the underlying file descriptor. | |
683 | @item | |
684 | return @code{#f} if end-of-file is encountered before reading | |
685 | any characters, otherwise return the number of characters | |
686 | read. | |
687 | @item | |
688 | return 0 if the port is in non-blocking mode and no characters | |
689 | are immediately available. | |
690 | @item | |
691 | return 0 if the request is for 0 bytes, with no | |
692 | end-of-file check. | |
693 | @end itemize | |
694 | @end deffn | |
695 | ||
696 | @deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]] | |
697 | @deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end) | |
698 | Write characters from a string @var{str} to a port or file | |
699 | descriptor. A port must have an underlying file descriptor | |
700 | --- a so-called fport. This procedure is | |
701 | scsh-compatible and can efficiently write large strings. | |
702 | It will: | |
703 | ||
704 | @itemize | |
705 | @item | |
706 | attempt to write the entire string, unless the @var{start} | |
707 | and/or @var{end} arguments are supplied. i.e., @var{start} | |
708 | defaults to 0 and @var{end} defaults to | |
709 | @code{(string-length str)} | |
710 | @item | |
711 | use the current output port if @var{port_of_fdes} is not | |
712 | supplied. | |
713 | @item | |
714 | in the case of a buffered port, store the characters in the | |
715 | port's output buffer, if all will fit. If they will not fit | |
716 | then any existing buffered characters will be flushed | |
717 | before attempting | |
718 | to write the new characters directly to the underlying file | |
719 | descriptor. If the port is in non-blocking mode and | |
720 | buffered characters can not be flushed immediately, then an | |
721 | @code{EAGAIN} system-error exception will be raised (Note: | |
722 | scsh does not support the use of non-blocking buffered ports.) | |
723 | @item | |
724 | write fewer than the requested number of | |
725 | characters in some cases, e.g., if interrupted by a signal or | |
726 | if not all of the output can be accepted immediately. | |
727 | @item | |
728 | wait indefinitely for at least one character | |
729 | from @var{str} to be accepted by the port, unless the port is | |
730 | in non-blocking mode. | |
731 | @item | |
732 | return the number of characters accepted by the port. | |
733 | @item | |
734 | return 0 if the port is in non-blocking mode and can not accept | |
735 | at least one character from @var{str} immediately | |
736 | @item | |
737 | return 0 immediately if the request size is 0 bytes. | |
738 | @end itemize | |
739 | @end deffn | |
740 | ||
741 | @node Default Ports | |
742 | @subsection Default Ports for Input, Output and Errors | |
bf5df489 KR |
743 | @cindex Default ports |
744 | @cindex Port, default | |
07d83abe MV |
745 | |
746 | @rnindex current-input-port | |
747 | @deffn {Scheme Procedure} current-input-port | |
748 | @deffnx {C Function} scm_current_input_port () | |
34846414 | 749 | @cindex standard input |
07d83abe | 750 | Return the current input port. This is the default port used |
3fa0a042 KR |
751 | by many input procedures. |
752 | ||
753 | Initially this is the @dfn{standard input} in Unix and C terminology. | |
754 | When the standard input is a tty the port is unbuffered, otherwise | |
755 | it's fully buffered. | |
756 | ||
757 | Unbuffered input is good if an application runs an interactive | |
758 | subprocess, since any type-ahead input won't go into Guile's buffer | |
9782da8a | 759 | and be unavailable to the subprocess. |
3fa0a042 KR |
760 | |
761 | Note that Guile buffering is completely separate from the tty ``line | |
9782da8a KR |
762 | discipline''. In the usual cooked mode on a tty Guile only sees a |
763 | line of input once the user presses @key{Return}. | |
07d83abe MV |
764 | @end deffn |
765 | ||
766 | @rnindex current-output-port | |
767 | @deffn {Scheme Procedure} current-output-port | |
768 | @deffnx {C Function} scm_current_output_port () | |
34846414 | 769 | @cindex standard output |
07d83abe | 770 | Return the current output port. This is the default port used |
3fa0a042 KR |
771 | by many output procedures. |
772 | ||
773 | Initially this is the @dfn{standard output} in Unix and C terminology. | |
774 | When the standard output is a tty this port is unbuffered, otherwise | |
775 | it's fully buffered. | |
776 | ||
777 | Unbuffered output to a tty is good for ensuring progress output or a | |
778 | prompt is seen. But an application which always prints whole lines | |
779 | could change to line buffered, or an application with a lot of output | |
780 | could go fully buffered and perhaps make explicit @code{force-output} | |
781 | calls (@pxref{Writing}) at selected points. | |
07d83abe MV |
782 | @end deffn |
783 | ||
784 | @deffn {Scheme Procedure} current-error-port | |
785 | @deffnx {C Function} scm_current_error_port () | |
34846414 | 786 | @cindex standard error output |
3fa0a042 KR |
787 | Return the port to which errors and warnings should be sent. |
788 | ||
789 | Initially this is the @dfn{standard error} in Unix and C terminology. | |
790 | When the standard error is a tty this port is unbuffered, otherwise | |
791 | it's fully buffered. | |
07d83abe MV |
792 | @end deffn |
793 | ||
794 | @deffn {Scheme Procedure} set-current-input-port port | |
795 | @deffnx {Scheme Procedure} set-current-output-port port | |
796 | @deffnx {Scheme Procedure} set-current-error-port port | |
797 | @deffnx {C Function} scm_set_current_input_port (port) | |
798 | @deffnx {C Function} scm_set_current_output_port (port) | |
799 | @deffnx {C Function} scm_set_current_error_port (port) | |
800 | Change the ports returned by @code{current-input-port}, | |
801 | @code{current-output-port} and @code{current-error-port}, respectively, | |
802 | so that they use the supplied @var{port} for input or output. | |
803 | @end deffn | |
804 | ||
661ae7ab MV |
805 | @deftypefn {C Function} void scm_dynwind_current_input_port (SCM port) |
806 | @deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port) | |
807 | @deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port) | |
07d83abe | 808 | These functions must be used inside a pair of calls to |
661ae7ab MV |
809 | @code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic |
810 | Wind}). During the dynwind context, the indicated port is set to | |
07d83abe MV |
811 | @var{port}. |
812 | ||
813 | More precisely, the current port is swapped with a `backup' value | |
661ae7ab | 814 | whenever the dynwind context is entered or left. The backup value is |
07d83abe MV |
815 | initialized with the @var{port} argument. |
816 | @end deftypefn | |
817 | ||
818 | @node Port Types | |
819 | @subsection Types of Port | |
bf5df489 KR |
820 | @cindex Types of ports |
821 | @cindex Port, types | |
07d83abe MV |
822 | |
823 | [Types of port; how to make them.] | |
824 | ||
825 | @menu | |
826 | * File Ports:: Ports on an operating system file. | |
827 | * String Ports:: Ports on a Scheme string. | |
828 | * Soft Ports:: Ports on arbitrary Scheme procedures. | |
829 | * Void Ports:: Ports on nothing at all. | |
830 | @end menu | |
831 | ||
832 | ||
833 | @node File Ports | |
834 | @subsubsection File Ports | |
bf5df489 KR |
835 | @cindex File port |
836 | @cindex Port, file | |
07d83abe MV |
837 | |
838 | The following procedures are used to open file ports. | |
839 | See also @ref{Ports and File Descriptors, open}, for an interface | |
840 | to the Unix @code{open} system call. | |
841 | ||
842 | Most systems have limits on how many files can be open, so it's | |
843 | strongly recommended that file ports be closed explicitly when no | |
844 | longer required (@pxref{Ports}). | |
845 | ||
3ace9a8e MW |
846 | @deffn {Scheme Procedure} open-file filename mode @ |
847 | [#:guess-encoding=#f] [#:encoding=#f] | |
848 | @deffnx {C Function} scm_open_file_with_encoding @ | |
849 | (filename, mode, guess_encoding, encoding) | |
07d83abe MV |
850 | @deffnx {C Function} scm_open_file (filename, mode) |
851 | Open the file whose name is @var{filename}, and return a port | |
852 | representing that file. The attributes of the port are | |
853 | determined by the @var{mode} string. The way in which this is | |
854 | interpreted is similar to C stdio. The first character must be | |
855 | one of the following: | |
c755b861 | 856 | |
07d83abe MV |
857 | @table @samp |
858 | @item r | |
859 | Open an existing file for input. | |
860 | @item w | |
861 | Open a file for output, creating it if it doesn't already exist | |
862 | or removing its contents if it does. | |
863 | @item a | |
864 | Open a file for output, creating it if it doesn't already | |
865 | exist. All writes to the port will go to the end of the file. | |
866 | The "append mode" can be turned off while the port is in use | |
867 | @pxref{Ports and File Descriptors, fcntl} | |
868 | @end table | |
c755b861 | 869 | |
07d83abe | 870 | The following additional characters can be appended: |
c755b861 | 871 | |
07d83abe MV |
872 | @table @samp |
873 | @item + | |
874 | Open the port for both input and output. E.g., @code{r+}: open | |
875 | an existing file for both input and output. | |
876 | @item 0 | |
877 | Create an "unbuffered" port. In this case input and output | |
878 | operations are passed directly to the underlying port | |
879 | implementation without additional buffering. This is likely to | |
880 | slow down I/O operations. The buffering mode can be changed | |
881 | while a port is in use @pxref{Ports and File Descriptors, | |
882 | setvbuf} | |
883 | @item l | |
884 | Add line-buffering to the port. The port output buffer will be | |
885 | automatically flushed whenever a newline character is written. | |
c755b861 | 886 | @item b |
5261e742 AW |
887 | Use binary mode, ensuring that each byte in the file will be read as one |
888 | Scheme character. | |
889 | ||
890 | To provide this property, the file will be opened with the 8-bit | |
9a334eb3 MW |
891 | character encoding "ISO-8859-1", ignoring the default port encoding. |
892 | @xref{Ports}, for more information on port encodings. | |
5261e742 AW |
893 | |
894 | Note that while it is possible to read and write binary data as | |
895 | characters or strings, it is usually better to treat bytes as octets, | |
896 | and byte sequences as bytevectors. @xref{R6RS Binary Input}, and | |
897 | @ref{R6RS Binary Output}, for more. | |
898 | ||
899 | This option had another historical meaning, for DOS compatibility: in | |
900 | the default (textual) mode, DOS reads a CR-LF sequence as one LF byte. | |
901 | The @code{b} flag prevents this from happening, adding @code{O_BINARY} | |
902 | to the underlying @code{open} call. Still, the flag is generally useful | |
903 | because of its port encoding ramifications. | |
07d83abe | 904 | @end table |
c755b861 | 905 | |
3ace9a8e MW |
906 | Unless binary mode is requested, the character encoding of the new port |
907 | is determined as follows: First, if @var{guess-encoding} is true, the | |
908 | @code{file-encoding} procedure is used to guess the encoding of the file | |
909 | (@pxref{Character Encoding of Source Files}). If @var{guess-encoding} | |
910 | is false or if @code{file-encoding} fails, @var{encoding} is used unless | |
911 | it is also false. As a last resort, the default port encoding is used. | |
912 | @xref{Ports}, for more information on port encodings. It is an error to | |
913 | pass a non-false @var{guess-encoding} or @var{encoding} if binary mode | |
914 | is requested. | |
915 | ||
916 | If a file cannot be opened with the access requested, @code{open-file} | |
917 | throws an exception. | |
092bdcc4 | 918 | |
9a334eb3 MW |
919 | When the file is opened, its encoding is set to the current |
920 | @code{%default-port-encoding}, unless the @code{b} flag was supplied. | |
921 | Sometimes it is desirable to honor Emacs-style coding declarations in | |
922 | files@footnote{Guile 2.0.0 to 2.0.7 would do this by default. This | |
923 | behavior was deemed inappropriate and disabled starting from Guile | |
924 | 2.0.8.}. When that is the case, the @code{file-encoding} procedure can | |
925 | be used as follows (@pxref{Character Encoding of Source Files, | |
926 | @code{file-encoding}}): | |
927 | ||
928 | @example | |
929 | (let* ((port (open-input-file file)) | |
930 | (encoding (file-encoding port))) | |
931 | (set-port-encoding! port (or encoding (port-encoding port)))) | |
932 | @end example | |
211683cc | 933 | |
07d83abe MV |
934 | In theory we could create read/write ports which were buffered |
935 | in one direction only. However this isn't included in the | |
092bdcc4 | 936 | current interfaces. |
07d83abe MV |
937 | @end deffn |
938 | ||
939 | @rnindex open-input-file | |
3ace9a8e MW |
940 | @deffn {Scheme Procedure} open-input-file filename @ |
941 | [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] | |
942 | ||
943 | Open @var{filename} for input. If @var{binary} is true, open the port | |
944 | in binary mode, otherwise use text mode. @var{encoding} and | |
945 | @var{guess-encoding} determine the character encoding as described above | |
946 | for @code{open-file}. Equivalent to | |
aba0dff5 | 947 | @lisp |
3ace9a8e MW |
948 | (open-file @var{filename} |
949 | (if @var{binary} "rb" "r") | |
950 | #:guess-encoding @var{guess-encoding} | |
951 | #:encoding @var{encoding}) | |
aba0dff5 | 952 | @end lisp |
07d83abe MV |
953 | @end deffn |
954 | ||
955 | @rnindex open-output-file | |
3ace9a8e MW |
956 | @deffn {Scheme Procedure} open-output-file filename @ |
957 | [#:encoding=#f] [#:binary=#f] | |
958 | ||
959 | Open @var{filename} for output. If @var{binary} is true, open the port | |
960 | in binary mode, otherwise use text mode. @var{encoding} specifies the | |
961 | character encoding as described above for @code{open-file}. Equivalent | |
962 | to | |
aba0dff5 | 963 | @lisp |
3ace9a8e MW |
964 | (open-file @var{filename} |
965 | (if @var{binary} "wb" "w") | |
966 | #:encoding @var{encoding}) | |
aba0dff5 | 967 | @end lisp |
07d83abe MV |
968 | @end deffn |
969 | ||
3ace9a8e MW |
970 | @deffn {Scheme Procedure} call-with-input-file filename proc @ |
971 | [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] | |
972 | @deffnx {Scheme Procedure} call-with-output-file filename proc @ | |
973 | [#:encoding=#f] [#:binary=#f] | |
07d83abe MV |
974 | @rnindex call-with-input-file |
975 | @rnindex call-with-output-file | |
976 | Open @var{filename} for input or output, and call @code{(@var{proc} | |
977 | port)} with the resulting port. Return the value returned by | |
978 | @var{proc}. @var{filename} is opened as per @code{open-input-file} or | |
28cc8dac | 979 | @code{open-output-file} respectively, and an error is signaled if it |
07d83abe MV |
980 | cannot be opened. |
981 | ||
982 | When @var{proc} returns, the port is closed. If @var{proc} does not | |
28cc8dac | 983 | return (e.g.@: if it throws an error), then the port might not be |
07d83abe MV |
984 | closed automatically, though it will be garbage collected in the usual |
985 | way if not otherwise referenced. | |
986 | @end deffn | |
987 | ||
3ace9a8e MW |
988 | @deffn {Scheme Procedure} with-input-from-file filename thunk @ |
989 | [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f] | |
990 | @deffnx {Scheme Procedure} with-output-to-file filename thunk @ | |
991 | [#:encoding=#f] [#:binary=#f] | |
992 | @deffnx {Scheme Procedure} with-error-to-file filename thunk @ | |
993 | [#:encoding=#f] [#:binary=#f] | |
07d83abe MV |
994 | @rnindex with-input-from-file |
995 | @rnindex with-output-to-file | |
996 | Open @var{filename} and call @code{(@var{thunk})} with the new port | |
997 | setup as respectively the @code{current-input-port}, | |
998 | @code{current-output-port}, or @code{current-error-port}. Return the | |
999 | value returned by @var{thunk}. @var{filename} is opened as per | |
1000 | @code{open-input-file} or @code{open-output-file} respectively, and an | |
28cc8dac | 1001 | error is signaled if it cannot be opened. |
07d83abe MV |
1002 | |
1003 | When @var{thunk} returns, the port is closed and the previous setting | |
1004 | of the respective current port is restored. | |
1005 | ||
1006 | The current port setting is managed with @code{dynamic-wind}, so the | |
1007 | previous value is restored no matter how @var{thunk} exits (eg.@: an | |
1008 | exception), and if @var{thunk} is re-entered (via a captured | |
64de6db5 | 1009 | continuation) then it's set again to the @var{filename} port. |
07d83abe MV |
1010 | |
1011 | The port is closed when @var{thunk} returns normally, but not when | |
1012 | exited via an exception or new continuation. This ensures it's still | |
1013 | ready for use if @var{thunk} is re-entered by a captured continuation. | |
1014 | Of course the port is always garbage collected and closed in the usual | |
1015 | way when no longer referenced anywhere. | |
1016 | @end deffn | |
1017 | ||
1018 | @deffn {Scheme Procedure} port-mode port | |
1019 | @deffnx {C Function} scm_port_mode (port) | |
1020 | Return the port modes associated with the open port @var{port}. | |
1021 | These will not necessarily be identical to the modes used when | |
1022 | the port was opened, since modes such as "append" which are | |
1023 | used only during port creation are not retained. | |
1024 | @end deffn | |
1025 | ||
1026 | @deffn {Scheme Procedure} port-filename port | |
1027 | @deffnx {C Function} scm_port_filename (port) | |
ac012a27 AW |
1028 | Return the filename associated with @var{port}, or @code{#f} if no |
1029 | filename is associated with the port. | |
e55abf41 KR |
1030 | |
1031 | @var{port} must be open, @code{port-filename} cannot be used once the | |
1032 | port is closed. | |
07d83abe MV |
1033 | @end deffn |
1034 | ||
1035 | @deffn {Scheme Procedure} set-port-filename! port filename | |
1036 | @deffnx {C Function} scm_set_port_filename_x (port, filename) | |
1037 | Change the filename associated with @var{port}, using the current input | |
1038 | port if none is specified. Note that this does not change the port's | |
1039 | source of data, but only the value that is returned by | |
1040 | @code{port-filename} and reported in diagnostic output. | |
1041 | @end deffn | |
1042 | ||
1043 | @deffn {Scheme Procedure} file-port? obj | |
1044 | @deffnx {C Function} scm_file_port_p (obj) | |
1045 | Determine whether @var{obj} is a port that is related to a file. | |
1046 | @end deffn | |
1047 | ||
1048 | ||
1049 | @node String Ports | |
1050 | @subsubsection String Ports | |
bf5df489 KR |
1051 | @cindex String port |
1052 | @cindex Port, string | |
07d83abe | 1053 | |
ecb87335 | 1054 | The following allow string ports to be opened by analogy to R4RS |
07d83abe MV |
1055 | file port facilities: |
1056 | ||
28cc8dac MG |
1057 | With string ports, the port-encoding is treated differently than other |
1058 | types of ports. When string ports are created, they do not inherit a | |
1059 | character encoding from the current locale. They are given a | |
1060 | default locale that allows them to handle all valid string characters. | |
1061 | Typically one should not modify a string port's character encoding | |
1062 | away from its default. | |
1063 | ||
07d83abe MV |
1064 | @deffn {Scheme Procedure} call-with-output-string proc |
1065 | @deffnx {C Function} scm_call_with_output_string (proc) | |
1066 | Calls the one-argument procedure @var{proc} with a newly created output | |
1067 | port. When the function returns, the string composed of the characters | |
1068 | written into the port is returned. @var{proc} should not close the port. | |
1069 | @end deffn | |
1070 | ||
1071 | @deffn {Scheme Procedure} call-with-input-string string proc | |
1072 | @deffnx {C Function} scm_call_with_input_string (string, proc) | |
1073 | Calls the one-argument procedure @var{proc} with a newly | |
1074 | created input port from which @var{string}'s contents may be | |
1075 | read. The value yielded by the @var{proc} is returned. | |
1076 | @end deffn | |
1077 | ||
1078 | @deffn {Scheme Procedure} with-output-to-string thunk | |
1079 | Calls the zero-argument procedure @var{thunk} with the current output | |
1080 | port set temporarily to a new string port. It returns a string | |
1081 | composed of the characters written to the current output. | |
1082 | @end deffn | |
1083 | ||
1084 | @deffn {Scheme Procedure} with-input-from-string string thunk | |
1085 | Calls the zero-argument procedure @var{thunk} with the current input | |
1086 | port set temporarily to a string port opened on the specified | |
1087 | @var{string}. The value yielded by @var{thunk} is returned. | |
1088 | @end deffn | |
1089 | ||
1090 | @deffn {Scheme Procedure} open-input-string str | |
1091 | @deffnx {C Function} scm_open_input_string (str) | |
1092 | Take a string and return an input port that delivers characters | |
1093 | from the string. The port can be closed by | |
1094 | @code{close-input-port}, though its storage will be reclaimed | |
1095 | by the garbage collector if it becomes inaccessible. | |
1096 | @end deffn | |
1097 | ||
1098 | @deffn {Scheme Procedure} open-output-string | |
1099 | @deffnx {C Function} scm_open_output_string () | |
1100 | Return an output port that will accumulate characters for | |
1101 | retrieval by @code{get-output-string}. The port can be closed | |
1102 | by the procedure @code{close-output-port}, though its storage | |
1103 | will be reclaimed by the garbage collector if it becomes | |
1104 | inaccessible. | |
1105 | @end deffn | |
1106 | ||
1107 | @deffn {Scheme Procedure} get-output-string port | |
1108 | @deffnx {C Function} scm_get_output_string (port) | |
1109 | Given an output port created by @code{open-output-string}, | |
1110 | return a string consisting of the characters that have been | |
1111 | output to the port so far. | |
1112 | ||
1113 | @code{get-output-string} must be used before closing @var{port}, once | |
1114 | closed the string cannot be obtained. | |
1115 | @end deffn | |
1116 | ||
1117 | A string port can be used in many procedures which accept a port | |
1118 | but which are not dependent on implementation details of fports. | |
1119 | E.g., seeking and truncating will work on a string port, | |
1120 | but trying to extract the file descriptor number will fail. | |
1121 | ||
1122 | ||
1123 | @node Soft Ports | |
1124 | @subsubsection Soft Ports | |
bf5df489 KR |
1125 | @cindex Soft port |
1126 | @cindex Port, soft | |
07d83abe MV |
1127 | |
1128 | A @dfn{soft-port} is a port based on a vector of procedures capable of | |
1129 | accepting or delivering characters. It allows emulation of I/O ports. | |
1130 | ||
1131 | @deffn {Scheme Procedure} make-soft-port pv modes | |
1132 | @deffnx {C Function} scm_make_soft_port (pv, modes) | |
1133 | Return a port capable of receiving or delivering characters as | |
1134 | specified by the @var{modes} string (@pxref{File Ports, | |
1135 | open-file}). @var{pv} must be a vector of length 5 or 6. Its | |
1136 | components are as follows: | |
1137 | ||
1138 | @enumerate 0 | |
1139 | @item | |
1140 | procedure accepting one character for output | |
1141 | @item | |
1142 | procedure accepting a string for output | |
1143 | @item | |
1144 | thunk for flushing output | |
1145 | @item | |
1146 | thunk for getting one character | |
1147 | @item | |
1148 | thunk for closing port (not by garbage collection) | |
1149 | @item | |
1150 | (if present and not @code{#f}) thunk for computing the number of | |
1151 | characters that can be read from the port without blocking. | |
1152 | @end enumerate | |
1153 | ||
1154 | For an output-only port only elements 0, 1, 2, and 4 need be | |
1155 | procedures. For an input-only port only elements 3 and 4 need | |
1156 | be procedures. Thunks 2 and 4 can instead be @code{#f} if | |
1157 | there is no useful operation for them to perform. | |
1158 | ||
1159 | If thunk 3 returns @code{#f} or an @code{eof-object} | |
1160 | (@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on | |
1161 | Scheme}) it indicates that the port has reached end-of-file. | |
1162 | For example: | |
1163 | ||
1164 | @lisp | |
1165 | (define stdout (current-output-port)) | |
1166 | (define p (make-soft-port | |
1167 | (vector | |
1168 | (lambda (c) (write c stdout)) | |
1169 | (lambda (s) (display s stdout)) | |
1170 | (lambda () (display "." stdout)) | |
1171 | (lambda () (char-upcase (read-char))) | |
1172 | (lambda () (display "@@" stdout))) | |
1173 | "rw")) | |
1174 | ||
1175 | (write p p) @result{} #<input-output: soft 8081e20> | |
1176 | @end lisp | |
1177 | @end deffn | |
1178 | ||
1179 | ||
1180 | @node Void Ports | |
1181 | @subsubsection Void Ports | |
bf5df489 KR |
1182 | @cindex Void port |
1183 | @cindex Port, void | |
07d83abe MV |
1184 | |
1185 | This kind of port causes any data to be discarded when written to, and | |
1186 | always returns the end-of-file object when read from. | |
1187 | ||
1188 | @deffn {Scheme Procedure} %make-void-port mode | |
1189 | @deffnx {C Function} scm_sys_make_void_port (mode) | |
1190 | Create and return a new void port. A void port acts like | |
1191 | @file{/dev/null}. The @var{mode} argument | |
1192 | specifies the input/output modes for this port: see the | |
1193 | documentation for @code{open-file} in @ref{File Ports}. | |
1194 | @end deffn | |
1195 | ||
1196 | ||
b242715b LC |
1197 | @node R6RS I/O Ports |
1198 | @subsection R6RS I/O Ports | |
1199 | ||
1200 | @cindex R6RS | |
1201 | @cindex R6RS ports | |
1202 | ||
1203 | The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on | |
1204 | the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs | |
1205 | io ports)} module. It provides features, such as binary I/O and Unicode | |
1206 | string I/O, that complement or refine Guile's historical port API | |
040dfa6f AR |
1207 | presented above (@pxref{Input and Output}). Note that R6RS ports are not |
1208 | disjoint from Guile's native ports, so Guile-specific procedures will | |
1209 | work on ports created using the R6RS API, and vice versa. | |
1210 | ||
1211 | The text in this section is taken from the R6RS standard libraries | |
1212 | document, with only minor adaptions for inclusion in this manual. The | |
1213 | Guile developers offer their thanks to the R6RS editors for having | |
1214 | provided the report's text under permissive conditions making this | |
1215 | possible. | |
b242715b LC |
1216 | |
1217 | @c FIXME: Update description when implemented. | |
958173e4 | 1218 | @emph{Note}: The implementation of this R6RS API is not complete yet. |
b242715b LC |
1219 | |
1220 | @menu | |
040dfa6f AR |
1221 | * R6RS File Names:: File names. |
1222 | * R6RS File Options:: Options for opening files. | |
1223 | * R6RS Buffer Modes:: Influencing buffering behavior. | |
1224 | * R6RS Transcoders:: Influencing port encoding. | |
b242715b LC |
1225 | * R6RS End-of-File:: The end-of-file object. |
1226 | * R6RS Port Manipulation:: Manipulating R6RS ports. | |
040dfa6f | 1227 | * R6RS Input Ports:: Input Ports. |
b242715b | 1228 | * R6RS Binary Input:: Binary input. |
040dfa6f AR |
1229 | * R6RS Textual Input:: Textual input. |
1230 | * R6RS Output Ports:: Output Ports. | |
b242715b | 1231 | * R6RS Binary Output:: Binary output. |
040dfa6f | 1232 | * R6RS Textual Output:: Textual output. |
b242715b LC |
1233 | @end menu |
1234 | ||
7f6c3f8f MW |
1235 | A subset of the @code{(rnrs io ports)} module, plus one non-standard |
1236 | procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is | |
1237 | provided by the @code{(ice-9 binary-ports)} module. It contains binary | |
1238 | input/output procedures and does not rely on R6RS support. | |
de424d95 | 1239 | |
040dfa6f AR |
1240 | @node R6RS File Names |
1241 | @subsubsection File Names | |
1242 | ||
1243 | Some of the procedures described in this chapter accept a file name as an | |
1244 | argument. Valid values for such a file name include strings that name a file | |
b3da54d1 | 1245 | using the native notation of file system paths on an implementation's |
040dfa6f AR |
1246 | underlying operating system, and may include implementation-dependent |
1247 | values as well. | |
1248 | ||
1249 | A @var{filename} parameter name means that the | |
1250 | corresponding argument must be a file name. | |
1251 | ||
1252 | @node R6RS File Options | |
1253 | @subsubsection File Options | |
1254 | @cindex file options | |
1255 | ||
1256 | When opening a file, the various procedures in this library accept a | |
1257 | @code{file-options} object that encapsulates flags to specify how the | |
1258 | file is to be opened. A @code{file-options} object is an enum-set | |
1259 | (@pxref{rnrs enums}) over the symbols constituting valid file options. | |
1260 | ||
1261 | A @var{file-options} parameter name means that the corresponding | |
1262 | argument must be a file-options object. | |
1263 | ||
1264 | @deffn {Scheme Syntax} file-options @var{file-options-symbol} ... | |
1265 | ||
1266 | Each @var{file-options-symbol} must be a symbol. | |
1267 | ||
1268 | The @code{file-options} syntax returns a file-options object that | |
1269 | encapsulates the specified options. | |
1270 | ||
1271 | When supplied to an operation that opens a file for output, the | |
1272 | file-options object returned by @code{(file-options)} specifies that the | |
1273 | file is created if it does not exist and an exception with condition | |
1274 | type @code{&i/o-file-already-exists} is raised if it does exist. The | |
1275 | following standard options can be included to modify the default | |
1276 | behavior. | |
1277 | ||
1278 | @table @code | |
1279 | @item no-create | |
1280 | If the file does not already exist, it is not created; | |
1281 | instead, an exception with condition type @code{&i/o-file-does-not-exist} | |
1282 | is raised. | |
1283 | If the file already exists, the exception with condition type | |
1284 | @code{&i/o-file-already-exists} is not raised | |
1285 | and the file is truncated to zero length. | |
1286 | @item no-fail | |
1287 | If the file already exists, the exception with condition type | |
1288 | @code{&i/o-file-already-exists} is not raised, | |
1289 | even if @code{no-create} is not included, | |
1290 | and the file is truncated to zero length. | |
1291 | @item no-truncate | |
1292 | If the file already exists and the exception with condition type | |
1293 | @code{&i/o-file-already-exists} has been inhibited by inclusion of | |
1294 | @code{no-create} or @code{no-fail}, the file is not truncated, but | |
1295 | the port's current position is still set to the beginning of the | |
1296 | file. | |
1297 | @end table | |
1298 | ||
1299 | These options have no effect when a file is opened only for input. | |
1300 | Symbols other than those listed above may be used as | |
1301 | @var{file-options-symbol}s; they have implementation-specific meaning, | |
1302 | if any. | |
1303 | ||
1304 | @quotation Note | |
1305 | Only the name of @var{file-options-symbol} is significant. | |
1306 | @end quotation | |
1307 | @end deffn | |
1308 | ||
1309 | @node R6RS Buffer Modes | |
1310 | @subsubsection Buffer Modes | |
1311 | ||
1312 | Each port has an associated buffer mode. For an output port, the | |
1313 | buffer mode defines when an output operation flushes the buffer | |
1314 | associated with the output port. For an input port, the buffer mode | |
1315 | defines how much data will be read to satisfy read operations. The | |
1316 | possible buffer modes are the symbols @code{none} for no buffering, | |
1317 | @code{line} for flushing upon line endings and reading up to line | |
1318 | endings, or other implementation-dependent behavior, | |
1319 | and @code{block} for arbitrary buffering. This section uses | |
1320 | the parameter name @var{buffer-mode} for arguments that must be | |
1321 | buffer-mode symbols. | |
1322 | ||
1323 | If two ports are connected to the same mutable source, both ports | |
1324 | are unbuffered, and reading a byte or character from that shared | |
1325 | source via one of the two ports would change the bytes or characters | |
1326 | seen via the other port, a lookahead operation on one port will | |
1327 | render the peeked byte or character inaccessible via the other port, | |
1328 | while a subsequent read operation on the peeked port will see the | |
1329 | peeked byte or character even though the port is otherwise unbuffered. | |
1330 | ||
1331 | In other words, the semantics of buffering is defined in terms of side | |
1332 | effects on shared mutable sources, and a lookahead operation has the | |
1333 | same side effect on the shared source as a read operation. | |
1334 | ||
1335 | @deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol} | |
1336 | ||
1337 | @var{buffer-mode-symbol} must be a symbol whose name is one of | |
1338 | @code{none}, @code{line}, and @code{block}. The result is the | |
1339 | corresponding symbol, and specifies the associated buffer mode. | |
1340 | ||
1341 | @quotation Note | |
1342 | Only the name of @var{buffer-mode-symbol} is significant. | |
1343 | @end quotation | |
1344 | @end deffn | |
1345 | ||
1346 | @deffn {Scheme Procedure} buffer-mode? obj | |
1347 | Returns @code{#t} if the argument is a valid buffer-mode symbol, and | |
1348 | returns @code{#f} otherwise. | |
1349 | @end deffn | |
1350 | ||
1351 | @node R6RS Transcoders | |
1352 | @subsubsection Transcoders | |
1353 | @cindex codec | |
1354 | @cindex end-of-line style | |
1355 | @cindex transcoder | |
1356 | @cindex binary port | |
1357 | @cindex textual port | |
1358 | ||
1359 | Several different Unicode encoding schemes describe standard ways to | |
1360 | encode characters and strings as byte sequences and to decode those | |
1361 | sequences. Within this document, a @dfn{codec} is an immutable Scheme | |
1362 | object that represents a Unicode or similar encoding scheme. | |
1363 | ||
1364 | An @dfn{end-of-line style} is a symbol that, if it is not @code{none}, | |
1365 | describes how a textual port transcodes representations of line endings. | |
1366 | ||
1367 | A @dfn{transcoder} is an immutable Scheme object that combines a codec | |
1368 | with an end-of-line style and a method for handling decoding errors. | |
1369 | Each transcoder represents some specific bidirectional (but not | |
1370 | necessarily lossless), possibly stateful translation between byte | |
1371 | sequences and Unicode characters and strings. Every transcoder can | |
1372 | operate in the input direction (bytes to characters) or in the output | |
1373 | direction (characters to bytes). A @var{transcoder} parameter name | |
1374 | means that the corresponding argument must be a transcoder. | |
1375 | ||
1376 | A @dfn{binary port} is a port that supports binary I/O, does not have an | |
1377 | associated transcoder and does not support textual I/O. A @dfn{textual | |
1378 | port} is a port that supports textual I/O, and does not support binary | |
1379 | I/O. A textual port may or may not have an associated transcoder. | |
1380 | ||
1381 | @deffn {Scheme Procedure} latin-1-codec | |
1382 | @deffnx {Scheme Procedure} utf-8-codec | |
1383 | @deffnx {Scheme Procedure} utf-16-codec | |
1384 | ||
1385 | These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16 | |
1386 | encoding schemes. | |
1387 | ||
1388 | A call to any of these procedures returns a value that is equal in the | |
1389 | sense of @code{eqv?} to the result of any other call to the same | |
1390 | procedure. | |
1391 | @end deffn | |
1392 | ||
1393 | @deffn {Scheme Syntax} eol-style @var{eol-style-symbol} | |
1394 | ||
1395 | @var{eol-style-symbol} should be a symbol whose name is one of | |
1396 | @code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls}, | |
1397 | and @code{none}. | |
1398 | ||
1399 | The form evaluates to the corresponding symbol. If the name of | |
1400 | @var{eol-style-symbol} is not one of these symbols, the effect and | |
1401 | result are implementation-dependent; in particular, the result may be an | |
1402 | eol-style symbol acceptable as an @var{eol-style} argument to | |
1403 | @code{make-transcoder}. Otherwise, an exception is raised. | |
1404 | ||
1405 | All eol-style symbols except @code{none} describe a specific | |
1406 | line-ending encoding: | |
1407 | ||
1408 | @table @code | |
1409 | @item lf | |
1410 | linefeed | |
1411 | @item cr | |
1412 | carriage return | |
1413 | @item crlf | |
1414 | carriage return, linefeed | |
1415 | @item nel | |
1416 | next line | |
1417 | @item crnel | |
1418 | carriage return, next line | |
1419 | @item ls | |
1420 | line separator | |
1421 | @end table | |
1422 | ||
1423 | For a textual port with a transcoder, and whose transcoder has an | |
1424 | eol-style symbol @code{none}, no conversion occurs. For a textual input | |
1425 | port, any eol-style symbol other than @code{none} means that all of the | |
1426 | above line-ending encodings are recognized and are translated into a | |
1427 | single linefeed. For a textual output port, @code{none} and @code{lf} | |
1428 | are equivalent. Linefeed characters are encoded according to the | |
1429 | specified eol-style symbol, and all other characters that participate in | |
1430 | possible line endings are encoded as is. | |
1431 | ||
1432 | @quotation Note | |
1433 | Only the name of @var{eol-style-symbol} is significant. | |
1434 | @end quotation | |
1435 | @end deffn | |
1436 | ||
1437 | @deffn {Scheme Procedure} native-eol-style | |
1438 | Returns the default end-of-line style of the underlying platform, e.g., | |
1439 | @code{lf} on Unix and @code{crlf} on Windows. | |
1440 | @end deffn | |
1441 | ||
1442 | @deffn {Condition Type} &i/o-decoding | |
1443 | @deffnx {Scheme Procedure} make-i/o-decoding-error port | |
1444 | @deffnx {Scheme Procedure} i/o-decoding-error? obj | |
1445 | ||
1446 | This condition type could be defined by | |
1447 | ||
1448 | @lisp | |
1449 | (define-condition-type &i/o-decoding &i/o-port | |
1450 | make-i/o-decoding-error i/o-decoding-error?) | |
1451 | @end lisp | |
1452 | ||
1453 | An exception with this type is raised when one of the operations for | |
1454 | textual input from a port encounters a sequence of bytes that cannot be | |
1455 | translated into a character or string by the input direction of the | |
1456 | port's transcoder. | |
1457 | ||
1458 | When such an exception is raised, the port's position is past the | |
1459 | invalid encoding. | |
1460 | @end deffn | |
1461 | ||
1462 | @deffn {Condition Type} &i/o-encoding | |
1463 | @deffnx {Scheme Procedure} make-i/o-encoding-error port char | |
1464 | @deffnx {Scheme Procedure} i/o-encoding-error? obj | |
1465 | @deffnx {Scheme Procedure} i/o-encoding-error-char condition | |
1466 | ||
1467 | This condition type could be defined by | |
1468 | ||
1469 | @lisp | |
1470 | (define-condition-type &i/o-encoding &i/o-port | |
1471 | make-i/o-encoding-error i/o-encoding-error? | |
1472 | (char i/o-encoding-error-char)) | |
1473 | @end lisp | |
1474 | ||
1475 | An exception with this type is raised when one of the operations for | |
1476 | textual output to a port encounters a character that cannot be | |
1477 | translated into bytes by the output direction of the port's transcoder. | |
64de6db5 | 1478 | @var{char} is the character that could not be encoded. |
040dfa6f AR |
1479 | @end deffn |
1480 | ||
1481 | @deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol} | |
1482 | ||
1483 | @var{error-handling-mode-symbol} should be a symbol whose name is one of | |
1484 | @code{ignore}, @code{raise}, and @code{replace}. The form evaluates to | |
1485 | the corresponding symbol. If @var{error-handling-mode-symbol} is not | |
1486 | one of these identifiers, effect and result are | |
1487 | implementation-dependent: The result may be an error-handling-mode | |
1488 | symbol acceptable as a @var{handling-mode} argument to | |
1489 | @code{make-transcoder}. If it is not acceptable as a | |
1490 | @var{handling-mode} argument to @code{make-transcoder}, an exception is | |
1491 | raised. | |
1492 | ||
1493 | @quotation Note | |
64de6db5 | 1494 | Only the name of @var{error-handling-mode-symbol} is significant. |
040dfa6f AR |
1495 | @end quotation |
1496 | ||
1497 | The error-handling mode of a transcoder specifies the behavior | |
1498 | of textual I/O operations in the presence of encoding or decoding | |
1499 | errors. | |
1500 | ||
1501 | If a textual input operation encounters an invalid or incomplete | |
1502 | character encoding, and the error-handling mode is @code{ignore}, an | |
1503 | appropriate number of bytes of the invalid encoding are ignored and | |
1504 | decoding continues with the following bytes. | |
1505 | ||
1506 | If the error-handling mode is @code{replace}, the replacement | |
1507 | character U+FFFD is injected into the data stream, an appropriate | |
1508 | number of bytes are ignored, and decoding | |
1509 | continues with the following bytes. | |
1510 | ||
1511 | If the error-handling mode is @code{raise}, an exception with condition | |
1512 | type @code{&i/o-decoding} is raised. | |
1513 | ||
1514 | If a textual output operation encounters a character it cannot encode, | |
1515 | and the error-handling mode is @code{ignore}, the character is ignored | |
1516 | and encoding continues with the next character. If the error-handling | |
1517 | mode is @code{replace}, a codec-specific replacement character is | |
1518 | emitted by the transcoder, and encoding continues with the next | |
1519 | character. The replacement character is U+FFFD for transcoders whose | |
1520 | codec is one of the Unicode encodings, but is the @code{?} character | |
1521 | for the Latin-1 encoding. If the error-handling mode is @code{raise}, | |
1522 | an exception with condition type @code{&i/o-encoding} is raised. | |
1523 | @end deffn | |
1524 | ||
1525 | @deffn {Scheme Procedure} make-transcoder codec | |
1526 | @deffnx {Scheme Procedure} make-transcoder codec eol-style | |
1527 | @deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode | |
1528 | ||
1529 | @var{codec} must be a codec; @var{eol-style}, if present, an eol-style | |
1530 | symbol; and @var{handling-mode}, if present, an error-handling-mode | |
1531 | symbol. | |
1532 | ||
1533 | @var{eol-style} may be omitted, in which case it defaults to the native | |
64de6db5 | 1534 | end-of-line style of the underlying platform. @var{handling-mode} may |
040dfa6f AR |
1535 | be omitted, in which case it defaults to @code{replace}. The result is |
1536 | a transcoder with the behavior specified by its arguments. | |
1537 | @end deffn | |
1538 | ||
1539 | @deffn {Scheme procedure} native-transcoder | |
1540 | Returns an implementation-dependent transcoder that represents a | |
1541 | possibly locale-dependent ``native'' transcoding. | |
1542 | @end deffn | |
1543 | ||
1544 | @deffn {Scheme Procedure} transcoder-codec transcoder | |
1545 | @deffnx {Scheme Procedure} transcoder-eol-style transcoder | |
1546 | @deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder | |
1547 | ||
1548 | These are accessors for transcoder objects; when applied to a | |
1549 | transcoder returned by @code{make-transcoder}, they return the | |
1550 | @var{codec}, @var{eol-style}, and @var{handling-mode} arguments, | |
1551 | respectively. | |
1552 | @end deffn | |
1553 | ||
1554 | @deffn {Scheme Procedure} bytevector->string bytevector transcoder | |
1555 | ||
1556 | Returns the string that results from transcoding the | |
1557 | @var{bytevector} according to the input direction of the transcoder. | |
1558 | @end deffn | |
1559 | ||
1560 | @deffn {Scheme Procedure} string->bytevector string transcoder | |
1561 | ||
1562 | Returns the bytevector that results from transcoding the | |
1563 | @var{string} according to the output direction of the transcoder. | |
1564 | @end deffn | |
1565 | ||
b242715b LC |
1566 | @node R6RS End-of-File |
1567 | @subsubsection The End-of-File Object | |
1568 | ||
1569 | @cindex EOF | |
1570 | @cindex end-of-file | |
1571 | ||
1572 | R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io | |
1573 | ports)} module: | |
1574 | ||
1575 | @deffn {Scheme Procedure} eof-object? obj | |
1576 | @deffnx {C Function} scm_eof_object_p (obj) | |
1577 | Return true if @var{obj} is the end-of-file (EOF) object. | |
1578 | @end deffn | |
1579 | ||
1580 | In addition, the following procedure is provided: | |
1581 | ||
1582 | @deffn {Scheme Procedure} eof-object | |
1583 | @deffnx {C Function} scm_eof_object () | |
1584 | Return the end-of-file (EOF) object. | |
1585 | ||
1586 | @lisp | |
1587 | (eof-object? (eof-object)) | |
1588 | @result{} #t | |
1589 | @end lisp | |
1590 | @end deffn | |
1591 | ||
1592 | ||
1593 | @node R6RS Port Manipulation | |
1594 | @subsubsection Port Manipulation | |
1595 | ||
1596 | The procedures listed below operate on any kind of R6RS I/O port. | |
1597 | ||
040dfa6f AR |
1598 | @deffn {Scheme Procedure} port? obj |
1599 | Returns @code{#t} if the argument is a port, and returns @code{#f} | |
1600 | otherwise. | |
1601 | @end deffn | |
1602 | ||
1603 | @deffn {Scheme Procedure} port-transcoder port | |
1604 | Returns the transcoder associated with @var{port} if @var{port} is | |
1605 | textual and has an associated transcoder, and returns @code{#f} if | |
1606 | @var{port} is binary or does not have an associated transcoder. | |
1607 | @end deffn | |
1608 | ||
1609 | @deffn {Scheme Procedure} binary-port? port | |
1610 | Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for | |
1611 | binary data input/output. | |
1612 | ||
1613 | Note that internally Guile does not differentiate between binary and | |
1614 | textual ports, unlike the R6RS. Thus, this procedure returns true when | |
1615 | @var{port} does not have an associated encoding---i.e., when | |
1616 | @code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports, | |
1617 | port-encoding}). This is the case for ports returned by R6RS procedures | |
1618 | such as @code{open-bytevector-input-port} and | |
1619 | @code{make-custom-binary-output-port}. | |
1620 | ||
1621 | However, Guile currently does not prevent use of textual I/O procedures | |
1622 | such as @code{display} or @code{read-char} with binary ports. Doing so | |
1623 | ``upgrades'' the port from binary to textual, under the ISO-8859-1 | |
1624 | encoding. Likewise, Guile does not prevent use of | |
1625 | @code{set-port-encoding!} on a binary port, which also turns it into a | |
1626 | ``textual'' port. | |
1627 | @end deffn | |
1628 | ||
1629 | @deffn {Scheme Procedure} textual-port? port | |
64de6db5 | 1630 | Always return @code{#t}, as all ports can be used for textual I/O in |
040dfa6f AR |
1631 | Guile. |
1632 | @end deffn | |
1633 | ||
64de6db5 | 1634 | @deffn {Scheme Procedure} transcoded-port binary-port transcoder |
040dfa6f AR |
1635 | The @code{transcoded-port} procedure |
1636 | returns a new textual port with the specified @var{transcoder}. | |
1637 | Otherwise the new textual port's state is largely the same as | |
1638 | that of @var{binary-port}. | |
1639 | If @var{binary-port} is an input port, the new textual | |
1640 | port will be an input port and | |
1641 | will transcode the bytes that have not yet been read from | |
1642 | @var{binary-port}. | |
1643 | If @var{binary-port} is an output port, the new textual | |
1644 | port will be an output port and | |
1645 | will transcode output characters into bytes that are | |
1646 | written to the byte sink represented by @var{binary-port}. | |
1647 | ||
1648 | As a side effect, however, @code{transcoded-port} | |
1649 | closes @var{binary-port} in | |
1650 | a special way that allows the new textual port to continue to | |
1651 | use the byte source or sink represented by @var{binary-port}, | |
1652 | even though @var{binary-port} itself is closed and cannot | |
1653 | be used by the input and output operations described in this | |
1654 | chapter. | |
1655 | @end deffn | |
1656 | ||
b242715b LC |
1657 | @deffn {Scheme Procedure} port-position port |
1658 | If @var{port} supports it (see below), return the offset (an integer) | |
1659 | indicating where the next octet will be read from/written to in | |
1660 | @var{port}. If @var{port} does not support this operation, an error | |
1661 | condition is raised. | |
1662 | ||
1663 | This is similar to Guile's @code{seek} procedure with the | |
1664 | @code{SEEK_CUR} argument (@pxref{Random Access}). | |
1665 | @end deffn | |
1666 | ||
1667 | @deffn {Scheme Procedure} port-has-port-position? port | |
1668 | Return @code{#t} is @var{port} supports @code{port-position}. | |
1669 | @end deffn | |
1670 | ||
1671 | @deffn {Scheme Procedure} set-port-position! port offset | |
1672 | If @var{port} supports it (see below), set the position where the next | |
1673 | octet will be read from/written to @var{port} to @var{offset} (an | |
1674 | integer). If @var{port} does not support this operation, an error | |
1675 | condition is raised. | |
1676 | ||
1677 | This is similar to Guile's @code{seek} procedure with the | |
1678 | @code{SEEK_SET} argument (@pxref{Random Access}). | |
1679 | @end deffn | |
1680 | ||
1681 | @deffn {Scheme Procedure} port-has-set-port-position!? port | |
1682 | Return @code{#t} is @var{port} supports @code{set-port-position!}. | |
1683 | @end deffn | |
1684 | ||
1685 | @deffn {Scheme Procedure} call-with-port port proc | |
1686 | Call @var{proc}, passing it @var{port} and closing @var{port} upon exit | |
1687 | of @var{proc}. Return the return values of @var{proc}. | |
1688 | @end deffn | |
1689 | ||
040dfa6f AR |
1690 | @node R6RS Input Ports |
1691 | @subsubsection Input Ports | |
96128014 | 1692 | |
64de6db5 | 1693 | @deffn {Scheme Procedure} input-port? obj |
040dfa6f AR |
1694 | Returns @code{#t} if the argument is an input port (or a combined input |
1695 | and output port), and returns @code{#f} otherwise. | |
1696 | @end deffn | |
96128014 | 1697 | |
64de6db5 | 1698 | @deffn {Scheme Procedure} port-eof? input-port |
040dfa6f AR |
1699 | Returns @code{#t} |
1700 | if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port) | |
1701 | or the @code{lookahead-char} procedure (if @var{input-port} is a textual port) | |
1702 | would return | |
1703 | the end-of-file object, and @code{#f} otherwise. | |
1704 | The operation may block indefinitely if no data is available | |
1705 | but the port cannot be determined to be at end of file. | |
96128014 LC |
1706 | @end deffn |
1707 | ||
040dfa6f AR |
1708 | @deffn {Scheme Procedure} open-file-input-port filename |
1709 | @deffnx {Scheme Procedure} open-file-input-port filename file-options | |
1710 | @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode | |
1711 | @deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder | |
64de6db5 | 1712 | @var{maybe-transcoder} must be either a transcoder or @code{#f}. |
040dfa6f AR |
1713 | |
1714 | The @code{open-file-input-port} procedure returns an | |
1715 | input port for the named file. The @var{file-options} and | |
1716 | @var{maybe-transcoder} arguments are optional. | |
1717 | ||
1718 | The @var{file-options} argument, which may determine | |
1719 | various aspects of the returned port (@pxref{R6RS File Options}), | |
1720 | defaults to the value of @code{(file-options)}. | |
1721 | ||
1722 | The @var{buffer-mode} argument, if supplied, | |
1723 | must be one of the symbols that name a buffer mode. | |
1724 | The @var{buffer-mode} argument defaults to @code{block}. | |
1725 | ||
1726 | If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated | |
1727 | with the returned port. | |
1728 | ||
1729 | If @var{maybe-transcoder} is @code{#f} or absent, | |
1730 | the port will be a binary port and will support the | |
1731 | @code{port-position} and @code{set-port-position!} operations. | |
1732 | Otherwise the port will be a textual port, and whether it supports | |
1733 | the @code{port-position} and @code{set-port-position!} operations | |
1734 | is implementation-dependent (and possibly transcoder-dependent). | |
96128014 LC |
1735 | @end deffn |
1736 | ||
040dfa6f AR |
1737 | @deffn {Scheme Procedure} standard-input-port |
1738 | Returns a fresh binary input port connected to standard input. Whether | |
1739 | the port supports the @code{port-position} and @code{set-port-position!} | |
1740 | operations is implementation-dependent. | |
1741 | @end deffn | |
1742 | ||
1743 | @deffn {Scheme Procedure} current-input-port | |
1744 | This returns a default textual port for input. Normally, this default | |
1745 | port is associated with standard input, but can be dynamically | |
1746 | re-assigned using the @code{with-input-from-file} procedure from the | |
1747 | @code{io simple (6)} library (@pxref{rnrs io simple}). The port may or | |
1748 | may not have an associated transcoder; if it does, the transcoder is | |
1749 | implementation-dependent. | |
1750 | @end deffn | |
b242715b LC |
1751 | |
1752 | @node R6RS Binary Input | |
1753 | @subsubsection Binary Input | |
1754 | ||
1755 | @cindex binary input | |
1756 | ||
1757 | R6RS binary input ports can be created with the procedures described | |
1758 | below. | |
1759 | ||
1760 | @deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder] | |
1761 | @deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder) | |
1762 | Return an input port whose contents are drawn from bytevector @var{bv} | |
1763 | (@pxref{Bytevectors}). | |
1764 | ||
1765 | @c FIXME: Update description when implemented. | |
1766 | The @var{transcoder} argument is currently not supported. | |
1767 | @end deffn | |
1768 | ||
1769 | @cindex custom binary input ports | |
1770 | ||
1771 | @deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close | |
1772 | @deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close) | |
1773 | Return a new custom binary input port@footnote{This is similar in spirit | |
1774 | to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a | |
1775 | string) whose input is drained by invoking @var{read!} and passing it a | |
1776 | bytevector, an index where bytes should be written, and the number of | |
1777 | bytes to read. The @code{read!} procedure must return an integer | |
1778 | indicating the number of bytes read, or @code{0} to indicate the | |
1779 | end-of-file. | |
1780 | ||
1781 | Optionally, if @var{get-position} is not @code{#f}, it must be a thunk | |
64de6db5 | 1782 | that will be called when @code{port-position} is invoked on the custom |
b242715b LC |
1783 | binary port and should return an integer indicating the position within |
1784 | the underlying data stream; if @var{get-position} was not supplied, the | |
64de6db5 | 1785 | returned port does not support @code{port-position}. |
b242715b LC |
1786 | |
1787 | Likewise, if @var{set-position!} is not @code{#f}, it should be a | |
64de6db5 | 1788 | one-argument procedure. When @code{set-port-position!} is invoked on the |
b242715b LC |
1789 | custom binary input port, @var{set-position!} is passed an integer |
1790 | indicating the position of the next byte is to read. | |
1791 | ||
1792 | Finally, if @var{close} is not @code{#f}, it must be a thunk. It is | |
1793 | invoked when the custom binary input port is closed. | |
1794 | ||
1795 | Using a custom binary input port, the @code{open-bytevector-input-port} | |
1796 | procedure could be implemented as follows: | |
1797 | ||
1798 | @lisp | |
1799 | (define (open-bytevector-input-port source) | |
1800 | (define position 0) | |
1801 | (define length (bytevector-length source)) | |
1802 | ||
1803 | (define (read! bv start count) | |
1804 | (let ((count (min count (- length position)))) | |
1805 | (bytevector-copy! source position | |
1806 | bv start count) | |
1807 | (set! position (+ position count)) | |
1808 | count)) | |
1809 | ||
1810 | (define (get-position) position) | |
1811 | ||
1812 | (define (set-position! new-position) | |
1813 | (set! position new-position)) | |
1814 | ||
1815 | (make-custom-binary-input-port "the port" read! | |
1816 | get-position | |
1817 | set-position!)) | |
1818 | ||
1819 | (read (open-bytevector-input-port (string->utf8 "hello"))) | |
1820 | @result{} hello | |
1821 | @end lisp | |
1822 | @end deffn | |
1823 | ||
1824 | @cindex binary input | |
1825 | Binary input is achieved using the procedures below: | |
1826 | ||
1827 | @deffn {Scheme Procedure} get-u8 port | |
1828 | @deffnx {C Function} scm_get_u8 (port) | |
1829 | Return an octet read from @var{port}, a binary input port, blocking as | |
1830 | necessary, or the end-of-file object. | |
1831 | @end deffn | |
1832 | ||
1833 | @deffn {Scheme Procedure} lookahead-u8 port | |
1834 | @deffnx {C Function} scm_lookahead_u8 (port) | |
1835 | Like @code{get-u8} but does not update @var{port}'s position to point | |
1836 | past the octet. | |
1837 | @end deffn | |
1838 | ||
1839 | @deffn {Scheme Procedure} get-bytevector-n port count | |
1840 | @deffnx {C Function} scm_get_bytevector_n (port, count) | |
1841 | Read @var{count} octets from @var{port}, blocking as necessary and | |
1842 | return a bytevector containing the octets read. If fewer bytes are | |
1843 | available, a bytevector smaller than @var{count} is returned. | |
1844 | @end deffn | |
1845 | ||
1846 | @deffn {Scheme Procedure} get-bytevector-n! port bv start count | |
1847 | @deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count) | |
1848 | Read @var{count} bytes from @var{port} and store them in @var{bv} | |
1849 | starting at index @var{start}. Return either the number of bytes | |
1850 | actually read or the end-of-file object. | |
1851 | @end deffn | |
1852 | ||
1853 | @deffn {Scheme Procedure} get-bytevector-some port | |
1854 | @deffnx {C Function} scm_get_bytevector_some (port) | |
21bbe22a MW |
1855 | Read from @var{port}, blocking as necessary, until bytes are available |
1856 | or an end-of-file is reached. Return either the end-of-file object or a | |
1857 | new bytevector containing some of the available bytes (at least one), | |
1858 | and update the port position to point just past these bytes. | |
b242715b LC |
1859 | @end deffn |
1860 | ||
1861 | @deffn {Scheme Procedure} get-bytevector-all port | |
1862 | @deffnx {C Function} scm_get_bytevector_all (port) | |
1863 | Read from @var{port}, blocking as necessary, until the end-of-file is | |
1864 | reached. Return either a new bytevector containing the data read or the | |
1865 | end-of-file object (if no data were available). | |
1866 | @end deffn | |
1867 | ||
7f6c3f8f MW |
1868 | The @code{(ice-9 binary-ports)} module provides the following procedure |
1869 | as an extension to @code{(rnrs io ports)}: | |
1870 | ||
1871 | @deffn {Scheme Procedure} unget-bytevector port bv [start [count]] | |
1872 | @deffnx {C Function} scm_unget_bytevector (port, bv, start, count) | |
1873 | Place the contents of @var{bv} in @var{port}, optionally starting at | |
1874 | index @var{start} and limiting to @var{count} octets, so that its bytes | |
1875 | will be read from left-to-right as the next bytes from @var{port} during | |
1876 | subsequent read operations. If called multiple times, the unread bytes | |
1877 | will be read again in last-in first-out order. | |
1878 | @end deffn | |
1879 | ||
040dfa6f AR |
1880 | @node R6RS Textual Input |
1881 | @subsubsection Textual Input | |
1882 | ||
64de6db5 | 1883 | @deffn {Scheme Procedure} get-char textual-input-port |
040dfa6f AR |
1884 | Reads from @var{textual-input-port}, blocking as necessary, until a |
1885 | complete character is available from @var{textual-input-port}, | |
1886 | or until an end of file is reached. | |
1887 | ||
1888 | If a complete character is available before the next end of file, | |
1889 | @code{get-char} returns that character and updates the input port to | |
1890 | point past the character. If an end of file is reached before any | |
1891 | character is read, @code{get-char} returns the end-of-file object. | |
1892 | @end deffn | |
1893 | ||
64de6db5 | 1894 | @deffn {Scheme Procedure} lookahead-char textual-input-port |
040dfa6f AR |
1895 | The @code{lookahead-char} procedure is like @code{get-char}, but it does |
1896 | not update @var{textual-input-port} to point past the character. | |
1897 | @end deffn | |
1898 | ||
64de6db5 | 1899 | @deffn {Scheme Procedure} get-string-n textual-input-port count |
040dfa6f | 1900 | |
64de6db5 | 1901 | @var{count} must be an exact, non-negative integer object, representing |
040dfa6f AR |
1902 | the number of characters to be read. |
1903 | ||
1904 | The @code{get-string-n} procedure reads from @var{textual-input-port}, | |
1905 | blocking as necessary, until @var{count} characters are available, or | |
1906 | until an end of file is reached. | |
1907 | ||
1908 | If @var{count} characters are available before end of file, | |
1909 | @code{get-string-n} returns a string consisting of those @var{count} | |
1910 | characters. If fewer characters are available before an end of file, but | |
1911 | one or more characters can be read, @code{get-string-n} returns a string | |
1912 | containing those characters. In either case, the input port is updated | |
1913 | to point just past the characters read. If no characters can be read | |
1914 | before an end of file, the end-of-file object is returned. | |
1915 | @end deffn | |
1916 | ||
64de6db5 | 1917 | @deffn {Scheme Procedure} get-string-n! textual-input-port string start count |
040dfa6f | 1918 | |
64de6db5 | 1919 | @var{start} and @var{count} must be exact, non-negative integer objects, |
040dfa6f | 1920 | with @var{count} representing the number of characters to be read. |
64de6db5 | 1921 | @var{string} must be a string with at least $@var{start} + @var{count}$ |
040dfa6f AR |
1922 | characters. |
1923 | ||
1924 | The @code{get-string-n!} procedure reads from @var{textual-input-port} | |
1925 | in the same manner as @code{get-string-n}. If @var{count} characters | |
1926 | are available before an end of file, they are written into @var{string} | |
1927 | starting at index @var{start}, and @var{count} is returned. If fewer | |
1928 | characters are available before an end of file, but one or more can be | |
1929 | read, those characters are written into @var{string} starting at index | |
1930 | @var{start} and the number of characters actually read is returned as an | |
1931 | exact integer object. If no characters can be read before an end of | |
1932 | file, the end-of-file object is returned. | |
1933 | @end deffn | |
1934 | ||
1fcf6909 | 1935 | @deffn {Scheme Procedure} get-string-all textual-input-port |
040dfa6f AR |
1936 | Reads from @var{textual-input-port} until an end of file, decoding |
1937 | characters in the same manner as @code{get-string-n} and | |
1938 | @code{get-string-n!}. | |
1939 | ||
1940 | If characters are available before the end of file, a string containing | |
1941 | all the characters decoded from that data are returned. If no character | |
1942 | precedes the end of file, the end-of-file object is returned. | |
1943 | @end deffn | |
1944 | ||
64de6db5 | 1945 | @deffn {Scheme Procedure} get-line textual-input-port |
040dfa6f AR |
1946 | Reads from @var{textual-input-port} up to and including the linefeed |
1947 | character or end of file, decoding characters in the same manner as | |
1948 | @code{get-string-n} and @code{get-string-n!}. | |
1949 | ||
1950 | If a linefeed character is read, a string containing all of the text up | |
1951 | to (but not including) the linefeed character is returned, and the port | |
1952 | is updated to point just past the linefeed character. If an end of file | |
1953 | is encountered before any linefeed character is read, but some | |
1954 | characters have been read and decoded as characters, a string containing | |
1955 | those characters is returned. If an end of file is encountered before | |
1956 | any characters are read, the end-of-file object is returned. | |
1957 | ||
1958 | @quotation Note | |
1959 | The end-of-line style, if not @code{none}, will cause all line endings | |
1960 | to be read as linefeed characters. @xref{R6RS Transcoders}. | |
1961 | @end quotation | |
1962 | @end deffn | |
1963 | ||
64de6db5 | 1964 | @deffn {Scheme Procedure} get-datum textual-input-port count |
040dfa6f AR |
1965 | Reads an external representation from @var{textual-input-port} and returns the |
1966 | datum it represents. The @code{get-datum} procedure returns the next | |
1967 | datum that can be parsed from the given @var{textual-input-port}, updating | |
1968 | @var{textual-input-port} to point exactly past the end of the external | |
1969 | representation of the object. | |
1970 | ||
1971 | Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme | |
1972 | Syntax}) in the input is first skipped. If an end of file occurs after | |
1973 | the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File}) | |
1974 | is returned. | |
1975 | ||
1976 | If a character inconsistent with an external representation is | |
1977 | encountered in the input, an exception with condition types | |
1978 | @code{&lexical} and @code{&i/o-read} is raised. Also, if the end of | |
1979 | file is encountered after the beginning of an external representation, | |
1980 | but the external representation is incomplete and therefore cannot be | |
1981 | parsed, an exception with condition types @code{&lexical} and | |
1982 | @code{&i/o-read} is raised. | |
1983 | @end deffn | |
1984 | ||
1985 | @node R6RS Output Ports | |
1986 | @subsubsection Output Ports | |
1987 | ||
1988 | @deffn {Scheme Procedure} output-port? obj | |
1989 | Returns @code{#t} if the argument is an output port (or a | |
1990 | combined input and output port), @code{#f} otherwise. | |
1991 | @end deffn | |
1992 | ||
1993 | @deffn {Scheme Procedure} flush-output-port port | |
1994 | Flushes any buffered output from the buffer of @var{output-port} to the | |
1995 | underlying file, device, or object. The @code{flush-output-port} | |
1996 | procedure returns an unspecified values. | |
1997 | @end deffn | |
1998 | ||
1999 | @deffn {Scheme Procedure} open-file-output-port filename | |
2000 | @deffnx {Scheme Procedure} open-file-output-port filename file-options | |
2001 | @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode | |
2002 | @deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder | |
2003 | ||
2004 | @var{maybe-transcoder} must be either a transcoder or @code{#f}. | |
2005 | ||
2006 | The @code{open-file-output-port} procedure returns an output port for the named file. | |
2007 | ||
2008 | The @var{file-options} argument, which may determine various aspects of | |
2009 | the returned port (@pxref{R6RS File Options}), defaults to the value of | |
2010 | @code{(file-options)}. | |
2011 | ||
2012 | The @var{buffer-mode} argument, if supplied, | |
2013 | must be one of the symbols that name a buffer mode. | |
2014 | The @var{buffer-mode} argument defaults to @code{block}. | |
2015 | ||
2016 | If @var{maybe-transcoder} is a transcoder, it becomes the transcoder | |
2017 | associated with the port. | |
2018 | ||
2019 | If @var{maybe-transcoder} is @code{#f} or absent, | |
2020 | the port will be a binary port and will support the | |
2021 | @code{port-position} and @code{set-port-position!} operations. | |
2022 | Otherwise the port will be a textual port, and whether it supports | |
2023 | the @code{port-position} and @code{set-port-position!} operations | |
2024 | is implementation-dependent (and possibly transcoder-dependent). | |
2025 | @end deffn | |
2026 | ||
2027 | @deffn {Scheme Procedure} standard-output-port | |
2028 | @deffnx {Scheme Procedure} standard-error-port | |
2029 | Returns a fresh binary output port connected to the standard output or | |
2030 | standard error respectively. Whether the port supports the | |
2031 | @code{port-position} and @code{set-port-position!} operations is | |
2032 | implementation-dependent. | |
2033 | @end deffn | |
2034 | ||
2035 | @deffn {Scheme Procedure} current-output-port | |
2036 | @deffnx {Scheme Procedure} current-error-port | |
2037 | These return default textual ports for regular output and error output. | |
2038 | Normally, these default ports are associated with standard output, and | |
2039 | standard error, respectively. The return value of | |
2040 | @code{current-output-port} can be dynamically re-assigned using the | |
2041 | @code{with-output-to-file} procedure from the @code{io simple (6)} | |
2042 | library (@pxref{rnrs io simple}). A port returned by one of these | |
2043 | procedures may or may not have an associated transcoder; if it does, the | |
2044 | transcoder is implementation-dependent. | |
2045 | @end deffn | |
2046 | ||
b242715b LC |
2047 | @node R6RS Binary Output |
2048 | @subsubsection Binary Output | |
2049 | ||
2050 | Binary output ports can be created with the procedures below. | |
2051 | ||
2052 | @deffn {Scheme Procedure} open-bytevector-output-port [transcoder] | |
2053 | @deffnx {C Function} scm_open_bytevector_output_port (transcoder) | |
2054 | Return two values: a binary output port and a procedure. The latter | |
2055 | should be called with zero arguments to obtain a bytevector containing | |
2056 | the data accumulated by the port, as illustrated below. | |
2057 | ||
2058 | @lisp | |
2059 | (call-with-values | |
2060 | (lambda () | |
2061 | (open-bytevector-output-port)) | |
2062 | (lambda (port get-bytevector) | |
2063 | (display "hello" port) | |
2064 | (get-bytevector))) | |
2065 | ||
2066 | @result{} #vu8(104 101 108 108 111) | |
2067 | @end lisp | |
2068 | ||
2069 | @c FIXME: Update description when implemented. | |
2070 | The @var{transcoder} argument is currently not supported. | |
2071 | @end deffn | |
2072 | ||
2073 | @cindex custom binary output ports | |
2074 | ||
2075 | @deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close | |
2076 | @deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close) | |
2077 | Return a new custom binary output port named @var{id} (a string) whose | |
2078 | output is sunk by invoking @var{write!} and passing it a bytevector, an | |
2079 | index where bytes should be read from this bytevector, and the number of | |
2080 | bytes to be ``written''. The @code{write!} procedure must return an | |
2081 | integer indicating the number of bytes actually written; when it is | |
2082 | passed @code{0} as the number of bytes to write, it should behave as | |
2083 | though an end-of-file was sent to the byte sink. | |
2084 | ||
2085 | The other arguments are as for @code{make-custom-binary-input-port} | |
2086 | (@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}). | |
2087 | @end deffn | |
2088 | ||
2089 | @cindex binary output | |
2090 | Writing to a binary output port can be done using the following | |
2091 | procedures: | |
2092 | ||
2093 | @deffn {Scheme Procedure} put-u8 port octet | |
2094 | @deffnx {C Function} scm_put_u8 (port, octet) | |
2095 | Write @var{octet}, an integer in the 0--255 range, to @var{port}, a | |
2096 | binary output port. | |
2097 | @end deffn | |
2098 | ||
2099 | @deffn {Scheme Procedure} put-bytevector port bv [start [count]] | |
2100 | @deffnx {C Function} scm_put_bytevector (port, bv, start, count) | |
2101 | Write the contents of @var{bv} to @var{port}, optionally starting at | |
2102 | index @var{start} and limiting to @var{count} octets. | |
2103 | @end deffn | |
2104 | ||
040dfa6f AR |
2105 | @node R6RS Textual Output |
2106 | @subsubsection Textual Output | |
2107 | ||
2108 | @deffn {Scheme Procedure} put-char port char | |
2109 | Writes @var{char} to the port. The @code{put-char} procedure returns | |
803c087e | 2110 | an unspecified value. |
040dfa6f AR |
2111 | @end deffn |
2112 | ||
2113 | @deffn {Scheme Procedure} put-string port string | |
2114 | @deffnx {Scheme Procedure} put-string port string start | |
2115 | @deffnx {Scheme Procedure} put-string port string start count | |
2116 | ||
2117 | @var{start} and @var{count} must be non-negative exact integer objects. | |
2118 | @var{string} must have a length of at least @math{@var{start} + | |
2119 | @var{count}}. @var{start} defaults to 0. @var{count} defaults to | |
2120 | @math{@code{(string-length @var{string})} - @var{start}}$. The | |
2121 | @code{put-string} procedure writes the @var{count} characters of | |
2122 | @var{string} starting at index @var{start} to the port. The | |
2123 | @code{put-string} procedure returns an unspecified value. | |
2124 | @end deffn | |
2125 | ||
64de6db5 | 2126 | @deffn {Scheme Procedure} put-datum textual-output-port datum |
040dfa6f AR |
2127 | @var{datum} should be a datum value. The @code{put-datum} procedure |
2128 | writes an external representation of @var{datum} to | |
2129 | @var{textual-output-port}. The specific external representation is | |
2130 | implementation-dependent. However, whenever possible, an implementation | |
2131 | should produce a representation for which @code{get-datum}, when reading | |
2132 | the representation, will return an object equal (in the sense of | |
2133 | @code{equal?}) to @var{datum}. | |
2134 | ||
2135 | @quotation Note | |
2136 | Not all datums may allow producing an external representation for which | |
2137 | @code{get-datum} will produce an object that is equal to the | |
2138 | original. Specifically, NaNs contained in @var{datum} may make | |
2139 | this impossible. | |
2140 | @end quotation | |
2141 | ||
2142 | @quotation Note | |
2143 | The @code{put-datum} procedure merely writes the external | |
2144 | representation, but no trailing delimiter. If @code{put-datum} is | |
2145 | used to write several subsequent external representations to an | |
2146 | output port, care should be taken to delimit them properly so they can | |
2147 | be read back in by subsequent calls to @code{get-datum}. | |
2148 | @end quotation | |
2149 | @end deffn | |
b242715b | 2150 | |
07d83abe MV |
2151 | @node I/O Extensions |
2152 | @subsection Using and Extending Ports in C | |
2153 | ||
2154 | @menu | |
2155 | * C Port Interface:: Using ports from C. | |
2156 | * Port Implementation:: How to implement a new port type in C. | |
2157 | @end menu | |
2158 | ||
2159 | ||
2160 | @node C Port Interface | |
2161 | @subsubsection C Port Interface | |
bf5df489 KR |
2162 | @cindex C port interface |
2163 | @cindex Port, C interface | |
07d83abe MV |
2164 | |
2165 | This section describes how to use Scheme ports from C. | |
2166 | ||
2167 | @subsubheading Port basics | |
2168 | ||
3081aee1 KR |
2169 | @cindex ptob |
2170 | @tindex scm_ptob_descriptor | |
2171 | @tindex scm_port | |
2172 | @findex SCM_PTAB_ENTRY | |
2173 | @findex SCM_PTOBNUM | |
2174 | @vindex scm_ptobs | |
07d83abe MV |
2175 | There are two main data structures. A port type object (ptob) is of |
2176 | type @code{scm_ptob_descriptor}. A port instance is of type | |
2177 | @code{scm_port}. Given an @code{SCM} variable which points to a port, | |
2178 | the corresponding C port object can be obtained using the | |
2179 | @code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using | |
2180 | @code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs} | |
2181 | global array. | |
2182 | ||
2183 | @subsubheading Port buffers | |
2184 | ||
2185 | An input port always has a read buffer and an output port always has a | |
2186 | write buffer. However the size of these buffers is not guaranteed to be | |
2187 | more than one byte (e.g., the @code{shortbuf} field in @code{scm_port} | |
2188 | which is used when no other buffer is allocated). The way in which the | |
2189 | buffers are allocated depends on the implementation of the ptob. For | |
2190 | example in the case of an fport, buffers may be allocated with malloc | |
2191 | when the port is created, but in the case of an strport the underlying | |
2192 | string is used as the buffer. | |
2193 | ||
2194 | @subsubheading The @code{rw_random} flag | |
2195 | ||
2196 | Special treatment is required for ports which can be seeked at random. | |
2197 | Before various operations, such as seeking the port or changing from | |
2198 | input to output on a bidirectional port or vice versa, the port | |
2199 | implementation must be given a chance to update its state. The write | |
2200 | buffer is updated by calling the @code{flush} ptob procedure and the | |
2201 | input buffer is updated by calling the @code{end_input} ptob procedure. | |
2202 | In the case of an fport, @code{flush} causes buffered output to be | |
2203 | written to the file descriptor, while @code{end_input} causes the | |
2204 | descriptor position to be adjusted to account for buffered input which | |
2205 | was never read. | |
2206 | ||
2207 | The special treatment must be performed if the @code{rw_random} flag in | |
2208 | the port is non-zero. | |
2209 | ||
2210 | @subsubheading The @code{rw_active} variable | |
2211 | ||
2212 | The @code{rw_active} variable in the port is only used if | |
2213 | @code{rw_random} is set. It's defined as an enum with the following | |
2214 | values: | |
2215 | ||
2216 | @table @code | |
2217 | @item SCM_PORT_READ | |
2218 | the read buffer may have unread data. | |
2219 | ||
2220 | @item SCM_PORT_WRITE | |
2221 | the write buffer may have unwritten data. | |
2222 | ||
2223 | @item SCM_PORT_NEITHER | |
2224 | neither the write nor the read buffer has data. | |
2225 | @end table | |
2226 | ||
2227 | @subsubheading Reading from a port. | |
2228 | ||
2229 | To read from a port, it's possible to either call existing libguile | |
2230 | procedures such as @code{scm_getc} and @code{scm_read_line} or to read | |
2231 | data from the read buffer directly. Reading from the buffer involves | |
2232 | the following steps: | |
2233 | ||
2234 | @enumerate | |
2235 | @item | |
2236 | Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}. | |
2237 | ||
2238 | @item | |
2239 | Fill the read buffer, if it's empty, using @code{scm_fill_input}. | |
2240 | ||
2241 | @item Read the data from the buffer and update the read position in | |
2242 | the buffer. Steps 2) and 3) may be repeated as many times as required. | |
2243 | ||
2244 | @item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set. | |
2245 | ||
2246 | @item update the port's line and column counts. | |
2247 | @end enumerate | |
2248 | ||
2249 | @subsubheading Writing to a port. | |
2250 | ||
2251 | To write data to a port, calling @code{scm_lfwrite} should be sufficient for | |
2252 | most purposes. This takes care of the following steps: | |
2253 | ||
2254 | @enumerate | |
2255 | @item | |
2256 | End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}. | |
2257 | ||
2258 | @item | |
2259 | Pass the data to the ptob implementation using the @code{write} ptob | |
2260 | procedure. The advantage of using the ptob @code{write} instead of | |
2261 | manipulating the write buffer directly is that it allows the data to be | |
2262 | written in one operation even if the port is using the single-byte | |
2263 | @code{shortbuf}. | |
2264 | ||
2265 | @item | |
2266 | Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random} | |
2267 | is set. | |
2268 | @end enumerate | |
2269 | ||
2270 | ||
2271 | @node Port Implementation | |
2272 | @subsubsection Port Implementation | |
28cc8dac | 2273 | @cindex Port implementation |
07d83abe MV |
2274 | |
2275 | This section describes how to implement a new port type in C. | |
2276 | ||
2277 | As described in the previous section, a port type object (ptob) is | |
2278 | a structure of type @code{scm_ptob_descriptor}. A ptob is created by | |
2279 | calling @code{scm_make_port_type}. | |
2280 | ||
23f2b9a3 KR |
2281 | @deftypefun scm_t_bits scm_make_port_type (char *name, int (*fill_input) (SCM port), void (*write) (SCM port, const void *data, size_t size)) |
2282 | Return a new port type object. The @var{name}, @var{fill_input} and | |
2283 | @var{write} parameters are initial values for those port type fields, | |
2284 | as described below. The other fields are initialized with default | |
2285 | values and can be changed later. | |
2286 | @end deftypefun | |
2287 | ||
07d83abe MV |
2288 | All of the elements of the ptob, apart from @code{name}, are procedures |
2289 | which collectively implement the port behaviour. Creating a new port | |
2290 | type mostly involves writing these procedures. | |
2291 | ||
07d83abe MV |
2292 | @table @code |
2293 | @item name | |
2294 | A pointer to a NUL terminated string: the name of the port type. This | |
2295 | is the only element of @code{scm_ptob_descriptor} which is not | |
2296 | a procedure. Set via the first argument to @code{scm_make_port_type}. | |
2297 | ||
2298 | @item mark | |
2299 | Called during garbage collection to mark any SCM objects that a port | |
2300 | object may contain. It doesn't need to be set unless the port has | |
23f2b9a3 KR |
2301 | @code{SCM} components. Set using |
2302 | ||
2303 | @deftypefun void scm_set_port_mark (scm_t_bits tc, SCM (*mark) (SCM port)) | |
2304 | @end deftypefun | |
07d83abe MV |
2305 | |
2306 | @item free | |
2307 | Called when the port is collected during gc. It | |
2308 | should free any resources used by the port. | |
23f2b9a3 KR |
2309 | Set using |
2310 | ||
2311 | @deftypefun void scm_set_port_free (scm_t_bits tc, size_t (*free) (SCM port)) | |
2312 | @end deftypefun | |
07d83abe MV |
2313 | |
2314 | @item print | |
2315 | Called when @code{write} is called on the port object, to print a | |
23f2b9a3 KR |
2316 | port description. E.g., for an fport it may produce something like: |
2317 | @code{#<input: /etc/passwd 3>}. Set using | |
2318 | ||
2319 | @deftypefun void scm_set_port_print (scm_t_bits tc, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate)) | |
2320 | The first argument @var{port} is the object being printed, the second | |
2321 | argument @var{dest_port} is where its description should go. | |
2322 | @end deftypefun | |
07d83abe MV |
2323 | |
2324 | @item equalp | |
23f2b9a3 KR |
2325 | Not used at present. Set using |
2326 | ||
2327 | @deftypefun void scm_set_port_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM)) | |
2328 | @end deftypefun | |
07d83abe MV |
2329 | |
2330 | @item close | |
2331 | Called when the port is closed, unless it was collected during gc. It | |
2332 | should free any resources used by the port. | |
23f2b9a3 KR |
2333 | Set using |
2334 | ||
2335 | @deftypefun void scm_set_port_close (scm_t_bits tc, int (*close) (SCM port)) | |
2336 | @end deftypefun | |
07d83abe MV |
2337 | |
2338 | @item write | |
2339 | Accept data which is to be written using the port. The port implementation | |
2340 | may choose to buffer the data instead of processing it directly. | |
2341 | Set via the third argument to @code{scm_make_port_type}. | |
2342 | ||
2343 | @item flush | |
2344 | Complete the processing of buffered output data. Reset the value of | |
2345 | @code{rw_active} to @code{SCM_PORT_NEITHER}. | |
23f2b9a3 KR |
2346 | Set using |
2347 | ||
2348 | @deftypefun void scm_set_port_flush (scm_t_bits tc, void (*flush) (SCM port)) | |
2349 | @end deftypefun | |
07d83abe MV |
2350 | |
2351 | @item end_input | |
2352 | Perform any synchronization required when switching from input to output | |
2353 | on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}. | |
23f2b9a3 KR |
2354 | Set using |
2355 | ||
2356 | @deftypefun void scm_set_port_end_input (scm_t_bits tc, void (*end_input) (SCM port, int offset)) | |
2357 | @end deftypefun | |
07d83abe MV |
2358 | |
2359 | @item fill_input | |
2360 | Read new data into the read buffer and return the first character. It | |
2361 | can be assumed that the read buffer is empty when this procedure is called. | |
2362 | Set via the second argument to @code{scm_make_port_type}. | |
2363 | ||
2364 | @item input_waiting | |
2365 | Return a lower bound on the number of bytes that could be read from the | |
2366 | port without blocking. It can be assumed that the current state of | |
2367 | @code{rw_active} is @code{SCM_PORT_NEITHER}. | |
23f2b9a3 KR |
2368 | Set using |
2369 | ||
2370 | @deftypefun void scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM port)) | |
2371 | @end deftypefun | |
07d83abe MV |
2372 | |
2373 | @item seek | |
2374 | Set the current position of the port. The procedure can not make | |
2375 | any assumptions about the value of @code{rw_active} when it's | |
2376 | called. It can reset the buffers first if desired by using something | |
2377 | like: | |
2378 | ||
2379 | @example | |
23f2b9a3 KR |
2380 | if (pt->rw_active == SCM_PORT_READ) |
2381 | scm_end_input (port); | |
2382 | else if (pt->rw_active == SCM_PORT_WRITE) | |
2383 | ptob->flush (port); | |
07d83abe MV |
2384 | @end example |
2385 | ||
2386 | However note that this will have the side effect of discarding any data | |
2387 | in the unread-char buffer, in addition to any side effects from the | |
2388 | @code{end_input} and @code{flush} ptob procedures. This is undesirable | |
2389 | when seek is called to measure the current position of the port, i.e., | |
2390 | @code{(seek p 0 SEEK_CUR)}. The libguile fport and string port | |
2391 | implementations take care to avoid this problem. | |
2392 | ||
23f2b9a3 KR |
2393 | The procedure is set using |
2394 | ||
f1ce9199 | 2395 | @deftypefun void scm_set_port_seek (scm_t_bits tc, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence)) |
23f2b9a3 | 2396 | @end deftypefun |
07d83abe MV |
2397 | |
2398 | @item truncate | |
2399 | Truncate the port data to be specified length. It can be assumed that the | |
2400 | current state of @code{rw_active} is @code{SCM_PORT_NEITHER}. | |
23f2b9a3 KR |
2401 | Set using |
2402 | ||
f1ce9199 | 2403 | @deftypefun void scm_set_port_truncate (scm_t_bits tc, void (*truncate) (SCM port, scm_t_off length)) |
23f2b9a3 | 2404 | @end deftypefun |
07d83abe MV |
2405 | |
2406 | @end table | |
2407 | ||
cdd3d6c9 MW |
2408 | @node BOM Handling |
2409 | @subsection Handling of Unicode byte order marks. | |
2410 | @cindex BOM | |
2411 | @cindex byte order mark | |
2412 | ||
2413 | This section documents the finer points of Guile's handling of Unicode | |
2414 | byte order marks (BOMs). A byte order mark (U+FEFF) is typically found | |
2415 | at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably | |
2416 | determine the byte order. Occasionally, a BOM is found at the start of | |
2417 | a UTF-8 stream, but this is much less common and not generally | |
2418 | recommended. | |
2419 | ||
2420 | Guile attempts to handle BOMs automatically, and in accordance with the | |
2421 | recommendations of the Unicode Standard, when the port encoding is set | |
2422 | to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile | |
2423 | automatically writes a BOM at the start of a UTF-16 or UTF-32 stream, | |
2424 | and automatically consumes one from the start of a UTF-8, UTF-16, or | |
2425 | UTF-32 stream. | |
2426 | ||
2427 | As specified in the Unicode Standard, a BOM is only handled specially at | |
2428 | the start of a stream, and only if the port encoding is set to | |
2429 | @code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is | |
2430 | set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or | |
2431 | @code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of | |
2432 | the special handling described in this section applies. | |
2433 | ||
2434 | @itemize @bullet | |
2435 | @item | |
2436 | To ensure that Guile will properly detect the byte order of a UTF-16 or | |
2437 | UTF-32 stream, you must perform a textual read before any writes, seeks, | |
2438 | or binary I/O. Guile will not attempt to read a BOM unless a read is | |
2439 | explicitly requested at the start of the stream. | |
2440 | ||
2441 | @item | |
2442 | If a textual write is performed before the first read, then an arbitrary | |
2443 | byte order will be chosen. Currently, big endian is the default on all | |
2444 | platforms, but that may change in the future. If you wish to explicitly | |
2445 | control the byte order of an output stream, set the port encoding to | |
2446 | @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE}, | |
2447 | and explicitly write a BOM (@code{#\xFEFF}) if desired. | |
2448 | ||
2449 | @item | |
2450 | If @code{set-port-encoding!} is called in the middle of a stream, Guile | |
2451 | treats this as a new logical ``start of stream'' for purposes of BOM | |
2452 | handling, and will forget about any BOMs that had previously been seen. | |
2453 | Therefore, it may choose a different byte order than had been used | |
2454 | previously. This is intended to support multiple logical text streams | |
2455 | embedded within a larger binary stream. | |
2456 | ||
2457 | @item | |
2458 | Binary I/O operations are not guaranteed to update Guile's notion of | |
2459 | whether the port is at the ``start of the stream'', nor are they | |
2460 | guaranteed to produce or consume BOMs. | |
2461 | ||
2462 | @item | |
2463 | For ports that support seeking (e.g. normal files), the input and output | |
2464 | streams are considered linked: if the user reads first, then a BOM will | |
2465 | be consumed (if appropriate), but later writes will @emph{not} produce a | |
2466 | BOM. Similarly, if the user writes first, then later reads will | |
2467 | @emph{not} consume a BOM. | |
2468 | ||
2469 | @item | |
2470 | For ports that do not support seeking (e.g. pipes, sockets, and | |
2471 | terminals), the input and output streams are considered | |
2472 | @emph{independent} for purposes of BOM handling: the first read will | |
2473 | consume a BOM (if appropriate), and the first write will @emph{also} | |
2474 | produce a BOM (if appropriate). However, the input and output streams | |
2475 | will always use the same byte order. | |
2476 | ||
2477 | @item | |
2478 | Seeks to the beginning of a file will set the ``start of stream'' flags. | |
2479 | Therefore, a subsequent textual read or write will consume or produce a | |
2480 | BOM. However, unlike @code{set-port-encoding!}, if a byte order had | |
2481 | already been chosen for the port, it will remain in effect after a seek, | |
2482 | and cannot be changed by the presence of a BOM. Seeks anywhere other | |
2483 | than the beginning of a file clear the ``start of stream'' flags. | |
2484 | @end itemize | |
2485 | ||
07d83abe MV |
2486 | @c Local Variables: |
2487 | @c TeX-master: "guile.texi" | |
2488 | @c End: |