Merge remote-tracking branch 'origin/stable-2.0'

[bpt/guile.git] / doc / ref / api-io.texi
diff --git a/doc/ref/api-io.texi b/doc/ref/api-io.texi

index b0b5741..5ca3506 100644 (file)
--- a/doc/ref/api-io.texi
+++ b/doc/ref/api-io.texi
@@ -1,10 +1,9 @@
  @c -*-texinfo-*-
  @c This is part of the GNU Guile Reference Manual.
-@c Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009
-@c   Free Software Foundation, Inc.
+@c Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
+@c   2010, 2011, 2013  Free Software Foundation, Inc.
  @c See the file guile.texi for copying conditions.
  
-@page
  @node Input and Output
  @section Input and Output
  
@@ -20,6 +19,7 @@
  * Port Types::                  Types of port and how to make them.
  * R6RS I/O Ports::              The R6RS port API.
  * I/O Extensions::              Using and extending ports in C.
+* BOM Handling::                Handling of Unicode byte order marks.
  @end menu
  
  
@@ -47,7 +47,7 @@ are two interesting and powerful examples of this technique.
  
  Ports are garbage collected in the usual way (@pxref{Memory
  Management}), and will be closed at that time if not already closed.
-In this case any errors occuring in the close will not be reported.
+In this case any errors occurring in the close will not be reported.
  Usually a program will want to explicitly close so as to be sure all
  its operations have been successful.  Of course if a program has
  abandoned something due to an error or other condition then closing
@@ -70,6 +70,24 @@ All file access uses the ``LFS'' large file support functions when
  available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be
  read and written on a 32-bit system.
  
+Each port has an associated character encoding that controls how bytes
+read from the port are converted to characters and string and controls
+how characters and strings written to the port are converted to bytes.
+When ports are created, they inherit their character encoding from the
+current locale, but, that can be modified after the port is created.
+
+Currently, the ports only work with @emph{non-modal} encodings.  Most
+encodings are non-modal, meaning that the conversion of bytes to a
+string doesn't depend on its context: the same byte sequence will always
+return the same string.  A couple of modal encodings are in common use,
+like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
+
+Each port also has an associated conversion strategy: what to do when
+a Guile character can't be converted to the port's encoded character
+representation for output. There are three possible strategies: to
+raise an error, to replace the character with a hex escape, or to
+replace the character with a substitute character.
+
  @rnindex input-port?
  @deffn {Scheme Procedure} input-port? x
  @deffnx {C Function} scm_input_port_p (x)
@@ -93,6 +111,78 @@ Equivalent to @code{(or (input-port? @var{x}) (output-port?
  @var{x}))}.
  @end deffn
  
+@deffn {Scheme Procedure} set-port-encoding! port enc
+@deffnx {C Function} scm_set_port_encoding_x (port, enc)
+Sets the character encoding that will be used to interpret all port I/O.
+@var{enc} is a string containing the name of an encoding.  Valid
+encoding names are those
+@url{http://www.iana.org/assignments/character-sets, defined by IANA}.
+@end deffn
+
+@defvr {Scheme Variable} %default-port-encoding
+A fluid containing @code{#f} or the name of the encoding to
+be used by default for newly created ports (@pxref{Fluids and Dynamic
+States}).  The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
+
+New ports are created with the encoding appropriate for the current
+locale if @code{setlocale} has been called or the value specified by
+this fluid otherwise.
+@end defvr
+
+@deffn {Scheme Procedure} port-encoding port
+@deffnx {C Function} scm_port_encoding (port)
+Returns, as a string, the character encoding that @var{port} uses to interpret
+its input and output.  The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
+@end deffn
+
+@deffn {Scheme Procedure} set-port-conversion-strategy! port sym
+@deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
+Sets the behavior of the interpreter when outputting a character that
+is not representable in the port's current encoding.  @var{sym} can be
+either @code{'error}, @code{'substitute}, or @code{'escape}.  If it is
+@code{'error}, an error will be thrown when an nonconvertible character
+is encountered.  If it is @code{'substitute}, then nonconvertible
+characters will be replaced with approximate characters, or with
+question marks if no approximately correct character is available.  If
+it is @code{'escape}, it will appear as a hex escape when output.
+
+If @var{port} is an open port, the conversion error behavior
+is set for that port.  If it is @code{#f}, it is set as the
+default behavior for any future ports that get created in
+this thread.
+@end deffn
+
+@deffn {Scheme Procedure} port-conversion-strategy port
+@deffnx {C Function} scm_port_conversion_strategy (port)
+Returns the behavior of the port when outputting a character that is
+not representable in the port's current encoding.  It returns the
+symbol @code{error} if unrepresentable characters should cause
+exceptions, @code{substitute} if the port should try to replace
+unrepresentable characters with question marks or approximate
+characters, or @code{escape} if unrepresentable characters should be
+converted to string escapes.
+
+If @var{port} is @code{#f}, then the current default behavior will be
+returned.  New ports will have this default behavior when they are
+created.
+@end deffn
+
+@deffn {Scheme Variable} %default-port-conversion-strategy
+The fluid that defines the conversion strategy for newly created ports,
+and for other conversion routines such as @code{scm_to_stringn},
+@code{scm_from_stringn}, @code{string->pointer}, and
+@code{pointer->string}.
+
+Its value must be one of the symbols described above, with the same
+semantics: @code{'error}, @code{'substitute}, or @code{'escape}.
+
+When Guile starts, its value is @code{'substitute}.
+
+Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
+equivalent to @code{(fluid-set! %default-port-conversion-strategy
+@var{sym})}.
+@end deffn
+
  
  @node Reading
  @subsection Reading
@@ -100,6 +190,9 @@ Equivalent to @code{(or (input-port? @var{x}) (output-port?
  
  [Generic procedures for reading from ports.]
  
+These procedures pertain to reading characters and strings from
+ports. To read general S-expressions from ports, @xref{Scheme Read}.
+
  @rnindex eof-object?
  @cindex End of file object
  @deffn {Scheme Procedure} eof-object? x
@@ -133,6 +226,10 @@ interactive port that has no ready characters.
  Return the next character available from @var{port}, updating
  @var{port} to point to the following character.  If no more
  characters are available, the end-of-file object is returned.
+
+When @var{port}'s data cannot be decoded according to its
+character encoding, a @code{decoding-error} is raised and
+@var{port} points past the erroneous byte sequence.
  @end deffn
  
  @deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
@@ -161,11 +258,16 @@ return the value returned by the preceding call to
  @code{peek-char}.  In particular, a call to @code{peek-char} on
  an interactive port will hang waiting for input whenever a call
  to @code{read-char} would have hung.
+
+As for @code{read-char}, a @code{decoding-error} may be raised
+if such a situation occurs.  However, unlike with @code{read-char},
+@var{port} still points at the beginning of the erroneous byte
+sequence when the error is raised.
  @end deffn
  
  @deffn {Scheme Procedure} unread-char cobj [port]
  @deffnx {C Function} scm_unread_char (cobj, port)
-Place @var{char} in @var{port} so that it will be read by the
+Place character @var{cobj} in @var{port} so that it will be read by the
  next read operation.  If called multiple times, the unread characters
  will be read again in last-in first-out order.  If @var{port} is
  not supplied, the current input port is used.
@@ -225,34 +327,16 @@ Set the current column or line number of @var{port}.
  
  [Generic procedures for writing to ports.]
  
+These procedures are for writing characters and strings to
+ports. For more information on writing arbitrary Scheme objects to
+ports, @xref{Scheme Write}.
+
  @deffn {Scheme Procedure} get-print-state port
  @deffnx {C Function} scm_get_print_state (port)
  Return the print state of the port @var{port}.  If @var{port}
  has no associated print state, @code{#f} is returned.
  @end deffn
  
-@rnindex write
-@deffn {Scheme Procedure} write obj [port]
-Send a representation of @var{obj} to @var{port} or to the current
-output port if not given.
-
-The output is designed to be machine readable, and can be read back
-with @code{read} (@pxref{Reading}).  Strings are printed in
-doublequotes, with escapes if necessary, and characters are printed in
-@samp{#\} notation.
-@end deffn
-
-@rnindex display
-@deffn {Scheme Procedure} display obj [port]
-Send a representation of @var{obj} to @var{port} or to the current
-output port if not given.
-
-The output is designed for human readability, it differs from
-@code{write} in that strings are printed without doublequotes and
-escapes, and characters are printed as per @code{write-char}, not in
-@samp{#\} form.
-@end deffn
-
  @rnindex newline
  @deffn {Scheme Procedure} newline [port]
  @deffnx {C Function} scm_newline (port)
@@ -268,14 +352,6 @@ If @var{pstate} isn't supplied and @var{port} already has
  a print state, the old print state is reused.
  @end deffn
  
-@deffn {Scheme Procedure} print-options-interface [setting]
-@deffnx {C Function} scm_print_options (setting)
-Option interface for the print options. Instead of using
-this procedure directly, use the procedures
-@code{print-enable}, @code{print-disable}, @code{print-set!}
-and @code{print-options}.
-@end deffn
-
  @deffn {Scheme Procedure} simple-format destination message . args
  @deffnx {C Function} scm_simple_format (destination, message, args)
  Write @var{message} to @var{destination}, defaulting to
@@ -283,7 +359,7 @@ the current output port.
  @var{message} can contain @code{~A} (was @code{%s}) and
  @code{~S} (was @code{%S}) escapes.  When printed,
  the escapes are replaced with corresponding members of
-@var{ARGS}:
+@var{args}:
  @code{~A} formats using @code{display} and @code{~S} formats
  using @code{write}.
  If @var{destination} is @code{#t}, then use the current output
@@ -366,7 +442,7 @@ open.
  
  @deffn {Scheme Procedure} seek fd_port offset whence
  @deffnx {C Function} scm_seek (fd_port, offset, whence)
-Sets the current position of @var{fd/port} to the integer
+Sets the current position of @var{fd_port} to the integer
  @var{offset}, which is interpreted according to the value of
  @var{whence}.
  
@@ -381,7 +457,7 @@ Seek from the current position.
  @defvar SEEK_END
  Seek from the end of the file.
  @end defvar
-If @var{fd/port} is a file descriptor, the underlying system
+If @var{fd_port} is a file descriptor, the underlying system
  call is @code{lseek}.  @var{port} may be a string port.
  
  The value returned is the new position in the file.  This means
@@ -394,7 +470,7 @@ that the current position of a port can be obtained using:
  @deffn {Scheme Procedure} ftell fd_port
  @deffnx {C Function} scm_ftell (fd_port)
  Return an integer representing the current position of
-@var{fd/port}, measured from the beginning.  Equivalent to:
+@var{fd_port}, measured from the beginning.  Equivalent to:
  
  @lisp
  (seek port 0 SEEK_CUR)
@@ -424,9 +500,9 @@ the current size, but this is not mandatory in the POSIX standard.
  
  The delimited-I/O module can be accessed with:
  
-@smalllisp
+@lisp
  (use-modules (ice-9 rdelim))
-@end smalllisp
+@end lisp
  
  It can be used to read or write lines of text, or read text delimited by
  a specified set of characters.  It's similar to the @code{(scsh rdelim)}
@@ -454,6 +530,9 @@ Push the terminating delimiter (if any) back on to the port.
  Return a pair containing the string read from the port and the
  terminating delimiter or end-of-file object.
  @end table
+
+Like @code{read-char}, this procedure can throw to @code{decoding-error}
+(@pxref{Reading, @code{read-char}}).
  @end deffn
  
  @c begin (scm-doc-string "rdelim.scm" "read-line!")
@@ -475,14 +554,17 @@ from the value returned by @code{(current-input-port)}.
  
  @c begin (scm-doc-string "rdelim.scm" "read-delimited!")
  @deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
-Read text into the supplied string @var{buf} and return the number of
-characters added to @var{buf} (subject to @var{handle-delim}, which takes
-the same values specified for @code{read-line}.  If @var{buf} is filled,
-@code{#f} is returned for both the number of characters read and the
-delimiter.  Also terminates if one of the characters in the string
-@var{delims} is found
-or end-of-file is reached.  Read from @var{port} if supplied, otherwise
-from the value returned by @code{(current-input-port)}.
+Read text into the supplied string @var{buf}.
+
+If a delimiter was found, return the number of characters written,
+except if @var{handle-delim} is @code{split}, in which case the return
+value is a pair, as noted above.
+
+As a special case, if @var{port} was already at end-of-stream, the EOF
+object is returned. Also, if no characters were written because the
+buffer was full, @code{#f} is returned.
+
+It's something of a wacky interface, to be honest.
  @end deffn
  
  @deffn {Scheme Procedure} write-line obj [port]
@@ -496,7 +578,34 @@ used.  This function is equivalent to:
  @end lisp
  @end deffn
  
-Some of the abovementioned I/O functions rely on the following C
+In the past, Guile did not have a procedure that would just read out all
+of the characters from a port.  As a workaround, many people just called
+@code{read-delimited} with no delimiters, knowing that would produce the
+behavior they wanted.  This prompted Guile developers to add some
+routines that would read all characters from a port.  So it is that
+@code{(ice-9 rdelim)} is also the home for procedures that can reading
+undelimited text:
+
+@deffn {Scheme Procedure} read-string [port] [count]
+Read all of the characters out of @var{port} and return them as a
+string.  If the @var{count} is present, treat it as a limit to the
+number of characters to read.
+
+By default, read from the current input port, with no size limit on the
+result.  This procedure always returns a string, even if no characters
+were read.
+@end deffn
+
+@deffn {Scheme Procedure} read-string! buf [port] [start] [end]
+Fill @var{buf} with characters read from @var{port}, defaulting to the
+current input port.  Return the number of characters read.
+
+If @var{start} or @var{end} are specified, store data only into the
+substring of @var{str} bounded by @var{start} and @var{end} (which
+default to the beginning and end of the string, respectively).
+@end deffn
+
+Some of the aforementioned I/O functions rely on the following C
  primitives.  These will mainly be of interest to people hacking Guile
  internals.
  
@@ -536,9 +645,9 @@ delimiter may be either a newline or the @var{eof-object}; if
  
  The Block-string-I/O module can be accessed with:
  
-@smalllisp
+@lisp
  (use-modules (ice-9 rw))
-@end smalllisp
+@end lisp
  
  It currently contains procedures that help to implement the
  @code{(scsh rw)} module in guile-scsh.
@@ -734,7 +843,10 @@ Most systems have limits on how many files can be open, so it's
  strongly recommended that file ports be closed explicitly when no
  longer required (@pxref{Ports}).
  
-@deffn {Scheme Procedure} open-file filename mode
+@deffn {Scheme Procedure} open-file filename mode @
+                          [#:guess-encoding=#f] [#:encoding=#f]
+@deffnx {C Function} scm_open_file_with_encoding @
+                     (filename, mode, guess_encoding, encoding)
  @deffnx {C Function} scm_open_file (filename, mode)
  Open the file whose name is @var{filename}, and return a port
  representing that file.  The attributes of the port are
@@ -772,20 +884,52 @@ setvbuf}
  Add line-buffering to the port.  The port output buffer will be
  automatically flushed whenever a newline character is written.
  @item b
-Use binary mode.  On DOS systems the default text mode converts CR+LF
-in the file to newline for the program, whereas binary mode reads and
-writes all bytes unchanged.  On Unix-like systems there is no such
-distinction, text files already contain just newlines and no
-conversion is ever made.  The @code{b} flag is accepted on all
-systems, but has no effect on Unix-like systems.
-
-(For reference, Guile leaves text versus binary up to the C library,
-@code{b} here just adds @code{O_BINARY} to the underlying @code{open}
-call, when that flag is available.)
+Use binary mode, ensuring that each byte in the file will be read as one
+Scheme character.
+
+To provide this property, the file will be opened with the 8-bit
+character encoding "ISO-8859-1", ignoring the default port encoding.
+@xref{Ports}, for more information on port encodings.
+
+Note that while it is possible to read and write binary data as
+characters or strings, it is usually better to treat bytes as octets,
+and byte sequences as bytevectors.  @xref{R6RS Binary Input}, and
+@ref{R6RS Binary Output}, for more.
+
+This option had another historical meaning, for DOS compatibility: in
+the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
+The @code{b} flag prevents this from happening, adding @code{O_BINARY}
+to the underlying @code{open} call.  Still, the flag is generally useful
+because of its port encoding ramifications.
  @end table
  
-If a file cannot be opened with the access
-requested, @code{open-file} throws an exception.
+Unless binary mode is requested, the character encoding of the new port
+is determined as follows: First, if @var{guess-encoding} is true, the
+@code{file-encoding} procedure is used to guess the encoding of the file
+(@pxref{Character Encoding of Source Files}).  If @var{guess-encoding}
+is false or if @code{file-encoding} fails, @var{encoding} is used unless
+it is also false.  As a last resort, the default port encoding is used.
+@xref{Ports}, for more information on port encodings.  It is an error to
+pass a non-false @var{guess-encoding} or @var{encoding} if binary mode
+is requested.
+
+If a file cannot be opened with the access requested, @code{open-file}
+throws an exception.
+
+When the file is opened, its encoding is set to the current
+@code{%default-port-encoding}, unless the @code{b} flag was supplied.
+Sometimes it is desirable to honor Emacs-style coding declarations in
+files@footnote{Guile 2.0.0 to 2.0.7 would do this by default.  This
+behavior was deemed inappropriate and disabled starting from Guile
+2.0.8.}.  When that is the case, the @code{file-encoding} procedure can
+be used as follows (@pxref{Character Encoding of Source Files,
+@code{file-encoding}}):
+
+@example
+(let* ((port     (open-input-file file))
+       (encoding (file-encoding port)))
+  (set-port-encoding! port (or encoding (port-encoding port))))
+@end example
  
  In theory we could create read/write ports which were buffered
  in one direction only.  However this isn't included in the
@@ -793,40 +937,60 @@ current interfaces.
  @end deffn
  
  @rnindex open-input-file
-@deffn {Scheme Procedure} open-input-file filename
-Open @var{filename} for input.  Equivalent to
-@smalllisp
-(open-file @var{filename} "r")
-@end smalllisp
+@deffn {Scheme Procedure} open-input-file filename @
+       [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
+
+Open @var{filename} for input.  If @var{binary} is true, open the port
+in binary mode, otherwise use text mode.  @var{encoding} and
+@var{guess-encoding} determine the character encoding as described above
+for @code{open-file}.  Equivalent to
+@lisp
+(open-file @var{filename}
+           (if @var{binary} "rb" "r")
+           #:guess-encoding @var{guess-encoding}
+           #:encoding @var{encoding})
+@end lisp
  @end deffn
  
  @rnindex open-output-file
-@deffn {Scheme Procedure} open-output-file filename
-Open @var{filename} for output.  Equivalent to
-@smalllisp
-(open-file @var{filename} "w")
-@end smalllisp
+@deffn {Scheme Procedure} open-output-file filename @
+       [#:encoding=#f] [#:binary=#f]
+
+Open @var{filename} for output.  If @var{binary} is true, open the port
+in binary mode, otherwise use text mode.  @var{encoding} specifies the
+character encoding as described above for @code{open-file}.  Equivalent
+to
+@lisp
+(open-file @var{filename}
+           (if @var{binary} "wb" "w")
+           #:encoding @var{encoding})
+@end lisp
  @end deffn
  
-@deffn {Scheme Procedure} call-with-input-file filename proc
-@deffnx {Scheme Procedure} call-with-output-file filename proc
+@deffn {Scheme Procedure} call-with-input-file filename proc @
+        [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
+@deffnx {Scheme Procedure} call-with-output-file filename proc @
+        [#:encoding=#f] [#:binary=#f]
  @rnindex call-with-input-file
  @rnindex call-with-output-file
  Open @var{filename} for input or output, and call @code{(@var{proc}
  port)} with the resulting port.  Return the value returned by
  @var{proc}.  @var{filename} is opened as per @code{open-input-file} or
-@code{open-output-file} respectively, and an error is signalled if it
+@code{open-output-file} respectively, and an error is signaled if it
  cannot be opened.
  
  When @var{proc} returns, the port is closed.  If @var{proc} does not
-return (eg.@: if it throws an error), then the port might not be
+return (e.g.@: if it throws an error), then the port might not be
  closed automatically, though it will be garbage collected in the usual
  way if not otherwise referenced.
  @end deffn
  
-@deffn {Scheme Procedure} with-input-from-file filename thunk
-@deffnx {Scheme Procedure} with-output-to-file filename thunk
-@deffnx {Scheme Procedure} with-error-to-file filename thunk
+@deffn {Scheme Procedure} with-input-from-file filename thunk @
+        [#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
+@deffnx {Scheme Procedure} with-output-to-file filename thunk @
+        [#:encoding=#f] [#:binary=#f]
+@deffnx {Scheme Procedure} with-error-to-file filename thunk @
+        [#:encoding=#f] [#:binary=#f]
  @rnindex with-input-from-file
  @rnindex with-output-to-file
  Open @var{filename} and call @code{(@var{thunk})} with the new port
@@ -834,7 +998,7 @@ setup as respectively the @code{current-input-port},
  @code{current-output-port}, or @code{current-error-port}.  Return the
  value returned by @var{thunk}.  @var{filename} is opened as per
  @code{open-input-file} or @code{open-output-file} respectively, and an
-error is signalled if it cannot be opened.
+error is signaled if it cannot be opened.
  
  When @var{thunk} returns, the port is closed and the previous setting
  of the respective current port is restored.
@@ -842,7 +1006,7 @@ of the respective current port is restored.
  The current port setting is managed with @code{dynamic-wind}, so the
  previous value is restored no matter how @var{thunk} exits (eg.@: an
  exception), and if @var{thunk} is re-entered (via a captured
-continuation) then it's set again to the @var{FILENAME} port.
+continuation) then it's set again to the @var{filename} port.
  
  The port is closed when @var{thunk} returns normally, but not when
  exited via an exception or new continuation.  This ensures it's still
@@ -861,9 +1025,8 @@ used only during port creation are not retained.
  
  @deffn {Scheme Procedure} port-filename port
  @deffnx {C Function} scm_port_filename (port)
-Return the filename associated with @var{port}.  This function returns
-the strings "standard input", "standard output" and "standard error"
-when called on the current input, output and error ports respectively.
+Return the filename associated with @var{port}, or @code{#f} if no
+filename is associated with the port.
  
  @var{port} must be open, @code{port-filename} cannot be used once the
  port is closed.
@@ -888,9 +1051,16 @@ Determine whether @var{obj} is a port that is related to a file.
  @cindex String port
  @cindex Port, string
  
-The following allow string ports to be opened by analogy to R4R*
+The following allow string ports to be opened by analogy to R4RS
  file port facilities:
  
+With string ports, the port-encoding is treated differently than other
+types of ports.  When string ports are created, they do not inherit a
+character encoding from the current locale.  They are given a
+default locale that allows them to handle all valid string characters.
+Typically one should not modify a string port's character encoding
+away from its default.
+
  @deffn {Scheme Procedure} call-with-output-string proc
  @deffnx {C Function} scm_call_with_output_string (proc)
  Calls the one-argument procedure @var{proc} with a newly created output
@@ -1034,19 +1204,365 @@ The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
  the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
  io ports)} module.  It provides features, such as binary I/O and Unicode
  string I/O, that complement or refine Guile's historical port API
-presented above (@pxref{Input and Output}).
+presented above (@pxref{Input and Output}). Note that R6RS ports are not
+disjoint from Guile's native ports, so Guile-specific procedures will
+work on ports created using the R6RS API, and vice versa.
+
+The text in this section is taken from the R6RS standard libraries
+document, with only minor adaptions for inclusion in this manual.  The
+Guile developers offer their thanks to the R6RS editors for having
+provided the report's text under permissive conditions making this
+possible.
  
  @c FIXME: Update description when implemented.
-@emph{Note}: The implementation of this R6RS API is currently far from
-complete, notably due to the lack of support for Unicode I/O and strings.
+@emph{Note}: The implementation of this R6RS API is not complete yet.
  
  @menu
+* R6RS File Names::             File names.
+* R6RS File Options::           Options for opening files.
+* R6RS Buffer Modes::           Influencing buffering behavior.
+* R6RS Transcoders::            Influencing port encoding.
  * R6RS End-of-File::            The end-of-file object.
  * R6RS Port Manipulation::      Manipulating R6RS ports.
+* R6RS Input Ports::            Input Ports.
  * R6RS Binary Input::           Binary input.
+* R6RS Textual Input::          Textual input.
+* R6RS Output Ports::           Output Ports.
  * R6RS Binary Output::          Binary output.
+* R6RS Textual Output::         Textual output.
  @end menu
  
+A subset of the @code{(rnrs io ports)} module, plus one non-standard
+procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is
+provided by the @code{(ice-9 binary-ports)} module.  It contains binary
+input/output procedures and does not rely on R6RS support.
+
+@node R6RS File Names
+@subsubsection File Names
+
+Some of the procedures described in this chapter accept a file name as an
+argument. Valid values for such a file name include strings that name a file
+using the native notation of file system paths on an implementation's
+underlying operating system, and may include implementation-dependent
+values as well.
+
+A @var{filename} parameter name means that the
+corresponding argument must be a file name.
+
+@node R6RS File Options
+@subsubsection File Options
+@cindex file options
+
+When opening a file, the various procedures in this library accept a
+@code{file-options} object that encapsulates flags to specify how the
+file is to be opened. A @code{file-options} object is an enum-set
+(@pxref{rnrs enums}) over the symbols constituting valid file options.
+
+A @var{file-options} parameter name means that the corresponding
+argument must be a file-options object.
+
+@deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
+
+Each @var{file-options-symbol} must be a symbol.
+
+The @code{file-options} syntax returns a file-options object that
+encapsulates the specified options.
+
+When supplied to an operation that opens a file for output, the
+file-options object returned by @code{(file-options)} specifies that the
+file is created if it does not exist and an exception with condition
+type @code{&i/o-file-already-exists} is raised if it does exist.  The
+following standard options can be included to modify the default
+behavior.
+
+@table @code
+@item no-create
+      If the file does not already exist, it is not created;
+      instead, an exception with condition type @code{&i/o-file-does-not-exist}
+      is raised.
+      If the file already exists, the exception with condition type
+      @code{&i/o-file-already-exists} is not raised
+      and the file is truncated to zero length.
+@item no-fail
+      If the file already exists, the exception with condition type
+      @code{&i/o-file-already-exists} is not raised,
+      even if @code{no-create} is not included,
+      and the file is truncated to zero length.
+@item no-truncate
+      If the file already exists and the exception with condition type
+      @code{&i/o-file-already-exists} has been inhibited by inclusion of
+      @code{no-create} or @code{no-fail}, the file is not truncated, but
+      the port's current position is still set to the beginning of the
+      file.
+@end table
+
+These options have no effect when a file is opened only for input.
+Symbols other than those listed above may be used as
+@var{file-options-symbol}s; they have implementation-specific meaning,
+if any.
+
+@quotation Note
+  Only the name of @var{file-options-symbol} is significant.
+@end quotation
+@end deffn
+
+@node R6RS Buffer Modes
+@subsubsection Buffer Modes
+
+Each port has an associated buffer mode.  For an output port, the
+buffer mode defines when an output operation flushes the buffer
+associated with the output port.  For an input port, the buffer mode
+defines how much data will be read to satisfy read operations.  The
+possible buffer modes are the symbols @code{none} for no buffering,
+@code{line} for flushing upon line endings and reading up to line
+endings, or other implementation-dependent behavior,
+and @code{block} for arbitrary buffering.  This section uses
+the parameter name @var{buffer-mode} for arguments that must be
+buffer-mode symbols.
+
+If two ports are connected to the same mutable source, both ports
+are unbuffered, and reading a byte or character from that shared
+source via one of the two ports would change the bytes or characters
+seen via the other port, a lookahead operation on one port will
+render the peeked byte or character inaccessible via the other port,
+while a subsequent read operation on the peeked port will see the
+peeked byte or character even though the port is otherwise unbuffered.
+
+In other words, the semantics of buffering is defined in terms of side
+effects on shared mutable sources, and a lookahead operation has the
+same side effect on the shared source as a read operation.
+
+@deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
+
+@var{buffer-mode-symbol} must be a symbol whose name is one of
+@code{none}, @code{line}, and @code{block}. The result is the
+corresponding symbol, and specifies the associated buffer mode.
+
+@quotation Note
+  Only the name of @var{buffer-mode-symbol} is significant.
+@end quotation
+@end deffn
+
+@deffn {Scheme Procedure} buffer-mode?  obj
+Returns @code{#t} if the argument is a valid buffer-mode symbol, and
+returns @code{#f} otherwise.
+@end deffn
+
+@node R6RS Transcoders
+@subsubsection Transcoders
+@cindex codec
+@cindex end-of-line style
+@cindex transcoder
+@cindex binary port
+@cindex textual port
+
+Several different Unicode encoding schemes describe standard ways to
+encode characters and strings as byte sequences and to decode those
+sequences. Within this document, a @dfn{codec} is an immutable Scheme
+object that represents a Unicode or similar encoding scheme.
+
+An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
+describes how a textual port transcodes representations of line endings.
+
+A @dfn{transcoder} is an immutable Scheme object that combines a codec
+with an end-of-line style and a method for handling decoding errors.
+Each transcoder represents some specific bidirectional (but not
+necessarily lossless), possibly stateful translation between byte
+sequences and Unicode characters and strings.  Every transcoder can
+operate in the input direction (bytes to characters) or in the output
+direction (characters to bytes).  A @var{transcoder} parameter name
+means that the corresponding argument must be a transcoder.
+
+A @dfn{binary port} is a port that supports binary I/O, does not have an
+associated transcoder and does not support textual I/O.  A @dfn{textual
+port} is a port that supports textual I/O, and does not support binary
+I/O.  A textual port may or may not have an associated transcoder.
+
+@deffn {Scheme Procedure} latin-1-codec
+@deffnx {Scheme Procedure} utf-8-codec
+@deffnx {Scheme Procedure} utf-16-codec
+
+These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
+encoding schemes.
+
+A call to any of these procedures returns a value that is equal in the
+sense of @code{eqv?} to the result of any other call to the same
+procedure.
+@end deffn
+
+@deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
+
+@var{eol-style-symbol} should be a symbol whose name is one of
+@code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
+and @code{none}.
+
+The form evaluates to the corresponding symbol.  If the name of
+@var{eol-style-symbol} is not one of these symbols, the effect and
+result are implementation-dependent; in particular, the result may be an
+eol-style symbol acceptable as an @var{eol-style} argument to
+@code{make-transcoder}.  Otherwise, an exception is raised.
+
+All eol-style symbols except @code{none} describe a specific
+line-ending encoding:
+
+@table @code
+@item lf
+linefeed
+@item cr
+carriage return
+@item crlf
+carriage return, linefeed
+@item nel
+next line
+@item crnel
+carriage return, next line
+@item ls
+line separator
+@end table
+
+For a textual port with a transcoder, and whose transcoder has an
+eol-style symbol @code{none}, no conversion occurs.  For a textual input
+port, any eol-style symbol other than @code{none} means that all of the
+above line-ending encodings are recognized and are translated into a
+single linefeed.  For a textual output port, @code{none} and @code{lf}
+are equivalent.  Linefeed characters are encoded according to the
+specified eol-style symbol, and all other characters that participate in
+possible line endings are encoded as is.
+
+@quotation Note
+  Only the name of @var{eol-style-symbol} is significant.
+@end quotation
+@end deffn
+
+@deffn {Scheme Procedure} native-eol-style
+Returns the default end-of-line style of the underlying platform, e.g.,
+@code{lf} on Unix and @code{crlf} on Windows.
+@end deffn
+
+@deffn {Condition Type} &i/o-decoding
+@deffnx {Scheme Procedure} make-i/o-decoding-error  port
+@deffnx {Scheme Procedure} i/o-decoding-error?  obj
+
+This condition type could be defined by
+
+@lisp
+(define-condition-type &i/o-decoding &i/o-port
+  make-i/o-decoding-error i/o-decoding-error?)
+@end lisp
+
+An exception with this type is raised when one of the operations for
+textual input from a port encounters a sequence of bytes that cannot be
+translated into a character or string by the input direction of the
+port's transcoder.
+
+When such an exception is raised, the port's position is past the
+invalid encoding.
+@end deffn
+
+@deffn {Condition Type} &i/o-encoding
+@deffnx {Scheme Procedure} make-i/o-encoding-error  port char
+@deffnx {Scheme Procedure} i/o-encoding-error?  obj
+@deffnx {Scheme Procedure} i/o-encoding-error-char  condition
+
+This condition type could be defined by
+
+@lisp
+(define-condition-type &i/o-encoding &i/o-port
+  make-i/o-encoding-error i/o-encoding-error?
+  (char i/o-encoding-error-char))
+@end lisp
+
+An exception with this type is raised when one of the operations for
+textual output to a port encounters a character that cannot be
+translated into bytes by the output direction of the port's transcoder.
+@var{char} is the character that could not be encoded.
+@end deffn
+
+@deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
+
+@var{error-handling-mode-symbol} should be a symbol whose name is one of
+@code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
+the corresponding symbol.  If @var{error-handling-mode-symbol} is not
+one of these identifiers, effect and result are
+implementation-dependent: The result may be an error-handling-mode
+symbol acceptable as a @var{handling-mode} argument to
+@code{make-transcoder}.  If it is not acceptable as a
+@var{handling-mode} argument to @code{make-transcoder}, an exception is
+raised.
+
+@quotation Note
+  Only the name of @var{error-handling-mode-symbol} is significant.
+@end quotation
+
+The error-handling mode of a transcoder specifies the behavior
+of textual I/O operations in the presence of encoding or decoding
+errors.
+
+If a textual input operation encounters an invalid or incomplete
+character encoding, and the error-handling mode is @code{ignore}, an
+appropriate number of bytes of the invalid encoding are ignored and
+decoding continues with the following bytes.
+
+If the error-handling mode is @code{replace}, the replacement
+character U+FFFD is injected into the data stream, an appropriate
+number of bytes are ignored, and decoding
+continues with the following bytes.
+
+If the error-handling mode is @code{raise}, an exception with condition
+type @code{&i/o-decoding} is raised.
+
+If a textual output operation encounters a character it cannot encode,
+and the error-handling mode is @code{ignore}, the character is ignored
+and encoding continues with the next character.  If the error-handling
+mode is @code{replace}, a codec-specific replacement character is
+emitted by the transcoder, and encoding continues with the next
+character.  The replacement character is U+FFFD for transcoders whose
+codec is one of the Unicode encodings, but is the @code{?}  character
+for the Latin-1 encoding.  If the error-handling mode is @code{raise},
+an exception with condition type @code{&i/o-encoding} is raised.
+@end deffn
+
+@deffn {Scheme Procedure} make-transcoder  codec
+@deffnx {Scheme Procedure} make-transcoder codec eol-style
+@deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
+
+@var{codec} must be a codec; @var{eol-style}, if present, an eol-style
+symbol; and @var{handling-mode}, if present, an error-handling-mode
+symbol.
+
+@var{eol-style} may be omitted, in which case it defaults to the native
+end-of-line style of the underlying platform.  @var{handling-mode} may
+be omitted, in which case it defaults to @code{replace}.  The result is
+a transcoder with the behavior specified by its arguments.
+@end deffn
+
+@deffn {Scheme procedure} native-transcoder
+Returns an implementation-dependent transcoder that represents a
+possibly locale-dependent ``native'' transcoding.
+@end deffn
+
+@deffn {Scheme Procedure} transcoder-codec  transcoder
+@deffnx {Scheme Procedure} transcoder-eol-style  transcoder
+@deffnx {Scheme Procedure} transcoder-error-handling-mode  transcoder
+
+These are accessors for transcoder objects; when applied to a
+transcoder returned by @code{make-transcoder}, they return the
+@var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
+respectively.
+@end deffn
+
+@deffn {Scheme Procedure} bytevector->string  bytevector transcoder
+
+Returns the string that results from transcoding the
+@var{bytevector} according to the input direction of the transcoder.
+@end deffn
+
+@deffn {Scheme Procedure} string->bytevector  string transcoder
+
+Returns the bytevector that results from transcoding the
+@var{string} according to the output direction of the transcoder.
+@end deffn
+
  @node R6RS End-of-File
  @subsubsection The End-of-File Object
  
@@ -1079,6 +1595,65 @@ Return the end-of-file (EOF) object.
  
  The procedures listed below operate on any kind of R6RS I/O port.
  
+@deffn {Scheme Procedure} port? obj
+Returns @code{#t} if the argument is a port, and returns @code{#f}
+otherwise.
+@end deffn
+
+@deffn {Scheme Procedure} port-transcoder port
+Returns the transcoder associated with @var{port} if @var{port} is
+textual and has an associated transcoder, and returns @code{#f} if
+@var{port} is binary or does not have an associated transcoder.
+@end deffn
+
+@deffn {Scheme Procedure} binary-port? port
+Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
+binary data input/output.
+
+Note that internally Guile does not differentiate between binary and
+textual ports, unlike the R6RS.  Thus, this procedure returns true when
+@var{port} does not have an associated encoding---i.e., when
+@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
+port-encoding}).  This is the case for ports returned by R6RS procedures
+such as @code{open-bytevector-input-port} and
+@code{make-custom-binary-output-port}.
+
+However, Guile currently does not prevent use of textual I/O procedures
+such as @code{display} or @code{read-char} with binary ports.  Doing so
+``upgrades'' the port from binary to textual, under the ISO-8859-1
+encoding.  Likewise, Guile does not prevent use of
+@code{set-port-encoding!} on a binary port, which also turns it into a
+``textual'' port.
+@end deffn
+
+@deffn {Scheme Procedure} textual-port? port
+Always return @code{#t}, as all ports can be used for textual I/O in
+Guile.
+@end deffn
+
+@deffn {Scheme Procedure} transcoded-port binary-port transcoder
+The @code{transcoded-port} procedure
+returns a new textual port with the specified @var{transcoder}.
+Otherwise the new textual port's state is largely the same as
+that of @var{binary-port}.
+If @var{binary-port} is an input port, the new textual
+port will be an input port and
+will transcode the bytes that have not yet been read from
+@var{binary-port}.
+If @var{binary-port} is an output port, the new textual
+port will be an output port and
+will transcode output characters into bytes that are
+written to the byte sink represented by @var{binary-port}.
+
+As a side effect, however, @code{transcoded-port}
+closes @var{binary-port} in
+a special way that allows the new textual port to continue to
+use the byte source or sink represented by @var{binary-port},
+even though @var{binary-port} itself is closed and cannot
+be used by the input and output operations described in this
+chapter.
+@end deffn
+
  @deffn {Scheme Procedure} port-position port
  If @var{port} supports it (see below), return the offset (an integer)
  indicating where the next octet will be read from/written to in
@@ -1112,6 +1687,67 @@ Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
  of @var{proc}.  Return the return values of @var{proc}.
  @end deffn
  
+@node R6RS Input Ports
+@subsubsection Input Ports
+
+@deffn {Scheme Procedure} input-port? obj
+Returns @code{#t} if the argument is an input port (or a combined input
+and output port), and returns @code{#f} otherwise.
+@end deffn
+
+@deffn {Scheme Procedure} port-eof? input-port
+Returns @code{#t}
+if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
+or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
+would return
+the end-of-file object, and @code{#f} otherwise.
+The operation may block indefinitely if no data is available
+but the port cannot be determined to be at end of file.
+@end deffn
+
+@deffn {Scheme Procedure} open-file-input-port filename
+@deffnx {Scheme Procedure} open-file-input-port filename file-options
+@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
+@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
+@var{maybe-transcoder} must be either a transcoder or @code{#f}.
+
+The @code{open-file-input-port} procedure returns an
+input port for the named file. The @var{file-options} and
+@var{maybe-transcoder} arguments are optional.
+
+The @var{file-options} argument, which may determine
+various aspects of the returned port (@pxref{R6RS File Options}),
+defaults to the value of @code{(file-options)}.
+
+The @var{buffer-mode} argument, if supplied,
+must be one of the symbols that name a buffer mode.
+The @var{buffer-mode} argument defaults to @code{block}.
+
+If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
+with the returned port.
+
+If @var{maybe-transcoder} is @code{#f} or absent,
+the port will be a binary port and will support the
+@code{port-position} and @code{set-port-position!}  operations.
+Otherwise the port will be a textual port, and whether it supports
+the @code{port-position} and @code{set-port-position!} operations
+is implementation-dependent (and possibly transcoder-dependent).
+@end deffn
+
+@deffn {Scheme Procedure} standard-input-port
+Returns a fresh binary input port connected to standard input.  Whether
+the port supports the @code{port-position} and @code{set-port-position!}
+operations is implementation-dependent.
+@end deffn
+
+@deffn {Scheme Procedure} current-input-port
+This returns a default textual port for input.  Normally, this default
+port is associated with standard input, but can be dynamically
+re-assigned using the @code{with-input-from-file} procedure from the
+@code{io simple (6)} library (@pxref{rnrs io simple}).  The port may or
+may not have an associated transcoder; if it does, the transcoder is
+implementation-dependent.
+@end deffn
  
  @node R6RS Binary Input
  @subsubsection Binary Input
@@ -1143,13 +1779,13 @@ indicating the number of bytes read, or @code{0} to indicate the
  end-of-file.
  
  Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
-that will be called when @var{port-position} is invoked on the custom
+that will be called when @code{port-position} is invoked on the custom
  binary port and should return an integer indicating the position within
  the underlying data stream; if @var{get-position} was not supplied, the
-returned port does not support @var{port-position}.
+returned port does not support @code{port-position}.
  
  Likewise, if @var{set-position!} is not @code{#f}, it should be a
-one-argument procedure.  When @var{set-port-position!} is invoked on the
+one-argument procedure.  When @code{set-port-position!} is invoked on the
  custom binary input port, @var{set-position!} is passed an integer
  indicating the position of the next byte is to read.
  
@@ -1216,9 +1852,10 @@ actually read or the end-of-file object.
  
  @deffn {Scheme Procedure} get-bytevector-some port
  @deffnx {C Function} scm_get_bytevector_some (port)
-Read from @var{port}, blocking as necessary, until data are available or
-and end-of-file is reached.  Return either a new bytevector containing
-the data read or the end-of-file object.
+Read from @var{port}, blocking as necessary, until bytes are available
+or an end-of-file is reached.  Return either the end-of-file object or a
+new bytevector containing some of the available bytes (at least one),
+and update the port position to point just past these bytes.
  @end deffn
  
  @deffn {Scheme Procedure} get-bytevector-all port
@@ -1228,6 +1865,185 @@ reached.  Return either a new bytevector containing the data read or the
  end-of-file object (if no data were available).
  @end deffn
  
+The @code{(ice-9 binary-ports)} module provides the following procedure
+as an extension to @code{(rnrs io ports)}:
+
+@deffn {Scheme Procedure} unget-bytevector port bv [start [count]]
+@deffnx {C Function} scm_unget_bytevector (port, bv, start, count)
+Place the contents of @var{bv} in @var{port}, optionally starting at
+index @var{start} and limiting to @var{count} octets, so that its bytes
+will be read from left-to-right as the next bytes from @var{port} during
+subsequent read operations.  If called multiple times, the unread bytes
+will be read again in last-in first-out order.
+@end deffn
+
+@node R6RS Textual Input
+@subsubsection Textual Input
+
+@deffn {Scheme Procedure} get-char textual-input-port
+Reads from @var{textual-input-port}, blocking as necessary, until a
+complete character is available from @var{textual-input-port},
+or until an end of file is reached.
+
+If a complete character is available before the next end of file,
+@code{get-char} returns that character and updates the input port to
+point past the character. If an end of file is reached before any
+character is read, @code{get-char} returns the end-of-file object.
+@end deffn
+
+@deffn {Scheme Procedure} lookahead-char textual-input-port
+The @code{lookahead-char} procedure is like @code{get-char}, but it does
+not update @var{textual-input-port} to point past the character.
+@end deffn
+
+@deffn {Scheme Procedure} get-string-n textual-input-port count
+
+@var{count} must be an exact, non-negative integer object, representing
+the number of characters to be read.
+
+The @code{get-string-n} procedure reads from @var{textual-input-port},
+blocking as necessary, until @var{count} characters are available, or
+until an end of file is reached.
+
+If @var{count} characters are available before end of file,
+@code{get-string-n} returns a string consisting of those @var{count}
+characters. If fewer characters are available before an end of file, but
+one or more characters can be read, @code{get-string-n} returns a string
+containing those characters. In either case, the input port is updated
+to point just past the characters read. If no characters can be read
+before an end of file, the end-of-file object is returned.
+@end deffn
+
+@deffn {Scheme Procedure} get-string-n! textual-input-port string start count
+
+@var{start} and @var{count} must be exact, non-negative integer objects,
+with @var{count} representing the number of characters to be read.
+@var{string} must be a string with at least $@var{start} + @var{count}$
+characters.
+
+The @code{get-string-n!} procedure reads from @var{textual-input-port}
+in the same manner as @code{get-string-n}.  If @var{count} characters
+are available before an end of file, they are written into @var{string}
+starting at index @var{start}, and @var{count} is returned. If fewer
+characters are available before an end of file, but one or more can be
+read, those characters are written into @var{string} starting at index
+@var{start} and the number of characters actually read is returned as an
+exact integer object. If no characters can be read before an end of
+file, the end-of-file object is returned.
+@end deffn
+
+@deffn {Scheme Procedure} get-string-all textual-input-port
+Reads from @var{textual-input-port} until an end of file, decoding
+characters in the same manner as @code{get-string-n} and
+@code{get-string-n!}.
+
+If characters are available before the end of file, a string containing
+all the characters decoded from that data are returned. If no character
+precedes the end of file, the end-of-file object is returned.
+@end deffn
+
+@deffn {Scheme Procedure} get-line textual-input-port
+Reads from @var{textual-input-port} up to and including the linefeed
+character or end of file, decoding characters in the same manner as
+@code{get-string-n} and @code{get-string-n!}.
+
+If a linefeed character is read, a string containing all of the text up
+to (but not including) the linefeed character is returned, and the port
+is updated to point just past the linefeed character. If an end of file
+is encountered before any linefeed character is read, but some
+characters have been read and decoded as characters, a string containing
+those characters is returned. If an end of file is encountered before
+any characters are read, the end-of-file object is returned.
+
+@quotation Note
+  The end-of-line style, if not @code{none}, will cause all line endings
+  to be read as linefeed characters.  @xref{R6RS Transcoders}.
+@end quotation
+@end deffn
+
+@deffn {Scheme Procedure} get-datum textual-input-port count
+Reads an external representation from @var{textual-input-port} and returns the
+datum it represents.  The @code{get-datum} procedure returns the next
+datum that can be parsed from the given @var{textual-input-port}, updating
+@var{textual-input-port} to point exactly past the end of the external
+representation of the object.
+
+Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
+Syntax}) in the input is first skipped.  If an end of file occurs after
+the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
+is returned.
+
+If a character inconsistent with an external representation is
+encountered in the input, an exception with condition types
+@code{&lexical} and @code{&i/o-read} is raised.  Also, if the end of
+file is encountered after the beginning of an external representation,
+but the external representation is incomplete and therefore cannot be
+parsed, an exception with condition types @code{&lexical} and
+@code{&i/o-read} is raised.
+@end deffn
+
+@node R6RS Output Ports
+@subsubsection Output Ports
+
+@deffn {Scheme Procedure} output-port? obj
+Returns @code{#t} if the argument is an output port (or a
+combined input and output port), @code{#f} otherwise.
+@end deffn
+
+@deffn {Scheme Procedure} flush-output-port port
+Flushes any buffered output from the buffer of @var{output-port} to the
+underlying file, device, or object. The @code{flush-output-port}
+procedure returns an unspecified values.
+@end deffn
+
+@deffn {Scheme Procedure} open-file-output-port filename
+@deffnx {Scheme Procedure} open-file-output-port filename file-options
+@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
+@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
+
+@var{maybe-transcoder} must be either a transcoder or @code{#f}.
+
+The @code{open-file-output-port} procedure returns an output port for the named file.
+
+The @var{file-options} argument, which may determine various aspects of
+the returned port (@pxref{R6RS File Options}), defaults to the value of
+@code{(file-options)}.
+
+The @var{buffer-mode} argument, if supplied,
+must be one of the symbols that name a buffer mode.
+The @var{buffer-mode} argument defaults to @code{block}.
+
+If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
+associated with the port.
+
+If @var{maybe-transcoder} is @code{#f} or absent,
+the port will be a binary port and will support the
+@code{port-position} and @code{set-port-position!}  operations.
+Otherwise the port will be a textual port, and whether it supports
+the @code{port-position} and @code{set-port-position!} operations
+is implementation-dependent (and possibly transcoder-dependent).
+@end deffn
+
+@deffn {Scheme Procedure} standard-output-port
+@deffnx {Scheme Procedure} standard-error-port
+Returns a fresh binary output port connected to the standard output or
+standard error respectively.  Whether the port supports the
+@code{port-position} and @code{set-port-position!} operations is
+implementation-dependent.
+@end deffn
+
+@deffn {Scheme Procedure} current-output-port
+@deffnx {Scheme Procedure} current-error-port
+These return default textual ports for regular output and error output.
+Normally, these default ports are associated with standard output, and
+standard error, respectively.  The return value of
+@code{current-output-port} can be dynamically re-assigned using the
+@code{with-output-to-file} procedure from the @code{io simple (6)}
+library (@pxref{rnrs io simple}).  A port returned by one of these
+procedures may or may not have an associated transcoder; if it does, the
+transcoder is implementation-dependent.
+@end deffn
+
  @node R6RS Binary Output
  @subsubsection Binary Output
  
@@ -1286,6 +2102,51 @@ Write the contents of @var{bv} to @var{port}, optionally starting at
  index @var{start} and limiting to @var{count} octets.
  @end deffn
  
+@node R6RS Textual Output
+@subsubsection Textual Output
+
+@deffn {Scheme Procedure} put-char port char
+Writes @var{char} to the port. The @code{put-char} procedure returns
+an unspecified value.
+@end deffn
+
+@deffn {Scheme Procedure} put-string port string
+@deffnx {Scheme Procedure} put-string port string start
+@deffnx {Scheme Procedure} put-string port string start count
+
+@var{start} and @var{count} must be non-negative exact integer objects.
+@var{string} must have a length of at least @math{@var{start} +
+@var{count}}.  @var{start} defaults to 0.  @var{count} defaults to
+@math{@code{(string-length @var{string})} - @var{start}}$. The
+@code{put-string} procedure writes the @var{count} characters of
+@var{string} starting at index @var{start} to the port.  The
+@code{put-string} procedure returns an unspecified value.
+@end deffn
+
+@deffn {Scheme Procedure} put-datum textual-output-port datum
+@var{datum} should be a datum value.  The @code{put-datum} procedure
+writes an external representation of @var{datum} to
+@var{textual-output-port}.  The specific external representation is
+implementation-dependent.  However, whenever possible, an implementation
+should produce a representation for which @code{get-datum}, when reading
+the representation, will return an object equal (in the sense of
+@code{equal?}) to @var{datum}.
+
+@quotation Note
+  Not all datums may allow producing an external representation for which
+  @code{get-datum} will produce an object that is equal to the
+  original.  Specifically, NaNs contained in @var{datum} may make
+  this impossible.
+@end quotation
+
+@quotation Note
+  The @code{put-datum} procedure merely writes the external
+  representation, but no trailing delimiter.  If @code{put-datum} is
+  used to write several subsequent external representations to an
+  output port, care should be taken to delimit them properly so they can
+  be read back in by subsequent calls to @code{get-datum}.
+@end quotation
+@end deffn
  
  @node I/O Extensions
  @subsection Using and Extending Ports in C
@@ -1409,7 +2270,7 @@ is set.
  
  @node Port Implementation
  @subsubsection Port Implementation
-@cindex Port implemenation
+@cindex Port implementation
  
  This section describes how to implement a new port type in C.
  
@@ -1544,6 +2405,83 @@ Set using
  
  @end table
  
+@node BOM Handling
+@subsection Handling of Unicode byte order marks.
+@cindex BOM
+@cindex byte order mark
+
+This section documents the finer points of Guile's handling of Unicode
+byte order marks (BOMs).  A byte order mark (U+FEFF) is typically found
+at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
+determine the byte order.  Occasionally, a BOM is found at the start of
+a UTF-8 stream, but this is much less common and not generally
+recommended.
+
+Guile attempts to handle BOMs automatically, and in accordance with the
+recommendations of the Unicode Standard, when the port encoding is set
+to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}.  In brief, Guile
+automatically writes a BOM at the start of a UTF-16 or UTF-32 stream,
+and automatically consumes one from the start of a UTF-8, UTF-16, or
+UTF-32 stream.
+
+As specified in the Unicode Standard, a BOM is only handled specially at
+the start of a stream, and only if the port encoding is set to
+@code{UTF-8}, @code{UTF-16} or @code{UTF-32}.  If the port encoding is
+set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or
+@code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of
+the special handling described in this section applies.
+
+@itemize @bullet
+@item
+To ensure that Guile will properly detect the byte order of a UTF-16 or
+UTF-32 stream, you must perform a textual read before any writes, seeks,
+or binary I/O.  Guile will not attempt to read a BOM unless a read is
+explicitly requested at the start of the stream.
+
+@item
+If a textual write is performed before the first read, then an arbitrary
+byte order will be chosen.  Currently, big endian is the default on all
+platforms, but that may change in the future.  If you wish to explicitly
+control the byte order of an output stream, set the port encoding to
+@code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE},
+and explicitly write a BOM (@code{#\xFEFF}) if desired.
+
+@item
+If @code{set-port-encoding!} is called in the middle of a stream, Guile
+treats this as a new logical ``start of stream'' for purposes of BOM
+handling, and will forget about any BOMs that had previously been seen.
+Therefore, it may choose a different byte order than had been used
+previously.  This is intended to support multiple logical text streams
+embedded within a larger binary stream.
+
+@item
+Binary I/O operations are not guaranteed to update Guile's notion of
+whether the port is at the ``start of the stream'', nor are they
+guaranteed to produce or consume BOMs.
+
+@item
+For ports that support seeking (e.g. normal files), the input and output
+streams are considered linked: if the user reads first, then a BOM will
+be consumed (if appropriate), but later writes will @emph{not} produce a
+BOM.  Similarly, if the user writes first, then later reads will
+@emph{not} consume a BOM.
+
+@item
+For ports that do not support seeking (e.g. pipes, sockets, and
+terminals), the input and output streams are considered
+@emph{independent} for purposes of BOM handling: the first read will
+consume a BOM (if appropriate), and the first write will @emph{also}
+produce a BOM (if appropriate).  However, the input and output streams
+will always use the same byte order.
+
+@item
+Seeks to the beginning of a file will set the ``start of stream'' flags.
+Therefore, a subsequent textual read or write will consume or produce a
+BOM.  However, unlike @code{set-port-encoding!}, if a byte order had
+already been chosen for the port, it will remain in effect after a seek,
+and cannot be changed by the presence of a BOM.  Seeks anywhere other
+than the beginning of a file clear the ``start of stream'' flags.
+@end itemize
  
  @c Local Variables:
  @c TeX-master: "guile.texi"