@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
-@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2007, 2008, 2009, 2010, 2011
-@c Free Software Foundation, Inc.
+@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2007,
+@c 2008, 2009, 2010, 2011, 2012, 2013 Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@node Simple Data Types
#t
@end lisp
-In test condition contexts like @code{if} and @code{cond} (@pxref{if
-cond case}), where a group of subexpressions will be evaluated only if a
-@var{condition} expression evaluates to ``true'', ``true'' means any
-value at all except @code{#f}.
+In test condition contexts like @code{if} and @code{cond}
+(@pxref{Conditionals}), where a group of subexpressions will be
+evaluated only if a @var{condition} expression evaluates to ``true'',
+``true'' means any value at all except @code{#f}.
@lisp
(if #t "yes" "no")
@deftypefnx {C Function} {unsigned long long} scm_to_ulong_long (SCM x)
@deftypefnx {C Function} size_t scm_to_size_t (SCM x)
@deftypefnx {C Function} ssize_t scm_to_ssize_t (SCM x)
+@deftypefnx {C Function} scm_t_ptrdiff scm_to_ptrdiff_t (SCM x)
@deftypefnx {C Function} scm_t_int8 scm_to_int8 (SCM x)
@deftypefnx {C Function} scm_t_uint8 scm_to_uint8 (SCM x)
@deftypefnx {C Function} scm_t_int16 scm_to_int16 (SCM x)
@deftypefnx {C Function} SCM scm_from_ulong_long (unsigned long long x)
@deftypefnx {C Function} SCM scm_from_size_t (size_t x)
@deftypefnx {C Function} SCM scm_from_ssize_t (ssize_t x)
+@deftypefnx {C Function} SCM scm_from_ptrdiff_t (scm_t_ptrdiff x)
@deftypefnx {C Function} SCM scm_from_int8 (scm_t_int8 x)
@deftypefnx {C Function} SCM scm_from_uint8 (scm_t_uint8 x)
@deftypefnx {C Function} SCM scm_from_int16 (scm_t_int16 x)
@deffn {Scheme Procedure} complex? z
@deffnx {C Function} scm_complex_p (z)
-Return @code{#t} if @var{x} is a complex number, @code{#f}
+Return @code{#t} if @var{z} is a complex number, @code{#f}
otherwise. Note that the sets of real, rational and integer
values form subsets of the set of complex numbers, i.e.@: the
-predicate will also be fulfilled if @var{x} is a real,
+predicate will also be fulfilled if @var{z} is a real,
rational or integer number.
@end deffn
@end deffn
+@deftypefn {C Function} int scm_is_exact (SCM z)
+Return a @code{1} if the number @var{z} is exact, and @code{0}
+otherwise. This is equivalent to @code{scm_is_true (scm_exact_p (z))}.
+
+An alternate approch to testing the exactness of a number is to
+use @code{scm_is_signed_integer} or @code{scm_is_unsigned_integer}.
+@end deftypefn
+
@deffn {Scheme Procedure} inexact? z
@deffnx {C Function} scm_inexact_p (z)
Return @code{#t} if the number @var{z} is inexact, @code{#f}
else.
@end deffn
+@deftypefn {C Function} int scm_is_inexact (SCM z)
+Return a @code{1} if the number @var{z} is inexact, and @code{0}
+otherwise. This is equivalent to @code{scm_is_true (scm_inexact_p (z))}.
+@end deftypefn
+
@deffn {Scheme Procedure} inexact->exact z
@deffnx {C Function} scm_inexact_to_exact (z)
Return an exact number that is numerically closest to @var{z}, when
@end lisp
@end deffn
+@deftypefn {Scheme Procedure} {} exact-integer-sqrt @var{k}
+@deftypefnx {C Function} void scm_exact_integer_sqrt (SCM @var{k}, SCM *@var{s}, SCM *@var{r})
+Return two exact non-negative integers @var{s} and @var{r}
+such that @math{@var{k} = @var{s}^2 + @var{r}} and
+@math{@var{s}^2 <= @var{k} < (@var{s} + 1)^2}.
+An error is raised if @var{k} is not an exact non-negative integer.
+
+@lisp
+(exact-integer-sqrt 10) @result{} 3 and 1
+@end lisp
+@end deftypefn
+
@node Comparison
@subsubsection Comparison Predicates
@rnindex zero?
separately. Note that @var{r}, if non-zero, will have the same sign
as @var{y}.
-When @var{x} and @var{y} are exact integers, @code{floor-remainder} is
+When @var{x} and @var{y} are integers, @code{floor-remainder} is
equivalent to the R5RS integer-only operator @code{modulo}.
@lisp
separately. Note that @var{r}, if non-zero, will have the same sign
as @var{x}.
-When @var{x} and @var{y} are exact integers, these operators are
+When @var{x} and @var{y} are integers, these operators are
equivalent to the R5RS integer-only operators @code{quotient} and
@code{remainder}.
@end lisp
@end deffn
-@deffn {Scheme Procedure} ash n cnt
-@deffnx {C Function} scm_ash (n, cnt)
-Return @var{n} shifted left by @var{cnt} bits, or shifted right if
-@var{cnt} is negative. This is an ``arithmetic'' shift.
+@deffn {Scheme Procedure} ash n count
+@deffnx {C Function} scm_ash (n, count)
+Return @math{floor(n * 2^count)}.
+@var{n} and @var{count} must be exact integers.
-This is effectively a multiplication by @m{2^{cnt}, 2^@var{cnt}}, and
-when @var{cnt} is negative it's a division, rounded towards negative
-infinity. (Note that this is not the same rounding as @code{quotient}
-does.)
-
-With @var{n} viewed as an infinite precision twos complement,
-@code{ash} means a left shift introducing zero bits, or a right shift
-dropping bits.
+With @var{n} viewed as an infinite-precision twos-complement
+integer, @code{ash} means a left shift introducing zero bits
+when @var{count} is positive, or a right shift dropping bits
+when @var{count} is negative. This is an ``arithmetic'' shift.
@lisp
(number->string (ash #b1 3) 2) @result{} "1000"
@end lisp
@end deffn
+@deffn {Scheme Procedure} round-ash n count
+@deffnx {C Function} scm_round_ash (n, count)
+Return @math{round(n * 2^count)}.
+@var{n} and @var{count} must be exact integers.
+
+With @var{n} viewed as an infinite-precision twos-complement
+integer, @code{round-ash} means a left shift introducing zero
+bits when @var{count} is positive, or a right shift rounding
+to the nearest integer (with ties going to the nearest even
+integer) when @var{count} is negative. This is a rounded
+``arithmetic'' shift.
+
+@lisp
+(number->string (round-ash #b1 3) 2) @result{} \"1000\"
+(number->string (round-ash #b1010 -1) 2) @result{} \"101\"
+(number->string (round-ash #b1010 -2) 2) @result{} \"10\"
+(number->string (round-ash #b1011 -2) 2) @result{} \"11\"
+(number->string (round-ash #b1101 -2) 2) @result{} \"11\"
+(number->string (round-ash #b1110 -2) 2) @result{} \"100\"
+@end lisp
+@end deffn
+
@deffn {Scheme Procedure} logcount n
@deffnx {C Function} scm_logcount (n)
Return the number of bits in integer @var{n}. If @var{n} is
read back with the Scheme reader.
@end deffn
+@deffn {Scheme Procedure} random-state-from-platform
+@deffnx {C Function} scm_random_state_from_platform ()
+Construct a new random state seeded from a platform-specific source of
+entropy, appropriate for use in non-security-critical applications.
+Currently @file{/dev/urandom} is tried first, or else the seed is based
+on the time, date, process ID, an address from a freshly allocated heap
+cell, an address from the local stack frame, and a high-resolution timer
+if available.
+@end deffn
+
@defvar *random-state*
The global random state used by the above functions when the
@var{state} parameter is not given.
(0 1 1 2 2 2 1 2 6 7 10 0 5 3 12 5 5 12)
@end lisp
-To use the time of day as the random seed, you can use code like this:
+To seed the random state in a sensible way for non-security-critical
+applications, do this during initialization of your program:
@lisp
-(let ((time (gettimeofday)))
- (set! *random-state*
- (seed->random-state (+ (car time)
- (cdr time)))))
+(set! *random-state* (random-state-from-platform))
@end lisp
-@noindent
-And then (depending on the time of day, of course):
-
-@lisp
-(map random (cdr (iota 19)))
-@result{}
-(0 0 1 0 2 4 5 4 5 5 9 3 10 1 8 3 14 17)
-@end lisp
-
-For security applications, such as password generation, you should use
-more bits of seed. Otherwise an open source password generator could
-be attacked by guessing the seed@dots{} but that's a subject for
-another manual.
-
@node Characters
@subsection Characters
@rnindex char?
@deffn {Scheme Procedure} char? x
@deffnx {C Function} scm_char_p (x)
-Return @code{#t} iff @var{x} is a character, else @code{#f}.
+Return @code{#t} if @var{x} is a character, else @code{#f}.
@end deffn
Fundamentally, the character comparison operations below are
@rnindex char=?
@deffn {Scheme Procedure} char=? x y
-Return @code{#t} iff code point of @var{x} is equal to the code point
+Return @code{#t} if code point of @var{x} is equal to the code point
of @var{y}, else @code{#f}.
@end deffn
@rnindex char<?
@deffn {Scheme Procedure} char<? x y
-Return @code{#t} iff the code point of @var{x} is less than the code
+Return @code{#t} if the code point of @var{x} is less than the code
point of @var{y}, else @code{#f}.
@end deffn
@rnindex char<=?
@deffn {Scheme Procedure} char<=? x y
-Return @code{#t} iff the code point of @var{x} is less than or equal
+Return @code{#t} if the code point of @var{x} is less than or equal
to the code point of @var{y}, else @code{#f}.
@end deffn
@rnindex char>?
@deffn {Scheme Procedure} char>? x y
-Return @code{#t} iff the code point of @var{x} is greater than the
+Return @code{#t} if the code point of @var{x} is greater than the
code point of @var{y}, else @code{#f}.
@end deffn
@rnindex char>=?
@deffn {Scheme Procedure} char>=? x y
-Return @code{#t} iff the code point of @var{x} is greater than or
+Return @code{#t} if the code point of @var{x} is greater than or
equal to the code point of @var{y}, else @code{#f}.
@end deffn
@rnindex char-ci=?
@deffn {Scheme Procedure} char-ci=? x y
-Return @code{#t} iff the case-folded code point of @var{x} is the same
+Return @code{#t} if the case-folded code point of @var{x} is the same
as the case-folded code point of @var{y}, else @code{#f}.
@end deffn
@rnindex char-ci<?
@deffn {Scheme Procedure} char-ci<? x y
-Return @code{#t} iff the case-folded code point of @var{x} is less
+Return @code{#t} if the case-folded code point of @var{x} is less
than the case-folded code point of @var{y}, else @code{#f}.
@end deffn
@rnindex char-ci<=?
@deffn {Scheme Procedure} char-ci<=? x y
-Return @code{#t} iff the case-folded code point of @var{x} is less
+Return @code{#t} if the case-folded code point of @var{x} is less
than or equal to the case-folded code point of @var{y}, else
@code{#f}.
@end deffn
@rnindex char-ci>?
@deffn {Scheme Procedure} char-ci>? x y
-Return @code{#t} iff the case-folded code point of @var{x} is greater
+Return @code{#t} if the case-folded code point of @var{x} is greater
than the case-folded code point of @var{y}, else @code{#f}.
@end deffn
@rnindex char-ci>=?
@deffn {Scheme Procedure} char-ci>=? x y
-Return @code{#t} iff the case-folded code point of @var{x} is greater
+Return @code{#t} if the case-folded code point of @var{x} is greater
than or equal to the case-folded code point of @var{y}, else
@code{#f}.
@end deffn
@rnindex char-alphabetic?
@deffn {Scheme Procedure} char-alphabetic? chr
@deffnx {C Function} scm_char_alphabetic_p (chr)
-Return @code{#t} iff @var{chr} is alphabetic, else @code{#f}.
+Return @code{#t} if @var{chr} is alphabetic, else @code{#f}.
@end deffn
@rnindex char-numeric?
@deffn {Scheme Procedure} char-numeric? chr
@deffnx {C Function} scm_char_numeric_p (chr)
-Return @code{#t} iff @var{chr} is numeric, else @code{#f}.
+Return @code{#t} if @var{chr} is numeric, else @code{#f}.
@end deffn
@rnindex char-whitespace?
@deffn {Scheme Procedure} char-whitespace? chr
@deffnx {C Function} scm_char_whitespace_p (chr)
-Return @code{#t} iff @var{chr} is whitespace, else @code{#f}.
+Return @code{#t} if @var{chr} is whitespace, else @code{#f}.
@end deffn
@rnindex char-upper-case?
@deffn {Scheme Procedure} char-upper-case? chr
@deffnx {C Function} scm_char_upper_case_p (chr)
-Return @code{#t} iff @var{chr} is uppercase, else @code{#f}.
+Return @code{#t} if @var{chr} is uppercase, else @code{#f}.
@end deffn
@rnindex char-lower-case?
@deffn {Scheme Procedure} char-lower-case? chr
@deffnx {C Function} scm_char_lower_case_p (chr)
-Return @code{#t} iff @var{chr} is lowercase, else @code{#f}.
+Return @code{#t} if @var{chr} is lowercase, else @code{#f}.
@end deffn
@deffn {Scheme Procedure} char-is-both? chr
@deffnx {C Function} scm_char_is_both_p (chr)
-Return @code{#t} iff @var{chr} is either uppercase or lowercase, else
+Return @code{#t} if @var{chr} is either uppercase or lowercase, else
@code{#f}.
@end deffn
otherwise.
@end deffn
-@deffn {Scheme Procedure} char-set= . char_sets
+@deffn {Scheme Procedure} char-set= char_set @dots{}
@deffnx {C Function} scm_char_set_eq (char_sets)
Return @code{#t} if all given character sets are equal.
@end deffn
-@deffn {Scheme Procedure} char-set<= . char_sets
+@deffn {Scheme Procedure} char-set<= char_set @dots{}
@deffnx {C Function} scm_char_set_leq (char_sets)
-Return @code{#t} if every character set @var{cs}i is a subset
-of character set @var{cs}i+1.
+Return @code{#t} if every character set @var{char_set}i is a subset
+of character set @var{char_set}i+1.
@end deffn
@deffn {Scheme Procedure} char-set-hash cs [bound]
@deffnx {C Function} scm_char_set_hash (cs, bound)
Compute a hash value for the character set @var{cs}. If
@var{bound} is given and non-zero, it restricts the
-returned value to the range 0 @dots{} @var{bound - 1}.
+returned value to the range 0 @dots{} @var{bound} - 1.
@end deffn
@c ===================================================================
characters in @var{cs}.
@end deffn
-@deffn {Scheme Procedure} char-set . rest
-@deffnx {C Function} scm_char_set (rest)
+@deffn {Scheme Procedure} char-set chr @dots{}
+@deffnx {C Function} scm_char_set (chrs)
Return a character set containing all given characters.
@end deffn
@deffn {Scheme Procedure} char-set-contains? cs ch
@deffnx {C Function} scm_char_set_contains_p (cs, ch)
-Return @code{#t} iff the character @var{ch} is contained in the
-character set @var{cs}.
+Return @code{#t} if the character @var{ch} is contained in the
+character set @var{cs}, or @code{#f} otherwise.
@end deffn
@deffn {Scheme Procedure} char-set-every pred cs
provide side-effecting variants, which modify their character set
argument(s).
-@deffn {Scheme Procedure} char-set-adjoin cs . rest
-@deffnx {C Function} scm_char_set_adjoin (cs, rest)
+@deffn {Scheme Procedure} char-set-adjoin cs chr @dots{}
+@deffnx {C Function} scm_char_set_adjoin (cs, chrs)
Add all character arguments to the first argument, which must
be a character set.
@end deffn
-@deffn {Scheme Procedure} char-set-delete cs . rest
-@deffnx {C Function} scm_char_set_delete (cs, rest)
+@deffn {Scheme Procedure} char-set-delete cs chr @dots{}
+@deffnx {C Function} scm_char_set_delete (cs, chrs)
Delete all character arguments from the first argument, which
must be a character set.
@end deffn
-@deffn {Scheme Procedure} char-set-adjoin! cs . rest
-@deffnx {C Function} scm_char_set_adjoin_x (cs, rest)
+@deffn {Scheme Procedure} char-set-adjoin! cs chr @dots{}
+@deffnx {C Function} scm_char_set_adjoin_x (cs, chrs)
Add all character arguments to the first argument, which must
be a character set.
@end deffn
-@deffn {Scheme Procedure} char-set-delete! cs . rest
-@deffnx {C Function} scm_char_set_delete_x (cs, rest)
+@deffn {Scheme Procedure} char-set-delete! cs chr @dots{}
+@deffnx {C Function} scm_char_set_delete_x (cs, chrs)
Delete all character arguments from the first argument, which
must be a character set.
@end deffn
@code{char-set-complement} by computing its intersection with the set
of designated code points, @code{char-set:designated}.
-@deffn {Scheme Procedure} char-set-union . rest
-@deffnx {C Function} scm_char_set_union (rest)
+@deffn {Scheme Procedure} char-set-union cs @dots{}
+@deffnx {C Function} scm_char_set_union (char_sets)
Return the union of all argument character sets.
@end deffn
-@deffn {Scheme Procedure} char-set-intersection . rest
-@deffnx {C Function} scm_char_set_intersection (rest)
+@deffn {Scheme Procedure} char-set-intersection cs @dots{}
+@deffnx {C Function} scm_char_set_intersection (char_sets)
Return the intersection of all argument character sets.
@end deffn
-@deffn {Scheme Procedure} char-set-difference cs1 . rest
-@deffnx {C Function} scm_char_set_difference (cs1, rest)
+@deffn {Scheme Procedure} char-set-difference cs1 cs @dots{}
+@deffnx {C Function} scm_char_set_difference (cs1, char_sets)
Return the difference of all argument character sets.
@end deffn
-@deffn {Scheme Procedure} char-set-xor . rest
-@deffnx {C Function} scm_char_set_xor (rest)
+@deffn {Scheme Procedure} char-set-xor cs @dots{}
+@deffnx {C Function} scm_char_set_xor (char_sets)
Return the exclusive-or of all argument character sets.
@end deffn
-@deffn {Scheme Procedure} char-set-diff+intersection cs1 . rest
-@deffnx {C Function} scm_char_set_diff_plus_intersection (cs1, rest)
+@deffn {Scheme Procedure} char-set-diff+intersection cs1 cs @dots{}
+@deffnx {C Function} scm_char_set_diff_plus_intersection (cs1, char_sets)
Return the difference and the intersection of all argument
character sets.
@end deffn
Return the complement of the character set @var{cs}.
@end deffn
-@deffn {Scheme Procedure} char-set-union! cs1 . rest
-@deffnx {C Function} scm_char_set_union_x (cs1, rest)
+@deffn {Scheme Procedure} char-set-union! cs1 cs @dots{}
+@deffnx {C Function} scm_char_set_union_x (cs1, char_sets)
Return the union of all argument character sets.
@end deffn
-@deffn {Scheme Procedure} char-set-intersection! cs1 . rest
-@deffnx {C Function} scm_char_set_intersection_x (cs1, rest)
+@deffn {Scheme Procedure} char-set-intersection! cs1 cs @dots{}
+@deffnx {C Function} scm_char_set_intersection_x (cs1, char_sets)
Return the intersection of all argument character sets.
@end deffn
-@deffn {Scheme Procedure} char-set-difference! cs1 . rest
-@deffnx {C Function} scm_char_set_difference_x (cs1, rest)
+@deffn {Scheme Procedure} char-set-difference! cs1 cs @dots{}
+@deffnx {C Function} scm_char_set_difference_x (cs1, char_sets)
Return the difference of all argument character sets.
@end deffn
-@deffn {Scheme Procedure} char-set-xor! cs1 . rest
-@deffnx {C Function} scm_char_set_xor_x (cs1, rest)
+@deffn {Scheme Procedure} char-set-xor! cs1 cs @dots{}
+@deffnx {C Function} scm_char_set_xor_x (cs1, char_sets)
Return the exclusive-or of all argument character sets.
@end deffn
-@deffn {Scheme Procedure} char-set-diff+intersection! cs1 cs2 . rest
-@deffnx {C Function} scm_char_set_diff_plus_intersection_x (cs1, cs2, rest)
+@deffn {Scheme Procedure} char-set-diff+intersection! cs1 cs2 cs @dots{}
+@deffnx {C Function} scm_char_set_diff_plus_intersection_x (cs1, cs2, char_sets)
Return the difference and the intersection of all argument
character sets.
@end deffn
These character sets are locale independent and are not recomputed
upon a @code{setlocale} call. They contain characters from the whole
range of Unicode code points. For instance, @code{char-set:letter}
-contains about 94,000 characters.
+contains about 100,000 characters.
@defvr {Scheme Variable} char-set:lower-case
@defvrx {C Variable} scm_char_set_lower_case
* Reversing and Appending Strings:: Appending strings to form a new string.
* Mapping Folding and Unfolding:: Iterating over strings.
* Miscellaneous String Operations:: Replicating, insertion, parsing, ...
+* Representing Strings as Bytes:: Encoding and decoding strings.
* Conversion to/from C::
* String Internals:: The storage strategy for strings.
@end menu
Return a newly allocated string of
length @var{k}. If @var{chr} is given, then all elements of
the string are initialized to @var{chr}, otherwise the contents
-of the @var{string} are unspecified.
+of the string are unspecified.
@end deffn
@deftypefn {C Function} SCM scm_c_make_string (size_t len, SCM chr)
@deffn {Scheme Procedure} string-join ls [delimiter [grammar]]
@deffnx {C Function} scm_string_join (ls, delimiter, grammar)
Append the string in the string list @var{ls}, using the string
-@var{delim} as a delimiter between the elements of @var{ls}.
+@var{delimiter} as a delimiter between the elements of @var{ls}.
@var{grammar} is a symbol which specifies how the delimiter is
placed between the strings, and defaults to the symbol
@code{infix}.
@item infix
Insert the separator between list elements. An empty string
will produce an empty list.
-@item string-infix
+@item strict-infix
Like @code{infix}, but will raise an error if given the empty
list.
@item suffix
Convert the string @var{str} into a list of characters.
@end deffn
-@deffn {Scheme Procedure} string-split str chr
-@deffnx {C Function} scm_string_split (str, chr)
+@deffn {Scheme Procedure} string-split str char_pred
+@deffnx {C Function} scm_string_split (str, char_pred)
Split the string @var{str} into a list of substrings delimited
-by appearances of the character @var{chr}. Note that an empty substring
-between separator characters will result in an empty string in the
-result list.
+by appearances of characters that
+
+@itemize @bullet
+@item
+equal @var{char_pred}, if it is a character,
+
+@item
+satisfy the predicate @var{char_pred}, if it is a procedure,
+
+@item
+are in the set @var{char_pred}, if it is a character set.
+@end itemize
+
+Note that an empty substring between separator characters will result in
+an empty string in the result list.
@lisp
(string-split "root:x:0:0:root:/root:/bin/bash" #\:)
@deffnx {C Function} scm_string_pad (s, len, chr, start, end)
@deffnx {C Function} scm_string_pad_right (s, len, chr, start, end)
Take characters @var{start} to @var{end} from the string @var{s} and
-either pad with @var{char} or truncate them to give @var{len}
+either pad with @var{chr} or truncate them to give @var{len}
characters.
@code{string-pad} pads or truncates on the left, so for example
@var{end} to @var{fill}.
@lisp
-(define y "abcdefg")
+(define y (string-copy "abcdefg"))
(substring-fill! y 1 3 #\r)
y
@result{} "arrdefg"
i18n)} module}, for locale-dependent string comparison.
@rnindex string=?
-@deffn {Scheme Procedure} string=? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_equal_p (s1, s2, rest)
-Lexicographic equality predicate; return @code{#t} if the two
-strings are the same length and contain the same characters in
-the same positions, otherwise return @code{#f}.
+@deffn {Scheme Procedure} string=? s1 s2 s3 @dots{}
+Lexicographic equality predicate; return @code{#t} if all strings are
+the same length and contain the same characters in the same positions,
+otherwise return @code{#f}.
The procedure @code{string-ci=?} treats upper and lower case
letters as though they were the same character, but
@end deffn
@rnindex string<?
-@deffn {Scheme Procedure} string<? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_less_p (s1, s2, rest)
-Lexicographic ordering predicate; return @code{#t} if @var{s1}
-is lexicographically less than @var{s2}.
+@deffn {Scheme Procedure} string<? s1 s2 s3 @dots{}
+Lexicographic ordering predicate; return @code{#t} if, for every pair of
+consecutive string arguments @var{str_i} and @var{str_i+1}, @var{str_i} is
+lexicographically less than @var{str_i+1}.
@end deffn
@rnindex string<=?
-@deffn {Scheme Procedure} string<=? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_leq_p (s1, s2, rest)
-Lexicographic ordering predicate; return @code{#t} if @var{s1}
-is lexicographically less than or equal to @var{s2}.
+@deffn {Scheme Procedure} string<=? s1 s2 s3 @dots{}
+Lexicographic ordering predicate; return @code{#t} if, for every pair of
+consecutive string arguments @var{str_i} and @var{str_i+1}, @var{str_i} is
+lexicographically less than or equal to @var{str_i+1}.
@end deffn
@rnindex string>?
-@deffn {Scheme Procedure} string>? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_gr_p (s1, s2, rest)
-Lexicographic ordering predicate; return @code{#t} if @var{s1}
-is lexicographically greater than @var{s2}.
+@deffn {Scheme Procedure} string>? s1 s2 s3 @dots{}
+Lexicographic ordering predicate; return @code{#t} if, for every pair of
+consecutive string arguments @var{str_i} and @var{str_i+1}, @var{str_i} is
+lexicographically greater than @var{str_i+1}.
@end deffn
@rnindex string>=?
-@deffn {Scheme Procedure} string>=? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_geq_p (s1, s2, rest)
-Lexicographic ordering predicate; return @code{#t} if @var{s1}
-is lexicographically greater than or equal to @var{s2}.
+@deffn {Scheme Procedure} string>=? s1 s2 s3 @dots{}
+Lexicographic ordering predicate; return @code{#t} if, for every pair of
+consecutive string arguments @var{str_i} and @var{str_i+1}, @var{str_i} is
+lexicographically greater than or equal to @var{str_i+1}.
@end deffn
@rnindex string-ci=?
-@deffn {Scheme Procedure} string-ci=? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_ci_equal_p (s1, s2, rest)
+@deffn {Scheme Procedure} string-ci=? s1 s2 s3 @dots{}
Case-insensitive string equality predicate; return @code{#t} if
-the two strings are the same length and their component
+all strings are the same length and their component
characters match (ignoring case) at each position; otherwise
return @code{#f}.
@end deffn
@rnindex string-ci<?
-@deffn {Scheme Procedure} string-ci<? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_ci_less_p (s1, s2, rest)
-Case insensitive lexicographic ordering predicate; return
-@code{#t} if @var{s1} is lexicographically less than @var{s2}
+@deffn {Scheme Procedure} string-ci<? s1 s2 s3 @dots{}
+Case insensitive lexicographic ordering predicate; return @code{#t} if,
+for every pair of consecutive string arguments @var{str_i} and
+@var{str_i+1}, @var{str_i} is lexicographically less than @var{str_i+1}
regardless of case.
@end deffn
@rnindex string<=?
-@deffn {Scheme Procedure} string-ci<=? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_ci_leq_p (s1, s2, rest)
-Case insensitive lexicographic ordering predicate; return
-@code{#t} if @var{s1} is lexicographically less than or equal
-to @var{s2} regardless of case.
+@deffn {Scheme Procedure} string-ci<=? s1 s2 s3 @dots{}
+Case insensitive lexicographic ordering predicate; return @code{#t} if,
+for every pair of consecutive string arguments @var{str_i} and
+@var{str_i+1}, @var{str_i} is lexicographically less than or equal to
+@var{str_i+1} regardless of case.
@end deffn
@rnindex string-ci>?
-@deffn {Scheme Procedure} string-ci>? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_ci_gr_p (s1, s2, rest)
-Case insensitive lexicographic ordering predicate; return
-@code{#t} if @var{s1} is lexicographically greater than
-@var{s2} regardless of case.
+@deffn {Scheme Procedure} string-ci>? s1 s2 s3 @dots{}
+Case insensitive lexicographic ordering predicate; return @code{#t} if,
+for every pair of consecutive string arguments @var{str_i} and
+@var{str_i+1}, @var{str_i} is lexicographically greater than
+@var{str_i+1} regardless of case.
@end deffn
@rnindex string-ci>=?
-@deffn {Scheme Procedure} string-ci>=? [s1 [s2 . rest]]
-@deffnx {C Function} scm_i_string_ci_geq_p (s1, s2, rest)
-Case insensitive lexicographic ordering predicate; return
-@code{#t} if @var{s1} is lexicographically greater than or
-equal to @var{s2} regardless of case.
+@deffn {Scheme Procedure} string-ci>=? s1 s2 s3 @dots{}
+Case insensitive lexicographic ordering predicate; return @code{#t} if,
+for every pair of consecutive string arguments @var{str_i} and
+@var{str_i+1}, @var{str_i} is lexicographically greater than or equal to
+@var{str_i+1} regardless of case.
@end deffn
@deffn {Scheme Procedure} string-compare s1 s2 proc_lt proc_eq proc_gt [start1 [end1 [start2 [end2]]]]
@deffn {Scheme Procedure} string-hash s [bound [start [end]]]
@deffnx {C Function} scm_substring_hash (s, bound, start, end)
-Compute a hash value for @var{S}. The optional argument @var{bound} is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
+Compute a hash value for @var{s}. The optional argument @var{bound} is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
@end deffn
@deffn {Scheme Procedure} string-hash-ci s [bound [start [end]]]
@deffnx {C Function} scm_substring_hash_ci (s, bound, start, end)
-Compute a hash value for @var{S}. The optional argument @var{bound} is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
+Compute a hash value for @var{s}. The optional argument @var{bound} is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
@end deffn
Because the same visual appearance of an abstract Unicode character can
@end deffn
@rnindex string-append
-@deffn {Scheme Procedure} string-append . args
+@deffn {Scheme Procedure} string-append arg @dots{}
@deffnx {C Function} scm_string_append (args)
Return a newly allocated string whose characters form the
-concatenation of the given strings, @var{args}.
+concatenation of the given strings, @var{arg} @enddots{}.
@example
(let ((h "hello "))
@end example
@end deffn
-@deffn {Scheme Procedure} string-append/shared . rest
-@deffnx {C Function} scm_string_append_shared (rest)
+@deffn {Scheme Procedure} string-append/shared arg @dots{}
+@deffnx {C Function} scm_string_append_shared (args)
Like @code{string-append}, but the result may share memory
with the argument strings.
@end deffn
@deffn {Scheme Procedure} string-concatenate ls
@deffnx {C Function} scm_string_concatenate (ls)
-Append the elements of @var{ls} (which must be strings)
-together into a single string. Guaranteed to return a freshly
-allocated string.
+Append the elements (which must be strings) of @var{ls} together into a
+single string. Guaranteed to return a freshly allocated string.
@end deffn
@deffn {Scheme Procedure} string-concatenate-reverse ls [final_string [end]]
is a character set, it is tested for membership.
@end deffn
+@node Representing Strings as Bytes
+@subsubsection Representing Strings as Bytes
+
+Out in the cold world outside of Guile, not all strings are treated in
+the same way. Out there there are only bytes, and there are many ways
+of representing a strings (sequences of characters) as binary data
+(sequences of bytes).
+
+As a user, usually you don't have to think about this very much. When
+you type on your keyboard, your system encodes your keystrokes as bytes
+according to the locale that you have configured on your computer.
+Guile uses the locale to decode those bytes back into characters --
+hopefully the same characters that you typed in.
+
+All is not so clear when dealing with a system with multiple users, such
+as a web server. Your web server might get a request from one user for
+data encoded in the ISO-8859-1 character set, and then another request
+from a different user for UTF-8 data.
+
+@cindex iconv
+@cindex character encoding
+Guile provides an @dfn{iconv} module for converting between strings and
+sequences of bytes. @xref{Bytevectors}, for more on how Guile
+represents raw byte sequences. This module gets its name from the
+common @sc{unix} command of the same name.
+
+Note that often it is sufficient to just read and write strings from
+ports instead of using these functions. To do this, specify the port
+encoding using @code{set-port-encoding!}. @xref{Ports}, for more on
+ports and character encodings.
+
+Unlike the rest of the procedures in this section, you have to load the
+@code{iconv} module before having access to these procedures:
+
+@example
+(use-modules (ice-9 iconv))
+@end example
+
+@deffn string->bytevector string encoding [conversion-strategy]
+Encode @var{string} as a sequence of bytes.
+
+The string will be encoded in the character set specified by the
+@var{encoding} string. If the string has characters that cannot be
+represented in the encoding, by default this procedure raises an
+@code{encoding-error}. Pass a @var{conversion-strategy} argument to
+specify other behaviors.
+
+The return value is a bytevector. @xref{Bytevectors}, for more on
+bytevectors. @xref{Ports}, for more on character encodings and
+conversion strategies.
+@end deffn
+
+@deffn bytevector->string bytevector encoding [conversion-strategy]
+Decode @var{bytevector} into a string.
+
+The bytes will be decoded from the character set by the @var{encoding}
+string. If the bytes do not form a valid encoding, by default this
+procedure raises an @code{decoding-error}. As with
+@code{string->bytevector}, pass the optional @var{conversion-strategy}
+argument to modify this behavior. @xref{Ports}, for more on character
+encodings and conversion strategies.
+@end deffn
+
+@deffn call-with-output-encoded-string encoding proc [conversion-strategy]
+Like @code{call-with-output-string}, but instead of returning a string,
+returns a encoding of the string according to @var{encoding}, as a
+bytevector. This procedure can be more efficient than collecting a
+string and then converting it via @code{string->bytevector}.
+@end deffn
+
@node Conversion to/from C
@subsubsection Conversion to/from C
In C, a string is just a sequence of bytes, and the character encoding
describes the relation between these bytes and the actual characters
-that make up the string. For Scheme strings, character encoding is
-not an issue (most of the time), since in Scheme you never get to see
-the bytes, only the characters.
+that make up the string. For Scheme strings, character encoding is not
+an issue (most of the time), since in Scheme you usually treat strings
+as character sequences, not byte sequences.
Converting to C and converting from C each have their own challenges.
@deftypefn {C Function} SCM scm_from_locale_string (const char *str)
@deftypefnx {C Function} SCM scm_from_locale_stringn (const char *str, size_t len)
Creates a new Scheme string that has the same contents as @var{str} when
-interpreted in the locale character encoding of the
-@code{current-input-port}.
+interpreted in the character encoding of the current locale.
For @code{scm_from_locale_string}, @var{str} must be null-terminated.
null-terminated and the real length will be found with @code{strlen}.
If the C string is ill-formed, an error will be raised.
+
+Note that these functions should @emph{not} be used to convert C string
+constants, because there is no guarantee that the current locale will
+match that of the source code. To convert C string constants, use
+@code{scm_from_latin1_string}, @code{scm_from_utf8_string} or
+@code{scm_from_utf32_string}.
@end deftypefn
@deftypefn {C Function} SCM scm_take_locale_string (char *str)
@deftypefn {C Function} {char *} scm_to_locale_string (SCM str)
@deftypefnx {C Function} {char *} scm_to_locale_stringn (SCM str, size_t *lenp)
-Returns a C string with the same contents as @var{str} in the locale
-encoding of the @code{current-output-port}. The C string must be freed
-with @code{free} eventually, maybe by using @code{scm_dynwind_free},
+Returns a C string with the same contents as @var{str} in the character
+encoding of the current locale. The C string must be freed with
+@code{free} eventually, maybe by using @code{scm_dynwind_free},
@xref{Dynamic Wind}.
For @code{scm_to_locale_string}, the returned string is
@var{lenp} is @code{NULL}, @code{scm_to_locale_stringn} behaves like
@code{scm_to_locale_string}.
-If a character in @var{str} cannot be represented in the locale encoding
-of the current output port, the port conversion strategy of the current
-output port will determine the result, @xref{Ports}. If output port's
-conversion strategy is @code{error}, an error will be raised. If it is
-@code{substitute}, a replacement character, such as a question mark, will
-be inserted in its place. If it is @code{escape}, a hex escape will be
-inserted in its place.
+If a character in @var{str} cannot be represented in the character
+encoding of the current locale, the default port conversion strategy is
+used. @xref{Ports}, for more on conversion strategies.
+
+If the conversion strategy is @code{error}, an error will be raised. If
+it is @code{substitute}, a replacement character, such as a question
+mark, will be inserted in its place. If it is @code{escape}, a hex
+escape will be inserted in its place.
@end deftypefn
@deftypefn {C Function} size_t scm_to_locale_stringbuf (SCM str, char *buf, size_t max_len)
@deftypefn {C Function} char *scm_to_stringn (SCM str, size_t *lenp, const char *encoding, scm_t_string_failed_conversion_handler handler)
This function returns a newly allocated C string from the Guile string
-@var{str}. The length of the string will be returned in @var{lenp}.
-The character encoding of the C string is passed as the ASCII,
+@var{str}. The length of the returned string in bytes will be returned in
+@var{lenp}. The character encoding of the C string is passed as the ASCII,
null-terminated C string @var{encoding}. The @var{handler} parameter
gives a strategy for dealing with characters that cannot be converted
into @var{encoding}.
-If @var{lenp} is NULL, this function will return a null-terminated C
+If @var{lenp} is @code{NULL}, this function will return a null-terminated C
string. It will throw an error if the string contains a null
character.
+
+The Scheme interface to this function is @code{string->bytevector}, from the
+@code{ice-9 iconv} module. @xref{Representing Strings as Bytes}.
@end deftypefn
@deftypefn {C Function} SCM scm_from_stringn (const char *str, size_t len, const char *encoding, scm_t_string_failed_conversion_handler handler)
This function returns a scheme string from the C string @var{str}. The
-length of the C string is input as @var{len}. The encoding of the C
+length in bytes of the C string is input as @var{len}. The encoding of the C
string is passed as the ASCII, null-terminated C string @code{encoding}.
The @var{handler} parameters suggests a strategy for dealing with
unconvertable characters.
+
+The Scheme interface to this function is @code{bytevector->string}.
+@xref{Representing Strings as Bytes}.
@end deftypefn
-ISO-8859-1 is the most common 8-bit character encoding. This encoding
-is also referred to as the Latin-1 encoding. The following two
-conversion functions are provided to convert between Latin-1 C strings
-and Guile strings.
+The following conversion functions are provided as a convenience for the
+most commonly used encodings.
+
+@deftypefn {C Function} SCM scm_from_latin1_string (const char *str)
+@deftypefnx {C Function} SCM scm_from_utf8_string (const char *str)
+@deftypefnx {C Function} SCM scm_from_utf32_string (const scm_t_wchar *str)
+Return a scheme string from the null-terminated C string @var{str},
+which is ISO-8859-1-, UTF-8-, or UTF-32-encoded. These functions should
+be used to convert hard-coded C string constants into Scheme strings.
+@end deftypefn
@deftypefn {C Function} SCM scm_from_latin1_stringn (const char *str, size_t len)
@deftypefnx {C Function} SCM scm_from_utf8_stringn (const char *str, size_t len)
@deftypefnx {C function} scm_t_wchar *scm_to_utf32_stringn (SCM str, size_t *lenp)
Return a newly allocated, ISO-8859-1-, UTF-8-, or UTF-32-encoded C string
from Scheme string @var{str}. An error is thrown when @var{str}
-string cannot be converted to the specified encoding. If @var{lenp} is
+cannot be converted to the specified encoding. If @var{lenp} is
@code{NULL}, the returned C string will be null terminated, and an error
will be thrown if the C string would otherwise contain null
-characters. If @var{lenp} is not NULL, the length of the string is
-returned in @var{lenp}, and the string is not null terminated.
+characters. If @var{lenp} is not @code{NULL}, the string is not null terminated,
+and the length of the returned string is returned in @var{lenp}. The length
+returned is the number of bytes for @code{scm_to_latin1_stringn} and
+@code{scm_to_utf8_stringn}; it is the number of elements (code points)
+for @code{scm_to_utf32_stringn}.
+@end deftypefn
+
+It is not often the case, but sometimes when you are dealing with the
+implementation details of a port, you need to encode and decode strings
+according to the encoding and conversion strategy of the port. There
+are some convenience functions for that purpose as well.
+
+@deftypefn {C Function} SCM scm_from_port_string (const char *str, SCM port)
+@deftypefnx {C Function} SCM scm_from_port_stringn (const char *str, size_t len, SCM port)
+@deftypefnx {C Function} char* scm_to_port_string (SCM str, SCM port)
+@deftypefnx {C Function} char* scm_to_port_stringn (SCM str, size_t *lenp, SCM port)
+Like @code{scm_from_stringn} and friends, except they take their
+encoding and conversion strategy from a given port object.
@end deftypefn
@node String Internals
* Bytevectors and Integer Lists:: Converting to/from an integer list.
* Bytevectors as Floats:: Interpreting bytes as real numbers.
* Bytevectors as Strings:: Interpreting bytes as Unicode strings.
-* Bytevectors as Generalized Vectors:: Guile extension to the bytevector API.
+* Bytevectors as Arrays:: Guile extension to the bytevector API.
* Bytevectors as Uniform Vectors:: Bytevectors and SRFI-4.
@end menu
@deffnx {C Function} scm_bytevector_copy_x (source, source_start, target, target_start, len)
Copy @var{len} bytes from @var{source} into @var{target}, starting
reading from @var{source-start} (a positive index within @var{source})
-and start writing at @var{target-start}.
+and start writing at @var{target-start}. It is permitted for the
+@var{source} and @var{target} regions to overlap.
@end deffn
@deffn {Scheme Procedure} bytevector-copy bv
Bytevector contents can also be interpreted as Unicode strings encoded
in one of the most commonly available encoding formats.
+@xref{Representing Strings as Bytes}, for a more generic interface.
@lisp
(utf8->string (u8-list->bytevector '(99 97 102 101)))
it defaults to big endian.
@end deffn
-@node Bytevectors as Generalized Vectors
-@subsubsection Accessing Bytevectors with the Generalized Vector API
+@node Bytevectors as Arrays
+@subsubsection Accessing Bytevectors with the Array API
As an extension to the R6RS, Guile allows bytevectors to be manipulated
-with the @dfn{generalized vector} procedures (@pxref{Generalized
-Vectors}). This also allows bytevectors to be accessed using the
-generic @dfn{array} procedures (@pxref{Array Procedures}). When using
-these APIs, bytes are accessed one at a time as 8-bit unsigned integers:
+with the @dfn{array} procedures (@pxref{Arrays}). When using these
+APIs, bytes are accessed one at a time as 8-bit unsigned integers:
@example
(define bv #vu8(0 1 2 3))
-(generalized-vector? bv)
+(array? bv)
@result{} #t
-(generalized-vector-ref bv 2)
+(array-rank bv)
+@result{} 1
+
+(array-ref bv 2)
@result{} 2
-(generalized-vector-set! bv 2 77)
+;; Note the different argument order on array-set!.
+(array-set! bv 77 2)
(array-ref bv 2)
@result{} 77
@end deffn
@rnindex symbol-append
-@deffn {Scheme Procedure} symbol-append . args
+@deffn {Scheme Procedure} symbol-append arg @dots{}
Return a newly allocated symbol whose characters form the
-concatenation of the given symbols, @var{args}.
+concatenation of the given symbols, @var{arg} @enddots{}.
@example
(let ((h 'hello))
and strings using @code{scm_symbol_to_string} and
@code{scm_string_to_symbol} and work with the strings.
-@deffn {C Function} scm_from_locale_symbol (const char *name)
-@deffnx {C Function} scm_from_locale_symboln (const char *name, size_t len)
+@deftypefn {C Function} scm_from_latin1_symbol (const char *name)
+@deftypefnx {C Function} scm_from_utf8_symbol (const char *name)
+Construct and return a Scheme symbol whose name is specified by the
+null-terminated C string @var{name}. These are appropriate when
+the C string is hard-coded in the source code.
+@end deftypefn
+
+@deftypefn {C Function} scm_from_locale_symbol (const char *name)
+@deftypefnx {C Function} scm_from_locale_symboln (const char *name, size_t len)
Construct and return a Scheme symbol whose name is specified by
@var{name}. For @code{scm_from_locale_symbol}, @var{name} must be null
terminated; for @code{scm_from_locale_symboln} the length of @var{name} is
specified explicitly by @var{len}.
-@end deffn
+
+Note that these functions should @emph{not} be used when @var{name} is a
+C string constant, because there is no guarantee that the current locale
+will match that of the source code. In such cases, use
+@code{scm_from_latin1_symbol} or @code{scm_from_utf8_symbol}.
+@end deftypefn
@deftypefn {C Function} SCM scm_take_locale_symbol (char *str)
@deftypefnx {C Function} SCM scm_take_locale_symboln (char *str, size_t len)
Equivalent to @code{scm_is_true (scm_keyword_p (@var{obj}))}.
@end deftypefn
-@deftypefn {C Function} SCM scm_from_locale_keyword (const char *str)
-@deftypefnx {C Function} SCM scm_from_locale_keywordn (const char *str, size_t len)
+@deftypefn {C Function} SCM scm_from_locale_keyword (const char *name)
+@deftypefnx {C Function} SCM scm_from_locale_keywordn (const char *name, size_t len)
Equivalent to @code{scm_symbol_to_keyword (scm_from_locale_symbol
-(@var{str}))} and @code{scm_symbol_to_keyword (scm_from_locale_symboln
-(@var{str}, @var{len}))}, respectively.
+(@var{name}))} and @code{scm_symbol_to_keyword (scm_from_locale_symboln
+(@var{name}, @var{len}))}, respectively.
+
+Note that these functions should @emph{not} be used when @var{name} is a
+C string constant, because there is no guarantee that the current locale
+will match that of the source code. In such cases, use
+@code{scm_from_latin1_keyword} or @code{scm_from_utf8_keyword}.
+@end deftypefn
+
+@deftypefn {C Function} SCM scm_from_latin1_keyword (const char *name)
+@deftypefnx {C Function} SCM scm_from_utf8_keyword (const char *name)
+Equivalent to @code{scm_symbol_to_keyword (scm_from_latin1_symbol
+(@var{name}))} and @code{scm_symbol_to_keyword (scm_from_utf8_symbol
+(@var{name}))}, respectively.
@end deftypefn
@node Other Types