@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
-@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004
+@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006
@c Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
In addition to the classification into integers, rationals, reals and
complex numbers, Scheme also distinguishes between whether a number is
represented exactly or not. For example, the result of
-@m{2\sin(\pi/4),sin(pi/4)} is exactly @m{\sqrt{2},2^(1/2)} but Guile
-can neither represent @m{\pi/4,pi/4} nor @m{\sqrt{2},2^(1/2)} exactly.
+@m{2\sin(\pi/4),2*sin(pi/4)} is exactly @m{\sqrt{2},2^(1/2)}, but Guile
+can represent neither @m{\pi/4,pi/4} nor @m{\sqrt{2},2^(1/2)} exactly.
Instead, it stores an inexact approximation, using the C type
@code{double}.
@deffn {Scheme Procedure} integer? x
@deffnx {C Function} scm_integer_p (x)
-Return @code{#t} if @var{x} is an exactor inexact integer number, else
+Return @code{#t} if @var{x} is an exact or inexact integer number, else
@code{#f}.
@lisp
@xref{Initializing Integers,,, gmp, GNU MP Manual}, for details.
@end deftypefn
-@deftypefn {C Function} SCM scm_from_mpz_t (mpz_t val)
+@deftypefn {C Function} SCM scm_from_mpz (mpz_t val)
Return the @code{SCM} value that represents @var{val}.
@end deftypefn
The rational numbers are the set of all numbers that can be written as
fractions @var{p}/@var{q}, where @var{p} and @var{q} are integers.
All rational numbers are also real, but there are real numbers that
-are not rational, for example the square root of 2, and pi.
+are not rational, for example @m{\sqrt2, the square root of 2}, and
+@m{\pi,pi}.
Guile can represent both exact and inexact rational numbers, but it
can not represent irrational numbers. Exact rationals are represented
9.3-17.5i
@end lisp
+@cindex polar form
+@noindent
+Polar form can also be used, with an @samp{@@} between magnitude and
+angle,
+
+@lisp
+1@@3.141592 @result{} -1.0 (approx)
+-1@@1.57079 @result{} 0.0-1.0i (approx)
+@end lisp
+
Guile represents a complex number with a non-zero imaginary part as a
pair of inexact rationals, so the real and imaginary parts of a
complex number have the same properties of inexactness and limited
@end deffn
@c begin (texi-doc-string "guile" "gcd")
-@deffn {Scheme Procedure} gcd
+@deffn {Scheme Procedure} gcd x@dots{}
@deffnx {C Function} scm_gcd (x, y)
Return the greatest common divisor of all arguments.
If called without arguments, 0 is returned.
@end deffn
@c begin (texi-doc-string "guile" "lcm")
-@deffn {Scheme Procedure} lcm
+@deffn {Scheme Procedure} lcm x@dots{}
@deffnx {C Function} scm_lcm (x, y)
Return the least common multiple of the arguments.
If called without arguments, 1 is returned.
@rnindex number->string
@rnindex string->number
+The following procedures read and write numbers according to their
+external representation as defined by R5RS (@pxref{Lexical structure,
+R5RS Lexical Structure,, r5rs, The Revised^5 Report on the Algorithmic
+Language Scheme}). @xref{The ice-9 i18n Module, the @code{(ice-9
+i18n)} module}, for locale-dependent number parsing.
+
@deffn {Scheme Procedure} number->string n [radix]
@deffnx {C Function} scm_number_to_string (n, radix)
Return a string holding the external representation of the
@code{string->number} returns @code{#f}.
@end deffn
+@deftypefn {C Function} SCM scm_c_locale_stringn_to_number (const char *string, size_t len, unsigned radix)
+As per @code{string->number} above, but taking a C string, as pointer
+and length. The string characters should be in the current locale
+encoding (@code{locale} in the name refers only to that, there's no
+locale-dependent parsing).
+@end deftypefn
+
@node Complex
@subsubsection Complex Number Operations
@deffn {Scheme Procedure} make-polar x y
@deffnx {C Function} scm_make_polar (x, y)
+@cindex polar form
Return the complex number @var{x} * e^(i * @var{y}).
@end deffn
@end deffn
@c begin (texi-doc-string "guile" "truncate")
-@deffn {Scheme Procedure} truncate
+@deffn {Scheme Procedure} truncate x
@deffnx {C Function} scm_truncate_number (x)
Round the inexact number @var{x} towards zero.
@end deffn
@rnindex sqrt
@c begin (texi-doc-string "guile" "sqrt")
@deffn {Scheme Procedure} sqrt z
-Return the square root of @var{z}.
+Return the square root of @var{z}. Of the two possible roots
+(positive and negative), the one with the a positive real part is
+returned, or if that's zero then a positive imaginary part. Thus,
+
+@example
+(sqrt 9.0) @result{} 3.0
+(sqrt -9.0) @result{} 0.0+3.0i
+(sqrt 1.0+1.0i) @result{} 1.09868411346781+0.455089860562227i
+(sqrt -1.0-1.0i) @result{} 0.455089860562227-1.09868411346781i
+@end example
@end deffn
@rnindex expt
@deffn {Scheme Procedure} logtest j k
@deffnx {C Function} scm_logtest (j, k)
-@lisp
-(logtest j k) @equiv{} (not (zero? (logand j k)))
+Test whether @var{j} and @var{k} have any 1 bits in common. This is
+equivalent to @code{(not (zero? (logand j k)))}, but without actually
+calculating the @code{logand}, just testing for non-zero.
+@lisp
(logtest #b0100 #b1011) @result{} #f
(logtest #b0100 #b0111) @result{} #t
@end lisp
@deffn {Scheme Procedure} logbit? index j
@deffnx {C Function} scm_logbit_p (index, j)
-@lisp
-(logbit? index j) @equiv{} (logtest (integer-expt 2 index) j)
+Test whether bit number @var{index} in @var{j} is set. @var{index}
+starts from 0 for the least significant bit.
+@lisp
(logbit? 0 #b1101) @result{} #t
(logbit? 1 #b1101) @result{} #f
(logbit? 2 #b1101) @result{} #t
@deffn {Scheme Procedure} logcount n
@deffnx {C Function} scm_logcount (n)
-Return the number of bits in integer @var{n}. If integer is
+Return the number of bits in integer @var{n}. If @var{n} is
positive, the 1-bits in its binary representation are counted.
If negative, the 0-bits in its two's-complement binary
-representation are counted. If 0, 0 is returned.
+representation are counted. If zero, 0 is returned.
@lisp
(logcount #b10101010)
@deffn {Scheme Procedure} integer-expt n k
@deffnx {C Function} scm_integer_expt (n, k)
-Return @var{n} raised to the exact integer exponent
-@var{k}.
+Return @var{n} raised to the power @var{k}. @var{k} must be an exact
+integer, @var{n} can be any number.
+
+Negative @var{k} is supported, and results in @m{1/n^|k|, 1/n^abs(k)}
+in the usual way. @math{@var{n}^0} is 1, as usual, and that includes
+@math{0^0} is 1.
@lisp
-(integer-expt 2 5)
- @result{} 32
-(integer-expt -3 3)
- @result{} -27
+(integer-expt 2 5) @result{} 32
+(integer-expt -3 3) @result{} -27
+(integer-expt 5 -3) @result{} 1/125
+(integer-expt 0 0) @result{} 1
@end lisp
@end deffn
squares is less than 1.0. Thinking of @var{vect} as coordinates in
space of dimension @var{n} @math{=} @code{(vector-length @var{vect})},
the coordinates are uniformly distributed within the unit
-@var{n}-sphere. The sum of the squares of the numbers is returned.
+@var{n}-sphere.
@c FIXME: What does this mean, particularly the n-sphere part?
@end deffn
In order to make the use of the character set data type and procedures
useful, several predefined character set variables exist.
+@cindex codeset
+@cindex charset
+@cindex locale
+
+Currently, the contents of these character sets are recomputed upon a
+successful @code{setlocale} call (@pxref{Locales}) in order to reflect
+the characters available in the current locale's codeset. For
+instance, @code{char-set:letter} contains 52 characters under an ASCII
+locale (e.g., the default @code{C} locale) and 117 characters under an
+ISO-8859-1 (``Latin-1'') locale.
+
@defvr {Scheme Variable} char-set:lower-case
@defvrx {C Variable} scm_char_set_lower_case
All lower-case characters.
@deffn {Scheme Procedure} string-any char_pred s [start [end]]
@deffnx {C Function} scm_string_any (char_pred, s, start, end)
-Check if the predicate @var{pred} is true for any character in
-the string @var{s}.
+Check if @var{char_pred} is true for any character in string @var{s}.
+
+@var{char_pred} can be a character to check for any equal to that, or
+a character set (@pxref{Character Sets}) to check for any in that set,
+or a predicate procedure to call.
-Calls to @var{pred} are made from left to right across @var{s}.
-When it returns true (ie.@: non-@code{#f}), that return value
-is the return from @code{string-any}.
+For a procedure, calls @code{(@var{char_pred} c)} are made
+successively on the characters from @var{start} to @var{end}. If
+@var{char_pred} returns true (ie.@: non-@code{#f}), @code{string-any}
+stops and that return value is the return from @code{string-any}. The
+call on the last character (ie.@: at @math{@var{end}-1}), if that
+point is reached, is a tail call.
-The SRFI-13 specification requires that the call to @var{pred}
-on the last character of @var{s} (assuming that point is
-reached) be a tail call, but currently in Guile this is not the
-case.
+If there are no characters in @var{s} (ie.@: @var{start} equals
+@var{end}) then the return is @code{#f}.
@end deffn
@deffn {Scheme Procedure} string-every char_pred s [start [end]]
@deffnx {C Function} scm_string_every (char_pred, s, start, end)
-Check if the predicate @var{pred} is true for every character
-in the string @var{s}.
+Check if @var{char_pred} is true for every character in string
+@var{s}.
+
+@var{char_pred} can be a character to check for every character equal
+to that, or a character set (@pxref{Character Sets}) to check for
+every character being in that set, or a predicate procedure to call.
-Calls to @var{pred} are made from left to right across @var{s}.
-If the predicate is true for every character then the return
-value from the last @var{pred} call is the return from
-@code{string-every}.
+For a procedure, calls @code{(@var{char_pred} c)} are made
+successively on the characters from @var{start} to @var{end}. If
+@var{char_pred} returns @code{#f}, @code{string-every} stops and
+returns @code{#f}. The call on the last character (ie.@: at
+@math{@var{end}-1}), if that point is reached, is a tail call and the
+return from that call is the return from @code{string-every}.
If there are no characters in @var{s} (ie.@: @var{start} equals
@var{end}) then the return is @code{#t}.
-
-The SRFI-13 specification requires that the call to @var{pred}
-on the last character of @var{s} (assuming that point is
-reached) be a tail call, but currently in Guile this is not the
-case.
@end deffn
@node String Constructors
@c FIXME::martin: list->string belongs into `List/String Conversion'
+@deffn {Scheme Procedure} string char@dots{}
@rnindex string
+Return a newly allocated string made from the given character
+arguments.
+
+@example
+(string #\x #\y #\z) @result{} "xyz"
+(string) @result{} ""
+@end example
+@end deffn
+
+@deffn {Scheme Procedure} list->string lst
+@deffnx {C Function} scm_string (lst)
@rnindex list->string
-@deffn {Scheme Procedure} string . chrs
-@deffnx {Scheme Procedure} list->string chrs
-@deffnx {C Function} scm_string (chrs)
-Return a newly allocated string composed of the arguments,
-@var{chrs}.
+Return a newly allocated string made from a list of characters.
+
+@example
+(list->string '(#\a #\b #\c)) @result{} "abc"
+@end example
+@end deffn
+
+@deffn {Scheme Procedure} reverse-list->string lst
+@deffnx {C Function} scm_reverse_list_to_string (lst)
+Return a newly allocated string made from a list of characters, in
+reverse order.
+
+@example
+(reverse-list->string '(#\a #\B #\c)) @result{} "cBa"
+@end example
@end deffn
@rnindex make-string
@var{proc} is applied to the indices is not specified.
@end deffn
-@deffn {Scheme Procedure} reverse-list->string chrs
-@deffnx {C Function} scm_reverse_list_to_string (chrs)
-An efficient implementation of @code{(compose string->list
-reverse)}:
-
-@smalllisp
-(reverse-list->string '(#\a #\B #\c)) @result{} "cBa"
-@end smalllisp
-@end deffn
-
@deffn {Scheme Procedure} string-join ls [delimiter [grammar]]
@deffnx {C Function} scm_string_join (ls, delimiter, grammar)
Append the string in the string list @var{ls}, using the string
@end deffn
@deffn {Scheme Procedure} string-pad s len [chr [start [end]]]
+@deffnx {Scheme Procedure} string-pad-right s len [chr [start [end]]]
@deffnx {C Function} scm_string_pad (s, len, chr, start, end)
-Take that characters from @var{start} to @var{end} from the
-string @var{s} and return a new string, right-padded by the
-character @var{chr} to length @var{len}. If the resulting
-string is longer than @var{len}, it is truncated on the right.
-@end deffn
-
-@deffn {Scheme Procedure} string-pad-right s len [chr [start [end]]]
@deffnx {C Function} scm_string_pad_right (s, len, chr, start, end)
-Take that characters from @var{start} to @var{end} from the
-string @var{s} and return a new string, left-padded by the
-character @var{chr} to length @var{len}. If the resulting
-string is longer than @var{len}, it is truncated on the left.
-@end deffn
-
-@deffn {Scheme Procedure} string-trim s [char_pred [start [end]]]
-@deffnx {C Function} scm_string_trim (s, char_pred, start, end)
-Trim @var{s} by skipping over all characters on the left
-that satisfy the parameter @var{char_pred}:
+Take characters @var{start} to @var{end} from the string @var{s} and
+either pad with @var{char} or truncate them to give @var{len}
+characters.
-@itemize @bullet
-@item
-if it is the character @var{ch}, characters equal to
-@var{ch} are trimmed,
+@code{string-pad} pads or truncates on the left, so for example
-@item
-if it is a procedure @var{pred} characters that
-satisfy @var{pred} are trimmed,
+@example
+(string-pad "x" 3) @result{} " x"
+(string-pad "abcde" 3) @result{} "cde"
+@end example
-@item
-if it is a character set, characters in that set are trimmed.
-@end itemize
+@code{string-pad-right} pads or truncates on the right, so for example
-If called without a @var{char_pred} argument, all whitespace is
-trimmed.
+@example
+(string-pad-right "x" 3) @result{} "x "
+(string-pad-right "abcde" 3) @result{} "abc"
+@end example
@end deffn
-@deffn {Scheme Procedure} string-trim-right s [char_pred [start [end]]]
+@deffn {Scheme Procedure} string-trim s [char_pred [start [end]]]
+@deffnx {Scheme Procedure} string-trim-right s [char_pred [start [end]]]
+@deffnx {Scheme Procedure} string-trim-both s [char_pred [start [end]]]
+@deffnx {C Function} scm_string_trim (s, char_pred, start, end)
@deffnx {C Function} scm_string_trim_right (s, char_pred, start, end)
-Trim @var{s} by skipping over all characters on the rightt
-that satisfy the parameter @var{char_pred}:
-
-@itemize @bullet
-@item
-if it is the character @var{ch}, characters equal to @var{ch}
-are trimmed,
-
-@item
-if it is a procedure @var{pred} characters that satisfy
-@var{pred} are trimmed,
-
-@item
-if it is a character sets, all characters in that set are
-trimmed.
-@end itemize
-
-If called without a @var{char_pred} argument, all whitespace is
-trimmed.
-@end deffn
-
-@deffn {Scheme Procedure} string-trim-both s [char_pred [start [end]]]
@deffnx {C Function} scm_string_trim_both (s, char_pred, start, end)
-Trim @var{s} by skipping over all characters on both sides of
-the string that satisfy the parameter @var{char_pred}:
-
-@itemize @bullet
-@item
-if it is the character @var{ch}, characters equal to @var{ch}
-are trimmed,
+Trim occurrances of @var{char_pred} from the ends of @var{s}.
-@item
-if it is a procedure @var{pred} characters that satisfy
-@var{pred} are trimmed,
+@code{string-trim} trims @var{char_pred} characters from the left
+(start) of the string, @code{string-trim-right} trims them from the
+right (end) of the string, @code{string-trim-both} trims from both
+ends.
-@item
-if it is a character set, the characters in the set are
-trimmed.
-@end itemize
+@var{char_pred} can be a character, a character set, or a predicate
+procedure to call on each character. If @var{char_pred} is not given
+the default is whitespace as per @code{char-set:whitespace}
+(@pxref{Standard Character Sets}).
-If called without a @var{char_pred} argument, all whitespace is
-trimmed.
+@example
+(string-trim " x ") @result{} "x "
+(string-trim-right "banana" #\a) @result{} "banan"
+(string-trim-both ".,xy:;" char-set:punctuation)
+ @result{} "xy"
+(string-trim-both "xyzzy" (lambda (c)
+ (or (eqv? c #\x)
+ (eqv? c #\y))))
+ @result{} "zz"
+@end example
@end deffn
@node String Modification
The first set is specified in R5RS and has names that end in @code{?}.
The second set is specified in SRFI-13 and the names have no ending
@code{?}. The predicates ending in @code{-ci} ignore the character case
-when comparing strings.
+when comparing strings. @xref{The ice-9 i18n Module, the @code{(ice-9
+i18n)} module}, for locale-dependent string comparison.
@rnindex string=?
@deffn {Scheme Procedure} string=? s1 s2
@deffn {Scheme Procedure} string-for-each-index proc s [start [end]]
@deffnx {C Function} scm_string_for_each_index (proc, s, start, end)
-@var{proc} is mapped over @var{s} in left-to-right order. The
-return value is not specified.
+Call @code{(@var{proc} i)} for each index i in @var{s}, from left to
+right.
+
+For example, to change characters to alternately upper and lower case,
+
+@example
+(define str (string-copy "studly"))
+(string-for-each-index (lambda (i)
+ (string-set! str i
+ ((if (even? i) char-upcase char-downcase)
+ (string-ref str i))))
+ str)
+str @result{} "StUdLy"
+@end example
@end deffn
@deffn {Scheme Procedure} string-fold kons knil s [start [end]]
@deffn {Scheme Procedure} string-filter s char_pred [start [end]]
@deffnx {C Function} scm_string_filter (s, char_pred, start, end)
-Filter the string @var{s}, retaining only those characters that
-satisfy the @var{char_pred} argument. If the argument is a
-procedure, it is applied to each character as a predicate, if
-it is a character, it is tested for equality and if it is a
-character set, it is tested for membership.
+Filter the string @var{s}, retaining only those characters which
+satisfy @var{char_pred}.
+
+If @var{char_pred} is a procedure, it is applied to each character as
+a predicate, if it is a character, it is tested for equality and if it
+is a character set, it is tested for membership.
@end deffn
@deffn {Scheme Procedure} string-delete s char_pred [start [end]]
@deffnx {C Function} scm_string_delete (s, char_pred, start, end)
-Filter the string @var{s}, retaining only those characters that
-do not satisfy the @var{char_pred} argument. If the argument
-is a procedure, it is applied to each character as a predicate,
-if it is a character, it is tested for equality and if it is a
-character set, it is tested for membership.
+Delete characters satisfying @var{char_pred} from @var{s}.
+
+If @var{char_pred} is a procedure, it is applied to each character as
+a predicate, if it is a character, it is tested for equality and if it
+is a character set, it is tested for membership.
@end deffn
@node Conversion to/from C
Converting a Scheme string to a C string will often allocate fresh
memory to hold the result. You must take care that this memory is
properly freed eventually. In many cases, this can be achieved by
-using @code{scm_frame_free} inside an appropriate frame,
-@xref{Frames}.
+using @code{scm_dynwind_free} inside an appropriate dynwind context,
+@xref{Dynamic Wind}.
@deftypefn {C Function} SCM scm_from_locale_string (const char *str)
@deftypefnx {C Function} SCM scm_from_locale_stringn (const char *str, size_t len)
@deftypefnx {C Function} {char *} scm_to_locale_stringn (SCM str, size_t *lenp)
Returns a C string in the current locale encoding with the same
contents as @var{str}. The C string must be freed with @code{free}
-eventually, maybe by using @code{scm_frame_free}, @xref{Frames}.
+eventually, maybe by using @code{scm_dynwind_free}, @xref{Dynamic
+Wind}.
For @code{scm_to_locale_string}, the returned string is
null-terminated and an error is signalled when @var{str} contains
implemented by SCSH, the Scheme Shell. It is intended to be
upwardly compatible with SCSH regular expressions.
+Zero bytes (@code{#\nul}) cannot be used in regex patterns or input
+strings, since the underlying C functions treat that as the end of
+string. If there's a zero byte an error is thrown.
+
+Patterns and input strings are treated as being in the locale
+character set if @code{setlocale} has been called (@pxref{Locales}),
+and in a multibyte locale this includes treating multi-byte sequences
+as a single character. (Guile strings are currently merely bytes,
+though this may change in the future, @xref{Conversion to/from C}.)
+
@deffn {Scheme Procedure} string-match pattern str [start]
Compile the string @var{pattern} into a regular expression and compare
it with @var{str}. The optional numeric argument @var{start} specifies
Return a match structure describing the results of the match,
or @code{#f} if no match could be found.
-The @var{flags} arguments change the matching behavior.
-The following flags may be supplied:
+The @var{flags} argument changes the matching behavior. The following
+flag values may be supplied, use @code{logior} (@pxref{Bitwise
+Operations}) to combine them,
@defvar regexp/notbol
-Operator @samp{^} always fails (unless @code{regexp/newline}
-is used). Use this when the beginning of the string should
-not be considered the beginning of a line.
+Consider that the @var{start} offset into @var{str} is not the
+beginning of a line and should not match operator @samp{^}.
+
+If @var{rx} was created with the @code{regexp/newline} option above,
+@samp{^} will still match after a newline in @var{str}.
@end defvar
@defvar regexp/noteol
-Operator @samp{$} always fails (unless @code{regexp/newline}
-is used). Use this when the end of the string should not be
-considered the end of a line.
+Consider that the end of @var{str} is not the end of a line and should
+not match operator @samp{$}.
+
+If @var{rx} was created with the @code{regexp/newline} option above,
+@samp{$} will still match before a newline in @var{str}.
@end defvar
@end deffn
or @code{#f} otherwise.
@end deffn
-Regular expressions are commonly used to find patterns in one string and
-replace them with the contents of another string.
+@sp 1
+@deffn {Scheme Procedure} list-matches regexp str [flags]
+Return a list of match structures which are the non-overlapping
+matches of @var{regexp} in @var{str}. @var{regexp} can be either a
+pattern string or a compiled regexp. The @var{flags} argument is as
+per @code{regexp-exec} above.
+
+@example
+(map match:substring (list-matches "[a-z]+" "abc 42 def 78"))
+@result{} ("abc" "def")
+@end example
+@end deffn
+
+@deffn {Scheme Procedure} fold-matches regexp str init proc [flags]
+Apply @var{proc} to the non-overlapping matches of @var{regexp} in
+@var{str}, to build a result. @var{regexp} can be either a pattern
+string or a compiled regexp. The @var{flags} argument is as per
+@code{regexp-exec} above.
+
+@var{proc} is called as @code{(@var{proc} match prev)} where
+@var{match} is a match structure and @var{prev} is the previous return
+from @var{proc}. For the first call @var{prev} is the given
+@var{init} parameter. @code{fold-matches} returns the final value
+from @var{proc}.
+
+For example to count matches,
+
+@example
+(fold-matches "[a-z][0-9]" "abc x1 def y2" 0
+ (lambda (match count)
+ (1+ count)))
+@result{} 2
+@end example
+@end deffn
+
+@sp 1
+Regular expressions are commonly used to find patterns in one string
+and replace them with the contents of another string. The following
+functions are convenient ways to do this.
@c begin (scm-doc-string "regex.scm" "regexp-substitute")
@deffn {Scheme Procedure} regexp-substitute port match [item@dots{}]
-Write to the output port @var{port} selected contents of the match
-structure @var{match}. Each @var{item} specifies what should be
-written, and may be one of the following arguments:
+Write to @var{port} selected parts of the match structure @var{match}.
+Or if @var{port} is @code{#f} then form a string from those parts and
+return that.
+
+Each @var{item} specifies a part to be written, and may be one of the
+following,
@itemize @bullet
@item
A string. String arguments are written out verbatim.
@item
-An integer. The submatch with that number is written.
+An integer. The submatch with that number is written
+(@code{match:substring}). Zero is the entire match.
@item
The symbol @samp{pre}. The portion of the matched string preceding
-the regexp match is written.
+the regexp match is written (@code{match:prefix}).
@item
The symbol @samp{post}. The portion of the matched string following
-the regexp match is written.
+the regexp match is written (@code{match:suffix}).
@end itemize
-The @var{port} argument may be @code{#f}, in which case nothing is
-written; instead, @code{regexp-substitute} constructs a string from the
-specified @var{item}s and returns that.
-@end deffn
+For example, changing a match and retaining the text before and after,
-The following example takes a regular expression that matches a standard
-@sc{yyyymmdd}-format date such as @code{"20020828"}. The
-@code{regexp-substitute} call returns a string computed from the
-information in the match structure, consisting of the fields and text
-from the original string reordered and reformatted.
+@example
+(regexp-substitute #f (string-match "[0-9]+" "number 25 is good")
+ 'pre "37" 'post)
+@result{} "number 37 is good"
+@end example
+
+Or matching a @sc{yyyymmdd} format date such as @samp{20020828} and
+re-ordering and hyphenating the fields.
@lisp
(define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
(define s "Date 20020429 12am.")
-(define sm (string-match date-regex s))
-(regexp-substitute #f sm 'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
+(regexp-substitute #f (string-match date-regex s)
+ 'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
@result{} "Date 04-29-2002 12am. (20020429)"
@end lisp
+@end deffn
+
@c begin (scm-doc-string "regex.scm" "regexp-substitute")
@deffn {Scheme Procedure} regexp-substitute/global port regexp target [item@dots{}]
-Similar to @code{regexp-substitute}, but can be used to perform global
-substitutions on @var{str}. Instead of taking a match structure as an
-argument, @code{regexp-substitute/global} takes two string arguments: a
-@var{regexp} string describing a regular expression, and a @var{target}
-string which should be matched against this regular expression.
+@cindex search and replace
+Write to @var{port} selected parts of matches of @var{regexp} in
+@var{target}. If @var{port} is @code{#f} then form a string from
+those parts and return that. @var{regexp} can be a string or a
+compiled regex.
-Each @var{item} behaves as in @code{regexp-substitute}, with the
-following exceptions:
+This is similar to @code{regexp-substitute}, but allows global
+substitutions on @var{target}. Each @var{item} behaves as per
+@code{regexp-substitute}, with the following differences,
@itemize @bullet
@item
-A function may be supplied. When this function is called, it will be
-passed one argument: a match structure for a given regular expression
-match. It should return a string to be written out to @var{port}.
+A function. Called as @code{(@var{item} match)} with the match
+structure for the @var{regexp} match, it should return a string to be
+written to @var{port}.
@item
-The @samp{post} symbol causes @code{regexp-substitute/global} to recurse
-on the unmatched portion of @var{str}. This @emph{must} be supplied in
-order to perform global search-and-replace on @var{str}; if it is not
-present among the @var{item}s, then @code{regexp-substitute/global} will
-return after processing a single match.
+The symbol @samp{post}. This doesn't output anything, but instead
+causes @code{regexp-substitute/global} to recurse on the unmatched
+portion of @var{target}.
+
+This @emph{must} be supplied to perform a global search and replace on
+@var{target}; without it @code{regexp-substitute/global} returns after
+a single match and output.
@end itemize
-@end deffn
-The example above for @code{regexp-substitute} could be rewritten as
-follows to remove the @code{string-match} stage:
+For example, to collapse runs of tabs and spaces to a single hyphen
+each,
+
+@example
+(regexp-substitute/global #f "[ \t]+" "this is the text"
+ 'pre "-" 'post)
+@result{} "this-is-the-text"
+@end example
+
+Or using a function to reverse the letters in each word,
+
+@example
+(regexp-substitute/global #f "[a-z]+" "to do and not-do"
+ 'pre (lambda (m) (string-reverse (match:substring m))) 'post)
+@result{} "ot od dna ton-od"
+@end example
+
+Without the @code{post} symbol, just one regexp match is made. For
+example the following is the date example from
+@code{regexp-substitute} above, without the need for the separate
+@code{string-match} call.
@lisp
(define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
(define s "Date 20020429 12am.")
(regexp-substitute/global #f date-regex s
- 'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
+ 'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
+
@result{} "Date 04-29-2002 12am. (20020429)"
@end lisp
+@end deffn
@node Match Structures
specified explicitly by @var{len}.
@end deffn
+@deftypefn {C Function} SCM scm_take_locale_symbol (char *str)
+@deftypefnx {C Function} SCM scm_take_locale_symboln (char *str, size_t len)
+Like @code{scm_from_locale_symbol} and @code{scm_from_locale_symboln},
+respectively, but also frees @var{str} with @code{free} eventually.
+Thus, you can use this function when you would free @var{str} anyway
+immediately after creating the Scheme string. In certain cases, Guile
+can then use @var{str} directly as its internal representation.
+@end deftypefn
+
+
Finally, some applications, especially those that generate new Scheme
code dynamically, need to generate symbols for use in the generated
code. The @code{gensym} primitive meets this need:
@end deffn
Support for these extra slots may be removed in a future release, and it
-is probably better to avoid using them. (In release 1.6, Guile itself
-uses the property list slot sparingly, and the function slot not at
-all.) For a more modern and Schemely approach to properties, see
-@ref{Object Properties}.
+is probably better to avoid using them. For a more modern and Schemely
+approach to properties, see @ref{Object Properties}.
@node Symbol Read Syntax