New file.
[bpt/guile.git] / doc / ref / api-data.texi
CommitLineData
07d83abe
MV
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
3@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004
4@c Free Software Foundation, Inc.
5@c See the file guile.texi for copying conditions.
6
7@page
8@node Simple Data Types
9@section Simple Generic Data Types
10
11This chapter describes those of Guile's simple data types which are
12primarily used for their role as items of generic data. By
13@dfn{simple} we mean data types that are not primarily used as
14containers to hold other data --- i.e.@: pairs, lists, vectors and so on.
15For the documentation of such @dfn{compound} data types, see
16@ref{Compound Data Types}.
17
18@c One of the great strengths of Scheme is that there is no straightforward
19@c distinction between ``data'' and ``functionality''. For example,
20@c Guile's support for dynamic linking could be described:
21
22@c @itemize @bullet
23@c @item
24@c either in a ``data-centric'' way, as the behaviour and properties of the
25@c ``dynamically linked object'' data type, and the operations that may be
26@c applied to instances of this type
27
28@c @item
29@c or in a ``functionality-centric'' way, as the set of procedures that
30@c constitute Guile's support for dynamic linking, in the context of the
31@c module system.
32@c @end itemize
33
34@c The contents of this chapter are, therefore, a matter of judgment. By
35@c @dfn{generic}, we mean to select those data types whose typical use as
36@c @emph{data} in a wide variety of programming contexts is more important
37@c than their use in the implementation of a particular piece of
38@c @emph{functionality}. The last section of this chapter provides
39@c references for all the data types that are documented not here but in a
40@c ``functionality-centric'' way elsewhere in the manual.
41
42@menu
43* Booleans:: True/false values.
44* Numbers:: Numerical data types.
45* Characters:: New character names.
46* Strings:: Special things about strings.
47* Regular Expressions:: Pattern matching and substitution.
48* Symbols:: Symbols.
49* Keywords:: Self-quoting, customizable display keywords.
50* Other Types:: "Functionality-centric" data types.
51@end menu
52
53
54@node Booleans
55@subsection Booleans
56@tpindex Booleans
57
58The two boolean values are @code{#t} for true and @code{#f} for false.
59
60Boolean values are returned by predicate procedures, such as the general
61equality predicates @code{eq?}, @code{eqv?} and @code{equal?}
62(@pxref{Equality}) and numerical and string comparison operators like
63@code{string=?} (@pxref{String Comparison}) and @code{<=}
64(@pxref{Comparison}).
65
66@lisp
67(<= 3 8)
68@result{} #t
69
70(<= 3 -3)
71@result{} #f
72
73(equal? "house" "houses")
74@result{} #f
75
76(eq? #f #f)
77@result{}
78#t
79@end lisp
80
81In test condition contexts like @code{if} and @code{cond} (@pxref{if
82cond case}), where a group of subexpressions will be evaluated only if a
83@var{condition} expression evaluates to ``true'', ``true'' means any
84value at all except @code{#f}.
85
86@lisp
87(if #t "yes" "no")
88@result{} "yes"
89
90(if 0 "yes" "no")
91@result{} "yes"
92
93(if #f "yes" "no")
94@result{} "no"
95@end lisp
96
97A result of this asymmetry is that typical Scheme source code more often
98uses @code{#f} explicitly than @code{#t}: @code{#f} is necessary to
99represent an @code{if} or @code{cond} false value, whereas @code{#t} is
100not necessary to represent an @code{if} or @code{cond} true value.
101
102It is important to note that @code{#f} is @strong{not} equivalent to any
103other Scheme value. In particular, @code{#f} is not the same as the
104number 0 (like in C and C++), and not the same as the ``empty list''
105(like in some Lisp dialects).
106
107In C, the two Scheme boolean values are available as the two constants
108@code{SCM_BOOL_T} for @code{#t} and @code{SCM_BOOL_F} for @code{#f}.
109Care must be taken with the false value @code{SCM_BOOL_F}: it is not
110false when used in C conditionals. In order to test for it, use
111@code{scm_is_false} or @code{scm_is_true}.
112
113@rnindex not
114@deffn {Scheme Procedure} not x
115@deffnx {C Function} scm_not (x)
116Return @code{#t} if @var{x} is @code{#f}, else return @code{#f}.
117@end deffn
118
119@rnindex boolean?
120@deffn {Scheme Procedure} boolean? obj
121@deffnx {C Function} scm_boolean_p (obj)
122Return @code{#t} if @var{obj} is either @code{#t} or @code{#f}, else
123return @code{#f}.
124@end deffn
125
126@deftypevr {C Macro} SCM SCM_BOOL_T
127The @code{SCM} representation of the Scheme object @code{#t}.
128@end deftypevr
129
130@deftypevr {C Macro} SCM SCM_BOOL_F
131The @code{SCM} representation of the Scheme object @code{#f}.
132@end deftypevr
133
134@deftypefn {C Function} int scm_is_true (SCM obj)
135Return @code{0} if @var{obj} is @code{#f}, else return @code{1}.
136@end deftypefn
137
138@deftypefn {C Function} int scm_is_false (SCM obj)
139Return @code{1} if @var{obj} is @code{#f}, else return @code{0}.
140@end deftypefn
141
142@deftypefn {C Function} int scm_is_bool (SCM obj)
143Return @code{1} if @var{obj} is either @code{#t} or @code{#f}, else
144return @code{0}.
145@end deftypefn
146
147@deftypefn {C Function} SCM scm_from_bool (int val)
148Return @code{#f} if @var{val} is @code{0}, else return @code{#t}.
149@end deftypefn
150
151@deftypefn {C Function} int scm_to_bool (SCM val)
152Return @code{1} if @var{val} is @code{SCM_BOOL_T}, return @code{0}
153when @var{val} is @code{SCM_BOOL_F}, else signal a `wrong type' error.
154
155You should probably use @code{scm_is_true} instead of this function
156when you just want to test a @code{SCM} value for trueness.
157@end deftypefn
158
159@node Numbers
160@subsection Numerical data types
161@tpindex Numbers
162
163Guile supports a rich ``tower'' of numerical types --- integer,
164rational, real and complex --- and provides an extensive set of
165mathematical and scientific functions for operating on numerical
166data. This section of the manual documents those types and functions.
167
168You may also find it illuminating to read R5RS's presentation of numbers
169in Scheme, which is particularly clear and accessible: see
170@ref{Numbers,,,r5rs,R5RS}.
171
172@menu
173* Numerical Tower:: Scheme's numerical "tower".
174* Integers:: Whole numbers.
175* Reals and Rationals:: Real and rational numbers.
176* Complex Numbers:: Complex numbers.
177* Exactness:: Exactness and inexactness.
178* Number Syntax:: Read syntax for numerical data.
179* Integer Operations:: Operations on integer values.
180* Comparison:: Comparison predicates.
181* Conversion:: Converting numbers to and from strings.
182* Complex:: Complex number operations.
183* Arithmetic:: Arithmetic functions.
184* Scientific:: Scientific functions.
185* Primitive Numerics:: Primitive numeric functions.
186* Bitwise Operations:: Logical AND, OR, NOT, and so on.
187* Random:: Random number generation.
188@end menu
189
190
191@node Numerical Tower
192@subsubsection Scheme's Numerical ``Tower''
193@rnindex number?
194
195Scheme's numerical ``tower'' consists of the following categories of
196numbers:
197
198@table @dfn
199@item integers
200Whole numbers, positive or negative; e.g.@: --5, 0, 18.
201
202@item rationals
203The set of numbers that can be expressed as @math{@var{p}/@var{q}}
204where @var{p} and @var{q} are integers; e.g.@: @math{9/16} works, but
205pi (an irrational number) doesn't. These include integers
206(@math{@var{n}/1}).
207
208@item real numbers
209The set of numbers that describes all possible positions along a
210one-dimensional line. This includes rationals as well as irrational
211numbers.
212
213@item complex numbers
214The set of numbers that describes all possible positions in a two
215dimensional space. This includes real as well as imaginary numbers
216(@math{@var{a}+@var{b}i}, where @var{a} is the @dfn{real part},
217@var{b} is the @dfn{imaginary part}, and @math{i} is the square root of
218@minus{}1.)
219@end table
220
221It is called a tower because each category ``sits on'' the one that
222follows it, in the sense that every integer is also a rational, every
223rational is also real, and every real number is also a complex number
224(but with zero imaginary part).
225
226In addition to the classification into integers, rationals, reals and
227complex numbers, Scheme also distinguishes between whether a number is
228represented exactly or not. For example, the result of
229@m{2\sin(\pi/4),sin(pi/4)} is exactly @m{\sqrt{2},2^(1/2)} but Guile
230can neither represent @m{\pi/4,pi/4} nor @m{\sqrt{2},2^(1/2)} exactly.
231Instead, it stores an inexact approximation, using the C type
232@code{double}.
233
234Guile can represent exact rationals of any magnitude, inexact
235rationals that fit into a C @code{double}, and inexact complex numbers
236with @code{double} real and imaginary parts.
237
238The @code{number?} predicate may be applied to any Scheme value to
239discover whether the value is any of the supported numerical types.
240
241@deffn {Scheme Procedure} number? obj
242@deffnx {C Function} scm_number_p (obj)
243Return @code{#t} if @var{obj} is any kind of number, else @code{#f}.
244@end deffn
245
246For example:
247
248@lisp
249(number? 3)
250@result{} #t
251
252(number? "hello there!")
253@result{} #f
254
255(define pi 3.141592654)
256(number? pi)
257@result{} #t
258@end lisp
259
5615f696
MV
260@deftypefn {C Function} int scm_is_number (SCM obj)
261This is equivalent to @code{scm_is_true (scm_number_p (obj))}.
262@end deftypefn
263
07d83abe
MV
264The next few subsections document each of Guile's numerical data types
265in detail.
266
267@node Integers
268@subsubsection Integers
269
270@tpindex Integer numbers
271
272@rnindex integer?
273
274Integers are whole numbers, that is numbers with no fractional part,
275such as 2, 83, and @minus{}3789.
276
277Integers in Guile can be arbitrarily big, as shown by the following
278example.
279
280@lisp
281(define (factorial n)
282 (let loop ((n n) (product 1))
283 (if (= n 0)
284 product
285 (loop (- n 1) (* product n)))))
286
287(factorial 3)
288@result{} 6
289
290(factorial 20)
291@result{} 2432902008176640000
292
293(- (factorial 45))
294@result{} -119622220865480194561963161495657715064383733760000000000
295@end lisp
296
297Readers whose background is in programming languages where integers are
298limited by the need to fit into just 4 or 8 bytes of memory may find
299this surprising, or suspect that Guile's representation of integers is
300inefficient. In fact, Guile achieves a near optimal balance of
301convenience and efficiency by using the host computer's native
302representation of integers where possible, and a more general
303representation where the required number does not fit in the native
304form. Conversion between these two representations is automatic and
305completely invisible to the Scheme level programmer.
306
307The infinities @samp{+inf.0} and @samp{-inf.0} are considered to be
308inexact integers. They are explained in detail in the next section,
309together with reals and rationals.
310
311C has a host of different integer types, and Guile offers a host of
312functions to convert between them and the @code{SCM} representation.
313For example, a C @code{int} can be handled with @code{scm_to_int} and
314@code{scm_from_int}. Guile also defines a few C integer types of its
315own, to help with differences between systems.
316
317C integer types that are not covered can be handled with the generic
318@code{scm_to_signed_integer} and @code{scm_from_signed_integer} for
319signed types, or with @code{scm_to_unsigned_integer} and
320@code{scm_from_unsigned_integer} for unsigned types.
321
322Scheme integers can be exact and inexact. For example, a number
323written as @code{3.0} with an explicit decimal-point is inexact, but
324it is also an integer. The functions @code{integer?} and
325@code{scm_is_integer} report true for such a number, but the functions
326@code{scm_is_signed_integer} and @code{scm_is_unsigned_integer} only
327allow exact integers and thus report false. Likewise, the conversion
328functions like @code{scm_to_signed_integer} only accept exact
329integers.
330
331The motivation for this behavior is that the inexactness of a number
332should not be lost silently. If you want to allow inexact integers,
333you can explicitely insert a call to @code{inexact->exact} or to its C
334equivalent @code{scm_inexact_to_exact}. (Only inexact integers will
335be converted by this call into exact integers; inexact non-integers
336will become exact fractions.)
337
338@deffn {Scheme Procedure} integer? x
339@deffnx {C Function} scm_integer_p (x)
340Return @code{#t} if @var{x} is an exactor inexact integer number, else
341@code{#f}.
342
343@lisp
344(integer? 487)
345@result{} #t
346
347(integer? 3.0)
348@result{} #t
349
350(integer? -3.4)
351@result{} #f
352
353(integer? +inf.0)
354@result{} #t
355@end lisp
356@end deffn
357
358@deftypefn {C Function} int scm_is_integer (SCM x)
359This is equivalent to @code{scm_is_true (scm_integer_p (x))}.
360@end deftypefn
361
362@defvr {C Type} scm_t_int8
363@defvrx {C Type} scm_t_uint8
364@defvrx {C Type} scm_t_int16
365@defvrx {C Type} scm_t_uint16
366@defvrx {C Type} scm_t_int32
367@defvrx {C Type} scm_t_uint32
368@defvrx {C Type} scm_t_int64
369@defvrx {C Type} scm_t_uint64
370@defvrx {C Type} scm_t_intmax
371@defvrx {C Type} scm_t_uintmax
372The C types are equivalent to the corresponding ISO C types but are
373defined on all platforms, with the exception of @code{scm_t_int64} and
374@code{scm_t_uint64}, which are only defined when a 64-bit type is
375available. For example, @code{scm_t_int8} is equivalent to
376@code{int8_t}.
377
378You can regard these definitions as a stop-gap measure until all
379platforms provide these types. If you know that all the platforms
380that you are interested in already provide these types, it is better
381to use them directly instead of the types provided by Guile.
382@end defvr
383
384@deftypefn {C Function} int scm_is_signed_integer (SCM x, scm_t_intmax min, scm_t_intmax max)
385@deftypefnx {C Function} int scm_is_unsigned_integer (SCM x, scm_t_uintmax min, scm_t_uintmax max)
386Return @code{1} when @var{x} represents an exact integer that is
387between @var{min} and @var{max}, inclusive.
388
389These functions can be used to check whether a @code{SCM} value will
390fit into a given range, such as the range of a given C integer type.
391If you just want to convert a @code{SCM} value to a given C integer
392type, use one of the conversion functions directly.
393@end deftypefn
394
395@deftypefn {C Function} scm_t_intmax scm_to_signed_integer (SCM x, scm_t_intmax min, scm_t_intmax max)
396@deftypefnx {C Function} scm_t_uintmax scm_to_unsigned_integer (SCM x, scm_t_uintmax min, scm_t_uintmax max)
397When @var{x} represents an exact integer that is between @var{min} and
398@var{max} inclusive, return that integer. Else signal an error,
399either a `wrong-type' error when @var{x} is not an exact integer, or
400an `out-of-range' error when it doesn't fit the given range.
401@end deftypefn
402
403@deftypefn {C Function} SCM scm_from_signed_integer (scm_t_intmax x)
404@deftypefnx {C Function} SCM scm_from_unsigned_integer (scm_t_uintmax x)
405Return the @code{SCM} value that represents the integer @var{x}. This
406function will always succeed and will always return an exact number.
407@end deftypefn
408
409@deftypefn {C Function} char scm_to_char (SCM x)
410@deftypefnx {C Function} {signed char} scm_to_schar (SCM x)
411@deftypefnx {C Function} {unsigned char} scm_to_uchar (SCM x)
412@deftypefnx {C Function} short scm_to_short (SCM x)
413@deftypefnx {C Function} {unsigned short} scm_to_ushort (SCM x)
414@deftypefnx {C Function} int scm_to_int (SCM x)
415@deftypefnx {C Function} {unsigned int} scm_to_uint (SCM x)
416@deftypefnx {C Function} long scm_to_long (SCM x)
417@deftypefnx {C Function} {unsigned long} scm_to_ulong (SCM x)
418@deftypefnx {C Function} {long long} scm_to_long_long (SCM x)
419@deftypefnx {C Function} {unsigned long long} scm_to_ulong_long (SCM x)
420@deftypefnx {C Function} size_t scm_to_size_t (SCM x)
421@deftypefnx {C Function} ssize_t scm_to_ssize_t (SCM x)
422@deftypefnx {C Function} scm_t_int8 scm_to_int8 (SCM x)
423@deftypefnx {C Function} scm_t_uint8 scm_to_uint8 (SCM x)
424@deftypefnx {C Function} scm_t_int16 scm_to_int16 (SCM x)
425@deftypefnx {C Function} scm_t_uint16 scm_to_uint16 (SCM x)
426@deftypefnx {C Function} scm_t_int32 scm_to_int32 (SCM x)
427@deftypefnx {C Function} scm_t_uint32 scm_to_uint32 (SCM x)
428@deftypefnx {C Function} scm_t_int64 scm_to_int64 (SCM x)
429@deftypefnx {C Function} scm_t_uint64 scm_to_uint64 (SCM x)
430@deftypefnx {C Function} scm_t_intmax scm_to_intmax (SCM x)
431@deftypefnx {C Function} scm_t_uintmax scm_to_uintmax (SCM x)
432When @var{x} represents an exact integer that fits into the indicated
433C type, return that integer. Else signal an error, either a
434`wrong-type' error when @var{x} is not an exact integer, or an
435`out-of-range' error when it doesn't fit the given range.
436
437The functions @code{scm_to_long_long}, @code{scm_to_ulong_long},
438@code{scm_to_int64}, and @code{scm_to_uint64} are only available when
439the corresponding types are.
440@end deftypefn
441
442@deftypefn {C Function} SCM scm_from_char (char x)
443@deftypefnx {C Function} SCM scm_from_schar (signed char x)
444@deftypefnx {C Function} SCM scm_from_uchar (unsigned char x)
445@deftypefnx {C Function} SCM scm_from_short (short x)
446@deftypefnx {C Function} SCM scm_from_ushort (unsigned short x)
447@deftypefnx {C Function} SCM scm_from_int (int x)
448@deftypefnx {C Function} SCM scm_from_uint (unsigned int x)
449@deftypefnx {C Function} SCM scm_from_long (long x)
450@deftypefnx {C Function} SCM scm_from_ulong (unsigned long x)
451@deftypefnx {C Function} SCM scm_from_long_long (long long x)
452@deftypefnx {C Function} SCM scm_from_ulong_long (unsigned long long x)
453@deftypefnx {C Function} SCM scm_from_size_t (size_t x)
454@deftypefnx {C Function} SCM scm_from_ssize_t (ssize_t x)
455@deftypefnx {C Function} SCM scm_from_int8 (scm_t_int8 x)
456@deftypefnx {C Function} SCM scm_from_uint8 (scm_t_uint8 x)
457@deftypefnx {C Function} SCM scm_from_int16 (scm_t_int16 x)
458@deftypefnx {C Function} SCM scm_from_uint16 (scm_t_uint16 x)
459@deftypefnx {C Function} SCM scm_from_int32 (scm_t_int32 x)
460@deftypefnx {C Function} SCM scm_from_uint32 (scm_t_uint32 x)
461@deftypefnx {C Function} SCM scm_from_int64 (scm_t_int64 x)
462@deftypefnx {C Function} SCM scm_from_uint64 (scm_t_uint64 x)
463@deftypefnx {C Function} SCM scm_from_intmax (scm_t_intmax x)
464@deftypefnx {C Function} SCM scm_from_uintmax (scm_t_uintmax x)
465Return the @code{SCM} value that represents the integer @var{x}.
466These functions will always succeed and will always return an exact
467number.
468@end deftypefn
469
470@node Reals and Rationals
471@subsubsection Real and Rational Numbers
472@tpindex Real numbers
473@tpindex Rational numbers
474
475@rnindex real?
476@rnindex rational?
477
478Mathematically, the real numbers are the set of numbers that describe
479all possible points along a continuous, infinite, one-dimensional line.
480The rational numbers are the set of all numbers that can be written as
481fractions @var{p}/@var{q}, where @var{p} and @var{q} are integers.
482All rational numbers are also real, but there are real numbers that
483are not rational, for example the square root of 2, and pi.
484
485Guile can represent both exact and inexact rational numbers, but it
486can not represent irrational numbers. Exact rationals are represented
487by storing the numerator and denominator as two exact integers.
488Inexact rationals are stored as floating point numbers using the C
489type @code{double}.
490
491Exact rationals are written as a fraction of integers. There must be
492no whitespace around the slash:
493
494@lisp
4951/2
496-22/7
497@end lisp
498
499Even though the actual encoding of inexact rationals is in binary, it
500may be helpful to think of it as a decimal number with a limited
501number of significant figures and a decimal point somewhere, since
502this corresponds to the standard notation for non-whole numbers. For
503example:
504
505@lisp
5060.34
507-0.00000142857931198
508-5648394822220000000000.0
5094.0
510@end lisp
511
512The limited precision of Guile's encoding means that any ``real'' number
513in Guile can be written in a rational form, by multiplying and then dividing
514by sufficient powers of 10 (or in fact, 2). For example,
515@samp{-0.00000142857931198} is the same as @minus{}142857931198 divided by
516100000000000000000. In Guile's current incarnation, therefore, the
517@code{rational?} and @code{real?} predicates are equivalent.
518
519
520Dividing by an exact zero leads to a error message, as one might
521expect. However, dividing by an inexact zero does not produce an
522error. Instead, the result of the division is either plus or minus
523infinity, depending on the sign of the divided number.
524
525The infinities are written @samp{+inf.0} and @samp{-inf.0},
526respectivly. This syntax is also recognized by @code{read} as an
527extension to the usual Scheme syntax.
528
529Dividing zero by zero yields something that is not a number at all:
530@samp{+nan.0}. This is the special `not a number' value.
531
532On platforms that follow @acronym{IEEE} 754 for their floating point
533arithmetic, the @samp{+inf.0}, @samp{-inf.0}, and @samp{+nan.0} values
534are implemented using the corresponding @acronym{IEEE} 754 values.
535They behave in arithmetic operations like @acronym{IEEE} 754 describes
536it, i.e., @code{(= +nan.0 +nan.0)} @result{} @code{#f}.
537
538The infinities are inexact integers and are considered to be both even
539and odd. While @samp{+nan.0} is not @code{=} to itself, it is
540@code{eqv?} to itself.
541
542To test for the special values, use the functions @code{inf?} and
543@code{nan?}.
544
545@deffn {Scheme Procedure} real? obj
546@deffnx {C Function} scm_real_p (obj)
547Return @code{#t} if @var{obj} is a real number, else @code{#f}. Note
548that the sets of integer and rational values form subsets of the set
549of real numbers, so the predicate will also be fulfilled if @var{obj}
550is an integer number or a rational number.
551@end deffn
552
553@deffn {Scheme Procedure} rational? x
554@deffnx {C Function} scm_rational_p (x)
555Return @code{#t} if @var{x} is a rational number, @code{#f} otherwise.
556Note that the set of integer values forms a subset of the set of
557rational numbers, i. e. the predicate will also be fulfilled if
558@var{x} is an integer number.
559
560Since Guile can not represent irrational numbers, every number
561satisfying @code{real?} also satisfies @code{rational?} in Guile.
562@end deffn
563
564@deffn {Scheme Procedure} rationalize x eps
565@deffnx {C Function} scm_rationalize (x, eps)
566Returns the @emph{simplest} rational number differing
567from @var{x} by no more than @var{eps}.
568
569As required by @acronym{R5RS}, @code{rationalize} only returns an
570exact result when both its arguments are exact. Thus, you might need
571to use @code{inexact->exact} on the arguments.
572
573@lisp
574(rationalize (inexact->exact 1.2) 1/100)
575@result{} 6/5
576@end lisp
577
578@end deffn
579
d3df9759
MV
580@deffn {Scheme Procedure} inf? x
581@deffnx {C Function} scm_inf_p (x)
07d83abe
MV
582Return @code{#t} if @var{x} is either @samp{+inf.0} or @samp{-inf.0},
583@code{#f} otherwise.
584@end deffn
585
586@deffn {Scheme Procedure} nan? x
d3df9759 587@deffnx {C Function} scm_nan_p (x)
07d83abe
MV
588Return @code{#t} if @var{x} is @samp{+nan.0}, @code{#f} otherwise.
589@end deffn
590
d3df9759
MV
591@deffn {Scheme Procedure} numerator x
592@deffnx {C Function} scm_numerator (x)
593Return the numerator of the rational number @var{x}.
594@end deffn
595
596@deffn {Scheme Procedure} denominator x
597@deffnx {C Function} scm_denominator (x)
598Return the denominator of the rational number @var{x}.
599@end deffn
600
601@deftypefn {C Function} int scm_is_real (SCM val)
602@deftypefnx {C Function} int scm_is_rational (SCM val)
603Equivalent to @code{scm_is_true (scm_real_p (val))} and
604@code{scm_is_true (scm_rational_p (val))}, respectively.
605@end deftypefn
606
607@deftypefn {C Function} double scm_to_double (SCM val)
608Returns the number closest to @var{val} that is representable as a
609@code{double}. Returns infinity for a @var{val} that is too large in
610magnitude. The argument @var{val} must be a real number.
611@end deftypefn
612
613@deftypefn {C Function} SCM scm_from_double (double val)
614Return the @code{SCM} value that representats @var{val}. The returned
615value is inexact according to the predicate @code{inexact?}, but it
616will be exactly equal to @var{val}.
617@end deftypefn
618
07d83abe
MV
619@node Complex Numbers
620@subsubsection Complex Numbers
621@tpindex Complex numbers
622
623@rnindex complex?
624
625Complex numbers are the set of numbers that describe all possible points
626in a two-dimensional space. The two coordinates of a particular point
627in this space are known as the @dfn{real} and @dfn{imaginary} parts of
628the complex number that describes that point.
629
630In Guile, complex numbers are written in rectangular form as the sum of
631their real and imaginary parts, using the symbol @code{i} to indicate
632the imaginary part.
633
634@lisp
6353+4i
636@result{}
6373.0+4.0i
638
639(* 3-8i 2.3+0.3i)
640@result{}
6419.3-17.5i
642@end lisp
643
644Guile represents a complex number with a non-zero imaginary part as a
645pair of inexact rationals, so the real and imaginary parts of a
646complex number have the same properties of inexactness and limited
647precision as single inexact rational numbers. Guile can not represent
648exact complex numbers with non-zero imaginary parts.
649
5615f696
MV
650@deffn {Scheme Procedure} complex? z
651@deffnx {C Function} scm_complex_p (z)
07d83abe
MV
652Return @code{#t} if @var{x} is a complex number, @code{#f}
653otherwise. Note that the sets of real, rational and integer
654values form subsets of the set of complex numbers, i. e. the
655predicate will also be fulfilled if @var{x} is a real,
656rational or integer number.
657@end deffn
658
07d83abe
MV
659@node Exactness
660@subsubsection Exact and Inexact Numbers
661@tpindex Exact numbers
662@tpindex Inexact numbers
663
664@rnindex exact?
665@rnindex inexact?
666@rnindex exact->inexact
667@rnindex inexact->exact
668
669R5RS requires that a calculation involving inexact numbers always
670produces an inexact result. To meet this requirement, Guile
671distinguishes between an exact integer value such as @samp{5} and the
672corresponding inexact real value which, to the limited precision
673available, has no fractional part, and is printed as @samp{5.0}. Guile
674will only convert the latter value to the former when forced to do so by
675an invocation of the @code{inexact->exact} procedure.
676
677@deffn {Scheme Procedure} exact? z
678@deffnx {C Function} scm_exact_p (z)
679Return @code{#t} if the number @var{z} is exact, @code{#f}
680otherwise.
681
682@lisp
683(exact? 2)
684@result{} #t
685
686(exact? 0.5)
687@result{} #f
688
689(exact? (/ 2))
690@result{} #t
691@end lisp
692
693@end deffn
694
695@deffn {Scheme Procedure} inexact? z
696@deffnx {C Function} scm_inexact_p (z)
697Return @code{#t} if the number @var{z} is inexact, @code{#f}
698else.
699@end deffn
700
701@deffn {Scheme Procedure} inexact->exact z
702@deffnx {C Function} scm_inexact_to_exact (z)
703Return an exact number that is numerically closest to @var{z}, when
704there is one. For inexact rationals, Guile returns the exact rational
705that is numerically equal to the inexact rational. Inexact complex
706numbers with a non-zero imaginary part can not be made exact.
707
708@lisp
709(inexact->exact 0.5)
710@result{} 1/2
711@end lisp
712
713The following happens because 12/10 is not exactly representable as a
714@code{double} (on most platforms). However, when reading a decimal
715number that has been marked exact with the ``#e'' prefix, Guile is
716able to represent it correctly.
717
718@lisp
719(inexact->exact 1.2)
720@result{} 5404319552844595/4503599627370496
721
722#e1.2
723@result{} 6/5
724@end lisp
725
726@end deffn
727
728@c begin (texi-doc-string "guile" "exact->inexact")
729@deffn {Scheme Procedure} exact->inexact z
730@deffnx {C Function} scm_exact_to_inexact (z)
731Convert the number @var{z} to its inexact representation.
732@end deffn
733
734
735@node Number Syntax
736@subsubsection Read Syntax for Numerical Data
737
738The read syntax for integers is a string of digits, optionally
739preceded by a minus or plus character, a code indicating the
740base in which the integer is encoded, and a code indicating whether
741the number is exact or inexact. The supported base codes are:
742
743@table @code
744@item #b
745@itemx #B
746the integer is written in binary (base 2)
747
748@item #o
749@itemx #O
750the integer is written in octal (base 8)
751
752@item #d
753@itemx #D
754the integer is written in decimal (base 10)
755
756@item #x
757@itemx #X
758the integer is written in hexadecimal (base 16)
759@end table
760
761If the base code is omitted, the integer is assumed to be decimal. The
762following examples show how these base codes are used.
763
764@lisp
765-13
766@result{} -13
767
768#d-13
769@result{} -13
770
771#x-13
772@result{} -19
773
774#b+1101
775@result{} 13
776
777#o377
778@result{} 255
779@end lisp
780
781The codes for indicating exactness (which can, incidentally, be applied
782to all numerical values) are:
783
784@table @code
785@item #e
786@itemx #E
787the number is exact
788
789@item #i
790@itemx #I
791the number is inexact.
792@end table
793
794If the exactness indicator is omitted, the number is exact unless it
795contains a radix point. Since Guile can not represent exact complex
796numbers, an error is signalled when asking for them.
797
798@lisp
799(exact? 1.2)
800@result{} #f
801
802(exact? #e1.2)
803@result{} #t
804
805(exact? #e+1i)
806ERROR: Wrong type argument
807@end lisp
808
809Guile also understands the syntax @samp{+inf.0} and @samp{-inf.0} for
810plus and minus infinity, respectively. The value must be written
811exactly as shown, that is, they always must have a sign and exactly
812one zero digit after the decimal point. It also understands
813@samp{+nan.0} and @samp{-nan.0} for the special `not-a-number' value.
814The sign is ignored for `not-a-number' and the value is always printed
815as @samp{+nan.0}.
816
817@node Integer Operations
818@subsubsection Operations on Integer Values
819@rnindex odd?
820@rnindex even?
821@rnindex quotient
822@rnindex remainder
823@rnindex modulo
824@rnindex gcd
825@rnindex lcm
826
827@deffn {Scheme Procedure} odd? n
828@deffnx {C Function} scm_odd_p (n)
829Return @code{#t} if @var{n} is an odd number, @code{#f}
830otherwise.
831@end deffn
832
833@deffn {Scheme Procedure} even? n
834@deffnx {C Function} scm_even_p (n)
835Return @code{#t} if @var{n} is an even number, @code{#f}
836otherwise.
837@end deffn
838
839@c begin (texi-doc-string "guile" "quotient")
840@c begin (texi-doc-string "guile" "remainder")
841@deffn {Scheme Procedure} quotient n d
842@deffnx {Scheme Procedure} remainder n d
843@deffnx {C Function} scm_quotient (n, d)
844@deffnx {C Function} scm_remainder (n, d)
845Return the quotient or remainder from @var{n} divided by @var{d}. The
846quotient is rounded towards zero, and the remainder will have the same
847sign as @var{n}. In all cases quotient and remainder satisfy
848@math{@var{n} = @var{q}*@var{d} + @var{r}}.
849
850@lisp
851(remainder 13 4) @result{} 1
852(remainder -13 4) @result{} -1
853@end lisp
854@end deffn
855
856@c begin (texi-doc-string "guile" "modulo")
857@deffn {Scheme Procedure} modulo n d
858@deffnx {C Function} scm_modulo (n, d)
859Return the remainder from @var{n} divided by @var{d}, with the same
860sign as @var{d}.
861
862@lisp
863(modulo 13 4) @result{} 1
864(modulo -13 4) @result{} 3
865(modulo 13 -4) @result{} -3
866(modulo -13 -4) @result{} -1
867@end lisp
868@end deffn
869
870@c begin (texi-doc-string "guile" "gcd")
871@deffn {Scheme Procedure} gcd
872@deffnx {C Function} scm_gcd (x, y)
873Return the greatest common divisor of all arguments.
874If called without arguments, 0 is returned.
875
876The C function @code{scm_gcd} always takes two arguments, while the
877Scheme function can take an arbitrary number.
878@end deffn
879
880@c begin (texi-doc-string "guile" "lcm")
881@deffn {Scheme Procedure} lcm
882@deffnx {C Function} scm_lcm (x, y)
883Return the least common multiple of the arguments.
884If called without arguments, 1 is returned.
885
886The C function @code{scm_lcm} always takes two arguments, while the
887Scheme function can take an arbitrary number.
888@end deffn
889
890
891@node Comparison
892@subsubsection Comparison Predicates
893@rnindex zero?
894@rnindex positive?
895@rnindex negative?
896
897The C comparison functions below always takes two arguments, while the
898Scheme functions can take an arbitrary number. Also keep in mind that
899the C functions return one of the Scheme boolean values
900@code{SCM_BOOL_T} or @code{SCM_BOOL_F} which are both true as far as C
901is concerned. Thus, always write @code{scm_is_true (scm_num_eq_p (x,
902y))} when testing the two Scheme numbers @code{x} and @code{y} for
903equality, for example.
904
905@c begin (texi-doc-string "guile" "=")
906@deffn {Scheme Procedure} =
907@deffnx {C Function} scm_num_eq_p (x, y)
908Return @code{#t} if all parameters are numerically equal.
909@end deffn
910
911@c begin (texi-doc-string "guile" "<")
912@deffn {Scheme Procedure} <
913@deffnx {C Function} scm_less_p (x, y)
914Return @code{#t} if the list of parameters is monotonically
915increasing.
916@end deffn
917
918@c begin (texi-doc-string "guile" ">")
919@deffn {Scheme Procedure} >
920@deffnx {C Function} scm_gr_p (x, y)
921Return @code{#t} if the list of parameters is monotonically
922decreasing.
923@end deffn
924
925@c begin (texi-doc-string "guile" "<=")
926@deffn {Scheme Procedure} <=
927@deffnx {C Function} scm_leq_p (x, y)
928Return @code{#t} if the list of parameters is monotonically
929non-decreasing.
930@end deffn
931
932@c begin (texi-doc-string "guile" ">=")
933@deffn {Scheme Procedure} >=
934@deffnx {C Function} scm_geq_p (x, y)
935Return @code{#t} if the list of parameters is monotonically
936non-increasing.
937@end deffn
938
939@c begin (texi-doc-string "guile" "zero?")
940@deffn {Scheme Procedure} zero? z
941@deffnx {C Function} scm_zero_p (z)
942Return @code{#t} if @var{z} is an exact or inexact number equal to
943zero.
944@end deffn
945
946@c begin (texi-doc-string "guile" "positive?")
947@deffn {Scheme Procedure} positive? x
948@deffnx {C Function} scm_positive_p (x)
949Return @code{#t} if @var{x} is an exact or inexact number greater than
950zero.
951@end deffn
952
953@c begin (texi-doc-string "guile" "negative?")
954@deffn {Scheme Procedure} negative? x
955@deffnx {C Function} scm_negative_p (x)
956Return @code{#t} if @var{x} is an exact or inexact number less than
957zero.
958@end deffn
959
960
961@node Conversion
962@subsubsection Converting Numbers To and From Strings
963@rnindex number->string
964@rnindex string->number
965
966@deffn {Scheme Procedure} number->string n [radix]
967@deffnx {C Function} scm_number_to_string (n, radix)
968Return a string holding the external representation of the
969number @var{n} in the given @var{radix}. If @var{n} is
970inexact, a radix of 10 will be used.
971@end deffn
972
973@deffn {Scheme Procedure} string->number string [radix]
974@deffnx {C Function} scm_string_to_number (string, radix)
975Return a number of the maximally precise representation
976expressed by the given @var{string}. @var{radix} must be an
977exact integer, either 2, 8, 10, or 16. If supplied, @var{radix}
978is a default radix that may be overridden by an explicit radix
979prefix in @var{string} (e.g. "#o177"). If @var{radix} is not
980supplied, then the default radix is 10. If string is not a
981syntactically valid notation for a number, then
982@code{string->number} returns @code{#f}.
983@end deffn
984
985
986@node Complex
987@subsubsection Complex Number Operations
988@rnindex make-rectangular
989@rnindex make-polar
990@rnindex real-part
991@rnindex imag-part
992@rnindex magnitude
993@rnindex angle
994
995@deffn {Scheme Procedure} make-rectangular real imaginary
996@deffnx {C Function} scm_make_rectangular (real, imaginary)
997Return a complex number constructed of the given @var{real} and
998@var{imaginary} parts.
999@end deffn
1000
1001@deffn {Scheme Procedure} make-polar x y
1002@deffnx {C Function} scm_make_polar (x, y)
1003Return the complex number @var{x} * e^(i * @var{y}).
1004@end deffn
1005
1006@c begin (texi-doc-string "guile" "real-part")
1007@deffn {Scheme Procedure} real-part z
1008@deffnx {C Function} scm_real_part (z)
1009Return the real part of the number @var{z}.
1010@end deffn
1011
1012@c begin (texi-doc-string "guile" "imag-part")
1013@deffn {Scheme Procedure} imag-part z
1014@deffnx {C Function} scm_imag_part (z)
1015Return the imaginary part of the number @var{z}.
1016@end deffn
1017
1018@c begin (texi-doc-string "guile" "magnitude")
1019@deffn {Scheme Procedure} magnitude z
1020@deffnx {C Function} scm_magnitude (z)
1021Return the magnitude of the number @var{z}. This is the same as
1022@code{abs} for real arguments, but also allows complex numbers.
1023@end deffn
1024
1025@c begin (texi-doc-string "guile" "angle")
1026@deffn {Scheme Procedure} angle z
1027@deffnx {C Function} scm_angle (z)
1028Return the angle of the complex number @var{z}.
1029@end deffn
1030
5615f696
MV
1031@deftypefn {C Function} SCM scm_c_make_rectangular (double re, double im)
1032@deftypefnx {C Function} SCM scm_c_make_polar (double x, double y)
1033Like @code{scm_make_rectangular} or @code{scm_make_polar},
1034respectively, but these functions take @code{double}s as their
1035arguments.
1036@end deftypefn
1037
1038@deftypefn {C Function} double scm_c_real_part (z)
1039@deftypefnx {C Function} double scm_c_imag_part (z)
1040Returns the real or imaginary part of @var{z} as a @code{double}.
1041@end deftypefn
1042
1043@deftypefn {C Function} double scm_c_magnitude (z)
1044@deftypefnx {C Function} double scm_c_angle (z)
1045Returns the magnitude or angle of @var{z} as a @code{double}.
1046@end deftypefn
1047
07d83abe
MV
1048
1049@node Arithmetic
1050@subsubsection Arithmetic Functions
1051@rnindex max
1052@rnindex min
1053@rnindex +
1054@rnindex *
1055@rnindex -
1056@rnindex /
1057@rnindex abs
1058@rnindex floor
1059@rnindex ceiling
1060@rnindex truncate
1061@rnindex round
1062
1063The C arithmetic functions below always takes two arguments, while the
1064Scheme functions can take an arbitrary number. When you need to
1065invoke them with just one argument, for example to compute the
1066equivalent od @code{(- x)}, pass @code{SCM_UNDEFINED} as the second
1067one: @code{scm_difference (x, SCM_UNDEFINED)}.
1068
1069@c begin (texi-doc-string "guile" "+")
1070@deffn {Scheme Procedure} + z1 @dots{}
1071@deffnx {C Function} scm_sum (z1, z2)
1072Return the sum of all parameter values. Return 0 if called without any
1073parameters.
1074@end deffn
1075
1076@c begin (texi-doc-string "guile" "-")
1077@deffn {Scheme Procedure} - z1 z2 @dots{}
1078@deffnx {C Function} scm_difference (z1, z2)
1079If called with one argument @var{z1}, -@var{z1} is returned. Otherwise
1080the sum of all but the first argument are subtracted from the first
1081argument.
1082@end deffn
1083
1084@c begin (texi-doc-string "guile" "*")
1085@deffn {Scheme Procedure} * z1 @dots{}
1086@deffnx {C Function} scm_product (z1, z2)
1087Return the product of all arguments. If called without arguments, 1 is
1088returned.
1089@end deffn
1090
1091@c begin (texi-doc-string "guile" "/")
1092@deffn {Scheme Procedure} / z1 z2 @dots{}
1093@deffnx {C Function} scm_divide (z1, z2)
1094Divide the first argument by the product of the remaining arguments. If
1095called with one argument @var{z1}, 1/@var{z1} is returned.
1096@end deffn
1097
1098@c begin (texi-doc-string "guile" "abs")
1099@deffn {Scheme Procedure} abs x
1100@deffnx {C Function} scm_abs (x)
1101Return the absolute value of @var{x}.
1102
1103@var{x} must be a number with zero imaginary part. To calculate the
1104magnitude of a complex number, use @code{magnitude} instead.
1105@end deffn
1106
1107@c begin (texi-doc-string "guile" "max")
1108@deffn {Scheme Procedure} max x1 x2 @dots{}
1109@deffnx {C Function} scm_max (x1, x2)
1110Return the maximum of all parameter values.
1111@end deffn
1112
1113@c begin (texi-doc-string "guile" "min")
1114@deffn {Scheme Procedure} min x1 x2 @dots{}
1115@deffnx {C Function} scm_min (x1, x2)
1116Return the minimum of all parameter values.
1117@end deffn
1118
1119@c begin (texi-doc-string "guile" "truncate")
1120@deffn {Scheme Procedure} truncate
1121@deffnx {C Function} scm_truncate_number (x)
1122Round the inexact number @var{x} towards zero.
1123@end deffn
1124
1125@c begin (texi-doc-string "guile" "round")
1126@deffn {Scheme Procedure} round x
1127@deffnx {C Function} scm_round_number (x)
1128Round the inexact number @var{x} to the nearest integer. When exactly
1129halfway between two integers, round to the even one.
1130@end deffn
1131
1132@c begin (texi-doc-string "guile" "floor")
1133@deffn {Scheme Procedure} floor x
1134@deffnx {C Function} scm_floor (x)
1135Round the number @var{x} towards minus infinity.
1136@end deffn
1137
1138@c begin (texi-doc-string "guile" "ceiling")
1139@deffn {Scheme Procedure} ceiling x
1140@deffnx {C Function} scm_ceiling (x)
1141Round the number @var{x} towards infinity.
1142@end deffn
1143
35da08ee
MV
1144@deftypefn {C Function} double scm_c_truncate (double x)
1145@deftypefnx {C Function} double scm_c_round (double x)
1146Like @code{scm_truncate_number} or @code{scm_round_number},
1147respectively, but these functions take and return @code{double}
1148values.
1149@end deftypefn
07d83abe
MV
1150
1151@node Scientific
1152@subsubsection Scientific Functions
1153
1154The following procedures accept any kind of number as arguments,
1155including complex numbers.
1156
1157@rnindex sqrt
1158@c begin (texi-doc-string "guile" "sqrt")
1159@deffn {Scheme Procedure} sqrt z
1160Return the square root of @var{z}.
1161@end deffn
1162
1163@rnindex expt
1164@c begin (texi-doc-string "guile" "expt")
1165@deffn {Scheme Procedure} expt z1 z2
1166Return @var{z1} raised to the power of @var{z2}.
1167@end deffn
1168
1169@rnindex sin
1170@c begin (texi-doc-string "guile" "sin")
1171@deffn {Scheme Procedure} sin z
1172Return the sine of @var{z}.
1173@end deffn
1174
1175@rnindex cos
1176@c begin (texi-doc-string "guile" "cos")
1177@deffn {Scheme Procedure} cos z
1178Return the cosine of @var{z}.
1179@end deffn
1180
1181@rnindex tan
1182@c begin (texi-doc-string "guile" "tan")
1183@deffn {Scheme Procedure} tan z
1184Return the tangent of @var{z}.
1185@end deffn
1186
1187@rnindex asin
1188@c begin (texi-doc-string "guile" "asin")
1189@deffn {Scheme Procedure} asin z
1190Return the arcsine of @var{z}.
1191@end deffn
1192
1193@rnindex acos
1194@c begin (texi-doc-string "guile" "acos")
1195@deffn {Scheme Procedure} acos z
1196Return the arccosine of @var{z}.
1197@end deffn
1198
1199@rnindex atan
1200@c begin (texi-doc-string "guile" "atan")
1201@deffn {Scheme Procedure} atan z
1202@deffnx {Scheme Procedure} atan y x
1203Return the arctangent of @var{z}, or of @math{@var{y}/@var{x}}.
1204@end deffn
1205
1206@rnindex exp
1207@c begin (texi-doc-string "guile" "exp")
1208@deffn {Scheme Procedure} exp z
1209Return e to the power of @var{z}, where e is the base of natural
1210logarithms (2.71828@dots{}).
1211@end deffn
1212
1213@rnindex log
1214@c begin (texi-doc-string "guile" "log")
1215@deffn {Scheme Procedure} log z
1216Return the natural logarithm of @var{z}.
1217@end deffn
1218
1219@c begin (texi-doc-string "guile" "log10")
1220@deffn {Scheme Procedure} log10 z
1221Return the base 10 logarithm of @var{z}.
1222@end deffn
1223
1224@c begin (texi-doc-string "guile" "sinh")
1225@deffn {Scheme Procedure} sinh z
1226Return the hyperbolic sine of @var{z}.
1227@end deffn
1228
1229@c begin (texi-doc-string "guile" "cosh")
1230@deffn {Scheme Procedure} cosh z
1231Return the hyperbolic cosine of @var{z}.
1232@end deffn
1233
1234@c begin (texi-doc-string "guile" "tanh")
1235@deffn {Scheme Procedure} tanh z
1236Return the hyperbolic tangent of @var{z}.
1237@end deffn
1238
1239@c begin (texi-doc-string "guile" "asinh")
1240@deffn {Scheme Procedure} asinh z
1241Return the hyperbolic arcsine of @var{z}.
1242@end deffn
1243
1244@c begin (texi-doc-string "guile" "acosh")
1245@deffn {Scheme Procedure} acosh z
1246Return the hyperbolic arccosine of @var{z}.
1247@end deffn
1248
1249@c begin (texi-doc-string "guile" "atanh")
1250@deffn {Scheme Procedure} atanh z
1251Return the hyperbolic arctangent of @var{z}.
1252@end deffn
1253
1254
1255@node Primitive Numerics
1256@subsubsection Primitive Numeric Functions
1257
1258Many of Guile's numeric procedures which accept any kind of numbers as
1259arguments, including complex numbers, are implemented as Scheme
1260procedures that use the following real number-based primitives. These
1261primitives signal an error if they are called with complex arguments.
1262
1263@c begin (texi-doc-string "guile" "$abs")
1264@deffn {Scheme Procedure} $abs x
1265Return the absolute value of @var{x}.
1266@end deffn
1267
1268@c begin (texi-doc-string "guile" "$sqrt")
1269@deffn {Scheme Procedure} $sqrt x
1270Return the square root of @var{x}.
1271@end deffn
1272
1273@deffn {Scheme Procedure} $expt x y
1274@deffnx {C Function} scm_sys_expt (x, y)
1275Return @var{x} raised to the power of @var{y}. This
1276procedure does not accept complex arguments.
1277@end deffn
1278
1279@c begin (texi-doc-string "guile" "$sin")
1280@deffn {Scheme Procedure} $sin x
1281Return the sine of @var{x}.
1282@end deffn
1283
1284@c begin (texi-doc-string "guile" "$cos")
1285@deffn {Scheme Procedure} $cos x
1286Return the cosine of @var{x}.
1287@end deffn
1288
1289@c begin (texi-doc-string "guile" "$tan")
1290@deffn {Scheme Procedure} $tan x
1291Return the tangent of @var{x}.
1292@end deffn
1293
1294@c begin (texi-doc-string "guile" "$asin")
1295@deffn {Scheme Procedure} $asin x
1296Return the arcsine of @var{x}.
1297@end deffn
1298
1299@c begin (texi-doc-string "guile" "$acos")
1300@deffn {Scheme Procedure} $acos x
1301Return the arccosine of @var{x}.
1302@end deffn
1303
1304@c begin (texi-doc-string "guile" "$atan")
1305@deffn {Scheme Procedure} $atan x
1306Return the arctangent of @var{x} in the range @minus{}@math{PI/2} to
1307@math{PI/2}.
1308@end deffn
1309
1310@deffn {Scheme Procedure} $atan2 x y
1311@deffnx {C Function} scm_sys_atan2 (x, y)
1312Return the arc tangent of the two arguments @var{x} and
1313@var{y}. This is similar to calculating the arc tangent of
1314@var{x} / @var{y}, except that the signs of both arguments
1315are used to determine the quadrant of the result. This
1316procedure does not accept complex arguments.
1317@end deffn
1318
1319@c begin (texi-doc-string "guile" "$exp")
1320@deffn {Scheme Procedure} $exp x
1321Return e to the power of @var{x}, where e is the base of natural
1322logarithms (2.71828@dots{}).
1323@end deffn
1324
1325@c begin (texi-doc-string "guile" "$log")
1326@deffn {Scheme Procedure} $log x
1327Return the natural logarithm of @var{x}.
1328@end deffn
1329
1330@c begin (texi-doc-string "guile" "$sinh")
1331@deffn {Scheme Procedure} $sinh x
1332Return the hyperbolic sine of @var{x}.
1333@end deffn
1334
1335@c begin (texi-doc-string "guile" "$cosh")
1336@deffn {Scheme Procedure} $cosh x
1337Return the hyperbolic cosine of @var{x}.
1338@end deffn
1339
1340@c begin (texi-doc-string "guile" "$tanh")
1341@deffn {Scheme Procedure} $tanh x
1342Return the hyperbolic tangent of @var{x}.
1343@end deffn
1344
1345@c begin (texi-doc-string "guile" "$asinh")
1346@deffn {Scheme Procedure} $asinh x
1347Return the hyperbolic arcsine of @var{x}.
1348@end deffn
1349
1350@c begin (texi-doc-string "guile" "$acosh")
1351@deffn {Scheme Procedure} $acosh x
1352Return the hyperbolic arccosine of @var{x}.
1353@end deffn
1354
1355@c begin (texi-doc-string "guile" "$atanh")
1356@deffn {Scheme Procedure} $atanh x
1357Return the hyperbolic arctangent of @var{x}.
1358@end deffn
1359
1360C functions for the above are provided by the standard mathematics
1361library. Naturally these expect and return @code{double} arguments
1362(@pxref{Mathematics,,, libc, GNU C Library Reference Manual}).
1363
1364@multitable {xx} {Scheme Procedure} {C Function}
1365@item @tab Scheme Procedure @tab C Function
1366
1367@item @tab @code{$abs} @tab @code{fabs}
1368@item @tab @code{$sqrt} @tab @code{sqrt}
1369@item @tab @code{$sin} @tab @code{sin}
1370@item @tab @code{$cos} @tab @code{cos}
1371@item @tab @code{$tan} @tab @code{tan}
1372@item @tab @code{$asin} @tab @code{asin}
1373@item @tab @code{$acos} @tab @code{acos}
1374@item @tab @code{$atan} @tab @code{atan}
1375@item @tab @code{$atan2} @tab @code{atan2}
1376@item @tab @code{$exp} @tab @code{exp}
1377@item @tab @code{$expt} @tab @code{pow}
1378@item @tab @code{$log} @tab @code{log}
1379@item @tab @code{$sinh} @tab @code{sinh}
1380@item @tab @code{$cosh} @tab @code{cosh}
1381@item @tab @code{$tanh} @tab @code{tanh}
1382@item @tab @code{$asinh} @tab @code{asinh}
1383@item @tab @code{$acosh} @tab @code{acosh}
1384@item @tab @code{$atanh} @tab @code{atanh}
1385@end multitable
1386
1387@code{asinh}, @code{acosh} and @code{atanh} are C99 standard but might
1388not be available on older systems. Guile provides the following
1389equivalents (on all systems).
1390
1391@deftypefn {C Function} double scm_asinh (double x)
1392@deftypefnx {C Function} double scm_acosh (double x)
1393@deftypefnx {C Function} double scm_atanh (double x)
1394Return the hyperbolic arcsine, arccosine or arctangent of @var{x}
1395respectively.
1396@end deftypefn
1397
1398
1399@node Bitwise Operations
1400@subsubsection Bitwise Operations
1401
1402For the following bitwise functions, negative numbers are treated as
1403infinite precision twos-complements. For instance @math{-6} is bits
1404@math{@dots{}111010}, with infinitely many ones on the left. It can
1405be seen that adding 6 (binary 110) to such a bit pattern gives all
1406zeros.
1407
1408@deffn {Scheme Procedure} logand n1 n2 @dots{}
1409@deffnx {C Function} scm_logand (n1, n2)
1410Return the bitwise @sc{and} of the integer arguments.
1411
1412@lisp
1413(logand) @result{} -1
1414(logand 7) @result{} 7
1415(logand #b111 #b011 #b001) @result{} 1
1416@end lisp
1417@end deffn
1418
1419@deffn {Scheme Procedure} logior n1 n2 @dots{}
1420@deffnx {C Function} scm_logior (n1, n2)
1421Return the bitwise @sc{or} of the integer arguments.
1422
1423@lisp
1424(logior) @result{} 0
1425(logior 7) @result{} 7
1426(logior #b000 #b001 #b011) @result{} 3
1427@end lisp
1428@end deffn
1429
1430@deffn {Scheme Procedure} logxor n1 n2 @dots{}
1431@deffnx {C Function} scm_loxor (n1, n2)
1432Return the bitwise @sc{xor} of the integer arguments. A bit is
1433set in the result if it is set in an odd number of arguments.
1434
1435@lisp
1436(logxor) @result{} 0
1437(logxor 7) @result{} 7
1438(logxor #b000 #b001 #b011) @result{} 2
1439(logxor #b000 #b001 #b011 #b011) @result{} 1
1440@end lisp
1441@end deffn
1442
1443@deffn {Scheme Procedure} lognot n
1444@deffnx {C Function} scm_lognot (n)
1445Return the integer which is the ones-complement of the integer
1446argument, ie.@: each 0 bit is changed to 1 and each 1 bit to 0.
1447
1448@lisp
1449(number->string (lognot #b10000000) 2)
1450 @result{} "-10000001"
1451(number->string (lognot #b0) 2)
1452 @result{} "-1"
1453@end lisp
1454@end deffn
1455
1456@deffn {Scheme Procedure} logtest j k
1457@deffnx {C Function} scm_logtest (j, k)
1458@lisp
1459(logtest j k) @equiv{} (not (zero? (logand j k)))
1460
1461(logtest #b0100 #b1011) @result{} #f
1462(logtest #b0100 #b0111) @result{} #t
1463@end lisp
1464@end deffn
1465
1466@deffn {Scheme Procedure} logbit? index j
1467@deffnx {C Function} scm_logbit_p (index, j)
1468@lisp
1469(logbit? index j) @equiv{} (logtest (integer-expt 2 index) j)
1470
1471(logbit? 0 #b1101) @result{} #t
1472(logbit? 1 #b1101) @result{} #f
1473(logbit? 2 #b1101) @result{} #t
1474(logbit? 3 #b1101) @result{} #t
1475(logbit? 4 #b1101) @result{} #f
1476@end lisp
1477@end deffn
1478
1479@deffn {Scheme Procedure} ash n cnt
1480@deffnx {C Function} scm_ash (n, cnt)
1481Return @var{n} shifted left by @var{cnt} bits, or shifted right if
1482@var{cnt} is negative. This is an ``arithmetic'' shift.
1483
1484This is effectively a multiplication by @m{2^{cnt}, 2^@var{cnt}}, and
1485when @var{cnt} is negative it's a division, rounded towards negative
1486infinity. (Note that this is not the same rounding as @code{quotient}
1487does.)
1488
1489With @var{n} viewed as an infinite precision twos complement,
1490@code{ash} means a left shift introducing zero bits, or a right shift
1491dropping bits.
1492
1493@lisp
1494(number->string (ash #b1 3) 2) @result{} "1000"
1495(number->string (ash #b1010 -1) 2) @result{} "101"
1496
1497;; -23 is bits ...11101001, -6 is bits ...111010
1498(ash -23 -2) @result{} -6
1499@end lisp
1500@end deffn
1501
1502@deffn {Scheme Procedure} logcount n
1503@deffnx {C Function} scm_logcount (n)
1504Return the number of bits in integer @var{n}. If integer is
1505positive, the 1-bits in its binary representation are counted.
1506If negative, the 0-bits in its two's-complement binary
1507representation are counted. If 0, 0 is returned.
1508
1509@lisp
1510(logcount #b10101010)
1511 @result{} 4
1512(logcount 0)
1513 @result{} 0
1514(logcount -2)
1515 @result{} 1
1516@end lisp
1517@end deffn
1518
1519@deffn {Scheme Procedure} integer-length n
1520@deffnx {C Function} scm_integer_length (n)
1521Return the number of bits necessary to represent @var{n}.
1522
1523For positive @var{n} this is how many bits to the most significant one
1524bit. For negative @var{n} it's how many bits to the most significant
1525zero bit in twos complement form.
1526
1527@lisp
1528(integer-length #b10101010) @result{} 8
1529(integer-length #b1111) @result{} 4
1530(integer-length 0) @result{} 0
1531(integer-length -1) @result{} 0
1532(integer-length -256) @result{} 8
1533(integer-length -257) @result{} 9
1534@end lisp
1535@end deffn
1536
1537@deffn {Scheme Procedure} integer-expt n k
1538@deffnx {C Function} scm_integer_expt (n, k)
1539Return @var{n} raised to the non-negative integer exponent
1540@var{k}.
1541
1542@lisp
1543(integer-expt 2 5)
1544 @result{} 32
1545(integer-expt -3 3)
1546 @result{} -27
1547@end lisp
1548@end deffn
1549
1550@deffn {Scheme Procedure} bit-extract n start end
1551@deffnx {C Function} scm_bit_extract (n, start, end)
1552Return the integer composed of the @var{start} (inclusive)
1553through @var{end} (exclusive) bits of @var{n}. The
1554@var{start}th bit becomes the 0-th bit in the result.
1555
1556@lisp
1557(number->string (bit-extract #b1101101010 0 4) 2)
1558 @result{} "1010"
1559(number->string (bit-extract #b1101101010 4 9) 2)
1560 @result{} "10110"
1561@end lisp
1562@end deffn
1563
1564
1565@node Random
1566@subsubsection Random Number Generation
1567
1568Pseudo-random numbers are generated from a random state object, which
1569can be created with @code{seed->random-state}. The @var{state}
1570parameter to the various functions below is optional, it defaults to
1571the state object in the @code{*random-state*} variable.
1572
1573@deffn {Scheme Procedure} copy-random-state [state]
1574@deffnx {C Function} scm_copy_random_state (state)
1575Return a copy of the random state @var{state}.
1576@end deffn
1577
1578@deffn {Scheme Procedure} random n [state]
1579@deffnx {C Function} scm_random (n, state)
1580Return a number in [0, @var{n}).
1581
1582Accepts a positive integer or real n and returns a
1583number of the same type between zero (inclusive) and
1584@var{n} (exclusive). The values returned have a uniform
1585distribution.
1586@end deffn
1587
1588@deffn {Scheme Procedure} random:exp [state]
1589@deffnx {C Function} scm_random_exp (state)
1590Return an inexact real in an exponential distribution with mean
15911. For an exponential distribution with mean @var{u} use @code{(*
1592@var{u} (random:exp))}.
1593@end deffn
1594
1595@deffn {Scheme Procedure} random:hollow-sphere! vect [state]
1596@deffnx {C Function} scm_random_hollow_sphere_x (vect, state)
1597Fills @var{vect} with inexact real random numbers the sum of whose
1598squares is equal to 1.0. Thinking of @var{vect} as coordinates in
1599space of dimension @var{n} @math{=} @code{(vector-length @var{vect})},
1600the coordinates are uniformly distributed over the surface of the unit
1601n-sphere.
1602@end deffn
1603
1604@deffn {Scheme Procedure} random:normal [state]
1605@deffnx {C Function} scm_random_normal (state)
1606Return an inexact real in a normal distribution. The distribution
1607used has mean 0 and standard deviation 1. For a normal distribution
1608with mean @var{m} and standard deviation @var{d} use @code{(+ @var{m}
1609(* @var{d} (random:normal)))}.
1610@end deffn
1611
1612@deffn {Scheme Procedure} random:normal-vector! vect [state]
1613@deffnx {C Function} scm_random_normal_vector_x (vect, state)
1614Fills @var{vect} with inexact real random numbers that are
1615independent and standard normally distributed
1616(i.e., with mean 0 and variance 1).
1617@end deffn
1618
1619@deffn {Scheme Procedure} random:solid-sphere! vect [state]
1620@deffnx {C Function} scm_random_solid_sphere_x (vect, state)
1621Fills @var{vect} with inexact real random numbers the sum of whose
1622squares is less than 1.0. Thinking of @var{vect} as coordinates in
1623space of dimension @var{n} @math{=} @code{(vector-length @var{vect})},
1624the coordinates are uniformly distributed within the unit
1625@var{n}-sphere. The sum of the squares of the numbers is returned.
1626@c FIXME: What does this mean, particularly the n-sphere part?
1627@end deffn
1628
1629@deffn {Scheme Procedure} random:uniform [state]
1630@deffnx {C Function} scm_random_uniform (state)
1631Return a uniformly distributed inexact real random number in
1632[0,1).
1633@end deffn
1634
1635@deffn {Scheme Procedure} seed->random-state seed
1636@deffnx {C Function} scm_seed_to_random_state (seed)
1637Return a new random state using @var{seed}.
1638@end deffn
1639
1640@defvar *random-state*
1641The global random state used by the above functions when the
1642@var{state} parameter is not given.
1643@end defvar
1644
1645
1646@node Characters
1647@subsection Characters
1648@tpindex Characters
1649
1650@noindent
1651[@strong{FIXME}: how do you specify regular (non-control) characters?]
1652
1653Most of the ``control characters'' (those below codepoint 32) in the
1654@acronym{ASCII} character set, as well as the space, may be referred
1655to by name: for example, @code{#\tab}, @code{#\esc}, @code{#\stx}, and
1656so on. The following table describes the @acronym{ASCII} names for
1657each character.
1658
1659@multitable @columnfractions .25 .25 .25 .25
1660@item 0 = @code{#\nul}
1661 @tab 1 = @code{#\soh}
1662 @tab 2 = @code{#\stx}
1663 @tab 3 = @code{#\etx}
1664@item 4 = @code{#\eot}
1665 @tab 5 = @code{#\enq}
1666 @tab 6 = @code{#\ack}
1667 @tab 7 = @code{#\bel}
1668@item 8 = @code{#\bs}
1669 @tab 9 = @code{#\ht}
1670 @tab 10 = @code{#\nl}
1671 @tab 11 = @code{#\vt}
1672@item 12 = @code{#\np}
1673 @tab 13 = @code{#\cr}
1674 @tab 14 = @code{#\so}
1675 @tab 15 = @code{#\si}
1676@item 16 = @code{#\dle}
1677 @tab 17 = @code{#\dc1}
1678 @tab 18 = @code{#\dc2}
1679 @tab 19 = @code{#\dc3}
1680@item 20 = @code{#\dc4}
1681 @tab 21 = @code{#\nak}
1682 @tab 22 = @code{#\syn}
1683 @tab 23 = @code{#\etb}
1684@item 24 = @code{#\can}
1685 @tab 25 = @code{#\em}
1686 @tab 26 = @code{#\sub}
1687 @tab 27 = @code{#\esc}
1688@item 28 = @code{#\fs}
1689 @tab 29 = @code{#\gs}
1690 @tab 30 = @code{#\rs}
1691 @tab 31 = @code{#\us}
1692@item 32 = @code{#\sp}
1693@end multitable
1694
1695The ``delete'' character (octal 177) may be referred to with the name
1696@code{#\del}.
1697
1698Several characters have more than one name:
1699
1700@multitable {@code{#\backspace}} {Original}
1701@item Alias @tab Original
1702@item @code{#\space} @tab @code{#\sp}
1703@item @code{#\newline} @tab @code{#\nl}
1704@item @code{#\tab} @tab @code{#\ht}
1705@item @code{#\backspace} @tab @code{#\bs}
1706@item @code{#\return} @tab @code{#\cr}
1707@item @code{#\page} @tab @code{#\np}
1708@item @code{#\null} @tab @code{#\nul}
1709@end multitable
1710
1711@rnindex char?
1712@deffn {Scheme Procedure} char? x
1713@deffnx {C Function} scm_char_p (x)
1714Return @code{#t} iff @var{x} is a character, else @code{#f}.
1715@end deffn
1716
1717@rnindex char=?
1718@deffn {Scheme Procedure} char=? x y
1719Return @code{#t} iff @var{x} is the same character as @var{y}, else @code{#f}.
1720@end deffn
1721
1722@rnindex char<?
1723@deffn {Scheme Procedure} char<? x y
1724Return @code{#t} iff @var{x} is less than @var{y} in the @acronym{ASCII} sequence,
1725else @code{#f}.
1726@end deffn
1727
1728@rnindex char<=?
1729@deffn {Scheme Procedure} char<=? x y
1730Return @code{#t} iff @var{x} is less than or equal to @var{y} in the
1731@acronym{ASCII} sequence, else @code{#f}.
1732@end deffn
1733
1734@rnindex char>?
1735@deffn {Scheme Procedure} char>? x y
1736Return @code{#t} iff @var{x} is greater than @var{y} in the @acronym{ASCII}
1737sequence, else @code{#f}.
1738@end deffn
1739
1740@rnindex char>=?
1741@deffn {Scheme Procedure} char>=? x y
1742Return @code{#t} iff @var{x} is greater than or equal to @var{y} in the
1743@acronym{ASCII} sequence, else @code{#f}.
1744@end deffn
1745
1746@rnindex char-ci=?
1747@deffn {Scheme Procedure} char-ci=? x y
1748Return @code{#t} iff @var{x} is the same character as @var{y} ignoring
1749case, else @code{#f}.
1750@end deffn
1751
1752@rnindex char-ci<?
1753@deffn {Scheme Procedure} char-ci<? x y
1754Return @code{#t} iff @var{x} is less than @var{y} in the @acronym{ASCII} sequence
1755ignoring case, else @code{#f}.
1756@end deffn
1757
1758@rnindex char-ci<=?
1759@deffn {Scheme Procedure} char-ci<=? x y
1760Return @code{#t} iff @var{x} is less than or equal to @var{y} in the
1761@acronym{ASCII} sequence ignoring case, else @code{#f}.
1762@end deffn
1763
1764@rnindex char-ci>?
1765@deffn {Scheme Procedure} char-ci>? x y
1766Return @code{#t} iff @var{x} is greater than @var{y} in the @acronym{ASCII}
1767sequence ignoring case, else @code{#f}.
1768@end deffn
1769
1770@rnindex char-ci>=?
1771@deffn {Scheme Procedure} char-ci>=? x y
1772Return @code{#t} iff @var{x} is greater than or equal to @var{y} in the
1773@acronym{ASCII} sequence ignoring case, else @code{#f}.
1774@end deffn
1775
1776@rnindex char-alphabetic?
1777@deffn {Scheme Procedure} char-alphabetic? chr
1778@deffnx {C Function} scm_char_alphabetic_p (chr)
1779Return @code{#t} iff @var{chr} is alphabetic, else @code{#f}.
1780Alphabetic means the same thing as the @code{isalpha} C library function.
1781@end deffn
1782
1783@rnindex char-numeric?
1784@deffn {Scheme Procedure} char-numeric? chr
1785@deffnx {C Function} scm_char_numeric_p (chr)
1786Return @code{#t} iff @var{chr} is numeric, else @code{#f}.
1787Numeric means the same thing as the @code{isdigit} C library function.
1788@end deffn
1789
1790@rnindex char-whitespace?
1791@deffn {Scheme Procedure} char-whitespace? chr
1792@deffnx {C Function} scm_char_whitespace_p (chr)
1793Return @code{#t} iff @var{chr} is whitespace, else @code{#f}.
1794Whitespace means the same thing as the @code{isspace} C library function.
1795@end deffn
1796
1797@rnindex char-upper-case?
1798@deffn {Scheme Procedure} char-upper-case? chr
1799@deffnx {C Function} scm_char_upper_case_p (chr)
1800Return @code{#t} iff @var{chr} is uppercase, else @code{#f}.
1801Uppercase means the same thing as the @code{isupper} C library function.
1802@end deffn
1803
1804@rnindex char-lower-case?
1805@deffn {Scheme Procedure} char-lower-case? chr
1806@deffnx {C Function} scm_char_lower_case_p (chr)
1807Return @code{#t} iff @var{chr} is lowercase, else @code{#f}.
1808Lowercase means the same thing as the @code{islower} C library function.
1809@end deffn
1810
1811@deffn {Scheme Procedure} char-is-both? chr
1812@deffnx {C Function} scm_char_is_both_p (chr)
1813Return @code{#t} iff @var{chr} is either uppercase or lowercase, else
1814@code{#f}. Uppercase and lowercase are as defined by the
1815@code{isupper} and @code{islower} C library functions.
1816@end deffn
1817
1818@rnindex char->integer
1819@deffn {Scheme Procedure} char->integer chr
1820@deffnx {C Function} scm_char_to_integer (chr)
1821Return the number corresponding to ordinal position of @var{chr} in the
1822@acronym{ASCII} sequence.
1823@end deffn
1824
1825@rnindex integer->char
1826@deffn {Scheme Procedure} integer->char n
1827@deffnx {C Function} scm_integer_to_char (n)
1828Return the character at position @var{n} in the @acronym{ASCII} sequence.
1829@end deffn
1830
1831@rnindex char-upcase
1832@deffn {Scheme Procedure} char-upcase chr
1833@deffnx {C Function} scm_char_upcase (chr)
1834Return the uppercase character version of @var{chr}.
1835@end deffn
1836
1837@rnindex char-downcase
1838@deffn {Scheme Procedure} char-downcase chr
1839@deffnx {C Function} scm_char_downcase (chr)
1840Return the lowercase character version of @var{chr}.
1841@end deffn
1842
1843@xref{Classification of Characters,,,libc,GNU C Library Reference
1844Manual}, for information about the @code{is*} Standard C functions
1845mentioned above.
1846
1847
1848@node Strings
1849@subsection Strings
1850@tpindex Strings
1851
1852Strings are fixed-length sequences of characters. They can be created
1853by calling constructor procedures, but they can also literally get
1854entered at the @acronym{REPL} or in Scheme source files.
1855
1856@c Guile provides a rich set of string processing procedures, because text
1857@c handling is very important when Guile is used as a scripting language.
1858
1859Strings always carry the information about how many characters they are
1860composed of with them, so there is no special end-of-string character,
1861like in C. That means that Scheme strings can contain any character,
c48c62d0
MV
1862even the @samp{#\nul} character @samp{\0}.
1863
1864To use strings efficiently, you need to know a bit about how Guile
1865implements them. In Guile, a string consists of two parts, a head and
1866the actual memory where the characters are stored. When a string (or
1867a substring of it) is copied, only a new head gets created, the memory
1868is usually not copied. The two heads start out pointing to the same
1869memory.
1870
1871When one of these two strings is modified, as with @code{string-set!},
1872their common memory does get copied so that each string has its own
1873memory and modifying one does not accidently modify the other as well.
1874Thus, Guile's strings are `copy on write'; the actual copying of their
1875memory is delayed until one string is written to.
1876
1877This implementation makes functions like @code{substring} very
1878efficient in the common case that no modifications are done to the
1879involved strings.
1880
1881If you do know that your strings are getting modified right away, you
1882can use @code{substring/copy} instead of @code{substring}. This
1883function performs the copy immediately at the time of creation. This
1884is more efficient, especially in a multi-threaded program. Also,
1885@code{substring/copy} can avoid the problem that a short substring
1886holds on to the memory of a very large original string that could
1887otherwise be recycled.
1888
1889If you want to avoid the copy altogether, so that modifications of one
1890string show up in the other, you can use @code{substring/shared}. The
1891strings created by this procedure are called @dfn{mutation sharing
1892substrings} since the substring and the original string share
1893modifications to each other.
07d83abe
MV
1894
1895@menu
1896* String Syntax:: Read syntax for strings.
1897* String Predicates:: Testing strings for certain properties.
1898* String Constructors:: Creating new string objects.
1899* List/String Conversion:: Converting from/to lists of characters.
1900* String Selection:: Select portions from strings.
1901* String Modification:: Modify parts or whole strings.
1902* String Comparison:: Lexicographic ordering predicates.
1903* String Searching:: Searching in strings.
1904* Alphabetic Case Mapping:: Convert the alphabetic case of strings.
1905* Appending Strings:: Appending strings to form a new string.
91210d62 1906* Conversion to/from C::
07d83abe
MV
1907@end menu
1908
1909@node String Syntax
1910@subsubsection String Read Syntax
1911
1912@c In the following @code is used to get a good font in TeX etc, but
1913@c is omitted for Info format, so as not to risk any confusion over
1914@c whether surrounding ` ' quotes are part of the escape or are
1915@c special in a string (they're not).
1916
1917The read syntax for strings is an arbitrarily long sequence of
c48c62d0 1918characters enclosed in double quotes (@nicode{"}).
07d83abe
MV
1919
1920Backslash is an escape character and can be used to insert the
1921following special characters. @nicode{\"} and @nicode{\\} are R5RS
1922standard, the rest are Guile extensions, notice they follow C string
1923syntax.
1924
1925@table @asis
1926@item @nicode{\\}
1927Backslash character.
1928
1929@item @nicode{\"}
1930Double quote character (an unescaped @nicode{"} is otherwise the end
1931of the string).
1932
1933@item @nicode{\0}
1934NUL character (ASCII 0).
1935
1936@item @nicode{\a}
1937Bell character (ASCII 7).
1938
1939@item @nicode{\f}
1940Formfeed character (ASCII 12).
1941
1942@item @nicode{\n}
1943Newline character (ASCII 10).
1944
1945@item @nicode{\r}
1946Carriage return character (ASCII 13).
1947
1948@item @nicode{\t}
1949Tab character (ASCII 9).
1950
1951@item @nicode{\v}
1952Vertical tab character (ASCII 11).
1953
1954@item @nicode{\xHH}
1955Character code given by two hexadecimal digits. For example
1956@nicode{\x7f} for an ASCII DEL (127).
1957@end table
1958
1959@noindent
1960The following are examples of string literals:
1961
1962@lisp
1963"foo"
1964"bar plonk"
1965"Hello World"
1966"\"Hi\", he said."
1967@end lisp
1968
1969
1970@node String Predicates
1971@subsubsection String Predicates
1972
1973The following procedures can be used to check whether a given string
1974fulfills some specified property.
1975
1976@rnindex string?
1977@deffn {Scheme Procedure} string? obj
1978@deffnx {C Function} scm_string_p (obj)
1979Return @code{#t} if @var{obj} is a string, else @code{#f}.
1980@end deffn
1981
91210d62
MV
1982@deftypefn {C Function} int scm_is_string (SCM obj)
1983Returns @code{1} if @var{obj} is a string, @code{0} otherwise.
1984@end deftypefn
1985
07d83abe
MV
1986@deffn {Scheme Procedure} string-null? str
1987@deffnx {C Function} scm_string_null_p (str)
1988Return @code{#t} if @var{str}'s length is zero, and
1989@code{#f} otherwise.
1990@lisp
1991(string-null? "") @result{} #t
1992y @result{} "foo"
1993(string-null? y) @result{} #f
1994@end lisp
1995@end deffn
1996
1997@node String Constructors
1998@subsubsection String Constructors
1999
2000The string constructor procedures create new string objects, possibly
c48c62d0
MV
2001initializing them with some specified character data. See also
2002@xref{String Selection}, for ways to create strings from existing
2003strings.
07d83abe
MV
2004
2005@c FIXME::martin: list->string belongs into `List/String Conversion'
2006
2007@rnindex string
2008@rnindex list->string
2009@deffn {Scheme Procedure} string . chrs
2010@deffnx {Scheme Procedure} list->string chrs
2011@deffnx {C Function} scm_string (chrs)
2012Return a newly allocated string composed of the arguments,
2013@var{chrs}.
2014@end deffn
2015
2016@rnindex make-string
2017@deffn {Scheme Procedure} make-string k [chr]
2018@deffnx {C Function} scm_make_string (k, chr)
2019Return a newly allocated string of
2020length @var{k}. If @var{chr} is given, then all elements of
2021the string are initialized to @var{chr}, otherwise the contents
2022of the @var{string} are unspecified.
2023@end deffn
2024
c48c62d0
MV
2025@deftypefn {C Function} SCM scm_c_make_string (size_t len, SCM chr)
2026Like @code{scm_make_string}, but expects the length as a
2027@code{size_t}.
2028@end deftypefn
2029
07d83abe
MV
2030@node List/String Conversion
2031@subsubsection List/String conversion
2032
2033When processing strings, it is often convenient to first convert them
2034into a list representation by using the procedure @code{string->list},
2035work with the resulting list, and then convert it back into a string.
2036These procedures are useful for similar tasks.
2037
2038@rnindex string->list
2039@deffn {Scheme Procedure} string->list str
2040@deffnx {C Function} scm_string_to_list (str)
2041Return a newly allocated list of the characters that make up
2042the given string @var{str}. @code{string->list} and
2043@code{list->string} are inverses as far as @samp{equal?} is
2044concerned.
2045@end deffn
2046
2047@deffn {Scheme Procedure} string-split str chr
2048@deffnx {C Function} scm_string_split (str, chr)
2049Split the string @var{str} into the a list of the substrings delimited
2050by appearances of the character @var{chr}. Note that an empty substring
2051between separator characters will result in an empty string in the
2052result list.
2053
2054@lisp
2055(string-split "root:x:0:0:root:/root:/bin/bash" #\:)
2056@result{}
2057("root" "x" "0" "0" "root" "/root" "/bin/bash")
2058
2059(string-split "::" #\:)
2060@result{}
2061("" "" "")
2062
2063(string-split "" #\:)
2064@result{}
2065("")
2066@end lisp
2067@end deffn
2068
2069
2070@node String Selection
2071@subsubsection String Selection
2072
2073Portions of strings can be extracted by these procedures.
2074@code{string-ref} delivers individual characters whereas
2075@code{substring} can be used to extract substrings from longer strings.
2076
2077@rnindex string-length
2078@deffn {Scheme Procedure} string-length string
2079@deffnx {C Function} scm_string_length (string)
2080Return the number of characters in @var{string}.
2081@end deffn
2082
c48c62d0
MV
2083@deftypefn {C Function} size_t scm_c_string_length (SCM str)
2084Return the number of characters in @var{str} as a @code{size_t}.
2085@end deftypefn
2086
07d83abe
MV
2087@rnindex string-ref
2088@deffn {Scheme Procedure} string-ref str k
2089@deffnx {C Function} scm_string_ref (str, k)
2090Return character @var{k} of @var{str} using zero-origin
2091indexing. @var{k} must be a valid index of @var{str}.
2092@end deffn
2093
c48c62d0
MV
2094@deftypefn {C Function} SCM scm_c_string_ref (SCM str, size_t k)
2095Return character @var{k} of @var{str} using zero-origin
2096indexing. @var{k} must be a valid index of @var{str}.
2097@end deftypefn
2098
07d83abe
MV
2099@rnindex string-copy
2100@deffn {Scheme Procedure} string-copy str
2101@deffnx {C Function} scm_string_copy (str)
c48c62d0
MV
2102Return a copy of the given @var{string}.
2103
2104The returned string shares storage with @var{str} initially, but it is
2105copied as soon as one of the two strings is modified.
07d83abe
MV
2106@end deffn
2107
2108@rnindex substring
2109@deffn {Scheme Procedure} substring str start [end]
2110@deffnx {C Function} scm_substring (str, start, end)
c48c62d0 2111Return a new string formed from the characters
07d83abe
MV
2112of @var{str} beginning with index @var{start} (inclusive) and
2113ending with index @var{end} (exclusive).
2114@var{str} must be a string, @var{start} and @var{end} must be
2115exact integers satisfying:
2116
21170 <= @var{start} <= @var{end} <= @code{(string-length @var{str})}.
c48c62d0
MV
2118
2119The returned string shares storage with @var{str} initially, but it is
2120copied as soon as one of the two strings is modified.
2121@end deffn
2122
2123@deffn {Scheme Procedure} substring/shared str start [end]
2124@deffnx {C Function} scm_substring_shared (str, start, end)
2125Like @code{substring}, but the strings continue to share their storage
2126even if they are modified. Thus, modifications to @var{str} show up
2127in the new string, and vice versa.
2128@end deffn
2129
2130@deffn {Scheme Procedure} substring/copy str start [end]
2131@deffnx {C Function} scm_substring_copy (str, start, end)
2132Like @code{substring}, but the storage for the new string is copied
2133immediately.
07d83abe
MV
2134@end deffn
2135
c48c62d0
MV
2136@deftypefn {C Function} SCM scm_c_substring (SCM str, size_t start, size_t end)
2137@deftypefnx {C Function} SCM scm_c_substring_shared (SCM str, size_t start, size_t end)
2138@deftypefnx {C Function} SCM scm_c_substring_copy (SCM str, size_t start, size_t end)
2139Like @code{scm_substring}, etc. but the bounds are given as a @code{size_t}.
2140@end deftypefn
2141
07d83abe
MV
2142@node String Modification
2143@subsubsection String Modification
2144
2145These procedures are for modifying strings in-place. This means that the
2146result of the operation is not a new string; instead, the original string's
2147memory representation is modified.
2148
2149@rnindex string-set!
2150@deffn {Scheme Procedure} string-set! str k chr
2151@deffnx {C Function} scm_string_set_x (str, k, chr)
2152Store @var{chr} in element @var{k} of @var{str} and return
2153an unspecified value. @var{k} must be a valid index of
2154@var{str}.
2155@end deffn
2156
c48c62d0
MV
2157@deftypefn {C Function} void scm_c_string_set_x (SCM str, size_t k, SCM chr)
2158Like @code{scm_string_set_x}, but the index is given as a @code{size_t}.
2159@end deftypefn
2160
07d83abe
MV
2161@rnindex string-fill!
2162@deffn {Scheme Procedure} string-fill! str chr
2163@deffnx {C Function} scm_string_fill_x (str, chr)
2164Store @var{char} in every element of the given @var{string} and
2165return an unspecified value.
2166@end deffn
2167
2168@deffn {Scheme Procedure} substring-fill! str start end fill
2169@deffnx {C Function} scm_substring_fill_x (str, start, end, fill)
2170Change every character in @var{str} between @var{start} and
2171@var{end} to @var{fill}.
2172
2173@lisp
2174(define y "abcdefg")
2175(substring-fill! y 1 3 #\r)
2176y
2177@result{} "arrdefg"
2178@end lisp
2179@end deffn
2180
2181@deffn {Scheme Procedure} substring-move! str1 start1 end1 str2 start2
2182@deffnx {C Function} scm_substring_move_x (str1, start1, end1, str2, start2)
2183Copy the substring of @var{str1} bounded by @var{start1} and @var{end1}
2184into @var{str2} beginning at position @var{start2}.
2185@var{str1} and @var{str2} can be the same string.
2186@end deffn
2187
2188
2189@node String Comparison
2190@subsubsection String Comparison
2191
2192The procedures in this section are similar to the character ordering
2193predicates (@pxref{Characters}), but are defined on character sequences.
2194They all return @code{#t} on success and @code{#f} on failure. The
2195predicates ending in @code{-ci} ignore the character case when comparing
2196strings.
2197
2198
2199@rnindex string=?
2200@deffn {Scheme Procedure} string=? s1 s2
2201Lexicographic equality predicate; return @code{#t} if the two
2202strings are the same length and contain the same characters in
2203the same positions, otherwise return @code{#f}.
2204
2205The procedure @code{string-ci=?} treats upper and lower case
2206letters as though they were the same character, but
2207@code{string=?} treats upper and lower case as distinct
2208characters.
2209@end deffn
2210
2211@rnindex string<?
2212@deffn {Scheme Procedure} string<? s1 s2
2213Lexicographic ordering predicate; return @code{#t} if @var{s1}
2214is lexicographically less than @var{s2}.
2215@end deffn
2216
2217@rnindex string<=?
2218@deffn {Scheme Procedure} string<=? s1 s2
2219Lexicographic ordering predicate; return @code{#t} if @var{s1}
2220is lexicographically less than or equal to @var{s2}.
2221@end deffn
2222
2223@rnindex string>?
2224@deffn {Scheme Procedure} string>? s1 s2
2225Lexicographic ordering predicate; return @code{#t} if @var{s1}
2226is lexicographically greater than @var{s2}.
2227@end deffn
2228
2229@rnindex string>=?
2230@deffn {Scheme Procedure} string>=? s1 s2
2231Lexicographic ordering predicate; return @code{#t} if @var{s1}
2232is lexicographically greater than or equal to @var{s2}.
2233@end deffn
2234
2235@rnindex string-ci=?
2236@deffn {Scheme Procedure} string-ci=? s1 s2
2237Case-insensitive string equality predicate; return @code{#t} if
2238the two strings are the same length and their component
2239characters match (ignoring case) at each position; otherwise
2240return @code{#f}.
2241@end deffn
2242
2243@rnindex string-ci<
2244@deffn {Scheme Procedure} string-ci<? s1 s2
2245Case insensitive lexicographic ordering predicate; return
2246@code{#t} if @var{s1} is lexicographically less than @var{s2}
2247regardless of case.
2248@end deffn
2249
2250@rnindex string<=?
2251@deffn {Scheme Procedure} string-ci<=? s1 s2
2252Case insensitive lexicographic ordering predicate; return
2253@code{#t} if @var{s1} is lexicographically less than or equal
2254to @var{s2} regardless of case.
2255@end deffn
2256
2257@rnindex string-ci>?
2258@deffn {Scheme Procedure} string-ci>? s1 s2
2259Case insensitive lexicographic ordering predicate; return
2260@code{#t} if @var{s1} is lexicographically greater than
2261@var{s2} regardless of case.
2262@end deffn
2263
2264@rnindex string-ci>=?
2265@deffn {Scheme Procedure} string-ci>=? s1 s2
2266Case insensitive lexicographic ordering predicate; return
2267@code{#t} if @var{s1} is lexicographically greater than or
2268equal to @var{s2} regardless of case.
2269@end deffn
2270
2271
2272@node String Searching
2273@subsubsection String Searching
2274
2275When searching for the index of a character in a string, these
2276procedures can be used.
2277
2278@deffn {Scheme Procedure} string-index str chr [frm [to]]
2279@deffnx {C Function} scm_string_index (str, chr, frm, to)
2280Return the index of the first occurrence of @var{chr} in
2281@var{str}. The optional integer arguments @var{frm} and
2282@var{to} limit the search to a portion of the string. This
2283procedure essentially implements the @code{index} or
2284@code{strchr} functions from the C library.
2285
2286@lisp
2287(string-index "weiner" #\e)
2288@result{} 1
2289
2290(string-index "weiner" #\e 2)
2291@result{} 4
2292
2293(string-index "weiner" #\e 2 4)
2294@result{} #f
2295@end lisp
2296@end deffn
2297
2298@deffn {Scheme Procedure} string-rindex str chr [frm [to]]
2299@deffnx {C Function} scm_string_rindex (str, chr, frm, to)
2300Like @code{string-index}, but search from the right of the
2301string rather than from the left. This procedure essentially
2302implements the @code{rindex} or @code{strrchr} functions from
2303the C library.
2304
2305@lisp
2306(string-rindex "weiner" #\e)
2307@result{} 4
2308
2309(string-rindex "weiner" #\e 2 4)
2310@result{} #f
2311
2312(string-rindex "weiner" #\e 2 5)
2313@result{} 4
2314@end lisp
2315@end deffn
2316
2317@node Alphabetic Case Mapping
2318@subsubsection Alphabetic Case Mapping
2319
2320These are procedures for mapping strings to their upper- or lower-case
2321equivalents, respectively, or for capitalizing strings.
2322
2323@deffn {Scheme Procedure} string-upcase str
2324@deffnx {C Function} scm_string_upcase (str)
2325Return a freshly allocated string containing the characters of
2326@var{str} in upper case.
2327@end deffn
2328
2329@deffn {Scheme Procedure} string-upcase! str
2330@deffnx {C Function} scm_string_upcase_x (str)
2331Destructively upcase every character in @var{str} and return
2332@var{str}.
2333@lisp
2334y @result{} "arrdefg"
2335(string-upcase! y) @result{} "ARRDEFG"
2336y @result{} "ARRDEFG"
2337@end lisp
2338@end deffn
2339
2340@deffn {Scheme Procedure} string-downcase str
2341@deffnx {C Function} scm_string_downcase (str)
2342Return a freshly allocation string containing the characters in
2343@var{str} in lower case.
2344@end deffn
2345
2346@deffn {Scheme Procedure} string-downcase! str
2347@deffnx {C Function} scm_string_downcase_x (str)
2348Destructively downcase every character in @var{str} and return
2349@var{str}.
2350@lisp
2351y @result{} "ARRDEFG"
2352(string-downcase! y) @result{} "arrdefg"
2353y @result{} "arrdefg"
2354@end lisp
2355@end deffn
2356
2357@deffn {Scheme Procedure} string-capitalize str
2358@deffnx {C Function} scm_string_capitalize (str)
2359Return a freshly allocated string with the characters in
2360@var{str}, where the first character of every word is
2361capitalized.
2362@end deffn
2363
2364@deffn {Scheme Procedure} string-capitalize! str
2365@deffnx {C Function} scm_string_capitalize_x (str)
2366Upcase the first character of every word in @var{str}
2367destructively and return @var{str}.
2368
2369@lisp
2370y @result{} "hello world"
2371(string-capitalize! y) @result{} "Hello World"
2372y @result{} "Hello World"
2373@end lisp
2374@end deffn
2375
2376
2377@node Appending Strings
2378@subsubsection Appending Strings
2379
2380The procedure @code{string-append} appends several strings together to
2381form a longer result string.
2382
2383@rnindex string-append
2384@deffn {Scheme Procedure} string-append . args
2385@deffnx {C Function} scm_string_append (args)
2386Return a newly allocated string whose characters form the
2387concatenation of the given strings, @var{args}.
2388
2389@example
2390(let ((h "hello "))
2391 (string-append h "world"))
2392@result{} "hello world"
2393@end example
2394@end deffn
2395
91210d62
MV
2396@node Conversion to/from C
2397@subsubsection Conversion to/from C
2398
2399When creating a Scheme string from a C string or when converting a
2400Scheme string to a C string, the concept of character encoding becomes
2401important.
2402
2403In C, a string is just a sequence of bytes, and the character encoding
2404describes the relation between these bytes and the actual characters
c88453e8
MV
2405that make up the string. For Scheme strings, character encoding is
2406not an issue (most of the time), since in Scheme you never get to see
2407the bytes, only the characters.
91210d62
MV
2408
2409Well, ideally, anyway. Right now, Guile simply equates Scheme
2410characters and bytes, ignoring the possibility of multi-byte encodings
2411completely. This will change in the future, where Guile will use
c48c62d0
MV
2412Unicode codepoints as its characters and UTF-8 or some other encoding
2413as its internal encoding. When you exclusively use the functions
2414listed in this section, you are `future-proof'.
91210d62 2415
c88453e8
MV
2416Converting a Scheme string to a C string will often allocate fresh
2417memory to hold the result. You must take care that this memory is
2418properly freed eventually. In many cases, this can be achieved by
2419using @code{scm_frame_free} inside an appropriate frame,
2420@xref{Frames}.
91210d62
MV
2421
2422@deftypefn {C Function} SCM scm_from_locale_string (const char *str)
2423@deftypefnx {C Function} SCM scm_from_locale_stringn (const char *str, size_t len)
2424Creates a new Scheme string that has the same contents as @var{str}
2425when interpreted in the current locale character encoding.
2426
2427For @code{scm_from_locale_string}, @var{str} must be null-terminated.
2428
2429For @code{scm_from_locale_stringn}, @var{len} specifies the length of
2430@var{str} in bytes, and @var{str} does not need to be null-terminated.
2431If @var{len} is @code{(size_t)-1}, then @var{str} does need to be
2432null-terminated and the real length will be found with @code{strlen}.
2433@end deftypefn
2434
2435@deftypefn {C Function} SCM scm_take_locale_string (char *str)
2436@deftypefnx {C Function} SCM scm_take_locale_stringn (char *str, size_t len)
2437Like @code{scm_from_locale_string} and @code{scm_from_locale_stringn},
2438respectively, but also frees @var{str} with @code{free} eventually.
2439Thus, you can use this function when you would free @var{str} anyway
2440immediately after creating the Scheme string. In certain cases, Guile
2441can then use @var{str} directly as its internal representation.
2442@end deftypefn
2443
2444@deftypefn {C Function} char *scm_to_locale_string (SCM str)
2445@deftypefnx {C Function} char *scm_to_locale_stringn (SCM str, size_t *lenp)
2446Returns a C string in the current locale encoding with the same
2447contents as @var{str}. The C string must be freed with @code{free}
2448eventually, maybe by using @code{scm_frame_free}, @xref{Frames}.
2449
2450For @code{scm_to_locale_string}, the returned string is
2451null-terminated and an error is signalled when @var{str} contains
2452@code{#\nul} characters.
2453
2454For @code{scm_to_locale_stringn} and @var{lenp} not @code{NULL},
2455@var{str} might contain @code{#\nul} characters and the length of the
2456returned string in bytes is stored in @code{*@var{lenp}}. The
2457returned string will not be null-terminated in this case. If
2458@var{lenp} is @code{NULL}, @code{scm_to_locale_stringn} behaves like
2459@code{scm_to_locale_string}.
2460@end deftypefn
2461
2462@deftypefn {C Function} size_t scm_to_locale_stringbuf (SCM str, char *buf, size_t max_len)
2463Puts @var{str} as a C string in the current locale encoding into the
2464memory pointed to by @var{buf}. The buffer at @var{buf} has room for
2465@var{max_len} bytes and @code{scm_to_local_stringbuf} will never store
2466more than that. No terminating @code{'\0'} will be stored.
2467
2468The return value of @code{scm_to_locale_stringbuf} is the number of
2469bytes that are needed for all of @var{str}, regardless of whether
2470@var{buf} was large enough to hold them. Thus, when the return value
2471is larger than @var{max_len}, only @var{max_len} bytes have been
2472stored and you probably need to try again with a larger buffer.
2473@end deftypefn
07d83abe
MV
2474
2475@node Regular Expressions
2476@subsection Regular Expressions
2477@tpindex Regular expressions
2478
2479@cindex regular expressions
2480@cindex regex
2481@cindex emacs regexp
2482
2483A @dfn{regular expression} (or @dfn{regexp}) is a pattern that
2484describes a whole class of strings. A full description of regular
2485expressions and their syntax is beyond the scope of this manual;
2486an introduction can be found in the Emacs manual (@pxref{Regexps,
2487, Syntax of Regular Expressions, emacs, The GNU Emacs Manual}), or
2488in many general Unix reference books.
2489
2490If your system does not include a POSIX regular expression library,
2491and you have not linked Guile with a third-party regexp library such
2492as Rx, these functions will not be available. You can tell whether
2493your Guile installation includes regular expression support by
2494checking whether @code{(provided? 'regex)} returns true.
2495
2496The following regexp and string matching features are provided by the
2497@code{(ice-9 regex)} module. Before using the described functions,
2498you should load this module by executing @code{(use-modules (ice-9
2499regex))}.
2500
2501@menu
2502* Regexp Functions:: Functions that create and match regexps.
2503* Match Structures:: Finding what was matched by a regexp.
2504* Backslash Escapes:: Removing the special meaning of regexp
2505 meta-characters.
2506@end menu
2507
2508
2509@node Regexp Functions
2510@subsubsection Regexp Functions
2511
2512By default, Guile supports POSIX extended regular expressions.
2513That means that the characters @samp{(}, @samp{)}, @samp{+} and
2514@samp{?} are special, and must be escaped if you wish to match the
2515literal characters.
2516
2517This regular expression interface was modeled after that
2518implemented by SCSH, the Scheme Shell. It is intended to be
2519upwardly compatible with SCSH regular expressions.
2520
2521@deffn {Scheme Procedure} string-match pattern str [start]
2522Compile the string @var{pattern} into a regular expression and compare
2523it with @var{str}. The optional numeric argument @var{start} specifies
2524the position of @var{str} at which to begin matching.
2525
2526@code{string-match} returns a @dfn{match structure} which
2527describes what, if anything, was matched by the regular
2528expression. @xref{Match Structures}. If @var{str} does not match
2529@var{pattern} at all, @code{string-match} returns @code{#f}.
2530@end deffn
2531
2532Two examples of a match follow. In the first example, the pattern
2533matches the four digits in the match string. In the second, the pattern
2534matches nothing.
2535
2536@example
2537(string-match "[0-9][0-9][0-9][0-9]" "blah2002")
2538@result{} #("blah2002" (4 . 8))
2539
2540(string-match "[A-Za-z]" "123456")
2541@result{} #f
2542@end example
2543
2544Each time @code{string-match} is called, it must compile its
2545@var{pattern} argument into a regular expression structure. This
2546operation is expensive, which makes @code{string-match} inefficient if
2547the same regular expression is used several times (for example, in a
2548loop). For better performance, you can compile a regular expression in
2549advance and then match strings against the compiled regexp.
2550
2551@deffn {Scheme Procedure} make-regexp pat flag@dots{}
2552@deffnx {C Function} scm_make_regexp (pat, flaglst)
2553Compile the regular expression described by @var{pat}, and
2554return the compiled regexp structure. If @var{pat} does not
2555describe a legal regular expression, @code{make-regexp} throws
2556a @code{regular-expression-syntax} error.
2557
2558The @var{flag} arguments change the behavior of the compiled
2559regular expression. The following values may be supplied:
2560
2561@defvar regexp/icase
2562Consider uppercase and lowercase letters to be the same when
2563matching.
2564@end defvar
2565
2566@defvar regexp/newline
2567If a newline appears in the target string, then permit the
2568@samp{^} and @samp{$} operators to match immediately after or
2569immediately before the newline, respectively. Also, the
2570@samp{.} and @samp{[^...]} operators will never match a newline
2571character. The intent of this flag is to treat the target
2572string as a buffer containing many lines of text, and the
2573regular expression as a pattern that may match a single one of
2574those lines.
2575@end defvar
2576
2577@defvar regexp/basic
2578Compile a basic (``obsolete'') regexp instead of the extended
2579(``modern'') regexps that are the default. Basic regexps do
2580not consider @samp{|}, @samp{+} or @samp{?} to be special
2581characters, and require the @samp{@{...@}} and @samp{(...)}
2582metacharacters to be backslash-escaped (@pxref{Backslash
2583Escapes}). There are several other differences between basic
2584and extended regular expressions, but these are the most
2585significant.
2586@end defvar
2587
2588@defvar regexp/extended
2589Compile an extended regular expression rather than a basic
2590regexp. This is the default behavior; this flag will not
2591usually be needed. If a call to @code{make-regexp} includes
2592both @code{regexp/basic} and @code{regexp/extended} flags, the
2593one which comes last will override the earlier one.
2594@end defvar
2595@end deffn
2596
2597@deffn {Scheme Procedure} regexp-exec rx str [start [flags]]
2598@deffnx {C Function} scm_regexp_exec (rx, str, start, flags)
2599Match the compiled regular expression @var{rx} against
2600@code{str}. If the optional integer @var{start} argument is
2601provided, begin matching from that position in the string.
2602Return a match structure describing the results of the match,
2603or @code{#f} if no match could be found.
2604
2605The @var{flags} arguments change the matching behavior.
2606The following flags may be supplied:
2607
2608@defvar regexp/notbol
2609Operator @samp{^} always fails (unless @code{regexp/newline}
2610is used). Use this when the beginning of the string should
2611not be considered the beginning of a line.
2612@end defvar
2613
2614@defvar regexp/noteol
2615Operator @samp{$} always fails (unless @code{regexp/newline}
2616is used). Use this when the end of the string should not be
2617considered the end of a line.
2618@end defvar
2619@end deffn
2620
2621@lisp
2622;; Regexp to match uppercase letters
2623(define r (make-regexp "[A-Z]*"))
2624
2625;; Regexp to match letters, ignoring case
2626(define ri (make-regexp "[A-Z]*" regexp/icase))
2627
2628;; Search for bob using regexp r
2629(match:substring (regexp-exec r "bob"))
2630@result{} "" ; no match
2631
2632;; Search for bob using regexp ri
2633(match:substring (regexp-exec ri "Bob"))
2634@result{} "Bob" ; matched case insensitive
2635@end lisp
2636
2637@deffn {Scheme Procedure} regexp? obj
2638@deffnx {C Function} scm_regexp_p (obj)
2639Return @code{#t} if @var{obj} is a compiled regular expression,
2640or @code{#f} otherwise.
2641@end deffn
2642
2643Regular expressions are commonly used to find patterns in one string and
2644replace them with the contents of another string.
2645
2646@c begin (scm-doc-string "regex.scm" "regexp-substitute")
2647@deffn {Scheme Procedure} regexp-substitute port match [item@dots{}]
2648Write to the output port @var{port} selected contents of the match
2649structure @var{match}. Each @var{item} specifies what should be
2650written, and may be one of the following arguments:
2651
2652@itemize @bullet
2653@item
2654A string. String arguments are written out verbatim.
2655
2656@item
2657An integer. The submatch with that number is written.
2658
2659@item
2660The symbol @samp{pre}. The portion of the matched string preceding
2661the regexp match is written.
2662
2663@item
2664The symbol @samp{post}. The portion of the matched string following
2665the regexp match is written.
2666@end itemize
2667
2668The @var{port} argument may be @code{#f}, in which case nothing is
2669written; instead, @code{regexp-substitute} constructs a string from the
2670specified @var{item}s and returns that.
2671@end deffn
2672
2673The following example takes a regular expression that matches a standard
2674@sc{yyyymmdd}-format date such as @code{"20020828"}. The
2675@code{regexp-substitute} call returns a string computed from the
2676information in the match structure, consisting of the fields and text
2677from the original string reordered and reformatted.
2678
2679@lisp
2680(define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
2681(define s "Date 20020429 12am.")
2682(define sm (string-match date-regex s))
2683(regexp-substitute #f sm 'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
2684@result{} "Date 04-29-2002 12am. (20020429)"
2685@end lisp
2686
2687@c begin (scm-doc-string "regex.scm" "regexp-substitute")
2688@deffn {Scheme Procedure} regexp-substitute/global port regexp target [item@dots{}]
2689Similar to @code{regexp-substitute}, but can be used to perform global
2690substitutions on @var{str}. Instead of taking a match structure as an
2691argument, @code{regexp-substitute/global} takes two string arguments: a
2692@var{regexp} string describing a regular expression, and a @var{target}
2693string which should be matched against this regular expression.
2694
2695Each @var{item} behaves as in @code{regexp-substitute}, with the
2696following exceptions:
2697
2698@itemize @bullet
2699@item
2700A function may be supplied. When this function is called, it will be
2701passed one argument: a match structure for a given regular expression
2702match. It should return a string to be written out to @var{port}.
2703
2704@item
2705The @samp{post} symbol causes @code{regexp-substitute/global} to recurse
2706on the unmatched portion of @var{str}. This @emph{must} be supplied in
2707order to perform global search-and-replace on @var{str}; if it is not
2708present among the @var{item}s, then @code{regexp-substitute/global} will
2709return after processing a single match.
2710@end itemize
2711@end deffn
2712
2713The example above for @code{regexp-substitute} could be rewritten as
2714follows to remove the @code{string-match} stage:
2715
2716@lisp
2717(define date-regex "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])")
2718(define s "Date 20020429 12am.")
2719(regexp-substitute/global #f date-regex s
2720 'pre 2 "-" 3 "-" 1 'post " (" 0 ")")
2721@result{} "Date 04-29-2002 12am. (20020429)"
2722@end lisp
2723
2724
2725@node Match Structures
2726@subsubsection Match Structures
2727
2728@cindex match structures
2729
2730A @dfn{match structure} is the object returned by @code{string-match} and
2731@code{regexp-exec}. It describes which portion of a string, if any,
2732matched the given regular expression. Match structures include: a
2733reference to the string that was checked for matches; the starting and
2734ending positions of the regexp match; and, if the regexp included any
2735parenthesized subexpressions, the starting and ending positions of each
2736submatch.
2737
2738In each of the regexp match functions described below, the @code{match}
2739argument must be a match structure returned by a previous call to
2740@code{string-match} or @code{regexp-exec}. Most of these functions
2741return some information about the original target string that was
2742matched against a regular expression; we will call that string
2743@var{target} for easy reference.
2744
2745@c begin (scm-doc-string "regex.scm" "regexp-match?")
2746@deffn {Scheme Procedure} regexp-match? obj
2747Return @code{#t} if @var{obj} is a match structure returned by a
2748previous call to @code{regexp-exec}, or @code{#f} otherwise.
2749@end deffn
2750
2751@c begin (scm-doc-string "regex.scm" "match:substring")
2752@deffn {Scheme Procedure} match:substring match [n]
2753Return the portion of @var{target} matched by subexpression number
2754@var{n}. Submatch 0 (the default) represents the entire regexp match.
2755If the regular expression as a whole matched, but the subexpression
2756number @var{n} did not match, return @code{#f}.
2757@end deffn
2758
2759@lisp
2760(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
2761(match:substring s)
2762@result{} "2002"
2763
2764;; match starting at offset 6 in the string
2765(match:substring
2766 (string-match "[0-9][0-9][0-9][0-9]" "blah987654" 6))
2767@result{} "7654"
2768@end lisp
2769
2770@c begin (scm-doc-string "regex.scm" "match:start")
2771@deffn {Scheme Procedure} match:start match [n]
2772Return the starting position of submatch number @var{n}.
2773@end deffn
2774
2775In the following example, the result is 4, since the match starts at
2776character index 4:
2777
2778@lisp
2779(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
2780(match:start s)
2781@result{} 4
2782@end lisp
2783
2784@c begin (scm-doc-string "regex.scm" "match:end")
2785@deffn {Scheme Procedure} match:end match [n]
2786Return the ending position of submatch number @var{n}.
2787@end deffn
2788
2789In the following example, the result is 8, since the match runs between
2790characters 4 and 8 (i.e. the ``2002'').
2791
2792@lisp
2793(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
2794(match:end s)
2795@result{} 8
2796@end lisp
2797
2798@c begin (scm-doc-string "regex.scm" "match:prefix")
2799@deffn {Scheme Procedure} match:prefix match
2800Return the unmatched portion of @var{target} preceding the regexp match.
2801
2802@lisp
2803(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
2804(match:prefix s)
2805@result{} "blah"
2806@end lisp
2807@end deffn
2808
2809@c begin (scm-doc-string "regex.scm" "match:suffix")
2810@deffn {Scheme Procedure} match:suffix match
2811Return the unmatched portion of @var{target} following the regexp match.
2812@end deffn
2813
2814@lisp
2815(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
2816(match:suffix s)
2817@result{} "foo"
2818@end lisp
2819
2820@c begin (scm-doc-string "regex.scm" "match:count")
2821@deffn {Scheme Procedure} match:count match
2822Return the number of parenthesized subexpressions from @var{match}.
2823Note that the entire regular expression match itself counts as a
2824subexpression, and failed submatches are included in the count.
2825@end deffn
2826
2827@c begin (scm-doc-string "regex.scm" "match:string")
2828@deffn {Scheme Procedure} match:string match
2829Return the original @var{target} string.
2830@end deffn
2831
2832@lisp
2833(define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo"))
2834(match:string s)
2835@result{} "blah2002foo"
2836@end lisp
2837
2838
2839@node Backslash Escapes
2840@subsubsection Backslash Escapes
2841
2842Sometimes you will want a regexp to match characters like @samp{*} or
2843@samp{$} exactly. For example, to check whether a particular string
2844represents a menu entry from an Info node, it would be useful to match
2845it against a regexp like @samp{^* [^:]*::}. However, this won't work;
2846because the asterisk is a metacharacter, it won't match the @samp{*} at
2847the beginning of the string. In this case, we want to make the first
2848asterisk un-magic.
2849
2850You can do this by preceding the metacharacter with a backslash
2851character @samp{\}. (This is also called @dfn{quoting} the
2852metacharacter, and is known as a @dfn{backslash escape}.) When Guile
2853sees a backslash in a regular expression, it considers the following
2854glyph to be an ordinary character, no matter what special meaning it
2855would ordinarily have. Therefore, we can make the above example work by
2856changing the regexp to @samp{^\* [^:]*::}. The @samp{\*} sequence tells
2857the regular expression engine to match only a single asterisk in the
2858target string.
2859
2860Since the backslash is itself a metacharacter, you may force a regexp to
2861match a backslash in the target string by preceding the backslash with
2862itself. For example, to find variable references in a @TeX{} program,
2863you might want to find occurrences of the string @samp{\let\} followed
2864by any number of alphabetic characters. The regular expression
2865@samp{\\let\\[A-Za-z]*} would do this: the double backslashes in the
2866regexp each match a single backslash in the target string.
2867
2868@c begin (scm-doc-string "regex.scm" "regexp-quote")
2869@deffn {Scheme Procedure} regexp-quote str
2870Quote each special character found in @var{str} with a backslash, and
2871return the resulting string.
2872@end deffn
2873
2874@strong{Very important:} Using backslash escapes in Guile source code
2875(as in Emacs Lisp or C) can be tricky, because the backslash character
2876has special meaning for the Guile reader. For example, if Guile
2877encounters the character sequence @samp{\n} in the middle of a string
2878while processing Scheme code, it replaces those characters with a
2879newline character. Similarly, the character sequence @samp{\t} is
2880replaced by a horizontal tab. Several of these @dfn{escape sequences}
2881are processed by the Guile reader before your code is executed.
2882Unrecognized escape sequences are ignored: if the characters @samp{\*}
2883appear in a string, they will be translated to the single character
2884@samp{*}.
2885
2886This translation is obviously undesirable for regular expressions, since
2887we want to be able to include backslashes in a string in order to
2888escape regexp metacharacters. Therefore, to make sure that a backslash
2889is preserved in a string in your Guile program, you must use @emph{two}
2890consecutive backslashes:
2891
2892@lisp
2893(define Info-menu-entry-pattern (make-regexp "^\\* [^:]*"))
2894@end lisp
2895
2896The string in this example is preprocessed by the Guile reader before
2897any code is executed. The resulting argument to @code{make-regexp} is
2898the string @samp{^\* [^:]*}, which is what we really want.
2899
2900This also means that in order to write a regular expression that matches
2901a single backslash character, the regular expression string in the
2902source code must include @emph{four} backslashes. Each consecutive pair
2903of backslashes gets translated by the Guile reader to a single
2904backslash, and the resulting double-backslash is interpreted by the
2905regexp engine as matching a single backslash character. Hence:
2906
2907@lisp
2908(define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*"))
2909@end lisp
2910
2911The reason for the unwieldiness of this syntax is historical. Both
2912regular expression pattern matchers and Unix string processing systems
2913have traditionally used backslashes with the special meanings
2914described above. The POSIX regular expression specification and ANSI C
2915standard both require these semantics. Attempting to abandon either
2916convention would cause other kinds of compatibility problems, possibly
2917more severe ones. Therefore, without extending the Scheme reader to
2918support strings with different quoting conventions (an ungainly and
2919confusing extension when implemented in other languages), we must adhere
2920to this cumbersome escape syntax.
2921
2922
2923@node Symbols
2924@subsection Symbols
2925@tpindex Symbols
2926
2927Symbols in Scheme are widely used in three ways: as items of discrete
2928data, as lookup keys for alists and hash tables, and to denote variable
2929references.
2930
2931A @dfn{symbol} is similar to a string in that it is defined by a
2932sequence of characters. The sequence of characters is known as the
2933symbol's @dfn{name}. In the usual case --- that is, where the symbol's
2934name doesn't include any characters that could be confused with other
2935elements of Scheme syntax --- a symbol is written in a Scheme program by
2936writing the sequence of characters that make up the name, @emph{without}
2937any quotation marks or other special syntax. For example, the symbol
2938whose name is ``multiply-by-2'' is written, simply:
2939
2940@lisp
2941multiply-by-2
2942@end lisp
2943
2944Notice how this differs from a @emph{string} with contents
2945``multiply-by-2'', which is written with double quotation marks, like
2946this:
2947
2948@lisp
2949"multiply-by-2"
2950@end lisp
2951
2952Looking beyond how they are written, symbols are different from strings
2953in two important respects.
2954
2955The first important difference is uniqueness. If the same-looking
2956string is read twice from two different places in a program, the result
2957is two @emph{different} string objects whose contents just happen to be
2958the same. If, on the other hand, the same-looking symbol is read twice
2959from two different places in a program, the result is the @emph{same}
2960symbol object both times.
2961
2962Given two read symbols, you can use @code{eq?} to test whether they are
2963the same (that is, have the same name). @code{eq?} is the most
2964efficient comparison operator in Scheme, and comparing two symbols like
2965this is as fast as comparing, for example, two numbers. Given two
2966strings, on the other hand, you must use @code{equal?} or
2967@code{string=?}, which are much slower comparison operators, to
2968determine whether the strings have the same contents.
2969
2970@lisp
2971(define sym1 (quote hello))
2972(define sym2 (quote hello))
2973(eq? sym1 sym2) @result{} #t
2974
2975(define str1 "hello")
2976(define str2 "hello")
2977(eq? str1 str2) @result{} #f
2978(equal? str1 str2) @result{} #t
2979@end lisp
2980
2981The second important difference is that symbols, unlike strings, are not
2982self-evaluating. This is why we need the @code{(quote @dots{})}s in the
2983example above: @code{(quote hello)} evaluates to the symbol named
2984"hello" itself, whereas an unquoted @code{hello} is @emph{read} as the
2985symbol named "hello" and evaluated as a variable reference @dots{} about
2986which more below (@pxref{Symbol Variables}).
2987
2988@menu
2989* Symbol Data:: Symbols as discrete data.
2990* Symbol Keys:: Symbols as lookup keys.
2991* Symbol Variables:: Symbols as denoting variables.
2992* Symbol Primitives:: Operations related to symbols.
2993* Symbol Props:: Function slots and property lists.
2994* Symbol Read Syntax:: Extended read syntax for symbols.
2995* Symbol Uninterned:: Uninterned symbols.
2996@end menu
2997
2998
2999@node Symbol Data
3000@subsubsection Symbols as Discrete Data
3001
3002Numbers and symbols are similar to the extent that they both lend
3003themselves to @code{eq?} comparison. But symbols are more descriptive
3004than numbers, because a symbol's name can be used directly to describe
3005the concept for which that symbol stands.
3006
3007For example, imagine that you need to represent some colours in a
3008computer program. Using numbers, you would have to choose arbitrarily
3009some mapping between numbers and colours, and then take care to use that
3010mapping consistently:
3011
3012@lisp
3013;; 1=red, 2=green, 3=purple
3014
3015(if (eq? (colour-of car) 1)
3016 ...)
3017@end lisp
3018
3019@noindent
3020You can make the mapping more explicit and the code more readable by
3021defining constants:
3022
3023@lisp
3024(define red 1)
3025(define green 2)
3026(define purple 3)
3027
3028(if (eq? (colour-of car) red)
3029 ...)
3030@end lisp
3031
3032@noindent
3033But the simplest and clearest approach is not to use numbers at all, but
3034symbols whose names specify the colours that they refer to:
3035
3036@lisp
3037(if (eq? (colour-of car) 'red)
3038 ...)
3039@end lisp
3040
3041The descriptive advantages of symbols over numbers increase as the set
3042of concepts that you want to describe grows. Suppose that a car object
3043can have other properties as well, such as whether it has or uses:
3044
3045@itemize @bullet
3046@item
3047automatic or manual transmission
3048@item
3049leaded or unleaded fuel
3050@item
3051power steering (or not).
3052@end itemize
3053
3054@noindent
3055Then a car's combined property set could be naturally represented and
3056manipulated as a list of symbols:
3057
3058@lisp
3059(properties-of car1)
3060@result{}
3061(red manual unleaded power-steering)
3062
3063(if (memq 'power-steering (properties-of car1))
3064 (display "Unfit people can drive this car.\n")
3065 (display "You'll need strong arms to drive this car!\n"))
3066@print{}
3067Unfit people can drive this car.
3068@end lisp
3069
3070Remember, the fundamental property of symbols that we are relying on
3071here is that an occurrence of @code{'red} in one part of a program is an
3072@emph{indistinguishable} symbol from an occurrence of @code{'red} in
3073another part of a program; this means that symbols can usefully be
3074compared using @code{eq?}. At the same time, symbols have naturally
3075descriptive names. This combination of efficiency and descriptive power
3076makes them ideal for use as discrete data.
3077
3078
3079@node Symbol Keys
3080@subsubsection Symbols as Lookup Keys
3081
3082Given their efficiency and descriptive power, it is natural to use
3083symbols as the keys in an association list or hash table.
3084
3085To illustrate this, consider a more structured representation of the car
3086properties example from the preceding subsection. Rather than
3087mixing all the properties up together in a flat list, we could use an
3088association list like this:
3089
3090@lisp
3091(define car1-properties '((colour . red)
3092 (transmission . manual)
3093 (fuel . unleaded)
3094 (steering . power-assisted)))
3095@end lisp
3096
3097Notice how this structure is more explicit and extensible than the flat
3098list. For example it makes clear that @code{manual} refers to the
3099transmission rather than, say, the windows or the locking of the car.
3100It also allows further properties to use the same symbols among their
3101possible values without becoming ambiguous:
3102
3103@lisp
3104(define car1-properties '((colour . red)
3105 (transmission . manual)
3106 (fuel . unleaded)
3107 (steering . power-assisted)
3108 (seat-colour . red)
3109 (locking . manual)))
3110@end lisp
3111
3112With a representation like this, it is easy to use the efficient
3113@code{assq-XXX} family of procedures (@pxref{Association Lists}) to
3114extract or change individual pieces of information:
3115
3116@lisp
3117(assq-ref car1-properties 'fuel) @result{} unleaded
3118(assq-ref car1-properties 'transmission) @result{} manual
3119
3120(assq-set! car1-properties 'seat-colour 'black)
3121@result{}
3122((colour . red)
3123 (transmission . manual)
3124 (fuel . unleaded)
3125 (steering . power-assisted)
3126 (seat-colour . black)
3127 (locking . manual)))
3128@end lisp
3129
3130Hash tables also have keys, and exactly the same arguments apply to the
3131use of symbols in hash tables as in association lists. The hash value
3132that Guile uses to decide where to add a symbol-keyed entry to a hash
3133table can be obtained by calling the @code{symbol-hash} procedure:
3134
3135@deffn {Scheme Procedure} symbol-hash symbol
3136@deffnx {C Function} scm_symbol_hash (symbol)
3137Return a hash value for @var{symbol}.
3138@end deffn
3139
3140See @ref{Hash Tables} for information about hash tables in general, and
3141for why you might choose to use a hash table rather than an association
3142list.
3143
3144
3145@node Symbol Variables
3146@subsubsection Symbols as Denoting Variables
3147
3148When an unquoted symbol in a Scheme program is evaluated, it is
3149interpreted as a variable reference, and the result of the evaluation is
3150the appropriate variable's value.
3151
3152For example, when the expression @code{(string-length "abcd")} is read
3153and evaluated, the sequence of characters @code{string-length} is read
3154as the symbol whose name is "string-length". This symbol is associated
3155with a variable whose value is the procedure that implements string
3156length calculation. Therefore evaluation of the @code{string-length}
3157symbol results in that procedure.
3158
3159The details of the connection between an unquoted symbol and the
3160variable to which it refers are explained elsewhere. See @ref{Binding
3161Constructs}, for how associations between symbols and variables are
3162created, and @ref{Modules}, for how those associations are affected by
3163Guile's module system.
3164
3165
3166@node Symbol Primitives
3167@subsubsection Operations Related to Symbols
3168
3169Given any Scheme value, you can determine whether it is a symbol using
3170the @code{symbol?} primitive:
3171
3172@rnindex symbol?
3173@deffn {Scheme Procedure} symbol? obj
3174@deffnx {C Function} scm_symbol_p (obj)
3175Return @code{#t} if @var{obj} is a symbol, otherwise return
3176@code{#f}.
3177@end deffn
3178
3179Once you know that you have a symbol, you can obtain its name as a
3180string by calling @code{symbol->string}. Note that Guile differs by
3181default from R5RS on the details of @code{symbol->string} as regards
3182case-sensitivity:
3183
3184@rnindex symbol->string
3185@deffn {Scheme Procedure} symbol->string s
3186@deffnx {C Function} scm_symbol_to_string (s)
3187Return the name of symbol @var{s} as a string. By default, Guile reads
3188symbols case-sensitively, so the string returned will have the same case
3189variation as the sequence of characters that caused @var{s} to be
3190created.
3191
3192If Guile is set to read symbols case-insensitively (as specified by
3193R5RS), and @var{s} comes into being as part of a literal expression
3194(@pxref{Literal expressions,,,r5rs, The Revised^5 Report on Scheme}) or
3195by a call to the @code{read} or @code{string-ci->symbol} procedures,
3196Guile converts any alphabetic characters in the symbol's name to
3197lower case before creating the symbol object, so the string returned
3198here will be in lower case.
3199
3200If @var{s} was created by @code{string->symbol}, the case of characters
3201in the string returned will be the same as that in the string that was
3202passed to @code{string->symbol}, regardless of Guile's case-sensitivity
3203setting at the time @var{s} was created.
3204
3205It is an error to apply mutation procedures like @code{string-set!} to
3206strings returned by this procedure.
3207@end deffn
3208
3209Most symbols are created by writing them literally in code. However it
3210is also possible to create symbols programmatically using the following
3211@code{string->symbol} and @code{string-ci->symbol} procedures:
3212
3213@rnindex string->symbol
3214@deffn {Scheme Procedure} string->symbol string
3215@deffnx {C Function} scm_string_to_symbol (string)
3216Return the symbol whose name is @var{string}. This procedure can create
3217symbols with names containing special characters or letters in the
3218non-standard case, but it is usually a bad idea to create such symbols
3219because in some implementations of Scheme they cannot be read as
3220themselves.
3221@end deffn
3222
3223@deffn {Scheme Procedure} string-ci->symbol str
3224@deffnx {C Function} scm_string_ci_to_symbol (str)
3225Return the symbol whose name is @var{str}. If Guile is currently
3226reading symbols case-insensitively, @var{str} is converted to lowercase
3227before the returned symbol is looked up or created.
3228@end deffn
3229
3230The following examples illustrate Guile's detailed behaviour as regards
3231the case-sensitivity of symbols:
3232
3233@lisp
3234(read-enable 'case-insensitive) ; R5RS compliant behaviour
3235
3236(symbol->string 'flying-fish) @result{} "flying-fish"
3237(symbol->string 'Martin) @result{} "martin"
3238(symbol->string
3239 (string->symbol "Malvina")) @result{} "Malvina"
3240
3241(eq? 'mISSISSIppi 'mississippi) @result{} #t
3242(string->symbol "mISSISSIppi") @result{} mISSISSIppi
3243(eq? 'bitBlt (string->symbol "bitBlt")) @result{} #f
3244(eq? 'LolliPop
3245 (string->symbol (symbol->string 'LolliPop))) @result{} #t
3246(string=? "K. Harper, M.D."
3247 (symbol->string
3248 (string->symbol "K. Harper, M.D."))) @result{} #t
3249
3250(read-disable 'case-insensitive) ; Guile default behaviour
3251
3252(symbol->string 'flying-fish) @result{} "flying-fish"
3253(symbol->string 'Martin) @result{} "Martin"
3254(symbol->string
3255 (string->symbol "Malvina")) @result{} "Malvina"
3256
3257(eq? 'mISSISSIppi 'mississippi) @result{} #f
3258(string->symbol "mISSISSIppi") @result{} mISSISSIppi
3259(eq? 'bitBlt (string->symbol "bitBlt")) @result{} #t
3260(eq? 'LolliPop
3261 (string->symbol (symbol->string 'LolliPop))) @result{} #t
3262(string=? "K. Harper, M.D."
3263 (symbol->string
3264 (string->symbol "K. Harper, M.D."))) @result{} #t
3265@end lisp
3266
3267From C, there are lower level functions that construct a Scheme symbol
c48c62d0
MV
3268from a C string in the current locale encoding.
3269
3270When you want to do more from C, you should convert between symbols
3271and strings using @code{scm_symbol_to_string} and
3272@code{scm_string_to_symbol} and work with the strings.
07d83abe 3273
c48c62d0
MV
3274@deffn {C Function} scm_from_locale_symbol (const char *name)
3275@deffnx {C Function} scm_from_locale_symboln (const char *name, size_t len)
07d83abe 3276Construct and return a Scheme symbol whose name is specified by
c48c62d0
MV
3277@var{name}. For @code{scm_from_locale_symbol}, @var{name} must be null
3278terminated; for @code{scm_from_locale_symboln} the length of @var{name} is
07d83abe
MV
3279specified explicitly by @var{len}.
3280@end deffn
3281
3282Finally, some applications, especially those that generate new Scheme
3283code dynamically, need to generate symbols for use in the generated
3284code. The @code{gensym} primitive meets this need:
3285
3286@deffn {Scheme Procedure} gensym [prefix]
3287@deffnx {C Function} scm_gensym (prefix)
3288Create a new symbol with a name constructed from a prefix and a counter
3289value. The string @var{prefix} can be specified as an optional
3290argument. Default prefix is @samp{@w{ g}}. The counter is increased by 1
3291at each call. There is no provision for resetting the counter.
3292@end deffn
3293
3294The symbols generated by @code{gensym} are @emph{likely} to be unique,
3295since their names begin with a space and it is only otherwise possible
3296to generate such symbols if a programmer goes out of their way to do
3297so. Uniqueness can be guaranteed by instead using uninterned symbols
3298(@pxref{Symbol Uninterned}), though they can't be usefully written out
3299and read back in.
3300
3301
3302@node Symbol Props
3303@subsubsection Function Slots and Property Lists
3304
3305In traditional Lisp dialects, symbols are often understood as having
3306three kinds of value at once:
3307
3308@itemize @bullet
3309@item
3310a @dfn{variable} value, which is used when the symbol appears in
3311code in a variable reference context
3312
3313@item
3314a @dfn{function} value, which is used when the symbol appears in
3315code in a function name position (i.e. as the first element in an
3316unquoted list)
3317
3318@item
3319a @dfn{property list} value, which is used when the symbol is given as
3320the first argument to Lisp's @code{put} or @code{get} functions.
3321@end itemize
3322
3323Although Scheme (as one of its simplifications with respect to Lisp)
3324does away with the distinction between variable and function namespaces,
3325Guile currently retains some elements of the traditional structure in
3326case they turn out to be useful when implementing translators for other
3327languages, in particular Emacs Lisp.
3328
3329Specifically, Guile symbols have two extra slots. for a symbol's
3330property list, and for its ``function value.'' The following procedures
3331are provided to access these slots.
3332
3333@deffn {Scheme Procedure} symbol-fref symbol
3334@deffnx {C Function} scm_symbol_fref (symbol)
3335Return the contents of @var{symbol}'s @dfn{function slot}.
3336@end deffn
3337
3338@deffn {Scheme Procedure} symbol-fset! symbol value
3339@deffnx {C Function} scm_symbol_fset_x (symbol, value)
3340Set the contents of @var{symbol}'s function slot to @var{value}.
3341@end deffn
3342
3343@deffn {Scheme Procedure} symbol-pref symbol
3344@deffnx {C Function} scm_symbol_pref (symbol)
3345Return the @dfn{property list} currently associated with @var{symbol}.
3346@end deffn
3347
3348@deffn {Scheme Procedure} symbol-pset! symbol value
3349@deffnx {C Function} scm_symbol_pset_x (symbol, value)
3350Set @var{symbol}'s property list to @var{value}.
3351@end deffn
3352
3353@deffn {Scheme Procedure} symbol-property sym prop
3354From @var{sym}'s property list, return the value for property
3355@var{prop}. The assumption is that @var{sym}'s property list is an
3356association list whose keys are distinguished from each other using
3357@code{equal?}; @var{prop} should be one of the keys in that list. If
3358the property list has no entry for @var{prop}, @code{symbol-property}
3359returns @code{#f}.
3360@end deffn
3361
3362@deffn {Scheme Procedure} set-symbol-property! sym prop val
3363In @var{sym}'s property list, set the value for property @var{prop} to
3364@var{val}, or add a new entry for @var{prop}, with value @var{val}, if
3365none already exists. For the structure of the property list, see
3366@code{symbol-property}.
3367@end deffn
3368
3369@deffn {Scheme Procedure} symbol-property-remove! sym prop
3370From @var{sym}'s property list, remove the entry for property
3371@var{prop}, if there is one. For the structure of the property list,
3372see @code{symbol-property}.
3373@end deffn
3374
3375Support for these extra slots may be removed in a future release, and it
3376is probably better to avoid using them. (In release 1.6, Guile itself
3377uses the property list slot sparingly, and the function slot not at
3378all.) For a more modern and Schemely approach to properties, see
3379@ref{Object Properties}.
3380
3381
3382@node Symbol Read Syntax
3383@subsubsection Extended Read Syntax for Symbols
3384
3385The read syntax for a symbol is a sequence of letters, digits, and
3386@dfn{extended alphabetic characters}, beginning with a character that
3387cannot begin a number. In addition, the special cases of @code{+},
3388@code{-}, and @code{...} are read as symbols even though numbers can
3389begin with @code{+}, @code{-} or @code{.}.
3390
3391Extended alphabetic characters may be used within identifiers as if
3392they were letters. The set of extended alphabetic characters is:
3393
3394@example
3395! $ % & * + - . / : < = > ? @@ ^ _ ~
3396@end example
3397
3398In addition to the standard read syntax defined above (which is taken
3399from R5RS (@pxref{Formal syntax,,,r5rs,The Revised^5 Report on
3400Scheme})), Guile provides an extended symbol read syntax that allows the
3401inclusion of unusual characters such as space characters, newlines and
3402parentheses. If (for whatever reason) you need to write a symbol
3403containing characters not mentioned above, you can do so as follows.
3404
3405@itemize @bullet
3406@item
3407Begin the symbol with the characters @code{#@{},
3408
3409@item
3410write the characters of the symbol and
3411
3412@item
3413finish the symbol with the characters @code{@}#}.
3414@end itemize
3415
3416Here are a few examples of this form of read syntax. The first symbol
3417needs to use extended syntax because it contains a space character, the
3418second because it contains a line break, and the last because it looks
3419like a number.
3420
3421@lisp
3422#@{foo bar@}#
3423
3424#@{what
3425ever@}#
3426
3427#@{4242@}#
3428@end lisp
3429
3430Although Guile provides this extended read syntax for symbols,
3431widespread usage of it is discouraged because it is not portable and not
3432very readable.
3433
3434
3435@node Symbol Uninterned
3436@subsubsection Uninterned Symbols
3437
3438What makes symbols useful is that they are automatically kept unique.
3439There are no two symbols that are distinct objects but have the same
3440name. But of course, there is no rule without exception. In addition
3441to the normal symbols that have been discussed up to now, you can also
3442create special @dfn{uninterned} symbols that behave slightly
3443differently.
3444
3445To understand what is different about them and why they might be useful,
3446we look at how normal symbols are actually kept unique.
3447
3448Whenever Guile wants to find the symbol with a specific name, for
3449example during @code{read} or when executing @code{string->symbol}, it
3450first looks into a table of all existing symbols to find out whether a
3451symbol with the given name already exists. When this is the case, Guile
3452just returns that symbol. When not, a new symbol with the name is
3453created and entered into the table so that it can be found later.
3454
3455Sometimes you might want to create a symbol that is guaranteed `fresh',
3456i.e. a symbol that did not exist previously. You might also want to
3457somehow guarantee that no one else will ever unintentionally stumble
3458across your symbol in the future. These properties of a symbol are
3459often needed when generating code during macro expansion. When
3460introducing new temporary variables, you want to guarantee that they
3461don't conflict with variables in other people's code.
3462
3463The simplest way to arrange for this is to create a new symbol but
3464not enter it into the global table of all symbols. That way, no one
3465will ever get access to your symbol by chance. Symbols that are not in
3466the table are called @dfn{uninterned}. Of course, symbols that
3467@emph{are} in the table are called @dfn{interned}.
3468
3469You create new uninterned symbols with the function @code{make-symbol}.
3470You can test whether a symbol is interned or not with
3471@code{symbol-interned?}.
3472
3473Uninterned symbols break the rule that the name of a symbol uniquely
3474identifies the symbol object. Because of this, they can not be written
3475out and read back in like interned symbols. Currently, Guile has no
3476support for reading uninterned symbols. Note that the function
3477@code{gensym} does not return uninterned symbols for this reason.
3478
3479@deffn {Scheme Procedure} make-symbol name
3480@deffnx {C Function} scm_make_symbol (name)
3481Return a new uninterned symbol with the name @var{name}. The returned
3482symbol is guaranteed to be unique and future calls to
3483@code{string->symbol} will not return it.
3484@end deffn
3485
3486@deffn {Scheme Procedure} symbol-interned? symbol
3487@deffnx {C Function} scm_symbol_interned_p (symbol)
3488Return @code{#t} if @var{symbol} is interned, otherwise return
3489@code{#f}.
3490@end deffn
3491
3492For example:
3493
3494@lisp
3495(define foo-1 (string->symbol "foo"))
3496(define foo-2 (string->symbol "foo"))
3497(define foo-3 (make-symbol "foo"))
3498(define foo-4 (make-symbol "foo"))
3499
3500(eq? foo-1 foo-2)
3501@result{} #t
3502; Two interned symbols with the same name are the same object,
3503
3504(eq? foo-1 foo-3)
3505@result{} #f
3506; but a call to make-symbol with the same name returns a
3507; distinct object.
3508
3509(eq? foo-3 foo-4)
3510@result{} #f
3511; A call to make-symbol always returns a new object, even for
3512; the same name.
3513
3514foo-3
3515@result{} #<uninterned-symbol foo 8085290>
3516; Uninterned symbols print differently from interned symbols,
3517
3518(symbol? foo-3)
3519@result{} #t
3520; but they are still symbols,
3521
3522(symbol-interned? foo-3)
3523@result{} #f
3524; just not interned.
3525@end lisp
3526
3527
3528@node Keywords
3529@subsection Keywords
3530@tpindex Keywords
3531
3532Keywords are self-evaluating objects with a convenient read syntax that
3533makes them easy to type.
3534
3535Guile's keyword support conforms to R5RS, and adds a (switchable) read
3536syntax extension to permit keywords to begin with @code{:} as well as
3537@code{#:}.
3538
3539@menu
3540* Why Use Keywords?:: Motivation for keyword usage.
3541* Coding With Keywords:: How to use keywords.
3542* Keyword Read Syntax:: Read syntax for keywords.
3543* Keyword Procedures:: Procedures for dealing with keywords.
3544* Keyword Primitives:: The underlying primitive procedures.
3545@end menu
3546
3547@node Why Use Keywords?
3548@subsubsection Why Use Keywords?
3549
3550Keywords are useful in contexts where a program or procedure wants to be
3551able to accept a large number of optional arguments without making its
3552interface unmanageable.
3553
3554To illustrate this, consider a hypothetical @code{make-window}
3555procedure, which creates a new window on the screen for drawing into
3556using some graphical toolkit. There are many parameters that the caller
3557might like to specify, but which could also be sensibly defaulted, for
3558example:
3559
3560@itemize @bullet
3561@item
3562color depth -- Default: the color depth for the screen
3563
3564@item
3565background color -- Default: white
3566
3567@item
3568width -- Default: 600
3569
3570@item
3571height -- Default: 400
3572@end itemize
3573
3574If @code{make-window} did not use keywords, the caller would have to
3575pass in a value for each possible argument, remembering the correct
3576argument order and using a special value to indicate the default value
3577for that argument:
3578
3579@lisp
3580(make-window 'default ;; Color depth
3581 'default ;; Background color
3582 800 ;; Width
3583 100 ;; Height
3584 @dots{}) ;; More make-window arguments
3585@end lisp
3586
3587With keywords, on the other hand, defaulted arguments are omitted, and
3588non-default arguments are clearly tagged by the appropriate keyword. As
3589a result, the invocation becomes much clearer:
3590
3591@lisp
3592(make-window #:width 800 #:height 100)
3593@end lisp
3594
3595On the other hand, for a simpler procedure with few arguments, the use
3596of keywords would be a hindrance rather than a help. The primitive
3597procedure @code{cons}, for example, would not be improved if it had to
3598be invoked as
3599
3600@lisp
3601(cons #:car x #:cdr y)
3602@end lisp
3603
3604So the decision whether to use keywords or not is purely pragmatic: use
3605them if they will clarify the procedure invocation at point of call.
3606
3607@node Coding With Keywords
3608@subsubsection Coding With Keywords
3609
3610If a procedure wants to support keywords, it should take a rest argument
3611and then use whatever means is convenient to extract keywords and their
3612corresponding arguments from the contents of that rest argument.
3613
3614The following example illustrates the principle: the code for
3615@code{make-window} uses a helper procedure called
3616@code{get-keyword-value} to extract individual keyword arguments from
3617the rest argument.
3618
3619@lisp
3620(define (get-keyword-value args keyword default)
3621 (let ((kv (memq keyword args)))
3622 (if (and kv (>= (length kv) 2))
3623 (cadr kv)
3624 default)))
3625
3626(define (make-window . args)
3627 (let ((depth (get-keyword-value args #:depth screen-depth))
3628 (bg (get-keyword-value args #:bg "white"))
3629 (width (get-keyword-value args #:width 800))
3630 (height (get-keyword-value args #:height 100))
3631 @dots{})
3632 @dots{}))
3633@end lisp
3634
3635But you don't need to write @code{get-keyword-value}. The @code{(ice-9
3636optargs)} module provides a set of powerful macros that you can use to
3637implement keyword-supporting procedures like this:
3638
3639@lisp
3640(use-modules (ice-9 optargs))
3641
3642(define (make-window . args)
3643 (let-keywords args #f ((depth screen-depth)
3644 (bg "white")
3645 (width 800)
3646 (height 100))
3647 ...))
3648@end lisp
3649
3650@noindent
3651Or, even more economically, like this:
3652
3653@lisp
3654(use-modules (ice-9 optargs))
3655
3656(define* (make-window #:key (depth screen-depth)
3657 (bg "white")
3658 (width 800)
3659 (height 100))
3660 ...)
3661@end lisp
3662
3663For further details on @code{let-keywords}, @code{define*} and other
3664facilities provided by the @code{(ice-9 optargs)} module, see
3665@ref{Optional Arguments}.
3666
3667
3668@node Keyword Read Syntax
3669@subsubsection Keyword Read Syntax
3670
3671Guile, by default, only recognizes the keyword syntax specified by R5RS.
3672A token of the form @code{#:NAME}, where @code{NAME} has the same syntax
3673as a Scheme symbol (@pxref{Symbol Read Syntax}), is the external
3674representation of the keyword named @code{NAME}. Keyword objects print
3675using this syntax as well, so values containing keyword objects can be
3676read back into Guile. When used in an expression, keywords are
3677self-quoting objects.
3678
3679If the @code{keyword} read option is set to @code{'prefix}, Guile also
3680recognizes the alternative read syntax @code{:NAME}. Otherwise, tokens
3681of the form @code{:NAME} are read as symbols, as required by R5RS.
3682
3683To enable and disable the alternative non-R5RS keyword syntax, you use
3684the @code{read-set!} procedure documented in @ref{User level options
3685interfaces} and @ref{Reader options}.
3686
3687@smalllisp
3688(read-set! keywords 'prefix)
3689
3690#:type
3691@result{}
3692#:type
3693
3694:type
3695@result{}
3696#:type
3697
3698(read-set! keywords #f)
3699
3700#:type
3701@result{}
3702#:type
3703
3704:type
3705@print{}
3706ERROR: In expression :type:
3707ERROR: Unbound variable: :type
3708ABORT: (unbound-variable)
3709@end smalllisp
3710
3711@node Keyword Procedures
3712@subsubsection Keyword Procedures
3713
3714The following procedures can be used for converting symbols to keywords
3715and back.
3716
3717@deffn {Scheme Procedure} symbol->keyword sym
3718Return a keyword with the same characters as in @var{sym}.
3719@end deffn
3720
3721@deffn {Scheme Procedure} keyword->symbol kw
3722Return a symbol with the same characters as in @var{kw}.
3723@end deffn
3724
3725
3726@node Keyword Primitives
3727@subsubsection Keyword Primitives
3728
3729Internally, a keyword is implemented as something like a tagged symbol,
3730where the tag identifies the keyword as being self-evaluating, and the
3731symbol, known as the keyword's @dfn{dash symbol} has the same name as
3732the keyword name but prefixed by a single dash. For example, the
3733keyword @code{#:name} has the corresponding dash symbol @code{-name}.
3734
3735Most keyword objects are constructed automatically by the reader when it
3736reads a token beginning with @code{#:}. However, if you need to
3737construct a keyword object programmatically, you can do so by calling
3738@code{make-keyword-from-dash-symbol} with the corresponding dash symbol
3739(as the reader does). The dash symbol for a keyword object can be
3740retrieved using the @code{keyword-dash-symbol} procedure.
3741
3742@deffn {Scheme Procedure} make-keyword-from-dash-symbol symbol
3743@deffnx {C Function} scm_make_keyword_from_dash_symbol (symbol)
3744Make a keyword object from a @var{symbol} that starts with a dash.
3745For example,
3746
3747@example
3748(make-keyword-from-dash-symbol '-foo)
3749@result{} #:foo
3750@end example
3751@end deffn
3752
3753@deffn {Scheme Procedure} keyword? obj
3754@deffnx {C Function} scm_keyword_p (obj)
3755Return @code{#t} if the argument @var{obj} is a keyword, else
3756@code{#f}.
3757@end deffn
3758
3759@deffn {Scheme Procedure} keyword-dash-symbol keyword
3760@deffnx {C Function} scm_keyword_dash_symbol (keyword)
3761Return the dash symbol for @var{keyword}.
3762This is the inverse of @code{make-keyword-from-dash-symbol}.
3763For example,
3764
3765@example
3766(keyword-dash-symbol #:foo)
3767@result{} -foo
3768@end example
3769@end deffn
3770
3771@deftypefn {C Function} SCM scm_c_make_keyword (char *@var{str})
3772Make a keyword object from a string. For example,
3773
3774@example
3775scm_c_make_keyword ("foo")
3776@result{} #:foo
3777@end example
3778@c
3779@c FIXME: What can be said about the string argument? Currently it's
3780@c not used after creation, but should that be documented?
3781@end deftypefn
3782
3783
3784@node Other Types
3785@subsection ``Functionality-Centric'' Data Types
3786
3787Procedures and macros are documented in their own chapter: see
3788@ref{Procedures and Macros}.
3789
3790Variable objects are documented as part of the description of Guile's
3791module system: see @ref{Variables}.
3792
3793Asyncs, dynamic roots and fluids are described in the chapter on
3794scheduling: see @ref{Scheduling}.
3795
3796Hooks are documented in the chapter on general utility functions: see
3797@ref{Hooks}.
3798
3799Ports are described in the chapter on I/O: see @ref{Input and Output}.
3800
3801
3802@c Local Variables:
3803@c TeX-master: "guile.texi"
3804@c End: