doc/ref/scheme-data.texi

   1 @page
   2 @node Simple Data Types
   3 @chapter Simple Generic Data Types
   4
   5 This chapter describes those of Guile's simple data types which are
   6 primarily used for their role as items of generic data.  By
   7 @dfn{simple} we mean data types that are not primarily used as
   8 containers to hold other data --- i.e. pairs, lists, vectors and so on.
   9 For the documentation of such @dfn{compound} data types, see
  10 @ref{Compound Data Types}.
  11
  12 One of the great strengths of Scheme is that there is no straightforward
  13 distinction between ``data'' and ``functionality''.  For example,
  14 Guile's support for dynamic linking could be described
  15
  16 @itemize @bullet
  17 @item
  18 either in a ``data-centric'' way, as the behaviour and properties of the
  19 ``dynamically linked object'' data type, and the operations that may be
  20 applied to instances of this type
  21
  22 @item
  23 or in a ``functionality-centric'' way, as the set of procedures that
  24 constitute Guile's support for dynamic linking, in the context of the
  25 module system.
  26 @end itemize
  27
  28 The contents of this chapter are, therefore, a matter of judgment.  By
  29 @dfn{generic}, we mean to select those data types whose typical use as
  30 @emph{data} in a wide variety of programming contexts is more important
  31 than their use in the implementation of a particular piece of
  32 @emph{functionality}.  The last section of this chapter provides
  33 references for all the data types that are documented not here but in a
  34 ``functionality-centric'' way elsewhere in the manual.
  35
  36 @menu
  37 * Booleans::                    True/false values.
  38 * Numbers::                     Numerical data types.
  39 * Characters::                  New character names.
  40 * Strings::                     Special things about strings.
  41 * Regular Expressions::         Pattern matching and substitution.
  42 * Symbols::                     Symbols.
  43 * Keywords::                    Self-quoting, customizable display keywords.
  44 * Other Types::                 "Functionality-centric" data types.
  45 @end menu
  46
  47
  48 @node Booleans
  49 @section Booleans
  50 @tpindex Booleans
  51
  52 The two boolean values are @code{#t} for true and @code{#f} for false.
  53
  54 Boolean values are returned by predicate procedures, such as the general
  55 equality predicates @code{eq?}, @code{eqv?} and @code{equal?}
  56 (@pxref{Equality}) and numerical and string comparison operators like
  57 @code{string=?} (@pxref{String Comparison}) and @code{<=}
  58 (@pxref{Comparison}).
  59
  60 @lisp
  61 (<= 3 8)
  62 @result{}
  63 #t
  64
  65 (<= 3 -3)
  66 @result{}
  67 #f
  68
  69 (equal? "house" "houses")
  70 @result{}
  71 #f
  72
  73 (eq? #f #f)
  74 @result{}
  75 #t
  76 @end lisp
  77
  78 In test condition contexts like @code{if} and @code{cond} (@pxref{if
  79 cond case}), where a group of subexpressions will be evaluated only if a
  80 @var{condition} expression evaluates to ``true'', ``true'' means any
  81 value at all except @code{#f}.
  82
  83 @lisp
  84 (if #t "yes" "no")
  85 @result{}
  86 "yes"
  87
  88 (if 0 "yes" "no")
  89 @result{}
  90 "yes"
  91
  92 (if #f "yes" "no")
  93 @result{}
  94 "no"
  95 @end lisp
  96
  97 A result of this asymmetry is that typical Scheme source code more often
  98 uses @code{#f} explicitly than @code{#t}: @code{#f} is necessary to
  99 represent an @code{if} or @code{cond} false value, whereas @code{#t} is
 100 not necessary to represent an @code{if} or @code{cond} true value.
 101
 102 It is important to note that @code{#f} is @strong{not} equivalent to any
 103 other Scheme value.  In particular, @code{#f} is not the same as the
 104 number 0 (like in C and C++), and not the same as the ``empty list''
 105 (like in some Lisp dialects).
 106
 107 The @code{not} procedure returns the boolean inverse of its argument:
 108
 109 @rnindex not
 110 @deffn {Scheme Procedure} not x
 111 @deffnx {C Function} scm_not (x)
 112 Return @code{#t} iff @var{x} is @code{#f}, else return @code{#f}.
 113 @end deffn
 114
 115 The @code{boolean?} procedure is a predicate that returns @code{#t} if
 116 its argument is one of the boolean values, otherwise @code{#f}.
 117
 118 @rnindex boolean?
 119 @deffn {Scheme Procedure} boolean? obj
 120 @deffnx {C Function} scm_boolean_p (obj)
 121 Return @code{#t} iff @var{obj} is either @code{#t} or @code{#f}.
 122 @end deffn
 123
 124
 125 @node Numbers
 126 @section Numerical data types
 127 @tpindex Numbers
 128
 129 Guile supports a rich ``tower'' of numerical types --- integer,
 130 rational, real and complex --- and provides an extensive set of
 131 mathematical and scientific functions for operating on numerical
 132 data.  This section of the manual documents those types and functions.
 133
 134 You may also find it illuminating to read R5RS's presentation of numbers
 135 in Scheme, which is particularly clear and accessible: see
 136 @xref{Numbers,,,r5rs}.
 137
 138 @menu
 139 * Numerical Tower::             Scheme's numerical "tower".
 140 * Integers::                    Whole numbers.
 141 * Reals and Rationals::         Real and rational numbers.
 142 * Complex Numbers::             Complex numbers.
 143 * Exactness::                   Exactness and inexactness.
 144 * Number Syntax::               Read syntax for numerical data.
 145 * Integer Operations::          Operations on integer values.
 146 * Comparison::                  Comparison predicates.
 147 * Conversion::                  Converting numbers to and from strings.
 148 * Complex::                     Complex number operations.
 149 * Arithmetic::                  Arithmetic functions.
 150 * Scientific::                  Scientific functions.
 151 * Primitive Numerics::          Primitive numeric functions.
 152 * Bitwise Operations::          Logical AND, OR, NOT, and so on.
 153 * Random::                      Random number generation.
 154 @end menu
 155
 156
 157 @node Numerical Tower
 158 @subsection Scheme's Numerical ``Tower''
 159 @rnindex number?
 160
 161 Scheme's numerical ``tower'' consists of the following categories of
 162 numbers:
 163
 164 @itemize @bullet
 165 @item
 166 integers (whole numbers)
 167
 168 @item
 169 rationals (the set of numbers that can be expressed as P/Q where P and Q
 170 are integers)
 171
 172 @item
 173 real numbers (the set of numbers that describes all possible positions
 174 along a one dimensional line)
 175
 176 @item
 177 complex numbers (the set of numbers that describes all possible
 178 positions in a two dimensional space)
 179 @end itemize
 180
 181 It is called a tower because each category ``sits on'' the one that
 182 follows it, in the sense that every integer is also a rational, every
 183 rational is also real, and every real number is also a complex number
 184 (but with zero imaginary part).
 185
 186 Of these, Guile implements integers, reals and complex numbers as
 187 distinct types.  Rationals are implemented as regards the read syntax
 188 for rational numbers that is specified by R5RS, but are immediately
 189 converted by Guile to the corresponding real number.
 190
 191 The @code{number?} predicate may be applied to any Scheme value to
 192 discover whether the value is any of the supported numerical types.
 193
 194 @deffn {Scheme Procedure} number? obj
 195 @deffnx {C Function} scm_number_p (obj)
 196 Return @code{#t} if @var{obj} is any kind of number, else @code{#f}.
 197 @end deffn
 198
 199 For example:
 200
 201 @lisp
 202 (number? 3)
 203 @result{}
 204 #t
 205
 206 (number? "hello there!")
 207 @result{}
 208 #f
 209
 210 (define pi 3.141592654)
 211 (number? pi)
 212 @result{}
 213 #t
 214 @end lisp
 215
 216 The next few subsections document each of Guile's numerical data types
 217 in detail.
 218
 219 @node Integers
 220 @subsection Integers
 221
 222 @tpindex Integer numbers
 223
 224 @rnindex integer?
 225
 226 Integers are whole numbers, that is numbers with no fractional part,
 227 such as 2, 83 and -3789.
 228
 229 Integers in Guile can be arbitrarily big, as shown by the following
 230 example.
 231
 232 @lisp
 233 (define (factorial n)
 234   (let loop ((n n) (product 1))
 235     (if (= n 0)
 236         product
 237         (loop (- n 1) (* product n)))))
 238
 239 (factorial 3)
 240 @result{}
 241 6
 242
 243 (factorial 20)
 244 @result{}
 245 2432902008176640000
 246
 247 (- (factorial 45))
 248 @result{}
 249 -119622220865480194561963161495657715064383733760000000000
 250 @end lisp
 251
 252 Readers whose background is in programming languages where integers are
 253 limited by the need to fit into just 4 or 8 bytes of memory may find
 254 this surprising, or suspect that Guile's representation of integers is
 255 inefficient.  In fact, Guile achieves a near optimal balance of
 256 convenience and efficiency by using the host computer's native
 257 representation of integers where possible, and a more general
 258 representation where the required number does not fit in the native
 259 form.  Conversion between these two representations is automatic and
 260 completely invisible to the Scheme level programmer.
 261
 262 The infinities @code{+inf.0} and @code{-inf.0} are considered to be
 263 inexact integers.  They are explained in detail in the next section,
 264 together with reals and rationals.
 265
 266 @c REFFIXME Maybe point here to discussion of handling immediates/bignums
 267 @c on the C level, where the conversion is not so automatic - NJ
 268
 269 @deffn {Scheme Procedure} integer? x
 270 @deffnx {C Function} scm_integer_p (x)
 271 Return @code{#t} if @var{x} is an integer number, else @code{#f}.
 272
 273 @lisp
 274 (integer? 487)
 275 @result{}
 276 #t
 277
 278 (integer? -3.4)
 279 @result{}
 280 #f
 281
 282 (integer? +inf.0)
 283 @result{}
 284 #t
 285 @end lisp
 286 @end deffn
 287
 288
 289 @node Reals and Rationals
 290 @subsection Real and Rational Numbers
 291 @tpindex Real numbers
 292 @tpindex Rational numbers
 293
 294 @rnindex real?
 295 @rnindex rational?
 296
 297 Mathematically, the real numbers are the set of numbers that describe
 298 all possible points along a continuous, infinite, one-dimensional line.
 299 The rational numbers are the set of all numbers that can be written as
 300 fractions P/Q, where P and Q are integers.  All rational numbers are
 301 also real, but there are real numbers that are not rational, for example
 302 the square root of 2, and pi.
 303
 304 Guile represents both real and rational numbers approximately using a
 305 floating point encoding with limited precision.  Even though the actual
 306 encoding is in binary, it may be helpful to think of it as a decimal
 307 number with a limited number of significant figures and a decimal point
 308 somewhere, since this corresponds to the standard notation for non-whole
 309 numbers.  For example:
 310
 311 @lisp
 312 0.34
 313 -0.00000142857931198
 314 -5648394822220000000000.0
 315 4.0
 316 @end lisp
 317
 318 The limited precision of Guile's encoding means that any ``real'' number
 319 in Guile can be written in a rational form, by multiplying and then dividing
 320 by sufficient powers of 10 (or in fact, 2).  For example,
 321 @code{-0.00000142857931198} is the same as @code{142857931198} divided by
 322 @code{100000000000000000}.  In Guile's current incarnation, therefore,
 323 the @code{rational?} and @code{real?} predicates are equivalent.
 324
 325 Another aspect of this equivalence is that Guile currently does not
 326 preserve the exactness that is possible with rational arithmetic.
 327 If such exactness is needed, it is of course possible to implement
 328 exact rational arithmetic at the Scheme level using Guile's arbitrary
 329 size integers.
 330
 331 A planned future revision of Guile's numerical tower will make it
 332 possible to implement exact representations and arithmetic for both
 333 rational numbers and real irrational numbers such as square roots,
 334 and in such a way that the new kinds of number integrate seamlessly
 335 with those that are already implemented.
 336
 337 Dividing by an exact zero leads to a error message, as one might
 338 expect.  However, dividing by an inexact zero does not produce an
 339 error.  Instead, the result of the division is either plus or minus
 340 infinity, depending on the sign of the divided number.
 341
 342 The infinities are written @samp{+inf.0} and @samp{-inf.0},
 343 respectibly.  This syntax is also recognized by @code{read} as an
 344 extension to the usual Scheme syntax.
 345
 346 Dividing zero by zero yields something that is not a number at all:
 347 @samp{+nan.0}.  This is the special 'not a number' value.
 348
 349 On platforms that follow IEEE 754 for their floating point arithmetic,
 350 the @samp{+inf.0}, @samp{-inf.0}, and @samp{+nan.0} values are
 351 implemented using the corresponding IEEE 754 values.  They behave in
 352 arithmetic operations like IEEE 754 describes it, i.e., @code{(=
 353 +nan.0 +nan.0) @result{#f}}.
 354
 355 The infinities are inexact integers and are considered to be both even
 356 and odd.  While @samp{+nan.0} is not @code{=} to itself, it is
 357 @code{eqv?} to itself.
 358
 359 To test for the special values, use the functions @code{inf?} and
 360 @code{nan?}.
 361
 362 @deffn {Scheme Procedure} real? obj
 363 @deffnx {C Function} scm_real_p (obj)
 364 Return @code{#t} if @var{obj} is a real number, else @code{#f}.
 365 Note that the sets of integer and rational values form subsets
 366 of the set of real numbers, so the predicate will also be fulfilled
 367 if @var{obj} is an integer number or a rational number.
 368 @end deffn
 369
 370 @deffn {Scheme Procedure} rational? x
 371 @deffnx {C Function} scm_real_p (x)
 372 Return @code{#t} if @var{x} is a rational number, @code{#f}
 373 otherwise.  Note that the set of integer values forms a subset of
 374 the set of rational numbers, i. e. the predicate will also be
 375 fulfilled if @var{x} is an integer number.  Real numbers
 376 will also satisfy this predicate, because of their limited
 377 precision.
 378 @end deffn
 379
 380 @deffn {Scheme Procedure} inf? x
 381 Return @code{#t} if @var{x} is either @samp{+inf.0} or @samp{-inf.0},
 382 @code{#f} otherwise.
 383 @end deffn
 384
 385 @deffn {Scheme Procedure} nan? x
 386 Return @code{#t} if @var{x} is @samp{+nan.0}, @code{#f} otherwise.
 387 @end deffn
 388
 389 @node Complex Numbers
 390 @subsection Complex Numbers
 391 @tpindex Complex numbers
 392
 393 @rnindex complex?
 394
 395 Complex numbers are the set of numbers that describe all possible points
 396 in a two-dimensional space.  The two coordinates of a particular point
 397 in this space are known as the @dfn{real} and @dfn{imaginary} parts of
 398 the complex number that describes that point.
 399
 400 In Guile, complex numbers are written in rectangular form as the sum of
 401 their real and imaginary parts, using the symbol @code{i} to indicate
 402 the imaginary part.
 403
 404 @lisp
 405 3+4i
 406 @result{}
 407 3.0+4.0i
 408
 409 (* 3-8i 2.3+0.3i)
 410 @result{}
 411 9.3-17.5i
 412 @end lisp
 413
 414 Guile represents a complex number as a pair of numbers both of which are
 415 real, so the real and imaginary parts of a complex number have the same
 416 properties of inexactness and limited precision as single real numbers.
 417
 418 @deffn {Scheme Procedure} complex? x
 419 @deffnx {C Function} scm_number_p (x)
 420 Return @code{#t} if @var{x} is a complex number, @code{#f}
 421 otherwise.  Note that the sets of real, rational and integer
 422 values form subsets of the set of complex numbers, i. e. the
 423 predicate will also be fulfilled if @var{x} is a real,
 424 rational or integer number.
 425 @end deffn
 426
 427
 428 @node Exactness
 429 @subsection Exact and Inexact Numbers
 430 @tpindex Exact numbers
 431 @tpindex Inexact numbers
 432
 433 @rnindex exact?
 434 @rnindex inexact?
 435 @rnindex exact->inexact
 436 @rnindex inexact->exact
 437
 438 R5RS requires that a calculation involving inexact numbers always
 439 produces an inexact result.  To meet this requirement, Guile
 440 distinguishes between an exact integer value such as @code{5} and the
 441 corresponding inexact real value which, to the limited precision
 442 available, has no fractional part, and is printed as @code{5.0}.  Guile
 443 will only convert the latter value to the former when forced to do so by
 444 an invocation of the @code{inexact->exact} procedure.
 445
 446 @deffn {Scheme Procedure} exact? x
 447 @deffnx {C Function} scm_exact_p (x)
 448 Return @code{#t} if @var{x} is an exact number, @code{#f}
 449 otherwise.
 450 @end deffn
 451
 452 @deffn {Scheme Procedure} inexact? x
 453 @deffnx {C Function} scm_inexact_p (x)
 454 Return @code{#t} if @var{x} is an inexact number, @code{#f}
 455 else.
 456 @end deffn
 457
 458 @deffn {Scheme Procedure} inexact->exact z
 459 @deffnx {C Function} scm_inexact_to_exact (z)
 460 Return an exact number that is numerically closest to @var{z}.
 461 @end deffn
 462
 463 @c begin (texi-doc-string "guile" "exact->inexact")
 464 @deffn {Scheme Procedure} exact->inexact z
 465 Convert the number @var{z} to its inexact representation.
 466 @end deffn
 467
 468
 469 @node Number Syntax
 470 @subsection Read Syntax for Numerical Data
 471
 472 The read syntax for integers is a string of digits, optionally
 473 preceded by a minus or plus character, a code indicating the
 474 base in which the integer is encoded, and a code indicating whether
 475 the number is exact or inexact.  The supported base codes are:
 476
 477 @itemize @bullet
 478 @item
 479 @code{#b}, @code{#B} --- the integer is written in binary (base 2)
 480
 481 @item
 482 @code{#o}, @code{#O} --- the integer is written in octal (base 8)
 483
 484 @item
 485 @code{#d}, @code{#D} --- the integer is written in decimal (base 10)
 486
 487 @item
 488 @code{#x}, @code{#X} --- the integer is written in hexadecimal (base 16).
 489 @end itemize
 490
 491 If the base code is omitted, the integer is assumed to be decimal.  The
 492 following examples show how these base codes are used.
 493
 494 @lisp
 495 -13
 496 @result{}
 497 -13
 498
 499 #d-13
 500 @result{}
 501 -13
 502
 503 #x-13
 504 @result{}
 505 -19
 506
 507 #b+1101
 508 @result{}
 509 13
 510
 511 #o377
 512 @result{}
 513 255
 514 @end lisp
 515
 516 The codes for indicating exactness (which can, incidentally, be applied
 517 to all numerical values) are:
 518
 519 @itemize @bullet
 520 @item
 521 @code{#e}, @code{#E} --- the number is exact
 522
 523 @item
 524 @code{#i}, @code{#I} --- the number is inexact.
 525 @end itemize
 526
 527 If the exactness indicator is omitted, the integer is assumed to be exact,
 528 since Guile's internal representation for integers is always exact.
 529 Real numbers have limited precision similar to the precision of the
 530 @code{double} type in C.  A consequence of the limited precision is that
 531 all real numbers in Guile are also rational, since any number R with a
 532 limited number of decimal places, say N, can be made into an integer by
 533 multiplying by 10^N.
 534
 535 Guile also understands the syntax @samp{+inf.0} and @samp{-inf.0} for
 536 plus and minus infinity, respectively.  The value must be written
 537 exactly as shown, that is, the always must have a sign and exactly one
 538 zero digit after the decimal point.  It also understands @samp{+nan.0}
 539 and @samp{-nan.0} for the special 'not-a-number' value.  The sign is
 540 ignored for 'not-a-number' and the value is always printed as @samp{+nan.0}.
 541
 542 @node Integer Operations
 543 @subsection Operations on Integer Values
 544 @rnindex odd?
 545 @rnindex even?
 546 @rnindex quotient
 547 @rnindex remainder
 548 @rnindex modulo
 549 @rnindex gcd
 550 @rnindex lcm
 551
 552 @deffn {Scheme Procedure} odd? n
 553 @deffnx {C Function} scm_odd_p (n)
 554 Return @code{#t} if @var{n} is an odd number, @code{#f}
 555 otherwise.
 556 @end deffn
 557
 558 @deffn {Scheme Procedure} even? n
 559 @deffnx {C Function} scm_even_p (n)
 560 Return @code{#t} if @var{n} is an even number, @code{#f}
 561 otherwise.
 562 @end deffn
 563
 564 @c begin (texi-doc-string "guile" "quotient")
 565 @deffn {Scheme Procedure} quotient
 566 Return the quotient of the numbers @var{x} and @var{y}.
 567 @end deffn
 568
 569 @c begin (texi-doc-string "guile" "remainder")
 570 @deffn {Scheme Procedure} remainder
 571 Return the remainder of the numbers @var{x} and @var{y}.
 572 @lisp
 573 (remainder 13 4) @result{} 1
 574 (remainder -13 4) @result{} -1
 575 @end lisp
 576 @end deffn
 577
 578 @c begin (texi-doc-string "guile" "modulo")
 579 @deffn {Scheme Procedure} modulo
 580 Return the modulo of the numbers @var{x} and @var{y}.
 581 @lisp
 582 (modulo 13 4) @result{} 1
 583 (modulo -13 4) @result{} 3
 584 @end lisp
 585 @end deffn
 586
 587 @c begin (texi-doc-string "guile" "gcd")
 588 @deffn {Scheme Procedure} gcd
 589 Return the greatest common divisor of all arguments.
 590 If called without arguments, 0 is returned.
 591 @end deffn
 592
 593 @c begin (texi-doc-string "guile" "lcm")
 594 @deffn {Scheme Procedure} lcm
 595 Return the least common multiple of the arguments.
 596 If called without arguments, 1 is returned.
 597 @end deffn
 598
 599
 600 @node Comparison
 601 @subsection Comparison Predicates
 602 @rnindex zero?
 603 @rnindex positive?
 604 @rnindex negative?
 605
 606 @c begin (texi-doc-string "guile" "=")
 607 @deffn {Scheme Procedure} =
 608 Return @code{#t} if all parameters are numerically equal.
 609 @end deffn
 610
 611 @c begin (texi-doc-string "guile" "<")
 612 @deffn {Scheme Procedure} <
 613 Return @code{#t} if the list of parameters is monotonically
 614 increasing.
 615 @end deffn
 616
 617 @c begin (texi-doc-string "guile" ">")
 618 @deffn {Scheme Procedure} >
 619 Return @code{#t} if the list of parameters is monotonically
 620 decreasing.
 621 @end deffn
 622
 623 @c begin (texi-doc-string "guile" "<=")
 624 @deffn {Scheme Procedure} <=
 625 Return @code{#t} if the list of parameters is monotonically
 626 non-decreasing.
 627 @end deffn
 628
 629 @c begin (texi-doc-string "guile" ">=")
 630 @deffn {Scheme Procedure} >=
 631 Return @code{#t} if the list of parameters is monotonically
 632 non-increasing.
 633 @end deffn
 634
 635 @c begin (texi-doc-string "guile" "zero?")
 636 @deffn {Scheme Procedure} zero?
 637 Return @code{#t} if @var{z} is an exact or inexact number equal to
 638 zero.
 639 @end deffn
 640
 641 @c begin (texi-doc-string "guile" "positive?")
 642 @deffn {Scheme Procedure} positive?
 643 Return @code{#t} if @var{x} is an exact or inexact number greater than
 644 zero.
 645 @end deffn
 646
 647 @c begin (texi-doc-string "guile" "negative?")
 648 @deffn {Scheme Procedure} negative?
 649 Return @code{#t} if @var{x} is an exact or inexact number less than
 650 zero.
 651 @end deffn
 652
 653
 654 @node Conversion
 655 @subsection Converting Numbers To and From Strings
 656 @rnindex number->string
 657 @rnindex string->number
 658
 659 @deffn {Scheme Procedure} number->string n [radix]
 660 @deffnx {C Function} scm_number_to_string (n, radix)
 661 Return a string holding the external representation of the
 662 number @var{n} in the given @var{radix}.  If @var{n} is
 663 inexact, a radix of 10 will be used.
 664 @end deffn
 665
 666 @deffn {Scheme Procedure} string->number string [radix]
 667 @deffnx {C Function} scm_string_to_number (string, radix)
 668 Return a number of the maximally precise representation
 669 expressed by the given @var{string}. @var{radix} must be an
 670 exact integer, either 2, 8, 10, or 16. If supplied, @var{radix}
 671 is a default radix that may be overridden by an explicit radix
 672 prefix in @var{string} (e.g. "#o177"). If @var{radix} is not
 673 supplied, then the default radix is 10. If string is not a
 674 syntactically valid notation for a number, then
 675 @code{string->number} returns @code{#f}.
 676 @end deffn
 677
 678
 679 @node Complex
 680 @subsection Complex Number Operations
 681 @rnindex make-rectangular
 682 @rnindex make-polar
 683 @rnindex real-part
 684 @rnindex imag-part
 685 @rnindex magnitude
 686 @rnindex angle
 687
 688 @deffn {Scheme Procedure} make-rectangular real imaginary
 689 @deffnx {C Function} scm_make_rectangular (real, imaginary)
 690 Return a complex number constructed of the given @var{real} and
 691 @var{imaginary} parts.
 692 @end deffn
 693
 694 @deffn {Scheme Procedure} make-polar x y
 695 @deffnx {C Function} scm_make_polar (x, y)
 696 Return the complex number @var{x} * e^(i * @var{y}).
 697 @end deffn
 698
 699 @c begin (texi-doc-string "guile" "real-part")
 700 @deffn {Scheme Procedure} real-part
 701 Return the real part of the number @var{z}.
 702 @end deffn
 703
 704 @c begin (texi-doc-string "guile" "imag-part")
 705 @deffn {Scheme Procedure} imag-part
 706 Return the imaginary part of the number @var{z}.
 707 @end deffn
 708
 709 @c begin (texi-doc-string "guile" "magnitude")
 710 @deffn {Scheme Procedure} magnitude
 711 Return the magnitude of the number @var{z}. This is the same as
 712 @code{abs} for real arguments, but also allows complex numbers.
 713 @end deffn
 714
 715 @c begin (texi-doc-string "guile" "angle")
 716 @deffn {Scheme Procedure} angle
 717 Return the angle of the complex number @var{z}.
 718 @end deffn
 719
 720
 721 @node Arithmetic
 722 @subsection Arithmetic Functions
 723 @rnindex max
 724 @rnindex min
 725 @rnindex +
 726 @rnindex *
 727 @rnindex -
 728 @rnindex /
 729 @rnindex abs
 730 @rnindex floor
 731 @rnindex ceiling
 732 @rnindex truncate
 733 @rnindex round
 734
 735 @c begin (texi-doc-string "guile" "+")
 736 @deffn {Scheme Procedure} + z1 @dots{}
 737 Return the sum of all parameter values.  Return 0 if called without any
 738 parameters.
 739 @end deffn
 740
 741 @c begin (texi-doc-string "guile" "-")
 742 @deffn {Scheme Procedure} - z1 z2 @dots{}
 743 If called with one argument @var{z1}, -@var{z1} is returned. Otherwise
 744 the sum of all but the first argument are subtracted from the first
 745 argument.
 746 @end deffn
 747
 748 @c begin (texi-doc-string "guile" "*")
 749 @deffn {Scheme Procedure} * z1 @dots{}
 750 Return the product of all arguments.  If called without arguments, 1 is
 751 returned.
 752 @end deffn
 753
 754 @c begin (texi-doc-string "guile" "/")
 755 @deffn {Scheme Procedure} / z1 z2 @dots{}
 756 Divide the first argument by the product of the remaining arguments.  If
 757 called with one argument @var{z1}, 1/@var{z1} is returned.
 758 @end deffn
 759
 760 @c begin (texi-doc-string "guile" "abs")
 761 @deffn {Scheme Procedure} abs x
 762 @deffnx {C Function} scm_abs (x)
 763 Return the absolute value of @var{x}.
 764
 765 @var{x} must be a number with zero imaginary part.  To calculate the
 766 magnitude of a complex number, use @code{magnitude} instead.
 767 @end deffn
 768
 769 @c begin (texi-doc-string "guile" "max")
 770 @deffn {Scheme Procedure} max x1 x2 @dots{}
 771 Return the maximum of all parameter values.
 772 @end deffn
 773
 774 @c begin (texi-doc-string "guile" "min")
 775 @deffn {Scheme Procedure} min x1 x2 @dots{}
 776 Return the minimum of all parameter values.
 777 @end deffn
 778
 779 @c begin (texi-doc-string "guile" "truncate")
 780 @deffn {Scheme Procedure} truncate
 781 Round the inexact number @var{x} towards zero.
 782 @end deffn
 783
 784 @c begin (texi-doc-string "guile" "round")
 785 @deffn {Scheme Procedure} round x
 786 Round the inexact number @var{x} towards zero.
 787 @end deffn
 788
 789 @c begin (texi-doc-string "guile" "floor")
 790 @deffn {Scheme Procedure} floor x
 791 Round the number @var{x} towards minus infinity.
 792 @end deffn
 793
 794 @c begin (texi-doc-string "guile" "ceiling")
 795 @deffn {Scheme Procedure} ceiling x
 796 Round the number @var{x} towards infinity.
 797 @end deffn
 798
 799 For the @code{truncate} and @code{round} procedures, the Guile library
 800 exports equivalent C functions, but taking and returning arguments of
 801 type @code{double} rather than the usual @code{SCM}.
 802
 803 @deftypefn {C Function} double scm_truncate (double x)
 804 @deftypefnx {C Function} double scm_round (double x)
 805 @end deftypefn
 806
 807 For @code{floor} and @code{ceiling}, the equivalent C functions are
 808 @code{floor} and @code{ceil} from the standard mathematics library
 809 (which also take and return @code{double} arguments).
 810
 811
 812 @node Scientific
 813 @subsection Scientific Functions
 814
 815 The following procedures accept any kind of number as arguments,
 816 including complex numbers.
 817
 818 @rnindex sqrt
 819 @c begin (texi-doc-string "guile" "sqrt")
 820 @deffn {Scheme Procedure} sqrt z
 821 Return the square root of @var{z}.
 822 @end deffn
 823
 824 @rnindex expt
 825 @c begin (texi-doc-string "guile" "expt")
 826 @deffn {Scheme Procedure} expt z1 z2
 827 Return @var{z1} raised to the power of @var{z2}.
 828 @end deffn
 829
 830 @rnindex sin
 831 @c begin (texi-doc-string "guile" "sin")
 832 @deffn {Scheme Procedure} sin z
 833 Return the sine of @var{z}.
 834 @end deffn
 835
 836 @rnindex cos
 837 @c begin (texi-doc-string "guile" "cos")
 838 @deffn {Scheme Procedure} cos z
 839 Return the cosine of @var{z}.
 840 @end deffn
 841
 842 @rnindex tan
 843 @c begin (texi-doc-string "guile" "tan")
 844 @deffn {Scheme Procedure} tan z
 845 Return the tangent of @var{z}.
 846 @end deffn
 847
 848 @rnindex asin
 849 @c begin (texi-doc-string "guile" "asin")
 850 @deffn {Scheme Procedure} asin z
 851 Return the arcsine of @var{z}.
 852 @end deffn
 853
 854 @rnindex acos
 855 @c begin (texi-doc-string "guile" "acos")
 856 @deffn {Scheme Procedure} acos z
 857 Return the arccosine of @var{z}.
 858 @end deffn
 859
 860 @rnindex atan
 861 @c begin (texi-doc-string "guile" "atan")
 862 @deffn {Scheme Procedure} atan z
 863 Return the arctangent of @var{z}.
 864 @end deffn
 865
 866 @rnindex exp
 867 @c begin (texi-doc-string "guile" "exp")
 868 @deffn {Scheme Procedure} exp z
 869 Return e to the power of @var{z}, where e is the base of natural
 870 logarithms (2.71828@dots{}).
 871 @end deffn
 872
 873 @rnindex log
 874 @c begin (texi-doc-string "guile" "log")
 875 @deffn {Scheme Procedure} log z
 876 Return the natural logarithm of @var{z}.
 877 @end deffn
 878
 879 @c begin (texi-doc-string "guile" "log10")
 880 @deffn {Scheme Procedure} log10 z
 881 Return the base 10 logarithm of @var{z}.
 882 @end deffn
 883
 884 @c begin (texi-doc-string "guile" "sinh")
 885 @deffn {Scheme Procedure} sinh z
 886 Return the hyperbolic sine of @var{z}.
 887 @end deffn
 888
 889 @c begin (texi-doc-string "guile" "cosh")
 890 @deffn {Scheme Procedure} cosh z
 891 Return the hyperbolic cosine of @var{z}.
 892 @end deffn
 893
 894 @c begin (texi-doc-string "guile" "tanh")
 895 @deffn {Scheme Procedure} tanh z
 896 Return the hyperbolic tangent of @var{z}.
 897 @end deffn
 898
 899 @c begin (texi-doc-string "guile" "asinh")
 900 @deffn {Scheme Procedure} asinh z
 901 Return the hyperbolic arcsine of @var{z}.
 902 @end deffn
 903
 904 @c begin (texi-doc-string "guile" "acosh")
 905 @deffn {Scheme Procedure} acosh z
 906 Return the hyperbolic arccosine of @var{z}.
 907 @end deffn
 908
 909 @c begin (texi-doc-string "guile" "atanh")
 910 @deffn {Scheme Procedure} atanh z
 911 Return the hyperbolic arctangent of @var{z}.
 912 @end deffn
 913
 914
 915 @node Primitive Numerics
 916 @subsection Primitive Numeric Functions
 917
 918 Many of Guile's numeric procedures which accept any kind of numbers as
 919 arguments, including complex numbers, are implemented as Scheme
 920 procedures that use the following real number-based primitives.  These
 921 primitives signal an error if they are called with complex arguments.
 922
 923 @c begin (texi-doc-string "guile" "$abs")
 924 @deffn {Scheme Procedure} $abs x
 925 Return the absolute value of @var{x}.
 926 @end deffn
 927
 928 @c begin (texi-doc-string "guile" "$sqrt")
 929 @deffn {Scheme Procedure} $sqrt x
 930 Return the square root of @var{x}.
 931 @end deffn
 932
 933 @deffn {Scheme Procedure} $expt x y
 934 @deffnx {C Function} scm_sys_expt (x, y)
 935 Return @var{x} raised to the power of @var{y}. This
 936 procedure does not accept complex arguments.
 937 @end deffn
 938
 939 @c begin (texi-doc-string "guile" "$sin")
 940 @deffn {Scheme Procedure} $sin x
 941 Return the sine of @var{x}.
 942 @end deffn
 943
 944 @c begin (texi-doc-string "guile" "$cos")
 945 @deffn {Scheme Procedure} $cos x
 946 Return the cosine of @var{x}.
 947 @end deffn
 948
 949 @c begin (texi-doc-string "guile" "$tan")
 950 @deffn {Scheme Procedure} $tan x
 951 Return the tangent of @var{x}.
 952 @end deffn
 953
 954 @c begin (texi-doc-string "guile" "$asin")
 955 @deffn {Scheme Procedure} $asin x
 956 Return the arcsine of @var{x}.
 957 @end deffn
 958
 959 @c begin (texi-doc-string "guile" "$acos")
 960 @deffn {Scheme Procedure} $acos x
 961 Return the arccosine of @var{x}.
 962 @end deffn
 963
 964 @c begin (texi-doc-string "guile" "$atan")
 965 @deffn {Scheme Procedure} $atan x
 966 Return the arctangent of @var{x} in the range -PI/2 to PI/2.
 967 @end deffn
 968
 969 @deffn {Scheme Procedure} $atan2 x y
 970 @deffnx {C Function} scm_sys_atan2 (x, y)
 971 Return the arc tangent of the two arguments @var{x} and
 972 @var{y}. This is similar to calculating the arc tangent of
 973 @var{x} / @var{y}, except that the signs of both arguments
 974 are used to determine the quadrant of the result. This
 975 procedure does not accept complex arguments.
 976 @end deffn
 977
 978 @c begin (texi-doc-string "guile" "$exp")
 979 @deffn {Scheme Procedure} $exp x
 980 Return e to the power of @var{x}, where e is the base of natural
 981 logarithms (2.71828@dots{}).
 982 @end deffn
 983
 984 @c begin (texi-doc-string "guile" "$log")
 985 @deffn {Scheme Procedure} $log x
 986 Return the natural logarithm of @var{x}.
 987 @end deffn
 988
 989 @c begin (texi-doc-string "guile" "$sinh")
 990 @deffn {Scheme Procedure} $sinh x
 991 Return the hyperbolic sine of @var{x}.
 992 @end deffn
 993
 994 @c begin (texi-doc-string "guile" "$cosh")
 995 @deffn {Scheme Procedure} $cosh x
 996 Return the hyperbolic cosine of @var{x}.
 997 @end deffn
 998
 999 @c begin (texi-doc-string "guile" "$tanh")
1000 @deffn {Scheme Procedure} $tanh x
1001 Return the hyperbolic tangent of @var{x}.
1002 @end deffn
1003
1004 @c begin (texi-doc-string "guile" "$asinh")
1005 @deffn {Scheme Procedure} $asinh x
1006 Return the hyperbolic arcsine of @var{x}.
1007 @end deffn
1008
1009 @c begin (texi-doc-string "guile" "$acosh")
1010 @deffn {Scheme Procedure} $acosh x
1011 Return the hyperbolic arccosine of @var{x}.
1012 @end deffn
1013
1014 @c begin (texi-doc-string "guile" "$atanh")
1015 @deffn {Scheme Procedure} $atanh x
1016 Return the hyperbolic arctangent of @var{x}.
1017 @end deffn
1018
1019 For the hyperbolic arc-functions, the Guile library exports C functions
1020 corresponding to these Scheme procedures, but taking and returning
1021 arguments of type @code{double} rather than the usual @code{SCM}.
1022
1023 @deftypefn {C Function} double scm_asinh (double x)
1024 @deftypefnx {C Function} double scm_acosh (double x)
1025 @deftypefnx {C Function} double scm_atanh (double x)
1026 Return the hyperbolic arcsine, arccosine or arctangent of @var{x}
1027 respectively.
1028 @end deftypefn
1029
1030 For all the other Scheme procedures above, except @code{expt} and
1031 @code{atan2} (whose entries specifically mention an equivalent C
1032 function), the equivalent C functions are those provided by the standard
1033 mathematics library.  The mapping is as follows.
1034
1035 @multitable {xx} {Scheme Procedure} {C Function}
1036 @item @tab Scheme Procedure @tab C Function
1037
1038 @item @tab @code{$abs}      @tab @code{fabs}
1039 @item @tab @code{$sqrt}     @tab @code{sqrt}
1040 @item @tab @code{$sin}      @tab @code{sin}
1041 @item @tab @code{$cos}      @tab @code{cos}
1042 @item @tab @code{$tan}      @tab @code{tan}
1043 @item @tab @code{$asin}     @tab @code{asin}
1044 @item @tab @code{$acos}     @tab @code{acos}
1045 @item @tab @code{$atan}     @tab @code{atan}
1046 @item @tab @code{$exp}      @tab @code{exp}
1047 @item @tab @code{$log}      @tab @code{log}
1048 @item @tab @code{$sinh}     @tab @code{sinh}
1049 @item @tab @code{$cosh}     @tab @code{cosh}
1050 @item @tab @code{$tanh}     @tab @code{tanh}
1051 @end multitable
1052
1053 @noindent
1054 Naturally, these C functions expect and return @code{double} arguments.
1055
1056
1057 @node Bitwise Operations
1058 @subsection Bitwise Operations
1059
1060 @deffn {Scheme Procedure} logand n1 n2
1061 Return the bitwise AND of the integer arguments.
1062
1063 @lisp
1064 (logand) @result{} -1
1065 (logand 7) @result{} 7
1066 (logand #b111 #b011 #b001) @result{} 1
1067 @end lisp
1068 @end deffn
1069
1070 @deffn {Scheme Procedure} logior n1 n2
1071 Return the bitwise OR of the integer arguments.
1072
1073 @lisp
1074 (logior) @result{} 0
1075 (logior 7) @result{} 7
1076 (logior #b000 #b001 #b011) @result{} 3
1077 @end lisp
1078 @end deffn
1079
1080 @deffn {Scheme Procedure} logxor n1 n2
1081 Return the bitwise XOR of the integer arguments.  A bit is
1082 set in the result if it is set in an odd number of arguments.
1083 @lisp
1084 (logxor) @result{} 0
1085 (logxor 7) @result{} 7
1086 (logxor #b000 #b001 #b011) @result{} 2
1087 (logxor #b000 #b001 #b011 #b011) @result{} 1
1088 @end lisp
1089 @end deffn
1090
1091 @deffn {Scheme Procedure} lognot n
1092 @deffnx {C Function} scm_lognot (n)
1093 Return the integer which is the 2s-complement of the integer
1094 argument.
1095
1096 @lisp
1097 (number->string (lognot #b10000000) 2)
1098    @result{} "-10000001"
1099 (number->string (lognot #b0) 2)
1100    @result{} "-1"
1101 @end lisp
1102 @end deffn
1103
1104 @deffn {Scheme Procedure} logtest j k
1105 @deffnx {C Function} scm_logtest (j, k)
1106 @lisp
1107 (logtest j k) @equiv{} (not (zero? (logand j k)))
1108
1109 (logtest #b0100 #b1011) @result{} #f
1110 (logtest #b0100 #b0111) @result{} #t
1111 @end lisp
1112 @end deffn
1113
1114 @deffn {Scheme Procedure} logbit? index j
1115 @deffnx {C Function} scm_logbit_p (index, j)
1116 @lisp
1117 (logbit? index j) @equiv{} (logtest (integer-expt 2 index) j)
1118
1119 (logbit? 0 #b1101) @result{} #t
1120 (logbit? 1 #b1101) @result{} #f
1121 (logbit? 2 #b1101) @result{} #t
1122 (logbit? 3 #b1101) @result{} #t
1123 (logbit? 4 #b1101) @result{} #f
1124 @end lisp
1125 @end deffn
1126
1127 @deffn {Scheme Procedure} ash n cnt
1128 @deffnx {C Function} scm_ash (n, cnt)
1129 The function ash performs an arithmetic shift left by @var{cnt}
1130 bits (or shift right, if @var{cnt} is negative).  'Arithmetic'
1131 means, that the function does not guarantee to keep the bit
1132 structure of @var{n}, but rather guarantees that the result
1133 will always be rounded towards minus infinity.  Therefore, the
1134 results of ash and a corresponding bitwise shift will differ if
1135 @var{n} is negative.
1136
1137 Formally, the function returns an integer equivalent to
1138 @code{(inexact->exact (floor (* @var{n} (expt 2 @var{cnt}))))}.
1139
1140 @lisp
1141 (number->string (ash #b1 3) 2)     @result{} "1000"
1142 (number->string (ash #b1010 -1) 2) @result{} "101"
1143 @end lisp
1144 @end deffn
1145
1146 @deffn {Scheme Procedure} logcount n
1147 @deffnx {C Function} scm_logcount (n)
1148 Return the number of bits in integer @var{n}.  If integer is
1149 positive, the 1-bits in its binary representation are counted.
1150 If negative, the 0-bits in its two's-complement binary
1151 representation are counted.  If 0, 0 is returned.
1152
1153 @lisp
1154 (logcount #b10101010)
1155    @result{} 4
1156 (logcount 0)
1157    @result{} 0
1158 (logcount -2)
1159    @result{} 1
1160 @end lisp
1161 @end deffn
1162
1163 @deffn {Scheme Procedure} integer-length n
1164 @deffnx {C Function} scm_integer_length (n)
1165 Return the number of bits necessary to represent @var{n}.
1166
1167 @lisp
1168 (integer-length #b10101010)
1169    @result{} 8
1170 (integer-length 0)
1171    @result{} 0
1172 (integer-length #b1111)
1173    @result{} 4
1174 @end lisp
1175 @end deffn
1176
1177 @deffn {Scheme Procedure} integer-expt n k
1178 @deffnx {C Function} scm_integer_expt (n, k)
1179 Return @var{n} raised to the non-negative integer exponent
1180 @var{k}.
1181
1182 @lisp
1183 (integer-expt 2 5)
1184    @result{} 32
1185 (integer-expt -3 3)
1186    @result{} -27
1187 @end lisp
1188 @end deffn
1189
1190 @deffn {Scheme Procedure} bit-extract n start end
1191 @deffnx {C Function} scm_bit_extract (n, start, end)
1192 Return the integer composed of the @var{start} (inclusive)
1193 through @var{end} (exclusive) bits of @var{n}.  The
1194 @var{start}th bit becomes the 0-th bit in the result.
1195
1196 @lisp
1197 (number->string (bit-extract #b1101101010 0 4) 2)
1198    @result{} "1010"
1199 (number->string (bit-extract #b1101101010 4 9) 2)
1200    @result{} "10110"
1201 @end lisp
1202 @end deffn
1203
1204
1205 @node Random
1206 @subsection Random Number Generation
1207
1208 @deffn {Scheme Procedure} copy-random-state [state]
1209 @deffnx {C Function} scm_copy_random_state (state)
1210 Return a copy of the random state @var{state}.
1211 @end deffn
1212
1213 @deffn {Scheme Procedure} random n [state]
1214 @deffnx {C Function} scm_random (n, state)
1215 Return a number in [0, N).
1216
1217 Accepts a positive integer or real n and returns a
1218 number of the same type between zero (inclusive) and
1219 N (exclusive). The values returned have a uniform
1220 distribution.
1221
1222 The optional argument @var{state} must be of the type produced
1223 by @code{seed->random-state}. It defaults to the value of the
1224 variable @var{*random-state*}. This object is used to maintain
1225 the state of the pseudo-random-number generator and is altered
1226 as a side effect of the random operation.
1227 @end deffn
1228
1229 @deffn {Scheme Procedure} random:exp [state]
1230 @deffnx {C Function} scm_random_exp (state)
1231 Return an inexact real in an exponential distribution with mean
1232 1.  For an exponential distribution with mean u use (* u
1233 (random:exp)).
1234 @end deffn
1235
1236 @deffn {Scheme Procedure} random:hollow-sphere! v [state]
1237 @deffnx {C Function} scm_random_hollow_sphere_x (v, state)
1238 Fills vect with inexact real random numbers
1239 the sum of whose squares is equal to 1.0.
1240 Thinking of vect as coordinates in space of
1241 dimension n = (vector-length vect), the coordinates
1242 are uniformly distributed over the surface of the
1243 unit n-sphere.
1244 @end deffn
1245
1246 @deffn {Scheme Procedure} random:normal [state]
1247 @deffnx {C Function} scm_random_normal (state)
1248 Return an inexact real in a normal distribution.  The
1249 distribution used has mean 0 and standard deviation 1.  For a
1250 normal distribution with mean m and standard deviation d use
1251 @code{(+ m (* d (random:normal)))}.
1252 @end deffn
1253
1254 @deffn {Scheme Procedure} random:normal-vector! v [state]
1255 @deffnx {C Function} scm_random_normal_vector_x (v, state)
1256 Fills vect with inexact real random numbers that are
1257 independent and standard normally distributed
1258 (i.e., with mean 0 and variance 1).
1259 @end deffn
1260
1261 @deffn {Scheme Procedure} random:solid-sphere! v [state]
1262 @deffnx {C Function} scm_random_solid_sphere_x (v, state)
1263 Fills vect with inexact real random numbers
1264 the sum of whose squares is less than 1.0.
1265 Thinking of vect as coordinates in space of
1266 dimension n = (vector-length vect), the coordinates
1267 are uniformly distributed within the unit n-sphere.
1268 The sum of the squares of the numbers is returned.
1269 @end deffn
1270
1271 @deffn {Scheme Procedure} random:uniform [state]
1272 @deffnx {C Function} scm_random_uniform (state)
1273 Return a uniformly distributed inexact real random number in
1274 [0,1).
1275 @end deffn
1276
1277 @deffn {Scheme Procedure} seed->random-state seed
1278 @deffnx {C Function} scm_seed_to_random_state (seed)
1279 Return a new random state using @var{seed}.
1280 @end deffn
1281
1282
1283 @node Characters
1284 @section Characters
1285 @tpindex Characters
1286
1287 Most of the characters in the ASCII character set may be referred to by
1288 name: for example, @code{#\tab}, @code{#\esc}, @code{#\stx}, and so on.
1289 The following table describes the ASCII names for each character.
1290
1291 @multitable @columnfractions .25 .25 .25 .25
1292 @item 0 = @code{#\nul}
1293  @tab 1 = @code{#\soh}
1294  @tab 2 = @code{#\stx}
1295  @tab 3 = @code{#\etx}
1296 @item 4 = @code{#\eot}
1297  @tab 5 = @code{#\enq}
1298  @tab 6 = @code{#\ack}
1299  @tab 7 = @code{#\bel}
1300 @item 8 = @code{#\bs}
1301  @tab 9 = @code{#\ht}
1302  @tab 10 = @code{#\nl}
1303  @tab 11 = @code{#\vt}
1304 @item 12 = @code{#\np}
1305  @tab 13 = @code{#\cr}
1306  @tab 14 = @code{#\so}
1307  @tab 15 = @code{#\si}
1308 @item 16 = @code{#\dle}
1309  @tab 17 = @code{#\dc1}
1310  @tab 18 = @code{#\dc2}
1311  @tab 19 = @code{#\dc3}
1312 @item 20 = @code{#\dc4}
1313  @tab 21 = @code{#\nak}
1314  @tab 22 = @code{#\syn}
1315  @tab 23 = @code{#\etb}
1316 @item 24 = @code{#\can}
1317  @tab 25 = @code{#\em}
1318  @tab 26 = @code{#\sub}
1319  @tab 27 = @code{#\esc}
1320 @item 28 = @code{#\fs}
1321  @tab 29 = @code{#\gs}
1322  @tab 30 = @code{#\rs}
1323  @tab 31 = @code{#\us}
1324 @item 32 = @code{#\sp}
1325 @end multitable
1326
1327 The @code{delete} character (octal 177) may be referred to with the name
1328 @code{#\del}.
1329
1330 Several characters have more than one name:
1331
1332 @itemize @bullet
1333 @item
1334 @code{#\space}, @code{#\sp}
1335 @item
1336 @code{#\newline}, @code{#\nl}
1337 @item
1338 @code{#\tab}, @code{#\ht}
1339 @item
1340 @code{#\backspace}, @code{#\bs}
1341 @item
1342 @code{#\return}, @code{#\cr}
1343 @item
1344 @code{#\page}, @code{#\np}
1345 @item
1346 @code{#\null}, @code{#\nul}
1347 @end itemize
1348
1349 @rnindex char?
1350 @deffn {Scheme Procedure} char? x
1351 @deffnx {C Function} scm_char_p (x)
1352 Return @code{#t} iff @var{x} is a character, else @code{#f}.
1353 @end deffn
1354
1355 @rnindex char=?
1356 @deffn {Scheme Procedure} char=? x y
1357 Return @code{#t} iff @var{x} is the same character as @var{y}, else @code{#f}.
1358 @end deffn
1359
1360 @rnindex char<?
1361 @deffn {Scheme Procedure} char<? x y
1362 Return @code{#t} iff @var{x} is less than @var{y} in the ASCII sequence,
1363 else @code{#f}.
1364 @end deffn
1365
1366 @rnindex char<=?
1367 @deffn {Scheme Procedure} char<=? x y
1368 Return @code{#t} iff @var{x} is less than or equal to @var{y} in the
1369 ASCII sequence, else @code{#f}.
1370 @end deffn
1371
1372 @rnindex char>?
1373 @deffn {Scheme Procedure} char>? x y
1374 Return @code{#t} iff @var{x} is greater than @var{y} in the ASCII
1375 sequence, else @code{#f}.
1376 @end deffn
1377
1378 @rnindex char>=?
1379 @deffn {Scheme Procedure} char>=? x y
1380 Return @code{#t} iff @var{x} is greater than or equal to @var{y} in the
1381 ASCII sequence, else @code{#f}.
1382 @end deffn
1383
1384 @rnindex char-ci=?
1385 @deffn {Scheme Procedure} char-ci=? x y
1386 Return @code{#t} iff @var{x} is the same character as @var{y} ignoring
1387 case, else @code{#f}.
1388 @end deffn
1389
1390 @rnindex char-ci<?
1391 @deffn {Scheme Procedure} char-ci<? x y
1392 Return @code{#t} iff @var{x} is less than @var{y} in the ASCII sequence
1393 ignoring case, else @code{#f}.
1394 @end deffn
1395
1396 @rnindex char-ci<=?
1397 @deffn {Scheme Procedure} char-ci<=? x y
1398 Return @code{#t} iff @var{x} is less than or equal to @var{y} in the
1399 ASCII sequence ignoring case, else @code{#f}.
1400 @end deffn
1401
1402 @rnindex char-ci>?
1403 @deffn {Scheme Procedure} char-ci>? x y
1404 Return @code{#t} iff @var{x} is greater than @var{y} in the ASCII
1405 sequence ignoring case, else @code{#f}.
1406 @end deffn
1407
1408 @rnindex char-ci>=?
1409 @deffn {Scheme Procedure} char-ci>=? x y
1410 Return @code{#t} iff @var{x} is greater than or equal to @var{y} in the
1411 ASCII sequence ignoring case, else @code{#f}.
1412 @end deffn
1413
1414 @rnindex char-alphabetic?
1415 @deffn {Scheme Procedure} char-alphabetic? chr
1416 @deffnx {C Function} scm_char_alphabetic_p (chr)
1417 Return @code{#t} iff @var{chr} is alphabetic, else @code{#f}.
1418 Alphabetic means the same thing as the isalpha C library function.
1419 @end deffn
1420
1421 @rnindex char-numeric?
1422 @deffn {Scheme Procedure} char-numeric? chr
1423 @deffnx {C Function} scm_char_numeric_p (chr)
1424 Return @code{#t} iff @var{chr} is numeric, else @code{#f}.
1425 Numeric means the same thing as the isdigit C library function.
1426 @end deffn
1427
1428 @rnindex char-whitespace?
1429 @deffn {Scheme Procedure} char-whitespace? chr
1430 @deffnx {C Function} scm_char_whitespace_p (chr)
1431 Return @code{#t} iff @var{chr} is whitespace, else @code{#f}.
1432 Whitespace means the same thing as the isspace C library function.
1433 @end deffn
1434
1435 @rnindex char-upper-case?
1436 @deffn {Scheme Procedure} char-upper-case? chr
1437 @deffnx {C Function} scm_char_upper_case_p (chr)
1438 Return @code{#t} iff @var{chr} is uppercase, else @code{#f}.
1439 Uppercase means the same thing as the isupper C library function.
1440 @end deffn
1441
1442 @rnindex char-lower-case?
1443 @deffn {Scheme Procedure} char-lower-case? chr
1444 @deffnx {C Function} scm_char_lower_case_p (chr)
1445 Return @code{#t} iff @var{chr} is lowercase, else @code{#f}.
1446 Lowercase means the same thing as the islower C library function.
1447 @end deffn
1448
1449 @deffn {Scheme Procedure} char-is-both? chr
1450 @deffnx {C Function} scm_char_is_both_p (chr)
1451 Return @code{#t} iff @var{chr} is either uppercase or lowercase, else @code{#f}.
1452 Uppercase and lowercase are as defined by the isupper and islower
1453 C library functions.
1454 @end deffn
1455
1456 @rnindex char->integer
1457 @deffn {Scheme Procedure} char->integer chr
1458 @deffnx {C Function} scm_char_to_integer (chr)
1459 Return the number corresponding to ordinal position of @var{chr} in the
1460 ASCII sequence.
1461 @end deffn
1462
1463 @rnindex integer->char
1464 @deffn {Scheme Procedure} integer->char n
1465 @deffnx {C Function} scm_integer_to_char (n)
1466 Return the character at position @var{n} in the ASCII sequence.
1467 @end deffn
1468
1469 @rnindex char-upcase
1470 @deffn {Scheme Procedure} char-upcase chr
1471 @deffnx {C Function} scm_char_upcase (chr)
1472 Return the uppercase character version of @var{chr}.
1473 @end deffn
1474
1475 @rnindex char-downcase
1476 @deffn {Scheme Procedure} char-downcase chr
1477 @deffnx {C Function} scm_char_downcase (chr)
1478 Return the lowercase character version of @var{chr}.
1479 @end deffn
1480
1481
1482 @node Strings
1483 @section Strings
1484 @tpindex Strings
1485
1486 Strings are fixed-length sequences of characters.  They can be created
1487 by calling constructor procedures, but they can also literally get
1488 entered at the REPL or in Scheme source files.
1489
1490 Guile provides a rich set of string processing procedures, because text
1491 handling is very important when Guile is used as a scripting language.
1492
1493 Strings always carry the information about how many characters they are
1494 composed of with them, so there is no special end-of-string character,
1495 like in C.  That means that Scheme strings can contain any character,
1496 even the NUL character @code{'\0'}.  But note: Since most operating
1497 system calls dealing with strings (such as for file operations) expect
1498 strings to be zero-terminated, they might do unexpected things when
1499 called with string containing unusual characters.
1500
1501 @menu
1502 * String Syntax::               Read syntax for strings.
1503 * String Predicates::           Testing strings for certain properties.
1504 * String Constructors::         Creating new string objects.
1505 * List/String Conversion::      Converting from/to lists of characters.
1506 * String Selection::            Select portions from strings.
1507 * String Modification::         Modify parts or whole strings.
1508 * String Comparison::           Lexicographic ordering predicates.
1509 * String Searching::            Searching in strings.
1510 * Alphabetic Case Mapping::     Convert the alphabetic case of strings.
1511 * Appending Strings::           Appending strings to form a new string.
1512 @end menu
1513
1514 @node String Syntax
1515 @subsection String Read Syntax
1516
1517 The read syntax for strings is an arbitrarily long sequence of
1518 characters enclosed in double quotes (@code{"}). @footnote{Actually, the
1519 current implementation restricts strings to a length of 2^24
1520 characters.}  If you want to insert a double quote character into a
1521 string literal, it must be prefixed with a backslash @code{\} character
1522 (called an @dfn{escape character}).
1523
1524 The following are examples of string literals:
1525
1526 @lisp
1527 "foo"
1528 "bar plonk"
1529 "Hello World"
1530 "\"Hi\", he said."
1531 @end lisp
1532
1533 @c FIXME::martin: What about escape sequences like \r, \n etc.?
1534
1535 @node String Predicates
1536 @subsection String Predicates
1537
1538 The following procedures can be used to check whether a given string
1539 fulfills some specified property.
1540
1541 @rnindex string?
1542 @deffn {Scheme Procedure} string? obj
1543 @deffnx {C Function} scm_string_p (obj)
1544 Return @code{#t} if @var{obj} is a string, else @code{#f}.
1545 @end deffn
1546
1547 @deffn {Scheme Procedure} string-null? str
1548 @deffnx {C Function} scm_string_null_p (str)
1549 Return @code{#t} if @var{str}'s length is zero, and
1550 @code{#f} otherwise.
1551 @lisp
1552 (string-null? "")  @result{} #t
1553 y                    @result{} "foo"
1554 (string-null? y)     @result{} #f
1555 @end lisp
1556 @end deffn
1557
1558 @node String Constructors
1559 @subsection String Constructors
1560
1561 The string constructor procedures create new string objects, possibly
1562 initializing them with some specified character data.
1563
1564 @c FIXME::martin: list->string belongs into `List/String Conversion'
1565
1566 @rnindex string
1567 @rnindex list->string
1568 @deffn {Scheme Procedure} string . chrs
1569 @deffnx {Scheme Procedure} list->string chrs
1570 @deffnx {C Function} scm_string (chrs)
1571 Return a newly allocated string composed of the arguments,
1572 @var{chrs}.
1573 @end deffn
1574
1575 @rnindex make-string
1576 @deffn {Scheme Procedure} make-string k [chr]
1577 @deffnx {C Function} scm_make_string (k, chr)
1578 Return a newly allocated string of
1579 length @var{k}.  If @var{chr} is given, then all elements of
1580 the string are initialized to @var{chr}, otherwise the contents
1581 of the @var{string} are unspecified.
1582 @end deffn
1583
1584 @node List/String Conversion
1585 @subsection List/String conversion
1586
1587 When processing strings, it is often convenient to first convert them
1588 into a list representation by using the procedure @code{string->list},
1589 work with the resulting list, and then convert it back into a string.
1590 These procedures are useful for similar tasks.
1591
1592 @rnindex string->list
1593 @deffn {Scheme Procedure} string->list str
1594 @deffnx {C Function} scm_string_to_list (str)
1595 Return a newly allocated list of the characters that make up
1596 the given string @var{str}. @code{string->list} and
1597 @code{list->string} are inverses as far as @samp{equal?} is
1598 concerned.
1599 @end deffn
1600
1601 @deffn {Scheme Procedure} string-split str chr
1602 @deffnx {C Function} scm_string_split (str, chr)
1603 Split the string @var{str} into the a list of the substrings delimited
1604 by appearances of the character @var{chr}.  Note that an empty substring
1605 between separator characters will result in an empty string in the
1606 result list.
1607
1608 @lisp
1609 (string-split "root:x:0:0:root:/root:/bin/bash" #\:)
1610 @result{}
1611 ("root" "x" "0" "0" "root" "/root" "/bin/bash")
1612
1613 (string-split "::" #\:)
1614 @result{}
1615 ("" "" "")
1616
1617 (string-split "" #\:)
1618 @result{}
1619 ("")
1620 @end lisp
1621 @end deffn
1622
1623
1624 @node String Selection
1625 @subsection String Selection
1626
1627 Portions of strings can be extracted by these procedures.
1628 @code{string-ref} delivers individual characters whereas
1629 @code{substring} can be used to extract substrings from longer strings.
1630
1631 @rnindex string-length
1632 @deffn {Scheme Procedure} string-length string
1633 @deffnx {C Function} scm_string_length (string)
1634 Return the number of characters in @var{string}.
1635 @end deffn
1636
1637 @rnindex string-ref
1638 @deffn {Scheme Procedure} string-ref str k
1639 @deffnx {C Function} scm_string_ref (str, k)
1640 Return character @var{k} of @var{str} using zero-origin
1641 indexing. @var{k} must be a valid index of @var{str}.
1642 @end deffn
1643
1644 @rnindex string-copy
1645 @deffn {Scheme Procedure} string-copy str
1646 @deffnx {C Function} scm_string_copy (str)
1647 Return a newly allocated copy of the given @var{string}.
1648 @end deffn
1649
1650 @rnindex substring
1651 @deffn {Scheme Procedure} substring str start [end]
1652 @deffnx {C Function} scm_substring (str, start, end)
1653 Return a newly allocated string formed from the characters
1654 of @var{str} beginning with index @var{start} (inclusive) and
1655 ending with index @var{end} (exclusive).
1656 @var{str} must be a string, @var{start} and @var{end} must be
1657 exact integers satisfying:
1658
1659 0 <= @var{start} <= @var{end} <= (string-length @var{str}).
1660 @end deffn
1661
1662 @node String Modification
1663 @subsection String Modification
1664
1665 These procedures are for modifying strings in-place.  This means that the
1666 result of the operation is not a new string; instead, the original string's
1667 memory representation is modified.
1668
1669 @rnindex string-set!
1670 @deffn {Scheme Procedure} string-set! str k chr
1671 @deffnx {C Function} scm_string_set_x (str, k, chr)
1672 Store @var{chr} in element @var{k} of @var{str} and return
1673 an unspecified value. @var{k} must be a valid index of
1674 @var{str}.
1675 @end deffn
1676
1677 @rnindex string-fill!
1678 @deffn {Scheme Procedure} string-fill! str chr
1679 @deffnx {C Function} scm_string_fill_x (str, chr)
1680 Store @var{char} in every element of the given @var{string} and
1681 return an unspecified value.
1682 @end deffn
1683
1684 @deffn {Scheme Procedure} substring-fill! str start end fill
1685 @deffnx {C Function} scm_substring_fill_x (str, start, end, fill)
1686 Change every character in @var{str} between @var{start} and
1687 @var{end} to @var{fill}.
1688
1689 @lisp
1690 (define y "abcdefg")
1691 (substring-fill! y 1 3 #\r)
1692 y
1693 @result{} "arrdefg"
1694 @end lisp
1695 @end deffn
1696
1697 @deffn {Scheme Procedure} substring-move! str1 start1 end1 str2 start2
1698 @deffnx {C Function} scm_substring_move_x (str1, start1, end1, str2, start2)
1699 Copy the substring of @var{str1} bounded by @var{start1} and @var{end1}
1700 into @var{str2} beginning at position @var{start2}.
1701 @var{str1} and @var{str2} can be the same string.
1702 @end deffn
1703
1704
1705 @node String Comparison
1706 @subsection String Comparison
1707
1708 The procedures in this section are similar to the character ordering
1709 predicates (@pxref{Characters}), but are defined on character sequences.
1710 They all return @code{#t} on success and @code{#f} on failure.  The
1711 predicates ending in @code{-ci} ignore the character case when comparing
1712 strings.
1713
1714
1715 @rnindex string=?
1716 @deffn {Scheme Procedure} string=? s1 s2
1717 Lexicographic equality predicate; return @code{#t} if the two
1718 strings are the same length and contain the same characters in
1719 the same positions, otherwise return @code{#f}.
1720
1721 The procedure @code{string-ci=?} treats upper and lower case
1722 letters as though they were the same character, but
1723 @code{string=?} treats upper and lower case as distinct
1724 characters.
1725 @end deffn
1726
1727 @rnindex string<?
1728 @deffn {Scheme Procedure} string<? s1 s2
1729 Lexicographic ordering predicate; return @code{#t} if @var{s1}
1730 is lexicographically less than @var{s2}.
1731 @end deffn
1732
1733 @rnindex string<=?
1734 @deffn {Scheme Procedure} string<=? s1 s2
1735 Lexicographic ordering predicate; return @code{#t} if @var{s1}
1736 is lexicographically less than or equal to @var{s2}.
1737 @end deffn
1738
1739 @rnindex string>?
1740 @deffn {Scheme Procedure} string>? s1 s2
1741 Lexicographic ordering predicate; return @code{#t} if @var{s1}
1742 is lexicographically greater than @var{s2}.
1743 @end deffn
1744
1745 @rnindex string>=?
1746 @deffn {Scheme Procedure} string>=? s1 s2
1747 Lexicographic ordering predicate; return @code{#t} if @var{s1}
1748 is lexicographically greater than or equal to @var{s2}.
1749 @end deffn
1750
1751 @rnindex string-ci=?
1752 @deffn {Scheme Procedure} string-ci=? s1 s2
1753 Case-insensitive string equality predicate; return @code{#t} if
1754 the two strings are the same length and their component
1755 characters match (ignoring case) at each position; otherwise
1756 return @code{#f}.
1757 @end deffn
1758
1759 @rnindex string-ci<
1760 @deffn {Scheme Procedure} string-ci<? s1 s2
1761 Case insensitive lexicographic ordering predicate; return
1762 @code{#t} if @var{s1} is lexicographically less than @var{s2}
1763 regardless of case.
1764 @end deffn
1765
1766 @rnindex string<=?
1767 @deffn {Scheme Procedure} string-ci<=? s1 s2
1768 Case insensitive lexicographic ordering predicate; return
1769 @code{#t} if @var{s1} is lexicographically less than or equal
1770 to @var{s2} regardless of case.
1771 @end deffn
1772
1773 @rnindex string-ci>?
1774 @deffn {Scheme Procedure} string-ci>? s1 s2
1775 Case insensitive lexicographic ordering predicate; return
1776 @code{#t} if @var{s1} is lexicographically greater than
1777 @var{s2} regardless of case.
1778 @end deffn
1779
1780 @rnindex string-ci>=?
1781 @deffn {Scheme Procedure} string-ci>=? s1 s2
1782 Case insensitive lexicographic ordering predicate; return
1783 @code{#t} if @var{s1} is lexicographically greater than or
1784 equal to @var{s2} regardless of case.
1785 @end deffn
1786
1787
1788 @node String Searching
1789 @subsection String Searching
1790
1791 When searching for the index of a character in a string, these
1792 procedures can be used.
1793
1794 @deffn {Scheme Procedure} string-index str chr [frm [to]]
1795 @deffnx {C Function} scm_string_index (str, chr, frm, to)
1796 Return the index of the first occurrence of @var{chr} in
1797 @var{str}.  The optional integer arguments @var{frm} and
1798 @var{to} limit the search to a portion of the string.  This
1799 procedure essentially implements the @code{index} or
1800 @code{strchr} functions from the C library.
1801
1802 @lisp
1803 (string-index "weiner" #\e)
1804 @result{} 1
1805
1806 (string-index "weiner" #\e 2)
1807 @result{} 4
1808
1809 (string-index "weiner" #\e 2 4)
1810 @result{} #f
1811 @end lisp
1812 @end deffn
1813
1814 @deffn {Scheme Procedure} string-rindex str chr [frm [to]]
1815 @deffnx {C Function} scm_string_rindex (str, chr, frm, to)
1816 Like @code{string-index}, but search from the right of the
1817 string rather than from the left.  This procedure essentially
1818 implements the @code{rindex} or @code{strrchr} functions from
1819 the C library.
1820
1821 @lisp
1822 (string-rindex "weiner" #\e)
1823 @result{} 4
1824
1825 (string-rindex "weiner" #\e 2 4)
1826 @result{} #f
1827
1828 (string-rindex "weiner" #\e 2 5)
1829 @result{} 4
1830 @end lisp
1831 @end deffn
1832
1833 @node Alphabetic Case Mapping
1834 @subsection Alphabetic Case Mapping
1835
1836 These are procedures for mapping strings to their upper- or lower-case
1837 equivalents, respectively, or for capitalizing strings.
1838
1839 @deffn {Scheme Procedure} string-upcase str
1840 @deffnx {C Function} scm_string_upcase (str)
1841 Return a freshly allocated string containing the characters of
1842 @var{str} in upper case.
1843 @end deffn
1844
1845 @deffn {Scheme Procedure} string-upcase! str
1846 @deffnx {C Function} scm_string_upcase_x (str)
1847 Destructively upcase every character in @var{str} and return
1848 @var{str}.
1849 @lisp
1850 y                  @result{} "arrdefg"
1851 (string-upcase! y) @result{} "ARRDEFG"
1852 y                  @result{} "ARRDEFG"
1853 @end lisp
1854 @end deffn
1855
1856 @deffn {Scheme Procedure} string-downcase str
1857 @deffnx {C Function} scm_string_downcase (str)
1858 Return a freshly allocation string containing the characters in
1859 @var{str} in lower case.
1860 @end deffn
1861
1862 @deffn {Scheme Procedure} string-downcase! str
1863 @deffnx {C Function} scm_string_downcase_x (str)
1864 Destructively downcase every character in @var{str} and return
1865 @var{str}.
1866 @lisp
1867 y                     @result{} "ARRDEFG"
1868 (string-downcase! y)  @result{} "arrdefg"
1869 y                     @result{} "arrdefg"
1870 @end lisp
1871 @end deffn
1872
1873 @deffn {Scheme Procedure} string-capitalize str
1874 @deffnx {C Function} scm_string_capitalize (str)
1875 Return a freshly allocated string with the characters in
1876 @var{str}, where the first character of every word is
1877 capitalized.
1878 @end deffn
1879
1880 @deffn {Scheme Procedure} string-capitalize! str
1881 @deffnx {C Function} scm_string_capitalize_x (str)
1882 Upcase the first character of every word in @var{str}
1883 destructively and return @var{str}.
1884
1885 @lisp
1886 y                      @result{} "hello world"
1887 (string-capitalize! y) @result{} "Hello World"
1888 y                      @result{} "Hello World"
1889 @end lisp
1890 @end deffn
1891
1892
1893 @node Appending Strings
1894 @subsection Appending Strings
1895
1896 The procedure @code{string-append} appends several strings together to
1897 form a longer result string.
1898
1899 @rnindex string-append
1900 @deffn {Scheme Procedure} string-append . args
1901 @deffnx {C Function} scm_string_append (args)
1902 Return a newly allocated string whose characters form the
1903 concatenation of the given strings, @var{args}.
1904 @end deffn
1905
1906
1907 @node Regular Expressions
1908 @section Regular Expressions
1909 @tpindex Regular expressions
1910
1911 @cindex regular expressions
1912 @cindex regex
1913 @cindex emacs regexp
1914
1915 A @dfn{regular expression} (or @dfn{regexp}) is a pattern that
1916 describes a whole class of strings.  A full description of regular
1917 expressions and their syntax is beyond the scope of this manual;
1918 an introduction can be found in the Emacs manual (@pxref{Regexps,
1919 , Syntax of Regular Expressions, emacs, The GNU Emacs Manual}), or
1920 in many general Unix reference books.
1921
1922 If your system does not include a POSIX regular expression library, and
1923 you have not linked Guile with a third-party regexp library such as Rx,
1924 these functions will not be available.  You can tell whether your Guile
1925 installation includes regular expression support by checking whether the
1926 @code{*features*} list includes the @code{regex} symbol.
1927
1928 @menu
1929 * Regexp Functions::            Functions that create and match regexps.
1930 * Match Structures::            Finding what was matched by a regexp.
1931 * Backslash Escapes::           Removing the special meaning of regexp
1932                                 meta-characters.
1933 @end menu
1934
1935 [FIXME: it may be useful to include an Examples section.  Parts of this
1936 interface are bewildering on first glance.]
1937
1938 @node Regexp Functions
1939 @subsection Regexp Functions
1940
1941 By default, Guile supports POSIX extended regular expressions.
1942 That means that the characters @samp{(}, @samp{)}, @samp{+} and
1943 @samp{?} are special, and must be escaped if you wish to match the
1944 literal characters.
1945
1946 This regular expression interface was modeled after that
1947 implemented by SCSH, the Scheme Shell.  It is intended to be
1948 upwardly compatible with SCSH regular expressions.
1949
1950 @c begin (scm-doc-string "regex.scm" "string-match")
1951 @deffn {Scheme Procedure} string-match pattern str [start]
1952 Compile the string @var{pattern} into a regular expression and compare
1953 it with @var{str}.  The optional numeric argument @var{start} specifies
1954 the position of @var{str} at which to begin matching.
1955
1956 @code{string-match} returns a @dfn{match structure} which
1957 describes what, if anything, was matched by the regular
1958 expression.  @xref{Match Structures}.  If @var{str} does not match
1959 @var{pattern} at all, @code{string-match} returns @code{#f}.
1960 @end deffn
1961
1962 Each time @code{string-match} is called, it must compile its
1963 @var{pattern} argument into a regular expression structure.  This
1964 operation is expensive, which makes @code{string-match} inefficient if
1965 the same regular expression is used several times (for example, in a
1966 loop).  For better performance, you can compile a regular expression in
1967 advance and then match strings against the compiled regexp.
1968
1969 @deffn {Scheme Procedure} make-regexp pat . flags
1970 @deffnx {C Function} scm_make_regexp (pat, flags)
1971 Compile the regular expression described by @var{pat}, and
1972 return the compiled regexp structure.  If @var{pat} does not
1973 describe a legal regular expression, @code{make-regexp} throws
1974 a @code{regular-expression-syntax} error.
1975
1976 The @var{flags} arguments change the behavior of the compiled
1977 regular expression.  The following flags may be supplied:
1978
1979 @table @code
1980 @item regexp/icase
1981 Consider uppercase and lowercase letters to be the same when
1982 matching.
1983 @item regexp/newline
1984 If a newline appears in the target string, then permit the
1985 @samp{^} and @samp{$} operators to match immediately after or
1986 immediately before the newline, respectively.  Also, the
1987 @samp{.} and @samp{[^...]} operators will never match a newline
1988 character.  The intent of this flag is to treat the target
1989 string as a buffer containing many lines of text, and the
1990 regular expression as a pattern that may match a single one of
1991 those lines.
1992 @item regexp/basic
1993 Compile a basic (``obsolete'') regexp instead of the extended
1994 (``modern'') regexps that are the default.  Basic regexps do
1995 not consider @samp{|}, @samp{+} or @samp{?} to be special
1996 characters, and require the @samp{@{...@}} and @samp{(...)}
1997 metacharacters to be backslash-escaped (@pxref{Backslash
1998 Escapes}).  There are several other differences between basic
1999 and extended regular expressions, but these are the most
2000 significant.
2001 @item regexp/extended
2002 Compile an extended regular expression rather than a basic
2003 regexp.  This is the default behavior; this flag will not
2004 usually be needed.  If a call to @code{make-regexp} includes
2005 both @code{regexp/basic} and @code{regexp/extended} flags, the
2006 one which comes last will override the earlier one.
2007 @end table
2008 @end deffn
2009
2010 @deffn {Scheme Procedure} regexp-exec rx str [start [flags]]
2011 @deffnx {C Function} scm_regexp_exec (rx, str, start, flags)
2012 Match the compiled regular expression @var{rx} against
2013 @code{str}.  If the optional integer @var{start} argument is
2014 provided, begin matching from that position in the string.
2015 Return a match structure describing the results of the match,
2016 or @code{#f} if no match could be found.
2017
2018 The @var{flags} arguments change the matching behavior.
2019 The following flags may be supplied:
2020
2021 @table @code
2022 @item regexp/notbol
2023 Operator @samp{^} always fails (unless @code{regexp/newline}
2024 is used).  Use this when the beginning of the string should
2025 not be considered the beginning of a line.
2026 @item regexp/noteol
2027 Operator @samp{$} always fails (unless @code{regexp/newline}
2028 is used).  Use this when the end of the string should not be
2029 considered the end of a line.
2030 @end table
2031 @end deffn
2032
2033 @deffn {Scheme Procedure} regexp? obj
2034 @deffnx {C Function} scm_regexp_p (obj)
2035 Return @code{#t} if @var{obj} is a compiled regular expression,
2036 or @code{#f} otherwise.
2037 @end deffn
2038
2039 Regular expressions are commonly used to find patterns in one string and
2040 replace them with the contents of another string.
2041
2042 @c begin (scm-doc-string "regex.scm" "regexp-substitute")
2043 @deffn {Scheme Procedure} regexp-substitute port match [item@dots{}]
2044 Write to the output port @var{port} selected contents of the match
2045 structure @var{match}.  Each @var{item} specifies what should be
2046 written, and may be one of the following arguments:
2047
2048 @itemize @bullet
2049 @item
2050 A string.  String arguments are written out verbatim.
2051
2052 @item
2053 An integer.  The submatch with that number is written.
2054
2055 @item
2056 The symbol @samp{pre}.  The portion of the matched string preceding
2057 the regexp match is written.
2058
2059 @item
2060 The symbol @samp{post}.  The portion of the matched string following
2061 the regexp match is written.
2062 @end itemize
2063
2064 @var{port} may be @code{#f}, in which case nothing is written; instead,
2065 @code{regexp-substitute} constructs a string from the specified
2066 @var{item}s and returns that.
2067 @end deffn
2068
2069 @c begin (scm-doc-string "regex.scm" "regexp-substitute")
2070 @deffn {Scheme Procedure} regexp-substitute/global port regexp target [item@dots{}]
2071 Similar to @code{regexp-substitute}, but can be used to perform global
2072 substitutions on @var{str}.  Instead of taking a match structure as an
2073 argument, @code{regexp-substitute/global} takes two string arguments: a
2074 @var{regexp} string describing a regular expression, and a @var{target}
2075 string which should be matched against this regular expression.
2076
2077 Each @var{item} behaves as in @var{regexp-substitute}, with the
2078 following exceptions:
2079
2080 @itemize @bullet
2081 @item
2082 A function may be supplied.  When this function is called, it will be
2083 passed one argument: a match structure for a given regular expression
2084 match.  It should return a string to be written out to @var{port}.
2085
2086 @item
2087 The @samp{post} symbol causes @code{regexp-substitute/global} to recurse
2088 on the unmatched portion of @var{str}.  This @emph{must} be supplied in
2089 order to perform global search-and-replace on @var{str}; if it is not
2090 present among the @var{item}s, then @code{regexp-substitute/global} will
2091 return after processing a single match.
2092 @end itemize
2093 @end deffn
2094
2095 @node Match Structures
2096 @subsection Match Structures
2097
2098 @cindex match structures
2099
2100 A @dfn{match structure} is the object returned by @code{string-match} and
2101 @code{regexp-exec}.  It describes which portion of a string, if any,
2102 matched the given regular expression.  Match structures include: a
2103 reference to the string that was checked for matches; the starting and
2104 ending positions of the regexp match; and, if the regexp included any
2105 parenthesized subexpressions, the starting and ending positions of each
2106 submatch.
2107
2108 In each of the regexp match functions described below, the @code{match}
2109 argument must be a match structure returned by a previous call to
2110 @code{string-match} or @code{regexp-exec}.  Most of these functions
2111 return some information about the original target string that was
2112 matched against a regular expression; we will call that string
2113 @var{target} for easy reference.
2114
2115 @c begin (scm-doc-string "regex.scm" "regexp-match?")
2116 @deffn {Scheme Procedure} regexp-match? obj
2117 Return @code{#t} if @var{obj} is a match structure returned by a
2118 previous call to @code{regexp-exec}, or @code{#f} otherwise.
2119 @end deffn
2120
2121 @c begin (scm-doc-string "regex.scm" "match:substring")
2122 @deffn {Scheme Procedure} match:substring match [n]
2123 Return the portion of @var{target} matched by subexpression number
2124 @var{n}.  Submatch 0 (the default) represents the entire regexp match.
2125 If the regular expression as a whole matched, but the subexpression
2126 number @var{n} did not match, return @code{#f}.
2127 @end deffn
2128
2129 @c begin (scm-doc-string "regex.scm" "match:start")
2130 @deffn {Scheme Procedure} match:start match [n]
2131 Return the starting position of submatch number @var{n}.
2132 @end deffn
2133
2134 @c begin (scm-doc-string "regex.scm" "match:end")
2135 @deffn {Scheme Procedure} match:end match [n]
2136 Return the ending position of submatch number @var{n}.
2137 @end deffn
2138
2139 @c begin (scm-doc-string "regex.scm" "match:prefix")
2140 @deffn {Scheme Procedure} match:prefix match
2141 Return the unmatched portion of @var{target} preceding the regexp match.
2142 @end deffn
2143
2144 @c begin (scm-doc-string "regex.scm" "match:suffix")
2145 @deffn {Scheme Procedure} match:suffix match
2146 Return the unmatched portion of @var{target} following the regexp match.
2147 @end deffn
2148
2149 @c begin (scm-doc-string "regex.scm" "match:count")
2150 @deffn {Scheme Procedure} match:count match
2151 Return the number of parenthesized subexpressions from @var{match}.
2152 Note that the entire regular expression match itself counts as a
2153 subexpression, and failed submatches are included in the count.
2154 @end deffn
2155
2156 @c begin (scm-doc-string "regex.scm" "match:string")
2157 @deffn {Scheme Procedure} match:string match
2158 Return the original @var{target} string.
2159 @end deffn
2160
2161 @node Backslash Escapes
2162 @subsection Backslash Escapes
2163
2164 Sometimes you will want a regexp to match characters like @samp{*} or
2165 @samp{$} exactly.  For example, to check whether a particular string
2166 represents a menu entry from an Info node, it would be useful to match
2167 it against a regexp like @samp{^* [^:]*::}.  However, this won't work;
2168 because the asterisk is a metacharacter, it won't match the @samp{*} at
2169 the beginning of the string.  In this case, we want to make the first
2170 asterisk un-magic.
2171
2172 You can do this by preceding the metacharacter with a backslash
2173 character @samp{\}.  (This is also called @dfn{quoting} the
2174 metacharacter, and is known as a @dfn{backslash escape}.)  When Guile
2175 sees a backslash in a regular expression, it considers the following
2176 glyph to be an ordinary character, no matter what special meaning it
2177 would ordinarily have.  Therefore, we can make the above example work by
2178 changing the regexp to @samp{^\* [^:]*::}.  The @samp{\*} sequence tells
2179 the regular expression engine to match only a single asterisk in the
2180 target string.
2181
2182 Since the backslash is itself a metacharacter, you may force a regexp to
2183 match a backslash in the target string by preceding the backslash with
2184 itself.  For example, to find variable references in a @TeX{} program,
2185 you might want to find occurrences of the string @samp{\let\} followed
2186 by any number of alphabetic characters.  The regular expression
2187 @samp{\\let\\[A-Za-z]*} would do this: the double backslashes in the
2188 regexp each match a single backslash in the target string.
2189
2190 @c begin (scm-doc-string "regex.scm" "regexp-quote")
2191 @deffn {Scheme Procedure} regexp-quote str
2192 Quote each special character found in @var{str} with a backslash, and
2193 return the resulting string.
2194 @end deffn
2195
2196 @strong{Very important:} Using backslash escapes in Guile source code
2197 (as in Emacs Lisp or C) can be tricky, because the backslash character
2198 has special meaning for the Guile reader.  For example, if Guile
2199 encounters the character sequence @samp{\n} in the middle of a string
2200 while processing Scheme code, it replaces those characters with a
2201 newline character.  Similarly, the character sequence @samp{\t} is
2202 replaced by a horizontal tab.  Several of these @dfn{escape sequences}
2203 are processed by the Guile reader before your code is executed.
2204 Unrecognized escape sequences are ignored: if the characters @samp{\*}
2205 appear in a string, they will be translated to the single character
2206 @samp{*}.
2207
2208 This translation is obviously undesirable for regular expressions, since
2209 we want to be able to include backslashes in a string in order to
2210 escape regexp metacharacters.  Therefore, to make sure that a backslash
2211 is preserved in a string in your Guile program, you must use @emph{two}
2212 consecutive backslashes:
2213
2214 @lisp
2215 (define Info-menu-entry-pattern (make-regexp "^\\* [^:]*"))
2216 @end lisp
2217
2218 The string in this example is preprocessed by the Guile reader before
2219 any code is executed.  The resulting argument to @code{make-regexp} is
2220 the string @samp{^\* [^:]*}, which is what we really want.
2221
2222 This also means that in order to write a regular expression that matches
2223 a single backslash character, the regular expression string in the
2224 source code must include @emph{four} backslashes.  Each consecutive pair
2225 of backslashes gets translated by the Guile reader to a single
2226 backslash, and the resulting double-backslash is interpreted by the
2227 regexp engine as matching a single backslash character.  Hence:
2228
2229 @lisp
2230 (define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*"))
2231 @end lisp
2232
2233 The reason for the unwieldiness of this syntax is historical.  Both
2234 regular expression pattern matchers and Unix string processing systems
2235 have traditionally used backslashes with the special meanings
2236 described above.  The POSIX regular expression specification and ANSI C
2237 standard both require these semantics.  Attempting to abandon either
2238 convention would cause other kinds of compatibility problems, possibly
2239 more severe ones.  Therefore, without extending the Scheme reader to
2240 support strings with different quoting conventions (an ungainly and
2241 confusing extension when implemented in other languages), we must adhere
2242 to this cumbersome escape syntax.
2243
2244
2245 @node Symbols
2246 @section Symbols
2247 @tpindex Symbols
2248
2249 Symbols in Scheme are widely used in three ways: as items of discrete
2250 data, as lookup keys for alists and hash tables, and to denote variable
2251 references.
2252
2253 A @dfn{symbol} is similar to a string in that it is defined by a
2254 sequence of characters.  The sequence of characters is known as the
2255 symbol's @dfn{name}.  In the usual case --- that is, where the symbol's
2256 name doesn't include any characters that could be confused with other
2257 elements of Scheme syntax --- a symbol is written in a Scheme program by
2258 writing the sequence of characters that make up the name, @emph{without}
2259 any quotation marks or other special syntax.  For example, the symbol
2260 whose name is ``multiply-by-2'' is written, simply:
2261
2262 @lisp
2263 multiply-by-2
2264 @end lisp
2265
2266 Notice how this differs from a @emph{string} with contents
2267 ``multiply-by-2'', which is written with double quotation marks, like
2268 this:
2269
2270 @lisp
2271 "multiply-by-2"
2272 @end lisp
2273
2274 Looking beyond how they are written, symbols are different from strings
2275 in two important respects.
2276
2277 The first important difference is uniqueness.  If the same-looking
2278 string is read twice from two different places in a program, the result
2279 is two @emph{different} string objects whose contents just happen to be
2280 the same.  If, on the other hand, the same-looking symbol is read twice
2281 from two different places in a program, the result is the @emph{same}
2282 symbol object both times.
2283
2284 Given two read symbols, you can use @code{eq?} to test whether they are
2285 the same (that is, have the same name).  @code{eq?} is the most
2286 efficient comparison operator in Scheme, and comparing two symbols like
2287 this is as fast as comparing, for example, two numbers.  Given two
2288 strings, on the other hand, you must use @code{equal?} or
2289 @code{string=?}, which are much slower comparison operators, to
2290 determine whether the strings have the same contents.
2291
2292 @lisp
2293 (define sym1 (quote hello))
2294 (define sym2 (quote hello))
2295 (eq? sym1 sym2) @result{} #t
2296
2297 (define str1 "hello")
2298 (define str2 "hello")
2299 (eq? str1 str2) @result{} #f
2300 (equal? str1 str2) @result{} #t
2301 @end lisp
2302
2303 The second important difference is that symbols, unlike strings, are not
2304 self-evaluating.  This is why we need the @code{(quote @dots{})}s in the
2305 example above: @code{(quote hello)} evaluates to the symbol named
2306 "hello" itself, whereas an unquoted @code{hello} is @emph{read} as the
2307 symbol named "hello" and evaluated as a variable reference @dots{} about
2308 which more below (@pxref{Symbol Variables}).
2309
2310 @menu
2311 * Symbol Data::                 Symbols as discrete data.
2312 * Symbol Keys::                 Symbols as lookup keys.
2313 * Symbol Variables::            Symbols as denoting variables.
2314 * Symbol Primitives::           Operations related to symbols.
2315 * Symbol Props::                Function slots and property lists.
2316 * Symbol Read Syntax::          Extended read syntax for symbols.
2317 * Symbol Uninterned::           Uninterned symbols.
2318 @end menu
2319
2320
2321 @node Symbol Data
2322 @subsection Symbols as Discrete Data
2323
2324 Numbers and symbols are similar to the extent that they both lend
2325 themselves to @code{eq?} comparison.  But symbols are more descriptive
2326 than numbers, because a symbol's name can be used directly to describe
2327 the concept for which that symbol stands.
2328
2329 For example, imagine that you need to represent some colours in a
2330 computer program.  Using numbers, you would have to choose arbitrarily
2331 some mapping between numbers and colours, and then take care to use that
2332 mapping consistently:
2333
2334 @lisp
2335 ;; 1=red, 2=green, 3=purple
2336
2337 (if (eq? (colour-of car) 1)
2338     ...)
2339 @end lisp
2340
2341 @noindent
2342 You can make the mapping more explicit and the code more readable by
2343 defining constants:
2344
2345 @lisp
2346 (define red 1)
2347 (define green 2)
2348 (define purple 3)
2349
2350 (if (eq? (colour-of car) red)
2351     ...)
2352 @end lisp
2353
2354 @noindent
2355 But the simplest and clearest approach is not to use numbers at all, but
2356 symbols whose names specify the colours that they refer to:
2357
2358 @lisp
2359 (if (eq? (colour-of car) 'red)
2360     ...)
2361 @end lisp
2362
2363 The descriptive advantages of symbols over numbers increase as the set
2364 of concepts that you want to describe grows.  Suppose that a car object
2365 can have other properties as well, such as whether it has or uses:
2366
2367 @itemize @bullet
2368 @item
2369 automatic or manual transmission
2370 @item
2371 leaded or unleaded fuel
2372 @item
2373 power steering (or not).
2374 @end itemize
2375
2376 @noindent
2377 Then a car's combined property set could be naturally represented and
2378 manipulated as a list of symbols:
2379
2380 @lisp
2381 (properties-of car1)
2382 @result{}
2383 (red manual unleaded power-steering)
2384
2385 (if (memq 'power-steering (properties-of car1))
2386     (display "Unfit people can drive this car.\n")
2387     (display "You'll need strong arms to drive this car!\n"))
2388 @print{}
2389 Unfit people can drive this car.
2390 @end lisp
2391
2392 Remember, the fundamental property of symbols that we are relying on
2393 here is that an occurrence of @code{'red} in one part of a program is an
2394 @emph{indistinguishable} symbol from an occurrence of @code{'red} in
2395 another part of a program; this means that symbols can usefully be
2396 compared using @code{eq?}.  At the same time, symbols have naturally
2397 descriptive names.  This combination of efficiency and descriptive power
2398 makes them ideal for use as discrete data.
2399
2400
2401 @node Symbol Keys
2402 @subsection Symbols as Lookup Keys
2403
2404 Given their efficiency and descriptive power, it is natural to use
2405 symbols as the keys in an association list or hash table.
2406
2407 To illustrate this, consider a more structured representation of the car
2408 properties example from the preceding subsection.  Rather than
2409 mixing all the properties up together in a flat list, we could use an
2410 association list like this:
2411
2412 @lisp
2413 (define car1-properties '((colour . red)
2414                           (transmission . manual)
2415                           (fuel . unleaded)
2416                           (steering . power-assisted)))
2417 @end lisp
2418
2419 Notice how this structure is more explicit and extensible than the flat
2420 list.  For example it makes clear that @code{manual} refers to the
2421 transmission rather than, say, the windows or the locking of the car.
2422 It also allows further properties to use the same symbols among their
2423 possible values without becoming ambiguous:
2424
2425 @lisp
2426 (define car1-properties '((colour . red)
2427                           (transmission . manual)
2428                           (fuel . unleaded)
2429                           (steering . power-assisted)
2430                           (seat-colour . red)
2431                           (locking . manual)))
2432 @end lisp
2433
2434 With a representation like this, it is easy to use the efficient
2435 @code{assq-XXX} family of procedures (@pxref{Association Lists}) to
2436 extract or change individual pieces of information:
2437
2438 @lisp
2439 (assq-ref car1-properties 'fuel) @result{} unleaded
2440 (assq-ref car1-properties 'transmission) @result{} manual
2441
2442 (assq-set! car1-properties 'seat-colour 'black)
2443 @result{}
2444 ((colour . red)
2445  (transmission . manual)
2446  (fuel . unleaded)
2447  (steering . power-assisted)
2448  (seat-colour . black)
2449  (locking . manual)))
2450 @end lisp
2451
2452 Hash tables also have keys, and exactly the same arguments apply to the
2453 use of symbols in hash tables as in association lists.  The hash value
2454 that Guile uses to decide where to add a symbol-keyed entry to a hash
2455 table can be obtained by calling the @code{symbol-hash} procedure:
2456
2457 @deffn {Scheme Procedure} symbol-hash symbol
2458 @deffnx {C Function} scm_symbol_hash (symbol)
2459 Return a hash value for @var{symbol}.
2460 @end deffn
2461
2462 See @ref{Hash Tables} for information about hash tables in general, and
2463 for why you might choose to use a hash table rather than an association
2464 list.
2465
2466
2467 @node Symbol Variables
2468 @subsection Symbols as Denoting Variables
2469
2470 When an unquoted symbol in a Scheme program is evaluated, it is
2471 interpreted as a variable reference, and the result of the evaluation is
2472 the appropriate variable's value.
2473
2474 For example, when the expression @code{(string-length "abcd")} is read
2475 and evaluated, the sequence of characters @code{string-length} is read
2476 as the symbol whose name is "string-length".  This symbol is associated
2477 with a variable whose value is the procedure that implements string
2478 length calculation.  Therefore evaluation of the @code{string-length}
2479 symbol results in that procedure.
2480
2481 The details of the connection between an unquoted symbol and the
2482 variable to which it refers are explained elsewhere.  See @ref{Binding
2483 Constructs}, for how associations between symbols and variables are
2484 created, and @ref{Modules}, for how those associations are affected by
2485 Guile's module system.
2486
2487
2488 @node Symbol Primitives
2489 @subsection Operations Related to Symbols
2490
2491 Given any Scheme value, you can determine whether it is a symbol using
2492 the @code{symbol?} primitive:
2493
2494 @rnindex symbol?
2495 @deffn {Scheme Procedure} symbol? obj
2496 @deffnx {C Function} scm_symbol_p (obj)
2497 Return @code{#t} if @var{obj} is a symbol, otherwise return
2498 @code{#f}.
2499 @end deffn
2500
2501 Once you know that you have a symbol, you can obtain its name as a
2502 string by calling @code{symbol->string}.  Note that Guile differs by
2503 default from R5RS on the details of @code{symbol->string} as regards
2504 case-sensitivity:
2505
2506 @rnindex symbol->string
2507 @deffn {Scheme Procedure} symbol->string s
2508 @deffnx {C Function} scm_symbol_to_string (s)
2509 Return the name of symbol @var{s} as a string.  By default, Guile reads
2510 symbols case-sensitively, so the string returned will have the same case
2511 variation as the sequence of characters that caused @var{s} to be
2512 created.
2513
2514 If Guile is set to read symbols case-insensitively (as specified by
2515 R5RS), and @var{s} comes into being as part of a literal expression
2516 (@pxref{Literal expressions,,,r5rs, The Revised^5 Report on Scheme}) or
2517 by a call to the @code{read} or @code{string-ci->symbol} procedures,
2518 Guile converts any alphabetic characters in the symbol's name to
2519 lower case before creating the symbol object, so the string returned
2520 here will be in lower case.
2521
2522 If @var{s} was created by @code{string->symbol}, the case of characters
2523 in the string returned will be the same as that in the string that was
2524 passed to @code{string->symbol}, regardless of Guile's case-sensitivity
2525 setting at the time @var{s} was created.
2526
2527 It is an error to apply mutation procedures like @code{string-set!} to
2528 strings returned by this procedure.
2529 @end deffn
2530
2531 Most symbols are created by writing them literally in code.  However it
2532 is also possible to create symbols programmatically using the following
2533 @code{string->symbol} and @code{string-ci->symbol} procedures:
2534
2535 @rnindex string->symbol
2536 @deffn {Scheme Procedure} string->symbol string
2537 @deffnx {C Function} scm_string_to_symbol (string)
2538 Return the symbol whose name is @var{string}.  This procedure can create
2539 symbols with names containing special characters or letters in the
2540 non-standard case, but it is usually a bad idea to create such symbols
2541 because in some implementations of Scheme they cannot be read as
2542 themselves.
2543 @end deffn
2544
2545 @deffn {Scheme Procedure} string-ci->symbol str
2546 @deffnx {C Function} scm_string_ci_to_symbol (str)
2547 Return the symbol whose name is @var{str}.  If Guile is currently
2548 reading symbols case-insensitively, @var{str} is converted to lowercase
2549 before the returned symbol is looked up or created.
2550 @end deffn
2551
2552 The following examples illustrate Guile's detailed behaviour as regards
2553 the case-sensitivity of symbols:
2554
2555 @lisp
2556 (read-enable 'case-insensitive)   ; R5RS compliant behaviour
2557
2558 (symbol->string 'flying-fish)    @result{} "flying-fish"
2559 (symbol->string 'Martin)         @result{} "martin"
2560 (symbol->string
2561    (string->symbol "Malvina"))   @result{} "Malvina"
2562
2563 (eq? 'mISSISSIppi 'mississippi)  @result{} #t
2564 (string->symbol "mISSISSIppi")   @result{} mISSISSIppi
2565 (eq? 'bitBlt (string->symbol "bitBlt")) @result{} #f
2566 (eq? 'LolliPop
2567   (string->symbol (symbol->string 'LolliPop))) @result{} #t
2568 (string=? "K. Harper, M.D."
2569   (symbol->string
2570     (string->symbol "K. Harper, M.D."))) @result{} #t
2571
2572 (read-disable 'case-insensitive)   ; Guile default behaviour
2573
2574 (symbol->string 'flying-fish)    @result{} "flying-fish"
2575 (symbol->string 'Martin)         @result{} "Martin"
2576 (symbol->string
2577    (string->symbol "Malvina"))   @result{} "Malvina"
2578
2579 (eq? 'mISSISSIppi 'mississippi)  @result{} #f
2580 (string->symbol "mISSISSIppi")   @result{} mISSISSIppi
2581 (eq? 'bitBlt (string->symbol "bitBlt")) @result{} #t
2582 (eq? 'LolliPop
2583   (string->symbol (symbol->string 'LolliPop))) @result{} #t
2584 (string=? "K. Harper, M.D."
2585   (symbol->string
2586     (string->symbol "K. Harper, M.D."))) @result{} #t
2587 @end lisp
2588
2589 Finally, some applications, especially those that generate new Scheme
2590 code dynamically, need to generate symbols for use in the generated
2591 code.  The @code{gensym} primitive meets this need:
2592
2593 @deffn {Scheme Procedure} gensym [prefix]
2594 @deffnx {C Function} scm_gensym (prefix)
2595 Create a new symbol with a name constructed from a prefix and a counter
2596 value.  The string @var{prefix} can be specified as an optional
2597 argument.  Default prefix is @samp{ g}.  The counter is increased by 1
2598 at each call.  There is no provision for resetting the counter.
2599 @end deffn
2600
2601 The symbols generated by @code{gensym} are @emph{likely} to be unique,
2602 since their names begin with a space and it is only otherwise possible
2603 to generate such symbols if a programmer goes out of their way to do
2604 so.  The 1.8 release of Guile will include a way of creating
2605 symbols that are @emph{guaranteed} to be unique.
2606
2607
2608 @node Symbol Props
2609 @subsection Function Slots and Property Lists
2610
2611 In traditional Lisp dialects, symbols are often understood as having
2612 three kinds of value at once:
2613
2614 @itemize @bullet
2615 @item
2616 a @dfn{variable} value, which is used when the symbol appears in
2617 code in a variable reference context
2618
2619 @item
2620 a @dfn{function} value, which is used when the symbol appears in
2621 code in a function name position (i.e. as the first element in an
2622 unquoted list)
2623
2624 @item
2625 a @dfn{property list} value, which is used when the symbol is given as
2626 the first argument to Lisp's @code{put} or @code{get} functions.
2627 @end itemize
2628
2629 Although Scheme (as one of its simplifications with respect to Lisp)
2630 does away with the distinction between variable and function namespaces,
2631 Guile currently retains some elements of the traditional structure in
2632 case they turn out to be useful when implementing translators for other
2633 languages, in particular Emacs Lisp.
2634
2635 Specifically, Guile symbols have two extra slots. for a symbol's
2636 property list, and for its ``function value.''  The following procedures
2637 are provided to access these slots.
2638
2639 @deffn {Scheme Procedure} symbol-fref symbol
2640 @deffnx {C Function} scm_symbol_fref (symbol)
2641 Return the contents of @var{symbol}'s @dfn{function slot}.
2642 @end deffn
2643
2644 @deffn {Scheme Procedure} symbol-fset! symbol value
2645 @deffnx {C Function} scm_symbol_fset_x (symbol, value)
2646 Set the contents of @var{symbol}'s function slot to @var{value}.
2647 @end deffn
2648
2649 @deffn {Scheme Procedure} symbol-pref symbol
2650 @deffnx {C Function} scm_symbol_pref (symbol)
2651 Return the @dfn{property list} currently associated with @var{symbol}.
2652 @end deffn
2653
2654 @deffn {Scheme Procedure} symbol-pset! symbol value
2655 @deffnx {C Function} scm_symbol_pset_x (symbol, value)
2656 Set @var{symbol}'s property list to @var{value}.
2657 @end deffn
2658
2659 @deffn {Scheme Procedure} symbol-property sym prop
2660 From @var{sym}'s property list, return the value for property
2661 @var{prop}.  The assumption is that @var{sym}'s property list is an
2662 association list whose keys are distinguished from each other using
2663 @code{equal?}; @var{prop} should be one of the keys in that list.  If
2664 the property list has no entry for @var{prop}, @code{symbol-property}
2665 returns @code{#f}.
2666 @end deffn
2667
2668 @deffn {Scheme Procedure} set-symbol-property sym prop val
2669 In @var{sym}'s property list, set the value for property @var{prop} to
2670 @var{val}, or add a new entry for @var{prop}, with value @var{val}, if
2671 none already exists.  For the structure of the property list, see
2672 @code{symbol-property}.
2673 @end deffn
2674
2675 @deffn {Scheme Procedure} symbol-property-remove! sym prop
2676 From @var{sym}'s property list, remove the entry for property
2677 @var{prop}, if there is one.  For the structure of the property list,
2678 see @code{symbol-property}.
2679 @end deffn
2680
2681 Support for these extra slots may be removed in a future release, and it
2682 is probably better to avoid using them.  (In release 1.6, Guile itself
2683 uses the property list slot sparingly, and the function slot not at
2684 all.)  For a more modern and Schemely approach to properties, see
2685 @ref{Object Properties}.
2686
2687
2688 @node Symbol Read Syntax
2689 @subsection Extended Read Syntax for Symbols
2690
2691 The read syntax for a symbol is a sequence of letters, digits, and
2692 @dfn{extended alphabetic characters}, beginning with a character that
2693 cannot begin a number.  In addition, the special cases of @code{+},
2694 @code{-}, and @code{...} are read as symbols even though numbers can
2695 begin with @code{+}, @code{-} or @code{.}.
2696
2697 Extended alphabetic characters may be used within identifiers as if
2698 they were letters.  The set of extended alphabetic characters is:
2699
2700 @example
2701 ! $ % & * + - . / : < = > ? @@ ^ _ ~
2702 @end example
2703
2704 In addition to the standard read syntax defined above (which is taken
2705 from R5RS (@pxref{Formal syntax,,,r5rs,The Revised^5 Report on
2706 Scheme})), Guile provides an extended symbol read syntax that allows the
2707 inclusion of unusual characters such as space characters, newlines and
2708 parentheses.  If (for whatever reason) you need to write a symbol
2709 containing characters not mentioned above, you can do so as follows.
2710
2711 @itemize @bullet
2712 @item
2713 Begin the symbol with the characters @code{#@{},
2714
2715 @item
2716 write the characters of the symbol and
2717
2718 @item
2719 finish the symbol with the characters @code{@}#}.
2720 @end itemize
2721
2722 Here are a few examples of this form of read syntax.  The first symbol
2723 needs to use extended syntax because it contains a space character, the
2724 second because it contains a line break, and the last because it looks
2725 like a number.
2726
2727 @lisp
2728 #@{foo bar@}#
2729
2730 #@{what
2731 ever@}#
2732
2733 #@{4242@}#
2734 @end lisp
2735
2736 Although Guile provides this extended read syntax for symbols,
2737 widespread usage of it is discouraged because it is not portable and not
2738 very readable.
2739
2740
2741 @node Symbol Uninterned
2742 @subsection Uninterned Symbols
2743
2744 What makes symbols useful is that they are automatically kept unique.
2745 There are no two symbols that are distinct objects but have the same
2746 name.  But of course, there is no rule without exception.  In addition
2747 to the normal symbols that have been discussed up to now, you can also
2748 create special @dfn{uninterned} symbols that behave slightly
2749 differently.
2750
2751 To understand what is different about them and why they might be useful,
2752 we look at how normal symbols are actually kept unique.
2753
2754 Whenever Guile wants to find the symbol with a specific name, for
2755 example during @code{read} or when executing @code{string->symbol}, it
2756 first looks into a table of all existing symbols to find out whether a
2757 symbol with the given name already exists.  When this is the case, Guile
2758 just returns that symbol.  When not, a new symbol with the name is
2759 created and entered into the table so that it can be found later.
2760
2761 Sometimes you might want to create a symbol that is guaranteed `fresh',
2762 i.e. a symbol that did not exist previously.  You might also want to
2763 somehow guarantee that no one else will ever unintentionally stumble
2764 across your symbol in the future.  These properties of a symbol are
2765 often needed when generating code during macro expansion.  When
2766 introducing new temporary variables, you want to guarantee that they
2767 don't conflict with variables in other people's code.
2768
2769 The simplest way to arrange for this is to create a new symbol but
2770 not enter it into the global table of all symbols.  That way, no one
2771 will ever get access to your symbol by chance.  Symbols that are not in
2772 the table are called @dfn{uninterned}.  Of course, symbols that
2773 @emph{are} in the table are called @dfn{interned}.
2774
2775 You create new uninterned symbols with the function @code{make-symbol}.
2776 You can test whether a symbol is interned or not with
2777 @code{symbol-interned?}.
2778
2779 Uninterned symbols break the rule that the name of a symbol uniquely
2780 identifies the symbol object.  Because of this, they can not be written
2781 out and read back in like interned symbols.  Currently, Guile has no
2782 support for reading uninterned symbols.  Note that the function
2783 @code{gensym} does not return uninterned symbols for this reason.
2784
2785 @deffn {Scheme Procedure} make-symbol name
2786 @deffnx {C Function} scm_make_symbol (name)
2787 Return a new uninterned symbol with the name @var{name}.  The returned
2788 symbol is guaranteed to be unique and future calls to
2789 @code{string->symbol} will not return it.
2790 @end deffn
2791
2792 @deffn {Scheme Procedure} symbol-interned? symbol
2793 @deffnx {C Function} scm_symbol_interned_p (symbol)
2794 Return @code{#t} if @var{symbol} is interned, otherwise return
2795 @code{#f}.
2796 @end deffn
2797
2798 For example:
2799
2800 @lisp
2801 (define foo-1 (string->symbol "foo"))
2802 (define foo-2 (string->symbol "foo"))
2803 (define foo-3 (make-symbol "foo"))
2804 (define foo-4 (make-symbol "foo"))
2805
2806 (eq? foo-1 foo-2)
2807 @result{} #t
2808 ; Two interned symbols with the same name are the same object,
2809
2810 (eq? foo-1 foo-3)
2811 @result{} #f
2812 ; but a call to make-symbol with the same name returns a
2813 ; distinct object.
2814
2815 (eq? foo-3 foo-4)
2816 @result{} #f
2817 ; A call to make-symbol always returns a new object, even for
2818 ; the same name.
2819
2820 foo-3
2821 @result{} #<uninterned-symbol foo 8085290>
2822 ; Uninterned symbols print differently from interned symbols,
2823
2824 (symbol? foo-3)
2825 @result{} #t
2826 ; but they are still symbols,
2827
2828 (symbol-interned? foo-3)
2829 @result{} #f
2830 ; just not interned.
2831 @end lisp
2832
2833
2834 @node Keywords
2835 @section Keywords
2836 @tpindex Keywords
2837
2838 Keywords are self-evaluating objects with a convenient read syntax that
2839 makes them easy to type.
2840
2841 Guile's keyword support conforms to R5RS, and adds a (switchable) read
2842 syntax extension to permit keywords to begin with @code{:} as well as
2843 @code{#:}.
2844
2845 @menu
2846 * Why Use Keywords?::           Motivation for keyword usage.
2847 * Coding With Keywords::        How to use keywords.
2848 * Keyword Read Syntax::         Read syntax for keywords.
2849 * Keyword Procedures::          Procedures for dealing with keywords.
2850 * Keyword Primitives::          The underlying primitive procedures.
2851 @end menu
2852
2853 @node Why Use Keywords?
2854 @subsection Why Use Keywords?
2855
2856 Keywords are useful in contexts where a program or procedure wants to be
2857 able to accept a large number of optional arguments without making its
2858 interface unmanageable.
2859
2860 To illustrate this, consider a hypothetical @code{make-window}
2861 procedure, which creates a new window on the screen for drawing into
2862 using some graphical toolkit.  There are many parameters that the caller
2863 might like to specify, but which could also be sensibly defaulted, for
2864 example:
2865
2866 @itemize @bullet
2867 @item
2868 color depth -- Default: the color depth for the screen
2869
2870 @item
2871 background color -- Default: white
2872
2873 @item
2874 width -- Default: 600
2875
2876 @item
2877 height -- Default: 400
2878 @end itemize
2879
2880 If @code{make-window} did not use keywords, the caller would have to
2881 pass in a value for each possible argument, remembering the correct
2882 argument order and using a special value to indicate the default value
2883 for that argument:
2884
2885 @lisp
2886 (make-window 'default              ;; Color depth
2887              'default              ;; Background color
2888              800                   ;; Width
2889              100                   ;; Height
2890              @dots{})                  ;; More make-window arguments
2891 @end lisp
2892
2893 With keywords, on the other hand, defaulted arguments are omitted, and
2894 non-default arguments are clearly tagged by the appropriate keyword.  As
2895 a result, the invocation becomes much clearer:
2896
2897 @lisp
2898 (make-window #:width 800 #:height 100)
2899 @end lisp
2900
2901 On the other hand, for a simpler procedure with few arguments, the use
2902 of keywords would be a hindrance rather than a help.  The primitive
2903 procedure @code{cons}, for example, would not be improved if it had to
2904 be invoked as
2905
2906 @lisp
2907 (cons #:car x #:cdr y)
2908 @end lisp
2909
2910 So the decision whether to use keywords or not is purely pragmatic: use
2911 them if they will clarify the procedure invocation at point of call.
2912
2913 @node Coding With Keywords
2914 @subsection Coding With Keywords
2915
2916 If a procedure wants to support keywords, it should take a rest argument
2917 and then use whatever means is convenient to extract keywords and their
2918 corresponding arguments from the contents of that rest argument.
2919
2920 The following example illustrates the principle: the code for
2921 @code{make-window} uses a helper procedure called
2922 @code{get-keyword-value} to extract individual keyword arguments from
2923 the rest argument.
2924
2925 @lisp
2926 (define (get-keyword-value args keyword default)
2927   (let ((kv (memq keyword args)))
2928     (if (and kv (>= (length kv) 2))
2929         (cadr kv)
2930         default)))
2931
2932 (define (make-window . args)
2933   (let ((depth  (get-keyword-value args #:depth  screen-depth))
2934         (bg     (get-keyword-value args #:bg     "white"))
2935         (width  (get-keyword-value args #:width  800))
2936         (height (get-keyword-value args #:height 100))
2937         @dots{})
2938     @dots{}))
2939 @end lisp
2940
2941 But you don't need to write @code{get-keyword-value}.  The @code{(ice-9
2942 optargs)} module provides a set of powerful macros that you can use to
2943 implement keyword-supporting procedures like this:
2944
2945 @lisp
2946 (use-modules (ice-9 optargs))
2947
2948 (define (make-window . args)
2949   (let-keywords args #f ((depth  screen-depth)
2950                          (bg     "white")
2951                          (width  800)
2952                          (height 100))
2953     ...))
2954 @end lisp
2955
2956 @noindent
2957 Or, even more economically, like this:
2958
2959 @lisp
2960 (use-modules (ice-9 optargs))
2961
2962 (define* (make-window #:key (depth  screen-depth)
2963                             (bg     "white")
2964                             (width  800)
2965                             (height 100))
2966   ...)
2967 @end lisp
2968
2969 For further details on @code{let-keywords}, @code{define*} and other
2970 facilities provided by the @code{(ice-9 optargs)} module, see
2971 @ref{Optional Arguments}.
2972
2973
2974 @node Keyword Read Syntax
2975 @subsection Keyword Read Syntax
2976
2977 Guile, by default, only recognizes the keyword syntax specified by R5RS.
2978 A token of the form @code{#:NAME}, where @code{NAME} has the same syntax
2979 as a Scheme symbol (@pxref{Symbol Read Syntax}), is the external
2980 representation of the keyword named @code{NAME}.  Keyword objects print
2981 using this syntax as well, so values containing keyword objects can be
2982 read back into Guile.  When used in an expression, keywords are
2983 self-quoting objects.
2984
2985 If the @code{keyword} read option is set to @code{'prefix}, Guile also
2986 recognizes the alternative read syntax @code{:NAME}.  Otherwise, tokens
2987 of the form @code{:NAME} are read as symbols, as required by R5RS.
2988
2989 To enable and disable the alternative non-R5RS keyword syntax, you use
2990 the @code{read-options} procedure documented in @ref{General option
2991 interface} and @ref{Reader options}.
2992
2993 @smalllisp
2994 (read-set! keywords 'prefix)
2995
2996 #:type
2997 @result{}
2998 #:type
2999
3000 :type
3001 @result{}
3002 #:type
3003
3004 (read-set! keywords #f)
3005
3006 #:type
3007 @result{}
3008 #:type
3009
3010 :type
3011 @print{}
3012 ERROR: In expression :type:
3013 ERROR: Unbound variable: :type
3014 ABORT: (unbound-variable)
3015 @end smalllisp
3016
3017 @node Keyword Procedures
3018 @subsection Keyword Procedures
3019
3020 The following procedures can be used for converting symbols to keywords
3021 and back.
3022
3023 @deffn {Scheme Procedure} symbol->keyword sym
3024 Return a keyword with the same characters as in @var{sym}.
3025 @end deffn
3026
3027 @deffn {Scheme Procedure} keyword->symbol kw
3028 Return a symbol with the same characters as in @var{kw}.
3029 @end deffn
3030
3031
3032 @node Keyword Primitives
3033 @subsection Keyword Primitives
3034
3035 Internally, a keyword is implemented as something like a tagged symbol,
3036 where the tag identifies the keyword as being self-evaluating, and the
3037 symbol, known as the keyword's @dfn{dash symbol} has the same name as
3038 the keyword name but prefixed by a single dash.  For example, the
3039 keyword @code{#:name} has the corresponding dash symbol @code{-name}.
3040
3041 Most keyword objects are constructed automatically by the reader when it
3042 reads a token beginning with @code{#:}.  However, if you need to
3043 construct a keyword object programmatically, you can do so by calling
3044 @code{make-keyword-from-dash-symbol} with the corresponding dash symbol
3045 (as the reader does).  The dash symbol for a keyword object can be
3046 retrieved using the @code{keyword-dash-symbol} procedure.
3047
3048 @deffn {Scheme Procedure} make-keyword-from-dash-symbol symbol
3049 @deffnx {C Function} scm_make_keyword_from_dash_symbol (symbol)
3050 Make a keyword object from a @var{symbol} that starts with a dash.
3051 @end deffn
3052
3053 @deffn {Scheme Procedure} keyword? obj
3054 @deffnx {C Function} scm_keyword_p (obj)
3055 Return @code{#t} if the argument @var{obj} is a keyword, else
3056 @code{#f}.
3057 @end deffn
3058
3059 @deffn {Scheme Procedure} keyword-dash-symbol keyword
3060 @deffnx {C Function} scm_keyword_dash_symbol (keyword)
3061 Return the dash symbol for @var{keyword}.
3062 This is the inverse of @code{make-keyword-from-dash-symbol}.
3063 @end deffn
3064
3065
3066 @node Other Types
3067 @section ``Functionality-Centric'' Data Types
3068
3069 Procedures and macros are documented in their own chapter: see
3070 @ref{Procedures and Macros}.
3071
3072 Variable objects are documented as part of the description of Guile's
3073 module system: see @ref{Variables}.
3074
3075 Asyncs, dynamic roots and fluids are described in the chapter on
3076 scheduling: see @ref{Scheduling}.
3077
3078 Hooks are documented in the chapter on general utility functions: see
3079 @ref{Hooks}.
3080
3081 Ports are described in the chapter on I/O: see @ref{Input and Output}.
3082
3083
3084 @c Local Variables:
3085 @c TeX-master: "guile.texi"
3086 @c End: