X-Git-Url: http://git.hcoop.net/bpt/guile.git/blobdiff_plain/f0a9ab4d9071c14780d155a260bfae9b1039c555..7a329029cf898fc0b9b24252c9bb437e1ad0b1d7:/doc/ref/misc-modules.texi diff --git a/doc/ref/misc-modules.texi b/doc/ref/misc-modules.texi index efd10116f..c1e65d7e3 100644 --- a/doc/ref/misc-modules.texi +++ b/doc/ref/misc-modules.texi @@ -1,10 +1,9 @@ @c -*-texinfo-*- @c This is part of the GNU Guile Reference Manual. -@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004 -@c Free Software Foundation, Inc. +@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2009, +@c 2010, 2011, 2012 Free Software Foundation, Inc. @c See the file guile.texi for copying conditions. -@page @node Pretty Printing @section Pretty Printing @@ -16,7 +15,7 @@ The module @code{(ice-9 pretty-print)} provides the procedure objects. This is especially useful for deeply nested or complex data structures, such as lists and vectors. -The module is loaded by simply saying. +The module is loaded by entering the following: @lisp (use-modules (ice-9 pretty-print)) @@ -60,7 +59,67 @@ Print within the given @var{columns}. The default is 79. @end deffn -@page +@cindex truncated printing +Also exported by the @code{(ice-9 pretty-print)} module is +@code{truncated-print}, a procedure to print Scheme datums, truncating +the output to a certain number of characters. This is useful when you +need to present an arbitrary datum to the user, but you only have one +line in which to do so. + +@lisp +(define exp '(a b #(c d e) f . g)) +(truncated-print exp #:width 10) (newline) +@print{} (a b . #) +(truncated-print exp #:width 15) (newline) +@print{} (a b # f . g) +(truncated-print exp #:width 18) (newline) +@print{} (a b #(c ...) . #) +(truncated-print exp #:width 20) (newline) +@print{} (a b #(c d e) f . g) +(truncated-print "The quick brown fox" #:width 20) (newline) +@print{} "The quick brown..." +(truncated-print (current-module) #:width 20) (newline) +@print{} # +@end lisp + +@code{truncated-print} will not output a trailing newline. If an expression does +not fit in the given width, it will be truncated -- possibly +ellipsized@footnote{On Unicode-capable ports, the ellipsis is represented by +character `HORIZONTAL ELLIPSIS' (U+2026), otherwise it is represented by three +dots.}, or in the worst case, displayed as @nicode{#}. + +@deffn {Scheme Procedure} truncated-print obj [port] [keyword-options] +Print @var{obj}, truncating the output, if necessary, to make it fit +into @var{width} characters. By default, @var{obj} will be printed using +@code{write}, though that behavior can be overridden via the +@var{display?} keyword argument. + +The default behaviour is to print depth-first, meaning that the entire +remaining width will be available to each sub-expression of @var{obj} -- +e.g., if @var{obj} is a vector, each member of @var{obj}. One can attempt to +``ration'' the available width, trying to allocate it equally to each +sub-expression, via the @var{breadth-first?} keyword argument. + +The further @var{keyword-options} are keywords and parameters as +follows, + +@table @asis +@item @nicode{#:display?} @var{flag} +If @var{flag} is true then print using @code{display}. The default is +@code{#f} which means use @code{write} style. (@pxref{Writing}) + +@item @nicode{#:width} @var{columns} +Print within the given @var{columns}. The default is 79. + +@item @nicode{#:breadth-first?} @var{flag} +If @var{flag} is true, then allocate the available width breadth-first +among elements of a compound data structure (list, vector, pair, +etc.). The default is @code{#f} which means that any element is +allowed to consume all of the available width. +@end table +@end deffn + + @node Formatted Output @section Formatted Output @cindex formatted output @@ -97,12 +156,11 @@ C programmers will note the similarity between @code{format} and instead of @nicode{%}, and are more powerful. @sp 1 -@deffn {Scheme Procedure} format dest fmt [args@dots{}] +@deffn {Scheme Procedure} format dest fmt arg @dots{} Write output specified by the @var{fmt} string to @var{dest}. @var{dest} can be an output port, @code{#t} for -@code{current-output-port} (@pxref{Default Ports}), a number for -@code{current-error-port}, or @code{#f} to return the output as a -string. +@code{current-output-port} (@pxref{Default Ports}), or @code{#f} to +return the output as a string. @var{fmt} can contain literal text to be output, and @nicode{~} escapes. Each escape has the form @@ -118,7 +176,7 @@ parameters are accepted by some codes too. Parameters have the following forms, @table @asis -@item @nicode{[+/-] number} +@item @nicode{[+/-]number} An integer, with optional @nicode{+} or @nicode{-}. @item @nicode{'} (apostrophe) The following character in the format string, for instance @nicode{'z} @@ -153,16 +211,16 @@ outputs an argument like @code{write} (@pxref{Writing}). (format #t "~s" "foo") @print{} "foo" @end example -With the @nicode{:} modifier, objects which don't have an external -representation are put in quotes like a string. +@nicode{~:a} and @nicode{~:s} put objects that don't have an external +representation in quotes like a string. @example (format #t "~:a" car) @print{} "#" @end example If the output is less than @var{minwidth} characters (default 0), it's -padded on the right with @var{padchar} (default space). The -@nicode{@@} modifier puts the padding on the left instead. +padded on the right with @var{padchar} (default space). @nicode{~@@a} +and @nicode{~@@s} put the padding on the left instead. @example (format #f "~5a" 'abc) @result{} "abc " @@ -184,9 +242,9 @@ no minimum or multiple). Character. Parameter: @var{charnum}. Output a character. The default is to simply output, as per -@code{write-char} (@pxref{Writing}). With the @nicode{@@} modifier -output is in @code{write} style. Or with the @nicode{:} modifier -control characters (ASCII 0 to 31) are printed in @nicode{^X} form. +@code{write-char} (@pxref{Writing}). @nicode{~@@c} prints in +@code{write} style. @nicode{~:c} prints control characters (ASCII 0 +to 31) in @nicode{^X} form. @example (format #t "~c" #\z) @print{} z @@ -211,13 +269,13 @@ Integer. Parameters: @var{minwidth}, @var{padchar}, @var{commachar}, @var{commawidth}. Output an integer argument as a decimal, hexadecimal, octal or binary -integer (respectively). +integer (respectively), in a locale-independent way. @example (format #t "~d" 123) @print{} 123 @end example -With the @nicode{@@} modifier, a @nicode{+} sign is shown on positive +@nicode{~@@d} etc shows a @nicode{+} sign is shown on positive numbers. @c FIXME: "+" is not shown on zero, unlike in Common Lisp. Should @@ -238,8 +296,10 @@ minimum), it's padded on the left with the @var{padchar} parameter (format #t "~3d" 1234) @print{} 1234 @end example -The @nicode{:} modifier adds commas (or the @var{commachar} parameter) -every three digits (or the @var{commawidth} parameter many). +@nicode{~:d} adds commas (or the @var{commachar} parameter) every +three digits (or the @var{commawidth} parameter many). However, when +your intent is to write numbers in a way that follows typographical +conventions, using @nicode{~h} is recommended. @example (format #t "~:d" 1234567) @print{} 1,234,567 @@ -261,7 +321,7 @@ Integer in words, roman numerals, or a specified radix. Parameters: @var{commawidth}. With no parameters output is in words as a cardinal like ``ten'', or -with the @nicode{:} modifier as an ordinal like ``tenth''. +@nicode{~:r} prints an ordinal like ``tenth''. @example (format #t "~r" 9) @print{} nine ;; cardinal @@ -269,15 +329,14 @@ with the @nicode{:} modifier as an ordinal like ``tenth''. (format #t "~:r" 9) @print{} ninth ;; ordinal @end example -And also with no parameters, the @nicode{@@} modifier gives roman -numerals and @nicode{@@} and @nicode{:} together give old roman -numerals. In old roman numerals there's no ``subtraction'', so 9 is -@nicode{VIIII} instead of @nicode{IX}. In both cases only positive -numbers can be output. +And also with no parameters, @nicode{~@@r} gives roman numerals and +@nicode{~:@@r} gives old roman numerals. In old roman numerals +there's no ``subtraction'', so 9 is @nicode{VIIII} instead of +@nicode{IX}. In both cases only positive numbers can be output. @example (format #t "~@@r" 89) @print{} LXXXIX ;; roman -(format #t "~@@:r" 89) @print{} LXXXVIIII ;; old roman +(format #t "~:@@r" 89) @print{} LXXXVIIII ;; old roman @end example When a parameter is given it means numeric output in the specified @@ -302,8 +361,8 @@ decimal point. (format #t "~f" "1e-1") @print{} 0.1 @end example -With the @nicode{@@} modifier a @nicode{+} sign is shown on positive -numbers (including zero). +@nicode{~@@f} prints a @nicode{+} sign on positive numbers (including +zero). @example (format #t "~@@f" 0) @print{} +0.0 @@ -343,8 +402,31 @@ would exceed @var{width}, then that many @var{overflowchar}s are printed instead of the value. @example -(format #t "~5,,,'xf" 12345) @print{} 12345 -(format #t "~4,,,'xf" 12345) @print{} xxxx +(format #t "~6,,,'xf" 12345) @print{} 12345. +(format #t "~5,,,'xf" 12345) @print{} xxxxx +@end example + +@item @nicode{~h} +Localized number@footnote{The @nicode{~h} format specifier first +appeared in Guile version 2.0.6.}. Parameters: @var{width}, +@var{decimals}, @var{padchar}. + +Like @nicode{~f}, output an exact or floating point number, but do so +according to the current locale, or according to the given locale object +when the @code{:} modifier is used (@pxref{Number Input and Output, +@code{number->locale-string}}). + +@example +(format #t "~h" 12345.5678) ; with "C" as the current locale +@print{} 12345.5678 + +(format #t "~14,,'*:h" 12345.5678 + (make-locale LC_ALL "en_US")) +@print{} ***12,345.5678 + +(format #t "~,2:h" 12345.5678 + (make-locale LC_NUMERIC "fr_FR")) +@print{} 12 345,56 @end example @item @nicode{~e} @@ -360,9 +442,9 @@ Output a number or number string in exponential notation. (format #t "~e" "1e4") @print{} 1.0E+4 @end example -With the @nicode{@@} modifier a @nicode{+} sign is shown on positive -numbers (including zero). (This is for the mantissa, a @nicode{+} or -@nicode{-} sign is always shown on the exponent.) +@nicode{~@@e} prints a @nicode{+} sign on positive numbers (including +zero). (This is for the mantissa, a @nicode{+} or @nicode{-} sign is +always shown on the exponent.) @example (format #t "~@@e" 5000.0) @print{} +5.0E+3 @@ -418,7 +500,7 @@ in which case leading zeros are shown after the decimal point. @c FIXME: MANTDIGITS with negative INTDIGITS doesn't match CL spec, @c believe the spec says it ought to still show mantdigits+1 sig -@c figures, ie. leading zeros don't count towards MANTDIGITS, but it +@c figures, i.e. leading zeros don't count towards MANTDIGITS, but it @c seems to just treat MANTDIGITS as how many digits after the @c decimal point. @@ -486,8 +568,8 @@ show, default 2. (format #t "~4$" "1e-2") @print{} 0.0100 @end example -With the @nicode{@@} modifier a @nicode{+} sign is shown on positive -numbers (including zero). +@nicode{~@@$} prints a @nicode{+} sign on positive numbers (including +zero). @example (format #t "~@@$" 0) @print{} +0.00 @@ -502,13 +584,13 @@ part of the value (default 1). @end example If the output is less than @var{width} characters (default 0), it's -padded on the left with @var{padchar} (default space). With the -@nicode{:} modifier the padding is output after the sign. +padded on the left with @var{padchar} (default space). @nicode{~:$} +puts the padding after the sign. @example (format #f "~,,8$" -1.5) @result{} " -1.50" (format #f "~,,8:$" -1.5) @result{} "- 1.50" -(format #f "~,,8,'.@@:$" 3) @result{} "+...3.00" +(format #f "~,,8,'.:@@$" 3) @result{} "+...3.00" @end example Note that floating point for dollar amounts is generally not a good @@ -556,38 +638,60 @@ value. (format #t "enter name~p" 2) @print{} enter names @end example -With the @nicode{@@} modifier, the output is @samp{y} for 1 or -@samp{ies} otherwise. +@nicode{~@@p} prints @samp{y} for 1 or @samp{ies} otherwise. @example (format #t "pupp~@@p" 1) @print{} puppy (format #t "pupp~@@p" 2) @print{} puppies @end example -The @nicode{:} modifier means re-use the preceding argument instead of -taking a new one, which can be convenient when printing some sort of -count. +@nicode{~:p} re-uses the preceding argument instead of taking a new +one, which can be convenient when printing some sort of count. @example -(format #t "~d cat~:p" 9) @print{} 9 cats +(format #t "~d cat~:p" 9) @print{} 9 cats +(format #t "~d pupp~:@@p" 5) @print{} 5 puppies @end example +@nicode{~p} is designed for English plurals and there's no attempt to +support other languages. @nicode{~[} conditionals (below) may be able +to help. When using @code{gettext} to translate messages +@code{ngettext} is probably best though +(@pxref{Internationalization}). + @item @nicode{~y} -Pretty print. No parameters. +Structured printing. Parameters: @var{width}. -Output an argument with @code{pretty-print} (@pxref{Pretty Printing}). +@nicode{~y} outputs an argument using @code{pretty-print} +(@pxref{Pretty Printing}). The result will be formatted to fit within +@var{width} columns (79 by default), consuming multiple lines if +necessary. + +@nicode{~@@y} outputs an argument using @code{truncated-print} +(@pxref{Pretty Printing}). The resulting code will be formatted to fit +within @var{width} columns (79 by default), on a single line. The +output will be truncated if necessary. + +@nicode{~:@@y} is like @nicode{~@@y}, except the @var{width} parameter +is interpreted to be the maximum column to which to output. That is to +say, if you are at column 10, and @nicode{~60:@@y} is seen, the datum +will be truncated to 50 columns. @item @nicode{~?} @itemx @nicode{~k} Sub-format. No parameters. Take a format string argument and a second argument which is a list of -arguments for it, and output the result. With the @nicode{@@} -modifier, the arguments for the sub-format are taken directly rather -than from a list. +arguments for that string, and output the result. + +@example +(format #t "~?" "~d ~d" '(1 2)) @print{} 1 2 +@end example + +@nicode{~@@?} takes arguments for the sub-format directly rather than +in a list. @example -(format #t "~?" "~d ~d" '(1 2)) @print{} 1 2 (format #t "~@@? ~s" "~d ~d" 1 2 "foo") @print{} 1 2 "foo" @end example @@ -597,16 +701,16 @@ T-Scheme compatibility. @item @nicode{~*} Argument jumping. Parameter: @var{N}. -Move forward @var{N} arguments (default 1) in the argument list. With -the @nicode{:} modifier move backwards. (@var{N} cannot be negative.) +Move forward @var{N} arguments (default 1) in the argument list. +@nicode{~:*} moves backwards. (@var{N} cannot be negative.) @example (format #f "~d ~2*~d" 1 2 3 4) @result{} "1 4" (format #f "~d ~:*~d" 6) @result{} "6 6" @end example -With the @nicode{@@} modifier, move to argument number @var{N}. The -first argument is number 0 (and that's the default for @var{N}). +@nicode{~@@*} moves to argument number @var{N}. The first argument is +number 0 (and that's the default for @var{N}). @example (format #f "~d~d again ~@@*~d~d" 1 2) @result{} "12 again 12" @@ -621,7 +725,7 @@ argument list, a reverse of what the @nicode{@@} modifier does. (format #t "~#*~2:*~a" 'a 'b 'c 'd) @print{} c @end example -At the end of the format string, the current argument postion doesn't +At the end of the format string the current argument position doesn't matter, any further arguments are ignored. @item @nicode{~t} @@ -647,10 +751,10 @@ The default @var{colinc} is 1 (which means no further move). (format #f "abcd~2,5,'.tx") @result{} "abcd...x" @end example -With the @nicode{@@} modifier, @var{colnum} is relative to the current -column. @var{colnum} many padding characters are output, then further -padding to make the current column a multiple of @var{colinc}, if it -isn't already so. +@nicode{~@@t} takes @var{colnum} as an offset from the current column. +@var{colnum} many pad characters are output, then further padding to +make the current column a multiple of @var{colinc}, if it isn't +already so. @example (format #f "a~3,5'*@@tx") @result{} "a****x" @@ -721,20 +825,22 @@ nothing. Continuation line. No parameters. Skip this newline and any following whitespace in the format string, -don't send it to the output. With the @nicode{:} modifier the newline -is not output but any further following whitespace is. With the -@nicode{@@} modifier the newline is output but not any following +ie.@: don't send it to the output. This can be used to break up a +long format string for readability, but not print the extra whitespace. -This escape can be used to break up a long format string into multiple -lines for readability, but supress that extra whitespace. - @example (format #f "abc~ ~d def~ ~d" 1 2) @result{} "abc1 def2" @end example +@nicode{~:newline} skips the newline but leaves any further whitespace +to be printed normally. + +@nicode{~@@newline} prints the newline then skips following +whitespace. + @item @nicode{~(} @nicode{~)} Case conversion. No parameters. @@ -743,7 +849,7 @@ The modifiers on @nicode{~(} control the conversion. @itemize @w{} @item -no modifiers --- lower case. +@nicode{~(} --- lower case. @c @c FIXME: The : and @ modifiers are not yet documented because the @c code applies string-capitalize and string-capitalize-first to each @@ -767,14 +873,14 @@ no modifiers --- lower case. @c rest lower case. @c @item -@nicode{:} and @nicode{@@} together --- upper case. +@nicode{~:@@(} --- upper case. @end itemize For example, @example (format #t "~(Hello~)") @print{} hello -(format #t "~@@:(Hello~)") @print{} HELLO +(format #t "~:@@(Hello~)") @print{} HELLO @end example In the future it's intended the modifiers @nicode{:} and @nicode{@@} @@ -800,26 +906,30 @@ elements from it. This is a convenient way to output a whole list. (format #t "~@{~s=~d ~@}" '("x" 1 "y" 2)) @print{} "x"=1 "y"=2 @end example -With the @nicode{:} modifier a list of lists argument is taken, each -of those lists gives the arguments for the iterated format. +@nicode{~:@{} takes a single argument which is a list of lists, each +of those contained lists gives the arguments for the iterated format. +@c @print{} on a new line here to avoid overflowing page width in DVI @example -(format #t "~:@{~dx~d ~@}" '((1 2) (3 4) (5 6))) @print{} 1x2 3x4 5x6 +(format #t "~:@{~dx~d ~@}" '((1 2) (3 4) (5 6))) +@print{} 1x2 3x4 5x6 @end example -With the @nicode{@@} modifier, the remaining arguments are used, each -iteration successively consuming elements. +@nicode{~@@@{} takes arguments directly, with each iteration +successively consuming arguments. @example (format #t "~@@@{~d~@}" 1 2 3) @print{} 123 (format #t "~@@@{~s=~d ~@}" "x" 1 "y" 2) @print{} "x"=1 "y"=2 @end example -With both @nicode{:} and @nicode{@@} modifiers, the remaining -arguments are used, each is a list of arguments for the format. +@nicode{~:@@@{} takes list arguments, one argument for each iteration, +using that list for the format. +@c @print{} on a new line here to avoid overflowing page width in DVI @example -(format #t "~:@@@{~dx~d ~@}" '(1 2) '(3 4) '(5 6)) @print{} 1x2 3x4 5x6 +(format #t "~:@@@{~dx~d ~@}" '(1 2) '(3 4) '(5 6)) +@print{} 1x2 3x4 5x6 @end example Iterating stops when there are no more arguments or when the @@ -853,6 +963,8 @@ to @code{(1 2)} then @code{(3 4 5)} etc. (format #t "~@{~@{~d~@}x~@}" '((1 2) (3 4 5))) @print{} 12x345x @end example +See also @nicode{~^} below for escaping from iteration. + @item @nicode{~[} @nicode{~;} @nicode{~]} Conditional. Parameter: @var{selector}. @@ -873,17 +985,16 @@ instead of taking an argument. @end example If the clause number is out of range then nothing is output. Or the -last @nicode{~;} can have a @nicode{:} modifier to make it the default -for a number out of range. +last clause can be @nicode{~:;} to use that for a number out of range. @example (format #f "~[banana~;mango~]" 99) @result{} "" (format #f "~[banana~;mango~:;fruit~]" 99) @result{} "fruit" @end example -The @nicode{:} modifier to @nicode{~[} treats the argument as a flag, -and expects two clauses. The first is used if the argument is -@code{#f} or the second otherwise. +@nicode{~:[} treats the argument as a flag, and expects two clauses. +The first is used if the argument is @code{#f} or the second +otherwise. @example (format #f "~:[false~;not false~]" #f) @result{} "false" @@ -894,12 +1005,12 @@ and expects two clauses. The first is used if the argument is @print{} 3 gnus are here @end example -The @nicode{@@} modifier to @nicode{~[} also treats the argument as a -flag, and expects one clause. If the argument is @code{#f} then no -output is produced and the argument is consumed, otherwise the clause -is used and the argument is not consumed by @nicode{~[}, it's left for -the clause. This can be used for instance to suppress output if -@code{#f} means something not available. +@nicode{~@@[} also treats the argument as a flag, and expects one +clause. If the argument is @code{#f} then no output is produced and +the argument is consumed, otherwise the clause is used and the +argument is not consumed, it's left for the clause. This can be used +for instance to suppress output if @code{#f} means something not +available. @example (format #f "~@@[temperature=~d~]" 27) @result{} "temperature=27" @@ -910,7 +1021,7 @@ the clause. This can be used for instance to suppress output if Escape. Parameters: @var{val1}, @var{val2}, @var{val3}. Stop formatting if there are no more arguments. This can be used for -instance to let a format string adapt to a variable number of +instance to have a format string adapt to a variable number of arguments. @example @@ -920,8 +1031,9 @@ arguments. Within a @nicode{~@{} @nicode{~@}} iteration, @nicode{~^} stops the current iteration step if there are no more arguments to that step, -continuing with possible further steps (for instance in the case of -the @nicode{:} modifier to @nicode{~@{}) and the rest of the format. +but continuing with possible further steps and the rest of the format. +This can be used for instance to avoid a separator on the last +iteration, or to adapt to variable length argument lists. @example (format #f "~@{~d~^/~@} go" '(1 2 3)) @result{} "1/2/3 go" @@ -966,8 +1078,9 @@ equal. For three parameters, termination is when @math{@var{val1} @c FIXME: Good examples of these? @item @nicode{~q} -Inquiry message. Insert a copyright message into the output. With -the @nicode{:} modifier insert the format implementation version. +Inquiry message. Insert a copyright message into the output. + +@nicode{~:q} inserts the format implementation version. @end table @sp 1 @@ -1006,156 +1119,188 @@ try to use one of them. The reason for two versions is that the full @code{simple-format} is often adequate too. -@page -@node Rx Regexps -@section The Rx Regular Expression Library - -[FIXME: this is taken from Gary and Mark's quick summaries and should be -reviewed and expanded. Rx is pretty stable, so could already be done!] - -@cindex rx -@cindex finite automaton - -The @file{guile-lang-allover} package provides an interface to Tom -Lord's Rx library (currently only to POSIX regular expressions). Use of -the library requires a two step process: compile a regular expression -into an efficient structure, then use the structure in any number of -string comparisons. - -For example, given the regular expression @samp{abc.} (which matches any -string containing @samp{abc} followed by any single character): - -@smalllisp -guile> @kbd{(define r (regcomp "abc."))} -guile> @kbd{r} -# -guile> @kbd{(regexec r "abc")} -#f -guile> @kbd{(regexec r "abcd")} -#((0 . 4)) -guile> -@end smalllisp - -The definitions of @code{regcomp} and @code{regexec} are as follows: - -@deffn {Scheme Procedure} regcomp pattern [flags] -Compile the regular expression pattern using POSIX rules. Flags is -optional and should be specified using symbolic names: -@defvar REG_EXTENDED -use extended POSIX syntax -@end defvar -@defvar REG_ICASE -use case-insensitive matching -@end defvar -@defvar REG_NEWLINE -allow anchors to match after newline characters in the -string and prevents @code{.} or @code{[^...]} from matching newlines. -@end defvar - -The @code{logior} procedure can be used to combine multiple flags. -The default is to use -POSIX basic syntax, which makes @code{+} and @code{?} literals and @code{\+} -and @code{\?} -operators. Backslashes in @var{pattern} must be escaped if specified in a -literal string e.g., @code{"\\(a\\)\\?"}. -@end deffn +@node File Tree Walk +@section File Tree Walk +@cindex file tree walk -@deffn {Scheme Procedure} regexec regex string [match-pick] [flags] -Match @var{string} against the compiled POSIX regular expression -@var{regex}. -@var{match-pick} and @var{flags} are optional. Possible flags (which can be -combined using the logior procedure) are: +@cindex file system traversal +@cindex directory traversal -@defvar REG_NOTBOL -The beginning of line operator won't match the beginning of -@var{string} (presumably because it's not the beginning of a line) -@end defvar +The functions in this section traverse a tree of files and +directories. They come in two flavors: the first one is a high-level +functional interface, and the second one is similar to the C @code{ftw} +and @code{nftw} routines (@pxref{Working with Directory Trees,,, libc, +GNU C Library Reference Manual}). -@defvar REG_NOTEOL -Similar to REG_NOTBOL, but prevents the end of line operator -from matching the end of @var{string}. -@end defvar +@example +(use-modules (ice-9 ftw)) +@end example +@sp 1 -If no match is possible, regexec returns #f. Otherwise @var{match-pick} -determines the return value: +@deffn {Scheme Procedure} file-system-tree file-name [enter? [stat]] +Return a tree of the form @code{(@var{file-name} @var{stat} +@var{children} ...)} where @var{stat} is the result of @code{(@var{stat} +@var{file-name})} and @var{children} are similar structures for each +file contained in @var{file-name} when it designates a directory. + +The optional @var{enter?} predicate is invoked as @code{(@var{enter?} +@var{name} @var{stat})} and should return true to allow recursion into +directory @var{name}; the default value is a procedure that always +returns @code{#t}. When a directory does not match @var{enter?}, it +nonetheless appears in the resulting tree, only with zero children. + +The @var{stat} argument is optional and defaults to @code{lstat}, as for +@code{file-system-fold} (see below.) + +The example below shows how to obtain a hierarchical listing of the +files under the @file{module/language} directory in the Guile source +tree, discarding their @code{stat} info: + +@example +(use-modules (ice-9 match)) + +(define remove-stat + ;; Remove the `stat' object the `file-system-tree' provides + ;; for each file in the tree. + (match-lambda + ((name stat) ; flat file + name) + ((name stat children ...) ; directory + (list name (map remove-stat children))))) + +(let ((dir (string-append (assq-ref %guile-build-info 'top_srcdir) + "/module/language"))) + (remove-stat (file-system-tree dir))) + +@result{} +("language" + (("value" ("spec.go" "spec.scm")) + ("scheme" + ("spec.go" + "spec.scm" + "compile-tree-il.scm" + "decompile-tree-il.scm" + "decompile-tree-il.go" + "compile-tree-il.go")) + ("tree-il" + ("spec.go" + "fix-letrec.go" + "inline.go" + "fix-letrec.scm" + "compile-glil.go" + "spec.scm" + "optimize.scm" + "primitives.scm" + @dots{})) + @dots{})) +@end example +@end deffn -@code{#t} or unspecified: a newly-allocated vector is returned, -containing pairs with the indices of the matched part of @var{string} and any -substrings. +@cindex file system combinator -@code{""}: a list is returned: the first element contains a nested list -with the matched part of @var{string} surrounded by the the unmatched parts. -Remaining elements are matched substrings (if any). All returned -substrings share memory with @var{string}. +It is often desirable to process directories entries directly, rather +than building up a tree of entries in memory, like +@code{file-system-tree} does. The following procedure, a +@dfn{combinator}, is designed to allow directory entries to be processed +directly as a directory tree is traversed; in fact, +@code{file-system-tree} is implemented in terms of it. -@code{#f}: regexec returns #t if a match is made, otherwise #f. +@deffn {Scheme Procedure} file-system-fold enter? leaf down up skip error init file-name [stat] +Traverse the directory at @var{file-name}, recursively, and return the +result of the successive applications of the @var{leaf}, @var{down}, +@var{up}, and @var{skip} procedures as described below. -vector: the supplied vector is returned, with the first element replaced -by a pair containing the indices of the matched portion of @var{string} and -further elements replaced by pairs containing the indices of matched -substrings (if any). +Enter sub-directories only when @code{(@var{enter?} @var{path} +@var{stat} @var{result})} returns true. When a sub-directory is +entered, call @code{(@var{down} @var{path} @var{stat} @var{result})}, +where @var{path} is the path of the sub-directory and @var{stat} the +result of @code{(false-if-exception (@var{stat} @var{path}))}; when it is +left, call @code{(@var{up} @var{path} @var{stat} @var{result})}. -list: a list will be returned, with each member of the list -specified by a code in the corresponding position of the supplied list: +For each file in a directory, call @code{(@var{leaf} @var{path} +@var{stat} @var{result})}. -a number: the numbered matching substring (0 for the entire match). +When @var{enter?} returns @code{#f}, or when an unreadable directory is +encountered, call @code{(@var{skip} @var{path} @var{stat} +@var{result})}. -@code{#\<}: the beginning of @var{string} to the beginning of the part matched -by regex. +When @var{file-name} names a flat file, @code{(@var{leaf} @var{path} +@var{stat} @var{init})} is returned. -@code{#\>}: the end of the matched part of @var{string} to the end of -@var{string}. +When an @code{opendir} or @var{stat} call fails, call @code{(@var{error} +@var{path} @var{stat} @var{errno} @var{result})}, with @var{errno} being +the operating system error number that was raised---e.g., +@code{EACCES}---and @var{stat} either @code{#f} or the result of the +@var{stat} call for that entry, when available. -@code{#\c}: the "final tag", which seems to be associated with the "cut -operator", which doesn't seem to be available through the posix -interface. +The special @file{.} and @file{..} entries are not passed to these +procedures. The @var{path} argument to the procedures is a full file +name---e.g., @code{"../foo/bar/gnu"}; if @var{file-name} is an absolute +file name, then @var{path} is also an absolute file name. Files and +directories, as identified by their device/inode number pair, are +traversed only once. -e.g., @code{(list #\< 0 1 #\>)}. The returned substrings share memory with -@var{string}. -@end deffn +The optional @var{stat} argument defaults to @code{lstat}, which means +that symbolic links are not followed; the @code{stat} procedure can be +used instead when symbolic links are to be followed (@pxref{File System, +stat}). -Here are some other procedures that might be used when using regular -expressions: +The example below illustrates the use of @code{file-system-fold}: -@deffn {Scheme Procedure} compiled-regexp? obj -Test whether obj is a compiled regular expression. -@end deffn +@example +(define (total-file-size file-name) + "Return the size in bytes of the files under FILE-NAME (similar +to `du --apparent-size' with GNU Coreutils.)" -@deffn {Scheme Procedure} regexp->dfa regex [flags] -@end deffn + (define (enter? name stat result) + ;; Skip version control directories. + (not (member (basename name) '(".git" ".svn" "CVS")))) + (define (leaf name stat result) + ;; Return RESULT plus the size of the file at NAME. + (+ result (stat:size stat))) -@deffn {Scheme Procedure} dfa-fork dfa -@end deffn + ;; Count zero bytes for directories. + (define (down name stat result) result) + (define (up name stat result) result) -@deffn {Scheme Procedure} reset-dfa! dfa -@end deffn + ;; Likewise for skipped directories. + (define (skip name stat result) result) -@deffn {Scheme Procedure} dfa-final-tag dfa -@end deffn + ;; Ignore unreadable files/directories but warn the user. + (define (error name stat errno result) + (format (current-error-port) "warning: ~a: ~a~%" + name (strerror errno)) + result) -@deffn {Scheme Procedure} dfa-continuable? dfa -@end deffn + (file-system-fold enter? leaf down up skip error + 0 ; initial counter is zero bytes + file-name)) -@deffn {Scheme Procedure} advance-dfa! dfa string -@end deffn +(total-file-size ".") +@result{} 8217554 +(total-file-size "/dev/null") +@result{} 0 +@end example +@end deffn -@node File Tree Walk -@section File Tree Walk -@cindex file tree walk +The alternative C-like functions are described below. -The functions in this section traverse a tree of files and -directories, in a fashion similar to the C @code{ftw} and @code{nftw} -routines (@pxref{Working with Directory Trees,,, libc, GNU C Library -Reference Manual}). +@deffn {Scheme Procedure} scandir name [select? [entrystream list -@defunx vector->stream vector +@deffn {Scheme Procedure} list->stream list +@deffnx {Scheme Procedure} vector->stream vector Return a stream with the contents of @var{list} or @var{vector}. @var{list} or @var{vector} should not be modified subsequently, since it's unspecified whether changes there will be reflected in the stream returned. -@end defun +@end deffn -@defun port->stream port readproc +@deffn {Scheme Procedure} port->stream port readproc Return a stream which is the values obtained by reading from @var{port} using @var{readproc}. Each read call is @code{(@var{readproc} @var{port})}, and it should return an EOF object @@ -1524,64 +1671,127 @@ For example a stream of characters from a file, @example (port->stream (open-input-file "/foo/bar.txt") read-char) @end example -@end defun +@end deffn -@defun stream->list stream +@deffn {Scheme Procedure} stream->list stream Return a list which is the entire contents of @var{stream}. -@end defun +@end deffn -@defun stream->reversed-list stream +@deffn {Scheme Procedure} stream->reversed-list stream Return a list which is the entire contents of @var{stream}, but in reverse order. -@end defun - -@defun stream->list&length stream -Return two values (@pxref{Multiple Values}) being a list which is the -entire contents of @var{stream}, and the number of elements in that -list. -@end defun +@end deffn -@defun stream->reversed-list&length stream -Return two values (@pxref{Multiple Values}) being a list which is the -entire contents of @var{stream}, but in reverse order, and the number +@deffn {Scheme Procedure} stream->list&length stream +Return two values (@pxref{Multiple Values}), being firstly a list +which is the entire contents of @var{stream}, and secondly the number of elements in that list. -@end defun +@end deffn -@defun stream->vector stream +@deffn {Scheme Procedure} stream->reversed-list&length stream +Return two values (@pxref{Multiple Values}) being firstly a list which +is the entire contents of @var{stream}, but in reverse order, and +secondly the number of elements in that list. +@end deffn + +@deffn {Scheme Procedure} stream->vector stream Return a vector which is the entire contents of @var{stream}. -@end defun +@end deffn -@defun stream-fold proc init stream0 @dots{} streamN +@defun stream-fold proc init stream1 stream2 @dots{} Apply @var{proc} successively over the elements of the given streams, from first to last until the end of the shortest stream is reached. Return the result from the last @var{proc} call. -Each call is @code{(@var{proc} elem0 @dots{} elemN prev)}, where each +Each call is @code{(@var{proc} elem1 elem2 @dots{} prev)}, where each @var{elem} is from the corresponding @var{stream}. @var{prev} is the return from the previous @var{proc} call, or the given @var{init} for the first call. @end defun -@defun stream-for-each proc stream0 @dots{} streamN +@defun stream-for-each proc stream1 stream2 @dots{} Call @var{proc} on the elements from the given @var{stream}s. The return value is unspecified. -Each call is @code{(@var{proc} elem0 @dots{} elemN)}, where each +Each call is @code{(@var{proc} elem1 elem2 @dots{})}, where each @var{elem} is from the corresponding @var{stream}. @code{stream-for-each} stops when it reaches the end of the shortest @var{stream}. @end defun -@defun stream-map proc stream0 @dots{} streamN +@defun stream-map proc stream1 stream2 @dots{} Return a new stream which is the results of applying @var{proc} to the elements of the given @var{stream}s. -Each call is @code{(@var{proc} elem0 @dots{} elemN)}, where each +Each call is @code{(@var{proc} elem1 elem2 @dots{})}, where each @var{elem} is from the corresponding @var{stream}. The new stream -ends when the end of teh shortest given @var{stream} is reached. +ends when the end of the shortest given @var{stream} is reached. @end defun +@node Buffered Input +@section Buffered Input +@cindex Buffered input +@cindex Line continuation + +The following functions are provided by + +@example +(use-modules (ice-9 buffered-input)) +@end example + +A buffered input port allows a reader function to return chunks of +characters which are to be handed out on reading the port. A notion +of further input for an application level logical expression is +maintained too, and passed through to the reader. + +@deffn {Scheme Procedure} make-buffered-input-port reader +Create an input port which returns characters obtained from the given +@var{reader} function. @var{reader} is called (@var{reader} cont), +and should return a string or an EOF object. + +The new port gives precisely the characters returned by @var{reader}, +nothing is added, so if any newline characters or other separators are +desired they must come from the reader function. + +The @var{cont} parameter to @var{reader} is @code{#f} for initial +input, or @code{#t} when continuing an expression. This is an +application level notion, set with +@code{set-buffered-input-continuation?!} below. If the user has +entered a partial expression then it allows @var{reader} for instance +to give a different prompt to show more is required. +@end deffn + +@deffn {Scheme Procedure} make-line-buffered-input-port reader +@cindex Line buffered input +Create an input port which returns characters obtained from the +specified @var{reader} function, similar to +@code{make-buffered-input-port} above, but where @var{reader} is +expected to be a line-oriented. + +@var{reader} is called (@var{reader} cont), and should return a string +or an EOF object as above. Each string is a line of input without a +newline character, the port code inserts a newline after each string. +@end deffn + +@deffn {Scheme Procedure} set-buffered-input-continuation?! port cont +Set the input continuation flag for a given buffered input +@var{port}. + +An application uses this by calling with a @var{cont} flag of +@code{#f} when beginning to read a new logical expression. For +example with the Scheme @code{read} function (@pxref{Scheme Read}), + +@example +(define my-port (make-buffered-input-port my-reader)) + +(set-buffered-input-continuation?! my-port #f) +(let ((obj (read my-port))) + ... +@end example +@end deffn + + @c Local Variables: @c TeX-master: "guile.texi" @c End: