Add interface to disable automatic finalization

[bpt/guile.git] / doc / ref / compiler.texi
diff --git a/doc/ref/compiler.texi b/doc/ref/compiler.texi

index 0d68abf..bfc633e 100644 (file)
--- a/doc/ref/compiler.texi
+++ b/doc/ref/compiler.texi
@@ -1,6 +1,6 @@
  @c -*-texinfo-*-
  @c This is part of the GNU Guile Reference Manual.
-@c Copyright (C)  2008
+@c Copyright (C)  2008, 2009, 2010, 2013
  @c   Free Software Foundation, Inc.
  @c See the file guile.texi for copying conditions.
  
@@ -17,7 +17,7 @@ This section aims to pay attention to the small man behind the
  curtain.
  
  @xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to
-know how to compile your .scm file.
+know how to compile your @code{.scm} file.
  
  @menu
  * Compiler Tower::                   
@@ -26,6 +26,7 @@ know how to compile your .scm file.
  * GLIL::                
  * Assembly::                   
  * Bytecode and Objcode::                   
+* Writing New High-Level Languages::
  * Extending the Compiler::
  @end menu
  
@@ -52,8 +53,11 @@ Languages are registered in the module, @code{(system base language)}:
  They are registered with the @code{define-language} form.
  
  @deffn {Scheme Syntax} define-language @
-name title version reader printer @
-[parser=#f] [compilers='()] [decompilers='()] [evaluator=#f]
+                       [#:name] [#:title] [#:reader] [#:printer] @
+                       [#:parser=#f] [#:compilers='()] @
+                       [#:decompilers='()] [#:evaluator=#f] @
+                       [#:joiner=#f] [#:for-humans?=#t] @
+                       [#:make-default-environment=make-fresh-user-module]
  Define a language.
  
  This syntax defines a @code{#<language>} object, bound to @var{name}
@@ -63,14 +67,13 @@ for Scheme:
  
  @example
  (define-language scheme
-  #:title       "Guile Scheme"
-  #:version     "0.5"
-  #:reader      read
-  #:compilers   `((tree-il . ,compile-tree-il)
-                  (ghil . ,compile-ghil))
+  #:title      "Scheme"
+  #:reader      (lambda (port env) ...)
+  #:compilers   `((tree-il . ,compile-tree-il))
    #:decompilers `((tree-il . ,decompile-tree-il))
-  #:evaluator   (lambda (x module) (primitive-eval x))
-  #:printer     write)
+  #:evaluator  (lambda (x module) (primitive-eval x))
+  #:printer    write
+  #:make-default-environment (lambda () ...))
  @end example
  @end deffn
  
@@ -79,17 +82,11 @@ they present a uniform interface to the read-eval-print loop. This
  allows the user to change the current language of the REPL:
  
  @example
-$ guile
-Guile Scheme interpreter 0.5 on Guile 1.9.0
-Copyright (C) 2001-2008 Free Software Foundation, Inc.
-
-Enter `,help' for help.
  scheme@@(guile-user)> ,language tree-il
-Tree Intermediate Language interpreter 1.0 on Guile 1.9.0
-Copyright (C) 2001-2008 Free Software Foundation, Inc.
-
-Enter `,help' for help.
-tree-il@@(guile-user)> 
+Happy hacking with Tree Intermediate Language!  To switch back, type `,L scheme'.
+tree-il@@(guile-user)> ,L scheme
+Happy hacking with Scheme!  To switch back, type `,L tree-il'.
+scheme@@(guile-user)> 
  @end example
  
  Languages can be looked up by name, as they were above.
@@ -117,18 +114,19 @@ subsequent calls to @code{define-language}, so it should be quite
  fast.
  @end deffn
  
-There is a notion of a ``current language'', which is maintained in
-the @code{*current-language*} fluid. This language is normally Scheme,
-and may be rebound by the user. The run-time compilation interfaces
+There is a notion of a ``current language'', which is maintained in the
+@code{current-language} parameter, defined in the core @code{(guile)}
+module. This language is normally Scheme, and may be rebound by the
+user. The run-time compilation interfaces
  (@pxref{Read/Load/Eval/Compile}) also allow you to choose other source
  and target languages.
  
  The normal tower of languages when compiling Scheme goes like this:
  
  @itemize
-@item Scheme, which we know and love
+@item Scheme
  @item Tree Intermediate Language (Tree-IL)
-@item Guile Low Intermediate Language (GLIL)
+@item Guile Lowlevel Intermediate Language (GLIL)
  @item Assembly
  @item Bytecode
  @item Objcode
@@ -195,14 +193,14 @@ The Scheme-to-Tree-IL expander may be invoked using the generic
  Or, since Tree-IL is so close to Scheme, it is often useful to expand
  Scheme to Tree-IL, then translate back to Scheme. For that reason the
  expander provides two interfaces. The former is equivalent to calling
-@code{(sc-expand '(+ 1 2) 'c)}, where the @code{'c} is for
+@code{(macroexpand '(+ 1 2) 'c)}, where the @code{'c} is for
  ``compile''. With @code{'e} (the default), the result is translated
  back to Scheme:
  
  @lisp
-(sc-expand '(+ 1 2))
+(macroexpand '(+ 1 2))
  @result{} (+ 1 2)
-(sc-expand '(let ((x 10)) (* x x)))
+(macroexpand '(let ((x 10)) (* x x)))
  @result{} (let ((x84 10)) (* x84 x84))
  @end lisp
  
@@ -214,18 +212,18 @@ lexical binding only has one name. It is for this reason that the
  much information we would lose if we translated to Scheme directly:
  lexical variable names, source locations, and module hygiene.
  
-Note however that @code{sc-expand} does not have the same signature as
-@code{compile-tree-il}. @code{compile-tree-il} is a small wrapper
-around @code{sc-expand}, to make it conform to the general form of
+Note however that @code{macroexpand} does not have the same signature
+as @code{compile-tree-il}. @code{compile-tree-il} is a small wrapper
+around @code{macroexpand}, to make it conform to the general form of
  compiler procedures in Guile's language tower.
  
-Compiler procedures take two arguments, an expression and an
-environment. They return three values: the compiled expression, the
-corresponding environment for the target language, and a
-``continuation environment''. The compiled expression and environment
-will serve as input to the next language's compiler. The
-``continuation environment'' can be used to compile another expression
-from the same source language within the same module.
+Compiler procedures take three arguments: an expression, an
+environment, and a keyword list of options. They return three values:
+the compiled expression, the corresponding environment for the target
+language, and a ``continuation environment''. The compiled expression
+and environment will serve as input to the next language's compiler.
+The ``continuation environment'' can be used to compile another
+expression from the same source language within the same module.
  
  For example, you might compile the expression, @code{(define-module
  (foo))}. This will result in a Tree-IL expression and environment. But
@@ -235,12 +233,42 @@ which puts the user in the @code{(foo)} module. That is purpose of the
  ``continuation environment''; you would pass it as the environment
  when compiling the subsequent expression.
  
-For Scheme, an environment may be one of two things:
-@itemize
-@item @code{#f}, in which case compilation is performed in the context
-of the current module; or
-@item a module, which specifies the context of the compilation.
-@end itemize
+For Scheme, an environment is a module. By default, the @code{compile}
+and @code{compile-file} procedures compile in a fresh module, such
+that bindings and macros introduced by the expression being compiled
+are isolated:
+
+@example
+(eq? (current-module) (compile '(current-module)))
+@result{} #f
+
+(compile '(define hello 'world))
+(defined? 'hello)
+@result{} #f
+
+(define / *)
+(eq? (compile '/) /)
+@result{} #f
+@end example
+
+Similarly, changes to the @code{current-reader} fluid (@pxref{Loading,
+@code{current-reader}}) are isolated:
+
+@example
+(compile '(fluid-set! current-reader (lambda args 'fail)))
+(fluid-ref current-reader)
+@result{} #f
+@end example
+
+Nevertheless, having the compiler and @dfn{compilee} share the same name
+space can be achieved by explicitly passing @code{(current-module)} as
+the compilation environment:
+
+@example
+(define hello 'world)
+(compile 'hello #:env (current-module))
+@result{} world
+@end example
  
  @node Tree-IL
  @subsection Tree-IL
@@ -252,12 +280,12 @@ expanded, pre-analyzed Scheme.
  Tree-IL is ``structured'' in the sense that its representation is
  based on records, not S-expressions. This gives a rigidity to the
  language that ensures that compiling to a lower-level language only
-requires a limited set of transformations. Practically speaking,
-consider the Tree-IL type, @code{<const>}, which has two fields,
-@code{src} and @code{exp}. Instances of this type are records created
-via @code{make-const}, and whose fields are accessed as
-@code{const-src}, and @code{const-exp}. There is also a predicate,
-@code{const?}. @xref{Records}, for more information on records.
+requires a limited set of transformations. For example, the Tree-IL
+type @code{<const>} is a record type with two fields, @code{src} and
+@code{exp}. Instances of this type are created via @code{make-const}.
+Fields of this type are accessed via the @code{const-src} and
+@code{const-exp} procedures. There is also a predicate, @code{const?}.
+@xref{Records}, for more information on records.
  
  @c alpha renaming
  
@@ -270,7 +298,7 @@ Properties}, for more information.
  
  Although Tree-IL objects are represented internally using records,
  there is also an equivalent S-expression external representation for
-each kind of Tree-IL. For example, an the S-expression representation
+each kind of Tree-IL. For example, the S-expression representation
  of @code{#<const src: #f exp: 3>} expression would be:
  
  @example
@@ -281,16 +309,21 @@ Users may program with this format directly at the REPL:
  
  @example
  scheme@@(guile-user)> ,language tree-il
-Tree Intermediate Language interpreter 1.0 on Guile 1.9.0
-Copyright (C) 2001-2008 Free Software Foundation, Inc.
-
-Enter `,help' for help.
+Happy hacking with Tree Intermediate Language!  To switch back, type `,L scheme'.
  tree-il@@(guile-user)> (apply (primitive +) (const 32) (const 10))
  @result{} 42
  @end example
  
  The @code{src} fields are left out of the external representation.
  
+One may create Tree-IL objects from their external representations via
+calling @code{parse-tree-il}, the reader for Tree-IL. If any source
+information is attached to the input S-expression, it will be
+propagated to the resulting Tree-IL expressions. This is probably the
+easiest way to compile to Tree-IL: just make the appropriate external
+representations in S-expression format, and let @code{parse-tree-il}
+take care of the rest.
+
  @deftp {Scheme Variable} <void> src
  @deftpx {External Representation} (void)
  An empty expression. In practice, equivalent to Scheme's @code{(if #f
@@ -327,7 +360,7 @@ Sets a lexically-bound variable.
  @deftpx {External Representation} (@@ @var{mod} @var{name})
  @deftpx {External Representation} (@@@@ @var{mod} @var{name})
  A reference to a variable in a specific module. @var{mod} should be
-the name of the module, e.g. @code{(guile-user)}.
+the name of the module, e.g.@: @code{(guile-user)}.
  
  If @var{public?} is true, the variable named @var{name} will be looked
  up in @var{mod}'s public interface, and serialized with @code{@@};
@@ -363,44 +396,141 @@ A procedure call.
  @deftpx {External Representation} (begin . @var{exps})
  Like Scheme's @code{begin}.
  @end deftp
-@deftp {Scheme Variable} <lambda> src names vars meta body
-@deftpx {External Representation} (lambda @var{names} @var{vars} @var{meta} @var{body})
-A closure. @var{names} is original binding form, as given in the
-source code, which may be an improper list. @var{vars} are gensyms
-corresponding to the @var{names}. @var{meta} is an association list of
-properties. The actual @var{body} is a single Tree-IL expression.
+@deftp {Scheme Variable} <lambda> src meta body
+@deftpx {External Representation} (lambda @var{meta} @var{body})
+A closure. @var{meta} is an association list of properties for the
+procedure. @var{body} is a single Tree-IL expression of type
+@code{<lambda-case>}. As the @code{<lambda-case>} clause can chain to
+an alternate clause, this makes Tree-IL's @code{<lambda>} have the
+expressiveness of Scheme's @code{case-lambda}.
+@end deftp
+@deftp {Scheme Variable} <lambda-case> req opt rest kw inits gensyms body alternate
+@deftpx {External Representation} @
+  (lambda-case ((@var{req} @var{opt} @var{rest} @var{kw} @var{inits} @var{gensyms})@
+                @var{body})@
+               [@var{alternate}])
+One clause of a @code{case-lambda}. A @code{lambda} expression in
+Scheme is treated as a @code{case-lambda} with one clause.
+
+@var{req} is a list of the procedure's required arguments, as symbols.
+@var{opt} is a list of the optional arguments, or @code{#f} if there
+are no optional arguments. @var{rest} is the name of the rest
+argument, or @code{#f}.
+
+@var{kw} is a list of the form, @code{(@var{allow-other-keys?}
+(@var{keyword} @var{name} @var{var}) ...)}, where @var{keyword} is the
+keyword corresponding to the argument named @var{name}, and whose
+corresponding gensym is @var{var}. @var{inits} are tree-il expressions
+corresponding to all of the optional and keyword arguments, evaluated
+to bind variables whose value is not supplied by the procedure caller.
+Each @var{init} expression is evaluated in the lexical context of
+previously bound variables, from left to right.
+
+@var{gensyms} is a list of gensyms corresponding to all arguments:
+first all of the required arguments, then the optional arguments if
+any, then the rest argument if any, then all of the keyword arguments.
+
+@var{body} is the body of the clause. If the procedure is called with
+an appropriate number of arguments, @var{body} is evaluated in tail
+position. Otherwise, if there is an @var{alternate}, it should be a
+@code{<lambda-case>} expression, representing the next clause to try.
+If there is no @var{alternate}, a wrong-number-of-arguments error is
+signaled.
  @end deftp
-@deftp {Scheme Variable} <let> src names vars vals exp
-@deftpx {External Representation} (let @var{names} @var{vars} @var{vals} @var{exp})
+@deftp {Scheme Variable} <let> src names gensyms vals exp
+@deftpx {External Representation} (let @var{names} @var{gensyms} @var{vals} @var{exp})
  Lexical binding, like Scheme's @code{let}. @var{names} are the
-original binding names, @var{vars} are gensyms corresponding to the
+original binding names, @var{gensyms} are gensyms corresponding to the
  @var{names}, and @var{vals} are Tree-IL expressions for the values.
  @var{exp} is a single Tree-IL expression.
  @end deftp
-@deftp {Scheme Variable} <letrec> src names vars vals exp
-@deftpx {External Representation} (letrec @var{names} @var{vars} @var{vals} @var{exp})
+@deftp {Scheme Variable} <letrec> in-order? src names gensyms vals exp
+@deftpx {External Representation} (letrec @var{names} @var{gensyms} @var{vals} @var{exp})
+@deftpx {External Representation} (letrec* @var{names} @var{gensyms} @var{vals} @var{exp})
  A version of @code{<let>} that creates recursive bindings, like
-Scheme's @code{letrec}.
+Scheme's @code{letrec}, or @code{letrec*} if @var{in-order?} is true.
+@end deftp
+@deftp {Scheme Variable} <dynlet> fluids vals body
+@deftpx {External Representation} (dynlet @var{fluids} @var{vals} @var{body})
+Dynamic binding; the equivalent of Scheme's @code{with-fluids}.
+@var{fluids} should be a list of Tree-IL expressions that will
+evaluate to fluids, and @var{vals} a corresponding list of expressions
+to bind to the fluids during the dynamic extent of the evaluation of
+@var{body}.
+@end deftp
+@deftp {Scheme Variable} <dynref> fluid
+@deftpx {External Representation} (dynref @var{fluid})
+A dynamic variable reference. @var{fluid} should be a Tree-IL
+expression evaluating to a fluid.
+@end deftp
+@deftp {Scheme Variable} <dynset> fluid exp
+@deftpx {External Representation} (dynset @var{fluid} @var{exp})
+A dynamic variable set. @var{fluid}, a Tree-IL expression evaluating
+to a fluid, will be set to the result of evaluating @var{exp}.
+@end deftp
+@deftp {Scheme Variable} <dynwind> winder body unwinder
+@deftpx {External Representation} (dynwind @var{winder} @var{body} @var{unwinder})
+A @code{dynamic-wind}. @var{winder} and @var{unwinder} should both
+evaluate to thunks. Ensure that the winder and the unwinder are called
+before entering and after leaving @var{body}. Note that @var{body} is
+an expression, without a thunk wrapper.
+@end deftp
+@deftp {Scheme Variable} <prompt> tag body handler
+@deftpx {External Representation} (prompt @var{tag} @var{body} @var{handler})
+A dynamic prompt. Instates a prompt named @var{tag}, an expression,
+during the dynamic extent of the execution of @var{body}, also an
+expression. If an abort occurs to this prompt, control will be passed
+to @var{handler}, a @code{<lambda-case>} expression with no optional
+or keyword arguments, and no alternate. The first argument to the
+@code{<lambda-case>} will be the captured continuation, and then all
+of the values passed to the abort. @xref{Prompts}, for more
+information.
+@end deftp
+@deftp {Scheme Variable} <abort> tag args tail
+@deftpx {External Representation} (abort @var{tag} @var{args} @var{tail})
+An abort to the nearest prompt with the name @var{tag}, an expression.
+@var{args} should be a list of expressions to pass to the prompt's
+handler, and @var{tail} should be an expression that will evaluate to
+a list of additional arguments. An abort will save the partial
+continuation, which may later be reinstated, resulting in the
+@code{<abort>} expression evaluating to some number of values.
  @end deftp
  
-@c FIXME -- need to revive this one
-@c @deftp {Scheme Variable} <ghil-mv-bind> src vars rest producer . body
-@c Like Scheme's @code{receive} -- binds the values returned by
-@c applying @code{producer}, which should be a thunk, to the
-@c @code{lambda}-like bindings described by @var{vars} and @var{rest}.
-@c @end deftp
+There are two Tree-IL constructs that are not normally produced by
+higher-level compilers, but instead are generated during the
+source-to-source optimization and analysis passes that the Tree-IL
+compiler does. Users should not generate these expressions directly,
+unless they feel very clever, as the default analysis pass will
+generate them as necessary.
+
+@deftp {Scheme Variable} <let-values> src names gensyms exp body
+@deftpx {External Representation} (let-values @var{names} @var{gensyms} @var{exp} @var{body})
+Like Scheme's @code{receive} -- binds the values returned by
+evaluating @code{exp} to the @code{lambda}-like bindings described by
+@var{gensyms}. That is to say, @var{gensyms} may be an improper list.
+
+@code{<let-values>} is an optimization of @code{<application>} of the
+primitive, @code{call-with-values}.
+@end deftp
+@deftp {Scheme Variable} <fix> src names gensyms vals body
+@deftpx {External Representation} (fix @var{names} @var{gensyms} @var{vals} @var{body})
+Like @code{<letrec>}, but only for @var{vals} that are unset
+@code{lambda} expressions.
+
+@code{fix} is an optimization of @code{letrec} (and @code{let}).
+@end deftp
  
  Tree-IL implements a compiler to GLIL that recursively traverses
  Tree-IL expressions, writing out GLIL expressions into a linear list.
  The compiler also keeps some state as to whether the current
  expression is in tail context, and whether its value will be used in
  future computations. This state allows the compiler not to emit code
-for constant expressions that will not be used (e.g. docstrings), and
+for constant expressions that will not be used (e.g.@: docstrings), and
  to perform tail calls when in tail position.
  
-In the future, there will be a pass at the beginning of the
-Tree-IL->GLIL compilation step to perform inlining, copy propagation,
-dead code elimination, and constant folding.
+Most optimization, such as it currently is, is performed on Tree-IL
+expressions as source-to-source transformations. There will be more
+optimizations added in the future.
  
  Interested readers are encouraged to read the implementation in
  @code{(language tree-il compile-glil)} for more details.
@@ -408,20 +538,37 @@ Interested readers are encouraged to read the implementation in
  @node GLIL
  @subsection GLIL
  
-Guile Low Intermediate Language (GLIL) is a structured intermediate
+Guile Lowlevel Intermediate Language (GLIL) is a structured intermediate
  language whose expressions more closely approximate Guile's VM
-instruction set.
-
-Its expression types are defined in @code{(language glil)}, and as
-with GHIL, some of its fields parse as rest arguments.
+instruction set. Its expression types are defined in @code{(language
+glil)}.
  
-@deftp {Scheme Variable} <glil-program> nargs nrest nlocs nexts meta . body
+@deftp {Scheme Variable} <glil-program> meta . body
  A unit of code that at run-time will correspond to a compiled
-procedure. @var{nargs} @var{nrest} @var{nlocs}, and @var{nexts}
-collectively define the program's arity; see @ref{Compiled
-Procedures}, for more information. @var{meta} should be an alist of
-properties, as in Tree IL's @code{<lambda>}. @var{body} is a list of
-GLIL expressions.
+procedure. @var{meta} should be an alist of properties, as in
+Tree-IL's @code{<lambda>}. @var{body} is an ordered list of GLIL
+expressions.
+@end deftp
+@deftp {Scheme Variable} <glil-std-prelude> nreq nlocs else-label
+A prologue for a function with no optional, keyword, or rest
+arguments. @var{nreq} is the number of required arguments. @var{nlocs}
+the total number of local variables, including the arguments. If the
+procedure was not given exactly @var{nreq} arguments, control will
+jump to @var{else-label}, if given, or otherwise signal an error.
+@end deftp
+@deftp {Scheme Variable} <glil-opt-prelude> nreq nopt rest nlocs else-label
+A prologue for a function with optional or rest arguments. Like
+@code{<glil-std-prelude>}, with the addition that @var{nopt} is the
+number of optional arguments (possibly zero) and @var{rest} is an
+index of a local variable at which to bind a rest argument, or
+@code{#f} if there is no rest argument.
+@end deftp
+@deftp {Scheme Variable} <glil-kw-prelude> nreq nopt rest kw allow-other-keys? nlocs else-label
+A prologue for a function with keyword arguments. Like
+@code{<glil-opt-prelude>}, with the addition that @var{kw} is a list
+of keyword arguments, and @var{allow-other-keys?} is a flag indicating
+whether to allow unknown keys. @xref{Function Prologue Instructions,
+@code{bind-kwargs}}, for details on the format of @var{kw}.
  @end deftp
  @deftp {Scheme Variable} <glil-bind> . vars
  An advisory expression that notes a liveness extent for a set of
@@ -433,15 +580,15 @@ variables. @var{vars} is a list of @code{(@var{name} @var{type}
  program's metadata and do not form part of a program's code path.
  @end deftp
  @deftp {Scheme Variable} <glil-mv-bind> vars rest
-A multiple-value binding of the values on the stack to @var{vars}. Iff
-@var{rest} is true, the last element of @var{vars} will be treated as
-a rest argument.
+A multiple-value binding of the values on the stack to @var{vars}.  If
+@var{rest} is true, the last element of @var{vars} will be treated as a
+rest argument.
  
  In addition to pushing a binding annotation on the stack, like
  @code{<glil-bind>}, an expression is emitted at compilation time to
  make sure that there are enough values available to bind. See the
-notes on @code{truncate-values} in @ref{Procedural Instructions}, for
-more information.
+notes on @code{truncate-values} in @ref{Procedure Call and Return
+Instructions}, for more information.
  @end deftp
  @deftp {Scheme Variable} <glil-unbind>
  Closes the liveness extent of the most recently encountered
@@ -456,27 +603,25 @@ offset within a VM program.
  @deftp {Scheme Variable} <glil-source> loc
  Records source information for the preceding expression. @var{loc}
  should be an association list of containing @code{line} @code{column},
-and @code{filename} keys, e.g. as returned by
+and @code{filename} keys, e.g.@: as returned by
  @code{source-properties}.
  @end deftp
  @deftp {Scheme Variable} <glil-void>
-Pushes the unspecified value on the stack.
+Pushes ``the unspecified value'' on the stack.
  @end deftp
  @deftp {Scheme Variable} <glil-const> obj
  Pushes a constant value onto the stack. @var{obj} must be a number,
-string, symbol, keyword, boolean, character, the empty list, or a pair
-or vector of constants.
-@end deftp
-@deftp {Scheme Variable} <glil-local> op index
-Accesses a lexically bound variable from the stack. If @var{op} is
-@code{ref}, the value is pushed onto the stack; if it is @code{set},
-the variable is set from the top value on the stack, which is popped
-off. @xref{Stack Layout}, for more information.
+string, symbol, keyword, boolean, character, uniform array, the empty
+list, or a pair or vector of constants.
  @end deftp
-@deftp {Scheme Variable} <glil-external> op depth index
-Accesses a heap-allocated variable, addressed by @var{depth}, the nth
-enclosing environment, and @var{index}, the variable's position within
-the environment. @var{op} is @code{ref} or @code{set}.
+@deftp {Scheme Variable} <glil-lexical> local? boxed? op index
+Accesses a lexically bound variable. If the variable is not
+@var{local?} it is free. All variables may have @code{ref},
+@code{set}, and @code{bound?} as their @var{op}. Boxed variables may
+also have the @var{op}s @code{box}, @code{empty-box}, and @code{fix},
+which correspond in semantics to the VM instructions @code{box},
+@code{empty-box}, and @code{fix-closure}. @xref{Stack Layout}, for
+more information.
  @end deftp
  @deftp {Scheme Variable} <glil-toplevel> op name
  Accesses a toplevel variable. @var{op} may be @code{ref}, @code{set},
@@ -504,22 +649,25 @@ the stack afterwards depends on the instruction.
  @deftp {Scheme Variable} <glil-mv-call> nargs ra
  Performs a multiple-value call. @var{ra} is a @code{<glil-label>}
  corresponding to the multiple-value return address for the call. See
-the notes on @code{mv-call} in @ref{Procedural Instructions}, for more
-information.
+the notes on @code{mv-call} in @ref{Procedure Call and Return
+Instructions}, for more information.
+@end deftp
+@deftp {Scheme Variable} <glil-prompt> label escape-only?
+Push a dynamic prompt into the stack, with a handler at @var{label}.
+@var{escape-only?} is a flag that is propagated to the prompt,
+allowing an abort to avoid capturing a continuation in some cases.
+@xref{Prompts}, for more information.
  @end deftp
  
  Users may enter in GLIL at the REPL as well, though there is a bit
-more bookkeeping to do. Since GLIL needs the set of variables to be
-declared explicitly in a @code{<glil-program>}, GLIL expressions must
-be wrapped in a thunk that declares the arity of the expression:
+more bookkeeping to do:
  
  @example
  scheme@@(guile-user)> ,language glil
-Guile Lowlevel Intermediate Language (GLIL) interpreter 0.3 on Guile 1.9.0
-Copyright (C) 2001-2008 Free Software Foundation, Inc.
-
-Enter `,help' for help.
-glil@@(guile-user)> (program 0 0 0 0 () (const 3) (call return 0))
+Happy hacking with Guile Lowlevel Intermediate Language (GLIL)!
+To switch back, type `,L scheme'.
+glil@@(guile-user)> (program () (std-prelude 0 0 #f)
+                       (const 3) (call return 1))
  @result{} 3
  @end example
  
@@ -541,12 +689,12 @@ differs from GLIL in four main ways:
  @itemize
  @item Labels have been resolved to byte offsets in the program.
  @item Constants inside procedures have either been expressed as inline
-instructions, and possibly cached in object arrays.
+instructions or cached in object arrays.
  @item Procedures with metadata (source location information, liveness
  extents, procedure names, generic properties, etc) have had their
  metadata serialized out to thunks.
  @item All expressions correspond directly to VM instructions -- i.e.,
-there is no @code{<glil-local>} which can be a ref or a set.
+there is no @code{<glil-lexical>} which can be a ref or a set.
  @end itemize
  
  Assembly is isomorphic to the bytecode that it compiles to. You can
@@ -565,39 +713,27 @@ to play around with it at the REPL, as can be seen in this annotated
  example:
  
  @example
-scheme@@(guile-user)> (compile '(lambda (x) (+ x x)) #:to 'assembly)
-(load-program 0 0 0 0
-  () ; Labels
-  60 ; Length
-  #f ; Metadata
-  (make-false) ; object table for the returned lambda
-  (nop)
-  (nop) ; Alignment. Since assembly has already resolved its labels
-  (nop) ; to offsets, and programs must be 8-byte aligned since their
-  (nop) ; object code is mmap'd directly to structures, assembly
-  (nop) ; has to have the alignment embedded in it.
-  (nop) 
-  (load-program 1 0 0 0 
+scheme@@(guile-user)> ,pp (compile '(+ 32 10) #:to 'assembly)
+(load-program
+  ((:LCASE16 . 2))  ; Labels, unused in this case.
+  8                 ; Length of the thunk that was compiled.
+  (load-program     ; Metadata thunk.
      ()
-    6
-    ; This is the metadata thunk for the returned procedure.
-    (load-program 0 0 0 0 () 21 #f
-      (load-symbol "x")  ; Name and liveness extent for @code{x}.
-      (make-false)
-      (make-int8:0) ; Some instruction+arg combinations
-      (make-int8:0) ; have abbreviations.
-      (make-int8 6)
-      (list 0 5)
-      (list 0 1)
-      (make-eol)
-      (list 0 2)
-      (return))
-    ; And here, the actual code.
-    (local-ref 0)
-    (local-ref 0)
-    (add)
+    17
+    #f              ; No metadata thunk for the metadata thunk.
+    (make-eol)
+    (make-eol)
+    (make-int8 2)   ; Liveness extents, source info, and arities,
+    (make-int8 8)   ; in a format that Guile knows how to parse.
+    (make-int8:0)
+    (list 0 3)
+    (list 0 1)
+    (list 0 3)
      (return))
-  ; Return our new procedure.
+  (assert-nargs-ee/locals 0)  ; Prologue.
+  (make-int8 32)    ; Actual code starts here.
+  (make-int8 10)
+  (add)
    (return))
  @end example
  
@@ -616,11 +752,12 @@ structuring and destructuring code on the Scheme level. Bytecode is
  the next step down from assembly:
  
  @example
-scheme@@(guile-user)> (compile '(+ 32 10) #:to 'assembly)
-@result{} (load-program 0 0 0 0 () 6 #f
-       (make-int8 32) (make-int8 10) (add) (return))
  scheme@@(guile-user)> (compile '(+ 32 10) #:to 'bytecode)
-@result{} #u8(0 0 0 0 6 0 0 0 0 0 0 0 10 32 10 10 100 48)
+@result{} #vu8(8 0 0 0 25 0 0 0            ; Header.
+       95 0                            ; Prologue.
+       10 32 10 10 148 66 17           ; Actual code.
+       0 0 0 0 0 0 0 9                 ; Metadata thunk.
+       9 10 2 10 8 11 18 0 3 18 0 1 18 0 3 66)
  @end example
  
  ``Objcode'' is bytecode, but mapped directly to a C structure,
@@ -628,10 +765,6 @@ scheme@@(guile-user)> (compile '(+ 32 10) #:to 'bytecode)
  
  @example
  struct scm_objcode @{
-  scm_t_uint8 nargs;
-  scm_t_uint8 nrest;
-  scm_t_uint8 nlocs;
-  scm_t_uint8 nexts;
    scm_t_uint32 len;
    scm_t_uint32 metalen;
    scm_t_uint8 base[0];
@@ -639,9 +772,8 @@ struct scm_objcode @{
  @end example
  
  As one might imagine, objcode imposes a minimum length on the
-bytecode. Also, the multibyte fields are in native endianness, which
-makes objcode (and bytecode) system-dependent. Indeed, in the short
-example above, all but the last 5 bytes were the program's header.
+bytecode. Also, the @code{len} and @code{metalen} fields are in native
+endianness, which makes objcode (and bytecode) system-dependent.
  
  Objcode also has a couple of important efficiency hacks. First,
  objcode may be mapped directly from disk, allowing compiled code to be
@@ -657,13 +789,13 @@ objcode)} module.
  
  @deffn {Scheme Procedure} objcode? obj
  @deffnx {C Function} scm_objcode_p (obj)
-Returns @code{#f} iff @var{obj} is object code, @code{#f} otherwise.
+Returns @code{#f} if @var{obj} is object code, @code{#f} otherwise.
  @end deffn
  
  @deffn {Scheme Procedure} bytecode->objcode bytecode
-@deffnx {C Function} scm_bytecode_to_objcode (bytecode,)
+@deffnx {C Function} scm_bytecode_to_objcode (bytecode)
  Makes a bytecode object from @var{bytecode}, which should be a
-@code{u8vector}.
+bytevector. @xref{Bytevectors}.
  @end deffn
  
  @deffn {Scheme Variable} load-objcode file
@@ -671,28 +803,28 @@ Makes a bytecode object from @var{bytecode}, which should be a
  Load object code from a file named @var{file}. The file will be mapped
  into memory via @code{mmap}, so this is a very fast operation.
  
-On disk, object code has an eight-byte cookie prepended to it, to
+On disk, object code has an sixteen-byte cookie prepended to it, to
  prevent accidental loading of arbitrary garbage.
  @end deffn
  
  @deffn {Scheme Variable} write-objcode objcode file
  @deffnx {C Function} scm_write_objcode (objcode)
-Write object code out to a file, prepending the eight-byte cookie.
+Write object code out to a file, prepending the sixteen-byte cookie.
  @end deffn
  
-@deffn {Scheme Variable} objcode->u8vector objcode
-@deffnx {C Function} scm_objcode_to_u8vector (objcode)
-Copy object code out to a @code{u8vector} for analysis by Scheme.
+@deffn {Scheme Variable} objcode->bytecode objcode
+@deffnx {C Function} scm_objcode_to_bytecode (objcode)
+Copy object code out to a bytevector for analysis by Scheme.
  @end deffn
  
  The following procedure is actually in @code{(system vm program)}, but
  we'll mention it here:
  
-@deffn {Scheme Variable} make-program objcode objtable [external='()]
-@deffnx {C Function} scm_make_program (objcode, objtable, external)
+@deffn {Scheme Variable} make-program objcode objtable [free-vars=#f]
+@deffnx {C Function} scm_make_program (objcode, objtable, free_vars)
  Load up object code into a Scheme program. The resulting program will
  have @var{objtable} as its object table, which should be a vector or
-@code{#f}, and will capture the closure variables from @var{external}.
+@code{#f}, and will capture the free variables from @var{free-vars}.
  @end deffn
  
  Object code from a file may be disassembled at the REPL via the
@@ -704,24 +836,31 @@ Compiling object code to the fake language, @code{value}, is performed
  via loading objcode into a program, then executing that thunk with
  respect to the compilation environment. Normally the environment
  propagates through the compiler transparently, but users may specify
-the compilation environment manually as well:
+the compilation environment manually as well, as a module.
+
+
+@node Writing New High-Level Languages
+@subsection Writing New High-Level Languages
+
+In order to integrate a new language @var{lang} into Guile's compiler
+system, one has to create the module @code{(language @var{lang} spec)}
+containing the language definition and referencing the parser,
+compiler and other routines processing it. The module hierarchy in
+@code{(language brainfuck)} defines a very basic Brainfuck
+implementation meant to serve as easy-to-understand example on how to
+do this. See for instance @url{http://en.wikipedia.org/wiki/Brainfuck}
+for more information about the Brainfuck language itself.
  
-@deffn {Scheme Procedure} make-objcode-env module externals
-Make an object code environment. @var{module} should be a Scheme
-module, and @var{externals} should be a list of external variables.
-@code{#f} is also a valid object code environment.
-@end deffn
  
  @node Extending the Compiler
  @subsection Extending the Compiler
  
-At this point, we break with the impersonal tone of the rest of the
-manual, and make an intervention. Admit it: if you've read this far
-into the compiler internals manual, you are a junkie. Perhaps a course
-at your university left you unsated, or perhaps you've always harbored
-a sublimated desire to hack the holy of computer science holies: a
-compiler. Well you're in good company, and in a good position. Guile's
-compiler needs your help.
+At this point we take a detour from the impersonal tone of the rest of
+the manual.  Admit it: if you've read this far into the compiler
+internals manual, you are a junkie.  Perhaps a course at your university
+left you unsated, or perhaps you've always harbored a desire to hack the
+holy of computer science holies: a compiler.  Well you're in good
+company, and in a good position.  Guile's compiler needs your help.
  
  There are many possible avenues for improving Guile's compiler.
  Probably the most important improvement, speed-wise, will be some form
@@ -734,12 +873,14 @@ procedure is called a certain number of times.
  The name of the game is a profiling-based harvest of the low-hanging
  fruit, running programs of interest under a system-level profiler and
  determining which improvements would give the most bang for the buck.
-There are many well-known efficiency hacks in the literature: Dybvig's
-letrec optimization, individual boxing of heap-allocated values (and
-then store the boxes on the stack directly), optimized case-lambda
-expressions, stack underflow and overflow handlers, etc. Highly
-recommended papers: Dybvig's HOCS, Ghuloum's compiler paper.
+It's really getting to the point though that native compilation is the
+next step.
  
  The compiler also needs help at the top end, enhancing the Scheme that
-it knows to also understand R6RS, and adding new high-level compilers:
-Emacs Lisp, Lua, JavaScript...
+it knows to also understand R6RS, and adding new high-level compilers.
+We have JavaScript and Emacs Lisp mostly complete, but they could use
+some love; Lua would be nice as well, but whatever language it is
+that strikes your fancy would be welcome too.
+
+Compilers are for hacking, not for admiring or for complaining about.
+Get to it!