add assert-nargs-ee/locals instruction

[bpt/guile.git] / doc / ref / vm.texi
diff --git a/doc/ref/vm.texi b/doc/ref/vm.texi

index 49b420c..6a7a0a9 100644 (file)
--- a/doc/ref/vm.texi
+++ b/doc/ref/vm.texi
@@ -1,20 +1,20 @@
  @c -*-texinfo-*-
  @c This is part of the GNU Guile Reference Manual.
-@c Copyright (C)  2008,2009
+@c Copyright (C)  2008,2009,2010
  @c   Free Software Foundation, Inc.
  @c See the file guile.texi for copying conditions.
  
  @node A Virtual Machine for Guile
  @section A Virtual Machine for Guile
  
-Guile has both an interpreter and a compiler. To a user, the
-difference is largely transparent---interpreted and compiled
-procedures can call each other as they please.
+Guile has both an interpreter and a compiler. To a user, the difference
+is transparent---interpreted and compiled procedures can call each other
+as they please.
  
  The difference is that the compiler creates and interprets bytecode
  for a custom virtual machine, instead of interpreting the
-S-expressions directly. Running compiled code is faster than running
-interpreted code.
+S-expressions directly. Loading and running compiled code is faster
+than loading and running source code.
  
  The virtual machine that does the bytecode interpretation is a part of
  Guile itself. This section describes the nature of Guile's virtual
@@ -33,21 +33,19 @@ machine.
  @subsection Why a VM?
  
  @cindex interpreter
-@cindex evaluator
-For a long time, Guile only had an interpreter, called the
-@dfn{evaluator}. Guile's evaluator operates directly on the
-S-expression representation of Scheme source code.
+For a long time, Guile only had an interpreter. Guile's interpreter
+operated directly on the S-expression representation of Scheme source
+code.
  
-But while the evaluator is highly optimized and hand-tuned, and
-contains some extensive speed trickery (@pxref{Memoization}), it still
+But while the interpreter was highly optimized and hand-tuned, it still
  performs many needless computations during the course of evaluating an
  expression. For example, application of a function to arguments
-needlessly conses up the arguments in a list. Evaluation of an
-expression always has to figure out what the car of the expression is
--- a procedure, a memoized form, or something else. All values have to
-be allocated on the heap. Et cetera.
+needlessly consed up the arguments in a list. Evaluation of an
+expression always had to figure out what the car of the expression is --
+a procedure, a memoized form, or something else. All values have to be
+allocated on the heap. Et cetera.
  
-The solution to this problem is to compile the higher-level language,
+The solution to this problem was to compile the higher-level language,
  Scheme, into a lower-level language for which all of the checks and
  dispatching have already been done---the code is instead stripped to
  the bare minimum needed to ``do the job''.
@@ -71,7 +69,21 @@ for Guile (@code{cons}, @code{struct-ref}, etc.).
  So this is what Guile does. The rest of this section describes that VM
  that Guile implements, and the compiled procedures that run on it.
  
-Note that this decision to implement a bytecode compiler does not
+Before moving on, though, we should note that though we spoke of the
+interpreter in the past tense, Guile still has an interpreter. The
+difference is that before, it was Guile's main evaluator, and so was
+implemented in highly optimized C; now, it is actually implemented in
+Scheme, and compiled down to VM bytecode, just like any other program.
+(There is still a C interpreter around, used to bootstrap the compiler,
+but it is not normally used at runtime.)
+
+The upside of implementing the interpreter in Scheme is that we preserve
+tail calls and multiple-value handling between interpreted and compiled
+code. The downside is that the interpreter in Guile 2.0 is slower than
+the interpreter in 1.8. We hope the that the compiler's speed makes up
+for the loss!
+
+Also note that this decision to implement a bytecode compiler does not
  preclude native compilation. We can compile from bytecode to native
  code at runtime, or even do ahead of time compilation. More
  possibilities are discussed in @ref{Extending the Compiler}.
@@ -79,12 +91,9 @@ possibilities are discussed in @ref{Extending the Compiler}.
  @node VM Concepts
  @subsection VM Concepts
  
-A virtual machine (VM) is a Scheme object. Users may create virtual
-machines using the standard procedures described later in this manual,
-but that is usually unnecessary, as Guile ensures that there is one
-virtual machine per thread. When a VM-compiled procedure is run, Guile
-looks up the virtual machine for the current thread and executes the
-procedure using that VM.
+Compiled code is run by a virtual machine (VM). Each thread has its own
+VM. When a compiled procedure is run, Guile looks up the virtual machine
+for the current thread and executes the procedure using that VM.
  
  Guile's virtual machine is a stack machine---that is, it has few
  registers, and the instructions defined in the VM operate by pushing
@@ -113,12 +122,6 @@ the ``program counter'' (pc). This set of registers is pretty typical
  for stack machines; their exact meanings in the context of Guile's VM
  are described in the next section.
  
-A virtual machine executes by loading a compiled procedure, and
-executing the object code associated with that procedure. Of course,
-that procedure may call other procedures, tail-call others, ad
-infinitum---indeed, within a guile whose modules have all been
-compiled to object code, one might never leave the virtual machine.
-
  @c wingo: The following is true, but I don't know in what context to
  @c describe it. A documentation FIXME.
  
@@ -134,7 +137,7 @@ compiled to object code, one might never leave the virtual machine.
  @subsection Stack Layout
  
  While not strictly necessary to understand how to work with the VM, it
-is instructive and sometimes entertaining to consider the struture of
+is instructive and sometimes entertaining to consider the structure of
  the VM stack.
  
  Logically speaking, a VM stack is composed of ``frames''. Each frame
@@ -159,18 +162,19 @@ The structure of the fixed part of an application frame is as follows:
  
  @example
               Stack
-   |                  | <- fp + bp->nargs + bp->nlocs + 4
-   +------------------+    = SCM_FRAME_UPPER_ADDRESS (fp)
-   | Return address   |
-   | MV return address|
-   | Dynamic link     |
-   | External link    | <- fp + bp->nargs + bp->nlocs
-   | Local variable 1 |    = SCM_FRAME_DATA_ADDRESS (fp)
+   | ...              |
+   | Intermed. val. 0 | <- fp + bp->nargs + bp->nlocs = SCM_FRAME_UPPER_ADDRESS (fp)
+   +==================+
+   | Local variable 1 |
     | Local variable 0 | <- fp + bp->nargs
     | Argument 1       |
     | Argument 0       | <- fp
     | Program          | <- fp - 1
-   +------------------+    = SCM_FRAME_LOWER_ADDRESS (fp)
+   +------------------+    
+   | Return address   |
+   | MV return address|
+   | Dynamic link     | <- fp - 4 = SCM_FRAME_DATA_ADDRESS (fp) = SCM_FRAME_LOWER_ADDRESS (fp)
+   +==================+
     |                  |
  @end example
  
@@ -201,25 +205,17 @@ values being returned.
  @item Dynamic link
  This is the @code{fp} in effect before this program was applied. In
  effect, this and the return address are the registers that are always
-``saved''.
-
-@item External link
-This field is a reference to the list of heap-allocated variables
-associated with this frame. For a discussion of heap versus stack
-allocation, @xref{Variables and the VM}.
+``saved''. The dynamic link links the current frame to the previous
+frame; computing a stack trace involves traversing these frames.
  
  @item Local variable @var{n}
-Lambda-local variables that are allocated on the stack are all
-allocated as part of the frame. This makes access to non-captured,
-non-mutated variables very cheap.
+Lambda-local variables that are all allocated as part of the frame.
+This makes access to variables very cheap.
  
  @item Argument @var{n}
  The calling convention of the VM requires arguments of a function
-application to be pushed on the stack, and here they are. Normally
-references to arguments dispatch to these locations on the stack.
-However if an argument has to be stored on the heap, it will be copied
-from its initial value here onto a location in the heap, and
-thereafter only referenced on the heap.
+application to be pushed on the stack, and here they are. References
+to arguments dispatch to these locations on the stack.
  
  @item Program
  This is the program being applied. For more information on how
@@ -236,26 +232,44 @@ Consider the following Scheme code as an example:
      (lambda (b) (list foo a b)))
  @end example
  
-Within the lambda expression, "foo" is a top-level variable, "a" is a
-lexically captured variable, and "b" is a local variable.
-
-@code{b} may safely be allocated on the stack, as there is no enclosed
-procedure that references it, nor is it ever mutated.
-
-@code{a}, on the other hand, is referenced by an enclosed procedure,
-that of the lambda. Thus it must be allocated on the heap, as it may
-(and will) outlive the dynamic extent of the invocation of @code{foo}.
-
-@code{foo} is a top-level variable, because it names the procedure
-@code{foo}, which is here defined at the top-level.
-
-Note that variables that are mutated (via @code{set!}) must be
-allocated on the heap, even if they are local variables. This is
-because any called subprocedure might capture the continuation, which
-would need to capture locations instead of values. Thus perhaps
-counterintuitively, what would seem ``closer to the metal'', viz
-@code{set!}, actually forces heap allocation instead of stack
-allocation.
+Within the lambda expression, @code{foo} is a top-level variable, @code{a} is a
+lexically captured variable, and @code{b} is a local variable.
+
+Another way to refer to @code{a} and @code{b} is to say that @code{a}
+is a ``free'' variable, since it is not defined within the lambda, and
+@code{b} is a ``bound'' variable. These are the terms used in the
+@dfn{lambda calculus}, a mathematical notation for describing
+functions. The lambda calculus is useful because it allows one to
+prove statements about functions. It is especially good at describing
+scope relations, and it is for that reason that we mention it here.
+
+Guile allocates all variables on the stack. When a lexically enclosed
+procedure with free variables---a @dfn{closure}---is created, it copies
+those variables into its free variable vector. References to free
+variables are then redirected through the free variable vector.
+
+If a variable is ever @code{set!}, however, it will need to be
+heap-allocated instead of stack-allocated, so that different closures
+that capture the same variable can see the same value. Also, this
+allows continuations to capture a reference to the variable, instead
+of to its value at one point in time. For these reasons, @code{set!}
+variables are allocated in ``boxes''---actually, in variable cells.
+@xref{Variables}, for more information. References to @code{set!}
+variables are indirected through the boxes.
+
+Thus perhaps counterintuitively, what would seem ``closer to the
+metal'', viz @code{set!}, actually forces an extra memory allocation
+and indirection.
+
+Going back to our example, @code{b} may be allocated on the stack, as
+it is never mutated.
+
+@code{a} may also be allocated on the stack, as it too is never
+mutated. Within the enclosed lambda, its value will be copied into
+(and referenced from) the free variables vector.
+
+@code{foo} is a top-level variable, because @code{foo} is not
+lexically bound in this example.
  
  @node VM Programs
  @subsection Compiled Procedures are VM Programs
@@ -295,48 +309,50 @@ We can see how these concepts tie together by disassembling the
  @smallexample
  scheme@@(guile-user)> (define (foo a) (lambda (b) (list foo a b)))
  scheme@@(guile-user)> ,x foo
-Disassembly of #<program foo (a)>:
+Disassembly of #<procedure foo (a)>:
  
-   0    (local-ref 0)                   ;; `a' (arg)
-   2    (external-set 0)                ;; `a' (arg)
-   4    (object-ref 1)                  ;; #<program b70d2910 at <unknown port>:0:16 (b)>
-   6    (make-closure)                  
-   7    (return)                        
+   0    (assert-nargs-ee 0 1)           
+   3    (reserve-locals 0 1)            
+   6    (object-ref 1)                  ;; #<procedure 85bfec0 at <current input>:0:16 (b)>
+   8    (local-ref 0)                   ;; `a'
+  10    (make-closure 0 1)              
+  13    (return)                        
  
  ----------------------------------------
-Disassembly of #<program b70d2910 at <unknown port>:0:16 (b)>:
-
-   0    (toplevel-ref 1)                ;; `foo'
-   2    (external-ref 0)                ;; (closure variable)
-   4    (local-ref 0)                   ;; `b' (arg)
-   6    (list 0 3)                      ;; 3 elements         at (unknown file):0:28
-   9    (return)                        
+Disassembly of #<procedure 85bfec0 at <current input>:0:16 (b)>:
+
+   0    (assert-nargs-ee 0 1)           
+   3    (reserve-locals 0 1)            
+   6    (toplevel-ref 1)                ;; `foo'
+   8    (free-ref 0)                    ;; (closure variable)
+  10    (local-ref 0)                   ;; `b'
+  12    (list 0 3)                      ;; 3 elements         at (unknown file):0:28
+  15    (return)                        
  @end smallexample
  
-At @code{ip} 0 and 2, we do the copy from argument to heap for
-@code{a}. @code{Ip} 4 loads up the compiled lambda, and then at
-@code{ip} 6 we make a closure---binding code (from the compiled
-lambda) with data (the heap-allocated variables). Finally we return
-the closure.
-
-The second stanza disassembles the compiled lambda. Toplevel variables
-are resolved relative to the module that was current when the
-procedure was created. This lookup occurs lazily, at the first time
-the variable is actually referenced, and the location of the lookup is
-cached so that future references are very cheap. @xref{Environment
-Control Instructions}, for more details.
-
-Then we see a reference to an external variable, corresponding to
-@code{a}. The disassembler doesn't have enough information to give a
-name to that variable, so it just marks it as being a ``closure
-variable''. Finally we see the reference to @code{b}, then the
-@code{list} opcode, an inline implementation of the @code{list} scheme
-routine.
+First there's some prelude, where @code{foo} checks that it was called with only
+1 argument. Then at @code{ip} 6, we load up the compiled lambda. @code{Ip} 8
+loads up `a', so that it can be captured into a closure by at @code{ip}
+10---binding code (from the compiled lambda) with data (the free-variable
+vector). Finally we return the closure.
+
+The second stanza disassembles the compiled lambda. After the prelude, we note
+that toplevel variables are resolved relative to the module that was current
+when the procedure was created. This lookup occurs lazily, at the first time the
+variable is actually referenced, and the location of the lookup is cached so
+that future references are very cheap. @xref{Top-Level Environment Instructions},
+for more details.
+
+Then we see a reference to a free variable, corresponding to @code{a}. The
+disassembler doesn't have enough information to give a name to that variable, so
+it just marks it as being a ``closure variable''. Finally we see the reference
+to @code{b}, then the @code{list} opcode, an inline implementation of the
+@code{list} scheme routine.
  
  @node Instruction Set
  @subsection Instruction Set
  
-There are about 100 instructions in Guile's virtual machine. These
+There are about 180 instructions in Guile's virtual machine. These
  instructions represent atomic units of a program's execution. Ideally,
  they perform one task without conditional branches, then dispatch to
  the next instruction in the stream.
@@ -368,24 +384,36 @@ is not concerned with making a minimal, orthogonal set of
  instructions. More instructions may be added over time.
  
  @menu
-* Environment Control Instructions::  
+* Lexical Environment Instructions::  
+* Top-Level Environment Instructions::  
+* Procedure Call and Return Instructions::  
+* Function Prologue Instructions::  
+* Trampoline Instructions::  
  * Branch Instructions::         
+* Data Constructor Instructions::   
  * Loading Instructions::  
-* Procedural Instructions::  
-* Data Control Instructions::   
+* Dynamic Environment Instructions::  
  * Miscellaneous Instructions::  
  * Inlined Scheme Instructions::  
  * Inlined Mathematical Instructions::  
+* Inlined Bytevector Instructions::  
  @end menu
  
-@node Environment Control Instructions
-@subsubsection Environment Control Instructions
  
-These instructions access and mutate the environment of a compiled
-procedure---the local bindings, the ``external'' bindings, and the
-toplevel bindings.
+@node Lexical Environment Instructions
+@subsubsection Lexical Environment Instructions
+
+These instructions access and mutate the lexical environment of a
+compiled procedure---its free and bound variables.
+
+Some of these instructions have @code{long-} variants, the difference
+being that they take 16-bit arguments, encoded in big-endianness,
+instead of the normal 8-bit range.
+
+@xref{Stack Layout}, for more information on the format of stack frames.
  
  @deffn Instruction local-ref index
+@deffnx Instruction long-local-ref index
  Push onto the stack the value of the local variable located at
  @var{index} within the current stack frame.
  
@@ -395,34 +423,107 @@ arguments.
  @end deffn
  
  @deffn Instruction local-set index
+@deffnx Instruction long-local-set index
  Pop the Scheme object located on top of the stack and make it the new
  value of the local variable located at @var{index} within the current
  stack frame.
  @end deffn
  
-@deffn Instruction external-ref index
-Push the value of the closure variable located at position
-@var{index} within the program's list of external variables.
+@deffn Instruction box index
+Pop a value off the stack, and set the @var{index}nth local variable
+to a box containing that value. A shortcut for @code{make-variable}
+then @code{local-set}, used when binding boxed variables.
  @end deffn
  
-@deffn Instruction external-set index
-Pop the Scheme object located on top of the stack and make it the new
-value of the closure variable located at @var{index} within the
-program's list of external variables.
+@deffn Instruction empty-box index
+Set the @var{indext}h local variable to a box containing a variable
+whose value is unbound. Used when compiling some @code{letrec}
+expressions.
+@end deffn
+
+@deffn Instruction local-boxed-ref index
+@deffnx Instruction local-boxed-ref index
+Get or set the value of the variable located at @var{index} within the
+current stack frame. A shortcut for @code{local-ref} then
+@code{variable-ref} or @code{variable-set}, respectively.
+@end deffn
+
+@deffn Instruction free-ref index
+Push the value of the captured variable located at position
+@var{index} within the program's vector of captured variables.
  @end deffn
  
-The external variable lookup algorithm should probably be made more
-efficient in the future via addressing by frame and index. Currently,
-external variables are all consed onto a list, which results in O(N)
-lookup time.
+@deffn Instruction free-boxed-ref index
+@deffnx Instruction free-boxed-set index
+Get or set a boxed free variable. A shortcut for @code{free-ref} then
+@code{variable-ref} or @code{variable-set}, respectively.
+
+Note that there is no @code{free-set} instruction, as variables that are
+@code{set!} must be boxed.
+@end deffn
+
+@deffn Instruction make-closure num-free-vars
+Pop @var{num-free-vars} values and a program object off the stack in
+that order, and push a new program object closing over the given free
+variables. @var{num-free-vars} is encoded as a two-byte big-endian
+value.
+
+The free variables are stored in an array, inline to the new program
+object, in the order that they were on the stack (not the order they are
+popped off). The new closure shares state with the original program. At
+the time of this writing, the space overhead of closures is 3 words,
+plus one word for each free variable.
+@end deffn
+
+@deffn Instruction fix-closure index
+Fix up the free variables array of the closure stored in the
+@var{index}th local variable. @var{index} is a two-byte big-endian
+integer.
+
+This instruction will pop as many values from the stack as are in the
+corresponding closure's free variables array. The topmost value on the
+stack will be stored as the closure's last free variable, with other
+values filling in free variable slots in order.
+
+@code{fix-closure} is part of a hack for allocating mutually recursive
+procedures. The hack is to store the procedures in their corresponding
+local variable slots, with space already allocated for free variables.
+Then once they are all in place, this instruction fixes up their
+procedures' free variable bindings in place. This allows most
+@code{letrec}-bound procedures to be allocated unboxed on the stack.
+@end deffn
+
+@deffn Instruction local-bound? index
+@deffnx Instruction long-local-bound? index
+Push @code{#t} on the stack if the @code{index}th local variable has
+been assigned, or @code{#f} otherwise. Mostly useful for handling
+optional arguments in procedure prologues.
+@end deffn
+
+
+@node Top-Level Environment Instructions
+@subsubsection Top-Level Environment Instructions
+
+These instructions access values in the top-level environment: bindings
+that were not lexically apparent at the time that the code in question
+was compiled.
+
+The location in which a toplevel binding is stored can be looked up once
+and cached for later. The binding itself may change over time, but its
+location will stay constant.
+
+Currently only toplevel references within procedures are cached, as only
+procedures have a place to cache them, in their object tables.
  
  @deffn Instruction toplevel-ref index
+@deffnx Instruction long-toplevel-ref index
  Push the value of the toplevel binding whose location is stored in at
-position @var{index} in the object table.
+position @var{index} in the current procedure's object table. The
+@code{long-} variant encodes the index over two bytes.
  
-Initially, a cell in the object table that is used by
-@code{toplevel-ref} is initialized to one of two forms. The normal
-case is that the cell holds a symbol, whose binding will be looked up
+Initially, a cell in a procedure's object table that is used by
+@code{toplevel-ref} is initialized to one of two forms. The normal case
+is that the cell holds a symbol, whose binding will be looked up
  relative to the module that was current when the current program was
  created.
  
@@ -444,20 +545,27 @@ variable has been successfully resolved.
  This instruction pushes the value of the variable onto the stack.
  @end deffn
  
-@deffn Instruction toplevel-ref index
+@deffn Instruction toplevel-set index
+@deffnx Instruction long-toplevel-set index
  Pop a value off the stack, and set it as the value of the toplevel
  variable stored at @var{index} in the object table. If the variable
  has not yet been looked up, we do the lookup as in
  @code{toplevel-ref}.
  @end deffn
  
+@deffn Instruction define
+Pop a symbol and a value from the stack, in that order. Look up its
+binding in the current toplevel environment, creating the binding if
+necessary. Set the variable to the value.
+@end deffn
+
  @deffn Instruction link-now
  Pop a value, @var{x}, from the stack. Look up the binding for @var{x},
  according to the rules for @code{toplevel-ref}, and push that variable
  on the stack. If the lookup fails, an error will be signalled.
  
  This instruction is mostly used when loading programs, because it can
-do toplevel variable lookups without an object vector.
+do toplevel variable lookups without an object table.
  @end deffn
  
  @deffn Instruction variable-ref
@@ -470,217 +578,100 @@ Pop off two objects from the stack, a variable and a value, and set
  the variable to the value.
  @end deffn
  
-@deffn Instruction object-ref n
-Push @var{n}th value from the current program's object vector.
-@end deffn
-
-@node Branch Instructions
-@subsubsection Branch Instructions
-
-All the conditional branch instructions described below work in the
-same way:
-
-@itemize
-@item They pop off the Scheme object located on the stack and use it as
-the branch condition;
-@item If the condition is true, then the instruction pointer is
-increased by the offset passed as an argument to the branch
-instruction;
-@item Program execution proceeds with the next instruction (that is,
-the one to which the instruction pointer points).
-@end itemize
-
-Note that the offset passed to the instruction is encoded on two 8-bit
-integers which are then combined by the VM as one 16-bit integer.
-
-@deffn Instruction br offset
-Jump to @var{offset}.
-@end deffn
-
-@deffn Instruction br-if offset
-Jump to @var{offset} if the condition on the stack is not false.
-@end deffn
-
-@deffn Instruction br-if-not offset
-Jump to @var{offset} if the condition on the stack is false.
-@end deffn
-
-@deffn Instruction br-if-eq offset
-Jump to @var{offset} if the two objects located on the stack are
-equal in the sense of @var{eq?}.  Note that, for this instruction, the
-stack pointer is decremented by two Scheme objects instead of only
-one.
-@end deffn
-
-@deffn Instruction br-if-not-eq offset
-Same as @var{br-if-eq} for non-@code{eq?} objects.
-@end deffn
-
-@deffn Instruction br-if-null offset
-Jump to @var{offset} if the object on the stack is @code{'()}.
-@end deffn
-
-@deffn Instruction br-if-not-null offset
-Jump to @var{offset} if the object on the stack is not @code{'()}.
-@end deffn
-
-
-@node Loading Instructions
-@subsubsection Loading Instructions
-
-In addition to VM instructions, an instruction stream may contain
-variable-length data embedded within it. This data is always preceded
-by special loading instructions, which interpret the data and advance
-the instruction pointer to the next VM instruction.
-
-All of these loading instructions have a @code{length} parameter,
-indicating the size of the embedded data, in bytes. The length itself
-may be encoded in 1, 2, or 4 bytes.
-
-@deffn Instruction load-integer length
-@deffnx Instruction load-unsigned-integer length
-Load a 32-bit integer or unsigned integer from the instruction stream.
-The bytes of the integer are read in order of decreasing significance
-(i.e., big-endian).
-@end deffn
-@deffn Instruction load-number length
-Load an arbitrary number from the instruction stream. The number is
-embedded in the stream as a string.
-@end deffn
-@deffn Instruction load-string length
-Load a string from the instruction stream.
-@end deffn
-@deffn Instruction load-symbol length
-Load a symbol from the instruction stream.
-@end deffn
-@deffn Instruction load-keyword length
-Load a keyword from the instruction stream.
-@end deffn
-
-@deffn Instruction define length
-Load a symbol from the instruction stream, and look up its binding in
-the current toplevel environment, creating the binding if necessary.
-Push the variable corresponding to the binding.
+@deffn Instruction variable-bound?
+Pop off the variable object from top of the stack and push @code{#t} if
+it is bound, or @code{#f} otherwise. Mostly useful in procedure
+prologues for defining default values for boxed optional variables.
  @end deffn
  
-@deffn Instruction load-program
-Load bytecode from the instruction stream, and push a compiled
-procedure.
-
-This instruction pops one value from the stack: the program's object
-table, as a vector, or @code{#f} in the case that the program has no
-object table. A program that does not reference toplevel bindings and
-does not use @code{object-ref} does not need an object table.
-
-This instruction is unlike the rest of the loading instructions,
-because instead of parsing its data, it directly maps the instruction
-stream onto a C structure, @code{struct scm_objcode}. @xref{Bytecode
-and Objcode}, for more information.
-
-The resulting compiled procedure will not have any ``external''
-variables captured, so it may be loaded only once but used many times
-to create closures.
+@deffn Instruction make-variable
+Replace the top object on the stack with a variable containing it.
+Used in some circumstances when compiling @code{letrec} expressions.
  @end deffn
  
-Finally, while this instruction is not strictly a ``loading''
-instruction, it's useful to wind up the @code{load-program} discussion
-here:
  
-@deffn Instruction make-closure
-Pop the program object from the stack, capture the current set of
-``external'' variables, and assign those external variables to a copy
-of the program. Push the new program object, which shares state with
-the original program.
+@node Procedure Call and Return Instructions
+@subsubsection Procedure Call and Return Instructions
  
-At the time of this writing, the space overhead of closures is 4 words
-per closure.
-@end deffn
-
-@node Procedural Instructions
-@subsubsection Procedural Instructions
-
-@deffn Instruction return
-Free the program's frame, returning the top value from the stack to
-the current continuation. (The stack should have exactly one value on
-it.)
+@c something about the calling convention here?
  
-Specifically, the @code{sp} is decremented to one below the current
-@code{fp}, the @code{ip} is reset to the current return address, the
-@code{fp} is reset to the value of the current dynamic link, and then
-the top item on the stack (formerly the procedure being applied) is
-set to the returned value.
+@deffn Instruction new-frame
+Push a new frame on the stack, reserving space for the dynamic link,
+return address, and the multiple-values return address. The frame
+pointer is not yet updated, because the frame is not yet active -- it
+has to be patched by a @code{call} instruction to get the return
+address.
  @end deffn
  
  @deffn Instruction call nargs
  Call the procedure located at @code{sp[-nargs]} with the @var{nargs}
  arguments located from @code{sp[-nargs + 1]} to @code{sp[0]}.
  
-For compiled procedures, this instruction sets up a new stack frame,
-as described in @ref{Stack Layout}, and then dispatches to the first
-instruction in the called procedure, relying on the called procedure
-to return one value to the newly-created continuation. Because the new
-frame pointer will point to sp[-nargs + 1], the arguments don't have
-to be shuffled around -- they are already in place.
-
-For non-compiled procedures (continuations, primitives, and
-interpreted procedures), @code{call} will pop the procedure and
-arguments off the stack, and push the result of calling
-@code{scm_apply}.
+This instruction requires that a new frame be pushed on the stack before
+the procedure, via @code{new-frame}. @xref{Stack Layout}, for more
+information. It patches up that frame with the current @code{ip} as the
+return address, then dispatches to the first instruction in the called
+procedure, relying on the called procedure to return one value to the
+newly-created continuation. Because the new frame pointer will point to
+@code{sp[-nargs + 1]}, the arguments don't have to be shuffled around --
+they are already in place.
  @end deffn
  
-@deffn Instruction goto/args nargs
-Like @code{call}, but reusing the current continuation. This
-instruction implements tail calls as required by RnRS.
+@deffn Instruction tail-call nargs
+Transfer control to the procedure located at @code{sp[-nargs]} with the
+@var{nargs} arguments located from @code{sp[-nargs + 1]} to
+@code{sp[0]}.
  
-For compiled procedures, that means that @code{goto/args} reuses the
-current frame instead of building a new one. The @code{goto/*}
-instruction family is named as it is because tail calls are equivalent
-to @code{goto}, along with relabeled variables.
-
-For non-VM procedures, the result is the same, but the current VM
-invocation remains on the C stack. True tail calls are not currently
-possible between compiled and non-compiled procedures.
+Unlike @code{call}, which requires a new frame to be pushed onto the
+stack, @code{tail-call} simply shuffles down the procedure and arguments
+to the current stack frame. This instruction implements tail calls as
+required by RnRS.
  @end deffn
  
  @deffn Instruction apply nargs
-@deffnx Instruction goto/apply nargs
-Like @code{call} and @code{goto/args}, except that the top item on the
+@deffnx Instruction tail-apply nargs
+Like @code{call} and @code{tail-call}, except that the top item on the
  stack must be a list. The elements of that list are then pushed on the
  stack and treated as additional arguments, replacing the list itself,
  then the procedure is invoked as usual.
  @end deffn
  
  @deffn Instruction call/nargs
-@deffnx Instruction goto/nargs
-These are like @code{call} and @code{goto/args}, except they take the
+@deffnx Instruction tail-call/nargs
+These are like @code{call} and @code{tail-call}, except they take the
  number of arguments from the stack instead of the instruction stream.
  These instructions are used in the implementation of multiple value
  returns, where the actual number of values is pushed on the stack.
  @end deffn
  
-@deffn Instruction call/cc
-@deffnx Instruction goto/cc
-Capture the current continuation, and then call (or tail-call) the
-procedure on the top of the stack, with the continuation as the
-argument.
-
-Both the VM continuation and the C continuation are captured.
-@end deffn
-
  @deffn Instruction mv-call nargs offset
  Like @code{call}, except that a multiple-value continuation is created
  in addition to a single-value continuation.
  
-The offset (a two-byte value) is an offset within the instruction
-stream; the multiple-value return address in the new frame
-(@pxref{Stack Layout}) will be set to the normal return address plus
-this offset. Instructions at that offset will expect the top value of
-the stack to be the number of values, and below that values
-themselves, pushed separately.
+The offset (a three-byte value) is an offset within the instruction
+stream; the multiple-value return address in the new frame (@pxref{Stack
+Layout}) will be set to the normal return address plus this offset.
+Instructions at that offset will expect the top value of the stack to be
+the number of values, and below that values themselves, pushed
+separately.
+@end deffn
+
+@deffn Instruction return
+Free the program's frame, returning the top value from the stack to
+the current continuation. (The stack should have exactly one value on
+it.)
+
+Specifically, the @code{sp} is decremented to one below the current
+@code{fp}, the @code{ip} is reset to the current return address, the
+@code{fp} is reset to the value of the current dynamic link, and then
+the returned value is pushed on the stack.
  @end deffn
  
  @deffn Instruction return/values nvalues
-Return the top @var{nvalues} to the current continuation.
+@deffnx Instruction return/nvalues
+Return the top @var{nvalues} to the current continuation. In the case of
+@code{return/nvalues}, @var{nvalues} itself is first popped from the top
+of the stack.
  
  If the current continuation is a multiple-value continuation,
  @code{return/values} pushes the number of values on the stack, then
@@ -713,11 +704,247 @@ be 1 (to indicate that one of the bindings was a rest argument).
  Signals an error if there is an insufficient number of values.
  @end deffn
  
-@node Data Control Instructions
-@subsubsection Data Control Instructions
+@deffn Instruction call/cc
+@deffnx Instruction tail-call/cc
+Capture the current continuation, and then call (or tail-call) the
+procedure on the top of the stack, with the continuation as the
+argument.
+
+@code{call/cc} does not require a @code{new-frame} to be pushed on the
+stack, as @code{call} does, because it needs to capture the stack
+before the frame is pushed.
+
+Both the VM continuation and the C continuation are captured.
+@end deffn
+
+
+@node Function Prologue Instructions
+@subsubsection Function Prologue Instructions
+
+A function call in Guile is very cheap: the VM simply hands control to
+the procedure. The procedure itself is responsible for asserting that it
+has been passed an appropriate number of arguments. This strategy allows
+arbitrarily complex argument parsing idioms to be developed, without
+harming the common case.
+
+For example, only calls to keyword-argument procedures ``pay'' for the
+cost of parsing keyword arguments. (At the time of this writing, calling
+procedures with keyword arguments is typically two to four times as
+costly as calling procedures with a fixed set of arguments.)
+
+@deffn Instruction assert-nargs-ee n
+@deffnx Instruction assert-nargs-ge n
+Assert that the current procedure has been passed exactly @var{n}
+arguments, for the @code{-ee} case, or @var{n} or more arguments, for
+the @code{-ge} case. @var{n} is encoded over two bytes.
+
+The number of arguments is determined by subtracting the frame pointer
+from the stack pointer (@code{sp - (fp -1)}). @xref{Stack Layout}, for
+more details on stack frames.
+@end deffn
+
+@deffn Instruction br-if-nargs-ne n offset
+@deffnx Instruction br-if-nargs-gt n offset
+@deffnx Instruction br-if-nargs-lt n offset
+Jump to @var{offset} if the number of arguments is not equal to, greater
+than, or less than @var{n}. @var{n} is encoded over two bytes, and
+@var{offset} has the normal three-byte encoding.
  
-These instructions push simple immediate values onto the stack, or
-manipulate lists and vectors on the stack.
+These instructions are used to implement muliple arities, as in
+@code{case-lambda}. @xref{Case-lambda}, for more information.
+@end deffn
+
+@deffn Instruction bind-optionals n
+If the procedure has been called with fewer than @var{n} arguments, fill
+in the remaining arguments with an unbound value (@code{SCM_UNDEFINED}).
+@var{n} is encoded over two bytes.
+
+The optionals can be later initialized conditionally via the
+@code{local-bound?} instruction.
+@end deffn
+
+@deffn Instruction push-rest n
+Pop off excess arguments (more than @var{n}), collecting them into a
+list, and push that list. Used to bind a rest argument, if the procedure
+has no keyword arguments. Procedures with keyword arguments use
+@code{bind-rest} instead.
+@end deffn
+
+@deffn Instruction bind-rest n idx
+Pop off excess arguments (more than @var{n}), collecting them into a
+list. The list is then assigned to the @var{idx}th local variable.
+@end deffn
+
+@deffn Instruction bind-optionals/shuffle nreq nreq-and-opt ntotal
+Shuffle keyword arguments to the top of the stack, filling in the holes
+with @code{SCM_UNDEFINED}. Each argument is encoded over two bytes.
+
+This instruction is used by procedures with keyword arguments.
+@var{nreq} is the number of required arguments to the procedure, and
+@var{nreq-and-opt} is the total number of positional arguments (required
+plus optional). @code{bind-optionals/shuffle} will scan the stack from
+the @var{nreq}th argument up to the @var{nreq-and-opt}th, and start
+shuffling when it sees the first keyword argument or runs out of
+positional arguments.
+
+Shuffling simply moves the keyword arguments past the total number of
+arguments, @var{ntotal}, which includes keyword and rest arguments. The
+free slots created by the shuffle are filled in with
+@code{SCM_UNDEFINED}, so they may be conditionally initialized later in
+the function's prologue.
+@end deffn
+
+@deffn Instruction bind-kwargs idx ntotal flags
+Parse keyword arguments, assigning their values to the corresponding
+local variables. The keyword arguments should already have been shuffled
+above the @var{ntotal}th stack slot by @code{bind-optionals/shuffle}.
+
+The parsing is driven by a keyword arguments association list, looked up
+from the @var{idx}th element of the procedures object array. The alist
+is a list of pairs of the form @code{(@var{kw} . @var{index})}, mapping
+keyword arguments to their local variable indices.
+
+There are two bitflags that affect the parser, @code{allow-other-keys?}
+(@code{0x1}) and @code{rest?} (@code{0x2}). Unless
+@code{allow-other-keys?} is set, the parser will signal an error if an
+unknown key is found. If @code{rest?} is set, errors parsing the the
+keyword arguments will be ignored, as a later @code{bind-rest}
+instruction will collect all of the tail arguments, including the
+keywords, into a list. Otherwise if the keyword arguments are invalid,
+an error is signalled.
+
+@var{idx} and @var{ntotal} are encoded over two bytes each, and
+@var{flags} is encoded over one byte.
+@end deffn
+
+@deffn Instruction reserve-locals n
+Resets the stack pointer to have space for @var{n} local variables,
+including the arguments. If this operation increments the stack pointer,
+as in a push, the new slots are filled with @code{SCM_UNBOUND}. If this
+operation decrements the stack pointer, any excess values are dropped.
+
+@code{reserve-locals} is typically used after argument parsing to
+reserve space for local variables.
+@end deffn
+
+@deffn Instruction assert-nargs-ee/locals n
+@deffnx Instruction assert-nargs-ge/locals n
+A combination of @code{assert-nargs-ee} and @code{reserve-locals}. The
+number of arguments is encoded in the lower three bits of @var{n}, a
+one-byte value. The number of additional local variables is take from
+the upper 5 bits of @var{n}.
+@end deffn
+
+
+@node Trampoline Instructions
+@subsubsection Trampoline Instructions
+
+Though most applicable objects in Guile are procedures implemented
+in bytecode, not all are. There are primitives, continuations, and other
+procedure-like objects that have their own calling convention. Instead
+of adding special cases to the @code{call} instruction, Guile wraps
+these other applicable objects in VM trampoline procedures, then
+provides special support for these objects in bytecode.
+
+Trampoline procedures are typically generated by Guile at runtime, for
+example in response to a call to @code{scm_c_make_gsubr}. As such, a
+compiler probably shouldn't emit code with these instructions. However,
+it's still interesting to know how these things work, so we document
+these trampoline instructions here.
+
+@deffn Instruction subr-call nargs
+Pop off a foreign pointer (which should have been pushed on by the
+trampoline), and call it directly, with the @var{nargs} arguments from
+the stack. Return the resulting value or values to the calling
+procedure.
+@end deffn
+
+@deffn Instruction foreign-call nargs
+Pop off an internal foreign object (which should have been pushed on by
+the trampoline), and call that foreign function with the @var{nargs}
+arguments from the stack. Return the resulting value to the calling
+procedure.
+@end deffn
+
+@deffn Instruction smob-call nargs
+Pop off the smob object from the stack (which should have been pushed on
+by the trampoline), and call its descriptor's @code{apply} function with
+the @var{nargs} arguments from the stack. Return the resulting value or
+values to the calling procedure.
+@end deffn
+
+@deffn Instruction continuation-call
+Pop off an internal continuation object (which should have been pushed
+on by the trampoline), and reinstate that continuation. All of the
+procedure's arguments are passed to the continuation. Does not return.
+@end deffn
+
+@deffn Instruction partial-cont-call
+Pop off two objects from the stack: the dynamic winds associated with
+the partial continuation, and the VM continuation object. Unroll the
+continuation onto the stack, rewinding the dynamic environment and
+overwriting the current frame, and pass all arguments to the
+continuation. Control flow proceeds where the continuation was captured.
+@end deffn
+
+
+@node Branch Instructions
+@subsubsection Branch Instructions
+
+All the conditional branch instructions described below work in the
+same way:
+
+@itemize
+@item They pop off Scheme object(s) located on the stack for use in the
+branch condition
+@item If the condition is true, then the instruction pointer is
+increased by the offset passed as an argument to the branch
+instruction;
+@item Program execution proceeds with the next instruction (that is,
+the one to which the instruction pointer points).
+@end itemize
+
+Note that the offset passed to the instruction is encoded as three 8-bit
+integers, in big-endian order, effectively giving Guile a 24-bit
+relative address space.
+
+@deffn Instruction br offset
+Jump to @var{offset}. No values are popped.
+@end deffn
+
+@deffn Instruction br-if offset
+Jump to @var{offset} if the object on the stack is not false.
+@end deffn
+
+@deffn Instruction br-if-not offset
+Jump to @var{offset} if the object on the stack is false.
+@end deffn
+
+@deffn Instruction br-if-eq offset
+Jump to @var{offset} if the two objects located on the stack are
+equal in the sense of @var{eq?}.  Note that, for this instruction, the
+stack pointer is decremented by two Scheme objects instead of only
+one.
+@end deffn
+
+@deffn Instruction br-if-not-eq offset
+Same as @var{br-if-eq} for non-@code{eq?} objects.
+@end deffn
+
+@deffn Instruction br-if-null offset
+Jump to @var{offset} if the object on the stack is @code{'()}.
+@end deffn
+
+@deffn Instruction br-if-not-null offset
+Jump to @var{offset} if the object on the stack is not @code{'()}.
+@end deffn
+
+
+@node Data Constructor Instructions
+@subsubsection Data Constructor Instructions
+
+These instructions push simple immediate values onto the stack,
+or constructo compound data structures from values the stack.
  
  @deffn Instruction make-int8 value
  Push @var{value}, an 8-bit integer, onto the stack.
@@ -735,6 +962,17 @@ Push the immediate value @code{1} onto the stack.
  Push @var{value}, a 16-bit integer, onto the stack.
  @end deffn
  
+@deffn Instruction make-uint64 value
+Push @var{value}, an unsigned 64-bit integer, onto the stack. The
+value is encoded in 8 bytes, most significant byte first (big-endian).
+@end deffn
+
+@deffn Instruction make-int64 value
+Push @var{value}, a signed 64-bit integer, onto the stack. The value
+is encoded in 8 bytes, most significant byte first (big-endian), in
+twos-complement arithmetic.
+@end deffn
+
  @deffn Instruction make-false
  Push @code{#f} onto the stack.
  @end deffn
@@ -743,6 +981,10 @@ Push @code{#f} onto the stack.
  Push @code{#t} onto the stack.
  @end deffn
  
+@deffn Instruction make-nil
+Push @code{#nil} onto the stack.
+@end deffn
+
  @deffn Instruction make-eol
  Push @code{'()} onto the stack.
  @end deffn
@@ -751,6 +993,19 @@ Push @code{'()} onto the stack.
  Push @var{value}, an 8-bit character, onto the stack.
  @end deffn
  
+@deffn Instruction make-char32 value
+Push @var{value}, an 32-bit character, onto the stack. The value is
+encoded in big-endian order.
+@end deffn
+
+@deffn Instruction make-symbol
+Pops a string off the stack, and pushes a symbol.
+@end deffn
+
+@deffn Instruction make-keyword value
+Pops a symbol off the stack, and pushes a keyword.
+@end deffn
+
  @deffn Instruction list n
  Pops off the top @var{n} values off of the stack, consing them up into
  a list, then pushes that list on the stack. What was the topmost value
@@ -764,37 +1019,180 @@ popping off those values and pushing on the resulting vector. @var{n}
  is a two-byte value, like in @code{vector}.
  @end deffn
  
-@deffn Instruction mark
-Pushes a special value onto the stack that other stack instructions
-like @code{list-mark} can use.
+@deffn Instruction make-struct n
+Make a new struct from the top @var{n} values on the stack. The values
+are popped, and the new struct is pushed.
+
+The deepest value is used as the vtable for the struct, and the rest are
+used in order as the field initializers. Tail arrays are not supported
+by this instruction.
+@end deffn
+
+@deffn Instruction make-array n
+Pop an array shape from the stack, then pop the remaining @var{n}
+values, pushing a new array. @var{n} is encoded over three bytes.
+
+The array shape should be appropriate to store @var{n} values.
+@xref{Array Procedures}, for more information on array shapes.
+@end deffn
+
+Many of these data structures are constant, never changing over the
+course of the different invocations of the procedure. In that case it is
+often advantageous to make them once when the procedure is created, and
+just reference them from the object table thereafter. @xref{Variables
+and the VM}, for more information on the object table.
+
+@deffn Instruction object-ref n
+@deffnx Instruction long-object-ref n
+Push @var{n}th value from the current program's object vector. The
+``long'' variant has a 16-bit index instead of an 8-bit index.
+@end deffn
+
+
+@node Loading Instructions
+@subsubsection Loading Instructions
+
+In addition to VM instructions, an instruction stream may contain
+variable-length data embedded within it. This data is always preceded
+by special loading instructions, which interpret the data and advance
+the instruction pointer to the next VM instruction.
+
+All of these loading instructions have a @code{length} parameter,
+indicating the size of the embedded data, in bytes. The length itself
+is encoded in 3 bytes.
+
+@deffn Instruction load-number length
+Load an arbitrary number from the instruction stream. The number is
+embedded in the stream as a string.
+@end deffn
+@deffn Instruction load-string length
+Load a string from the instruction stream. The string is assumed to be
+encoded in the ``latin1'' locale.
+@end deffn
+@deffn Instruction load-wide-string length
+Load a UTF-32 string from the instruction stream. @var{length} is the
+length in bytes, not in codepoints
+@end deffn
+@deffn Instruction load-symbol length
+Load a symbol from the instruction stream. The symbol is assumed to be
+encoded in the ``latin1'' locale. Symbols backed by wide strings may
+be loaded via @code{load-wide-string} then @code{make-symbol}.
+@end deffn
+@deffn Instruction load-array length
+Load a uniform array from the instruction stream. The shape and type
+of the array are popped off the stack, in that order.
+@end deffn
+
+@deffn Instruction load-program
+Load bytecode from the instruction stream, and push a compiled
+procedure.
+
+This instruction pops one value from the stack: the program's object
+table, as a vector, or @code{#f} in the case that the program has no
+object table. A program that does not reference toplevel bindings and
+does not use @code{object-ref} does not need an object table.
+
+This instruction is unlike the rest of the loading instructions,
+because instead of parsing its data, it directly maps the instruction
+stream onto a C structure, @code{struct scm_objcode}. @xref{Bytecode
+and Objcode}, for more information.
+
+The resulting compiled procedure will not have any free variables
+captured, so it may be loaded only once but used many times to create
+closures.
+@end deffn
+
+@node Dynamic Environment Instructions
+@subsubsection Dynamic Environment Instructions
+
+Guile's virtual machine has low-level support for @code{dynamic-wind},
+dynamic binding, and composable prompts and aborts.
+
+@deffn Instruction wind
+Pop an unwind thunk and a wind thunk from the stack, in that order, and
+push them onto the ``dynamic stack''. The unwind thunk will be called on
+nonlocal exits, and the wind thunk on reentries. Used to implement
+@code{dynamic-wind}.
+
+Note that neither thunk is actually called; the compiler should emit
+calls to wind and unwind for the normal dynamic-wind control flow.
+@xref{Dynamic Wind}.
  @end deffn
  
-@deffn Instruction list-mark
-Create a list from values from the stack, as in @code{list}, but
-instead of knowing beforehand how many there will be, keep going until
-we see a @code{mark} value.
+@deffn Instruction unwind
+Pop off the top entry from the ``dynamic stack'', for example, a
+wind/unwind thunk pair. @code{unwind} instructions should be properly
+paired with their winding instructions, like @code{wind}.
  @end deffn
  
-@deffn Instruction cons-mark
-As the scheme procedure @code{cons*} is to the scheme procedure
-@code{list}, so the instruction @code{cons-mark} is to the instruction
-@code{list-mark}.
+@deffn Instruction wind-fluids n
+Pop off @var{n} values and @var{n} fluids from the stack, in that order.
+Set the fluids to the values by creating a with-fluids object and
+pushing that object on the dynamic stack. @xref{Fluids and Dynamic
+States}.
  @end deffn
  
-@deffn Instruction vector-mark
-Like @code{list-mark}, but makes a vector instead of a list.
+@deffn Instruction unwind-fluids
+Pop a with-fluids object from the dynamic stack, and swap the current
+values of its fluids with the saved values of its fluids. In this way,
+the dynamic environment is left as it was before the corresponding
+@code{wind-fluids} instruction was processed.
  @end deffn
  
-@deffn Instruction list-break
-The opposite of @code{list}: pops a value, which should be a list, and
-pushes its elements on the stack.
+@deffn Instruction fluid-ref
+Pop a fluid from the stack, and push its current value.
+@end deffn
+
+@deffn Instruction fluid-set
+Pop a value and a fluid from the stack, in that order, and set the fluid
+to the value.
+@end deffn
+
+@deffn Instruction prompt escape-only? offset
+Establish a dynamic prompt. @xref{Prompts}, for more information on
+prompts.
+
+The prompt will be pushed on the dynamic stack. The normal control flow
+should ensure that the prompt is popped off at the end, via
+@code{unwind}.
+
+If an abort is made to this prompt, control will jump to @var{offset}, a
+three-byte relative address. The continuation and all arguments to the
+abort will be pushed on the stack, along with the total number of
+arguments (including the continuation. If control returns to the
+handler, the prompt is already popped off by the abort mechanism.
+(Guile's @code{prompt} implements Felleisen's @dfn{--F--} operator.)
+
+If @var{escape-only?} is nonzero, the prompt will be marked as
+escape-only, which allows an abort to this prompt to avoid reifying the
+continuation.
+@end deffn
+
+@deffn Instruction abort n
+Abort to a dynamic prompt.
+
+This instruction pops one tail argument list, @var{n} arguments, and a
+prompt tag from the stack. The dynamic environment is then searched for
+a prompt having the given tag. If none is found, an error is signalled.
+Otherwise all arguments are passed to the prompt's handler, along with
+the captured continuation, if necessary.
+
+If the prompt's handler can be proven to not reference the captured
+continuation, no continuation is allocated. This decision happens
+dynamically, at run-time; the general case is that the continuation may
+be captured, and thus resumed. A reinstated continuation will have its
+arguments pushed on the stack, along with the number of arguments, as in
+the multiple-value return convention. Therefore an @code{abort}
+instruction should be followed by code ready to handle the equivalent of
+a multiply-valued return.
  @end deffn
  
  @node Miscellaneous Instructions
  @subsubsection Miscellaneous Instructions
  
  @deffn Instruction nop
-Does nothing!
+Does nothing! Used for padding other instructions to certain
+alignments.
  @end deffn
  
  @deffn Instruction halt
@@ -855,11 +1253,18 @@ stream.
  @deffnx Instruction list? x
  @deffnx Instruction set-car! pair x
  @deffnx Instruction set-cdr! pair x
-@deffnx Instruction slot-ref struct n
-@deffnx Instruction slot-set struct n x
  @deffnx Instruction cons x y
  @deffnx Instruction car x
  @deffnx Instruction cdr x
+@deffnx Instruction vector-ref x y
+@deffnx Instruction vector-set x n y
+@deffnx Instruction struct? x
+@deffnx Instruction struct-ref x n
+@deffnx Instruction struct-set x n v
+@deffnx Instruction struct-vtable x
+@deffnx Instruction class-of x
+@deffnx Instruction slot-ref struct n
+@deffnx Instruction slot-set struct n x
  Inlined implementations of their Scheme equivalents.
  @end deffn
  
@@ -880,7 +1285,9 @@ As in the previous section, the definitions below show stack
  parameters instead of instruction stream parameters.
  
  @deffn Instruction add x y
+@deffnx Instruction add1 x
  @deffnx Instruction sub x y
+@deffnx Instruction sub1 x
  @deffnx Instruction mul x y
  @deffnx Instruction div x y
  @deffnx Instruction quo x y
@@ -891,5 +1298,64 @@ parameters instead of instruction stream parameters.
  @deffnx Instruction gt? x y
  @deffnx Instruction le? x y
  @deffnx Instruction ge? x y
+@deffnx Instruction ash x n
+@deffnx Instruction logand x y
+@deffnx Instruction logior x y
+@deffnx Instruction logxor x y
  Inlined implementations of the corresponding mathematical operations.
  @end deffn
+
+@node Inlined Bytevector Instructions
+@subsubsection Inlined Bytevector Instructions
+
+Bytevector operations correspond closely to what the current hardware
+can do, so it makes sense to inline them to VM instructions, providing
+a clear path for eventual native compilation. Without this, Scheme
+programs would need other primitives for accessing raw bytes -- but
+these primitives are as good as any.
+
+As in the previous section, the definitions below show stack
+parameters instead of instruction stream parameters.
+
+The multibyte formats (@code{u16}, @code{f64}, etc) take an extra
+endianness argument. Only aligned native accesses are currently
+fast-pathed in Guile's VM.
+
+@deffn Instruction bv-u8-ref bv n
+@deffnx Instruction bv-s8-ref bv n
+@deffnx Instruction bv-u16-native-ref bv n
+@deffnx Instruction bv-s16-native-ref bv n
+@deffnx Instruction bv-u32-native-ref bv n
+@deffnx Instruction bv-s32-native-ref bv n
+@deffnx Instruction bv-u64-native-ref bv n
+@deffnx Instruction bv-s64-native-ref bv n
+@deffnx Instruction bv-f32-native-ref bv n
+@deffnx Instruction bv-f64-native-ref bv n
+@deffnx Instruction bv-u16-ref bv n endianness
+@deffnx Instruction bv-s16-ref bv n endianness
+@deffnx Instruction bv-u32-ref bv n endianness
+@deffnx Instruction bv-s32-ref bv n endianness
+@deffnx Instruction bv-u64-ref bv n endianness
+@deffnx Instruction bv-s64-ref bv n endianness
+@deffnx Instruction bv-f32-ref bv n endianness
+@deffnx Instruction bv-f64-ref bv n endianness
+@deffnx Instruction bv-u8-set bv n val
+@deffnx Instruction bv-s8-set bv n val
+@deffnx Instruction bv-u16-native-set bv n val
+@deffnx Instruction bv-s16-native-set bv n val
+@deffnx Instruction bv-u32-native-set bv n val
+@deffnx Instruction bv-s32-native-set bv n val
+@deffnx Instruction bv-u64-native-set bv n val
+@deffnx Instruction bv-s64-native-set bv n val
+@deffnx Instruction bv-f32-native-set bv n val
+@deffnx Instruction bv-f64-native-set bv n val
+@deffnx Instruction bv-u16-set bv n val endianness
+@deffnx Instruction bv-s16-set bv n val endianness
+@deffnx Instruction bv-u32-set bv n val endianness
+@deffnx Instruction bv-s32-set bv n val endianness
+@deffnx Instruction bv-u64-set bv n val endianness
+@deffnx Instruction bv-s64-set bv n val endianness
+@deffnx Instruction bv-f32-set bv n val endianness
+@deffnx Instruction bv-f64-set bv n val endianness
+Inlined implementations of the corresponding bytevector operations.
+@end deffn