doc/ref/vm.texi

   1 @c -*-texinfo-*-
   2 @c This is part of the GNU Guile Reference Manual.
   3 @c Copyright (C)  2008,2009
   4 @c   Free Software Foundation, Inc.
   5 @c See the file guile.texi for copying conditions.
   6
   7 @node A Virtual Machine for Guile
   8 @section A Virtual Machine for Guile
   9
  10 Guile has both an interpreter and a compiler. To a user, the
  11 difference is largely transparent---interpreted and compiled
  12 procedures can call each other as they please.
  13
  14 The difference is that the compiler creates and interprets bytecode
  15 for a custom virtual machine, instead of interpreting the
  16 S-expressions directly. Running compiled code is faster than running
  17 interpreted code.
  18
  19 The virtual machine that does the bytecode interpretation is a part of
  20 Guile itself. This section describes the nature of Guile's virtual
  21 machine.
  22
  23 @menu
  24 * Why a VM?::
  25 * VM Concepts::
  26 * Stack Layout::
  27 * Variables and the VM::
  28 * VM Programs::
  29 * Instruction Set::
  30 @end menu
  31
  32 @node Why a VM?
  33 @subsection Why a VM?
  34
  35 @cindex interpreter
  36 @cindex evaluator
  37 For a long time, Guile only had an interpreter, called the
  38 @dfn{evaluator}. Guile's evaluator operates directly on the
  39 S-expression representation of Scheme source code.
  40
  41 But while the evaluator is highly optimized and hand-tuned, and
  42 contains some extensive speed trickery (@pxref{Memoization}), it still
  43 performs many needless computations during the course of evaluating an
  44 expression. For example, application of a function to arguments
  45 needlessly conses up the arguments in a list. Evaluation of an
  46 expression always has to figure out what the car of the expression is
  47 -- a procedure, a memoized form, or something else. All values have to
  48 be allocated on the heap. Et cetera.
  49
  50 The solution to this problem is to compile the higher-level language,
  51 Scheme, into a lower-level language for which all of the checks and
  52 dispatching have already been done---the code is instead stripped to
  53 the bare minimum needed to ``do the job''.
  54
  55 The question becomes then, what low-level language to choose? There
  56 are many options. We could compile to native code directly, but that
  57 poses portability problems for Guile, as it is a highly cross-platform
  58 project.
  59
  60 So we want the performance gains that compilation provides, but we
  61 also want to maintain the portability benefits of a single code path.
  62 The obvious solution is to compile to a virtual machine that is
  63 present on all Guile installations.
  64
  65 The easiest (and most fun) way to depend on a virtual machine is to
  66 implement the virtual machine within Guile itself. This way the
  67 virtual machine provides what Scheme needs (tail calls, multiple
  68 values, @code{call/cc}) and can provide optimized inline instructions
  69 for Guile (@code{cons}, @code{struct-ref}, etc.).
  70
  71 So this is what Guile does. The rest of this section describes that VM
  72 that Guile implements, and the compiled procedures that run on it.
  73
  74 Note that this decision to implement a bytecode compiler does not
  75 preclude native compilation. We can compile from bytecode to native
  76 code at runtime, or even do ahead of time compilation. More
  77 possibilities are discussed in @ref{Extending the Compiler}.
  78
  79 @node VM Concepts
  80 @subsection VM Concepts
  81
  82 A virtual machine (VM) is a Scheme object. Users may create virtual
  83 machines using the standard procedures described later in this manual,
  84 but that is usually unnecessary, as Guile ensures that there is one
  85 virtual machine per thread. When a VM-compiled procedure is run, Guile
  86 looks up the virtual machine for the current thread and executes the
  87 procedure using that VM.
  88
  89 Guile's virtual machine is a stack machine---that is, it has few
  90 registers, and the instructions defined in the VM operate by pushing
  91 and popping values from a stack.
  92
  93 Stack memory is exclusive to the virtual machine that owns it. In
  94 addition to their stacks, virtual machines also have access to the
  95 global memory (modules, global bindings, etc) that is shared among
  96 other parts of Guile, including other VMs.
  97
  98 A VM has generic instructions, such as those to reference local
  99 variables, and instructions designed to support Guile's languages --
 100 mathematical instructions that support the entire numerical tower, an
 101 inlined implementation of @code{cons}, etc.
 102
 103 The registers that a VM has are as follows:
 104
 105 @itemize
 106 @item ip - Instruction pointer
 107 @item sp - Stack pointer
 108 @item fp - Frame pointer
 109 @end itemize
 110
 111 In other architectures, the instruction pointer is sometimes called
 112 the ``program counter'' (pc). This set of registers is pretty typical
 113 for stack machines; their exact meanings in the context of Guile's VM
 114 is described in the next section.
 115
 116 A virtual machine executes by loading a compiled procedure, and
 117 executing the object code associated with that procedure. Of course,
 118 that procedure may call other procedures, tail-call others, ad
 119 infinitum---indeed, within a guile whose modules have all been
 120 compiled to object code, one might never leave the virtual machine.
 121
 122 @c wingo: I wish the following were true, but currently we just use
 123 @c the one engine. This kind of thing is possible tho.
 124
 125 @c A VM may have one of three engines: reckless, regular, or debugging.
 126 @c Reckless engine is fastest but dangerous.  Regular engine is normally
 127 @c fail-safe and reasonably fast.  Debugging engine is safest and
 128 @c functional but very slow.
 129
 130 @node Stack Layout
 131 @subsection Stack Layout
 132
 133 While not strictly necessary to understand how to work with the VM, it
 134 is instructive and sometimes entertaining to consider the struture of
 135 the VM stack.
 136
 137 Logically speaking, a VM stack is composed of ``frames''. Each frame
 138 corresponds to the application of one compiled procedure, and contains
 139 storage space for arguments, local variables, intermediate values, and
 140 some bookkeeping information (such as what to do after the frame
 141 computes its value).
 142
 143 While the compiler is free to do whatever it wants to, as long as the
 144 semantics of a computation are preserved, in practice every time you
 145 call a function, a new frame is created. (The notable exception of
 146 course is the tail call case, @pxref{Tail Calls}.)
 147
 148 Within a frame, you have the data associated with the function
 149 application itself, which is of a fixed size, and the stack space for
 150 intermediate values. Sometimes only the former is referred to as the
 151 ``frame'', and the latter is the ``stack'', although all pending
 152 application frames can have some intermediate computations interleaved
 153 on the stack.
 154
 155 The structure of the fixed part of an application frame is as follows:
 156
 157 @example
 158              Stack
 159    |                  | <- fp + bp->nargs + bp->nlocs + 4
 160    +------------------+    = SCM_FRAME_UPPER_ADDRESS (fp)
 161    | Return address   |
 162    | MV return address|
 163    | Dynamic link     |
 164    | External link    | <- fp + bp->nargs + bp->nlocs
 165    | Local variable 1 |    = SCM_FRAME_DATA_ADDRESS (fp)
 166    | Local variable 0 | <- fp + bp->nargs
 167    | Argument 1       |
 168    | Argument 0       | <- fp
 169    | Program          | <- fp - 1
 170    +------------------+    = SCM_FRAME_LOWER_ADDRESS (fp)
 171    |                  |
 172 @end example
 173
 174 In the above drawing, the stack grows upward. The intermediate values
 175 stored in the application of this frame are stored above
 176 @code{SCM_FRAME_UPPER_ADDRESS (fp)}. @code{bp} refers to the
 177 @code{struct scm_program*} data associated with the program at
 178 @code{fp - 1}. @code{nargs} and @code{nlocs} are properties of the
 179 compiled procedure, which will be discussed later.
 180
 181 The individual fields of the frame are as follows:
 182
 183 @table @asis
 184 @item Return address
 185 The @code{ip} that was in effect before this program was applied. When
 186 we return from this activation frame, we will jump back to this
 187 @code{ip}.
 188
 189 @item MV return address
 190 The @code{ip} to return to if this application returns multiple
 191 values. For continuations that only accept one value, this value will
 192 be @code{NULL}; for others, it will be an @code{ip} that points to a
 193 multiple-value return address in the calling code. That code will
 194 expect the top value on the stack to be an integer---the number of
 195 values being returned---and that below that integer there are the
 196 values being returned.
 197
 198 @item Dynamic link
 199 This is the @code{fp} in effect before this program was applied. In
 200 effect, this and the return address are the registers that are always
 201 ``saved''.
 202
 203 @item External link
 204 This field is a reference to the list of heap-allocated variables
 205 associated with this frame. For a discussion of heap versus stack
 206 allocation, @xref{Variables and the VM}.
 207
 208 @item Local variable @var{n}
 209 Lambda-local variables that are allocated on the stack are all
 210 allocated as part of the frame. This makes access to non-captured,
 211 non-mutated variables very cheap.
 212
 213 @item Argument @var{n}
 214 The calling convention of the VM requires arguments of a function
 215 application to be pushed on the stack, and here they are. Normally
 216 references to arguments dispatch to these locations on the stack.
 217 However if an argument has to be stored on the heap, it will be copied
 218 from its initial value here onto a location in the heap, and
 219 thereafter only referenced on the heap.
 220
 221 @item Program
 222 This is the program being applied. For more information on how
 223 programs are implemented, @xref{VM Programs}.
 224 @end table
 225
 226 @node Variables and the VM
 227 @subsection Variables and the VM
 228
 229 Let's think about the following Scheme code as an example:
 230
 231 @example
 232   (define (foo a)
 233     (lambda (b) (list foo a b)))
 234 @end example
 235
 236 Within the lambda expression, "foo" is a top-level variable, "a" is a
 237 lexically captured variable, and "b" is a local variable.
 238
 239 That is to say: @code{b} may safely be allocated on the stack, as
 240 there is no enclosed procedure that references it, nor is it ever
 241 mutated.
 242
 243 @code{a}, on the other hand, is referenced by an enclosed procedure,
 244 that of the lambda. Thus it must be allocated on the heap, as it may
 245 (and will) outlive the dynamic extent of the invocation of @code{foo}.
 246
 247 @code{foo} is a toplevel variable, as mandated by Scheme's semantics:
 248
 249 @example
 250   (define proc (foo 'bar)) ; assuming prev. definition of @code{foo}
 251   (define foo 42)          ; redefinition
 252   (proc 'baz)
 253   @result{} (42 bar baz)
 254 @end example
 255
 256 Note that variables that are mutated (via @code{set!}) must be
 257 allocated on the heap, even if they are local variables. This is
 258 because any called subprocedure might capture the continuation, which
 259 would need to capture locations instead of values. Thus perhaps
 260 counterintuitively, what would seem ``closer to the metal'', viz
 261 @code{set!}, actually forces heap allocation instead of stack
 262 allocation.
 263
 264 @node VM Programs
 265 @subsection Compiled Procedures are VM Programs
 266
 267 By default, when you enter in expressions at Guile's REPL, they are
 268 first compiled to VM object code, then that VM object code is executed
 269 to produce a value. If the expression evaluates to a procedure, the
 270 result of this process is a compiled procedure.
 271
 272 A compiled procedure is a compound object, consisting of its bytecode,
 273 a reference to any captured lexical variables, an object array, and
 274 some metadata such as the procedure's arity, name, and documentation.
 275 You can pick apart these pieces with the accessors in @code{(system vm
 276 program)}. @xref{Compiled Procedures}, for a full API reference.
 277
 278 @cindex object table
 279 The object array of a compiled procedure, also known as the
 280 @dfn{object table}, holds all Scheme objects whose values are known
 281 not to change across invocations of the procedure: constant strings,
 282 symbols, etc. The object table of a program is initialized right
 283 before a program is loaded with @code{load-program}.
 284 @xref{Loading Instructions}, for more information.
 285
 286 Variable objects are one such type of constant object: when a global
 287 binding is defined, a variable object is associated to it and that
 288 object will remain constant over time, even if the value bound to it
 289 changes. Therefore, toplevel bindings only need to be looked up once.
 290 Thereafter, references to the corresponding toplevel variables from
 291 within the program are then performed via the @code{toplevel-ref}
 292 instruction, which uses the object vector, and are almost as fast as
 293 local variable references.
 294
 295 We can see how these concepts tie together by disassembling the
 296 @code{foo} function to see what is going on:
 297
 298 @smallexample
 299 scheme@@(guile-user)> (define (foo a) (lambda (b) (list foo a b)))
 300 scheme@@(guile-user)> ,x foo
 301 Disassembly of #<program foo (a)>:
 302
 303 Bytecode:
 304
 305    0    (local-ref 0)                   ;; `a' (arg)
 306    2    (external-set 0)                ;; `a' (arg)
 307    4    (object-ref 0)                  ;; #<program #(0 28 #f) (b)>
 308    6    (make-closure)                                        at (unknown file):0:16
 309    7    (return)
 310
 311 ----------------------------------------
 312 Disassembly of #<program #(0 28 #f) (b)>:
 313
 314 Bytecode:
 315
 316    0    (toplevel-ref 0)                ;; `list'
 317    2    (toplevel-ref 1)                ;; `foo'
 318    4    (external-ref 0)                ;; (closure variable)
 319    6    (local-ref 0)                   ;; `b' (arg)
 320    8    (goto/args 3)                                         at (unknown file):0:28
 321 @end smallexample
 322
 323 At @code{ip} 0 and 2, we do the copy from argument to heap for
 324 @code{a}. @code{Ip} 4 loads up the compiled lambda, and then at
 325 @code{ip} 6 we make a closure---binding code (from the compiled
 326 lambda) with data (the heap-allocated variables). Finally we return
 327 the closure.
 328
 329 The second stanza disassembles the compiled lambda. Toplevel variables
 330 are resolved relative to the module that was current when the
 331 procedure was created. This lookup occurs lazily, at the first time
 332 the variable is actually referenced, and the location of the lookup is
 333 cached so that future references are very cheap. @xref{Environment
 334 Control Instructions}, for more details.
 335
 336 Then we see a reference to an external variable, corresponding to
 337 @code{a}. The disassembler doesn't have enough information to give a
 338 name to that variable, so it just marks it as being a ``closure
 339 variable''. Finally we see the reference to @code{b}, then a tail call
 340 (@code{goto/args}) with three arguments.
 341
 342 @node Instruction Set
 343 @subsection Instruction Set
 344
 345 There are about 100 instructions in Guile's virtual machine. These
 346 instructions represent atomic units of a program's execution. Ideally,
 347 they perform one task without conditional branches, then dispatch to
 348 the next instruction in the stream.
 349
 350 Instructions themselves are one byte long. Some instructions take
 351 parameters, which follow the instruction byte in the instruction
 352 stream.
 353
 354 Sometimes the compiler can figure out that it is compiling a special
 355 case that can be run more efficiently. So, for example, while Guile
 356 offers a generic test-and-branch instruction, it also offers specific
 357 instructions for special cases, so that the following cases all have
 358 their own test-and-branch instructions:
 359
 360 @example
 361 (if pred then else)
 362 (if (not pred) then else)
 363 (if (null? l) then else)
 364 (if (not (null? l)) then else)
 365 @end example
 366
 367 In addition, some Scheme primitives have their own inline
 368 implementations, e.g. @code{cons}.
 369
 370 So Guile's instruction set is a @emph{complete} instruction set, in
 371 that it provides the instructions that are suited to the problem, and
 372 is not concerned with making a minimal, orthogonal set of
 373 instructions. More instructions may be added over time.
 374
 375 @menu
 376 * Environment Control Instructions::
 377 * Branch Instructions::
 378 * Loading Instructions::
 379 * Procedural Instructions::
 380 * Data Control Instructions::
 381 * Miscellaneous Instructions::
 382 * Inlined Scheme Instructions::
 383 * Inlined Mathematical Instructions::
 384 @end menu
 385
 386 @node Environment Control Instructions
 387 @subsubsection Environment Control Instructions
 388
 389 These instructions access and mutate the environment of a compiled
 390 procedure---the local bindings, the ``external'' bindings, and the
 391 toplevel bindings.
 392
 393 @deffn Instruction local-ref index
 394 Push onto the stack the value of the local variable located at
 395 @var{index} within the current stack frame.
 396
 397 Note that arguments and local variables are all in one block. Thus the
 398 first argument, if any, is at index 0, and local bindings follow the
 399 arguments.
 400 @end deffn
 401
 402 @deffn Instruction local-set index
 403 Pop the Scheme object located on top of the stack and make it the new
 404 value of the local variable located at @var{index} within the current
 405 stack frame.
 406 @end deffn
 407
 408 @deffn Instruction external-ref index
 409 Push the value of the closure variable located at position
 410 @var{index} within the program's list of external variables.
 411 @end deffn
 412
 413 @deffn Instruction external-set index
 414 Pop the Scheme object located on top of the stack and make it the new
 415 value of the closure variable located at @var{index} within the
 416 program's list of external variables.
 417 @end deffn
 418
 419 The external variable lookup algorithm should probably be made more
 420 efficient in the future via addressing by frame and index. Currently,
 421 external variables are all consed onto a list, which results in O(N)
 422 lookup time.
 423
 424 @deffn Instruction externals
 425 Pushes the current list of external variables onto the stack. This
 426 instruction is used in the implementation of
 427 @code{compile-time-environment}. @xref{The Scheme Compiler}.
 428 @end deffn
 429
 430 @deffn Instruction toplevel-ref index
 431 Push the value of the toplevel binding whose location is stored in at
 432 position @var{index} in the object table.
 433
 434 Initially, a cell in the object table that is used by
 435 @code{toplevel-ref} is initialized to one of two forms. The normal
 436 case is that the cell holds a symbol, whose binding will be looked up
 437 relative to the module that was current when the current program was
 438 created.
 439
 440 Alternately, the lookup may be performed relative to a particular
 441 module, determined at compile-time (e.g. via @code{@@} or
 442 @code{@@@@}). In that case, the cell in the object table holds a list:
 443 @code{(@var{modname} @var{sym} @var{interface?})}. The symbol
 444 @var{sym} will be looked up in the module named @var{modname} (a list
 445 of symbols). The lookup will be performed against the module's public
 446 interface, unless @var{interface?} is @code{#f}, which it is for
 447 example when compiling @code{@@@@}.
 448
 449 In any case, if the symbol is unbound, an error is signalled.
 450 Otherwise the initial form is replaced with the looked-up variable, an
 451 in-place mutation of the object table. This mechanism provides for
 452 lazy variable resolution, and an important cached fast-path once the
 453 variable has been successfully resolved.
 454
 455 This instruction pushes the value of the variable onto the stack.
 456 @end deffn
 457
 458 @deffn Instruction toplevel-ref index
 459 Pop a value off the stack, and set it as the value of the toplevel
 460 variable stored at @var{index} in the object table. If the variable
 461 has not yet been looked up, we do the lookup as in
 462 @code{toplevel-ref}.
 463 @end deffn
 464
 465 @deffn Instruction link-now
 466 Pop a value, @var{x}, from the stack. Look up the binding for @var{x},
 467 according to the rules for @code{toplevel-ref}, and push that variable
 468 on the stack. If the lookup fails, an error will be signalled.
 469
 470 This instruction is mostly used when loading programs, because it can
 471 do toplevel variable lookups without an object vector.
 472 @end deffn
 473
 474 @deffn Instruction variable-ref
 475 Dereference the variable object which is on top of the stack and
 476 replace it by the value of the variable it represents.
 477 @end deffn
 478
 479 @deffn Instruction variable-set
 480 Pop off two objects from the stack, a variable and a value, and set
 481 the variable to the value.
 482 @end deffn
 483
 484 @deffn Instruction object-ref n
 485 Push @var{n}th value from the current program's object vector.
 486 @end deffn
 487
 488 @node Branch Instructions
 489 @subsubsection Branch Instructions
 490
 491 All the conditional branch instructions described below work in the
 492 same way:
 493
 494 @itemize
 495 @item They pop off the Scheme object located on the stack and use it as
 496 the branch condition;
 497 @item If the condition is true, then the instruction pointer is
 498 increased by the offset passed as an argument to the branch
 499 instruction;
 500 @item Program execution proceeds with the next instruction (that is,
 501 the one to which the instruction pointer points).
 502 @end itemize
 503
 504 Note that the offset passed to the instruction is encoded on two 8-bit
 505 integers which are then combined by the VM as one 16-bit integer.
 506
 507 @deffn Instruction br offset
 508 Jump to @var{offset}.
 509 @end deffn
 510
 511 @deffn Instruction br-if offset
 512 Jump to @var{offset} if the condition on the stack is not false.
 513 @end deffn
 514
 515 @deffn Instruction br-if-not offset
 516 Jump to @var{offset} if the condition on the stack is false.
 517 @end deffn
 518
 519 @deffn Instruction br-if-eq offset
 520 Jump to @var{offset} if the two objects located on the stack are
 521 equal in the sense of @var{eq?}.  Note that, for this instruction, the
 522 stack pointer is decremented by two Scheme objects instead of only
 523 one.
 524 @end deffn
 525
 526 @deffn Instruction br-if-not-eq offset
 527 Same as @var{br-if-eq} for non-@code{eq?} objects.
 528 @end deffn
 529
 530 @deffn Instruction br-if-null offset
 531 Jump to @var{offset} if the object on the stack is @code{'()}.
 532 @end deffn
 533
 534 @deffn Instruction br-if-not-null offset
 535 Jump to @var{offset} if the object on the stack is not @code{'()}.
 536 @end deffn
 537
 538
 539 @node Loading Instructions
 540 @subsubsection Loading Instructions
 541
 542 In addition to VM instructions, an instruction stream may contain
 543 variable-length data embedded within it. This data is always preceded
 544 by special loading instructions, which interpret the data and advance
 545 the instruction pointer to the next VM instruction.
 546
 547 All of these loading instructions have a @code{length} parameter,
 548 indicating the size of the embedded data, in bytes. The length itself
 549 may be encoded in 1, 2, or 4 bytes.
 550
 551 @deffn Instruction load-integer length
 552 @deffnx Instruction load-unsigned-integer length
 553 Load a 32-bit integer (respectively unsigned integer) from the
 554 instruction stream.
 555 @end deffn
 556 @deffn Instruction load-number length
 557 Load an arbitrary number from the instruction stream. The number is
 558 embedded in the stream as a string.
 559 @end deffn
 560 @deffn Instruction load-string length
 561 Load a string from the instruction stream.
 562 @end deffn
 563 @deffn Instruction load-symbol length
 564 Load a symbol from the instruction stream.
 565 @end deffn
 566 @deffn Instruction load-keyword length
 567 Load a keyword from the instruction stream.
 568 @end deffn
 569
 570 @deffn Instruction define length
 571 Load a symbol from the instruction stream, and look up its binding in
 572 the current toplevel environment, creating the binding if necessary.
 573 Push the variable corresponding to the binding.
 574 @end deffn
 575
 576 @deffn Instruction load-program length
 577 Load bytecode from the instruction stream, and push a compiled
 578 procedure. This instruction pops the following values from the stack:
 579
 580 @itemize
 581 @item Optionally, a thunk, which when called should return metadata
 582 associated with this program---for example its name, the names of its
 583 arguments, its documentation string, debugging information, etc.
 584
 585 Normally, this thunk its itself a compiled procedure (with no
 586 metadata). Metadata is represented this way so that the initial load
 587 of a procedure is fast: the VM just mmap's the thunk and goes. The
 588 symbols and pairs associated with the metadata are only created if the
 589 user asks for them.
 590
 591 For information on the format of the thunk's return value,
 592 @xref{Compiled Procedures}.
 593 @item Optionally, the program's object table, as a vector.
 594
 595 A program that does not reference toplevel bindings and does not use
 596 @code{object-ref} does not need an object table.
 597 @item Finally, either one immediate integer or four immediate integers
 598 representing the arity of the program.
 599
 600 In the four-fixnum case, the values are respectively the number of
 601 arguments taken by the function (@var{nargs}), the number of @dfn{rest
 602 arguments} (@var{nrest}, 0 or 1), the number of local variables
 603 (@var{nlocs}) and the number of external variables (@var{nexts})
 604 (@pxref{Environment Control Instructions}).
 605
 606 The common single-fixnum case represents all of these values within a
 607 16-bit bitmask.
 608 @end itemize
 609
 610 The resulting compiled procedure will not have any ``external''
 611 variables captured, so it will be loaded only once but may be used
 612 many times to create closures.
 613 @end deffn
 614
 615 Finally, while this instruction is not strictly a ``loading''
 616 instruction, it's useful to wind up the @code{load-program} discussion
 617 here:
 618
 619 @deffn Instruction make-closure
 620 Pop the program object from the stack, capture the current set of
 621 ``external'' variables, and assign those external variables to a copy
 622 of the program. Push the new program object, which shares state with
 623 the original program. Also captures the current module.
 624 @end deffn
 625
 626 @node Procedural Instructions
 627 @subsubsection Procedural Instructions
 628
 629 @deffn Instruction return
 630 Free the program's frame, returning the top value from the stack to
 631 the current continuation. (The stack should have exactly one value on
 632 it.)
 633
 634 Specifically, the @code{sp} is decremented to one below the current
 635 @code{fp}, the @code{ip} is reset to the current return address, the
 636 @code{fp} is reset to the value of the current dynamic link, and then
 637 the top item on the stack (formerly the procedure being applied) is
 638 set to the returned value.
 639 @end deffn
 640
 641 @deffn Instruction call nargs
 642 Call the procedure located at @code{sp[-nargs]} with the @var{nargs}
 643 arguments located from @code{sp[0]} to @code{sp[-nargs + 1]}.
 644
 645 For non-compiled procedures (continuations, primitives, and
 646 interpreted procedures), @code{call} will pop the procedure and
 647 arguments off the stack, and push the result of calling
 648 @code{scm_apply}.
 649
 650 For compiled procedures, this instruction sets up a new stack frame,
 651 as described in @ref{Stack Layout}, and then dispatches to the first
 652 instruction in the called procedure, relying on the called procedure
 653 to return one value to the newly-created continuation.
 654 @end deffn
 655
 656 @deffn Instruction goto/args nargs
 657 Like @code{call}, but reusing the current continuation. This
 658 instruction implements tail calling as required by RnRS.
 659
 660 For compiled procedures, that means that @code{goto/args} reuses the
 661 current frame instead of building a new one. The @code{goto/*}
 662 instruction family is named as it is because tail calls are equivalent
 663 to @code{goto}, along with relabeled variables.
 664
 665 For non-VM procedures, the result is the same, but the current VM
 666 invocation remains on the C stack. True tail calls are not currently
 667 possible between compiled and non-compiled procedures.
 668 @end deffn
 669
 670 @deffn Instruction apply nargs
 671 @deffnx Instruction goto/apply nargs
 672 Like @code{call} and @code{goto/args}, except that the top item on the
 673 stack must be a list. The elements of that list are then pushed on the
 674 stack and treated as additional arguments, replacing the list itself,
 675 then the procedure is invoked as usual.
 676 @end deffn
 677
 678 @deffn Instruction call/nargs
 679 @deffnx Instruction goto/nargs
 680 These are like @code{call} and @code{goto/args}, except they take the
 681 number of arguments from the stack instead of the instruction stream.
 682 These instructions are used in the implementation of multiple value
 683 returns, where the actual number of values is pushed on the stack.
 684 @end deffn
 685
 686 @deffn Instruction call/cc
 687 @deffnx Instruction goto/cc
 688 Capture the current continuation, and then call (or tail-call) the
 689 procedure on the top of the stack, with the continuation as the
 690 argument.
 691
 692 Both the VM continuation and the C continuation are captured.
 693 @end deffn
 694
 695 @deffn Instruction mv-call nargs offset
 696 Like @code{call}, except that a multiple-value continuation is created
 697 in addition to a single-value continuation.
 698
 699 The offset (a two-byte value) is an offset within the instruction
 700 stream; the multiple-value return address in the new frame
 701 (@pxref{Stack Layout}) will be set to the normal return address plus
 702 this offset. Instructions at that offset will expect the top value of
 703 the stack to be the number of values, and below that values
 704 themselves, pushed separately.
 705 @end deffn
 706
 707 @deffn Instruction return/values nvalues
 708 Return the top @var{nvalues} to the current continuation.
 709
 710 If the current continuation is a multiple-value continuation,
 711 @code{return/values} pushes the number of values on the stack, then
 712 returns as in @code{return}, but to the multiple-value return address.
 713
 714 Otherwise if the current continuation accepts only one value, i.e. the
 715 multiple-value return address is @code{NULL}, then we assume the user
 716 only wants one value, and we give them the first one. If there are no
 717 values, an error is signaled.
 718 @end deffn
 719
 720 @deffn Instruction return/values* nvalues
 721 Like a combination of @code{apply} and @code{return/values}, in which
 722 the top value on the stack is interpreted as a list of additional
 723 values. This is an optimization for the common @code{(apply values
 724 ...)} case.
 725 @end deffn
 726
 727 @deffn Instruction truncate-values nbinds nrest
 728 Used in multiple-value continuations, this instruction takes the
 729 values that are on the stack (including the number-of-value marker)
 730 and truncates them for a binding construct.
 731
 732 For example, a call to @code{(receive (x y . z) (foo) ...)} would,
 733 logically speaking, pop off the values returned from @code{(foo)} and
 734 push them as three values, corresponding to @code{x}, @code{y}, and
 735 @code{z}. In that case, @var{nbinds} would be 3, and @var{nrest} would
 736 be 1 (to indicate that one of the bindings was a rest arguments).
 737
 738 Signals an error if there is an insufficient number of values.
 739 @end deffn
 740
 741 @node Data Control Instructions
 742 @subsubsection Data Control Instructions
 743
 744 These instructions push simple immediate values onto the stack, or
 745 manipulate lists and vectors on the stack.
 746
 747 @deffn Instruction make-int8 value
 748 Push @var{value}, an 8-bit integer, onto the stack.
 749 @end deffn
 750
 751 @deffn Instruction make-int8:0
 752 Push the immediate value @code{0} onto the stack.
 753 @end deffn
 754
 755 @deffn Instruction make-int8:1
 756 Push the immediate value @code{1} onto the stack.
 757 @end deffn
 758
 759 @deffn Instruction make-int16 value
 760 Push @var{value}, a 16-bit integer, onto the stack.
 761 @end deffn
 762
 763 @deffn Instruction make-false
 764 Push @code{#f} onto the stack.
 765 @end deffn
 766
 767 @deffn Instruction make-true
 768 Push @code{#t} onto the stack.
 769 @end deffn
 770
 771 @deffn Instruction make-eol
 772 Push @code{'()} onto the stack.
 773 @end deffn
 774
 775 @deffn Instruction make-char8 value
 776 Push @var{value}, an 8-bit character, onto the stack.
 777 @end deffn
 778
 779 @deffn Instruction list n
 780 Pops off the top @var{n} values off of the stack, consing them up into
 781 a list, then pushes that list on the stack. What was the topmost value
 782 will be the last element in the list.
 783 @end deffn
 784
 785 @deffn Instruction vector n
 786 Create and fill a vector with the top @var{n} values from the stack,
 787 popping off those values and pushing on the resulting vector.
 788 @end deffn
 789
 790 @deffn Instruction mark
 791 Pushes a special value onto the stack that other stack instructions
 792 like @code{list-mark} can use.
 793 @end deffn
 794
 795 @deffn Instruction list-mark
 796 Create a list from values from the stack, as in @code{list}, but
 797 instead of knowing beforehand how many there will be, keep going until
 798 we see a @code{mark} value.
 799 @end deffn
 800
 801 @deffn Instruction cons-mark
 802 As the scheme procedure @code{cons*} is to the scheme procedure
 803 @code{list}, so the instruction @code{cons-mark} is to the instruction
 804 @code{list-mark}.
 805 @end deffn
 806
 807 @deffn Instruction vector-mark
 808 Like @code{list-mark}, but makes a vector instead of a list.
 809 @end deffn
 810
 811 @deffn Instruction list-break
 812 The opposite of @code{list}: pops a value, which should be a list, and
 813 pushes its elements on the stack.
 814 @end deffn
 815
 816 @node Miscellaneous Instructions
 817 @subsubsection Miscellaneous Instructions
 818
 819 @deffn Instruction nop
 820 Does nothing!
 821 @end deffn
 822
 823 @deffn Instruction halt
 824 Exits the VM, returning a SCM value. Normally, this instruction is
 825 only part of the ``bootstrap program'', a program run when a virtual
 826 machine is first entered; compiled Scheme procedures will not contain
 827 this instruction.
 828
 829 If multiple values have been returned, the SCM value will be a
 830 multiple-values object (@pxref{Multiple Values}).
 831 @end deffn
 832
 833 @deffn Instruction break
 834 Does nothing, but invokes the break hook.
 835 @end deffn
 836
 837 @deffn Instruction drop
 838 Pops off the top value from the stack, throwing it away.
 839 @end deffn
 840
 841 @deffn Instruction dup
 842 Re-pushes the top value onto the stack.
 843 @end deffn
 844
 845 @deffn Instruction void
 846 Pushes ``the unspecified value'' onto the stack.
 847 @end deffn
 848
 849 @node Inlined Scheme Instructions
 850 @subsubsection Inlined Scheme Instructions
 851
 852 The Scheme compiler can recognize the application of standard Scheme
 853 procedures, or unbound variables that look like they are bound to
 854 standard Scheme procedures. It tries to inline these small operations
 855 to avoid the overhead of creating new stack frames.
 856
 857 Since most of these operations are historically implemented as C
 858 primitives, not inlining them would entail constantly calling out from
 859 the VM to the interpreter, which has some costs---registers must be
 860 saved, the interpreter has to dispatch, called procedures have to do
 861 much typechecking, etc. It's much more efficient to inline these
 862 operations in the virtual machine itself.
 863
 864 All of these instructions pop their arguments from the stack and push
 865 their results, and take no parameters from the instruction stream.
 866 Thus, unlike in the previous sections, these instruction definitions
 867 show stack parameters instead of parameters from the instruction
 868 stream.
 869
 870 @deffn Instruction not x
 871 @deffnx Instruction not-not x
 872 @deffnx Instruction eq? x y
 873 @deffnx Instruction not-eq? x y
 874 @deffnx Instruction null?
 875 @deffnx Instruction not-null?
 876 @deffnx Instruction eqv? x y
 877 @deffnx Instruction equal? x y
 878 @deffnx Instruction pair? x y
 879 @deffnx Instruction list? x y
 880 @deffnx Instruction set-car! pair x
 881 @deffnx Instruction set-cdr! pair x
 882 @deffnx Instruction slot-ref struct n
 883 @deffnx Instruction slot-set struct n x
 884 @deffnx Instruction cons x
 885 @deffnx Instruction car x
 886 @deffnx Instruction cdr x
 887 Inlined implementations of their Scheme equivalents.
 888 @end deffn
 889
 890 Note that @code{caddr} and friends compile to a series of @code{car}
 891 and @code{cdr} instructions.
 892
 893 @node Inlined Mathematical Instructions
 894 @subsubsection Inlined Mathematical Instructions
 895
 896 Inlining mathematical operations has the obvious advantage of handling
 897 fixnums without function calls or allocations. The trick, of course,
 898 is knowing when the result of an operation will be a fixnum, and there
 899 might be a couple bugs here.
 900
 901 More instructions could be added here over time.
 902
 903 As in the previous section, the definitions below show stack
 904 parameters instead of instruction stream parameters.
 905
 906 @deffn Instruction add x y
 907 @deffnx Instruction sub x y
 908 @deffnx Instruction mul x y
 909 @deffnx Instruction div x y
 910 @deffnx Instruction quo x y
 911 @deffnx Instruction rem x y
 912 @deffnx Instruction mod x y
 913 @deffnx Instruction ee? x y
 914 @deffnx Instruction lt? x y
 915 @deffnx Instruction gt? x y
 916 @deffnx Instruction le? x y
 917 @deffnx Instruction ge? x y
 918 Inlined implementations of the corresponding mathematical operations.
 919 @end deffn