Fix overfull hboxes
[bpt/guile.git] / doc / ref / vm.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2008,2009
4 @c Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
6
7 @node A Virtual Machine for Guile
8 @section A Virtual Machine for Guile
9
10 Guile has both an interpreter and a compiler. To a user, the
11 difference is largely transparent---interpreted and compiled
12 procedures can call each other as they please.
13
14 The difference is that the compiler creates and interprets bytecode
15 for a custom virtual machine, instead of interpreting the
16 S-expressions directly. Running compiled code is faster than running
17 interpreted code.
18
19 The virtual machine that does the bytecode interpretation is a part of
20 Guile itself. This section describes the nature of Guile's virtual
21 machine.
22
23 @menu
24 * Why a VM?::
25 * VM Concepts::
26 * Stack Layout::
27 * Variables and the VM::
28 * VM Programs::
29 * Instruction Set::
30 @end menu
31
32 @node Why a VM?
33 @subsection Why a VM?
34
35 @cindex interpreter
36 @cindex evaluator
37 For a long time, Guile only had an interpreter, called the
38 @dfn{evaluator}. Guile's evaluator operates directly on the
39 S-expression representation of Scheme source code.
40
41 But while the evaluator is highly optimized and hand-tuned, and
42 contains some extensive speed trickery (@pxref{Memoization}), it still
43 performs many needless computations during the course of evaluating an
44 expression. For example, application of a function to arguments
45 needlessly conses up the arguments in a list. Evaluation of an
46 expression always has to figure out what the car of the expression is
47 -- a procedure, a memoized form, or something else. All values have to
48 be allocated on the heap. Et cetera.
49
50 The solution to this problem is to compile the higher-level language,
51 Scheme, into a lower-level language for which all of the checks and
52 dispatching have already been done---the code is instead stripped to
53 the bare minimum needed to ``do the job''.
54
55 The question becomes then, what low-level language to choose? There
56 are many options. We could compile to native code directly, but that
57 poses portability problems for Guile, as it is a highly cross-platform
58 project.
59
60 So we want the performance gains that compilation provides, but we
61 also want to maintain the portability benefits of a single code path.
62 The obvious solution is to compile to a virtual machine that is
63 present on all Guile installations.
64
65 The easiest (and most fun) way to depend on a virtual machine is to
66 implement the virtual machine within Guile itself. This way the
67 virtual machine provides what Scheme needs (tail calls, multiple
68 values, @code{call/cc}) and can provide optimized inline instructions
69 for Guile (@code{cons}, @code{struct-ref}, etc.).
70
71 So this is what Guile does. The rest of this section describes that VM
72 that Guile implements, and the compiled procedures that run on it.
73
74 Note that this decision to implement a bytecode compiler does not
75 preclude native compilation. We can compile from bytecode to native
76 code at runtime, or even do ahead of time compilation. More
77 possibilities are discussed in @ref{Extending the Compiler}.
78
79 @node VM Concepts
80 @subsection VM Concepts
81
82 A virtual machine (VM) is a Scheme object. Users may create virtual
83 machines using the standard procedures described later in this manual,
84 but that is usually unnecessary, as Guile ensures that there is one
85 virtual machine per thread. When a VM-compiled procedure is run, Guile
86 looks up the virtual machine for the current thread and executes the
87 procedure using that VM.
88
89 Guile's virtual machine is a stack machine---that is, it has few
90 registers, and the instructions defined in the VM operate by pushing
91 and popping values from a stack.
92
93 Stack memory is exclusive to the virtual machine that owns it. In
94 addition to their stacks, virtual machines also have access to the
95 global memory (modules, global bindings, etc) that is shared among
96 other parts of Guile, including other VMs.
97
98 A VM has generic instructions, such as those to reference local
99 variables, and instructions designed to support Guile's languages --
100 mathematical instructions that support the entire numerical tower, an
101 inlined implementation of @code{cons}, etc.
102
103 The registers that a VM has are as follows:
104
105 @itemize
106 @item ip - Instruction pointer
107 @item sp - Stack pointer
108 @item fp - Frame pointer
109 @end itemize
110
111 In other architectures, the instruction pointer is sometimes called
112 the ``program counter'' (pc). This set of registers is pretty typical
113 for stack machines; their exact meanings in the context of Guile's VM
114 are described in the next section.
115
116 A virtual machine executes by loading a compiled procedure, and
117 executing the object code associated with that procedure. Of course,
118 that procedure may call other procedures, tail-call others, ad
119 infinitum---indeed, within a guile whose modules have all been
120 compiled to object code, one might never leave the virtual machine.
121
122 @c wingo: The following is true, but I don't know in what context to
123 @c describe it. A documentation FIXME.
124
125 @c A VM may have one of three engines: reckless, regular, or debugging.
126 @c Reckless engine is fastest but dangerous. Regular engine is normally
127 @c fail-safe and reasonably fast. Debugging engine is safest and
128 @c functional but very slow.
129
130 @c (Actually we have just a regular and a debugging engine; normally
131 @c we use the latter, it's almost as fast as the ``regular'' engine.)
132
133 @node Stack Layout
134 @subsection Stack Layout
135
136 While not strictly necessary to understand how to work with the VM, it
137 is instructive and sometimes entertaining to consider the struture of
138 the VM stack.
139
140 Logically speaking, a VM stack is composed of ``frames''. Each frame
141 corresponds to the application of one compiled procedure, and contains
142 storage space for arguments, local variables, intermediate values, and
143 some bookkeeping information (such as what to do after the frame
144 computes its value).
145
146 While the compiler is free to do whatever it wants to, as long as the
147 semantics of a computation are preserved, in practice every time you
148 call a function, a new frame is created. (The notable exception of
149 course is the tail call case, @pxref{Tail Calls}.)
150
151 Within a frame, you have the data associated with the function
152 application itself, which is of a fixed size, and the stack space for
153 intermediate values. Sometimes only the former is referred to as the
154 ``frame'', and the latter is the ``stack'', although all pending
155 application frames can have some intermediate computations interleaved
156 on the stack.
157
158 The structure of the fixed part of an application frame is as follows:
159
160 @example
161 Stack
162 | | <- fp + bp->nargs + bp->nlocs + 4
163 +------------------+ = SCM_FRAME_UPPER_ADDRESS (fp)
164 | Return address |
165 | MV return address|
166 | Dynamic link |
167 | External link | <- fp + bp->nargs + bp->nlocs
168 | Local variable 1 | = SCM_FRAME_DATA_ADDRESS (fp)
169 | Local variable 0 | <- fp + bp->nargs
170 | Argument 1 |
171 | Argument 0 | <- fp
172 | Program | <- fp - 1
173 +------------------+ = SCM_FRAME_LOWER_ADDRESS (fp)
174 | |
175 @end example
176
177 In the above drawing, the stack grows upward. The intermediate values
178 stored in the application of this frame are stored above
179 @code{SCM_FRAME_UPPER_ADDRESS (fp)}. @code{bp} refers to the
180 @code{struct scm_objcode} data associated with the program at
181 @code{fp - 1}. @code{nargs} and @code{nlocs} are properties of the
182 compiled procedure, which will be discussed later.
183
184 The individual fields of the frame are as follows:
185
186 @table @asis
187 @item Return address
188 The @code{ip} that was in effect before this program was applied. When
189 we return from this activation frame, we will jump back to this
190 @code{ip}.
191
192 @item MV return address
193 The @code{ip} to return to if this application returns multiple
194 values. For continuations that only accept one value, this value will
195 be @code{NULL}; for others, it will be an @code{ip} that points to a
196 multiple-value return address in the calling code. That code will
197 expect the top value on the stack to be an integer---the number of
198 values being returned---and that below that integer there are the
199 values being returned.
200
201 @item Dynamic link
202 This is the @code{fp} in effect before this program was applied. In
203 effect, this and the return address are the registers that are always
204 ``saved''.
205
206 @item External link
207 This field is a reference to the list of heap-allocated variables
208 associated with this frame. For a discussion of heap versus stack
209 allocation, @xref{Variables and the VM}.
210
211 @item Local variable @var{n}
212 Lambda-local variables that are allocated on the stack are all
213 allocated as part of the frame. This makes access to non-captured,
214 non-mutated variables very cheap.
215
216 @item Argument @var{n}
217 The calling convention of the VM requires arguments of a function
218 application to be pushed on the stack, and here they are. Normally
219 references to arguments dispatch to these locations on the stack.
220 However if an argument has to be stored on the heap, it will be copied
221 from its initial value here onto a location in the heap, and
222 thereafter only referenced on the heap.
223
224 @item Program
225 This is the program being applied. For more information on how
226 programs are implemented, @xref{VM Programs}.
227 @end table
228
229 @node Variables and the VM
230 @subsection Variables and the VM
231
232 Consider the following Scheme code as an example:
233
234 @example
235 (define (foo a)
236 (lambda (b) (list foo a b)))
237 @end example
238
239 Within the lambda expression, "foo" is a top-level variable, "a" is a
240 lexically captured variable, and "b" is a local variable.
241
242 @code{b} may safely be allocated on the stack, as there is no enclosed
243 procedure that references it, nor is it ever mutated.
244
245 @code{a}, on the other hand, is referenced by an enclosed procedure,
246 that of the lambda. Thus it must be allocated on the heap, as it may
247 (and will) outlive the dynamic extent of the invocation of @code{foo}.
248
249 @code{foo} is a top-level variable, because it names the procedure
250 @code{foo}, which is here defined at the top-level.
251
252 Note that variables that are mutated (via @code{set!}) must be
253 allocated on the heap, even if they are local variables. This is
254 because any called subprocedure might capture the continuation, which
255 would need to capture locations instead of values. Thus perhaps
256 counterintuitively, what would seem ``closer to the metal'', viz
257 @code{set!}, actually forces heap allocation instead of stack
258 allocation.
259
260 @node VM Programs
261 @subsection Compiled Procedures are VM Programs
262
263 By default, when you enter in expressions at Guile's REPL, they are
264 first compiled to VM object code, then that VM object code is executed
265 to produce a value. If the expression evaluates to a procedure, the
266 result of this process is a compiled procedure.
267
268 A compiled procedure is a compound object, consisting of its bytecode,
269 a reference to any captured lexical variables, an object array, and
270 some metadata such as the procedure's arity, name, and documentation.
271 You can pick apart these pieces with the accessors in @code{(system vm
272 program)}. @xref{Compiled Procedures}, for a full API reference.
273
274 @cindex object table
275 @cindex object array
276 The object array of a compiled procedure, also known as the
277 @dfn{object table}, holds all Scheme objects whose values are known
278 not to change across invocations of the procedure: constant strings,
279 symbols, etc. The object table of a program is initialized right
280 before a program is loaded with @code{load-program}.
281 @xref{Loading Instructions}, for more information.
282
283 Variable objects are one such type of constant object: when a global
284 binding is defined, a variable object is associated to it and that
285 object will remain constant over time, even if the value bound to it
286 changes. Therefore, toplevel bindings only need to be looked up once.
287 Thereafter, references to the corresponding toplevel variables from
288 within the program are then performed via the @code{toplevel-ref}
289 instruction, which uses the object vector, and are almost as fast as
290 local variable references.
291
292 We can see how these concepts tie together by disassembling the
293 @code{foo} function we defined earlier to see what is going on:
294
295 @smallexample
296 scheme@@(guile-user)> (define (foo a) (lambda (b) (list foo a b)))
297 scheme@@(guile-user)> ,x foo
298 Disassembly of #<program foo (a)>:
299
300 0 (local-ref 0) ;; `a' (arg)
301 2 (external-set 0) ;; `a' (arg)
302 4 (object-ref 1) ;; #<program b70d2910 at <unknown port>:0:16 (b)>
303 6 (make-closure)
304 7 (return)
305
306 ----------------------------------------
307 Disassembly of #<program b70d2910 at <unknown port>:0:16 (b)>:
308
309 0 (toplevel-ref 1) ;; `foo'
310 2 (external-ref 0) ;; (closure variable)
311 4 (local-ref 0) ;; `b' (arg)
312 6 (list 0 3) ;; 3 elements at (unknown file):0:28
313 9 (return)
314 @end smallexample
315
316 At @code{ip} 0 and 2, we do the copy from argument to heap for
317 @code{a}. @code{Ip} 4 loads up the compiled lambda, and then at
318 @code{ip} 6 we make a closure---binding code (from the compiled
319 lambda) with data (the heap-allocated variables). Finally we return
320 the closure.
321
322 The second stanza disassembles the compiled lambda. Toplevel variables
323 are resolved relative to the module that was current when the
324 procedure was created. This lookup occurs lazily, at the first time
325 the variable is actually referenced, and the location of the lookup is
326 cached so that future references are very cheap. @xref{Environment
327 Control Instructions}, for more details.
328
329 Then we see a reference to an external variable, corresponding to
330 @code{a}. The disassembler doesn't have enough information to give a
331 name to that variable, so it just marks it as being a ``closure
332 variable''. Finally we see the reference to @code{b}, then the
333 @code{list} opcode, an inline implementation of the @code{list} scheme
334 routine.
335
336 @node Instruction Set
337 @subsection Instruction Set
338
339 There are about 100 instructions in Guile's virtual machine. These
340 instructions represent atomic units of a program's execution. Ideally,
341 they perform one task without conditional branches, then dispatch to
342 the next instruction in the stream.
343
344 Instructions themselves are one byte long. Some instructions take
345 parameters, which follow the instruction byte in the instruction
346 stream.
347
348 Sometimes the compiler can figure out that it is compiling a special
349 case that can be run more efficiently. So, for example, while Guile
350 offers a generic test-and-branch instruction, it also offers specific
351 instructions for special cases, so that the following cases all have
352 their own test-and-branch instructions:
353
354 @example
355 (if pred then else)
356 (if (not pred) then else)
357 (if (null? l) then else)
358 (if (not (null? l)) then else)
359 @end example
360
361 In addition, some Scheme primitives have their own inline
362 implementations, e.g. @code{cons}, and @code{list}, as we saw in the
363 previous section.
364
365 So Guile's instruction set is a @emph{complete} instruction set, in
366 that it provides the instructions that are suited to the problem, and
367 is not concerned with making a minimal, orthogonal set of
368 instructions. More instructions may be added over time.
369
370 @menu
371 * Environment Control Instructions::
372 * Branch Instructions::
373 * Loading Instructions::
374 * Procedural Instructions::
375 * Data Control Instructions::
376 * Miscellaneous Instructions::
377 * Inlined Scheme Instructions::
378 * Inlined Mathematical Instructions::
379 @end menu
380
381 @node Environment Control Instructions
382 @subsubsection Environment Control Instructions
383
384 These instructions access and mutate the environment of a compiled
385 procedure---the local bindings, the ``external'' bindings, and the
386 toplevel bindings.
387
388 @deffn Instruction local-ref index
389 Push onto the stack the value of the local variable located at
390 @var{index} within the current stack frame.
391
392 Note that arguments and local variables are all in one block. Thus the
393 first argument, if any, is at index 0, and local bindings follow the
394 arguments.
395 @end deffn
396
397 @deffn Instruction local-set index
398 Pop the Scheme object located on top of the stack and make it the new
399 value of the local variable located at @var{index} within the current
400 stack frame.
401 @end deffn
402
403 @deffn Instruction external-ref index
404 Push the value of the closure variable located at position
405 @var{index} within the program's list of external variables.
406 @end deffn
407
408 @deffn Instruction external-set index
409 Pop the Scheme object located on top of the stack and make it the new
410 value of the closure variable located at @var{index} within the
411 program's list of external variables.
412 @end deffn
413
414 The external variable lookup algorithm should probably be made more
415 efficient in the future via addressing by frame and index. Currently,
416 external variables are all consed onto a list, which results in O(N)
417 lookup time.
418
419 @deffn Instruction toplevel-ref index
420 @deffnx Instruction long-toplevel-ref index
421 Push the value of the toplevel binding whose location is stored in at
422 position @var{index} in the object table.
423
424 Initially, a cell in the object table that is used by
425 @code{toplevel-ref} is initialized to one of two forms. The normal
426 case is that the cell holds a symbol, whose binding will be looked up
427 relative to the module that was current when the current program was
428 created.
429
430 Alternately, the lookup may be performed relative to a particular
431 module, determined at compile-time (e.g. via @code{@@} or
432 @code{@@@@}). In that case, the cell in the object table holds a list:
433 @code{(@var{modname} @var{sym} @var{public?})}. The symbol @var{sym}
434 will be looked up in the module named @var{modname} (a list of
435 symbols). The lookup will be performed against the module's public
436 interface, unless @var{public?} is @code{#f}, which it is for example
437 when compiling @code{@@@@}.
438
439 In any case, if the symbol is unbound, an error is signalled.
440 Otherwise the initial form is replaced with the looked-up variable, an
441 in-place mutation of the object table. This mechanism provides for
442 lazy variable resolution, and an important cached fast-path once the
443 variable has been successfully resolved.
444
445 The ``long'' variant has a 16-bit index instead of an 8-bit index,
446 with the most significant byte first.
447
448 This instruction pushes the value of the variable onto the stack.
449 @end deffn
450
451 @deffn Instruction toplevel-set index
452 @deffnx Instruction long-toplevel-set index
453 Pop a value off the stack, and set it as the value of the toplevel
454 variable stored at @var{index} in the object table. If the variable
455 has not yet been looked up, we do the lookup as in
456 @code{toplevel-ref}. The ``long'' variant has a 16-bit index instead
457 of an 8-bit index.
458 @end deffn
459
460 @deffn Instruction link-now
461 Pop a value, @var{x}, from the stack. Look up the binding for @var{x},
462 according to the rules for @code{toplevel-ref}, and push that variable
463 on the stack. If the lookup fails, an error will be signalled.
464
465 This instruction is mostly used when loading programs, because it can
466 do toplevel variable lookups without an object vector.
467 @end deffn
468
469 @deffn Instruction variable-ref
470 Dereference the variable object which is on top of the stack and
471 replace it by the value of the variable it represents.
472 @end deffn
473
474 @deffn Instruction variable-set
475 Pop off two objects from the stack, a variable and a value, and set
476 the variable to the value.
477 @end deffn
478
479 @deffn Instruction object-ref n
480 @deffnx Instruction long-object-ref n
481 Push @var{n}th value from the current program's object vector. The
482 ``long'' variant has a 16-bit index instead of an 8-bit index.
483 @end deffn
484
485 @node Branch Instructions
486 @subsubsection Branch Instructions
487
488 All the conditional branch instructions described below work in the
489 same way:
490
491 @itemize
492 @item They pop off the Scheme object located on the stack and use it as
493 the branch condition;
494 @item If the condition is true, then the instruction pointer is
495 increased by the offset passed as an argument to the branch
496 instruction;
497 @item Program execution proceeds with the next instruction (that is,
498 the one to which the instruction pointer points).
499 @end itemize
500
501 Note that the offset passed to the instruction is encoded on two 8-bit
502 integers which are then combined by the VM as one 16-bit integer.
503
504 @deffn Instruction br offset
505 Jump to @var{offset}.
506 @end deffn
507
508 @deffn Instruction br-if offset
509 Jump to @var{offset} if the condition on the stack is not false.
510 @end deffn
511
512 @deffn Instruction br-if-not offset
513 Jump to @var{offset} if the condition on the stack is false.
514 @end deffn
515
516 @deffn Instruction br-if-eq offset
517 Jump to @var{offset} if the two objects located on the stack are
518 equal in the sense of @var{eq?}. Note that, for this instruction, the
519 stack pointer is decremented by two Scheme objects instead of only
520 one.
521 @end deffn
522
523 @deffn Instruction br-if-not-eq offset
524 Same as @var{br-if-eq} for non-@code{eq?} objects.
525 @end deffn
526
527 @deffn Instruction br-if-null offset
528 Jump to @var{offset} if the object on the stack is @code{'()}.
529 @end deffn
530
531 @deffn Instruction br-if-not-null offset
532 Jump to @var{offset} if the object on the stack is not @code{'()}.
533 @end deffn
534
535
536 @node Loading Instructions
537 @subsubsection Loading Instructions
538
539 In addition to VM instructions, an instruction stream may contain
540 variable-length data embedded within it. This data is always preceded
541 by special loading instructions, which interpret the data and advance
542 the instruction pointer to the next VM instruction.
543
544 All of these loading instructions have a @code{length} parameter,
545 indicating the size of the embedded data, in bytes. The length itself
546 is encoded in 3 bytes.
547
548 @deffn Instruction load-number length
549 Load an arbitrary number from the instruction stream. The number is
550 embedded in the stream as a string.
551 @end deffn
552 @deffn Instruction load-string length
553 Load a string from the instruction stream.
554 @end deffn
555 @deffn Instruction load-symbol length
556 Load a symbol from the instruction stream.
557 @end deffn
558 @deffn Instruction load-keyword length
559 Load a keyword from the instruction stream.
560 @end deffn
561
562 @deffn Instruction define length
563 Load a symbol from the instruction stream, and look up its binding in
564 the current toplevel environment, creating the binding if necessary.
565 Push the variable corresponding to the binding.
566 @end deffn
567
568 @deffn Instruction load-program
569 Load bytecode from the instruction stream, and push a compiled
570 procedure.
571
572 This instruction pops one value from the stack: the program's object
573 table, as a vector, or @code{#f} in the case that the program has no
574 object table. A program that does not reference toplevel bindings and
575 does not use @code{object-ref} does not need an object table.
576
577 This instruction is unlike the rest of the loading instructions,
578 because instead of parsing its data, it directly maps the instruction
579 stream onto a C structure, @code{struct scm_objcode}. @xref{Bytecode
580 and Objcode}, for more information.
581
582 The resulting compiled procedure will not have any ``external''
583 variables captured, so it may be loaded only once but used many times
584 to create closures.
585 @end deffn
586
587 Finally, while this instruction is not strictly a ``loading''
588 instruction, it's useful to wind up the @code{load-program} discussion
589 here:
590
591 @deffn Instruction make-closure
592 Pop the program object from the stack, capture the current set of
593 ``external'' variables, and assign those external variables to a copy
594 of the program. Push the new program object, which shares state with
595 the original program.
596
597 At the time of this writing, the space overhead of closures is 4 words
598 per closure.
599 @end deffn
600
601 @node Procedural Instructions
602 @subsubsection Procedural Instructions
603
604 @deffn Instruction return
605 Free the program's frame, returning the top value from the stack to
606 the current continuation. (The stack should have exactly one value on
607 it.)
608
609 Specifically, the @code{sp} is decremented to one below the current
610 @code{fp}, the @code{ip} is reset to the current return address, the
611 @code{fp} is reset to the value of the current dynamic link, and then
612 the top item on the stack (formerly the procedure being applied) is
613 set to the returned value.
614 @end deffn
615
616 @deffn Instruction call nargs
617 Call the procedure located at @code{sp[-nargs]} with the @var{nargs}
618 arguments located from @code{sp[-nargs + 1]} to @code{sp[0]}.
619
620 For compiled procedures, this instruction sets up a new stack frame,
621 as described in @ref{Stack Layout}, and then dispatches to the first
622 instruction in the called procedure, relying on the called procedure
623 to return one value to the newly-created continuation. Because the new
624 frame pointer will point to sp[-nargs + 1], the arguments don't have
625 to be shuffled around -- they are already in place.
626
627 For non-compiled procedures (continuations, primitives, and
628 interpreted procedures), @code{call} will pop the procedure and
629 arguments off the stack, and push the result of calling
630 @code{scm_apply}.
631 @end deffn
632
633 @deffn Instruction goto/args nargs
634 Like @code{call}, but reusing the current continuation. This
635 instruction implements tail calls as required by RnRS.
636
637 For compiled procedures, that means that @code{goto/args} reuses the
638 current frame instead of building a new one. The @code{goto/*}
639 instruction family is named as it is because tail calls are equivalent
640 to @code{goto}, along with relabeled variables.
641
642 For non-VM procedures, the result is the same, but the current VM
643 invocation remains on the C stack. True tail calls are not currently
644 possible between compiled and non-compiled procedures.
645 @end deffn
646
647 @deffn Instruction apply nargs
648 @deffnx Instruction goto/apply nargs
649 Like @code{call} and @code{goto/args}, except that the top item on the
650 stack must be a list. The elements of that list are then pushed on the
651 stack and treated as additional arguments, replacing the list itself,
652 then the procedure is invoked as usual.
653 @end deffn
654
655 @deffn Instruction call/nargs
656 @deffnx Instruction goto/nargs
657 These are like @code{call} and @code{goto/args}, except they take the
658 number of arguments from the stack instead of the instruction stream.
659 These instructions are used in the implementation of multiple value
660 returns, where the actual number of values is pushed on the stack.
661 @end deffn
662
663 @deffn Instruction call/cc
664 @deffnx Instruction goto/cc
665 Capture the current continuation, and then call (or tail-call) the
666 procedure on the top of the stack, with the continuation as the
667 argument.
668
669 Both the VM continuation and the C continuation are captured.
670 @end deffn
671
672 @deffn Instruction mv-call nargs offset
673 Like @code{call}, except that a multiple-value continuation is created
674 in addition to a single-value continuation.
675
676 The offset (a two-byte value) is an offset within the instruction
677 stream; the multiple-value return address in the new frame
678 (@pxref{Stack Layout}) will be set to the normal return address plus
679 this offset. Instructions at that offset will expect the top value of
680 the stack to be the number of values, and below that values
681 themselves, pushed separately.
682 @end deffn
683
684 @deffn Instruction return/values nvalues
685 Return the top @var{nvalues} to the current continuation.
686
687 If the current continuation is a multiple-value continuation,
688 @code{return/values} pushes the number of values on the stack, then
689 returns as in @code{return}, but to the multiple-value return address.
690
691 Otherwise if the current continuation accepts only one value, i.e. the
692 multiple-value return address is @code{NULL}, then we assume the user
693 only wants one value, and we give them the first one. If there are no
694 values, an error is signaled.
695 @end deffn
696
697 @deffn Instruction return/values* nvalues
698 Like a combination of @code{apply} and @code{return/values}, in which
699 the top value on the stack is interpreted as a list of additional
700 values. This is an optimization for the common @code{(apply values
701 ...)} case.
702 @end deffn
703
704 @deffn Instruction truncate-values nbinds nrest
705 Used in multiple-value continuations, this instruction takes the
706 values that are on the stack (including the number-of-values marker)
707 and truncates them for a binding construct.
708
709 For example, a call to @code{(receive (x y . z) (foo) ...)} would,
710 logically speaking, pop off the values returned from @code{(foo)} and
711 push them as three values, corresponding to @code{x}, @code{y}, and
712 @code{z}. In that case, @var{nbinds} would be 3, and @var{nrest} would
713 be 1 (to indicate that one of the bindings was a rest argument).
714
715 Signals an error if there is an insufficient number of values.
716 @end deffn
717
718 @node Data Control Instructions
719 @subsubsection Data Control Instructions
720
721 These instructions push simple immediate values onto the stack, or
722 manipulate lists and vectors on the stack.
723
724 @deffn Instruction make-int8 value
725 Push @var{value}, an 8-bit integer, onto the stack.
726 @end deffn
727
728 @deffn Instruction make-int8:0
729 Push the immediate value @code{0} onto the stack.
730 @end deffn
731
732 @deffn Instruction make-int8:1
733 Push the immediate value @code{1} onto the stack.
734 @end deffn
735
736 @deffn Instruction make-int16 value
737 Push @var{value}, a 16-bit integer, onto the stack.
738 @end deffn
739
740 @deffn Instruction make-uint64 value
741 Push @var{value}, an unsigned 64-bit integer, onto the stack. The
742 value is encoded in 8 bytes, most significant byte first (big-endian).
743 @end deffn
744
745 @deffn Instruction make-int64 value
746 Push @var{value}, a signed 64-bit integer, onto the stack. The value
747 is encoded in 8 bytes, most significant byte first (big-endian), in
748 twos-complement arithmetic.
749 @end deffn
750
751 @deffn Instruction make-false
752 Push @code{#f} onto the stack.
753 @end deffn
754
755 @deffn Instruction make-true
756 Push @code{#t} onto the stack.
757 @end deffn
758
759 @deffn Instruction make-eol
760 Push @code{'()} onto the stack.
761 @end deffn
762
763 @deffn Instruction make-char8 value
764 Push @var{value}, an 8-bit character, onto the stack.
765 @end deffn
766
767 @deffn Instruction list n
768 Pops off the top @var{n} values off of the stack, consing them up into
769 a list, then pushes that list on the stack. What was the topmost value
770 will be the last element in the list. @var{n} is a two-byte value,
771 most significant byte first.
772 @end deffn
773
774 @deffn Instruction vector n
775 Create and fill a vector with the top @var{n} values from the stack,
776 popping off those values and pushing on the resulting vector. @var{n}
777 is a two-byte value, like in @code{vector}.
778 @end deffn
779
780 @deffn Instruction mark
781 Pushes a special value onto the stack that other stack instructions
782 like @code{list-mark} can use.
783 @end deffn
784
785 @deffn Instruction list-mark
786 Create a list from values from the stack, as in @code{list}, but
787 instead of knowing beforehand how many there will be, keep going until
788 we see a @code{mark} value.
789 @end deffn
790
791 @deffn Instruction cons-mark
792 As the scheme procedure @code{cons*} is to the scheme procedure
793 @code{list}, so the instruction @code{cons-mark} is to the instruction
794 @code{list-mark}.
795 @end deffn
796
797 @deffn Instruction vector-mark
798 Like @code{list-mark}, but makes a vector instead of a list.
799 @end deffn
800
801 @deffn Instruction list-break
802 The opposite of @code{list}: pops a value, which should be a list, and
803 pushes its elements on the stack.
804 @end deffn
805
806 @node Miscellaneous Instructions
807 @subsubsection Miscellaneous Instructions
808
809 @deffn Instruction nop
810 Does nothing!
811 @end deffn
812
813 @deffn Instruction halt
814 Exits the VM, returning a SCM value. Normally, this instruction is
815 only part of the ``bootstrap program'', a program run when a virtual
816 machine is first entered; compiled Scheme procedures will not contain
817 this instruction.
818
819 If multiple values have been returned, the SCM value will be a
820 multiple-values object (@pxref{Multiple Values}).
821 @end deffn
822
823 @deffn Instruction break
824 Does nothing, but invokes the break hook.
825 @end deffn
826
827 @deffn Instruction drop
828 Pops off the top value from the stack, throwing it away.
829 @end deffn
830
831 @deffn Instruction dup
832 Re-pushes the top value onto the stack.
833 @end deffn
834
835 @deffn Instruction void
836 Pushes ``the unspecified value'' onto the stack.
837 @end deffn
838
839 @node Inlined Scheme Instructions
840 @subsubsection Inlined Scheme Instructions
841
842 The Scheme compiler can recognize the application of standard Scheme
843 procedures. It tries to inline these small operations to avoid the
844 overhead of creating new stack frames.
845
846 Since most of these operations are historically implemented as C
847 primitives, not inlining them would entail constantly calling out from
848 the VM to the interpreter, which has some costs---registers must be
849 saved, the interpreter has to dispatch, called procedures have to do
850 much typechecking, etc. It's much more efficient to inline these
851 operations in the virtual machine itself.
852
853 All of these instructions pop their arguments from the stack and push
854 their results, and take no parameters from the instruction stream.
855 Thus, unlike in the previous sections, these instruction definitions
856 show stack parameters instead of parameters from the instruction
857 stream.
858
859 @deffn Instruction not x
860 @deffnx Instruction not-not x
861 @deffnx Instruction eq? x y
862 @deffnx Instruction not-eq? x y
863 @deffnx Instruction null?
864 @deffnx Instruction not-null?
865 @deffnx Instruction eqv? x y
866 @deffnx Instruction equal? x y
867 @deffnx Instruction pair? x y
868 @deffnx Instruction list? x
869 @deffnx Instruction set-car! pair x
870 @deffnx Instruction set-cdr! pair x
871 @deffnx Instruction slot-ref struct n
872 @deffnx Instruction slot-set struct n x
873 @deffnx Instruction cons x y
874 @deffnx Instruction car x
875 @deffnx Instruction cdr x
876 Inlined implementations of their Scheme equivalents.
877 @end deffn
878
879 Note that @code{caddr} and friends compile to a series of @code{car}
880 and @code{cdr} instructions.
881
882 @node Inlined Mathematical Instructions
883 @subsubsection Inlined Mathematical Instructions
884
885 Inlining mathematical operations has the obvious advantage of handling
886 fixnums without function calls or allocations. The trick, of course,
887 is knowing when the result of an operation will be a fixnum, and there
888 might be a couple bugs here.
889
890 More instructions could be added here over time.
891
892 As in the previous section, the definitions below show stack
893 parameters instead of instruction stream parameters.
894
895 @deffn Instruction add x y
896 @deffnx Instruction sub x y
897 @deffnx Instruction mul x y
898 @deffnx Instruction div x y
899 @deffnx Instruction quo x y
900 @deffnx Instruction rem x y
901 @deffnx Instruction mod x y
902 @deffnx Instruction ee? x y
903 @deffnx Instruction lt? x y
904 @deffnx Instruction gt? x y
905 @deffnx Instruction le? x y
906 @deffnx Instruction ge? x y
907 Inlined implementations of the corresponding mathematical operations.
908 @end deffn