Beginning vm.texi updates
[bpt/guile.git] / doc / ref / vm.texi
CommitLineData
8680d53b
AW
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
747bd534 3@c Copyright (C) 2008,2009,2010,2011,2013
8680d53b
AW
4@c Free Software Foundation, Inc.
5@c See the file guile.texi for copying conditions.
6
7@node A Virtual Machine for Guile
8@section A Virtual Machine for Guile
9
f11871d6
AW
10Guile has both an interpreter and a compiler. To a user, the difference
11is transparent---interpreted and compiled procedures can call each other
12as they please.
090d51ed
AW
13
14The difference is that the compiler creates and interprets bytecode
15for a custom virtual machine, instead of interpreting the
98850fd7
AW
16S-expressions directly. Loading and running compiled code is faster
17than loading and running source code.
090d51ed
AW
18
19The virtual machine that does the bytecode interpretation is a part of
20Guile itself. This section describes the nature of Guile's virtual
21machine.
22
8680d53b
AW
23@menu
24* Why a VM?::
25* VM Concepts::
26* Stack Layout::
27* Variables and the VM::
00ce5125 28* VM Programs::
8680d53b
AW
29* Instruction Set::
30@end menu
31
32@node Why a VM?
33@subsection Why a VM?
34
86872cc3 35@cindex interpreter
f11871d6
AW
36For a long time, Guile only had an interpreter. Guile's interpreter
37operated directly on the S-expression representation of Scheme source
38code.
090d51ed 39
f11871d6 40But while the interpreter was highly optimized and hand-tuned, it still
e3ba263d
AW
41performs many needless computations during the course of evaluating an
42expression. For example, application of a function to arguments
f11871d6
AW
43needlessly consed up the arguments in a list. Evaluation of an
44expression always had to figure out what the car of the expression is --
45a procedure, a memoized form, or something else. All values have to be
46allocated on the heap. Et cetera.
090d51ed 47
f11871d6 48The solution to this problem was to compile the higher-level language,
090d51ed 49Scheme, into a lower-level language for which all of the checks and
86872cc3 50dispatching have already been done---the code is instead stripped to
090d51ed
AW
51the bare minimum needed to ``do the job''.
52
53The question becomes then, what low-level language to choose? There
54are many options. We could compile to native code directly, but that
55poses portability problems for Guile, as it is a highly cross-platform
56project.
57
58So we want the performance gains that compilation provides, but we
59also want to maintain the portability benefits of a single code path.
60The obvious solution is to compile to a virtual machine that is
61present on all Guile installations.
62
63The easiest (and most fun) way to depend on a virtual machine is to
64implement the virtual machine within Guile itself. This way the
65virtual machine provides what Scheme needs (tail calls, multiple
86872cc3
AW
66values, @code{call/cc}) and can provide optimized inline instructions
67for Guile (@code{cons}, @code{struct-ref}, etc.).
090d51ed
AW
68
69So this is what Guile does. The rest of this section describes that VM
70that Guile implements, and the compiled procedures that run on it.
71
f11871d6
AW
72Before moving on, though, we should note that though we spoke of the
73interpreter in the past tense, Guile still has an interpreter. The
74difference is that before, it was Guile's main evaluator, and so was
75implemented in highly optimized C; now, it is actually implemented in
76Scheme, and compiled down to VM bytecode, just like any other program.
77(There is still a C interpreter around, used to bootstrap the compiler,
78but it is not normally used at runtime.)
79
80The upside of implementing the interpreter in Scheme is that we preserve
81tail calls and multiple-value handling between interpreted and compiled
23e2e780 82code. The downside is that the interpreter in Guile 2.2 is still slower
0c81a0c1 83than the interpreter in 1.8. We hope the that the compiler's speed makes
23e2e780
AW
84up for the loss. In any case, once we have native compilation for
85Scheme code, we expect the new self-hosted interpreter to beat the old
86hand-tuned C implementation.
f11871d6
AW
87
88Also note that this decision to implement a bytecode compiler does not
090d51ed
AW
89preclude native compilation. We can compile from bytecode to native
90code at runtime, or even do ahead of time compilation. More
86872cc3 91possibilities are discussed in @ref{Extending the Compiler}.
8680d53b
AW
92
93@node VM Concepts
94@subsection VM Concepts
95
23e2e780
AW
96Compiled code is run by a virtual machine (VM). Each thread has its own
97VM. The virtual machine executes the sequence of instructions in a
98procedure.
8680d53b 99
23e2e780
AW
100Each VM instruction starts by indicating which operation it is, and then
101follows by encoding its source and destination operands. Each procedure
102declares that it has some number of local variables, including the
103function arguments. These local variables form the available operands
104of the procedure, and are accessed by index.
8680d53b 105
23e2e780
AW
106The local variables for a procedure are stored on a stack. Calling a
107procedure typically enlarges the stack, and returning from a procedure
108shrinks it. Stack memory is exclusive to the virtual machine that owns
109it.
8680d53b 110
23e2e780
AW
111In addition to their stacks, virtual machines also have access to the
112global memory (modules, global bindings, etc) that is shared among other
113parts of Guile, including other VMs.
8680d53b
AW
114
115The registers that a VM has are as follows:
116
117@itemize
118@item ip - Instruction pointer
119@item sp - Stack pointer
120@item fp - Frame pointer
121@end itemize
122
23e2e780
AW
123In other architectures, the instruction pointer is sometimes called the
124``program counter'' (pc). This set of registers is pretty typical for
125virtual machines; their exact meanings in the context of Guile's VM are
126described in the next section.
81fd3152 127
8680d53b
AW
128@node Stack Layout
129@subsection Stack Layout
130
23e2e780
AW
131The stack of Guile's virtual machine is composed of @dfn{frames}. Each
132frame corresponds to the application of one compiled procedure, and
133contains storage space for arguments, local variables, and some
134bookkeeping information (such as what to do after the frame is
135finished).
8680d53b
AW
136
137While the compiler is free to do whatever it wants to, as long as the
138semantics of a computation are preserved, in practice every time you
139call a function, a new frame is created. (The notable exception of
140course is the tail call case, @pxref{Tail Calls}.)
141
23e2e780 142The structure of the top stack frame is as follows:
8680d53b
AW
143
144@example
23e2e780
AW
145 /------------------\ <- top of stack
146 | Local N-1 | <- sp
8274228f 147 | ... |
23e2e780
AW
148 | Local 1 |
149 | Local 0 | <- fp = SCM_FRAME_LOCALS_ADDRESS (fp)
8274228f 150 +==================+
8274228f 151 | Return address |
23e2e780 152 | Dynamic link | <- fp - 2 = SCM_FRAME_LOWER_ADDRESS (fp)
8274228f 153 +==================+
23e2e780 154 | | <- fp - 3 = SCM_FRAME_PREVIOUS_SP (fp)
8680d53b
AW
155@end example
156
23e2e780
AW
157In the above drawing, the stack grows upward. Usually the procedure
158being applied is in local 0, followed by the arguments from local 1.
159After that are enough slots to store the various lexically-bound and
160temporary values that are needed in the function's application.
161
162The @dfn{return address} is the @code{ip} that was in effect before this
163program was applied. When we return from this activation frame, we will
164jump back to this @code{ip}. Likewise, the @dfn{dynamic link} is the
165@code{fp} in effect before this program was applied.
166
167To prepare for a non-tail application, Guile's VM will emit code that
168shuffles the function to apply and its arguments into appropriate stack
169slots, with two free slots below them. The call then initializes those
170free slots with the current @code{ip} and @code{fp}, and updates
171@code{ip} to point to the function entry, and @code{fp} to point to the
172new call frame.
173
174In this way, the dynamic link links the current frame to the previous
175frame. Computing a stack trace involves traversing these frames.
8680d53b
AW
176
177@node Variables and the VM
178@subsection Variables and the VM
179
81fd3152 180Consider the following Scheme code as an example:
8680d53b
AW
181
182@example
183 (define (foo a)
184 (lambda (b) (list foo a b)))
185@end example
186
23e2e780
AW
187Within the lambda expression, @code{foo} is a top-level variable,
188@code{a} is a lexically captured variable, and @code{b} is a local
189variable.
98850fd7 190
23e2e780
AW
191Another way to refer to @code{a} and @code{b} is to say that @code{a} is
192a ``free'' variable, since it is not defined within the lambda, and
98850fd7 193@code{b} is a ``bound'' variable. These are the terms used in the
23e2e780
AW
194@dfn{lambda calculus}, a mathematical notation for describing functions.
195The lambda calculus is useful because it is a language in which to
196reason precisely about functions and variables. It is especially good
197at describing scope relations, and it is for that reason that we mention
198it here.
98850fd7
AW
199
200Guile allocates all variables on the stack. When a lexically enclosed
f11871d6
AW
201procedure with free variables---a @dfn{closure}---is created, it copies
202those variables into its free variable vector. References to free
98850fd7
AW
203variables are then redirected through the free variable vector.
204
205If a variable is ever @code{set!}, however, it will need to be
206heap-allocated instead of stack-allocated, so that different closures
207that capture the same variable can see the same value. Also, this
208allows continuations to capture a reference to the variable, instead
209of to its value at one point in time. For these reasons, @code{set!}
210variables are allocated in ``boxes''---actually, in variable cells.
211@xref{Variables}, for more information. References to @code{set!}
212variables are indirected through the boxes.
213
214Thus perhaps counterintuitively, what would seem ``closer to the
215metal'', viz @code{set!}, actually forces an extra memory allocation
216and indirection.
217
218Going back to our example, @code{b} may be allocated on the stack, as
219it is never mutated.
220
221@code{a} may also be allocated on the stack, as it too is never
222mutated. Within the enclosed lambda, its value will be copied into
223(and referenced from) the free variables vector.
224
225@code{foo} is a top-level variable, because @code{foo} is not
226lexically bound in this example.
8680d53b 227
00ce5125
AW
228@node VM Programs
229@subsection Compiled Procedures are VM Programs
8680d53b
AW
230
231By default, when you enter in expressions at Guile's REPL, they are
23e2e780
AW
232first compiled to bytecode. Then that bytecode is executed to produce a
233value. If the expression evaluates to a procedure, the result of this
234process is a compiled procedure.
235
236A compiled procedure is a compound object consisting of its bytecode and
237a reference to any captured lexical variables. In addition, when a
238procedure is compiled, it has associated metadata written to side
239tables, for instance a line number mapping, or its docstring. You can
240pick apart these pieces with the accessors in @code{(system vm
241program)}. @xref{Compiled Procedures}, for a full API reference.
242
243A procedure may reference data that was statically allocated when the
244procedure was compiled. For example, a pair of immediate objects
245(@pxref{Immediate objects}) can be allocated directly in the memory
246segment that contains the compiled bytecode, and accessed directly by
247the bytecode.
248
249Another use for statically allocated data is to serve as a cache for a
250bytecode. Top-level variable lookups are handled in this way. If the
251@code{toplevel-box} instruction finds that it does not have a cached
252variable for a top-level reference, it accesses other static data to
253resolve the reference, and fills in the cache slot. Thereafter all
254access to the variable goes through the cache cell. The variable's
255value may change in the future, but the variable itself will not.
8680d53b
AW
256
257We can see how these concepts tie together by disassembling the
81fd3152 258@code{foo} function we defined earlier to see what is going on:
8680d53b
AW
259
260@smallexample
261scheme@@(guile-user)> (define (foo a) (lambda (b) (list foo a b)))
262scheme@@(guile-user)> ,x foo
23e2e780
AW
263Disassembly of #<procedure foo (a)> at #x203be34:
264
265 0 (assert-nargs-ee/locals 2 1) ;; 1 arg, 1 local at (unknown file):1:0
266 1 (make-closure 2 6 1) ;; anonymous procedure at #x203be50 (1 free var)
267 4 (free-set! 2 1 0) ;; free var 0
268 6 (return 2)
8680d53b
AW
269
270----------------------------------------
23e2e780
AW
271Disassembly of anonymous procedure at #x203be50:
272
273 0 (assert-nargs-ee/locals 2 3) ;; 1 arg, 3 locals at (unknown file):1:0
274 1 (toplevel-box 2 73 57 71 #t) ;; `foo'
275 6 (box-ref 2 2)
276 7 (make-short-immediate 3 772) ;; ()
277 8 (cons 3 1 3)
278 9 (free-ref 4 0 0) ;; free var 0
279 11 (cons 3 4 3)
280 12 (cons 2 2 3)
281 13 (return 2)
8680d53b
AW
282@end smallexample
283
23e2e780
AW
284First there's some prelude, where @code{foo} checks that it was called
285with only 1 argument. Then at @code{ip} 1, we allocate a new closure
286and store it in slot 2. The `6' in the @code{(make-closure 2 6 1)} is a
287relative offset from the instruction pointer of the code for the
288closure.
289
290A closure is code with data. We already have the code part initialized;
291what remains is to set the data. @code{Ip} 4 initializes free variable
2920 in the new closure with the value from local variable 1, which
293corresponds to the first argument of @code{foo}: `a'. Finally we return
294the closure.
295
296The second stanza disassembles the code for the closure. After the
297prelude, we load the variable for the toplevel variable @code{foo} into
298local variable 2. This lookup occurs lazily, the first time the
299variable is actually referenced, and the location of the lookup is
300cached so that future references are very cheap. @xref{Top-Level
301Environment Instructions}, for more details. The @code{box-ref}
302dereferences the variable cell, replacing the contents of local 2.
303
304What follows is a sequence of conses to build up the result list.
305@code{Ip} 7 makes the tail of the list. @code{Ip} 8 conses on the value
306in local 1, corresponding to the first argument to the closure: `b'.
307@code{Ip} 9 loads free variable 0 of local 0 -- the procedure being
308called -- into slot 4, then @code{ip} 11 conses it onto the list.
309Finally we cons local 2, containing the @code{foo} toplevel, onto the
310front of the list, and we return it.
8680d53b
AW
311
312@node Instruction Set
313@subsection Instruction Set
314
23e2e780
AW
315There are currently about 130 instructions in Guile's virtual machine.
316These instructions represent atomic units of a program's execution.
317Ideally, they perform one task without conditional branches, then
318dispatch to the next instruction in the stream.
319
320Instructions themselves are composed of 1 or more 32-bit units. The low
3218 bits of the first word indicate the opcode, and the rest of
322instruction describe the operands. There are a number of different ways
323operands can be encoded.
324
325@table @code
326@item u@var{n}
327An unsigned @var{n}-bit integer. Usually indicates the index of a local
328variable, but some instructions interpret these operands as immediate
329values.
330@item l24
331An offset from the current @code{ip}, in 32-bit units, as a signed
33224-bit value. Indicates a bytecode address, for a relative jump.
333@item i16
334@itemx i32
335An immediate Scheme value (@pxref{Immediate objects}), encoded directly
336in 16 or 32 bits.
337@item a32
338@itemx b32
339An immediate Scheme value, encoded as a pair of 32-bit words.
340@code{a32} and @code{b32} values always go together on the same opcode,
341and indicate the high and low bits, respectively. Normally only used on
34264-bit systems.
343@item n32
344A statically allocated non-immediate. The address of the non-immediate
345is encoded as a signed 32-bit integer, and indicates a relative offset
346in 32-bit units. Think of it as @code{SCM x = ip + offset}.
347@item s32
348Indirect scheme value, like @code{n32} but indirected. Think of it as
349@code{SCM *x = ip + offset}.
350@item l32
351@item lo32
352An ip-relative address, as a signed 32-bit integer. Could indicate a
353bytecode address, as in @code{make-closure}, or a non-immediate address,
354as with @code{static-patch!}.
355
356@code{l32} and @code{lo32} are the same from the perspective of the
357virtual machine. The difference is that an assembler might want to
358allow an @code{lo32} address to be specified as a label and then some
359number of words offset from that label, for example when patching a
360field of a statically allocated object.
361@item b1
362A boolean value: 1 for true, otherwise 0.
363@item x@var{n}
364An ignored sequence of @var{n} bits.
365@end table
366
367An instruction is specified by giving its name, then describing its
368operands. The operands are packed by 32-bit words, with earlier
369operands occupying the lower bits.
370
371For example, consider the following instruction specification:
372
373@deftypefn Instruction {} free-set! u12:@var{dst} u12:@var{src} x8:@var{_} u24:@var{idx}
374Set free variable @var{idx} from the closure @var{dst} to @var{src}.
375@end deftypefn
bd7aa35f 376
23e2e780
AW
377The first word in the instruction will start with the 8-bit value
378corresponding to the @var{free-set!} opcode in the low bits, followed by
379@var{dst} and @var{src} as 12-bit values. The second word starts with 8
380dead bits, followed by the index as a 24-bit immediate value.
bd7aa35f
AW
381
382Sometimes the compiler can figure out that it is compiling a special
383case that can be run more efficiently. So, for example, while Guile
384offers a generic test-and-branch instruction, it also offers specific
385instructions for special cases, so that the following cases all have
386their own test-and-branch instructions:
387
388@example
389(if pred then else)
390(if (not pred) then else)
391(if (null? l) then else)
392(if (not (null? l)) then else)
393@end example
394
395In addition, some Scheme primitives have their own inline
23e2e780
AW
396implementations. For example, in the previous section we saw
397@code{cons}.
bd7aa35f 398
23e2e780
AW
399Guile's instruction set is a @emph{complete} instruction set, in that it
400provides the instructions that are suited to the problem, and is not
401concerned with making a minimal, orthogonal set of instructions. More
402instructions may be added over time.
8680d53b
AW
403
404@menu
acc51c3e
AW
405* Lexical Environment Instructions::
406* Top-Level Environment Instructions::
407* Procedure Call and Return Instructions::
408* Function Prologue Instructions::
409* Trampoline Instructions::
8680d53b 410* Branch Instructions::
acc51c3e 411* Data Constructor Instructions::
bd7aa35f 412* Loading Instructions::
acc51c3e 413* Dynamic Environment Instructions::
8680d53b
AW
414* Miscellaneous Instructions::
415* Inlined Scheme Instructions::
416* Inlined Mathematical Instructions::
98850fd7 417* Inlined Bytevector Instructions::
8680d53b
AW
418@end menu
419
8680d53b 420
acc51c3e
AW
421@node Lexical Environment Instructions
422@subsubsection Lexical Environment Instructions
423
424These instructions access and mutate the lexical environment of a
425compiled procedure---its free and bound variables.
8680d53b 426
98850fd7
AW
427Some of these instructions have @code{long-} variants, the difference
428being that they take 16-bit arguments, encoded in big-endianness,
429instead of the normal 8-bit range.
430
acc51c3e
AW
431@xref{Stack Layout}, for more information on the format of stack frames.
432
ca445ba5 433@deffn Instruction local-ref index
98850fd7 434@deffnx Instruction long-local-ref index
8680d53b 435Push onto the stack the value of the local variable located at
ca445ba5 436@var{index} within the current stack frame.
bd7aa35f
AW
437
438Note that arguments and local variables are all in one block. Thus the
ca445ba5 439first argument, if any, is at index 0, and local bindings follow the
bd7aa35f 440arguments.
8680d53b
AW
441@end deffn
442
ca445ba5 443@deffn Instruction local-set index
acc51c3e 444@deffnx Instruction long-local-set index
8680d53b 445Pop the Scheme object located on top of the stack and make it the new
ca445ba5 446value of the local variable located at @var{index} within the current
8680d53b
AW
447stack frame.
448@end deffn
449
acc51c3e
AW
450@deffn Instruction box index
451Pop a value off the stack, and set the @var{index}nth local variable
452to a box containing that value. A shortcut for @code{make-variable}
453then @code{local-set}, used when binding boxed variables.
454@end deffn
455
456@deffn Instruction empty-box index
64de6db5 457Set the @var{index}th local variable to a box containing a variable
acc51c3e
AW
458whose value is unbound. Used when compiling some @code{letrec}
459expressions.
460@end deffn
461
462@deffn Instruction local-boxed-ref index
3248c954 463@deffnx Instruction local-boxed-set index
acc51c3e
AW
464Get or set the value of the variable located at @var{index} within the
465current stack frame. A shortcut for @code{local-ref} then
466@code{variable-ref} or @code{variable-set}, respectively.
467@end deffn
468
98850fd7
AW
469@deffn Instruction free-ref index
470Push the value of the captured variable located at position
471@var{index} within the program's vector of captured variables.
8680d53b
AW
472@end deffn
473
98850fd7
AW
474@deffn Instruction free-boxed-ref index
475@deffnx Instruction free-boxed-set index
acc51c3e
AW
476Get or set a boxed free variable. A shortcut for @code{free-ref} then
477@code{variable-ref} or @code{variable-set}, respectively.
98850fd7 478
acc51c3e
AW
479Note that there is no @code{free-set} instruction, as variables that are
480@code{set!} must be boxed.
8680d53b
AW
481@end deffn
482
acc51c3e
AW
483@deffn Instruction make-closure num-free-vars
484Pop @var{num-free-vars} values and a program object off the stack in
485that order, and push a new program object closing over the given free
486variables. @var{num-free-vars} is encoded as a two-byte big-endian
487value.
98850fd7 488
acc51c3e
AW
489The free variables are stored in an array, inline to the new program
490object, in the order that they were on the stack (not the order they are
491popped off). The new closure shares state with the original program. At
492the time of this writing, the space overhead of closures is 3 words,
493plus one word for each free variable.
8680d53b
AW
494@end deffn
495
98850fd7 496@deffn Instruction fix-closure index
acc51c3e
AW
497Fix up the free variables array of the closure stored in the
498@var{index}th local variable. @var{index} is a two-byte big-endian
499integer.
98850fd7 500
acc51c3e
AW
501This instruction will pop as many values from the stack as are in the
502corresponding closure's free variables array. The topmost value on the
503stack will be stored as the closure's last free variable, with other
504values filling in free variable slots in order.
98850fd7 505
acc51c3e
AW
506@code{fix-closure} is part of a hack for allocating mutually recursive
507procedures. The hack is to store the procedures in their corresponding
508local variable slots, with space already allocated for free variables.
509Then once they are all in place, this instruction fixes up their
510procedures' free variable bindings in place. This allows most
511@code{letrec}-bound procedures to be allocated unboxed on the stack.
98850fd7
AW
512@end deffn
513
acc51c3e
AW
514@deffn Instruction local-bound? index
515@deffnx Instruction long-local-bound? index
516Push @code{#t} on the stack if the @code{index}th local variable has
517been assigned, or @code{#f} otherwise. Mostly useful for handling
518optional arguments in procedure prologues.
98850fd7
AW
519@end deffn
520
acc51c3e
AW
521
522@node Top-Level Environment Instructions
523@subsubsection Top-Level Environment Instructions
524
525These instructions access values in the top-level environment: bindings
526that were not lexically apparent at the time that the code in question
527was compiled.
528
529The location in which a toplevel binding is stored can be looked up once
530and cached for later. The binding itself may change over time, but its
531location will stay constant.
532
533Currently only toplevel references within procedures are cached, as only
534procedures have a place to cache them, in their object tables.
bd7aa35f 535
ca445ba5 536@deffn Instruction toplevel-ref index
a9b0f876 537@deffnx Instruction long-toplevel-ref index
bd7aa35f 538Push the value of the toplevel binding whose location is stored in at
acc51c3e
AW
539position @var{index} in the current procedure's object table. The
540@code{long-} variant encodes the index over two bytes.
bd7aa35f 541
acc51c3e
AW
542Initially, a cell in a procedure's object table that is used by
543@code{toplevel-ref} is initialized to one of two forms. The normal case
544is that the cell holds a symbol, whose binding will be looked up
bd7aa35f
AW
545relative to the module that was current when the current program was
546created.
547
548Alternately, the lookup may be performed relative to a particular
679cceed 549module, determined at compile-time (e.g.@: via @code{@@} or
bd7aa35f 550@code{@@@@}). In that case, the cell in the object table holds a list:
81fd3152
AW
551@code{(@var{modname} @var{sym} @var{public?})}. The symbol @var{sym}
552will be looked up in the module named @var{modname} (a list of
553symbols). The lookup will be performed against the module's public
554interface, unless @var{public?} is @code{#f}, which it is for example
555when compiling @code{@@@@}.
bd7aa35f
AW
556
557In any case, if the symbol is unbound, an error is signalled.
558Otherwise the initial form is replaced with the looked-up variable, an
559in-place mutation of the object table. This mechanism provides for
560lazy variable resolution, and an important cached fast-path once the
561variable has been successfully resolved.
562
563This instruction pushes the value of the variable onto the stack.
8680d53b
AW
564@end deffn
565
a9b0f876
AW
566@deffn Instruction toplevel-set index
567@deffnx Instruction long-toplevel-set index
bd7aa35f 568Pop a value off the stack, and set it as the value of the toplevel
ca445ba5 569variable stored at @var{index} in the object table. If the variable
bd7aa35f 570has not yet been looked up, we do the lookup as in
98850fd7
AW
571@code{toplevel-ref}.
572@end deffn
573
574@deffn Instruction define
575Pop a symbol and a value from the stack, in that order. Look up its
576binding in the current toplevel environment, creating the binding if
577necessary. Set the variable to the value.
8680d53b
AW
578@end deffn
579
bd7aa35f
AW
580@deffn Instruction link-now
581Pop a value, @var{x}, from the stack. Look up the binding for @var{x},
582according to the rules for @code{toplevel-ref}, and push that variable
583on the stack. If the lookup fails, an error will be signalled.
584
585This instruction is mostly used when loading programs, because it can
acc51c3e 586do toplevel variable lookups without an object table.
bd7aa35f
AW
587@end deffn
588
589@deffn Instruction variable-ref
590Dereference the variable object which is on top of the stack and
591replace it by the value of the variable it represents.
592@end deffn
593
594@deffn Instruction variable-set
595Pop off two objects from the stack, a variable and a value, and set
596the variable to the value.
597@end deffn
598
acc51c3e
AW
599@deffn Instruction variable-bound?
600Pop off the variable object from top of the stack and push @code{#t} if
601it is bound, or @code{#f} otherwise. Mostly useful in procedure
602prologues for defining default values for boxed optional variables.
603@end deffn
604
98850fd7
AW
605@deffn Instruction make-variable
606Replace the top object on the stack with a variable containing it.
607Used in some circumstances when compiling @code{letrec} expressions.
608@end deffn
609
81fd3152 610
acc51c3e
AW
611@node Procedure Call and Return Instructions
612@subsubsection Procedure Call and Return Instructions
8680d53b 613
acc51c3e 614@c something about the calling convention here?
8680d53b 615
acc51c3e 616@deffn Instruction new-frame
8274228f
AW
617Push a new frame on the stack, reserving space for the dynamic link,
618return address, and the multiple-values return address. The frame
619pointer is not yet updated, because the frame is not yet active -- it
620has to be patched by a @code{call} instruction to get the return
621address.
8680d53b
AW
622@end deffn
623
624@deffn Instruction call nargs
bd7aa35f 625Call the procedure located at @code{sp[-nargs]} with the @var{nargs}
81fd3152
AW
626arguments located from @code{sp[-nargs + 1]} to @code{sp[0]}.
627
acc51c3e
AW
628This instruction requires that a new frame be pushed on the stack before
629the procedure, via @code{new-frame}. @xref{Stack Layout}, for more
630information. It patches up that frame with the current @code{ip} as the
631return address, then dispatches to the first instruction in the called
632procedure, relying on the called procedure to return one value to the
633newly-created continuation. Because the new frame pointer will point to
634@code{sp[-nargs + 1]}, the arguments don't have to be shuffled around --
635they are already in place.
8680d53b
AW
636@end deffn
637
a5bbb22e 638@deffn Instruction tail-call nargs
acc51c3e
AW
639Transfer control to the procedure located at @code{sp[-nargs]} with the
640@var{nargs} arguments located from @code{sp[-nargs + 1]} to
641@code{sp[0]}.
8680d53b 642
acc51c3e
AW
643Unlike @code{call}, which requires a new frame to be pushed onto the
644stack, @code{tail-call} simply shuffles down the procedure and arguments
645to the current stack frame. This instruction implements tail calls as
646required by RnRS.
8680d53b 647@end deffn
bd7aa35f
AW
648
649@deffn Instruction apply nargs
a5bbb22e
AW
650@deffnx Instruction tail-apply nargs
651Like @code{call} and @code{tail-call}, except that the top item on the
bd7aa35f
AW
652stack must be a list. The elements of that list are then pushed on the
653stack and treated as additional arguments, replacing the list itself,
654then the procedure is invoked as usual.
8680d53b 655@end deffn
bd7aa35f
AW
656
657@deffn Instruction call/nargs
a5bbb22e
AW
658@deffnx Instruction tail-call/nargs
659These are like @code{call} and @code{tail-call}, except they take the
bd7aa35f
AW
660number of arguments from the stack instead of the instruction stream.
661These instructions are used in the implementation of multiple value
662returns, where the actual number of values is pushed on the stack.
8680d53b
AW
663@end deffn
664
bd7aa35f
AW
665@deffn Instruction mv-call nargs offset
666Like @code{call}, except that a multiple-value continuation is created
667in addition to a single-value continuation.
668
acc51c3e
AW
669The offset (a three-byte value) is an offset within the instruction
670stream; the multiple-value return address in the new frame (@pxref{Stack
671Layout}) will be set to the normal return address plus this offset.
672Instructions at that offset will expect the top value of the stack to be
673the number of values, and below that values themselves, pushed
674separately.
8680d53b 675@end deffn
bd7aa35f 676
8274228f
AW
677@deffn Instruction return
678Free the program's frame, returning the top value from the stack to
679the current continuation. (The stack should have exactly one value on
680it.)
681
682Specifically, the @code{sp} is decremented to one below the current
683@code{fp}, the @code{ip} is reset to the current return address, the
684@code{fp} is reset to the value of the current dynamic link, and then
acc51c3e 685the returned value is pushed on the stack.
8274228f
AW
686@end deffn
687
bd7aa35f 688@deffn Instruction return/values nvalues
acc51c3e
AW
689@deffnx Instruction return/nvalues
690Return the top @var{nvalues} to the current continuation. In the case of
691@code{return/nvalues}, @var{nvalues} itself is first popped from the top
692of the stack.
bd7aa35f
AW
693
694If the current continuation is a multiple-value continuation,
695@code{return/values} pushes the number of values on the stack, then
696returns as in @code{return}, but to the multiple-value return address.
697
679cceed 698Otherwise if the current continuation accepts only one value, i.e.@: the
bd7aa35f
AW
699multiple-value return address is @code{NULL}, then we assume the user
700only wants one value, and we give them the first one. If there are no
701values, an error is signaled.
8680d53b 702@end deffn
bd7aa35f
AW
703
704@deffn Instruction return/values* nvalues
705Like a combination of @code{apply} and @code{return/values}, in which
706the top value on the stack is interpreted as a list of additional
707values. This is an optimization for the common @code{(apply values
708...)} case.
8680d53b
AW
709@end deffn
710
bd7aa35f
AW
711@deffn Instruction truncate-values nbinds nrest
712Used in multiple-value continuations, this instruction takes the
81fd3152 713values that are on the stack (including the number-of-values marker)
bd7aa35f
AW
714and truncates them for a binding construct.
715
716For example, a call to @code{(receive (x y . z) (foo) ...)} would,
717logically speaking, pop off the values returned from @code{(foo)} and
718push them as three values, corresponding to @code{x}, @code{y}, and
719@code{z}. In that case, @var{nbinds} would be 3, and @var{nrest} would
81fd3152 720be 1 (to indicate that one of the bindings was a rest argument).
bd7aa35f
AW
721
722Signals an error if there is an insufficient number of values.
723@end deffn
8680d53b 724
8274228f 725@deffn Instruction call/cc
a5bbb22e 726@deffnx Instruction tail-call/cc
8274228f
AW
727Capture the current continuation, and then call (or tail-call) the
728procedure on the top of the stack, with the continuation as the
729argument.
730
731@code{call/cc} does not require a @code{new-frame} to be pushed on the
732stack, as @code{call} does, because it needs to capture the stack
733before the frame is pushed.
734
735Both the VM continuation and the C continuation are captured.
736@end deffn
737
8680d53b 738
acc51c3e
AW
739@node Function Prologue Instructions
740@subsubsection Function Prologue Instructions
741
742A function call in Guile is very cheap: the VM simply hands control to
743the procedure. The procedure itself is responsible for asserting that it
744has been passed an appropriate number of arguments. This strategy allows
745arbitrarily complex argument parsing idioms to be developed, without
746harming the common case.
747
748For example, only calls to keyword-argument procedures ``pay'' for the
749cost of parsing keyword arguments. (At the time of this writing, calling
750procedures with keyword arguments is typically two to four times as
751costly as calling procedures with a fixed set of arguments.)
752
753@deffn Instruction assert-nargs-ee n
754@deffnx Instruction assert-nargs-ge n
755Assert that the current procedure has been passed exactly @var{n}
756arguments, for the @code{-ee} case, or @var{n} or more arguments, for
757the @code{-ge} case. @var{n} is encoded over two bytes.
758
759The number of arguments is determined by subtracting the frame pointer
760from the stack pointer (@code{sp - (fp -1)}). @xref{Stack Layout}, for
761more details on stack frames.
762@end deffn
763
764@deffn Instruction br-if-nargs-ne n offset
765@deffnx Instruction br-if-nargs-gt n offset
766@deffnx Instruction br-if-nargs-lt n offset
767Jump to @var{offset} if the number of arguments is not equal to, greater
768than, or less than @var{n}. @var{n} is encoded over two bytes, and
769@var{offset} has the normal three-byte encoding.
770
ecb87335 771These instructions are used to implement multiple arities, as in
acc51c3e
AW
772@code{case-lambda}. @xref{Case-lambda}, for more information.
773@end deffn
774
775@deffn Instruction bind-optionals n
776If the procedure has been called with fewer than @var{n} arguments, fill
777in the remaining arguments with an unbound value (@code{SCM_UNDEFINED}).
778@var{n} is encoded over two bytes.
779
780The optionals can be later initialized conditionally via the
781@code{local-bound?} instruction.
782@end deffn
783
784@deffn Instruction push-rest n
785Pop off excess arguments (more than @var{n}), collecting them into a
786list, and push that list. Used to bind a rest argument, if the procedure
787has no keyword arguments. Procedures with keyword arguments use
788@code{bind-rest} instead.
789@end deffn
790
791@deffn Instruction bind-rest n idx
792Pop off excess arguments (more than @var{n}), collecting them into a
793list. The list is then assigned to the @var{idx}th local variable.
794@end deffn
795
796@deffn Instruction bind-optionals/shuffle nreq nreq-and-opt ntotal
581f410f 797@deffnx Instruction bind-optionals/shuffle-or-br nreq nreq-and-opt ntotal offset
acc51c3e
AW
798Shuffle keyword arguments to the top of the stack, filling in the holes
799with @code{SCM_UNDEFINED}. Each argument is encoded over two bytes.
800
801This instruction is used by procedures with keyword arguments.
802@var{nreq} is the number of required arguments to the procedure, and
803@var{nreq-and-opt} is the total number of positional arguments (required
804plus optional). @code{bind-optionals/shuffle} will scan the stack from
805the @var{nreq}th argument up to the @var{nreq-and-opt}th, and start
806shuffling when it sees the first keyword argument or runs out of
807positional arguments.
808
581f410f
AW
809@code{bind-optionals/shuffle-or-br} does the same, except that it checks
810if there are too many positional arguments before shuffling. If this is
811the case, it jumps to @var{offset}, encoded using the normal three-byte
812encoding.
813
acc51c3e
AW
814Shuffling simply moves the keyword arguments past the total number of
815arguments, @var{ntotal}, which includes keyword and rest arguments. The
816free slots created by the shuffle are filled in with
817@code{SCM_UNDEFINED}, so they may be conditionally initialized later in
818the function's prologue.
819@end deffn
820
821@deffn Instruction bind-kwargs idx ntotal flags
822Parse keyword arguments, assigning their values to the corresponding
823local variables. The keyword arguments should already have been shuffled
824above the @var{ntotal}th stack slot by @code{bind-optionals/shuffle}.
825
826The parsing is driven by a keyword arguments association list, looked up
827from the @var{idx}th element of the procedures object array. The alist
828is a list of pairs of the form @code{(@var{kw} . @var{index})}, mapping
829keyword arguments to their local variable indices.
830
831There are two bitflags that affect the parser, @code{allow-other-keys?}
832(@code{0x1}) and @code{rest?} (@code{0x2}). Unless
833@code{allow-other-keys?} is set, the parser will signal an error if an
ecb87335 834unknown key is found. If @code{rest?} is set, errors parsing the
acc51c3e
AW
835keyword arguments will be ignored, as a later @code{bind-rest}
836instruction will collect all of the tail arguments, including the
837keywords, into a list. Otherwise if the keyword arguments are invalid,
838an error is signalled.
839
840@var{idx} and @var{ntotal} are encoded over two bytes each, and
841@var{flags} is encoded over one byte.
842@end deffn
843
844@deffn Instruction reserve-locals n
845Resets the stack pointer to have space for @var{n} local variables,
846including the arguments. If this operation increments the stack pointer,
847as in a push, the new slots are filled with @code{SCM_UNBOUND}. If this
848operation decrements the stack pointer, any excess values are dropped.
849
850@code{reserve-locals} is typically used after argument parsing to
851reserve space for local variables.
852@end deffn
853
de45d8ee
AW
854@deffn Instruction assert-nargs-ee/locals n
855@deffnx Instruction assert-nargs-ge/locals n
856A combination of @code{assert-nargs-ee} and @code{reserve-locals}. The
857number of arguments is encoded in the lower three bits of @var{n}, a
858one-byte value. The number of additional local variables is take from
859the upper 5 bits of @var{n}.
860@end deffn
861
acc51c3e
AW
862
863@node Trampoline Instructions
864@subsubsection Trampoline Instructions
865
866Though most applicable objects in Guile are procedures implemented
867in bytecode, not all are. There are primitives, continuations, and other
868procedure-like objects that have their own calling convention. Instead
869of adding special cases to the @code{call} instruction, Guile wraps
870these other applicable objects in VM trampoline procedures, then
871provides special support for these objects in bytecode.
872
873Trampoline procedures are typically generated by Guile at runtime, for
874example in response to a call to @code{scm_c_make_gsubr}. As such, a
875compiler probably shouldn't emit code with these instructions. However,
876it's still interesting to know how these things work, so we document
877these trampoline instructions here.
878
879@deffn Instruction subr-call nargs
880Pop off a foreign pointer (which should have been pushed on by the
881trampoline), and call it directly, with the @var{nargs} arguments from
882the stack. Return the resulting value or values to the calling
883procedure.
884@end deffn
885
886@deffn Instruction foreign-call nargs
887Pop off an internal foreign object (which should have been pushed on by
888the trampoline), and call that foreign function with the @var{nargs}
889arguments from the stack. Return the resulting value to the calling
890procedure.
acc51c3e
AW
891@end deffn
892
893@deffn Instruction continuation-call
894Pop off an internal continuation object (which should have been pushed
895on by the trampoline), and reinstate that continuation. All of the
896procedure's arguments are passed to the continuation. Does not return.
897@end deffn
898
899@deffn Instruction partial-cont-call
900Pop off two objects from the stack: the dynamic winds associated with
901the partial continuation, and the VM continuation object. Unroll the
902continuation onto the stack, rewinding the dynamic environment and
903overwriting the current frame, and pass all arguments to the
904continuation. Control flow proceeds where the continuation was captured.
905@end deffn
906
907
908@node Branch Instructions
909@subsubsection Branch Instructions
910
911All the conditional branch instructions described below work in the
912same way:
913
914@itemize
915@item They pop off Scheme object(s) located on the stack for use in the
916branch condition
917@item If the condition is true, then the instruction pointer is
918increased by the offset passed as an argument to the branch
919instruction;
920@item Program execution proceeds with the next instruction (that is,
921the one to which the instruction pointer points).
922@end itemize
923
924Note that the offset passed to the instruction is encoded as three 8-bit
925integers, in big-endian order, effectively giving Guile a 24-bit
926relative address space.
927
928@deffn Instruction br offset
929Jump to @var{offset}. No values are popped.
930@end deffn
931
932@deffn Instruction br-if offset
933Jump to @var{offset} if the object on the stack is not false.
934@end deffn
935
936@deffn Instruction br-if-not offset
937Jump to @var{offset} if the object on the stack is false.
938@end deffn
939
940@deffn Instruction br-if-eq offset
941Jump to @var{offset} if the two objects located on the stack are
64de6db5 942equal in the sense of @code{eq?}. Note that, for this instruction, the
acc51c3e
AW
943stack pointer is decremented by two Scheme objects instead of only
944one.
945@end deffn
946
947@deffn Instruction br-if-not-eq offset
64de6db5 948Same as @code{br-if-eq} for non-@code{eq?} objects.
acc51c3e
AW
949@end deffn
950
951@deffn Instruction br-if-null offset
952Jump to @var{offset} if the object on the stack is @code{'()}.
953@end deffn
954
955@deffn Instruction br-if-not-null offset
956Jump to @var{offset} if the object on the stack is not @code{'()}.
957@end deffn
958
959
960@node Data Constructor Instructions
961@subsubsection Data Constructor Instructions
962
963These instructions push simple immediate values onto the stack,
ecb87335 964or construct compound data structures from values on the stack.
bd7aa35f 965
8680d53b
AW
966@deffn Instruction make-int8 value
967Push @var{value}, an 8-bit integer, onto the stack.
968@end deffn
969
970@deffn Instruction make-int8:0
971Push the immediate value @code{0} onto the stack.
972@end deffn
973
974@deffn Instruction make-int8:1
975Push the immediate value @code{1} onto the stack.
976@end deffn
977
978@deffn Instruction make-int16 value
979Push @var{value}, a 16-bit integer, onto the stack.
980@end deffn
981
586cfdec
AW
982@deffn Instruction make-uint64 value
983Push @var{value}, an unsigned 64-bit integer, onto the stack. The
984value is encoded in 8 bytes, most significant byte first (big-endian).
985@end deffn
986
987@deffn Instruction make-int64 value
988Push @var{value}, a signed 64-bit integer, onto the stack. The value
989is encoded in 8 bytes, most significant byte first (big-endian), in
990twos-complement arithmetic.
991@end deffn
992
8680d53b
AW
993@deffn Instruction make-false
994Push @code{#f} onto the stack.
995@end deffn
996
997@deffn Instruction make-true
998Push @code{#t} onto the stack.
999@end deffn
1000
4530432e 1001@deffn Instruction make-nil
92a61010 1002Push @code{#nil} onto the stack.
4530432e
DK
1003@end deffn
1004
8680d53b
AW
1005@deffn Instruction make-eol
1006Push @code{'()} onto the stack.
1007@end deffn
1008
1009@deffn Instruction make-char8 value
1010Push @var{value}, an 8-bit character, onto the stack.
1011@end deffn
1012
98850fd7
AW
1013@deffn Instruction make-char32 value
1014Push @var{value}, an 32-bit character, onto the stack. The value is
1015encoded in big-endian order.
1016@end deffn
1017
1018@deffn Instruction make-symbol
1019Pops a string off the stack, and pushes a symbol.
1020@end deffn
1021
1022@deffn Instruction make-keyword value
1023Pops a symbol off the stack, and pushes a keyword.
1024@end deffn
1025
8680d53b
AW
1026@deffn Instruction list n
1027Pops off the top @var{n} values off of the stack, consing them up into
1028a list, then pushes that list on the stack. What was the topmost value
81fd3152
AW
1029will be the last element in the list. @var{n} is a two-byte value,
1030most significant byte first.
8680d53b
AW
1031@end deffn
1032
1033@deffn Instruction vector n
1034Create and fill a vector with the top @var{n} values from the stack,
81fd3152
AW
1035popping off those values and pushing on the resulting vector. @var{n}
1036is a two-byte value, like in @code{vector}.
8680d53b
AW
1037@end deffn
1038
acc51c3e
AW
1039@deffn Instruction make-struct n
1040Make a new struct from the top @var{n} values on the stack. The values
1041are popped, and the new struct is pushed.
1042
1043The deepest value is used as the vtable for the struct, and the rest are
1044used in order as the field initializers. Tail arrays are not supported
1045by this instruction.
1046@end deffn
1047
1048@deffn Instruction make-array n
1049Pop an array shape from the stack, then pop the remaining @var{n}
1050values, pushing a new array. @var{n} is encoded over three bytes.
1051
1052The array shape should be appropriate to store @var{n} values.
1053@xref{Array Procedures}, for more information on array shapes.
1054@end deffn
1055
1056Many of these data structures are constant, never changing over the
1057course of the different invocations of the procedure. In that case it is
1058often advantageous to make them once when the procedure is created, and
1059just reference them from the object table thereafter. @xref{Variables
1060and the VM}, for more information on the object table.
1061
1062@deffn Instruction object-ref n
1063@deffnx Instruction long-object-ref n
1064Push @var{n}th value from the current program's object vector. The
1065``long'' variant has a 16-bit index instead of an 8-bit index.
1066@end deffn
1067
1068
1069@node Loading Instructions
1070@subsubsection Loading Instructions
1071
1072In addition to VM instructions, an instruction stream may contain
1073variable-length data embedded within it. This data is always preceded
1074by special loading instructions, which interpret the data and advance
1075the instruction pointer to the next VM instruction.
1076
1077All of these loading instructions have a @code{length} parameter,
1078indicating the size of the embedded data, in bytes. The length itself
1079is encoded in 3 bytes.
1080
1081@deffn Instruction load-number length
1082Load an arbitrary number from the instruction stream. The number is
1083embedded in the stream as a string.
1084@end deffn
1085@deffn Instruction load-string length
1086Load a string from the instruction stream. The string is assumed to be
080a9d4f 1087encoded in the ``latin1'' locale.
acc51c3e
AW
1088@end deffn
1089@deffn Instruction load-wide-string length
1090Load a UTF-32 string from the instruction stream. @var{length} is the
ecb87335 1091length in bytes, not in codepoints.
acc51c3e
AW
1092@end deffn
1093@deffn Instruction load-symbol length
1094Load a symbol from the instruction stream. The symbol is assumed to be
080a9d4f 1095encoded in the ``latin1'' locale. Symbols backed by wide strings may
acc51c3e
AW
1096be loaded via @code{load-wide-string} then @code{make-symbol}.
1097@end deffn
1098@deffn Instruction load-array length
1099Load a uniform array from the instruction stream. The shape and type
1100of the array are popped off the stack, in that order.
1101@end deffn
1102
1103@deffn Instruction load-program
1104Load bytecode from the instruction stream, and push a compiled
1105procedure.
1106
1107This instruction pops one value from the stack: the program's object
1108table, as a vector, or @code{#f} in the case that the program has no
1109object table. A program that does not reference toplevel bindings and
1110does not use @code{object-ref} does not need an object table.
1111
1112This instruction is unlike the rest of the loading instructions,
1113because instead of parsing its data, it directly maps the instruction
1114stream onto a C structure, @code{struct scm_objcode}. @xref{Bytecode
1115and Objcode}, for more information.
1116
1117The resulting compiled procedure will not have any free variables
1118captured, so it may be loaded only once but used many times to create
1119closures.
1120@end deffn
1121
1122@node Dynamic Environment Instructions
1123@subsubsection Dynamic Environment Instructions
1124
1125Guile's virtual machine has low-level support for @code{dynamic-wind},
1126dynamic binding, and composable prompts and aborts.
1127
1128@deffn Instruction wind
1129Pop an unwind thunk and a wind thunk from the stack, in that order, and
1130push them onto the ``dynamic stack''. The unwind thunk will be called on
1131nonlocal exits, and the wind thunk on reentries. Used to implement
1132@code{dynamic-wind}.
1133
1134Note that neither thunk is actually called; the compiler should emit
1135calls to wind and unwind for the normal dynamic-wind control flow.
1136@xref{Dynamic Wind}.
1137@end deffn
1138
1139@deffn Instruction unwind
1140Pop off the top entry from the ``dynamic stack'', for example, a
1141wind/unwind thunk pair. @code{unwind} instructions should be properly
1142paired with their winding instructions, like @code{wind}.
1143@end deffn
1144
c32b7c4c
AW
1145@deffn Instruction push-fluid
1146Pop a value and a fluid from the stack, in that order. Set the fluid
1147to the value by creating a with-fluids object and pushing that object
1148on the dynamic stack. @xref{Fluids and Dynamic States}.
acc51c3e
AW
1149@end deffn
1150
c32b7c4c 1151@deffn Instruction pop-fluid
acc51c3e
AW
1152Pop a with-fluids object from the dynamic stack, and swap the current
1153values of its fluids with the saved values of its fluids. In this way,
1154the dynamic environment is left as it was before the corresponding
c32b7c4c 1155@code{wind-fluid} instruction was processed.
acc51c3e
AW
1156@end deffn
1157
1158@deffn Instruction fluid-ref
1159Pop a fluid from the stack, and push its current value.
1160@end deffn
1161
1162@deffn Instruction fluid-set
1163Pop a value and a fluid from the stack, in that order, and set the fluid
1164to the value.
1165@end deffn
1166
1167@deffn Instruction prompt escape-only? offset
1168Establish a dynamic prompt. @xref{Prompts}, for more information on
1169prompts.
1170
1171The prompt will be pushed on the dynamic stack. The normal control flow
1172should ensure that the prompt is popped off at the end, via
1173@code{unwind}.
1174
1175If an abort is made to this prompt, control will jump to @var{offset}, a
1176three-byte relative address. The continuation and all arguments to the
1177abort will be pushed on the stack, along with the total number of
1178arguments (including the continuation. If control returns to the
1179handler, the prompt is already popped off by the abort mechanism.
1180(Guile's @code{prompt} implements Felleisen's @dfn{--F--} operator.)
1181
1182If @var{escape-only?} is nonzero, the prompt will be marked as
1183escape-only, which allows an abort to this prompt to avoid reifying the
1184continuation.
1185@end deffn
1186
1187@deffn Instruction abort n
1188Abort to a dynamic prompt.
1189
1190This instruction pops one tail argument list, @var{n} arguments, and a
1191prompt tag from the stack. The dynamic environment is then searched for
1192a prompt having the given tag. If none is found, an error is signalled.
1193Otherwise all arguments are passed to the prompt's handler, along with
1194the captured continuation, if necessary.
1195
1196If the prompt's handler can be proven to not reference the captured
1197continuation, no continuation is allocated. This decision happens
1198dynamically, at run-time; the general case is that the continuation may
1199be captured, and thus resumed. A reinstated continuation will have its
1200arguments pushed on the stack, along with the number of arguments, as in
1201the multiple-value return convention. Therefore an @code{abort}
1202instruction should be followed by code ready to handle the equivalent of
1203a multiply-valued return.
1204@end deffn
1205
8680d53b
AW
1206@node Miscellaneous Instructions
1207@subsubsection Miscellaneous Instructions
1208
1209@deffn Instruction nop
98850fd7
AW
1210Does nothing! Used for padding other instructions to certain
1211alignments.
8680d53b
AW
1212@end deffn
1213
1214@deffn Instruction halt
bd7aa35f
AW
1215Exits the VM, returning a SCM value. Normally, this instruction is
1216only part of the ``bootstrap program'', a program run when a virtual
1217machine is first entered; compiled Scheme procedures will not contain
1218this instruction.
1219
1220If multiple values have been returned, the SCM value will be a
e3ba263d 1221multiple-values object (@pxref{Multiple Values}).
8680d53b
AW
1222@end deffn
1223
1224@deffn Instruction break
1225Does nothing, but invokes the break hook.
1226@end deffn
1227
1228@deffn Instruction drop
1229Pops off the top value from the stack, throwing it away.
1230@end deffn
1231
1232@deffn Instruction dup
1233Re-pushes the top value onto the stack.
1234@end deffn
1235
1236@deffn Instruction void
1237Pushes ``the unspecified value'' onto the stack.
1238@end deffn
1239
1240@node Inlined Scheme Instructions
1241@subsubsection Inlined Scheme Instructions
1242
bd7aa35f 1243The Scheme compiler can recognize the application of standard Scheme
81fd3152
AW
1244procedures. It tries to inline these small operations to avoid the
1245overhead of creating new stack frames.
bd7aa35f
AW
1246
1247Since most of these operations are historically implemented as C
1248primitives, not inlining them would entail constantly calling out from
86872cc3 1249the VM to the interpreter, which has some costs---registers must be
bd7aa35f 1250saved, the interpreter has to dispatch, called procedures have to do
ecb87335 1251much type checking, etc. It's much more efficient to inline these
bd7aa35f
AW
1252operations in the virtual machine itself.
1253
1254All of these instructions pop their arguments from the stack and push
1255their results, and take no parameters from the instruction stream.
1256Thus, unlike in the previous sections, these instruction definitions
1257show stack parameters instead of parameters from the instruction
1258stream.
1259
8680d53b 1260@deffn Instruction not x
bd7aa35f
AW
1261@deffnx Instruction not-not x
1262@deffnx Instruction eq? x y
1263@deffnx Instruction not-eq? x y
1264@deffnx Instruction null?
1265@deffnx Instruction not-null?
1266@deffnx Instruction eqv? x y
1267@deffnx Instruction equal? x y
1268@deffnx Instruction pair? x y
81fd3152 1269@deffnx Instruction list? x
bd7aa35f
AW
1270@deffnx Instruction set-car! pair x
1271@deffnx Instruction set-cdr! pair x
81fd3152 1272@deffnx Instruction cons x y
bd7aa35f
AW
1273@deffnx Instruction car x
1274@deffnx Instruction cdr x
98850fd7
AW
1275@deffnx Instruction vector-ref x y
1276@deffnx Instruction vector-set x n y
acc51c3e
AW
1277@deffnx Instruction struct? x
1278@deffnx Instruction struct-ref x n
1279@deffnx Instruction struct-set x n v
1280@deffnx Instruction struct-vtable x
1281@deffnx Instruction class-of x
1282@deffnx Instruction slot-ref struct n
1283@deffnx Instruction slot-set struct n x
bd7aa35f
AW
1284Inlined implementations of their Scheme equivalents.
1285@end deffn
1286
1287Note that @code{caddr} and friends compile to a series of @code{car}
1288and @code{cdr} instructions.
8680d53b
AW
1289
1290@node Inlined Mathematical Instructions
1291@subsubsection Inlined Mathematical Instructions
1292
bd7aa35f
AW
1293Inlining mathematical operations has the obvious advantage of handling
1294fixnums without function calls or allocations. The trick, of course,
1295is knowing when the result of an operation will be a fixnum, and there
1296might be a couple bugs here.
1297
1298More instructions could be added here over time.
1299
1300As in the previous section, the definitions below show stack
1301parameters instead of instruction stream parameters.
1302
1303@deffn Instruction add x y
98850fd7 1304@deffnx Instruction add1 x
bd7aa35f 1305@deffnx Instruction sub x y
98850fd7 1306@deffnx Instruction sub1 x
bd7aa35f
AW
1307@deffnx Instruction mul x y
1308@deffnx Instruction div x y
1309@deffnx Instruction quo x y
1310@deffnx Instruction rem x y
1311@deffnx Instruction mod x y
1312@deffnx Instruction ee? x y
1313@deffnx Instruction lt? x y
1314@deffnx Instruction gt? x y
1315@deffnx Instruction le? x y
1316@deffnx Instruction ge? x y
acc51c3e
AW
1317@deffnx Instruction ash x n
1318@deffnx Instruction logand x y
1319@deffnx Instruction logior x y
1320@deffnx Instruction logxor x y
bd7aa35f 1321Inlined implementations of the corresponding mathematical operations.
8680d53b 1322@end deffn
98850fd7
AW
1323
1324@node Inlined Bytevector Instructions
1325@subsubsection Inlined Bytevector Instructions
1326
1327Bytevector operations correspond closely to what the current hardware
1328can do, so it makes sense to inline them to VM instructions, providing
1329a clear path for eventual native compilation. Without this, Scheme
1330programs would need other primitives for accessing raw bytes -- but
1331these primitives are as good as any.
1332
1333As in the previous section, the definitions below show stack
1334parameters instead of instruction stream parameters.
1335
1336The multibyte formats (@code{u16}, @code{f64}, etc) take an extra
1337endianness argument. Only aligned native accesses are currently
1338fast-pathed in Guile's VM.
1339
1340@deffn Instruction bv-u8-ref bv n
1341@deffnx Instruction bv-s8-ref bv n
1342@deffnx Instruction bv-u16-native-ref bv n
1343@deffnx Instruction bv-s16-native-ref bv n
1344@deffnx Instruction bv-u32-native-ref bv n
1345@deffnx Instruction bv-s32-native-ref bv n
1346@deffnx Instruction bv-u64-native-ref bv n
1347@deffnx Instruction bv-s64-native-ref bv n
1348@deffnx Instruction bv-f32-native-ref bv n
1349@deffnx Instruction bv-f64-native-ref bv n
1350@deffnx Instruction bv-u16-ref bv n endianness
1351@deffnx Instruction bv-s16-ref bv n endianness
1352@deffnx Instruction bv-u32-ref bv n endianness
1353@deffnx Instruction bv-s32-ref bv n endianness
1354@deffnx Instruction bv-u64-ref bv n endianness
1355@deffnx Instruction bv-s64-ref bv n endianness
1356@deffnx Instruction bv-f32-ref bv n endianness
1357@deffnx Instruction bv-f64-ref bv n endianness
1358@deffnx Instruction bv-u8-set bv n val
1359@deffnx Instruction bv-s8-set bv n val
1360@deffnx Instruction bv-u16-native-set bv n val
1361@deffnx Instruction bv-s16-native-set bv n val
1362@deffnx Instruction bv-u32-native-set bv n val
1363@deffnx Instruction bv-s32-native-set bv n val
1364@deffnx Instruction bv-u64-native-set bv n val
1365@deffnx Instruction bv-s64-native-set bv n val
1366@deffnx Instruction bv-f32-native-set bv n val
1367@deffnx Instruction bv-f64-native-set bv n val
1368@deffnx Instruction bv-u16-set bv n val endianness
1369@deffnx Instruction bv-s16-set bv n val endianness
1370@deffnx Instruction bv-u32-set bv n val endianness
1371@deffnx Instruction bv-s32-set bv n val endianness
1372@deffnx Instruction bv-u64-set bv n val endianness
1373@deffnx Instruction bv-s64-set bv n val endianness
1374@deffnx Instruction bv-f32-set bv n val endianness
1375@deffnx Instruction bv-f64-set bv n val endianness
1376Inlined implementations of the corresponding bytevector operations.
1377@end deffn