Merge commit 'feccd2d3100fd2964d4c2df58ab3da7ce4949a66' into vm-check
[bpt/guile.git] / doc / ref / compiler.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2008
4 @c Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
6
7 @node Compiling to the Virtual Machine
8 @section Compiling to the Virtual Machine
9
10 Compilers have a mystique about them that is attractive and
11 off-putting at the same time. They are attractive because they are
12 magical -- they transform inert text into live results, like throwing
13 the switch on Frankenstein's monster. However, this magic is perceived
14 by many to be impenetrable.
15
16 This section aims to pay attention to the small man behind the
17 curtain.
18
19 @xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to
20 know how to compile your .scm file.
21
22 @menu
23 * Compiler Tower::
24 * The Scheme Compiler::
25 * GHIL::
26 * GLIL::
27 * Object Code::
28 * Extending the Compiler::
29 @end menu
30
31 @node Compiler Tower
32 @subsection Compiler Tower
33
34 Guile's compiler is quite simple, actually -- its @emph{compilers}, to
35 put it more accurately. Guile defines a tower of languages, starting
36 at Scheme and progressively simplifying down to languages that
37 resemble the VM instruction set (@pxref{Instruction Set}).
38
39 Each language knows how to compile to the next, so each step is simple
40 and understandable. Furthermore, this set of languages is not
41 hardcoded into Guile, so it is possible for the user to add new
42 high-level languages, new passes, or even different compilation
43 targets.
44
45 Languages are registered in the module, @code{(system base language)}:
46
47 @example
48 (use-modules (system base language))
49 @end example
50
51 They are registered with the @code{define-language} form.
52
53 @deffn {Scheme Syntax} define-language @
54 name title version reader printer @
55 [parser=#f] [read-file=#f] [compilers='()] [evaluator=#f]
56 Define a language.
57
58 This syntax defines a @code{#<language>} object, bound to @var{name}
59 in the current environment. In addition, the language will be added to
60 the global language set. For example, this is the language definition
61 for Scheme:
62
63 @example
64 (define-language scheme
65 #:title "Guile Scheme"
66 #:version "0.5"
67 #:reader read
68 #:read-file read-file
69 #:compilers `((,ghil . ,compile-ghil))
70 #:evaluator (lambda (x module) (primitive-eval x))
71 #:printer write)
72 @end example
73
74 In this example, from @code{(language scheme spec)}, @code{read-file}
75 reads expressions from a port and wraps them in a @code{begin} block.
76 @end deffn
77
78 The interesting thing about having languages defined this way is that
79 they present a uniform interface to the read-eval-print loop. This
80 allows the user to change the current language of the REPL:
81
82 @example
83 $ guile
84 Guile Scheme interpreter 0.5 on Guile 1.9.0
85 Copyright (C) 2001-2008 Free Software Foundation, Inc.
86
87 Enter `,help' for help.
88 scheme@@(guile-user)> ,language ghil
89 Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
90 Copyright (C) 2001-2008 Free Software Foundation, Inc.
91
92 Enter `,help' for help.
93 ghil@@(guile-user)>
94 @end example
95
96 Languages can be looked up by name, as they were above.
97
98 @deffn {Scheme Procedure} lookup-language name
99 Looks up a language named @var{name}, autoloading it if necessary.
100
101 Languages are autoloaded by looking for a variable named @var{name} in
102 a module named @code{(language @var{name} spec)}.
103
104 The language object will be returned, or @code{#f} if there does not
105 exist a language with that name.
106 @end deffn
107
108 Defining languages this way allows us to programmatically determine
109 the necessary steps for compiling code from one language to another.
110
111 @deffn {Scheme Procedure} lookup-compilation-order from to
112 Recursively traverses the set of languages to which @var{from} can
113 compile, depth-first, and return the first path that can transform
114 @var{from} to @var{to}. Returns @code{#f} if no path is found.
115
116 This function memoizes its results in a cache that is invalidated by
117 subsequent calls to @code{define-language}, so it should be quite
118 fast.
119 @end deffn
120
121 There is a notion of a ``current language'', which is maintained in
122 the @code{*current-language*} fluid. This language is normally Scheme,
123 and may be rebound by the user. The run-time compilation interfaces
124 (@pxref{Read/Load/Eval/Compile}) also allow you to choose other source
125 and target languages.
126
127 The normal tower of languages when compiling Scheme goes like this:
128
129 @itemize
130 @item Scheme, which we know and love
131 @item Guile High Intermediate Language (GHIL)
132 @item Guile Low Intermediate Language (GLIL)
133 @item Object code
134 @end itemize
135
136 Object code may be serialized to disk directly, though it has a cookie
137 and version prepended to the front. But when compiling Scheme at
138 run time, you want a Scheme value, e.g. a compiled procedure. For this
139 reason, so as not to break the abstraction, Guile defines a fake
140 language, @code{value}. Compiling to @code{value} loads the object
141 code into a procedure, and wakes the sleeping giant.
142
143 Perhaps this strangeness can be explained by example:
144 @code{compile-file} defaults to compiling to object code, because it
145 produces object code that has to live in the barren world outside the
146 Guile runtime; but @code{compile} defaults to compiling to
147 @code{value}, as its product re-enters the Guile world.
148
149 Indeed, the process of compilation can circulate through these
150 different worlds indefinitely, as shown by the following quine:
151
152 @example
153 ((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))
154 @end example
155
156 @node The Scheme Compiler
157 @subsection The Scheme Compiler
158
159 The job of the Scheme compiler is to expand all macros and to resolve
160 all symbols to lexical variables. Its target language, GHIL, is fairly
161 close to Scheme itself, so this process is not very complicated.
162
163 The Scheme compiler is driven by a table of @dfn{translators},
164 declared with the @code{define-scheme-translator} form, defined in the
165 module, @code{(language scheme compile-ghil)}.
166
167 @deffn {Scheme Syntax} define-scheme-translator head clause1 clause2...
168 The best documentation of this form is probably an example. Here is
169 the translator for @code{if}:
170
171 @example
172 (define-scheme-translator if
173 ;; (if TEST THEN [ELSE])
174 ((,test ,then)
175 (make-ghil-if e l (retrans test) (retrans then) (retrans '(begin))))
176 ((,test ,then ,else)
177 (make-ghil-if e l (retrans test) (retrans then) (retrans else))))
178 @end example
179
180 The match syntax is from the @code{pmatch} macro, defined in
181 @code{(system base pmatch)}. The result of a clause should be a valid
182 GHIL value. If no clause matches, a syntax error is signalled.
183
184 In the body of the clauses, the following bindings are introduced:
185 @itemize
186 @item @code{e}, the current environment
187 @item @code{l}, the current source location (or @code{#f})
188 @item @code{retrans}, a procedure that may be called to compile
189 subexpressions
190 @end itemize
191
192 Note that translators are looked up by @emph{value}, not by name. That
193 is to say, the translator is keyed under the @emph{value} of
194 @code{if}, which normally prints as @code{#<primitive-builtin-macro!
195 if>}.
196 @end deffn
197
198 Users can extend the compiler by defining new translators.
199 Additionally, some forms can be inlined directly to
200 instructions -- @xref{Inlined Scheme Instructions}, for a list. The
201 actual inliners are defined in @code{(language scheme inline)}:
202
203 @deffn {Scheme Syntax} define-inline head arity1 result1 arity2 result2...
204 Defines an inliner for @code{head}. As in
205 @code{define-scheme-translator}, inliners are keyed by value and not
206 by name.
207
208 Expressions are matched on their arities. For example:
209
210 @example
211 (define-inline eq?
212 (x y) (eq? x y))
213 @end example
214
215 This inlines calls to the Scheme procedure, @code{eq?}, to the
216 instruction @code{eq?}.
217
218 A more complicated example would be:
219
220 @example
221 (define-inline +
222 () 0
223 (x) x
224 (x y) (add x y)
225 (x y . rest) (add x (+ y . rest)))
226 @end example
227 @end deffn
228
229 Compilers take two arguments, an expression and an environment, and
230 return two values as well: an expression in the target language, and
231 an environment suitable for the target language. The format of the
232 environment is language-dependent.
233
234 For Scheme, an environment may be one of three things:
235 @itemize
236 @item @code{#f}, in which case compilation is performed in the context
237 of the current module;
238 @item a module, which specifies the context of the compilation; or
239 @item a @dfn{compile environment}, which specifies lexical variables
240 as well.
241 @end itemize
242
243 The format of a compile environment for scheme is @code{(@var{module}
244 @var{lexicals} . @var{externals})}, though users are strongly
245 discouraged from constructing these environments themselves. Instead,
246 if you need this functionality -- as in GOOPS' dynamic method compiler
247 -- capture an environment with @code{compile-time-environment}, then
248 pass that environment to @code{compile}.
249
250 @deffn {Scheme Procedure} compile-time-environment
251 A special function known to the compiler that, when compiled, will
252 return a representation of the lexical environment in place at compile
253 time. Useful for supporting some forms of dynamic compilation. Returns
254 @code{#f} if called from the interpreter.
255 @end deffn
256
257 @node GHIL
258 @subsection GHIL
259
260 Guile High Intermediate Language (GHIL) is a structured intermediate
261 language that is close in expressive power to Scheme. It is an
262 expanded, pre-analyzed Scheme.
263
264 GHIL is ``structured'' in the sense that its representation is based
265 on records, not S-expressions. This gives a rigidity to the language
266 that ensures that compiling to a lower-level language only requires a
267 limited set of transformations. Practically speaking, consider the
268 GHIL type, @code{<ghil-quote>}, which has fields named @code{env},
269 @code{loc}, and @code{exp}. Instances of this type are records created
270 via @code{make-ghil-quote}, and whose fields are accessed as
271 @code{ghil-quote-env}, @code{ghil-quote-loc}, and
272 @code{ghil-quote-exp}. There is also a predicate, @code{ghil-quote?}.
273 @xref{Records}, for more information on records.
274
275 Expressions of GHIL name their environments explicitly, and all
276 variables are referenced by identity in addition to by name.
277 @code{(language ghil)} defines a number of routines to deal explicitly
278 with variables and environments:
279
280 @deftp {Scheme Variable} <ghil-toplevel-env> [table='()]
281 A toplevel environment. The @var{table} holds all toplevel variables
282 that have been resolved in this environment.
283 @end deftp
284 @deftp {Scheme Variable} <ghil-env> parent [table='()] [variables='()]
285 A lexical environment. @var{parent} will be the enclosing lexical
286 environment, or a toplevel environment. @var{table} holds an alist
287 mapping symbols to variables bound in this environment, while
288 @var{variables} holds a cumulative list of all variables ever defined
289 in this environment.
290
291 Lexical environments correspond to procedures. Bindings introduced
292 e.g. by Scheme's @code{let} add to the bindings in a lexical
293 environment. An example of a case in which a variable might be in
294 @var{variables} but not in @var{table} would be a variable that is in
295 the same procedure, but is out of scope.
296 @end deftp
297 @deftp {Scheme Variable} <ghil-var> env name kind [index=#f]
298 A variable. @var{kind} is one of @code{argument}, @code{local},
299 @code{external}, @code{toplevel}, @code{public}, or @code{private};
300 see the procedures below for more information. @var{index} is used in
301 compilation.
302 @end deftp
303
304 @deffn {Scheme Procedure} ghil-var-is-bound? env sym
305 Recursively look up a variable named @var{sym} in @var{env}, and
306 return it or @code{#f} if none is found.
307 @end deffn
308 @deffn {Scheme Procedure} ghil-var-for-ref! env sym
309 Recursively look up a variable named @var{sym} in @var{env}, and
310 return it. If the symbol was not bound, return a new toplevel
311 variable.
312 @end deffn
313 @deffn {Scheme Procedure} ghil-var-for-set! env sym
314 Like @code{ghil-var-for-ref!}, except that the returned variable will
315 be marked as @code{external}. @xref{Variables and the VM}.
316 @end deffn
317 @deffn {Scheme Procedure} ghil-var-define! toplevel-env sym
318 Return an existing or new toplevel variable named @var{sym}.
319 @var{toplevel-env} must be a toplevel environment.
320 @end deffn
321 @deffn {Scheme Procedure} ghil-var-at-module! env modname sym interface?
322 Return a variable that will be resolved at run-time with respect to a
323 specific module named @var{modname}. If @var{interface?} is true, the
324 variable will be of type @code{public}, otherwise @code{private}.
325 @end deffn
326 @deffn {Scheme Procedure} call-with-ghil-environment env syms func
327 Bind @var{syms} to fresh variables within a new lexical environment
328 whose parent is @var{env}, and call @var{func} as @code{(@var{func}
329 @var{new-env} @var{new-vars})}.
330 @end deffn
331 @deffn {Scheme Procedure} call-with-ghil-bindings env syms func
332 Like @code{call-with-ghil-environment}, except the existing
333 environment @var{env} is re-used. For that reason, @var{func} is
334 invoked as @code{(@var{func} @var{new-vars})}
335 @end deffn
336
337 In the aforementioned @code{<ghil-quote>} type, the @var{env} slot
338 holds a pointer to the environment in which the expression occurs. The
339 @var{loc} slot holds source location information, so that errors
340 corresponding to this expression can be mapped back to the initial
341 expression in the higher-level language, e.g. Scheme. @xref{Compiled
342 Procedures}, for more information on source location objects.
343
344 GHIL also has a declarative serialization format, which makes writing
345 and reading it a tractable problem for the human mind. Since all GHIL
346 language constructs contain @code{env} and @code{loc} pointers, they
347 are left out of the serialization. (Serializing @code{env} structures
348 would be difficult, as they are often circular.) What is left is the
349 type of expression, and the remaining slots defined in the expression
350 type.
351
352 For example, an S-expression representation of the @code{<ghil-quote>}
353 expression would be:
354
355 @example
356 (quote 3)
357 @end example
358
359 It's deceptively like Scheme. The general rule is, for a type defined
360 as @code{<ghil-@var{foo}> env loc @var{slot1} @var{slot2}...}, the
361 S-expression representation will be @code{(@var{foo} @var{slot1}
362 @var{slot2}...)}. Users may program with this format directly at the
363 REPL:
364
365 @example
366 scheme@@(guile-user)> ,language ghil
367 Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
368 Copyright (C) 2001-2008 Free Software Foundation, Inc.
369
370 Enter `,help' for help.
371 ghil@@(guile-user)> (call (ref +) (quote 32) (quote 10))
372 @result{} 42
373 @end example
374
375 For convenience, some slots are serialized as rest arguments; those
376 are noted below. The other caveat is that variables are serialized as
377 their names only, and not their identities.
378
379 @deftp {Scheme Variable} <ghil-void> env loc
380 The unspecified value.
381 @end deftp
382 @deftp {Scheme Variable} <ghil-quote> env loc exp
383 A quoted expression.
384
385 Note that unlike in Scheme, there are no self-quoting expressions; all
386 constants must come from @code{quote} expressions.
387 @end deftp
388 @deftp {Scheme Variable} <ghil-quasiquote> env loc exp
389 A quasiquoted expression. The expression is treated as a constant,
390 except for embedded @code{unquote} and @code{unquote-splicing} forms.
391 @end deftp
392 @deftp {Scheme Variable} <ghil-unquote> env loc exp
393 Like Scheme's @code{unquote}; only valid within a quasiquote.
394 @end deftp
395 @deftp {Scheme Variable} <ghil-unquote-splicing> env loc exp
396 Like Scheme's @code{unquote-splicing}; only valid within a quasiquote.
397 @end deftp
398 @deftp {Scheme Variable} <ghil-ref> env loc var
399 A variable reference. Note that for purposes of serialization,
400 @var{var} is serialized as its name, as a symbol.
401 @end deftp
402 @deftp {Scheme Variable} <ghil-set> env loc var val
403 A variable mutation. @var{var} is serialized as a symbol.
404 @end deftp
405 @deftp {Scheme Variable} <ghil-define> env loc var val
406 A toplevel variable definition. See @code{ghil-var-define!}.
407 @end deftp
408 @deftp {Scheme Variable} <ghil-if> env loc test then else
409 A conditional. Note that @var{else} is not optional.
410 @end deftp
411 @deftp {Scheme Variable} <ghil-and> env loc . exps
412 Like Scheme's @code{and}.
413 @end deftp
414 @deftp {Scheme Variable} <ghil-or> env loc . exps
415 Like Scheme's @code{or}.
416 @end deftp
417 @deftp {Scheme Variable} <ghil-begin> env loc . body
418 Like Scheme's @code{begin}.
419 @end deftp
420 @deftp {Scheme Variable} <ghil-bind> env loc vars exprs . body
421 Like a deconstructed @code{let}: each element of @var{vars} will be
422 bound to the corresponding GHIL expression in @var{exprs}.
423
424 Note that for purposes of the serialization format, @var{exprs} are
425 evaluated before the new bindings are added to the environment. For
426 @code{letrec} semantics, there also exists a @code{bindrec} parse
427 flavor. This is useful for writing GHIL at the REPL, but the
428 serializer does not currently have the cleverness needed to determine
429 whether a @code{<ghil-bind>} has @code{let} or @code{letrec}
430 semantics, and thus only serializes @code{<ghil-bind>} as @code{bind}.
431 @end deftp
432 @deftp {Scheme Variable} <ghil-mv-bind> env loc vars rest producer . body
433 Like Scheme's @code{receive} -- binds the values returned by
434 applying @code{producer}, which should be a thunk, to the
435 @code{lambda}-like bindings described by @var{vars} and @var{rest}.
436 @end deftp
437 @deftp {Scheme Variable} <ghil-lambda> env loc vars rest meta . body
438 A closure. @var{vars} is the argument list, serialized as a list of
439 symbols. @var{rest} is a boolean, which is @code{#t} iff the last
440 argument is a rest argument. @var{meta} is an association list of
441 properties. The actual @var{body} should be a list of GHIL
442 expressions.
443 @end deftp
444 @deftp {Scheme Variable} <ghil-call> env loc proc . args
445 A procedure call.
446 @end deftp
447 @deftp {Scheme Variable} <ghil-mv-call> env loc producer consumer
448 Like Scheme's @code{call-with-values}.
449 @end deftp
450 @deftp {Scheme Variable} <ghil-inline> env loc op . args
451 An inlined VM instruction. @var{op} should be the instruction name as
452 a symbol, and @var{args} should be its arguments, as GHIL expressions.
453 @end deftp
454 @deftp {Scheme Variable} <ghil-values> env loc . values
455 Like Scheme's @code{values}.
456 @end deftp
457 @deftp {Scheme Variable} <ghil-values*> env loc . values
458 @var{values} are as in the Scheme expression, @code{(apply values .
459 @var{vals})}.
460 @end deftp
461 @deftp {Scheme Variable} <ghil-reified-env> env loc
462 Produces, at run-time, a reification of the environment at compile
463 time. Used in the implementation of Scheme's
464 @code{compile-time-environment}.
465 @end deftp
466
467 GHIL implements a compiler to GLIL that recursively traverses GHIL
468 expressions, writing out GLIL expressions into a linear list. The
469 compiler also keeps some state as to whether the current expression is
470 in tail context, and whether its value will be used in future
471 computations. This state allows the compiler not to emit code for
472 constant expressions that will not be used (e.g. docstrings), and to
473 perform tail calls when in tail position.
474
475 Just as the Scheme to GHIL compiler introduced new hidden state---the
476 environment---the GHIL to GLIL compiler introduces more state, the
477 stack. While not represented explicitly, the stack is present in the
478 compilation of each GHIL expression: compiling a GHIL expression
479 should leave the run-time value stack in the same state. For example,
480 if the intermediate value stack has two elements before evaluating an
481 @code{if} expression, it should have two elements after that
482 expression.
483
484 Interested readers are encouraged to read the implementation in
485 @code{(language ghil compile-glil)} for more details.
486
487 @node GLIL
488 @subsection GLIL
489
490 Guile Low Intermediate Language (GLIL) is a structured intermediate
491 language whose expressions closely mirror the functionality of Guile's
492 VM instruction set.
493
494 Its expression types are defined in @code{(language glil)}, and as
495 with GHIL, some of its fields parse as rest arguments.
496
497 @deftp {Scheme Variable} <glil-program> nargs nrest nlocs nexts meta . body
498 A unit of code that at run-time will correspond to a compiled
499 procedure. @var{nargs} @var{nrest} @var{nlocs}, and @var{nexts}
500 collectively define the program's arity; see @ref{Compiled
501 Procedures}, for more information. @var{meta} should be an alist of
502 properties, as in @code{<ghil-lambda>}. @var{body} is a list of GLIL
503 expressions.
504 @end deftp
505 @deftp {Scheme Variable} <glil-bind> . vars
506 An advisory expression that notes a liveness extent for a set of
507 variables. @var{vars} is a list of @code{(@var{name} @var{type}
508 @var{index})}, where @var{type} should be either @code{argument},
509 @code{local}, or @code{external}.
510
511 @code{<glil-bind>} expressions end up being serialized as part of a
512 program's metadata and do not form part of a program's code path.
513 @end deftp
514 @deftp {Scheme Variable} <glil-mv-bind> vars rest
515 A multiple-value binding of the values on the stack to @var{vars}. Iff
516 @var{rest} is true, the last element of @var{vars} will be treated as
517 a rest argument.
518
519 In addition to pushing a binding annotation on the stack, like
520 @code{<glil-bind>}, an expression is emitted at compilation time to
521 make sure that there are enough values available to bind. See the
522 notes on @code{truncate-values} in @ref{Procedural Instructions}, for
523 more information.
524 @end deftp
525 @deftp {Scheme Variable} <glil-unbind>
526 Closes the liveness extent of the most recently encountered
527 @code{<glil-bind>} or @code{<glil-mv-bind>} expression. As GLIL
528 expressions are compiled, a parallel stack of live bindings is
529 maintained; this expression pops off the top element from that stack.
530
531 Bindings are written into the program's metadata so that debuggers and
532 other tools can determine the set of live local variables at a given
533 offset within a VM program.
534 @end deftp
535 @deftp {Scheme Variable} <glil-source> loc
536 Records source information for the preceding expression. @var{loc}
537 should be a vector, @code{#(@var{line} @var{column} @var{filename})}.
538 @end deftp
539 @deftp {Scheme Variable} <glil-void>
540 Pushes the unspecified value on the stack.
541 @end deftp
542 @deftp {Scheme Variable} <glil-const> obj
543 Pushes a constant value onto the stack. @var{obj} must be a number,
544 string, symbol, keyword, boolean, character, or a pair or vector or
545 list thereof, or the empty list.
546 @end deftp
547 @deftp {Scheme Variable} <glil-argument> op index
548 Accesses an argument on the stack. If @var{op} is @code{ref}, the
549 argument is pushed onto the stack; if it is @code{set}, the argument
550 is set from the top value on the stack, which is popped off.
551 @end deftp
552 @deftp {Scheme Variable} <glil-local> op index
553 Like @code{<glil-argument>}, but for local variables. @xref{Stack
554 Layout}, for more information.
555 @end deftp
556 @deftp {Scheme Variable} <glil-external> op depth index
557 Accesses a heap-allocated variable, addressed by @var{depth}, the nth
558 enclosing environment, and @var{index}, the variable's position within
559 the environment. @var{op} is @code{ref} or @code{set}.
560 @end deftp
561 @deftp {Scheme Variable} <glil-toplevel> op name
562 Accesses a toplevel variable. @var{op} may be @code{ref}, @code{set},
563 or @code{define}.
564 @end deftp
565 @deftp {Scheme Variable} <glil-module> op mod name public?
566 Accesses a variable within a specific module. See
567 @code{ghil-var-at-module!}, for more information.
568 @end deftp
569 @deftp {Scheme Variable} <glil-label> label
570 Creates a new label. @var{label} can be any Scheme value, and should
571 be unique.
572 @end deftp
573 @deftp {Scheme Variable} <glil-branch> inst label
574 Branch to a label. @var{label} should be a @code{<ghil-label>}.
575 @code{inst} is a branching instruction: @code{br-if}, @code{br}, etc.
576 @end deftp
577 @deftp {Scheme Variable} <glil-call> inst nargs
578 This expression is probably misnamed, as it does not correspond to
579 function calls. @code{<glil-call>} invokes the VM instruction named
580 @var{inst}, noting that it is called with @var{nargs} stack arguments.
581 The arguments should be pushed on the stack already. What happens to
582 the stack afterwards depends on the instruction.
583 @end deftp
584 @deftp {Scheme Variable} <glil-mv-call> nargs ra
585 Performs a multiple-value call. @var{ra} is a @code{<glil-label>}
586 corresponding to the multiple-value return address for the call. See
587 the notes on @code{mv-call} in @ref{Procedural Instructions}, for more
588 information.
589 @end deftp
590
591 Users may enter in GLIL at the REPL as well, though there is a bit
592 more bookkeeping to do. Since GLIL needs the set of variables to be
593 declared explicitly in a @code{<glil-program>}, GLIL expressions must
594 be wrapped in a thunk that declares the arity of the expression:
595
596 @example
597 scheme@@(guile-user)> ,language glil
598 Guile Lowlevel Intermediate Language (GLIL) interpreter 0.3 on Guile 1.9.0
599 Copyright (C) 2001-2008 Free Software Foundation, Inc.
600
601 Enter `,help' for help.
602 glil@@(guile-user)> (program 0 0 0 0 () (const 3) (call return 0))
603 @result{} 3
604 @end example
605
606 Just as in all of Guile's compilers, an environment is passed to the
607 GLIL-to-object code compiler, and one is returned as well, along with
608 the object code.
609
610 @node Object Code
611 @subsection Object Code
612
613 Object code is the serialization of the raw instruction stream of a
614 program, ready for interpretation by the VM. Procedures related to
615 object code are defined in the @code{(system vm objcode)} module.
616
617 @deffn {Scheme Procedure} objcode? obj
618 @deffnx {C Function} scm_objcode_p (obj)
619 Returns @code{#f} iff @var{obj} is object code, @code{#f} otherwise.
620 @end deffn
621
622 @deffn {Scheme Procedure} bytecode->objcode bytecode nlocs nexts
623 @deffnx {C Function} scm_bytecode_to_objcode (bytecode, nlocs, nexts)
624 Makes a bytecode object from @var{bytecode}, which should be a
625 @code{u8vector}. @var{nlocs} and @var{nexts} denote the number of
626 stack and heap variables to reserve when this objcode is executed.
627 @end deffn
628
629 @deffn {Scheme Variable} load-objcode file
630 @deffnx {C Function} scm_load_objcode (file)
631 Load object code from a file named @var{file}. The file will be mapped
632 into memory via @code{mmap}, so this is a very fast operation.
633
634 On disk, object code has an eight-byte cookie prepended to it, so that
635 we will not execute arbitrary garbage. In addition, two more bytes are
636 reserved for @var{nlocs} and @var{nexts}.
637 @end deffn
638
639 @deffn {Scheme Variable} objcode->u8vector objcode
640 @deffnx {C Function} scm_objcode_to_u8vector (objcode)
641 Copy object code out to a @code{u8vector} for analysis by Scheme. The
642 ten-byte header is included.
643 @end deffn
644
645 @deffn {Scheme Variable} objcode->program objcode [external='()]
646 @deffnx {C Function} scm_objcode_to_program (objcode, external)
647 Load up object code into a Scheme program. The resulting program will
648 be a thunk that captures closure variables from @var{external}.
649 @end deffn
650
651 Object code from a file may be disassembled at the REPL via the
652 meta-command @code{,disassemble-file}, abbreviated as @code{,xx}.
653 Programs may be disassembled via @code{,disassemble}, abbreviated as
654 @code{,x}.
655
656 Compiling object code to the fake language, @code{value}, is performed
657 via loading objcode into a program, then executing that thunk with
658 respect to the compilation environment. Normally the environment
659 propagates through the compiler transparently, but users may specify
660 the compilation environment manually as well:
661
662 @deffn {Scheme Procedure} make-objcode-env module externals
663 Make an object code environment. @var{module} should be a Scheme
664 module, and @var{externals} should be a list of external variables.
665 @code{#f} is also a valid object code environment.
666 @end deffn
667
668 @node Extending the Compiler
669 @subsection Extending the Compiler
670
671 At this point, we break with the impersonal tone of the rest of the
672 manual, and make an intervention. Admit it: if you've read this far
673 into the compiler internals manual, you are a junkie. Perhaps a course
674 at your university left you unsated, or perhaps you've always harbored
675 a sublimated desire to hack the holy of computer science holies: a
676 compiler. Well you're in good company, and in a good position. Guile's
677 compiler needs your help.
678
679 There are many possible avenues for improving Guile's compiler.
680 Probably the most important improvement, speed-wise, will be some form
681 of native compilation, both just-in-time and ahead-of-time. This could
682 be done in many ways. Probably the easiest strategy would be to extend
683 the compiled procedure structure to include a pointer to a native code
684 vector, and compile from bytecode to native code at run-time after a
685 procedure is called a certain number of times.
686
687 The name of the game is a profiling-based harvest of the low-hanging
688 fruit, running programs of interest under a system-level profiler and
689 determining which improvements would give the most bang for the buck.
690 There are many well-known efficiency hacks in the literature: Dybvig's
691 letrec optimization, individual boxing of heap-allocated values (and
692 then store the boxes on the stack directory), optimized case-lambda
693 expressions, stack underflow and overflow handlers, etc. Highly
694 recommended papers: Dybvig's HOCS, Ghuloum's compiler paper.
695
696 The compiler also needs help at the top end, enhancing the Scheme that
697 it knows to also understand R6RS, and adding new high-level compilers:
698 Emacs Lisp, Lua, JavaScript...