2 @c This is part of the GNU Guile Reference Manual.
4 @c Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
7 @node Compiling to the Virtual Machine
8 @section Compiling to the Virtual Machine
10 Compilers have a mystique about them that is attractive and
11 off-putting at the same time. They are attractive because they are
12 magical -- they transform inert text into live results, like throwing
13 the switch on Frankenstein's monster. However, this magic is perceived
14 by many to be impenetrable.
16 This section aims to pay attention to the small man behind the
19 @xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to
20 know how to compile your .scm file.
24 * The Scheme Compiler::
28 * Extending the Compiler::
32 @subsection Compiler Tower
34 Guile's compiler is quite simple, actually -- its @emph{compilers}, to
35 put it more accurately. Guile defines a tower of languages, starting
36 at Scheme and progressively simplifying down to languages that
37 resemble the VM instruction set (@pxref{Instruction Set}).
39 Each language knows how to compile to the next, so each step is simple
40 and understandable. Furthermore, this set of languages is not
41 hardcoded into Guile, so it is possible for the user to add new
42 high-level languages, new passes, or even different compilation
45 Languages are registered in the module, @code{(system base language)}:
48 (use-modules (system base language))
51 They are registered with the @code{define-language} form.
53 @deffn {Scheme Syntax} define-language @
54 name title version reader printer @
55 [parser=#f] [read-file=#f] [compilers='()] [evaluator=#f]
58 This syntax defines a @code{#<language>} object, bound to @var{name}
59 in the current environment. In addition, the language will be added to
60 the global language set. For example, this is the language definition
64 (define-language scheme
65 #:title "Guile Scheme"
69 #:compilers `((,ghil . ,compile-ghil))
70 #:evaluator (lambda (x module) (primitive-eval x))
74 In this example, from @code{(language scheme spec)}, @code{read-file}
75 reads expressions from a port and wraps them in a @code{begin} block.
78 The interesting thing about having languages defined this way is that
79 they present a uniform interface to the read-eval-print loop. This
80 allows the user to change the current language of the REPL:
84 Guile Scheme interpreter 0.5 on Guile 1.9.0
85 Copyright (C) 2001-2008 Free Software Foundation, Inc.
87 Enter `,help' for help.
88 scheme@@(guile-user)> ,language ghil
89 Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
90 Copyright (C) 2001-2008 Free Software Foundation, Inc.
92 Enter `,help' for help.
96 Languages can be looked up by name, as they were above.
98 @deffn {Scheme Procedure} lookup-language name
99 Looks up a language named @var{name}, autoloading it if necessary.
101 Languages are autoloaded by looking for a variable named @var{name} in
102 a module named @code{(language @var{name} spec)}.
104 The language object will be returned, or @code{#f} if there does not
105 exist a language with that name.
108 Defining languages this way allows us to programmatically determine
109 the necessary steps for compiling code from one language to another.
111 @deffn {Scheme Procedure} lookup-compilation-order from to
112 Recursively traverses the set of languages to which @var{from} can
113 compile, depth-first, and return the first path that can transform
114 @var{from} to @var{to}. Returns @code{#f} if no path is found.
116 This function memoizes its results in a cache that is invalidated by
117 subsequent calls to @code{define-language}, so it should be quite
121 There is a notion of a ``current language'', which is maintained in
122 the @code{*current-language*} fluid. This language is normally Scheme,
123 and may be rebound by the user. The run-time compilation interfaces
124 (@pxref{Read/Load/Eval/Compile}) also allow you to choose other source
125 and target languages.
127 The normal tower of languages when compiling Scheme goes like this:
130 @item Scheme, which we know and love
131 @item Guile High Intermediate Language (GHIL)
132 @item Guile Low Intermediate Language (GLIL)
136 Object code may be serialized to disk directly, though it has a cookie
137 and version prepended to the front. But when compiling Scheme at
138 run time, you want a Scheme value, e.g. a compiled procedure. For this
139 reason, so as not to break the abstraction, Guile defines a fake
140 language, @code{value}. Compiling to @code{value} loads the object
141 code into a procedure, and wakes the sleeping giant.
143 Perhaps this strangeness can be explained by example:
144 @code{compile-file} defaults to compiling to object code, because it
145 produces object code that has to live in the barren world outside the
146 Guile runtime; but @code{compile} defaults to compiling to
147 @code{value}, as its product re-enters the Guile world.
149 Indeed, the process of compilation can circulate through these
150 different worlds indefinitely, as shown by the following quine:
153 ((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))
156 @node The Scheme Compiler
157 @subsection The Scheme Compiler
159 The job of the Scheme compiler is to expand all macros and to resolve
160 all symbols to lexical variables. Its target language, GHIL, is fairly
161 close to Scheme itself, so this process is not very complicated.
163 The Scheme compiler is driven by a table of @dfn{translators},
164 declared with the @code{define-scheme-translator} form, defined in the
165 module, @code{(language scheme compile-ghil)}.
167 @deffn {Scheme Syntax} define-scheme-translator head clause1 clause2...
168 The best documentation of this form is probably an example. Here is
169 the translator for @code{if}:
172 (define-scheme-translator if
173 ;; (if TEST THEN [ELSE])
175 (make-ghil-if e l (retrans test) (retrans then) (retrans '(begin))))
177 (make-ghil-if e l (retrans test) (retrans then) (retrans else))))
180 The match syntax is from the @code{pmatch} macro, defined in
181 @code{(system base pmatch)}. The result of a clause should be a valid
182 GHIL value. If no clause matches, a syntax error is signalled.
184 In the body of the clauses, the following bindings are introduced:
186 @item @code{e}, the current environment
187 @item @code{l}, the current source location (or @code{#f})
188 @item @code{retrans}, a procedure that may be called to compile
192 Note that translators are looked up by @emph{value}, not by name. That
193 is to say, the translator is keyed under the @emph{value} of
194 @code{if}, which normally prints as @code{#<primitive-builtin-macro!
198 Users can extend the compiler by defining new translators.
199 Additionally, some forms can be inlined directly to
200 instructions -- @xref{Inlined Scheme Instructions}, for a list. The
201 actual inliners are defined in @code{(language scheme inline)}:
203 @deffn {Scheme Syntax} define-inline head arity1 result1 arity2 result2...
204 Defines an inliner for @code{head}. As in
205 @code{define-scheme-translator}, inliners are keyed by value and not
208 Expressions are matched on their arities. For example:
215 This inlines calls to the Scheme procedure, @code{eq?}, to the
216 instruction @code{eq?}.
218 A more complicated example would be:
225 (x y . rest) (add x (+ y . rest)))
229 Compilers take two arguments, an expression and an environment, and
230 return two values as well: an expression in the target language, and
231 an environment suitable for the target language. The format of the
232 environment is language-dependent.
234 For Scheme, an environment may be one of three things:
236 @item @code{#f}, in which case compilation is performed in the context
237 of the current module;
238 @item a module, which specifies the context of the compilation; or
239 @item a @dfn{compile environment}, which specifies lexical variables
243 The format of a compile environment for scheme is @code{(@var{module}
244 @var{lexicals} . @var{externals})}, though users are strongly
245 discouraged from constructing these environments themselves. Instead,
246 if you need this functionality -- as in GOOPS' dynamic method compiler
247 -- capture an environment with @code{compile-time-environment}, then
248 pass that environment to @code{compile}.
250 @deffn {Scheme Procedure} compile-time-environment
251 A special function known to the compiler that, when compiled, will
252 return a representation of the lexical environment in place at compile
253 time. Useful for supporting some forms of dynamic compilation. Returns
254 @code{#f} if called from the interpreter.
260 Guile High Intermediate Language (GHIL) is a structured intermediate
261 language that is close in expressive power to Scheme. It is an
262 expanded, pre-analyzed Scheme.
264 GHIL is ``structured'' in the sense that its representation is based
265 on records, not S-expressions. This gives a rigidity to the language
266 that ensures that compiling to a lower-level language only requires a
267 limited set of transformations. Practically speaking, consider the
268 GHIL type, @code{<ghil-quote>}, which has fields named @code{env},
269 @code{loc}, and @code{exp}. Instances of this type are records created
270 via @code{make-ghil-quote}, and whose fields are accessed as
271 @code{ghil-quote-env}, @code{ghil-quote-loc}, and
272 @code{ghil-quote-exp}. There is also a predicate, @code{ghil-quote?}.
273 @xref{Records}, for more information on records.
275 Expressions of GHIL name their environments explicitly, and all
276 variables are referenced by identity in addition to by name.
277 @code{(language ghil)} defines a number of routines to deal explicitly
278 with variables and environments:
280 @deftp {Scheme Variable} <ghil-toplevel-env> [table='()]
281 A toplevel environment. The @var{table} holds all toplevel variables
282 that have been resolved in this environment.
284 @deftp {Scheme Variable} <ghil-env> parent [table='()] [variables='()]
285 A lexical environment. @var{parent} will be the enclosing lexical
286 environment, or a toplevel environment. @var{table} holds an alist
287 mapping symbols to variables bound in this environment, while
288 @var{variables} holds a cumulative list of all variables ever defined
291 Lexical environments correspond to procedures. Bindings introduced
292 e.g. by Scheme's @code{let} add to the bindings in a lexical
293 environment. An example of a case in which a variable might be in
294 @var{variables} but not in @var{table} would be a variable that is in
295 the same procedure, but is out of scope.
297 @deftp {Scheme Variable} <ghil-var> env name kind [index=#f]
298 A variable. @var{kind} is one of @code{argument}, @code{local},
299 @code{external}, @code{toplevel}, @code{public}, or @code{private};
300 see the procedures below for more information. @var{index} is used in
304 @deffn {Scheme Procedure} ghil-var-is-bound? env sym
305 Recursively look up a variable named @var{sym} in @var{env}, and
306 return it or @code{#f} if none is found.
308 @deffn {Scheme Procedure} ghil-var-for-ref! env sym
309 Recursively look up a variable named @var{sym} in @var{env}, and
310 return it. If the symbol was not bound, return a new toplevel
313 @deffn {Scheme Procedure} ghil-var-for-set! env sym
314 Like @code{ghil-var-for-ref!}, except that the returned variable will
315 be marked as @code{external}. @xref{Variables and the VM}.
317 @deffn {Scheme Procedure} ghil-var-define! toplevel-env sym
318 Return an existing or new toplevel variable named @var{sym}.
319 @var{toplevel-env} must be a toplevel environment.
321 @deffn {Scheme Procedure} ghil-var-at-module! env modname sym interface?
322 Return a variable that will be resolved at run-time with respect to a
323 specific module named @var{modname}. If @var{interface?} is true, the
324 variable will be of type @code{public}, otherwise @code{private}.
326 @deffn {Scheme Procedure} call-with-ghil-environment env syms func
327 Bind @var{syms} to fresh variables within a new lexical environment
328 whose parent is @var{env}, and call @var{func} as @code{(@var{func}
329 @var{new-env} @var{new-vars})}.
331 @deffn {Scheme Procedure} call-with-ghil-bindings env syms func
332 Like @code{call-with-ghil-environment}, except the existing
333 environment @var{env} is re-used. For that reason, @var{func} is
334 invoked as @code{(@var{func} @var{new-vars})}
337 In the aforementioned @code{<ghil-quote>} type, the @var{env} slot
338 holds a pointer to the environment in which the expression occurs. The
339 @var{loc} slot holds source location information, so that errors
340 corresponding to this expression can be mapped back to the initial
341 expression in the higher-level language, e.g. Scheme. @xref{Compiled
342 Procedures}, for more information on source location objects.
344 GHIL also has a declarative serialization format, which makes writing
345 and reading it a tractable problem for the human mind. Since all GHIL
346 language constructs contain @code{env} and @code{loc} pointers, they
347 are left out of the serialization. (Serializing @code{env} structures
348 would be difficult, as they are often circular.) What is left is the
349 type of expression, and the remaining slots defined in the expression
352 For example, an S-expression representation of the @code{<ghil-quote>}
359 It's deceptively like Scheme. The general rule is, for a type defined
360 as @code{<ghil-@var{foo}> env loc @var{slot1} @var{slot2}...}, the
361 S-expression representation will be @code{(@var{foo} @var{slot1}
362 @var{slot2}...)}. Users may program with this format directly at the
366 scheme@@(guile-user)> ,language ghil
367 Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
368 Copyright (C) 2001-2008 Free Software Foundation, Inc.
370 Enter `,help' for help.
371 ghil@@(guile-user)> (call (ref +) (quote 32) (quote 10))
375 For convenience, some slots are serialized as rest arguments; those
376 are noted below. The other caveat is that variables are serialized as
377 their names only, and not their identities.
379 @deftp {Scheme Variable} <ghil-void> env loc
380 The unspecified value.
382 @deftp {Scheme Variable} <ghil-quote> env loc exp
385 Note that unlike in Scheme, there are no self-quoting expressions; all
386 constants must come from @code{quote} expressions.
388 @deftp {Scheme Variable} <ghil-quasiquote> env loc exp
389 A quasiquoted expression. The expression is treated as a constant,
390 except for embedded @code{unquote} and @code{unquote-splicing} forms.
392 @deftp {Scheme Variable} <ghil-unquote> env loc exp
393 Like Scheme's @code{unquote}; only valid within a quasiquote.
395 @deftp {Scheme Variable} <ghil-unquote-splicing> env loc exp
396 Like Scheme's @code{unquote-splicing}; only valid within a quasiquote.
398 @deftp {Scheme Variable} <ghil-ref> env loc var
399 A variable reference. Note that for purposes of serialization,
400 @var{var} is serialized as its name, as a symbol.
402 @deftp {Scheme Variable} <ghil-set> env loc var val
403 A variable mutation. @var{var} is serialized as a symbol.
405 @deftp {Scheme Variable} <ghil-define> env loc var val
406 A toplevel variable definition. See @code{ghil-var-define!}.
408 @deftp {Scheme Variable} <ghil-if> env loc test then else
409 A conditional. Note that @var{else} is not optional.
411 @deftp {Scheme Variable} <ghil-and> env loc . exps
412 Like Scheme's @code{and}.
414 @deftp {Scheme Variable} <ghil-or> env loc . exps
415 Like Scheme's @code{or}.
417 @deftp {Scheme Variable} <ghil-begin> env loc . body
418 Like Scheme's @code{begin}.
420 @deftp {Scheme Variable} <ghil-bind> env loc vars exprs . body
421 Like a deconstructed @code{let}: each element of @var{vars} will be
422 bound to the corresponding GHIL expression in @var{exprs}.
424 Note that for purposes of the serialization format, @var{exprs} are
425 evaluated before the new bindings are added to the environment. For
426 @code{letrec} semantics, there also exists a @code{bindrec} parse
427 flavor. This is useful for writing GHIL at the REPL, but the
428 serializer does not currently have the cleverness needed to determine
429 whether a @code{<ghil-bind>} has @code{let} or @code{letrec}
430 semantics, and thus only serializes @code{<ghil-bind>} as @code{bind}.
432 @deftp {Scheme Variable} <ghil-mv-bind> env loc vars rest producer . body
433 Like Scheme's @code{receive} -- binds the values returned by
434 applying @code{producer}, which should be a thunk, to the
435 @code{lambda}-like bindings described by @var{vars} and @var{rest}.
437 @deftp {Scheme Variable} <ghil-lambda> env loc vars rest meta . body
438 A closure. @var{vars} is the argument list, serialized as a list of
439 symbols. @var{rest} is a boolean, which is @code{#t} iff the last
440 argument is a rest argument. @var{meta} is an association list of
441 properties. The actual @var{body} should be a list of GHIL
444 @deftp {Scheme Variable} <ghil-call> env loc proc . args
447 @deftp {Scheme Variable} <ghil-mv-call> env loc producer consumer
448 Like Scheme's @code{call-with-values}.
450 @deftp {Scheme Variable} <ghil-inline> env loc op . args
451 An inlined VM instruction. @var{op} should be the instruction name as
452 a symbol, and @var{args} should be its arguments, as GHIL expressions.
454 @deftp {Scheme Variable} <ghil-values> env loc . values
455 Like Scheme's @code{values}.
457 @deftp {Scheme Variable} <ghil-values*> env loc . values
458 @var{values} are as in the Scheme expression, @code{(apply values .
461 @deftp {Scheme Variable} <ghil-reified-env> env loc
462 Produces, at run-time, a reification of the environment at compile
463 time. Used in the implementation of Scheme's
464 @code{compile-time-environment}.
467 GHIL implements a compiler to GLIL that recursively traverses GHIL
468 expressions, writing out GLIL expressions into a linear list. The
469 compiler also keeps some state as to whether the current expression is
470 in tail context, and whether its value will be used in future
471 computations. This state allows the compiler not to emit code for
472 constant expressions that will not be used (e.g. docstrings), and to
473 perform tail calls when in tail position.
475 Just as the Scheme to GHIL compiler introduced new hidden state---the
476 environment---the GHIL to GLIL compiler introduces more state, the
477 stack. While not represented explicitly, the stack is present in the
478 compilation of each GHIL expression: compiling a GHIL expression
479 should leave the run-time value stack in the same state. For example,
480 if the intermediate value stack has two elements before evaluating an
481 @code{if} expression, it should have two elements after that
484 Interested readers are encouraged to read the implementation in
485 @code{(language ghil compile-glil)} for more details.
490 Guile Low Intermediate Language (GLIL) is a structured intermediate
491 language whose expressions closely mirror the functionality of Guile's
494 Its expression types are defined in @code{(language glil)}, and as
495 with GHIL, some of its fields parse as rest arguments.
497 @deftp {Scheme Variable} <glil-program> nargs nrest nlocs nexts meta . body
498 A unit of code that at run-time will correspond to a compiled
499 procedure. @var{nargs} @var{nrest} @var{nlocs}, and @var{nexts}
500 collectively define the program's arity; see @ref{Compiled
501 Procedures}, for more information. @var{meta} should be an alist of
502 properties, as in @code{<ghil-lambda>}. @var{body} is a list of GLIL
505 @deftp {Scheme Variable} <glil-bind> . vars
506 An advisory expression that notes a liveness extent for a set of
507 variables. @var{vars} is a list of @code{(@var{name} @var{type}
508 @var{index})}, where @var{type} should be either @code{argument},
509 @code{local}, or @code{external}.
511 @code{<glil-bind>} expressions end up being serialized as part of a
512 program's metadata and do not form part of a program's code path.
514 @deftp {Scheme Variable} <glil-mv-bind> vars rest
515 A multiple-value binding of the values on the stack to @var{vars}. Iff
516 @var{rest} is true, the last element of @var{vars} will be treated as
519 In addition to pushing a binding annotation on the stack, like
520 @code{<glil-bind>}, an expression is emitted at compilation time to
521 make sure that there are enough values available to bind. See the
522 notes on @code{truncate-values} in @ref{Procedural Instructions}, for
525 @deftp {Scheme Variable} <glil-unbind>
526 Closes the liveness extent of the most recently encountered
527 @code{<glil-bind>} or @code{<glil-mv-bind>} expression. As GLIL
528 expressions are compiled, a parallel stack of live bindings is
529 maintained; this expression pops off the top element from that stack.
531 Bindings are written into the program's metadata so that debuggers and
532 other tools can determine the set of live local variables at a given
533 offset within a VM program.
535 @deftp {Scheme Variable} <glil-source> loc
536 Records source information for the preceding expression. @var{loc}
537 should be a vector, @code{#(@var{line} @var{column} @var{filename})}.
539 @deftp {Scheme Variable} <glil-void>
540 Pushes the unspecified value on the stack.
542 @deftp {Scheme Variable} <glil-const> obj
543 Pushes a constant value onto the stack. @var{obj} must be a number,
544 string, symbol, keyword, boolean, character, or a pair or vector or
545 list thereof, or the empty list.
547 @deftp {Scheme Variable} <glil-argument> op index
548 Accesses an argument on the stack. If @var{op} is @code{ref}, the
549 argument is pushed onto the stack; if it is @code{set}, the argument
550 is set from the top value on the stack, which is popped off.
552 @deftp {Scheme Variable} <glil-local> op index
553 Like @code{<glil-argument>}, but for local variables. @xref{Stack
554 Layout}, for more information.
556 @deftp {Scheme Variable} <glil-external> op depth index
557 Accesses a heap-allocated variable, addressed by @var{depth}, the nth
558 enclosing environment, and @var{index}, the variable's position within
559 the environment. @var{op} is @code{ref} or @code{set}.
561 @deftp {Scheme Variable} <glil-toplevel> op name
562 Accesses a toplevel variable. @var{op} may be @code{ref}, @code{set},
565 @deftp {Scheme Variable} <glil-module> op mod name public?
566 Accesses a variable within a specific module. See
567 @code{ghil-var-at-module!}, for more information.
569 @deftp {Scheme Variable} <glil-label> label
570 Creates a new label. @var{label} can be any Scheme value, and should
573 @deftp {Scheme Variable} <glil-branch> inst label
574 Branch to a label. @var{label} should be a @code{<ghil-label>}.
575 @code{inst} is a branching instruction: @code{br-if}, @code{br}, etc.
577 @deftp {Scheme Variable} <glil-call> inst nargs
578 This expression is probably misnamed, as it does not correspond to
579 function calls. @code{<glil-call>} invokes the VM instruction named
580 @var{inst}, noting that it is called with @var{nargs} stack arguments.
581 The arguments should be pushed on the stack already. What happens to
582 the stack afterwards depends on the instruction.
584 @deftp {Scheme Variable} <glil-mv-call> nargs ra
585 Performs a multiple-value call. @var{ra} is a @code{<glil-label>}
586 corresponding to the multiple-value return address for the call. See
587 the notes on @code{mv-call} in @ref{Procedural Instructions}, for more
591 Users may enter in GLIL at the REPL as well, though there is a bit
592 more bookkeeping to do. Since GLIL needs the set of variables to be
593 declared explicitly in a @code{<glil-program>}, GLIL expressions must
594 be wrapped in a thunk that declares the arity of the expression:
597 scheme@@(guile-user)> ,language glil
598 Guile Lowlevel Intermediate Language (GLIL) interpreter 0.3 on Guile 1.9.0
599 Copyright (C) 2001-2008 Free Software Foundation, Inc.
601 Enter `,help' for help.
602 glil@@(guile-user)> (program 0 0 0 0 () (const 3) (call return 0))
606 Just as in all of Guile's compilers, an environment is passed to the
607 GLIL-to-object code compiler, and one is returned as well, along with
611 @subsection Object Code
613 Object code is the serialization of the raw instruction stream of a
614 program, ready for interpretation by the VM. Procedures related to
615 object code are defined in the @code{(system vm objcode)} module.
617 @deffn {Scheme Procedure} objcode? obj
618 @deffnx {C Function} scm_objcode_p (obj)
619 Returns @code{#f} iff @var{obj} is object code, @code{#f} otherwise.
622 @deffn {Scheme Procedure} bytecode->objcode bytecode nlocs nexts
623 @deffnx {C Function} scm_bytecode_to_objcode (bytecode, nlocs, nexts)
624 Makes a bytecode object from @var{bytecode}, which should be a
625 @code{u8vector}. @var{nlocs} and @var{nexts} denote the number of
626 stack and heap variables to reserve when this objcode is executed.
629 @deffn {Scheme Variable} load-objcode file
630 @deffnx {C Function} scm_load_objcode (file)
631 Load object code from a file named @var{file}. The file will be mapped
632 into memory via @code{mmap}, so this is a very fast operation.
634 On disk, object code has an eight-byte cookie prepended to it, so that
635 we will not execute arbitrary garbage. In addition, two more bytes are
636 reserved for @var{nlocs} and @var{nexts}.
639 @deffn {Scheme Variable} objcode->u8vector objcode
640 @deffnx {C Function} scm_objcode_to_u8vector (objcode)
641 Copy object code out to a @code{u8vector} for analysis by Scheme. The
642 ten-byte header is included.
645 @deffn {Scheme Variable} objcode->program objcode [external='()]
646 @deffnx {C Function} scm_objcode_to_program (objcode, external)
647 Load up object code into a Scheme program. The resulting program will
648 be a thunk that captures closure variables from @var{external}.
651 Object code from a file may be disassembled at the REPL via the
652 meta-command @code{,disassemble-file}, abbreviated as @code{,xx}.
653 Programs may be disassembled via @code{,disassemble}, abbreviated as
656 Compiling object code to the fake language, @code{value}, is performed
657 via loading objcode into a program, then executing that thunk with
658 respect to the compilation environment. Normally the environment
659 propagates through the compiler transparently, but users may specify
660 the compilation environment manually as well:
662 @deffn {Scheme Procedure} make-objcode-env module externals
663 Make an object code environment. @var{module} should be a Scheme
664 module, and @var{externals} should be a list of external variables.
665 @code{#f} is also a valid object code environment.
668 @node Extending the Compiler
669 @subsection Extending the Compiler
671 At this point, we break with the impersonal tone of the rest of the
672 manual, and make an intervention. Admit it: if you've read this far
673 into the compiler internals manual, you are a junkie. Perhaps a course
674 at your university left you unsated, or perhaps you've always harbored
675 a sublimated desire to hack the holy of computer science holies: a
676 compiler. Well you're in good company, and in a good position. Guile's
677 compiler needs your help.
679 There are many possible avenues for improving Guile's compiler.
680 Probably the most important improvement, speed-wise, will be some form
681 of native compilation, both just-in-time and ahead-of-time. This could
682 be done in many ways. Probably the easiest strategy would be to extend
683 the compiled procedure structure to include a pointer to a native code
684 vector, and compile from bytecode to native code at run-time after a
685 procedure is called a certain number of times.
687 The name of the game is a profiling-based harvest of the low-hanging
688 fruit, running programs of interest under a system-level profiler and
689 determining which improvements would give the most bang for the buck.
690 There are many well-known efficiency hacks in the literature: Dybvig's
691 letrec optimization, individual boxing of heap-allocated values (and
692 then store the boxes on the stack directory), optimized case-lambda
693 expressions, stack underflow and overflow handlers, etc. Highly
694 recommended papers: Dybvig's HOCS, Ghuloum's compiler paper.
696 The compiler also needs help at the top end, enhancing the Scheme that
697 it knows to also understand R6RS, and adding new high-level compilers:
698 Emacs Lisp, Lua, JavaScript...