tweaks to compiler.texi
[bpt/guile.git] / doc / ref / compiler.texi
CommitLineData
8680d53b
AW
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
3@c Copyright (C) 2008
4@c Free Software Foundation, Inc.
5@c See the file guile.texi for copying conditions.
6
7@node Compiling to the Virtual Machine
8@section Compiling to the Virtual Machine
9
00ce5125
AW
10Compilers have a mystique about them that is attractive and
11off-putting at the same time. They are attractive because they are
12magical -- they transform inert text into live results, like throwing
e33e3aee
AW
13the switch on Frankenstein's monster. However, this magic is perceived
14by many to be impenetrable.
00ce5125
AW
15
16This section aims to pull back the veil from over Guile's compiler
e3ba263d 17implementation, and pay attention to the small man behind the curtain.
00ce5125 18
e3ba263d
AW
19@xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to
20know how to compile your .scm file.
00ce5125
AW
21
22@menu
23* Compiler Tower::
24* The Scheme Compiler::
25* GHIL::
26* GLIL::
27* Object Code::
e3ba263d 28* Extending the Compiler::
00ce5125
AW
29@end menu
30
31@node Compiler Tower
32@subsection Compiler Tower
33
34Guile's compiler is quite simple, actually -- its @emph{compilers}, to
35put it more accurately. Guile defines a tower of languages, starting
36at Scheme and progressively simplifying down to languages that
e3ba263d 37resemble the VM instruction set (@pxref{Instruction Set}).
00ce5125
AW
38
39Each language knows how to compile to the next, so each step is simple
40and understandable. Furthermore, this set of languages is not
41hardcoded into Guile, so it is possible for the user to add new
42high-level languages, new passes, or even different compilation
43targets.
44
e3ba263d
AW
45Languages are registered in the module, @code{(system base language)}:
46
47@example
48(use-modules (system base language))
49@end example
50
51They are registered with the @code{define-language} form.
52
53@deffn {Scheme Syntax} define-language @
54name title version reader printer @
55[parser=#f] [read-file=#f] [compilers='()] [evaluator=#f]
56Define a language.
57
58This syntax defines a @code{#<language>} object, bound to @var{name}
59in the current environment. In addition, the language will be added to
60the global language set. For example, this is the language definition
61for Scheme:
62
63@example
64(define-language scheme
65 #:title "Guile Scheme"
66 #:version "0.5"
67 #:reader read
68 #:read-file read-file
ca445ba5 69 #:compilers `((,ghil . ,compile-ghil))
e3ba263d
AW
70 #:evaluator (lambda (x module) (primitive-eval x))
71 #:printer write)
72@end example
73
74In this example, from @code{(language scheme spec)}, @code{read-file}
75reads expressions from a port and wraps them in a @code{begin} block.
76@end deffn
77
78The interesting thing about having languages defined this way is that
79they present a uniform interface to the read-eval-print loop. This
80allows the user to change the current language of the REPL:
81
82@example
83$ guile
84Guile Scheme interpreter 0.5 on Guile 1.9.0
85Copyright (C) 2001-2008 Free Software Foundation, Inc.
86
87Enter `,help' for help.
88scheme@@(guile-user)> ,language ghil
89Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
90Copyright (C) 2001-2008 Free Software Foundation, Inc.
91
92Enter `,help' for help.
93ghil@@(guile-user)>
94@end example
95
96Languages can be looked up by name, as they were above.
97
98@deffn {Scheme Procedure} lookup-language name
99Looks up a language named @var{name}, autoloading it if necessary.
100
101Languages are autoloaded by looking for a variable named @var{name} in
102a module named @code{(language @var{name} spec)}.
103
104The language object will be returned, or @code{#f} if there does not
105exist a language with that name.
106@end deffn
107
108Defining languages this way allows us to programmatically determine
109the necessary steps for compiling code from one language to another.
110
111@deffn {Scheme Procedure} lookup-compilation-order from to
112Recursively traverses the set of languages to which @var{from} can
113compile, depth-first, and return the first path that can transform
114@var{from} to @var{to}. Returns @code{#f} if no path is found.
115
116This function memoizes its results in a cache that is invalidated by
117subsequent calls to @code{define-language}, so it should be quite
118fast.
119@end deffn
120
121There is a notion of a ``current language'', which is maintained in
122the @code{*current-language*} fluid. This language is normally Scheme,
123and may be rebound by the user. The runtime compilation interfaces
124(@pxref{Read/Load/Eval/Compile}) also allow you to choose other source
125and target languages.
126
127The normal tower of languages when compiling Scheme goes like this:
128
129@itemize
130@item Scheme, which we know and love
131@item Guile High Intermediate Language (GHIL)
132@item Guile Low Intermediate Language (GLIL)
133@item Object code
134@end itemize
135
136Object code may be serialized to disk directly, though it has a cookie
137and version prepended to the front. But when compiling Scheme at
138runtime, you want a Scheme value, e.g. a compiled procedure. For this
139reason, so as not to break the abstraction, Guile defines a fake
140language, @code{value}. Compiling to @code{value} loads the object
141code into a procedure, and wakes the sleeping giant.
142
143Perhaps this strangeness can be explained by example:
144@code{compile-file} defaults to compiling to object code, because it
145produces object code that has to live in the barren world outside the
146Guile runtime; but @code{compile} defaults to compiling to
147@code{value}, as its product re-enters the Guile world.
148
149Indeed, the process of compilation can circulate through these
150different worlds indefinitely, as shown by the following quine:
151
152@example
00ce5125 153((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))
e3ba263d 154@end example
00ce5125
AW
155
156@node The Scheme Compiler
157@subsection The Scheme Compiler
158
ca445ba5
AW
159The job of the Scheme compiler is to expand all macros and to resolve
160all symbols to lexical variables. Its target language, GHIL, is fairly
161close to Scheme itself, so this process is not very complicated.
00ce5125 162
ca445ba5
AW
163The Scheme compiler is driven by a table of @dfn{translators},
164declared with the @code{define-scheme-translator} form, defined in the
165module, @code{(language scheme compile-ghil)}.
00ce5125 166
ca445ba5
AW
167@deffn {Scheme Syntax} define-scheme-translator head clause1 clause2...
168The best documentation of this form is probably an example. Here is
169the translator for @code{if}:
00ce5125 170
ca445ba5
AW
171@example
172(define-scheme-translator if
173 ;; (if TEST THEN [ELSE])
174 ((,test ,then)
175 (make-ghil-if e l (retrans test) (retrans then) (retrans '(begin))))
176 ((,test ,then ,else)
177 (make-ghil-if e l (retrans test) (retrans then) (retrans else))))
178@end example
00ce5125 179
ca445ba5
AW
180The match syntax is from the @code{pmatch} macro, defined in
181@code{(system base pmatch)}. The result of a clause should be a valid
182GHIL value. If no clause matches, a syntax error is signalled.
183
184In the body of the clauses, the following bindings are introduced:
185@itemize
186@item @code{e}, the current environment
187@item @code{l}, the current source location (or @code{#f})
188@item @code{retrans}, a procedure that may be called to compile
189subexpressions
190@end itemize
191
192Note that translators are looked up by @emph{value}, not by name. That
193is to say, the translator is keyed under the @emph{value} of
194@code{if}, which normally prints as @code{#<primitive-builtin-macro!
195if>}.
196@end deffn
197
198Users can extend the compiler by defining new translators.
199Additionally, some forms can be inlined directly to
200instructions -- @xref{Inlined Scheme Instructions}, for a list. The
201actual inliners are defined in @code{(language scheme inline)}:
202
203@deffn {Scheme Syntax} define-inline head arity1 result1 arity2 result2...
204Defines an inliner for @code{head}. As in
205@code{define-scheme-translator}, inliners are keyed by value and not
206by name.
207
208Expressions are matched on their arities. For example:
209
210@example
211(define-inline eq?
212 (x y) (eq? x y))
213@end example
00ce5125 214
ca445ba5
AW
215This inlines calls to the Scheme procedure, @code{eq?}, to the
216instruction @code{eq?}.
217
218A more complicated example would be:
219
220@example
221(define-inline +
222 () 0
223 (x) x
224 (x y) (add x y)
225 (x y . rest) (add x (+ y . rest)))
226@end example
227@end deffn
228
229Compilers take two arguments, an expression and an environment, and
230return two values as well: an expression in the target language, and
231an environment suitable for the target language. The format of the
232environment is language-dependent.
233
234For Scheme, an environment may be one of three things:
235@itemize
236@item @code{#f}, in which case compilation is performed in the context
237of the current module;
238@item a module, which specifies the context of the compilation; or
239@item a @dfn{compile environment}, which specifies lexical variables
240as well.
241@end itemize
242
243The format of a compile environment for scheme is @code{(@var{module}
244@var{lexicals} . @var{externals})}, though users are strongly
245discouraged from constructing these environments themselves. Instead,
246if you need this functionality -- as in GOOPS' dynamic method compiler
247-- capture an environment with @code{compile-time-environment}, then
248pass that environment to @code{compile}.
249
250@deffn {Scheme Procedure} compile-time-environment
251A special function known to the compiler that, when compiled, will
252return a representation of the lexical environment in place at compile
253time. Useful for supporting some forms of dynamic compilation. Returns
254@code{#f} if called from the interpreter.
255@end deffn
00ce5125
AW
256
257@node GHIL
258@subsection GHIL
259
c850030f
AW
260Guile High Intermediate Language (GHIL) is a structured intermediate
261language that is close in expressive power to Scheme. It is an
262expanded, pre-analyzed Scheme.
263
264GHIL is ``structured'' in the sense that its representation is based
265on records, not S-expressions. This gives a rigidity to the language
266that ensures that compiling to a lower-level language only requires a
267limited set of transformations. Practically speaking, consider the
268GHIL type, @code{<ghil-quote>}, which has fields named @code{env},
269@code{loc}, and @code{exp}. Instances of this type are records created
270via @code{make-ghil-quote}, and whose fields are accessed as
271@code{ghil-quote-env}, @code{ghil-quote-loc}, and
272@code{ghil-quote-exp}. There is also a predicate, @code{ghil-quote?}.
273@xref{Records}, for more information on records.
274
275Expressions of GHIL name their environments explicitly, and all
276variables are referenced by identity in addition to by name.
277@code{(language ghil)} defines a number of routines to deal explicitly
278with variables and environments:
279
280@deftp {Scheme Variable} <ghil-toplevel-env> [table='()]
281A toplevel environment. The @var{table} holds all toplevel variables
282that have been resolved in this environment.
283@end deftp
284@deftp {Scheme Variable} <ghil-env> parent [table='()] [variables='()]
285A lexical environment. @var{parent} will be the enclosing lexical
286environment, or a toplevel environment. @var{table} holds an alist
287mapping symbols to variables bound in this environment, while
288@var{variables} holds a cumulative list of all variables ever defined
289in this environment.
290
291Lexical environments correspond to procedures. Bindings introduced
292e.g. by Scheme's @code{let} add to the bindings in a lexical
293environment. An example of a case in which a variable might be in
294@var{variables} but not in @var{table} would be a variable that is in
295the same procedure, but is out of scope.
296@end deftp
297@deftp {Scheme Variable} <ghil-var> env name kind [index=#f]
298A variable. @var{kind} is one of @code{argument}, @code{local},
299@code{external}, @code{toplevel}, @code{public}, or @code{private};
300see the procedures below for more information. @var{index} is used in
301compilation.
302@end deftp
303
304@deffn {Scheme Procedure} ghil-var-is-bound? env sym
305Recursively look up a variable named @var{sym} in @var{env}, and
306return it or @code{#f} if none is found.
ca445ba5 307@end deffn
c850030f
AW
308@deffn {Scheme Procedure} ghil-var-for-ref! env sym
309Recursively look up a variable named @var{sym} in @var{env}, and
310return it. If the symbol was not bound, return a new toplevel
311variable.
ca445ba5 312@end deffn
c850030f
AW
313@deffn {Scheme Procedure} ghil-var-for-set! env sym
314Like @code{ghil-var-for-ref!}, except that the returned variable will
315be marked as @code{external}. @xref{Variables and the VM}.
ca445ba5 316@end deffn
c850030f
AW
317@deffn {Scheme Procedure} ghil-var-define! toplevel-env sym
318Return an existing or new toplevel variable named @var{sym}.
319@var{toplevel-env} must be a toplevel environment.
ca445ba5 320@end deffn
c850030f
AW
321@deffn {Scheme Procedure} ghil-var-at-module! env modname sym interface?
322Return a variable that will be resolved at runtime with respect to a
323specific module named @var{modname}. If @var{interface?} is true, the
324variable will be of type @code{public}, otherwise @code{private}.
ca445ba5 325@end deffn
c850030f
AW
326@deffn {Scheme Procedure} call-with-ghil-environment env syms func
327Bind @var{syms} to fresh variables within a new lexical environment
328whose parent is @var{env}, and call @var{func} as @code{(@var{func}
329@var{new-env} @var{new-vars})}.
ca445ba5 330@end deffn
c850030f
AW
331@deffn {Scheme Procedure} call-with-ghil-bindings env syms func
332Like @code{call-with-ghil-environment}, except the existing
333environment @var{env} is re-used. For that reason, @var{func} is
334invoked as @code{(@var{func} @var{new-vars})}
ca445ba5 335@end deffn
c850030f
AW
336
337In the aforementioned @code{<ghil-quote>} type, the @var{env} slot
338holds a pointer to the environment in which the expression occurs. The
339@var{loc} slot holds source location information, so that errors
340corresponding to this expression can be mapped back to the initial
341expression in the higher-level language, e.g. Scheme. @xref{Compiled
342Procedures}, for more information on source location objects.
343
344GHIL also has a declarative serialization format, which makes writing
345and reading it a tractable problem for the human mind. Since all GHIL
346language constructs contain @code{env} and @code{loc} pointers, they
347are left out of the serialization. (Serializing @code{env} structures
348would be difficult, as they are often circular.) What is left is the
349type of expression, and the remaining slots defined in the expression
350type.
351
352For example, an S-expression representation of the @code{<ghil-quote>}
353expression would be:
354
355@example
356(quote 3)
357@end example
358
359It's deceptively like Scheme. The general rule is, for a type defined
360as @code{<ghil-@var{foo}> env loc @var{slot1} @var{slot2}...}, the
361S-expression representation will be @code{(@var{foo} @var{slot1}
362@var{slot2}...)}. Users may program with this format directly at the
363REPL:
364
365@example
366scheme@@(guile-user)> ,language ghil
367Guile High Intermediate Language (GHIL) interpreter 0.3 on Guile 1.9.0
368Copyright (C) 2001-2008 Free Software Foundation, Inc.
369
370Enter `,help' for help.
371ghil@@(guile-user)> (call (ref +) (quote 32) (quote 10))
372@result{} 42
373@end example
374
375For convenience, some slots are serialized as rest arguments; those
376are noted below. The other caveat is that variables are serialized as
377their names only, and not their identities.
378
379@deftp {Scheme Variable} <ghil-void> env loc
380The unspecified value.
381@end deftp
382@deftp {Scheme Variable} <ghil-quote> env loc exp
383A quoted expression.
384
385Note that unlike in Scheme, there are no self-quoting expressions; all
386constants must come from @code{quote} expressions.
387@end deftp
388@deftp {Scheme Variable} <ghil-quasiquote> env loc exp
389A quasiquoted expression. The expression is treated as a constant,
390except for embedded @code{unquote} and @code{unquote-splicing} forms.
391@end deftp
392@deftp {Scheme Variable} <ghil-unquote> env loc exp
393Like Scheme's @code{unquote}; only valid within a quasiquote.
394@end deftp
395@deftp {Scheme Variable} <ghil-unquote-splicing> env loc exp
396Like Scheme's @code{unquote-splicing}; only valid within a quasiquote.
397@end deftp
398@deftp {Scheme Variable} <ghil-ref> env loc var
399A variable reference. Note that for purposes of serialization,
400@var{var} is serialized as its name, as a symbol.
401@end deftp
402@deftp {Scheme Variable} <ghil-set> env loc var val
403A variable mutation. @var{var} is serialized as a symbol.
404@end deftp
405@deftp {Scheme Variable} <ghil-define> env loc var val
406A toplevel variable definition. See @code{ghil-var-define!}.
407@end deftp
408@deftp {Scheme Variable} <ghil-if> env loc test then else
ca445ba5 409A conditional. Note that @var{else} is not optional.
c850030f
AW
410@end deftp
411@deftp {Scheme Variable} <ghil-and> env loc . exps
ca445ba5 412Like Scheme's @code{and}.
c850030f
AW
413@end deftp
414@deftp {Scheme Variable} <ghil-or> env loc . exps
ca445ba5 415Like Scheme's @code{or}.
c850030f
AW
416@end deftp
417@deftp {Scheme Variable} <ghil-begin> env loc . body
418Like Scheme's @code{begin}.
419@end deftp
420@deftp {Scheme Variable} <ghil-bind> env loc vars exprs . body
421Like a deconstructed @code{let}: each element of @var{vars} will be
422bound to the corresponding GHIL expression in @var{exprs}.
423
424Note that for purposes of the serialization format, @var{exprs} are
425evaluated before the new bindings are added to the environment. For
426@code{letrec} semantics, there also exists a @code{bindrec} parse
427flavor. This is useful for writing GHIL at the REPL, but the
428serializer does not currently have the cleverness needed to determine
429whether a @code{<ghil-bind>} has @code{let} or @code{letrec}
430semantics, and thus only serializes @code{<ghil-bind>} as @code{bind}.
431@end deftp
432@deftp {Scheme Variable} <ghil-mv-bind> env loc vars rest producer . body
ca445ba5
AW
433Like Scheme's @code{receive} -- binds the values returned by
434applying @code{producer}, which should be a thunk, to the
c850030f
AW
435@code{lambda}-like bindings described by @var{vars} and @var{rest}.
436@end deftp
437@deftp {Scheme Variable} <ghil-lambda> env loc vars rest meta . body
438A closure. @var{vars} is the argument list, serialized as a list of
439symbols. @var{rest} is a boolean, which is @code{#t} iff the last
440argument is a rest argument. @var{meta} is an association list of
441properties. The actual @var{body} should be a list of GHIL
442expressions.
443@end deftp
444@deftp {Scheme Variable} <ghil-call> env loc proc . args
ca445ba5 445A procedure call.
c850030f
AW
446@end deftp
447@deftp {Scheme Variable} <ghil-mv-call> env loc producer consumer
ca445ba5 448Like Scheme's @code{call-with-values}.
c850030f
AW
449@end deftp
450@deftp {Scheme Variable} <ghil-inline> env loc op . args
ca445ba5
AW
451An inlined VM instruction. @var{op} should be the instruction name as
452a symbol, and @var{args} should be its arguments, as GHIL expressions.
c850030f
AW
453@end deftp
454@deftp {Scheme Variable} <ghil-values> env loc . values
ca445ba5 455Like Scheme's @code{values}.
c850030f
AW
456@end deftp
457@deftp {Scheme Variable} <ghil-values*> env loc . values
ca445ba5
AW
458@var{values} are as in the Scheme expression, @code{(apply values .
459@var{vals})}.
c850030f
AW
460@end deftp
461@deftp {Scheme Variable} <ghil-reified-env> env loc
ca445ba5 462Produces, at runtime, a reification of the environment at compile
c850030f
AW
463time. Used in the implementation of Scheme's
464@code{compile-time-environment}.
465@end deftp
466
467GHIL implements a compiler to GLIL that recursively traverses GHIL
468expressions, writing out GLIL expressions into a linear list. The
469compiler also keeps some state as to whether the current expression is
470in tail context, and whether its value will be used in future
471computations. This state allows the compiler not to emit code for
472constant expressions that will not be used (e.g. docstrings), and to
473perform tail calls when in tail position.
474
475Just as the Scheme to GHIL compiler introduced new hidden state---the
476environment---the GHIL to GLIL compiler introduces more state, the
477stack. While not represented explicitly, the stack is present in the
478compilation of each GHIL expression: compiling a GHIL expression
479should leave the runtime value stack in the same state. For example,
480if the intermediate value stack has two elements before evaluating an
481@code{if} expression, it should have two elements after that
482expression.
483
484Interested readers are encouraged to read the implementation in
485@code{(language ghil compile-glil)} for more details.
00ce5125
AW
486
487@node GLIL
488@subsection GLIL
489
ff73ae34 490Guile Low Intermediate Language (GLIL) is a structured intermediate
c850030f
AW
491language whose expressions closely mirror the functionality of Guile's
492VM instruction set.
493
494Its expression types are defined in @code{(language glil)}, and as
495with GHIL, some of its fields parse as rest arguments.
496
ff73ae34
AW
497@deftp {Scheme Variable} <glil-program> nargs nrest nlocs nexts meta . body
498A unit of code that at runtime will correspond to a compiled
e33e3aee 499procedure. @var{nargs} @var{nrest} @var{nlocs}, and @var{nexts}
ff73ae34
AW
500collectively define the program's arity; see @ref{Compiled
501Procedures}, for more information. @var{meta} should be an alist of
502properties, as in @code{<ghil-lambda>}. @var{body} is a list of GLIL
503expressions.
c850030f
AW
504@end deftp
505@deftp {Scheme Variable} <glil-bind> . vars
ff73ae34
AW
506An advisory expression that notes a liveness extent for a set of
507variables. @var{vars} is a list of @code{(@var{name} @var{type}
508@var{index})}, where @var{type} should be either @code{argument},
509@code{local}, or @code{external}.
510
511@code{<glil-bind>} expressions end up being serialized as part of a
512program's metadata and do not form part of a program's code path.
c850030f
AW
513@end deftp
514@deftp {Scheme Variable} <glil-mv-bind> vars rest
ff73ae34
AW
515A multiple-value binding of the values on the stack to @var{vars}. Iff
516@var{rest} is true, the last element of @var{vars} will be treated as
517a rest argument.
518
519In addition to pushing a binding annotation on the stack, like
520@code{<glil-bind>}, an expression is emitted at compilation time to
521make sure that there are enough values available to bind. See the
522notes on @code{truncate-values} in @ref{Procedural Instructions}, for
523more information.
c850030f
AW
524@end deftp
525@deftp {Scheme Variable} <glil-unbind>
ff73ae34
AW
526Closes the liveness extent of the most recently encountered
527@code{<glil-bind>} or @code{<glil-mv-bind>} expression. As GLIL
528expressions are compiled, a parallel stack of live bindings is
529maintained; this expression pops off the top element from that stack.
530
531Bindings are written into the program's metadata so that debuggers and
532other tools can determine the set of live local variables at a given
533offset within a VM program.
c850030f
AW
534@end deftp
535@deftp {Scheme Variable} <glil-source> loc
ff73ae34
AW
536Records source information for the preceding expression. @var{loc}
537should be a vector, @code{#(@var{line} @var{column} @var{filename})}.
c850030f
AW
538@end deftp
539@deftp {Scheme Variable} <glil-void>
ff73ae34 540Pushes the unspecified value on the stack.
c850030f
AW
541@end deftp
542@deftp {Scheme Variable} <glil-const> obj
ff73ae34
AW
543Pushes a constant value onto the stack. @var{obj} must be a number,
544string, symbol, keyword, boolean, character, or a pair or vector or
545list thereof, or the empty list.
c850030f
AW
546@end deftp
547@deftp {Scheme Variable} <glil-argument> op index
ff73ae34
AW
548Accesses an argument on the stack. If @var{op} is @code{ref}, the
549argument is pushed onto the stack; if it is @code{set}, the argument
550is set from the top value on the stack, which is popped off.
c850030f
AW
551@end deftp
552@deftp {Scheme Variable} <glil-local> op index
ff73ae34
AW
553Like @code{<glil-argument>}, but for local variables. @xref{Stack
554Layout}, for more information.
c850030f
AW
555@end deftp
556@deftp {Scheme Variable} <glil-external> op depth index
ff73ae34
AW
557Accesses a heap-allocated variable, addressed by @var{depth}, the nth
558enclosing environment, and @var{index}, the variable's position within
559the environment. @var{op} is @code{ref} or @code{set}.
c850030f
AW
560@end deftp
561@deftp {Scheme Variable} <glil-toplevel> op name
ff73ae34
AW
562Accesses a toplevel variable. @var{op} may be @code{ref}, @code{set},
563or @code{define}.
c850030f
AW
564@end deftp
565@deftp {Scheme Variable} <glil-module> op mod name public?
ff73ae34
AW
566Accesses a variable within a specific module. See
567@code{ghil-var-at-module!}, for more information.
c850030f
AW
568@end deftp
569@deftp {Scheme Variable} <glil-label> label
ff73ae34
AW
570Creates a new label. @var{label} can be any Scheme value, and should
571be unique.
c850030f
AW
572@end deftp
573@deftp {Scheme Variable} <glil-branch> inst label
ff73ae34 574Branch to a label. @var{label} should be a @code{<ghil-label>}.
c850030f
AW
575@code{inst} is a branching instruction: @code{br-if}, @code{br}, etc.
576@end deftp
577@deftp {Scheme Variable} <glil-call> inst nargs
ff73ae34 578This expression is probably misnamed, as it does not correspond to
c850030f
AW
579function calls. @code{<glil-call>} invokes the VM instruction named
580@var{inst}, noting that it is called with @var{nargs} stack arguments.
ff73ae34
AW
581The arguments should be pushed on the stack already. What happens to
582the stack afterwards depends on the instruction.
c850030f
AW
583@end deftp
584@deftp {Scheme Variable} <glil-mv-call> nargs ra
ff73ae34
AW
585Performs a multiple-value call. @var{ra} is a @code{<glil-label>}
586corresponding to the multiple-value return address for the call. See
587the notes on @code{mv-call} in @ref{Procedural Instructions}, for more
588information.
c850030f
AW
589@end deftp
590
ff73ae34
AW
591Users may enter in GLIL at the REPL as well, though there is a bit
592more bookkeeping to do. Since GLIL needs the set of variables to be
593declared explicitly in a @code{<glil-program>}, GLIL expressions must
594be wrapped in a thunk that declares the arity of the expression:
00ce5125 595
ff73ae34
AW
596@example
597scheme@@(guile-user)> ,language glil
598Guile Lowlevel Intermediate Language (GLIL) interpreter 0.3 on Guile 1.9.0
599Copyright (C) 2001-2008 Free Software Foundation, Inc.
00ce5125 600
ff73ae34
AW
601Enter `,help' for help.
602glil@@(guile-user)> (program 0 0 0 0 () (const 3) (call return 0))
603@result{} 3
604@end example
00ce5125 605
ff73ae34
AW
606Just as in all of Guile's compilers, an environment is passed to the
607GLIL-to-object code compiler, and one is returned as well, along with
608the object code.
00ce5125
AW
609
610@node Object Code
611@subsection Object Code
612
ff73ae34
AW
613Object code is the serialization of the raw instruction stream of a
614program, ready for interpretation by the VM. Procedures related to
615object code are defined in the @code{(system vm objcode)} module.
00ce5125 616
ff73ae34
AW
617@deffn {Scheme Procedure} objcode? obj
618@deffnx {C Function} scm_objcode_p (obj)
619Returns @code{#f} iff @var{obj} is object code, @code{#f} otherwise.
620@end deffn
00ce5125 621
ff73ae34
AW
622@deffn {Scheme Procedure} bytecode->objcode bytecode nlocs nexts
623@deffnx {C Function} scm_bytecode_to_objcode (bytecode, nlocs, nexts)
624Makes a bytecode object from @var{bytecode}, which should be a
625@code{u8vector}. @var{nlocs} and @var{nexts} denote the number of
626stack and heap variables to reserve when this objcode is executed.
627@end deffn
e3ba263d 628
ff73ae34
AW
629@deffn {Scheme Variable} load-objcode file
630@deffnx {C Function} scm_load_objcode (file)
631Load object code from a file named @var{file}. The file will be mapped
632into memory via @code{mmap}, so this is a very fast operation.
e3ba263d 633
ff73ae34
AW
634On disk, object code has an eight-byte cookie prepended to it, so that
635we will not execute arbitrary garbage. In addition, two more bytes are
636reserved for @var{nlocs} and @var{nexts}.
637@end deffn
e3ba263d 638
ff73ae34
AW
639@deffn {Scheme Variable} objcode->u8vector objcode
640@deffnx {C Function} scm_objcode_to_u8vector (objcode)
641Copy object code out to a @code{u8vector} for analysis by Scheme. The
642ten-byte header is included.
643@end deffn
e3ba263d 644
ff73ae34
AW
645@deffn {Scheme Variable} objcode->program objcode [external='()]
646@deffnx {C Function} scm_objcode_to_program (objcode, external)
647Load up object code into a Scheme program. The resulting program will
648be a thunk that captures closure variables from @var{external}.
649@end deffn
c850030f 650
ff73ae34
AW
651Object code from a file may be disassembled at the REPL via the
652meta-command @code{,disassemble-file}, abbreviated as @code{,xx}.
653Programs may be disassembled via @code{,disassemble}, abbreviated as
654@code{,x}.
655
656Compiling object code to the fake language, @code{value}, is performed
657via loading objcode into a program, then executing that thunk with
658respect to the compilation environment. Normally the environment
659propagates through the compiler transparently, but users may specify
660the compilation environment manually as well:
661
662@deffn {Scheme Procedure} make-objcode-env module externals
663Make an object code environment. @var{module} should be a Scheme
664module, and @var{externals} should be a list of external variables.
665@code{#f} is also a valid object code environment.
666@end deffn
c850030f 667
ff73ae34
AW
668@node Extending the Compiler
669@subsection Extending the Compiler
e3ba263d 670
ff73ae34
AW
671At this point, we break with the impersonal tone of the rest of the
672manual, and make an intervention. Admit it: if you've read this far
673into the compiler internals manual, you are a junkie. Perhaps a course
674at your university left you unsated, or perhaps you've always harbored
675a sublimated desire to hack the holy of computer science holies: a
676compiler. Well you're in good company, and in a good position. Guile's
677compiler needs your help.
678
679There are many possible avenues for improving Guile's compiler.
680Probably the most important improvement, speed-wise, will be some form
681of native compilation, both just-in-time and ahead-of-time. This could
682be done in many ways. Probably the easiest strategy would be to extend
683the compiled procedure structure to include a pointer to a native code
684vector, and compile from bytecode to native code at runtime after a
685procedure is called a certain number of times.
686
687The name of the game is a profiling-based harvest of the low-hanging
688fruit, running programs of interest under a system-level profiler and
689determining which improvements would give the most bang for the buck.
690There are many well-known efficiency hacks in the literature: Dybvig's
691letrec optimization, individual boxing of heap-allocated values (and
692then store the boxes on the stack directory), optimized case-lambda
693expressions, stack underflow and overflow handlers, etc. Highly
694recommended papers: Dybvig's HOCS, Ghuloum's compiler paper.
695
696The compiler also needs help at the top end, enhancing the Scheme that
697it knows to also understand R6RS, and adding new high-level compilers:
698Emacs Lisp, Lua, JavaScript...