Fix cross-references in api-macros.texi
[bpt/guile.git] / doc / ref / compiler.texi
CommitLineData
8680d53b
AW
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
747bd534 3@c Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013
8680d53b
AW
4@c Free Software Foundation, Inc.
5@c See the file guile.texi for copying conditions.
6
7@node Compiling to the Virtual Machine
8@section Compiling to the Virtual Machine
9
67915ab0
AW
10Compilers! The word itself inspires excitement and awe, even among
11experienced practitioners. But a compiler is just a program: an
12eminently hackable thing. This section aims to to describe Guile's
13compiler in such a way that interested Scheme hackers can feel
14comfortable reading and extending it.
00ce5125 15
e3ba263d 16@xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to
98850fd7 17know how to compile your @code{.scm} file.
00ce5125
AW
18
19@menu
20* Compiler Tower::
21* The Scheme Compiler::
81fd3152 22* Tree-IL::
67915ab0
AW
23* Continuation-Passing Style::
24* Bytecode::
e63d888e 25* Writing New High-Level Languages::
e3ba263d 26* Extending the Compiler::
00ce5125
AW
27@end menu
28
29@node Compiler Tower
30@subsection Compiler Tower
31
d297e544
AW
32Guile's compiler is quite simple -- its @emph{compilers}, to put it more
33accurately. Guile defines a tower of languages, starting at Scheme and
34progressively simplifying down to languages that resemble the VM
35instruction set (@pxref{Instruction Set}).
00ce5125
AW
36
37Each language knows how to compile to the next, so each step is simple
67915ab0
AW
38and understandable. Furthermore, this set of languages is not hardcoded
39into Guile, so it is possible for the user to add new high-level
40languages, new passes, or even different compilation targets.
00ce5125 41
e3ba263d
AW
42Languages are registered in the module, @code{(system base language)}:
43
44@example
45(use-modules (system base language))
46@end example
47
48They are registered with the @code{define-language} form.
49
50@deffn {Scheme Syntax} define-language @
994d87be
BT
51 [#:name] [#:title] [#:reader] [#:printer] @
52 [#:parser=#f] [#:compilers='()] @
53 [#:decompilers='()] [#:evaluator=#f] @
54 [#:joiner=#f] [#:for-humans?=#t] @
55 [#:make-default-environment=make-fresh-user-module]
e3ba263d
AW
56Define a language.
57
67915ab0
AW
58This syntax defines a @code{<language>} object, bound to @var{name} in
59the current environment. In addition, the language will be added to the
60global language set. For example, this is the language definition for
61Scheme:
e3ba263d
AW
62
63@example
64(define-language scheme
41e64dd7
AW
65 #:title "Scheme"
66 #:reader (lambda (port env) ...)
98850fd7 67 #:compilers `((tree-il . ,compile-tree-il))
81fd3152 68 #:decompilers `((tree-il . ,decompile-tree-il))
41e64dd7
AW
69 #:evaluator (lambda (x module) (primitive-eval x))
70 #:printer write
71 #:make-default-environment (lambda () ...))
e3ba263d 72@end example
e3ba263d
AW
73@end deffn
74
75The interesting thing about having languages defined this way is that
67915ab0 76they present a uniform interface to the read-eval-print loop. This
e3ba263d
AW
77allows the user to change the current language of the REPL:
78
79@example
81fd3152 80scheme@@(guile-user)> ,language tree-il
41e64dd7
AW
81Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'.
82tree-il@@(guile-user)> ,L scheme
83Happy hacking with Scheme! To switch back, type `,L tree-il'.
84scheme@@(guile-user)>
e3ba263d
AW
85@end example
86
87Languages can be looked up by name, as they were above.
88
89@deffn {Scheme Procedure} lookup-language name
90Looks up a language named @var{name}, autoloading it if necessary.
91
92Languages are autoloaded by looking for a variable named @var{name} in
93a module named @code{(language @var{name} spec)}.
94
95The language object will be returned, or @code{#f} if there does not
96exist a language with that name.
97@end deffn
98
99Defining languages this way allows us to programmatically determine
100the necessary steps for compiling code from one language to another.
101
102@deffn {Scheme Procedure} lookup-compilation-order from to
103Recursively traverses the set of languages to which @var{from} can
104compile, depth-first, and return the first path that can transform
105@var{from} to @var{to}. Returns @code{#f} if no path is found.
106
107This function memoizes its results in a cache that is invalidated by
108subsequent calls to @code{define-language}, so it should be quite
109fast.
110@end deffn
111
5745de91
AW
112There is a notion of a ``current language'', which is maintained in the
113@code{current-language} parameter, defined in the core @code{(guile)}
67915ab0
AW
114module. This language is normally Scheme, and may be rebound by the
115user. The run-time compilation interfaces
e3ba263d
AW
116(@pxref{Read/Load/Eval/Compile}) also allow you to choose other source
117and target languages.
118
119The normal tower of languages when compiling Scheme goes like this:
120
121@itemize
41e64dd7 122@item Scheme
81fd3152 123@item Tree Intermediate Language (Tree-IL)
67915ab0 124@item Continuation-Passing Style (CPS)
81fd3152 125@item Bytecode
e3ba263d
AW
126@end itemize
127
67915ab0
AW
128As discussed before (@pxref{Object File Format}), bytecode is in ELF
129format, ready to be serialized to disk. But when compiling Scheme at
130run time, you want a Scheme value: for example, a compiled procedure.
131For this reason, so as not to break the abstraction, Guile defines a
132fake language at the bottom of the tower:
81fd3152
AW
133
134@itemize
135@item Value
136@end itemize
137
d297e544
AW
138Compiling to @code{value} loads the bytecode into a procedure, turning
139cold bytes into warm code.
e3ba263d
AW
140
141Perhaps this strangeness can be explained by example:
d297e544 142@code{compile-file} defaults to compiling to bytecode, because it
e3ba263d 143produces object code that has to live in the barren world outside the
67915ab0
AW
144Guile runtime; but @code{compile} defaults to compiling to @code{value},
145as its product re-enters the Guile world.
146
147@c FIXME: This doesn't work anymore :( Should we add some kind of
148@c special GC pass, or disclaim this kind of code, or what?
e3ba263d
AW
149
150Indeed, the process of compilation can circulate through these
151different worlds indefinitely, as shown by the following quine:
152
153@example
00ce5125 154((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x)))
e3ba263d 155@end example
00ce5125
AW
156
157@node The Scheme Compiler
158@subsection The Scheme Compiler
159
7081d4f9 160The job of the Scheme compiler is to expand all macros and all of Scheme
67915ab0
AW
161to its most primitive expressions. The definition of ``primitive
162expression'' is given by the inventory of constructs provided by
163Tree-IL, the target language of the Scheme compiler: procedure calls,
164conditionals, lexical references, and so on. This is described more
165fully in the next section.
81fd3152
AW
166
167The tricky and amusing thing about the Scheme-to-Tree-IL compiler is
67915ab0 168that it is completely implemented by the macro expander. Since the
81fd3152
AW
169macro expander has to run over all of the source code already in order
170to expand macros, it might as well do the analysis at the same time,
171producing Tree-IL expressions directly.
172
67915ab0
AW
173Because this compiler is actually the macro expander, it is extensible.
174Any macro which the user writes becomes part of the compiler.
81fd3152
AW
175
176The Scheme-to-Tree-IL expander may be invoked using the generic
177@code{compile} procedure:
178
179@lisp
180(compile '(+ 1 2) #:from 'scheme #:to 'tree-il)
181@result{}
67915ab0 182#<tree-il (call (toplevel +) (const 1) (const 2))>
81fd3152
AW
183@end lisp
184
67915ab0
AW
185@code{(compile @var{foo} #:from 'scheme #:to 'tree-il)} is entirely
186equivalent to calling the macro expander as @code{(macroexpand @var{foo}
187'c '(compile load eval))}. @xref{Macro Expansion}.
188@code{compile-tree-il}, the procedure dispatched by @code{compile} to
189@code{'tree-il}, is a small wrapper around @code{macroexpand}, to make
190it conform to the general form of compiler procedures in Guile's
191language tower.
81fd3152 192
98850fd7
AW
193Compiler procedures take three arguments: an expression, an
194environment, and a keyword list of options. They return three values:
195the compiled expression, the corresponding environment for the target
196language, and a ``continuation environment''. The compiled expression
197and environment will serve as input to the next language's compiler.
198The ``continuation environment'' can be used to compile another
199expression from the same source language within the same module.
81fd3152
AW
200
201For example, you might compile the expression, @code{(define-module
202(foo))}. This will result in a Tree-IL expression and environment. But
203if you compiled a second expression, you would want to take into
204account the compile-time effect of compiling the previous expression,
205which puts the user in the @code{(foo)} module. That is purpose of the
206``continuation environment''; you would pass it as the environment
207when compiling the subsequent expression.
208
41e64dd7
AW
209For Scheme, an environment is a module. By default, the @code{compile}
210and @code{compile-file} procedures compile in a fresh module, such
211that bindings and macros introduced by the expression being compiled
212are isolated:
1ebe6a63
LC
213
214@example
215(eq? (current-module) (compile '(current-module)))
216@result{} #f
217
218(compile '(define hello 'world))
219(defined? 'hello)
220@result{} #f
221
222(define / *)
223(eq? (compile '/) /)
224@result{} #f
225@end example
226
227Similarly, changes to the @code{current-reader} fluid (@pxref{Loading,
228@code{current-reader}}) are isolated:
229
230@example
231(compile '(fluid-set! current-reader (lambda args 'fail)))
232(fluid-ref current-reader)
233@result{} #f
234@end example
235
236Nevertheless, having the compiler and @dfn{compilee} share the same name
237space can be achieved by explicitly passing @code{(current-module)} as
238the compilation environment:
239
240@example
241(define hello 'world)
242(compile 'hello #:env (current-module))
243@result{} world
244@end example
245
81fd3152
AW
246@node Tree-IL
247@subsection Tree-IL
00ce5125 248
81fd3152 249Tree Intermediate Language (Tree-IL) is a structured intermediate
c850030f
AW
250language that is close in expressive power to Scheme. It is an
251expanded, pre-analyzed Scheme.
252
81fd3152
AW
253Tree-IL is ``structured'' in the sense that its representation is
254based on records, not S-expressions. This gives a rigidity to the
255language that ensures that compiling to a lower-level language only
41e64dd7
AW
256requires a limited set of transformations. For example, the Tree-IL
257type @code{<const>} is a record type with two fields, @code{src} and
258@code{exp}. Instances of this type are created via @code{make-const}.
259Fields of this type are accessed via the @code{const-src} and
260@code{const-exp} procedures. There is also a predicate, @code{const?}.
261@xref{Records}, for more information on records.
81fd3152
AW
262
263@c alpha renaming
264
265All Tree-IL types have a @code{src} slot, which holds source location
266information for the expression. This information, if present, will be
267residualized into the compiled object code, allowing backtraces to
268show source information. The format of @code{src} is the same as that
269returned by Guile's @code{source-properties} function. @xref{Source
270Properties}, for more information.
271
272Although Tree-IL objects are represented internally using records,
273there is also an equivalent S-expression external representation for
ecb87335 274each kind of Tree-IL. For example, the S-expression representation
81fd3152 275of @code{#<const src: #f exp: 3>} expression would be:
c850030f
AW
276
277@example
81fd3152 278(const 3)
c850030f
AW
279@end example
280
81fd3152 281Users may program with this format directly at the REPL:
c850030f
AW
282
283@example
81fd3152 284scheme@@(guile-user)> ,language tree-il
41e64dd7 285Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'.
67915ab0 286tree-il@@(guile-user)> (call (primitive +) (const 32) (const 10))
c850030f
AW
287@result{} 42
288@end example
289
81fd3152
AW
290The @code{src} fields are left out of the external representation.
291
98850fd7
AW
292One may create Tree-IL objects from their external representations via
293calling @code{parse-tree-il}, the reader for Tree-IL. If any source
294information is attached to the input S-expression, it will be
295propagated to the resulting Tree-IL expressions. This is probably the
296easiest way to compile to Tree-IL: just make the appropriate external
297representations in S-expression format, and let @code{parse-tree-il}
298take care of the rest.
299
81fd3152
AW
300@deftp {Scheme Variable} <void> src
301@deftpx {External Representation} (void)
67915ab0 302An empty expression. In practice, equivalent to Scheme's @code{(if #f
81fd3152
AW
303#f)}.
304@end deftp
d297e544 305
81fd3152
AW
306@deftp {Scheme Variable} <const> src exp
307@deftpx {External Representation} (const @var{exp})
308A constant.
309@end deftp
d297e544 310
81fd3152
AW
311@deftp {Scheme Variable} <primitive-ref> src name
312@deftpx {External Representation} (primitive @var{name})
67915ab0
AW
313A reference to a ``primitive''. A primitive is a procedure that, when
314compiled, may be open-coded. For example, @code{cons} is usually
81fd3152
AW
315recognized as a primitive, so that it compiles down to a single
316instruction.
317
318Compilation of Tree-IL usually begins with a pass that resolves some
319@code{<module-ref>} and @code{<toplevel-ref>} expressions to
67915ab0 320@code{<primitive-ref>} expressions. The actual compilation pass has
7081d4f9
AW
321special cases for calls to certain primitives, like @code{apply} or
322@code{cons}.
81fd3152 323@end deftp
d297e544 324
81fd3152
AW
325@deftp {Scheme Variable} <lexical-ref> src name gensym
326@deftpx {External Representation} (lexical @var{name} @var{gensym})
67915ab0 327A reference to a lexically-bound variable. The @var{name} is the
81fd3152
AW
328original name of the variable in the source program. @var{gensym} is a
329unique identifier for this variable.
330@end deftp
d297e544 331
81fd3152
AW
332@deftp {Scheme Variable} <lexical-set> src name gensym exp
333@deftpx {External Representation} (set! (lexical @var{name} @var{gensym}) @var{exp})
334Sets a lexically-bound variable.
335@end deftp
d297e544 336
81fd3152
AW
337@deftp {Scheme Variable} <module-ref> src mod name public?
338@deftpx {External Representation} (@@ @var{mod} @var{name})
339@deftpx {External Representation} (@@@@ @var{mod} @var{name})
340A reference to a variable in a specific module. @var{mod} should be
679cceed 341the name of the module, e.g.@: @code{(guile-user)}.
81fd3152
AW
342
343If @var{public?} is true, the variable named @var{name} will be looked
344up in @var{mod}'s public interface, and serialized with @code{@@};
345otherwise it will be looked up among the module's private bindings,
346and is serialized with @code{@@@@}.
347@end deftp
d297e544 348
81fd3152
AW
349@deftp {Scheme Variable} <module-set> src mod name public? exp
350@deftpx {External Representation} (set! (@@ @var{mod} @var{name}) @var{exp})
351@deftpx {External Representation} (set! (@@@@ @var{mod} @var{name}) @var{exp})
352Sets a variable in a specific module.
353@end deftp
d297e544 354
81fd3152
AW
355@deftp {Scheme Variable} <toplevel-ref> src name
356@deftpx {External Representation} (toplevel @var{name})
357References a variable from the current procedure's module.
358@end deftp
d297e544 359
81fd3152
AW
360@deftp {Scheme Variable} <toplevel-set> src name exp
361@deftpx {External Representation} (set! (toplevel @var{name}) @var{exp})
362Sets a variable in the current procedure's module.
363@end deftp
d297e544 364
81fd3152
AW
365@deftp {Scheme Variable} <toplevel-define> src name exp
366@deftpx {External Representation} (define (toplevel @var{name}) @var{exp})
367Defines a new top-level variable in the current procedure's module.
368@end deftp
d297e544 369
81fd3152
AW
370@deftp {Scheme Variable} <conditional> src test then else
371@deftpx {External Representation} (if @var{test} @var{then} @var{else})
ca445ba5 372A conditional. Note that @var{else} is not optional.
c850030f 373@end deftp
d297e544 374
7081d4f9
AW
375@deftp {Scheme Variable} <call> src proc args
376@deftpx {External Representation} (call @var{proc} . @var{args})
ca445ba5 377A procedure call.
c850030f 378@end deftp
d297e544 379
a881a4ae
AW
380@deftp {Scheme Variable} <primcall> src name args
381@deftpx {External Representation} (primcall @var{name} . @var{args})
382A call to a primitive. Equivalent to @code{(call (primitive @var{name})
383. @var{args})}. This construct is often more convenient to generate and
384analyze than @code{<call>}.
385
386As part of the compilation process, instances of @code{(call (primitive
387@var{name}) . @var{args})} are transformed into primcalls.
388@end deftp
d297e544 389
67915ab0
AW
390@deftp {Scheme Variable} <seq> src head tail
391@deftpx {External Representation} (seq @var{head} @var{tail})
392A sequence. The semantics is that @var{head} is evaluated first, and
393any resulting values are ignored. Then @var{tail} is evaluated, in tail
394position.
c850030f 395@end deftp
d297e544 396
41e64dd7
AW
397@deftp {Scheme Variable} <lambda> src meta body
398@deftpx {External Representation} (lambda @var{meta} @var{body})
67915ab0
AW
399A closure. @var{meta} is an association list of properties for the
400procedure. @var{body} is a single Tree-IL expression of type
401@code{<lambda-case>}. As the @code{<lambda-case>} clause can chain to
41e64dd7
AW
402an alternate clause, this makes Tree-IL's @code{<lambda>} have the
403expressiveness of Scheme's @code{case-lambda}.
404@end deftp
d297e544 405
41e64dd7
AW
406@deftp {Scheme Variable} <lambda-case> req opt rest kw inits gensyms body alternate
407@deftpx {External Representation} @
408 (lambda-case ((@var{req} @var{opt} @var{rest} @var{kw} @var{inits} @var{gensyms})@
409 @var{body})@
410 [@var{alternate}])
67915ab0 411One clause of a @code{case-lambda}. A @code{lambda} expression in
41e64dd7
AW
412Scheme is treated as a @code{case-lambda} with one clause.
413
414@var{req} is a list of the procedure's required arguments, as symbols.
415@var{opt} is a list of the optional arguments, or @code{#f} if there
416are no optional arguments. @var{rest} is the name of the rest
417argument, or @code{#f}.
418
419@var{kw} is a list of the form, @code{(@var{allow-other-keys?}
420(@var{keyword} @var{name} @var{var}) ...)}, where @var{keyword} is the
421keyword corresponding to the argument named @var{name}, and whose
67915ab0
AW
422corresponding gensym is @var{var}. @var{inits} are tree-il expressions
423corresponding to all of the optional and keyword arguments, evaluated to
424bind variables whose value is not supplied by the procedure caller.
41e64dd7
AW
425Each @var{init} expression is evaluated in the lexical context of
426previously bound variables, from left to right.
427
428@var{gensyms} is a list of gensyms corresponding to all arguments:
429first all of the required arguments, then the optional arguments if
430any, then the rest argument if any, then all of the keyword arguments.
431
67915ab0 432@var{body} is the body of the clause. If the procedure is called with
41e64dd7 433an appropriate number of arguments, @var{body} is evaluated in tail
67915ab0 434position. Otherwise, if there is an @var{alternate}, it should be a
41e64dd7 435@code{<lambda-case>} expression, representing the next clause to try.
64de6db5 436If there is no @var{alternate}, a wrong-number-of-arguments error is
41e64dd7
AW
437signaled.
438@end deftp
d297e544 439
41e64dd7
AW
440@deftp {Scheme Variable} <let> src names gensyms vals exp
441@deftpx {External Representation} (let @var{names} @var{gensyms} @var{vals} @var{exp})
67915ab0
AW
442Lexical binding, like Scheme's @code{let}. @var{names} are the original
443binding names, @var{gensyms} are gensyms corresponding to the
81fd3152
AW
444@var{names}, and @var{vals} are Tree-IL expressions for the values.
445@var{exp} is a single Tree-IL expression.
446@end deftp
d297e544 447
935c7aca 448@deftp {Scheme Variable} <letrec> in-order? src names gensyms vals exp
172988ee
AW
449@deftpx {External Representation} (letrec @var{names} @var{gensyms} @var{vals} @var{exp})
450@deftpx {External Representation} (letrec* @var{names} @var{gensyms} @var{vals} @var{exp})
81fd3152 451A version of @code{<let>} that creates recursive bindings, like
935c7aca 452Scheme's @code{letrec}, or @code{letrec*} if @var{in-order?} is true.
81fd3152 453@end deftp
d297e544 454
67915ab0
AW
455@deftp {Scheme Variable} <prompt> escape-only? tag body handler
456@deftpx {External Representation} (prompt @var{escape-only?} @var{tag} @var{body} @var{handler})
457A dynamic prompt. Instates a prompt named @var{tag}, an expression,
41e64dd7 458during the dynamic extent of the execution of @var{body}, also an
67915ab0
AW
459expression. If an abort occurs to this prompt, control will be passed
460to @var{handler}, also an expression, which should be a procedure. The
461first argument to the handler procedure will be the captured
462continuation, followed by all of the values passed to the abort. If
463@var{escape-only?} is true, the handler should be a @code{<lambda>} with
464a single @code{<lambda-case>} body expression with no optional or
465keyword arguments, and no alternate, and whose first argument is
466unreferenced. @xref{Prompts}, for more information.
41e64dd7 467@end deftp
d297e544 468
41e64dd7
AW
469@deftp {Scheme Variable} <abort> tag args tail
470@deftpx {External Representation} (abort @var{tag} @var{args} @var{tail})
471An abort to the nearest prompt with the name @var{tag}, an expression.
472@var{args} should be a list of expressions to pass to the prompt's
473handler, and @var{tail} should be an expression that will evaluate to
67915ab0 474a list of additional arguments. An abort will save the partial
41e64dd7
AW
475continuation, which may later be reinstated, resulting in the
476@code{<abort>} expression evaluating to some number of values.
477@end deftp
81fd3152 478
98850fd7
AW
479There are two Tree-IL constructs that are not normally produced by
480higher-level compilers, but instead are generated during the
481source-to-source optimization and analysis passes that the Tree-IL
67915ab0
AW
482compiler does. Users should not generate these expressions directly,
483unless they feel very clever, as the default analysis pass will generate
484them as necessary.
98850fd7 485
41e64dd7
AW
486@deftp {Scheme Variable} <let-values> src names gensyms exp body
487@deftpx {External Representation} (let-values @var{names} @var{gensyms} @var{exp} @var{body})
98850fd7
AW
488Like Scheme's @code{receive} -- binds the values returned by
489evaluating @code{exp} to the @code{lambda}-like bindings described by
67915ab0 490@var{gensyms}. That is to say, @var{gensyms} may be an improper list.
98850fd7 491
7081d4f9 492@code{<let-values>} is an optimization of a @code{<call>} to the
98850fd7
AW
493primitive, @code{call-with-values}.
494@end deftp
d297e544 495
41e64dd7
AW
496@deftp {Scheme Variable} <fix> src names gensyms vals body
497@deftpx {External Representation} (fix @var{names} @var{gensyms} @var{vals} @var{body})
98850fd7
AW
498Like @code{<letrec>}, but only for @var{vals} that are unset
499@code{lambda} expressions.
500
501@code{fix} is an optimization of @code{letrec} (and @code{let}).
502@end deftp
81fd3152 503
67915ab0
AW
504Tree-IL is a convenient compilation target from source languages. It
505can be convenient as a medium for optimization, though CPS is usually
506better. The strength of Tree-IL is that it does not fix order of
507evaluation, so it makes some code motion a bit easier.
00ce5125 508
67915ab0 509Optimization passes performed on Tree-IL currently include:
00ce5125 510
73643339 511@itemize
67915ab0
AW
512@item Open-coding (turning toplevel-refs into primitive-refs,
513and calls to primitives to primcalls)
514@item Partial evaluation (comprising inlining, copy propagation, and
515constant folding)
516@item Common subexpression elimination (CSE)
73643339
AW
517@end itemize
518
67915ab0
AW
519In the future, we will move the CSE pass to operate over the lower-level
520CPS language.
73643339 521
67915ab0
AW
522@node Continuation-Passing Style
523@subsection Continuation-Passing Style
73643339 524
67915ab0
AW
525@cindex CPS
526Continuation-passing style (CPS) is ...
73643339 527
67915ab0
AW
528@node Bytecode
529@subsection Bytecode
73643339 530
67915ab0 531Blah blah ...
73643339 532
67915ab0 533@xref{Object File Format}
00ce5125 534
67915ab0 535(system vm loader)
e3ba263d 536
b8bc86bc
AW
537@deffn {Scheme Variable} load-thunk-from-file file
538@deffnx {C Function} scm_load_thunk_from_file (file)
ff73ae34
AW
539Load object code from a file named @var{file}. The file will be mapped
540into memory via @code{mmap}, so this is a very fast operation.
e3ba263d 541
b8bc86bc
AW
542On disk, object code is embedded in ELF, a flexible container format
543created for use in UNIX systems. Guile has its own ELF linker and
544loader, so it uses the ELF format on all systems.
73643339
AW
545@end deffn
546
67915ab0 547likewise load-thunk-from-memory
ff73ae34
AW
548
549Compiling object code to the fake language, @code{value}, is performed
550via loading objcode into a program, then executing that thunk with
551respect to the compilation environment. Normally the environment
552propagates through the compiler transparently, but users may specify
41e64dd7 553the compilation environment manually as well, as a module.
ff73ae34 554
c850030f 555
e63d888e
DK
556@node Writing New High-Level Languages
557@subsection Writing New High-Level Languages
558
559In order to integrate a new language @var{lang} into Guile's compiler
560system, one has to create the module @code{(language @var{lang} spec)}
561containing the language definition and referencing the parser,
562compiler and other routines processing it. The module hierarchy in
563@code{(language brainfuck)} defines a very basic Brainfuck
564implementation meant to serve as easy-to-understand example on how to
4e432dab
AW
565do this. See for instance @url{http://en.wikipedia.org/wiki/Brainfuck}
566for more information about the Brainfuck language itself.
567
e63d888e 568
ff73ae34
AW
569@node Extending the Compiler
570@subsection Extending the Compiler
e3ba263d 571
dd6e37d0
NJ
572At this point we take a detour from the impersonal tone of the rest of
573the manual. Admit it: if you've read this far into the compiler
574internals manual, you are a junkie. Perhaps a course at your university
575left you unsated, or perhaps you've always harbored a desire to hack the
576holy of computer science holies: a compiler. Well you're in good
577company, and in a good position. Guile's compiler needs your help.
ff73ae34
AW
578
579There are many possible avenues for improving Guile's compiler.
580Probably the most important improvement, speed-wise, will be some form
581of native compilation, both just-in-time and ahead-of-time. This could
582be done in many ways. Probably the easiest strategy would be to extend
583the compiled procedure structure to include a pointer to a native code
86872cc3 584vector, and compile from bytecode to native code at run-time after a
ff73ae34
AW
585procedure is called a certain number of times.
586
587The name of the game is a profiling-based harvest of the low-hanging
588fruit, running programs of interest under a system-level profiler and
589determining which improvements would give the most bang for the buck.
98850fd7
AW
590It's really getting to the point though that native compilation is the
591next step.
ff73ae34
AW
592
593The compiler also needs help at the top end, enhancing the Scheme that
98850fd7
AW
594it knows to also understand R6RS, and adding new high-level compilers.
595We have JavaScript and Emacs Lisp mostly complete, but they could use
ecb87335 596some love; Lua would be nice as well, but whatever language it is
98850fd7
AW
597that strikes your fancy would be welcome too.
598
599Compilers are for hacking, not for admiring or for complaining about.
600Get to it!