Commit | Line | Data |
---|---|---|
8680d53b AW |
1 | @c -*-texinfo-*- |
2 | @c This is part of the GNU Guile Reference Manual. | |
747bd534 | 3 | @c Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013 |
8680d53b AW |
4 | @c Free Software Foundation, Inc. |
5 | @c See the file guile.texi for copying conditions. | |
6 | ||
7 | @node Compiling to the Virtual Machine | |
8 | @section Compiling to the Virtual Machine | |
9 | ||
67915ab0 AW |
10 | Compilers! The word itself inspires excitement and awe, even among |
11 | experienced practitioners. But a compiler is just a program: an | |
12 | eminently hackable thing. This section aims to to describe Guile's | |
13 | compiler in such a way that interested Scheme hackers can feel | |
14 | comfortable reading and extending it. | |
00ce5125 | 15 | |
e3ba263d | 16 | @xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to |
98850fd7 | 17 | know how to compile your @code{.scm} file. |
00ce5125 AW |
18 | |
19 | @menu | |
20 | * Compiler Tower:: | |
21 | * The Scheme Compiler:: | |
81fd3152 | 22 | * Tree-IL:: |
67915ab0 AW |
23 | * Continuation-Passing Style:: |
24 | * Bytecode:: | |
e63d888e | 25 | * Writing New High-Level Languages:: |
e3ba263d | 26 | * Extending the Compiler:: |
00ce5125 AW |
27 | @end menu |
28 | ||
29 | @node Compiler Tower | |
30 | @subsection Compiler Tower | |
31 | ||
d297e544 AW |
32 | Guile's compiler is quite simple -- its @emph{compilers}, to put it more |
33 | accurately. Guile defines a tower of languages, starting at Scheme and | |
34 | progressively simplifying down to languages that resemble the VM | |
35 | instruction set (@pxref{Instruction Set}). | |
00ce5125 AW |
36 | |
37 | Each language knows how to compile to the next, so each step is simple | |
67915ab0 AW |
38 | and understandable. Furthermore, this set of languages is not hardcoded |
39 | into Guile, so it is possible for the user to add new high-level | |
40 | languages, new passes, or even different compilation targets. | |
00ce5125 | 41 | |
e3ba263d AW |
42 | Languages are registered in the module, @code{(system base language)}: |
43 | ||
44 | @example | |
45 | (use-modules (system base language)) | |
46 | @end example | |
47 | ||
48 | They are registered with the @code{define-language} form. | |
49 | ||
50 | @deffn {Scheme Syntax} define-language @ | |
994d87be BT |
51 | [#:name] [#:title] [#:reader] [#:printer] @ |
52 | [#:parser=#f] [#:compilers='()] @ | |
53 | [#:decompilers='()] [#:evaluator=#f] @ | |
54 | [#:joiner=#f] [#:for-humans?=#t] @ | |
55 | [#:make-default-environment=make-fresh-user-module] | |
e3ba263d AW |
56 | Define a language. |
57 | ||
67915ab0 AW |
58 | This syntax defines a @code{<language>} object, bound to @var{name} in |
59 | the current environment. In addition, the language will be added to the | |
60 | global language set. For example, this is the language definition for | |
61 | Scheme: | |
e3ba263d AW |
62 | |
63 | @example | |
64 | (define-language scheme | |
41e64dd7 AW |
65 | #:title "Scheme" |
66 | #:reader (lambda (port env) ...) | |
98850fd7 | 67 | #:compilers `((tree-il . ,compile-tree-il)) |
81fd3152 | 68 | #:decompilers `((tree-il . ,decompile-tree-il)) |
41e64dd7 AW |
69 | #:evaluator (lambda (x module) (primitive-eval x)) |
70 | #:printer write | |
71 | #:make-default-environment (lambda () ...)) | |
e3ba263d | 72 | @end example |
e3ba263d AW |
73 | @end deffn |
74 | ||
75 | The interesting thing about having languages defined this way is that | |
67915ab0 | 76 | they present a uniform interface to the read-eval-print loop. This |
e3ba263d AW |
77 | allows the user to change the current language of the REPL: |
78 | ||
79 | @example | |
81fd3152 | 80 | scheme@@(guile-user)> ,language tree-il |
41e64dd7 AW |
81 | Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'. |
82 | tree-il@@(guile-user)> ,L scheme | |
83 | Happy hacking with Scheme! To switch back, type `,L tree-il'. | |
84 | scheme@@(guile-user)> | |
e3ba263d AW |
85 | @end example |
86 | ||
87 | Languages can be looked up by name, as they were above. | |
88 | ||
89 | @deffn {Scheme Procedure} lookup-language name | |
90 | Looks up a language named @var{name}, autoloading it if necessary. | |
91 | ||
92 | Languages are autoloaded by looking for a variable named @var{name} in | |
93 | a module named @code{(language @var{name} spec)}. | |
94 | ||
95 | The language object will be returned, or @code{#f} if there does not | |
96 | exist a language with that name. | |
97 | @end deffn | |
98 | ||
99 | Defining languages this way allows us to programmatically determine | |
100 | the necessary steps for compiling code from one language to another. | |
101 | ||
102 | @deffn {Scheme Procedure} lookup-compilation-order from to | |
103 | Recursively traverses the set of languages to which @var{from} can | |
104 | compile, depth-first, and return the first path that can transform | |
105 | @var{from} to @var{to}. Returns @code{#f} if no path is found. | |
106 | ||
107 | This function memoizes its results in a cache that is invalidated by | |
108 | subsequent calls to @code{define-language}, so it should be quite | |
109 | fast. | |
110 | @end deffn | |
111 | ||
5745de91 AW |
112 | There is a notion of a ``current language'', which is maintained in the |
113 | @code{current-language} parameter, defined in the core @code{(guile)} | |
67915ab0 AW |
114 | module. This language is normally Scheme, and may be rebound by the |
115 | user. The run-time compilation interfaces | |
e3ba263d AW |
116 | (@pxref{Read/Load/Eval/Compile}) also allow you to choose other source |
117 | and target languages. | |
118 | ||
119 | The normal tower of languages when compiling Scheme goes like this: | |
120 | ||
121 | @itemize | |
41e64dd7 | 122 | @item Scheme |
81fd3152 | 123 | @item Tree Intermediate Language (Tree-IL) |
67915ab0 | 124 | @item Continuation-Passing Style (CPS) |
81fd3152 | 125 | @item Bytecode |
e3ba263d AW |
126 | @end itemize |
127 | ||
67915ab0 AW |
128 | As discussed before (@pxref{Object File Format}), bytecode is in ELF |
129 | format, ready to be serialized to disk. But when compiling Scheme at | |
130 | run time, you want a Scheme value: for example, a compiled procedure. | |
131 | For this reason, so as not to break the abstraction, Guile defines a | |
132 | fake language at the bottom of the tower: | |
81fd3152 AW |
133 | |
134 | @itemize | |
135 | @item Value | |
136 | @end itemize | |
137 | ||
d297e544 AW |
138 | Compiling to @code{value} loads the bytecode into a procedure, turning |
139 | cold bytes into warm code. | |
e3ba263d AW |
140 | |
141 | Perhaps this strangeness can be explained by example: | |
d297e544 | 142 | @code{compile-file} defaults to compiling to bytecode, because it |
e3ba263d | 143 | produces object code that has to live in the barren world outside the |
67915ab0 AW |
144 | Guile runtime; but @code{compile} defaults to compiling to @code{value}, |
145 | as its product re-enters the Guile world. | |
146 | ||
147 | @c FIXME: This doesn't work anymore :( Should we add some kind of | |
148 | @c special GC pass, or disclaim this kind of code, or what? | |
e3ba263d AW |
149 | |
150 | Indeed, the process of compilation can circulate through these | |
151 | different worlds indefinitely, as shown by the following quine: | |
152 | ||
153 | @example | |
00ce5125 | 154 | ((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x))) |
e3ba263d | 155 | @end example |
00ce5125 AW |
156 | |
157 | @node The Scheme Compiler | |
158 | @subsection The Scheme Compiler | |
159 | ||
7081d4f9 | 160 | The job of the Scheme compiler is to expand all macros and all of Scheme |
67915ab0 AW |
161 | to its most primitive expressions. The definition of ``primitive |
162 | expression'' is given by the inventory of constructs provided by | |
163 | Tree-IL, the target language of the Scheme compiler: procedure calls, | |
164 | conditionals, lexical references, and so on. This is described more | |
165 | fully in the next section. | |
81fd3152 AW |
166 | |
167 | The tricky and amusing thing about the Scheme-to-Tree-IL compiler is | |
67915ab0 | 168 | that it is completely implemented by the macro expander. Since the |
81fd3152 AW |
169 | macro expander has to run over all of the source code already in order |
170 | to expand macros, it might as well do the analysis at the same time, | |
171 | producing Tree-IL expressions directly. | |
172 | ||
67915ab0 AW |
173 | Because this compiler is actually the macro expander, it is extensible. |
174 | Any macro which the user writes becomes part of the compiler. | |
81fd3152 AW |
175 | |
176 | The Scheme-to-Tree-IL expander may be invoked using the generic | |
177 | @code{compile} procedure: | |
178 | ||
179 | @lisp | |
180 | (compile '(+ 1 2) #:from 'scheme #:to 'tree-il) | |
181 | @result{} | |
67915ab0 | 182 | #<tree-il (call (toplevel +) (const 1) (const 2))> |
81fd3152 AW |
183 | @end lisp |
184 | ||
67915ab0 AW |
185 | @code{(compile @var{foo} #:from 'scheme #:to 'tree-il)} is entirely |
186 | equivalent to calling the macro expander as @code{(macroexpand @var{foo} | |
187 | 'c '(compile load eval))}. @xref{Macro Expansion}. | |
188 | @code{compile-tree-il}, the procedure dispatched by @code{compile} to | |
189 | @code{'tree-il}, is a small wrapper around @code{macroexpand}, to make | |
190 | it conform to the general form of compiler procedures in Guile's | |
191 | language tower. | |
81fd3152 | 192 | |
98850fd7 AW |
193 | Compiler procedures take three arguments: an expression, an |
194 | environment, and a keyword list of options. They return three values: | |
195 | the compiled expression, the corresponding environment for the target | |
196 | language, and a ``continuation environment''. The compiled expression | |
197 | and environment will serve as input to the next language's compiler. | |
198 | The ``continuation environment'' can be used to compile another | |
199 | expression from the same source language within the same module. | |
81fd3152 AW |
200 | |
201 | For example, you might compile the expression, @code{(define-module | |
202 | (foo))}. This will result in a Tree-IL expression and environment. But | |
203 | if you compiled a second expression, you would want to take into | |
204 | account the compile-time effect of compiling the previous expression, | |
205 | which puts the user in the @code{(foo)} module. That is purpose of the | |
206 | ``continuation environment''; you would pass it as the environment | |
207 | when compiling the subsequent expression. | |
208 | ||
41e64dd7 AW |
209 | For Scheme, an environment is a module. By default, the @code{compile} |
210 | and @code{compile-file} procedures compile in a fresh module, such | |
211 | that bindings and macros introduced by the expression being compiled | |
212 | are isolated: | |
1ebe6a63 LC |
213 | |
214 | @example | |
215 | (eq? (current-module) (compile '(current-module))) | |
216 | @result{} #f | |
217 | ||
218 | (compile '(define hello 'world)) | |
219 | (defined? 'hello) | |
220 | @result{} #f | |
221 | ||
222 | (define / *) | |
223 | (eq? (compile '/) /) | |
224 | @result{} #f | |
225 | @end example | |
226 | ||
227 | Similarly, changes to the @code{current-reader} fluid (@pxref{Loading, | |
228 | @code{current-reader}}) are isolated: | |
229 | ||
230 | @example | |
231 | (compile '(fluid-set! current-reader (lambda args 'fail))) | |
232 | (fluid-ref current-reader) | |
233 | @result{} #f | |
234 | @end example | |
235 | ||
236 | Nevertheless, having the compiler and @dfn{compilee} share the same name | |
237 | space can be achieved by explicitly passing @code{(current-module)} as | |
238 | the compilation environment: | |
239 | ||
240 | @example | |
241 | (define hello 'world) | |
242 | (compile 'hello #:env (current-module)) | |
243 | @result{} world | |
244 | @end example | |
245 | ||
81fd3152 AW |
246 | @node Tree-IL |
247 | @subsection Tree-IL | |
00ce5125 | 248 | |
81fd3152 | 249 | Tree Intermediate Language (Tree-IL) is a structured intermediate |
c850030f AW |
250 | language that is close in expressive power to Scheme. It is an |
251 | expanded, pre-analyzed Scheme. | |
252 | ||
81fd3152 AW |
253 | Tree-IL is ``structured'' in the sense that its representation is |
254 | based on records, not S-expressions. This gives a rigidity to the | |
255 | language that ensures that compiling to a lower-level language only | |
41e64dd7 AW |
256 | requires a limited set of transformations. For example, the Tree-IL |
257 | type @code{<const>} is a record type with two fields, @code{src} and | |
258 | @code{exp}. Instances of this type are created via @code{make-const}. | |
259 | Fields of this type are accessed via the @code{const-src} and | |
260 | @code{const-exp} procedures. There is also a predicate, @code{const?}. | |
261 | @xref{Records}, for more information on records. | |
81fd3152 AW |
262 | |
263 | @c alpha renaming | |
264 | ||
265 | All Tree-IL types have a @code{src} slot, which holds source location | |
266 | information for the expression. This information, if present, will be | |
267 | residualized into the compiled object code, allowing backtraces to | |
268 | show source information. The format of @code{src} is the same as that | |
269 | returned by Guile's @code{source-properties} function. @xref{Source | |
270 | Properties}, for more information. | |
271 | ||
272 | Although Tree-IL objects are represented internally using records, | |
273 | there is also an equivalent S-expression external representation for | |
ecb87335 | 274 | each kind of Tree-IL. For example, the S-expression representation |
81fd3152 | 275 | of @code{#<const src: #f exp: 3>} expression would be: |
c850030f AW |
276 | |
277 | @example | |
81fd3152 | 278 | (const 3) |
c850030f AW |
279 | @end example |
280 | ||
81fd3152 | 281 | Users may program with this format directly at the REPL: |
c850030f AW |
282 | |
283 | @example | |
81fd3152 | 284 | scheme@@(guile-user)> ,language tree-il |
41e64dd7 | 285 | Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'. |
67915ab0 | 286 | tree-il@@(guile-user)> (call (primitive +) (const 32) (const 10)) |
c850030f AW |
287 | @result{} 42 |
288 | @end example | |
289 | ||
81fd3152 AW |
290 | The @code{src} fields are left out of the external representation. |
291 | ||
98850fd7 AW |
292 | One may create Tree-IL objects from their external representations via |
293 | calling @code{parse-tree-il}, the reader for Tree-IL. If any source | |
294 | information is attached to the input S-expression, it will be | |
295 | propagated to the resulting Tree-IL expressions. This is probably the | |
296 | easiest way to compile to Tree-IL: just make the appropriate external | |
297 | representations in S-expression format, and let @code{parse-tree-il} | |
298 | take care of the rest. | |
299 | ||
81fd3152 AW |
300 | @deftp {Scheme Variable} <void> src |
301 | @deftpx {External Representation} (void) | |
67915ab0 | 302 | An empty expression. In practice, equivalent to Scheme's @code{(if #f |
81fd3152 AW |
303 | #f)}. |
304 | @end deftp | |
d297e544 | 305 | |
81fd3152 AW |
306 | @deftp {Scheme Variable} <const> src exp |
307 | @deftpx {External Representation} (const @var{exp}) | |
308 | A constant. | |
309 | @end deftp | |
d297e544 | 310 | |
81fd3152 AW |
311 | @deftp {Scheme Variable} <primitive-ref> src name |
312 | @deftpx {External Representation} (primitive @var{name}) | |
67915ab0 AW |
313 | A reference to a ``primitive''. A primitive is a procedure that, when |
314 | compiled, may be open-coded. For example, @code{cons} is usually | |
81fd3152 AW |
315 | recognized as a primitive, so that it compiles down to a single |
316 | instruction. | |
317 | ||
318 | Compilation of Tree-IL usually begins with a pass that resolves some | |
319 | @code{<module-ref>} and @code{<toplevel-ref>} expressions to | |
67915ab0 | 320 | @code{<primitive-ref>} expressions. The actual compilation pass has |
7081d4f9 AW |
321 | special cases for calls to certain primitives, like @code{apply} or |
322 | @code{cons}. | |
81fd3152 | 323 | @end deftp |
d297e544 | 324 | |
81fd3152 AW |
325 | @deftp {Scheme Variable} <lexical-ref> src name gensym |
326 | @deftpx {External Representation} (lexical @var{name} @var{gensym}) | |
67915ab0 | 327 | A reference to a lexically-bound variable. The @var{name} is the |
81fd3152 AW |
328 | original name of the variable in the source program. @var{gensym} is a |
329 | unique identifier for this variable. | |
330 | @end deftp | |
d297e544 | 331 | |
81fd3152 AW |
332 | @deftp {Scheme Variable} <lexical-set> src name gensym exp |
333 | @deftpx {External Representation} (set! (lexical @var{name} @var{gensym}) @var{exp}) | |
334 | Sets a lexically-bound variable. | |
335 | @end deftp | |
d297e544 | 336 | |
81fd3152 AW |
337 | @deftp {Scheme Variable} <module-ref> src mod name public? |
338 | @deftpx {External Representation} (@@ @var{mod} @var{name}) | |
339 | @deftpx {External Representation} (@@@@ @var{mod} @var{name}) | |
340 | A reference to a variable in a specific module. @var{mod} should be | |
679cceed | 341 | the name of the module, e.g.@: @code{(guile-user)}. |
81fd3152 AW |
342 | |
343 | If @var{public?} is true, the variable named @var{name} will be looked | |
344 | up in @var{mod}'s public interface, and serialized with @code{@@}; | |
345 | otherwise it will be looked up among the module's private bindings, | |
346 | and is serialized with @code{@@@@}. | |
347 | @end deftp | |
d297e544 | 348 | |
81fd3152 AW |
349 | @deftp {Scheme Variable} <module-set> src mod name public? exp |
350 | @deftpx {External Representation} (set! (@@ @var{mod} @var{name}) @var{exp}) | |
351 | @deftpx {External Representation} (set! (@@@@ @var{mod} @var{name}) @var{exp}) | |
352 | Sets a variable in a specific module. | |
353 | @end deftp | |
d297e544 | 354 | |
81fd3152 AW |
355 | @deftp {Scheme Variable} <toplevel-ref> src name |
356 | @deftpx {External Representation} (toplevel @var{name}) | |
357 | References a variable from the current procedure's module. | |
358 | @end deftp | |
d297e544 | 359 | |
81fd3152 AW |
360 | @deftp {Scheme Variable} <toplevel-set> src name exp |
361 | @deftpx {External Representation} (set! (toplevel @var{name}) @var{exp}) | |
362 | Sets a variable in the current procedure's module. | |
363 | @end deftp | |
d297e544 | 364 | |
81fd3152 AW |
365 | @deftp {Scheme Variable} <toplevel-define> src name exp |
366 | @deftpx {External Representation} (define (toplevel @var{name}) @var{exp}) | |
367 | Defines a new top-level variable in the current procedure's module. | |
368 | @end deftp | |
d297e544 | 369 | |
81fd3152 AW |
370 | @deftp {Scheme Variable} <conditional> src test then else |
371 | @deftpx {External Representation} (if @var{test} @var{then} @var{else}) | |
ca445ba5 | 372 | A conditional. Note that @var{else} is not optional. |
c850030f | 373 | @end deftp |
d297e544 | 374 | |
7081d4f9 AW |
375 | @deftp {Scheme Variable} <call> src proc args |
376 | @deftpx {External Representation} (call @var{proc} . @var{args}) | |
ca445ba5 | 377 | A procedure call. |
c850030f | 378 | @end deftp |
d297e544 | 379 | |
a881a4ae AW |
380 | @deftp {Scheme Variable} <primcall> src name args |
381 | @deftpx {External Representation} (primcall @var{name} . @var{args}) | |
382 | A call to a primitive. Equivalent to @code{(call (primitive @var{name}) | |
383 | . @var{args})}. This construct is often more convenient to generate and | |
384 | analyze than @code{<call>}. | |
385 | ||
386 | As part of the compilation process, instances of @code{(call (primitive | |
387 | @var{name}) . @var{args})} are transformed into primcalls. | |
388 | @end deftp | |
d297e544 | 389 | |
67915ab0 AW |
390 | @deftp {Scheme Variable} <seq> src head tail |
391 | @deftpx {External Representation} (seq @var{head} @var{tail}) | |
392 | A sequence. The semantics is that @var{head} is evaluated first, and | |
393 | any resulting values are ignored. Then @var{tail} is evaluated, in tail | |
394 | position. | |
c850030f | 395 | @end deftp |
d297e544 | 396 | |
41e64dd7 AW |
397 | @deftp {Scheme Variable} <lambda> src meta body |
398 | @deftpx {External Representation} (lambda @var{meta} @var{body}) | |
67915ab0 AW |
399 | A closure. @var{meta} is an association list of properties for the |
400 | procedure. @var{body} is a single Tree-IL expression of type | |
401 | @code{<lambda-case>}. As the @code{<lambda-case>} clause can chain to | |
41e64dd7 AW |
402 | an alternate clause, this makes Tree-IL's @code{<lambda>} have the |
403 | expressiveness of Scheme's @code{case-lambda}. | |
404 | @end deftp | |
d297e544 | 405 | |
41e64dd7 AW |
406 | @deftp {Scheme Variable} <lambda-case> req opt rest kw inits gensyms body alternate |
407 | @deftpx {External Representation} @ | |
408 | (lambda-case ((@var{req} @var{opt} @var{rest} @var{kw} @var{inits} @var{gensyms})@ | |
409 | @var{body})@ | |
410 | [@var{alternate}]) | |
67915ab0 | 411 | One clause of a @code{case-lambda}. A @code{lambda} expression in |
41e64dd7 AW |
412 | Scheme is treated as a @code{case-lambda} with one clause. |
413 | ||
414 | @var{req} is a list of the procedure's required arguments, as symbols. | |
415 | @var{opt} is a list of the optional arguments, or @code{#f} if there | |
416 | are no optional arguments. @var{rest} is the name of the rest | |
417 | argument, or @code{#f}. | |
418 | ||
419 | @var{kw} is a list of the form, @code{(@var{allow-other-keys?} | |
420 | (@var{keyword} @var{name} @var{var}) ...)}, where @var{keyword} is the | |
421 | keyword corresponding to the argument named @var{name}, and whose | |
67915ab0 AW |
422 | corresponding gensym is @var{var}. @var{inits} are tree-il expressions |
423 | corresponding to all of the optional and keyword arguments, evaluated to | |
424 | bind variables whose value is not supplied by the procedure caller. | |
41e64dd7 AW |
425 | Each @var{init} expression is evaluated in the lexical context of |
426 | previously bound variables, from left to right. | |
427 | ||
428 | @var{gensyms} is a list of gensyms corresponding to all arguments: | |
429 | first all of the required arguments, then the optional arguments if | |
430 | any, then the rest argument if any, then all of the keyword arguments. | |
431 | ||
67915ab0 | 432 | @var{body} is the body of the clause. If the procedure is called with |
41e64dd7 | 433 | an appropriate number of arguments, @var{body} is evaluated in tail |
67915ab0 | 434 | position. Otherwise, if there is an @var{alternate}, it should be a |
41e64dd7 | 435 | @code{<lambda-case>} expression, representing the next clause to try. |
64de6db5 | 436 | If there is no @var{alternate}, a wrong-number-of-arguments error is |
41e64dd7 AW |
437 | signaled. |
438 | @end deftp | |
d297e544 | 439 | |
41e64dd7 AW |
440 | @deftp {Scheme Variable} <let> src names gensyms vals exp |
441 | @deftpx {External Representation} (let @var{names} @var{gensyms} @var{vals} @var{exp}) | |
67915ab0 AW |
442 | Lexical binding, like Scheme's @code{let}. @var{names} are the original |
443 | binding names, @var{gensyms} are gensyms corresponding to the | |
81fd3152 AW |
444 | @var{names}, and @var{vals} are Tree-IL expressions for the values. |
445 | @var{exp} is a single Tree-IL expression. | |
446 | @end deftp | |
d297e544 | 447 | |
935c7aca | 448 | @deftp {Scheme Variable} <letrec> in-order? src names gensyms vals exp |
172988ee AW |
449 | @deftpx {External Representation} (letrec @var{names} @var{gensyms} @var{vals} @var{exp}) |
450 | @deftpx {External Representation} (letrec* @var{names} @var{gensyms} @var{vals} @var{exp}) | |
81fd3152 | 451 | A version of @code{<let>} that creates recursive bindings, like |
935c7aca | 452 | Scheme's @code{letrec}, or @code{letrec*} if @var{in-order?} is true. |
81fd3152 | 453 | @end deftp |
d297e544 | 454 | |
67915ab0 AW |
455 | @deftp {Scheme Variable} <prompt> escape-only? tag body handler |
456 | @deftpx {External Representation} (prompt @var{escape-only?} @var{tag} @var{body} @var{handler}) | |
457 | A dynamic prompt. Instates a prompt named @var{tag}, an expression, | |
41e64dd7 | 458 | during the dynamic extent of the execution of @var{body}, also an |
67915ab0 AW |
459 | expression. If an abort occurs to this prompt, control will be passed |
460 | to @var{handler}, also an expression, which should be a procedure. The | |
461 | first argument to the handler procedure will be the captured | |
462 | continuation, followed by all of the values passed to the abort. If | |
463 | @var{escape-only?} is true, the handler should be a @code{<lambda>} with | |
464 | a single @code{<lambda-case>} body expression with no optional or | |
465 | keyword arguments, and no alternate, and whose first argument is | |
466 | unreferenced. @xref{Prompts}, for more information. | |
41e64dd7 | 467 | @end deftp |
d297e544 | 468 | |
41e64dd7 AW |
469 | @deftp {Scheme Variable} <abort> tag args tail |
470 | @deftpx {External Representation} (abort @var{tag} @var{args} @var{tail}) | |
471 | An abort to the nearest prompt with the name @var{tag}, an expression. | |
472 | @var{args} should be a list of expressions to pass to the prompt's | |
473 | handler, and @var{tail} should be an expression that will evaluate to | |
67915ab0 | 474 | a list of additional arguments. An abort will save the partial |
41e64dd7 AW |
475 | continuation, which may later be reinstated, resulting in the |
476 | @code{<abort>} expression evaluating to some number of values. | |
477 | @end deftp | |
81fd3152 | 478 | |
98850fd7 AW |
479 | There are two Tree-IL constructs that are not normally produced by |
480 | higher-level compilers, but instead are generated during the | |
481 | source-to-source optimization and analysis passes that the Tree-IL | |
67915ab0 AW |
482 | compiler does. Users should not generate these expressions directly, |
483 | unless they feel very clever, as the default analysis pass will generate | |
484 | them as necessary. | |
98850fd7 | 485 | |
41e64dd7 AW |
486 | @deftp {Scheme Variable} <let-values> src names gensyms exp body |
487 | @deftpx {External Representation} (let-values @var{names} @var{gensyms} @var{exp} @var{body}) | |
98850fd7 AW |
488 | Like Scheme's @code{receive} -- binds the values returned by |
489 | evaluating @code{exp} to the @code{lambda}-like bindings described by | |
67915ab0 | 490 | @var{gensyms}. That is to say, @var{gensyms} may be an improper list. |
98850fd7 | 491 | |
7081d4f9 | 492 | @code{<let-values>} is an optimization of a @code{<call>} to the |
98850fd7 AW |
493 | primitive, @code{call-with-values}. |
494 | @end deftp | |
d297e544 | 495 | |
41e64dd7 AW |
496 | @deftp {Scheme Variable} <fix> src names gensyms vals body |
497 | @deftpx {External Representation} (fix @var{names} @var{gensyms} @var{vals} @var{body}) | |
98850fd7 AW |
498 | Like @code{<letrec>}, but only for @var{vals} that are unset |
499 | @code{lambda} expressions. | |
500 | ||
501 | @code{fix} is an optimization of @code{letrec} (and @code{let}). | |
502 | @end deftp | |
81fd3152 | 503 | |
67915ab0 AW |
504 | Tree-IL is a convenient compilation target from source languages. It |
505 | can be convenient as a medium for optimization, though CPS is usually | |
506 | better. The strength of Tree-IL is that it does not fix order of | |
507 | evaluation, so it makes some code motion a bit easier. | |
00ce5125 | 508 | |
67915ab0 | 509 | Optimization passes performed on Tree-IL currently include: |
00ce5125 | 510 | |
73643339 | 511 | @itemize |
67915ab0 AW |
512 | @item Open-coding (turning toplevel-refs into primitive-refs, |
513 | and calls to primitives to primcalls) | |
514 | @item Partial evaluation (comprising inlining, copy propagation, and | |
515 | constant folding) | |
516 | @item Common subexpression elimination (CSE) | |
73643339 AW |
517 | @end itemize |
518 | ||
67915ab0 AW |
519 | In the future, we will move the CSE pass to operate over the lower-level |
520 | CPS language. | |
73643339 | 521 | |
67915ab0 AW |
522 | @node Continuation-Passing Style |
523 | @subsection Continuation-Passing Style | |
73643339 | 524 | |
67915ab0 AW |
525 | @cindex CPS |
526 | Continuation-passing style (CPS) is ... | |
73643339 | 527 | |
67915ab0 AW |
528 | @node Bytecode |
529 | @subsection Bytecode | |
73643339 | 530 | |
67915ab0 | 531 | Blah blah ... |
73643339 | 532 | |
67915ab0 | 533 | @xref{Object File Format} |
00ce5125 | 534 | |
67915ab0 | 535 | (system vm loader) |
e3ba263d | 536 | |
b8bc86bc AW |
537 | @deffn {Scheme Variable} load-thunk-from-file file |
538 | @deffnx {C Function} scm_load_thunk_from_file (file) | |
ff73ae34 AW |
539 | Load object code from a file named @var{file}. The file will be mapped |
540 | into memory via @code{mmap}, so this is a very fast operation. | |
e3ba263d | 541 | |
b8bc86bc AW |
542 | On disk, object code is embedded in ELF, a flexible container format |
543 | created for use in UNIX systems. Guile has its own ELF linker and | |
544 | loader, so it uses the ELF format on all systems. | |
73643339 AW |
545 | @end deffn |
546 | ||
67915ab0 | 547 | likewise load-thunk-from-memory |
ff73ae34 AW |
548 | |
549 | Compiling object code to the fake language, @code{value}, is performed | |
550 | via loading objcode into a program, then executing that thunk with | |
551 | respect to the compilation environment. Normally the environment | |
552 | propagates through the compiler transparently, but users may specify | |
41e64dd7 | 553 | the compilation environment manually as well, as a module. |
ff73ae34 | 554 | |
c850030f | 555 | |
e63d888e DK |
556 | @node Writing New High-Level Languages |
557 | @subsection Writing New High-Level Languages | |
558 | ||
559 | In order to integrate a new language @var{lang} into Guile's compiler | |
560 | system, one has to create the module @code{(language @var{lang} spec)} | |
561 | containing the language definition and referencing the parser, | |
562 | compiler and other routines processing it. The module hierarchy in | |
563 | @code{(language brainfuck)} defines a very basic Brainfuck | |
564 | implementation meant to serve as easy-to-understand example on how to | |
4e432dab AW |
565 | do this. See for instance @url{http://en.wikipedia.org/wiki/Brainfuck} |
566 | for more information about the Brainfuck language itself. | |
567 | ||
e63d888e | 568 | |
ff73ae34 AW |
569 | @node Extending the Compiler |
570 | @subsection Extending the Compiler | |
e3ba263d | 571 | |
dd6e37d0 NJ |
572 | At this point we take a detour from the impersonal tone of the rest of |
573 | the manual. Admit it: if you've read this far into the compiler | |
574 | internals manual, you are a junkie. Perhaps a course at your university | |
575 | left you unsated, or perhaps you've always harbored a desire to hack the | |
576 | holy of computer science holies: a compiler. Well you're in good | |
577 | company, and in a good position. Guile's compiler needs your help. | |
ff73ae34 AW |
578 | |
579 | There are many possible avenues for improving Guile's compiler. | |
580 | Probably the most important improvement, speed-wise, will be some form | |
581 | of native compilation, both just-in-time and ahead-of-time. This could | |
582 | be done in many ways. Probably the easiest strategy would be to extend | |
583 | the compiled procedure structure to include a pointer to a native code | |
86872cc3 | 584 | vector, and compile from bytecode to native code at run-time after a |
ff73ae34 AW |
585 | procedure is called a certain number of times. |
586 | ||
587 | The name of the game is a profiling-based harvest of the low-hanging | |
588 | fruit, running programs of interest under a system-level profiler and | |
589 | determining which improvements would give the most bang for the buck. | |
98850fd7 AW |
590 | It's really getting to the point though that native compilation is the |
591 | next step. | |
ff73ae34 AW |
592 | |
593 | The compiler also needs help at the top end, enhancing the Scheme that | |
98850fd7 AW |
594 | it knows to also understand R6RS, and adding new high-level compilers. |
595 | We have JavaScript and Emacs Lisp mostly complete, but they could use | |
ecb87335 | 596 | some love; Lua would be nice as well, but whatever language it is |
98850fd7 AW |
597 | that strikes your fancy would be welcome too. |
598 | ||
599 | Compilers are for hacking, not for admiring or for complaining about. | |
600 | Get to it! |