Commit | Line | Data |
---|---|---|
8680d53b AW |
1 | @c -*-texinfo-*- |
2 | @c This is part of the GNU Guile Reference Manual. | |
bdad1340 | 3 | @c Copyright (C) 2008, 2009, 2010, 2011, 2012, 2013, 2014 |
8680d53b AW |
4 | @c Free Software Foundation, Inc. |
5 | @c See the file guile.texi for copying conditions. | |
6 | ||
7 | @node Compiling to the Virtual Machine | |
8 | @section Compiling to the Virtual Machine | |
9 | ||
67915ab0 AW |
10 | Compilers! The word itself inspires excitement and awe, even among |
11 | experienced practitioners. But a compiler is just a program: an | |
12 | eminently hackable thing. This section aims to to describe Guile's | |
13 | compiler in such a way that interested Scheme hackers can feel | |
14 | comfortable reading and extending it. | |
00ce5125 | 15 | |
e3ba263d | 16 | @xref{Read/Load/Eval/Compile}, if you're lost and you just wanted to |
98850fd7 | 17 | know how to compile your @code{.scm} file. |
00ce5125 AW |
18 | |
19 | @menu | |
20 | * Compiler Tower:: | |
21 | * The Scheme Compiler:: | |
81fd3152 | 22 | * Tree-IL:: |
67915ab0 AW |
23 | * Continuation-Passing Style:: |
24 | * Bytecode:: | |
e63d888e | 25 | * Writing New High-Level Languages:: |
e3ba263d | 26 | * Extending the Compiler:: |
00ce5125 AW |
27 | @end menu |
28 | ||
29 | @node Compiler Tower | |
30 | @subsection Compiler Tower | |
31 | ||
d297e544 AW |
32 | Guile's compiler is quite simple -- its @emph{compilers}, to put it more |
33 | accurately. Guile defines a tower of languages, starting at Scheme and | |
34 | progressively simplifying down to languages that resemble the VM | |
35 | instruction set (@pxref{Instruction Set}). | |
00ce5125 AW |
36 | |
37 | Each language knows how to compile to the next, so each step is simple | |
67915ab0 AW |
38 | and understandable. Furthermore, this set of languages is not hardcoded |
39 | into Guile, so it is possible for the user to add new high-level | |
40 | languages, new passes, or even different compilation targets. | |
00ce5125 | 41 | |
e3ba263d AW |
42 | Languages are registered in the module, @code{(system base language)}: |
43 | ||
44 | @example | |
45 | (use-modules (system base language)) | |
46 | @end example | |
47 | ||
48 | They are registered with the @code{define-language} form. | |
49 | ||
50 | @deffn {Scheme Syntax} define-language @ | |
994d87be BT |
51 | [#:name] [#:title] [#:reader] [#:printer] @ |
52 | [#:parser=#f] [#:compilers='()] @ | |
53 | [#:decompilers='()] [#:evaluator=#f] @ | |
54 | [#:joiner=#f] [#:for-humans?=#t] @ | |
55 | [#:make-default-environment=make-fresh-user-module] | |
e3ba263d AW |
56 | Define a language. |
57 | ||
67915ab0 AW |
58 | This syntax defines a @code{<language>} object, bound to @var{name} in |
59 | the current environment. In addition, the language will be added to the | |
60 | global language set. For example, this is the language definition for | |
61 | Scheme: | |
e3ba263d AW |
62 | |
63 | @example | |
64 | (define-language scheme | |
41e64dd7 AW |
65 | #:title "Scheme" |
66 | #:reader (lambda (port env) ...) | |
98850fd7 | 67 | #:compilers `((tree-il . ,compile-tree-il)) |
81fd3152 | 68 | #:decompilers `((tree-il . ,decompile-tree-il)) |
41e64dd7 AW |
69 | #:evaluator (lambda (x module) (primitive-eval x)) |
70 | #:printer write | |
71 | #:make-default-environment (lambda () ...)) | |
e3ba263d | 72 | @end example |
e3ba263d AW |
73 | @end deffn |
74 | ||
75 | The interesting thing about having languages defined this way is that | |
67915ab0 | 76 | they present a uniform interface to the read-eval-print loop. This |
e3ba263d AW |
77 | allows the user to change the current language of the REPL: |
78 | ||
79 | @example | |
81fd3152 | 80 | scheme@@(guile-user)> ,language tree-il |
41e64dd7 AW |
81 | Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'. |
82 | tree-il@@(guile-user)> ,L scheme | |
83 | Happy hacking with Scheme! To switch back, type `,L tree-il'. | |
84 | scheme@@(guile-user)> | |
e3ba263d AW |
85 | @end example |
86 | ||
87 | Languages can be looked up by name, as they were above. | |
88 | ||
89 | @deffn {Scheme Procedure} lookup-language name | |
90 | Looks up a language named @var{name}, autoloading it if necessary. | |
91 | ||
92 | Languages are autoloaded by looking for a variable named @var{name} in | |
93 | a module named @code{(language @var{name} spec)}. | |
94 | ||
95 | The language object will be returned, or @code{#f} if there does not | |
96 | exist a language with that name. | |
97 | @end deffn | |
98 | ||
99 | Defining languages this way allows us to programmatically determine | |
100 | the necessary steps for compiling code from one language to another. | |
101 | ||
102 | @deffn {Scheme Procedure} lookup-compilation-order from to | |
103 | Recursively traverses the set of languages to which @var{from} can | |
104 | compile, depth-first, and return the first path that can transform | |
105 | @var{from} to @var{to}. Returns @code{#f} if no path is found. | |
106 | ||
107 | This function memoizes its results in a cache that is invalidated by | |
108 | subsequent calls to @code{define-language}, so it should be quite | |
109 | fast. | |
110 | @end deffn | |
111 | ||
5745de91 AW |
112 | There is a notion of a ``current language'', which is maintained in the |
113 | @code{current-language} parameter, defined in the core @code{(guile)} | |
67915ab0 AW |
114 | module. This language is normally Scheme, and may be rebound by the |
115 | user. The run-time compilation interfaces | |
e3ba263d AW |
116 | (@pxref{Read/Load/Eval/Compile}) also allow you to choose other source |
117 | and target languages. | |
118 | ||
119 | The normal tower of languages when compiling Scheme goes like this: | |
120 | ||
121 | @itemize | |
41e64dd7 | 122 | @item Scheme |
81fd3152 | 123 | @item Tree Intermediate Language (Tree-IL) |
67915ab0 | 124 | @item Continuation-Passing Style (CPS) |
81fd3152 | 125 | @item Bytecode |
e3ba263d AW |
126 | @end itemize |
127 | ||
67915ab0 AW |
128 | As discussed before (@pxref{Object File Format}), bytecode is in ELF |
129 | format, ready to be serialized to disk. But when compiling Scheme at | |
130 | run time, you want a Scheme value: for example, a compiled procedure. | |
131 | For this reason, so as not to break the abstraction, Guile defines a | |
132 | fake language at the bottom of the tower: | |
81fd3152 AW |
133 | |
134 | @itemize | |
135 | @item Value | |
136 | @end itemize | |
137 | ||
d297e544 AW |
138 | Compiling to @code{value} loads the bytecode into a procedure, turning |
139 | cold bytes into warm code. | |
e3ba263d AW |
140 | |
141 | Perhaps this strangeness can be explained by example: | |
d297e544 | 142 | @code{compile-file} defaults to compiling to bytecode, because it |
e3ba263d | 143 | produces object code that has to live in the barren world outside the |
67915ab0 AW |
144 | Guile runtime; but @code{compile} defaults to compiling to @code{value}, |
145 | as its product re-enters the Guile world. | |
146 | ||
147 | @c FIXME: This doesn't work anymore :( Should we add some kind of | |
148 | @c special GC pass, or disclaim this kind of code, or what? | |
e3ba263d AW |
149 | |
150 | Indeed, the process of compilation can circulate through these | |
151 | different worlds indefinitely, as shown by the following quine: | |
152 | ||
153 | @example | |
00ce5125 | 154 | ((lambda (x) ((compile x) x)) '(lambda (x) ((compile x) x))) |
e3ba263d | 155 | @end example |
00ce5125 AW |
156 | |
157 | @node The Scheme Compiler | |
158 | @subsection The Scheme Compiler | |
159 | ||
7081d4f9 | 160 | The job of the Scheme compiler is to expand all macros and all of Scheme |
67915ab0 AW |
161 | to its most primitive expressions. The definition of ``primitive |
162 | expression'' is given by the inventory of constructs provided by | |
163 | Tree-IL, the target language of the Scheme compiler: procedure calls, | |
164 | conditionals, lexical references, and so on. This is described more | |
165 | fully in the next section. | |
81fd3152 AW |
166 | |
167 | The tricky and amusing thing about the Scheme-to-Tree-IL compiler is | |
67915ab0 | 168 | that it is completely implemented by the macro expander. Since the |
81fd3152 AW |
169 | macro expander has to run over all of the source code already in order |
170 | to expand macros, it might as well do the analysis at the same time, | |
171 | producing Tree-IL expressions directly. | |
172 | ||
67915ab0 AW |
173 | Because this compiler is actually the macro expander, it is extensible. |
174 | Any macro which the user writes becomes part of the compiler. | |
81fd3152 AW |
175 | |
176 | The Scheme-to-Tree-IL expander may be invoked using the generic | |
177 | @code{compile} procedure: | |
178 | ||
179 | @lisp | |
180 | (compile '(+ 1 2) #:from 'scheme #:to 'tree-il) | |
181 | @result{} | |
67915ab0 | 182 | #<tree-il (call (toplevel +) (const 1) (const 2))> |
81fd3152 AW |
183 | @end lisp |
184 | ||
67915ab0 AW |
185 | @code{(compile @var{foo} #:from 'scheme #:to 'tree-il)} is entirely |
186 | equivalent to calling the macro expander as @code{(macroexpand @var{foo} | |
187 | 'c '(compile load eval))}. @xref{Macro Expansion}. | |
188 | @code{compile-tree-il}, the procedure dispatched by @code{compile} to | |
189 | @code{'tree-il}, is a small wrapper around @code{macroexpand}, to make | |
190 | it conform to the general form of compiler procedures in Guile's | |
191 | language tower. | |
81fd3152 | 192 | |
98850fd7 AW |
193 | Compiler procedures take three arguments: an expression, an |
194 | environment, and a keyword list of options. They return three values: | |
195 | the compiled expression, the corresponding environment for the target | |
196 | language, and a ``continuation environment''. The compiled expression | |
197 | and environment will serve as input to the next language's compiler. | |
198 | The ``continuation environment'' can be used to compile another | |
199 | expression from the same source language within the same module. | |
81fd3152 AW |
200 | |
201 | For example, you might compile the expression, @code{(define-module | |
202 | (foo))}. This will result in a Tree-IL expression and environment. But | |
203 | if you compiled a second expression, you would want to take into | |
204 | account the compile-time effect of compiling the previous expression, | |
205 | which puts the user in the @code{(foo)} module. That is purpose of the | |
206 | ``continuation environment''; you would pass it as the environment | |
207 | when compiling the subsequent expression. | |
208 | ||
41e64dd7 AW |
209 | For Scheme, an environment is a module. By default, the @code{compile} |
210 | and @code{compile-file} procedures compile in a fresh module, such | |
211 | that bindings and macros introduced by the expression being compiled | |
212 | are isolated: | |
1ebe6a63 LC |
213 | |
214 | @example | |
215 | (eq? (current-module) (compile '(current-module))) | |
216 | @result{} #f | |
217 | ||
218 | (compile '(define hello 'world)) | |
219 | (defined? 'hello) | |
220 | @result{} #f | |
221 | ||
222 | (define / *) | |
223 | (eq? (compile '/) /) | |
224 | @result{} #f | |
225 | @end example | |
226 | ||
227 | Similarly, changes to the @code{current-reader} fluid (@pxref{Loading, | |
228 | @code{current-reader}}) are isolated: | |
229 | ||
230 | @example | |
231 | (compile '(fluid-set! current-reader (lambda args 'fail))) | |
232 | (fluid-ref current-reader) | |
233 | @result{} #f | |
234 | @end example | |
235 | ||
236 | Nevertheless, having the compiler and @dfn{compilee} share the same name | |
237 | space can be achieved by explicitly passing @code{(current-module)} as | |
238 | the compilation environment: | |
239 | ||
240 | @example | |
241 | (define hello 'world) | |
242 | (compile 'hello #:env (current-module)) | |
243 | @result{} world | |
244 | @end example | |
245 | ||
81fd3152 AW |
246 | @node Tree-IL |
247 | @subsection Tree-IL | |
00ce5125 | 248 | |
81fd3152 | 249 | Tree Intermediate Language (Tree-IL) is a structured intermediate |
c850030f AW |
250 | language that is close in expressive power to Scheme. It is an |
251 | expanded, pre-analyzed Scheme. | |
252 | ||
81fd3152 AW |
253 | Tree-IL is ``structured'' in the sense that its representation is |
254 | based on records, not S-expressions. This gives a rigidity to the | |
255 | language that ensures that compiling to a lower-level language only | |
41e64dd7 AW |
256 | requires a limited set of transformations. For example, the Tree-IL |
257 | type @code{<const>} is a record type with two fields, @code{src} and | |
258 | @code{exp}. Instances of this type are created via @code{make-const}. | |
259 | Fields of this type are accessed via the @code{const-src} and | |
260 | @code{const-exp} procedures. There is also a predicate, @code{const?}. | |
261 | @xref{Records}, for more information on records. | |
81fd3152 AW |
262 | |
263 | @c alpha renaming | |
264 | ||
265 | All Tree-IL types have a @code{src} slot, which holds source location | |
266 | information for the expression. This information, if present, will be | |
267 | residualized into the compiled object code, allowing backtraces to | |
268 | show source information. The format of @code{src} is the same as that | |
269 | returned by Guile's @code{source-properties} function. @xref{Source | |
270 | Properties}, for more information. | |
271 | ||
272 | Although Tree-IL objects are represented internally using records, | |
273 | there is also an equivalent S-expression external representation for | |
ecb87335 | 274 | each kind of Tree-IL. For example, the S-expression representation |
81fd3152 | 275 | of @code{#<const src: #f exp: 3>} expression would be: |
c850030f AW |
276 | |
277 | @example | |
81fd3152 | 278 | (const 3) |
c850030f AW |
279 | @end example |
280 | ||
81fd3152 | 281 | Users may program with this format directly at the REPL: |
c850030f AW |
282 | |
283 | @example | |
81fd3152 | 284 | scheme@@(guile-user)> ,language tree-il |
41e64dd7 | 285 | Happy hacking with Tree Intermediate Language! To switch back, type `,L scheme'. |
67915ab0 | 286 | tree-il@@(guile-user)> (call (primitive +) (const 32) (const 10)) |
c850030f AW |
287 | @result{} 42 |
288 | @end example | |
289 | ||
81fd3152 AW |
290 | The @code{src} fields are left out of the external representation. |
291 | ||
98850fd7 AW |
292 | One may create Tree-IL objects from their external representations via |
293 | calling @code{parse-tree-il}, the reader for Tree-IL. If any source | |
294 | information is attached to the input S-expression, it will be | |
295 | propagated to the resulting Tree-IL expressions. This is probably the | |
296 | easiest way to compile to Tree-IL: just make the appropriate external | |
297 | representations in S-expression format, and let @code{parse-tree-il} | |
298 | take care of the rest. | |
299 | ||
81fd3152 AW |
300 | @deftp {Scheme Variable} <void> src |
301 | @deftpx {External Representation} (void) | |
67915ab0 | 302 | An empty expression. In practice, equivalent to Scheme's @code{(if #f |
81fd3152 AW |
303 | #f)}. |
304 | @end deftp | |
d297e544 | 305 | |
81fd3152 AW |
306 | @deftp {Scheme Variable} <const> src exp |
307 | @deftpx {External Representation} (const @var{exp}) | |
308 | A constant. | |
309 | @end deftp | |
d297e544 | 310 | |
81fd3152 AW |
311 | @deftp {Scheme Variable} <primitive-ref> src name |
312 | @deftpx {External Representation} (primitive @var{name}) | |
67915ab0 AW |
313 | A reference to a ``primitive''. A primitive is a procedure that, when |
314 | compiled, may be open-coded. For example, @code{cons} is usually | |
81fd3152 AW |
315 | recognized as a primitive, so that it compiles down to a single |
316 | instruction. | |
317 | ||
318 | Compilation of Tree-IL usually begins with a pass that resolves some | |
319 | @code{<module-ref>} and @code{<toplevel-ref>} expressions to | |
67915ab0 | 320 | @code{<primitive-ref>} expressions. The actual compilation pass has |
7081d4f9 AW |
321 | special cases for calls to certain primitives, like @code{apply} or |
322 | @code{cons}. | |
81fd3152 | 323 | @end deftp |
d297e544 | 324 | |
81fd3152 AW |
325 | @deftp {Scheme Variable} <lexical-ref> src name gensym |
326 | @deftpx {External Representation} (lexical @var{name} @var{gensym}) | |
67915ab0 | 327 | A reference to a lexically-bound variable. The @var{name} is the |
81fd3152 AW |
328 | original name of the variable in the source program. @var{gensym} is a |
329 | unique identifier for this variable. | |
330 | @end deftp | |
d297e544 | 331 | |
81fd3152 AW |
332 | @deftp {Scheme Variable} <lexical-set> src name gensym exp |
333 | @deftpx {External Representation} (set! (lexical @var{name} @var{gensym}) @var{exp}) | |
334 | Sets a lexically-bound variable. | |
335 | @end deftp | |
d297e544 | 336 | |
81fd3152 AW |
337 | @deftp {Scheme Variable} <module-ref> src mod name public? |
338 | @deftpx {External Representation} (@@ @var{mod} @var{name}) | |
339 | @deftpx {External Representation} (@@@@ @var{mod} @var{name}) | |
340 | A reference to a variable in a specific module. @var{mod} should be | |
679cceed | 341 | the name of the module, e.g.@: @code{(guile-user)}. |
81fd3152 AW |
342 | |
343 | If @var{public?} is true, the variable named @var{name} will be looked | |
344 | up in @var{mod}'s public interface, and serialized with @code{@@}; | |
345 | otherwise it will be looked up among the module's private bindings, | |
346 | and is serialized with @code{@@@@}. | |
347 | @end deftp | |
d297e544 | 348 | |
81fd3152 AW |
349 | @deftp {Scheme Variable} <module-set> src mod name public? exp |
350 | @deftpx {External Representation} (set! (@@ @var{mod} @var{name}) @var{exp}) | |
351 | @deftpx {External Representation} (set! (@@@@ @var{mod} @var{name}) @var{exp}) | |
352 | Sets a variable in a specific module. | |
353 | @end deftp | |
d297e544 | 354 | |
81fd3152 AW |
355 | @deftp {Scheme Variable} <toplevel-ref> src name |
356 | @deftpx {External Representation} (toplevel @var{name}) | |
357 | References a variable from the current procedure's module. | |
358 | @end deftp | |
d297e544 | 359 | |
81fd3152 AW |
360 | @deftp {Scheme Variable} <toplevel-set> src name exp |
361 | @deftpx {External Representation} (set! (toplevel @var{name}) @var{exp}) | |
362 | Sets a variable in the current procedure's module. | |
363 | @end deftp | |
d297e544 | 364 | |
81fd3152 AW |
365 | @deftp {Scheme Variable} <toplevel-define> src name exp |
366 | @deftpx {External Representation} (define (toplevel @var{name}) @var{exp}) | |
367 | Defines a new top-level variable in the current procedure's module. | |
368 | @end deftp | |
d297e544 | 369 | |
81fd3152 AW |
370 | @deftp {Scheme Variable} <conditional> src test then else |
371 | @deftpx {External Representation} (if @var{test} @var{then} @var{else}) | |
ca445ba5 | 372 | A conditional. Note that @var{else} is not optional. |
c850030f | 373 | @end deftp |
d297e544 | 374 | |
7081d4f9 AW |
375 | @deftp {Scheme Variable} <call> src proc args |
376 | @deftpx {External Representation} (call @var{proc} . @var{args}) | |
ca445ba5 | 377 | A procedure call. |
c850030f | 378 | @end deftp |
d297e544 | 379 | |
a881a4ae AW |
380 | @deftp {Scheme Variable} <primcall> src name args |
381 | @deftpx {External Representation} (primcall @var{name} . @var{args}) | |
382 | A call to a primitive. Equivalent to @code{(call (primitive @var{name}) | |
383 | . @var{args})}. This construct is often more convenient to generate and | |
384 | analyze than @code{<call>}. | |
385 | ||
386 | As part of the compilation process, instances of @code{(call (primitive | |
387 | @var{name}) . @var{args})} are transformed into primcalls. | |
388 | @end deftp | |
d297e544 | 389 | |
67915ab0 AW |
390 | @deftp {Scheme Variable} <seq> src head tail |
391 | @deftpx {External Representation} (seq @var{head} @var{tail}) | |
392 | A sequence. The semantics is that @var{head} is evaluated first, and | |
393 | any resulting values are ignored. Then @var{tail} is evaluated, in tail | |
394 | position. | |
c850030f | 395 | @end deftp |
d297e544 | 396 | |
41e64dd7 AW |
397 | @deftp {Scheme Variable} <lambda> src meta body |
398 | @deftpx {External Representation} (lambda @var{meta} @var{body}) | |
67915ab0 AW |
399 | A closure. @var{meta} is an association list of properties for the |
400 | procedure. @var{body} is a single Tree-IL expression of type | |
401 | @code{<lambda-case>}. As the @code{<lambda-case>} clause can chain to | |
41e64dd7 AW |
402 | an alternate clause, this makes Tree-IL's @code{<lambda>} have the |
403 | expressiveness of Scheme's @code{case-lambda}. | |
404 | @end deftp | |
d297e544 | 405 | |
41e64dd7 AW |
406 | @deftp {Scheme Variable} <lambda-case> req opt rest kw inits gensyms body alternate |
407 | @deftpx {External Representation} @ | |
408 | (lambda-case ((@var{req} @var{opt} @var{rest} @var{kw} @var{inits} @var{gensyms})@ | |
409 | @var{body})@ | |
410 | [@var{alternate}]) | |
67915ab0 | 411 | One clause of a @code{case-lambda}. A @code{lambda} expression in |
41e64dd7 AW |
412 | Scheme is treated as a @code{case-lambda} with one clause. |
413 | ||
414 | @var{req} is a list of the procedure's required arguments, as symbols. | |
415 | @var{opt} is a list of the optional arguments, or @code{#f} if there | |
416 | are no optional arguments. @var{rest} is the name of the rest | |
417 | argument, or @code{#f}. | |
418 | ||
419 | @var{kw} is a list of the form, @code{(@var{allow-other-keys?} | |
420 | (@var{keyword} @var{name} @var{var}) ...)}, where @var{keyword} is the | |
421 | keyword corresponding to the argument named @var{name}, and whose | |
67915ab0 AW |
422 | corresponding gensym is @var{var}. @var{inits} are tree-il expressions |
423 | corresponding to all of the optional and keyword arguments, evaluated to | |
424 | bind variables whose value is not supplied by the procedure caller. | |
41e64dd7 AW |
425 | Each @var{init} expression is evaluated in the lexical context of |
426 | previously bound variables, from left to right. | |
427 | ||
428 | @var{gensyms} is a list of gensyms corresponding to all arguments: | |
429 | first all of the required arguments, then the optional arguments if | |
430 | any, then the rest argument if any, then all of the keyword arguments. | |
431 | ||
67915ab0 | 432 | @var{body} is the body of the clause. If the procedure is called with |
41e64dd7 | 433 | an appropriate number of arguments, @var{body} is evaluated in tail |
67915ab0 | 434 | position. Otherwise, if there is an @var{alternate}, it should be a |
41e64dd7 | 435 | @code{<lambda-case>} expression, representing the next clause to try. |
64de6db5 | 436 | If there is no @var{alternate}, a wrong-number-of-arguments error is |
41e64dd7 AW |
437 | signaled. |
438 | @end deftp | |
d297e544 | 439 | |
41e64dd7 AW |
440 | @deftp {Scheme Variable} <let> src names gensyms vals exp |
441 | @deftpx {External Representation} (let @var{names} @var{gensyms} @var{vals} @var{exp}) | |
67915ab0 AW |
442 | Lexical binding, like Scheme's @code{let}. @var{names} are the original |
443 | binding names, @var{gensyms} are gensyms corresponding to the | |
81fd3152 AW |
444 | @var{names}, and @var{vals} are Tree-IL expressions for the values. |
445 | @var{exp} is a single Tree-IL expression. | |
446 | @end deftp | |
d297e544 | 447 | |
935c7aca | 448 | @deftp {Scheme Variable} <letrec> in-order? src names gensyms vals exp |
172988ee AW |
449 | @deftpx {External Representation} (letrec @var{names} @var{gensyms} @var{vals} @var{exp}) |
450 | @deftpx {External Representation} (letrec* @var{names} @var{gensyms} @var{vals} @var{exp}) | |
81fd3152 | 451 | A version of @code{<let>} that creates recursive bindings, like |
935c7aca | 452 | Scheme's @code{letrec}, or @code{letrec*} if @var{in-order?} is true. |
81fd3152 | 453 | @end deftp |
d297e544 | 454 | |
67915ab0 AW |
455 | @deftp {Scheme Variable} <prompt> escape-only? tag body handler |
456 | @deftpx {External Representation} (prompt @var{escape-only?} @var{tag} @var{body} @var{handler}) | |
457 | A dynamic prompt. Instates a prompt named @var{tag}, an expression, | |
41e64dd7 | 458 | during the dynamic extent of the execution of @var{body}, also an |
67915ab0 AW |
459 | expression. If an abort occurs to this prompt, control will be passed |
460 | to @var{handler}, also an expression, which should be a procedure. The | |
461 | first argument to the handler procedure will be the captured | |
462 | continuation, followed by all of the values passed to the abort. If | |
463 | @var{escape-only?} is true, the handler should be a @code{<lambda>} with | |
464 | a single @code{<lambda-case>} body expression with no optional or | |
465 | keyword arguments, and no alternate, and whose first argument is | |
466 | unreferenced. @xref{Prompts}, for more information. | |
41e64dd7 | 467 | @end deftp |
d297e544 | 468 | |
41e64dd7 AW |
469 | @deftp {Scheme Variable} <abort> tag args tail |
470 | @deftpx {External Representation} (abort @var{tag} @var{args} @var{tail}) | |
471 | An abort to the nearest prompt with the name @var{tag}, an expression. | |
472 | @var{args} should be a list of expressions to pass to the prompt's | |
473 | handler, and @var{tail} should be an expression that will evaluate to | |
67915ab0 | 474 | a list of additional arguments. An abort will save the partial |
41e64dd7 AW |
475 | continuation, which may later be reinstated, resulting in the |
476 | @code{<abort>} expression evaluating to some number of values. | |
477 | @end deftp | |
81fd3152 | 478 | |
98850fd7 AW |
479 | There are two Tree-IL constructs that are not normally produced by |
480 | higher-level compilers, but instead are generated during the | |
481 | source-to-source optimization and analysis passes that the Tree-IL | |
67915ab0 AW |
482 | compiler does. Users should not generate these expressions directly, |
483 | unless they feel very clever, as the default analysis pass will generate | |
484 | them as necessary. | |
98850fd7 | 485 | |
41e64dd7 AW |
486 | @deftp {Scheme Variable} <let-values> src names gensyms exp body |
487 | @deftpx {External Representation} (let-values @var{names} @var{gensyms} @var{exp} @var{body}) | |
98850fd7 AW |
488 | Like Scheme's @code{receive} -- binds the values returned by |
489 | evaluating @code{exp} to the @code{lambda}-like bindings described by | |
67915ab0 | 490 | @var{gensyms}. That is to say, @var{gensyms} may be an improper list. |
98850fd7 | 491 | |
7081d4f9 | 492 | @code{<let-values>} is an optimization of a @code{<call>} to the |
98850fd7 AW |
493 | primitive, @code{call-with-values}. |
494 | @end deftp | |
d297e544 | 495 | |
41e64dd7 AW |
496 | @deftp {Scheme Variable} <fix> src names gensyms vals body |
497 | @deftpx {External Representation} (fix @var{names} @var{gensyms} @var{vals} @var{body}) | |
98850fd7 AW |
498 | Like @code{<letrec>}, but only for @var{vals} that are unset |
499 | @code{lambda} expressions. | |
500 | ||
501 | @code{fix} is an optimization of @code{letrec} (and @code{let}). | |
502 | @end deftp | |
81fd3152 | 503 | |
67915ab0 AW |
504 | Tree-IL is a convenient compilation target from source languages. It |
505 | can be convenient as a medium for optimization, though CPS is usually | |
506 | better. The strength of Tree-IL is that it does not fix order of | |
507 | evaluation, so it makes some code motion a bit easier. | |
00ce5125 | 508 | |
67915ab0 | 509 | Optimization passes performed on Tree-IL currently include: |
00ce5125 | 510 | |
73643339 | 511 | @itemize |
67915ab0 AW |
512 | @item Open-coding (turning toplevel-refs into primitive-refs, |
513 | and calls to primitives to primcalls) | |
514 | @item Partial evaluation (comprising inlining, copy propagation, and | |
515 | constant folding) | |
516 | @item Common subexpression elimination (CSE) | |
73643339 AW |
517 | @end itemize |
518 | ||
67915ab0 AW |
519 | In the future, we will move the CSE pass to operate over the lower-level |
520 | CPS language. | |
73643339 | 521 | |
67915ab0 AW |
522 | @node Continuation-Passing Style |
523 | @subsection Continuation-Passing Style | |
73643339 | 524 | |
67915ab0 | 525 | @cindex CPS |
58b23156 AW |
526 | Continuation-passing style (CPS) is Guile's principal intermediate |
527 | language, bridging the gap between languages for people and languages | |
528 | for machines. CPS gives a name to every part of a program: every | |
529 | control point, and every intermediate value. This makes it an excellent | |
530 | medium for reasoning about programs, which is the principal job of a | |
531 | compiler. | |
532 | ||
533 | @menu | |
534 | * An Introduction to CPS:: | |
535 | * CPS in Guile:: | |
bdad1340 | 536 | * Building CPS:: |
58b23156 AW |
537 | @end menu |
538 | ||
539 | @node An Introduction to CPS | |
540 | @subsubsection An Introduction to CPS | |
541 | ||
542 | As an example, consider the following Scheme expression: | |
543 | ||
544 | @lisp | |
545 | (begin | |
546 | (display "The sum of 32 and 10 is: ") | |
547 | (display 42) | |
548 | (newline)) | |
549 | @end lisp | |
550 | ||
551 | Let us identify all of the sub-expressions in this expression. We give | |
552 | them unique labels, like @var{k1}, and annotate the original source | |
553 | code: | |
554 | ||
555 | @lisp | |
556 | (begin | |
557 | (display "The sum of 32 and 10 is: ") | |
558 | |k1 k2 | |
559 | k0 | |
560 | (display 42) | |
561 | |k4 k5 | |
562 | k3 | |
563 | (newline)) | |
564 | |k7 | |
565 | k6 | |
566 | @end lisp | |
567 | ||
568 | These labels also identify continuations. For example, the continuation | |
569 | of @code{k7} is @code{k6}. This is because after evaluating the value | |
570 | of @code{newline}, performed by the expression labelled @code{k7}, we | |
bdad1340 | 571 | continue to apply it in @code{k6}. |
58b23156 AW |
572 | |
573 | Which label has @code{k0} as its continuation? It is either @code{k1} | |
574 | or @code{k2}. Scheme does not have a fixed order of evaluation of | |
575 | arguments, although it does guarantee that they are evaluated in some | |
576 | order. However, continuation-passing style makes evaluation order | |
577 | explicit. In Guile, this choice is made by the higher-level language | |
578 | compilers. | |
579 | ||
580 | Let us assume a left-to-right evaluation order. In that case the | |
581 | continuation of @code{k1} is @code{k2}, and the continuation of | |
582 | @code{k2} is @code{k0}. | |
583 | ||
584 | With this example established, we are ready to give an example of CPS in | |
585 | Scheme: | |
586 | ||
587 | @lisp | |
588 | (lambda (ktail) | |
589 | (let ((k1 (lambda () | |
590 | (let ((k2 (lambda (proc) | |
591 | (let ((k0 (lambda (arg0) | |
592 | (proc k4 arg0)))) | |
593 | (k0 "The sum of 32 and 10 is: "))))) | |
594 | (k2 display)))) | |
595 | (k4 (lambda _ | |
596 | (let ((k5 (lambda (proc) | |
597 | (let ((k3 (lambda (arg0) | |
598 | (proc k7 arg0)))) | |
599 | (k3 42))))) | |
600 | (k5 display)))) | |
601 | (k7 (lambda _ | |
602 | (let ((k6 (lambda (proc) | |
603 | (proc ktail)))) | |
604 | (k6 newline))))) | |
605 | (k1)) | |
606 | @end lisp | |
607 | ||
608 | Holy code explosion, Batman! What's with all the lambdas? Indeed, CPS | |
609 | is by nature much more verbose than ``direct-style'' intermediate | |
610 | languages like Tree-IL. At the same time, CPS is more simple than full | |
611 | Scheme, in the same way that a Turing machine is more simple than | |
612 | Scheme, although they are semantically equivalent. | |
613 | ||
614 | In the original program, the expression labelled @code{k0} is in effect | |
615 | context. Any values it returns are ignored. This is reflected in CPS | |
616 | by noting that its continuation, @code{k4}, takes any number of values | |
617 | and ignores them. Compare this to @code{k2}, which takes a single | |
618 | value; in this way we can say that @code{k1} is in a ``value'' context. | |
619 | Likewise @code{k6} is in tail context with respect to the expression as | |
620 | a whole, because its continuation is the tail continuation, | |
621 | @code{ktail}. CPS makes these details manifest, and gives them names. | |
622 | ||
bdad1340 AW |
623 | @subsubheading Compiling CPS |
624 | ||
625 | In CPS, there are no nested expressions. Indeed, CPS even removes the | |
626 | concept of a stack. All applications in CPS are in tail context. For | |
627 | that reason, applications in CPS are jumps, not calls. The @code{(k1)} | |
628 | above is nothing more than a @code{goto}. @code{(k3 42)} is a | |
629 | @code{goto} with a value. In this way, CPS bridges the gap between the | |
630 | lambda calculus and machine instruction sequences. | |
631 | ||
632 | On the side of machine instructions, Guile does still have a stack, and | |
633 | the @code{lambda} forms shown above do not actually result in one | |
634 | closure being allocated per subexpression at run-time. Lambda | |
635 | expressions introduced by a CPS transformation can always be allocated | |
636 | as labels or basic blocks within a function. In fact, we make a | |
637 | syntactic distinction between closures and continuations in the CPS | |
638 | language, and attempt to transform closures to continuations (basic | |
639 | blocks) where possible, via the @dfn{contification} optimization pass. | |
640 | ||
641 | Values bound by continuations are allocated to stack slots in a | |
642 | function's frame. The compiler from CPS only allocates slots to values | |
643 | that are actually live; it's possible to have a value in scope but not | |
644 | allocated to a slot. | |
645 | ||
58b23156 AW |
646 | @node CPS in Guile |
647 | @subsubsection CPS in Guile | |
648 | ||
bdad1340 AW |
649 | Guile's CPS language is composed of @dfn{terms}, @dfn{expressions}, |
650 | and @dfn{continuations}. | |
58b23156 | 651 | |
bdad1340 AW |
652 | A term can either evaluate an expression and pass the resulting values |
653 | to some continuation, or it can declare local continuations and contain | |
654 | a sub-term in the scope of those continuations. | |
655 | ||
656 | @deftp {CPS Term} $continue k src exp | |
657 | Evaluate the expression @var{exp} and pass the resulting values (if any) | |
658 | to the continuation labelled @var{k}. The source information associated | |
659 | with the expression may be found in @var{src}, which is either an alist | |
660 | as in @code{source-properties} or is @code{#f} if there is no associated | |
661 | source. | |
58b23156 AW |
662 | @end deftp |
663 | ||
bdad1340 AW |
664 | @deftp {CPS Term} $letk conts body |
665 | Bind @var{conts}, a list of continuations (@code{$cont} instances), in | |
666 | the scope of the sub-term @var{body}. The continuations are mutually | |
667 | recursive. | |
58b23156 AW |
668 | @end deftp |
669 | ||
bdad1340 AW |
670 | Additionally, the early stages of CPS allow for a set of mutually |
671 | recursive functions to be declared as a term. This @code{$letrec} type | |
672 | is like Tree-IL's @code{<fix>}. The contification pass will attempt to | |
673 | transform the functions declared in a @code{$letrec} into local | |
674 | continuations. Any remaining functions are later lowered to @code{$fun} | |
675 | expressions. | |
58b23156 | 676 | |
bdad1340 AW |
677 | @deftp {CPS Term} $letrec names syms funs body |
678 | Declare the mutually recursive set of functions denoted by @var{names}, | |
679 | @var{syms}, and @var{funs} within the sub-term @var{body}. @var{names} | |
680 | and @var{syms} are lists of symbols, and @var{funs} is a list of | |
681 | @var{$fun} values. @var{syms} are globally unique. | |
58b23156 AW |
682 | @end deftp |
683 | ||
bdad1340 AW |
684 | Here is an inventory of the kinds of expressions in Guile's CPS |
685 | language. Recall that all expressions are wrapped in a @code{$continue} | |
686 | term which specifies their continuation. | |
58b23156 | 687 | |
bdad1340 AW |
688 | @deftp {CPS Expression} $void |
689 | Continue with the unspecified value. | |
58b23156 | 690 | @end deftp |
bdad1340 AW |
691 | |
692 | @deftp {CPS Expression} $const val | |
693 | Continue with the constant value @var{val}. | |
58b23156 | 694 | @end deftp |
bdad1340 AW |
695 | |
696 | @deftp {CPS Expression} $prim name | |
697 | Continue with the procedure that implements the primitive operation | |
698 | named by @var{name}. | |
58b23156 | 699 | @end deftp |
bdad1340 AW |
700 | |
701 | @deftp {CPS Expression} $fun src meta free body | |
702 | Continue with a procedure. @var{src} identifies the source information | |
703 | for the procedure declaration, and @var{meta} is the metadata alist as | |
704 | described above in Tree-IL's @code{<lambda>}. @var{free} is a list of | |
705 | free variables accessed by the procedure. Early CPS uses an empty list | |
706 | for @var{free}; only after closure conversion is it correctly populated. | |
707 | Finally, @var{body} is the @code{$kentry} @code{$cont} of the procedure | |
708 | entry. | |
58b23156 | 709 | @end deftp |
bdad1340 AW |
710 | |
711 | @deftp {CPS Expression} $call proc args | |
712 | Call @var{proc} with the arguments @var{args}, and pass all values to | |
713 | the continuation. @var{proc} and the elements of the @var{args} list | |
714 | should all be variable names. The continuation identified by the term's | |
715 | @var{k} should be a @code{$kreceive} or a @code{$ktail} instance. | |
58b23156 | 716 | @end deftp |
bdad1340 AW |
717 | |
718 | @deftp {CPS Expression} $primcall name args | |
719 | Perform the primitive operation identified by @code{name}, a well-known | |
720 | symbol, passing it the arguments @var{args}, and pass all resulting | |
721 | values to the continuation. The set of available primitives includes | |
722 | all primitives known to Tree-IL and then some more; see the source code | |
723 | for details. | |
58b23156 AW |
724 | @end deftp |
725 | ||
bdad1340 AW |
726 | @deftp {CPS Expression} $values args |
727 | Pass the values named by the list @var{args} to the continuation. | |
728 | @end deftp | |
58b23156 | 729 | |
bdad1340 AW |
730 | @deftp {CPS Expression} $prompt escape? tag handler |
731 | Push a prompt on the stack identified by the variable name @var{tag}, | |
732 | which may be escape-only if @var{escape?} is true, and continue with | |
733 | zero values. If the body aborts to this prompt, control will proceed at | |
734 | the continuation labelled @var{handler}, which should be a | |
735 | @code{$kreceive} continuation. Prompts are later popped by | |
736 | @code{pop-prompt} primcalls. | |
737 | @end deftp | |
58b23156 | 738 | |
bdad1340 AW |
739 | The remaining element of the CPS language in Guile is the continuation. |
740 | In CPS, all continuations have unique labels. Since this aspect is | |
741 | common to all continuation types, all continuations are contained in a | |
742 | @code{$cont} instance: | |
58b23156 | 743 | |
bdad1340 AW |
744 | @deftp {CPS Continuation Wrapper} $cont k cont |
745 | Declare a continuation labelled @var{k}. All references to the | |
746 | continuation will use this label. | |
747 | @end deftp | |
58b23156 | 748 | |
bdad1340 AW |
749 | The most common kind of continuation binds some number of values, and |
750 | then evaluates a sub-term. @code{$kargs} is this kind of simple | |
751 | @code{lambda}. | |
58b23156 | 752 | |
bdad1340 AW |
753 | @deftp {CPS Continuation} $kargs names syms body |
754 | Bind the incoming values to the variables @var{syms}, with original | |
755 | names @var{names}, and then evaluate the sub-term @var{body}. | |
756 | @end deftp | |
58b23156 | 757 | |
bdad1340 AW |
758 | Variable names (the names in the @var{syms} of a @code{$kargs}) should |
759 | be globally unique, and also disjoint from continuation labels. To bind | |
760 | a value to a variable and then evaluate some term, you would continue | |
761 | with the value to a @code{$kargs} that declares one variable. The bound | |
762 | value would then be available for use within the body of the | |
763 | @code{$kargs}. | |
58b23156 | 764 | |
bdad1340 AW |
765 | @deftp {CPS Continuation} $kif kt kf |
766 | Receive one value. If it is true for the purposes of Scheme, branch to | |
767 | the continuation labelled @var{kt}, passing no values; otherwise, branch | |
768 | to @var{kf}. | |
769 | @end deftp | |
770 | ||
771 | For internal reasons, only certain terms may continue to a @code{$kif}. | |
772 | Compiling @code{$kif} avoids allocating space for the test variable, so | |
773 | it needs to be preceded by expressions that can test-and-branch without | |
774 | temporary values. In practice this condition is true for | |
775 | @code{$primcall}s to @code{null?}, @code{=}, and similar primitives that | |
776 | have corresponding @code{br-if-@var{foo}} VM operations; see the source | |
777 | code for full details. When in doubt, bind the test expression to a | |
778 | variable, and continue to the @code{$kif} with a @code{$values} | |
779 | expression. The optimizer should elide the @code{$values} if it is not | |
780 | needed. | |
58b23156 | 781 | |
bdad1340 AW |
782 | Calls out to other functions need to be wrapped in a @code{$kreceive} |
783 | continuation in order to adapt the returned values to their uses in the | |
784 | calling function, if any. | |
785 | ||
786 | @deftp {CPS Continuation} $kreceive arity k | |
787 | Receive values on the stack. Parse them according to @var{arity}, and | |
788 | then proceed with the parsed values to the @var{$kargs} continuation | |
789 | labelled @var{k}. As a limitation specific to @code{$kreceive}, | |
790 | @var{arity} may only contain required and rest arguments. | |
791 | @end deftp | |
792 | ||
793 | @code{$arity} is a helper data structure used by @code{$kreceive} and | |
794 | also by @code{$kclause}, described below. | |
795 | ||
796 | @deftp {CPS Data} $arity req opt rest kw allow-other-keys? | |
797 | A data type declaring an arity. @var{req} and @var{opt} are lists of | |
798 | source names of required and optional arguments, respectively. | |
799 | @var{rest} is either the source name of the rest variable, or @code{#f} | |
800 | if this arity does not accept additional values. @var{kw} is a list of | |
801 | the form @code{((@var{keyword} @var{name} @var{var}) ...)}, describing | |
802 | the keyword arguments. @var{allow-other-keys?} is true if other keyword | |
803 | arguments are allowed and false otherwise. | |
804 | ||
805 | Note that all of these names with the exception of the @var{var}s in the | |
806 | @var{kw} list are source names, not unique variable names. | |
807 | @end deftp | |
808 | ||
809 | Additionally, there are three specific kinds of continuations that can | |
810 | only be declared at function entries. | |
811 | ||
812 | @deftp {CPS Continuation} $kentry self tail clauses | |
813 | Declare a function entry. @var{self} is a variable bound to the | |
814 | procedure being called, and which may be used for self-references. | |
815 | @var{tail} declares the @code{$cont} wrapping the @code{$ktail} for this | |
816 | function, corresponding to the function's tail continuation. | |
817 | @var{clauses} is a list of @code{$kclause} @code{$cont} instances. | |
818 | @end deftp | |
819 | ||
820 | @deftp {CPS Continuation} $ktail | |
821 | A tail continuation. | |
822 | @end deftp | |
823 | ||
824 | @deftp {CPS Continuation} $kclause arity cont | |
825 | A clause of a function with a given arity. Applications of a function | |
826 | with a compatible set of actual arguments will continue to @var{cont}, a | |
827 | @code{$kargs} @code{$cont} instance representing the clause body. | |
828 | @end deftp | |
829 | ||
830 | ||
831 | @node Building CPS | |
832 | @subsubsection Building CPS | |
833 | ||
834 | Unlike Tree-IL, the CPS language is built to be constructed and | |
835 | deconstructed with abstract macros instead of via procedural | |
836 | constructors or accessors, or instead of S-expression matching. | |
837 | ||
838 | Deconstruction and matching is handled adequately by the @code{match} | |
839 | form from @code{(ice-9 match)}. @xref{Pattern Matching}. Construction | |
840 | is handled by a set of mutually recursive builder macros: | |
841 | @code{build-cps-term}, @code{build-cps-cont}, and @code{build-cps-exp}. | |
842 | ||
843 | In the following interface definitions, consider variables containing | |
844 | @code{cont} to be recursively build by @code{build-cps-cont}, and | |
845 | likewise for @code{term} and @code{exp}. Consider any other name to be | |
846 | evaluated as a Scheme expression. Many of these forms recognize | |
847 | @code{unquote} in some contexts, to splice in a previously-built value; | |
848 | see the specifications below for full details. | |
849 | ||
850 | @deffn {Scheme Syntax} build-cps-term ,val | |
851 | @deffnx {Scheme Syntax} build-cps-term ($letk (cont ...) term) | |
852 | @deffnx {Scheme Syntax} build-cps-term ($letrec names syms funs term) | |
853 | @deffnx {Scheme Syntax} build-cps-term ($continue k src exp) | |
854 | @deffnx {Scheme Syntax} build-cps-exp ,val | |
855 | @deffnx {Scheme Syntax} build-cps-exp ($void) | |
856 | @deffnx {Scheme Syntax} build-cps-exp ($const val) | |
857 | @deffnx {Scheme Syntax} build-cps-exp ($prim name) | |
858 | @deffnx {Scheme Syntax} build-cps-exp ($fun src meta free body) | |
859 | @deffnx {Scheme Syntax} build-cps-exp ($call proc (arg ...)) | |
860 | @deffnx {Scheme Syntax} build-cps-exp ($call proc args) | |
861 | @deffnx {Scheme Syntax} build-cps-exp ($primcall name (arg ...)) | |
862 | @deffnx {Scheme Syntax} build-cps-exp ($primcall name args) | |
863 | @deffnx {Scheme Syntax} build-cps-exp ($values (arg ...)) | |
864 | @deffnx {Scheme Syntax} build-cps-exp ($values args) | |
865 | @deffnx {Scheme Syntax} build-cps-exp ($prompt escape? tag handler) | |
866 | @deffnx {Scheme Syntax} build-cps-cont ,val | |
867 | @deffnx {Scheme Syntax} build-cps-cont (k ($kargs (name ...) (sym ...) term)) | |
868 | @deffnx {Scheme Syntax} build-cps-cont (k ($kargs names syms term)) | |
869 | @deffnx {Scheme Syntax} build-cps-cont (k ($kif kt kf)) | |
870 | @deffnx {Scheme Syntax} build-cps-cont (k ($kreceive req rest kargs)) | |
871 | @deffnx {Scheme Syntax} build-cps-cont (k ($kentry self tail-cont ,clauses)) | |
872 | @deffnx {Scheme Syntax} build-cps-cont (k ($kentry self tail-cont (cont ...))) | |
873 | @deffnx {Scheme Syntax} build-cps-cont (k ($kclause ,arity cont)) | |
874 | @deffnx {Scheme Syntax} build-cps-cont (k ($kclause (req opt rest kw aok?) cont)) | |
875 | Construct a CPS term, expression, or continuation. | |
876 | @end deffn | |
58b23156 | 877 | |
bdad1340 | 878 | There are a few more miscellaneous interfaces as well. |
58b23156 | 879 | |
bdad1340 AW |
880 | @deffn {Scheme Procedure} make-arity req opt rest kw allow-other-keywords? |
881 | A procedural constructor for @code{$arity} objects. | |
882 | @end deffn | |
58b23156 | 883 | |
bdad1340 AW |
884 | @deffn {Scheme Syntax} let-gensyms (sym ...) body ... |
885 | Bind @var{sym...} to fresh names, and evaluate @var{body...}. | |
886 | @end deffn | |
58b23156 | 887 | |
bdad1340 AW |
888 | @deffn {Scheme Syntax} rewrite-cps-term val (pat term) ... |
889 | @deffnx {Scheme Syntax} rewrite-cps-exp val (pat exp) ... | |
890 | @deffnx {Scheme Syntax} rewrite-cps-cont val (pat cont) ... | |
891 | Match @var{val} against the series of patterns @var{pat...}, using | |
892 | @code{match}. The body of the matching clause should be a template in | |
893 | the syntax of @code{build-cps-term}, @code{build-cps-exp}, or | |
894 | @code{build-cps-cont}, respectively. | |
895 | @end deffn | |
73643339 | 896 | |
67915ab0 AW |
897 | @node Bytecode |
898 | @subsection Bytecode | |
73643339 | 899 | |
18e11135 | 900 | @xref{Object File Format}. |
73643339 | 901 | |
18e11135 | 902 | TODO: document (system vm loader) |
e3ba263d | 903 | |
b8bc86bc AW |
904 | @deffn {Scheme Variable} load-thunk-from-file file |
905 | @deffnx {C Function} scm_load_thunk_from_file (file) | |
ff73ae34 AW |
906 | Load object code from a file named @var{file}. The file will be mapped |
907 | into memory via @code{mmap}, so this is a very fast operation. | |
e3ba263d | 908 | |
b8bc86bc AW |
909 | On disk, object code is embedded in ELF, a flexible container format |
910 | created for use in UNIX systems. Guile has its own ELF linker and | |
911 | loader, so it uses the ELF format on all systems. | |
73643339 AW |
912 | @end deffn |
913 | ||
18e11135 | 914 | TODO: document load-thunk-from-memory |
ff73ae34 AW |
915 | |
916 | Compiling object code to the fake language, @code{value}, is performed | |
917 | via loading objcode into a program, then executing that thunk with | |
918 | respect to the compilation environment. Normally the environment | |
919 | propagates through the compiler transparently, but users may specify | |
41e64dd7 | 920 | the compilation environment manually as well, as a module. |
ff73ae34 | 921 | |
c850030f | 922 | |
e63d888e DK |
923 | @node Writing New High-Level Languages |
924 | @subsection Writing New High-Level Languages | |
925 | ||
926 | In order to integrate a new language @var{lang} into Guile's compiler | |
927 | system, one has to create the module @code{(language @var{lang} spec)} | |
928 | containing the language definition and referencing the parser, | |
929 | compiler and other routines processing it. The module hierarchy in | |
930 | @code{(language brainfuck)} defines a very basic Brainfuck | |
931 | implementation meant to serve as easy-to-understand example on how to | |
4e432dab AW |
932 | do this. See for instance @url{http://en.wikipedia.org/wiki/Brainfuck} |
933 | for more information about the Brainfuck language itself. | |
934 | ||
e63d888e | 935 | |
ff73ae34 AW |
936 | @node Extending the Compiler |
937 | @subsection Extending the Compiler | |
e3ba263d | 938 | |
dd6e37d0 NJ |
939 | At this point we take a detour from the impersonal tone of the rest of |
940 | the manual. Admit it: if you've read this far into the compiler | |
941 | internals manual, you are a junkie. Perhaps a course at your university | |
942 | left you unsated, or perhaps you've always harbored a desire to hack the | |
943 | holy of computer science holies: a compiler. Well you're in good | |
944 | company, and in a good position. Guile's compiler needs your help. | |
ff73ae34 AW |
945 | |
946 | There are many possible avenues for improving Guile's compiler. | |
947 | Probably the most important improvement, speed-wise, will be some form | |
948 | of native compilation, both just-in-time and ahead-of-time. This could | |
949 | be done in many ways. Probably the easiest strategy would be to extend | |
950 | the compiled procedure structure to include a pointer to a native code | |
86872cc3 | 951 | vector, and compile from bytecode to native code at run-time after a |
ff73ae34 AW |
952 | procedure is called a certain number of times. |
953 | ||
954 | The name of the game is a profiling-based harvest of the low-hanging | |
955 | fruit, running programs of interest under a system-level profiler and | |
956 | determining which improvements would give the most bang for the buck. | |
98850fd7 AW |
957 | It's really getting to the point though that native compilation is the |
958 | next step. | |
ff73ae34 AW |
959 | |
960 | The compiler also needs help at the top end, enhancing the Scheme that | |
98850fd7 AW |
961 | it knows to also understand R6RS, and adding new high-level compilers. |
962 | We have JavaScript and Emacs Lisp mostly complete, but they could use | |
ecb87335 | 963 | some love; Lua would be nice as well, but whatever language it is |
98850fd7 AW |
964 | that strikes your fancy would be welcome too. |
965 | ||
966 | Compilers are for hacking, not for admiring or for complaining about. | |
967 | Get to it! |