Fixed a Scheme translation bug; cleaned compilation with GCC 4.
[bpt/guile.git] / doc / guile-vm.texi
1 \input texinfo @c -*-texinfo-*-
2 @c %**start of header
3 @setfilename guile-vm.info
4 @settitle Guile VM Specification
5 @footnotestyle end
6 @setchapternewpage odd
7 @c %**end of header
8
9 @set EDITION 0.6
10 @set VERSION 0.6
11 @set UPDATED 2005-04-26
12
13 @c Macro for instruction definitions.
14 @macro insn{}
15 Instruction
16 @end macro
17
18 @c For Scheme procedure definitions.
19 @macro scmproc{}
20 Scheme Procedure
21 @end macro
22
23 @c Scheme records.
24 @macro scmrec{}
25 Record
26 @end macro
27
28 @ifinfo
29 @dircategory Scheme Programming
30 @direntry
31 * Guile VM: (guile-vm). Guile's Virtual Machine.
32 @end direntry
33
34 This file documents Guile VM.
35
36 Copyright @copyright{} 2000 Keisuke Nishida
37 Copyright @copyright{} 2005 Ludovic Court`es
38
39 Permission is granted to make and distribute verbatim copies of this
40 manual provided the copyright notice and this permission notice are
41 preserved on all copies.
42
43 @ignore
44 Permission is granted to process this file through TeX and print the
45 results, provided the printed document carries a copying permission
46 notice identical to this one except for the removal of this paragraph
47 (this paragraph not being relevant to the printed manual).
48
49 @end ignore
50 Permission is granted to copy and distribute modified versions of this
51 manual under the conditions for verbatim copying, provided that the
52 entire resulting derived work is distributed under the terms of a
53 permission notice identical to this one.
54
55 Permission is granted to copy and distribute translations of this manual
56 into another language, under the above conditions for modified versions,
57 except that this permission notice may be stated in a translation
58 approved by the Free Software Foundation.
59 @end ifinfo
60
61 @titlepage
62 @title Guile VM Specification
63 @subtitle for Guile VM @value{VERSION}
64 @author Keisuke Nishida
65
66 @page
67 @vskip 0pt plus 1filll
68 Edition @value{EDITION} @*
69 Updated for Guile VM @value{VERSION} @*
70 @value{UPDATED} @*
71
72 Copyright @copyright{} 2000 Keisuke Nishida
73 Copyright @copyright{} 2005 Ludovic Court`es
74
75 Permission is granted to make and distribute verbatim copies of this
76 manual provided the copyright notice and this permission notice are
77 preserved on all copies.
78
79 Permission is granted to copy and distribute modified versions of this
80 manual under the conditions for verbatim copying, provided that the
81 entire resulting derived work is distributed under the terms of a
82 permission notice identical to this one.
83
84 Permission is granted to copy and distribute translations of this manual
85 into another language, under the above conditions for modified versions,
86 except that this permission notice may be stated in a translation
87 approved by the Free Software Foundation.
88 @end titlepage
89
90 @contents
91
92 @c *********************************************************************
93 @node Top, Introduction, (dir), (dir)
94 @top Guile VM Specification
95
96 This document would like to correspond to Guile VM @value{VERSION}.
97 However, be warned that important parts still correspond to version
98 0.0 and are not valid anymore.
99
100 @menu
101 * Introduction::
102 * Variable Management::
103 * Instruction Set::
104 * The Compiler::
105 * Concept Index::
106 * Function and Instruction Index::
107 * Command and Variable Index::
108
109 @detailmenu
110 --- The Detailed Node Listing ---
111
112 Instruction Set
113
114 * Environment Control Instructions::
115 * Branch Instructions::
116 * Subprogram Control Instructions::
117 * Data Control Instructions::
118
119 The Compiler
120
121 * Overview::
122 * The Language Front-Ends::
123 * GHIL::
124 * GLIL::
125 * The Assembler::
126
127 @end detailmenu
128 @end menu
129
130 @c *********************************************************************
131 @node Introduction, Variable Management, Top, Top
132 @chapter What is Guile VM?
133
134 A Guile VM has a set of registers and its own stack memory. Guile may
135 have more than one VM's. Each VM may execute at most one program at a
136 time. Guile VM is a CISC system so designed as to execute Scheme and
137 other languages efficiently.
138
139 @unnumberedsubsec Registers
140
141 @itemize
142 @item pc - Program counter ;; ip (instruction poiner) is better?
143 @item sp - Stack pointer
144 @item bp - Base pointer
145 @item ac - Accumulator
146 @end itemize
147
148 @unnumberedsubsec Engine
149
150 A VM may have one of three engines: reckless, regular, or debugging.
151 Reckless engine is fastest but dangerous. Regular engine is normally
152 fail-safe and reasonably fast. Debugging engine is safest and
153 functional but very slow.
154
155 @unnumberedsubsec Memory
156
157 Stack is the only memory that each VM owns. The other memory is shared
158 memory that is shared among every VM and other part of Guile.
159
160 @unnumberedsubsec Program
161
162 A VM program consists of a bytecode that is executed and an environment
163 in which execution is done. Each program is allocated in the shared
164 memory and may be executed by any VM. A program may call other programs
165 within a VM.
166
167 @unnumberedsubsec Instruction
168
169 Guile VM has dozens of system instructions and (possibly) hundreds of
170 functional instructions. Some Scheme procedures such as cons and car
171 are implemented as VM's builtin functions, which are very efficient.
172 Other procedures defined outside of the VM are also considered as VM's
173 functional features, since they do not change the state of VM.
174 Procedures defined within the VM are called subprograms.
175
176 Most instructions deal with the accumulator (ac). The VM stores all
177 results from functions in ac, instead of pushing them into the stack.
178 I'm not sure whether this is a good thing or not.
179
180 @node Variable Management, Instruction Set, Introduction, Top
181 @chapter Variable Management
182
183 FIXME: This chapter needs to be reviewed so that it matches reality.
184 A more up-to-date description of the mechanisms described in this
185 section is given in @ref{Instruction Set}.
186
187 A program may have access to local variables, external variables, and
188 top-level variables.
189
190 @section Local/external variables
191
192 A stack is logically divided into several blocks during execution. A
193 "block" is such a unit that maintains local variables and dynamic chain.
194 A "frame" is an upper level unit that maintains subprogram calls.
195
196 @example
197 Stack
198 dynamic | | | |
199 chain +==========+ - =
200 | |local vars| | |
201 `-|block data| | block |
202 /|frame data| | |
203 | +----------+ - |
204 | |local vars| | | frame
205 `-|block data| | |
206 /+----------+ - |
207 | |local vars| | |
208 `-|block data| | |
209 /+==========+ - =
210 | |local vars| | |
211 `-|block data| | |
212 /|frame data| | |
213 | +----------+ - |
214 | | | | |
215 @end example
216
217 The first block of each frame may look like this:
218
219 @example
220 Address Data
221 ------- ----
222 xxx0028 Local variable 2
223 xxx0024 Local variable 1
224 bp ->xxx0020 Local variable 0
225 xxx001c Local link (block data)
226 xxx0018 External link (block data)
227 xxx0014 Stack pointer (block data)
228 xxx0010 Return address (frame data)
229 xxx000c Parent program (frame data)
230 @end example
231
232 The base pointer (bp) always points to the lowest address of local
233 variables of the recent block. Local variables are referred as "bp[n]".
234 The local link field has a pointer to the dynamic parent of the block.
235 The parent's variables are referred as "bp[-1][n]", and grandparent's
236 are "bp[-1][-1][n]". Thus, any local variable is represented by its
237 depth and offset from the current bp.
238
239 A variable may be "external", which is allocated in the shared memory.
240 The external link field of a block has a pointer to such a variable set,
241 which I call "fragment" (what should I call?). A fragment has a set of
242 variables and its own chain.
243
244 @example
245 local external
246 chain| | chain
247 | +-----+ .--------, |
248 `-|block|--+->|external|-'
249 /+-----+ | `--------'\,
250 `-|block|--' |
251 /+-----+ .--------, |
252 `-|block|---->|external|-'
253 +-----+ `--------'
254 | |
255 @end example
256
257 An external variable is referred as "bp[-2]->variables[n]" or
258 "bp[-2]->link->...->variables[n]". This is also represented by a pair
259 of depth and offset. At any point of execution, the value of bp
260 determines the current local link and external link, and thus the
261 current environment of a program.
262
263 Other data fields are described later.
264
265 @section Top-level variables
266
267 Guile VM uses the same top-level variables as the regular Guile. A
268 program may have direct access to vcells. Currently this is done by
269 calling scm_intern0, but a program is possible to have any top-level
270 environment defined by the current module.
271
272 @section Scheme and VM variable
273
274 Let's think about the following Scheme code as an example:
275
276 @example
277 (define (foo a)
278 (lambda (b) (list foo a b)))
279 @end example
280
281 In the lambda expression, "foo" is a top-level variable, "a" is an
282 external variable, and "b" is a local variable.
283
284 When a VM executes foo, it allocates a block for "a". Since "a" may be
285 externally referred from the closure, the VM creates a fragment with a
286 copy of "a" in it. When the VM evaluates the lambda expression, it
287 creates a subprogram (closure), associating the fragment with the
288 subprogram as its external environment. When the closure is executed,
289 its environment will look like this:
290
291 @example
292 block Top-level: foo
293 +-------------+
294 |local var: b | fragment
295 +-------------+ .-----------,
296 |external link|---->|variable: a|
297 +-------------+ `-----------'
298 @end example
299
300 The fragment remains as long as the closure exists.
301
302 @section Addressing mode
303
304 Guile VM has five addressing modes:
305
306 @itemize
307 @item Real address
308 @item Local position
309 @item External position
310 @item Top-level location
311 @item Constant object
312 @end itemize
313
314 Real address points to the address in the real program and is only used
315 with the program counter (pc).
316
317 Local position and external position are represented as a pair of depth
318 and offset from bp, as described above. These are base relative
319 addresses, and the real address may vary during execution.
320
321 Top-level location is represented as a Guile's vcell. This location is
322 determined at loading time, so the use of this address is efficient.
323
324 Constant object is not an address but gives an instruction an Scheme
325 object directly.
326
327 [ We'll also need dynamic scope addressing to support Emacs Lisp? ]
328
329
330 Overall procedure:
331
332 @enumerate
333 @item A source program is compiled into a bytecode.
334 @item A bytecode is given an environment and becomes a program.
335 @item A VM starts execution, creating a frame for it.
336 @item Whenever a program calls a subprogram, a new frame is created for it.
337 @item When a program finishes execution, it returns a value, and the VM
338 continues execution of the parent program.
339 @item When all programs terminated, the VM returns the final value and stops.
340 @end enumerate
341
342 \f
343 @node Instruction Set, The Compiler, Variable Management, Top
344 @chapter Instruction Set
345
346 The Guile VM instruction set is roughly divided two groups: system
347 instructions and functional instructions. System instructions control
348 the execution of programs, while functional instructions provide many
349 useful calculations.
350
351 @menu
352 * Environment Control Instructions::
353 * Branch Instructions::
354 * Subprogram Control Instructions::
355 * Data Control Instructions::
356 @end menu
357
358 @node Environment Control Instructions, Branch Instructions, Instruction Set, Instruction Set
359 @section Environment Control Instructions
360
361 @deffn @insn{} link binding-name
362 Look up @var{binding-name} (a string) in the current environment and
363 push the corresponding variable object onto the stack. If
364 @var{binding-name} is not bound yet, then create a new binding and
365 push its variable object.
366 @end deffn
367
368 @deffn @insn{} variable-ref
369 Dereference the variable object which is on top of the stack and
370 replace it by the value of the variable it represents.
371 @end deffn
372
373 @deffn @insn{} variable-set
374 Set the value of the variable on top of the stack (at @code{sp[0]}) to
375 the object located immediately before (at @code{sp[-1]}).
376 @end deffn
377
378 As an example, let us look at what a simple function call looks like:
379
380 @example
381 (+ 2 3)
382 @end example
383
384 This call yields the following sequence of instructions:
385
386 @example
387 (link "+") ;; lookup binding "+"
388 (variable-ref) ;; dereference it
389 (make-int8 2) ;; push immediate value `2'
390 (make-int8 3) ;; push immediate value `3'
391 (tail-call 2) ;; call the proc at sp[-3] with two args
392 @end example
393
394 @deffn @insn{} local-ref offset
395 Push onto the stack the value of the local variable located at
396 @var{offset} within the current stack frame.
397 @end deffn
398
399 @deffn @insn{} local-set offset
400 Pop the Scheme object located on top of the stack and make it the new
401 value of the local variable located at @var{offset} within the current
402 stack frame.
403 @end deffn
404
405 @deffn @insn{} external-ref offset
406 Push the value of the closure variable located at position
407 @var{offset} within the program's list of external variables.
408 @end deffn
409
410 @deffn @insn{} external-set offset
411 Pop the Scheme object located on top of the stack and make it the new
412 value of the closure variable located at @var{offset} within the
413 program's list of external variables.
414 @end deffn
415
416 @deffn @insn{} make-closure
417 Pop the program object from the stack and assign it the current
418 closure variable list as its closure. Push the result program
419 object.
420 @end deffn
421
422 Let's illustrate this:
423
424 @example
425 (let ((x 2))
426 (lambda ()
427 (let ((x++ (+ 1 x)))
428 (set! x x++)
429 x++)))
430 @end example
431
432 The resulting program has one external (closure) variable, i.e. its
433 @var{nexts} is set to 1 (@pxref{Subprogram Control Instructions}).
434 This yields the following code:
435
436 @example
437 ;; the traditional program prologue with NLOCS = 0 and NEXTS = 1
438
439 0 (make-int8 2)
440 2 (external-set 0)
441 4 (make-int8 4)
442 6 (link "+") ;; lookup `+'
443 9 (vector 1) ;; create the external variable vector for
444 ;; later use by `object-ref' and `object-set'
445 ...
446 40 (load-program ##34#)
447 59 (make-closure) ;; assign the current closure to the program
448 ;; just pushed by `load-program'
449 60 (return)
450 @end example
451
452 The program loaded here by @var{load-program} contains the following
453 sequence of instructions:
454
455 @example
456 0 (object-ref 0) ;; push the variable for `+'
457 2 (variable-ref) ;; dereference `+'
458 3 (make-int8:1) ;; push 1
459 4 (external-ref 0) ;; push the value of `x'
460 6 (call 2) ;; call `+' and push the result
461 8 (local-set 0) ;; make it the new value of `x++'
462 10 (local-ref 0) ;; push the value of `x++'
463 12 (external-set 0) ;; make it the new value of `x'
464 14 (local-ref 0) ;; push the value of `x++'
465 16 (return) ;; return it
466 @end example
467
468 At this point, you should know pretty much everything about the three
469 types of variables a program may need to access.
470
471
472 @node Branch Instructions, Subprogram Control Instructions, Environment Control Instructions, Instruction Set
473 @section Branch Instructions
474
475 All the conditional branch instructions described below work in the
476 same way:
477
478 @itemize
479 @item They take the Scheme object located on the stack and use it as
480 the branch condition;
481 @item If the condition if false, then program execution continues with
482 the next instruction;
483 @item If the condition is true, then the instruction pointer is
484 increased by the offset passed as an argument to the branch
485 instruction;
486 @item Finally, when the instruction finished, the condition object is
487 removed from the stack.
488 @end itemize
489
490 Note that the offset passed to the instruction is encoded on two 8-bit
491 integers which are then combined by the VM as one 16-bit integer.
492
493 @deffn @insn{} br offset
494 Jump to @var{offset}.
495 @end deffn
496
497 @deffn @insn{} br-if offset
498 Jump to @var{offset} if the condition on the stack is not false.
499 @end deffn
500
501 @deffn @insn{} br-if-not offset
502 Jump to @var{offset} if the condition on the stack is false.
503 @end deffn
504
505 @deffn @insn{} br-if-eq offset
506 Jump to @var{offset} if the two objects located on the stack are
507 equal in the sense of @var{eq?}. Note that, for this instruction, the
508 stack pointer is decremented by two Scheme objects instead of only
509 one.
510 @end deffn
511
512 @deffn @insn{} br-if-not-eq offset
513 Same as @var{br-if-eq} for non-equal objects.
514 @end deffn
515
516 @deffn @insn{} br-if-null offset
517 Jump to @var{offset} if the object on the stack is @code{'()}.
518 @end deffn
519
520 @deffn @insn{} br-if-not-null offset
521 Jump to @var{offset} if the object on the stack is not @code{'()}.
522 @end deffn
523
524
525 @node Subprogram Control Instructions, Data Control Instructions, Branch Instructions, Instruction Set
526 @section Subprogram Control Instructions
527
528 Programs (read: ``compiled procedure'') may refer to external
529 bindings, like variables or functions defined outside the program
530 itself, in the environment in which it will evaluate at run-time. In
531 a sense, a program's environment and its bindings are an implicit
532 parameter of every program.
533
534 @cindex object table
535 In order to handle such bindings, each program has an @dfn{object
536 table} associated to it. This table (actually a Scheme vector)
537 contains all constant objects referenced by the program. The object
538 table of a program is initialized right before a program is loaded
539 with @var{load-program}.
540
541 Variable objects are one such type of constant object: when a global
542 binding is defined, a variable object is associated to it and that
543 object will remain constant over time, even if the value bound to it
544 changes. Therefore, external bindings only need to be looked up once
545 when the program is loaded. References to the corresponding external
546 variables from within the program are then performed via the
547 @var{object-ref} instruction and are almost as fast as local variable
548 references.
549
550 Let us consider the following program (procedure) which references
551 external bindings @code{frob} and @var{%magic}:
552
553 @example
554 (lambda (x)
555 (frob x %magic))
556 @end example
557
558 This yields the following assembly code:
559
560 @example
561 (make-int8 64) ;; number of args, vars, etc. (see below)
562 (link "frob")
563 (link "%magic")
564 (vector 2) ;; object table (external bindings)
565 ...
566 (load-program #u8(20 0 23 21 0 20 1 23 36 2))
567 (return)
568 @end example
569
570 All the instructions occurring before @var{load-program} (some were
571 omitted for simplicity) form a @dfn{prologue} which, among other
572 things, pushed an object table (a vector) that contains the variable
573 objects for the variables bound to @var{frob} and @var{%magic}. This
574 vector and other data pushed onto the stack are then popped by the
575 @var{load-program} instruction.
576
577 Besides, the @var{load-program} instruction takes one explicit
578 argument which is the bytecode of the program itself. Disassembled,
579 this bytecode looks like:
580
581 @example
582 (object-ref 0) ;; push the variable object of `frob'
583 (variable-ref) ;; dereference it
584 (local-ref 0) ;; push the value of `x'
585 (object-ref 1) ;; push the variable object of `%magic'
586 (variable-ref) ;; dereference it
587 (tail-call 2) ;; call `frob' with two parameters
588 @end example
589
590 This clearly shows that there is little difference between references
591 to local variables and references to externally bound variables since
592 lookup of externally bound variables if performed only once before the
593 program is run.
594
595 @deffn @insn{} load-program bytecode
596 Load the program whose bytecode is @var{bytecode} (a u8vector), pop
597 its meta-information from the stack, and push a corresponding program
598 object onto the stack. The program's meta-information may consist of
599 (in the order in which it should be pushed onto the stack):
600
601 @itemize
602 @item optionally, a pair representing meta-data (see the
603 @var{program-meta} procedure); [FIXME: explain their meaning]
604 @item optionally, a vector which is the program's object table (a
605 program that does not reference external bindings does not need an
606 object table);
607 @item either one immediate integer or four immediate integers
608 representing respectively the number of arguments taken by the
609 function (@var{nargs}), the number of @dfn{rest arguments}
610 (@var{nrest}, 0 or 1), the number of local variables (@var{nlocs}) and
611 the number of external variables (@var{nexts}) (@pxref{Environment
612 Control Instructions}).
613 @end itemize
614
615 @end deffn
616
617 @deffn @insn{} object-ref offset
618 Push the variable object for the external variable located at
619 @var{offset} within the program's object table.
620 @end deffn
621
622 @deffn @insn{} return
623 Free the program's frame.
624 @end deffn
625
626 @deffn @insn{} call nargs
627 Call the procedure, continuation or program located at
628 @code{sp[-nargs]} with the @var{nargs} arguments located from
629 @code{sp[0]} to @code{sp[-nargs + 1]}. The
630 procedure/continuation/program and its arguments are dropped from the
631 stack and the result is pushed. When calling a program, the
632 @code{call} instruction reserves room for its local variables on the
633 stack, and initializes its list of closure variables and its vector of
634 externally bound variables.
635 @end deffn
636
637 @deffn @insn{} tail-call nargs
638 Same as @code{call} except that, for tail-recursive calls to a
639 program, the current stack frame is re-used, as required by RnRS.
640 This instruction is otherwise similar to @code{call}.
641 @end deffn
642
643
644 @node Data Control Instructions, , Subprogram Control Instructions, Instruction Set
645 @section Data Control Instructions
646
647 @deffn @insn{} make-int8 value
648 Push @var{value}, an 8-bit integer, onto the stack.
649 @end deffn
650
651 @deffn @insn{} make-int8:0
652 Push the immediate value @code{0} onto the stack.
653 @end deffn
654
655 @deffn @insn{} make-int8:1
656 Push the immediate value @code{1} onto the stack.
657 @end deffn
658
659 @deffn @insn{} make-false
660 Push @code{#f} onto the stack.
661 @end deffn
662
663 @deffn @insn{} make-true
664 Push @code{#t} onto the stack.
665 @end deffn
666
667 @itemize
668 @item %push
669 @item %pushi
670 @item %pushl, %pushl:0:0, %pushl:0:1, %pushl:0:2, %pushl:0:3
671 @item %pushe, %pushe:0:0, %pushe:0:1, %pushe:0:2, %pushe:0:3
672 @item %pusht
673 @end itemize
674
675 @itemize
676 @item %loadi
677 @item %loadl, %loadl:0:0, %loadl:0:1, %loadl:0:2, %loadl:0:3
678 @item %loade, %loade:0:0, %loade:0:1, %loade:0:2, %loade:0:3
679 @item %loadt
680 @end itemize
681
682 @itemize
683 @item %savei
684 @item %savel, %savel:0:0, %savel:0:1, %savel:0:2, %savel:0:3
685 @item %savee, %savee:0:0, %savee:0:1, %savee:0:2, %savee:0:3
686 @item %savet
687 @end itemize
688
689 @section Flow control instructions
690
691 @itemize
692 @item %br-if
693 @item %br-if-not
694 @item %jump
695 @end itemize
696
697 @section Function call instructions
698
699 @itemize
700 @item %func, %func0, %func1, %func2
701 @end itemize
702
703 @section Scheme built-in functions
704
705 @itemize
706 @item cons
707 @item car
708 @item cdr
709 @end itemize
710
711 @section Mathematical buitin functions
712
713 @itemize
714 @item 1+
715 @item 1-
716 @item add, add2
717 @item sub, sub2, minus
718 @item mul2
719 @item div2
720 @item lt2
721 @item gt2
722 @item le2
723 @item ge2
724 @item num-eq2
725 @end itemize
726
727
728 \f
729 @node The Compiler, Concept Index, Instruction Set, Top
730 @chapter The Compiler
731
732 This section describes Guile-VM's compiler and the compilation process
733 to produce bytecode executable by the VM itself (@pxref{Instruction
734 Set}).
735
736 @menu
737 * Overview::
738 * The Language Front-Ends::
739 * GHIL::
740 * GLIL::
741 * The Assembler::
742 @end menu
743
744 @node Overview, The Language Front-Ends, The Compiler, The Compiler
745 @section Overview
746
747 Compilation in Guile-VM is a three-stage process:
748
749 @cindex intermediate language
750 @cindex assembler
751 @cindex compiler
752 @cindex GHIL
753 @cindex GLIL
754 @cindex bytecode
755
756 @enumerate
757 @item the source programming language (e.g. R5RS Scheme) is read and
758 translated into GHIL, @dfn{Guile's High-Level Intermediate Language};
759 @item GHIL code is then translated into a lower-level intermediate
760 language call GLIL, @dfn{Guile's Low-Level Intermediate Language};
761 @item finally, GLIL is @dfn{assembled} into the VM's assembly language
762 (@pxref{Instruction Set}) and bytecode.
763 @end enumerate
764
765 The use of two separate intermediate languages eases the
766 implementation of front-ends since the gap between high-level
767 languages like Scheme and GHIL is relatively small.
768
769 @findex compile-file
770 @vindex guilec
771 From an end-user viewpoint, compiling a Guile program into bytecode
772 can be done either by using the @command{guilec} command-line tool, or
773 by using the @code{compile-file} procedure exported by the
774 @code{(system base compile)} module.
775
776
777 @node The Language Front-Ends, GHIL, Overview, The Compiler
778 @section The Language Front-Ends
779
780 Guile-VM comes with a number of @dfn{language front-ends}, that is,
781 code that can read a given high-level programming language like R5RS
782 Scheme, and translate it into a lower-level representation suitable to
783 the compiler.
784
785 Each language front-end provides a @dfn{specification} and a
786 @dfn{translator} to GHIL. Both of them come in the @code{language}
787 module hierarchy. As an example, the front-end for Scheme is located
788 in the @code{(language scheme spec)} and @code{(language scheme
789 translate)} modules. Language front-ends can then be retrieved using
790 the @code{lookup-language} procedure of the @code{(system base
791 language)} module.
792
793 @deftp @scmrec{} <language> name title version reader printer read-file expander translator evaluator environment
794 Denotes a language front-end specification a various methods used by
795 the compiler to handle source written in that language. Of particular
796 interest is the @code{translator} slot (@pxref{GHIL}).
797 @end deftp
798
799 @deffn @scmproc{} lookup-language symbol
800 Look for a language front-end named @var{symbol} and return the
801 @code{<language>} record describing it if found. If @var{symbol}
802 doesn't denote a language front-end, an error is raised. Note that
803 this procedure assumes that language @var{symbol} exists if there
804 exist a @code{(language @var{symbol} spec)} module.
805 @end deffn
806
807
808 @node GHIL, GLIL, The Language Front-Ends, The Compiler
809 @section Guile's High-Level Intermediate Language
810
811 GHIL has constructs almost equivalent to those found in Scheme.
812 However, unlike Scheme, it is meant to be read only by the compiler
813 itself. Therefore, a sequence of GHIL code is only a sequence of GHIL
814 @emph{objects} (records), as opposed to symbols, each of which
815 represents a particular language feature. These records are all
816 defined in the @code{(system il ghil)} module and are named
817 @code{<ghil-*>}.
818
819 Each GHIL record has at least two fields: one containing the
820 environment (Guile module) in which it is considered, and one
821 containing its location [FIXME: currently seems to be unused]. Below
822 is a list of the main GHIL object types and their fields:
823
824 @example
825 ;; Objects
826 (<ghil-void> env loc)
827 (<ghil-quote> env loc obj)
828 (<ghil-quasiquote> env loc exp)
829 (<ghil-unquote> env loc exp)
830 (<ghil-unquote-splicing> env loc exp)
831 ;; Variables
832 (<ghil-ref> env loc var)
833 (<ghil-set> env loc var val)
834 (<ghil-define> env loc var val)
835 ;; Controls
836 (<ghil-if> env loc test then else)
837 (<ghil-and> env loc exps)
838 (<ghil-or> env loc exps)
839 (<ghil-begin> env loc exps)
840 (<ghil-bind> env loc vars vals body)
841 (<ghil-lambda> env loc vars rest body)
842 (<ghil-call> env loc proc args)
843 (<ghil-inline> env loc inline args)
844 @end example
845
846 As can be seen from this examples, the constructs in GHIL are pretty
847 close to the fundamental primitives of Scheme.
848
849 It is the role of front-end language translators (@pxref{The Language
850 Front-Ends}) to produce a sequence of GHIL objects from the
851 human-readable, source programming language.
852
853 [FIXME: Describe more.]
854
855 @node GLIL, The Assembler, GHIL, The Compiler
856 @section Guile's Low-Level Intermediate Language
857
858 A GHIL instruction sequence can be compiled into GLIL using the
859 @code{compile} procedure exported by the @code{(system il compile)}
860 module. During this translation process, various optimizations may
861 also be performed.
862
863 The module @code{(system il glil)} defines record types representing
864 various low-level abstractions. Compared to GHIL, the flow control
865 primitives in GLIL are much more low-level: only @code{<glil-label>},
866 @code{<glil-branch>} and @code{<glil-call>} are available, no
867 @code{lambda}, @code{if}, etc.
868
869
870 @deffn @scmproc{} compile ghil environment . opts
871 Compile @var{ghil}, a GHIL instruction sequence, within
872 environment/module @var{environment}, and return the resulting GLIL
873 instruction sequence. The option list @var{opts} may be either the
874 empty list or a list containing the @code{:O} keyword in which case
875 @code{compile} will first go through an optimization stage of
876 @var{ghil}.
877 @end deffn
878
879 @deffn @scmproc{} pprint-glil glil . port
880 Print @var{glil}, a GLIL sequence instructions, in a human-readable
881 form. If @var{port} is passed, it will be used as the output port.
882 @end deffn
883
884
885 Let's consider the following Scheme expression:
886
887 @example
888 (lambda (x) (+ x 1))
889 @end example
890
891 The corresponding (unoptimized) GLIL code, as shown by
892 @code{pprint-glil}, looks like this:
893
894 @example
895 (@@asm (0 0 0 0)
896 (@@asm (1 0 0 0) ;; expect one arg.
897 (@@bind (x argument 0)) ;; debugging info
898 (module-ref #f +) ;; lookup `+'
899 (argument-ref 0) ;; push the argument onto
900 ;; the stack
901 (const 1) ;; push `1'
902 (tail-call 2) ;; call `+', with 2 args,
903 ;; using the same stack frame
904 (@@source 15 33)) ;; additional debugging info
905 (return 0))
906 @end example
907
908 This is not unlike the VM's assembly language described in
909 @ref{Instruction Set}.
910
911 @node The Assembler, , GLIL, The Compiler
912 @section The Assembler
913
914 @findex code->bytes
915
916 The final compilation step consists in converting the GLIL instruction
917 sequence into VM bytecode. This is what the @code{assemble} procedure
918 defined in the @code{(system vm assemble)} module is for. It relies
919 on the @code{code->bytes} procedure of the @code{(system vm conv)}
920 module to convert instructions (represented as lists whose @code{car}
921 is a symbol naming the instruction, e.g. @code{object-ref},
922 @pxref{Instruction Set}) into binary code, or @dfn{bytecode}.
923 Bytecode itself is represented using SRFI-4 byte vectors,
924 @inforef{SRFI-4, SRFI-4 homogeneous numeric vectors, guile}.
925
926
927 @deffn @scmproc{} assemble glil environment . opts
928 Return a binary representation of @var{glil} (bytecode), either in the
929 form of an SRFI-4 @code{u8vector} or a @code{<bytespec>} object.
930 [FIXME: Why is that?]
931 @end deffn
932
933
934
935 @c *********************************************************************
936 @node Concept Index, Function and Instruction Index, The Compiler, Top
937 @unnumbered Concept Index
938 @printindex cp
939
940 @node Function and Instruction Index, Command and Variable Index, Concept Index, Top
941 @unnumbered Function and Instruction Index
942 @printindex fn
943
944 @node Command and Variable Index, , Function and Instruction Index, Top
945 @unnumbered Command and Variable Index
946 @printindex vr
947
948 @bye
949
950 @c Local Variables:
951 @c ispell-local-dictionary: "american";
952 @c End:
953
954 @c LocalWords: bytecode