a0b7eacddb4dad9b0db67b6404ae6106c235ab22
[bpt/guile.git] / doc / guile-vm.texi
1 \input texinfo @c -*-texinfo-*-
2 @c %**start of header
3 @setfilename guile-vm.info
4 @settitle Guile VM Specification
5 @footnotestyle end
6 @setchapternewpage odd
7 @c %**end of header
8
9 @set EDITION 0.6
10 @set VERSION 0.6
11 @set UPDATED 2005-04-26
12
13 @c Macro for instruction definitions.
14 @macro insn{}
15 Instruction
16 @end macro
17
18 @c For Scheme procedure definitions.
19 @macro scmproc{}
20 Scheme Procedure
21 @end macro
22
23 @c Scheme records.
24 @macro scmrec{}
25 Record
26 @end macro
27
28 @ifinfo
29 @dircategory Scheme Programming
30 @direntry
31 * Guile VM: (guile-vm). Guile's Virtual Machine.
32 @end direntry
33
34 This file documents Guile VM.
35
36 Copyright @copyright{} 2000 Keisuke Nishida
37 Copyright @copyright{} 2005 Ludovic Court`es
38
39 Permission is granted to make and distribute verbatim copies of this
40 manual provided the copyright notice and this permission notice are
41 preserved on all copies.
42
43 @ignore
44 Permission is granted to process this file through TeX and print the
45 results, provided the printed document carries a copying permission
46 notice identical to this one except for the removal of this paragraph
47 (this paragraph not being relevant to the printed manual).
48
49 @end ignore
50 Permission is granted to copy and distribute modified versions of this
51 manual under the conditions for verbatim copying, provided that the
52 entire resulting derived work is distributed under the terms of a
53 permission notice identical to this one.
54
55 Permission is granted to copy and distribute translations of this manual
56 into another language, under the above conditions for modified versions,
57 except that this permission notice may be stated in a translation
58 approved by the Free Software Foundation.
59 @end ifinfo
60
61 @titlepage
62 @title Guile VM Specification
63 @subtitle for Guile VM @value{VERSION}
64 @author Keisuke Nishida
65
66 @page
67 @vskip 0pt plus 1filll
68 Edition @value{EDITION} @*
69 Updated for Guile VM @value{VERSION} @*
70 @value{UPDATED} @*
71
72 Copyright @copyright{} 2000 Keisuke Nishida
73 Copyright @copyright{} 2005 Ludovic Court`es
74
75 Permission is granted to make and distribute verbatim copies of this
76 manual provided the copyright notice and this permission notice are
77 preserved on all copies.
78
79 Permission is granted to copy and distribute modified versions of this
80 manual under the conditions for verbatim copying, provided that the
81 entire resulting derived work is distributed under the terms of a
82 permission notice identical to this one.
83
84 Permission is granted to copy and distribute translations of this manual
85 into another language, under the above conditions for modified versions,
86 except that this permission notice may be stated in a translation
87 approved by the Free Software Foundation.
88 @end titlepage
89
90 @contents
91
92 @c *********************************************************************
93 @node Top, Introduction, (dir), (dir)
94 @top Guile VM Specification
95
96 This document would like to correspond to Guile VM @value{VERSION}.
97 However, be warned that important parts still correspond to version
98 0.0 and are not valid anymore.
99
100 @menu
101 * Introduction::
102 * Variable Management::
103 * Instruction Set::
104 * The Compiler::
105
106 @detailmenu
107 --- The Detailed Node Listing ---
108
109 Instruction Set
110
111 * Environment Control Instructions::
112 * Branch Instructions::
113 * Subprogram Control Instructions::
114 * Data Control Instructions::
115
116 The Compiler
117
118 * Overview::
119 * The Language Front-Ends::
120 * GHIL::
121 * GLIL::
122 * The Assembler::
123
124 @end detailmenu
125 @end menu
126
127 @c *********************************************************************
128 @node Introduction, Variable Management, Top, Top
129 @chapter What is Guile VM?
130
131 A Guile VM has a set of registers and its own stack memory. Guile may
132 have more than one VM's. Each VM may execute at most one program at a
133 time. Guile VM is a CISC system so designed as to execute Scheme and
134 other languages efficiently.
135
136 @unnumberedsubsec Registers
137
138 @itemize
139 @item pc - Program counter ;; ip (instruction poiner) is better?
140 @item sp - Stack pointer
141 @item bp - Base pointer
142 @item ac - Accumulator
143 @end itemize
144
145 @unnumberedsubsec Engine
146
147 A VM may have one of three engines: reckless, regular, or debugging.
148 Reckless engine is fastest but dangerous. Regular engine is normally
149 fail-safe and reasonably fast. Debugging engine is safest and
150 functional but very slow.
151
152 @unnumberedsubsec Memory
153
154 Stack is the only memory that each VM owns. The other memory is shared
155 memory that is shared among every VM and other part of Guile.
156
157 @unnumberedsubsec Program
158
159 A VM program consists of a bytecode that is executed and an environment
160 in which execution is done. Each program is allocated in the shared
161 memory and may be executed by any VM. A program may call other programs
162 within a VM.
163
164 @unnumberedsubsec Instruction
165
166 Guile VM has dozens of system instructions and (possibly) hundreds of
167 functional instructions. Some Scheme procedures such as cons and car
168 are implemented as VM's builtin functions, which are very efficient.
169 Other procedures defined outside of the VM are also considered as VM's
170 functional features, since they do not change the state of VM.
171 Procedures defined within the VM are called subprograms.
172
173 Most instructions deal with the accumulator (ac). The VM stores all
174 results from functions in ac, instead of pushing them into the stack.
175 I'm not sure whether this is a good thing or not.
176
177 @node Variable Management, Instruction Set, Introduction, Top
178 @chapter Variable Management
179
180 FIXME: This chapter needs to be reviewed so that it matches reality.
181 A more up-to-date description of the mechanisms described in this
182 section is given in @ref{Instruction Set}.
183
184 A program may have access to local variables, external variables, and
185 top-level variables.
186
187 @section Local/external variables
188
189 A stack is logically divided into several blocks during execution. A
190 "block" is such a unit that maintains local variables and dynamic chain.
191 A "frame" is an upper level unit that maintains subprogram calls.
192
193 @example
194 Stack
195 dynamic | | | |
196 chain +==========+ - =
197 | |local vars| | |
198 `-|block data| | block |
199 /|frame data| | |
200 | +----------+ - |
201 | |local vars| | | frame
202 `-|block data| | |
203 /+----------+ - |
204 | |local vars| | |
205 `-|block data| | |
206 /+==========+ - =
207 | |local vars| | |
208 `-|block data| | |
209 /|frame data| | |
210 | +----------+ - |
211 | | | | |
212 @end example
213
214 The first block of each frame may look like this:
215
216 @example
217 Address Data
218 ------- ----
219 xxx0028 Local variable 2
220 xxx0024 Local variable 1
221 bp ->xxx0020 Local variable 0
222 xxx001c Local link (block data)
223 xxx0018 External link (block data)
224 xxx0014 Stack pointer (block data)
225 xxx0010 Return address (frame data)
226 xxx000c Parent program (frame data)
227 @end example
228
229 The base pointer (bp) always points to the lowest address of local
230 variables of the recent block. Local variables are referred as "bp[n]".
231 The local link field has a pointer to the dynamic parent of the block.
232 The parent's variables are referred as "bp[-1][n]", and grandparent's
233 are "bp[-1][-1][n]". Thus, any local variable is represented by its
234 depth and offset from the current bp.
235
236 A variable may be "external", which is allocated in the shared memory.
237 The external link field of a block has a pointer to such a variable set,
238 which I call "fragment" (what should I call?). A fragment has a set of
239 variables and its own chain.
240
241 @example
242 local external
243 chain| | chain
244 | +-----+ .--------, |
245 `-|block|--+->|external|-'
246 /+-----+ | `--------'\,
247 `-|block|--' |
248 /+-----+ .--------, |
249 `-|block|---->|external|-'
250 +-----+ `--------'
251 | |
252 @end example
253
254 An external variable is referred as "bp[-2]->variables[n]" or
255 "bp[-2]->link->...->variables[n]". This is also represented by a pair
256 of depth and offset. At any point of execution, the value of bp
257 determines the current local link and external link, and thus the
258 current environment of a program.
259
260 Other data fields are described later.
261
262 @section Top-level variables
263
264 Guile VM uses the same top-level variables as the regular Guile. A
265 program may have direct access to vcells. Currently this is done by
266 calling scm_intern0, but a program is possible to have any top-level
267 environment defined by the current module.
268
269 @section Scheme and VM variable
270
271 Let's think about the following Scheme code as an example:
272
273 @example
274 (define (foo a)
275 (lambda (b) (list foo a b)))
276 @end example
277
278 In the lambda expression, "foo" is a top-level variable, "a" is an
279 external variable, and "b" is a local variable.
280
281 When a VM executes foo, it allocates a block for "a". Since "a" may be
282 externally referred from the closure, the VM creates a fragment with a
283 copy of "a" in it. When the VM evaluates the lambda expression, it
284 creates a subprogram (closure), associating the fragment with the
285 subprogram as its external environment. When the closure is executed,
286 its environment will look like this:
287
288 @example
289 block Top-level: foo
290 +-------------+
291 |local var: b | fragment
292 +-------------+ .-----------,
293 |external link|---->|variable: a|
294 +-------------+ `-----------'
295 @end example
296
297 The fragment remains as long as the closure exists.
298
299 @section Addressing mode
300
301 Guile VM has five addressing modes:
302
303 @itemize
304 @item Real address
305 @item Local position
306 @item External position
307 @item Top-level location
308 @item Constant object
309 @end itemize
310
311 Real address points to the address in the real program and is only used
312 with the program counter (pc).
313
314 Local position and external position are represented as a pair of depth
315 and offset from bp, as described above. These are base relative
316 addresses, and the real address may vary during execution.
317
318 Top-level location is represented as a Guile's vcell. This location is
319 determined at loading time, so the use of this address is efficient.
320
321 Constant object is not an address but gives an instruction an Scheme
322 object directly.
323
324 [ We'll also need dynamic scope addressing to support Emacs Lisp? ]
325
326
327 Overall procedure:
328
329 @enumerate
330 @item A source program is compiled into a bytecode.
331 @item A bytecode is given an environment and becomes a program.
332 @item A VM starts execution, creating a frame for it.
333 @item Whenever a program calls a subprogram, a new frame is created for it.
334 @item When a program finishes execution, it returns a value, and the VM
335 continues execution of the parent program.
336 @item When all programs terminated, the VM returns the final value and stops.
337 @end enumerate
338
339 \f
340 @node Instruction Set, The Compiler, Variable Management, Top
341 @chapter Instruction Set
342
343 The Guile VM instruction set is roughly divided two groups: system
344 instructions and functional instructions. System instructions control
345 the execution of programs, while functional instructions provide many
346 useful calculations.
347
348 @menu
349 * Environment Control Instructions::
350 * Branch Instructions::
351 * Subprogram Control Instructions::
352 * Data Control Instructions::
353 @end menu
354
355 @node Environment Control Instructions, Branch Instructions, Instruction Set, Instruction Set
356 @section Environment Control Instructions
357
358 @deffn @insn{} link binding-name
359 Look up @var{binding-name} (a string) in the current environment and
360 push the corresponding variable object onto the stack. If
361 @var{binding-name} is not bound yet, then create a new binding and
362 push its variable object.
363 @end deffn
364
365 @deffn @insn{} variable-ref
366 Dereference the variable object which is on top of the stack and
367 replace it by the value of the variable it represents.
368 @end deffn
369
370 @deffn @insn{} variable-set
371 Set the value of the variable on top of the stack (at @code{sp[0]}) to
372 the object located immediately before (at @code{sp[-1]}).
373 @end deffn
374
375 As an example, let us look at what a simple function call looks like:
376
377 @example
378 (+ 2 3)
379 @end example
380
381 This call yields the following sequence of instructions:
382
383 @example
384 (link "+") ;; lookup binding "+"
385 (variable-ref) ;; dereference it
386 (make-int8 2) ;; push immediate value `2'
387 (make-int8 3) ;; push immediate value `3'
388 (tail-call 2) ;; call the proc at sp[-3] with two args
389 @end example
390
391 @deffn @insn{} local-ref offset
392 Push onto the stack the value of the local variable located at
393 @var{offset} within the current stack frame.
394 @end deffn
395
396 @deffn @insn{} local-set offset
397 Pop the Scheme object located on top of the stack and make it the new
398 value of the local variable located at @var{offset} within the current
399 stack frame.
400 @end deffn
401
402 @deffn @insn{} external-ref offset
403 Push the value of the closure variable located at position
404 @var{offset} within the program's list of external variables.
405 @end deffn
406
407 @deffn @insn{} external-set offset
408 Pop the Scheme object located on top of the stack and make it the new
409 value of the closure variable located at @var{offset} within the
410 program's list of external variables.
411 @end deffn
412
413 @deffn @insn{} make-closure
414 Pop the program object from the stack and assign it the current
415 closure variable list as its closure. Push the result program
416 object.
417 @end deffn
418
419 Let's illustrate this:
420
421 @example
422 (let ((x 2))
423 (lambda ()
424 (let ((x++ (+ 1 x)))
425 (set! x x++)
426 x++)))
427 @end example
428
429 The resulting program has one external (closure) variable, i.e. its
430 @var{nexts} is set to 1 (@pxref{Subprogram Control Instructions}).
431 This yields the following code:
432
433 @example
434 ;; the traditional program prologue with NLOCS = 0 and NEXTS = 1
435
436 0 (make-int8 2)
437 2 (external-set 0)
438 4 (make-int8 4)
439 6 (link "+") ;; lookup `+'
440 9 (vector 1) ;; create the external variable vector for
441 ;; later use by `object-ref' and `object-set'
442 ...
443 40 (load-program ##34#)
444 59 (make-closure) ;; assign the current closure to the program
445 ;; just pushed by `load-program'
446 60 (return)
447 @end example
448
449 The program loaded here by @var{load-program} contains the following
450 sequence of instructions:
451
452 @example
453 0 (object-ref 0) ;; push the variable for `+'
454 2 (variable-ref) ;; dereference `+'
455 3 (make-int8:1) ;; push 1
456 4 (external-ref 0) ;; push the value of `x'
457 6 (call 2) ;; call `+' and push the result
458 8 (local-set 0) ;; make it the new value of `x++'
459 10 (local-ref 0) ;; push the value of `x++'
460 12 (external-set 0) ;; make it the new value of `x'
461 14 (local-ref 0) ;; push the value of `x++'
462 16 (return) ;; return it
463 @end example
464
465 At this point, you should know pretty much everything about the three
466 types of variables a program may need to access.
467
468
469 @node Branch Instructions, Subprogram Control Instructions, Environment Control Instructions, Instruction Set
470 @section Branch Instructions
471
472 All the conditional branch instructions described below work in the
473 same way:
474
475 @itemize
476 @item They take the Scheme object located on the stack and use it as
477 the branch condition;
478 @item If the condition if false, then program execution continues with
479 the next instruction;
480 @item If the condition is true, then the instruction pointer is
481 increased by the offset passed as an argument to the branch
482 instruction;
483 @item Finally, when the instruction finished, the condition object is
484 removed from the stack.
485 @end itemize
486
487 Note that the offset passed to the instruction is encoded on two 8-bit
488 integers which are then combined by the VM as one 16-bit integer.
489
490 @deffn @insn{} br offset
491 Jump to @var{offset}.
492 @end deffn
493
494 @deffn @insn{} br-if offset
495 Jump to @var{offset} if the condition on the stack is not false.
496 @end deffn
497
498 @deffn @insn{} br-if-not offset
499 Jump to @var{offset} if the condition on the stack is false.
500 @end deffn
501
502 @deffn @insn{} br-if-eq offset
503 Jump to @var{offset} if the two objects located on the stack are
504 equal in the sense of @var{eq?}. Note that, for this instruction, the
505 stack pointer is decremented by two Scheme objects instead of only
506 one.
507 @end deffn
508
509 @deffn @insn{} br-if-not-eq offset
510 Same as @var{br-if-eq} for non-equal objects.
511 @end deffn
512
513 @deffn @insn{} br-if-null offset
514 Jump to @var{offset} if the object on the stack is @code{'()}.
515 @end deffn
516
517 @deffn @insn{} br-if-not-null offset
518 Jump to @var{offset} if the object on the stack is not @code{'()}.
519 @end deffn
520
521
522 @node Subprogram Control Instructions, Data Control Instructions, Branch Instructions, Instruction Set
523 @section Subprogram Control Instructions
524
525 Programs (read: ``compiled procedure'') may refer to external
526 bindings, like variables or functions defined outside the program
527 itself, in the environment in which it will evaluate at run-time. In
528 a sense, a program's environment and its bindings are an implicit
529 parameter of every program.
530
531 @cindex Object table
532 In order to handle such bindings, each program has an @dfn{object
533 table} associated to it. This table (actually a Scheme vector)
534 contains all constant objects referenced by the program. The object
535 table of a program is initialized right before a program is loaded
536 with @var{load-program}.
537
538 Variable objects are one such type of constant object: when a global
539 binding is defined, a variable object is associated to it and that
540 object will remain constant over time, even if the value bound to it
541 changes. Therefore, external bindings only need to be looked up once
542 when the program is loaded. References to the corresponding external
543 variables from within the program are then performed via the
544 @var{object-ref} instruction and are almost as fast as local variable
545 references.
546
547 Let us consider the following program (procedure) which references
548 external bindings @code{frob} and @var{%magic}:
549
550 @example
551 (lambda (x)
552 (frob x %magic))
553 @end example
554
555 This yields the following assembly code:
556
557 @example
558 (make-int8 64) ;; number of args, vars, etc. (see below)
559 (link "frob")
560 (link "%magic")
561 (vector 2) ;; object table (external bindings)
562 ...
563 (load-program #u8(20 0 23 21 0 20 1 23 36 2))
564 (return)
565 @end example
566
567 All the instructions occurring before @var{load-program} (some were
568 omitted for simplicity) form a @dfn{prologue} which, among other
569 things, pushed an object table (a vector) that contains the variable
570 objects for the variables bound to @var{frob} and @var{%magic}. This
571 vector and other data pushed onto the stack are then popped by the
572 @var{load-program} instruction.
573
574 Besides, the @var{load-program} instruction takes one explicit
575 argument which is the bytecode of the program itself. Disassembled,
576 this bytecode looks like:
577
578 @example
579 (object-ref 0) ;; push the variable object of `frob'
580 (variable-ref) ;; dereference it
581 (local-ref 0) ;; push the value of `x'
582 (object-ref 1) ;; push the variable object of `%magic'
583 (variable-ref) ;; dereference it
584 (tail-call 2) ;; call `frob' with two parameters
585 @end example
586
587 This clearly shows that there is little difference between references
588 to local variables and references to externally bound variables since
589 lookup of externally bound variables if performed only once before the
590 program is run.
591
592 @deffn @insn{} load-program bytecode
593 Load the program whose bytecode is @var{bytecode} (a u8vector), pop
594 its meta-information from the stack, and push a corresponding program
595 object onto the stack. The program's meta-information may consist of
596 (in the order in which it should be pushed onto the stack):
597
598 @itemize
599 @item optionally, a pair representing meta-data (see the
600 @var{program-meta} procedure); [FIXME: explain their meaning]
601 @item optionally, a vector which is the program's object table (a
602 program that does not reference external bindings does not need an
603 object table);
604 @item either one immediate integer or four immediate integers
605 representing respectively the number of arguments taken by the
606 function (@var{nargs}), the number of @dfn{rest arguments}
607 (@var{nrest}, 0 or 1), the number of local variables (@var{nlocs}) and
608 the number of external variables (@var{nexts}) (@pxref{Environment
609 Control Instructions}).
610 @end itemize
611
612 @end deffn
613
614 @deffn @insn{} object-ref offset
615 Push the variable object for the external variable located at
616 @var{offset} within the program's object table.
617 @end deffn
618
619 @deffn @insn{} return
620 Free the program's frame.
621 @end deffn
622
623 @deffn @insn{} call nargs
624 Call the procedure, continuation or program located at
625 @code{sp[-nargs]} with the @var{nargs} arguments located from
626 @code{sp[0]} to @code{sp[-nargs + 1]}. The
627 procedure/continuation/program and its arguments are dropped from the
628 stack and the result is pushed. When calling a program, the
629 @code{call} instruction reserves room for its local variables on the
630 stack, and initializes its list of closure variables and its vector of
631 externally bound variables.
632 @end deffn
633
634 @deffn @insn{} tail-call nargs
635 Same as @code{call} except that, for tail-recursive calls to a
636 program, the current stack frame is re-used, as required by RnRS.
637 This instruction is otherwise similar to @code{call}.
638 @end deffn
639
640
641 @node Data Control Instructions, , Subprogram Control Instructions, Instruction Set
642 @section Data Control Instructions
643
644 @deffn @insn{} make-int8 value
645 Push @var{value}, an 8-bit integer, onto the stack.
646 @end deffn
647
648 @deffn @insn{} make-int8:0
649 Push the immediate value @code{0} onto the stack.
650 @end deffn
651
652 @deffn @insn{} make-int8:1
653 Push the immediate value @code{1} onto the stack.
654 @end deffn
655
656 @deffn @insn{} make-false
657 Push @code{#f} onto the stack.
658 @end deffn
659
660 @deffn @insn{} make-true
661 Push @code{#t} onto the stack.
662 @end deffn
663
664 @itemize
665 @item %push
666 @item %pushi
667 @item %pushl, %pushl:0:0, %pushl:0:1, %pushl:0:2, %pushl:0:3
668 @item %pushe, %pushe:0:0, %pushe:0:1, %pushe:0:2, %pushe:0:3
669 @item %pusht
670 @end itemize
671
672 @itemize
673 @item %loadi
674 @item %loadl, %loadl:0:0, %loadl:0:1, %loadl:0:2, %loadl:0:3
675 @item %loade, %loade:0:0, %loade:0:1, %loade:0:2, %loade:0:3
676 @item %loadt
677 @end itemize
678
679 @itemize
680 @item %savei
681 @item %savel, %savel:0:0, %savel:0:1, %savel:0:2, %savel:0:3
682 @item %savee, %savee:0:0, %savee:0:1, %savee:0:2, %savee:0:3
683 @item %savet
684 @end itemize
685
686 @section Flow control instructions
687
688 @itemize
689 @item %br-if
690 @item %br-if-not
691 @item %jump
692 @end itemize
693
694 @section Function call instructions
695
696 @itemize
697 @item %func, %func0, %func1, %func2
698 @end itemize
699
700 @section Scheme built-in functions
701
702 @itemize
703 @item cons
704 @item car
705 @item cdr
706 @end itemize
707
708 @section Mathematical buitin functions
709
710 @itemize
711 @item 1+
712 @item 1-
713 @item add, add2
714 @item sub, sub2, minus
715 @item mul2
716 @item div2
717 @item lt2
718 @item gt2
719 @item le2
720 @item ge2
721 @item num-eq2
722 @end itemize
723
724
725 \f
726 @node The Compiler, , Instruction Set, Top
727 @chapter The Compiler
728
729 This section describes Guile-VM's compiler and the compilation process
730 to produce bytecode executable by the VM itself (@pxref{Instruction
731 Set}).
732
733 @menu
734 * Overview::
735 * The Language Front-Ends::
736 * GHIL::
737 * GLIL::
738 * The Assembler::
739 @end menu
740
741 @node Overview, The Language Front-Ends, The Compiler, The Compiler
742 @section Overview
743
744 Compilation in Guile-VM is a three-stage process:
745
746 @enumerate
747 @item the source programming language (e.g. R5RS Scheme) is read and
748 translated into GHIL, @dfn{Guile's High-Level Intermediate Language};
749 @item GHIL code is then translated into a lower-level intermediate
750 language call GLIL, @dfn{Guile's Low-Level Intermediate Language};
751 @item finally, GLIL is @dfn{assembled} into the VM's assembly language
752 (@pxref{Instruction Set}) and bytecode.
753 @end enumerate
754
755 The use of two separate intermediate languages eases the
756 implementation of front-ends since the gap between high-level
757 languages like Scheme and GHIL is relatively small.
758
759 From an end-user viewpoint, compiling a Guile program into bytecode
760 can be done either by using the @command{guilec} command-line tool, or
761 by using the @code{compile-file} procedure exported by the
762 @code{(system base compile)} module.
763
764
765 @node The Language Front-Ends, GHIL, Overview, The Compiler
766 @section The Language Front-Ends
767
768 Guile-VM comes with a number of @dfn{language front-ends}, that is,
769 code that can read a given high-level programming language like R5RS
770 Scheme, and translate it into a lower-level representation suitable to
771 the compiler.
772
773 Each language front-end provides a @dfn{specification} and a
774 @dfn{translator} to GHIL. Both of them come in the @code{language}
775 module hierarchy. As an example, the front-end for Scheme is located
776 in the @code{(language scheme spec)} and @code{(language scheme
777 translate)} modules. Language front-ends can then be retrieved using
778 the @code{lookup-language} procedure of the @code{(system base
779 language)} module.
780
781 @deftp @scmrec{} <language> name title version reader printer read-file expander translator evaluator environment
782 Denotes a language front-end specification a various methods used by
783 the compiler to handle source written in that language. Of particular
784 interest is the @code{translator} slot (@pxref{GHIL}).
785 @end deftp
786
787 @deffn @scmproc{} lookup-language symbol
788 Look for a language front-end named @var{symbol} and return the
789 @code{<language>} record describing it if found. If @var{symbol}
790 doesn't denote a language front-end, an error is raised. Note that
791 this procedure assumes that language @var{symbol} exists if there
792 exist a @code{(language @var{symbol} spec)} module.
793 @end deffn
794
795
796 @node GHIL, GLIL, The Language Front-Ends, The Compiler
797 @section Guile's High-Level Intermediate Language
798
799 GHIL has constructs almost equivalent to those found in Scheme.
800 However, unlike Scheme, it is meant to be read only by the compiler
801 itself. Therefore, a sequence of GHIL code is only a sequence of GHIL
802 @emph{objects} (records), as opposed to symbols, each of which
803 represents a particular language feature. These records are all
804 defined in the @code{(system il ghil)} module and are named
805 @code{<ghil-*>}.
806
807 Each GHIL record has at least two fields: one containing the
808 environment (Guile module) in which it is considered, and one
809 containing its location [FIXME: currently seems to be unused]. Below
810 is a list of the main GHIL object types and their fields:
811
812 @example
813 ;; Objects
814 (<ghil-void> env loc)
815 (<ghil-quote> env loc obj)
816 (<ghil-quasiquote> env loc exp)
817 (<ghil-unquote> env loc exp)
818 (<ghil-unquote-splicing> env loc exp)
819 ;; Variables
820 (<ghil-ref> env loc var)
821 (<ghil-set> env loc var val)
822 (<ghil-define> env loc var val)
823 ;; Controls
824 (<ghil-if> env loc test then else)
825 (<ghil-and> env loc exps)
826 (<ghil-or> env loc exps)
827 (<ghil-begin> env loc exps)
828 (<ghil-bind> env loc vars vals body)
829 (<ghil-lambda> env loc vars rest body)
830 (<ghil-call> env loc proc args)
831 (<ghil-inline> env loc inline args)
832 @end example
833
834 As can be seen from this examples, the constructs in GHIL are pretty
835 close to the fundamental primitives of Scheme.
836
837 It is the role of front-end language translators (@pxref{The Language
838 Front-Ends}) to produce a sequence of GHIL objects from the
839 human-readable, source programming language.
840
841 [FIXME: Describe more.]
842
843 @node GLIL, The Assembler, GHIL, The Compiler
844 @section Guile's Low-Level Intermediate Language
845
846 A GHIL instruction sequence can be compiled into GLIL using the
847 @code{compile} procedure exported by the @code{(system il compile)}
848 module. During this translation process, various optimizations may
849 also be performed.
850
851 The module @code{(system il glil)} defines record types representing
852 various low-level abstractions. Compared to GHIL, the flow control
853 primitives in GLIL are much more low-level: only @code{<glil-label>},
854 @code{<glil-branch>} and @code{<glil-call>} are available, no
855 @code{lambda}, @code{if}, etc.
856
857
858 @deffn @scmproc{} compile ghil environment . opts
859 Compile @var{ghil}, a GHIL instruction sequence, within
860 environment/module @var{environment}, and return the resulting GLIL
861 instruction sequence. The option list @var{opts} may be either the
862 empty list or a list containing the @code{:O} keyword in which case
863 @code{compile} will first go through an optimization stage of
864 @var{ghil}.
865 @end deffn
866
867 @deffn @scmproc{} pprint-glil glil . port
868 Print @var{glil}, a GLIL sequence instructions, in a human-readable
869 form. If @var{port} is passed, it will be used as the output port.
870 @end deffn
871
872
873 Let's consider the following Scheme expression:
874
875 @example
876 (lambda (x) (+ x 1))
877 @end example
878
879 The corresponding (unoptimized) GLIL code, as shown by
880 @code{pprint-glil}, looks like this:
881
882 @example
883 (@@asm (0 0 0 0)
884 (@@asm (1 0 0 0) ;; expect one arg.
885 (@@bind (x argument 0)) ;; debugging info
886 (module-ref #f +) ;; lookup `+'
887 (argument-ref 0) ;; push the argument onto
888 ;; the stack
889 (const 1) ;; push `1'
890 (tail-call 2) ;; call `+', with 2 args,
891 ;; using the same stack frame
892 (@@source 15 33)) ;; additional debugging info
893 (return 0))
894 @end example
895
896 This is not unlike the VM's assembly language described in
897 @ref{Instruction Set}.
898
899 @node The Assembler, , GLIL, The Compiler
900 @section The Assembler
901
902 The final compilation step consists in converting the GLIL instruction
903 sequence into VM bytecode. This is what the @code{assemble} procedure
904 defined in the @code{(system vm assemble)} module is for. It relies
905 on the @code{code->bytes} procedure of the @code{(system vm conv)}
906 module to convert instructions (represented as lists whose @code{car}
907 is a symbol naming the instruction, e.g. @code{object-ref},
908 @pxref{Instruction Set}) into binary code, or @dfn{bytecode}.
909 Bytecode itself is represented using SRFI-4 byte vectors,
910 @inforef{SRFI-4, SRFI-4 homogeneous numeric vectors, guile}.
911
912
913 @deffn @scmproc{} assemble glil environment . opts
914 Return a binary representation of @var{glil} (bytecode), either in the
915 form of an SRFI-4 @code{u8vector} or a @code{<bytespec>} object.
916 [FIXME: Why is that?]
917 @end deffn
918
919
920 \f
921 @c *********************************************************************
922 @c @node Concept Index, Command Index, Related Information, Top
923 @c @unnumbered Concept Index
924 @c @printindex cp
925
926 @c @node Command Index, Variable Index, Concept Index, Top
927 @c @unnumbered Command Index
928 @c @printindex fn
929
930 @c @node Variable Index, , Command Index, Top
931 @c @unnumbered Variable Index
932 @c @printindex vr
933
934 @bye
935 \f
936 @c Local Variables:
937 @c mode:outline-minor
938 @c outline-regexp:"@\\(ch\\|sec\\|subs\\)"
939 @c End:
940
941 @c LocalWords: bytecode