Add rudimentary ECMAScript tests.
[bpt/guile.git] / module / language / ecmascript / parse-lalr.scm
CommitLineData
5dcd3e48
AW
1;; (language ecmascript parse-lalr) -- yacc's parser generator, in Guile
2;; Copyright (C) 1984,1989,1990 Free Software Foundation, Inc.
3;; Copyright (C) 1996-2002 Dominique Boucher
4
53befeb7
NJ
5;;;; This library is free software; you can redistribute it and/or
6;;;; modify it under the terms of the GNU Lesser General Public
7;;;; License as published by the Free Software Foundation; either
8;;;; version 3 of the License, or (at your option) any later version.
9;;;;
10;;;; This library is distributed in the hope that it will be useful,
11;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
12;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
13;;;; Lesser General Public License for more details.
14;;;;
15;;;; You should have received a copy of the GNU Lesser General Public
16;;;; License along with this library; if not, write to the Free Software
17;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
5dcd3e48
AW
18
19
20;; ---------------------------------------------------------------------- ;;
21#!
22;;; Commentary:
23This file contains yet another LALR(1) parser generator written in
24Scheme. In contrast to other such parser generators, this one
25implements a more efficient algorithm for computing the lookahead sets.
26The algorithm is the same as used in Bison (GNU yacc) and is described
27in the following paper:
28
29"Efficient Computation of LALR(1) Look-Ahead Set", F. DeRemer and
30T. Pennello, TOPLAS, vol. 4, no. 4, october 1982.
31
32As a consequence, it is not written in a fully functional style.
33In fact, much of the code is a direct translation from C to Scheme
34of the Bison sources.
35
36@section Defining a parser
37
38The module @code{(language ecmascript parse-lalr)} declares a macro
39called @code{lalr-parser}:
40
41@lisp
42 (lalr-parser tokens rules ...)
43@end lisp
44
45This macro, when given appropriate arguments, generates an LALR(1)
46syntax analyzer. The macro accepts at least two arguments. The first
47is a list of symbols which represent the terminal symbols of the
48grammar. The remaining arguments are the grammar production rules.
49
50@section Running the parser
51
52The parser generated by the @code{lalr-parser} macro is a function that
53takes two parameters. The first parameter is a lexical analyzer while
54the second is an error procedure.
55
56The lexical analyzer is zero-argument function (a thunk)
57invoked each time the parser needs to look-ahead in the token stream.
58A token is usually a pair whose @code{car} is the symbol corresponding to
59the token (the same symbol as used in the grammar definition). The
60@code{cdr} of the pair is the semantic value associated with the token. For
61example, a string token would have the @code{car} set to @code{'string}
62while the @code{cdr} is set to the string value @code{"hello"}.
63
64Once the end of file is encountered, the lexical analyzer must always
65return the symbol @code{'*eoi*} each time it is invoked.
66
67The error procedure must be a function that accepts at least two
68parameters.
69
70@section The grammar format
71
72The grammar is specified by first giving the list of terminals and the
73list of non-terminal definitions. Each non-terminal definition
74is a list where the first element is the non-terminal and the other
75elements are the right-hand sides (lists of grammar symbols). In
76addition to this, each rhs can be followed by a semantic action.
77
78For example, consider the following (yacc) grammar for a very simple
79expression language:
80@example
81 e : e '+' t
82 | e '-' t
83 | t
84 ;
85 t : t '*' f
86 : t '/' f
87 | f
88 ;
89 f : ID
90 ;
91@end example
92The same grammar, written for the scheme parser generator, would look
93like this (with semantic actions)
94@lisp
95(define expr-parser
96 (lalr-parser
97 ; Terminal symbols
98 (ID + - * /)
99 ; Productions
100 (e (e + t) -> (+ $1 $3)
101 (e - t) -> (- $1 $3)
102 (t) -> $1)
103 (t (t * f) -> (* $1 $3)
104 (t / f) -> (/ $1 $3)
105 (f) -> $1)
106 (f (ID) -> $1)))
107@end lisp
108In semantic actions, the symbol @code{$n} refers to the synthesized
109attribute value of the nth symbol in the production. The value
110associated with the non-terminal on the left is the result of
111evaluating the semantic action (it defaults to @code{#f}).
112
113The above grammar implicitly handles operator precedences. It is also
114possible to explicitly assign precedences and associativity to
115terminal symbols and productions a la Yacc. Here is a modified
116(and augmented) version of the grammar:
117@lisp
118(define expr-parser
119 (lalr-parser
120 ; Terminal symbols
121 (ID
122 (left: + -)
123 (left: * /)
124 (nonassoc: uminus))
125 (e (e + e) -> (+ $1 $3)
126 (e - e) -> (- $1 $3)
127 (e * e) -> (* $1 $3)
128 (e / e) -> (/ $1 $3)
129 (- e (prec: uminus)) -> (- $2)
130 (ID) -> $1)))
131@end lisp
132The @code{left:} directive is used to specify a set of left-associative
133operators of the same precedence level, the @code{right:} directive for
134right-associative operators, and @code{nonassoc:} for operators that
135are not associative. Note the use of the (apparently) useless
136terminal @code{uminus}. It is only defined in order to assign to the
137penultimate rule a precedence level higher than that of @code{*} and
138@code{/}. The @code{prec:} directive can only appear as the last element of a
139rule. Finally, note that precedence levels are incremented from
140left to right, i.e. the precedence level of @code{+} and @code{-} is less
141than the precedence level of @code{*} and @code{/} since the formers appear
142first in the list of terminal symbols (token definitions).
143
144@section A final note on conflict resolution
145
146Conflicts in the grammar are handled in a conventional way.
147In the absence of precedence directives,
148Shift/Reduce conflicts are resolved by shifting, and Reduce/Reduce
149conflicts are resolved by choosing the rule listed first in the
150grammar definition.
151
152You can print the states of the generated parser by evaluating
153@code{(print-states)}. The format of the output is similar to the one
154produced by bison when given the -v command-line option.
155;;; Code:
156!#
157
158;;; ---------- SYSTEM DEPENDENT SECTION -----------------
159;; put in a module by Richard Todd
160(define-module (language ecmascript parse-lalr)
161 #:export (lalr-parser
162 print-states))
163
164;; this code is by Thien-Thi Nguyen, found in a google search
165(begin
166 (defmacro def-macro (form . body)
167 `(defmacro ,(car form) ,(cdr form) ,@body))
168 (def-macro (BITS-PER-WORD) 28)
169 (def-macro (lalr-error msg obj) `(throw 'lalr-error ,msg ,obj))
170 (def-macro (logical-or x . y) `(logior ,x ,@y)))
171
172;;; ---------- END OF SYSTEM DEPENDENT SECTION ------------
173
174;; - Macros pour la gestion des vecteurs de bits
175
176(def-macro (set-bit v b)
177 `(let ((x (quotient ,b (BITS-PER-WORD)))
178 (y (expt 2 (remainder ,b (BITS-PER-WORD)))))
179 (vector-set! ,v x (logical-or (vector-ref ,v x) y))))
180
181(def-macro (bit-union v1 v2 n)
182 `(do ((i 0 (+ i 1)))
183 ((= i ,n))
184 (vector-set! ,v1 i (logical-or (vector-ref ,v1 i)
185 (vector-ref ,v2 i)))))
186
187;; - Macro pour les structures de donnees
188
189(def-macro (new-core) `(make-vector 4 0))
190(def-macro (set-core-number! c n) `(vector-set! ,c 0 ,n))
191(def-macro (set-core-acc-sym! c s) `(vector-set! ,c 1 ,s))
192(def-macro (set-core-nitems! c n) `(vector-set! ,c 2 ,n))
193(def-macro (set-core-items! c i) `(vector-set! ,c 3 ,i))
194(def-macro (core-number c) `(vector-ref ,c 0))
195(def-macro (core-acc-sym c) `(vector-ref ,c 1))
196(def-macro (core-nitems c) `(vector-ref ,c 2))
197(def-macro (core-items c) `(vector-ref ,c 3))
198
199(def-macro (new-shift) `(make-vector 3 0))
200(def-macro (set-shift-number! c x) `(vector-set! ,c 0 ,x))
201(def-macro (set-shift-nshifts! c x) `(vector-set! ,c 1 ,x))
202(def-macro (set-shift-shifts! c x) `(vector-set! ,c 2 ,x))
203(def-macro (shift-number s) `(vector-ref ,s 0))
204(def-macro (shift-nshifts s) `(vector-ref ,s 1))
205(def-macro (shift-shifts s) `(vector-ref ,s 2))
206
207(def-macro (new-red) `(make-vector 3 0))
208(def-macro (set-red-number! c x) `(vector-set! ,c 0 ,x))
209(def-macro (set-red-nreds! c x) `(vector-set! ,c 1 ,x))
210(def-macro (set-red-rules! c x) `(vector-set! ,c 2 ,x))
211(def-macro (red-number c) `(vector-ref ,c 0))
212(def-macro (red-nreds c) `(vector-ref ,c 1))
213(def-macro (red-rules c) `(vector-ref ,c 2))
214
215
216
217(def-macro (new-set nelem)
218 `(make-vector ,nelem 0))
219
220
221(def-macro (vector-map f v)
222 `(let ((vm-n (- (vector-length ,v) 1)))
223 (let loop ((vm-low 0) (vm-high vm-n))
224 (if (= vm-low vm-high)
225 (vector-set! ,v vm-low (,f (vector-ref ,v vm-low) vm-low))
226 (let ((vm-middle (quotient (+ vm-low vm-high) 2)))
227 (loop vm-low vm-middle)
228 (loop (+ vm-middle 1) vm-high))))))
229
230
231;; - Constantes
232(define STATE-TABLE-SIZE 1009)
233
234
235;; - Tableaux
236(define rrhs #f)
237(define rlhs #f)
238(define ritem #f)
239(define nullable #f)
240(define derives #f)
241(define fderives #f)
242(define firsts #f)
243(define kernel-base #f)
244(define kernel-end #f)
245(define shift-symbol #f)
246(define shift-set #f)
247(define red-set #f)
248(define state-table #f)
249(define acces-symbol #f)
250(define reduction-table #f)
251(define shift-table #f)
252(define consistent #f)
253(define lookaheads #f)
254(define LA #f)
255(define LAruleno #f)
256(define lookback #f)
257(define goto-map #f)
258(define from-state #f)
259(define to-state #f)
260(define includes #f)
261(define F #f)
262(define action-table #f)
263
264;; - Variables
265(define nitems #f)
266(define nrules #f)
267(define nvars #f)
268(define nterms #f)
269(define nsyms #f)
270(define nstates #f)
271(define first-state #f)
272(define last-state #f)
273(define final-state #f)
274(define first-shift #f)
275(define last-shift #f)
276(define first-reduction #f)
277(define last-reduction #f)
278(define nshifts #f)
279(define maxrhs #f)
280(define ngotos #f)
281(define token-set-size #f)
282
283(define (gen-tables! tokens gram)
284 (initialize-all)
285 (rewrite-grammar
286 tokens
287 gram
288 (lambda (terms terms/prec vars gram gram/actions)
289 (set! the-terminals/prec (list->vector terms/prec))
290 (set! the-terminals (list->vector terms))
291 (set! the-nonterminals (list->vector vars))
292 (set! nterms (length terms))
293 (set! nvars (length vars))
294 (set! nsyms (+ nterms nvars))
295 (let ((no-of-rules (length gram/actions))
296 (no-of-items (let loop ((l gram/actions) (count 0))
297 (if (null? l)
298 count
299 (loop (cdr l) (+ count (length (caar l))))))))
300 (pack-grammar no-of-rules no-of-items gram)
301 (set-derives)
302 (set-nullable)
303 (generate-states)
304 (lalr)
305 (build-tables)
306 (compact-action-table terms)
307 gram/actions))))
308
309
310(define (initialize-all)
311 (set! rrhs #f)
312 (set! rlhs #f)
313 (set! ritem #f)
314 (set! nullable #f)
315 (set! derives #f)
316 (set! fderives #f)
317 (set! firsts #f)
318 (set! kernel-base #f)
319 (set! kernel-end #f)
320 (set! shift-symbol #f)
321 (set! shift-set #f)
322 (set! red-set #f)
323 (set! state-table (make-vector STATE-TABLE-SIZE '()))
324 (set! acces-symbol #f)
325 (set! reduction-table #f)
326 (set! shift-table #f)
327 (set! consistent #f)
328 (set! lookaheads #f)
329 (set! LA #f)
330 (set! LAruleno #f)
331 (set! lookback #f)
332 (set! goto-map #f)
333 (set! from-state #f)
334 (set! to-state #f)
335 (set! includes #f)
336 (set! F #f)
337 (set! action-table #f)
338 (set! nstates #f)
339 (set! first-state #f)
340 (set! last-state #f)
341 (set! final-state #f)
342 (set! first-shift #f)
343 (set! last-shift #f)
344 (set! first-reduction #f)
345 (set! last-reduction #f)
346 (set! nshifts #f)
347 (set! maxrhs #f)
348 (set! ngotos #f)
349 (set! token-set-size #f)
350 (set! rule-precedences '()))
351
352
353(define (pack-grammar no-of-rules no-of-items gram)
354 (set! nrules (+ no-of-rules 1))
355 (set! nitems no-of-items)
356 (set! rlhs (make-vector nrules #f))
357 (set! rrhs (make-vector nrules #f))
358 (set! ritem (make-vector (+ 1 nitems) #f))
359
360 (let loop ((p gram) (item-no 0) (rule-no 1))
361 (if (not (null? p))
362 (let ((nt (caar p)))
363 (let loop2 ((prods (cdar p)) (it-no2 item-no) (rl-no2 rule-no))
364 (if (null? prods)
365 (loop (cdr p) it-no2 rl-no2)
366 (begin
367 (vector-set! rlhs rl-no2 nt)
368 (vector-set! rrhs rl-no2 it-no2)
369 (let loop3 ((rhs (car prods)) (it-no3 it-no2))
370 (if (null? rhs)
371 (begin
372 (vector-set! ritem it-no3 (- rl-no2))
373 (loop2 (cdr prods) (+ it-no3 1) (+ rl-no2 1)))
374 (begin
375 (vector-set! ritem it-no3 (car rhs))
376 (loop3 (cdr rhs) (+ it-no3 1))))))))))))
377
378
379;; Fonction set-derives
380;; --------------------
381(define (set-derives)
382 (define delts (make-vector (+ nrules 1) 0))
383 (define dset (make-vector nvars -1))
384
385 (let loop ((i 1) (j 0)) ; i = 0
386 (if (< i nrules)
387 (let ((lhs (vector-ref rlhs i)))
388 (if (>= lhs 0)
389 (begin
390 (vector-set! delts j (cons i (vector-ref dset lhs)))
391 (vector-set! dset lhs j)
392 (loop (+ i 1) (+ j 1)))
393 (loop (+ i 1) j)))))
394
395 (set! derives (make-vector nvars 0))
396
397 (let loop ((i 0))
398 (if (< i nvars)
399 (let ((q (let loop2 ((j (vector-ref dset i)) (s '()))
400 (if (< j 0)
401 s
402 (let ((x (vector-ref delts j)))
403 (loop2 (cdr x) (cons (car x) s)))))))
404 (vector-set! derives i q)
405 (loop (+ i 1))))))
406
407
408
409(define (set-nullable)
410 (set! nullable (make-vector nvars #f))
411 (let ((squeue (make-vector nvars #f))
412 (rcount (make-vector (+ nrules 1) 0))
413 (rsets (make-vector nvars #f))
414 (relts (make-vector (+ nitems nvars 1) #f)))
415 (let loop ((r 0) (s2 0) (p 0))
416 (let ((*r (vector-ref ritem r)))
417 (if *r
418 (if (< *r 0)
419 (let ((symbol (vector-ref rlhs (- *r))))
420 (if (and (>= symbol 0)
421 (not (vector-ref nullable symbol)))
422 (begin
423 (vector-set! nullable symbol #t)
424 (vector-set! squeue s2 symbol)
425 (loop (+ r 1) (+ s2 1) p))))
426 (let loop2 ((r1 r) (any-tokens #f))
427 (let* ((symbol (vector-ref ritem r1)))
428 (if (> symbol 0)
429 (loop2 (+ r1 1) (or any-tokens (>= symbol nvars)))
430 (if (not any-tokens)
431 (let ((ruleno (- symbol)))
432 (let loop3 ((r2 r) (p2 p))
433 (let ((symbol (vector-ref ritem r2)))
434 (if (> symbol 0)
435 (begin
436 (vector-set! rcount ruleno
437 (+ (vector-ref rcount ruleno) 1))
438 (vector-set! relts p2
439 (cons (vector-ref rsets symbol)
440 ruleno))
441 (vector-set! rsets symbol p2)
442 (loop3 (+ r2 1) (+ p2 1)))
443 (loop (+ r2 1) s2 p2)))))
444 (loop (+ r1 1) s2 p))))))
445 (let loop ((s1 0) (s3 s2))
446 (if (< s1 s3)
447 (let loop2 ((p (vector-ref rsets (vector-ref squeue s1))) (s4 s3))
448 (if p
449 (let* ((x (vector-ref relts p))
450 (ruleno (cdr x))
451 (y (- (vector-ref rcount ruleno) 1)))
452 (vector-set! rcount ruleno y)
453 (if (= y 0)
454 (let ((symbol (vector-ref rlhs ruleno)))
455 (if (and (>= symbol 0)
456 (not (vector-ref nullable symbol)))
457 (begin
458 (vector-set! nullable symbol #t)
459 (vector-set! squeue s4 symbol)
460 (loop2 (car x) (+ s4 1)))
461 (loop2 (car x) s4)))
462 (loop2 (car x) s4))))
463 (loop (+ s1 1) s4)))))))))
464
465
466
467; Fonction set-firsts qui calcule un tableau de taille
468; nvars et qui donne, pour chaque non-terminal X, une liste des
469; non-terminaux pouvant apparaitre au debut d'une derivation a
470; partir de X.
471
472(define (set-firsts)
473 (set! firsts (make-vector nvars '()))
474
475 ;; -- initialization
476 (let loop ((i 0))
477 (if (< i nvars)
478 (let loop2 ((sp (vector-ref derives i)))
479 (if (null? sp)
480 (loop (+ i 1))
481 (let ((sym (vector-ref ritem (vector-ref rrhs (car sp)))))
482 (if (< -1 sym nvars)
483 (vector-set! firsts i (sinsert sym (vector-ref firsts i))))
484 (loop2 (cdr sp)))))))
485
486 ;; -- reflexive and transitive closure
487 (let loop ((continue #t))
488 (if continue
489 (let loop2 ((i 0) (cont #f))
490 (if (>= i nvars)
491 (loop cont)
492 (let* ((x (vector-ref firsts i))
493 (y (let loop3 ((l x) (z x))
494 (if (null? l)
495 z
496 (loop3 (cdr l)
497 (sunion (vector-ref firsts (car l)) z))))))
498 (if (equal? x y)
499 (loop2 (+ i 1) cont)
500 (begin
501 (vector-set! firsts i y)
502 (loop2 (+ i 1) #t))))))))
503
504 (let loop ((i 0))
505 (if (< i nvars)
506 (begin
507 (vector-set! firsts i (sinsert i (vector-ref firsts i)))
508 (loop (+ i 1))))))
509
510
511
512
513; Fonction set-fderives qui calcule un tableau de taille
514; nvars et qui donne, pour chaque non-terminal, une liste des regles pouvant
515; etre derivees a partir de ce non-terminal. (se sert de firsts)
516
517(define (set-fderives)
518 (set! fderives (make-vector nvars #f))
519
520 (set-firsts)
521
522 (let loop ((i 0))
523 (if (< i nvars)
524 (let ((x (let loop2 ((l (vector-ref firsts i)) (fd '()))
525 (if (null? l)
526 fd
527 (loop2 (cdr l)
528 (sunion (vector-ref derives (car l)) fd))))))
529 (vector-set! fderives i x)
530 (loop (+ i 1))))))
531
532
533; Fonction calculant la fermeture d'un ensemble d'items LR0
534; ou core est une liste d'items
535
536(define (closure core)
537 ;; Initialization
538 (define ruleset (make-vector nrules #f))
539
540 (let loop ((csp core))
541 (if (not (null? csp))
542 (let ((sym (vector-ref ritem (car csp))))
543 (if (< -1 sym nvars)
544 (let loop2 ((dsp (vector-ref fderives sym)))
545 (if (not (null? dsp))
546 (begin
547 (vector-set! ruleset (car dsp) #t)
548 (loop2 (cdr dsp))))))
549 (loop (cdr csp)))))
550
551 (let loop ((ruleno 1) (csp core) (itemsetv '())) ; ruleno = 0
552 (if (< ruleno nrules)
553 (if (vector-ref ruleset ruleno)
554 (let ((itemno (vector-ref rrhs ruleno)))
555 (let loop2 ((c csp) (itemsetv2 itemsetv))
556 (if (and (pair? c)
557 (< (car c) itemno))
558 (loop2 (cdr c) (cons (car c) itemsetv2))
559 (loop (+ ruleno 1) c (cons itemno itemsetv2)))))
560 (loop (+ ruleno 1) csp itemsetv))
561 (let loop2 ((c csp) (itemsetv2 itemsetv))
562 (if (pair? c)
563 (loop2 (cdr c) (cons (car c) itemsetv2))
564 (reverse itemsetv2))))))
565
566
567
568(define (allocate-item-sets)
569 (set! kernel-base (make-vector nsyms 0))
570 (set! kernel-end (make-vector nsyms #f)))
571
572
573(define (allocate-storage)
574 (allocate-item-sets)
575 (set! red-set (make-vector (+ nrules 1) 0)))
576
577;; --
578
579
580(define (initialize-states)
581 (let ((p (new-core)))
582 (set-core-number! p 0)
583 (set-core-acc-sym! p #f)
584 (set-core-nitems! p 1)
585 (set-core-items! p '(0))
586
587 (set! first-state (list p))
588 (set! last-state first-state)
589 (set! nstates 1)))
590
591
592
593(define (generate-states)
594 (allocate-storage)
595 (set-fderives)
596 (initialize-states)
597 (let loop ((this-state first-state))
598 (if (pair? this-state)
599 (let* ((x (car this-state))
600 (is (closure (core-items x))))
601 (save-reductions x is)
602 (new-itemsets is)
603 (append-states)
604 (if (> nshifts 0)
605 (save-shifts x))
606 (loop (cdr this-state))))))
607
608
609;; Fonction calculant les symboles sur lesquels il faut "shifter"
610;; et regroupe les items en fonction de ces symboles
611
612(define (new-itemsets itemset)
613 ;; - Initialization
614 (set! shift-symbol '())
615 (let loop ((i 0))
616 (if (< i nsyms)
617 (begin
618 (vector-set! kernel-end i '())
619 (loop (+ i 1)))))
620
621 (let loop ((isp itemset))
622 (if (pair? isp)
623 (let* ((i (car isp))
624 (sym (vector-ref ritem i)))
625 (if (>= sym 0)
626 (begin
627 (set! shift-symbol (sinsert sym shift-symbol))
628 (let ((x (vector-ref kernel-end sym)))
629 (if (null? x)
630 (begin
631 (vector-set! kernel-base sym (cons (+ i 1) x))
632 (vector-set! kernel-end sym (vector-ref kernel-base sym)))
633 (begin
634 (set-cdr! x (list (+ i 1)))
635 (vector-set! kernel-end sym (cdr x)))))))
636 (loop (cdr isp)))))
637
638 (set! nshifts (length shift-symbol)))
639
640
641
642(define (get-state sym)
643 (let* ((isp (vector-ref kernel-base sym))
644 (n (length isp))
645 (key (let loop ((isp1 isp) (k 0))
646 (if (null? isp1)
647 (modulo k STATE-TABLE-SIZE)
648 (loop (cdr isp1) (+ k (car isp1))))))
649 (sp (vector-ref state-table key)))
650 (if (null? sp)
651 (let ((x (new-state sym)))
652 (vector-set! state-table key (list x))
653 (core-number x))
654 (let loop ((sp1 sp))
655 (if (and (= n (core-nitems (car sp1)))
656 (let loop2 ((i1 isp) (t (core-items (car sp1))))
657 (if (and (pair? i1)
658 (= (car i1)
659 (car t)))
660 (loop2 (cdr i1) (cdr t))
661 (null? i1))))
662 (core-number (car sp1))
663 (if (null? (cdr sp1))
664 (let ((x (new-state sym)))
665 (set-cdr! sp1 (list x))
666 (core-number x))
667 (loop (cdr sp1))))))))
668
669
670(define (new-state sym)
671 (let* ((isp (vector-ref kernel-base sym))
672 (n (length isp))
673 (p (new-core)))
674 (set-core-number! p nstates)
675 (set-core-acc-sym! p sym)
676 (if (= sym nvars) (set! final-state nstates))
677 (set-core-nitems! p n)
678 (set-core-items! p isp)
679 (set-cdr! last-state (list p))
680 (set! last-state (cdr last-state))
681 (set! nstates (+ nstates 1))
682 p))
683
684
685;; --
686
687(define (append-states)
688 (set! shift-set
689 (let loop ((l (reverse shift-symbol)))
690 (if (null? l)
691 '()
692 (cons (get-state (car l)) (loop (cdr l)))))))
693
694;; --
695
696(define (save-shifts core)
697 (let ((p (new-shift)))
698 (set-shift-number! p (core-number core))
699 (set-shift-nshifts! p nshifts)
700 (set-shift-shifts! p shift-set)
701 (if last-shift
702 (begin
703 (set-cdr! last-shift (list p))
704 (set! last-shift (cdr last-shift)))
705 (begin
706 (set! first-shift (list p))
707 (set! last-shift first-shift)))))
708
709(define (save-reductions core itemset)
710 (let ((rs (let loop ((l itemset))
711 (if (null? l)
712 '()
713 (let ((item (vector-ref ritem (car l))))
714 (if (< item 0)
715 (cons (- item) (loop (cdr l)))
716 (loop (cdr l))))))))
717 (if (pair? rs)
718 (let ((p (new-red)))
719 (set-red-number! p (core-number core))
720 (set-red-nreds! p (length rs))
721 (set-red-rules! p rs)
722 (if last-reduction
723 (begin
724 (set-cdr! last-reduction (list p))
725 (set! last-reduction (cdr last-reduction)))
726 (begin
727 (set! first-reduction (list p))
728 (set! last-reduction first-reduction)))))))
729
730
731;; --
732
733(define (lalr)
734 (set! token-set-size (+ 1 (quotient nterms (BITS-PER-WORD))))
735 (set-accessing-symbol)
736 (set-shift-table)
737 (set-reduction-table)
738 (set-max-rhs)
739 (initialize-LA)
740 (set-goto-map)
741 (initialize-F)
742 (build-relations)
743 (digraph includes)
744 (compute-lookaheads))
745
746(define (set-accessing-symbol)
747 (set! acces-symbol (make-vector nstates #f))
748 (let loop ((l first-state))
749 (if (pair? l)
750 (let ((x (car l)))
751 (vector-set! acces-symbol (core-number x) (core-acc-sym x))
752 (loop (cdr l))))))
753
754(define (set-shift-table)
755 (set! shift-table (make-vector nstates #f))
756 (let loop ((l first-shift))
757 (if (pair? l)
758 (let ((x (car l)))
759 (vector-set! shift-table (shift-number x) x)
760 (loop (cdr l))))))
761
762(define (set-reduction-table)
763 (set! reduction-table (make-vector nstates #f))
764 (let loop ((l first-reduction))
765 (if (pair? l)
766 (let ((x (car l)))
767 (vector-set! reduction-table (red-number x) x)
768 (loop (cdr l))))))
769
770(define (set-max-rhs)
771 (let loop ((p 0) (curmax 0) (length 0))
772 (let ((x (vector-ref ritem p)))
773 (if x
774 (if (>= x 0)
775 (loop (+ p 1) curmax (+ length 1))
776 (loop (+ p 1) (max curmax length) 0))
777 (set! maxrhs curmax)))))
778
779(define (initialize-LA)
780 (define (last l)
781 (if (null? (cdr l))
782 (car l)
783 (last (cdr l))))
784
785 (set! consistent (make-vector nstates #f))
786 (set! lookaheads (make-vector (+ nstates 1) #f))
787
788 (let loop ((count 0) (i 0))
789 (if (< i nstates)
790 (begin
791 (vector-set! lookaheads i count)
792 (let ((rp (vector-ref reduction-table i))
793 (sp (vector-ref shift-table i)))
794 (if (and rp
795 (or (> (red-nreds rp) 1)
796 (and sp
797 (not
798 (< (vector-ref acces-symbol
799 (last (shift-shifts sp)))
800 nvars)))))
801 (loop (+ count (red-nreds rp)) (+ i 1))
802 (begin
803 (vector-set! consistent i #t)
804 (loop count (+ i 1))))))
805
806 (begin
807 (vector-set! lookaheads nstates count)
808 (let ((c (max count 1)))
809 (set! LA (make-vector c #f))
810 (do ((j 0 (+ j 1))) ((= j c)) (vector-set! LA j (new-set token-set-size)))
811 (set! LAruleno (make-vector c -1))
812 (set! lookback (make-vector c #f)))
813 (let loop ((i 0) (np 0))
814 (if (< i nstates)
815 (if (vector-ref consistent i)
816 (loop (+ i 1) np)
817 (let ((rp (vector-ref reduction-table i)))
818 (if rp
819 (let loop2 ((j (red-rules rp)) (np2 np))
820 (if (null? j)
821 (loop (+ i 1) np2)
822 (begin
823 (vector-set! LAruleno np2 (car j))
824 (loop2 (cdr j) (+ np2 1)))))
825 (loop (+ i 1) np))))))))))
826
827
828(define (set-goto-map)
829 (set! goto-map (make-vector (+ nvars 1) 0))
830 (let ((temp-map (make-vector (+ nvars 1) 0)))
831 (let loop ((ng 0) (sp first-shift))
832 (if (pair? sp)
833 (let loop2 ((i (reverse (shift-shifts (car sp)))) (ng2 ng))
834 (if (pair? i)
835 (let ((symbol (vector-ref acces-symbol (car i))))
836 (if (< symbol nvars)
837 (begin
838 (vector-set! goto-map symbol
839 (+ 1 (vector-ref goto-map symbol)))
840 (loop2 (cdr i) (+ ng2 1)))
841 (loop2 (cdr i) ng2)))
842 (loop ng2 (cdr sp))))
843
844 (let loop ((k 0) (i 0))
845 (if (< i nvars)
846 (begin
847 (vector-set! temp-map i k)
848 (loop (+ k (vector-ref goto-map i)) (+ i 1)))
849
850 (begin
851 (do ((i 0 (+ i 1)))
852 ((>= i nvars))
853 (vector-set! goto-map i (vector-ref temp-map i)))
854
855 (set! ngotos ng)
856 (vector-set! goto-map nvars ngotos)
857 (vector-set! temp-map nvars ngotos)
858 (set! from-state (make-vector ngotos #f))
859 (set! to-state (make-vector ngotos #f))
860
861 (do ((sp first-shift (cdr sp)))
862 ((null? sp))
863 (let* ((x (car sp))
864 (state1 (shift-number x)))
865 (do ((i (shift-shifts x) (cdr i)))
866 ((null? i))
867 (let* ((state2 (car i))
868 (symbol (vector-ref acces-symbol state2)))
869 (if (< symbol nvars)
870 (let ((k (vector-ref temp-map symbol)))
871 (vector-set! temp-map symbol (+ k 1))
872 (vector-set! from-state k state1)
873 (vector-set! to-state k state2))))))))))))))
874
875
876(define (map-goto state symbol)
877 (let loop ((low (vector-ref goto-map symbol))
878 (high (- (vector-ref goto-map (+ symbol 1)) 1)))
879 (if (> low high)
880 (begin
881 (display (list "Error in map-goto" state symbol) (current-error-port))
882 (newline (current-error-port))
883 0)
884 (let* ((middle (quotient (+ low high) 2))
885 (s (vector-ref from-state middle)))
886 (cond
887 ((= s state)
888 middle)
889 ((< s state)
890 (loop (+ middle 1) high))
891 (else
892 (loop low (- middle 1))))))))
893
894
895(define (initialize-F)
896 (set! F (make-vector ngotos #f))
897 (do ((i 0 (+ i 1))) ((= i ngotos)) (vector-set! F i (new-set token-set-size)))
898
899 (let ((reads (make-vector ngotos #f)))
900
901 (let loop ((i 0) (rowp 0))
902 (if (< i ngotos)
903 (let* ((rowf (vector-ref F rowp))
904 (stateno (vector-ref to-state i))
905 (sp (vector-ref shift-table stateno)))
906 (if sp
907 (let loop2 ((j (shift-shifts sp)) (edges '()))
908 (if (pair? j)
909 (let ((symbol (vector-ref acces-symbol (car j))))
910 (if (< symbol nvars)
911 (if (vector-ref nullable symbol)
912 (loop2 (cdr j) (cons (map-goto stateno symbol)
913 edges))
914 (loop2 (cdr j) edges))
915 (begin
916 (set-bit rowf (- symbol nvars))
917 (loop2 (cdr j) edges))))
918 (if (pair? edges)
919 (vector-set! reads i (reverse edges))))))
920 (loop (+ i 1) (+ rowp 1)))))
921 (digraph reads)))
922
923(define (add-lookback-edge stateno ruleno gotono)
924 (let ((k (vector-ref lookaheads (+ stateno 1))))
925 (let loop ((found #f) (i (vector-ref lookaheads stateno)))
926 (if (and (not found) (< i k))
927 (if (= (vector-ref LAruleno i) ruleno)
928 (loop #t i)
929 (loop found (+ i 1)))
930
931 (if (not found)
932 (begin (display "Error in add-lookback-edge : " (current-error-port))
933 (display (list stateno ruleno gotono) (current-error-port))
934 (newline (current-error-port)))
935 (vector-set! lookback i
936 (cons gotono (vector-ref lookback i))))))))
937
938
939(define (transpose r-arg n)
940 (let ((new-end (make-vector n #f))
941 (new-R (make-vector n #f)))
942 (do ((i 0 (+ i 1)))
943 ((= i n))
944 (let ((x (list 'bidon)))
945 (vector-set! new-R i x)
946 (vector-set! new-end i x)))
947 (do ((i 0 (+ i 1)))
948 ((= i n))
949 (let ((sp (vector-ref r-arg i)))
950 (if (pair? sp)
951 (let loop ((sp2 sp))
952 (if (pair? sp2)
953 (let* ((x (car sp2))
954 (y (vector-ref new-end x)))
955 (set-cdr! y (cons i (cdr y)))
956 (vector-set! new-end x (cdr y))
957 (loop (cdr sp2))))))))
958 (do ((i 0 (+ i 1)))
959 ((= i n))
960 (vector-set! new-R i (cdr (vector-ref new-R i))))
961
962 new-R))
963
964
965
966(define (build-relations)
967
968 (define (get-state stateno symbol)
969 (let loop ((j (shift-shifts (vector-ref shift-table stateno)))
970 (stno stateno))
971 (if (null? j)
972 stno
973 (let ((st2 (car j)))
974 (if (= (vector-ref acces-symbol st2) symbol)
975 st2
976 (loop (cdr j) st2))))))
977
978 (set! includes (make-vector ngotos #f))
979 (do ((i 0 (+ i 1)))
980 ((= i ngotos))
981 (let ((state1 (vector-ref from-state i))
982 (symbol1 (vector-ref acces-symbol (vector-ref to-state i))))
983 (let loop ((rulep (vector-ref derives symbol1))
984 (edges '()))
985 (if (pair? rulep)
986 (let ((*rulep (car rulep)))
987 (let loop2 ((rp (vector-ref rrhs *rulep))
988 (stateno state1)
989 (states (list state1)))
990 (let ((*rp (vector-ref ritem rp)))
991 (if (> *rp 0)
992 (let ((st (get-state stateno *rp)))
993 (loop2 (+ rp 1) st (cons st states)))
994 (begin
995
996 (if (not (vector-ref consistent stateno))
997 (add-lookback-edge stateno *rulep i))
998
999 (let loop2 ((done #f)
1000 (stp (cdr states))
1001 (rp2 (- rp 1))
1002 (edgp edges))
1003 (if (not done)
1004 (let ((*rp (vector-ref ritem rp2)))
1005 (if (< -1 *rp nvars)
1006 (loop2 (not (vector-ref nullable *rp))
1007 (cdr stp)
1008 (- rp2 1)
1009 (cons (map-goto (car stp) *rp) edgp))
1010 (loop2 #t stp rp2 edgp)))
1011
1012 (loop (cdr rulep) edgp))))))))
1013 (vector-set! includes i edges)))))
1014 (set! includes (transpose includes ngotos)))
1015
1016
1017
1018(define (compute-lookaheads)
1019 (let ((n (vector-ref lookaheads nstates)))
1020 (let loop ((i 0))
1021 (if (< i n)
1022 (let loop2 ((sp (vector-ref lookback i)))
1023 (if (pair? sp)
1024 (let ((LA-i (vector-ref LA i))
1025 (F-j (vector-ref F (car sp))))
1026 (bit-union LA-i F-j token-set-size)
1027 (loop2 (cdr sp)))
1028 (loop (+ i 1))))))))
1029
1030
1031
1032(define (digraph relation)
1033 (define infinity (+ ngotos 2))
1034 (define INDEX (make-vector (+ ngotos 1) 0))
1035 (define VERTICES (make-vector (+ ngotos 1) 0))
1036 (define top 0)
1037 (define R relation)
1038
1039 (define (traverse i)
1040 (set! top (+ 1 top))
1041 (vector-set! VERTICES top i)
1042 (let ((height top))
1043 (vector-set! INDEX i height)
1044 (let ((rp (vector-ref R i)))
1045 (if (pair? rp)
1046 (let loop ((rp2 rp))
1047 (if (pair? rp2)
1048 (let ((j (car rp2)))
1049 (if (= 0 (vector-ref INDEX j))
1050 (traverse j))
1051 (if (> (vector-ref INDEX i)
1052 (vector-ref INDEX j))
1053 (vector-set! INDEX i (vector-ref INDEX j)))
1054 (let ((F-i (vector-ref F i))
1055 (F-j (vector-ref F j)))
1056 (bit-union F-i F-j token-set-size))
1057 (loop (cdr rp2))))))
1058 (if (= (vector-ref INDEX i) height)
1059 (let loop ()
1060 (let ((j (vector-ref VERTICES top)))
1061 (set! top (- top 1))
1062 (vector-set! INDEX j infinity)
1063 (if (not (= i j))
1064 (begin
1065 (bit-union (vector-ref F i)
1066 (vector-ref F j)
1067 token-set-size)
1068 (loop)))))))))
1069
1070 (let loop ((i 0))
1071 (if (< i ngotos)
1072 (begin
1073 (if (and (= 0 (vector-ref INDEX i))
1074 (pair? (vector-ref R i)))
1075 (traverse i))
1076 (loop (+ i 1))))))
1077
1078
1079;; ---------------------------------------------------------------------- ;;
1080;; operator precedence management ;;
1081;; ---------------------------------------------------------------------- ;;
1082
1083; a vector of precedence descriptors where each element
1084; is of the form (terminal type precedence)
1085(define the-terminals/prec #f) ; terminal symbols with precedence
1086; the precedence is an integer >= 0
1087(define (get-symbol-precedence sym)
1088 (caddr (vector-ref the-terminals/prec sym)))
1089; the operator type is either 'none, 'left, 'right, or 'nonassoc
1090(define (get-symbol-assoc sym)
1091 (cadr (vector-ref the-terminals/prec sym)))
1092
1093(define rule-precedences '())
1094(define (add-rule-precedence! rule sym)
1095 (set! rule-precedences
1096 (cons (cons rule sym) rule-precedences)))
1097
1098(define (get-rule-precedence ruleno)
1099 (cond
1100 ((assq ruleno rule-precedences)
1101 => (lambda (p)
1102 (get-symbol-precedence (cdr p))))
1103 (else
1104 ;; process the rule symbols from left to right
1105 (let loop ((i (vector-ref rrhs ruleno))
1106 (prec 0))
1107 (let ((item (vector-ref ritem i)))
1108 ;; end of rule
1109 (if (< item 0)
1110 prec
1111 (let ((i1 (+ i 1)))
1112 (if (>= item nvars)
1113 ;; it's a terminal symbol
1114 (loop i1 (get-symbol-precedence (- item nvars)))
1115 (loop i1 prec)))))))))
1116
1117;; ---------------------------------------------------------------------- ;;
1118;; Build the various tables ;;
1119;; ---------------------------------------------------------------------- ;;
1120(define (build-tables)
1121
1122 (define (resolve-conflict sym rule)
1123 (let ((sym-prec (get-symbol-precedence sym))
1124 (sym-assoc (get-symbol-assoc sym))
1125 (rule-prec (get-rule-precedence rule)))
1126 (cond
1127 ((> sym-prec rule-prec) 'shift)
1128 ((< sym-prec rule-prec) 'reduce)
1129 ((eq? sym-assoc 'left) 'reduce)
1130 ((eq? sym-assoc 'right) 'shift)
1131 (else 'shift))))
1132
1133 ;; --- Add an action to the action table ------------------------------ ;;
1134 (define (add-action St Sym Act)
1135 (let* ((x (vector-ref action-table St))
1136 (y (assv Sym x)))
1137 (if y
1138 (if (not (= Act (cdr y)))
1139 ;; -- there is a conflict
1140 (begin
1141 (if (and (<= (cdr y) 0)
1142 (<= Act 0))
1143 ;; --- reduce/reduce conflict ----------------------- ;;
1144 (begin
1145 (display "%% Reduce/Reduce conflict " (current-error-port))
1146 (display "(reduce " (current-error-port))
1147 (display (- Act) (current-error-port))
1148 (display ", reduce " (current-error-port))
1149 (display (- (cdr y)) (current-error-port))
1150 (display ") on " (current-error-port))
1151 (print-symbol (+ Sym nvars) (current-error-port))
1152 (display " in state " (current-error-port))
1153 (display St (current-error-port))
1154 (newline (current-error-port))
1155 (set-cdr! y (max (cdr y) Act)))
1156 ;; --- shift/reduce conflict ------------------------ ;;
1157 ;; can we resolve the conflict using precedences?
1158 (case (resolve-conflict Sym (- (cdr y)))
1159 ;; -- shift
1160 ((shift)
1161 (set-cdr! y Act))
1162 ;; -- reduce
1163 ((reduce)
1164 #f) ; well, nothing to do...
1165 ;; -- signal a conflict!
1166 (else
1167 (display "%% Shift/Reduce conflict " (current-error-port))
1168 (display "(shift " (current-error-port))
1169 (display Act (current-error-port))
1170 (display ", reduce " (current-error-port))
1171 (display (- (cdr y)) (current-error-port))
1172 (display ") on " (current-error-port))
1173 (print-symbol (+ Sym nvars) (current-error-port))
1174 (display " in state " (current-error-port))
1175 (display St (current-error-port))
1176 (newline (current-error-port))
1177 (set-cdr! y Act))))))
1178
1179 (vector-set! action-table St (cons (cons Sym Act) x)))))
1180
1181 (set! action-table (make-vector nstates '()))
1182
1183 (do ((i 0 (+ i 1))) ; i = state
1184 ((= i nstates))
1185 (let ((red (vector-ref reduction-table i)))
1186 (if (and red (>= (red-nreds red) 1))
1187 (if (and (= (red-nreds red) 1) (vector-ref consistent i))
1188 (add-action i 'default (- (car (red-rules red))))
1189 (let ((k (vector-ref lookaheads (+ i 1))))
1190 (let loop ((j (vector-ref lookaheads i)))
1191 (if (< j k)
1192 (let ((rule (- (vector-ref LAruleno j)))
1193 (lav (vector-ref LA j)))
1194 (let loop2 ((token 0) (x (vector-ref lav 0)) (y 1) (z 0))
1195 (if (< token nterms)
1196 (begin
1197 (let ((in-la-set? (modulo x 2)))
1198 (if (= in-la-set? 1)
1199 (add-action i token rule)))
1200 (if (= y (BITS-PER-WORD))
1201 (loop2 (+ token 1)
1202 (vector-ref lav (+ z 1))
1203 1
1204 (+ z 1))
1205 (loop2 (+ token 1) (quotient x 2) (+ y 1) z)))))
1206 (loop (+ j 1)))))))))
1207
1208 (let ((shiftp (vector-ref shift-table i)))
1209 (if shiftp
1210 (let loop ((k (shift-shifts shiftp)))
1211 (if (pair? k)
1212 (let* ((state (car k))
1213 (symbol (vector-ref acces-symbol state)))
1214 (if (>= symbol nvars)
1215 (add-action i (- symbol nvars) state))
1216 (loop (cdr k))))))))
1217
1218 (add-action final-state 0 'accept))
1219
1220(define (compact-action-table terms)
1221 (define (most-common-action acts)
1222 (let ((accums '()))
1223 (let loop ((l acts))
1224 (if (pair? l)
1225 (let* ((x (cdar l))
1226 (y (assv x accums)))
1227 (if (and (number? x) (< x 0))
1228 (if y
1229 (set-cdr! y (+ 1 (cdr y)))
1230 (set! accums (cons `(,x . 1) accums))))
1231 (loop (cdr l)))))
1232
1233 (let loop ((l accums) (max 0) (sym #f))
1234 (if (null? l)
1235 sym
1236 (let ((x (car l)))
1237 (if (> (cdr x) max)
1238 (loop (cdr l) (cdr x) (car x))
1239 (loop (cdr l) max sym)))))))
1240
1241 (define (translate-terms acts)
1242 (map (lambda (act)
1243 (cons (list-ref terms (car act))
1244 (cdr act)))
1245 acts))
1246
1247 (do ((i 0 (+ i 1)))
1248 ((= i nstates))
1249 (let ((acts (vector-ref action-table i)))
1250 (if (vector? (vector-ref reduction-table i))
1251 (let ((act (most-common-action acts)))
1252 (vector-set! action-table i
1253 (cons `(*default* . ,(if act act 'error))
1254 (translate-terms
1255 (lalr-filter (lambda (x)
1256 (not (eq? (cdr x) act)))
1257 acts)))))
1258 (vector-set! action-table i
1259 (cons `(*default* . *error*)
1260 (translate-terms acts)))))))
1261
1262
1263
1264;; --
1265
1266(define (rewrite-grammar tokens grammar k)
1267
1268 (define eoi '*eoi*)
1269
1270 (define (check-terminal term terms)
1271 (cond
1272 ((not (valid-terminal? term))
1273 (lalr-error "invalid terminal: " term))
1274 ((member term terms)
1275 (lalr-error "duplicate definition of terminal: " term))))
1276
1277 (define (prec->type prec)
1278 (cdr (assq prec '((left: . left)
1279 (right: . right)
1280 (nonassoc: . nonassoc)))))
1281
1282 (cond
1283 ;; --- a few error conditions ---------------------------------------- ;;
1284 ((not (list? tokens))
1285 (lalr-error "Invalid token list: " tokens))
1286 ((not (pair? grammar))
1287 (lalr-error "Grammar definition must have a non-empty list of productions" '()))
1288
1289 (else
1290 ;; --- check the terminals ---------------------------------------- ;;
1291 (let loop1 ((lst tokens)
1292 (rev-terms '())
1293 (rev-terms/prec '())
1294 (prec-level 0))
1295 (if (pair? lst)
1296 (let ((term (car lst)))
1297 (cond
1298 ((pair? term)
1299 (if (and (memq (car term) '(left: right: nonassoc:))
1300 (not (null? (cdr term))))
1301 (let ((prec (+ prec-level 1))
1302 (optype (prec->type (car term))))
1303 (let loop-toks ((l (cdr term))
1304 (rev-terms rev-terms)
1305 (rev-terms/prec rev-terms/prec))
1306 (if (null? l)
1307 (loop1 (cdr lst) rev-terms rev-terms/prec prec)
1308 (let ((term (car l)))
1309 (check-terminal term rev-terms)
1310 (loop-toks
1311 (cdr l)
1312 (cons term rev-terms)
1313 (cons (list term optype prec) rev-terms/prec))))))
1314
1315 (lalr-error "invalid operator precedence specification: " term)))
1316
1317 (else
1318 (check-terminal term rev-terms)
1319 (loop1 (cdr lst)
1320 (cons term rev-terms)
1321 (cons (list term 'none 0) rev-terms/prec)
1322 prec-level))))
1323
1324 ;; --- check the grammar rules ------------------------------ ;;
1325 (let loop2 ((lst grammar) (rev-nonterm-defs '()))
1326 (if (pair? lst)
1327 (let ((def (car lst)))
1328 (if (not (pair? def))
1329 (lalr-error "Nonterminal definition must be a non-empty list" '())
1330 (let ((nonterm (car def)))
1331 (cond ((not (valid-nonterminal? nonterm))
1332 (lalr-error "Invalid nonterminal:" nonterm))
1333 ((or (member nonterm rev-terms)
1334 (assoc nonterm rev-nonterm-defs))
1335 (lalr-error "Nonterminal previously defined:" nonterm))
1336 (else
1337 (loop2 (cdr lst)
1338 (cons def rev-nonterm-defs)))))))
1339 (let* ((terms (cons eoi (reverse rev-terms)))
1340 (terms/prec (cons '(eoi none 0) (reverse rev-terms/prec)))
1341 (nonterm-defs (reverse rev-nonterm-defs))
1342 (nonterms (cons '*start* (map car nonterm-defs))))
1343 (if (= (length nonterms) 1)
1344 (lalr-error "Grammar must contain at least one nonterminal" '())
1345 (let loop-defs ((defs (cons `(*start* (,(cadr nonterms) ,eoi) -> $1)
1346 nonterm-defs))
1347 (ruleno 0)
1348 (comp-defs '()))
1349 (if (pair? defs)
1350 (let* ((nonterm-def (car defs))
1351 (compiled-def (rewrite-nonterm-def
1352 nonterm-def
1353 ruleno
1354 terms nonterms)))
1355 (loop-defs (cdr defs)
1356 (+ ruleno (length compiled-def))
1357 (cons compiled-def comp-defs)))
1358
1359 (let ((compiled-nonterm-defs (reverse comp-defs)))
1360 (k terms
1361 terms/prec
1362 nonterms
1363 (map (lambda (x) (cons (caaar x) (map cdar x)))
1364 compiled-nonterm-defs)
1365 (apply append compiled-nonterm-defs))))))))))))))
1366
1367
1368(define *arrow* '->)
1369
1370(define (rewrite-nonterm-def nonterm-def ruleno terms nonterms)
1371
1372 (define No-NT (length nonterms))
1373
1374 (define (encode x)
1375 (let ((PosInNT (pos-in-list x nonterms)))
1376 (if PosInNT
1377 PosInNT
1378 (let ((PosInT (pos-in-list x terms)))
1379 (if PosInT
1380 (+ No-NT PosInT)
1381 (lalr-error "undefined symbol : " x))))))
1382
1383 (define (process-prec-directive rhs ruleno)
1384 (let loop ((l rhs))
1385 (if (null? l)
1386 '()
1387 (let ((first (car l))
1388 (rest (cdr l)))
1389 (cond
1390 ((or (member first terms) (member first nonterms))
1391 (cons first (loop rest)))
1392 ((and (pair? first)
1393 (eq? (car first) 'prec:))
1394 (pair? (cdr first))
1395 (if (and (pair? (cdr first))
1396 (member (cadr first) terms))
1397 (if (null? (cddr first))
1398 (begin
1399 (add-rule-precedence! ruleno (pos-in-list (cadr first) terms))
1400 (loop rest))
1401 (lalr-error "prec: directive should be at end of rule: " rhs))
1402 (lalr-error "Invalid prec: directive: " first)))
1403 (else
1404 (lalr-error "Invalid terminal or nonterminal: " first)))))))
1405
1406
1407 (if (not (pair? (cdr nonterm-def)))
1408 (lalr-error "At least one production needed for nonterminal" (car nonterm-def))
1409 (let ((name (symbol->string (car nonterm-def))))
1410 (let loop1 ((lst (cdr nonterm-def))
1411 (i 1)
1412 (rev-productions-and-actions '()))
1413 (if (not (pair? lst))
1414 (reverse rev-productions-and-actions)
1415 (let* ((rhs (process-prec-directive (car lst) (+ ruleno i -1)))
1416 (rest (cdr lst))
1417 (prod (map encode (cons (car nonterm-def) rhs))))
1418 (for-each (lambda (x)
1419 (if (not (or (member x terms) (member x nonterms)))
1420 (lalr-error "Invalid terminal or nonterminal" x)))
1421 rhs)
1422 (if (and (pair? rest)
1423 (eq? (car rest) *arrow*)
1424 (pair? (cdr rest)))
1425 (loop1 (cddr rest)
1426 (+ i 1)
1427 (cons (cons prod (cadr rest))
1428 rev-productions-and-actions))
1429 (let* ((rhs-length (length rhs))
1430 (action
1431 (cons 'vector
1432 (cons (list 'quote (string->symbol
1433 (string-append
1434 name
1435 "-"
1436 (number->string i))))
1437 (let loop-j ((j 1))
1438 (if (> j rhs-length)
1439 '()
1440 (cons (string->symbol
1441 (string-append
1442 "$"
1443 (number->string j)))
1444 (loop-j (+ j 1)))))))))
1445 (loop1 rest
1446 (+ i 1)
1447 (cons (cons prod action)
1448 rev-productions-and-actions))))))))))
1449
1450(define (valid-nonterminal? x)
1451 (symbol? x))
1452
1453(define (valid-terminal? x)
1454 (symbol? x)) ; DB
1455
1456;; ---------------------------------------------------------------------- ;;
1457;; Miscellaneous ;;
1458;; ---------------------------------------------------------------------- ;;
1459(define (pos-in-list x lst)
1460 (let loop ((lst lst) (i 0))
1461 (cond ((not (pair? lst)) #f)
1462 ((equal? (car lst) x) i)
1463 (else (loop (cdr lst) (+ i 1))))))
1464
1465(define (sunion lst1 lst2) ; union of sorted lists
1466 (let loop ((L1 lst1)
1467 (L2 lst2))
1468 (cond ((null? L1) L2)
1469 ((null? L2) L1)
1470 (else
1471 (let ((x (car L1)) (y (car L2)))
1472 (cond
1473 ((> x y)
1474 (cons y (loop L1 (cdr L2))))
1475 ((< x y)
1476 (cons x (loop (cdr L1) L2)))
1477 (else
1478 (loop (cdr L1) L2))
1479 ))))))
1480
1481(define (sinsert elem lst)
1482 (let loop ((l1 lst))
1483 (if (null? l1)
1484 (cons elem l1)
1485 (let ((x (car l1)))
1486 (cond ((< elem x)
1487 (cons elem l1))
1488 ((> elem x)
1489 (cons x (loop (cdr l1))))
1490 (else
1491 l1))))))
1492
1493(define (lalr-filter p lst)
1494 (let loop ((l lst))
1495 (if (null? l)
1496 '()
1497 (let ((x (car l)) (y (cdr l)))
1498 (if (p x)
1499 (cons x (loop y))
1500 (loop y))))))
1501
1502;; ---------------------------------------------------------------------- ;;
1503;; Debugging tools ... ;;
1504;; ---------------------------------------------------------------------- ;;
1505(define the-terminals #f) ; names of terminal symbols
1506(define the-nonterminals #f) ; non-terminals
1507
1508(define (print-item item-no)
1509 (let loop ((i item-no))
1510 (let ((v (vector-ref ritem i)))
1511 (if (>= v 0)
1512 (loop (+ i 1))
1513 (let* ((rlno (- v))
1514 (nt (vector-ref rlhs rlno)))
1515 (display (vector-ref the-nonterminals nt)) (display " --> ")
1516 (let loop ((i (vector-ref rrhs rlno)))
1517 (let ((v (vector-ref ritem i)))
1518 (if (= i item-no)
1519 (display ". "))
1520 (if (>= v 0)
1521 (begin
1522 (print-symbol v)
1523 (display " ")
1524 (loop (+ i 1)))
1525 (begin
1526 (display " (rule ")
1527 (display (- v))
1528 (display ")")
1529 (newline))))))))))
1530
1531(define (print-symbol n . port)
1532 (display (if (>= n nvars)
1533 (vector-ref the-terminals (- n nvars))
1534 (vector-ref the-nonterminals n))
1535 (if (null? port)
1536 (current-output-port)
1537 (car port))))
1538
1539(define (print-states)
1540"Print the states of a generated parser."
1541 (define (print-action act)
1542 (cond
1543 ((eq? act '*error*)
1544 (display " : Error"))
1545 ((eq? act 'accept)
1546 (display " : Accept input"))
1547 ((< act 0)
1548 (display " : reduce using rule ")
1549 (display (- act)))
1550 (else
1551 (display " : shift and goto state ")
1552 (display act)))
1553 (newline)
1554 #t)
1555
1556 (define (print-actions acts)
1557 (let loop ((l acts))
1558 (if (null? l)
1559 #t
1560 (let ((sym (caar l))
1561 (act (cdar l)))
1562 (display " ")
1563 (cond
1564 ((eq? sym 'default)
1565 (display "default action"))
1566 (else
1567 (if (number? sym)
1568 (print-symbol (+ sym nvars))
1569 (display sym))))
1570 (print-action act)
1571 (loop (cdr l))))))
1572
1573 (if (not action-table)
1574 (begin
1575 (display "No generated parser available!")
1576 (newline)
1577 #f)
1578 (begin
1579 (display "State table") (newline)
1580 (display "-----------") (newline) (newline)
1581
1582 (let loop ((l first-state))
1583 (if (null? l)
1584 #t
1585 (let* ((core (car l))
1586 (i (core-number core))
1587 (items (core-items core))
1588 (actions (vector-ref action-table i)))
1589 (display "state ") (display i) (newline)
1590 (newline)
1591 (for-each (lambda (x) (display " ") (print-item x))
1592 items)
1593 (newline)
1594 (print-actions actions)
1595 (newline)
1596 (loop (cdr l))))))))
1597
1598
1599
1600;; ---------------------------------------------------------------------- ;;
1601
1602(define build-goto-table
1603 (lambda ()
1604 `(vector
1605 ,@(map
1606 (lambda (shifts)
1607 (list 'quote
1608 (if shifts
1609 (let loop ((l (shift-shifts shifts)))
1610 (if (null? l)
1611 '()
1612 (let* ((state (car l))
1613 (symbol (vector-ref acces-symbol state)))
1614 (if (< symbol nvars)
1615 (cons `(,symbol . ,state)
1616 (loop (cdr l)))
1617 (loop (cdr l))))))
1618 '())))
1619 (vector->list shift-table)))))
1620
1621
1622(define build-reduction-table
1623 (lambda (gram/actions)
1624 `(vector
1625 '()
1626 ,@(map
1627 (lambda (p)
1628 (let ((act (cdr p)))
1629 `(lambda (___stack ___sp ___goto-table ___k)
1630 ,(let* ((nt (caar p)) (rhs (cdar p)) (n (length rhs)))
1631 `(let* (,@(if act
1632 (let loop ((i 1) (l rhs))
1633 (if (pair? l)
1634 (let ((rest (cdr l)))
1635 (cons
1636 `(,(string->symbol
1637 (string-append
1638 "$"
1639 (number->string
1640 (+ (- n i) 1))))
1641 (vector-ref ___stack (- ___sp ,(- (* i 2) 1))))
1642 (loop (+ i 1) rest)))
1643 '()))
1644 '()))
1645 ,(if (= nt 0)
1646 '$1
1647 `(___push ___stack (- ___sp ,(* 2 n))
1648 ,nt ___goto-table ,(cdr p) ___k)))))))
1649
1650 gram/actions))))
1651
1652
1653;; @section (api "API")
1654
1655(define-macro (lalr-parser tokens . rules)
1656 (let* ((gram/actions (gen-tables! tokens rules))
1657 (code
1658 `(letrec ((___max-stack-size 500)
1659
1660 (___atable ',action-table)
1661 (___gtable ,(build-goto-table))
1662 (___grow-stack (lambda (stack)
1663 ;; make a new stack twice as big as the original
1664 (let ((new-stack (make-vector (* 2 (vector-length stack)) #f)))
1665 ;; then copy the elements...
1666 (let loop ((i (- (vector-length stack) 1)))
1667 (if (< i 0)
1668 new-stack
1669 (begin
1670 (vector-set! new-stack i (vector-ref stack i))
1671 (loop (- i 1))))))))
1672
1673 (___push (lambda (stack sp new-cat goto-table lval k)
1674 (let* ((state (vector-ref stack sp))
1675 (new-state (cdr (assq new-cat (vector-ref goto-table state))))
1676 (new-sp (+ sp 2))
1677 (stack (if (< new-sp (vector-length stack))
1678 stack
1679 (___grow-stack stack))))
1680 (vector-set! stack new-sp new-state)
1681 (vector-set! stack (- new-sp 1) lval)
1682 (k stack new-sp))))
1683
1684 (___action (lambda (x l)
1685 (let ((y (assq x l)))
1686 (if y (cdr y) (cdar l)))))
1687
1688 (___rtable ,(build-reduction-table gram/actions)))
1689
1690 (lambda (lexerp errorp)
1691
1692 (let ((stack (make-vector ___max-stack-size 0)))
1693 (let loop ((stack stack) (sp 0) (input (lexerp)))
1694 (let* ((state (vector-ref stack sp))
1695 (i (if (pair? input) (car input) input))
1696 (attr (if (pair? input) (cdr input) #f))
1697 (act (___action i (vector-ref ___atable state))))
1698
1699 (if (not (symbol? i))
1700 (errorp "PARSE ERROR: invalid token: " input))
1701
1702 (cond
1703
1704 ;; Input succesfully parsed
1705 ((eq? act 'accept)
1706 (vector-ref stack 1))
1707
1708 ;; Syntax error in input
1709 ((eq? act '*error*)
1710 (if (eq? i '*eoi*)
1711 (errorp "PARSE ERROR : unexpected end of input ")
1712 (errorp "PARSE ERROR : unexpected token : " input)))
1713
1714 ;; Shift current token on top of the stack
1715 ((>= act 0)
1716 (let ((stack (if (< (+ sp 2) (vector-length stack))
1717 stack
1718 (___grow-stack stack))))
1719 (vector-set! stack (+ sp 1) attr)
1720 (vector-set! stack (+ sp 2) act)
1721 (loop stack (+ sp 2) (lexerp))))
1722
1723 ;; Reduce by rule (- act)
1724 (else
1725 ((vector-ref ___rtable (- act))
1726 stack sp ___gtable
1727 (lambda (stack sp)
1728 (loop stack sp input))))))))))))
1729 code))
1730
1731;; arch-tag: 4FE771DE-F56D-11D8-8B77-000A95B4C7DC