Merge commit 'ca5e0414e96886177d883a249edd957d2331db65'
[bpt/guile.git] / module / ice-9 / match.upstream.scm
CommitLineData
b92bbfff 1;;;; match.scm -- portable hygienic pattern matcher -*- coding: utf-8 -*-
d967913f
LC
2;;
3;; This code is written by Alex Shinn and placed in the
4;; Public Domain. All warranties are disclaimed.
5
5fcb7b3c 6;;> @example-import[(srfi 9)]
d967913f 7
5fcb7b3c
LC
8;;> This is a full superset of the popular @hyperlink[
9;;> "http://www.cs.indiana.edu/scheme-repository/code.match.html"]{match}
10;;> package by Andrew Wright, written in fully portable @scheme{syntax-rules}
11;;> and thus preserving hygiene.
12
13;;> The most notable extensions are the ability to use @emph{non-linear}
14;;> patterns - patterns in which the same identifier occurs multiple
15;;> times, tail patterns after ellipsis, and the experimental tree patterns.
16
17;;> @subsubsection{Patterns}
18
19;;> Patterns are written to look like the printed representation of
20;;> the objects they match. The basic usage is
21
22;;> @scheme{(match expr (pat body ...) ...)}
23
24;;> where the result of @var{expr} is matched against each pattern in
25;;> turn, and the corresponding body is evaluated for the first to
26;;> succeed. Thus, a list of three elements matches a list of three
27;;> elements.
28
29;;> @example{(let ((ls (list 1 2 3))) (match ls ((1 2 3) #t)))}
30
31;;> If no patterns match an error is signalled.
32
33;;> Identifiers will match anything, and make the corresponding
34;;> binding available in the body.
35
36;;> @example{(match (list 1 2 3) ((a b c) b))}
37
38;;> If the same identifier occurs multiple times, the first instance
39;;> will match anything, but subsequent instances must match a value
40;;> which is @scheme{equal?} to the first.
41
42;;> @example{(match (list 1 2 1) ((a a b) 1) ((a b a) 2))}
43
44;;> The special identifier @scheme{_} matches anything, no matter how
45;;> many times it is used, and does not bind the result in the body.
46
47;;> @example{(match (list 1 2 1) ((_ _ b) 1) ((a b a) 2))}
48
49;;> To match a literal identifier (or list or any other literal), use
50;;> @scheme{quote}.
51
52;;> @example{(match 'a ('b 1) ('a 2))}
53
54;;> Analogous to its normal usage in scheme, @scheme{quasiquote} can
55;;> be used to quote a mostly literally matching object with selected
56;;> parts unquoted.
57
58;;> @example|{(match (list 1 2 3) (`(1 ,b ,c) (list b c)))}|
59
60;;> Often you want to match any number of a repeated pattern. Inside
61;;> a list pattern you can append @scheme{...} after an element to
62;;> match zero or more of that pattern (like a regexp Kleene star).
63
64;;> @example{(match (list 1 2) ((1 2 3 ...) #t))}
65;;> @example{(match (list 1 2 3) ((1 2 3 ...) #t))}
66;;> @example{(match (list 1 2 3 3 3) ((1 2 3 ...) #t))}
67
68;;> Pattern variables matched inside the repeated pattern are bound to
69;;> a list of each matching instance in the body.
70
71;;> @example{(match (list 1 2) ((a b c ...) c))}
72;;> @example{(match (list 1 2 3) ((a b c ...) c))}
73;;> @example{(match (list 1 2 3 4 5) ((a b c ...) c))}
74
75;;> More than one @scheme{...} may not be used in the same list, since
76;;> this would require exponential backtracking in the general case.
77;;> However, @scheme{...} need not be the final element in the list,
78;;> and may be succeeded by a fixed number of patterns.
79
80;;> @example{(match (list 1 2 3 4) ((a b c ... d e) c))}
81;;> @example{(match (list 1 2 3 4 5) ((a b c ... d e) c))}
82;;> @example{(match (list 1 2 3 4 5 6 7) ((a b c ... d e) c))}
83
84;;> @scheme{___} is provided as an alias for @scheme{...} when it is
85;;> inconvenient to use the ellipsis (as in a syntax-rules template).
86
87;;> The @scheme{..1} syntax is exactly like the @scheme{...} except
88;;> that it matches one or more repetitions (like a regexp "+").
89
90;;> @example{(match (list 1 2) ((a b c ..1) c))}
91;;> @example{(match (list 1 2 3) ((a b c ..1) c))}
92
93;;> The boolean operators @scheme{and}, @scheme{or} and @scheme{not}
94;;> can be used to group and negate patterns analogously to their
95;;> Scheme counterparts.
96
97;;> The @scheme{and} operator ensures that all subpatterns match.
98;;> This operator is often used with the idiom @scheme{(and x pat)} to
99;;> bind @var{x} to the entire value that matches @var{pat}
100;;> (c.f. "as-patterns" in ML or Haskell). Another common use is in
101;;> conjunction with @scheme{not} patterns to match a general case
102;;> with certain exceptions.
103
104;;> @example{(match 1 ((and) #t))}
105;;> @example{(match 1 ((and x) x))}
106;;> @example{(match 1 ((and x 1) x))}
107
108;;> The @scheme{or} operator ensures that at least one subpattern
109;;> matches. If the same identifier occurs in different subpatterns,
110;;> it is matched independently. All identifiers from all subpatterns
111;;> are bound if the @scheme{or} operator matches, but the binding is
112;;> only defined for identifiers from the subpattern which matched.
113
114;;> @example{(match 1 ((or) #t) (else #f))}
115;;> @example{(match 1 ((or x) x))}
116;;> @example{(match 1 ((or x 2) x))}
117
118;;> The @scheme{not} operator succeeds if the given pattern doesn't
119;;> match. None of the identifiers used are available in the body.
120
121;;> @example{(match 1 ((not 2) #t))}
122
123;;> The more general operator @scheme{?} can be used to provide a
124;;> predicate. The usage is @scheme{(? predicate pat ...)} where
125;;> @var{predicate} is a Scheme expression evaluating to a predicate
126;;> called on the value to match, and any optional patterns after the
127;;> predicate are then matched as in an @scheme{and} pattern.
128
129;;> @example{(match 1 ((? odd? x) x))}
130
131;;> The field operator @scheme{=} is used to extract an arbitrary
132;;> field and match against it. It is useful for more complex or
133;;> conditional destructuring that can't be more directly expressed in
134;;> the pattern syntax. The usage is @scheme{(= field pat)}, where
135;;> @var{field} can be any expression, and should result in a
136;;> procedure of one argument, which is applied to the value to match
137;;> to generate a new value to match against @var{pat}.
138
139;;> Thus the pattern @scheme{(and (= car x) (= cdr y))} is equivalent
140;;> to @scheme{(x . y)}, except it will result in an immediate error
141;;> if the value isn't a pair.
142
143;;> @example{(match '(1 . 2) ((= car x) x))}
144;;> @example{(match 4 ((= sqrt x) x))}
145
146;;> The record operator @scheme{$} is used as a concise way to match
147;;> records defined by SRFI-9 (or SRFI-99). The usage is
148;;> @scheme{($ rtd field ...)}, where @var{rtd} should be the record
149;;> type descriptor specified as the first argument to
150;;> @scheme{define-record-type}, and each @var{field} is a subpattern
151;;> matched against the fields of the record in order. Not all fields
152;;> must be present.
153
154;;> @example{
155;;> (let ()
156;;> (define-record-type employee
157;;> (make-employee name title)
158;;> employee?
159;;> (name get-name)
160;;> (title get-title))
161;;> (match (make-employee "Bob" "Doctor")
162;;> (($ employee n t) (list t n))))
163;;> }
164
165;;> The @scheme{set!} and @scheme{get!} operators are used to bind an
166;;> identifier to the setter and getter of a field, respectively. The
167;;> setter is a procedure of one argument, which mutates the field to
168;;> that argument. The getter is a procedure of no arguments which
169;;> returns the current value of the field.
170
171;;> @example{(let ((x (cons 1 2))) (match x ((1 . (set! s)) (s 3) x)))}
172;;> @example{(match '(1 . 2) ((1 . (get! g)) (g)))}
173
174;;> The new operator @scheme{***} can be used to search a tree for
175;;> subpatterns. A pattern of the form @scheme{(x *** y)} represents
176;;> the subpattern @var{y} located somewhere in a tree where the path
177;;> from the current object to @var{y} can be seen as a list of the
178;;> form @scheme{(x ...)}. @var{y} can immediately match the current
179;;> object in which case the path is the empty list. In a sense it's
180;;> a 2-dimensional version of the @scheme{...} pattern.
181
182;;> As a common case the pattern @scheme{(_ *** y)} can be used to
183;;> search for @var{y} anywhere in a tree, regardless of the path
184;;> used.
185
186;;> @example{(match '(a (a (a b))) ((x *** 'b) x))}
187;;> @example{(match '(a (b) (c (d e) (f g))) ((x *** 'g) x))}
188
189;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
190;; Notes
191
192;; The implementation is a simple generative pattern matcher - each
193;; pattern is expanded into the required tests, calling a failure
194;; continuation if the tests fail. This makes the logic easy to
195;; follow and extend, but produces sub-optimal code in cases where you
196;; have many similar clauses due to repeating the same tests.
197;; Nonetheless a smart compiler should be able to remove the redundant
198;; tests. For MATCH-LET and DESTRUCTURING-BIND type uses there is no
199;; performance hit.
d967913f
LC
200
201;; The original version was written on 2006/11/29 and described in the
202;; following Usenet post:
203;; http://groups.google.com/group/comp.lang.scheme/msg/0941234de7112ffd
204;; and is still available at
205;; http://synthcode.com/scheme/match-simple.scm
206;; It's just 80 lines for the core MATCH, and an extra 40 lines for
207;; MATCH-LET, MATCH-LAMBDA and other syntactic sugar.
208;;
209;; A variant of this file which uses COND-EXPAND in a few places for
210;; performance can be found at
211;; http://synthcode.com/scheme/match-cond-expand.scm
212;;
0a3ac81a 213;; 2012/05/23 - fixing combinatorial explosion of code in certain or patterns
b92bbfff
LC
214;; 2011/09/25 - fixing bug when directly matching an identifier repeated in
215;; the pattern (thanks to Stefan Israelsson Tampe)
5fcb7b3c
LC
216;; 2011/01/27 - fixing bug when matching tail patterns against improper lists
217;; 2010/09/26 - adding `..1' patterns (thanks to Ludovic Courtès)
218;; 2010/09/07 - fixing identifier extraction in some `...' and `***' patterns
d967913f
LC
219;; 2009/11/25 - adding `***' tree search patterns
220;; 2008/03/20 - fixing bug where (a ...) matched non-lists
221;; 2008/03/15 - removing redundant check in vector patterns
222;; 2008/03/06 - you can use `...' portably now (thanks to Taylor Campbell)
223;; 2007/09/04 - fixing quasiquote patterns
224;; 2007/07/21 - allowing ellipse patterns in non-final list positions
225;; 2007/04/10 - fixing potential hygiene issue in match-check-ellipse
226;; (thanks to Taylor Campbell)
227;; 2007/04/08 - clean up, commenting
228;; 2006/12/24 - bugfixes
229;; 2006/12/01 - non-linear patterns, shared variables in OR, get!/set!
230
231;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
232;; force compile-time syntax errors with useful messages
233
234(define-syntax match-syntax-error
235 (syntax-rules ()
236 ((_) (match-syntax-error "invalid match-syntax-error usage"))))
237
238;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
239
5fcb7b3c
LC
240;;> @subsubsection{Syntax}
241
242;;> @subsubsubsection{@rawcode{(match expr (pattern . body) ...)@br{}
243;;> (match expr (pattern (=> failure) . body) ...)}}
244
245;;> The result of @var{expr} is matched against each @var{pattern} in
246;;> turn, according to the pattern rules described in the previous
247;;> section, until the the first @var{pattern} matches. When a match is
248;;> found, the corresponding @var{body}s are evaluated in order,
249;;> and the result of the last expression is returned as the result
250;;> of the entire @scheme{match}. If a @var{failure} is provided,
251;;> then it is bound to a procedure of no arguments which continues,
252;;> processing at the next @var{pattern}. If no @var{pattern} matches,
253;;> an error is signalled.
254
d967913f
LC
255;; The basic interface. MATCH just performs some basic syntax
256;; validation, binds the match expression to a temporary variable `v',
257;; and passes it on to MATCH-NEXT. It's a constant throughout the
258;; code below that the binding `v' is a direct variable reference, not
259;; an expression.
260
261(define-syntax match
262 (syntax-rules ()
263 ((match)
264 (match-syntax-error "missing match expression"))
265 ((match atom)
266 (match-syntax-error "no match clauses"))
267 ((match (app ...) (pat . body) ...)
268 (let ((v (app ...)))
269 (match-next v ((app ...) (set! (app ...))) (pat . body) ...)))
270 ((match #(vec ...) (pat . body) ...)
271 (let ((v #(vec ...)))
272 (match-next v (v (set! v)) (pat . body) ...)))
273 ((match atom (pat . body) ...)
b92bbfff
LC
274 (let ((v atom))
275 (match-next v (atom (set! atom)) (pat . body) ...)))
d967913f
LC
276 ))
277
278;; MATCH-NEXT passes each clause to MATCH-ONE in turn with its failure
279;; thunk, which is expanded by recursing MATCH-NEXT on the remaining
280;; clauses. `g+s' is a list of two elements, the get! and set!
281;; expressions respectively.
282
4a565538 283(define (match-error v)
a2972c19 284 #((definite-bailout? . #t))
4a565538
AW
285 (error 'match "no matching pattern" v))
286
d967913f
LC
287(define-syntax match-next
288 (syntax-rules (=>)
289 ;; no more clauses, the match failed
290 ((match-next v g+s)
4a565538
AW
291 ;; Here we call match-error in non-tail context, so that the
292 ;; backtrace can show the source location of the failing match
293 ;; form.
294 (begin
295 (match-error v)
296 #f))
d967913f
LC
297 ;; named failure continuation
298 ((match-next v g+s (pat (=> failure) . body) . rest)
299 (let ((failure (lambda () (match-next v g+s . rest))))
300 ;; match-one analyzes the pattern for us
301 (match-one v pat g+s (match-drop-ids (begin . body)) (failure) ())))
302 ;; anonymous failure continuation, give it a dummy name
303 ((match-next v g+s (pat . body) . rest)
304 (match-next v g+s (pat (=> failure) . body) . rest))))
305
306;; MATCH-ONE first checks for ellipse patterns, otherwise passes on to
307;; MATCH-TWO.
308
309(define-syntax match-one
310 (syntax-rules ()
311 ;; If it's a list of two or more values, check to see if the
312 ;; second one is an ellipse and handle accordingly, otherwise go
313 ;; to MATCH-TWO.
314 ((match-one v (p q . r) g+s sk fk i)
315 (match-check-ellipse
316 q
317 (match-extract-vars p (match-gen-ellipses v p r g+s sk fk i) i ())
318 (match-two v (p q . r) g+s sk fk i)))
319 ;; Go directly to MATCH-TWO.
320 ((match-one . x)
321 (match-two . x))))
322
323;; This is the guts of the pattern matcher. We are passed a lot of
324;; information in the form:
325;;
326;; (match-two var pattern getter setter success-k fail-k (ids ...))
327;;
328;; usually abbreviated
329;;
330;; (match-two v p g+s sk fk i)
331;;
332;; where VAR is the symbol name of the current variable we are
333;; matching, PATTERN is the current pattern, getter and setter are the
334;; corresponding accessors (e.g. CAR and SET-CAR! of the pair holding
335;; VAR), SUCCESS-K is the success continuation, FAIL-K is the failure
336;; continuation (which is just a thunk call and is thus safe to expand
337;; multiple times) and IDS are the list of identifiers bound in the
338;; pattern so far.
339
340(define-syntax match-two
1ffed5aa 341 (syntax-rules (_ ___ ..1 *** quote quasiquote ? $ = and or not set! get!)
d967913f
LC
342 ((match-two v () g+s (sk ...) fk i)
343 (if (null? v) (sk ... i) fk))
344 ((match-two v (quote p) g+s (sk ...) fk i)
345 (if (equal? v 'p) (sk ... i) fk))
346 ((match-two v (quasiquote p) . x)
347 (match-quasiquote v p . x))
348 ((match-two v (and) g+s (sk ...) fk i) (sk ... i))
349 ((match-two v (and p q ...) g+s sk fk i)
350 (match-one v p g+s (match-one v (and q ...) g+s sk fk) fk i))
351 ((match-two v (or) g+s sk fk i) fk)
352 ((match-two v (or p) . x)
353 (match-one v p . x))
354 ((match-two v (or p ...) g+s sk fk i)
355 (match-extract-vars (or p ...) (match-gen-or v (p ...) g+s sk fk i) i ()))
356 ((match-two v (not p) g+s (sk ...) fk i)
357 (match-one v p g+s (match-drop-ids fk) (sk ... i) i))
358 ((match-two v (get! getter) (g s) (sk ...) fk i)
359 (let ((getter (lambda () g))) (sk ... i)))
360 ((match-two v (set! setter) (g (s ...)) (sk ...) fk i)
361 (let ((setter (lambda (x) (s ... x)))) (sk ... i)))
362 ((match-two v (? pred . p) g+s sk fk i)
363 (if (pred v) (match-one v (and . p) g+s sk fk i) fk))
364 ((match-two v (= proc p) . x)
365 (let ((w (proc v))) (match-one w p . x)))
366 ((match-two v (p ___ . r) g+s sk fk i)
367 (match-extract-vars p (match-gen-ellipses v p r g+s sk fk i) i ()))
368 ((match-two v (p) g+s sk fk i)
369 (if (and (pair? v) (null? (cdr v)))
370 (let ((w (car v)))
371 (match-one w p ((car v) (set-car! v)) sk fk i))
372 fk))
373 ((match-two v (p *** q) g+s sk fk i)
374 (match-extract-vars p (match-gen-search v p q g+s sk fk i) i ()))
375 ((match-two v (p *** . q) g+s sk fk i)
376 (match-syntax-error "invalid use of ***" (p *** . q)))
1ffed5aa
LC
377 ((match-two v (p ..1) g+s sk fk i)
378 (if (pair? v)
379 (match-one v (p ___) g+s sk fk i)
380 fk))
5fcb7b3c
LC
381 ((match-two v ($ rec p ...) g+s sk fk i)
382 (if (is-a? v rec)
383 (match-record-refs v rec 0 (p ...) g+s sk fk i)
384 fk))
d967913f
LC
385 ((match-two v (p . q) g+s sk fk i)
386 (if (pair? v)
387 (let ((w (car v)) (x (cdr v)))
388 (match-one w p ((car v) (set-car! v))
389 (match-one x q ((cdr v) (set-cdr! v)) sk fk)
390 fk
391 i))
392 fk))
393 ((match-two v #(p ...) g+s . x)
394 (match-vector v 0 () (p ...) . x))
395 ((match-two v _ g+s (sk ...) fk i) (sk ... i))
396 ;; Not a pair or vector or special literal, test to see if it's a
397 ;; new symbol, in which case we just bind it, or if it's an
398 ;; already bound symbol or some other literal, in which case we
399 ;; compare it with EQUAL?.
400 ((match-two v x g+s (sk ...) fk (id ...))
401 (let-syntax
402 ((new-sym?
403 (syntax-rules (id ...)
404 ((new-sym? x sk2 fk2) sk2)
405 ((new-sym? y sk2 fk2) fk2))))
406 (new-sym? random-sym-to-match
407 (let ((x v)) (sk ... (id ... x)))
408 (if (equal? v x) (sk ... (id ...)) fk))))
409 ))
410
411;; QUASIQUOTE patterns
412
413(define-syntax match-quasiquote
414 (syntax-rules (unquote unquote-splicing quasiquote)
415 ((_ v (unquote p) g+s sk fk i)
416 (match-one v p g+s sk fk i))
417 ((_ v ((unquote-splicing p) . rest) g+s sk fk i)
418 (if (pair? v)
419 (match-one v
420 (p . tmp)
421 (match-quasiquote tmp rest g+s sk fk)
422 fk
423 i)
424 fk))
425 ((_ v (quasiquote p) g+s sk fk i . depth)
426 (match-quasiquote v p g+s sk fk i #f . depth))
427 ((_ v (unquote p) g+s sk fk i x . depth)
428 (match-quasiquote v p g+s sk fk i . depth))
429 ((_ v (unquote-splicing p) g+s sk fk i x . depth)
430 (match-quasiquote v p g+s sk fk i . depth))
431 ((_ v (p . q) g+s sk fk i . depth)
432 (if (pair? v)
433 (let ((w (car v)) (x (cdr v)))
434 (match-quasiquote
435 w p g+s
436 (match-quasiquote-step x q g+s sk fk depth)
437 fk i . depth))
438 fk))
439 ((_ v #(elt ...) g+s sk fk i . depth)
440 (if (vector? v)
441 (let ((ls (vector->list v)))
442 (match-quasiquote ls (elt ...) g+s sk fk i . depth))
443 fk))
444 ((_ v x g+s sk fk i . depth)
445 (match-one v 'x g+s sk fk i))))
446
447(define-syntax match-quasiquote-step
448 (syntax-rules ()
449 ((match-quasiquote-step x q g+s sk fk depth i)
450 (match-quasiquote x q g+s sk fk i . depth))))
451
452;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
453;; Utilities
454
455;; Takes two values and just expands into the first.
456(define-syntax match-drop-ids
457 (syntax-rules ()
458 ((_ expr ids ...) expr)))
459
5fcb7b3c
LC
460(define-syntax match-tuck-ids
461 (syntax-rules ()
462 ((_ (letish args (expr ...)) ids ...)
463 (letish args (expr ... ids ...)))))
464
d967913f
LC
465(define-syntax match-drop-first-arg
466 (syntax-rules ()
467 ((_ arg expr) expr)))
468
469;; To expand an OR group we try each clause in succession, passing the
470;; first that succeeds to the success continuation. On failure for
471;; any clause, we just try the next clause, finally resorting to the
472;; failure continuation fk if all clauses fail. The only trick is
473;; that we want to unify the identifiers, so that the success
474;; continuation can refer to a variable from any of the OR clauses.
475
476(define-syntax match-gen-or
477 (syntax-rules ()
478 ((_ v p g+s (sk ...) fk (i ...) ((id id-ls) ...))
479 (let ((sk2 (lambda (id ...) (sk ... (i ... id ...)))))
480 (match-gen-or-step v p g+s (match-drop-ids (sk2 id ...)) fk (i ...))))))
481
482(define-syntax match-gen-or-step
483 (syntax-rules ()
484 ((_ v () g+s sk fk . x)
485 ;; no OR clauses, call the failure continuation
486 fk)
487 ((_ v (p) . x)
488 ;; last (or only) OR clause, just expand normally
489 (match-one v p . x))
490 ((_ v (p . q) g+s sk fk i)
491 ;; match one and try the remaining on failure
0a3ac81a
LC
492 (let ((fk2 (lambda () (match-gen-or-step v q g+s sk fk i))))
493 (match-one v p g+s sk (fk2) i)))
d967913f
LC
494 ))
495
496;; We match a pattern (p ...) by matching the pattern p in a loop on
497;; each element of the variable, accumulating the bound ids into lists.
498
499;; Look at the body of the simple case - it's just a named let loop,
500;; matching each element in turn to the same pattern. The only trick
501;; is that we want to keep track of the lists of each extracted id, so
502;; when the loop recurses we cons the ids onto their respective list
503;; variables, and on success we bind the ids (what the user input and
504;; expects to see in the success body) to the reversed accumulated
505;; list IDs.
506
507(define-syntax match-gen-ellipses
508 (syntax-rules ()
509 ((_ v p () g+s (sk ...) fk i ((id id-ls) ...))
510 (match-check-identifier p
511 ;; simplest case equivalent to (p ...), just bind the list
512 (let ((p v))
513 (if (list? p)
514 (sk ... i)
515 fk))
516 ;; simple case, match all elements of the list
517 (let loop ((ls v) (id-ls '()) ...)
518 (cond
519 ((null? ls)
520 (let ((id (reverse id-ls)) ...) (sk ... i)))
521 ((pair? ls)
522 (let ((w (car ls)))
523 (match-one w p ((car ls) (set-car! ls))
524 (match-drop-ids (loop (cdr ls) (cons id id-ls) ...))
525 fk i)))
526 (else
527 fk)))))
528 ((_ v p r g+s (sk ...) fk i ((id id-ls) ...))
529 ;; general case, trailing patterns to match, keep track of the
530 ;; remaining list length so we don't need any backtracking
531 (match-verify-no-ellipses
532 r
533 (let* ((tail-len (length 'r))
534 (ls v)
5fcb7b3c
LC
535 (len (and (list? ls) (length ls))))
536 (if (or (not len) (< len tail-len))
d967913f
LC
537 fk
538 (let loop ((ls ls) (n len) (id-ls '()) ...)
539 (cond
540 ((= n tail-len)
541 (let ((id (reverse id-ls)) ...)
5fcb7b3c 542 (match-one ls r (#f #f) (sk ...) fk i)))
d967913f
LC
543 ((pair? ls)
544 (let ((w (car ls)))
545 (match-one w p ((car ls) (set-car! ls))
546 (match-drop-ids
547 (loop (cdr ls) (- n 1) (cons id id-ls) ...))
548 fk
549 i)))
550 (else
551 fk)))))))))
552
553;; This is just a safety check. Although unlike syntax-rules we allow
554;; trailing patterns after an ellipses, we explicitly disable multiple
555;; ellipses at the same level. This is because in the general case
556;; such patterns are exponential in the number of ellipses, and we
557;; don't want to make it easy to construct very expensive operations
558;; with simple looking patterns. For example, it would be O(n^2) for
559;; patterns like (a ... b ...) because we must consider every trailing
560;; element for every possible break for the leading "a ...".
561
562(define-syntax match-verify-no-ellipses
563 (syntax-rules ()
564 ((_ (x . y) sk)
565 (match-check-ellipse
566 x
567 (match-syntax-error
568 "multiple ellipse patterns not allowed at same level")
569 (match-verify-no-ellipses y sk)))
570 ((_ () sk)
571 sk)
572 ((_ x sk)
573 (match-syntax-error "dotted tail not allowed after ellipse" x))))
574
5fcb7b3c 575;; To implement the tree search, we use two recursive procedures. TRY
d967913f
LC
576;; attempts to match Y once, and on success it calls the normal SK on
577;; the accumulated list ids as in MATCH-GEN-ELLIPSES. On failure, we
578;; call NEXT which first checks if the current value is a list
579;; beginning with X, then calls TRY on each remaining element of the
580;; list. Since TRY will recursively call NEXT again on failure, this
581;; effects a full depth-first search.
582;;
583;; The failure continuation throughout is a jump to the next step in
584;; the tree search, initialized with the original failure continuation
585;; FK.
586
587(define-syntax match-gen-search
588 (syntax-rules ()
589 ((match-gen-search v p q g+s sk fk i ((id id-ls) ...))
590 (letrec ((try (lambda (w fail id-ls ...)
591 (match-one w q g+s
5fcb7b3c 592 (match-tuck-ids
d967913f
LC
593 (let ((id (reverse id-ls)) ...)
594 sk))
595 (next w fail id-ls ...) i)))
596 (next (lambda (w fail id-ls ...)
597 (if (not (pair? w))
598 (fail)
599 (let ((u (car w)))
600 (match-one
601 u p ((car w) (set-car! w))
602 (match-drop-ids
603 ;; accumulate the head variables from
604 ;; the p pattern, and loop over the tail
605 (let ((id-ls (cons id id-ls)) ...)
606 (let lp ((ls (cdr w)))
607 (if (pair? ls)
608 (try (car ls)
609 (lambda () (lp (cdr ls)))
610 id-ls ...)
611 (fail)))))
612 (fail) i))))))
613 ;; the initial id-ls binding here is a dummy to get the right
614 ;; number of '()s
615 (let ((id-ls '()) ...)
616 (try v (lambda () fk) id-ls ...))))))
617
618;; Vector patterns are just more of the same, with the slight
619;; exception that we pass around the current vector index being
620;; matched.
621
622(define-syntax match-vector
623 (syntax-rules (___)
624 ((_ v n pats (p q) . x)
625 (match-check-ellipse q
626 (match-gen-vector-ellipses v n pats p . x)
627 (match-vector-two v n pats (p q) . x)))
628 ((_ v n pats (p ___) sk fk i)
629 (match-gen-vector-ellipses v n pats p sk fk i))
630 ((_ . x)
631 (match-vector-two . x))))
632
633;; Check the exact vector length, then check each element in turn.
634
635(define-syntax match-vector-two
636 (syntax-rules ()
637 ((_ v n ((pat index) ...) () sk fk i)
638 (if (vector? v)
639 (let ((len (vector-length v)))
640 (if (= len n)
641 (match-vector-step v ((pat index) ...) sk fk i)
642 fk))
643 fk))
644 ((_ v n (pats ...) (p . q) . x)
645 (match-vector v (+ n 1) (pats ... (p n)) q . x))))
646
647(define-syntax match-vector-step
648 (syntax-rules ()
649 ((_ v () (sk ...) fk i) (sk ... i))
650 ((_ v ((pat index) . rest) sk fk i)
651 (let ((w (vector-ref v index)))
652 (match-one w pat ((vector-ref v index) (vector-set! v index))
653 (match-vector-step v rest sk fk)
654 fk i)))))
655
656;; With a vector ellipse pattern we first check to see if the vector
657;; length is at least the required length.
658
659(define-syntax match-gen-vector-ellipses
660 (syntax-rules ()
661 ((_ v n ((pat index) ...) p sk fk i)
662 (if (vector? v)
663 (let ((len (vector-length v)))
664 (if (>= len n)
665 (match-vector-step v ((pat index) ...)
666 (match-vector-tail v p n len sk fk)
667 fk i)
668 fk))
669 fk))))
670
671(define-syntax match-vector-tail
672 (syntax-rules ()
673 ((_ v p n len sk fk i)
674 (match-extract-vars p (match-vector-tail-two v p n len sk fk i) i ()))))
675
676(define-syntax match-vector-tail-two
677 (syntax-rules ()
678 ((_ v p n len (sk ...) fk i ((id id-ls) ...))
679 (let loop ((j n) (id-ls '()) ...)
680 (if (>= j len)
681 (let ((id (reverse id-ls)) ...) (sk ... i))
682 (let ((w (vector-ref v j)))
683 (match-one w p ((vector-ref v j) (vetor-set! v j))
684 (match-drop-ids (loop (+ j 1) (cons id id-ls) ...))
685 fk i)))))))
686
5fcb7b3c
LC
687(define-syntax match-record-refs
688 (syntax-rules ()
689 ((_ v rec n (p . q) g+s sk fk i)
690 (let ((w (slot-ref rec v n)))
691 (match-one w p ((slot-ref rec v n) (slot-set! rec v n))
692 (match-record-refs v rec (+ n 1) q g+s sk fk) fk i)))
693 ((_ v rec n () g+s (sk ...) fk i)
694 (sk ... i))))
695
d967913f
LC
696;; Extract all identifiers in a pattern. A little more complicated
697;; than just looking for symbols, we need to ignore special keywords
698;; and non-pattern forms (such as the predicate expression in ?
699;; patterns), and also ignore previously bound identifiers.
700;;
701;; Calls the continuation with all new vars as a list of the form
702;; ((orig-var tmp-name) ...), where tmp-name can be used to uniquely
703;; pair with the original variable (e.g. it's used in the ellipse
704;; generation for list variables).
705;;
706;; (match-extract-vars pattern continuation (ids ...) (new-vars ...))
707
708(define-syntax match-extract-vars
f2ee6341 709 (syntax-rules (_ ___ ..1 *** ? $ = quote quasiquote and or not get! set!)
d967913f
LC
710 ((match-extract-vars (? pred . p) . x)
711 (match-extract-vars p . x))
712 ((match-extract-vars ($ rec . p) . x)
713 (match-extract-vars p . x))
714 ((match-extract-vars (= proc p) . x)
715 (match-extract-vars p . x))
716 ((match-extract-vars (quote x) (k ...) i v)
717 (k ... v))
718 ((match-extract-vars (quasiquote x) k i v)
719 (match-extract-quasiquote-vars x k i v (#t)))
720 ((match-extract-vars (and . p) . x)
721 (match-extract-vars p . x))
722 ((match-extract-vars (or . p) . x)
723 (match-extract-vars p . x))
724 ((match-extract-vars (not . p) . x)
725 (match-extract-vars p . x))
726 ;; A non-keyword pair, expand the CAR with a continuation to
727 ;; expand the CDR.
728 ((match-extract-vars (p q . r) k i v)
729 (match-check-ellipse
730 q
731 (match-extract-vars (p . r) k i v)
732 (match-extract-vars p (match-extract-vars-step (q . r) k i v) i ())))
733 ((match-extract-vars (p . q) k i v)
734 (match-extract-vars p (match-extract-vars-step q k i v) i ()))
735 ((match-extract-vars #(p ...) . x)
736 (match-extract-vars (p ...) . x))
737 ((match-extract-vars _ (k ...) i v) (k ... v))
738 ((match-extract-vars ___ (k ...) i v) (k ... v))
739 ((match-extract-vars *** (k ...) i v) (k ... v))
5fcb7b3c 740 ((match-extract-vars ..1 (k ...) i v) (k ... v))
d967913f
LC
741 ;; This is the main part, the only place where we might add a new
742 ;; var if it's an unbound symbol.
743 ((match-extract-vars p (k ...) (i ...) v)
744 (let-syntax
745 ((new-sym?
746 (syntax-rules (i ...)
747 ((new-sym? p sk fk) sk)
5fcb7b3c 748 ((new-sym? any sk fk) fk))))
d967913f
LC
749 (new-sym? random-sym-to-match
750 (k ... ((p p-ls) . v))
751 (k ... v))))
752 ))
753
754;; Stepper used in the above so it can expand the CAR and CDR
755;; separately.
756
757(define-syntax match-extract-vars-step
758 (syntax-rules ()
759 ((_ p k i v ((v2 v2-ls) ...))
760 (match-extract-vars p k (v2 ... . i) ((v2 v2-ls) ... . v)))
761 ))
762
763(define-syntax match-extract-quasiquote-vars
764 (syntax-rules (quasiquote unquote unquote-splicing)
765 ((match-extract-quasiquote-vars (quasiquote x) k i v d)
766 (match-extract-quasiquote-vars x k i v (#t . d)))
767 ((match-extract-quasiquote-vars (unquote-splicing x) k i v d)
768 (match-extract-quasiquote-vars (unquote x) k i v d))
769 ((match-extract-quasiquote-vars (unquote x) k i v (#t))
770 (match-extract-vars x k i v))
771 ((match-extract-quasiquote-vars (unquote x) k i v (#t . d))
772 (match-extract-quasiquote-vars x k i v d))
773 ((match-extract-quasiquote-vars (x . y) k i v (#t . d))
774 (match-extract-quasiquote-vars
775 x
776 (match-extract-quasiquote-vars-step y k i v d) i ()))
777 ((match-extract-quasiquote-vars #(x ...) k i v (#t . d))
778 (match-extract-quasiquote-vars (x ...) k i v d))
779 ((match-extract-quasiquote-vars x (k ...) i v (#t . d))
780 (k ... v))
781 ))
782
783(define-syntax match-extract-quasiquote-vars-step
784 (syntax-rules ()
785 ((_ x k i v d ((v2 v2-ls) ...))
786 (match-extract-quasiquote-vars x k (v2 ... . i) ((v2 v2-ls) ... . v) d))
787 ))
788
789
790;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
791;; Gimme some sugar baby.
792
5fcb7b3c
LC
793;;> Shortcut for @scheme{lambda} + @scheme{match}. Creates a
794;;> procedure of one argument, and matches that argument against each
795;;> clause.
796
d967913f
LC
797(define-syntax match-lambda
798 (syntax-rules ()
5fcb7b3c
LC
799 ((_ (pattern . body) ...) (lambda (expr) (match expr (pattern . body) ...)))))
800
801;;> Similar to @scheme{match-lambda}. Creates a procedure of any
802;;> number of arguments, and matches the argument list against each
803;;> clause.
d967913f
LC
804
805(define-syntax match-lambda*
806 (syntax-rules ()
5fcb7b3c
LC
807 ((_ (pattern . body) ...) (lambda expr (match expr (pattern . body) ...)))))
808
809;;> Matches each var to the corresponding expression, and evaluates
810;;> the body with all match variables in scope. Raises an error if
811;;> any of the expressions fail to match. Syntax analogous to named
812;;> let can also be used for recursive functions which match on their
813;;> arguments as in @scheme{match-lambda*}.
d967913f
LC
814
815(define-syntax match-let
816 (syntax-rules ()
5fcb7b3c
LC
817 ((_ ((var value) ...) . body)
818 (match-let/helper let () () ((var value) ...) . body))
819 ((_ loop ((var init) ...) . body)
820 (match-named-let loop ((var init) ...) . body))))
821
822;;> Similar to @scheme{match-let}, but analogously to @scheme{letrec}
823;;> matches and binds the variables with all match variables in scope.
d967913f
LC
824
825(define-syntax match-letrec
826 (syntax-rules ()
5fcb7b3c
LC
827 ((_ ((var value) ...) . body)
828 (match-let/helper letrec () () ((var value) ...) . body))))
d967913f
LC
829
830(define-syntax match-let/helper
831 (syntax-rules ()
832 ((_ let ((var expr) ...) () () . body)
833 (let ((var expr) ...) . body))
834 ((_ let ((var expr) ...) ((pat tmp) ...) () . body)
835 (let ((var expr) ...)
836 (match-let* ((pat tmp) ...)
837 . body)))
838 ((_ let (v ...) (p ...) (((a . b) expr) . rest) . body)
839 (match-let/helper
840 let (v ... (tmp expr)) (p ... ((a . b) tmp)) rest . body))
841 ((_ let (v ...) (p ...) ((#(a ...) expr) . rest) . body)
842 (match-let/helper
843 let (v ... (tmp expr)) (p ... (#(a ...) tmp)) rest . body))
844 ((_ let (v ...) (p ...) ((a expr) . rest) . body)
845 (match-let/helper let (v ... (a expr)) (p ...) rest . body))))
846
847(define-syntax match-named-let
848 (syntax-rules ()
849 ((_ loop ((pat expr var) ...) () . body)
850 (let loop ((var expr) ...)
851 (match-let ((pat var) ...)
852 . body)))
853 ((_ loop (v ...) ((pat expr) . rest) . body)
854 (match-named-let loop (v ... (pat expr tmp)) rest . body))))
855
5fcb7b3c
LC
856;;> @subsubsubsection{@rawcode{(match-let* ((var value) ...) body ...)}}
857
858;;> Similar to @scheme{match-let}, but analogously to @scheme{let*}
859;;> matches and binds the variables in sequence, with preceding match
860;;> variables in scope.
861
d967913f
LC
862(define-syntax match-let*
863 (syntax-rules ()
864 ((_ () . body)
865 (begin . body))
866 ((_ ((pat expr) . rest) . body)
867 (match expr (pat (match-let* rest . body))))))
868
869
870;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
871;; Otherwise COND-EXPANDed bits.
872
873;; This *should* work, but doesn't :(
874;; (define-syntax match-check-ellipse
875;; (syntax-rules (...)
876;; ((_ ... sk fk) sk)
877;; ((_ x sk fk) fk)))
878
879;; This is a little more complicated, and introduces a new let-syntax,
880;; but should work portably in any R[56]RS Scheme. Taylor Campbell
881;; originally came up with the idea.
882(define-syntax match-check-ellipse
883 (syntax-rules ()
884 ;; these two aren't necessary but provide fast-case failures
885 ((match-check-ellipse (a . b) success-k failure-k) failure-k)
886 ((match-check-ellipse #(a ...) success-k failure-k) failure-k)
887 ;; matching an atom
888 ((match-check-ellipse id success-k failure-k)
889 (let-syntax ((ellipse? (syntax-rules ()
890 ;; iff `id' is `...' here then this will
891 ;; match a list of any length
892 ((ellipse? (foo id) sk fk) sk)
893 ((ellipse? other sk fk) fk))))
894 ;; this list of three elements will only many the (foo id) list
895 ;; above if `id' is `...'
896 (ellipse? (a b c) success-k failure-k)))))
897
898
899;; This is portable but can be more efficient with non-portable
900;; extensions. This trick was originally discovered by Oleg Kiselyov.
901
902(define-syntax match-check-identifier
903 (syntax-rules ()
904 ;; fast-case failures, lists and vectors are not identifiers
905 ((_ (x . y) success-k failure-k) failure-k)
906 ((_ #(x ...) success-k failure-k) failure-k)
907 ;; x is an atom
908 ((_ x success-k failure-k)
909 (let-syntax
910 ((sym?
911 (syntax-rules ()
912 ;; if the symbol `abracadabra' matches x, then x is a
913 ;; symbol
914 ((sym? x sk fk) sk)
915 ;; otherwise x is a non-symbol datum
916 ((sym? y sk fk) fk))))
917 (sym? abracadabra success-k failure-k)))))