Commit | Line | Data |
---|---|---|
96ca59d8 NJ |
1 | @c -*-texinfo-*- |
2 | @c This is part of the GNU Guile Reference Manual. | |
7aa394b5 | 3 | @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009, 2010, 2012 |
96ca59d8 NJ |
4 | @c Free Software Foundation, Inc. |
5 | @c See the file guile.texi for copying conditions. | |
6 | ||
7 | @node Regular Expressions | |
8 | @section Regular Expressions | |
9 | @tpindex Regular expressions | |
10 | ||
11 | @cindex regular expressions | |
12 | @cindex regex | |
13 | @cindex emacs regexp | |
14 | ||
15 | A @dfn{regular expression} (or @dfn{regexp}) is a pattern that | |
16 | describes a whole class of strings. A full description of regular | |
17 | expressions and their syntax is beyond the scope of this manual; | |
18 | an introduction can be found in the Emacs manual (@pxref{Regexps, | |
19 | , Syntax of Regular Expressions, emacs, The GNU Emacs Manual}), or | |
20 | in many general Unix reference books. | |
21 | ||
22 | If your system does not include a POSIX regular expression library, | |
23 | and you have not linked Guile with a third-party regexp library such | |
24 | as Rx, these functions will not be available. You can tell whether | |
25 | your Guile installation includes regular expression support by | |
26 | checking whether @code{(provided? 'regex)} returns true. | |
27 | ||
28 | The following regexp and string matching features are provided by the | |
29 | @code{(ice-9 regex)} module. Before using the described functions, | |
30 | you should load this module by executing @code{(use-modules (ice-9 | |
31 | regex))}. | |
32 | ||
33 | @menu | |
34 | * Regexp Functions:: Functions that create and match regexps. | |
35 | * Match Structures:: Finding what was matched by a regexp. | |
36 | * Backslash Escapes:: Removing the special meaning of regexp | |
37 | meta-characters. | |
38 | @end menu | |
39 | ||
40 | ||
41 | @node Regexp Functions | |
42 | @subsection Regexp Functions | |
43 | ||
44 | By default, Guile supports POSIX extended regular expressions. | |
45 | That means that the characters @samp{(}, @samp{)}, @samp{+} and | |
46 | @samp{?} are special, and must be escaped if you wish to match the | |
47 | literal characters. | |
48 | ||
49 | This regular expression interface was modeled after that | |
50 | implemented by SCSH, the Scheme Shell. It is intended to be | |
51 | upwardly compatible with SCSH regular expressions. | |
52 | ||
53 | Zero bytes (@code{#\nul}) cannot be used in regex patterns or input | |
54 | strings, since the underlying C functions treat that as the end of | |
55 | string. If there's a zero byte an error is thrown. | |
56 | ||
7aa394b5 LC |
57 | Internally, patterns and input strings are converted to the current |
58 | locale's encoding, and then passed to the C library's regular expression | |
59 | routines (@pxref{Regular Expressions,,, libc, The GNU C Library | |
60 | Reference Manual}). The returned match structures always point to | |
61 | characters in the strings, not to individual bytes, even in the case of | |
62 | multi-byte encodings. | |
96ca59d8 NJ |
63 | |
64 | @deffn {Scheme Procedure} string-match pattern str [start] | |
65 | Compile the string @var{pattern} into a regular expression and compare | |
66 | it with @var{str}. The optional numeric argument @var{start} specifies | |
67 | the position of @var{str} at which to begin matching. | |
68 | ||
69 | @code{string-match} returns a @dfn{match structure} which | |
70 | describes what, if anything, was matched by the regular | |
71 | expression. @xref{Match Structures}. If @var{str} does not match | |
72 | @var{pattern} at all, @code{string-match} returns @code{#f}. | |
73 | @end deffn | |
74 | ||
75 | Two examples of a match follow. In the first example, the pattern | |
76 | matches the four digits in the match string. In the second, the pattern | |
77 | matches nothing. | |
78 | ||
79 | @example | |
80 | (string-match "[0-9][0-9][0-9][0-9]" "blah2002") | |
81 | @result{} #("blah2002" (4 . 8)) | |
82 | ||
83 | (string-match "[A-Za-z]" "123456") | |
84 | @result{} #f | |
85 | @end example | |
86 | ||
87 | Each time @code{string-match} is called, it must compile its | |
88 | @var{pattern} argument into a regular expression structure. This | |
89 | operation is expensive, which makes @code{string-match} inefficient if | |
90 | the same regular expression is used several times (for example, in a | |
91 | loop). For better performance, you can compile a regular expression in | |
92 | advance and then match strings against the compiled regexp. | |
93 | ||
94 | @deffn {Scheme Procedure} make-regexp pat flag@dots{} | |
95 | @deffnx {C Function} scm_make_regexp (pat, flaglst) | |
96 | Compile the regular expression described by @var{pat}, and | |
97 | return the compiled regexp structure. If @var{pat} does not | |
98 | describe a legal regular expression, @code{make-regexp} throws | |
99 | a @code{regular-expression-syntax} error. | |
100 | ||
101 | The @var{flag} arguments change the behavior of the compiled | |
102 | regular expression. The following values may be supplied: | |
103 | ||
104 | @defvar regexp/icase | |
105 | Consider uppercase and lowercase letters to be the same when | |
106 | matching. | |
107 | @end defvar | |
108 | ||
109 | @defvar regexp/newline | |
110 | If a newline appears in the target string, then permit the | |
111 | @samp{^} and @samp{$} operators to match immediately after or | |
112 | immediately before the newline, respectively. Also, the | |
113 | @samp{.} and @samp{[^...]} operators will never match a newline | |
114 | character. The intent of this flag is to treat the target | |
115 | string as a buffer containing many lines of text, and the | |
116 | regular expression as a pattern that may match a single one of | |
117 | those lines. | |
118 | @end defvar | |
119 | ||
120 | @defvar regexp/basic | |
121 | Compile a basic (``obsolete'') regexp instead of the extended | |
122 | (``modern'') regexps that are the default. Basic regexps do | |
123 | not consider @samp{|}, @samp{+} or @samp{?} to be special | |
124 | characters, and require the @samp{@{...@}} and @samp{(...)} | |
125 | metacharacters to be backslash-escaped (@pxref{Backslash | |
126 | Escapes}). There are several other differences between basic | |
127 | and extended regular expressions, but these are the most | |
128 | significant. | |
129 | @end defvar | |
130 | ||
131 | @defvar regexp/extended | |
132 | Compile an extended regular expression rather than a basic | |
133 | regexp. This is the default behavior; this flag will not | |
134 | usually be needed. If a call to @code{make-regexp} includes | |
135 | both @code{regexp/basic} and @code{regexp/extended} flags, the | |
136 | one which comes last will override the earlier one. | |
137 | @end defvar | |
138 | @end deffn | |
139 | ||
140 | @deffn {Scheme Procedure} regexp-exec rx str [start [flags]] | |
141 | @deffnx {C Function} scm_regexp_exec (rx, str, start, flags) | |
142 | Match the compiled regular expression @var{rx} against | |
143 | @code{str}. If the optional integer @var{start} argument is | |
144 | provided, begin matching from that position in the string. | |
145 | Return a match structure describing the results of the match, | |
146 | or @code{#f} if no match could be found. | |
147 | ||
148 | The @var{flags} argument changes the matching behavior. The following | |
149 | flag values may be supplied, use @code{logior} (@pxref{Bitwise | |
150 | Operations}) to combine them, | |
151 | ||
152 | @defvar regexp/notbol | |
153 | Consider that the @var{start} offset into @var{str} is not the | |
154 | beginning of a line and should not match operator @samp{^}. | |
155 | ||
156 | If @var{rx} was created with the @code{regexp/newline} option above, | |
157 | @samp{^} will still match after a newline in @var{str}. | |
158 | @end defvar | |
159 | ||
160 | @defvar regexp/noteol | |
161 | Consider that the end of @var{str} is not the end of a line and should | |
162 | not match operator @samp{$}. | |
163 | ||
164 | If @var{rx} was created with the @code{regexp/newline} option above, | |
165 | @samp{$} will still match before a newline in @var{str}. | |
166 | @end defvar | |
167 | @end deffn | |
168 | ||
169 | @lisp | |
170 | ;; Regexp to match uppercase letters | |
171 | (define r (make-regexp "[A-Z]*")) | |
172 | ||
173 | ;; Regexp to match letters, ignoring case | |
174 | (define ri (make-regexp "[A-Z]*" regexp/icase)) | |
175 | ||
176 | ;; Search for bob using regexp r | |
177 | (match:substring (regexp-exec r "bob")) | |
178 | @result{} "" ; no match | |
179 | ||
180 | ;; Search for bob using regexp ri | |
181 | (match:substring (regexp-exec ri "Bob")) | |
182 | @result{} "Bob" ; matched case insensitive | |
183 | @end lisp | |
184 | ||
185 | @deffn {Scheme Procedure} regexp? obj | |
186 | @deffnx {C Function} scm_regexp_p (obj) | |
187 | Return @code{#t} if @var{obj} is a compiled regular expression, | |
188 | or @code{#f} otherwise. | |
189 | @end deffn | |
190 | ||
191 | @sp 1 | |
192 | @deffn {Scheme Procedure} list-matches regexp str [flags] | |
193 | Return a list of match structures which are the non-overlapping | |
194 | matches of @var{regexp} in @var{str}. @var{regexp} can be either a | |
195 | pattern string or a compiled regexp. The @var{flags} argument is as | |
196 | per @code{regexp-exec} above. | |
197 | ||
198 | @example | |
199 | (map match:substring (list-matches "[a-z]+" "abc 42 def 78")) | |
200 | @result{} ("abc" "def") | |
201 | @end example | |
202 | @end deffn | |
203 | ||
204 | @deffn {Scheme Procedure} fold-matches regexp str init proc [flags] | |
205 | Apply @var{proc} to the non-overlapping matches of @var{regexp} in | |
206 | @var{str}, to build a result. @var{regexp} can be either a pattern | |
207 | string or a compiled regexp. The @var{flags} argument is as per | |
208 | @code{regexp-exec} above. | |
209 | ||
210 | @var{proc} is called as @code{(@var{proc} match prev)} where | |
211 | @var{match} is a match structure and @var{prev} is the previous return | |
212 | from @var{proc}. For the first call @var{prev} is the given | |
213 | @var{init} parameter. @code{fold-matches} returns the final value | |
214 | from @var{proc}. | |
215 | ||
216 | For example to count matches, | |
217 | ||
218 | @example | |
219 | (fold-matches "[a-z][0-9]" "abc x1 def y2" 0 | |
220 | (lambda (match count) | |
221 | (1+ count))) | |
222 | @result{} 2 | |
223 | @end example | |
224 | @end deffn | |
225 | ||
226 | @sp 1 | |
227 | Regular expressions are commonly used to find patterns in one string | |
228 | and replace them with the contents of another string. The following | |
229 | functions are convenient ways to do this. | |
230 | ||
231 | @c begin (scm-doc-string "regex.scm" "regexp-substitute") | |
df0a1002 | 232 | @deffn {Scheme Procedure} regexp-substitute port match item @dots{} |
96ca59d8 NJ |
233 | Write to @var{port} selected parts of the match structure @var{match}. |
234 | Or if @var{port} is @code{#f} then form a string from those parts and | |
235 | return that. | |
236 | ||
237 | Each @var{item} specifies a part to be written, and may be one of the | |
238 | following, | |
239 | ||
240 | @itemize @bullet | |
241 | @item | |
242 | A string. String arguments are written out verbatim. | |
243 | ||
244 | @item | |
245 | An integer. The submatch with that number is written | |
246 | (@code{match:substring}). Zero is the entire match. | |
247 | ||
248 | @item | |
249 | The symbol @samp{pre}. The portion of the matched string preceding | |
250 | the regexp match is written (@code{match:prefix}). | |
251 | ||
252 | @item | |
253 | The symbol @samp{post}. The portion of the matched string following | |
254 | the regexp match is written (@code{match:suffix}). | |
255 | @end itemize | |
256 | ||
257 | For example, changing a match and retaining the text before and after, | |
258 | ||
259 | @example | |
260 | (regexp-substitute #f (string-match "[0-9]+" "number 25 is good") | |
261 | 'pre "37" 'post) | |
262 | @result{} "number 37 is good" | |
263 | @end example | |
264 | ||
265 | Or matching a @sc{yyyymmdd} format date such as @samp{20020828} and | |
266 | re-ordering and hyphenating the fields. | |
267 | ||
268 | @lisp | |
269 | (define date-regex | |
270 | "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") | |
271 | (define s "Date 20020429 12am.") | |
272 | (regexp-substitute #f (string-match date-regex s) | |
273 | 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") | |
274 | @result{} "Date 04-29-2002 12am. (20020429)" | |
275 | @end lisp | |
276 | @end deffn | |
277 | ||
278 | ||
279 | @c begin (scm-doc-string "regex.scm" "regexp-substitute") | |
df0a1002 | 280 | @deffn {Scheme Procedure} regexp-substitute/global port regexp target item@dots{} |
96ca59d8 NJ |
281 | @cindex search and replace |
282 | Write to @var{port} selected parts of matches of @var{regexp} in | |
283 | @var{target}. If @var{port} is @code{#f} then form a string from | |
284 | those parts and return that. @var{regexp} can be a string or a | |
285 | compiled regex. | |
286 | ||
287 | This is similar to @code{regexp-substitute}, but allows global | |
288 | substitutions on @var{target}. Each @var{item} behaves as per | |
289 | @code{regexp-substitute}, with the following differences, | |
290 | ||
291 | @itemize @bullet | |
292 | @item | |
293 | A function. Called as @code{(@var{item} match)} with the match | |
294 | structure for the @var{regexp} match, it should return a string to be | |
295 | written to @var{port}. | |
296 | ||
297 | @item | |
298 | The symbol @samp{post}. This doesn't output anything, but instead | |
299 | causes @code{regexp-substitute/global} to recurse on the unmatched | |
300 | portion of @var{target}. | |
301 | ||
302 | This @emph{must} be supplied to perform a global search and replace on | |
303 | @var{target}; without it @code{regexp-substitute/global} returns after | |
304 | a single match and output. | |
305 | @end itemize | |
306 | ||
307 | For example, to collapse runs of tabs and spaces to a single hyphen | |
308 | each, | |
309 | ||
310 | @example | |
311 | (regexp-substitute/global #f "[ \t]+" "this is the text" | |
312 | 'pre "-" 'post) | |
313 | @result{} "this-is-the-text" | |
314 | @end example | |
315 | ||
316 | Or using a function to reverse the letters in each word, | |
317 | ||
318 | @example | |
319 | (regexp-substitute/global #f "[a-z]+" "to do and not-do" | |
320 | 'pre (lambda (m) (string-reverse (match:substring m))) 'post) | |
321 | @result{} "ot od dna ton-od" | |
322 | @end example | |
323 | ||
324 | Without the @code{post} symbol, just one regexp match is made. For | |
325 | example the following is the date example from | |
326 | @code{regexp-substitute} above, without the need for the separate | |
327 | @code{string-match} call. | |
328 | ||
329 | @lisp | |
330 | (define date-regex | |
331 | "([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9])") | |
332 | (define s "Date 20020429 12am.") | |
333 | (regexp-substitute/global #f date-regex s | |
334 | 'pre 2 "-" 3 "-" 1 'post " (" 0 ")") | |
335 | ||
336 | @result{} "Date 04-29-2002 12am. (20020429)" | |
337 | @end lisp | |
338 | @end deffn | |
339 | ||
340 | ||
341 | @node Match Structures | |
342 | @subsection Match Structures | |
343 | ||
344 | @cindex match structures | |
345 | ||
346 | A @dfn{match structure} is the object returned by @code{string-match} and | |
347 | @code{regexp-exec}. It describes which portion of a string, if any, | |
348 | matched the given regular expression. Match structures include: a | |
349 | reference to the string that was checked for matches; the starting and | |
350 | ending positions of the regexp match; and, if the regexp included any | |
351 | parenthesized subexpressions, the starting and ending positions of each | |
352 | submatch. | |
353 | ||
354 | In each of the regexp match functions described below, the @code{match} | |
355 | argument must be a match structure returned by a previous call to | |
356 | @code{string-match} or @code{regexp-exec}. Most of these functions | |
357 | return some information about the original target string that was | |
358 | matched against a regular expression; we will call that string | |
359 | @var{target} for easy reference. | |
360 | ||
361 | @c begin (scm-doc-string "regex.scm" "regexp-match?") | |
362 | @deffn {Scheme Procedure} regexp-match? obj | |
363 | Return @code{#t} if @var{obj} is a match structure returned by a | |
364 | previous call to @code{regexp-exec}, or @code{#f} otherwise. | |
365 | @end deffn | |
366 | ||
367 | @c begin (scm-doc-string "regex.scm" "match:substring") | |
368 | @deffn {Scheme Procedure} match:substring match [n] | |
369 | Return the portion of @var{target} matched by subexpression number | |
370 | @var{n}. Submatch 0 (the default) represents the entire regexp match. | |
371 | If the regular expression as a whole matched, but the subexpression | |
372 | number @var{n} did not match, return @code{#f}. | |
373 | @end deffn | |
374 | ||
375 | @lisp | |
376 | (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) | |
377 | (match:substring s) | |
378 | @result{} "2002" | |
379 | ||
380 | ;; match starting at offset 6 in the string | |
381 | (match:substring | |
382 | (string-match "[0-9][0-9][0-9][0-9]" "blah987654" 6)) | |
383 | @result{} "7654" | |
384 | @end lisp | |
385 | ||
386 | @c begin (scm-doc-string "regex.scm" "match:start") | |
387 | @deffn {Scheme Procedure} match:start match [n] | |
388 | Return the starting position of submatch number @var{n}. | |
389 | @end deffn | |
390 | ||
391 | In the following example, the result is 4, since the match starts at | |
392 | character index 4: | |
393 | ||
394 | @lisp | |
395 | (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) | |
396 | (match:start s) | |
397 | @result{} 4 | |
398 | @end lisp | |
399 | ||
400 | @c begin (scm-doc-string "regex.scm" "match:end") | |
401 | @deffn {Scheme Procedure} match:end match [n] | |
402 | Return the ending position of submatch number @var{n}. | |
403 | @end deffn | |
404 | ||
405 | In the following example, the result is 8, since the match runs between | |
679cceed | 406 | characters 4 and 8 (i.e.@: the ``2002''). |
96ca59d8 NJ |
407 | |
408 | @lisp | |
409 | (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) | |
410 | (match:end s) | |
411 | @result{} 8 | |
412 | @end lisp | |
413 | ||
414 | @c begin (scm-doc-string "regex.scm" "match:prefix") | |
415 | @deffn {Scheme Procedure} match:prefix match | |
416 | Return the unmatched portion of @var{target} preceding the regexp match. | |
417 | ||
418 | @lisp | |
419 | (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) | |
420 | (match:prefix s) | |
421 | @result{} "blah" | |
422 | @end lisp | |
423 | @end deffn | |
424 | ||
425 | @c begin (scm-doc-string "regex.scm" "match:suffix") | |
426 | @deffn {Scheme Procedure} match:suffix match | |
427 | Return the unmatched portion of @var{target} following the regexp match. | |
428 | @end deffn | |
429 | ||
430 | @lisp | |
431 | (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) | |
432 | (match:suffix s) | |
433 | @result{} "foo" | |
434 | @end lisp | |
435 | ||
436 | @c begin (scm-doc-string "regex.scm" "match:count") | |
437 | @deffn {Scheme Procedure} match:count match | |
438 | Return the number of parenthesized subexpressions from @var{match}. | |
439 | Note that the entire regular expression match itself counts as a | |
440 | subexpression, and failed submatches are included in the count. | |
441 | @end deffn | |
442 | ||
443 | @c begin (scm-doc-string "regex.scm" "match:string") | |
444 | @deffn {Scheme Procedure} match:string match | |
445 | Return the original @var{target} string. | |
446 | @end deffn | |
447 | ||
448 | @lisp | |
449 | (define s (string-match "[0-9][0-9][0-9][0-9]" "blah2002foo")) | |
450 | (match:string s) | |
451 | @result{} "blah2002foo" | |
452 | @end lisp | |
453 | ||
454 | ||
455 | @node Backslash Escapes | |
456 | @subsection Backslash Escapes | |
457 | ||
458 | Sometimes you will want a regexp to match characters like @samp{*} or | |
459 | @samp{$} exactly. For example, to check whether a particular string | |
460 | represents a menu entry from an Info node, it would be useful to match | |
461 | it against a regexp like @samp{^* [^:]*::}. However, this won't work; | |
462 | because the asterisk is a metacharacter, it won't match the @samp{*} at | |
463 | the beginning of the string. In this case, we want to make the first | |
464 | asterisk un-magic. | |
465 | ||
466 | You can do this by preceding the metacharacter with a backslash | |
467 | character @samp{\}. (This is also called @dfn{quoting} the | |
468 | metacharacter, and is known as a @dfn{backslash escape}.) When Guile | |
469 | sees a backslash in a regular expression, it considers the following | |
470 | glyph to be an ordinary character, no matter what special meaning it | |
471 | would ordinarily have. Therefore, we can make the above example work by | |
472 | changing the regexp to @samp{^\* [^:]*::}. The @samp{\*} sequence tells | |
473 | the regular expression engine to match only a single asterisk in the | |
474 | target string. | |
475 | ||
476 | Since the backslash is itself a metacharacter, you may force a regexp to | |
477 | match a backslash in the target string by preceding the backslash with | |
478 | itself. For example, to find variable references in a @TeX{} program, | |
479 | you might want to find occurrences of the string @samp{\let\} followed | |
480 | by any number of alphabetic characters. The regular expression | |
481 | @samp{\\let\\[A-Za-z]*} would do this: the double backslashes in the | |
482 | regexp each match a single backslash in the target string. | |
483 | ||
484 | @c begin (scm-doc-string "regex.scm" "regexp-quote") | |
485 | @deffn {Scheme Procedure} regexp-quote str | |
486 | Quote each special character found in @var{str} with a backslash, and | |
487 | return the resulting string. | |
488 | @end deffn | |
489 | ||
490 | @strong{Very important:} Using backslash escapes in Guile source code | |
491 | (as in Emacs Lisp or C) can be tricky, because the backslash character | |
492 | has special meaning for the Guile reader. For example, if Guile | |
493 | encounters the character sequence @samp{\n} in the middle of a string | |
494 | while processing Scheme code, it replaces those characters with a | |
495 | newline character. Similarly, the character sequence @samp{\t} is | |
496 | replaced by a horizontal tab. Several of these @dfn{escape sequences} | |
497 | are processed by the Guile reader before your code is executed. | |
498 | Unrecognized escape sequences are ignored: if the characters @samp{\*} | |
499 | appear in a string, they will be translated to the single character | |
500 | @samp{*}. | |
501 | ||
502 | This translation is obviously undesirable for regular expressions, since | |
503 | we want to be able to include backslashes in a string in order to | |
504 | escape regexp metacharacters. Therefore, to make sure that a backslash | |
505 | is preserved in a string in your Guile program, you must use @emph{two} | |
506 | consecutive backslashes: | |
507 | ||
508 | @lisp | |
509 | (define Info-menu-entry-pattern (make-regexp "^\\* [^:]*")) | |
510 | @end lisp | |
511 | ||
512 | The string in this example is preprocessed by the Guile reader before | |
513 | any code is executed. The resulting argument to @code{make-regexp} is | |
514 | the string @samp{^\* [^:]*}, which is what we really want. | |
515 | ||
516 | This also means that in order to write a regular expression that matches | |
517 | a single backslash character, the regular expression string in the | |
518 | source code must include @emph{four} backslashes. Each consecutive pair | |
519 | of backslashes gets translated by the Guile reader to a single | |
520 | backslash, and the resulting double-backslash is interpreted by the | |
521 | regexp engine as matching a single backslash character. Hence: | |
522 | ||
523 | @lisp | |
524 | (define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*")) | |
525 | @end lisp | |
526 | ||
527 | The reason for the unwieldiness of this syntax is historical. Both | |
528 | regular expression pattern matchers and Unix string processing systems | |
529 | have traditionally used backslashes with the special meanings | |
530 | described above. The POSIX regular expression specification and ANSI C | |
531 | standard both require these semantics. Attempting to abandon either | |
532 | convention would cause other kinds of compatibility problems, possibly | |
533 | more severe ones. Therefore, without extending the Scheme reader to | |
534 | support strings with different quoting conventions (an ungainly and | |
535 | confusing extension when implemented in other languages), we must adhere | |
536 | to this cumbersome escape syntax. |