subrp fix
[bpt/emacs.git] / doc / lispref / searching.texi
CommitLineData
b8d4c8d0
GM
1@c -*-texinfo-*-
2@c This is part of the GNU Emacs Lisp Reference Manual.
ba318903 3@c Copyright (C) 1990-1995, 1998-1999, 2001-2014 Free Software
ab422c4d 4@c Foundation, Inc.
b8d4c8d0 5@c See the file elisp.texi for copying conditions.
ecc6530d 6@node Searching and Matching
b8d4c8d0
GM
7@chapter Searching and Matching
8@cindex searching
9
10 GNU Emacs provides two ways to search through a buffer for specified
11text: exact string searches and regular expression searches. After a
12regular expression search, you can examine the @dfn{match data} to
13determine which text matched the whole regular expression or various
14portions of it.
15
16@menu
17* String Search:: Search for an exact match.
18* Searching and Case:: Case-independent or case-significant searching.
19* Regular Expressions:: Describing classes of strings.
20* Regexp Search:: Searching for a match for a regexp.
21* POSIX Regexps:: Searching POSIX-style for the longest match.
22* Match Data:: Finding out which part of the text matched,
23 after a string or regexp search.
d24880de 24* Search and Replace:: Commands that loop, searching and replacing.
b8d4c8d0
GM
25* Standard Regexps:: Useful regexps for finding sentences, pages,...
26@end menu
27
28 The @samp{skip-chars@dots{}} functions also perform a kind of searching.
29@xref{Skipping Characters}. To search for changes in character
30properties, see @ref{Property Search}.
31
32@node String Search
33@section Searching for Strings
34@cindex string search
35
36 These are the primitive functions for searching through the text in a
37buffer. They are meant for use in programs, but you may call them
38interactively. If you do so, they prompt for the search string; the
39arguments @var{limit} and @var{noerror} are @code{nil}, and @var{repeat}
4fb9a543
GM
40is 1. For more details on interactive searching, @pxref{Search,,
41Searching and Replacement, emacs, The GNU Emacs Manual}.
b8d4c8d0
GM
42
43 These search functions convert the search string to multibyte if the
44buffer is multibyte; they convert the search string to unibyte if the
45buffer is unibyte. @xref{Text Representations}.
46
47@deffn Command search-forward string &optional limit noerror repeat
48This function searches forward from point for an exact match for
49@var{string}. If successful, it sets point to the end of the occurrence
50found, and returns the new value of point. If no match is found, the
51value and side effects depend on @var{noerror} (see below).
b8d4c8d0
GM
52
53In the following example, point is initially at the beginning of the
54line. Then @code{(search-forward "fox")} moves point after the last
55letter of @samp{fox}:
56
57@example
58@group
59---------- Buffer: foo ----------
60@point{}The quick brown fox jumped over the lazy dog.
61---------- Buffer: foo ----------
62@end group
63
64@group
65(search-forward "fox")
66 @result{} 20
67
68---------- Buffer: foo ----------
69The quick brown fox@point{} jumped over the lazy dog.
70---------- Buffer: foo ----------
71@end group
72@end example
73
4fb9a543
GM
74The argument @var{limit} specifies the bound to the search, and should
75be a position in the current buffer. No match extending after
b8d4c8d0
GM
76that position is accepted. If @var{limit} is omitted or @code{nil}, it
77defaults to the end of the accessible portion of the buffer.
78
79@kindex search-failed
80What happens when the search fails depends on the value of
81@var{noerror}. If @var{noerror} is @code{nil}, a @code{search-failed}
82error is signaled. If @var{noerror} is @code{t}, @code{search-forward}
83returns @code{nil} and does nothing. If @var{noerror} is neither
84@code{nil} nor @code{t}, then @code{search-forward} moves point to the
4fb9a543
GM
85upper bound and returns @code{nil}.
86@c I see no prospect of this ever changing, and frankly the current
87@c behavior seems better, so there seems no need to mention this.
88@ignore
89(It would be more consistent now to return the new position of point
90in that case, but some existing programs may depend on a value of
91@code{nil}.)
92@end ignore
b8d4c8d0
GM
93
94The argument @var{noerror} only affects valid searches which fail to
95find a match. Invalid arguments cause errors regardless of
96@var{noerror}.
97
acc28cb9
CY
98If @var{repeat} is a positive number @var{n}, it serves as a repeat
99count: the search is repeated @var{n} times, each time starting at the
100end of the previous time's match. If these successive searches
101succeed, the function succeeds, moving point and returning its new
102value. Otherwise the search fails, with results depending on the
103value of @var{noerror}, as described above. If @var{repeat} is a
104negative number -@var{n}, it serves as a repeat count of @var{n} for a
105search in the opposite (backward) direction.
b8d4c8d0
GM
106@end deffn
107
108@deffn Command search-backward string &optional limit noerror repeat
109This function searches backward from point for @var{string}. It is
acc28cb9
CY
110like @code{search-forward}, except that it searches backwards rather
111than forwards. Backward searches leave point at the beginning of the
112match.
b8d4c8d0
GM
113@end deffn
114
115@deffn Command word-search-forward string &optional limit noerror repeat
b8d4c8d0
GM
116This function searches forward from point for a ``word'' match for
117@var{string}. If it finds a match, it sets point to the end of the
118match found, and returns the new value of point.
b8d4c8d0
GM
119
120Word matching regards @var{string} as a sequence of words, disregarding
121punctuation that separates them. It searches the buffer for the same
122sequence of words. Each word must be distinct in the buffer (searching
123for the word @samp{ball} does not match the word @samp{balls}), but the
124details of punctuation and spacing are ignored (searching for @samp{ball
125boy} does match @samp{ball. Boy!}).
126
127In this example, point is initially at the beginning of the buffer; the
128search leaves it between the @samp{y} and the @samp{!}.
129
130@example
131@group
132---------- Buffer: foo ----------
133@point{}He said "Please! Find
134the ball boy!"
135---------- Buffer: foo ----------
136@end group
137
138@group
139(word-search-forward "Please find the ball, boy.")
bd21bf41 140 @result{} 39
b8d4c8d0
GM
141
142---------- Buffer: foo ----------
143He said "Please! Find
144the ball boy@point{}!"
145---------- Buffer: foo ----------
146@end group
147@end example
148
149If @var{limit} is non-@code{nil}, it must be a position in the current
150buffer; it specifies the upper bound to the search. The match found
151must not extend after that position.
152
153If @var{noerror} is @code{nil}, then @code{word-search-forward} signals
154an error if the search fails. If @var{noerror} is @code{t}, then it
155returns @code{nil} instead of signaling an error. If @var{noerror} is
156neither @code{nil} nor @code{t}, it moves point to @var{limit} (or the
157end of the accessible portion of the buffer) and returns @code{nil}.
158
159If @var{repeat} is non-@code{nil}, then the search is repeated that many
160times. Point is positioned at the end of the last match.
4fb9a543
GM
161
162@findex word-search-regexp
bd21bf41 163Internally, @code{word-search-forward} and related functions use the
4fb9a543
GM
164function @code{word-search-regexp} to convert @var{string} to a
165regular expression that ignores punctuation.
b8d4c8d0
GM
166@end deffn
167
fca4ec76
CY
168@deffn Command word-search-forward-lax string &optional limit noerror repeat
169This command is identical to @code{word-search-forward}, except that
bd21bf41
JL
170the beginning or the end of @var{string} need not match a word
171boundary, unless @var{string} begins or ends in whitespace.
172For instance, searching for @samp{ball boy} matches @samp{ball boyee},
173but does not match @samp{balls boy}.
fca4ec76
CY
174@end deffn
175
b8d4c8d0
GM
176@deffn Command word-search-backward string &optional limit noerror repeat
177This function searches backward from point for a word match to
178@var{string}. This function is just like @code{word-search-forward}
179except that it searches backward and normally leaves point at the
180beginning of the match.
181@end deffn
182
fca4ec76
CY
183@deffn Command word-search-backward-lax string &optional limit noerror repeat
184This command is identical to @code{word-search-backward}, except that
bd21bf41
JL
185the beginning or the end of @var{string} need not match a word
186boundary, unless @var{string} begins or ends in whitespace.
fca4ec76
CY
187@end deffn
188
b8d4c8d0
GM
189@node Searching and Case
190@section Searching and Case
191@cindex searching and case
192
193 By default, searches in Emacs ignore the case of the text they are
194searching through; if you specify searching for @samp{FOO}, then
195@samp{Foo} or @samp{foo} is also considered a match. This applies to
196regular expressions, too; thus, @samp{[aB]} would match @samp{a} or
197@samp{A} or @samp{b} or @samp{B}.
198
199 If you do not want this feature, set the variable
200@code{case-fold-search} to @code{nil}. Then all letters must match
201exactly, including case. This is a buffer-local variable; altering the
202variable affects only the current buffer. (@xref{Intro to
4fb9a543
GM
203Buffer-Local}.) Alternatively, you may change the default value.
204In Lisp code, you will more typically use @code{let} to bind
205@code{case-fold-search} to the desired value.
b8d4c8d0
GM
206
207 Note that the user-level incremental search feature handles case
fca4ec76
CY
208distinctions differently. When the search string contains only lower
209case letters, the search ignores case, but when the search string
210contains one or more upper case letters, the search becomes
211case-sensitive. But this has nothing to do with the searching
4fb9a543
GM
212functions used in Lisp code. @xref{Incremental Search,,, emacs,
213The GNU Emacs Manual}.
b8d4c8d0
GM
214
215@defopt case-fold-search
216This buffer-local variable determines whether searches should ignore
217case. If the variable is @code{nil} they do not ignore case; otherwise
4fb9a543 218(and by default) they do ignore case.
b8d4c8d0
GM
219@end defopt
220
fca4ec76 221@defopt case-replace
4fb9a543 222This variable determines whether the higher-level replacement
fca4ec76
CY
223functions should preserve case. If the variable is @code{nil}, that
224means to use the replacement text verbatim. A non-@code{nil} value
225means to convert the case of the replacement text according to the
226text being replaced.
227
228This variable is used by passing it as an argument to the function
229@code{replace-match}. @xref{Replacing Match}.
230@end defopt
231
b8d4c8d0
GM
232@node Regular Expressions
233@section Regular Expressions
234@cindex regular expression
235@cindex regexp
236
fca4ec76 237 A @dfn{regular expression}, or @dfn{regexp} for short, is a pattern that
b8d4c8d0
GM
238denotes a (possibly infinite) set of strings. Searching for matches for
239a regexp is a very powerful operation. This section explains how to write
240regexps; the following section says how to search for them.
241
242@findex re-builder
243@cindex regular expressions, developing
d14daa28 244 For interactive development of regular expressions, you
b8d4c8d0
GM
245can use the @kbd{M-x re-builder} command. It provides a convenient
246interface for creating regular expressions, by giving immediate visual
247feedback in a separate buffer. As you edit the regexp, all its
248matches in the target buffer are highlighted. Each parenthesized
249sub-expression of the regexp is shown in a distinct face, which makes
250it easier to verify even very complex regexps.
251
252@menu
253* Syntax of Regexps:: Rules for writing regular expressions.
254* Regexp Example:: Illustrates regular expression syntax.
255* Regexp Functions:: Functions for operating on regular expressions.
256@end menu
257
258@node Syntax of Regexps
259@subsection Syntax of Regular Expressions
260
261 Regular expressions have a syntax in which a few characters are
262special constructs and the rest are @dfn{ordinary}. An ordinary
263character is a simple regular expression that matches that character
264and nothing else. The special characters are @samp{.}, @samp{*},
265@samp{+}, @samp{?}, @samp{[}, @samp{^}, @samp{$}, and @samp{\}; no new
266special characters will be defined in the future. The character
267@samp{]} is special if it ends a character alternative (see later).
268The character @samp{-} is special inside a character alternative. A
269@samp{[:} and balancing @samp{:]} enclose a character class inside a
270character alternative. Any other character appearing in a regular
271expression is ordinary, unless a @samp{\} precedes it.
272
273 For example, @samp{f} is not a special character, so it is ordinary, and
274therefore @samp{f} is a regular expression that matches the string
275@samp{f} and no other string. (It does @emph{not} match the string
276@samp{fg}, but it does match a @emph{part} of that string.) Likewise,
76f1a3c3 277@samp{o} is a regular expression that matches only @samp{o}.
b8d4c8d0
GM
278
279 Any two regular expressions @var{a} and @var{b} can be concatenated. The
280result is a regular expression that matches a string if @var{a} matches
281some amount of the beginning of that string and @var{b} matches the rest of
76f1a3c3 282the string.
b8d4c8d0
GM
283
284 As a simple example, we can concatenate the regular expressions @samp{f}
285and @samp{o} to get the regular expression @samp{fo}, which matches only
286the string @samp{fo}. Still trivial. To do something more powerful, you
287need to use one of the special regular expression constructs.
288
289@menu
290* Regexp Special:: Special characters in regular expressions.
291* Char Classes:: Character classes used in regular expressions.
292* Regexp Backslash:: Backslash-sequences in regular expressions.
293@end menu
294
295@node Regexp Special
296@subsubsection Special Characters in Regular Expressions
297
298 Here is a list of the characters that are special in a regular
299expression.
300
301@need 800
302@table @asis
303@item @samp{.}@: @r{(Period)}
304@cindex @samp{.} in regexp
305is a special character that matches any single character except a newline.
306Using concatenation, we can make regular expressions like @samp{a.b}, which
307matches any three-character string that begins with @samp{a} and ends with
76f1a3c3 308@samp{b}.
b8d4c8d0
GM
309
310@item @samp{*}
311@cindex @samp{*} in regexp
312is not a construct by itself; it is a postfix operator that means to
313match the preceding regular expression repetitively as many times as
314possible. Thus, @samp{o*} matches any number of @samp{o}s (including no
315@samp{o}s).
316
317@samp{*} always applies to the @emph{smallest} possible preceding
318expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating
319@samp{fo}. It matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
320
d14daa28 321@cindex backtracking and regular expressions
b8d4c8d0
GM
322The matcher processes a @samp{*} construct by matching, immediately, as
323many repetitions as can be found. Then it continues with the rest of
324the pattern. If that fails, backtracking occurs, discarding some of the
325matches of the @samp{*}-modified construct in the hope that that will
326make it possible to match the rest of the pattern. For example, in
327matching @samp{ca*ar} against the string @samp{caaar}, the @samp{a*}
328first tries to match all three @samp{a}s; but the rest of the pattern is
329@samp{ar} and there is only @samp{r} left to match, so this try fails.
330The next alternative is for @samp{a*} to match only two @samp{a}s. With
331this choice, the rest of the regexp matches successfully.
332
333@strong{Warning:} Nested repetition operators can run for an
334indefinitely long time, if they lead to ambiguous matching. For
335example, trying to match the regular expression @samp{\(x+y*\)*a}
336against the string @samp{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz} could
337take hours before it ultimately fails. Emacs must try each way of
338grouping the @samp{x}s before concluding that none of them can work.
339Even worse, @samp{\(x*\)*} can match the null string in infinitely
340many ways, so it causes an infinite loop. To avoid these problems,
341check nested repetitions carefully, to make sure that they do not
342cause combinatorial explosions in backtracking.
343
344@item @samp{+}
345@cindex @samp{+} in regexp
346is a postfix operator, similar to @samp{*} except that it must match
347the preceding expression at least once. So, for example, @samp{ca+r}
348matches the strings @samp{car} and @samp{caaaar} but not the string
349@samp{cr}, whereas @samp{ca*r} matches all three strings.
350
351@item @samp{?}
352@cindex @samp{?} in regexp
353is a postfix operator, similar to @samp{*} except that it must match the
354preceding expression either once or not at all. For example,
355@samp{ca?r} matches @samp{car} or @samp{cr}; nothing else.
356
357@item @samp{*?}, @samp{+?}, @samp{??}
3645358a 358@cindex non-greedy repetition characters in regexp
b8d4c8d0
GM
359These are ``non-greedy'' variants of the operators @samp{*}, @samp{+}
360and @samp{?}. Where those operators match the largest possible
361substring (consistent with matching the entire containing expression),
362the non-greedy variants match the smallest possible substring
363(consistent with matching the entire containing expression).
364
365For example, the regular expression @samp{c[ad]*a} when applied to the
366string @samp{cdaaada} matches the whole string; but the regular
367expression @samp{c[ad]*?a}, applied to that same string, matches just
368@samp{cda}. (The smallest possible match here for @samp{[ad]*?} that
369permits the whole expression to match is @samp{d}.)
370
371@item @samp{[ @dots{} ]}
372@cindex character alternative (in regexp)
373@cindex @samp{[} in regexp
374@cindex @samp{]} in regexp
375is a @dfn{character alternative}, which begins with @samp{[} and is
376terminated by @samp{]}. In the simplest case, the characters between
377the two brackets are what this character alternative can match.
378
379Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
380@samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
ba3bf1d9 381(including the empty string). It follows that @samp{c[ad]*r}
b8d4c8d0
GM
382matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
383
384You can also include character ranges in a character alternative, by
385writing the starting and ending characters with a @samp{-} between them.
386Thus, @samp{[a-z]} matches any lower-case @acronym{ASCII} letter.
387Ranges may be intermixed freely with individual characters, as in
388@samp{[a-z$%.]}, which matches any lower case @acronym{ASCII} letter
389or @samp{$}, @samp{%} or period.
390
d14daa28
GM
391If @code{case-fold-search} is non-@code{nil}, @samp{[a-z]} also
392matches upper-case letters. Note that a range like @samp{[a-z]} is
393not affected by the locale's collation sequence, it always represents
394a sequence in @acronym{ASCII} order.
1df7defd 395@c This wasn't obvious to me, since, e.g., the grep manual "Character
efdf29da
GM
396@c Classes and Bracket Expressions" specifically notes the opposite
397@c behavior. But by experiment Emacs seems unaffected by LC_COLLATE
398@c in this regard.
d14daa28
GM
399
400Note also that the usual regexp special characters are not special inside a
b8d4c8d0
GM
401character alternative. A completely different set of characters is
402special inside character alternatives: @samp{]}, @samp{-} and @samp{^}.
403
404To include a @samp{]} in a character alternative, you must make it the
405first character. For example, @samp{[]a]} matches @samp{]} or @samp{a}.
406To include a @samp{-}, write @samp{-} as the first or last character of
407the character alternative, or put it after a range. Thus, @samp{[]-]}
d14daa28
GM
408matches both @samp{]} and @samp{-}. (As explained below, you cannot
409use @samp{\]} to include a @samp{]} inside a character alternative,
410since @samp{\} is not special there.)
b8d4c8d0
GM
411
412To include @samp{^} in a character alternative, put it anywhere but at
413the beginning.
414
d14daa28
GM
415@c What if it starts with a multibyte and ends with a unibyte?
416@c That doesn't seem to match anything...?
b8d4c8d0
GM
417If a range starts with a unibyte character @var{c} and ends with a
418multibyte character @var{c2}, the range is divided into two parts: one
d14daa28
GM
419spans the unibyte characters @samp{@var{c}..?\377}, the other the
420multibyte characters @samp{@var{c1}..@var{c2}}, where @var{c1} is the
421first character of the charset to which @var{c2} belongs.
b8d4c8d0 422
ba3bf1d9 423A character alternative can also specify named character classes
d14daa28
GM
424(@pxref{Char Classes}). This is a POSIX feature. For example,
425@samp{[[:ascii:]]} matches any @acronym{ASCII} character.
426Using a character class is equivalent to mentioning each of the
427characters in that class; but the latter is not feasible in practice,
428since some classes include thousands of different characters.
b8d4c8d0
GM
429
430@item @samp{[^ @dots{} ]}
431@cindex @samp{^} in regexp
432@samp{[^} begins a @dfn{complemented character alternative}. This
433matches any character except the ones specified. Thus,
434@samp{[^a-z0-9A-Z]} matches all characters @emph{except} letters and
435digits.
436
437@samp{^} is not special in a character alternative unless it is the first
438character. The character following the @samp{^} is treated as if it
439were first (in other words, @samp{-} and @samp{]} are not special there).
440
441A complemented character alternative can match a newline, unless newline is
442mentioned as one of the characters not to match. This is in contrast to
443the handling of regexps in programs such as @code{grep}.
444
ba3bf1d9
CY
445You can specify named character classes, just like in character
446alternatives. For instance, @samp{[^[:ascii:]]} matches any
447non-@acronym{ASCII} character. @xref{Char Classes}.
448
b8d4c8d0
GM
449@item @samp{^}
450@cindex beginning of line in regexp
451When matching a buffer, @samp{^} matches the empty string, but only at the
452beginning of a line in the text being matched (or the beginning of the
453accessible portion of the buffer). Otherwise it fails to match
454anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at the
455beginning of a line.
456
457When matching a string instead of a buffer, @samp{^} matches at the
458beginning of the string or after a newline character.
459
460For historical compatibility reasons, @samp{^} can be used only at the
461beginning of the regular expression, or after @samp{\(}, @samp{\(?:}
462or @samp{\|}.
463
464@item @samp{$}
465@cindex @samp{$} in regexp
466@cindex end of line in regexp
467is similar to @samp{^} but matches only at the end of a line (or the
468end of the accessible portion of the buffer). Thus, @samp{x+$}
469matches a string of one @samp{x} or more at the end of a line.
470
471When matching a string instead of a buffer, @samp{$} matches at the end
472of the string or before a newline character.
473
474For historical compatibility reasons, @samp{$} can be used only at the
475end of the regular expression, or before @samp{\)} or @samp{\|}.
476
477@item @samp{\}
478@cindex @samp{\} in regexp
479has two functions: it quotes the special characters (including
480@samp{\}), and it introduces additional special constructs.
481
482Because @samp{\} quotes special characters, @samp{\$} is a regular
483expression that matches only @samp{$}, and @samp{\[} is a regular
484expression that matches only @samp{[}, and so on.
485
486Note that @samp{\} also has special meaning in the read syntax of Lisp
487strings (@pxref{String Type}), and must be quoted with @samp{\}. For
488example, the regular expression that matches the @samp{\} character is
489@samp{\\}. To write a Lisp string that contains the characters
490@samp{\\}, Lisp syntax requires you to quote each @samp{\} with another
491@samp{\}. Therefore, the read syntax for a regular expression matching
76f1a3c3 492@samp{\} is @code{"\\\\"}.
b8d4c8d0
GM
493@end table
494
495@strong{Please note:} For historical compatibility, special characters
496are treated as ordinary ones if they are in contexts where their special
497meanings make no sense. For example, @samp{*foo} treats @samp{*} as
498ordinary since there is no preceding expression on which the @samp{*}
499can act. It is poor practice to depend on this behavior; quote the
76f1a3c3 500special character anyway, regardless of where it appears.
b8d4c8d0
GM
501
502As a @samp{\} is not special inside a character alternative, it can
503never remove the special meaning of @samp{-} or @samp{]}. So you
504should not quote these characters when they have no special meaning
505either. This would not clarify anything, since backslashes can
506legitimately precede these characters where they @emph{have} special
507meaning, as in @samp{[^\]} (@code{"[^\\]"} for Lisp string syntax),
508which matches any single character except a backslash.
509
510In practice, most @samp{]} that occur in regular expressions close a
511character alternative and hence are special. However, occasionally a
512regular expression may try to match a complex pattern of literal
513@samp{[} and @samp{]}. In such situations, it sometimes may be
514necessary to carefully parse the regexp from the start to determine
515which square brackets enclose a character alternative. For example,
516@samp{[^][]]} consists of the complemented character alternative
517@samp{[^][]} (which matches any single character that is not a square
518bracket), followed by a literal @samp{]}.
519
520The exact rules are that at the beginning of a regexp, @samp{[} is
521special and @samp{]} not. This lasts until the first unquoted
522@samp{[}, after which we are in a character alternative; @samp{[} is
523no longer special (except when it starts a character class) but @samp{]}
524is special, unless it immediately follows the special @samp{[} or that
525@samp{[} followed by a @samp{^}. This lasts until the next special
526@samp{]} that does not end a character class. This ends the character
527alternative and restores the ordinary syntax of regular expressions;
528an unquoted @samp{[} is special again and a @samp{]} not.
529
530@node Char Classes
531@subsubsection Character Classes
532@cindex character classes in regexp
533
534 Here is a table of the classes you can use in a character alternative,
535and what they mean:
536
537@table @samp
538@item [:ascii:]
539This matches any @acronym{ASCII} character (codes 0--127).
540@item [:alnum:]
541This matches any letter or digit. (At present, for multibyte
542characters, it matches anything that has word syntax.)
543@item [:alpha:]
544This matches any letter. (At present, for multibyte characters, it
545matches anything that has word syntax.)
546@item [:blank:]
547This matches space and tab only.
548@item [:cntrl:]
549This matches any @acronym{ASCII} control character.
550@item [:digit:]
551This matches @samp{0} through @samp{9}. Thus, @samp{[-+[:digit:]]}
552matches any digit, as well as @samp{+} and @samp{-}.
553@item [:graph:]
554This matches graphic characters---everything except @acronym{ASCII} control
555characters, space, and the delete character.
556@item [:lower:]
4359a806
CY
557This matches any lower-case letter, as determined by the current case
558table (@pxref{Case Tables}). If @code{case-fold-search} is
559non-@code{nil}, this also matches any upper-case letter.
b8d4c8d0
GM
560@item [:multibyte:]
561This matches any multibyte character (@pxref{Text Representations}).
562@item [:nonascii:]
563This matches any non-@acronym{ASCII} character.
564@item [:print:]
565This matches printing characters---everything except @acronym{ASCII} control
566characters and the delete character.
567@item [:punct:]
568This matches any punctuation character. (At present, for multibyte
569characters, it matches anything that has non-word syntax.)
570@item [:space:]
571This matches any character that has whitespace syntax
572(@pxref{Syntax Class Table}).
573@item [:unibyte:]
574This matches any unibyte character (@pxref{Text Representations}).
575@item [:upper:]
4359a806
CY
576This matches any upper-case letter, as determined by the current case
577table (@pxref{Case Tables}). If @code{case-fold-search} is
578non-@code{nil}, this also matches any lower-case letter.
b8d4c8d0
GM
579@item [:word:]
580This matches any character that has word syntax (@pxref{Syntax Class
581Table}).
582@item [:xdigit:]
583This matches the hexadecimal digits: @samp{0} through @samp{9}, @samp{a}
584through @samp{f} and @samp{A} through @samp{F}.
585@end table
586
587@node Regexp Backslash
588@subsubsection Backslash Constructs in Regular Expressions
4963495d 589@cindex backslash in regular expressions
b8d4c8d0
GM
590
591 For the most part, @samp{\} followed by any character matches only
592that character. However, there are several exceptions: certain
f8152bcb
XF
593sequences starting with @samp{\} that have special meanings. Here is
594a table of the special @samp{\} constructs.
b8d4c8d0
GM
595
596@table @samp
597@item \|
598@cindex @samp{|} in regexp
599@cindex regexp alternative
600specifies an alternative.
601Two regular expressions @var{a} and @var{b} with @samp{\|} in
602between form an expression that matches anything that either @var{a} or
76f1a3c3 603@var{b} matches.
b8d4c8d0
GM
604
605Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
76f1a3c3 606but no other string.
b8d4c8d0
GM
607
608@samp{\|} applies to the largest possible surrounding expressions. Only a
609surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
76f1a3c3 610@samp{\|}.
b8d4c8d0
GM
611
612If you need full backtracking capability to handle multiple uses of
613@samp{\|}, use the POSIX regular expression functions (@pxref{POSIX
614Regexps}).
615
616@item \@{@var{m}\@}
617is a postfix operator that repeats the previous pattern exactly @var{m}
618times. Thus, @samp{x\@{5\@}} matches the string @samp{xxxxx}
619and nothing else. @samp{c[ad]\@{3\@}r} matches string such as
620@samp{caaar}, @samp{cdddr}, @samp{cadar}, and so on.
621
622@item \@{@var{m},@var{n}\@}
623is a more general postfix operator that specifies repetition with a
624minimum of @var{m} repeats and a maximum of @var{n} repeats. If @var{m}
625is omitted, the minimum is 0; if @var{n} is omitted, there is no
626maximum.
627
628For example, @samp{c[ad]\@{1,2\@}r} matches the strings @samp{car},
629@samp{cdr}, @samp{caar}, @samp{cadr}, @samp{cdar}, and @samp{cddr}, and
630nothing else.@*
d24880de
GM
631@samp{\@{0,1\@}} or @samp{\@{,1\@}} is equivalent to @samp{?}.@*
632@samp{\@{0,\@}} or @samp{\@{,\@}} is equivalent to @samp{*}.@*
b8d4c8d0
GM
633@samp{\@{1,\@}} is equivalent to @samp{+}.
634
635@item \( @dots{} \)
636@cindex @samp{(} in regexp
637@cindex @samp{)} in regexp
638@cindex regexp grouping
639is a grouping construct that serves three purposes:
640
641@enumerate
642@item
643To enclose a set of @samp{\|} alternatives for other operations. Thus,
644the regular expression @samp{\(foo\|bar\)x} matches either @samp{foox}
645or @samp{barx}.
646
647@item
648To enclose a complicated expression for the postfix operators @samp{*},
649@samp{+} and @samp{?} to operate on. Thus, @samp{ba\(na\)*} matches
650@samp{ba}, @samp{bana}, @samp{banana}, @samp{bananana}, etc., with any
651number (zero or more) of @samp{na} strings.
652
653@item
654To record a matched substring for future reference with
655@samp{\@var{digit}} (see below).
656@end enumerate
657
658This last application is not a consequence of the idea of a
659parenthetical grouping; it is a separate feature that was assigned as a
660second meaning to the same @samp{\( @dots{} \)} construct because, in
661practice, there was usually no conflict between the two meanings. But
662occasionally there is a conflict, and that led to the introduction of
663shy groups.
664
665@item \(?: @dots{} \)
80d7cdca
CY
666@cindex shy groups
667@cindex non-capturing group
668@cindex unnumbered group
47f24290 669@cindex @samp{(?:} in regexp
b8d4c8d0
GM
670is the @dfn{shy group} construct. A shy group serves the first two
671purposes of an ordinary group (controlling the nesting of other
672operators), but it does not get a number, so you cannot refer back to
80d7cdca
CY
673its value with @samp{\@var{digit}}. Shy groups are particularly
674useful for mechanically-constructed regular expressions, because they
675can be added automatically without altering the numbering of ordinary,
676non-shy groups.
b8d4c8d0 677
80d7cdca
CY
678Shy groups are also called @dfn{non-capturing} or @dfn{unnumbered
679groups}.
b8d4c8d0
GM
680
681@item \(?@var{num}: @dots{} \)
682is the @dfn{explicitly numbered group} construct. Normal groups get
683their number implicitly, based on their position, which can be
684inconvenient. This construct allows you to force a particular group
685number. There is no particular restriction on the numbering,
1df7defd
PE
686e.g., you can have several groups with the same number in which case
687the last one to match (i.e., the rightmost match) will win.
b8d4c8d0
GM
688Implicitly numbered groups always get the smallest integer larger than
689the one of any previous group.
690
691@item \@var{digit}
692matches the same text that matched the @var{digit}th occurrence of a
693grouping (@samp{\( @dots{} \)}) construct.
694
695In other words, after the end of a group, the matcher remembers the
696beginning and end of the text matched by that group. Later on in the
697regular expression you can use @samp{\} followed by @var{digit} to
698match that same text, whatever it may have been.
699
700The strings matching the first nine grouping constructs appearing in
701the entire regular expression passed to a search or matching function
702are assigned numbers 1 through 9 in the order that the open
703parentheses appear in the regular expression. So you can use
704@samp{\1} through @samp{\9} to refer to the text matched by the
705corresponding grouping constructs.
706
707For example, @samp{\(.*\)\1} matches any newline-free string that is
708composed of two identical halves. The @samp{\(.*\)} matches the first
709half, which may be anything, but the @samp{\1} that follows must match
710the same exact text.
711
712If a @samp{\( @dots{} \)} construct matches more than once (which can
713happen, for instance, if it is followed by @samp{*}), only the last
714match is recorded.
715
716If a particular grouping construct in the regular expression was never
717matched---for instance, if it appears inside of an alternative that
718wasn't used, or inside of a repetition that repeated zero times---then
719the corresponding @samp{\@var{digit}} construct never matches
748c30f4 720anything. To use an artificial example, @samp{\(foo\(b*\)\|lose\)\2}
b8d4c8d0
GM
721cannot match @samp{lose}: the second alternative inside the larger
722group matches it, but then @samp{\2} is undefined and can't match
723anything. But it can match @samp{foobb}, because the first
724alternative matches @samp{foob} and @samp{\2} matches @samp{b}.
725
726@item \w
727@cindex @samp{\w} in regexp
728matches any word-constituent character. The editor syntax table
729determines which characters these are. @xref{Syntax Tables}.
730
731@item \W
732@cindex @samp{\W} in regexp
733matches any character that is not a word constituent.
734
735@item \s@var{code}
736@cindex @samp{\s} in regexp
737matches any character whose syntax is @var{code}. Here @var{code} is a
738character that represents a syntax code: thus, @samp{w} for word
739constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
740etc. To represent whitespace syntax, use either @samp{-} or a space
741character. @xref{Syntax Class Table}, for a list of syntax codes and
742the characters that stand for them.
743
744@item \S@var{code}
745@cindex @samp{\S} in regexp
746matches any character whose syntax is not @var{code}.
747
1ea897d5 748@cindex category, regexp search for
b8d4c8d0
GM
749@item \c@var{c}
750matches any character whose category is @var{c}. Here @var{c} is a
751character that represents a category: thus, @samp{c} for Chinese
752characters or @samp{g} for Greek characters in the standard category
1ea897d5
EZ
753table. You can see the list of all the currently defined categories
754with @kbd{M-x describe-categories @key{RET}}. You can also define
755your own categories in addition to the standard ones using the
756@code{define-category} function (@pxref{Categories}).
b8d4c8d0
GM
757
758@item \C@var{c}
759matches any character whose category is not @var{c}.
760@end table
761
762 The following regular expression constructs match the empty string---that is,
763they don't use up any characters---but whether they match depends on the
764context. For all, the beginning and end of the accessible portion of
765the buffer are treated as if they were the actual beginning and end of
766the buffer.
767
768@table @samp
769@item \`
770@cindex @samp{\`} in regexp
771matches the empty string, but only at the beginning
772of the buffer or string being matched against.
773
774@item \'
775@cindex @samp{\'} in regexp
776matches the empty string, but only at the end of
777the buffer or string being matched against.
778
779@item \=
780@cindex @samp{\=} in regexp
781matches the empty string, but only at point.
782(This construct is not defined when matching against a string.)
783
784@item \b
785@cindex @samp{\b} in regexp
786matches the empty string, but only at the beginning or
787end of a word. Thus, @samp{\bfoo\b} matches any occurrence of
788@samp{foo} as a separate word. @samp{\bballs?\b} matches
76f1a3c3 789@samp{ball} or @samp{balls} as a separate word.
b8d4c8d0
GM
790
791@samp{\b} matches at the beginning or end of the buffer (or string)
792regardless of what text appears next to it.
793
794@item \B
795@cindex @samp{\B} in regexp
796matches the empty string, but @emph{not} at the beginning or
797end of a word, nor at the beginning or end of the buffer (or string).
798
799@item \<
800@cindex @samp{\<} in regexp
801matches the empty string, but only at the beginning of a word.
802@samp{\<} matches at the beginning of the buffer (or string) only if a
803word-constituent character follows.
804
805@item \>
806@cindex @samp{\>} in regexp
807matches the empty string, but only at the end of a word. @samp{\>}
808matches at the end of the buffer (or string) only if the contents end
809with a word-constituent character.
810
811@item \_<
812@cindex @samp{\_<} in regexp
813matches the empty string, but only at the beginning of a symbol. A
814symbol is a sequence of one or more word or symbol constituent
815characters. @samp{\_<} matches at the beginning of the buffer (or
816string) only if a symbol-constituent character follows.
817
818@item \_>
819@cindex @samp{\_>} in regexp
820matches the empty string, but only at the end of a symbol. @samp{\_>}
821matches at the end of the buffer (or string) only if the contents end
822with a symbol-constituent character.
823@end table
824
825@kindex invalid-regexp
826 Not every string is a valid regular expression. For example, a string
d14daa28 827that ends inside a character alternative without a terminating @samp{]}
b8d4c8d0
GM
828is invalid, and so is a string that ends with a single @samp{\}. If
829an invalid regular expression is passed to any of the search functions,
830an @code{invalid-regexp} error is signaled.
831
832@node Regexp Example
b8d4c8d0
GM
833@subsection Complex Regexp Example
834
835 Here is a complicated regexp which was formerly used by Emacs to
836recognize the end of a sentence together with any whitespace that
837follows. (Nowadays Emacs uses a similar but more complex default
838regexp constructed by the function @code{sentence-end}.
839@xref{Standard Regexps}.)
840
d14daa28
GM
841 Below, we show first the regexp as a string in Lisp syntax (to
842distinguish spaces from tab characters), and then the result of
843evaluating it. The string constant begins and ends with a
b8d4c8d0
GM
844double-quote. @samp{\"} stands for a double-quote as part of the
845string, @samp{\\} for a backslash as part of the string, @samp{\t} for a
846tab and @samp{\n} for a newline.
847
b8d4c8d0
GM
848@example
849@group
850"[.?!][]\"')@}]*\\($\\| $\\|\t\\|@ @ \\)[ \t\n]*"
851 @result{} "[.?!][]\"')@}]*\\($\\| $\\| \\|@ @ \\)[
852]*"
853@end group
854@end example
855
856@noindent
d14daa28 857In the output, tab and newline appear as themselves.
b8d4c8d0
GM
858
859 This regular expression contains four parts in succession and can be
860deciphered as follows:
861
862@table @code
863@item [.?!]
864The first part of the pattern is a character alternative that matches
865any one of three characters: period, question mark, and exclamation
866mark. The match must begin with one of these three characters. (This
867is one point where the new default regexp used by Emacs differs from
868the old. The new value also allows some non-@acronym{ASCII}
869characters that end a sentence without any following whitespace.)
870
871@item []\"')@}]*
872The second part of the pattern matches any closing braces and quotation
873marks, zero or more of them, that may follow the period, question mark
874or exclamation mark. The @code{\"} is Lisp syntax for a double-quote in
875a string. The @samp{*} at the end indicates that the immediately
876preceding regular expression (a character alternative, in this case) may be
877repeated zero or more times.
878
879@item \\($\\|@ $\\|\t\\|@ @ \\)
880The third part of the pattern matches the whitespace that follows the
881end of a sentence: the end of a line (optionally with a space), or a
882tab, or two spaces. The double backslashes mark the parentheses and
883vertical bars as regular expression syntax; the parentheses delimit a
884group and the vertical bars separate alternatives. The dollar sign is
885used to match the end of a line.
886
887@item [ \t\n]*
888Finally, the last part of the pattern matches any additional whitespace
889beyond the minimum needed to end a sentence.
890@end table
891
892@node Regexp Functions
893@subsection Regular Expression Functions
894
895 These functions operate on regular expressions.
896
897@defun regexp-quote string
898This function returns a regular expression whose only exact match is
899@var{string}. Using this regular expression in @code{looking-at} will
900succeed only if the next characters in the buffer are @var{string};
901using it in a search function will succeed if the text being searched
fee88ca0 902contains @var{string}. @xref{Regexp Search}.
b8d4c8d0
GM
903
904This allows you to request an exact string match or search when calling
905a function that wants a regular expression.
906
907@example
908@group
909(regexp-quote "^The cat$")
910 @result{} "\\^The cat\\$"
911@end group
912@end example
913
914One use of @code{regexp-quote} is to combine an exact string match with
915context described as a regular expression. For example, this searches
916for the string that is the value of @var{string}, surrounded by
917whitespace:
918
919@example
920@group
921(re-search-forward
922 (concat "\\s-" (regexp-quote string) "\\s-"))
923@end group
924@end example
925@end defun
926
927@defun regexp-opt strings &optional paren
928This function returns an efficient regular expression that will match
929any of the strings in the list @var{strings}. This is useful when you
930need to make matching or searching as fast as possible---for example,
fee88ca0
GM
931for Font Lock mode@footnote{Note that @code{regexp-opt} does not
932guarantee that its result is absolutely the most efficient form
933possible. A hand-tuned regular expression can sometimes be slightly
934more efficient, but is almost never worth the effort.}.
1df7defd 935@c E.g., see http://debbugs.gnu.org/2816
b8d4c8d0
GM
936
937If the optional argument @var{paren} is non-@code{nil}, then the
938returned regular expression is always enclosed by at least one
939parentheses-grouping construct. If @var{paren} is @code{words}, then
07ff7702
MB
940that construct is additionally surrounded by @samp{\<} and @samp{\>};
941alternatively, if @var{paren} is @code{symbols}, then that construct
942is additionally surrounded by @samp{\_<} and @samp{\_>}
943(@code{symbols} is often appropriate when matching
944programming-language keywords and the like).
b8d4c8d0
GM
945
946This simplified definition of @code{regexp-opt} produces a
947regular expression which is equivalent to the actual value
948(but not as efficient):
949
950@example
fee88ca0 951(defun regexp-opt (strings &optional paren)
b8d4c8d0
GM
952 (let ((open-paren (if paren "\\(" ""))
953 (close-paren (if paren "\\)" "")))
954 (concat open-paren
955 (mapconcat 'regexp-quote strings "\\|")
956 close-paren)))
957@end example
958@end defun
959
960@defun regexp-opt-depth regexp
961This function returns the total number of grouping constructs
80d7cdca
CY
962(parenthesized expressions) in @var{regexp}. This does not include
963shy groups (@pxref{Regexp Backslash}).
b8d4c8d0
GM
964@end defun
965
fee88ca0
GM
966@c Supposedly an internal regexp-opt function, but table.el uses it at least.
967@defun regexp-opt-charset chars
968This function returns a regular expression matching a character in the
969list of characters @var{chars}.
970
971@example
972(regexp-opt-charset '(?a ?b ?c ?d ?e))
973 @result{} "[a-e]"
974@end example
975@end defun
976
977@c Internal functions: regexp-opt-group
978
b8d4c8d0
GM
979@node Regexp Search
980@section Regular Expression Searching
981@cindex regular expression searching
982@cindex regexp searching
983@cindex searching for regexp
984
985 In GNU Emacs, you can search for the next match for a regular
986expression either incrementally or not. For incremental search
987commands, see @ref{Regexp Search, , Regular Expression Search, emacs,
988The GNU Emacs Manual}. Here we describe only the search functions
989useful in programs. The principal one is @code{re-search-forward}.
990
991 These search functions convert the regular expression to multibyte if
992the buffer is multibyte; they convert the regular expression to unibyte
993if the buffer is unibyte. @xref{Text Representations}.
994
995@deffn Command re-search-forward regexp &optional limit noerror repeat
996This function searches forward in the current buffer for a string of
997text that is matched by the regular expression @var{regexp}. The
998function skips over any amount of text that is not matched by
999@var{regexp}, and leaves point at the end of the first match found.
1000It returns the new value of point.
1001
1002If @var{limit} is non-@code{nil}, it must be a position in the current
1003buffer. It specifies the upper bound to the search. No match
1004extending after that position is accepted.
1005
1006If @var{repeat} is supplied, it must be a positive number; the search
1007is repeated that many times; each repetition starts at the end of the
1008previous match. If all these successive searches succeed, the search
1009succeeds, moving point and returning its new value. Otherwise the
1010search fails. What @code{re-search-forward} does when the search
1011fails depends on the value of @var{noerror}:
1012
1013@table @asis
1014@item @code{nil}
1015Signal a @code{search-failed} error.
1016@item @code{t}
1017Do nothing and return @code{nil}.
1018@item anything else
1019Move point to @var{limit} (or the end of the accessible portion of the
1020buffer) and return @code{nil}.
1021@end table
1022
1023In the following example, point is initially before the @samp{T}.
1024Evaluating the search call moves point to the end of that line (between
1025the @samp{t} of @samp{hat} and the newline).
1026
1027@example
1028@group
1029---------- Buffer: foo ----------
1030I read "@point{}The cat in the hat
1031comes back" twice.
1032---------- Buffer: foo ----------
1033@end group
1034
1035@group
1036(re-search-forward "[a-z]+" nil t 5)
1037 @result{} 27
1038
1039---------- Buffer: foo ----------
1040I read "The cat in the hat@point{}
1041comes back" twice.
1042---------- Buffer: foo ----------
1043@end group
1044@end example
1045@end deffn
1046
1047@deffn Command re-search-backward regexp &optional limit noerror repeat
1048This function searches backward in the current buffer for a string of
1049text that is matched by the regular expression @var{regexp}, leaving
1050point at the beginning of the first text found.
1051
1052This function is analogous to @code{re-search-forward}, but they are not
1053simple mirror images. @code{re-search-forward} finds the match whose
1054beginning is as close as possible to the starting point. If
1055@code{re-search-backward} were a perfect mirror image, it would find the
1056match whose end is as close as possible. However, in fact it finds the
1057match whose beginning is as close as possible (and yet ends before the
1058starting point). The reason for this is that matching a regular
1059expression at a given spot always works from beginning to end, and
1060starts at a specified beginning position.
1061
1062A true mirror-image of @code{re-search-forward} would require a special
1063feature for matching regular expressions from end to beginning. It's
1064not worth the trouble of implementing that.
1065@end deffn
1066
1067@defun string-match regexp string &optional start
1068This function returns the index of the start of the first match for
1069the regular expression @var{regexp} in @var{string}, or @code{nil} if
1070there is no match. If @var{start} is non-@code{nil}, the search starts
1071at that index in @var{string}.
1072
1073For example,
1074
1075@example
1076@group
1077(string-match
1078 "quick" "The quick brown fox jumped quickly.")
1079 @result{} 4
1080@end group
1081@group
1082(string-match
1083 "quick" "The quick brown fox jumped quickly." 8)
1084 @result{} 27
1085@end group
1086@end example
1087
1088@noindent
1089The index of the first character of the
1090string is 0, the index of the second character is 1, and so on.
1091
1092After this function returns, the index of the first character beyond
1093the match is available as @code{(match-end 0)}. @xref{Match Data}.
1094
1095@example
1096@group
1097(string-match
1098 "quick" "The quick brown fox jumped quickly." 8)
1099 @result{} 27
1100@end group
1101
1102@group
1103(match-end 0)
1104 @result{} 32
1105@end group
1106@end example
1107@end defun
1108
3645358a 1109@defun string-match-p regexp string &optional start
4433fa91
EZ
1110This predicate function does what @code{string-match} does, but it
1111avoids modifying the match data.
3645358a
EZ
1112@end defun
1113
b8d4c8d0
GM
1114@defun looking-at regexp
1115This function determines whether the text in the current buffer directly
1116following point matches the regular expression @var{regexp}. ``Directly
1117following'' means precisely that: the search is ``anchored'' and it can
1118succeed only starting with the first character following point. The
1119result is @code{t} if so, @code{nil} otherwise.
1120
fee88ca0 1121This function does not move point, but it does update the match data.
3645358a
EZ
1122@xref{Match Data}. If you need to test for a match without modifying
1123the match data, use @code{looking-at-p}, described below.
b8d4c8d0
GM
1124
1125In this example, point is located directly before the @samp{T}. If it
1126were anywhere else, the result would be @code{nil}.
1127
1128@example
1129@group
1130---------- Buffer: foo ----------
1131I read "@point{}The cat in the hat
1132comes back" twice.
1133---------- Buffer: foo ----------
1134
1135(looking-at "The cat in the hat$")
1136 @result{} t
1137@end group
1138@end example
1139@end defun
1140
1899a5d0 1141@defun looking-back regexp &optional limit greedy
fee88ca0
GM
1142This function returns @code{t} if @var{regexp} matches the text
1143immediately before point (i.e., ending at point), and @code{nil} otherwise.
b8d4c8d0
GM
1144
1145Because regular expression matching works only going forward, this is
1146implemented by searching backwards from point for a match that ends at
1147point. That can be quite slow if it has to search a long distance.
1148You can bound the time required by specifying @var{limit}, which says
1149not to search before @var{limit}. In this case, the match that is
6cfe977d 1150found must begin at or after @var{limit}. Here's an example:
1899a5d0 1151
b8d4c8d0
GM
1152@example
1153@group
1154---------- Buffer: foo ----------
1155I read "@point{}The cat in the hat
1156comes back" twice.
1157---------- Buffer: foo ----------
1158
1159(looking-back "read \"" 3)
1160 @result{} t
1161(looking-back "read \"" 4)
1162 @result{} nil
1163@end group
1164@end example
fee88ca0 1165
6cfe977d
XF
1166If @var{greedy} is non-@code{nil}, this function extends the match
1167backwards as far as possible, stopping when a single additional
1168previous character cannot be part of a match for regexp. When the
1169match is extended, its starting position is allowed to occur before
1170@var{limit}.
1171
fee88ca0
GM
1172@c http://debbugs.gnu.org/5689
1173As a general recommendation, try to avoid using @code{looking-back}
1174wherever possible, since it is slow. For this reason, there are no
1175plans to add a @code{looking-back-p} function.
b8d4c8d0
GM
1176@end defun
1177
3645358a
EZ
1178@defun looking-at-p regexp
1179This predicate function works like @code{looking-at}, but without
1180updating the match data.
1181@end defun
1182
b8d4c8d0
GM
1183@defvar search-spaces-regexp
1184If this variable is non-@code{nil}, it should be a regular expression
1185that says how to search for whitespace. In that case, any group of
1186spaces in a regular expression being searched for stands for use of
1187this regular expression. However, spaces inside of constructs such as
1188@samp{[@dots{}]} and @samp{*}, @samp{+}, @samp{?} are not affected by
1189@code{search-spaces-regexp}.
1190
1191Since this variable affects all regular expression search and match
1192constructs, you should bind it temporarily for as small as possible
1193a part of the code.
1194@end defvar
1195
1196@node POSIX Regexps
1197@section POSIX Regular Expression Searching
1198
fee88ca0 1199@cindex backtracking and POSIX regular expressions
b8d4c8d0
GM
1200 The usual regular expression functions do backtracking when necessary
1201to handle the @samp{\|} and repetition constructs, but they continue
1202this only until they find @emph{some} match. Then they succeed and
1203report the first match found.
1204
1205 This section describes alternative search functions which perform the
1206full backtracking specified by the POSIX standard for regular expression
1207matching. They continue backtracking until they have tried all
1208possibilities and found all matches, so they can report the longest
1df7defd 1209match, as required by POSIX@. This is much slower, so use these
b8d4c8d0
GM
1210functions only when you really need the longest match.
1211
1212 The POSIX search and match functions do not properly support the
3645358a
EZ
1213non-greedy repetition operators (@pxref{Regexp Special, non-greedy}).
1214This is because POSIX backtracking conflicts with the semantics of
1215non-greedy repetition.
b8d4c8d0 1216
106e6894 1217@deffn Command posix-search-forward regexp &optional limit noerror repeat
b8d4c8d0
GM
1218This is like @code{re-search-forward} except that it performs the full
1219backtracking specified by the POSIX standard for regular expression
1220matching.
106e6894 1221@end deffn
b8d4c8d0 1222
106e6894 1223@deffn Command posix-search-backward regexp &optional limit noerror repeat
b8d4c8d0
GM
1224This is like @code{re-search-backward} except that it performs the full
1225backtracking specified by the POSIX standard for regular expression
1226matching.
106e6894 1227@end deffn
b8d4c8d0
GM
1228
1229@defun posix-looking-at regexp
1230This is like @code{looking-at} except that it performs the full
1231backtracking specified by the POSIX standard for regular expression
1232matching.
1233@end defun
1234
1235@defun posix-string-match regexp string &optional start
1236This is like @code{string-match} except that it performs the full
1237backtracking specified by the POSIX standard for regular expression
1238matching.
1239@end defun
1240
1241@node Match Data
1242@section The Match Data
1243@cindex match data
1244
1245 Emacs keeps track of the start and end positions of the segments of
1246text found during a search; this is called the @dfn{match data}.
1247Thanks to the match data, you can search for a complex pattern, such
1248as a date in a mail message, and then extract parts of the match under
1249control of the pattern.
1250
1251 Because the match data normally describe the most recent search only,
1252you must be careful not to do another search inadvertently between the
1253search you wish to refer back to and the use of the match data. If you
1254can't avoid another intervening search, you must save and restore the
1255match data around it, to prevent it from being overwritten.
1256
d2a6c43b
TR
1257 Notice that all functions are allowed to overwrite the match data
1258unless they're explicitly documented not to do so. A consequence is
53964682 1259that functions that are run implicitly in the background
d2a6c43b
TR
1260(@pxref{Timers}, and @ref{Idle Timers}) should likely save and restore
1261the match data explicitly.
1262
b8d4c8d0 1263@menu
d24880de 1264* Replacing Match:: Replacing a substring that was matched.
b8d4c8d0 1265* Simple Match Data:: Accessing single items of match data,
d24880de 1266 such as where a particular subexpression started.
b8d4c8d0
GM
1267* Entire Match Data:: Accessing the entire match data at once, as a list.
1268* Saving Match Data:: Saving and restoring the match data.
1269@end menu
1270
1271@node Replacing Match
1272@subsection Replacing the Text that Matched
1273@cindex replace matched text
1274
1275 This function replaces all or part of the text matched by the last
1276search. It works by means of the match data.
1277
1278@cindex case in replacements
1279@defun replace-match replacement &optional fixedcase literal string subexp
c88b867f
CY
1280This function performs a replacement operation on a buffer or string.
1281
1282If you did the last search in a buffer, you should omit the
1283@var{string} argument or specify @code{nil} for it, and make sure that
1284the current buffer is the one in which you performed the last search.
1285Then this function edits the buffer, replacing the matched text with
1286@var{replacement}. It leaves point at the end of the replacement
222426f6 1287text.
c88b867f
CY
1288
1289If you performed the last search on a string, pass the same string as
1290@var{string}. Then this function returns a new string, in which the
1291matched text is replaced by @var{replacement}.
b8d4c8d0
GM
1292
1293If @var{fixedcase} is non-@code{nil}, then @code{replace-match} uses
1294the replacement text without case conversion; otherwise, it converts
1295the replacement text depending upon the capitalization of the text to
1296be replaced. If the original text is all upper case, this converts
1297the replacement text to upper case. If all words of the original text
1298are capitalized, this capitalizes all the words of the replacement
1299text. If all the words are one-letter and they are all upper case,
1300they are treated as capitalized words rather than all-upper-case
1301words.
1302
1303If @var{literal} is non-@code{nil}, then @var{replacement} is inserted
1304exactly as it is, the only alterations being case changes as needed.
1305If it is @code{nil} (the default), then the character @samp{\} is treated
1306specially. If a @samp{\} appears in @var{replacement}, then it must be
1307part of one of the following sequences:
1308
1309@table @asis
1310@item @samp{\&}
1311@cindex @samp{&} in replacement
8a3afaf9 1312This stands for the entire text being replaced.
b8d4c8d0 1313
8a3afaf9 1314@item @samp{\@var{n}}, where @var{n} is a digit
b8d4c8d0 1315@cindex @samp{\@var{n}} in replacement
8a3afaf9
CY
1316This stands for the text that matched the @var{n}th subexpression in
1317the original regexp. Subexpressions are those expressions grouped
1318inside @samp{\(@dots{}\)}. If the @var{n}th subexpression never
1319matched, an empty string is substituted.
b8d4c8d0
GM
1320
1321@item @samp{\\}
1322@cindex @samp{\} in replacement
8a3afaf9
CY
1323This stands for a single @samp{\} in the replacement text.
1324
1325@item @samp{\?}
1326This stands for itself (for compatibility with @code{replace-regexp}
5f1a9647 1327and related commands; @pxref{Regexp Replace,,, emacs, The GNU
8a3afaf9 1328Emacs Manual}).
b8d4c8d0
GM
1329@end table
1330
8a3afaf9
CY
1331@noindent
1332Any other character following @samp{\} signals an error.
1333
1334The substitutions performed by @samp{\&} and @samp{\@var{n}} occur
1335after case conversion, if any. Therefore, the strings they substitute
1336are never case-converted.
b8d4c8d0
GM
1337
1338If @var{subexp} is non-@code{nil}, that says to replace just
1339subexpression number @var{subexp} of the regexp that was matched, not
1340the entire match. For example, after matching @samp{foo \(ba*r\)},
1341calling @code{replace-match} with 1 as @var{subexp} means to replace
1342just the text that matched @samp{\(ba*r\)}.
1343@end defun
1344
fe284805
JL
1345@defun match-substitute-replacement replacement &optional fixedcase literal string subexp
1346This function returns the text that would be inserted into the buffer
1347by @code{replace-match}, but without modifying the buffer. It is
1348useful if you want to present the user with actual replacement result,
1349with constructs like @samp{\@var{n}} or @samp{\&} substituted with
1350matched groups. Arguments @var{replacement} and optional
1351@var{fixedcase}, @var{literal}, @var{string} and @var{subexp} have the
1352same meaning as for @code{replace-match}.
1353@end defun
1354
b8d4c8d0
GM
1355@node Simple Match Data
1356@subsection Simple Match Data Access
1357
1358 This section explains how to use the match data to find out what was
1359matched by the last search or match operation, if it succeeded.
1360
1361 You can ask about the entire matching text, or about a particular
1362parenthetical subexpression of a regular expression. The @var{count}
1363argument in the functions below specifies which. If @var{count} is
1364zero, you are asking about the entire match. If @var{count} is
1365positive, it specifies which subexpression you want.
1366
1367 Recall that the subexpressions of a regular expression are those
1368expressions grouped with escaped parentheses, @samp{\(@dots{}\)}. The
1369@var{count}th subexpression is found by counting occurrences of
1370@samp{\(} from the beginning of the whole regular expression. The first
1371subexpression is numbered 1, the second 2, and so on. Only regular
1372expressions can have subexpressions---after a simple string search, the
1373only information available is about the entire match.
1374
1375 Every successful search sets the match data. Therefore, you should
1376query the match data immediately after searching, before calling any
1377other function that might perform another search. Alternatively, you
1378may save and restore the match data (@pxref{Saving Match Data}) around
fee88ca0
GM
1379the call to functions that could perform another search. Or use the
1380functions that explicitly do not modify the match data;
1df7defd 1381e.g., @code{string-match-p}.
b8d4c8d0 1382
fee88ca0
GM
1383@c This is an old comment and presumably there is no prospect of this
1384@c changing now. But still the advice stands.
b8d4c8d0 1385 A search which fails may or may not alter the match data. In the
fee88ca0
GM
1386current implementation, it does not, but we may change it in the
1387future. Don't try to rely on the value of the match data after a
1388failing search.
b8d4c8d0
GM
1389
1390@defun match-string count &optional in-string
1391This function returns, as a string, the text matched in the last search
1392or match operation. It returns the entire text if @var{count} is zero,
1393or just the portion corresponding to the @var{count}th parenthetical
1394subexpression, if @var{count} is positive.
1395
1396If the last such operation was done against a string with
1397@code{string-match}, then you should pass the same string as the
1398argument @var{in-string}. After a buffer search or match,
1399you should omit @var{in-string} or pass @code{nil} for it; but you
1400should make sure that the current buffer when you call
1401@code{match-string} is the one in which you did the searching or
fee88ca0 1402matching. Failure to follow this advice will lead to incorrect results.
b8d4c8d0
GM
1403
1404The value is @code{nil} if @var{count} is out of range, or for a
1405subexpression inside a @samp{\|} alternative that wasn't used or a
1406repetition that repeated zero times.
1407@end defun
1408
1409@defun match-string-no-properties count &optional in-string
1410This function is like @code{match-string} except that the result
1411has no text properties.
1412@end defun
1413
1414@defun match-beginning count
fee88ca0 1415This function returns the position of the start of the text matched by the
b8d4c8d0
GM
1416last regular expression searched for, or a subexpression of it.
1417
1418If @var{count} is zero, then the value is the position of the start of
1419the entire match. Otherwise, @var{count} specifies a subexpression in
1420the regular expression, and the value of the function is the starting
1421position of the match for that subexpression.
1422
1423The value is @code{nil} for a subexpression inside a @samp{\|}
1424alternative that wasn't used or a repetition that repeated zero times.
1425@end defun
1426
1427@defun match-end count
1428This function is like @code{match-beginning} except that it returns the
1429position of the end of the match, rather than the position of the
1430beginning.
1431@end defun
1432
1433 Here is an example of using the match data, with a comment showing the
1434positions within the text:
1435
1436@example
1437@group
1438(string-match "\\(qu\\)\\(ick\\)"
1439 "The quick fox jumped quickly.")
1440 ;0123456789
1441 @result{} 4
1442@end group
1443
1444@group
1445(match-string 0 "The quick fox jumped quickly.")
1446 @result{} "quick"
1447(match-string 1 "The quick fox jumped quickly.")
1448 @result{} "qu"
1449(match-string 2 "The quick fox jumped quickly.")
1450 @result{} "ick"
1451@end group
1452
1453@group
1454(match-beginning 1) ; @r{The beginning of the match}
1455 @result{} 4 ; @r{with @samp{qu} is at index 4.}
1456@end group
1457
1458@group
1459(match-beginning 2) ; @r{The beginning of the match}
1460 @result{} 6 ; @r{with @samp{ick} is at index 6.}
1461@end group
1462
1463@group
1464(match-end 1) ; @r{The end of the match}
1465 @result{} 6 ; @r{with @samp{qu} is at index 6.}
1466
1467(match-end 2) ; @r{The end of the match}
1468 @result{} 9 ; @r{with @samp{ick} is at index 9.}
1469@end group
1470@end example
1471
1472 Here is another example. Point is initially located at the beginning
1473of the line. Searching moves point to between the space and the word
1474@samp{in}. The beginning of the entire match is at the 9th character of
1475the buffer (@samp{T}), and the beginning of the match for the first
1476subexpression is at the 13th character (@samp{c}).
1477
1478@example
1479@group
1480(list
1481 (re-search-forward "The \\(cat \\)")
1482 (match-beginning 0)
1483 (match-beginning 1))
1899a5d0 1484 @result{} (17 9 13)
b8d4c8d0
GM
1485@end group
1486
1487@group
1488---------- Buffer: foo ----------
1489I read "The cat @point{}in the hat comes back" twice.
1490 ^ ^
1491 9 13
1492---------- Buffer: foo ----------
1493@end group
1494@end example
1495
1496@noindent
1497(In this case, the index returned is a buffer position; the first
1498character of the buffer counts as 1.)
1499
1500@node Entire Match Data
1501@subsection Accessing the Entire Match Data
1502
1503 The functions @code{match-data} and @code{set-match-data} read or
1504write the entire match data, all at once.
1505
1506@defun match-data &optional integers reuse reseat
1507This function returns a list of positions (markers or integers) that
fee88ca0 1508record all the information on the text that the last search matched.
b8d4c8d0
GM
1509Element zero is the position of the beginning of the match for the
1510whole expression; element one is the position of the end of the match
1511for the expression. The next two elements are the positions of the
1512beginning and end of the match for the first subexpression, and so on.
1513In general, element
1514@ifnottex
1515number 2@var{n}
1516@end ifnottex
1517@tex
1518number {\mathsurround=0pt $2n$}
1519@end tex
1520corresponds to @code{(match-beginning @var{n})}; and
1521element
1522@ifnottex
1523number 2@var{n} + 1
1524@end ifnottex
1525@tex
1526number {\mathsurround=0pt $2n+1$}
1527@end tex
1528corresponds to @code{(match-end @var{n})}.
1529
1530Normally all the elements are markers or @code{nil}, but if
1531@var{integers} is non-@code{nil}, that means to use integers instead
1532of markers. (In that case, the buffer itself is appended as an
1533additional element at the end of the list, to facilitate complete
1534restoration of the match data.) If the last match was done on a
1535string with @code{string-match}, then integers are always used,
1536since markers can't point into a string.
1537
1538If @var{reuse} is non-@code{nil}, it should be a list. In that case,
1539@code{match-data} stores the match data in @var{reuse}. That is,
1540@var{reuse} is destructively modified. @var{reuse} does not need to
1541have the right length. If it is not long enough to contain the match
1542data, it is extended. If it is too long, the length of @var{reuse}
1543stays the same, but the elements that were not used are set to
1544@code{nil}. The purpose of this feature is to reduce the need for
1545garbage collection.
1546
1547If @var{reseat} is non-@code{nil}, all markers on the @var{reuse} list
1548are reseated to point to nowhere.
1549
1550As always, there must be no possibility of intervening searches between
1551the call to a search function and the call to @code{match-data} that is
1552intended to access the match data for that search.
1553
1554@example
1555@group
1556(match-data)
1557 @result{} (#<marker at 9 in foo>
1558 #<marker at 17 in foo>
1559 #<marker at 13 in foo>
1560 #<marker at 17 in foo>)
1561@end group
1562@end example
1563@end defun
1564
1565@defun set-match-data match-list &optional reseat
1566This function sets the match data from the elements of @var{match-list},
1567which should be a list that was the value of a previous call to
1568@code{match-data}. (More precisely, anything that has the same format
1569will work.)
1570
1571If @var{match-list} refers to a buffer that doesn't exist, you don't get
1572an error; that sets the match data in a meaningless but harmless way.
1573
1574If @var{reseat} is non-@code{nil}, all markers on the @var{match-list} list
1575are reseated to point to nowhere.
1576
fee88ca0 1577@c TODO Make it properly obsolete.
b8d4c8d0
GM
1578@findex store-match-data
1579@code{store-match-data} is a semi-obsolete alias for @code{set-match-data}.
1580@end defun
1581
1582@node Saving Match Data
1583@subsection Saving and Restoring the Match Data
1584
fee88ca0 1585 When you call a function that may search, you may need to save
b8d4c8d0
GM
1586and restore the match data around that call, if you want to preserve the
1587match data from an earlier search for later use. Here is an example
1588that shows the problem that arises if you fail to save the match data:
1589
1590@example
1591@group
1592(re-search-forward "The \\(cat \\)")
1593 @result{} 48
fee88ca0 1594(foo) ; @r{@code{foo} does more searching.}
b8d4c8d0
GM
1595(match-end 0)
1596 @result{} 61 ; @r{Unexpected result---not 48!}
1597@end group
1598@end example
1599
1600 You can save and restore the match data with @code{save-match-data}:
1601
1602@defmac save-match-data body@dots{}
1603This macro executes @var{body}, saving and restoring the match
1604data around it. The return value is the value of the last form in
1605@var{body}.
1606@end defmac
1607
1608 You could use @code{set-match-data} together with @code{match-data} to
1609imitate the effect of the special form @code{save-match-data}. Here is
1610how:
1611
1612@example
1613@group
1614(let ((data (match-data)))
1615 (unwind-protect
1616 @dots{} ; @r{Ok to change the original match data.}
1617 (set-match-data data)))
1618@end group
1619@end example
1620
1621 Emacs automatically saves and restores the match data when it runs
1622process filter functions (@pxref{Filter Functions}) and process
1623sentinels (@pxref{Sentinels}).
1624
1625@ignore
1626 Here is a function which restores the match data provided the buffer
1627associated with it still exists.
1628
1629@smallexample
1630@group
1631(defun restore-match-data (data)
1632@c It is incorrect to split the first line of a doc string.
1633@c If there's a problem here, it should be solved in some other way.
1634 "Restore the match data DATA unless the buffer is missing."
1635 (catch 'foo
1636 (let ((d data))
1637@end group
1638 (while d
1639 (and (car d)
1640 (null (marker-buffer (car d)))
1641@group
1642 ;; @file{match-data} @r{buffer is deleted.}
1643 (throw 'foo nil))
1644 (setq d (cdr d)))
1645 (set-match-data data))))
1646@end group
1647@end smallexample
1648@end ignore
1649
1650@node Search and Replace
1651@section Search and Replace
1652@cindex replacement after search
1653@cindex searching and replacing
1654
1655 If you want to find all matches for a regexp in part of the buffer,
1656and replace them, the best way is to write an explicit loop using
1657@code{re-search-forward} and @code{replace-match}, like this:
1658
1659@example
1660(while (re-search-forward "foo[ \t]+bar" nil t)
1661 (replace-match "foobar"))
1662@end example
1663
1664@noindent
1665@xref{Replacing Match,, Replacing the Text that Matched}, for a
1666description of @code{replace-match}.
1667
1668 However, replacing matches in a string is more complex, especially
1669if you want to do it efficiently. So Emacs provides a function to do
1670this.
1671
1672@defun replace-regexp-in-string regexp rep string &optional fixedcase literal subexp start
1673This function copies @var{string} and searches it for matches for
1674@var{regexp}, and replaces them with @var{rep}. It returns the
1675modified copy. If @var{start} is non-@code{nil}, the search for
1676matches starts at that index in @var{string}, so matches starting
1677before that index are not changed.
1678
1679This function uses @code{replace-match} to do the replacement, and it
1680passes the optional arguments @var{fixedcase}, @var{literal} and
1681@var{subexp} along to @code{replace-match}.
1682
1683Instead of a string, @var{rep} can be a function. In that case,
1684@code{replace-regexp-in-string} calls @var{rep} for each match,
1685passing the text of the match as its sole argument. It collects the
1686value @var{rep} returns and passes that to @code{replace-match} as the
fee88ca0 1687replacement string. The match data at this point are the result
b8d4c8d0
GM
1688of matching @var{regexp} against a substring of @var{string}.
1689@end defun
1690
1691 If you want to write a command along the lines of @code{query-replace},
1692you can use @code{perform-replace} to do the work.
1693
1694@defun perform-replace from-string replacements query-flag regexp-flag delimited-flag &optional repeat-count map start end
1695This function is the guts of @code{query-replace} and related
1696commands. It searches for occurrences of @var{from-string} in the
1697text between positions @var{start} and @var{end} and replaces some or
1698all of them. If @var{start} is @code{nil} (or omitted), point is used
1699instead, and the end of the buffer's accessible portion is used for
1700@var{end}.
1701
1702If @var{query-flag} is @code{nil}, it replaces all
1703occurrences; otherwise, it asks the user what to do about each one.
1704
1705If @var{regexp-flag} is non-@code{nil}, then @var{from-string} is
1706considered a regular expression; otherwise, it must match literally. If
1707@var{delimited-flag} is non-@code{nil}, then only replacements
1708surrounded by word boundaries are considered.
1709
1710The argument @var{replacements} specifies what to replace occurrences
1711with. If it is a string, that string is used. It can also be a list of
1712strings, to be used in cyclic order.
1713
80120f13
EZ
1714If @var{replacements} is a cons cell, @w{@code{(@var{function}
1715. @var{data})}}, this means to call @var{function} after each match to
b8d4c8d0
GM
1716get the replacement text. This function is called with two arguments:
1717@var{data}, and the number of replacements already made.
1718
1719If @var{repeat-count} is non-@code{nil}, it should be an integer. Then
1720it specifies how many times to use each of the strings in the
1721@var{replacements} list before advancing cyclically to the next one.
1722
1723If @var{from-string} contains upper-case letters, then
1724@code{perform-replace} binds @code{case-fold-search} to @code{nil}, and
fee88ca0 1725it uses the @var{replacements} without altering their case.
b8d4c8d0
GM
1726
1727Normally, the keymap @code{query-replace-map} defines the possible
1728user responses for queries. The argument @var{map}, if
1729non-@code{nil}, specifies a keymap to use instead of
1730@code{query-replace-map}.
80120f13
EZ
1731
1732This function uses one of two functions to search for the next
1733occurrence of @var{from-string}. These functions are specified by the
1734values of two variables: @code{replace-re-search-function} and
1735@code{replace-search-function}. The former is called when the
1736argument @var{regexp-flag} is non-@code{nil}, the latter when it is
1737@code{nil}.
b8d4c8d0
GM
1738@end defun
1739
1740@defvar query-replace-map
1741This variable holds a special keymap that defines the valid user
1742responses for @code{perform-replace} and the commands that use it, as
1743well as @code{y-or-n-p} and @code{map-y-or-n-p}. This map is unusual
1744in two ways:
1745
1746@itemize @bullet
1747@item
1748The ``key bindings'' are not commands, just symbols that are meaningful
1749to the functions that use this map.
1750
1751@item
1752Prefix keys are not supported; each key binding must be for a
1753single-event key sequence. This is because the functions don't use
1754@code{read-key-sequence} to get the input; instead, they read a single
fee88ca0 1755event and look it up ``by hand''.
b8d4c8d0
GM
1756@end itemize
1757@end defvar
1758
1759Here are the meaningful ``bindings'' for @code{query-replace-map}.
1760Several of them are meaningful only for @code{query-replace} and
1761friends.
1762
1763@table @code
1764@item act
fee88ca0 1765Do take the action being considered---in other words, ``yes''.
b8d4c8d0
GM
1766
1767@item skip
fee88ca0 1768Do not take action for this question---in other words, ``no''.
b8d4c8d0
GM
1769
1770@item exit
fee88ca0
GM
1771Answer this question ``no'', and give up on the entire series of
1772questions, assuming that the answers will be ``no''.
1773
1774@item exit-prefix
1775Like @code{exit}, but add the key that was pressed to
c085e5b9 1776@code{unread-command-events} (@pxref{Event Input Misc}).
b8d4c8d0
GM
1777
1778@item act-and-exit
fee88ca0
GM
1779Answer this question ``yes'', and give up on the entire series of
1780questions, assuming that subsequent answers will be ``no''.
b8d4c8d0
GM
1781
1782@item act-and-show
fee88ca0 1783Answer this question ``yes'', but show the results---don't advance yet
b8d4c8d0
GM
1784to the next question.
1785
1786@item automatic
1787Answer this question and all subsequent questions in the series with
fee88ca0 1788``yes'', without further user interaction.
b8d4c8d0
GM
1789
1790@item backup
1791Move back to the previous place that a question was asked about.
1792
1793@item edit
1794Enter a recursive edit to deal with this question---instead of any
1795other action that would normally be taken.
1796
fee88ca0
GM
1797@item edit-replacement
1798Edit the replacement for this question in the minibuffer.
1799
b8d4c8d0
GM
1800@item delete-and-edit
1801Delete the text being considered, then enter a recursive edit to replace
1802it.
1803
1804@item recenter
011474aa
CY
1805@itemx scroll-up
1806@itemx scroll-down
1807@itemx scroll-other-window
1808@itemx scroll-other-window-down
1809Perform the specified window scroll operation, then ask the same
1810question again. Only @code{y-or-n-p} and related functions use this
1811answer.
b8d4c8d0
GM
1812
1813@item quit
1814Perform a quit right away. Only @code{y-or-n-p} and related functions
1815use this answer.
1816
1817@item help
1818Display some help, then ask again.
1819@end table
1820
2c0b8144
EZ
1821@defvar multi-query-replace-map
1822This variable holds a keymap that extends @code{query-replace-map} by
1823providing additional keybindings that are useful in multi-buffer
fee88ca0
GM
1824replacements. The additional ``bindings'' are:
1825
1826@table @code
1827@item automatic-all
1828Answer this question and all subsequent questions in the series with
1829``yes'', without further user interaction, for all remaining buffers.
1830
1831@item exit-current
1832Answer this question ``no'', and give up on the entire series of
1833questions for the current buffer. Continue to the next buffer in the
1834sequence.
1835@end table
2c0b8144
EZ
1836@end defvar
1837
80120f13
EZ
1838@defvar replace-search-function
1839This variable specifies a function that @code{perform-replace} calls
1840to search for the next string to replace. Its default value is
1841@code{search-forward}. Any other value should name a function of 3
1842arguments: the first 3 arguments of @code{search-forward}
1843(@pxref{String Search}).
1844@end defvar
1845
1846@defvar replace-re-search-function
1847This variable specifies a function that @code{perform-replace} calls
1848to search for the next regexp to replace. Its default value is
1849@code{re-search-forward}. Any other value should name a function of 3
1850arguments: the first 3 arguments of @code{re-search-forward}
1851(@pxref{Regexp Search}).
1852@end defvar
1853
b8d4c8d0
GM
1854@node Standard Regexps
1855@section Standard Regular Expressions Used in Editing
1856@cindex regexps used standardly in editing
1857@cindex standard regexps used in editing
1858
1859 This section describes some variables that hold regular expressions
1860used for certain purposes in editing:
1861
01f17ae2 1862@defopt page-delimiter
b8d4c8d0
GM
1863This is the regular expression describing line-beginnings that separate
1864pages. The default value is @code{"^\014"} (i.e., @code{"^^L"} or
1865@code{"^\C-l"}); this matches a line that starts with a formfeed
1866character.
01f17ae2 1867@end defopt
b8d4c8d0
GM
1868
1869 The following two regular expressions should @emph{not} assume the
1870match always starts at the beginning of a line; they should not use
1871@samp{^} to anchor the match. Most often, the paragraph commands do
1872check for a match only at the beginning of a line, which means that
1873@samp{^} would be superfluous. When there is a nonzero left margin,
1874they accept matches that start after the left margin. In that case, a
1875@samp{^} would be incorrect. However, a @samp{^} is harmless in modes
1876where a left margin is never used.
1877
01f17ae2 1878@defopt paragraph-separate
b8d4c8d0
GM
1879This is the regular expression for recognizing the beginning of a line
1880that separates paragraphs. (If you change this, you may have to
1881change @code{paragraph-start} also.) The default value is
1882@w{@code{"[@ \t\f]*$"}}, which matches a line that consists entirely of
1883spaces, tabs, and form feeds (after its left margin).
01f17ae2 1884@end defopt
b8d4c8d0 1885
01f17ae2 1886@defopt paragraph-start
b8d4c8d0
GM
1887This is the regular expression for recognizing the beginning of a line
1888that starts @emph{or} separates paragraphs. The default value is
1889@w{@code{"\f\\|[ \t]*$"}}, which matches a line containing only
1890whitespace or starting with a form feed (after its left margin).
01f17ae2 1891@end defopt
b8d4c8d0 1892
01f17ae2 1893@defopt sentence-end
b8d4c8d0
GM
1894If non-@code{nil}, the value should be a regular expression describing
1895the end of a sentence, including the whitespace following the
1896sentence. (All paragraph boundaries also end sentences, regardless.)
1897
fee88ca0
GM
1898If the value is @code{nil}, as it is by default, then the function
1899@code{sentence-end} constructs the regexp. That is why you
b8d4c8d0
GM
1900should always call the function @code{sentence-end} to obtain the
1901regexp to be used to recognize the end of a sentence.
01f17ae2 1902@end defopt
b8d4c8d0
GM
1903
1904@defun sentence-end
1905This function returns the value of the variable @code{sentence-end},
1906if non-@code{nil}. Otherwise it returns a default value based on the
1907values of the variables @code{sentence-end-double-space}
1908(@pxref{Definition of sentence-end-double-space}),
fee88ca0 1909@code{sentence-end-without-period}, and
b8d4c8d0
GM
1910@code{sentence-end-without-space}.
1911@end defun