(org-export-with-timestamps)
[bpt/emacs.git] / man / search.texi
CommitLineData
6bf7aab6 1@c This is part of the Emacs manual.
b65d8176 2@c Copyright (C) 1985, 1986, 1987, 1993, 1994, 1995, 1997, 2000, 2001, 2002,
8d99e09d 3@c 2003, 2004, 2005, 2006 Free Software Foundation, Inc.
6bf7aab6
DL
4@c See file emacs.texi for copying conditions.
5@node Search, Fixit, Display, Top
6@chapter Searching and Replacement
7@cindex searching
8@cindex finding strings within text
9
10 Like other editors, Emacs has commands for searching for occurrences of
11a string. The principal search command is unusual in that it is
12@dfn{incremental}; it begins to search before you have finished typing the
13search string. There are also nonincremental search commands more like
14those of other editors.
15
16 Besides the usual @code{replace-string} command that finds all
a76af65d
RS
17occurrences of one string and replaces them with another, Emacs has a
18more flexible replacement command called @code{query-replace}, which
18280275
RS
19asks interactively which occurrences to replace. There are also
20commands to find and operate on all matches for a pattern.
21
22 You can also search multiple files under control of a tags
23table (@pxref{Tags Search}) or through the Dired @kbd{A} command
24(@pxref{Operating on Files}), or ask the @code{grep} program to do it
25(@pxref{Grep Searching}).
26
6bf7aab6
DL
27
28@menu
a57bfc9f
EZ
29* Incremental Search:: Search happens as you type the string.
30* Nonincremental Search:: Specify entire string and then search.
31* Word Search:: Search for sequence of words.
32* Regexp Search:: Search for match for a regexp.
33* Regexps:: Syntax of regular expressions.
c118d09e
RS
34* Regexp Backslash:: Regular expression constructs starting with `\'.
35* Regexp Example:: A complex regular expression explained.
a57bfc9f 36* Search Case:: To ignore case while searching, or not.
a57bfc9f
EZ
37* Replace:: Search, and replace some or all matches.
38* Other Repeating Search:: Operating on all matches for some regexp.
6bf7aab6
DL
39@end menu
40
f2fd3623 41@node Incremental Search
6bf7aab6 42@section Incremental Search
7167d409
RS
43@cindex incremental search
44@cindex isearch
6bf7aab6 45
6bf7aab6
DL
46 An incremental search begins searching as soon as you type the first
47character of the search string. As you type in the search string, Emacs
48shows you where the string (as you have typed it so far) would be
49found. When you have typed enough characters to identify the place you
50want, you can stop. Depending on what you plan to do next, you may or
51may not need to terminate the search explicitly with @key{RET}.
52
6bf7aab6
DL
53@table @kbd
54@item C-s
55Incremental search forward (@code{isearch-forward}).
56@item C-r
57Incremental search backward (@code{isearch-backward}).
58@end table
59
f2fd3623
RS
60@menu
61* Basic Isearch:: Basic incremental search commands.
62* Repeat Isearch:: Searching for the same string again.
63* Error in Isearch:: When your string is not found.
64* Special Isearch:: Special input in incremental search.
65* Non-ASCII Isearch:: How to search for non-ASCII characters.
66* Isearch Yank:: Commands that grab text into the search string
67 or else edit the search string.
68* Highlight Isearch:: Isearch highlights the other possible matches.
69* Isearch Scroll:: Scrolling during an incremental search.
70* Slow Isearch:: Incremental search features for slow terminals.
71@end menu
72
73@node Basic Isearch
74@subsection Basics of Incremental Search
f2fd3623 75
6bf7aab6
DL
76@kindex C-s
77@findex isearch-forward
a76af65d
RS
78 @kbd{C-s} starts a forward incremental search. It reads characters
79from the keyboard, and moves point past the next occurrence of those
80characters. If you type @kbd{C-s} and then @kbd{F}, that puts the
e9c80604 81cursor after the first @samp{F} (the first following the starting point, since
a76af65d 82this is a forward search). Then if you type an @kbd{O}, you will see
5c5245f7 83the cursor move to just after the first @samp{FO} (the @samp{F} in that
a76af65d 84@samp{FO} may or may not be the first @samp{F}). After another
5c5245f7 85@kbd{O}, the cursor moves to just after the first @samp{FOO} after the place
a76af65d
RS
86where you started the search. At each step, the buffer text that
87matches the search string is highlighted, if the terminal can do that;
88the current search string is always displayed in the echo area.
6bf7aab6
DL
89
90 If you make a mistake in typing the search string, you can cancel
91characters with @key{DEL}. Each @key{DEL} cancels the last character of
92search string. This does not happen until Emacs is ready to read another
93input character; first it must either find, or fail to find, the character
94you want to erase. If you do not want to wait for this to happen, use
95@kbd{C-g} as described below.
96
97 When you are satisfied with the place you have reached, you can type
98@key{RET}, which stops searching, leaving the cursor where the search
99brought it. Also, any command not specially meaningful in searches
b20a1c88
RS
100stops the searching and is then executed. Thus, typing @kbd{C-a}
101would exit the search and then move to the beginning of the line.
102@key{RET} is necessary only if the next command you want to type is a
103printing character, @key{DEL}, @key{RET}, or another character that is
6bf7aab6 104special within searches (@kbd{C-q}, @kbd{C-w}, @kbd{C-r}, @kbd{C-s},
f98a8ffd 105@kbd{C-y}, @kbd{M-y}, @kbd{M-r}, @kbd{M-c}, @kbd{M-e}, and some other
b20a1c88 106meta-characters).
6bf7aab6 107
f2fd3623
RS
108 When you exit the incremental search, it sets the mark where point
109@emph{was} before the search. That is convenient for moving back
110there. In Transient Mark mode, incremental search sets the mark
111without activating it, and does so only if the mark is not already
112active.
113
114@node Repeat Isearch
115@subsection Repeating Incremental Search
116
a76af65d 117 Sometimes you search for @samp{FOO} and find one, but not the one you
91cf1909
RS
118expected to find. There was a second @samp{FOO} that you forgot
119about, before the one you were aiming for. In this event, type
120another @kbd{C-s} to move to the next occurrence of the search string.
121You can repeat this any number of times. If you overshoot, you can
122cancel some @kbd{C-s} characters with @key{DEL}.
6bf7aab6
DL
123
124 After you exit a search, you can search for the same string again by
125typing just @kbd{C-s C-s}: the first @kbd{C-s} is the key that invokes
126incremental search, and the second @kbd{C-s} means ``search again.''
127
f2fd3623
RS
128 If a search is failing and you ask to repeat it by typing another
129@kbd{C-s}, it starts again from the beginning of the buffer.
130Repeating a failing reverse search with @kbd{C-r} starts again from
131the end. This is called @dfn{wrapping around}, and @samp{Wrapped}
132appears in the search prompt once this has happened. If you keep on
133going past the original starting point of the search, it changes to
134@samp{Overwrapped}, which means that you are revisiting matches that
135you have already seen.
136
6bf7aab6
DL
137 To reuse earlier search strings, use the @dfn{search ring}. The
138commands @kbd{M-p} and @kbd{M-n} move through the ring to pick a search
139string to reuse. These commands leave the selected search ring element
f98a8ffd
JL
140in the minibuffer, where you can edit it. To edit the current search
141string in the minibuffer without replacing it with items from the
142search ring, type @kbd{M-e}. Type @kbd{C-s} or @kbd{C-r}
6bf7aab6
DL
143to terminate editing the string and search for it.
144
5c5245f7
RS
145 You can change to searching backwards with @kbd{C-r}. For instance,
146if you are searching forward but you realize you were looking for
147something above the starting point, you can do this. Repeated
148@kbd{C-r} keeps looking for more occurrences backwards. A @kbd{C-s}
149starts going forwards again. @kbd{C-r} in a search can be canceled
f2fd3623
RS
150with @key{DEL}.
151
152@kindex C-r
153@findex isearch-backward
154 If you know initially that you want to search backwards, you can use
5c5245f7
RS
155@kbd{C-r} instead of @kbd{C-s} to start the search, because @kbd{C-r}
156as a key runs a command (@code{isearch-backward}) to search backward.
157A backward search finds matches that end before the starting point,
158just as a forward search finds matches that begin after it.
f2fd3623
RS
159
160@node Error in Isearch
161@subsection Errors in Incremental Search
162
6bf7aab6
DL
163 If your string is not found at all, the echo area says @samp{Failing
164I-Search}. The cursor is after the place where Emacs found as much of your
165string as it could. Thus, if you search for @samp{FOOT}, and there is no
166@samp{FOOT}, you might see the cursor after the @samp{FOO} in @samp{FOOL}.
167At this point there are several things you can do. If your string was
168mistyped, you can rub some of it out and correct it. If you like the place
169you have found, you can type @key{RET} or some other Emacs command to
a76af65d 170remain there. Or you can type @kbd{C-g}, which
6bf7aab6
DL
171removes from the search string the characters that could not be found (the
172@samp{T} in @samp{FOOT}), leaving those that were found (the @samp{FOO} in
173@samp{FOOT}). A second @kbd{C-g} at that point cancels the search
174entirely, returning point to where it was when the search started.
175
f2fd3623
RS
176@cindex quitting (in search)
177 The @kbd{C-g} ``quit'' character does special things during searches;
178just what it does depends on the status of the search. If the search has
179found what you specified and is waiting for input, @kbd{C-g} cancels the
180entire search. The cursor moves back to where you started the search. If
181@kbd{C-g} is typed when there are characters in the search string that have
182not been found---because Emacs is still searching for them, or because it
183has failed to find them---then the search string characters which have not
184been found are discarded from the search string. With them gone, the
185search is now successful and waiting for more input, so a second @kbd{C-g}
186will cancel the entire search.
187
188@node Special Isearch
189@subsection Special Input for Incremental Search
190
6bf7aab6
DL
191 An upper-case letter in the search string makes the search
192case-sensitive. If you delete the upper-case character from the search
193string, it ceases to have this effect. @xref{Search Case}.
194
91cf1909
RS
195 To search for a newline, type @kbd{C-j}. To search for another
196control character, such as control-S or carriage return, you must quote
197it by typing @kbd{C-q} first. This function of @kbd{C-q} is analogous
198to its use for insertion (@pxref{Inserting Text}): it causes the
199following character to be treated the way any ``ordinary'' character is
200treated in the same context. You can also specify a character by its
201octal code: enter @kbd{C-q} followed by a sequence of octal digits.
202
f2fd3623
RS
203 @kbd{M-%} typed in incremental search invokes @code{query-replace}
204or @code{query-replace-regexp} (depending on search mode) with the
205current search string used as the string to replace. @xref{Query
206Replace}.
207
208 Entering @key{RET} when the search string is empty launches
209nonincremental search (@pxref{Nonincremental Search}).
210
211@vindex isearch-mode-map
212 To customize the special characters that incremental search understands,
213alter their bindings in the keymap @code{isearch-mode-map}. For a list
214of bindings, look at the documentation of @code{isearch-mode} with
215@kbd{C-h f isearch-mode @key{RET}}.
216
8624d250 217@node Non-ASCII Isearch
f2fd3623 218@subsection Isearch for Non-@acronym{ASCII} Characters
76dd3692 219@cindex searching for non-@acronym{ASCII} characters
4a1b539b 220@cindex input method, during incremental search
f2fd3623 221
47a14a31 222 To enter non-@acronym{ASCII} characters in an incremental search,
0ed0527c
RS
223you can use @kbd{C-q} (see the previous section), but it is easier to
224use an input method (@pxref{Input Methods}). If an input method is
225enabled in the current buffer when you start the search, you can use
226it in the search string also. Emacs indicates that by including the
227input method mnemonic in its prompt, like this:
4a1b539b
EZ
228
229@example
91cf1909 230I-search [@var{im}]:
4a1b539b
EZ
231@end example
232
233@noindent
234@findex isearch-toggle-input-method
235@findex isearch-toggle-specified-input-method
0ed0527c
RS
236where @var{im} is the mnemonic of the active input method.
237
238 You can toggle (enable or disable) the input method while you type
239the search string with @kbd{C-\} (@code{isearch-toggle-input-method}).
240You can turn on a certain (non-default) input method with @kbd{C-^}
4a1b539b 241(@code{isearch-toggle-specified-input-method}), which prompts for the
e9c80604
RS
242name of the input method. The input method you enable during
243incremental search remains enabled in the current buffer afterwards.
4a1b539b 244
f2fd3623
RS
245@node Isearch Yank
246@subsection Isearch Yanking
6bf7aab6 247
f98a8ffd 248 The characters @kbd{C-w} and @kbd{C-y} can be used in incremental
1e9ec40b
RS
249search to grab text from the buffer into the search string. This
250makes it convenient to search for another occurrence of text at point.
251@kbd{C-w} copies the character or word after point as part of the
252search string, advancing point over it. (The decision, whether to
253copy a character or a word, is heuristic.) Another @kbd{C-s} to
254repeat the search will then search for a string including that
255character or word.
256
257 @kbd{C-y} is similar to @kbd{C-w} but copies all the rest of the
0b4fe585
EZ
258current line into the search string. If point is already at the end
259of a line, it grabs the entire next line. Both @kbd{C-y} and
260@kbd{C-w} convert the text they copy to lower case if the search is
261currently not case-sensitive; this is so the search remains
262case-insensitive.
6bf7aab6 263
f98a8ffd
JL
264 @kbd{C-M-w} and @kbd{C-M-y} modify the search string by only one
265character at a time: @kbd{C-M-w} deletes the last character from the
266search string and @kbd{C-M-y} copies the character after point to the
267end of the search string. An alternative method to add the character
268after point into the search string is to enter the minibuffer by
269@kbd{M-e} and to type @kbd{C-f} at the end of the search string in the
270minibuffer.
271
6bf7aab6
DL
272 The character @kbd{M-y} copies text from the kill ring into the search
273string. It uses the same text that @kbd{C-y} as a command would yank.
91cf1909 274@kbd{Mouse-2} in the echo area does the same.
6bf7aab6
DL
275@xref{Yanking}.
276
f2fd3623
RS
277@node Highlight Isearch
278@subsection Lazy Search Highlighting
1de69f0c 279@cindex lazy search highlighting
1de69f0c 280@vindex isearch-lazy-highlight
f2fd3623 281
91cf1909
RS
282 When you pause for a little while during incremental search, it
283highlights all other possible matches for the search string. This
284makes it easier to anticipate where you can get to by typing @kbd{C-s}
285or @kbd{C-r} to repeat the search. The short delay before highlighting
286other matches helps indicate which match is the current one.
287If you don't like this feature, you can turn it off by setting
288@code{isearch-lazy-highlight} to @code{nil}.
1de69f0c 289
24346b4e 290@cindex faces for highlighting search matches
6f515f89 291 You can control how this highlighting looks by customizing the faces
8bcc4f6b 292@code{isearch} (used for the current match) and @code{lazy-highlight}
5c5245f7 293(for all the other matches).
24346b4e 294
f2fd3623 295@node Isearch Scroll
a57bfc9f
EZ
296@subsection Scrolling During Incremental Search
297
5c5245f7
RS
298 You can enable the use of vertical scrolling during incremental
299search (without exiting the search) by setting the customizable
300variable @code{isearch-allow-scroll} to a non-@code{nil} value. This
301applies to using the vertical scroll-bar and to certain keyboard
a57bfc9f 302commands such as @kbd{@key{PRIOR}} (@code{scroll-down}),
5c5245f7
RS
303@kbd{@key{NEXT}} (@code{scroll-up}) and @kbd{C-l} (@code{recenter}).
304You must run these commands via their key sequences to stay in the
305search---typing @kbd{M-x} will terminate the search. You can give
306prefix arguments to these commands in the usual way.
a57bfc9f 307
5c5245f7
RS
308 This feature won't let you scroll the current match out of visibility,
309however.
a57bfc9f 310
5c5245f7 311 The feature also affects some other commands, such as @kbd{C-x 2}
a57bfc9f 312(@code{split-window-vertically}) and @kbd{C-x ^}
5c5245f7
RS
313(@code{enlarge-window}) which don't exactly scroll but do affect where
314the text appears on the screen. In general, it applies to any command
315whose name has a non-@code{nil} @code{isearch-scroll} property. So you
316can control which commands are affected by changing these properties.
a57bfc9f 317
5c5245f7 318 For example, to make @kbd{C-h l} usable within an incremental search
c778ed49 319in all future Emacs sessions, use @kbd{C-h c} to find what command it
5c5245f7
RS
320runs. (You type @kbd{C-h c C-h l}; it says @code{view-lossage}.)
321Then you can put the following line in your @file{.emacs} file
322(@pxref{Init File}):
c778ed49
LT
323
324@example
325(put 'view-lossage 'isearch-scroll t)
326@end example
327
328@noindent
5c5245f7
RS
329This feature can be applied to any command that doesn't permanently
330change point, the buffer contents, the match data, the current buffer,
331or the selected window and frame. The command must not itself attempt
332an incremental search.
a57bfc9f 333
f2fd3623 334@node Slow Isearch
6bf7aab6
DL
335@subsection Slow Terminal Incremental Search
336
337 Incremental search on a slow terminal uses a modified style of display
338that is designed to take less time. Instead of redisplaying the buffer at
339each place the search gets to, it creates a new single-line window and uses
340that to display the line that the search has found. The single-line window
6f515f89 341comes into play as soon as point moves outside of the text that is already
6bf7aab6
DL
342on the screen.
343
344 When you terminate the search, the single-line window is removed.
6f515f89 345Emacs then redisplays the window in which the search was done, to show
6bf7aab6
DL
346its new position of point.
347
6bf7aab6
DL
348@vindex search-slow-speed
349 The slow terminal style of display is used when the terminal baud rate is
350less than or equal to the value of the variable @code{search-slow-speed},
c778ed49
LT
351initially 1200. See also the discussion of the variable @code{baud-rate}
352(@pxref{baud-rate,, Customization of Display}).
6bf7aab6
DL
353
354@vindex search-slow-window-lines
355 The number of lines to use in slow terminal search display is controlled
356by the variable @code{search-slow-window-lines}. Its normal value is 1.
357
f2fd3623 358@node Nonincremental Search
6bf7aab6
DL
359@section Nonincremental Search
360@cindex nonincremental search
361
362 Emacs also has conventional nonincremental search commands, which require
363you to type the entire search string before searching begins.
364
365@table @kbd
366@item C-s @key{RET} @var{string} @key{RET}
367Search for @var{string}.
368@item C-r @key{RET} @var{string} @key{RET}
369Search backward for @var{string}.
370@end table
371
372 To do a nonincremental search, first type @kbd{C-s @key{RET}}. This
373enters the minibuffer to read the search string; terminate the string
374with @key{RET}, and then the search takes place. If the string is not
a76af65d
RS
375found, the search command signals an error.
376
377 When you type @kbd{C-s @key{RET}}, the @kbd{C-s} invokes incremental
378search as usual. That command is specially programmed to invoke
379nonincremental search, @code{search-forward}, if the string you
380specify is empty. (Such an empty argument would otherwise be
381useless.) But it does not call @code{search-forward} right away. First
382it checks the next input character to see if is @kbd{C-w},
383which specifies a word search.
e0866401 384@ifnottex
6bf7aab6 385@xref{Word Search}.
e0866401 386@end ifnottex
a76af65d 387@kbd{C-r @key{RET}} does likewise, for a reverse incremental search.
6bf7aab6
DL
388
389@findex search-forward
390@findex search-backward
391 Forward and backward nonincremental searches are implemented by the
392commands @code{search-forward} and @code{search-backward}. These
393commands may be bound to keys in the usual manner. The feature that you
394can get to them via the incremental search commands exists for
5c5245f7 395historical reasons, and to avoid the need to find separate key sequences
6bf7aab6
DL
396for them.
397
f2fd3623 398@node Word Search
6bf7aab6
DL
399@section Word Search
400@cindex word search
401
402 Word search searches for a sequence of words without regard to how the
403words are separated. More precisely, you type a string of many words,
6f515f89
EZ
404using single spaces to separate them, and the string can be found even
405if there are multiple spaces, newlines, or other punctuation characters
406between these words.
6bf7aab6
DL
407
408 Word search is useful for editing a printed document made with a text
409formatter. If you edit while looking at the printed, formatted version,
410you can't tell where the line breaks are in the source file. With word
411search, you can search without having to know them.
412
413@table @kbd
414@item C-s @key{RET} C-w @var{words} @key{RET}
415Search for @var{words}, ignoring details of punctuation.
416@item C-r @key{RET} C-w @var{words} @key{RET}
417Search backward for @var{words}, ignoring details of punctuation.
418@end table
419
cadc14ec 420 Word search as a special case of nonincremental search is invoked
6bf7aab6
DL
421with @kbd{C-s @key{RET} C-w}. This is followed by the search string,
422which must always be terminated with @key{RET}. Being nonincremental,
423this search does not start until the argument is terminated. It works
424by constructing a regular expression and searching for that; see
425@ref{Regexp Search}.
426
427 Use @kbd{C-r @key{RET} C-w} to do backward word search.
428
cadc14ec
JL
429 You can also invoke word search with @kbd{C-s M-e C-w} or @kbd{C-r
430M-e C-w} followed by the search string and terminated with @key{RET},
431@kbd{C-s} or @kbd{C-r}. This puts word search into incremental mode
432where you can use all keys available for incremental search. However,
433when you type more words in incremental word search, it will fail
434until you type complete words.
435
6bf7aab6
DL
436@findex word-search-forward
437@findex word-search-backward
438 Forward and backward word searches are implemented by the commands
439@code{word-search-forward} and @code{word-search-backward}. These
a76af65d
RS
440commands may be bound to keys in the usual manner. They are available
441via the incremental search commands both for historical reasons and
5c5245f7 442to avoid the need to find separate key sequences for them.
6bf7aab6 443
f2fd3623 444@node Regexp Search
6bf7aab6
DL
445@section Regular Expression Search
446@cindex regular expression
447@cindex regexp
448
a76af65d
RS
449 A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern
450that denotes a class of alternative strings to match, possibly
451infinitely many. GNU Emacs provides both incremental and
f2fd3623
RS
452nonincremental ways to search for a match for a regexp. The syntax of
453regular expressions is explained in the following section.
6bf7aab6
DL
454
455@kindex C-M-s
456@findex isearch-forward-regexp
457@kindex C-M-r
458@findex isearch-backward-regexp
459 Incremental search for a regexp is done by typing @kbd{C-M-s}
f98a8ffd
JL
460(@code{isearch-forward-regexp}), by invoking @kbd{C-s} with a
461prefix argument (whose value does not matter), or by typing @kbd{M-r}
462within a forward incremental search. This command reads a
a76af65d
RS
463search string incrementally just like @kbd{C-s}, but it treats the
464search string as a regexp rather than looking for an exact match
465against the text in the buffer. Each time you add text to the search
466string, you make the regexp longer, and the new regexp is searched
467for. To search backward for a regexp, use @kbd{C-M-r}
f98a8ffd
JL
468(@code{isearch-backward-regexp}), @kbd{C-r} with a prefix argument,
469or @kbd{M-r} within a backward incremental search.
6bf7aab6
DL
470
471 All of the control characters that do special things within an
472ordinary incremental search have the same function in incremental regexp
473search. Typing @kbd{C-s} or @kbd{C-r} immediately after starting the
474search retrieves the last incremental search regexp used; that is to
475say, incremental regexp and non-regexp searches have independent
476defaults. They also have separate search rings that you can access with
477@kbd{M-p} and @kbd{M-n}.
478
c283cff5 479@vindex search-whitespace-regexp
6bf7aab6 480 If you type @key{SPC} in incremental regexp search, it matches any
c283cff5
RS
481sequence of whitespace characters, including newlines. If you want to
482match just a space, type @kbd{C-q @key{SPC}}. You can control what a
5c5245f7 483bare space matches by setting the variable
c283cff5 484@code{search-whitespace-regexp} to the desired regexp.
6bf7aab6 485
5c5245f7 486 In some cases, adding characters to the regexp in an incremental regexp
6bf7aab6
DL
487search can make the cursor move back and start again. For example, if
488you have searched for @samp{foo} and you add @samp{\|bar}, the cursor
489backs up in case the first @samp{bar} precedes the first @samp{foo}.
490
491@findex re-search-forward
492@findex re-search-backward
493 Nonincremental search for a regexp is done by the functions
494@code{re-search-forward} and @code{re-search-backward}. You can invoke
495these with @kbd{M-x}, or bind them to keys, or invoke them by way of
496incremental regexp search with @kbd{C-M-s @key{RET}} and @kbd{C-M-r
497@key{RET}}.
498
499 If you use the incremental regexp search commands with a prefix
500argument, they perform ordinary string search, like
501@code{isearch-forward} and @code{isearch-backward}. @xref{Incremental
502Search}.
503
f2fd3623 504@node Regexps
6bf7aab6 505@section Syntax of Regular Expressions
4946337d 506@cindex syntax of regexps
6bf7aab6 507
872d74eb
RS
508 This manual describes regular expression features that users
509typically want to use. There are additional features that are
510mainly used in Lisp programs; see @ref{Regular Expressions,,,
5485e3ec 511elisp, The Emacs Lisp Reference Manual}.
872d74eb 512
6bf7aab6
DL
513 Regular expressions have a syntax in which a few characters are
514special constructs and the rest are @dfn{ordinary}. An ordinary
515character is a simple regular expression which matches that same
516character and nothing else. The special characters are @samp{$},
a955f575
LT
517@samp{^}, @samp{.}, @samp{*}, @samp{+}, @samp{?}, @samp{[}, and
518@samp{\}. The character @samp{]} is special if it ends a character
519alternative (see later). The character @samp{-} is special inside a
520character alternative. Any other character appearing in a regular
521expression is ordinary, unless a @samp{\} precedes it. (When you use
522regular expressions in a Lisp program, each @samp{\} must be doubled,
523see the example near the end of this section.)
6bf7aab6
DL
524
525 For example, @samp{f} is not a special character, so it is ordinary, and
526therefore @samp{f} is a regular expression that matches the string
527@samp{f} and no other string. (It does @emph{not} match the string
528@samp{ff}.) Likewise, @samp{o} is a regular expression that matches
529only @samp{o}. (When case distinctions are being ignored, these regexps
530also match @samp{F} and @samp{O}, but we consider this a generalization
531of ``the same string,'' rather than an exception.)
532
533 Any two regular expressions @var{a} and @var{b} can be concatenated. The
534result is a regular expression which matches a string if @var{a} matches
535some amount of the beginning of that string and @var{b} matches the rest of
536the string.@refill
537
538 As a simple example, we can concatenate the regular expressions @samp{f}
539and @samp{o} to get the regular expression @samp{fo}, which matches only
540the string @samp{fo}. Still trivial. To do something nontrivial, you
541need to use one of the special characters. Here is a list of them.
542
f3143102
RS
543@table @asis
544@item @kbd{.}@: @r{(Period)}
6bf7aab6
DL
545is a special character that matches any single character except a newline.
546Using concatenation, we can make regular expressions like @samp{a.b}, which
547matches any three-character string that begins with @samp{a} and ends with
548@samp{b}.@refill
549
f3143102 550@item @kbd{*}
6bf7aab6
DL
551is not a construct by itself; it is a postfix operator that means to
552match the preceding regular expression repetitively as many times as
553possible. Thus, @samp{o*} matches any number of @samp{o}s (including no
554@samp{o}s).
555
556@samp{*} always applies to the @emph{smallest} possible preceding
557expression. Thus, @samp{fo*} has a repeating @samp{o}, not a repeating
558@samp{fo}. It matches @samp{f}, @samp{fo}, @samp{foo}, and so on.
559
560The matcher processes a @samp{*} construct by matching, immediately,
561as many repetitions as can be found. Then it continues with the rest
562of the pattern. If that fails, backtracking occurs, discarding some
563of the matches of the @samp{*}-modified construct in case that makes
564it possible to match the rest of the pattern. For example, in matching
565@samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first
566tries to match all three @samp{a}s; but the rest of the pattern is
567@samp{ar} and there is only @samp{r} left to match, so this try fails.
568The next alternative is for @samp{a*} to match only two @samp{a}s.
569With this choice, the rest of the regexp matches successfully.@refill
570
f3143102 571@item @kbd{+}
6bf7aab6
DL
572is a postfix operator, similar to @samp{*} except that it must match
573the preceding expression at least once. So, for example, @samp{ca+r}
574matches the strings @samp{car} and @samp{caaaar} but not the string
575@samp{cr}, whereas @samp{ca*r} matches all three strings.
576
f3143102 577@item @kbd{?}
6bf7aab6
DL
578is a postfix operator, similar to @samp{*} except that it can match the
579preceding expression either once or not at all. For example,
580@samp{ca?r} matches @samp{car} or @samp{cr}; nothing else.
581
f3143102 582@item @kbd{*?}, @kbd{+?}, @kbd{??}
f1a88ed9 583@cindex non-greedy regexp matching
8964fec7 584are non-greedy variants of the operators above. The normal operators
91cf1909
RS
585@samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as
586much as they can, as long as the overall regexp can still match. With
587a following @samp{?}, they are non-greedy: they will match as little
588as possible.
589
590Thus, both @samp{ab*} and @samp{ab*?} can match the string @samp{a}
591and the string @samp{abbbb}; but if you try to match them both against
592the text @samp{abbb}, @samp{ab*} will match it all (the longest valid
593match), while @samp{ab*?} will match just @samp{a} (the shortest
594valid match).
595
bed0fc91
RS
596Non-greedy operators match the shortest possible string starting at a
597given starting point; in a forward search, though, the earliest
598possible starting point for match is always the one chosen. Thus, if
599you search for @samp{a.*?$} against the text @samp{abbab} followed by
600a newline, it matches the whole string. Since it @emph{can} match
601starting at the first @samp{a}, it does.
602
f3143102 603@item @kbd{\@{@var{n}\@}}
91cf1909
RS
604is a postfix operator that specifies repetition @var{n} times---that
605is, the preceding regular expression must match exactly @var{n} times
606in a row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx}
607and nothing else.
8964fec7 608
f3143102 609@item @kbd{\@{@var{n},@var{m}\@}}
91cf1909
RS
610is a postfix operator that specifies repetition between @var{n} and
611@var{m} times---that is, the preceding regular expression must match
612at least @var{n} times, but no more than @var{m} times. If @var{m} is
613omitted, then there is no upper limit, but the preceding regular
614expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is
615equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to
616@samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}.
8a44227a 617
f3143102 618@item @kbd{[ @dots{} ]}
6bf7aab6
DL
619is a @dfn{character set}, which begins with @samp{[} and is terminated
620by @samp{]}. In the simplest case, the characters between the two
621brackets are what this set can match.
622
623Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
624@samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s
625(including the empty string), from which it follows that @samp{c[ad]*r}
626matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc.
627
628You can also include character ranges in a character set, by writing the
629starting and ending characters with a @samp{-} between them. Thus,
76dd3692 630@samp{[a-z]} matches any lower-case @acronym{ASCII} letter. Ranges may be
6bf7aab6 631intermixed freely with individual characters, as in @samp{[a-z$%.]},
76dd3692 632which matches any lower-case @acronym{ASCII} letter or @samp{$}, @samp{%} or
6bf7aab6
DL
633period.
634
635Note that the usual regexp special characters are not special inside a
636character set. A completely different set of special characters exists
637inside character sets: @samp{]}, @samp{-} and @samp{^}.
638
639To include a @samp{]} in a character set, you must make it the first
640character. For example, @samp{[]a]} matches @samp{]} or @samp{a}. To
641include a @samp{-}, write @samp{-} as the first or last character of the
642set, or put it after a range. Thus, @samp{[]-]} matches both @samp{]}
643and @samp{-}.
644
645To include @samp{^} in a set, put it anywhere but at the beginning of
b20a1c88 646the set. (At the beginning, it complements the set---see below.)
6bf7aab6
DL
647
648When you use a range in case-insensitive search, you should write both
649ends of the range in upper case, or both in lower case, or both should
650be non-letters. The behavior of a mixed-case range such as @samp{A-z}
651is somewhat ill-defined, and it may change in future Emacs versions.
652
f3143102 653@item @kbd{[^ @dots{} ]}
6bf7aab6
DL
654@samp{[^} begins a @dfn{complemented character set}, which matches any
655character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches
76dd3692 656all characters @emph{except} @acronym{ASCII} letters and digits.
6bf7aab6
DL
657
658@samp{^} is not special in a character set unless it is the first
659character. The character following the @samp{^} is treated as if it
660were first (in other words, @samp{-} and @samp{]} are not special there).
661
662A complemented character set can match a newline, unless newline is
663mentioned as one of the characters not to match. This is in contrast to
664the handling of regexps in programs such as @code{grep}.
665
f3143102 666@item @kbd{^}
6bf7aab6
DL
667is a special character that matches the empty string, but only at the
668beginning of a line in the text being matched. Otherwise it fails to
669match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at
670the beginning of a line.
671
40d0bf85
RS
672For historical compatibility reasons, @samp{^} can be used with this
673meaning only at the beginning of the regular expression, or after
674@samp{\(} or @samp{\|}.
675
f3143102 676@item @kbd{$}
6bf7aab6
DL
677is similar to @samp{^} but matches only at the end of a line. Thus,
678@samp{x+$} matches a string of one @samp{x} or more at the end of a line.
679
40d0bf85
RS
680For historical compatibility reasons, @samp{$} can be used with this
681meaning only at the end of the regular expression, or before @samp{\)}
682or @samp{\|}.
683
f3143102 684@item @kbd{\}
6bf7aab6
DL
685has two functions: it quotes the special characters (including
686@samp{\}), and it introduces additional special constructs.
687
688Because @samp{\} quotes special characters, @samp{\$} is a regular
689expression that matches only @samp{$}, and @samp{\[} is a regular
690expression that matches only @samp{[}, and so on.
c118d09e
RS
691
692See the following section for the special constructs that begin
693with @samp{\}.
6bf7aab6
DL
694@end table
695
c118d09e 696 Note: for historical compatibility, special characters are treated as
6bf7aab6
DL
697ordinary ones if they are in contexts where their special meanings make no
698sense. For example, @samp{*foo} treats @samp{*} as ordinary since there is
699no preceding expression on which the @samp{*} can act. It is poor practice
700to depend on this behavior; it is better to quote the special character anyway,
c118d09e
RS
701regardless of where it appears.
702
a955f575
LT
703As a @samp{\} is not special inside a character alternative, it can
704never remove the special meaning of @samp{-} or @samp{]}. So you
705should not quote these characters when they have no special meaning
706either. This would not clarify anything, since backslashes can
707legitimately precede these characters where they @emph{have} special
77e463b0 708meaning, as in @samp{[^\]} (@code{"[^\\]"} for Lisp string syntax),
a955f575
LT
709which matches any single character except a backslash.
710
c118d09e
RS
711@node Regexp Backslash
712@section Backslash in Regular Expressions
6bf7aab6 713
c118d09e
RS
714 For the most part, @samp{\} followed by any character matches only
715that character. However, there are several exceptions: two-character
716sequences starting with @samp{\} that have special meanings. The
717second character in the sequence is always an ordinary character when
718used on its own. Here is a table of @samp{\} constructs.
6bf7aab6
DL
719
720@table @kbd
721@item \|
722specifies an alternative. Two regular expressions @var{a} and @var{b}
723with @samp{\|} in between form an expression that matches some text if
724either @var{a} matches it or @var{b} matches it. It works by trying to
725match @var{a}, and if that fails, by trying to match @var{b}.
726
727Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
728but no other string.@refill
729
730@samp{\|} applies to the largest possible surrounding expressions. Only a
731surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
732@samp{\|}.@refill
733
734Full backtracking capability exists to handle multiple uses of @samp{\|}.
735
736@item \( @dots{} \)
737is a grouping construct that serves three purposes:
738
739@enumerate
740@item
741To enclose a set of @samp{\|} alternatives for other operations.
742Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
743
744@item
745To enclose a complicated expression for the postfix operators @samp{*},
746@samp{+} and @samp{?} to operate on. Thus, @samp{ba\(na\)*} matches
747@samp{bananana}, etc., with any (zero or more) number of @samp{na}
748strings.@refill
749
750@item
751To record a matched substring for future reference.
752@end enumerate
753
754This last application is not a consequence of the idea of a
755parenthetical grouping; it is a separate feature that is assigned as a
756second meaning to the same @samp{\( @dots{} \)} construct. In practice
91cf1909
RS
757there is usually no conflict between the two meanings; when there is
758a conflict, you can use a ``shy'' group.
95cd4c40
SM
759
760@item \(?: @dots{} \)
91cf1909
RS
761@cindex shy group, in regexp
762specifies a ``shy'' group that does not record the matched substring;
763you can't refer back to it with @samp{\@var{d}}. This is useful
764in mechanically combining regular expressions, so that you
765can add groups for syntactic purposes without interfering with
f44aa267 766the numbering of the groups that are meant to be referred to.
6bf7aab6
DL
767
768@item \@var{d}
0d8b7acb 769@cindex back reference, in regexp
6bf7aab6 770matches the same text that matched the @var{d}th occurrence of a
d8523190
RS
771@samp{\( @dots{} \)} construct. This is called a @dfn{back
772reference}.
6bf7aab6
DL
773
774After the end of a @samp{\( @dots{} \)} construct, the matcher remembers
775the beginning and end of the text matched by that construct. Then,
776later on in the regular expression, you can use @samp{\} followed by the
777digit @var{d} to mean ``match the same text matched the @var{d}th time
778by the @samp{\( @dots{} \)} construct.''
779
780The strings matching the first nine @samp{\( @dots{} \)} constructs
781appearing in a regular expression are assigned numbers 1 through 9 in
782the order that the open-parentheses appear in the regular expression.
783So you can use @samp{\1} through @samp{\9} to refer to the text matched
784by the corresponding @samp{\( @dots{} \)} constructs.
785
786For example, @samp{\(.*\)\1} matches any newline-free string that is
787composed of two identical halves. The @samp{\(.*\)} matches the first
788half, which may be anything, but the @samp{\1} that follows must match
789the same exact text.
790
791If a particular @samp{\( @dots{} \)} construct matches more than once
792(which can easily happen if it is followed by @samp{*}), only the last
793match is recorded.
794
795@item \`
4aa2d40e
RS
796matches the empty string, but only at the beginning of the string or
797buffer (or its accessible portion) being matched against.
6bf7aab6
DL
798
799@item \'
4aa2d40e
RS
800matches the empty string, but only at the end of the string or buffer
801(or its accessible portion) being matched against.
6bf7aab6
DL
802
803@item \=
804matches the empty string, but only at point.
805
806@item \b
807matches the empty string, but only at the beginning or
808end of a word. Thus, @samp{\bfoo\b} matches any occurrence of
809@samp{foo} as a separate word. @samp{\bballs?\b} matches
810@samp{ball} or @samp{balls} as a separate word.@refill
811
812@samp{\b} matches at the beginning or end of the buffer
813regardless of what text appears next to it.
814
815@item \B
816matches the empty string, but @emph{not} at the beginning or
817end of a word.
818
819@item \<
820matches the empty string, but only at the beginning of a word.
821@samp{\<} matches at the beginning of the buffer only if a
822word-constituent character follows.
823
824@item \>
825matches the empty string, but only at the end of a word. @samp{\>}
826matches at the end of the buffer only if the contents end with a
827word-constituent character.
828
829@item \w
830matches any word-constituent character. The syntax table
831determines which characters these are. @xref{Syntax}.
832
833@item \W
834matches any character that is not a word-constituent.
835
6ccc7fbc 836@item \_<
49561cf6
SM
837matches the empty string, but only at the beginning of a symbol.
838A symbol is a sequence of one or more symbol-constituent characters.
839A symbol-constituent character is a character whose syntax is either
0cb23e1d 840@samp{w} or @samp{_}. @samp{\_<} matches at the beginning of the
49561cf6 841buffer only if a symbol-constituent character follows.
6ccc7fbc
SM
842
843@item \_>
7695a3ae
LT
844matches the empty string, but only at the end of a symbol. @samp{\_>}
845matches at the end of the buffer only if the contents end with a
846symbol-constituent character.
6ccc7fbc 847
6bf7aab6
DL
848@item \s@var{c}
849matches any character whose syntax is @var{c}. Here @var{c} is a
b20a1c88
RS
850character that designates a particular syntax class: thus, @samp{w}
851for word constituent, @samp{-} or @samp{ } for whitespace, @samp{.}
852for ordinary punctuation, etc. @xref{Syntax}.
6bf7aab6
DL
853
854@item \S@var{c}
855matches any character whose syntax is not @var{c}.
4a1b539b
EZ
856
857@cindex categories of characters
858@cindex characters which belong to a specific language
859@findex describe-categories
860@item \c@var{c}
861matches any character that belongs to the category @var{c}. For
862example, @samp{\cc} matches Chinese characters, @samp{\cg} matches
863Greek characters, etc. For the description of the known categories,
864type @kbd{M-x describe-categories @key{RET}}.
865
866@item \C@var{c}
867matches any character that does @emph{not} belong to category
868@var{c}.
6bf7aab6
DL
869@end table
870
871 The constructs that pertain to words and syntax are controlled by the
872setting of the syntax table (@pxref{Syntax}).
873
c118d09e
RS
874@node Regexp Example
875@section Regular Expression Example
876
877 Here is a complicated regexp---a simplified version of the regexp
878that Emacs uses, by default, to recognize the end of a sentence
28fe88fc
LT
879together with any whitespace that follows. We show its Lisp syntax to
880distinguish the spaces from the tab characters. In Lisp syntax, the
881string constant begins and ends with a double-quote. @samp{\"} stands
882for a double-quote as part of the regexp, @samp{\\} for a backslash as
883part of the regexp, @samp{\t} for a tab, and @samp{\n} for a newline.
6bf7aab6
DL
884
885@example
b20a1c88 886"[.?!][]\"')]*\\($\\| $\\|\t\\| \\)[ \t\n]*"
6bf7aab6
DL
887@end example
888
889@noindent
b20a1c88
RS
890This contains four parts in succession: a character set matching
891period, @samp{?}, or @samp{!}; a character set matching
892close-brackets, quotes, or parentheses, repeated zero or more times; a
893set of alternatives within backslash-parentheses that matches either
894end-of-line, a space at the end of a line, a tab, or two spaces; and a
895character set matching whitespace characters, repeated any number of
896times.
6bf7aab6 897
31b5be12
RS
898 To enter the same regexp in incremental search, you would type
899@key{TAB} to enter a tab, and @kbd{C-j} to enter a newline. You would
900also type single backslashes as themselves, instead of doubling them
901for Lisp syntax. In commands that use ordinary minibuffer input to
902read a regexp, you would quote the @kbd{C-j} by preceding it with a
903@kbd{C-q} to prevent @kbd{C-j} from exiting the minibuffer.
6bf7aab6 904
f2fd3623 905@node Search Case
6bf7aab6
DL
906@section Searching and Case
907
6bf7aab6
DL
908 Incremental searches in Emacs normally ignore the case of the text
909they are searching through, if you specify the text in lower case.
910Thus, if you specify searching for @samp{foo}, then @samp{Foo} and
911@samp{foo} are also considered a match. Regexps, and in particular
912character sets, are included: @samp{[ab]} would match @samp{a} or
913@samp{A} or @samp{b} or @samp{B}.@refill
914
915 An upper-case letter anywhere in the incremental search string makes
916the search case-sensitive. Thus, searching for @samp{Foo} does not find
917@samp{foo} or @samp{FOO}. This applies to regular expression search as
918well as to string search. The effect ceases if you delete the
919upper-case letter from the search string.
920
b20a1c88
RS
921 Typing @kbd{M-c} within an incremental search toggles the case
922sensitivity of that search. The effect does not extend beyond the
923current incremental search to the next one, but it does override the
924effect of including an upper-case letter in the current search.
925
926@vindex case-fold-search
f2fd3623 927@vindex default-case-fold-search
6bf7aab6
DL
928 If you set the variable @code{case-fold-search} to @code{nil}, then
929all letters must match exactly, including case. This is a per-buffer
930variable; altering the variable affects only the current buffer, but
f2fd3623
RS
931there is a default value in @code{default-case-fold-search} that you
932can also set. @xref{Locals}. This variable applies to nonincremental
933searches also, including those performed by the replace commands
934(@pxref{Replace}) and the minibuffer history matching commands
935(@pxref{Minibuffer History}).
a57bfc9f 936
f2455bde
RS
937 Several related variables control case-sensitivity of searching and
938matching for specific commands or activities. For instance,
939@code{tags-case-fold-search} controls case sensitivity for
940@code{find-tag}. To find these variables, do @kbd{M-x
941apropos-variable @key{RET} case-fold-search @key{RET}}.
942
f2fd3623 943@node Replace
6bf7aab6
DL
944@section Replacement Commands
945@cindex replacement
946@cindex search-and-replace commands
947@cindex string substitution
948@cindex global substitution
949
a76af65d
RS
950 Global search-and-replace operations are not needed often in Emacs,
951but they are available. In addition to the simple @kbd{M-x
f98a8ffd 952replace-string} command which replaces all occurrences,
7860977a 953there is @kbd{M-%} (@code{query-replace}), which presents each occurrence
a76af65d 954of the pattern and asks you whether to replace it.
6bf7aab6
DL
955
956 The replace commands normally operate on the text from point to the
293fa54a
EZ
957end of the buffer; however, in Transient Mark mode (@pxref{Transient
958Mark}), when the mark is active, they operate on the region. The
5c5245f7 959basic replace commands replace one string (or regexp) with one
293fa54a
EZ
960replacement string. It is possible to perform several replacements in
961parallel using the command @code{expand-region-abbrevs}
962(@pxref{Expanding Abbrevs}).
6bf7aab6
DL
963
964@menu
a57bfc9f
EZ
965* Unconditional Replace:: Replacing all matches for a string.
966* Regexp Replace:: Replacing all matches for a regexp.
967* Replacement and Case:: How replacements preserve case of letters.
968* Query Replace:: How to use querying.
6bf7aab6
DL
969@end menu
970
971@node Unconditional Replace, Regexp Replace, Replace, Replace
972@subsection Unconditional Replacement
973@findex replace-string
6bf7aab6
DL
974
975@table @kbd
976@item M-x replace-string @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
977Replace every occurrence of @var{string} with @var{newstring}.
6bf7aab6
DL
978@end table
979
980 To replace every instance of @samp{foo} after point with @samp{bar},
981use the command @kbd{M-x replace-string} with the two arguments
982@samp{foo} and @samp{bar}. Replacement happens only in the text after
983point, so if you want to cover the whole buffer you must go to the
984beginning first. All occurrences up to the end of the buffer are
985replaced; to limit replacement to part of the buffer, narrow to that
986part of the buffer before doing the replacement (@pxref{Narrowing}).
987In Transient Mark mode, when the region is active, replacement is
988limited to the region (@pxref{Transient Mark}).
989
990 When @code{replace-string} exits, it leaves point at the last
991occurrence replaced. It sets the mark to the prior position of point
992(where the @code{replace-string} command was issued); use @kbd{C-u
993C-@key{SPC}} to move back there.
994
995 A numeric argument restricts replacement to matches that are surrounded
996by word boundaries. The argument's value doesn't matter.
997
46b1e9bb
RS
998 What if you want to exchange @samp{x} and @samp{y}: replace every @samp{x} with a @samp{y} and vice versa? You can do it this way:
999
1000@example
a8e3c8d6
DK
1001M-x replace-string @key{RET} x @key{RET} @@TEMP@@ @key{RET}
1002M-< M-x replace-string @key{RET} y @key{RET} x @key{RET}
1003M-< M-x replace-string @key{RET} @@TEMP@@ @key{RET} y @key{RET}
46b1e9bb
RS
1004@end example
1005
1006@noindent
1007This works provided the string @samp{@@TEMP@@} does not appear
1008in your text.
1009
6bf7aab6
DL
1010@node Regexp Replace, Replacement and Case, Unconditional Replace, Replace
1011@subsection Regexp Replacement
f2fd3623 1012@findex replace-regexp
6bf7aab6
DL
1013
1014 The @kbd{M-x replace-string} command replaces exact matches for a
1015single string. The similar command @kbd{M-x replace-regexp} replaces
1016any match for a specified pattern.
1017
f2fd3623
RS
1018@table @kbd
1019@item M-x replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
1020Replace every match for @var{regexp} with @var{newstring}.
1021@end table
1022
0d8b7acb 1023@cindex back reference, in regexp replacement
a745ff66
RS
1024 In @code{replace-regexp}, the @var{newstring} need not be constant:
1025it can refer to all or part of what is matched by the @var{regexp}.
1026@samp{\&} in @var{newstring} stands for the entire match being
1027replaced. @samp{\@var{d}} in @var{newstring}, where @var{d} is a
1028digit, stands for whatever matched the @var{d}th parenthesized
5a7f4c1b 1029grouping in @var{regexp}. (This is called a ``back reference.'')
d8523190
RS
1030@samp{\#} refers to the count of replacements already made in this
1031command, as a decimal number. In the first replacement, @samp{\#}
1032stands for @samp{0}; in the second, for @samp{1}; and so on. For
1033example,
6bf7aab6
DL
1034
1035@example
1036M-x replace-regexp @key{RET} c[ad]+r @key{RET} \&-safe @key{RET}
1037@end example
1038
1039@noindent
1040replaces (for example) @samp{cadr} with @samp{cadr-safe} and @samp{cddr}
1041with @samp{cddr-safe}.
1042
1043@example
1044M-x replace-regexp @key{RET} \(c[ad]+r\)-safe @key{RET} \1 @key{RET}
1045@end example
1046
1047@noindent
a745ff66
RS
1048performs the inverse transformation. To include a @samp{\} in the
1049text to replace with, you must enter @samp{\\}.
1050
c778ed49
LT
1051 If you want to enter part of the replacement string by hand each
1052time, use @samp{\?} in the replacement string. Each replacement will
1053ask you to edit the replacement string in the minibuffer, putting
1054point where the @samp{\?} was.
1055
1056 The remainder of this subsection is intended for specialized tasks
1057and requires knowledge of Lisp. Most readers can skip it.
1058
1059 You can use Lisp expressions to calculate parts of the
a745ff66
RS
1060replacement string. To do this, write @samp{\,} followed by the
1061expression in the replacement string. Each replacement calculates the
074f1b8b
RS
1062value of the expression and converts it to text without quoting (if
1063it's a string, this means using the string's contents), and uses it in
a745ff66
RS
1064the replacement string in place of the expression itself. If the
1065expression is a symbol, one space in the replacement string after the
074f1b8b
RS
1066symbol name goes with the symbol name, so the value replaces them
1067both.
1068
1069 Inside such an expression, you can use some special sequences.
1070@samp{\&} and @samp{\@var{n}} refer here, as usual, to the entire
1071match as a string, and to a submatch as a string. @var{n} may be
1072multiple digits, and the value of @samp{\@var{n}} is @code{nil} if
1073subexpression @var{n} did not match. You can also use @samp{\#&} and
1074@samp{\#@var{n}} to refer to those matches as numbers (this is valid
1075when the match or submatch has the form of a numeral). @samp{\#} here
1076too stands for the number of already-completed replacements.
a8e3c8d6
DK
1077
1078 Repeating our example to exchange @samp{x} and @samp{y}, we can thus
1079do it also this way:
1080
1081@example
1082M-x replace-regexp @key{RET} \(x\)\|y @key{RET}
1083\,(if \1 "y" "x") @key{RET}
1ed5c59c
DK
1084@end example
1085
074f1b8b 1086 For computing replacement strings for @samp{\,}, the @code{format}
5485e3ec 1087function is often useful (@pxref{Formatting Strings,,, elisp, The Emacs
074f1b8b 1088Lisp Reference Manual}). For example, to add consecutively numbered
a745ff66
RS
1089strings like @samp{ABC00042} to columns 73 @w{to 80} (unless they are
1090already occupied), you can use
1ed5c59c
DK
1091
1092@example
1093M-x replace-regexp @key{RET} ^.\@{0,72\@}$ @key{RET}
1094\,(format "%-72sABC%05d" \& \#) @key{RET}
a8e3c8d6
DK
1095@end example
1096
6bf7aab6
DL
1097@node Replacement and Case, Query Replace, Regexp Replace, Replace
1098@subsection Replace Commands and Case
1099
1100 If the first argument of a replace command is all lower case, the
e7ad2d23 1101command ignores case while searching for occurrences to
6bf7aab6
DL
1102replace---provided @code{case-fold-search} is non-@code{nil}. If
1103@code{case-fold-search} is set to @code{nil}, case is always significant
1104in all searches.
1105
1106@vindex case-replace
1107 In addition, when the @var{newstring} argument is all or partly lower
1108case, replacement commands try to preserve the case pattern of each
1109occurrence. Thus, the command
1110
1111@example
1112M-x replace-string @key{RET} foo @key{RET} bar @key{RET}
1113@end example
1114
1115@noindent
1116replaces a lower case @samp{foo} with a lower case @samp{bar}, an
1117all-caps @samp{FOO} with @samp{BAR}, and a capitalized @samp{Foo} with
1118@samp{Bar}. (These three alternatives---lower case, all caps, and
1119capitalized, are the only ones that @code{replace-string} can
1120distinguish.)
1121
1122 If upper-case letters are used in the replacement string, they remain
1123upper case every time that text is inserted. If upper-case letters are
1124used in the first argument, the second argument is always substituted
1125exactly as given, with no case conversion. Likewise, if either
1126@code{case-replace} or @code{case-fold-search} is set to @code{nil},
1127replacement is done without case conversion.
1128
1129@node Query Replace,, Replacement and Case, Replace
1130@subsection Query Replace
1131@cindex query replace
1132
1133@table @kbd
1134@item M-% @var{string} @key{RET} @var{newstring} @key{RET}
1135@itemx M-x query-replace @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
1136Replace some occurrences of @var{string} with @var{newstring}.
1137@item C-M-% @var{regexp} @key{RET} @var{newstring} @key{RET}
1138@itemx M-x query-replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
1139Replace some matches for @var{regexp} with @var{newstring}.
1140@end table
1141
1142@kindex M-%
1143@findex query-replace
1144 If you want to change only some of the occurrences of @samp{foo} to
1145@samp{bar}, not all of them, then you cannot use an ordinary
1146@code{replace-string}. Instead, use @kbd{M-%} (@code{query-replace}).
1147This command finds occurrences of @samp{foo} one by one, displays each
b20a1c88
RS
1148occurrence and asks you whether to replace it. Aside from querying,
1149@code{query-replace} works just like @code{replace-string}. It
1150preserves case, like @code{replace-string}, provided
1151@code{case-replace} is non-@code{nil}, as it normally is. A numeric
1152argument means consider only occurrences that are bounded by
1153word-delimiter characters.
6bf7aab6
DL
1154
1155@kindex C-M-%
1156@findex query-replace-regexp
b20a1c88 1157 @kbd{C-M-%} performs regexp search and replace (@code{query-replace-regexp}).
8bcc4f6b
RS
1158It works like @code{replace-regexp} except that it queries
1159like @code{query-replace}.
1160
1161@cindex faces for highlighting query replace
1162 These commands highlight the current match using the face
1163@code{query-replace}. They highlight other matches using
1164@code{lazy-highlight} just like incremental search (@pxref{Incremental
1165Search}).
6bf7aab6 1166
b20a1c88
RS
1167 The characters you can type when you are shown a match for the string
1168or regexp are:
6bf7aab6
DL
1169
1170@ignore @c Not worth it.
1171@kindex SPC @r{(query-replace)}
1172@kindex DEL @r{(query-replace)}
1173@kindex , @r{(query-replace)}
1174@kindex RET @r{(query-replace)}
1175@kindex . @r{(query-replace)}
1176@kindex ! @r{(query-replace)}
1177@kindex ^ @r{(query-replace)}
1178@kindex C-r @r{(query-replace)}
1179@kindex C-w @r{(query-replace)}
1180@kindex C-l @r{(query-replace)}
1181@end ignore
1182
1183@c WideCommands
1184@table @kbd
1185@item @key{SPC}
1186to replace the occurrence with @var{newstring}.
1187
1188@item @key{DEL}
1189to skip to the next occurrence without replacing this one.
1190
1191@item , @r{(Comma)}
1192to replace this occurrence and display the result. You are then asked
1193for another input character to say what to do next. Since the
1194replacement has already been made, @key{DEL} and @key{SPC} are
1195equivalent in this situation; both move to the next occurrence.
1196
1197You can type @kbd{C-r} at this point (see below) to alter the replaced
1198text. You can also type @kbd{C-x u} to undo the replacement; this exits
1199the @code{query-replace}, so if you want to do further replacement you
1200must use @kbd{C-x @key{ESC} @key{ESC} @key{RET}} to restart
1201(@pxref{Repetition}).
1202
1203@item @key{RET}
1204to exit without doing any more replacements.
1205
1206@item .@: @r{(Period)}
1207to replace this occurrence and then exit without searching for more
1208occurrences.
1209
1210@item !
1211to replace all remaining occurrences without asking again.
1212
1213@item ^
1214to go back to the position of the previous occurrence (or what used to
a8e3c8d6
DK
1215be an occurrence), in case you changed it by mistake or want to
1216reexamine it.
6bf7aab6
DL
1217
1218@item C-r
1219to enter a recursive editing level, in case the occurrence needs to be
1220edited rather than just replaced with @var{newstring}. When you are
1221done, exit the recursive editing level with @kbd{C-M-c} to proceed to
1222the next occurrence. @xref{Recursive Edit}.
1223
1224@item C-w
1225to delete the occurrence, and then enter a recursive editing level as in
1226@kbd{C-r}. Use the recursive edit to insert text to replace the deleted
1227occurrence of @var{string}. When done, exit the recursive editing level
1228with @kbd{C-M-c} to proceed to the next occurrence.
1229
91cf1909
RS
1230@item e
1231to edit the replacement string in the minibuffer. When you exit the
1232minibuffer by typing @key{RET}, the minibuffer contents replace the
1233current occurrence of the pattern. They also become the new
1234replacement string for any further occurrences.
1235
6bf7aab6
DL
1236@item C-l
1237to redisplay the screen. Then you must type another character to
1238specify what to do with this occurrence.
1239
1240@item C-h
1241to display a message summarizing these options. Then you must type
1242another character to specify what to do with this occurrence.
1243@end table
1244
1245 Some other characters are aliases for the ones listed above: @kbd{y},
1246@kbd{n} and @kbd{q} are equivalent to @key{SPC}, @key{DEL} and
1247@key{RET}.
1248
1249 Aside from this, any other character exits the @code{query-replace},
1250and is then reread as part of a key sequence. Thus, if you type
1251@kbd{C-k}, it exits the @code{query-replace} and then kills to end of
1252line.
1253
1254 To restart a @code{query-replace} once it is exited, use @kbd{C-x
1255@key{ESC} @key{ESC}}, which repeats the @code{query-replace} because it
1256used the minibuffer to read its arguments. @xref{Repetition, C-x ESC
1257ESC}.
1258
29d74a46
RS
1259 @xref{Operating on Files}, for the Dired @kbd{Q} command which
1260performs query replace on selected files. See also @ref{Transforming
1261File Names}, for Dired commands to rename, copy, or link files by
1262replacing regexp matches in file names.
6bf7aab6 1263
f2fd3623 1264@node Other Repeating Search
6bf7aab6
DL
1265@section Other Search-and-Loop Commands
1266
1267 Here are some other commands that find matches for a regular
91cf1909
RS
1268expression. They all ignore case in matching, if the pattern contains
1269no upper-case letters and @code{case-fold-search} is non-@code{nil}.
8cab3b9a
CW
1270Aside from @code{occur} and its variants, all operate on the text from
1271point to the end of the buffer, or on the active region in Transient
1272Mark mode.
6bf7aab6
DL
1273
1274@findex list-matching-lines
1275@findex occur
8cab3b9a 1276@findex multi-occur
5c5245f7 1277@findex multi-occur-in-matching-buffers
9c99d206 1278@findex how-many
6bf7aab6
DL
1279@findex delete-non-matching-lines
1280@findex delete-matching-lines
1281@findex flush-lines
1282@findex keep-lines
1283
1284@table @kbd
1285@item M-x occur @key{RET} @var{regexp} @key{RET}
91cf1909
RS
1286Display a list showing each line in the buffer that contains a match
1287for @var{regexp}. To limit the search to part of the buffer, narrow
1288to that part (@pxref{Narrowing}). A numeric argument @var{n}
f8635375 1289specifies that @var{n} lines of context are to be displayed before and
c778ed49
LT
1290after each matching line. Currently, @code{occur} can not correctly
1291handle multiline matches.
6bf7aab6
DL
1292
1293@kindex RET @r{(Occur mode)}
74308f28
RS
1294@kindex o @r{(Occur mode)}
1295@kindex C-o @r{(Occur mode)}
6bf7aab6 1296The buffer @samp{*Occur*} containing the output serves as a menu for
74308f28
RS
1297finding the occurrences in their original context. Click
1298@kbd{Mouse-2} on an occurrence listed in @samp{*Occur*}, or position
1299point there and type @key{RET}; this switches to the buffer that was
1300searched and moves point to the original of the chosen occurrence.
1301@kbd{o} and @kbd{C-o} display the match in another window; @kbd{C-o}
1302does not select it.
6bf7aab6 1303
5c5245f7
RS
1304After using @kbd{M-x occur}, you can use @code{next-error} to visit
1305the occurrences found, one by one. @ref{Compilation Mode}.
d4bc156f 1306
6bf7aab6
DL
1307@item M-x list-matching-lines
1308Synonym for @kbd{M-x occur}.
1309
8cab3b9a 1310@item M-x multi-occur @key{RET} @var{buffers} @key{RET} @var{regexp} @key{RET}
db639d24 1311This function is just like @code{occur}, except it is able to search
5c5245f7 1312through multiple buffers. It asks you to specify the buffer names one by one.
8cab3b9a 1313
5c5245f7 1314@item M-x multi-occur-in-matching-buffers @key{RET} @var{bufregexp} @key{RET} @var{regexp} @key{RET}
db639d24 1315This function is similar to @code{multi-occur}, except the buffers to
5c5245f7 1316search are specified by a regular expression that matches visited
81f65458 1317file names. With a prefix argument, it uses the regular expression to match
5c5245f7 1318buffer names instead.
8cab3b9a 1319
9c99d206 1320@item M-x how-many @key{RET} @var{regexp} @key{RET}
91cf1909
RS
1321Print the number of matches for @var{regexp} that exist in the buffer
1322after point. In Transient Mark mode, if the region is active, the
1323command operates on the region instead.
6bf7aab6
DL
1324
1325@item M-x flush-lines @key{RET} @var{regexp} @key{RET}
c778ed49
LT
1326This command deletes each line that contains a match for @var{regexp},
1327operating on the text after point; it deletes the current line
1328if it contains a match starting after point. In Transient Mark mode,
1329if the region is active, the command operates on the region instead;
1330it deletes a line partially contained in the region if it contains a
1331match entirely contained in the region.
1332
1333If a match is split across lines, @code{flush-lines} deletes all those
1334lines. It deletes the lines before starting to look for the next
1335match; hence, it ignores a match starting on the same line at which
1336another match ended.
6bf7aab6
DL
1337
1338@item M-x keep-lines @key{RET} @var{regexp} @key{RET}
c778ed49
LT
1339This command deletes each line that @emph{does not} contain a match for
1340@var{regexp}, operating on the text after point; if point is not at the
1341beginning of a line, it always keeps the current line. In Transient
1342Mark mode, if the region is active, the command operates on the region
1343instead; it never deletes lines that are only partially contained in
1344the region (a newline that ends a line counts as part of that line).
1345
1346If a match is split across lines, this command keeps all those lines.
6bf7aab6
DL
1347@end table
1348
ab5796a9
MB
1349@ignore
1350 arch-tag: fd9d8e77-66af-491c-b212-d80999613e3e
1351@end ignore